Rob Farley

Rob Rob Farley has been consulting in IT since completing a Computer Science degree with first class honours in 1997. Before moving to Adelaide, he worked in consultancies in Melbourne and London. He runs the development department in one of Australia's leading IT firms, as well as doing database application consultancy and training. He heads up the Adelaide SQL Server User Group, and holds several Microsoft certifications.

Rob has been involved with Microsoft technologies for most of his career, but has also done significant work with Oracle and Unix systems. His preferred database is SQL Server and his preferred language is C#. Recently he has been involved with Microsoft Learning in the US, creating and reviewing new content for the next generation of Microsoft exams.

Over the years, Rob's clients have included BP Oil, OneLink Transit, Accenture, Avanade, Australian Electorial Commission, the Chartered Institute of Personnel and Development, the Royal Borough of Kingston, Help The Aged, Unisys, Department of Treasury and Finance (Vic), National Mutual, the Bible Society and others.

Did you mean to come here? My blog is now at http://msmvps.com/blogs/robfarley



28 April 2005

Common SQL errors and pseudo-code

Recently I've been reading (and commenting) on various SQL newsgroups.

More and more I find that the problems people have are because they haven't defined what they want properly. Perhaps this is a difference between people who learn to program in a structured environment (such as a university), and people who learn to program by being thrown in at the deep end.

When I went to uni, I could already code adequately. I wasn't a great programmer, but I didn't feel like I had major coding issues. At uni, they taught us things like pseudo-code, and I kinda dismissed it, thinking that that was for people who couldn't code. But quite quickly, I saw the value in it.

Pseudo-code is really handy in all kinds of ways. By writing out what I'm trying to do in my own language, I can achieve a stack of things.

Firstly, it makes me think better about what it is I'm trying to achieve. Writing something out in English will help you notice when you're approaching something from the wrong direction.

Secondly, it helps show me where modules should be written. Anything I've written as a sentence that isn't almost a line of code in itself, I put as a function, or better, re-use a function I've written before. If it's new, I write pseudo-code for that function, which helps me work out what information I need to pass in and return.

Thirdly, it breaks what I'm doing down into smaller, more quickly achievable chunks. This is a big win. It means that your project is broken down to really measurable components. If you finish your pseudo-code and find that you have 100 things to write, then you can mark them off your list, and easily see how far you have to go.

But back to SQL...

SQL queries are really not that different. Hopefully your database design was based on a chunk of English in the first place, so that you have a reference to what everything does in a language you understand nicely. So then, when you need to write a query, you start by expressing it in English. Then you explain what you mean by each statement. When it's broken down enough, your query is just about written already.

For example. Recently someone asked about a query that would run on the 'pubs' database to show all the employees that had the next anniversary of their hiredate coming up in the next 30 days.

Well, considering that 'hire_date' is stored for everyone in the employees table, the location of the data is a no-brainer. It's only the computing bit that is going to be tricky.

So let's explain our statement better.

Someone's anniversary is the date this year (or next year if it's already passed) which is an exact number of years added on to the original date.

The 'exact number of years' added on to the original date is just the difference in the number of years between then and now.

Oh yeah - let's not use getdate(), because that makes testing harder. Later, we can replace our variable with getdate(), but that's for later.

datepart(year,hire_date) is the year of hire.
datepart(year,@now) is this year.

So, dateadd(year,datepart(year,@now)-datepart(year,hire_date),hire_date) is when that date occurred this year.

So then we check to see if it's already been (but comparing with tomorrow, because we don't want to miss today!)


case when dateadd(year,datepart(year,@now)-datepart(year,hire_date),hire_date) < dateadd(day,1,@now)
then dateadd(year,datepart(year,@now)-datepart(year,hire_date)+1,hire_date)
else dateadd(year,datepart(year,@now)-datepart(year,hire_date),hire_date)
end


So you see, it pretty much wrote itself! Next put all the values you need to compare in the resultset, and then make your where clause from that:

So you construct:


Select fname, lname, hire_date,
,datepart(year,hire_date)
,dateadd(year,datepart(year,@now)-datepart(year,hire_date),hire_date)
,case when dateadd(year,datepart(year,@now)-datepart(year,hire_date),hire_date) < dateadd(day,1,@now)
then dateadd(year,datepart(year,@now)-datepart(year,hire_date)+1,hire_date)
else dateadd(year,datepart(year,@now)-datepart(year,hire_date),hire_date)
end
From employee
where datediff(day, @now, case when dateadd(year,datepart(year,@now)-datepart(year,hire_date),hire_date) < dateadd(day,1,@now)
then dateadd(year,datepart(year,@now)-datepart(year,hire_date)+1,hire_date)
else dateadd(year,datepart(year,@now)-datepart(year,hire_date),hire_date)
end) <= 30
order by 7


Looks messy... could maybe be optimised a little. But it's correct, which counts for a lot. And we could verify that it was correct by looking down the list of values in the earlier statement. And because we're using a variable, we could look for interesting cases such as leap years and entries near the New Year.