More people are starting to write SQL who have a background in programming as more and more people are analyzing data as data science related jobs are exploding. SQL is a little different than other languages and here are some common patterns which should be avoided. Here are 5 things, in no particular order, you can do to improve your SQL.
1. Use the correct join for the job
There are a number of different joins you can use in SQL, but there are 3 common ones, Left Join, Right Join and Inner join. When joining two tables you will get only the records which match with an inner join. Any records which cannot be joined will be excluded. For this reason, many people only write left or right joins which automatically return all of the rows in the left or right hand side, respectively. These joins are always slower that Inner joins, and may not be necessary. Evaluate your joins to see if the record count changes if you do an inner join before you do a left or right join.
2. Do not Loop through rows of data
SQL is designed to process data in sets, not one record at a time. In fact, it does not work well processing one row at a time. Looping through records one at a time in a cursor or a while loop is to be avoided. Think about how you might write the code so that it processes the records at one time.
3. Avoid Nesting SQL code
SQL is not an object-oriented language. This means the performance and readability does not improve when views, stored Procedures Or cursors are called from other objects. Try to do all of the work in one place. For more information on nested views, check out my post on the topic.
4. Do not use column names which include spaces or reserved words
While you can get away with using spaces or reserved words by putting the name of the column in brackets, most people are not used to writing code this way and it is inevitable that someone will leave off the square brackets and the code will not work. Have pity on those who will be supporting your code and do not include spaces. If you are unsure of what the reserved words are, see if they change color in the editor.
5.Use Sets in SQL and Avoid Temp tables
Someone who attended I taught a class recently told me that they didn’t think that it was possible to write a stored procedure without a temp table because all of the stored procedures at her work had them. A common pattern which I see is to processing a bunch of records and putting them in a temp table, then processing them again. This pattern should be avoided. You can process records in one query. Look at writing more complex queries which look at all of the data in one query.
Resolving Common SQL Pitfalls for People who Develop Lots of Code
Sometimes all it takes to figure out how to write SQL using sets is to have someone show you how. If you are in Phoenix, I can show you how on February 5th when I am doing a talk at Galvanize on this topic. I would be happy to show you some tips and tricks which would help gather and analyze data for an experiment. Not in the area or can’t make it? Check out my class at DataCamp. You can try the class for free to see what you think. I look forward to hearing your feedback.
Yours Always
Ginger Grant
Data aficionado et SQL Raconteur