THE SQL Server Blog Spot on the Web
Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | Join | Help
in Search

Aaron Bertrand

My stored procedure "best practices" checklist

When developing stored procedures, there seems to be a lot of emphasis on "get it done fast." Which means type all lower case, pay little attention to formatting, and sometimes throw best practices out the window. Personally, I would rather front-load my development time; I think that the costs I pay in initial development far outweigh what I might have paid in maintenance down the road. Making readable and maintainable code that also performs well and is delivered in a timely manner is something that a lot of us strive for, but we don't always have the luxury. But I have found that it is very easy to fall into the good kind of development habits.

A popular adage is, "you can have it fast, cheap, or good. Pick two." I contend that if you develop habits like these and use them in all of your database programming, the time difference between following those methods and doing it the "lazy" way will be negligible at most; and so, fast and good go hand in hand, rather than trade off for one another.

Once in a while this "disorder" slows me down. I come across code that someone else wrote (almost exclusively it is someone I no longer work with), and I can't even bear to look at it without first re-writing it. Here is a fake but realistic example of the kinds of procedures I see:

create proc foo(@i int,@bar int=null,@hr int output,@xd datetimeas
declare 
@c varchar
declare 
@s nchar(2)
declare @x int
set 
@grok='Beverly'
set @korg='MA'
set @x=5
select customers.customerid,firstname,lastname,orderdate from customers join orders on
customers.customerid=orders.customerid where status=@i or status<=@bar and orderdate<=@xd
set @hr @@rowcount
select customers.customerid,count(*) from customers left join orders on 
customers.customerid
=orders.customerid where customers.city=@c and customers.state=@s 
group by 
customers.customerid having count(*)>=@x
return (@@rowcount)

This kind of feels like the 5th grade all over again, but when I get handed code like this, I start immediately visualizing one of those "find all of the things wrong with this picture" exercises, and feel compelled to fix them all. So, what is wrong with the above sample, you may ask? Well, let me go through my own personal (and quite subjective) subconscious checklist of best practices when I write my own stored procedures. I have never tried to list these all at once, so I may be all over the place, but hopefully I will justify why I choose to have these items on my checklist in the first place.

======================

Upper casing T-SQL keywords and built-in functions

I always use CREATE PROCEDURE and not create procedure or Create Procedure. Same goes for all of the code throughout my objects... you will always see SELECT, FROM, WHERE and not select, from, where. I just find if much more readable when all of the keywords are capitalized. It's not that hard for me to hold down the shift key while typing these words, and there are even IDEs that will do this kind of replacement for you (for example, Apex SQLEdit has a handy "mis-spelled keyword replacement" feature that I think could be used for this purpose also). This is probably one of the few areas where Celko and I actually agree. :-)

======================

Using a proper and consistent naming scheme

Obviously "foo" is a horribly ridiculous name for a procedure, but I have come across many that were equally nondescript. I like to name my objects using {target}_{verb}. So for example, if I have a Customers table, I would have procedures such as:

    dbo.Customer_Create
    dbo.Customer_Update
    dbo.Customer_Delete
    dbo.Customer_GetList
    dbo.Customer_GetDetails

This allows them to sort nicely in Object Explorer / Object Explorer Details, and also narrows down my search quickly in an IntelliSense (or SQLPrompt) auto- complete list. If I have a stored procedures named in the style dbo.GetCustomerList, they get mixed up in the list with dbo.GetClientList and dbo.GetCreditList. You could argue that maybe these should be organized by schema, but in spite of all the buzz, I have not developed a need or desire to use schemas in this way. For most of the applications I develop, ownership/schema is pretty simple and doesn't need to be made more complex.

Of course I NEVER name stored procedures using the sp_ prefix. See Brian Moran's article in SQL Server Magazine back in 2001. Or just ask anybody. :-) I also avoid other identifying object prefixes (like usp_). I don't know that I've ever been in a situation where I couldn't tell that some object was a procedure, or a function, or a table, and where the name really would have helped me all that much. This is especially true for the silly (but common) "tbl" prefix on tables. I don't want to get into that here, but I've always scratched my head at that one. Views may be the only place where I think this is justified, but then it should be a v or View_ prefix on the views only; no need to also identify tables... if it doesn't have a v or View_ prefix, it's a table!

More important than coming up with a proper naming scheme (because that is mostly subjective), it is much more important that you apply your naming scheme consistently. Nobody wants to see procedures named dbo.Customer_Create, dbo.Update_Customer and dbo.GetCustomerDetails.

======================

Using the schema prefix

I always specify the schema prefix when creating stored procedures. This way I know that it will be dbo.procedure_name no matter who I am logged in as when I create it. Similarly, my code always has the schema prefix on all object references. This prevents the database engine from checking for an object under my schema first.

======================

Using parentheses around parameter list

I am not a big fan of using parentheses around the parameter list. I can't really explain it, as I am a proponent of consistency, and this is the syntax required when creating user-defined functions. But I wanted to mention it because you will not see any of my stored procedures using this syntax. I'm open to change if you can suggest a good enough reason for me to do so.

======================

Lining up parameter names, data types, and default values

I find this much easier to read:

CREATE PROCEDURE dbo.User_Update
  
@CustomerID     INT,
  
@FirstName      VARCHAR(32)     NULL,
  
@LastName       VARCHAR(32)     NULL,
  
@Password       VARCHAR(16)     NULL,
  
@EmailAddress   VARCHAR(320)    = NULL,
  
@Active         BIT             1,
  
@LastLogin      SMALLDATETIME   NULL
AS
BEGIN

...

...than this:

CREATE PROCEDURE dbo.User_Update
@CustomerID INT,
@FirstName VARCHAR(32NULL,
@LastName VARCHAR(32NULL,
@Password VARCHAR(16NULL,
@EmailAddress VARCHAR(320NULL,
@Active BIT 1,
@LastLogin SMALLDATETIME NULL
AS
BEGIN
...

======================

Using spaces and line breaks liberally

This is a simple one, but in all comparison operators I like to see spaces between column/variable and operator. So instead of @foo int=null or where @foo>1 I would rather see @foo INT = NULL or WHERE @foo > 1.

I also tend to place at least a carriage return between individual statements, especially in stored procedures where many statements spill over multiple lines.

Both of these are just about readability, nothing more. While in some interpreted languages like JavaScript, size is king, and compressing / obfuscating code to make it as small as possible does provide some benefit, in T- SQL you would be hard-pressed to find a case where this comes into play. So, I lean to the side of readability.

======================

Avoiding data type / function prefixes on column / parameter names

I often see prefixes like @iCustomerID, @prmInputParameter, @varLocalVariable, @strStringVariable. I realize why people do it, I just think it muddies things up. It also makes it much harder to change the data type of a column when not only do you have to change all the variable/parameter declarations but you also have to change @iVarName to @bigintVarName, etc. Otherwise the purpose of the prefixed variable name loses most of its benefit. So, just name the variable for what it is. If you have a column EmailAddress VARCHAR(320), then make your variable/parameter declaration @EmailAddress VARCHAR(320). No need to use @strEmailAddress ... if you need to find out the data type, just go to the declaration line!

======================

Using lengths on parameters, even when optional

I occasionally see people define parameters and local variables as char or varchar, without specifying a length. This is very dangerous, as in many situations you will get silent truncation at 30 characters, and in a few obscure ones, you will get silent truncation at 1 character. This can mean data loss, which is not very good at all. I have asked that this silent truncation at least become consistent throughout the product (see Connect #267605), but nothing has happened yet. Fellow MVP Erland Sommarskog has gone so far as to ask for the length declaration to become mandatory (see Connect #244395) and, failing that, feels that this should be something that raises a warning when using his proposed SET STRICT_CHECKS ON setting (see http://www.sommarskog.se/strict_checks.html#nodefaultlength).

======================

Listing output parameters last

My habit is to list OUTPUT parameters last. I am not sure why that is exactly, except that it is the order that I conceptually think about the parameters... in then out, not the other way around.

======================

Using BEGIN / END liberally

I have seen many people write stuff like this:

CREATE PROCEDURE dbo.ProcedureA
AS
   SELECT 
FROM foo;
   
GO
   
SELECT FROM bar;
GO

They create the procedure, maybe don't notice the extra resultset from bar (or shrug it off), and then wonder why they only get results from foo when they run the procedure. If they had done this:

CREATE PROCEDURE dbo.ProcedureA
AS
BEGIN
   SELECT 
FROM foo;
   
GO
   
SELECT FROM bar;
END
GO

Because GO is not a T-SQL keyword but rather a batch separator for tools like Query Analyzer and SSMS, they would have received these error messages, one from each batch:

Msg 102, Level 15, State 1, Procedure ProcedureA, Line 4
Incorrect syntax near ';'.
Msg 102, Level 15, State 1, Line 2
Incorrect syntax near 'END'.

Yes, errors are bad, and all that, but I would rather have this brought to my face when I try to compile the procedure, then later on when the first user tries to call it.

======================

Using statement terminators

I have quickly adapted to the habit of ending all statements with proper statement terminators (;). This was always a habit in languages like JavaScript (where it is optional) and C# (where it is not). But as T-SQL gets more and more extensions (e.g. CTEs) that require it, I see it becoming a requirement eventually. Maybe I won't even be working with SQL Server by the time that happens, but if I am, I'll be ready. It's one extra keystroke and guarantees that my code will be forward-compatible.

======================

Using SET NOCOUNT ON

I always add SET NOCOUNT ON; as the very first line of the procedure (after BEGIN of course). This prevents DONE_IN_PROC messages from needlessly being sent back to the client after every row-affecting statement, which increases network traffic and in many cases can fool applications into believing there is an additional recordset available for consumption.

    NOTE
    I do not advocate blindly throwing SET NOCOUNT ON into all of your existing stored procedures. If you have existing applications they might actually already be working around the "extra recordset" problem, or there may be .NET applications that are using its result. If you code with SET NOCOUNT ON from the start, and keep track of rows affected in output parameters when necessary, this should never be an issue. Roy Ashbrook got beat up about this topic at a Tampa code camp last summer, and wrote about it here.

======================

Using local variables

When possible, I always use a single DECLARE statement to initialize all of my local variables. Similarly, I try to use a single SELECT to apply values to those variables that are being used like local constants. I see code like this:

declare @foo int
declare 
@bar int
declare 
@x int
set 
@foo 5
set @bar 6
set @x -1

And then some more declare and set statements later on in the code. I find it much harder to track down variables in longer and more complex procedures when the declaration and/or assignments can happen anywhere... I would much rather have as much of this as possible occurring in the beginning of the code. So for the above I would rather see:

DECLARE 
   
@foo    INT,
   
@bar    INT,
   
@x      INT;

SELECT
   
@foo    5,
   
@bar    6,
   
@x      -1;

As a bonus, in SQL Server 2008, the syntax now supports changing the above into a single statement:

DECLARE
   
@foo    INT = 5,
   
@bar    INT = 6,
   
@x      INT = -1;

So much nicer. However, it still leaves a lot to be desired: I also always use meaningful variables names, rather than @i, @x, etc.

Also, some people like listing the commas at the beginning of each new line, e.g.:

DECLARE
   
@foo    INT = 5
   
,@bar   INT = 6
   
,@x     INT = -1;

Not just in variable declarations, but also in parameter lists, columns lists, etc. While I will agree that this makes it easier to comment out individual lines in single steps, I find the readability suffers greatly.

======================

Using table aliases

I use aliases a lot. Nobody wants to read (never mind type) this, even though I have seen *many* examples of it posted to the public SQL Server newsgroups:

SELECT 
   
dbo.table_X_with_long_name.column1,
   
dbo.table_X_with_long_name.column2,
   
dbo.table_X_with_long_name.column3,
   
dbo.table_X_with_long_name.column4,
   
dbo.table_X_with_long_name.column5,
   
dbo.table_H_with_long_name.column1,
   
dbo.table_H_with_long_name.column2,
   
dbo.table_H_with_long_name.column3,
   
dbo.table_H_with_long_name.column4
FROM
   
dbo.table_X_with_long_name
INNER JOIN
   
dbo.table_H_with_long_name
ON
   
dbo.table_X_with_long_name.column1 dbo.table_H_with_long_name.column1
   
OR dbo.table_X_with_long_name.column1 dbo.table_H_with_long_name.column1
   
OR dbo.table_X_with_long_name.column1 dbo.table_H_with_long_name.column1
WHERE
   
dbo.table_X_with_long_name.column1 >= 5
   
AND dbo.table_X_with_long_name.column1 10;

But as long as you alias sensibly, you can make this a much more readable query:

SELECT 
   
X.column1,
   
X.column2,
   
X.column3,
   
X.column4,
   
X.column5,
   
H.column1,
   
H.column2,
   
H.column3,
   
H.column4
FROM
   
dbo.table_X_with_long_name AS X
INNER JOIN
   
dbo.table_H_with_long_name AS H
ON
   
X.column1 H.column1
   
OR X.column2 H.column2
   
OR X.column3 H.column3
WHERE
   
X.column1 >= 5
   
AND X.column1 10;

The "AS" when aliasing tables is optional; I have been trying very hard to make myself use it (only because the standard defines it that way). When writing multi-table queries, I don't give tables meaningless shorthand like a, b, c or t1, t2, t3. This might fly for simple queries, but if the query becomes more complex, you will regret it when you have to go back and edit it.

======================

Using column aliases

I buck against the trend here. A lot of people prefer to alias expressions / columns using this syntax:

SELECT [column expression] AS alias

I much prefer:

SELECT alias [column expression]

The reason is that all of my column names are listed down the left hand side of the column list, instead of being at the end. It is much easier to scan column names when they are vertically aligned.

In addition, I always use column aliases for expressions, even if right now I don't need to reference the column by an alias. This prevents me from having to deal with multiple errors should I ever need to move the query into a subquery, or cte, or derived table, etc.

======================

Using consistent formatting

I am very fussy (some co-workers use a different word) about formatting. I like my queries to be consistently readable and laid out in a predictable way. So for a join that includes a CTE and a subquery, this is how it would look:

   WITH cte AS 
   
(
       
SELECT 
           
t.col1,
           
t.col2,
           
t.col3
       
FROM
           
dbo.sometable AS t
   
)
   
SELECT
       
cte.col1,
       
cte.col2,
       
cte.col3,
       
c.col4
   
FROM
       
cte
   
INNER JOIN
       
dbo.Customers AS c
       
ON c.CustomerID cte.col1
   
WHERE EXISTS
   (
       
SELECT 1
           
FROM dbo.Orders o
           
WHERE o.CustomerID o.CustomerID
   
)
   AND 
c.Status 'LIVE';

Keeping all of the columns in a nice vertical line, and visually separating each table in the join and each where clause. Inside a subquery or derived table, I am less strict about the visual separation, though I still put each fundamental portion on its own line. And I always use SELECT 1 in this type of EXISTS() clause, instead of SELECT * or SELECT COUNT(*), to make it immediately clear to others that the query inside does NOT retrieve data.

======================

Matching case of underlying objects / columns

I always try to match the case of the underlying object, as I can never be too certain that my application will always be on a case-sensitive collation. Going back and correcting the case throughout all of my modules will be a royal pain, at best. This is much easier if you are using SQL Server 2008 Management Studio against a SQL Server 2008 instance, or have invested in Red-Gate's SQL Prompt, as you will automatically get the correct case when selecting from the auto-complete list.

======================

Qualifying column names with table/alias prefix

I always qualify column names when there is more than one table in the query. Heck, sometimes I even use aliases when there is only one table in the query, to ease my maintenance later should the query become more complex. I won't harp on this too much, as fellow MVP Alex Kuznetsov treated this subject a few days ago.

======================

Using RETURN and OUTPUT appropriately

I never use RETURN to provide any data back to the client (e.g. the SCOPE_IDENTITY() value or @@ROWCOUNT). This should be used exclusively for returning stored procedure status, such as ERROR_NUMBER() / @@ERROR. If you need to return data to the caller, use a resultset or an OUTPUT parameter.

======================

Avoiding keyword shorthands

I always use full keywords as opposed to their shorthand equivalents. "BEGIN TRAN" and "CREATE PROC" might save me a few keystrokes, and I'm sure the shorthand equivalents are here to stay, but something just doesn't feel right abou tit. Same with the parameters for built-in functions like DATEDIFF(), DATEADD() and DATEPART(). Why use WK or DW when you can use WEEK or WEEKDAY? (I also never understood why WEEKDAY become DW in shorthand, instead of WD, which is not supported. DW likely means DAYOFWEEK but that is an ODBC function and not supported directly in T-SQL at all. That in and of itself convinced me that it is better to take the expensive hit of typing five extra characters to be explicit and clear.) Finally, I always explicitly say "INNER JOIN or "LEFT OUTER JOIN"... never just "join" or "left join." Again, no real good reason behind that, just habit.

======================

Using parentheses liberally around AND / OR blocks

I always group my clauses when mixing AND and OR. Leaving it up to the optimizer to determine what "x=5 AND y = 4 OR b = 3" really means is not my cup of tea. I wrote a very short article about this a few years ago.

======================

So, after all of that, given the procedure I listed at the start of the article, what would I end up with? Assuming I am using SQL Server 2008, and that I can update the calling application to use the right procedure name, to use sensible input parameter names, and to stop using return values instead of output parameters:

CREATE PROCEDURE dbo.Customer_GetOlderOrders
   
@OrderStatus        INT,
   
@MaxOrderStatus     INT = NULL,
   
@OrderDate          SMALLDATETIME,
   
@RC1                INT OUTPUT,
   
@RC2                INT OUTPUT
AS
BEGIN
   SET 
NOCOUNT ON;

   
DECLARE
       
@City           VARCHAR(32'Beverly',
       
@State          CHAR (2)    'MA',
       
@MinOrderCount  INT         = 5;

   
SELECT
       
c.CustomerID,
       
c.FirstName,
       
c.LastName,
       
c.OrderDate
   
FROM
       
dbo.Customers c
   
INNER JOIN
       
dbo.Orders o
      
ON c.CustomerID o.CustomerID
   
WHERE   
       
(
           
o.OrderStatus       @OrderStatus
           
OR o.OrderStatus    <= @MaxOrderStatus
       
)
       AND 
o.OrderDate         <= @MaxOrderDate;

   
SET @RC1 @@ROWCOUNT;

   
SELECT
       
c.CustomerID,
       
OrderCount COUNT(*)
   
FROM
       
dbo.Customers c
   
LEFT OUTER JOIN
       
dbo.Orders o
       
ON c.CustomerID o.CustomerID
   
WHERE
       
c.City @City
       
AND c.State @State
   
GROUP BY
       
c.CustomerID
   
HAVING
       
COUNT(*) >= @MinOrderCount;

   
SET @RC2 @@ROWCOUNT;

   
RETURN;
END
GO

Okay, so it LOOKS like a lot more code, because the layout is more vertical. But you tell me. Copy both procedures to SSMS or Query Analyzer, and which one is easier to read / understand? And is it worth the three minutes it took me to convert the original query?  It took me a few hours to convert this list from my subconscious to you, so hopefully I have helped you pick up at least one good habit.  And if you think any of these are BAD habits, please drop a line and let me know why!


Published Thursday, October 30, 2008 6:40 PM by AaronBertrand

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Alexander Kuznetsov said:

Aaron,

This is great! Thank you for sharing. A cannot agree more.

Also I just do not allow objects in dbo schema, so all objects must be qualified. Besides, I did some benchmarking, and apparently a single SELECT even works faster that a series of SETs. Also I think enforcing that all aliases must be preceded by AS prevents another error

SELECT col1 col2 --comma omitted

--so col2 is interpreted as an alias

Thanks again!

October 30, 2008 5:55 PM
 

AaronBertrand said:

Thanks Alex, and I have to say, your series on defensive database programming is what inspired me...

October 30, 2008 6:52 PM
 

Jason said:

"

And if you think any of these are BAD habits, please drop a line and let me know why!

"

Local variables and\or changing the values of parameters could be considered a bad practice. http://www.microsoft.com/technet/prodtechnol/sql/2005/qrystats.mspx#EHKAE

Lot of times it doesn't matter but when it does, it really does.

October 30, 2008 9:03 PM
 

AaronBertrand said:

Well Jason, unless you are talking about moving the declaration and default value assignment to the parameter list instead of in the body of the procedure, which isn't even always possible (e.g. optional date parameters that default to something relative to CURRENT_TIMESTAMP, or other values that are populated by functions that can't be placed on the default value line), I think that is more of a business requirements question than a "best practices" question.  I have seen cases where taking a parameter out of the param list and defining it locally has also had similar effects (e.g. improved performance due to changes in estimation).

Anyway, my point wasn't really to say or imply, "you should use local variables!"  But rather, "when I use local variables, here is what I do with them."

October 30, 2008 9:24 PM
 

AaronBertrand said:

BTW, I didn't mean to sound so harsh, I do agree that declaring a variable just to assign a constant value and then only use it once is silly and pointless.  However, if the variable is used many times in a complex stored procedure, or is likely to change the next time you modify / recompile the procedure, I would be less willing to hard-code that constant everywhere, and instead will investigate workarounds if and when they cause a performance problem.  Search and replace is easy enough, but still tedious.

October 30, 2008 9:28 PM
 

Whitney Weaver said:

Great post Aaron...

Do you really make all join conditions three lines?  I think that might be my only difference of opinion.

I completely agree with the dislike of the "tbl" table prefix.  Having worked on a project that was originally coded that way, if I never see it again I will be a happy consultant.

October 30, 2008 10:27 PM
 

AaronBertrand said:

Yes Whitney, I find it much easier to read.  If you have long table names and column names then putting it all on one line makes you read (and sometimes scroll!) horizontally.  # of actual lines does not really scare me too much, but I like the visual consistency this convention provides me.

October 30, 2008 10:53 PM
 

Community Blogs said:

Web 网页栅格系统研究(1):960的秘密 - (2):蛋糕的切法 - (3):粒度问题 12 Steps To Faster Web Pages With Visual Round Trip Analyzer

October 31, 2008 1:00 AM
 

Reflective Perspective - Chris Alcock » The Morning Brew #213 said:

October 31, 2008 4:32 AM
 

Hugo Kornelis said:

Hi Aaron,

You make a very goood point that can't be repeated often enough: making your code readable is one of the most important -if not THE most important- habit for all developers to get into. And that includes SQL developers.

I agree with most of your points. Some loose thoughts:

* Avoiding parentheses around the parameter list for a stored procedure isn't that hard to explain. Just check BOL and you'll see that they are not even allowed by the documented syntax. :)

* I understand your reasons for using "Alias = Expression" instead of "Expression AS Alias" for column aliases. But the other side of the equation is that the latter is more consistent within the query since the "... AS alias" syntax is also used in the FROM clause.

* I do use the shorthand for some keywords. Mainly for outer joins. I find that all my joins line up veryy nicely if I qualify them all with one of "INNER", "LEFT ", "RIGHT", of "FULL ". (Note the extra space after LEFT and FULL!)

* I'm sure you know that I have a different style of query formatting than you do. But I must admit that yours is much more wide-spread - however, I still prefer to have my code laid out as a few neatly aligned columns. But the most important point is of course to have SOME standard and then follow that religiously!

Finally - the link to Alex Kusnetsov's article is broken. There is an extra space (%20) at the end that I had to remove to get to his post.

Thanks for a great post!

Best, Hugo

October 31, 2008 6:24 AM
 

Michael Swart said:

Great article.

There's an on-line sql formatter that has saved me a lot of time: http://www.wangz.net/cgi-bin/pp/gsqlparser/sqlpp/sqlformat.tpl

Of course it won't enforce all coding standards, but it's a good start.

October 31, 2008 8:27 AM
 

unclebiguns said:

I agree that you should have a standard.  I don't necessarily agree with everything you do.  I don't do keywords in all caps, I do first letter in caps.  I rarely spend time in anything but SSMS or Visual Studio and they colorize the key words which, IMO, eliminates the need for caps.

I also prefer the column AS alias instead of alias = column as I prefer to have the actual column names lined up instead of the alias lined up with column names.

I like to do joins this way:

table1 Join

table2 On

   pk1 = fk2 Join

table3 On

   pk2 = fk3

Although I do see the reason you do it on multiple lines to keep all the key words left aligned.

Lastly I just like I like commas at the end of the line I also like AND's and OR's at the end of the line instead of the beginning.  I prefer that my first column names in a Where or Join clause be left aligned.

In reality, I could learn to do it your way.  The key is to have a standard and stick with it.

October 31, 2008 8:51 AM
 

JeffGrend said:

Some of this code is unreadable to me.  why not just issue statements like this

SELECT *

FROM MYTABLE M1

INNER JOIN MYOTHERTABLE M2 ON M1.ID = M2.ID

LEFT OUTER JOIN ANOTHERTABLE M3.ID = M2.ID

WHERE M3.SOMECOLUMN IS NULL

good grief i think this fomatting business is overkill and anal retentive.

October 31, 2008 9:18 AM
 

Adam Machanic said:

Jeff:

With a simple example like that it probably makes no difference.  Show us some of your more complex code that uses CASE expressions, derived tables, etc.

October 31, 2008 10:11 AM
 

AllenMWhite said:

Aaron, great post!

I use RedGate's SQLReformatter whenever I have to look at anyone else's code just to get it into a consistent layout.  Because I'm essentially lazy (as I often state in my sessions) I'll write out queries I'm building in a less formal way, and when I get it working I'll use SQLReformatter to lay it out properly.  I then paste that code into my stored procedures.

Jeff, 'SELECT *' is bound to get you in trouble when the underlying object changes.  Listing out all columns you need in your procedure will prevent problems like that, and also minimize the time necessary to transfer the data to your procedure, because you're only requesting the data you need.

Allen

October 31, 2008 10:29 AM
 

jerryhung said:

I agree, SQL Prompt and SQL Refactor can save a lot of time (of typing and formatting). Now the problem is that what to do when the product is not installed on the server. SSMS 2008 IntelliSense can only do so much

Consistency is the key!!

October 31, 2008 3:53 PM
 

AaronBertrand said:

Jeff, my whole point is, use what makes sense.  What might be unreadable to you is what someone else likes, and I certainly could say that your stuff is unreadable as well, for my own reasons (I don't like table and column names in all caps, for example, as "MYOTHERTABLE" is really hard to read).  I was trying to explain why I do it the way I do it, and highlight some things you might want to think about when you're not the only one maintaining the code that you write.

As some others have echoed, consistency is king; it's not about whether your formatting can beat up my formatting, but rather whether it makes sense to you (and the people you work with).

October 31, 2008 4:39 PM
 

Madhivanan said:

Aaron,

I agree with all your points except your preference of using

SELECT alias = [column expression]

than

SELECT [column expression] as alias

I have two reasons

(1) I am not sure if forst method is supported in all RDBMSs

(2) It looks like update statement where columns are updated

I also pointed out using this

http://sqlblogcasts.com/blogs/madhivanan/archive/2007/11/14/should-alias-names-be-preceded-by-as.aspx

November 1, 2008 5:43 AM
 

Walter Wilson said:

I like the leading comma, I am pretty sure I was the first to start using it several years ago in posts to the usenet (anonymously).

I also don't like the join in three lines.  I will break the join into extra lines if there are extra conditions on the join with proper indentation (yes I like Javascript too!).

Case is not important to me because keywords are prettized but I do try to maintain case with the referenced object names.

The object naming standard is mandatory IMHO.  TableName_CRUDType is the only way to go.

I am sold on the use of ; as a statement delimiter though.

I review thousands of scripts a month prior to release to production for over a dozen different applications and for the most part they are produced from the Query Designer which means they are formatted for heck.  Is there any hope of seeing progress on this front?

Thanks,

Walter

November 1, 2008 8:01 PM
 

Ted T said:

i must chime in here and do so in all lower case.  the point of much of what you speak of in this article centers around readability.  personally i find that sql keywords are MORE readable when they are lowercase.  because reading sql for the last 15 years has made things more readable to me when keywords take a back seat to object names and typically criteria which i represent most often in mixed case.

okay and there's a critical point you're missing on the tbl prefix.  where this really comes in handy for me is in separating fact tables from code or lookup tables.  fact tables get a tbl prefix and code tables get a tlkp prefix.  this groups them together in mgmt studio and whenever i see a tlkp table embedded deep in a from clause i know exactly what its role is.

we could split hairs here about personal preference all day.  like how you structure your from clause.  i find your method much less readable than putting the table name first on every line of a from clause unless it's a carry over from the previous line and it would be indented in that case.  i can look down the left hand side after my from clause and instantly know exactly what tables are used in that sql statement.

so, again, i think most of what you argue for here is personal preference and i appreciate you sharing your opinion but don't say that the other ways are wrong because they're not.  beauty is in the eye of the beholder so whatever an experienced developer finds most readable for him (or within his team) is what should be used.

you do make very valid points on the use of table aliases but that point is a bit elementary.  i thinking using the inner and as keywords are a waste of time but if it helps you read code better than you should use it.  again, all personal preference but i like a healthy debate....keep 'em coming!!!!

November 2, 2008 2:17 AM
 

Andreew Dixon said:

November 2, 2008 4:36 PM
 

Andrew Dixon said:

November 2, 2008 4:36 PM
 

Michael Kjörling said:

The next version of SQL Server (that is, after 2008) will not support T-SQL statements that are not semicolon-terminated.

http://msdn.microsoft.com/en-us/library/ms143729.aspx

November 3, 2008 3:32 AM
 

2008 November 03 - Links for today « My (almost) Daily Links said:

November 3, 2008 4:30 AM
 

Rob H said:

Nice article.

Any comments about making comments w/in code?  How much is to much and when do you use the double-dash (--) line comments as opposed to using the start/stop comment delimiters (/*  blah blah... */)?

November 3, 2008 9:43 AM
 

Nigel Ainscoe said:

I have to say that while I agree with most of what Aaron says here, I have to add my voice to the lowercase keyword camp. While upper case was useful while when we only had monochrome editors, I think it has lost its attraction with colour editors. It just seems like the equivelant of usenet shouting to me. And with all the keywords in lower, I find the Camel case of TheRestOfTheVariables not so overwhelmed.

November 3, 2008 10:18 AM
 

AaronBertrand said:

Rob H,

Commenting is probably one of the more subjective topics.  Personally I don't think there can be too much commenting unless you have very basic code that really shouldn't need any support.  I try my best to make my code as self-documenting as possible (part of this comes from using descriptive object names, parameter names, variable names, etc).  I use -- when I have one or two lines of comments, and /* */ for anything more.

Nigel,

Again, it is all about preference.  I think lower-case keywords look lazy, and I appreciate the upper-case when I am on a server without a proper editor (some of my clients refuse to put client tools on their servers).

November 3, 2008 10:41 AM
 

Vern Rabe said:

Good list, Aaron. I have never understood, however, why it was considered easier to comment out individual lines when listing the commas at the beginning instead of the end. When at the beginning, you can usually easily comment out all but the first. When at the end, you can usually easily comment out all but the last. And when listing variables you cannot easily comment out two of the columns when the comma is at the beginning if you want to retain the semicolon statement teminator. For instance, in your example of variable declarations/assignments (below), only the @bar line can be easily comment out while retaining the terminator.

DECLARE

  @foo    INT = 5

  ,@bar   INT = 6

  ,@x     INT = -1;

Vern Rabe

November 3, 2008 1:44 PM
 

AaronBertrand said:

Vern, I guess it goes back to the days where statement terminators were not necessary and very seldom used, so the last line is no longer a concern in that case (and often when testing we add parameters to the end and not the middle of the list).  The statement terminator is still not a concern to many people using the current version, but as Michael reminded me, they better start caring soon...

Aside from that, I will confess that I am just guessing as to whether or not that is in fact the most compelling reason for the "leading comma" syntax.  Maybe Tom Moreau or others can share their reasoning and enlighten us all?  We must be missing something.  :-)

November 3, 2008 4:31 PM
 

Thomas Williams said:

Great list Aaron. Thanks for taking the time to put this down - it's well reasoned and explained (and has generated some useful discussion).

November 4, 2008 6:00 PM
 

Dustin Jones said:

great post, I plan to use some of these suggestions at work.

November 5, 2008 2:12 PM
 

Kurt Wimberger said:

I do use the tbl prefix for my tables but it's not because I can't recognize a table for what it is, nor am I afraid anyone else will be scratching their heads.

I use the tbl prefix as a validation hook for RegEx. I figure any string passed in that contains tbl is an injection attempt and I wipe it. I have never yet needed to pass a string value that had a legitimate word with tbl in it.

I know the arguments against allowing strings at all, but sometimes it happens and I'd rather have another hedge available.

November 6, 2008 2:53 PM
 

SQL Server & Cloud Links for the Week | Brent Ozar - SQL Server DBA said:

November 7, 2008 8:16 AM
 

Log Buffer #122: a Carnival of the Vanities for DBAs said:

November 7, 2008 12:31 PM
 

Steve Dassin said:

Well Aaron it seems you've taped into what sql programmers really consider to be the presentation layer. From the overwhelming number of views I think it's clear there's a special affection sql programmers have for their code. Sql users apparently feel if you've got it, flaunt it. And I can't see anything wrong with that. So, borrowing the theme from the great Roy Orbison, here's to "Pretty Sql":  

Pretty sql, running down query analyzer

Pretty sql, the kind I like to read

Pretty sql

I don't believe you, you're not the best

Nothing could look as good as you

Mercy

Pretty sql, won't you pardon me

Pretty sql, I couldn't help but see

Pretty sql

That you look lovely as can be

Are you lonely just like me

Wow

Pretty sql, delay a while

Pretty sql, work a while

Pretty sql, give your SELECTs to me

Pretty sql, yeah yeah yeah

Pretty sql, execute my way

Pretty sql, say you'll compile with me

'Cause I need you, I'll treat you right

Optimize with me baby, be mine tonight

Pretty sql, don't crash on by

Pretty sql, make me cry

Pretty sql, don't scroll away, hey...okay

If that's the way it must be, okay

I guess I'll go on home, it's late

There'll be tomorrow night, but wait

What do I see

Is she compiling back to me

Yeah, she's working back to me

Oh, oh, Pretty sql

November 8, 2008 4:09 AM
 

David Lean said:

Great Article.

My personal preference is "Commas at the start of line". Not for commenting out lines, as you point out the benefit is marginal. But because it forces me to realise that this line is a continuation from the line above.

Sure, this is less of an issue if you space out your code "extremely vertically" as you've done here. However, to me that reduces readibility as I need to scroll to understand the code.

 So to me this is easier to comprehend

WITH cte AS (

SELECT t.col1, t.col2, t.col3

FROM   dbo.sometable AS t

)

SELECT cte.col1, cte.col2, cte.col3, c.col4

FROM   cte

INNER JOIN dbo.Customers AS c ON c.CustomerID = cte.col1

WHERE EXISTS (

SELECT 1

FROM dbo.Orders o

WHERE o.CustomerID = o.CustomerID

)

AND c.Status = 'LIVE';

====

And if the select lists aren't trivial, then break them up to a column per line ie:

WITH cte AS (

SELECT t.col1

 ,t.col2

 ,t.col3

FROM   dbo.sometable AS t

)

SELECT   cte.col1

,cte.col2

,cte.col3

,c.col4

FROM   cte

INNER JOIN dbo.Customers AS c ON c.CustomerID = cte.col1

WHERE EXISTS (

SELECT 1

FROM dbo.Orders o

WHERE o.CustomerID = o.CustomerID

)

AND c.Status = 'LIVE';

November 8, 2008 10:12 PM
 

David Lean said:

PS: All the indents in the code examples above got striped so now it looks like crud. (So here is 2nd attempt, with extra '.' chars to attempt to keep the whitespace.

.WITH cte AS (

.   SELECT t.col1

.         ,t.col2

.         ,t.col3

.   FROM   dbo.sometable AS t

.)

.SELECT   cte.col1

.       ,cte.col2

.       ,cte.col3

.       ,c.col4

.FROM   cte

.INNER JOIN dbo.Customers AS c ON c.CustomerID = cte.col1

.WHERE EXISTS (

.   SELECT 1

.   FROM dbo.Orders o

.   WHERE o.CustomerID = o.CustomerID

.)

.AND c.Status = 'LIVE';

November 8, 2008 10:16 PM
 

Dm Unseen AKA M. Evers said:

A best practice I also use is catching null paramneter values as well as setting defaults on values:

The layout is something like this:

CRTEATE PROC usp_example @test int,

                        @test2 varchar(20)='default'

AS

BEGIN

    DECLARE @i int,

            @value float

Select @test=isnull(@test,0),

      @test2=isnull(@test2,'default'),

      @i=0,

      @value=0.0

.....

November 10, 2008 3:34 AM
 

EzBloke said:

Cracking article!

Must admit, I used to be one of those tblX, txtY, intZ people - maybe it's my age and a hangup from my early days and less visualy pleasant data repositories...

It's funny how simple things like this seem to be missed by the most intelligent of colleagues...

:o)

November 10, 2008 6:16 AM
 

Paul said:

Another Tip re comments

Commenting the start and end of large loops and conditional structures makes the logic more readable. Indenting alone is not as easy to read when the structure is long, nested and complex.

e.g.

IF If @complete = 1

 BEGIN

 ...

 END -- If @complete = 1

November 17, 2008 11:07 AM
 

David said:

Interesting read.

1) Didn't realize that the parentheses around proc parameters was a non-standard construct. I've always used it. Not sure I'm going to stop, but nice to know. :-)

2) oddly enough, I prefer all my built in sql functions to be all capital, but I prefer my datatypes to be lowercase.  I also as a matter of practice always have my sql objects be propercase with words ran together such as CustomerStatusCode, but on the other hand all my variables and parameters are all lower case with underscores between words such as @customer_status_code.  I agree, that as long as it's consistent and legible for the majority of people, then it probably doesn't matter too much.

3) I'm also a fan of the three line join syntax, yet indented...

<br>

FROM [Schema].[Customer] [c]<br>

  INNER JOIN [Schema].[Address] [a]<br>

      ON [c].[CustomerID] = [a].[CustomerID]<br>

4) Obviously, I'm not a devout AS alias user.  I also do not like and would not want (as Ted T mentions) to maintain code without the INNER, OUTER statements in the joins.

5) I have become a fan of Alexanders not allowing objects in the dbo schema, making schema declaration a lot more important.

6) I have a naming convention of s_ for procs, f_ for functions, v_ for views.  I also believe that all table names should be singular.  It's a database for crying out loud.  That's the purpose of a table, to hold data (meaning plural)  So a table would be Customer, not Customers. Now on the other hand, a view can be v_Customers.  This naming convention allows me to distinguish easily between a view, table, proc, or function name.

7) I do have exceptions to the rules.  Sometimes we have systems which we created, but then have to import source data from somewhere else.  That source data is not modified, manipulated, or reformatted at the table level.  A lot of times these tables come from Oracle, DB2, Teradata, etc.  I will usually mimic the exact table names and field names and case for those tables.  But then I will create a view of my own with my own conventions and use that within other code.

8) I like your recommendation against using RETURN for anything except for success/errors.

9) It's going to be a long while before you get me to use statement terminators though...just haven't quite got there yet.

November 17, 2008 7:50 PM
 

Denis Gobo said:

Wow, it has been already a year since I wrote A year in review, The 21 + 1 best blog posts on SQLBlog

December 31, 2008 10:38 AM
 

31 ?????????? ???????? ???????? SQLBlog ???? ?????? 2008 » ???? ???? ???????? ?????? said:

January 1, 2009 5:27 AM

Leave a Comment

(required) 
(optional)
(required) 
Submit

This Blog

Syndication