Monthly Archives: August 2014

Helper Stored Procedures in SQL Server - Part 1

Often I have days where a large amount of my time is spent writing SQL code in SQL Server Management Studio (SSMS). Building tables, creating indices and verifying data are all common tasks. However, I often myself needing to checking which columns are in a table or to take a quick look at the data. I got tired of typing Select Top 10 * From [MyTable] more than five times per day.   In order to get around typing that repeatedly I made a function called 'sp_top10'.

This stored procedure can be invoked by typing

exec sp_top10'YourTableHere'  where you substitute in your own table name. The real power comes when you assign your new stored procedure to a shortcut. In SSMS you can do this by going to Tools > Options > Keyboard > Query Shortcuts .  Simply add  'sp_top10' to an open key combination and then restart SSMS. You should now be able to highlight ( or doubleclick) a tablename and then use your new shortcut to view the top 10 records in that table. I set my shortcut to 'Ctrl+F1'.

An example:

You output should look similar to figure 1.

sp_top10_example

Figure 1.

Feel free to edit the SP to suit your particular situation. I use this in conjunction with 'sp_help' which has a default shortcut of 'Alt+F1'  in SSMS almost everyday. So I can select a tablename and 'Alt+F1' if I need to see what type of columns I have and then 'Ctrl+1' to view the first ten records of that table. Hopefully this will help save you time in your daily work.

Note: SSMS 2012 was used for this example.

1 Comment

Filed under Sql Server

Making a Code Book in Sql Server

While working through a coursera course recently (https://www.coursera.org/course/getdata) I started to think about how a code book could be implemented in sql server. There are 3 points that Jeff Leek makes regarding a code book and they are as follows:

  1. Information about the variables ...
  2. Information about the Summary Choices  you made
  3. Information about the study design you used

My focus for this article will be able point number one, information about the variables. Often times it would be nice to know more about your data than simply what data structure it is defined by. For a simple working example, load the Iris datasets into Sql Server and then add some metadata about the columns to provide further information for the users.

 

Now, you should have a table called Iris with some basic metadata about the columns. While you can navigate to the extended properties through the object explorer, it would be nice to access this information through a query.

Now you have the framework for creating a data code book that can be self contained within the table itself. This will prove most useful when you can can share a sql server table with someone else.

 

Leave a Comment

Filed under Sql Server

Unexpected behavior with summary function in R

I often find myself working with data that includes dates and times. Sometimes I am interested in looking at what happened on on a particular calendar day. I usually avoid dealing with actual datetime formats when working at the day level and prefer to use an integer for representing a day. For instance the first day of each quarter can be represented as integers (numeric also works for this example). For instance if I wanted to know the oldest date in my dataset I can just take the minimum since I am using a numerical data structure.

[1] 20140101

For some reason which I don't recall, I tried using the summary() function on my dates. The only values that would be valid are the minimum and the maximum, or so I thought.

The output:
# Min.            1st Qu.       Median      Mean          3rd Qu.       Max.
# 20140000 20140000 20140000 20140000 20140000 20140000

This behavior which seemed odd to me, is caused by the way summary() deals with numerical data. So I decided to look at what summary is actually doing. To view the code behind the summary function type:

The portion of code that we are interested in is this:

This code chunk gets executed when the object that we pass to summary is numeric.  If we substitute in our object 'mydates' we get the following code.

If you step through the code line by line, you will notice that after line 4, summary produces what we would expect to see for a min and max value. However, after you execute line 5, the numbers are changed because they are not the actual numbers but they are changed to be significant figures. For example try:

[1] 20140000

So be careful when using generic functions if you don't know what they are doing. I would encourage to take a look at the code behind some of the R functions you use the most. For instance using the fivenum() function does not change my min and max values the same way summary did.

Leave a Comment

Filed under R