Category Archives: Sql Server

Making a code book in sql server - Part 2

Reading Time: 2 minutes

We have already covered how to add a data dictionary to a sql server table by using extended properties (see here). We can go one step further and making a simple stored procedure that will give us quick access to the code book. After building the stored procedure below we can now double click a table and use a keyboard shortcut. I have mine shortcut as Ctrl+5.

Create Procedure [dbo].[GetDataDictionary]

@TableName varchar(200)

as
Begin
SELECT 
	[major_id], 
	[minor_id], 
	t.name AS [Table Name], 
	c.name AS [Column Name], 
	[value] AS [Extended Property],
	infos.[Data_Type],
	infos.[is_nullable],
	infos.[Numeric_Precision],
	infos.[Numeric_Scale]
FROM 
	sys.extended_properties AS ep
	inner join
	sys.tables AS t 
	ON ep.major_id = t.object_id 
	inner join
	sys.columns AS c 
	ON ep.major_id = c.object_id 
	  and ep.minor_id = c.column_id
	inner join 
	INFORMATION_SCHEMA.COLUMNS infos
	on infos.table_name = t.name 
	  and infos.column_name = c.name
Where 
	class = 1
	and t.name =@TableName /* This is our Table name */

End

You can see what my shortcuts look like in the picture below.

If we follow the example in the article for building a code book, we can create a table with the iris data set and add a quick code book. Then when we highlight the table [iris] and use our keyboard shortcut of Ctrl+5, you should see this:

Now we can quickly check the definition of a column without having to leave our SSMS window. If you are interested in the other helper stored procedures besides GetDataDictionary, then take a look at this post.

Leave a Comment

Filed under Productivity, Sql Server

Quick Bar Graph in Sql Server

Reading Time: < 1 minutes

Have you ever needed a bar graph and didn't want to leave Sql Server? Well if you don't have Sql Server 2016 yet I have some code for you. The code below creates a table with the numbers 1-10. We then sample repeatedly from that table and make a histogram of our sample. We should have something approaching a uniform distribution when we are done.

 

If Object_id('tempdb..#Test','U') is not null 
Drop Table #Test
Create Table #Test ( numbers float )

If Object_id('tempdb..#RandomNumbers','U') is not null 
Drop Table #RandomNumbers
Create Table #RandomNumbers ( numbers float )


-- Create Table with numbers 1 -10
Declare @Counter int
Set @Counter=1
While @Counter <=10
Begin
    Insert Into #Test Select @counter
    Set @Counter=@Counter+1
End

-- Repeated Sample with replacement from table
Declare @Counter2 int
Set @Counter2=1
While @Counter2 <=1000
Begin
    Insert Into #RandomNumbers Select Top 1 Numbers From #test Order By newid()
    Set @Counter2=@Counter2+1
End

-- Make histogram
Select
    numbers,
    replicate('>',count(*)) [Levels]
From
    #RandomNumbers
Group By
    numbers
Order By
    count(*) desc

You should now have a nice little histogram showing you the distribution of your data. Another useful thing is to approach it like a pareto chart and order the data descending.

Select
    numbers,
    replicate('>',count(*))[Levels]
From
    #RandomNumbers
Group By
    numbers
Order By
    count(*) desc

sqlhist

Leave a Comment

Filed under Sql Server

Tips For Dealing with Large Datasets in Sql Server

Reading Time: 2 minutes

Are you dealing with large (100 million row +) datasets that live in a sql database? Have you found your old methods not to be satisfactory? Here are a couple of tips from my own experience.

  1. Do not use count(*) to figure out how big of a table you have.
  2.  Do not use the max function to figure out when the last record was inserted

1.

I found this out the hard way when I actually got an error when I tried to run
Select Count(*) From Table  and received an arithmetic overflow error. I was puzzled at first and after some  searching I found the problem. A 'count' in sql server returns the datatype int, which means it can only be 2^31-1 or 2,147,483,647. The table that I was working with had more than 2.1 billion rows, so that caused a problem. Now, you may be thinking that you could just use a Count_Big instead, but that is probably not the right answer. Try using sp_spaceused instead. If you are interested in turning this into a shortcut for SSMS, look here.

2.

Sometimes I need to figure out when the latest record was inserted. Instead of taking a max on a datetime field. I can often get to my answer by using information about an index. Hopefully your table has a unique auto incrementing primary key that can aid you in finding the last record inserted. Make sure you understand the process that builds or alters your table. It could be that the maximum value of your primary key is not related to the most recent records.

Select max(LocalTimeStamp) From Table ( Slow )

vs

Select LocalTimeStamp From Table Where PrimaryKey = ( Select Max(PrimaryKey) From Table ) ( Quick )

Have some useful tips to add? Please post them in the comments.

Leave a Comment

Filed under Sql Server

Text bashing in R for SQL

Reading Time: < 1 minutes

Fairly often, a coworker who is strong in Excel, but weak in writing code will come to me for help in special details about customers in their datasets. Sometimes the reason is to call, email, or snail mail a survey, other times to do some classification grouping on the customer. Whatever the reason, the coworker has a list of ID numbers and needs help getting something out of a SQL database.

When it isn't as simple as just adding quotes and commas to the cells in Excel before copying all the ID's into the WHERE clause of a very basic SELECT statement, I often fall back to R and let it do the work of putting together the SELECT statement and querying the data.

Suppose that you're given an Excel file with 1.2 million ID's and there's some transformation that you need to do first. Obviously, you first read the file in using your protocol and package of choice. Since we're ultimately doing SQL, let's take advantage of the RODBC package's cool features.

library(RODBC)

xlsxFile <- file.path("C:", "MyFile.xlsx")
xlsxChan <- odbcConnectExcel2007(xlsxFile)
xlsxSheets <- sqlTables(xlsxChan)
View(xlsxSheets)

# From here you can choose the table (worksheet) you need to pull the data from.

xlsxData <- sqlFetch(xlsxChan, 'Sheet1$')
odbcClose(xlsxChan)

Now that the data have been pulled into R any manipulations that need to be made can be done until you end up with a list of ID's that you need to query the SQL database for. Let's start by writing the basis of the query.

myQuery <- "SELECT ID, Name, Street, City, State, Zip, Phone, Email FROM CustomerTable WHERE ID in (\s)"

Notice the use of the \s switch. We're ultimately going to be dropping this into sprintf() with the ID's so that we don't have to clutter up the script with a stupidly-long list of hard-coded values. Next we need to make R build that stupidly-long list of values to put into the query.

id <- as.character(xlsxData$ID)
id <- sprintf("'\s'", unique(id))
id <- toString(id)

# Now we can put the two together and send the whole query off.
myQuery <- sprintf(myQuery, id)

# Here do what is appropriate for you database flavor.  I'm pretending to use MySQL on Windows.
dbChan <- odbcConnect("CustDb", uid="jeremy", case="tolower")
dbData <- sqlQuery(dbChan, myQuery, stringsAsFactors=FALSE)
odbcCloseAll()

What you should end up with in dbData is the list of things that you are after and a happy colleague.

Leave a Comment

Filed under R, Sql Server

Checking For Duplicate Records in Sql Server Table

Reading Time: 2 minutes

I often find myself needing to check if a table is unique with respect to a particular column. For instance, if I have a table of customer data I want to ensure that a customer only exists once. There are several ways that you can do this.  If I want to write a quick select if usually looks like this.

-- Method 1
Select count(*),Count(Distinct ColumnofInterest)
From Table

A while back I got tired of writing the above query over and over, so I created a stored procedure (SP). The SP below take two arguments, a table name and a column name and will determine if your column is unique. Currently it is written using a  count(*)  , which is not meant for large tables, i.e. > 2,000,000,000 rows .

-- Method 2
USE [YourDataBase]
GO


SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO



Create Procedure [dbo].[DupeChecker]
@Table nvarchar(128),
@Column nvarchar(128)
as



-- Construct SQL Statement Using Name Provided
Declare @MyQuery as nvarchar(max)='
Select 	@DupeFlag =case when count(*) = count(Distinct ' + @Column + ') then 0 else 1 end 
From ' + @Table + ' with(nolock)
'


Declare @MyNumber smallint
Set @MyNumber = 0

exec sp_executesql @MyQuery, N'@DupeFlag smallint out', @MyNumber out



-- Raise Error if Table is not uqniue
Declare @ErrorText Varchar(100)
Set @ErrorText = 'Your column ' + @Column + ' on table ' + @Table + ' is not unique.'
If (@MyNumber =1)
	Begin
		RAISERROR(@ErrorText,16,1)
	  RETURN
	End

-- If unique print a message to confirm
Declare @ErrorText2 Varchar(100)
Set @ErrorText2 = 'Your column ' + @Column + ' on table ' + @Table + ' is unique.'
If (@MyNumber =0)
	Begin
	Print (@ErrorText2)
	End
		
GO

There is one more method that I commonly use to determine if my table is unique and that is an index.

-- Method 3
Create Unique NonClustered Index [NCI_Dupe_Checker] on TableName
( MyColumn )

So, now you have three different ways to check your table to make sure that your column is unique. Keep in mind that adding an index will take up additional space, where methods 1 and 2 will not.

Leave a Comment

Filed under Sql Server

Helper Stored Procedures in SQL Server - Part 2

Reading Time: 3 minutes

Continuing with last week's post about Helper Stored Procedures, I wanted to introduce two more that I use frequently. The first one is a built in stored procedure (  using SQL Server) called 'sp_helptext'. This stored procedure allows you to view the content of other stored procedures (SP) and more, here is a link to the Microsoft documentation. Here is an example of how you use it:

sp_helptext @objname='sp_helptext'

This will show you the contents of sp_helptext  by using it. This comes in very handy as a keyboard shortcut. I assigned mine to 'Ctrl+3'.  Often I have a SP that does data manipulation and puts the final data in a table that occurs as the last step of my SP. I may not remember which table the data goes into, but I can simply highlight the SP ( I typically double click) and then use 'Ctrl+3' and the contents of the SP appears in my results tab. This also works nicely for function and triggers that you might use.

Another piece of information that I am often curious about is how large is a particular table. If you like going through menus, then you can left click on a table name in the object explorer and choose properties. Then select 'storage' from the left navigation menu. You will see under the 'general' section: "Data Space","Row Count" and "Index Space". While this is okay if you only perform this task once a month, it takes way to long if you are running this 5 + times per day. I wrote a stored procedure called 'sp_tablesize' that is a wrapper for another Microsoft SP, that is sp_spaceused.

USE [YourDatabase]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO


Alter Procedure [dbo].[TableSize_Sp]
@TableName varchar(500)
as


If object_id(@Tablename,'U') is not null
Begin
If Object_ID('tempdb..#TableSummary','U') Is Not Null  
Drop Table #TableSummary

CREATE TABLE #TableSummary
(
[database] varchar(500) default db_name(),
table_name sysname ,
row_count BIGINT,
reserved_size VARCHAR(50),
data_size VARCHAR(50),
index_size VARCHAR(50),
unused_size VARCHAR(50),
[timestamp] datetime default getdate() not null,
Calendardate as convert(datetime,convert(varchar,[timestamp],101))
)

Declare @Sql varchar(max)
Set @Sql='
INSERT #TableSummary ([table_name],[row_count],[reserved_size],[data_size],[index_size],[unused_size])
EXEC sp_spaceused''' + @TableName + '''
'

Exec (@Sql)

If Object_ID('tempdb..#TableSummaryCleaned','U') is not null
Drop Table #TableSummaryCleaned
Select
	a.[Database],
	a.table_name [TableName],
	round(sum(cast(a.row_count as bigint)),2)[Rows],
	round(sum(cast(reserved_size_kb as float)/1024),2)ReservedSizeMB,
	round(sum(cast(data_size_kb as float)/1024),2)DataSizeMB,
	round(sum(cast(index_size_kb as float)/1024),2)IndexSizeMB,
	round(sum(cast(unused_size_kb as float)/1024),2)UnusedSizeMB,
	[Timestamp],
	Calendardate
Into #TableSummaryCleaned	
From
	(
	Select 
		[database],
		table_name ,
		row_count,
		cast(replace(reserved_size, ' KB','') as bigint) reserved_size_kb,
		cast(replace(data_size, ' KB','') as bigint) data_size_kb,
		cast(replace(index_size, ' KB','') as bigint) index_size_kb,
		cast(replace(unused_size, ' KB','') as bigint) unused_size_kb,
		[timestamp],
		[calendardate]
	From #TableSummary
	)a
Group By 
	a.[database],
	a.table_name,
	[Timestamp],
	Calendardate

Alter Table #TableSummaryCleaned
Add TotalMB float

Update #TableSummaryCleaned 
Set TotalMB = ReservedSizeMB 

Alter Table #TableSummaryCleaned
Add TotalGB float

Update #TableSummaryCleaned
Set TotalGB =TotalMB/1024 

Select *
From #TableSummaryCleaned
End
Else
Begin
	Declare @ErrorText varchar(500)
	Set @ErrorText = @TableName+' is not a valid table.'
 RAISERROR(@ErrorText, 16, 1)
End

GO

After you create the SP 'tablesize_SP' you can assign it to a shortcut in SSMS. I choose to use 'Ctrl+4' for myself. We can create a table and add data and an index to make sure that our new SP works.

 

Create Table dbo.TestTable
( numbers float )


-- Create Table with numbers 1 -100
Declare @Counter int
Set @Counter=1
While @Counter <=1000
Begin
    Insert Into dbo.TestTable Select @counter*1000
    Set @Counter=@Counter+1
End

-- Check Table Size
exec tablesize_sp'TestTable'

-- Create Index
Create NonClustered Index [NC_Numbers] on dbo.TestTable
(numbers)

-- Check size after creating index
exec tablesize_sp'TestTable'

You will notice that our index size increased after adding one, which is to be expected. Hopefully this will allow you to quickly check the size of a table. Combine that with 'sp_helptext' and 'sp_top10' you can find out a lot of information about a table quite quickly. Let me leave you with a quick reference guide of which SP to use depending on what you want to know.

 

SP Name Use Shortcut
sp_helptext View the contents of a stored procedure or function Ctrl+3
tablesize_SP Check the size of a table including indicies Ctrl+4
sp_top10 View the top 10 records of a table Ctrl+1
sp_help View Information about an object for instance table schema Alt+F1

Note: SSMS 2012 was used for this example.

Leave a Comment

Filed under Sql Server

Helper Stored Procedures in SQL Server - Part 1

Reading Time: 2 minutes

Often I have days where a large amount of my time is spent writing SQL code in SQL Server Management Studio (SSMS). Building tables, creating indices and verifying data are all common tasks. However, I often myself needing to checking which columns are in a table or to take a quick look at the data. I got tired of typing Select Top 10 * From [MyTable] more than five times per day.   In order to get around typing that repeatedly I made a function called 'sp_top10'.

CREATE Procedure [dbo].[sp_top10]
@Table as varchar(100)

as
Begin
set nocount on

Declare  @dbname as varchar(100)
Set @dbname = db_name()

Declare @MyQuery Varchar(max)
Set @MyQuery='
Select Top 10 *
From ' + @dbname + ' ..' + @Table + ' with(nolock)'

Exec (@MyQuery)
End

This stored procedure can be invoked by typing

exec sp_top10'YourTableHere'  where you substitute in your own table name. The real power comes when you assign your new stored procedure to a shortcut. In SSMS you can do this by going to Tools > Options > Keyboard > Query Shortcuts .  Simply add  'sp_top10' to an open key combination and then restart SSMS. You should now be able to highlight ( or doubleclick) a tablename and then use your new shortcut to view the top 10 records in that table. I set my shortcut to 'Ctrl+F1'.

An example:

-- Create Table To Hold Our Dataset
If Object_ID('tempdb..#TestData','U') is not null
Drop Table #TestData

Create Table #TestData
(ID bigint not null identity(1,1) ,
Letter varchar(1)
)

-- Quick Loop to get all lower case letters into our table
Declare @Counter Int;Set @Counter=97
While @Counter <=122
Begin
Declare @Sql Varchar(Max)
Set @SQL ='
Insert into #TestData (letter)
Select char(' + cast(@Counter as varchar)  + ')
'
--print @SQL
Exec (@Sql)
Set @Counter = @Counter+1
End

-- View all 26 records
Select *
From #TestData

-- Just view the top 10
exec sp_top10'#testdata'

You output should look similar to figure 1.

sp_top10_example

Figure 1.

Feel free to edit the SP to suit your particular situation. I use this in conjunction with 'sp_help' which has a default shortcut of 'Alt+F1'  in SSMS almost everyday. So I can select a tablename and 'Alt+F1' if I need to see what type of columns I have and then 'Ctrl+1' to view the first ten records of that table. Hopefully this will help save you time in your daily work.

Note: SSMS 2012 was used for this example.

1 Comment

Filed under Sql Server

Making a Code Book in Sql Server

Reading Time: 4 minutes

While working through a coursera course recently (https://www.coursera.org/course/getdata) I started to think about how a code book could be implemented in sql server. There are 3 points that Jeff Leek makes regarding a code book and they are as follows:

  1. Information about the variables ...
  2. Information about the Summary Choices  you made
  3. Information about the study design you used

My focus for this article will be able point number one, information about the variables. Often times it would be nice to know more about your data than simply what data structure it is defined by. For a simple working example, load the Iris datasets into Sql Server and then add some metadata about the columns to provide further information for the users.

 

Use YourDataBaseName
Go

-- Create Table 
If Object_Id('YourDataBaseName.dbo.Iris','U') is not null 
Drop Table dbo.Iris

Create Table dbo.Iris
(
IrisID bigint not null identity(1,1) ,
SepalLength numeric,
SepalWidth numeric,
PetalLength numeric,
PetalWidth numeric,
Species nvarchar(100)
)
-- Insert Values
Insert Into dbo.Iris Values('5.1','3.5','1.4','0.2','setosa')
Insert Into dbo.Iris Values('4.9','3','1.4','0.2','setosa')
Insert Into dbo.Iris Values('4.7','3.2','1.3','0.2','setosa')
Insert Into dbo.Iris Values('4.6','3.1','1.5','0.2','setosa')
Insert Into dbo.Iris Values('5','3.6','1.4','0.2','setosa')
Insert Into dbo.Iris Values('5.4','3.9','1.7','0.4','setosa')
Insert Into dbo.Iris Values('4.6','3.4','1.4','0.3','setosa')
Insert Into dbo.Iris Values('5','3.4','1.5','0.2','setosa')
Insert Into dbo.Iris Values('4.4','2.9','1.4','0.2','setosa')
Insert Into dbo.Iris Values('4.9','3.1','1.5','0.1','setosa')
Insert Into dbo.Iris Values('5.4','3.7','1.5','0.2','setosa')
Insert Into dbo.Iris Values('4.8','3.4','1.6','0.2','setosa')
Insert Into dbo.Iris Values('4.8','3','1.4','0.1','setosa')
Insert Into dbo.Iris Values('4.3','3','1.1','0.1','setosa')
Insert Into dbo.Iris Values('5.8','4','1.2','0.2','setosa')
Insert Into dbo.Iris Values('5.7','4.4','1.5','0.4','setosa')
Insert Into dbo.Iris Values('5.4','3.9','1.3','0.4','setosa')
Insert Into dbo.Iris Values('5.1','3.5','1.4','0.3','setosa')
Insert Into dbo.Iris Values('5.7','3.8','1.7','0.3','setosa')
Insert Into dbo.Iris Values('5.1','3.8','1.5','0.3','setosa')
Insert Into dbo.Iris Values('5.4','3.4','1.7','0.2','setosa')
Insert Into dbo.Iris Values('5.1','3.7','1.5','0.4','setosa')
Insert Into dbo.Iris Values('4.6','3.6','1','0.2','setosa')
Insert Into dbo.Iris Values('5.1','3.3','1.7','0.5','setosa')
Insert Into dbo.Iris Values('4.8','3.4','1.9','0.2','setosa')
Insert Into dbo.Iris Values('5','3','1.6','0.2','setosa')
Insert Into dbo.Iris Values('5','3.4','1.6','0.4','setosa')
Insert Into dbo.Iris Values('5.2','3.5','1.5','0.2','setosa')
Insert Into dbo.Iris Values('5.2','3.4','1.4','0.2','setosa')
Insert Into dbo.Iris Values('4.7','3.2','1.6','0.2','setosa')
Insert Into dbo.Iris Values('4.8','3.1','1.6','0.2','setosa')
Insert Into dbo.Iris Values('5.4','3.4','1.5','0.4','setosa')
Insert Into dbo.Iris Values('5.2','4.1','1.5','0.1','setosa')
Insert Into dbo.Iris Values('5.5','4.2','1.4','0.2','setosa')
Insert Into dbo.Iris Values('4.9','3.1','1.5','0.2','setosa')
Insert Into dbo.Iris Values('5','3.2','1.2','0.2','setosa')
Insert Into dbo.Iris Values('5.5','3.5','1.3','0.2','setosa')
Insert Into dbo.Iris Values('4.9','3.6','1.4','0.1','setosa')
Insert Into dbo.Iris Values('4.4','3','1.3','0.2','setosa')
Insert Into dbo.Iris Values('5.1','3.4','1.5','0.2','setosa')
Insert Into dbo.Iris Values('5','3.5','1.3','0.3','setosa')
Insert Into dbo.Iris Values('4.5','2.3','1.3','0.3','setosa')
Insert Into dbo.Iris Values('4.4','3.2','1.3','0.2','setosa')
Insert Into dbo.Iris Values('5','3.5','1.6','0.6','setosa')
Insert Into dbo.Iris Values('5.1','3.8','1.9','0.4','setosa')
Insert Into dbo.Iris Values('4.8','3','1.4','0.3','setosa')
Insert Into dbo.Iris Values('5.1','3.8','1.6','0.2','setosa')
Insert Into dbo.Iris Values('4.6','3.2','1.4','0.2','setosa')
Insert Into dbo.Iris Values('5.3','3.7','1.5','0.2','setosa')
Insert Into dbo.Iris Values('5','3.3','1.4','0.2','setosa')
Insert Into dbo.Iris Values('7','3.2','4.7','1.4','versicolor')
Insert Into dbo.Iris Values('6.4','3.2','4.5','1.5','versicolor')
Insert Into dbo.Iris Values('6.9','3.1','4.9','1.5','versicolor')
Insert Into dbo.Iris Values('5.5','2.3','4','1.3','versicolor')
Insert Into dbo.Iris Values('6.5','2.8','4.6','1.5','versicolor')
Insert Into dbo.Iris Values('5.7','2.8','4.5','1.3','versicolor')
Insert Into dbo.Iris Values('6.3','3.3','4.7','1.6','versicolor')
Insert Into dbo.Iris Values('4.9','2.4','3.3','1','versicolor')
Insert Into dbo.Iris Values('6.6','2.9','4.6','1.3','versicolor')
Insert Into dbo.Iris Values('5.2','2.7','3.9','1.4','versicolor')
Insert Into dbo.Iris Values('5','2','3.5','1','versicolor')
Insert Into dbo.Iris Values('5.9','3','4.2','1.5','versicolor')
Insert Into dbo.Iris Values('6','2.2','4','1','versicolor')
Insert Into dbo.Iris Values('6.1','2.9','4.7','1.4','versicolor')
Insert Into dbo.Iris Values('5.6','2.9','3.6','1.3','versicolor')
Insert Into dbo.Iris Values('6.7','3.1','4.4','1.4','versicolor')
Insert Into dbo.Iris Values('5.6','3','4.5','1.5','versicolor')
Insert Into dbo.Iris Values('5.8','2.7','4.1','1','versicolor')
Insert Into dbo.Iris Values('6.2','2.2','4.5','1.5','versicolor')
Insert Into dbo.Iris Values('5.6','2.5','3.9','1.1','versicolor')
Insert Into dbo.Iris Values('5.9','3.2','4.8','1.8','versicolor')
Insert Into dbo.Iris Values('6.1','2.8','4','1.3','versicolor')
Insert Into dbo.Iris Values('6.3','2.5','4.9','1.5','versicolor')
Insert Into dbo.Iris Values('6.1','2.8','4.7','1.2','versicolor')
Insert Into dbo.Iris Values('6.4','2.9','4.3','1.3','versicolor')
Insert Into dbo.Iris Values('6.6','3','4.4','1.4','versicolor')
Insert Into dbo.Iris Values('6.8','2.8','4.8','1.4','versicolor')
Insert Into dbo.Iris Values('6.7','3','5','1.7','versicolor')
Insert Into dbo.Iris Values('6','2.9','4.5','1.5','versicolor')
Insert Into dbo.Iris Values('5.7','2.6','3.5','1','versicolor')
Insert Into dbo.Iris Values('5.5','2.4','3.8','1.1','versicolor')
Insert Into dbo.Iris Values('5.5','2.4','3.7','1','versicolor')
Insert Into dbo.Iris Values('5.8','2.7','3.9','1.2','versicolor')
Insert Into dbo.Iris Values('6','2.7','5.1','1.6','versicolor')
Insert Into dbo.Iris Values('5.4','3','4.5','1.5','versicolor')
Insert Into dbo.Iris Values('6','3.4','4.5','1.6','versicolor')
Insert Into dbo.Iris Values('6.7','3.1','4.7','1.5','versicolor')
Insert Into dbo.Iris Values('6.3','2.3','4.4','1.3','versicolor')
Insert Into dbo.Iris Values('5.6','3','4.1','1.3','versicolor')
Insert Into dbo.Iris Values('5.5','2.5','4','1.3','versicolor')
Insert Into dbo.Iris Values('5.5','2.6','4.4','1.2','versicolor')
Insert Into dbo.Iris Values('6.1','3','4.6','1.4','versicolor')
Insert Into dbo.Iris Values('5.8','2.6','4','1.2','versicolor')
Insert Into dbo.Iris Values('5','2.3','3.3','1','versicolor')
Insert Into dbo.Iris Values('5.6','2.7','4.2','1.3','versicolor')
Insert Into dbo.Iris Values('5.7','3','4.2','1.2','versicolor')
Insert Into dbo.Iris Values('5.7','2.9','4.2','1.3','versicolor')
Insert Into dbo.Iris Values('6.2','2.9','4.3','1.3','versicolor')
Insert Into dbo.Iris Values('5.1','2.5','3','1.1','versicolor')
Insert Into dbo.Iris Values('5.7','2.8','4.1','1.3','versicolor')
Insert Into dbo.Iris Values('6.3','3.3','6','2.5','virginica')
Insert Into dbo.Iris Values('5.8','2.7','5.1','1.9','virginica')
Insert Into dbo.Iris Values('7.1','3','5.9','2.1','virginica')
Insert Into dbo.Iris Values('6.3','2.9','5.6','1.8','virginica')
Insert Into dbo.Iris Values('6.5','3','5.8','2.2','virginica')
Insert Into dbo.Iris Values('7.6','3','6.6','2.1','virginica')
Insert Into dbo.Iris Values('4.9','2.5','4.5','1.7','virginica')
Insert Into dbo.Iris Values('7.3','2.9','6.3','1.8','virginica')
Insert Into dbo.Iris Values('6.7','2.5','5.8','1.8','virginica')
Insert Into dbo.Iris Values('7.2','3.6','6.1','2.5','virginica')
Insert Into dbo.Iris Values('6.5','3.2','5.1','2','virginica')
Insert Into dbo.Iris Values('6.4','2.7','5.3','1.9','virginica')
Insert Into dbo.Iris Values('6.8','3','5.5','2.1','virginica')
Insert Into dbo.Iris Values('5.7','2.5','5','2','virginica')
Insert Into dbo.Iris Values('5.8','2.8','5.1','2.4','virginica')
Insert Into dbo.Iris Values('6.4','3.2','5.3','2.3','virginica')
Insert Into dbo.Iris Values('6.5','3','5.5','1.8','virginica')
Insert Into dbo.Iris Values('7.7','3.8','6.7','2.2','virginica')
Insert Into dbo.Iris Values('7.7','2.6','6.9','2.3','virginica')
Insert Into dbo.Iris Values('6','2.2','5','1.5','virginica')
Insert Into dbo.Iris Values('6.9','3.2','5.7','2.3','virginica')
Insert Into dbo.Iris Values('5.6','2.8','4.9','2','virginica')
Insert Into dbo.Iris Values('7.7','2.8','6.7','2','virginica')
Insert Into dbo.Iris Values('6.3','2.7','4.9','1.8','virginica')
Insert Into dbo.Iris Values('6.7','3.3','5.7','2.1','virginica')
Insert Into dbo.Iris Values('7.2','3.2','6','1.8','virginica')
Insert Into dbo.Iris Values('6.2','2.8','4.8','1.8','virginica')
Insert Into dbo.Iris Values('6.1','3','4.9','1.8','virginica')
Insert Into dbo.Iris Values('6.4','2.8','5.6','2.1','virginica')
Insert Into dbo.Iris Values('7.2','3','5.8','1.6','virginica')
Insert Into dbo.Iris Values('7.4','2.8','6.1','1.9','virginica')
Insert Into dbo.Iris Values('7.9','3.8','6.4','2','virginica')
Insert Into dbo.Iris Values('6.4','2.8','5.6','2.2','virginica')
Insert Into dbo.Iris Values('6.3','2.8','5.1','1.5','virginica')
Insert Into dbo.Iris Values('6.1','2.6','5.6','1.4','virginica')
Insert Into dbo.Iris Values('7.7','3','6.1','2.3','virginica')
Insert Into dbo.Iris Values('6.3','3.4','5.6','2.4','virginica')
Insert Into dbo.Iris Values('6.4','3.1','5.5','1.8','virginica')
Insert Into dbo.Iris Values('6','3','4.8','1.8','virginica')
Insert Into dbo.Iris Values('6.9','3.1','5.4','2.1','virginica')
Insert Into dbo.Iris Values('6.7','3.1','5.6','2.4','virginica')
Insert Into dbo.Iris Values('6.9','3.1','5.1','2.3','virginica')
Insert Into dbo.Iris Values('5.8','2.7','5.1','1.9','virginica')
Insert Into dbo.Iris Values('6.8','3.2','5.9','2.3','virginica')
Insert Into dbo.Iris Values('6.7','3.3','5.7','2.5','virginica')
Insert Into dbo.Iris Values('6.7','3','5.2','2.3','virginica')
Insert Into dbo.Iris Values('6.3','2.5','5','1.9','virginica')
Insert Into dbo.Iris Values('6.5','3','5.2','2','virginica')
Insert Into dbo.Iris Values('6.2','3.4','5.4','2.3','virginica')
Insert Into dbo.Iris Values('5.9','3','5.1','1.8','virginica')

-- Add Extended Properties
EXEC sys.sp_addextendedproperty @name=N'Notes', @value=N'
Sepal Length of Iris plant in centimetres.
' , @level0type=N'SCHEMA',
	@level0name=N'dbo', 
	@level1type=N'TABLE',
	@level1name=N'Iris', 
	@level2type=N'COLUMN',
	@level2name=N'SepalLength'
GO


EXEC sys.sp_addextendedproperty @name=N'Notes', @value=N'
Sepal width of Iris plant in centimetres.
' , @level0type=N'SCHEMA',
	@level0name=N'dbo', 
	@level1type=N'TABLE',
	@level1name=N'Iris', 
	@level2type=N'COLUMN',
	@level2name=N'SepalWidth'
GO



EXEC sys.sp_addextendedproperty @name=N'Notes', @value=N'
Petal Length of Iris plant in centimetres.
' , @level0type=N'SCHEMA',
	@level0name=N'dbo', 
	@level1type=N'TABLE',
	@level1name=N'Iris', 
	@level2type=N'COLUMN',
	@level2name=N'PetalLength'
GO

EXEC sys.sp_addextendedproperty @name=N'Notes', @value=N'
Petal Width  of Iris plant in centimetres.
' , @level0type=N'SCHEMA',
	@level0name=N'dbo', 
	@level1type=N'TABLE',
	@level1name=N'Iris', 
	@level2type=N'COLUMN',
	@level2name=N'PetalWidth'
GO

 
EXEC sys.sp_addextendedproperty @name=N'Notes', @value=N'
Species of Iris plant.
' , @level0type=N'SCHEMA',
	@level0name=N'dbo', 
	@level1type=N'TABLE',
	@level1name=N'Iris', 
	@level2type=N'COLUMN',
	@level2name=N'Species'
GO

  
 
 

Now, you should have a table called Iris with some basic metadata about the columns. While you can navigate to the extended properties through the object explorer, it would be nice to access this information through a query.

-- View the Code book for the Iris Table and some information about the data structure
SELECT 
	[major_id], 
	[minor_id], 
	[t.name] AS [Table Name], 
	[c.name] AS [Column Name], 
	[value] AS [Extended Property],
	infos.[Data_Type],
	infos.[is_nullable],
	infos.[Numeric_Precision],
	infos.[Numeric_Scale]
FROM 
	sys.extended_properties AS ep
	inner join
	sys.tables AS t 
	ON ep.major_id = t.object_id 
	inner join
	sys.columns AS c 
	ON ep.major_id = c.object_id 
	  and ep.minor_id = c.column_id
	inner join 
	INFORMATION_SCHEMA.COLUMNS infos
	on infos.table_name = t.name 
	  and infos.column_name = c.name
Where 
	class = 1
	and t.name ='Iris' /* This is our Table name */



-- Drop Iris Table to avoid cluttering up your database
Drop Table dbo.Iris

Now you have the framework for creating a data code book that can be self contained within the table itself. This will prove most useful when you can can share a sql server table with someone else.

 

Leave a Comment

Filed under Sql Server