While working through a coursera course recently (https://www.coursera.org/course/getdata) I started to think about how a code book could be implemented in sql server. There are 3 points that Jeff Leek makes regarding a code book and they are as follows:
- Information about the variables ...
- Information about the Summary Choices you made
- Information about the study design you used
My focus for this article will be able point number one, information about the variables. Often times it would be nice to know more about your data than simply what data structure it is defined by. For a simple working example, load the Iris datasets into Sql Server and then add some metadata about the columns to provide further information for the users.
Use YourDataBaseName Go -- Create Table If Object_Id('YourDataBaseName.dbo.Iris','U') is not null Drop Table dbo.Iris Create Table dbo.Iris ( IrisID bigint not null identity(1,1) , SepalLength numeric, SepalWidth numeric, PetalLength numeric, PetalWidth numeric, Species nvarchar(100) ) -- Insert Values Insert Into dbo.Iris Values('5.1','3.5','1.4','0.2','setosa') Insert Into dbo.Iris Values('4.9','3','1.4','0.2','setosa') Insert Into dbo.Iris Values('4.7','3.2','1.3','0.2','setosa') Insert Into dbo.Iris Values('4.6','3.1','1.5','0.2','setosa') Insert Into dbo.Iris Values('5','3.6','1.4','0.2','setosa') Insert Into dbo.Iris Values('5.4','3.9','1.7','0.4','setosa') Insert Into dbo.Iris Values('4.6','3.4','1.4','0.3','setosa') Insert Into dbo.Iris Values('5','3.4','1.5','0.2','setosa') Insert Into dbo.Iris Values('4.4','2.9','1.4','0.2','setosa') Insert Into dbo.Iris Values('4.9','3.1','1.5','0.1','setosa') Insert Into dbo.Iris Values('5.4','3.7','1.5','0.2','setosa') Insert Into dbo.Iris Values('4.8','3.4','1.6','0.2','setosa') Insert Into dbo.Iris Values('4.8','3','1.4','0.1','setosa') Insert Into dbo.Iris Values('4.3','3','1.1','0.1','setosa') Insert Into dbo.Iris Values('5.8','4','1.2','0.2','setosa') Insert Into dbo.Iris Values('5.7','4.4','1.5','0.4','setosa') Insert Into dbo.Iris Values('5.4','3.9','1.3','0.4','setosa') Insert Into dbo.Iris Values('5.1','3.5','1.4','0.3','setosa') Insert Into dbo.Iris Values('5.7','3.8','1.7','0.3','setosa') Insert Into dbo.Iris Values('5.1','3.8','1.5','0.3','setosa') Insert Into dbo.Iris Values('5.4','3.4','1.7','0.2','setosa') Insert Into dbo.Iris Values('5.1','3.7','1.5','0.4','setosa') Insert Into dbo.Iris Values('4.6','3.6','1','0.2','setosa') Insert Into dbo.Iris Values('5.1','3.3','1.7','0.5','setosa') Insert Into dbo.Iris Values('4.8','3.4','1.9','0.2','setosa') Insert Into dbo.Iris Values('5','3','1.6','0.2','setosa') Insert Into dbo.Iris Values('5','3.4','1.6','0.4','setosa') Insert Into dbo.Iris Values('5.2','3.5','1.5','0.2','setosa') Insert Into dbo.Iris Values('5.2','3.4','1.4','0.2','setosa') Insert Into dbo.Iris Values('4.7','3.2','1.6','0.2','setosa') Insert Into dbo.Iris Values('4.8','3.1','1.6','0.2','setosa') Insert Into dbo.Iris Values('5.4','3.4','1.5','0.4','setosa') Insert Into dbo.Iris Values('5.2','4.1','1.5','0.1','setosa') Insert Into dbo.Iris Values('5.5','4.2','1.4','0.2','setosa') Insert Into dbo.Iris Values('4.9','3.1','1.5','0.2','setosa') Insert Into dbo.Iris Values('5','3.2','1.2','0.2','setosa') Insert Into dbo.Iris Values('5.5','3.5','1.3','0.2','setosa') Insert Into dbo.Iris Values('4.9','3.6','1.4','0.1','setosa') Insert Into dbo.Iris Values('4.4','3','1.3','0.2','setosa') Insert Into dbo.Iris Values('5.1','3.4','1.5','0.2','setosa') Insert Into dbo.Iris Values('5','3.5','1.3','0.3','setosa') Insert Into dbo.Iris Values('4.5','2.3','1.3','0.3','setosa') Insert Into dbo.Iris Values('4.4','3.2','1.3','0.2','setosa') Insert Into dbo.Iris Values('5','3.5','1.6','0.6','setosa') Insert Into dbo.Iris Values('5.1','3.8','1.9','0.4','setosa') Insert Into dbo.Iris Values('4.8','3','1.4','0.3','setosa') Insert Into dbo.Iris Values('5.1','3.8','1.6','0.2','setosa') Insert Into dbo.Iris Values('4.6','3.2','1.4','0.2','setosa') Insert Into dbo.Iris Values('5.3','3.7','1.5','0.2','setosa') Insert Into dbo.Iris Values('5','3.3','1.4','0.2','setosa') Insert Into dbo.Iris Values('7','3.2','4.7','1.4','versicolor') Insert Into dbo.Iris Values('6.4','3.2','4.5','1.5','versicolor') Insert Into dbo.Iris Values('6.9','3.1','4.9','1.5','versicolor') Insert Into dbo.Iris Values('5.5','2.3','4','1.3','versicolor') Insert Into dbo.Iris Values('6.5','2.8','4.6','1.5','versicolor') Insert Into dbo.Iris Values('5.7','2.8','4.5','1.3','versicolor') Insert Into dbo.Iris Values('6.3','3.3','4.7','1.6','versicolor') Insert Into dbo.Iris Values('4.9','2.4','3.3','1','versicolor') Insert Into dbo.Iris Values('6.6','2.9','4.6','1.3','versicolor') Insert Into dbo.Iris Values('5.2','2.7','3.9','1.4','versicolor') Insert Into dbo.Iris Values('5','2','3.5','1','versicolor') Insert Into dbo.Iris Values('5.9','3','4.2','1.5','versicolor') Insert Into dbo.Iris Values('6','2.2','4','1','versicolor') Insert Into dbo.Iris Values('6.1','2.9','4.7','1.4','versicolor') Insert Into dbo.Iris Values('5.6','2.9','3.6','1.3','versicolor') Insert Into dbo.Iris Values('6.7','3.1','4.4','1.4','versicolor') Insert Into dbo.Iris Values('5.6','3','4.5','1.5','versicolor') Insert Into dbo.Iris Values('5.8','2.7','4.1','1','versicolor') Insert Into dbo.Iris Values('6.2','2.2','4.5','1.5','versicolor') Insert Into dbo.Iris Values('5.6','2.5','3.9','1.1','versicolor') Insert Into dbo.Iris Values('5.9','3.2','4.8','1.8','versicolor') Insert Into dbo.Iris Values('6.1','2.8','4','1.3','versicolor') Insert Into dbo.Iris Values('6.3','2.5','4.9','1.5','versicolor') Insert Into dbo.Iris Values('6.1','2.8','4.7','1.2','versicolor') Insert Into dbo.Iris Values('6.4','2.9','4.3','1.3','versicolor') Insert Into dbo.Iris Values('6.6','3','4.4','1.4','versicolor') Insert Into dbo.Iris Values('6.8','2.8','4.8','1.4','versicolor') Insert Into dbo.Iris Values('6.7','3','5','1.7','versicolor') Insert Into dbo.Iris Values('6','2.9','4.5','1.5','versicolor') Insert Into dbo.Iris Values('5.7','2.6','3.5','1','versicolor') Insert Into dbo.Iris Values('5.5','2.4','3.8','1.1','versicolor') Insert Into dbo.Iris Values('5.5','2.4','3.7','1','versicolor') Insert Into dbo.Iris Values('5.8','2.7','3.9','1.2','versicolor') Insert Into dbo.Iris Values('6','2.7','5.1','1.6','versicolor') Insert Into dbo.Iris Values('5.4','3','4.5','1.5','versicolor') Insert Into dbo.Iris Values('6','3.4','4.5','1.6','versicolor') Insert Into dbo.Iris Values('6.7','3.1','4.7','1.5','versicolor') Insert Into dbo.Iris Values('6.3','2.3','4.4','1.3','versicolor') Insert Into dbo.Iris Values('5.6','3','4.1','1.3','versicolor') Insert Into dbo.Iris Values('5.5','2.5','4','1.3','versicolor') Insert Into dbo.Iris Values('5.5','2.6','4.4','1.2','versicolor') Insert Into dbo.Iris Values('6.1','3','4.6','1.4','versicolor') Insert Into dbo.Iris Values('5.8','2.6','4','1.2','versicolor') Insert Into dbo.Iris Values('5','2.3','3.3','1','versicolor') Insert Into dbo.Iris Values('5.6','2.7','4.2','1.3','versicolor') Insert Into dbo.Iris Values('5.7','3','4.2','1.2','versicolor') Insert Into dbo.Iris Values('5.7','2.9','4.2','1.3','versicolor') Insert Into dbo.Iris Values('6.2','2.9','4.3','1.3','versicolor') Insert Into dbo.Iris Values('5.1','2.5','3','1.1','versicolor') Insert Into dbo.Iris Values('5.7','2.8','4.1','1.3','versicolor') Insert Into dbo.Iris Values('6.3','3.3','6','2.5','virginica') Insert Into dbo.Iris Values('5.8','2.7','5.1','1.9','virginica') Insert Into dbo.Iris Values('7.1','3','5.9','2.1','virginica') Insert Into dbo.Iris Values('6.3','2.9','5.6','1.8','virginica') Insert Into dbo.Iris Values('6.5','3','5.8','2.2','virginica') Insert Into dbo.Iris Values('7.6','3','6.6','2.1','virginica') Insert Into dbo.Iris Values('4.9','2.5','4.5','1.7','virginica') Insert Into dbo.Iris Values('7.3','2.9','6.3','1.8','virginica') Insert Into dbo.Iris Values('6.7','2.5','5.8','1.8','virginica') Insert Into dbo.Iris Values('7.2','3.6','6.1','2.5','virginica') Insert Into dbo.Iris Values('6.5','3.2','5.1','2','virginica') Insert Into dbo.Iris Values('6.4','2.7','5.3','1.9','virginica') Insert Into dbo.Iris Values('6.8','3','5.5','2.1','virginica') Insert Into dbo.Iris Values('5.7','2.5','5','2','virginica') Insert Into dbo.Iris Values('5.8','2.8','5.1','2.4','virginica') Insert Into dbo.Iris Values('6.4','3.2','5.3','2.3','virginica') Insert Into dbo.Iris Values('6.5','3','5.5','1.8','virginica') Insert Into dbo.Iris Values('7.7','3.8','6.7','2.2','virginica') Insert Into dbo.Iris Values('7.7','2.6','6.9','2.3','virginica') Insert Into dbo.Iris Values('6','2.2','5','1.5','virginica') Insert Into dbo.Iris Values('6.9','3.2','5.7','2.3','virginica') Insert Into dbo.Iris Values('5.6','2.8','4.9','2','virginica') Insert Into dbo.Iris Values('7.7','2.8','6.7','2','virginica') Insert Into dbo.Iris Values('6.3','2.7','4.9','1.8','virginica') Insert Into dbo.Iris Values('6.7','3.3','5.7','2.1','virginica') Insert Into dbo.Iris Values('7.2','3.2','6','1.8','virginica') Insert Into dbo.Iris Values('6.2','2.8','4.8','1.8','virginica') Insert Into dbo.Iris Values('6.1','3','4.9','1.8','virginica') Insert Into dbo.Iris Values('6.4','2.8','5.6','2.1','virginica') Insert Into dbo.Iris Values('7.2','3','5.8','1.6','virginica') Insert Into dbo.Iris Values('7.4','2.8','6.1','1.9','virginica') Insert Into dbo.Iris Values('7.9','3.8','6.4','2','virginica') Insert Into dbo.Iris Values('6.4','2.8','5.6','2.2','virginica') Insert Into dbo.Iris Values('6.3','2.8','5.1','1.5','virginica') Insert Into dbo.Iris Values('6.1','2.6','5.6','1.4','virginica') Insert Into dbo.Iris Values('7.7','3','6.1','2.3','virginica') Insert Into dbo.Iris Values('6.3','3.4','5.6','2.4','virginica') Insert Into dbo.Iris Values('6.4','3.1','5.5','1.8','virginica') Insert Into dbo.Iris Values('6','3','4.8','1.8','virginica') Insert Into dbo.Iris Values('6.9','3.1','5.4','2.1','virginica') Insert Into dbo.Iris Values('6.7','3.1','5.6','2.4','virginica') Insert Into dbo.Iris Values('6.9','3.1','5.1','2.3','virginica') Insert Into dbo.Iris Values('5.8','2.7','5.1','1.9','virginica') Insert Into dbo.Iris Values('6.8','3.2','5.9','2.3','virginica') Insert Into dbo.Iris Values('6.7','3.3','5.7','2.5','virginica') Insert Into dbo.Iris Values('6.7','3','5.2','2.3','virginica') Insert Into dbo.Iris Values('6.3','2.5','5','1.9','virginica') Insert Into dbo.Iris Values('6.5','3','5.2','2','virginica') Insert Into dbo.Iris Values('6.2','3.4','5.4','2.3','virginica') Insert Into dbo.Iris Values('5.9','3','5.1','1.8','virginica') -- Add Extended Properties EXEC sys.sp_addextendedproperty @name=N'Notes', @value=N' Sepal Length of Iris plant in centimetres. ' , @level0type=N'SCHEMA', @level0name=N'dbo', @level1type=N'TABLE', @level1name=N'Iris', @level2type=N'COLUMN', @level2name=N'SepalLength' GO EXEC sys.sp_addextendedproperty @name=N'Notes', @value=N' Sepal width of Iris plant in centimetres. ' , @level0type=N'SCHEMA', @level0name=N'dbo', @level1type=N'TABLE', @level1name=N'Iris', @level2type=N'COLUMN', @level2name=N'SepalWidth' GO EXEC sys.sp_addextendedproperty @name=N'Notes', @value=N' Petal Length of Iris plant in centimetres. ' , @level0type=N'SCHEMA', @level0name=N'dbo', @level1type=N'TABLE', @level1name=N'Iris', @level2type=N'COLUMN', @level2name=N'PetalLength' GO EXEC sys.sp_addextendedproperty @name=N'Notes', @value=N' Petal Width of Iris plant in centimetres. ' , @level0type=N'SCHEMA', @level0name=N'dbo', @level1type=N'TABLE', @level1name=N'Iris', @level2type=N'COLUMN', @level2name=N'PetalWidth' GO EXEC sys.sp_addextendedproperty @name=N'Notes', @value=N' Species of Iris plant. ' , @level0type=N'SCHEMA', @level0name=N'dbo', @level1type=N'TABLE', @level1name=N'Iris', @level2type=N'COLUMN', @level2name=N'Species' GO
Now, you should have a table called Iris with some basic metadata about the columns. While you can navigate to the extended properties through the object explorer, it would be nice to access this information through a query.
-- View the Code book for the Iris Table and some information about the data structure SELECT [major_id], [minor_id], [t.name] AS [Table Name], [c.name] AS [Column Name], [value] AS [Extended Property], infos.[Data_Type], infos.[is_nullable], infos.[Numeric_Precision], infos.[Numeric_Scale] FROM sys.extended_properties AS ep inner join sys.tables AS t ON ep.major_id = t.object_id inner join sys.columns AS c ON ep.major_id = c.object_id and ep.minor_id = c.column_id inner join INFORMATION_SCHEMA.COLUMNS infos on infos.table_name = t.name and infos.column_name = c.name Where class = 1 and t.name ='Iris' /* This is our Table name */ -- Drop Iris Table to avoid cluttering up your database Drop Table dbo.Iris
Now you have the framework for creating a data code book that can be self contained within the table itself. This will prove most useful when you can can share a sql server table with someone else.