While working through a coursera course recently (https://www.coursera.org/course/getdata) I started to think about how a code book could be implemented in sql server. There are 3 points that Jeff Leek makes regarding a code book and they are as follows:
- Information about the variables ...
- Information about the Summary Choices you made
- Information about the study design you used
My focus for this article will be able point number one, information about the variables. Often times it would be nice to know more about your data than simply what data structure it is defined by. For a simple working example, load the Iris datasets into Sql Server and then add some metadata about the columns to provide further information for the users.
1 2 3 4 5 6 7 8 9 10 |
Use YourDataBaseName Go -- Create Table If Object_Id('YourDataBaseName.dbo.Iris','U') is not null Drop Table dbo.Iris Create Table dbo.Iris ( IrisID bigint not null identity(1,1) , |
Now, you should have a table called Iris with some basic metadata about the columns. While you can navigate to the extended properties through the object explorer, it would be nice to access this information through a query.
1 2 3 4 5 6 7 8 9 10 |
-- View the Code book for the Iris Table and some information about the data structure SELECT [major_id], [minor_id], [t.name] AS [Table Name], [c.name] AS [Column Name], [value] AS [Extended Property], infos.[Data_Type], infos.[is_nullable], infos.[Numeric_Precision], |
Now you have the framework for creating a data code book that can be self contained within the table itself. This will prove most useful when you can can share a sql server table with someone else.