Articles for the Month of August 2017

Database Table Design

http://michaeljswart.com/2016/06/t-sql-tuesday-079-its-2016/comment-page-1/#comment-186750There are a number of different ways that you could decide to organize your data in a database. If you are creating a database to be used in a transactional system, your table design should follow a normalized design as much as possible.  Data should be grouped in logical groups, such as customers, products, sales, orders, quotes, tickets etc.  Redundantly repeating data in multiple places will cause problems in the future.  Your design may include hundreds of tables, and that is perfectly fine.

If the purpose of the database is for Power BI or for a data warehouse, dimensional modeling techniques should be deployed.  In this type of database design, generally speaking there are a number of tables containing descriptive data, such as product and customer and few fact tables which contain the actions which happened.  The actions include things like Sales.  The database design will look much like a star with the fact tables in the center and the dimension tables connected to it like satellites.  If you have one dimensional model connected to another dimension, that design is called a snowflake and some applications, like Analysis Services Multi-dimensional will not process it well.  Power BI and Analysis Services Tabular work very well with snowflake dimensions.

SnowflakeModel

Snowflake Data Model from Power BI

 

Table Design Gone Wrong

To paraphrase Ron White, the reason that I described database modeling is so that even people who know nothing about database design could appreciate my interview story. When I was working at a previous location, I assisted in providing technical reviews for database developer jobs.  We asked a number of typical questions about indexing and stored procedures, but I always tried to come up with at least one question which the candidate could not readily answer by cramming interview questions found on the internet.  I decided to ask one candidate, who did correctly answer the previous stock questions something that would let us know what kind of work he had really done.  I asked him “What do you do to determine how to design a table?” I was interested to find out what his thought process was, see if he would mention normal form or describe something he had done in the past.  I was completely surprised by his answer.

“Well, you can only have 256 columns in a table. After that you have to create a new one.”  This answer was a complete surprise.  I was really curious to find out where he had developed this completely warped view of how to determine what fields should go in a table.  It turned out that he learned all of his database skills from a co-worker, who had recently retired.  His co-worker had worked at the same location for a very long time and when he started used mainframes without any databases.  He had migrated some of the applications to databases and they wrote them this way because it “made sense”.   After that the interview was over, and we hired someone else.

I challenge anyone who is learning databases to please look up what people tell you to do on the internet.  This is useful for two reasons, the first being that it will probably help you learn the concept better to read about it another way.   The other reason is that you can find out if the person teaching you really knows what they are doing, so you will learn the correct way to do something.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

 

What to use to Dynamically Write a Date Dimension in Power BI: M or DAX?

Calendar-clip-artRecently I needed to create a date dimension for a Power BI model as there was not one in the source database. There are two different ways that I could do this, using DAX from the Modeling Tab within the Data View or using M via the Query Editor window.  As a general rule, when it is possible data manipulation should be done in M as it offers a greater level of compression.  In this case though I am using a function in DAX, which is not the same as creating a calculated column.

Create a Date Table in DAX

To create a date table in DAX,  in Power BI go to the Data View->Modeling Table.  Click on the  New Table button on the ribbon.  For those who are wondering how you would go about writing either one, here is the source code for the DAX version.

 

DimDate = ADDCOLUMNS( CALENDAR(DATE(2017,1,1), DATE(2020,12,31)) ,
"Date Key", FORMAT ( [Date], "YYYYMMDD" ), //NumericDate
"Year", YEAR([Date]),
"Qtr Number", "Qtr " & FORMAT( [Date], "Q"),
"Q Number", "Q " & FORMAT( [Date], "Q"),
"Month Name" , FORMAT ( [Date], "mmmm" ) ,
"Month Short Name" , FORMAT ( [Date], "mmm" ) ,
"Month Number", MONTH([Date]),
"Month Year", FORMAT ( [Date], "mmm " ) & YEAR([Date]),
"Day Name", FORMAT ( [Date], "dddd" ), //Name for Each day of the week
"Day Short Name", FORMAT ( [Date],  "ddd" ),
"Day Number" , WEEKDAY ( [Date] ) //Sunday is 1
)  

This code uses the DAX CALENDAR function to create a contiguous set of dates between January 1, 2017 and December 31, 2020 with a field is named “Date”.  The remaining fields need field names and comments were added for clarity

Create a Date Table in M

In the Power BI Query Editor, click on the New Source button and select Blank Query. Select the View tab and click on the Advanced Editor button.  The Advanced Editor is where the M query is stored.  Paste the following code in to create a new table called Query1, which of course you can rename.

let
Source = List.Dates( #date(2017,1,1), Number.From( #date(2020, 12,31) - Date.From( #date(2017,1,1) ))+1, #duration(1,0,0,0)),
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Changed Type" = Table.TransformColumnTypes(#"Converted to Table",{{"Column1", type date}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Column1", "Date"}}),
#"AddYear" = Table.AddColumn(#"Renamed Columns", "Year", each Text.End(Text.From([#"Date"], "en-US"), 4), type text),
#"AddMonth" = Table.AddColumn(AddYear, "Month Number",   each Date.ToText([Date], "MM")),
#"AddQuarter" = Table.AddColumn(AddMonth, "Quarter Number", each "Qtr " & Number.ToText(Date.QuarterOfYear([Date]))),
#"AddMonthName" = Table.AddColumn(#"AddQuarter", "Month Name", each Date.ToText([Date], "MMMM")),
#"AddMonthShortName" = Table.AddColumn(#"AddMonthName", "Month Short Name", each Date.ToText([Date], "MMM")),
#"AddShortMonthYear" = Table.AddColumn(#"AddMonthShortName", "Short Month Year", each [Month Short Name] &" " &  [Year]),
#"AddDayOfWeek" = Table.AddColumn(#"AddShortMonthYear", "Day of Week", each Date.ToText([Date], "dddd")),
#"AddDay" = Table.AddColumn(#"AddDayOfWeek", "Day", each Date.ToText([Date], "dd")),
#"AddDateKey" = Table.AddColumn(AddDay, "DateKey", each ([Year]&[Month Number]&[Day]))
in
#"AddDateKey"

To Create A Date Table Use Either M or DAX

Now the question is which one should you use?  To be honest it doesn’t matter.  I couldn’t see any difference when I tested it.  To validate this answer I consulted twitter, which sparked some very interesting comments and analysis. Marco Russo b | t  is planning on writing a blog on the details of it, but Jason Thomas b | t gave me this summary. “[The] Dictionary expands as needed as values are inserted–designed to reduce cost of re-alloc and re-org of hash buckets (at cost of memory waste).” The full explanation needs a post of it’s own to be sure, which I am looking forward to reading when Marco writes it.   I’ll quote Kasper De Jonge b | t who summed it up best “I don’t believe it will matter much, the date table will be so small regardless”. Whichever way you chose to add a date table, DAX or M, which are needed for time intelligence, you now have the code to do either one.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur