Articles

Analysis Services Tabular Best Practices – Part 3

This is my last installment in my series on improving Analysis Services Tabular. If you want to start at the beginning, please click here. I started writing this because I wanted one place to point clients to improve the performance and scalability of Tabular Models, and there are a lot of places you need to go to find all of this information.  I hope you find it helpful.  Let me know if you have any questions or comments regarding this post I hope you leave some comments or perhaps ask me a question on twitter.  With that here are more things that you need to do to improve your Analysis Services Tabular implementations.

Optimize your DAX Code

While it is not easy to performance tune DAX you can do it, by evaluating the DAX Query Plan and VeritPaq Queries, and SQLBI’s VertiPaq Analyzer. Also, you can also look to use functions which perform better, for example COUNTROWS instead of DISTINCTCOUNT or ADDCOLUMNS instead of SUMMARIZE. Whenever possible use the CALCULATE function instead of the FILTER function, as CALCULATE filters for context inside the parenthesis and are more efficient. Also all of the iterative functions SUMX, COUNTX etc., should be used sparingly as the row-by-row transactions they create are less efficient and should be used only when SUM or COUNT will not work.  When evaluating if a value missing, if it is possible, use ISEMPTY instead of ISBLANK as ISEMPTY looks only for the presence of a row, which is faster than the evaluation performed by ISBLANK.

Sort by Settings

There are times when the sort needs to be specified as something other than alphabetic.  A common place where this is done is for months, as people want to see January appearing first rather than April.  April will appear by default as the dates are sorted alphabetically.  To change this, modify the Month Name column property Sort by Column to sort by the Month number field so the months will always appear with January first.  This process should be repeated for any column where the alphabetic sort is not likely what people want to see, such as Day of Week.

Formatting Fields

All values should be formatted in a way that will look good on reports. Most of the time this formatting has been included, but there are some instances where the general format is used instead. Most users want numbers which include a thousands separator.  In analysis services tabular, this is accomplished by using a custom format and the format style #,##0.  This will add the thousands separator when needed and will never show a decimal number.

Hiding values

All values not used in a report should be hidden.  This will make things less confusing for people who are writing reports as the values that appear are ones which they actually use. For purposes of joining tables, many times you need to bring in key values to create relationships, but they are not values which should be reported upon as they have no business meaning.

Business Friendly Column and Table Names

When creating Power BI reports, the column names in the table are the names used in the reports.  As a best practice, the names in the tabular model should be business friendly names.   All of the column names visible in the model should be names which can be readily understood by anyone reading the report.  Names should include special characters like spaces and should be appropriately capitalized in the normal function names appear.  For example WK_NUM would appear as something like “Week Number” would be understandable to the user and look appropriate on reports. Using views can also provide the ability to change the names to user readable names outside of the model, which is also considered a best practice as then the model does not need to be changed to have user specified names.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

Analysis Services Tabular Best Practices – Part 2

I was mentoring a client recently trying to help them with some tips for using tabular. I couldn’t find all of the information I was looking for, so I decided to write all of them down so that next time I could just go to my blog for the information.  There are a lot of things to consider so I broke them up into 3 parts.  To make sure you get all of them, please subscribe to my blog.  And now, here are some more tips for tabular.

Marking all Date tables as Date

Tabular models include a number of Time Intelligence functions which provide the ability to quickly and easily calculate things such as Rolling 12 months, Year-to-date calculation, and more. In order for this functionality to be possible, the table needs to be marked as date.  In order for this feature to be available, there needs to be one contiguous date field in the table, meaning there should be no gaps in the date field. If there is one default missing date field which is many years from the start of the contiguous date values, the field needs to be removed or all of the dates after that date need to be added ensuring there are no gaps in the dates. You can choose to join on the date value or another key.

Modify Timestamps to Split Date and Time

When there is a field where the date and time are both needed, the values should be separated so that there is both a date field and a time field.   Having date time in two fields assists in the dictionary encoding as the date and time fields can be separately sorted into columns where the values are the same, decreasing the number of dictionary entries.  To further improve compression, only include the seconds if absolutely necessary, as add decreasing the cardinality will increase compression.

Modeling Role Playing Dimensions and Active Relationships

All relationships between tables should be active relationships.  If you have any dashed lines which connect portions of your model, these are inactive relationships and should be removed, unless you are writing DAX Code to use them. DAX code must be written to use inactive relationships and users often find it confusing. As a best practice for situations where there are multiple dates in the fact table, such as Order Date, Ship Date and Due Date, is to have 3 different date tables. The other date tables can be created by using a calculated table to create a second date table and join the other value to that date table.  Step-by-step instructions for creating a calculated table are included in the Reference section1.  Having multiple date tables is a best practice design for tabular models as it is a common for there to be multiple dates in one table.

Partitioning Tabular Models

Partitioning Tabular models does not improve query performance. In fact multiple partitions can actually make performance worse.  Partitioning should be implemented to assist in the loading of the data.  If you do not need to create a partition to make a load timeframe, then there is no reason to do it.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

Calculated Tables and Role-Playing Dimensions

http://michaeljswart.com/2016/06/t-sql-tuesday-079-its-2016/comment-page-1/#comment-186750Working with role playing dimensions, which are found when you have say multiple dates in a table and you want to relate them back to a single date table, have always been problematic in SQL Server Analysis Services Tabular. Tabular models only allow one active relationship to a single column at a time. The picture on the left shows how tabular models represent a role playing dimension, and the model on the right is the recommended method for how to model the relationships in Analysis Services Tabular as then users can filter the data on a number of different date tables.

TabularRolePlaying dimension Modeling

 

RolePlayingDimension

The big downside to this is one has to import the date table into the model multiple times, meaning the same data is imported again and again. At least that was the case until SQL Server 2016 was released. This weeks TSQL topic Fixing Old Problems with Shiny New Toys is really good reason to describe a better way of handling this problem.

 

Calculated Columns: The solution for Role Playing Dimensions

SQL Server 2016 provides a new method of solving the role playing dimension problem, using a calculated column. Instead of copying in the source from the date table, instead create a formula to get a copy. First switch to the data view, of the model. Then select Table->New Calculated Table. ThSSASScreenCalcTablee screen will change to the new table screen and the cursor will be pointed to the formula.

In my model I have one table called date. I am going to add a calculated table called order date. The DAX is couldn’t be simpler. Just select the table named ‘Date’ which is shown in the picture below. Rename the table to something more meaningful, like Order Date and that is it. The modeling required is the same, but now the model size does not increase to accommodate all of the date tables needed, as there is only one copy of the date table referenced multiple times. If you are using Power BI this same concept can be used for handling role playing dimensions as well.

SQL Server 2016 had a lot of great new features, and in addition to the flashy ones like R there are a lot of great enhancements to the Tabular model that are worth investigating as well.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur