Articles

Analysis Services Tabular Best Practices – Part 1

Recently I have been working on Analysis Services Tabular as a data source to Power BI and wanted to share some of the best practices for developing a model. Following these guidelines will improve the performance and scale of your project. There is a lot of material to cover, so I have broken up the information into this series.

Analysis Services Internal Dictionary Encoding

Understanding the internal components drives a number of design elements. The rules for Tabular are different because the underlying technology is different.  Analysis Service is a columnar based database, which means that it is most efficient on summarizing columns of data.  This methodology is incorporated in the Vertipaq Engine which is compresses the data in memory. The data is stored within a dictionary and is modified to store information optimally. Internally all of the data stored is translated into an integer value.  This means that two date fields when joined in a model will be as efficient as two integer values, as all values are internally represented as numbers.  Dictionary entries are created for each column.  The most efficient compression would be if the column contains one value as there would only need to have one entry.  If there were 4 different numbers in the column and the values were sorted on that column, there would be 4 dictionary entries.  If instead the numbers were ordered sequentially, such as 1,2,3,4,1,2,3,4, compression would not be possible and there would be an entry for each row.  When possible, data should be sorted so that like numbers are together.   Whenever possible, do not load unique values, as this data cannot be compressed.

Create tables with Underlying Queries with Views

Microsoft recommends that when getting data from a relational database, the data retrieved should be encapsulated with views.  This creates a record in the source data of the information used in by Power BI in the source database which can help ensure that the data structures are not modified in a way that will adversely impact the model.  The views can be materialized which can prevent the structure from changing without changing the view.  This provides visibility to the database administrator of the structure needed for Analysis Services.  Using views also can insulate Analysis Services from being dependent on the existing data model as the view does not have to match existing data types.  All views need to include an explicit list of columns for a deliberate inclusion of requested columns and should be sorted.

Extra columns should be eliminated from the view and the model as they will increase the size for no reason. The views should only contain columns which are actually used in either the DAX, reporting fields or to join tables together.  Do not bring in every field in the source table as it is not a best practice. It is more efficient to only bring in the columns needed rather than hide columns not used.  If the columns are not needed in DAX or for joining tables, do not include them in the model.  The views need to be written to run as efficient as possible on the source system, so be sure to tune the views as well.

Data Type Selection

The data type selected will impact the physical storage used, not the compression of the models in memory.  It is important whenever possible to reduce the cardinality of the data in order to be able to sort the data effectively.  When storing decimal numbers, unless you need many significant digits, store the data as Currency as it will take less space in physical storage than decimal.

Following these rules will help your model be better interpreted for dictionary encoding and will improve the compression of your data meaning it will load faster.  In the next part of this series I will discuss other things you can do to improve usability and structure of the data model. Let me know if you find this helpful by adding a comment or subscribing to this blog.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

 

Data Access and Self-Service Business Intelligence

There is more to implementing Self-Service Business Intelligence Business Intelligence than getting new software like Power BI, mindsets and practices also need to change. The data teams in many companies formed their policies based on history with previous technologies. One of those policies that is fraught with contention is letting the users have access to the data in order to do their own analysis. The reasons for this are based on a story like this one.  Like many a data professional, I worked at a company where we gave a team of users access to the database in order for them to do analysis. It was a replicated database, as we didn’t want to impact production. As these analysts primary skill was marketing not SQL, they wrote a query that took all the resources so no one had the ability to do anything else with the database, and we were required to intervene and kill the query to make the database useful again. After that, we changed their access to only being able to use views created for them to prevent this from happening again. Variations of this story exist all over.

Data Access has changed and so has the need for a 64-bit OS

Self-Service BI is supposed to be a way for Analysts to answer ad-hoc questions from Management about the business. While data professionals certainly could and do answer these questions, at some point a DatabaseAccessfocus line is drawn. If the primary focus is to determine the best way to write a query or implement an appropriate indexing scheme, this person has a technical focus and not a business focus. People with a business focus probably should be the person who use data to drive decision making. While technical people can write reports very efficiently, given the continual requests for answers from the data, keeping up with what the business people want to do can be extremely difficult as the numbers of reports required in various formats can be overwhelming. Like the old argument that “You don’t need a 64-bit OS” have become obsolete, so have the reasons for not giving business users access to the data. Now is the time to give them access. If you only have a 32-bit operating system, you don’t have the memory needed to do much data analysis. Data Analyst need 64 bit OS and access to the data.

What kind of Access should Analyst Have?

Most Analyst use Excel, which has become the de-facto tool of choice for data analysis. One doesn’t need to have a working knowledge of the SQL language to analyze data, and the scenario referenced above still happens. Instead data should be provided in a manner which is easy to consume in a Pivot table, allowing users to select, sort and filter the data at will. Analysis Services cubes, whether they be tabular or multidimensional provided this capability. Using a cube in an excel spreadsheet has very little chance of ever crashing a server, so go ahead and grant access. Give analysts the tools they need to provide the answers they need. Create a collaborative environment to grant access and provide the analyst what they need. In this kind of environment true data based decision making can really happen.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

Moving from Power BI to Analysis Services Tabular

Update By: Ginger Grant
Date: August 2, 2015

Power BI is a great tool for creating data mashups and great visualizations. There are a lot of posts on how and why to use Power BI for these tasks, including mine. PowerBIDesktopModelAfter working with a number of clients on how to make implement Power BI in their data environments, I thought it might be helpful to talk about how Power BI works internally to provide guidance to determine how best to implement it. One of Power BI’s big strengths is the ability to bring data in from a lot of different sources and put them together into a single data model. Power Pivot in Excel and Power BI Desktop in the Relationship’s page both allow you to create data models. Power BI on the web, does not, as of this writing allow the ability to create data models from multiple sources. Power BI is able to handle a large amount of data to be stored within it using the Vertipaq engine which compresses the data and loads it in memory. How much data? Well that has to do with how much memory you have.

Having 8 GB of Memory on a 32 bit Operating System is Worthless

Recently I was working at a client who wanted to load five year’s worth of operating information into Power BI. He knew that Power BI needs a lot of RAM, so he requisitioned 8 GB of RAM for his laptop so he would be able to load this much data. However, his IT department had a policy of only installing 32 bit operating systems on the computers. When 64 bit operating systems first became available, often times the computer bios or the mother board wouldn’t support it. Until very recently, outside of developers, most people didn’t have a business reason for 64 bit operating system to do their job.

Power BI will use all the memory you have available. What is available? Here’s some math to explain how much memory is available. A 32 bit Windows operating system can assess 2 32 = 4GB of RAM, and you don’t even get all of that as the system uses some of it. This math applies if a 32 bit version of Office 2013 is installed onto a 64 bit version of the operating system. The client couldn’t load five years of operational data into Excel because he didn’t have enough addressable memory, since with a 32 bit operating system he had 4 GB of RAM he can use, period.

Reasons for Data Modeling in Excel Power Pivot

Power BI is a great tool, but it is not the only tool. If you want to load up years of data into your model, at some point, even if you have a 64 bit OS and 32 GB of RAM, eventually you will run out of memory. Well then what do you do?ImportPowerPivot If you have developed your data model in Excel, no problem you can upgrade it to Analysis Services Tabular. Visual studio has a tool to make migrating to Analysis Services Tabular easy to do. I find Power Pivot model to be one of the best ways to get started on creating a Tabular Model as you can test out your data modeling concepts very quickly. If you have developed your data model in Power BI Desktop, as of this writing there is no way to migrate directly to Analysis Services Tabular. If you are familiar with Power BI, you may be pleasantly surprised at how similar Analysis Service Tabular is. Along with improved performance accessing the data, there are a number of security features which are not available in Power BI in Analysis Services Tabular. With Power BI, you can create a corporate solution from a desktop application, which is one of the neatest things about Power BI.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur