Articles for the Month of January 2014

Power Query and Power Pivot – two ways to load up Excel with data

If you’ve have messed with pivot tables in the past, you know that adding data to an excel spreadsheet is nothing new.  You now can add data three ways, because you can now add data with Power Pivot and Power Query.  The question is which tool should you use when?  Well if you have Office 2013 and didn’t sign up for the Power BI preview, you are limited to using Pivot Tables.  Since I signed up for the Power BI preview I was left with another question.  Which tool is the right one to use for loading data?  Well it’s the same answer you always get when working with computers.  It depends. 

Power Query was designed to readily provide access to data, any data.  Microsoft on their site promotes the fact that you can include, or as they like to say “Mash” data from Wikipedia or other sources together with your business data.  I can see where this feature would be very useful.   I work with a concrete company who has a lot of KPIs that they have for daily production.  Well, when it rains they don’t meet the performance metrics because no one wants to pour concrete in the rain as it affects the way it cures and the concrete can end up bumpy instead of smooth.  If they correlated the weather together with the KPIs they could definitively determine the effect of rain on their business. Power Query also provides a way of merging or appending data together, which makes sense since you may be selecting data from a variety of sources

If you are pulling together internal data so that you can create a Power View report on it, Power Pivot would be the best way to proceed.  Yes, you can do this in Power Query, but Power Query doesn’t provide the same level of filtering that Power Pivot does.  Power Pivot also provides the dangerous feature of allowing users to update data in the database by modifying the data in Power BI a spreadsheets, a feature which most of the time I believe will be turned off. 

With Power Pivot, not only do you load up the data, but you also build the relationships between the data.  It works the same way if you user Power Query for half the tables in your database and Power Pivot for the other half.  I felt compelled to check that out when I was playing around to see if it would make a difference.  Of course Power Pivot can do a lot more than just relationship management of data.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Do I really have the Power?

Is just it me or does anyone else keep on hearing that stupid Power song by the one hit wonder Snap whenever they are hear about Power BI?  Now if you cannot recall this catchy tune this should send it rattling around into your skull http://www.youtube.com/watch?v=_BRv9wGf5pk.  It’s the new Rick Roll.

Is it possible that adding the word Power is supposed to be a subliminal message to people who’s lives are spinning out of control and have been sapped of their personal essence?  But then again I could be wrong.  You add data by Power Query, then manipulate it in Power Pivot, then make a Power View and add a Power Map.  And naturally this task should never be attempted at home and only by Professional Power Users.  I feel that by using it I have increased my power exponentially by a factor of 365 to complete this task.  On the other hand, I could be totally wrong.  Power BI is available as a preview edition, which means for now anyone can load it up as it has not been officially released.  Still though the more time I spend with it the more I generally like it. 

Although you could argue that it might have made more sense for Microsoft to build a reporting app than bloating up Excel, it’s perfectly understandable why they did it.  Based upon what I have seen, there is a real reticence on the part of the IT staff to install anything else on a computer.  Given the things end users wish to install on their computers, one could hardly blame the IT staff.  The idea of self-service BI is for business users to use it, and Excel has been the staple of ad-hoc data storage and analysis for decades.  Since Power BI is a series of task specific bolt-on accessories to the ubiquitous Microsoft spreadsheet, using it isn’t the most intuitive thing in the world either.  It is sort of awkward to click here and there to get the tool to work, but once you figure that out, in the end it turns out to be a cool tool.  It has changed quite a bit since it was first introduced.  Microsoft’s blog from July includes a walk through of Yelp data which will no longer works as written because the expand feature has been removed. 

Power BI really makes Microsoft competitive in an area where they had nothing but a gaping hole which their competitors, namely Tableau, have driven their truck through.  In less than two years they have come up with an application which provides self-service BI to business users, uses in-memory analytics, can be viewed securely on the web, and can be viewed on tablets and phones.  Since Microsoft is slowly but surely getting rid of Silverlight to follow everyone else to HTML5, you can view your Power BI reports on whatever phone and tablet you happen to own. Ok, the caveat is that the application that was created for the Microsoft phone is better, but you are not shut out if you have an iPhone or Samsung tablet either. 

From a marketing perspective, Microsoft has provided perfectly sensible reason that you might want to actually get Office365.  Personally Power BI was the reason that I decided to move to Office 365.  Let’s face it, for most people Office 365 was just another way to load up office.  The whole cloud thing was not really much of a selling feature for end users who didn’t want to use a skydrive when they could easily use a hard drive ensuring your data isn’t spread all over the known universe.  Hosting SharePoint in the cloud provides a feature people might actually use.  There is an argument to be made as to whether or not it is any less annoying to administer SharePoint via a cloud rather than locally, but it is one less server IT needs to monitor.  The price isn’t terrible.  Here’s the info which Microsoft has released on pricing http://www.microsoft.com/en-us/powerBI/pricing.aspx#fbid=5EP_a34ZHLQ

I am giving a talk at SQL Saturday in Albuquerque on Power BI where I will delve the technical features of Power BI in greater detail.

Yours Always,

Ginger Grant

Data aficionado et SQL Raconteur

Data Quality, there is an App for that

On Wednesday, January 8, I will be giving a presentation of an interesting and potentially handy new feature of SQL Server 2012, Data Quality Services [DQS].  I discovered this interesting tool when I was studying for my 70-463 exam, which I did pretty well on and will be discussing in a post coming soon to a screen near you.  This tool is included as part of the standard release of SQL Server 2012.  It’s an application that given it’s usefulness will have a wide implementation in the future.

I will demonstrate how to use this tool in conjunction with third party reference data from the Azure Marketplace to validate some sample address information.  In order to assimilate ever increasing volumes of data, new applications and tools have been created to process it, but the underlying problem still remains.  The data needs to be complete and accurate.  While sometimes it is possible to improve your methods of gathering the data by say tweaking the input phone number field so that you only get valid number in a format you expect, such modifications are not always possible.  There are times where the data providers may benefit by sending more data, not more accurate data, as they are paid to deliver quantity and it is up to the recipient of the data to determine the quality. 

Microsoft has created two tools which they designed to improve the quality of collected data.  The first tool, which was released as part of SQL Server 2008 is Master Data Management.  I have yet to see or hear of anyone who is using this application, and outside of testing I have not used it either.  Master Data Management was designed to assist organizations by providing a framework which transfers ownership of data quality to subject matter experts who know the correct values.

In SQL Server 2012, Data Quality Services was added as a way to fix the data.  As part of the ongoing push to put Business Intelligence into the hands of users and away from IT, DQS has a very user friendly interface designed to develop a set of rules which are used to clean data.  Once the rules have been validated, you can take this knowledge and put it into an SSIS job to automate cleansing of larger quantities of data.  

As this is the first version of DQS, there are some things about it which are a little awkward.  An example of this is the two-step process required to install DQS.  When you install SQL Server 2012, and select DQS as an installation component it appears to be installed, but it isn’t.  After the SQL Server 2012 application is installed, then you have to run DQSInstaller.exe.  One of the other limitations of DQS is it is a bit on the slow side.  Microsoft doesn’t recommend it for millions of records, just a few thousand.  Even with these limitations, DQS can still provide a great deal of useful functionality which can be an important part of ensuring your production data is complete and correct.

Yours Always,

Ginger Grant

Data aficionado et SQL Raconteur