Articles for the Month of May 2015

Upcoming and Recent Events

24HOPPassSpeakingThe PASS organization is a professional organization which sponsors a number of different technical events in the technical community. Recently, I have been honored to be selected to speak at not one but two events hosted by PASS, a professional organization which provides a lot of great resources to improve knowledge of all things SQL Server and related technologies to the world. The PASS Business Intelligence Chapter provides training on all things related to Business Intelligence via the web. I was selected to talk at the last meeting in May. Thank you to all of the people who were able to attend my talk on Top 10 SSIS Tuning Tricks live. If you had to work, no problem all of the talks hosted by the PASS Business Intelligence Virtual Chapter Recordings are available on The recording of my Top 10 SSIS Tuning Tricks session is available here.

24 Hours of PASS

Periodically PASS provides a 24 Hour Training session on SQL Related topics to provide training live to every time zone in the world. As this event is watched by people around the world, it is a real honor to be selected for this event. This time the speakers were selected from people who had not yet spoken at the PASS Summit Convention, as the theme was Growing Our Community. The theme is just another way the PASS organization is working to improve people’s skills. Not only do they provide the opportunity to learn all things data, but also provide professional development through growing the speaking skills by providing many avenues to practice these skills.

Data Analytics with Azure Machine Learning

My abstract on Improving Data Analytics with Azure Machine Learning was selected by the 24 Hours of PASS. As readers of my blog are aware, I have been working on Azure Machine Learning [ML] this year and look forward to discussing how to integrate Azure ML into current environments. Data analytics with ML are yet another way to derive meaning from data being collected and stored. I find the application of data analytic fascinating, and hope to show you why if you are able to attend. There are a number of wonderful talks scheduled at this event, so I encourage you to check out the schedule at attend as many as you can. To be sure I’ll be signing up for a number of sessions as well.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Getting Started with Datazen

Since Microsoft announced they were buying Datazen, I wanted to check out the application. I am interested to know about Datazen’s feature set, and how it compares to functionality in Excel with the Power tools and Power BI Preview. Datazen is comprised of two applications, the Publisher and the Viewer. To create reports, one must install Datazen Publisher and that application will need to use the Datazen Server to make data connections and provide security. I have included a diagram from DataZen’s Website to describe how it is meant to be configured.DataZenArchitecture

The publisher is meant to use the Datazen Server, which is where the connection to your data is made. All of the client communication with the server is performed using a Web App which uses IIS. If you have an Enterprise copy of SQL Server with active Software Assurance coverage, you are in luck. You automatically have the ability to install Datazen Server for no additional cost, which you can do here. What if you just want Microsoft to host the Datazen Server for you, do you have to install a server? The answer to that question is yes, because the Datazen Hosted Service is listed as Coming Soon (scroll down to the bottom of the page if you click the link). You can of course install Datazen to Azure, which will make it cloud based.

Quick Tips for Installing Datazen

To design a dashboard, you will need to install the Datazen Publisher application. While Datazen visualizations can be viewed on Apple and Android devices, to design those visualizations, you will need to use the Windows only client. And by Windows, they mean Windows 8. While I have a machine with Windows 8.1, it’s not my favorite OS, and I work a lot on my Windows 7 PC. I won’t be using that PC to create Datazen reports. Datazen now offers a preview version of the Datazen Publisher, so I assume that the market demand pushed them to support Windows 7.  This version is not a released version, but a preview, so you may have some issues with it. Datazen is designed to be an application to run on Phones and tablets. The software is installed in the same manner phone apps are, you need to go to the Windows Store to download it, rather than click on a link to the site and download it, which is why you need Windows 8. If you can’t find the store, just go to the start screen and search for Store, which is what I had to do. Once in the store, search for Datazen Publisher, which is free.

Running Datazen

After Datazen Publisher has been installed, when you run it you will be prompted to connect to a server. If you don’t have a server, no worries, Datazen provides a demo server for you. The data which is entered by default the first time you run Datazen, as shown below, provides a connection to this server. DataZenClientScreenTo connect, just click on the Connect button on the screen. You will be asked if you want to let publisher run in the background. Generally speaking, click on the Allow button. If you do, you will get updates which have been made to the data sources. I received 31 updates after I installed it.

Datazen Visualizations

Initially you will see KPIS, Top Dashboards and other sample dashboards which have been sorted by sector provided to show the breadth of things available. Datazen has two basic categories of visualizations, KPIs and Dashboards.

KPIs are designed to provide tiles of a single metric, optionally compared to a target or range. You have the ability to comment on the results too, which is a novel ability that I have not seen in other products. Dashboards provide a typical visualization which allow for contextual selecting interactivity to see different detail on the same screen.

Creating Datazen Dashboards

To create your own, right click anywhere on the screen and a toolbar will appear at the bottom of the screen. Click on the create button and you can start either creating from an empty dashboard or starting from the existing dashboard. I will be exploring both of these options in an upcoming post to my blog. If you are interested knowing exactly when that will occur, I recommend subscribing to my blog so that you can be notified when exactly that is going to happen.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur



SQL Server 2016 and PolyBase

The next release of SQL Server, SQL Server 2016 is continuing with a convention which was employed in previous releases, which I call the Cadillac release system.  At General Motors, in the past new features were first offered on their most luxurious brand, Cadillac, and if these features prove successful, they are rolled out to Buick and the rest of the product lines.  Microsoft does the same thing.  Their ‘Cadillac’  is the PDW [Parallel Data Warehouse], Microsoft’s Data Appliance. One notable example of this release model was the addition of column store indexes to SQL Server. Column store indexes were first available on the PDW, or APS as is was known then, and Microsoft later added column store indexes to SQL Server 2012. Later that same year, at SQL PASS Summit 2012, I heard about a really neat feature available in the PDW, PolyBase. The recording I heard is available here, where Dr. David DeWitt of Microsoft explained PolyBase in great detail. I have been waiting to hear that PolyBase was going to be released to SQL Server ever since.  On May the Fourth, 2015, Microsoft announced the preview release of SQL Server 2016. Listed in the release announcement was the feature I’d been waiting for, PolyBase.

Sqoop Limitations

PolyBase provides the ability to integrate a Hadoop cluster with SQL Server, which will allow you to query the data in a Hadoop Cluster from SQL Server. While the Apache environment provided the Sqoop HadoopSqoopapplication to integrate Hadoop with other relational databases, it wasn’t really enough. With Sqoop, the data is actually moved from the Hadoop cluster into SQL Server, or the relational database of your choice. This is problematic because you needed to know before you ran Sqoop that you had enough room within your database to hold all the data. I remembered this the hard way when I ran out of space playing with Sqoop and SQL Server. From a performance perspective, this kind of data transfer is also, shall we say, far from optimal. Another way to look at Sqoop is that it provides the Hadoop answer to SSIS. After all Sqoop is performing a data move, just like SSIS code. The caveat is SSIS is generally faster than Sqoop, and provides a greater feature set too.

Polybase – Hadoop Integration with SQL Server

Unlike Sqoop, PolyBase does not load data into SQL Server. Instead it provides SQL Server with the ability to query Hadoop while leaving the data in the HDFS clusters. Since Hadoop is schema-on-read, within SQL server you generate the schema to apply to your data stored in Hadoop. After the table schema is known, PolyBase provides the ability to then query data outside of SQL Server from within SQL Server. Using PolyBase it is possible to integrate data from two completely different file systems, providing freedom to store the data in either place. No longer will people start automatically equating retrieving data in Hadoop with MapReduce. With PolyBase all of the SQL knowledge accumulated by millions of people becomes a useful tool which provides the ability to retrieve valuable information from Hadoop with SQL. This is a very exciting development which I think will encourage more Hadoop adoption and better yet, integration with existing data. I am really looking forward SQL Server 2016.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur