Articles

Power BI – Beyond the Basics

When helping clients recently with their Power BI implementations, I have noticed that when talking to people about Power BI there seems to be some areas where there continues to be a log of questions.  While it is easy to find a plethora of information about getting started with Power BI, when it comes to implementing a solution, the information is scarce.  How do you handle releases? Should an implementation contain only one data model? Is Power BI’s data secured on the cloud? Is it required to have Office 365 use Power BI? Do you have to have Power BI Premier to have the Power BI run locally?

Advanced Power BI Techniques in Norway

While I have discussed some best practice techniques on my blog, as usual new features released in Power BI have a

Norway Parliament Building in Oslo

Norway Parliament Building in Oslo

tendency to change some of the available options.  For example, App workspaces, the updated take on Content Packs released a few months ago, now offer a new method for releasing not only dashboards but the reports behind them and the ability to easily migrate sources. I am excited that I will have the opportunity to discuss the answers to the questions received by doing a full day of training at SQL Saturday Oslo. I am looking forward to visiting Oslo, which is home to the best preserved Viking Ship, an Opera House designed to be walked on and the home of the guy who painted the Scream.  If you happen to reside somewhere where it is possible to make the journey to Norway, please register to attend this full day of interactive training.  We will cover all of these items and go into detail about Power BI administration, security and new features and design techniques which will improve Power BI implementation techniques.

sqlsat667_osloFor those of you who are unable to attend, I feel obliged to answer some of the questions I posed earlier.  Implementations generally require more than one data model.  Power BI is encrypted both in transit and at rest. You do not need to have Office 365 to run Power BI.  Power BI can be run locally with Power BI Report Server, which is part of SQL Server 2016 Enterprise with Software Assurance, and you do not need to sign up with Power BI Premier to install it.

I hope to see you in Norway.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

Upcoming and Recent Events

24HOPPassSpeakingThe PASS organization is a professional organization which sponsors a number of different technical events in the technical community. Recently, I have been honored to be selected to speak at not one but two events hosted by PASS, a professional organization which provides a lot of great resources to improve knowledge of all things SQL Server and related technologies to the world. The PASS Business Intelligence Chapter provides training on all things related to Business Intelligence via the web. I was selected to talk at the last meeting in May. Thank you to all of the people who were able to attend my talk on Top 10 SSIS Tuning Tricks live. If you had to work, no problem all of the talks hosted by the PASS Business Intelligence Virtual Chapter Recordings are available on www.Youtube.com. The recording of my Top 10 SSIS Tuning Tricks session is available here.

24 Hours of PASS

Periodically PASS provides a 24 Hour Training session on SQL Related topics to provide training live to every time zone in the world. As this event is watched by people around the world, it is a real honor to be selected for this event. This time the speakers were selected from people who had not yet spoken at the PASS Summit Convention, as the theme was Growing Our Community. The theme is just another way the PASS organization is working to improve people’s skills. Not only do they provide the opportunity to learn all things data, but also provide professional development through growing the speaking skills by providing many avenues to practice these skills.

Data Analytics with Azure Machine Learning

My abstract on Improving Data Analytics with Azure Machine Learning was selected by the 24 Hours of PASS. As readers of my blog are aware, I have been working on Azure Machine Learning [ML] this year and look forward to discussing how to integrate Azure ML into current environments. Data analytics with ML are yet another way to derive meaning from data being collected and stored. I find the application of data analytic fascinating, and hope to show you why if you are able to attend. There are a number of wonderful talks scheduled at this event, so I encourage you to check out the schedule at attend as many as you can. To be sure I’ll be signing up for a number of sessions as well.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Azure ML, SSIS and the Modern Data Warehouse

Recently I was afforded the opportunity to speak at several different events, all of which I thoroughly enjoyed. I was able to speak on Azure Machine learning first at the Arizona SQL Server Users Group meeting. I really appreciate all who attended as we had quite a crowd. Since the meeting is held MachineLearningTalkpractically on Arizona State University’s Tempe Campus, it was great to see a number of students attending, most likely due to Ram’s continued marketing efforts on meetup.com. After talking to him about it, I was impressed at his success at improving attendance by promoting the event on Meetup, and wonder if many SQL Server User Groups have experienced the same benefits. If you have, please let me know. Thanks Joe for taking a picture of the event too.

Modern Data Warehousing Precon

The second event where I had the opportunity to talk about technology was at the Precon at SQL Saturday in Huntington Beach, where I spoke about Modern Data Warehousing. It was a real honor to be selected for this event, and I really enjoyed interacting with all of the attendees. Special thanks to Alan Faulkner for his assistance. We discussed the changing data environment including cloud based storage, analytics, Hadoop, handling ever increasing amounts of data from different sources, increasing demands of users, the review of technology solutions demonstrate ways to resolve these issues in their environments.

Talking and More Importantly Listening

The following day was SQL Saturday in Huntington Beach #389. Thanks to Andrew, Laurie, Thomas and the rest of the volunteers for making this a great event as I know a little bit about the work that goes into planning and pulling off the event. My sessions on Azure ML, Predicting the future with Machine Learning and Top 10 SSIS Tuning Tricks were both selected and I had great turnout on both sessions. To follow-up with a question I received during my SSIS Session, Balanced Data Distributor was first released as a new SSIS transform for SQL Server 2008 and 2008 R2, so you can use it for versions prior to SQL Server 2012. I’ve posted more information about it here. I also got a chance to meet a real live data scientist, the first time that has happened.  Not only did I get a chance to speak but a chance to listen. I really enjoyed the sessions from Steve Hughes on the Building a Modern Data Warehouse and Analytics Solution in Azure, Kevin Kline on , and Julie Koesmarno on Interactive & Actionable Data Visualisation With Power View. As always it’s wonderful to get a chance to visit in person with the people who’s technical expertise I read. In addition to listening to technical jokes which people outside of the SQL community would not find humorous, it’s great to discuss technology with other practitioners. Thanks to Mr. Smith for providing me a question which I didn’t know the answer, which now I feel compelled to go find. I’ll be investigating the scalability of Azure ML and R so that I will be able to have an answer for him next time I see him. I really enjoy the challenge of not only investigating and applying new technology but figuring out how to explain what I’ve learned. I look forward to the opportunity to present again, and when I do I’ll be sure to update this site so hopefully I get a chance to meet the people who read this.
Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

What is a Modern Data Warehouse?

As I was honored enough to be selected to give a PreCon on the Internals of the Modern Data Warehouse, I thought that I would take the time to explain why I felt drawn to the topic. There are a lot of places that haven’t given much thought to the changes in technology which have happened over the last few years. The major feature upgrades to SQL Server in 2012 and 2014 have meant that they can use column store indexes which makes things faster and maybe better High Availability. While those things are certainly valuable improvements there is a lot more that you can do to derive value from your data and companies want more than just a well-organized, running data warehouse.

Data is a Valuable Asset

In 2010, Borders Group Inc. was allowed by the Federal Trade Commission to sell their customer information to Barnes and Noble as part of their bankruptcy sale of their assets. In 2015, RadioShack is doing the same thing. Businesses understand that data is valuable and they are interested in using it to drive decision making. Amazon, Netflix and Target are well known for their use of customer information to drive sales, but they are far from the only ones doing this. This is one of the bigger trends identified recently in the business press. The heads of companies are now looking for their data teams to do more with their data so that they too can have the dream information systems they are reading about.

Total Destruction of the Existing DW is Not Required

Excavator working with earth and sand in sandpitWhile a lot of the time, it might be nice to level everything and start over, that is not always an option. The major reason for this is that the data warehouse environment already in place has a lot of value. You want to add to the value already there, not destroy what you have. Also it would take a long time to recreate the environment and no one is patient enough to wait for that. Alternatively you could expand into areas of new technology as your data grows. Perhaps this mean you archive some of your data from your database to a Hadoop cluster instead of backing up the data in some far off location. This would allow you to use Sqoop to bring the data back when you need it, providing ready access to the data. Perhaps you want to provide the users more self-service BI capabilities, moving the data analysis into the hands of the people who are more familiar with the data? You could add the capabilities of Power View in Excel, Power Designer or Tableau to your environment.

Incorporating Social Media Information

The business world operates not only on a batch cycle. More and more companies want to know what is being said about them so they can respond appropriately. With tools like Azure Event Hubs, Data Factory, Streaming Analytics, and Machine Learning this isn’t as hard to do as it might sound. We’ll review these products so that attendees will understand how these tools can provide greater insight not only into their own data, but the data building about them outside of the company firewall.

For More Information

I really hope you can join me in Huntington Beach on April 10 for a full day of exploring these concepts. I always look forward to events like the precon and of course SQL Saturday #389 – Huntington Beach which is the following day.

 

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Complex Data Analysis and Azure Machine Learning Presentation Wrap Up

Thank you for all of the people who signed up for my webinar on Data Analysis with Azure Machine Learning [ML]. I hope after watching it that you find reasons to agree that the most important thing you need to know to get started in Machine Learning is not Math, but having good knowledge of the data you want to analyze. There’s no reason not to investigate as Azure Machine Learning is free.  In order to take more time with the questions after the presentation than the webinar format allowed,  I am posting my answers here, where I am able to answer them in greater detail.

How would one choose a subset of data to “train” the model? For example, would I choose a random 1000 rows from my data set?

It is important to select a subset of data which is representative of the data which wish to evaluate. Sometime a random 1000 rows will do that, and other times you will need to use other criteria, like transactions throughout a given date range to be a better representative sample. It all comes down to knowing your data well enough to know that the data used for testing is similar to what you will be ultimately using for analysis.

Do you have to rerun or does it save results?

The process of creating an experiment requires that for each run you need to re-run the data as it does not save results.

Does Azure ML use the same logic as data mining?

In a word, no. If you look at the algorithms used for data mining you will see they overlap with some of the models available in Azure ML. Azure ML provides a richer set of models, plus a greater ability to either call models created by others or write custom models.

How much does Azure ML cost?

There is no cost for Azure ML. You can sign up and use it for free.  Click here for more information on Azure ML.

If I am using Data Factory, can I use Azure ML ?

Data Factory added the ability to call Azure ML in December, providing another place to incorporate Azure ML analytics. When an Azure experiment is complete, it is published as a web service so that the experiment can be called by any program which chooses to call it. Using the Azure ML experiments from directly within Data Factory decreases the need to write custom code, while allowing the logic to be incorporated into routine data collection processes.

http://azure.microsoft.com/blog/2014/12/16/azure-data-factory-updates-integration-with-azure-machine-learning-2/

If you have more questions about Azure ML or would like to see me present on the topic live and live in Southern California, I hope you can attend SQL Saturday #389 – Huntington Beach where I will be presenting on Azure ML and Top ten SSIS tips. I hope to see you there.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

Tips on SSIS at SQL Saturday Albuquerque

sqlsat358_ABQOn February 7, I was fortunate enough to be selected to speak at SQL Saturday in Albuquerque, New Mexico on Top 10 SSIS Tuning Tricks. Having worked with SSIS for a number of years, I’ve needed to research what was the best methods to employ to ensure my SSIS ETL was running optimally. I’ve compiled the most valuable items, with examples of course, into this presentation. I’m assuming that everyone attending already has been using SSIS for a while, so I will skip straight into more in-depth ways of tuning SSIS. One of the questions that I know I have heard most often is “When should I do X in SQL or SSIS?” If you are able to attend this session, you will have the answer to that question.

I really enjoy the opportunity to speak on data related topics and meeting people who may have come upon my blog in the past. Having spoken at this event last year, I know what a good job Keith, Chris and Meredith and friends do organizing this event. I want to take the time to say thank you for all of your hard work as I really appreciate it. These events are a great place to learn and keep up with a lot of the changes going on in the industry. I anticipate there will be many lively discussions both before and after the event. That reminds me. If you get a chance, on Friday there are two great precons scheduled on Friday, February 6th , Powershell Basics with Mike Fal and Query Tuning, Troubleshooting and Execution Plans with Jason Kassay. Having been fortunate enough to meet both of them, I know that they are both extremely knowledgeable in their respective topics, and if you are in Albuquerque I encourage you to sign up for either of them as I am sure both will be excellent.

I hope that you will be able to attend as I know I will enjoy seeing you there.

 

Yours Always
Ginger Grant
Data aficionado et SQL Raconteur

Presentation Follow Up to Data Analytics and Distribution with Power BI

Thank you to all of the people who took the time to view my session on Data Analytics and Distribution with Power BI. I really enjoy getting the chance to decrease the confusion I hear regarding Power BI, and hope that you will find the question and answer section helpful if you are trying to learn more about the product.

Questions and Answers

 In Power BI Designer, is it possible to manipulate the colors of the charts?

While this may change as Power BI Designer is still in preview mode, currently the colors are assigned automatically. As you might guess, this is a feature that other people are interested in so it is on Microsoft’s list of things to add. If Power BI Designer color selections are implemented like they are in Power View, which is very similar to Power BI Designer, it is likely themes of colors will be available, rather than the ability to pick each color like in Report Designer, but we will have to wait and see.

Are Power BI designer/dashboard changes specific to each user?

If you have created a Power BI Designer Dashboard, you have the ability to share it with people in your organization. There are a couple of things that need to happen for this to work. The people that you share it with must have Power BI accounts, and they have to be in the same domain as you are. When you share the reports they are only able to read, not edit them. For more information regarding security and Power BI, see Microsoft’s guide here.

Can I grant access to users outside of our domain?

Power BI’s security model is a separate tenant from the security model for SharePoint in the Office365 cloud, but they are related as you can only grant access to Power BI if those users are able to access your version of Office 365 SharePoint. As stated in the previous question, for Power BI Designer Dashboard, the users must be part of the same domain.

How does Power BI perform predictions? Is it the same logic which is used in Data mining?

Power BI uses the Forecasting and Hindcasting features to perform predictive analytics. There are a number of different analytical categories, and the kind used in Power BI use Time Series. As the name suggests, Time Series models analyze a set of measurements performed over time to determine patterns in the past which can be used as guides going forward. Data Mining looks at variables, which may or may not include time, as it looks for patterns throughout the data. These underlying statistical models are not the same.

Does Power BI have Power Map feature?

Power BI definitely contains Power Map. In fact Power Map is only supported in preview mode if you do not have Power BI. This link can provide more information about the limits of Power Map in Excel. Power Map is designed to be run as a movie, and provide a directed look of the data on the map, rather than providing the interactive drill down mapping features which are available in Power View. You can share a Power Map by saving it as an mp4 video file and posting it anywhere. There are a number of Power Maps on You tube if you care to search there.

How does “R” play here?

The R language is completely agnostic as to what is the source of its data, you have the ability to use excel if you want to as a data source. If you want to use R within Excel, try the RExcel add-in, which is available here.

Is the PowerPivot where the data is stored for Power BI Designer?

No. Data can come from anywhere, not just from Power Pivot in Excel. For example, if you want to use a website as a data source, you could do that too as there are a number of different available data sources, and that is one of many.

Do you have to use Power BI Designer on the Web?

While Microsoft has designed the Power BI Designer as a web project, so that you can create Power BI Designer Dashboards as part of the preview, there is also an application available for download here. The desktop application works very similarly to the web version, with the exception of course that you will need to upload and Select Power BI Designer file as your data source. Should you wish to modify the dashboard once it is loaded, you can do so.

How would you determine anomalies or freak instances in data versus true trends?

The problem of determining anomalies is one which the practitioners of predictive analytics are constantly struggling. For trending to occur, the numbers of what was previously considered an anomaly need to increase. Forecasting within Power BI applies one of the more standard methods for accounting for anomalies, looking at the standard deviations and probability. The likelihood a number will fall within a certain range of numbers are based upon the number of times this has happened in the past, which is graphed as a bell curve. The values representing the far sides of the Bell curve are discarded, which is known as variance, which in Forecasting in Power BI is represented as 1δ .The number increases with the more variance you wish to represent.

Does SharePoint on premise support Power BI Designer?

No. Power BI Designer is currently in preview version in the US for Power BI subscribers. You can download the application to play with it.

Will Power BI be available in next version of SharePoint?

While I cannot speak for Microsoft, I can tell you that it isn’t there now. For more information on SharePoint, check out their website here.

Does this work in Office 365 SharePoint? And this replace the bi feature offered in SharePoint on the cloud?

Since I do not work for Microsoft, I am hesitant to talk much about how their licensing plans really work. For more information, please check out their website.

If we want to start learning Power BI, where do we start?

There are a number of great places to learn about Power BI, the best and most up to date being here. I have included some other places where you might want to go to learn more about Power BI

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Upcoming and Up and Coming Topics

It’s funny the different meanings words have when you put them in different order, a point which anyone who has imitated the dialectic of Yoda can tell you. I find words fascinating as they are not static but have meanings which change over time. For example the Iron Maiden meant something totally different before there were electric guitars. Thinking of works and things changing, as one year closes and another year begins, I start to evaluate past and future topics. Earlier this year, I held an informal poll on twitter to find out how long people tend to talk on the same topic. The answers were quite varied. Some people keep on talking about the same topic as long as there seems to be interest in hearing about it. That way you can get to be a really good speaker on that topic. Another feels obligated to create a new topic each time out to provide him a challenge. The answer that personally I related to, was keep on talking about the topic until you are tired of hearing about it, which takes about a year.

SQL Saturday Albuquerque

sqlsat358_ABQMy first upcoming engagement for 2015 will be as SQL Server Albuquerque where I will be talking about SSIS. I generally talk about things I am interested in or presently working on, and having working on a lot of ETL recently, I thought that it would be an interesting topic which I think most people would find helpful. As a consultant, I see a lot of code and wonder why parts of it were written that way. One big reason is someone thought the design was a good one. Since that is an objective decision, I thought it might be helpful to clarify design decisions with facts so that that people would be able to employ good logic for their design decisions.

Technology changes and their Impact on Data Development

Another topic which really interests me is the changes that new technologies are having on the database world. With the increased implementation of Hadoop and cloud things are really changing in the way data is being both stored and used. Predictive Analytics, Machine Learning, Cloud implementations, Interactive Data visualizations are changing what people are expecting from the way their data is stored and used. Expectations for data professionals are increasing as the business is looking away from HIPPO and towards the knowledge that they have gathered or integrated data from public sources.

Modern Data Warehouse

I have the pleasure of assisting in a day-long session to talk about Architecting the Modern Data Warehouse . During this one day session we will be showing how to use new technology such as HD Insight and Machine Learning to implement a modern data warehouse. Instead of just talking about new technologies we will be putting them to use to show how they can be used today. I’m really looking forward to it.

If you are able to attend any of these or any upcoming sessions, please stop by and introduce yourself as I would love to meet readers of my blog in person.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

DQS – DIY Guide to Getting Started with Data Quality Services

SQLServerDBListPower BI and SQL Server 2012 (and beyond) both have components Microsoft developed to shift the focus of Business Intelligence onto business users. Since there are more people who use Excel than have eaten at McDonald’s, you may know about Power BI and the data features in Excel, but have you heard of Data Quality Services [DQS]? If not, well now you have. To get started, if you don’t have Microsoft’s developer version of SQL Server 2012 or later, or access to an enterprise or BI server somewhere, you will need $59.95 to get a copy of the Developer Edition SQL Server 2014. Interestingly Microsoft won’t let you buy the Developer Edition SQL Server 2012, which I found out when tried earlier in the year. Unfortunately, although Visual Studio is now free, you still have to pay for the Developer Edition of SQL Server.

After you have installed SQL Server and selected that you wanted Data Quality Services, one would assume you had installed it. While that sounds like sterling logic, it is not correct. Here’s how to tell if you have DQS. Open up SSMS on your computer and look at the list of databases like I did here. Do you see any databases here which start with DQS? No. That is because it hasn’t been installed yet. It sure looks like it is if you look at the sql installer, which I have included below. I added the red boxes to highlight the fact that I really did select Data Quality Services when I installed.

If you don’t have the Data Quality Services and Data Quality Client installed in SQL Server like they are here, you will need to do that first, but this is only the first step. Once the install screen looks like the one pictured above, you need to go to the Data Quality Services folder in SQL Server and select the SQL Server Data Quality Server Installer. After this package is run, which takes a while, you will finally get a screen that lets you know the installation is finally completed.

DQSInstallSuccessfulScreenAfter DQS Server has installed, you will see that 3 databases have been added: DQS_Main, DQS_Projects and DQS_Staging_Data. Once these three databases are installed, you can then start using the DQS Client.

DQSDBList

The DQS client does not need to be installed on a server. Since I highly doubt most places will want their business users to be directly accessing their Server, most of the time it will not be installed on the server.

Once your environment is set up, it’s time to start using it. For more information on how to use DQS, please listen to my presentation on the PASS BI virtual chapter on November26. If you can’t make it, generally speaking it will be available on PASS BI’s You Tube Channel after about a week. I sincerely hope you can make it. Let me know what you think of my presentation by posting feeback to my blog.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

 

Where to go to be in the Know

At the last SQL Server User’s group meeting, I got into a long conversation with someone about where to go to find good training materials online for people at various skill levels. Contrary to popular belief, I do spend time on the internet not on twitter or shoe shopping, and have come up with a list of sites where I think the training material is quite good, and also free. I included a lot of video content, as a lot people find that easy to use for learning new things.

This is not meant to be an exclusive list, just the places I’ve found helpful most recently or have found myself visiting a lot. If you have some suggestions of places you think I should add, please send them to me.

Big Data

MSBIAcademy – This is a really good way to get up to speed on Big Data and other topics.

Apache – If you are interested in Hadoop, you will make your way here to get the latest releases and see what wacky name the latest new tools has.

Hortonworks – I found the training information on Hortonwork’s site to be very good at explaining things.

Learning Map for HDInsight and Azure – SQL Server data is moving to the cloud too and this is a great place to get up to speed.

Mostly SQL Server

A lot of these sites have crossover information too.

SQL University – This site has information for those people just getting started with the Microsoft stack to more advanced topics.

Pragmatic Works Training on the Ts – Every week on Tuesdays and Thursdays, Pragmatic Works provides free webinars on SQL Server and Big data topics with some of the people who wrote the book, ok lots of SQL books about all kinds of SQL and Big Data Stuff. Flip through the archives if you are working during the day and don’t have time to view them live.

Microsoft Virtual Academy – This is Microsoft’s site where they offer free training and you get points. I didn’t know that I wanted points before, but I do now.

Microsoft’s SSIS Tech Net Videos – The audio on these is often very lousy, but the content is pretty good. I am not sure how often these are updated, but you can find good best practice material here.

Ola Hallengren’s Site – At any time you have anyone calling you a DBA, you should know about this site.

CBT Nuggets on YouTube – If you can find anything on YouTube from CBT Nuggets, it probably won’t be a waste of time. This link is for information on SSIS. Be wary of some of the things posted on YouTube, as not everything there is correct or best practices and the quality can be marginal.

Channel Nine – Microsoft has some random-ish videos out here, some of which are really helpful

SQLServerCentral – This is a great resource. Go create an account here as it is free and there is a treasure trove of information.

SQLPass – Last but certainly not least, check out all of the information archived on SQL Pass. They have a lot of virtual groups on a wide variety of SQL related topics. If you can’t attend when they are being held, the videos are available for later viewing on the website. The previous PASS Summit information is awesome. They also have a YouTube channel as well, where you can find interesting things to watch.

SQL Saturday

All of the other stuff I mentioned is archived and available when you have a chance, but there is nothing like being able to ask resident experts about various stuff and network with other SQL Server people. Check out SQL Saturday to see when and where there is going to be an event near you. These events have gone worldwide, so it is very likely there will be an event near you sometime this year.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur