Preparing for Exam 70-774 – Perform Cloud Data Science with Azure Machine Learning

There are a number of reasons why you might want to take a Microsoft cert exam. Maybe you want to focus your studies on a tangible thing, or you think it will help further your career, or you work for a Microsoft Partner and they required a certain number of people to pass the exam to maintain their current partner status.  I am not going to get into the long argument regarding whether or not a cert will help you in your career, or not, I can tell you why you might want to take the 70-774 exam. Machine Learning, or Data Science if you prefer, is an important analytic skill to have to analyze data.  I believe that it will only become more useful overtime. Azure Machine Learning is a good tool for learning the analysis process.  Once you have the concepts down, then should you need to use other tools to perform analysis it is just a matter of learning a new tool.  I talk to a number of people who are trying to learn new things, and the study them in their spare time.  It’s very easy to spend time vaguely studying something, but you may find that having a target set of items to study will focus your time, and as a bonus you get a neat badge and some measure of proof that you were spending time on the computer learning new things and not just watching cat videos.

Exam 70-774 Preparation Tips


While you could always buy the book for the exam (shameless plug as I was one of the authors), the book will not be enough and you will still need to write some code, and do some additional studying. This exam one of two needed for the MCSA in Data Science and you an take the exams in any order. The best place to start is by first looking at the 70-774 exam reference page from Microsoft.  There are four different sections in the exam, and I have created some links for each section which will help you prepare for the exam. In studying for exams in the past, the best way I have found to prepare is to look at everything on the outline and make sure that I know it.

Prepare Data for Analysis in Azure Machine Learning and Export from Azure Machine Learning

Normalizing Data
https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/normalize-data

TanH
https://reference.wolfram.com/language/ref/Tanh.html

ZScore
http://stattrek.com/statistics/dictionary.aspx?definition=z-score
http://howto.commetrics.com/methodology/statistics/normalization/

Min Max
https://www.quora.com/What-is-the-meaning-of-min-max-normalization

PCA
https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/principal-component-analysis
https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/principal-component-analysis
https://stackoverflow.com/questions/9590114/importance-of-pca-or-svd-in-machine-learning

SVD
http://andrew.gibiansky.com/blog/mathematics/cool-linear-algebra-singular-value-decomposition/

Canonical-correlation analysis (CCA)
https://en.wikipedia.org/wiki/Canonical_correlation

Singular Value Decomposition (SVD)
http://andrew.gibiansky.com/blog/mathematics/cool-linear-algebra-singular-value-decomposition/

Develop Machine Learning Models

Team Data Science
https://docs.microsoft.com/fi-fi/azure/machine-learning/team-data-science-process/python-data-access

K-Means
https://www.datascience.com/blog/k-means-clustering

Confusion Matrix
http://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/
https://en.wikipedia.org/wiki/Confusion_matrix
https://en.wikipedia.org/wiki/F1_score

Ordinal Regression
https://en.wikipedia.org/wiki/Ordinal_regression

Poisson regression
https://en.wikipedia.org/wiki/Poisson_regression

Mean Absolute Error and Root Mean Squared Error
http://www.eumetrain.org/data/4/451/english/msg/ver_cont_var/uos3/uos3_ko1.htm

Cross Validation
https://towardsdatascience.com/cross-validation-in-machine-learning-72924a69872f

Operationalize and Manage Azure Machine Learning Services

Connect to a published Machine Learning web service
https://docs.microsoft.com/en-us/azure/machine-learning/studio/publish-a-machine-learning-web-service
https://docs.microsoft.com/en-us/azure/machine-learning/studio/consume-web-service-with-web-app-template
https://docs.microsoft.com/en-us/azure/machine-learning/studio/manage-new-webservice

Use Other Services for Machine Learning

Microsoft Cognitive Toolkit
https://www.microsoft.com/en-us/cognitive-toolkit/

BrainScript
https://docs.microsoft.com/en-us/cognitive-toolkit/brainscript-basic-concepts

Streamline development by using existing resources
https://docs.microsoft.com/en-us/azure/machine-learning/studio/gallery-how-to-use-contribute-publish
Perform database analytics by using SQL Server R Services on Azure
https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/provision-vm
https://docs.microsoft.com/en-us/machine-learning-server/install/r-server-vm-data-science
https://journal.r-project.org/archive/2009-2/RJournal_2009-2_Williams.pdf
http://blog.revolutionanalytics.com/2017/07/xgboost-support-added-to-rattle.html
https://github.com/JohnLangford/vowpal_wabbit/wiki

I hope you have found this test preparation material helpful.  If you passed the exam, let me know by sending me a comment.

Yours Always,

Ginger Grant

Data aficionado et SQL Raconteur

Database Table Design

http://michaeljswart.com/2016/06/t-sql-tuesday-079-its-2016/comment-page-1/#comment-186750There are a number of different ways that you could decide to organize your data in a database. If you are creating a database to be used in a transactional system, your table design should follow a normalized design as much as possible.  Data should be grouped in logical groups, such as customers, products, sales, orders, quotes, tickets etc.  Redundantly repeating data in multiple places will cause problems in the future.  Your design may include hundreds of tables, and that is perfectly fine.

If the purpose of the database is for Power BI or for a data warehouse, dimensional modeling techniques should be deployed.  In this type of database design, generally speaking there are a number of tables containing descriptive data, such as product and customer and few fact tables which contain the actions which happened.  The actions include things like Sales.  The database design will look much like a star with the fact tables in the center and the dimension tables connected to it like satellites.  If you have one dimensional model connected to another dimension, that design is called a snowflake and some applications, like Analysis Services Multi-dimensional will not process it well.  Power BI and Analysis Services Tabular work very well with snowflake dimensions.

SnowflakeModel

Snowflake Data Model from Power BI

 

Table Design Gone Wrong

To paraphrase Ron White, the reason that I described database modeling is so that even people who know nothing about database design could appreciate my interview story. When I was working at a previous location, I assisted in providing technical reviews for database developer jobs.  We asked a number of typical questions about indexing and stored procedures, but I always tried to come up with at least one question which the candidate could not readily answer by cramming interview questions found on the internet.  I decided to ask one candidate, who did correctly answer the previous stock questions something that would let us know what kind of work he had really done.  I asked him “What do you do to determine how to design a table?” I was interested to find out what his thought process was, see if he would mention normal form or describe something he had done in the past.  I was completely surprised by his answer.

“Well, you can only have 256 columns in a table. After that you have to create a new one.”  This answer was a complete surprise.  I was really curious to find out where he had developed this completely warped view of how to determine what fields should go in a table.  It turned out that he learned all of his database skills from a co-worker, who had recently retired.  His co-worker had worked at the same location for a very long time and when he started used mainframes without any databases.  He had migrated some of the applications to databases and they wrote them this way because it “made sense”.   After that the interview was over, and we hired someone else.

I challenge anyone who is learning databases to please look up what people tell you to do on the internet.  This is useful for two reasons, the first being that it will probably help you learn the concept better to read about it another way.   The other reason is that you can find out if the person teaching you really knows what they are doing, so you will learn the correct way to do something.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

 

Why Developers Should Not Deploy Their Own Code

Code is a very expensive business asset, and needs to be treated that way. The code needs to be stored in source control application which is in a SafewithPapersecured, well-known location, and the process to release it to production needs to be documented and understood. Like backups, it’s important to ensure that code in source control can be modified and installed before a crisis arises where there is a time crunch to fix a huge production issue. To ensure that the code stored in source control is the expensive important business asset that is counted on to make the business operate every day, the code needs to be deployed by someone other than the person who wrote it.

Save Money by Validating Code in Source Control

When I first worked in a location which had another team deploy code I thought it was pointless bureaucracy, which did nothing but slow down the progress. Watching the problems caused by simple processes which went wrong changed my mind. Checking code in an out of source control is a simple process, whether you are using an open source application like Subversion, or have a full blown TFS Server. If no one checks that the code in source control is the code which is deployed, all sorts of bad things can and do happen. Being the poor slob who came in when everything was a mess, getting stuck not only with figuring out some old code was made even worse when I found out that the code in source control, was not the code in production and area I didn’t have access to view. Unfortunately for me, this step did not occur until after I’d changed what I thought was the released code. Writing the code twice and/or going on a code hunt for the right version became a necessary part of the process, adding needless hours to an already complicated task. If only the code in production was deployed from source control, this mess would have been avoided.

Improving Code Quality

All sorts of things can happen when one person writes and deploys. I know someone who worked in the IT department for a large cell phone company. At the time, working there meant free phone service. One of the devs was a heavy user of the free phone service and so was his large extended family. His job was to maintain the billing code. After several questionable incidents at work, HR got involved and he was perp walked out of the building. Due to the circumstances surrounding his departure, his cell phone accounts were checked to ensure from this point on, he would get a bill. Although his account showed a number of active phones, his balance was always zero. The code in source control was checked and there was nothing in it which provided a reason why his bill was zero. Upon further investigation, my friend noticed the version number in production did not match the version number in source control. The code in source control was compiled and a huge balance appeared for the former employee. If someone else had deployed the code in source control, this chicanery would not have been possible.

Code Deployment Needs to be a Well-Understood Process

Today in many companies, the code may exist a lot longer than employment of the person who wrote it. Given the life of the code, there needs to be well established obvious processes to deploy it. Recently I heard from someone who told me about their SQL Server 2012 SSIS project which used package deployment instead of project deployment because only some of the SSIS packages are deployed to production. The packages are installed in many different locations, and they all exist in one project. This project organization idea turns a simple one button deployment task into an involved process requiring copiously maintained documentation to ensure that everyone involved knows what to do and where to deploy which code. Most ETL code runs at night, and often times that means a person on call is woken up to fix it. This tired person complicated job is compounded when the code deployment moves from a straightforward, one button deploy process to a byzantine location determined by copious documentation. I can see many potential errors which would all be avoided if the organization was changed from one SSIS project containing everything, to projects containing locally grouped packages which are created and deployed via a project to folders in an Integration Services Catalog. If the person who developed this project had to explain and document the process they were using to another person who was doing the deployment, chances are this kind of project organization would be exposed like a Sooky Non-Sparkly Vampire to sunlight, and would be burned to ash.

Ensuring the code is in source control and can be modified and moved to production are important steps in maintaining code. That code can be a stored proc or a webservice, what it is not important, securing it is. Having someone other than the developer deploy the code to production ensures that this valuable asset is truly protected and can live on as long as the company needs it.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Asking for Help

tree-climbingWhen I was a kid, I liked to climb trees. And there was a time or two when I climbed up pretty high, and then got too scared to come down. The way I came up looked more dangerous when I was trying to come down than it did going up. I panicked, said I could never come down and my sister went and got my mom, who talked me out of the tree. This blog is proof that I was wrong. With help, I came down. With clarity that often comes with youth, my sister later told me that I was being stupid. If I had just tried harder and not panicked, I could have come down by myself. While I didn’t appreciate her directness at the time, she was right. I could have helped myself, and probably should have, that time. But there are times also when I should have asked for help, but I didn’t feel comfortable asking so I wasted a lot of time trying to figure out things that a phone call would have cleared up in an instant. I like to think that I have gotten better at knowing when to ask and when to figure it out on my own. There is a wide body of knowledge available via search engines to answer a tone of questions. Also I am very fortunate to know people who, when I have asked for help literally have forgone sleep to help me out. These resources have been invaluable when I have been stuck in a virtual tree where I have a problem I don’t know how to solve.

The Lonely Leading Edge of Technology

Recently there have been a number of new releases of software. Whenever this happens, the number of answers to be found is sparse because people haven’t had a chance to accumulate a large body of knowledge. One reason the internet is such a great place to find answers is other people ask the same questions I have and have posted the questions and answers, either on forums or blog posts. I know I have written a few blog posts after finding the answers to questions I had. I am happy to share what I know, as a way of paying back for all of the help I have received. When software is released, chances are the answers are very difficult or nearly impossible to find. There are few people to ask and the internet comes back empty. This is a problem we all can fix, starting with me.

Call for Answers

Recently I have been working with some new features of SQL Server 2016 and have had questions which blogs, TechNet and Stack Overflow provided no answers on the internet. Fortunately, I have found people to help me resolve the answers. If you go searching for the same errors I had, you will find answers now, as I have posted them. If you have had a problem unique to the latest release of SQL Server, I hope you will take the time to post the question and the answer if you have it. I’m going to try to be better at answering forum questions, especially now I have learned a few interesting factoids. I am looking forward to the fact that next time when I go looking for an answer, thanks to all of us who have done the same, we can all help each other out. The next person who finds themselves in the same jam will thank you for talking them out of the tree.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

The Five Key Components for a Successful Data Analysis Project

As a consultant, I have been a part of some very successful implementations of Data Analysis Projects. I have also been brought in to fix implementations which didn’t go so well. Being on a number of different projects, certain common components emerge. When these things are addressed early in the project, things go well. These items really cannot be put off to the end or ignored completely. These items are not related to the software, used in data analysis, as no matter what the tool selected, the solution will be incomplete if the following five areas are not addressed.

5 Components Every Project Should Include

Security

Business Continuity

Reliability

Distribution

Management Direction

Each of these items are important components of a successful Data Analytics Management practice. Problems with any of them can move a project from successful to a failed project.

Security

Security is an obvious consideration which needs to be addressed up front. Data is a very valuable commodity and

Data is the New Gold

Data is the New Gold

only people with appropriate access should be allowed to see it. What steps are going to be employed to ensure that happens? How much administration is going to be required to implement it? These questions need to be answered up front.

Business Continuity

Business Continuity means the solution cannot be on the shoulders of one person, as that can be a risky situation. One person needs a break to go on vacation or not work, and needs a backup who is skilled and able to understand the system and run it alone. This can be a really big problem, especially for smaller organizations, who have only relied on one person. I have been brought in to assist companies who until very recently thought that they had a very successful Data Analytics Platform. The problem with it was that there was only one person who had the skill to maintain it, and that person quit.

Business Continuity can be a specific problem for Power BI users, as often times one user owns a report. Reports for an organization should never be owned by one person. All companies using Power BI should have the reports in a series of group workspaces, not belonging to any single person. Otherwise, if the person writing the report quits and their account is deleted, the reports are not then deleted as well.

Reliability

Reliability is critical, because who cares about a data analysis project if no one believes the data is correct? In addition to accuracy, the system used needs to be reliable, containing data updates on a scheduled basis. How and when is the data going to be updated? How is that schedule communicated? The answers to these questions need to be addressed at the beginning of the project. Regulation here may be key to stability as the lack of it could result in a full-fledged data crisis. Given the circumstance that there is a lack of personnel to fill in, monitor, and regulate data, opting for external office 365 services (if MS Excel or other Microsoft applications are used) from reputed parties could ensure appropriate data management.

I remember working for one client who had over a 100 million dollar loss in a month on a visualization we created. I asked if the data was correct as that was a huge one month loss. I was assured that the data was not correct, but no one knew how to resolve the data issue. The reporting tool, whatever it happens to be, is not the place where data is fixed, it should reflect the contents source data. Where this rule is not followed, the reports are ignored as the data is suspect as no one knows why it should be believed as doesn’t match the source system. How is the source system data going to be fixed? This is often times a management issue as people need to be appropriately incentivize to fix things.

Management Direction

All data analysis needs Management Direction to set priorities. As there are only so many hours in a day, the important items need to be identified so that they can be addressed. What is important? Everything cannot be a number one priority as that means nothing is. In many data analytics projects, someone wants a dashboard. Great idea. What numbers show whether or not the company is currently successful? In most companies where I am helping them create data analysis project, the answer to what are the Key Performance Indicators [KPIs] is; no one has made up their mind yet. Management needs to provide the direction for the KPIs.

Distribution

How are people going to get their hands on whatever it is that you just created? What are people looking for? Reports in their email? Visualizations on their phones? What do people want? Only if you ask the question do you know if you are providing the data in a way people want to consume them. In order to pick the most appropriate tool or design visualizations people are actually going to use, these questions need to be asked up front. Recently I worked for a client who had selected Tableau as there reporting solution, but they were getting rid of it. Why? The users wanted to do adhoc analysis in Excel, so they were using Tableau not to visualize their data or do ad-hoc analysis, but to select data for Excel Pivot Tables. A lot of money and time would have been saved if the question of how the users wanted to use the data was asked up front.

Hopefully all of your data analysis project include these components. In today’s environment, data is the new gold. This valuable commodity needs a system which is Reliable, Secure, important to Management, which can be distributed to continually provide value to the organization.

Yours Always,

Ginger Grant

Data aficionado et SQL Raconteur

Data Platform MVP

I am very excited to be able to announce that Microsoft has made me a Data Platform MVP. This is a big thrill. The right words escape me, I will have to make do with these.

If I Only had a Brain

I love this song from the Wizard of Oz.  Unfortunately, the scarecrow never gets a brain, instead he gets a honorary degree.  I wish having an MVP award would make me smarter, but unfortunately, it does not do that.  Frankly it means I am in very intimidating mental company as when I wrote this, there were only 370 Data Platform MVPs. Most likely I need to learn a lot more and maybe write a book so I can keep up.

mvp_horizontal_fullcolor

One thing I do try to do is share what I know by blogging and speaking, if for no other reason than I don’t want to be a hypocrite.  When I was learning SSIS, the person leading the project was tuning SSIS and he would not show me how.  He obfuscated, and made SSIS tuning out to be wizardry. I thought to myself at the time, that he should tell me what he knows as I would do that.  Later I found out the rules, and gave a few talks about SSIS, including one for the PASS Data Warehousing and Business Intelligence Virtual Chapter which was recorded here.  If I learn something, I want to tell other people, which is why I blog and speak.  I think this is the greatest profession in the world and I feel bad for people who have chosen to do something else as the data platform stuff and they are missing out.

Keeping Up

There are a ton of new technology things to learn coming up all the time. I keep up as much as I can and when I do learn something, I tend to blog or speak about it.  If you subscribe to this blog or follow me on twitter, hopefully keeping up will be easier.  I don’t want Microsoft to think that they made a mistake, so I plan on trying to increase the number of blog posts and speak when I am afforded the chance.

SQL Saturday Phoenix

I wanted to make sure to talk about the next place I will be speaking, SQL Saturday Phoenix, the largest data related sqlsat492_Phoenixtechnology event in the state of Arizona.  I know it is going to be a great event thanks to Joe Barth and the rest of us on the organizing committee who have volunteered to make this a great event.  The Arizona SQL Server Users Group was where I learned about the SQL Server Community and was where I started to really get motivated to start learning and I am happy to be a part of it. I hope to see you there.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Who Do You Work For?

Who do you work for?” seems like an obvious question, after all you work for Company X or yourself, but is that really the answer? I recently read an interesting blog post from Mike Fal b | t who recently started a new job and talked about the things he finds important when selecting a position. After reading his post, I thought about a comment I heard about working which has stayed in my head ever since.

You work for your Immediate Manager

bossA few jobs ago, I was working for a company which was purchased by another company. Changes were coming, but they hadn’t happened, yet. I was working for Tyler, who was soon not going to be my manager. He knew I was not going to be working for him soon, and at that time I didn’t know I would be getting a new manager. We had a conversation about the upcoming changes where Tyler told me  you really don’t work for Company X, you work for your immediate boss. He’s right. After all working for a company is one thing, but where the rubber meets the road is when someone directs what you do during the day. The ability to make your life miserable or make you happy to come to work comes from your supervisor, not from the company. One person’s input is a lot smaller picture than Company X, more immediate, and more intense. When Tyler was not my manager, I realized how right he was. I didn’t think much of the new manager and left.

How do you Determine Where to Work?

Because people are such a large part of the working environment, a change in management is a big deal in determining if you want to stay or not. It also explains why two people who work for Company X may have two different perspectives, especially if it is a large company. A friend of mine quit Company Z, which is a large company that continually has very high marks for what a great company it is. Employee surveys continually rank it near the top of several Best Company’s for Employees to Work lists. He quit because he didn’t like his manager. He thought a number of people we knew in common were great, but that couldn’t overcome his bad manager.

Weighing the Criteria

When management is not a consideration, then the criteria change from people to tasks. Quality of work, ability to learn and apply new skills, career advancement, monetary compensation, working environment, scheduling are important considerations. Since rarely is one able to really determine the management question prior to being in a position, these tangible criteria are the only thing one can use to make a decision on where to work. Many times though, this information isn’t enough, and you only find out after you make a decision if it was the right one.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Non-technical Issues Impacting Data Based Decision Making

 Having worked with a number of clients to implement Power BI in their respective environments I noticed that one factor appeared to be common to all. The success of the project depended greatly upon the relationship between the business analyst and the database team. Since this seems to be an issue which greatly impacts the ability to implement Data Based Decision Making, I decided to talk about it in my recent webinar PASS BA Marathon. Too often I see companies which decide to join data together in an analytics platform, such as Power BI, and fail to take advantages of the separate skillsets in the organization. The data team has spent a considerable amount of effort and energy determining the best ways to combine datasets together. Logically one would assume that this expertise would be leveraged to help the business team analyze data. Instead the business teams are tasked with joining data together. While this approach can work, it will take longer to train the business in areas in which they may not be familiar, and the results will be mixed, especially when considering scalability and maintenance needs over time. To leverage the capabilities of the self-service business tool, which tool doesn’t really matter as the same issues will exist in for example Tableau as well as Power BI, the data team needs to be engaged. The skills they have gathered over time allow them to design a plan a data model which can be refreshed automatically without causing issues.

Using Areas of Expertise

Business Analysts time is best spent using the unique skills they have gathered over time too. Their familiarity with the data values allows them to determine at a glance how the business is doing. Codifying this knowledge into meaningful reports which can disseminate this information throughout the organization provides the basis for data based decision making. To make them successful, they need a data model which has all of the information they need which is well documented so that they can find the values they need to provide meaningful data visualizations. Too often the report generation is left to the data team, and many times there is a reporting backlog of items as there are not enough resources to do provide all of the information a business needs.

Team Collaboration

Data Based Decision Making should be an organizational goal, as it has been shown to be a major tool for business success. When the Data Team and Business Analysts work collaboratively by using their specialized skills to create and implement a solution, this solution will be successful. The result will be a model which provides the a path for the Business Analyst to continue to use the data to answer either routine questions, such as “How successful was the business last month” to more obscure questions, such as “What happened to sales volumes after a bad story in the press?”. These and many other questions are answered using the model and tools, like Power BI to implement an enterprise wide solution.

Implementing Successful Data Analytics Management PracticesPASS Business Analytics

There is more to implementing a self-service BI Tool such as Power BI than merely knowing how to make the tool work. A process and a commitment to work among teams is required as well. I enjoyed the opportunity to talk about integrating the tools with the company data management polices at the BA Marathon. If you would like to know more about this topic, please come join me at the PASS Business Analytics Conference in San Jose May 2-4 as I will be going into more depth than was possible in the webinar.

 

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

Data Access and Self-Service Business Intelligence

There is more to implementing Self-Service Business Intelligence Business Intelligence than getting new software like Power BI, mindsets and practices also need to change. The data teams in many companies formed their policies based on history with previous technologies. One of those policies that is fraught with contention is letting the users have access to the data in order to do their own analysis. The reasons for this are based on a story like this one.  Like many a data professional, I worked at a company where we gave a team of users access to the database in order for them to do analysis. It was a replicated database, as we didn’t want to impact production. As these analysts primary skill was marketing not SQL, they wrote a query that took all the resources so no one had the ability to do anything else with the database, and we were required to intervene and kill the query to make the database useful again. After that, we changed their access to only being able to use views created for them to prevent this from happening again. Variations of this story exist all over.

Data Access has changed and so has the need for a 64-bit OS

Self-Service BI is supposed to be a way for Analysts to answer ad-hoc questions from Management about the business. While data professionals certainly could and do answer these questions, at some point a DatabaseAccessfocus line is drawn. If the primary focus is to determine the best way to write a query or implement an appropriate indexing scheme, this person has a technical focus and not a business focus. People with a business focus probably should be the person who use data to drive decision making. While technical people can write reports very efficiently, given the continual requests for answers from the data, keeping up with what the business people want to do can be extremely difficult as the numbers of reports required in various formats can be overwhelming. Like the old argument that “You don’t need a 64-bit OS” have become obsolete, so have the reasons for not giving business users access to the data. Now is the time to give them access. If you only have a 32-bit operating system, you don’t have the memory needed to do much data analysis. Data Analyst need 64 bit OS and access to the data.

What kind of Access should Analyst Have?

Most Analyst use Excel, which has become the de-facto tool of choice for data analysis. One doesn’t need to have a working knowledge of the SQL language to analyze data, and the scenario referenced above still happens. Instead data should be provided in a manner which is easy to consume in a Pivot table, allowing users to select, sort and filter the data at will. Analysis Services cubes, whether they be tabular or multidimensional provided this capability. Using a cube in an excel spreadsheet has very little chance of ever crashing a server, so go ahead and grant access. Give analysts the tools they need to provide the answers they need. Create a collaborative environment to grant access and provide the analyst what they need. In this kind of environment true data based decision making can really happen.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

Lessons Learned About Speaking

As an attendee at PASS Summit, I had the opportunity to learn about a wide variety of topics, including public speaking. I’ll be devoting other blogs to the great technical things I learned, but I thought I would start by talking about the sessions in general. I saw a number of presentations, some which went well, others RimmaNimmeDavidDeWittwhich were beset by technical difficulties. By far the best talk I saw was the keynote with Rimma Nehme and David DeWitt of Microsoft. The presentation was well rehearsed without sounding canned, and the slides were absolutely amazing. You can check out the slides here as they are publically available. I am going to remember what made this talk work, and hope to incorporate what I saw here when I speak next. If you are interested in where that will be, check out my Engagement page as maybe we can meet sometime.

Speaking Techniques

I saw a number of different speaking techniques employed at Speaker Idol. People were really creative. Todd Kleinhans navigated through a game interface. Wes Springbob did an homage to Hitchhiker’s Guide to the Galaxy. By the way, if you haven’t read the series, I think you should as they are great books. I was surprised that all of the judges hadn’t read the books, but even those who didn’t thought he gave a great talk. I demonstrated that I had never used a microphone before, which was not positive. Bill Wolf worked to engage the audience throughout his talk. Ed Watson videotaped his demo. This technique is something that often I have heard that you should do in case your demo crashes, but this was the first time I have seen anyone who actually did record the demo. William Durkin brought great stage presence, which I noticed was a common theme among all of the talks I liked. Effective presenters know their topic so well that the talk should appear effortless and fun, without appearing that a script has been memorized that you are working to run through. Also, remember to have a point to follow during the presentation so I remember what the talk was about midway through. Everyone who did this I thought did a great job.

Speaker Idol Results

The finalist for Speaker Idol were William Durkin b | t , Theresea Iserman t, and David Maxwell b | t. My name was not there, due to my issues with the microphone, which put me off my game. Also despite my goal of not adding useless words, I threw in many “um” and “so” into my talk. In my round, William did the best job, so it was logical that he went forward. I talked to David and Theresa about their respective talks, and I know they put a lot of work and practice making them really good. David was the winner, so I look forward to seeing him at PASS Summit 2016 giving the talk of his choice. As for me, I hope to follow the pattern of fellow Speaker Idol 2014 non-winner Reeves Smith b | t who spoke at PASS Summit for the first time this year the old-fashioned way, by picking a good topic and writing a good abstract for it.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur