Articles for the Month of March 2018

Limitations in Time Series Data Analysis and the growth of Advanced Data Analytics

As someone who regularly analyzes data, I have done my share of time series analysis to determine trends over time.  I am struck by the fallibility of this sort of analysis.  For those who are unfamiliar with this time of analysis, time series analysis is performed to try to identify patters in the noise of data to help predict future trends through the use of algorithms like ARIMA. As I kid I remember hearing in fast announcer voice the following text “Past performance is no indication of future results”.  As a matter of fact this is a rule that the SEC requires mutual funds to tell all of their investors this statement.  Yet I get asked to do it anyway.  While I enjoy working with data and using advance analysis techniques including R and Python, I think it is important to realize the limitations of this sort of analysis.  It is considered a good experiment in Machine Learning if you are 85% right.  This is not acceptable if you are talking about a self-driving car as running people over 15% of the time is generally not considered acceptable. There are times when looking at the future that the data is not always going to provide an answer.  When looking to find answers in data, that needs to be something people keep in mind.  While you can find some answers in data, other answers will require prognostication or plan old guessing.

Impact on Technology realized in 2026 analysis is all about pattern matching, and while I don’t find it to be infallible, looking at a wideset of data has led me to plan accordingly.  While I am no Faith Popcorn, my analysis of what I see in the marketplace has led me to make some changes in my own life as I believe change is coming to the industry.  Adam Mechanic’s prompting for looking ahead to 2016 has provided the impetus to publish these theories.  What I see in the marketplace is the tools which are used to support databases are improving.  I see the ability of software to provide relevant hints and automate tuning of database queries and performance to continually improve, meaning there will be less of a need to employ people to perform this task.  I see with databases being pushed more and more to the cloud and managed services less and less need to employ many people to perform dba roles. Where I see the industry moving is towards more people being employed in analyzing the data to determine meaning from it.  I see that in 2026 very little data analysis being performed with R and most analysis being performed in Python.  This means that if you are looking ahead, and are employed in areas where people are being supplemented with tools, the time is now to learn skills in areas where there is growth. If you have been thinking about learning data science, Python and advanced analytics tools now is the time to start so that you will be prepared for the future.


Yours Always,

Ginger Grant

Data aficionado et SQL Raconteur