While there are a number of different Applications designed to implement Machine Learning, such as Azure Machine Learning, Matlab and Octave, a specific package to perform Machine Learning is not required. The algorithms used to generate machine learning experiments, can be applied in other languages, such as R.
Machine Learning Algorithms
Learning is often described as a method of applying rules to situations. “Don’t put your finger on the stove. The stove is hot and will burn you”. A child can extrapolate this to irons, fire and other hot things after being told about stoves. Computers process learning a little differently, by applying rules or algorithms to data to determine a result. A great example of this was the Kaggle competition to determine from looking at a picture, which picture was a cat, and which picture was a dog. The computer reviewed a number of different pictures where there was a label on the picture, indicating that it was a cat or a dog and applied those rules where the pictures were not labeled. The winning algorithm was right 98.914% on identifying dogs and cats. Sorting pictures into groups is a classification function, one of the common functions used in Machine Learning. Other popular functions include anomaly detection, regression and clustering. Once experiments are created, there are a number of different methods used to determine their effectiveness, such as the Receiver Operating Characteristic [ROC] graphs or a Confusion Matrix.
Often times determining which algorithm to use can take a while. Here is a pretty good flowchart for determining which algorithm should be used given some examples of what the desired outcomes and data contain. The diagram lists the algorithms, which are implemented in Azure ML. The same algorithms can be implemented in R. In R there are libraries to help with nearly every task. Here’s a list of libraries and their accompanying links which can be used in Machine Learning. This list is no means comprehensive as there are libraries and functions other than the ones listed here, but if you are trying to write a Machine Learning Experiment in R, and are looking at the flowchart, these R functions and Libraries will provide the tools to do the types of Machine Learning Analysis listed.
Drawing ROC Curves – ROCR
There is a really good list of all of the R regression functions here
- Linear – lm ()
- Poison – glm() with link =”log”
- Fast Forrest
- Decision Forrest
- Boosted Decision Tree
- Two Class Average Perceptron
- Two Class Logistic Regression
- Two Class Bayes point
- Two Class Decision Forrest
- Two Class Decision Jungle
- Multi-Class Logistic Regression
- Multi-Class Decision Forrest
- Multi-Class Decision Jungle
- One V all Multiclass
Applied Machine Learning
Hopefully this list of R libraries will help you apply machine learning to data within R. To see how R can be used in Machine Learning, please join me on my upcoming webinar on Machine Learning with R and SQL Server 2016 where I will show how an R program can be created and applied to a production environment.
Data aficionado et SQL Raconteur