Recently, I did a webinar on Machine Learning and R. I received a number of questions during the presentation. Due to time constraints, I was unable to answer all of them, so I have provided the Question and Answers here.
Question: Can I Use R in SQL Server to plot non-linear regression curves? We use IC50 and others in Michaelis-Menten kinetics for bio-chemical work.
R running on SQL Server provides the functionality of standard CRAN R packages with the additional capability to run the SCALER functions provided by SQL Server’s implementation of R. Any other functionality performed in R can therefore also be performed on SQL Server. Like all R code, you will need to install the required R libraries in the appropriate subdirectory. The directory is something like C:\Program Files\Microsoft SQL Server\MSSQL13.<instancename>\R_SERVICES\library
Question: Can you execute/invoke an external R script from T-SQL?
Yes. With SQL Server 2016 it is possible to run an R script from within T-SQL by using the sp_execute_external_script command. This link should help you with more detailed information on how to execute R scripts within T-SQL
Question: Is it possible to run R processes in diffrent boxes other than SQL Server itself for scalability reasons?
You have the option of installing the R Server on another server. Just keep in mind that you do have to account for the additional overhead of moving all the data over the network, which needs to weigh in on your decision to move processing to a different server.
Question: Can we join data generated from an R script to SQL data directly or does it have to be inserted into a table first?
Data generated from an R script interacts with the data the same way other data in a stored procedure does. It is not necessary to write the data to a table for the data to be processed.
Question: When would you use R Server versus R on SQL Server?
R on SQL server is an implementation of R server. When running R on SQL Server, R Server is running.
Question: What is the maximum file size I can load on R server?
R Server uses SCALR which provides the ability to not only use memory but also use disk storage, providing nearly limitless capability for file size processing.
Question: Is knitr integrated or can it be integrated into R for SQL Server?
While you can install the knitr library on SQL Server to implement the functionality provided, there is no direct integration path for incorporating the wide functionality knitr provides within SQL Server. For more information about knitr, please see this link.
Question: Does MS R Open have all the functionalities of CRAN R? Does it support all packages?
MS R Open is fully CRAN R compliant. All code which runs in CRAN R can run in MS R Open and all packages are supported. Since MS R Open was rewritten to use the Intel Math Kernel Libraries, it provides technical advantages to CRAN R. MS R Open is partially Multi-threaded, instead of single threaded like CRAN R, and is up to 38% faster
Question: Do you have a link for the Microsoft Machine learning cheat sheet?
Sure. The link can be found here.
Question: Why Use Microsoft R Open over other Analytics Tools such as Open R, Python, Matlab?
Microsoft R Open is fully CRAN R compliant, and also multi-threaded and faster, providing a clear benefit over CRAN R. Python is a great tool for data analytics, but unlike R, it is not designed solely for statistically analysis but has a wider functionality scope. R is focused solely on providing statistical data analysis. Matlab is a great tool, but given its complicated user interface and high licensing cost, many users may gravitate towards R, as it is free and there is a lot of good support for learning R available online as well.
Question: Can R be used with older versions of SQL Server or only 2016?
While it is possible to use R with other versions of SQL Server using tools such as RODBC, the R Server and full integration of R in SQL Server is only available in SQL Server 2016.
Question: Do I need all R server and R client and Microsoft R Open to be installed to run R from SQL Server?
To run R on SQL Server, R Server needs to be installed. For more information on how to install R on SQL Server please see this link. To connect to the server and use the SCALR functionality, the client machine will need to have R Client installed.
Question: What are the benefits of Microsoft Standalone R Server?
R Standalone Server is available to connect to data stored in HDFS and Teradata in addition to SQL Server. To incorporate data from other data sources and provide the ability to process more data than one has memory, it might make a lot of sense to deploy a standalone R Server.
Question: Are there any memory limitations in R when handling big data?
R is designed to run in memory, meaning if you have more data than memory you may run out of memory. When running using the SCALR functions, memory and disk are used to provide nearly limitless ability to process data.
Question: Can one call R from Python?
While it may be possible to create Python code which calls R, I am unaware of the syntax. This functionality would not be supported in the context of SQL Server.
Question: Can you use R charts in SSRS?
There are two different ways to incorporate R charts in SQL Server. Since a stored procedure can provide and R visualization, it is possible to use SSRS to call a stored procedure which would create a graphic to be rendered in SSRS. Also since Microsoft announced the incorporation of Power BI with SSRS, and Power BI provides the ability to create R graphics, incorporating Power BI which renders R would be another way to use charges in SSRS
Question: Any classes like “R for the nervous DBA” that you can recommend? I have a statistician who is running R on my SQL server.
I do not know of any R classes specifically designed for SQL Server DBAs, but I am working on a post for additional configuration information which you might find helpful. Please check back in a few weeks for updates.
Question: What is a Pirate’s favorite language?
RRRRRR Mad Props to Phillip for sending me a pirate joke. I love pirate jokes. I feel compelled to tell a few when I am presenting R and I have been told all of mine are really bad.
If you have any additional questions about SQL Server and R or any pirate jokes, please comment on this post or send me a message via twitter @desertislesql.
Data aficionado et SQL Raconteur