My apologies to Jack Johnson for lifting his song title for this blog, but I couldn’t help it. I thought that it might be useful to discuss the How-Tos of data migration to the cloud, which reminded me of all the cloud songs I know. I always thought the cloud metaphor was sort of an odd name for a remote data center, but that is probably why I am not in the business of naming things. But I digress. Back to the topic at hand. Moving data from SSIS to databases hosted on Microsoft’s Azure cloud requires some different steps than just moving data around on premise. Since things on the cloud change quite quickly, I thought I would demonstrate currently what this looks like using SSDT in Visual Studio 2012.
Breaking through the Wall
Before getting started with SSDT, there are some configuration steps one needs to complete on Azure first. The Windows Azure SQL database firewall must first be set up to allow connections from your IP, or your data will never get to the cloud. There is an option on the database screen to “Connect to your database”. If the firewall is not opened on the IP and port in use, you will get this message.
Selecting the Yes is needed to be able to connect to the Azure database via SSIS.
Also one might want to validate that the right drivers are loaded on SSDT as well. If you are running via Visual Studio 2012 or 2013, no worries as the drivers are already there, but for earlier versions new drivers may be required. If one is planning on loading data to an Azure SQL Server database, the ODBC or ADO.Net are the connections needed for Azure. The old data connection standby, Ole-DB is going to be left in the toolbox like last year’s dress, as it won’t work for the cloud. Much like fashion, everything old is new again so ODBC is once again the “It” connection. You can use ADO.Net too, but I won’t be here.
The next step in the process is getting the connection information needed to connect to the Azure database. Microsoft made this step quite easy. Look on the Azure Database screen where I’ve pasted a pink arrow. Click there.
This makes it so easy as a screen pops up with all the connection information you need. You just need to copy the ODBC section, and remember what your password is as you will need to enter it.
While we are still in Azure, I thought it would be a good idea to display where the SSIS package we will be creating will be putting the data. Here’s the table structure where the data will be placed.
Here’s the query screen showing that right now the table is empty.
SSIS Package Transferring Data to Azure
After you have all of the information you need from Azure, it is a relatively simple thing to create an SSIS package, with an OLEDB connection for my on premise database and an ODBC data connection to Azure using the information copied from the Azure database connection screen to transfer data to my Azure Database.
Going back to Azure, you can see 19,972 rows were added.
One word of caution, as you see here in the progress log, adding data can be a very slow process.
I highlighted the Elapsed time in red so that it would be easy to see that a simple file transfer took over two minutes.
Location, Location
One thing which is important to consider is where you are going to be moving your data. I demonstrated what I think may be the more common scenario, where the data is not on the cloud, and you want to put it to the cloud. Microsoft refers this as Hybrid Data Movement. Of course this may not be the case. If you are running SQL Server on a Virtual Machine in the cloud it may make a lot more sense to run SSIS on that virtual machine. If that is the case, for optimal performance, locate the SSIS in a VM in the same data center as the database because otherwise, due to the bandwidth and network latency, it will be slower. When transmitting data around the cloud, whether it be from on premises to the cloud or from one server to another on the cloud, you might want to consider compressing the data prior to sending if at all possible to decrease the size of the data being transmitted. It may be faster to extract the data you want on premises and transmit a compressed file to be applied on the cloud server. This can get to be more complicated as it requires setting up an Secure FTP server to transmit the files, which then have to be applied. Natively SSIS doesn’t have a compression tool, but there are third party products, such as Task Factory, which will allow you to not only compress the output but send it to your VM via Secured FTP from within the SSIS package.
Yours Always
Ginger Grant
Data aficionado et SQL Raconteur