Articles

T-SQL Tuesday #80 – SSIS Projects, Packages, and Deployments

http://michaeljswart.com/2016/06/t-sql-tuesday-079-its-2016/comment-page-1/#comment-186750Recently I was talking to someone who is just getting started with SSIS and wanted to know about the package deployment model. I hadn’t thought about a package deployment model in a while. TL;DR Do not use package deployment for SSIS. Why? Because SSIS code deployed in a project model is easier to maintain and migrate. Maintaining a large number of packages can be painful, something I know too well.

Making Maintenance Difficult One Package at a Time

Prior to SQL Server 2012, there was no project deployment. SSIS code was all deployed as packages. These packages could be stored within MSDB or they could be stored and run from the file system. In disorganized places like the one where I worked, they were deployed in both. Assuming nothing much changed since I left, they have all versions of SQL Server which were released prior to the day the new IT Director started in 2012. There was DTS on SQL Server 97, 2000 and SSIS on 2005, 2008 and 2008 R2. No reason to upgrade anything which still worked was their motto. When space was a problem, one could always go build another server. I think the LAN administrator was happiest when he was able to justify building a new server as he could spend hours shopping for parts on the internet and building the latest server.

I was given the task of supporting all of the SSIS code, which of course broke periodically. There were 300 packages on the myriad of different servers all named package1 which were deployed every way possible. As a bonus one could not trust the open source software control package to have the latest code, unless it was one I worked on previously. The hunt for where the code and the config file used to drive it, and getting access to where it was stored, was just part of the maintenance process. If one package called another package, then both packages needed to be found and reviewed. Each of those packages would have different ways of connecting to the same database too. My favorite was when one package called another package and they each used different IDs to access the exact same database. Ah the joys of troubleshooting SSIS Packages. SSISProjectIt is was on the top five list of the reason I was very happy when I quit that job.

SSIS Projects

If you create write SSIS code and use a project deployment model, you can create one data connection for all the packages which need one. The code is deployed to one place, the Integration Services Catalog. All the related code is deployed to one folder. If you need to change a connection which all the packages use, you can do it in one place. You want to pass some parameters for all of the packages to use? No problem. SSIS project deployment offers some great advantages. I cannot think of any reasons to use a package deployment for SSIS 2012 and beyond. If you are writing SSIS code it is how you want to deploy packages.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Creating New SSRS and SSIS Projects for SQL Server 2016

SSDTNow that SQL Server 2016 has been released, it is time to start creating new SSIS and SSRS projects for it. Since SQL Server 2014, SSIS has migrated to Visual Studio. The latest version, Visual Studio 2015, has a free Community edition, and can be found here. If you have it installed and try to create a new Reporting or Integration Services Project, you will notice that there are no templates listed which will allow you to create one of these projects.

Making SSIS and SSRS Projects for SQL Server 2016

To create SSIS projects in Visual Studio, you will need to click on this link to download the SQL Server Data Tools [SSDT] in the language of your choice. Visual Studio must not be running during the install. After about 5 minutes, when the install completes you will have a new application installed, SQL Server Data Tools 2015. You will still have the Visual Studio 2015 application as well, providing two methods for creating new packages. which means that you can click on this icon instead of opening up Visual studio. SSDT also contains the templates for database projects, so you can now start using Visual Studio.

Creating SSIS and SSRS Packages for Different Versions of SQL Server

Visual Studio SQL Server versioningIn this version of SQL Server Data Tools, Microsoft has finally addressed the common problem of needing to maintain multiple versions of SSIS packages for the different server versions. No longer do you need three different applications to maintain code for SQL Server 2012, 2014 and now 2016. All of these versions are supported with SSDT for Visual Studio 2015. SQL Server will detect which version the code was last saved in so that you don’t have to worry about accidently migrating code. You also have the ability to create an SSIS package in 2012, 2014 or 2016. To select the version you want, right click on the project and select Properties. Under Configuration Properties->General as shown in the picture, the TargetServerVersion, which defaults to SQL Server 2016, has a dropdown box making it possible to create a new package in Visual Studio 2015 for whatever version you need to support. Supporting the ability to write for different versions, is a great new feature and one which I am really happy is included in SSDT for Visual Studio 2015.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

SSIS: What to do when AcquireConnection Fails

The deadline required X number of SSIS packages to be done in each day, a deadline which was tight but doable, as long as there were no technical surprises.  The packages were written in SSDT for SQL Server 2012, and most of them were SSISFailconnecting to other SQL Server 2012 databases with no problem. The next package and contained an Ole DB connection to a SQL Server 2008R2 database with a SQL Server defined ID and password.  I made the connection, which tested successfully, opened up the data task and was able to preview the data. Everything looked fine. I right clicked on the package to execute it, and it went along fine until it reached the data task to get data from SQL Server 2008R2. Surprise! The dreaded red X appeared. Failure.

AcquireConnection method call to the connection manager failed with error code 0xC0202009/0xC020801C

Try as I might, the only thing I was able to do after an hour was periodically change the error code from 0xC0202009 to 0xC020801C. Nothing I did worked. I created a new connection, created a new OleDB Source, changed the Run64BitRuntime to False in Configuration Properties in the Debugging Section of the project execution, set the Data Flow task DelayValidation from False to True. None of these various suggestions that I got from various websites worked at all.  I thought about changing the SSIS Service ID’s execution properties, but since I was running in Debug mode I determined that this would not make any difference, so I abandoned that idea. Nothing worked. The only thing I was able to do was change the error code, not eliminate it. I could log into SQL Server with the same ID and password in my package and run the simple query in the data flow task and return data.  I could preview the data, what I couldn’t do is execute the SSIS package.  Out of desperation I rebooted, which also did nothing.

EncryptSensitiveWithUserKey fixes error code 0xC0202009/0xC020801C

While I was thinking about what was left to try, I was wondering why I was getting a failure to connect message and was thinking about passwords. Because the project and of course the package were set to DontSaveSensitive, every time I SSISGreenCheckopened the connection, I had to enter the password again. News flash. DontSaveSensitive means what it says. It doesn’t save the password, so when you run the package, you have no password and therefore cannot acquire a connection.  The fix was to change the package to EncryptSensitiveWithUserKey. Once I did that, the magical green check appeared.

Hopefully if you ever get this message you will find this post quickly and not waste the time I did figuring this out.

Yours Always,

Ginger Grant

Data aficionado et SQL Raconteur

 

SSIS – Resolving “Failed to Deploy the Project” Messages

Have you seen this screen when trying to deploy?
SSISFailDeploy

Needless to say, this indicates the SSIS package didn’t load to the server. This is especially frustrating as this particular package worked fine in Visual Studio. Believe it or not this screen actually contains information which can be used to resolve the error.

Finding a Useful Error Message

How do you go about determining what to do? Click on the word Failed. This will bring up some useful information, although it may not appear that way at first. Here’s the error message I received

SSISErrMessage

The message Failed to deploy project isn’t very useful, but the rest of the message is. The operation_messages view lives in SSISDB, and the operation identifier number is how to determine what the error is. Run this query, using the number provided in the error message, which in this case is 173

Select * from catalog.operation_messages where operation_id = 173

Here are the results from that query.

operation_message_id operation_id message_time message_type message_source_type message extended_info_id
50719 173 2016-02-29 15:02:08.2478928 -07:00 120 20 Failed to deploy the project. Fix the problems and try again later.:SqlDateTime overflow. Must be between 1/1/1753 12:00:00 AM and 12/31/9999 11:59:59 PM. NULL

 

Now this message is quite useful as it provides information that I can use to fix the issue. This SSIS Project contains a date parameter BackDate, which I had not set. Here’s the parameter.

SSISVSParameters

I set this parameter to a date between 1/1/1753 and 12/31/9999 and deployed the project again. This time, no error.

I hope that you have found this post helpful, especially if you haven’t deployed a package to SSIS in SQL Server 2012 or later.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

Best Practices for the care and use of the SSISDB

Deploying SSIS packages since the release of SQL Server 2012 and beyond, has moved to the SSISDB database. If you are using SQL Server 2012 and beyond and are not using the SSISDB for deployment, my sincere condolences as maintaining and deploying packages any other way is a maintenance headache which thankfully has been resolved. As the SSISDB has not been used for very long, the appropriate care and feeding of this database is not well known. SSISDB is part of Integration Services Catalogs. It is not possible to create an SSISDB without first creating an SSISDB catalog, which does not happen when SQL Server is installed. Follow Microsoft’s instructions for creating a SSISDB Catalog, which creates the SSIS catalog and the SSISDB database. You cannot rename the database, as SSISDB is the name of the database that Integration Services uses internally.

Backup the SSISDB

A client asked me recently why he should back up the SSISDB database. While you can recreate everything inside of the SSISDB, it will take time and you will have to remember exactly how all of your variables were set. Restoring the backup decreases this issue and having a backup allows a server to be redeployed quickly. When you do back up the database, make sure that you remember to backup the database certificate, which is created when the SSISDB is created as well, as you will need this to do a restore. By default. the recovery model of the SSISDB is set to Full. If the packages in SSISDB are changing minute by minute, full would make sense, but given that an SSISDB contains packages which are run on a scheduled basis, most likely the changes made are infrequent. Change the recovery model to simple.

Managing SSISDB Growth Over Time

SSISCatalogSettingsSSISDB contains all of the data used for the reports created when SSIS packages are run. Right click on the SSISDB icon underneath the Integration Services Catalog and take a look at the settings. The default settings are listed here, and to decrease the size of the SSISB over time, you may which to change them. The Retention Period is set to 365 days. Many environments don’t look at reports greater than 90 days, as information prior to that timeframe isn’t very meaningful. If that is the case, change the retention period to the number of days someone is actually going to look at the report, which will decrease the amount of data stored in the database. To get rid of the logs, the setting Clean Logs Periodically needs to be set to True, so don’t change it.

The Server-wide Default Logging Level is by default set to Basic. The information provided at this level is generally what is needed to troubleshoot any issues. Don’t set the logging to None just to save space. If the SSIS code ever crashes, the person doing this may be cursed. If you have simple jobs though, you may find the logging level of Performance may be adequate. Check out Microsoft’s documentation on Logging to better understand the differences between levels.

Lastly if using version control in another application, such as TFS, there may not be a need to set the Maximum Number of Versions per Product to 10. Generally speaking most people don’t look past the last 3 versions. The number should reflect what is practically, which is nearly always less than 10.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

 

 

Upcoming and Recent Events

24HOPPassSpeakingThe PASS organization is a professional organization which sponsors a number of different technical events in the technical community. Recently, I have been honored to be selected to speak at not one but two events hosted by PASS, a professional organization which provides a lot of great resources to improve knowledge of all things SQL Server and related technologies to the world. The PASS Business Intelligence Chapter provides training on all things related to Business Intelligence via the web. I was selected to talk at the last meeting in May. Thank you to all of the people who were able to attend my talk on Top 10 SSIS Tuning Tricks live. If you had to work, no problem all of the talks hosted by the PASS Business Intelligence Virtual Chapter Recordings are available on www.Youtube.com. The recording of my Top 10 SSIS Tuning Tricks session is available here.

24 Hours of PASS

Periodically PASS provides a 24 Hour Training session on SQL Related topics to provide training live to every time zone in the world. As this event is watched by people around the world, it is a real honor to be selected for this event. This time the speakers were selected from people who had not yet spoken at the PASS Summit Convention, as the theme was Growing Our Community. The theme is just another way the PASS organization is working to improve people’s skills. Not only do they provide the opportunity to learn all things data, but also provide professional development through growing the speaking skills by providing many avenues to practice these skills.

Data Analytics with Azure Machine Learning

My abstract on Improving Data Analytics with Azure Machine Learning was selected by the 24 Hours of PASS. As readers of my blog are aware, I have been working on Azure Machine Learning [ML] this year and look forward to discussing how to integrate Azure ML into current environments. Data analytics with ML are yet another way to derive meaning from data being collected and stored. I find the application of data analytic fascinating, and hope to show you why if you are able to attend. There are a number of wonderful talks scheduled at this event, so I encourage you to check out the schedule at attend as many as you can. To be sure I’ll be signing up for a number of sessions as well.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Azure ML, SSIS and the Modern Data Warehouse

Recently I was afforded the opportunity to speak at several different events, all of which I thoroughly enjoyed. I was able to speak on Azure Machine learning first at the Arizona SQL Server Users Group meeting. I really appreciate all who attended as we had quite a crowd. Since the meeting is held MachineLearningTalkpractically on Arizona State University’s Tempe Campus, it was great to see a number of students attending, most likely due to Ram’s continued marketing efforts on meetup.com. After talking to him about it, I was impressed at his success at improving attendance by promoting the event on Meetup, and wonder if many SQL Server User Groups have experienced the same benefits. If you have, please let me know. Thanks Joe for taking a picture of the event too.

Modern Data Warehousing Precon

The second event where I had the opportunity to talk about technology was at the Precon at SQL Saturday in Huntington Beach, where I spoke about Modern Data Warehousing. It was a real honor to be selected for this event, and I really enjoyed interacting with all of the attendees. Special thanks to Alan Faulkner for his assistance. We discussed the changing data environment including cloud based storage, analytics, Hadoop, handling ever increasing amounts of data from different sources, increasing demands of users, the review of technology solutions demonstrate ways to resolve these issues in their environments.

Talking and More Importantly Listening

The following day was SQL Saturday in Huntington Beach #389. Thanks to Andrew, Laurie, Thomas and the rest of the volunteers for making this a great event as I know a little bit about the work that goes into planning and pulling off the event. My sessions on Azure ML, Predicting the future with Machine Learning and Top 10 SSIS Tuning Tricks were both selected and I had great turnout on both sessions. To follow-up with a question I received during my SSIS Session, Balanced Data Distributor was first released as a new SSIS transform for SQL Server 2008 and 2008 R2, so you can use it for versions prior to SQL Server 2012. I’ve posted more information about it here. I also got a chance to meet a real live data scientist, the first time that has happened.  Not only did I get a chance to speak but a chance to listen. I really enjoyed the sessions from Steve Hughes on the Building a Modern Data Warehouse and Analytics Solution in Azure, Kevin Kline on , and Julie Koesmarno on Interactive & Actionable Data Visualisation With Power View. As always it’s wonderful to get a chance to visit in person with the people who’s technical expertise I read. In addition to listening to technical jokes which people outside of the SQL community would not find humorous, it’s great to discuss technology with other practitioners. Thanks to Mr. Smith for providing me a question which I didn’t know the answer, which now I feel compelled to go find. I’ll be investigating the scalability of Azure ML and R so that I will be able to have an answer for him next time I see him. I really enjoy the challenge of not only investigating and applying new technology but figuring out how to explain what I’ve learned. I look forward to the opportunity to present again, and when I do I’ll be sure to update this site so hopefully I get a chance to meet the people who read this.
Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

Tips on SSIS at SQL Saturday Albuquerque

sqlsat358_ABQOn February 7, I was fortunate enough to be selected to speak at SQL Saturday in Albuquerque, New Mexico on Top 10 SSIS Tuning Tricks. Having worked with SSIS for a number of years, I’ve needed to research what was the best methods to employ to ensure my SSIS ETL was running optimally. I’ve compiled the most valuable items, with examples of course, into this presentation. I’m assuming that everyone attending already has been using SSIS for a while, so I will skip straight into more in-depth ways of tuning SSIS. One of the questions that I know I have heard most often is “When should I do X in SQL or SSIS?” If you are able to attend this session, you will have the answer to that question.

I really enjoy the opportunity to speak on data related topics and meeting people who may have come upon my blog in the past. Having spoken at this event last year, I know what a good job Keith, Chris and Meredith and friends do organizing this event. I want to take the time to say thank you for all of your hard work as I really appreciate it. These events are a great place to learn and keep up with a lot of the changes going on in the industry. I anticipate there will be many lively discussions both before and after the event. That reminds me. If you get a chance, on Friday there are two great precons scheduled on Friday, February 6th , Powershell Basics with Mike Fal and Query Tuning, Troubleshooting and Execution Plans with Jason Kassay. Having been fortunate enough to meet both of them, I know that they are both extremely knowledgeable in their respective topics, and if you are in Albuquerque I encourage you to sign up for either of them as I am sure both will be excellent.

I hope that you will be able to attend as I know I will enjoy seeing you there.

 

Yours Always
Ginger Grant
Data aficionado et SQL Raconteur

Upcoming and Up and Coming Topics

It’s funny the different meanings words have when you put them in different order, a point which anyone who has imitated the dialectic of Yoda can tell you. I find words fascinating as they are not static but have meanings which change over time. For example the Iron Maiden meant something totally different before there were electric guitars. Thinking of works and things changing, as one year closes and another year begins, I start to evaluate past and future topics. Earlier this year, I held an informal poll on twitter to find out how long people tend to talk on the same topic. The answers were quite varied. Some people keep on talking about the same topic as long as there seems to be interest in hearing about it. That way you can get to be a really good speaker on that topic. Another feels obligated to create a new topic each time out to provide him a challenge. The answer that personally I related to, was keep on talking about the topic until you are tired of hearing about it, which takes about a year.

SQL Saturday Albuquerque

sqlsat358_ABQMy first upcoming engagement for 2015 will be as SQL Server Albuquerque where I will be talking about SSIS. I generally talk about things I am interested in or presently working on, and having working on a lot of ETL recently, I thought that it would be an interesting topic which I think most people would find helpful. As a consultant, I see a lot of code and wonder why parts of it were written that way. One big reason is someone thought the design was a good one. Since that is an objective decision, I thought it might be helpful to clarify design decisions with facts so that that people would be able to employ good logic for their design decisions.

Technology changes and their Impact on Data Development

Another topic which really interests me is the changes that new technologies are having on the database world. With the increased implementation of Hadoop and cloud things are really changing in the way data is being both stored and used. Predictive Analytics, Machine Learning, Cloud implementations, Interactive Data visualizations are changing what people are expecting from the way their data is stored and used. Expectations for data professionals are increasing as the business is looking away from HIPPO and towards the knowledge that they have gathered or integrated data from public sources.

Modern Data Warehouse

I have the pleasure of assisting in a day-long session to talk about Architecting the Modern Data Warehouse . During this one day session we will be showing how to use new technology such as HD Insight and Machine Learning to implement a modern data warehouse. Instead of just talking about new technologies we will be putting them to use to show how they can be used today. I’m really looking forward to it.

If you are able to attend any of these or any upcoming sessions, please stop by and introduce yourself as I would love to meet readers of my blog in person.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

Presentation Follow up to Data Quality Issues? There is a Service for that

PASSBIThank you to all of you who were able to attend my latest presentation. I received a number of great questions during the presentation and wanted to make sure that I was able to provide answers. I also wanted to say thank you for the people who provided feedback in the question and answer section as that was very thoughtful and most appreciated.

Questions and Answers

I assume DQS is included with the existing SQL Server license? Is DQS included with enterprise edition? Can you provide some idea of what the licensing costs are? What are the licensing cost for Master Data Services?

Because I don’t work for Microsoft, I hesitate to answer any but the most basic questions regarding licensing, but I can say DQS and MDS are included in the licensing costs for SQL Server. Neither are available in the standard edition either. For more information about what is included, see here http://technet.microsoft.com/en-us/library/cc645993#Other_Components or contact Microsoft directly.

Do you know of any testing of scalability that would provide some insight into the scalability of DQS?

The best place I can point you to for more information on DQS scalability is here http://www.microsoft.com/en-us/download/details.aspx?id=29075

It seems that DQS appears similar to Fuzzy Lookups and Fuzzy Grouping. Is that pretty much what DQS is using?

While the logic employed in the SSIS components Fuzzy Lookup and Fuzzy Grouping are similar, but not functionally equivalent. DQS uses logic much closer to that of Fuzzy Grouping, as both apply a rules to a set of data and set similarity thresholds to determine success. Where they differ is in DQS’ use and development of the knowledge base to continually improve the accuracy.

How do you see DQS and MDS working together? What gets done where? How does DQS integrate with MDS? Do you use MDS to fix the data?

One of Microsoft’s definition of Master Data Management is “we define Master Data Management (MDM) as the technology, tools, and processes required to create and maintain consistent and accurate lists of master data.” MDS contains a framework for identifying and implementing workflows, security, business rules and auditing to the business data. DQS is part of that process that can be used by the MDS product to accomplish these tasks. To best illustrate how they work together, I’d recommend downloading their DQS and MDS tutorial. At a high level MDS enforces rules for cleaning the data and DQS can be implemented as a part of that process. For example before adding data to MDS, you can use DQS to clean the data prior to evaluation, but it’s use isn’t required.

In your SSIS demo, what additional steps would you recommend to have the data steward process the invalid data so that it can be reprocessed under SSIS?

Ideally, when data fails a quality process, the data should be placed in a review table so that it can be reviewed by the data steward. Having a person resolve the issues and add them to the knowledgebase will continually improve the knowledgebase over time.

When building a knowledgebase within DQS, can you include multiple columns for evaluation within one table?

DQS is not limited to evaluating one column at a time. You can use combinations of columns within a table to determine what the correct values are.

 

If you missed my session, a recording is available here , so please feel free to review the video when you have a chance.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur