Thank you to all of you who were able to attend my latest presentation. I received a number of great questions during the presentation and wanted to make sure that I was able to provide answers. I also wanted to say thank you for the people who provided feedback in the question and answer section as that was very thoughtful and most appreciated.
Questions and Answers
I assume DQS is included with the existing SQL Server license? Is DQS included with enterprise edition? Can you provide some idea of what the licensing costs are? What are the licensing cost for Master Data Services?
Because I don’t work for Microsoft, I hesitate to answer any but the most basic questions regarding licensing, but I can say DQS and MDS are included in the licensing costs for SQL Server. Neither are available in the standard edition either. For more information about what is included, see here http://technet.microsoft.com/en-us/library/cc645993#Other_Components or contact Microsoft directly.
Do you know of any testing of scalability that would provide some insight into the scalability of DQS?
The best place I can point you to for more information on DQS scalability is here http://www.microsoft.com/en-us/download/details.aspx?id=29075
It seems that DQS appears similar to Fuzzy Lookups and Fuzzy Grouping. Is that pretty much what DQS is using?
While the logic employed in the SSIS components Fuzzy Lookup and Fuzzy Grouping are similar, but not functionally equivalent. DQS uses logic much closer to that of Fuzzy Grouping, as both apply a rules to a set of data and set similarity thresholds to determine success. Where they differ is in DQS’ use and development of the knowledge base to continually improve the accuracy.
How do you see DQS and MDS working together? What gets done where? How does DQS integrate with MDS? Do you use MDS to fix the data?
One of Microsoft’s definition of Master Data Management is “we define Master Data Management (MDM) as the technology, tools, and processes required to create and maintain consistent and accurate lists of master data.” MDS contains a framework for identifying and implementing workflows, security, business rules and auditing to the business data. DQS is part of that process that can be used by the MDS product to accomplish these tasks. To best illustrate how they work together, I’d recommend downloading their DQS and MDS tutorial. At a high level MDS enforces rules for cleaning the data and DQS can be implemented as a part of that process. For example before adding data to MDS, you can use DQS to clean the data prior to evaluation, but it’s use isn’t required.
In your SSIS demo, what additional steps would you recommend to have the data steward process the invalid data so that it can be reprocessed under SSIS?
Ideally, when data fails a quality process, the data should be placed in a review table so that it can be reviewed by the data steward. Having a person resolve the issues and add them to the knowledgebase will continually improve the knowledgebase over time.
When building a knowledgebase within DQS, can you include multiple columns for evaluation within one table?
DQS is not limited to evaluating one column at a time. You can use combinations of columns within a table to determine what the correct values are.
If you missed my session, a recording is available here , so please feel free to review the video when you have a chance.
Yours Always
Ginger Grant
Data aficionado et SQL Raconteur