Software analysts love to define distinct markets in which enterprise software is sold, such as databases or security. In some cases, though, the lines between such things are hazier, and adjacent software markets start to overlap.
One area where this is becoming ever more apparent are the markets for data quality and master data management. These two have always had connections. With data quality software one of the main elements is to eliminate duplicates and obtain a clean and complete record for, say, a customer name and address record. Master data has a similar goal, but is more ambitious. Rather than just correct a record at source, or avoid a duplicate entry, master data seeks to define a "golden record" of customer, product, asset or location data (there are many more) for the entire enterprise. This consolidated, reliable master data can then be used to feed into other applications like data warehouses, or be used to enrich the less complete versions records stored in transaction systems. One distinction is that data quality tools have mostly focused on customer name and address data, with a few dabbling in product data. Master data is broader, since as well as these domain, companies need consistency across supplier data, personnel data, chart of account codes and much more.
It is fair to say that you can have a data quality project without really impinging on the master data world, but that the reverse is not true. Any master data project has a data quality component and indeed this element is larger than people realise. An Information Difference survey in 2010 asked companies who had implemented master data projects how much they had budgeted for data quality issues – the average was 10% of the project budget. The same survey respondents reported that the final costs were much higher on average 30% of their master data project costs being spent on sorting out data quality issues.
One easy way to spot the difference between a data quality and a master data management product has been to check whether the resulting data is "persistent" – is there a hub containing the resulting data that is managed and maintained after the project is finished? If so then it can be considered a master data tool, whereas data quality tools concern themselves with fixing records in situ or returning corrected or enriched records to other applications. However, it can be seen that it is only a small step for a data quality vendor to make its result set persistent, and it would have the basics of a master data product. Sure, there is other functionality expected of a master data tool beyond just a database of records: hierarchy management capability, workflow, search, dashboards, data stewardship support etc. Yet data quality tools themselves often have considerable added functionality beyond just checking a record's accuracy: as well as validating, merging and matching data records they frequently have some form of suspense processing for records that need human attention (hence workflow) and data quality dashboards. It is hardly beyond the wit of man to imagine adding a few additional capabilities, and – hey presto! A data quality tool becomes a master data product.
A number of vendors have made this transition. Ataccama started out as a data quality product but added a high performance master data hub, sold and marketed separately. Pitney Bowes software has recently launched a master data product based on a graph database, complementary to their established data quality technology. Master data vendors have either built their own merge/matching algorithms or have, more usually, either partnered with data quality vendors or bought them outright.
As a data quality vendor there is some appeal to dabbling in master data – the master data management market is growing at perhaps twice the space of the longer established data quality tools market. Moreover, the perceived added value of master data and its trendier image mean that average deal prices are generally higher for master data products than stand-alone data quality tools. Some data quality vendors have such limited scope that such a path would be a stretch –perhaps tools that just do name and address postcode validation, say. Yet data quality vendors in recent years have been adding more and more functionality useful for managing master data too, and few have restricted themselves to just checking postal codes. I was chatting to a start-up vendor recently who was pondering whether to market their new technology as a data quality tool or a master data product. Data quality has rarely been seen as a sexy subject, so the lure of a more fashionable, faster growing market is easy to see from the perspective of a data quality vendor. I expect to see more such boundary-crossing behaviour as time passes, and for the line between the master data and data quality markets to become even more blurred.