The world of data management has changed greatly in the last decade, with the appearance of master data management (MDM) and data governance as disciplines, the growth and maturing of data warehouses, and the emergence of 'Big Data'. We would like to think that our efforts are making a difference, but just how much better is the state of data management at the beginning of 2014 than five or 10 years ago?  It is tempting to assume that things have moved on across the board, but does that data really support that?

Clearly some things will have changed more than others. The continued dramatic drop in storage costs and improvement in processor price/performance means that we would expect there to be plenty of change in the scale of things being managed. Other aspects, such as the degree to which organisations insist on business cases or post implementation reviews, are more about human behaviour than technology, and one intuitively would expect these to change less. As technologies become more mature one would hope that success rates would improve, but have they?

Over the last five years the Information Difference has been conducting quarterly surveys across the spectrum of data management, from data warehousing to data quality, from MDM to Big Data. I thought it would be interesting to look back at this range of surveys and examine the hard data about just how much, if at all, data management has really changed.

Data governance, at least known by that term, is quite a new discipline, and in a 2008 adoption survey, 67% of respondents could not say who in their organisation was actually responsible for it, and just 43% of those with programmes reckon them to be “reasonably successful”. Just two years later this success rate had risen to 79%, of which 19% rated their data governance activities as “very successful”. However, even in the later survey only half had a formal business case for it, and only a third were making any attempt to measure the monetary impact of it on their organisations.

Less than a quarter of companies in 2008 reckoned to have a single enterprise data warehouse in place, and 10% said that they had at least 10. A survey almost three years later found that the proportion of respondents with a single warehouse had grown to a third, a significant increase, although this figure hardly portrays a world in which data warehousing has achieved its goals. A third of those did not feel that the data in their warehouse was reliable, despite an average of 20 staff dedicated to supporting it. It is less surprising when you consider that 40% admitted they did not make any attempt to measure the quality of their data warehouse.

Master data management has certainly become more accepted over time, with an almost doubling of those regarding MDM as “well established” between 2008 and 2013. Interestingly, the amount spent on MDM projects actually dropped from $5 million to $3 million on average in that period, suggesting that projects are either becoming more efficient or possibly have been reduced in scope. Some 30% of a MDM project costs are associated with data quality, three times what was budgeted. MDM projects also spend four times as much on people costs as they do on the software itself, suggesting that the rapid implementation times touted by vendors is just marketing smoke and mirrors.

Data quality is a more established than the others, so one would expect somewhat less change over the years. One recurring theme has been the persistent lack of measuring data quality and the cost of poor data quality. In mid-2009 63% admitted to having no idea what poor data quality is costing them; this picture was actually slightly worse in 2011 before improving a little to 57% in 2013.

However is this hardly something to trumpet, with over half of organisations having not a clue what poor data is costing them. There has been an improvement in the proportion at least monitoring data quality in some form, rising from 58% in 2008 to 75% by the end of 2011. Amusingly, a 2012 survey found that 80% of respondents though that data quality was “key” to the success of Big Data, a triumph of hope over experience given the state of data quality in “small data”.  Indeed, a 2013 survey found that the overall perception of the quality of data in organisations was actually worse than it was in 2008, despite significantly more investment in data quality initiatives during that period.

With so many claims and counter-claims in the world of software, I find it revealing to look back at actual numbers and see how they tally with perception. It can be seen that, for all our investment in data management, we have a long way to go to truly improve things.