If you have any doubt about the excitement level around Big Data, consider this. Cloudera, a provider of Hadoop-based software and services founded in 2008, recently did a round of venture capital fund-raising. In its 2009 financing round it raised $5 million from Accel. This time it went for a tad more, raising a little matter of $900 million (that is not a typo) from Intel Capital and others, valuing the 500 employee company at $4.1 billion.

However, for all the hype and clearly serious amounts of money being invested in big data conferences, however well attended, presentation of practical case studies are thin on the ground.

A December 2013 Information Difference survey of 178 companies found around a quarter of companies claimed to have a production Big Data application, though just 13% had identified new business opportunities with these initiatives, with “presenting a business case” the biggest reported obstacle. This suggests a market long on hype and expectation but distinctly short on practical experience and proven business benefits.

One area that vendors often talk about in their PowerPoint slides in this area is the notion of “sentiment analysis” of social media, the idea that large corporations can use big data technology to get a better idea of how their brands are perceived by customers. The notion is that you can plug some analytic tools into social media sites and magically discern how your brands are doing. This has some clear limitations. Most Facebook users have by now figured out the meaning of those pesky privacy settings, so much of the accessible data here will be through a company’s corporate Facebook page, which may not be very representative. A more democratic view may come from analysing Twitter feeds (which are by definition public), but there is always the nagging doubt about whether those people tweeting nasty things about your product or customer service are actually valued customers that really matter, or just a few whiners with too much time on their hands looking for freebies.

One interesting example I came across recently was a case study involving a hotel chain in the USA, who had engaged with a start-up MDM vendor to try and dig rather deeper, and come up with insights that might actually be of some real business use. They handed over to the software vendor their customer loyalty data, and then asked the vendor if they could actually match this data to the people tweeting about their hotel. This is quite a knotty problem, as just because you have someone’s name does not make it trivial to derive from this their Twitter handle. Joe Smith from Alabama on the corporate loyalty program may or not be @jsmith who just moaned about the long queues at check-in when arriving at a hotel in Idaho. Nor is it necessarily easy to discern the context of a particular tweet.

This start-up company has developed algorithms that aim to tackle this issue, and achieved a decent level of matching between the loyalty database and a set of Twitter feeds, assigning a level of confidence to the potential matches. The idea is that as human experts assess the potential matches and either confirm or reject, the software learns from the expert intervention and improves its matching. This is all at an early stage, but it did allow some interesting interventions by the hotel chain. Once they had developed a set of plausibly matched customers, they analysed their travel-related tweets. Clearly people stay at more than one hotel chain if they travel a lot, and so what this particular hotel did was to match up tweets about its rivals to its own location database, and then carry out targeted marketing. For example, if a customer in its loyalty program had mentioned multiple times that she was staying in (say) New York but was checking in at a rival hotel (discerned through her tweets), then an attractive special offer would be send to her trying to encourage her to stay at their own hotel in her next trip, luring the customer away from a rival.    

This is pretty interesting, and unlike a number of plausible sounding “insights” that vendors think that companies may get from big data, this case is actually real rather than a product of the fevered imagination of a software marketer. It neatly ties together master data management with social media in a way that is financially useful to a corporation. Although the example is isolated, it shows the potential that may exist for such software. As more and more companies dabble with Big Data, more and more real life examples like this will emerge, and “making a business case” will cease to be the major obstacle that it is now with regards to Big Data investments.