For several years I've been talking to Big Data companies trying to sell products and to IT executives trying to get their hands around the issues. Some interesting problems persist. It's clear we're still at the beginning of understanding this problem, and we're likely still a long way from understanding the promise of using this information.
Companies such as Facebook and Google capture massive amounts of information. They generally get pounded for violating privacy, as neither they, nor we, can figure out what they are doing with this data. We assume they are using it against us, even though they very well may be trying to use it for our benefit.
Many of the issues that have historically revolved around large data repositories pertain to how you manage them. This chiefly means assuring that those who need access, for everything from management reports to compliance, can get the information they need when they need it. It also means assuring that data is stored safely. This has involved services from vendors such as Iron Mountain, where the data is often so safe that no one can figure out how to get it back out.
This speaks to the historic problem with managing data. We've treated it like pirate treasure, finding creative, inexpensive ways to bury it and coming up with equally creative excuses when can't get to it in a timely manner, if at all.
The booty exists, we're sure of that - but we don't know exactly where, and the really old data is often so poorly indexed and stored that it seems like we'd have been better off if we hadn't stored it in the first place.
Emerging public cloud resources promise inexpensive storage with the higher likelihood of future accessibility. Haphazard piles of treasure have been stashed in neat little rows, and a friendly elf has replaced the fire-breathing dragon. The only trade-offs, of course, are security, governance and compliance.
As data grows and IT budgets shrink, though, these tradeoffs seem less problematic - that is, until a hostile party gets access to data and publishes what it finds. We even came up with the risk manager job title, but this position collapsed with the financial market when it was discovered that the position was more about shifting blame than protecting assets.
Data analytics all about data access
Now we're realising that it isn't about Big Data at all. Instead, it's about analytics tied to that data and mobile access to the resulting reports. Executives increasingly find that, if they can get real information from the data they have collected, they can make better decisions, avoid painful repetitive mistakes and increase their stature in their company and industry.
Knowledge, as it turns out, is power. Today's successful executives therefore focus like a laser on knowing more about their customers, partners, employees and environment than their competitors. That's the way you win the corporate game.
This new age executive uses tools tied to stronger data synchronisation. These assure both the accuracy and currency of the data being analysed. They offer mobile clients that can display the results on tablets and smartphones. They utilise cloud services that can address both the cost and security requirements placed on the enterprise.
Hadoop has emerged as the premier data analytics platform, and vendors are racing each other to provide the best tools for using Hadoop. However, these optimisations can fall short, with vendors spending more time developing collateral than optimising the entire solution or unintentionally creating bottlenecks when using a partner.
Choose your data analytics solution wisely
In the end, which became clear when I heard from the CIO of President Obama's reelection analytics effort, the Big Data part of this isn't important. Providing the answers your executives need is what's important.
This may sound simple, but it requires a vendor that can meet the following criteria:
1. Has a great deal of experience with your company and industry.
2. Has a willingness to take on the entire solution.
3. Has a proven track record in meeting your expectations.
4. Has experience with public and private cloud resources.
5. Has the capacity to handle traditional data repositories and real-time data streaming.
In short, this isn't a do-it-yourself kind of problem. You need someone with experience, history, reliability and a reputation you can trust.
Only a handful of vendors will meet those criteria. Pick wisely.