CIOs have been fighting to keep the deluge of data at bay for years. But could they finally be losing the battle?

Analysts are predicting the worst. IDC says that there is already more data in the world than there is space to store it and that demand for storage capacity will, over the next three years, increase at a compound annual rate of nearly 50 per cent.

The figures are mind-boggling: according to a 2010 The Economist report, the US retail firm Wal-Mart “handles more than one million customer transactions every hour, feeding databases estimated at more than 2.5 petabytes — the equivalent of 167 times the number of books in America’s Library of Congress”.

CIOs themselves are concerned: three years ago Rick Chapman, CIO of Kindred Healthcare told CIO magazine that the firm’s data volumes were increasing by 40 per cent a year, and that it was already dealing with 400 terabytes of data.

Only a few years ago these quantities would have been unimaginable. So what’s driving the change?

Much of it is down to the growth in enterprise data, including the customer transactions of the type stored by Wal-Mart. Such data provides organisations with rich information about customer behaviour.

Unstructured content is another culprit: emails, PDF files, Word documents, PowerPoint slides and voicemail messages have been around for a while, but now firms also have to deal with social media output, such as blogs, tweets and instant messages.

Aman Munglani, principal research analyst at Gartner, predicts that unstructured data will make up 80 per cent of the total available capacity by 2014.

The CIO Hitachi survey on the data pressures faced by CIOs confirms the growth trend: 87 per cent of respondents said that data storage capacity in their organisation had increased since last year, and that the biggest increase in data had come in enterprise applications, with email a close second.

Backup backlog
Finding storage space is a significant enough problem, but enterprises have three additional problems. The first is that CIOs must back up and ensure enterprise application data can be searched and retrieved.

Traditional tape backup has its limitations, however, when it comes to dealing with large quantities.

As Kindred Healthcare CIO Chapman explains, “It takes more time to back up and recover than we have time available in the datacentre overnight.”

Secondly, enterprise data must be kept private and secure. The damage to reputation for an organisation that fails to safeguard sensitive data can be significant, and there may also be financial penalties.

Zurich Insurance discovered the damage a poor storage strategy can have when it was fined £2,275,000 by the Financial Services Authority (FSA) for losing the personal details of 46,000 customers. #

It’s not an issue confined to financial services organisations either: any organisation that does not safeguard data effectively can be subject to penalties under the Data Protection Act.

Keeping data secure is particularly hard when data is so easily copied onto laptops or memory sticks, and when systems are under attack from viruses and hackers.

Thirdly, the increasing pressure from regulation means that old enterprise data, whether structured or unstructured, must be retained for a fixed amount of time, and in a format that makes it easy to retrieve.

Requests for data could range from the FSA asking to see all of a bank’s emails between the CEO and CFO in a particular month five years ago to a member of the public asking a local authority to provide all documents relating to a particular planning application.

On the other hand, keeping every piece of data isn’t a realistic option, not only because of the space it consumes but also because the Data Protection Act requires that personal data is deleted after a certain amount of time.

Storage stresses
Between them, these three challenges (backing up data, keeping it secure and making it easily retrievable) add up to a headache that won’t go away.

As energy costs continue their upward growth, and as datacentre space becomes more expensive, CIOs are looking at new ways of addressing the storage problem.

Some are moving away from tape backups, for example, which are time-consuming to create, require physical security to transport and are relatively fragile.

Instead, they are using storage virtualisation technology, which compresses data more efficiently on disk and makes it easier to access.

Increasingly, CIOs are also turning to cloud solutions to manage data storage and archiving.

In the CIO Hitachi survey, 90 per cent of respondents were planning to move to a fully outsourced cloud service in future, while 62 per cent planned to opt for a part-hosted solution, archiving their data to a service provider. Only eight per cent of respondents said they would manage their data in-house in future.

Cloud solutions or X-as-a-Service capabilities allow service providers to capitalise on economies of scale to a greater degree than are available to most enterprises and free them from the problem of having to invest in more and more storage.

Private clouds, which use technologies such as virtualisation to reduce costs and storage space, are another option for businesses that don’t yet feel ready for a full move to the public cloud.

The analyst firm Quocirca believes that an increasing number of businesses will opt for a hybrid cloud: a combination of private and public cloud services that will enable organisations to keep control of sensitive data while taking advantage of the scalability provided by the public cloud.

Deletion dilemmas
Although cloud and virtualisation technologies can ease some of the data management headaches CIOs face they do not provide all the answers.

Businesses have to make difficult decisions about what data to retain and what to delete, based both on regulatory requirements and on business need.

Our survey suggests that not all organisations are fully confident of the effectiveness of their policies and procedures in this area.

One in five respondents said they either didn’t have, or were not sure whether they had, a documented procedure for the retention and disposal of documents and records.

Effective data management, however, requires businesses to understand which data has little business value and can be deleted and which needs to be retained, either for business or regulatory purposes or both.

Policies that can categorise different kinds of data and assign different strategies for dealing with them must play a crucial part in the decision about which technologies to use and how to use them.