Is cloud storage the answer to preservation?
As the costs of long-term digital preservation climb, cloud storage doesn't look set to bring them down, reports Rebecca Pool
Read the latest blogs from David Rosenthal, and you could be left feeling quite uncomfortable, if not somewhat sick. The LOCKSS pioneer asserts that no-one has enough money to preserve even a fraction of the content worthy of preservation, and cloud storage, perceived by many as a cheap way out, isn't.
‘People have this casual assumption that if you keep something for a few years, you can afford to keep it forever,’ he told Research Information. ‘This is not a safe assumption going forward; hard drive costs have decreased rapidly but now this drop is slowing.’
And, as Rosenthal explained, the cost of cloud storage, a relatively new alternative, has barely changed since its inception. ‘[Cloud storage providers] are coining money out of storage services,’ he quipped.
So what's changed in the storage industry? In the last 30 years, the cost of disk storage has dropped around 30 to 40 per cent every year, according to Kryder's law, analogous to Moore's Law, but for hard disk drives. However, thanks to delays in the roll-out of heat-assisted magnetic recording - the successor to today's perpendicular magnetic recording - the hard disk drive industry dropped off the Kryder curve by mid-2011.
And, as Rosenthal highlighted, that was before the 2011 Thailand floods destroyed a massive chunk of the world's hard disk drive manufacturing capacity, increasing prices overnight. Today the market has bounced back, but the prices have not.
According to USA-based IHS iSuppli, hard disk drives remain the cheapest storage medium around, but prices won't dip below the pre-flood range until 2014. Time to visit the new wave of service providers, promising cheap cloud storage?
Glen Robinson, solutions architect from Amazon Web Services believes businesses typically over-pay for data archiving, making expensive upfront payments for archives.
‘And since [storage providers] have to estimate capacity requirements, they understandably over-provision to make sure they have enough capacity for data redundancy and unexpected growth,’ he added. ‘This results in under-utilised capacity and wasted money.’
So now, several cloud storage businesses offer services, claiming customers will only pay for what they use. ‘This changes the game for data archiving and back-up. Customers pay nothing up front, pay a very low price for storage and can scale usage up and down as required,’ he said.
Given the promises, Rosenthal decided to run a LOCKSS box in Amazon Web Services' cloud, using Amazon's giant's S3 storage service. He recorded detailed costings and compared these with the costs of local disk storage.
‘Current cloud storage services are just not cost-competitive with local hardware for long-term storage, including LOCKSS boxes,’ Rosenthal concluded. ‘Over three years, running a median-sized LOCKSS box at Amazon would cost between six and 12 times the cost of buying the hardware. Yes, there are other costs such as power, cooling and storage but a factor of six to 12 is still a lot.’
What's more, after looking at overall price drops from several commercial providers, he noted that the organisations had, at the time, only reduced prices by up to 3 per cent every year, a fraction of the 30 to 40 per cent annual price drop seen in raw disk prices over the past 30 years.
‘It's clear that the benefits of the decrease in raw storage prices are not going to cloud storage customers,’ asserted Rosenthal. ‘Amazon and its competitors should be riding the Kryder's Law curve like everyone else. But, pricing strategies have been to initially price products very attractively, capture the market and then not reduce the price.’
Still, Amazon’s Robinson claimed that Amazon Web Services is “relentless” about driving efficiencies and passing along the cost savings to the customer. ‘We've lowered our prices 24 times since launching our first service, with no competitive pressure to do so,’ he said.
What's more, the multi-national company recently introduced 'Glacier', a low-cost cloud storage service for the digital preservation market. ‘Use this if low cost storage is paramount, your data if rarely retrieved and data retrieval times of several hours are acceptable,’ explained Robinson.
But while Rosenthal acknowledged that this stripped-down service has some excellent features – in return for accepting access latency the user pays only $0.01/GB/month – he asserted that the service is designed as a back-up for something you are storing elsewhere. The data may be very cheap to store, but in Rosenthal's words: ‘It can be very expensive to get at.’
‘The long-term competitiveness of any cloud storage services still depends on how closely the pricing tracks Kryder's Law, not on the initial pricing,’ he added. ‘Glacier has been priced aggressively... it isn't storage that's going to track Kryder's Law.’
So, given that the disk industry is failing to maintain its historical price decreases and cloud storage costs don't look set to come down soon, does an economic business model even exist for long-term digital data preservation? As Rosenthal pointed out, the economics of digital preservation are very difficult full-stop; most institutions are funded on a yearly budget cycle, making long-term investment in storage problematic.
‘Even government institutions such as the National Archives and Records Administration, and the national libraries are struggling with the costs,’ he said. ‘Right now, I'm building a model so we can understand this... it turned out to be a much bigger problem than I thought it would be.’