Keeping data safe

Share this on social media:

Topic tags: 

Siân Harris checks out the latest OCLC archive service

‘It’s about as exciting as a safe deposit box in a bank but it’s as secure as one too,’ joked Greg Zick as he described the new archiving service from the non-profit library co-operative OCLC.

OCLC is best known for its WorldCat global catalogue and its library management systems. Zick, who is vice president of OCLC Digital Collection Services, sees Digital Archive as a logical addition to these activities.

‘Over the last few years there has been quite a lot of discussion about archiving. There is great interest and a growing need around the world for a long-term solution to digital archiving,’ he said. ‘All libraries are going to need some sort of archive. Our aim is to provide an option for libraries for the long haul.’

He believes that one of the compelling reasons for libraries to use Digital Archive is that it is at OCLC. ‘The archive is behind the same security as WorldCat,’ he pointed out. OCLC’s limited-access operations facility is secured by a badge reader system and monitored 24/7 by systems operators, security guards and CCTV cameras.

The same level of precautions is taken inside too. OCLC has a network security team. The archive uses multiple raid disk drives locally and tape backups are done regularly. At any point in time there are multiple copies of the content of the Digital Archive in offsite facilities and one copy onsite. The lifecycle of the disk storage devices is also monitored and the taped media are regularly replaced.

Behind all this security, Zick sees OCLC’s approach to dealing with the data as very straightforward. ‘The shared archive is similar to a safe deposit box in a bank. Users have their own accounts and directories,’ he said. ‘Our commitment is to return to you whatever you sent us.’ This does not mean ignoring what the data is, however. When any file arrives it is checked for viruses, that it matches what is listed in the shipping manifest, and that it is in the format indicated by its file extension. OCLC regularly carries out integrity reports on the files to check that there are no problems with them.

The content of the archive is primarily archival TIFF files and PDFs, generated by scanning digitised materials. However, Zick said that the archive is independent of what the documents are. ‘They could be Autocad files or PowerPoint, for example,’ he said. ‘We’ve seen some government agencies deposit websites.’

Keeping the format

However, OCLC’s Digital Archive service does not change any of the files or their formats. OCLC believes that it is best to keep content in the same format as it was sent to the archive, so that users are prepared for future developments in digital preservation. ‘Some archives suggest that they would migrate the file formats forward but that is very problematic,’ said Zick. ‘Our regular users’ reports remind libraries what they put in.’

However, Zick did suggest that OCLC will continue to work with the library community to develop best practices. As these are standardised, OCLC could also develop migration scripts for selected file formats to be shared by the community. This will enable users to have control over their files. They can develop their own update policies or choose to use the OCLC scripts in the future.

OCLC makes a clear distinction between its archive service and its access tools. However, the archive does integrate with other tools such as OCLC’s CONTENTdm. ‘We want to give users a full set of integrated solutions but we give them the flexibility to use other systems in their libraries too,’ said Zick.

This is not the first, or only, archiving initiative for the industry – but Zick believes it has some unique characteristics. ‘We are focused primarily on serving libraries. As the libraries digitise more primary source material, they begin to be a local publisher with a growing need for archiving this valuable and unique content,’ he pointed out. ‘As a library co-operative, our mission is to develop services for these new activities while reducing the costs for all participating libraries.’

It is still early days – the service was only launched in April this year – and archiving agreements tend to take a while as they are long-term decisions. However, Zick said that the response been very positive, especially from government and academic libraries.