Saving for the future

Share this on social media:

Neil Grindley of JISC describes the importance of preserving digital information and some of the major projects that are helping with this

We all know how quickly modern technologies become obsolete. Think (with nostalgia) about the piles of abandoned floppy disks that new laptops no longer read, the films on Betamax video cassettes their owners are unable to watch, or the CD Walkman. Next, imagine the very real possibility that, unless strategies are put in place now, most of the digital data produced today may not be usefully accessed by the next generation due to technological obsolescence, data loss and other concerns.

Digital preservation (DP) is now ranked by the UK’s Office of Science and Technology as a top concern for the health of the nation’s e-infrastructure. Securing long-term and sustainable access to today’s data through preservation is the huge task that many international, national and institutional bodies are now addressing.

Threats and risks to digital data

The threats to digital data range from the human to the technical. On the societal side, the staff who act as guardians of an institution’s data are often not given sufficient motivation to think ahead. It is also difficult to document case studies concerning issues such as a serious data loss as organisations are disinclined to publicise their mistakes.

The withdrawal of funding and general institutional instability are also real concerns. Strategic priorities change as organisations merge and fold and whole data sets can become ‘orphaned’ as the interest and incentive to deal with them moves on. The UK’s Joint Information Systems Committee (JISC) has discovered a lack of formal policy in place concerning DP and data management. We believe that ownership of the issue should be taken more seriously within organisations.

On the technological side, storage media can simply decay or malfunction, a process known as bit rot, which can lead to serious losses of information. This can affect digital data to varying extents, from single character omission, to scrambling of information, to catastrophic loss of data.

Obsolescence of hardware or software is an obvious issue too. This can be tackled by migrating data to newer versions of the software while the data is still accessible (migration), or by attempting to replicate the data’s original environment (emulation). Both options come with inherent dangers, such as loss of quality during successive migrations across formats or between technology platforms. For example, engineers continuing long-term building projects are finding that updated software packages simply do not recognise original CAD design files from older versions of the same programme.

To minimise such threats, JISC is working with government, private and education sectors, as well as other players on the international stage, to increase the take up of DP strategy and policy, safeguarding longterm access for all.

What large-scale DP projects exist?

There are many innovative DP initiatives underway, several of which receive funding from JISC. The UK LOCKSS Alliance (Lots Of Copies Keep Stuff Safe) addresses electronic scholarly journals. As its name suggests, LOCKSS helps institutions retain access to their data assets – in this case archived e-journals – by storing copies on several external servers. This safeguard encourages educational institutions to engage with the issue and gives them the confidence to progress the move from print to more easily searchable e-journals. By being part of the collaborative UK LOCKSS Alliance, the risks to individual institutions of online storage are removed. In layman’s terms, LOCKSS is the closest that libraries can get to doing digitally what they used to do with actual journals; they get to keep a copy. With copyright remaining a worry for some parties, CLOCKSS (C for ‘Controlled’) is a similar venture between the world’s leading scholarly publishers and research libraries.

A second important JISC initiative is the LIFE2 project (Lifecycle Information for e-Literature) project, which is refining a methodology for the analysis and costing of digital-object lifecycles. JISC also provides capital funding in other areas of DP. These include the continued access to digital data held in repositories, and developing a methodology for determining the significant properties of different classes of digital objects. Is content or context more important, for example?

Internationally, the European Commission funds several large collaborative DP projects including the PLANETS programme (Preservation and Long-term Access through NETworked Services) and CASPAR (Cultural, Artistic and Scientific Knowledge for Preservation, Access and Retrieval).

The provision of a shared-service infrastructure for preservation is also a hot topic. Two JISC projects are currently focussing on aspects of that, namely SHERPA DP2 and PRESERV2. But, before such infrastructure can be created, the building blocks of policy need to be laid. A new report on the subject of Digital Preservation Policies has just been published, the recommendations of which should be of interest to a variety of organisations with an interest in or a need to undertake DP.

The Digital Curation Centre (a consortial organisation led by the University of Edinburgh) is a very important JISC-funded initiative. It is tasked with raising awareness and building capacity across the academic sector to undertake DP (which they refer to as ‘digital curation’). With regular conferences, workshops and activities, the Digital Curation Centre is forging ahead in providing models and examples of digital data auditing, archiving and preservation. We hope that many institutions will follow its forward-thinking lead.

Digitisation and auditing

JISC is also funding a significant programme of digitisation work that must be done prior to DP if the data for preservation was not ‘born digital’. JISC is a founding member of the UK Web Archiving Consortium (UKWAC), and its PoWR project (Preservation of Web Resources) is promoting ways for institutions to identify and archive ‘web born’ digital materials that may prove to be of long-term use to those organisations.

Before an institution can structure a preservation programme of its data assets, an audit is essential. In partnership with the Digital Curation Centre, JISC has helped to develop two new tools for such auditing. These are the Data Audit Framework and the DRAMBORA (Digital Repository Audit Method Based On Risk Assessment) tool. In treating digital data like any physical asset, these tools hope to improve the management and preservation of such data within institutions, as well as raising the profiles of those entrusted with its preservation.

The authenticity issue

‘What to preserve?’ is another challenge facing institutions. Who decides what to save, and can we predict now what might be needed in the future? This is where records management processes can help. Effective methods of appraisal for digital information allow strategic disposal of obsolete data to take place. This eases the burden in an age where the creation of digital information constantly threatens to outstrip the means to safely store and render it accessible.

Authenticity (whether data has been corrupted or not during its lifecycle) and the ‘significant properties’ of data should also be kept in mind. Sometimes the content of a given file might be important (perhaps the straightforward words that were spoken). In this case, the data can be migrated to newer technologies for preservation. In other instances, context might be paramount. If this is the case, the manner in which the original data was created and presented should be preserved for the digital object to remain accessible and meaningful.

JISC is funding InSPECT (Investigating the Significant Properties of Electronic Content over Time), a project run by The National Archives and the Arts and Humanities Data Service. Its purpose is to develop the concept of significant properties to support preservation activities. Preservation, for historic properties, tends to mean that things are left untouched, protected in their original state. But where digital data are concerned, unchanged quickly translates into inaccessible. This is our challenge!

International copyright

Some of the major challenges of DP will be most successfully tackled by national and international collaboration because many of its problems are common across institutions, disciplines and geographic regions. A classic example is copyright. The UK’s copyright law, for example, is more than 20 years old. JISC has recently funded an International Copyright Law Study, with partners such as the USA’s Library of Congress, to examine different approaches to legislation regarding creating and storing preservation versions of digital materials. This is with a view to prompting reform so that copyright law can ‘catch up’ with the digital age.

Two decades ago nobody could have foreseen how quickly technology would change, nor predicted the massive shifts from print to online materials and communication. JISC is attempting to scan tomorrow’s horizons today, developing policies and schemes so that digital preservation achieves its rightful importance globally, nationally and institutionally. Our aim, complementing that of our international peers, is that future generations may continue to enjoy unhindered access to the wealth of digital materials being created today. As Dame Lynne Brindley, CEO of the British Library says, ‘Collaborative development is clearly the way forward. We don’t want to develop any more than we have to on our own. The future can’t be in too few hands.’

Neil Grindley coordinates JISC’s digital preservation programme.

JISC and its partners in preservation

The UK’s Joint Information Systems Committee (JISC) supports the UK education sector by researching, developing and providing time- and money-saving ICT resources. This sector is a heavily dependent on digital resources from sources such as libraries, archives, publishers and government. In addition to creating and licensing electronic content for education institutions, one of JISC’s key concerns is affordable preservation. Its Digital Preservation and Records Management Programme funds research and manages projects so that today’s digital materials retain their value and accessibility for the future.

JISC’s work is often carried out in collaboration with key partners. These include the Digital Curation Centre (DCC), the British Library, the Digital Preservation Coalition (DPC), The National Archives and the national libraries of Scotland and Wales. Internationally, JISC also works collaboratively with the Library of Congress in the USA, the SURF Foundation in the Netherlands and key players in Australia and New Zealand.