Thanks for visiting Research Information.

You're trying to access an editorial feature that is only available to logged in, registered users of Research Information. Registering is completely free, so why not sign up with us?

By registering, as well as being able to browse all content on the site without further interruption, you'll also have the option to receive our magazine (multiple times a year) and our email newsletters.

Digitisation partners optimistic despite copyright issues

Share this on social media:

Tom Wilkie reports from London Book Fair 2011 on Google's book digitisation programme and the impact of the recent US Court decision

Last year, absence was one of the striking features of the London Book Fair – many attendees and exhibitors had difficulty getting there due to the Icelandic volcanic eruptions. This year visitors thronged the halls and the exhibition itself, in London from 11 to 13 April, seemed larger than in previous years. But there were still absences, albeit of a different kind.

During a seminar on ‘Big-picture digitisation initiatives’, there was no mention of the US Court decision, just a couple of weeks earlier, effectively halting Google’s digitisation project for books still in copyright. And, in response to a question from the floor, the speakers displayed considerable confidence and very little concern over the ‘future-proofing’ of the formats in which Google and other projects were digitising books.

In late March, US Federal Judge Denny Chin overturned a deal between Google and the US national trade organisations representing American authors and publishers which would have allowed the digitisation of copyright books by Google.

However, according to Ben Bunnell, from Google Books Library Partnerships, the project is already large and still growing. He pointed out that more than 15 million books, representing four billion pages and two trillion words had already been digitised.

Within Europe, some 32 per cent of the digitised content has come from the Bavarian State Library while 28 per cent has come from the Bodleian in Oxford. The Austrian, Dutch, Italian, and Czech state libraries had recently joined the project, he added. However, in Europe, no 20th century books are being digitised, for reasons of copyright.

Three books written in Klingon – the entirely fictitious language of the entirely fictitious alien race featured in the Star Trek movies and television series – are among the 478 languages currently represented in the digitised content. More than 51 per cent of the content is in languages other than English.

The digitisation project helps libraries, such as the Bodleian in Oxford, to widen access to their collections and thus fulfils the original aims of the library’s founder – to create a ‘Republic of Letters’ – according to Michael Popham, head of the Oxford Digital Library at the Bodleian. More than 60 per cent of the library’s users were not members of Oxford University, he said, with that figure rising to 95 per cent being from internet addresses other than ox.ac.uk.

For a significant proportion of the publications from the Bodleian, requests for digitisation arise not simply because of the text but also from graphical factors. Thus maps and early manuscripts have been digitised and, in one case, the interest was in bindings and coverings, Popham said. Only 19th century and earlier texts have been digitised, he continued, as these are out of copyright and generally out of print.

Partnering with Google had some aspects to which an academic library was relatively unaccustomed, he remarked. For example, the project was governed by a formal contract and Non-Disclosure Agreement. Google undertook to host the archive for 20 years, which raised a few eyebrows amongst academics whose timescales were measured in centuries rather than decades -- conditioned no doubt by the fact that the foundation of the Bodleian dates back to 1602.

According to Google’s Bunnell, long term preservation of the digitised archive ‘isn’t something that has been a source of concern to us’. He pointed out that Google was using open standards and so he did not expect the project to suffer the same types of obsolescence problems as had befallen those who had kept documents in proprietary formats – for example, the difficulty in recovering work stored in early versions of Microsoft Word. Popham, too, said that the open standards were the key to the problem and meant that the library had to manage only a limited number of formats.