Web 2.0 ideas enrich Europe's digital heritage

Share this on social media:

Topic tags: 

Later this year, the first prototype of Europe's digital library network will launch. Jonathan Purday takes a look at what's going on behind the scenes

The heritage organisations of Europe have digitised tens of millions of significant pictures, films, books, photographs, sounds, newspapers, manuscripts and archival records in recent years.

This is a potentially exciting development for researchers, but in practice it can be hard to find these resources. While they may feature in the catalogues and databases of individual institutions, they are deep web content and are not always individually identified by search engines.

It is particularly difficult to find related materials held in different countries, but a new Europe-wide project is setting out to change this. The Europeana portal will give direct access to the digitised resources of libraries, museums, archives and audio-visual (AV) collections across Europe. Viviane Reding, European Commissioner for Information Society and Media, will launch its first public prototype on 20 November 2008 at a meeting of the Council of Ministers in Brussels.

Once it has been launched, researchers will be able to find all the images of, for example, Reubens’ paintings in Paris, London, Madrid and Brussels on a single results page. In addition, his drawings, sketchbooks, accounts and personal papers would also be there, alongside books and films about his work.

People will be able to follow the ripples of influence as ideas such as the Reformation, the Enlightenment, Socialism or movements like Art Nouveau or Modernism spread across Europe. The growth of Europe’s modern cities – the trams and metros, electricity and sewerage – can be tracked in blueprints, adverts, magazines and newsreels held in the heritage collections across the continent.

Interoperable to the core

Interoperability is at the heart of what Europeana is doing: integrating format types across borders, across domains and between institutions. Museums, libraries, archives and AV collections have different histories, user-groups and purposes. These are reflected in their diverse approaches to cataloguing and the development of varying standards. The result is that delivering all content types in the same online space requires a commitment to working collaboratively and sharing knowledge across long-established professional boundaries.

Europeana’s conference in June 2008 in The Hague, the Netherlands, was titled ‘Users expect the interoperable’. The topic was the focus of keynotes and panel sessions. Speakers addressed contemporary users’ expectations, discussed the technologies that are enabling interoperability and gave practical examples, particularly in the archive and museum domains.

The main features of the conference for the 160 delegates – archivists, librarians, curators, web developers, technical experts and policymakers – was a preview of prototype 1 of Europeana. The delegates were asked to give technical and usability feedback to guide us to prototype 2 in September and on to the public prototype launch in November.

European endorsement

November will be the culmination of two years of activity. The starting point of the Europeana project was a recommendation to create a direct access point to Europe’s cultural heritage made by the European Commission in August 2006. This was strongly endorsed by the Council of Ministers in November 2006, followed in September 2007 by a vote in the European Parliament which overwhelmingly adopted the Commission’s plan.

The original stimulus some years earlier had been Google’s programme to digitise the printed word, and its announcement of partnerships with a range of major US and UK libraries. To some extent it was felt in Europe that the project would be skewed towards Anglophone content and that a complementary approach should be taken to digitise the European intellectual tradition in its original languages. There was also concern that the large-scale digitisation plans of Google and Microsoft would transfer a significant amount of public domain intellectual resource into the private sector, so that equivalent European programmes ought to be conceived of as broadly open access, open source and non-exclusive.

From these ideas came a broader ambition – to create a space in which all manifestations of Europe’s cultural and scientific heritage could be connected and integrated within a single portal, in a multilingual environment.

In part, the idea took shape because technology now enabled it. To a greater degree, however, it was born of a sudden leap in user expectations. Anybody who was using web 2.0 sites was used to being able to watch video, listen to audio, see images or read text in the same space.

Archive, library or museum?

In the cultural heritage field, this expectation was beginning to be met by the crossdomain sites that were being pioneered in a national context by the UK’s Museums, Libraries and Archives Council’s Discover site, the French Culture Ministry’s culture.fr site and the German BAM [bibliotheken, archiven, museen] portal.

Europeana.eu, in partnership with these portals, has set out to consolidate that work and extend it across national boundaries. Cross-domain portals meet evolving user expectations while at the same time removing their need to know or understand the arbitrary historic development of collections. They no longer have to make decisions about what type of institution would hold the material of interest. Is the Domesday Book (the national survey of England in 1086) in the British Library or the UK’s National Archives?

Which museum or art gallery has the best collection of a particular artist’s paintings? Should Chopin’s manuscripts be in Paris or Warsaw? The location of material will no longer be an impediment to access. There’s a sense that in order to remain relevant to new generations of users, heritage organisations need to position themselves strongly on the web, presenting their content in ways that people want. This is not just about creating portals that are cross-domain, cross-border and multi-format – it also means giving users the opportunity to customise their interaction in ways that are familiar from social networking sites. To this end, features include My Europeana and Communities, where people can work on projects together or create interest groups around subjects.

User-generated content

Europeana will give people the opportunity to tag books, paintings, films and sounds. There will also be specific fields in which people can add terms to the metadata that seem relevant to them. This social tagging complements the structured and standardised metadata. It creates new ways into collections and new ways of seeing them, based around natural language rather than structured vocabulary, and is more in tune with people’s perceptions, current interests and experiences.

Ultimately, beyond the Europeana prototype, we want to enable people to generate content themselves in the form of commentary on and responses to the material available. Wikis and blogs give us a model of engagement that is compelling. They have the potential to create interesting new relationships between those who experience the cultural content and those who have traditionally mediated it – the curators, archivists and librarians.

User-generated content offers a rich source of context and interpretation. This will be important because one of the drawbacks of providing access to millions of digitised items is that each will be delivered with bare metadata, devoid of context. This lack of interpretation can limit the use of an illustration, a museum object or a page of newsprint. This is a key difference between information which is ‘raw’ – i.e. offered without curatorial intervention – and ‘cooked’, i.e. that has been packaged and enriched with description.

Although at first the objects will be raw – and this may well inhibit early take-up of the site – the basic materials need to be there before the commentary can be added. There will be issues of accreditation, trust and authenticity with this approach, and we will need to balance our keenness to bring in user-generated content with the need to maintain a crucial value of the European brand – trust. Because the core content comes from the most reliable sources, it is authenticated and utterly trustworthy. If the painter is given as Rembrandt, with title, place and date; or ‘First Edition, printed London 1832’, we can be certain that scholarly accuracy has been brought to bear in making these attributions.

Richer metadata

One related issue that we are dealing with is that the metadata is sometimes minimal. In some cases, digitisation programmes have clearly put their resources into creating high-resolution files, but have been unable to devote sufficient expertise to creating substantial metadata. So author or maker, title, place and date of creation may be all the metadata available. Inevitably, the lack of information lowers an item’s chance of being found by a search.

However, there are authority files, thesauri and ontologies that we want to link to, thereby enriching the metadata. These include Iconclass and the Getty art thesauri. The Thesaurus of Geographic Names amplifies place names given in the metadata, giving the historic names as well as well as names in other languages. For example, Gda´nsk in Polish is Danzig in German; Gutiskandja in Gothic; and in Latin, variously Gedania Gedanum and Dantiscum. It also gives global coordinates, so that it can be linked to a clickable map like Google Earth. Europeana will also trial systems such as Cliopatria, in order to provide a visualisation of semantic relationships – that is, to give users a diagram of how their results are linked to related ideas.

Our metadata standard builds on the Dublin Core standardisation efforts. The metadata that we are supplied with is created using different standards and local variants, and we are compiling guidelines to help contributors make their metadata Europeana-compliant.

This usually simply means mapping the institution’s metadata to Europeana’s broad schema. A further step is to enhance the findability of objects within the portal – objects that may be distributed between collections and across Europe. We hope to establish relationships between objects in ways not possible before, using models such as OAI-ORE and CIDOCCRM.

Mapping digital resources

Over the past few months we have been gathering information about the digital assets held by libraries, museums, archives and AV collections across Europe, to discover information such as collection types, formats and metadata standards. We have also been keeping in touch with the work that the Numeric project has been doing to quantify the scale of digitisation in Europe. A survey on our website invites organisations to give information on available digital collections.

The survey helps us select contributors to the first prototypes. We need to work with different formats and across borders and domains to ensure that the protoypes are representative.

We then harvest the contributor’s metadata using the OAI-PMH protocol. We also harvest a uniform resource identifier (URI). This is a link to the digital object, which must remain true – or persistent – over time. We require this URI in order to link to the object on the holding institution’s website.

This is because Europeana works by creating a central index of all harvested metadata. The index is searched by the users, who are then shown the results with a thumbnail image, sample, title page or other representations that gives them an idea of the full digital object. When the users want to go to the full object, they click the URI and are taken to the full-size file on the website of the holding institution.

This approach means that Europeana does not have to take up vast server space duplicating millions of large, high-resolution files. It also means that no holding institutions has to part with their hi-res files, so no issues of ownership come up. In addition, it means that the branding of the owning institution is visible at the point at which the user views the image or the movie, reads the digitised book or plays the audio file.

What Europeana’s homepage might look like

Access to two million items

When the Europeana public prototype launches in November, it will have direct links to at least two million digital objects from across the four cultural heritage domains, and representing as many European countries as possible. By 2010, we expect to launch a full-service version 1.0 of Europeana, with access to at least six million items.

The interface of the November prototype will be in English, French and German. We’ll add more interface languages in subsequent releases, and by the full-service version it will be possible to perform the actual searches in other languages. For multi-lingual searches, we hope to adapt technology developed by other European Commission-funded projects such as the Multimatch multi-lingual search engine of cultural heritage.

Multilingual search and opportunities to add user-generated content will not be in the November prototype. However, they are incorporated into proposals which have been submitted to the European Commission under the last funding call for eContentplus. Other proposed developments include Application Programming Interfaces (APIs). These could take the form of mini Europeana search boxes that can easily be embedded in anybody’s website, and greatly increase access to and ease of use of European’s content. We should know the outcome of these submissions in autumn 2008.

Jonathan Purday is responsible for communications in the Europeana development team. If you would like to contribute content, become a network partner or know more about Europeana, visit www.Europeana.eu.

Further information

UK’s Museums, Libraries and Archives Council’s Discover site www.peoplesnetwork.gov.uk/discover

The French Culture Ministry’s culture.fr www.culture.fr

Germany’s BAM www.bam-portal.de

Iconclass www.iconclass.nl

Getty art thesauri www.getty.edu/research/conducting_research/vocabularies/aat

Thesaurus of Geographic Names www.getty.edu/research/conducting_research/vocabularies/tgn

Cliopatria e-culture.multimedian.nl/software/ClioPatria.shtml

OAI-ORE www.openarchives.org/ore/0.3/toc

CIDOC-CRM cidoc.ics.forth.gr

OAI-PMH www.openarchives.org/pmh

eContentplus ec.europa.eu/information_society/activities/econtentplus/index_en.htm

One hundred different network partners

Many of the technological solutions we are using are contributed by or adapted from work done by partner projects, which produce open-source software solutions suitable for a number of different applications. Several related projects that are also funded under the Commission’s Digital Libraries eContentplus initiative are among signatories to the Europeana partnership.

Our thematic network partnership now numbers 100 members. This includes representatives of every member state, together with many arts, culture and education ministries, and a wide range of cultural institutions. Among these are the British Library, the UK’s Science Museum, the Deutsche Bundesarchiv in Germany, Amsterdam’s Rijksmuseum and the Institut national d’Audiovisuel in Paris.

The core team responsible for Europeana comprises six project management and web development staff based in the Koninklijke Bibliotheek – the national library of the Netherlands. We coordinate the participation of the technical contributors, experts from universities, research institutions and European heritage organisations.

The overall governance of the project, its strategy and policy are vested in the EDL Foundation, a legal entity incorporated under Dutch law, which allows it to employ people, bid for funding and endorse project proposals.

The Foundation’s Board includes:

ACE: Association Cinémathèques Européennes

CENL: Conference of European National Librarians

CERL: Consortium of European Research Libraries

EMF: European Museum Forum

EURBICA: European Regional Branch of the International Council on Archives

FIAT: International Federation of Television Archives

IASA: International Association of Sound Archives

ICOM-Europe: International Council of Museums – Europe

Engaging the international associations of each of the major domains at board level endorses the Europeana project and promotes the movement towards interoperability at the highest professional level. It also enables the four domains to communicate the objectives out widely to their networks and members, and encourages widespread and diverse contributions of content.