Preparing for impact

Share this on social media:

Global use of institutional repositories is on the rise - but the trend is not without its challenges, writes Nadya Anscombe

‘You know that expression, “Build it and they will come”? Well, that’s not true for institutional repositories,’ said Tom Cramer, chief technology strategist and associate director for digital library systems and services at Stanford University Libraries in the United States.

Stanford’s institutional repository was set up in 2006 and the university launched its self-deposit user interface in 2013. Today this only has depositors numbering in the hundreds – despite Stanford having thousands of academics. ‘It started as a preservation repository and still has a strong preservation focus,’ said Cramer. ‘We have not pushed it, but rather made it available to see where it resonates with researchers. But uptake has been disappointing.’

The Stanford Digital Repository (SDR) is a service supporting long-term management of scholarly information resources. Deposited scholarly content is preserved in a robust, reliable and secure environment and is available from persistent URLs (PURLs) with optional access controls.

To create the user interface for the SDR, Cramer and his colleagues worked with colleagues from the University of Hull in the UK and the University of Virginia, US through the Hydra project – a large, multi-institutional collaboration that gives like-minded institutions a mechanism to combine their individual repository development efforts into a collective solution. In 2009, the repository was relaunched using the open-source software Fedora inside to provide ‘back-office’ preservation services such as data replication, auditing, media migration, and retrieval. The growing body of content deposited in the SDR includes scientific research data, digital humanities research, honours theses, images, audio and video, software and computer games, student projects, technical reports and archival collections.

The SDR team consults internally with librarians and archivists, as well as with researchers and faculty across the Stanford campus, providing technical guidance in creating, building, and depositing digital collections that will remain usable even as technology changes.

‘While uptake has been disappointing, researchers have been keen to use the repository to store theses and supplementary data,’ said Cramer. ‘Usage is increasing organically, as researchers realise the benefits of using the repository. We are also adding more functionality and funders are also starting to require the storage of data and research outputs.’

This added functionality is just one of the factors that are driving up usage. Attitudes are changing with respect to the long-term availability and re-use of data and information. Researchers are also increasingly having to show that their work has impact. While this is not currently a big driver in the US, Cramer believes it will be in the future. ‘Another feature that would drive up usage is if researchers could see how many people have looked at or downloaded the information stored in the repository,’ said Cramer. ‘Many repositories have great analytics, but we do not have that kind of functionality yet.’

One university has found analytics to be an important tool to persuade academics to use its repository. The first thing you see when you click on the repository for the University of Wollongong in Australia is a world map which in real-time shows you the location of users who are downloading content from the repository.

‘It’s a great visual tool, but more importantly it has helped us engage with the academics,’ said Michael Organ, manager for repository services at the University of Wollongong in Australia. Like Stanford, Wollongong has found it difficult to encourage academics to use the repository. ‘It’s hard when their main role is teaching and research,’ said Organ. ‘It is mainly support staff who upload content to the repository, but the feedback the academics get from the analytics has proved very useful. Many have never received any feedback other than from funding bodies. They really appreciate seeing the statistics on how many times their work has been accessed. Some have even used these statistics to change the directions of their research.’

Wollongong’s repository, which currently uses Digital Commons software from Bepress, was first set up in 2002 when there was a national programme to digitise theses. Today, all theses from the university going back to the first thesis published in 1958 are being digitised and are available through the repository.

‘It’s been a huge challenge, especially when it’s impossible to contact the authors of the theses,’ said Organ. ‘Some universities are reluctant to put all content up in case of copyright issues, but our attitude is: “put it up – if there is a problem, take it down”.’

Around 2004, the open access movement started to gain momentum in Australia and Organ saw an increase in the number of academics wanting to use the repository because the researchers felt the large publishing companies were getting an increasing amount of control over their work.

‘In 2006 we started trying to get 100 per cent full text papers onto the repository, but we soon realised that was not possible,’ said  Organ. ‘We now have the full text of around 50 per cent of all research papers published by academics at the university and we are proud of that figure.’

Like Stanford, Wollongong realised that relying on academics to put papers and other research outputs into the repository does not work. Wollongong’s repository therefore works on a semi-automatic basis, scouring major databases such as Web of Science and exporting the metadata for its papers. ‘Relying on academics to populate the repository is not going to work because the work is usually published months or years after it was first carried out and the academics have moved on to other work,’ said Organ. About 80 to 90 per cent of our content is found by trawling through databases and the rest is input manually.’

Wollongong’s repository stores a wide variety of information such as research outputs (papers theses conference, creative arts books, etc.); digitised material such as photographs and manuscript; and conference outputs (papers, presentations, videos, etc.). The data in the repository is used in a variety of ways. Internally, because there is now more assessment of academic performance, it is more important for the dean or head of academic departments to see what researchers have published. The information also feeds into league tables, is used for promotional purposes and increases visibility.

While usage of Wollongong’s repository is relatively high, Organ and his colleagues want to engage with academics even more. They plan to improve the integration with other systems such as Web of Science and also with social media. Funding bodies are also keen for researchers to use repositories more. ‘Various organisations such as funding councils are introducing mandates which force people to put their work into a repository,’ said Organ. ‘This will only increase the utilisation of the repository.’

Despite these increases, there are still challenges ahead. ‘Publishers see open access and repositories as the enemy,’ said Organ. ‘Publishers are fighting hard against us and coming up with schemes where they charge ridiculous amounts of money for open access copy of a paper. I am not sure how this is going to be resolved. We are carrying on with what we are doing while we watch the models evolving. I’m an archivist and want to preserve content. Publishers will come and go.’

As in Australia, one of factors in the UK that is driving the development and use of repositories is the assessment of research and the need to show that the research is having an impact.

When the University of Bristol wanted to set up an institutional repository in 2011 it had five main objectives: replace existing publications management system and have something researchers can put their papers into; the system needed to be able to harvest from existing sources; the system needed to support the Research Excellence Framework (REF) submission; it had to be able to visualise other data (projects, students, data from corporate systems); and there had to be an opportunity to re-use, re-purpose the information.

‘Back then, the market looked very different to how it does now,’ said Sophie Collet, head of research and enterprise policy at the University of Bristol. ‘We considered building our own system, but decided this would take too much resource. Then we looked at the best of breed for each individual function. At the time there was only one system that was compatible with our requirements and that was PURE.’

Today more than 20 UK universities use the PURE software. At the University of Bristol, the software is used to store research outputs such as papers, theses and books; visualise information on grants and funding; and capture activities such as awards, partnerships, public outreach or conference attendance. Data sets are stored in a separate data catalogue.

‘Unlike Australia, we have not had mandates or drivers to put everything in our repository until very recently,’ said Collet. ‘The mandates are still emerging, and many academics are not yet aware of them. The response from academics has been even more positive than we had hoped, and 18 months after our launch we had 12,000 new publications in the repository. More than half of these were harvested from other databases and the others were uploaded manually.’

However, research councils and other bodies are now starting to dictate that outputs from work they have funded should be freely available through institutional repositories and this will mean that more full-text papers will have to be available through the university’s repository.

‘Open access is a game-changer,’ said Collet. ‘We are working on getting the functionality in place that will help us track and fully support open access. We are not at this point yet, but the repository now has 135,000 entries with several thousand full-text papers available.’

The university has a mechanism in place to manage any potential copyright issues and also a team that manages intellectual property issues.

As with all institutional repositories, the University of Bristol’s repository is a work in progress. ‘The external environment is changing,’ said Collet. ‘There has been a major focus on impact and the REF, but now there are requirements for open access to both publications and research data. We are facing a tough challenge – prioritising the needs of the academic community as well as the compliance requirements of funding bodies. But it’s a process and won’t happen overnight.’

While engaging academics to use the repository has been a challenge, Collet has seen an increase in the number of deposits that academics make in the repository and puts this down to the ease of harvesting and automation, the user interface and communications. ‘PURE allows academic staff to manage information about all elements of their research portfolio, reducing the administrative burden by harvesting from external sources and thereby ensuring the university’s publication record is accurate, complete and increasingly visible,’ she said. ‘Without PURE, we would soon return to a position of inefficiency and duplication of effort. The visibility of information through PURE also facilitates the promotion of the range and breadth of research and researchers at Bristol, enhancing the communication of the university’s collective research strengths and achievements to external audiences.’

Organ agreed. Like Bristol and Stanford, Wollongong University has struggled to get academics engaged, but the message seems to be slowly getting through. The message that Organ would like academics to hear is: ‘The repository is a promotional tool for your research. It can lead to increased citation counts, improved academic reputation, additional grant funding and scholarly collaboration opportunities. The easier it is to access your work, the greater its impact.’