A large-scale investigation into systematic archiving in open-access repositories is coming to an end but publishers seemed more interested in participating than the researchers themselves, writes Siân Harris
Later this month the PEER project will be presenting its end-of-project findings into the challenges and effects of green open access (where versions of articles accepted for publication are made freely available in repositories). Since 2008, PEER, which stands for Publishing and the Ecology of European Research, has been investigating the systematic deposit of thousands of authors’ final peer-reviewed manuscripts into six European repositories and one long-term archive and making them freely available to any researcher or member of the public with internet access.
The project was set up – with euro 4.2 million in funding, half from the European Union and half from the project partners – to find out what effect green open access (OA) might have on reader access, author visibility, and journal viability, as well as on the broader ecology of European research.
One of the interesting features of the project is the involvement of heavyweight scholarly publishers. Arguably publishers have the least to gain and the most to lose from green OA being a success. If authors’ final, accepted manuscripts are comprehensively available in free repositories, and easy to find, then there are obvious benefits to researchers of using them.
Nonetheless, the PEER project includes 12 major publishers, many of whom have faced criticism from proponents of green OA. These publishers collectively made available content from 241 journals in four broad subject areas.
While traditional publishers investing so much time and effort into this project might seem surprising, the potential benefits to researchers seem clearer. After all, the main purpose of publishing research papers, leaving aside the issues of self promotion and bolstering an institution’s research rating, must be to enable researchers to find out what other researchers are doing and to communicate their own research breakthroughs with others. To be able to do this without their institution paying either publication or subscription costs should have obvious appeal.
For this reason, the author response to the PEER project seems strange. The initial project plan was to populate the repositories with half the articles submitted by publishers and half by author self archiving. As Julia Wallace, project manager of PEER, told delegates at the recent UKSG conference in Glasgow, 11,800 invitations to submit to the repositories via the PEER depot were sent to authors. Despite this large number of invitations however, only 170 papers were self archived by authors. Indeed the author response was so low that half way through the project the PEER team had to increase the proportion of papers that needed to be deposited by publishers in order to have enough material available for the subsequent analysis. In total, more than 53,000 manuscripts were submitted to PEER.
The reasons for so few papers being deposited by authors need further investigation, according to Wallace, and some of it might be attributed to it being experiment. However, the PEER experiences seem to correspond with those of many repositories. Even where there are mandates to deposit and not complying with the mandate could affect future funding, the percentages of authors self archiving are still a long way from 100 per cent.
‘There is anecdotal evidence that some researchers consider making journal articles accessible via open access to be beyond their remit,’ observed the report on behaviour research published late last year and carried out by a team at the Department of Information Science and the LISU group of Loughborough University, UK as part of the PEER project. The study also found that only a minority of researchers associated open access with self archiving, and those researchers were mainly in physics and related areas where there is a long-standing tradition of self archiving in the ArXiv repository.
The behaviour research also revealed some concerns that researchers have about the authority of article content in open-access repositories and the extent to which it can be cited when the version they have accessed is not the published final version. ‘These concerns are more prevalent where the purpose of reading is to produce a published journal article,’ said the study.
Some authors were also worried about the perception of the quality of their peer-reviewed, published journal articles if they were in an open-access repository with other content of variable quality.
But the attitudes about open-access repositories were not negative; simply guarded. As the study put it: ‘academic researchers have a conservative set of attitudes, perceptions and behaviours towards the scholarly communication system and do not desire fundamental changes in the way research is currently disseminated and published,’ an observation that must provide some reassurance to publishers of peer-reviewed, published journal articles.
Even without thousands of papers being deposited by individuals, the project still found plenty of challenges in populating the six repositories with content from 12 different publishers. As Wallace explained in her UKSG presentation, each publisher organises its data and resources differently and the structure of each repository is different too.
She noted that in gathering publisher content there was no standard extraction point. There were also a variety of file formats across publishers and a range of article types. Metadata was another challenge. ‘At the acceptance stage, some publishers don’t have a DOI assigned and none know the publication date. Some publishers therefore provided the data in two stages, while others held off until the article had been published,’ she said.
What’s more, different publishers use different metadata formats – including NLM2x, NLM 3.0, ScholarOne and proprietary formats. And there were repository challenges too, with metadata requirements and injection processes varying between repositories.
In addition to the studies of researcher behaviour, the economics of green open access were also investigated by the ASK research centre at Bocconi University in Milan, Italy as part of PEER. The economic study, which analysed 22 organisations involved with journal article publication and dissemination, revealed that significant costs are involved in managing the peer-review process, with no economies of scale, and that there are real costs associated with repositories.
This study recommends that open-access journals ‘will have to become more active in seeking multiple revenue streams and in improving services, while repositories will need to make a stronger case to guarantee the flow of funding.’
Meanwhile the end-of-project conference will include analysis of the usage data from the PEER project, carried out by the CIBER group at the University of London. This analysis should say a great deal about how these papers are used and the role that open-access repositories could play in the future – presumably provided that somebody other than the authors were willing to populate them.