Getting the measure

27 November 2015

The introduction of the web and new forms of scholarly publication are providing a seemingly endless source of data about scholarly communication that can be combined to produce a multitude of new indicators that may be used to inform existing library services, as well as establish new ones.

Speaking to Phill Jones, head of publisher outreach at Digital Science (www.digital-science.com), the potential and progress in bibliometric indicators is clear, but even with the most fundamental of metrics, such as usage, there is still a long way to go.

The changing information ecosystem

Bibliometric indicators have long been recognised as having a wide range of applications for improving library services, from journal selection and user recommendations to the weeding of a library’s stock.

However, according to Jones, the changing information ecosystem means there is a need for increased granularity in the nature of these indicators:

‘The advent of the internet and e-publishing has opened the floodgates, and one of the biggest problems that librarians and a library’s patrons face is deciding which content is relevant and of high quality. It’s difficult to use the traditional impact factor metrics as a proxy for research quality, or journal quality, because there’s just too much content. The value of journal prestige as a metric for the quality and importance of research is eroding; more and more of the top cited articles are coming from second and third tier journals, and if that trend continues, then article level metrics are going to be increasingly more helpful. Article level citation metrics are the obvious choice, but you need something that’s more rapid, and which gives a broader view of the kinds of impact that articles are capable of giving you. That’s where I see the opportunities growing.’

Jones continues: ‘Attention, impact and mentions exist on a spectrum, from the most superficial, which may be a tweet, to the most important and impactful sort of mention, which may be an academic article, a mention in a government policy document, or even a news article that sways public opinion and results in a public policy change. Context is vitally important, and simply adding up the different metrics doesn’t give you the full picture; it’s important to understand why the article was mentioned, in what context and venue it was mentioned, and whether that mention was positive or negative.’

The changing library

Of course, while there may be a lot of talk about the potential of new library metrics, often adoption rates are slower on the ground.

For Jones, however, there is a real groundswell of interest from a wide range of stakeholders: ‘As it becomes clear to more people that altmetrics, usage metrics, and other kinds of article-level metrics solve real problems that researchers face, the more excited people get.

‘For example, Wiley polled users during their altmetric trial and found 77 per cent of the respondents thought that the altmetrics data was valuable, and 50 per cent of respondents were more likely to submit to a journal that supported altmetrics. It shows the value that this kind of data is having for people.’

‘We’re also seeing an increasingly strong role for librarians in curating the research output of their institutions and measuring that impact. Various funding bodies are requiring evidence of broader societal impact and there are a lot of institutions that are now not just using citations and usage metric data, but also web mentions, and that broader impact in order to demonstrate the societal impact of their particular institution.

To make an analogy: publishers are investing in author services more and more, as scholarly communication is becoming increasingly about providing services to those who create the content to help them maximise their impact, and the library is mirroring what’s happening in the publishing industry in terms of increasing author services, and being a much more author-centric environment.’

Jones adds: ‘It’s something that’s really taking hold and it’ll increase over time. A lot of that is being driven by the research funders themselves, particularly in places like the UK, Netherlands and Australia.

‘You now get universities with teams of bibliometricians, webometricians and metrics experts who are compiling reports and doing those analyses in order to support their researchers and to help them maximise the impact of their work, and also help them report back to their funders in order to show the impact of their work.’

Changing metrics

An increasing openness to the value of article level metrics doesn’t mean, however, that there aren’t other challenges to be overcome in realising the potential of these metrics. Take, for example the case of usage statistics; although it may be considered one of the most fundamental bibliometric units, the distributed nature of scholarly publishing means that despite digital advances, accurate usage metrics still eludes measurement.

As Jones put it: ‘This is an unsolved technology challenge. We have COUNTER stats, but they give you a report on the number of downloads or accesses of an article on the primary publisher’s web site.This doesn’t include aggregators, institutional repositories, or subject level repositories. You can get statistics from the aggregator and try to merge those together, and there are also projects looking at COUNTER compliant usage within repositories such as IRUS – Institutional Repository Usage Statistics (www.irus.mimas.ac.uk) – but there’s currently no way to get a sense of what the use through PubMed Central or arXiv is when you are counting usage as a librarian. There’s nothing in the market that’d allow you to count all of the usage of a particular article.

‘One of the central ideas of the STM association’s working group on content sharing is that researchers should be able to share content, in order to collaborate, but that sharing should be something that publishers should be able to take account, of and be able to record, so that they can show that their content is being used and valued and is contributing to the knowledge base (www.stm-assoc.org/stm-consultations/scn-consultation-2015/).

‘The future might lead to either of two scenarios: you could have a clearing house approach that COUNTER are taking; alternatively you could have some kind of metadata approach, where the journal article itself has some kind of technology in it that allows librarians and information professionals to be able to understand what its usage is. Both of these approaches have challenges. The clearing house approach has the challenge that you have to form relationships with all of the people that might possibly supply that content; and then the technological approach requires people to accept an article that has technology in it to enable that usage to be recorded.’

There are also privacy challenges to be overcome. Usage data is most powerful when it is not anonymised, when it can be combined with information held by libraries or their associated organisations about a library’s users.

Conclusion

There will always be limitations in the counting of any bibliometric indicator, and the difficulties faced in the calculation of new metrics, such as usage metrics, have parallels in traditional citation databases. New scholarly collaboration network sites and services will continue to emerge, and the same work will inevitably exist in multiple manifestations and expressions – from blog posts and working papers through to pre-prints and post-prints – but it is not necessary to capture every instance.

Traditionally, citation databases have not tried to index every journal that was published; rather they relied on Garfield’s law of concentration, which stated that a core of 500 to 1,000 journals accounted for more than 80 per cent of references. Today the questions are: how many services or versions need to be captured to account for a similarly high proportion of usage statistics; how much of the web needs to be indexed to capture a sufficiently high proportion of web mentions; and how many social network sites are required to capture the bulk of the online conversation.

The fact that new metrics are not as accurate as they would be in an ideal world is not necessarily a problem in itself; after all, no metric should be taken as the whole truth.

Jones concludes: ‘Metrics are a screen that allows people to understand which resources are potentially of use. But this kind of analytics should never replace the individual judgement of the researcher, librarian, research administrator or grant reviewer; there’s always that human component in understanding and judging the quality of the research. It should augment that decision process, not replace it.’

As interest in library metrics increases, it should be a great opportunity for librarians rather than a threat.