The measure of metrics

Share this on social media:

Topic tags: 

Change is constant in scholarly publishing; nowhere more than in the world of metrics. Here, four industry leaders offer Tim Gillett some predictions for the future

What are the most important recent developments within the world of metrics?

Dr. Martijn Roelandse, head of publishing innovation, Springer Nature: Metrics for books. Books, like journals, are cited in scholarly literature, are mentioned more and more in social media and online platforms and are downloaded. We have been pioneering this area with Bookmetrix, a joint project from Altmetric and Springer, but we see some uptake from other publishers and A&I platforms as well. Scopus is expanding their indexing activities for books and Times Higher Education is now using book citations, in addition to article citations, for their university ranking.

In addition, funders and governments are keen to see a broader picture of researchers in research assessment. Only showing scholarly citations for your articles is in many cases not good enough anymore. They are keen to know the researcher’s ‘impact on society’. However, how to quantify this has yet to be determined. Altmetrics and metrics for content beyond articles, such as books, chapters, code and data, could play a crucial role.

Emmanuel Thiveaud, head of research analytics at Clarivate Analytics: There have been four main developments: concern over the misuse and/or overuse of metrics; altmetrics; citation context and sentiment analysis; and IDs for individuals and institutions. Several groups have attempted to describe the proper or balanced use of quantitative indicators in research performance evaluation.

Through movements like DORA, the Leiden Manifesto, and the ‘Metric Tide’ from HEFCE, the community has been urged towards responsible use of metrics. These developments reflect the extensive, indeed embedded, use of metrics in making appointments and promotions, in allocating resources, and in gauging institutional and national research performance. Part of solution to adopt wise use policies for metrics is for the research and bibliometrics communities to engage more with one another – for bibliometricians to advise the research community about the nature, uses, and misuses of metrics in different contexts, as well as bring them up to speed on the latest in metrics.

The study of altmetrics has begun to mature as a unique field that is complementary to more established citation-based scientometrics. One sign of this maturity comes from the publication of Outputs of the NISO Alternative Assessment Metrics Project – A Recommended Practice of the National Information Standards Organization (2016). In this report, the community agreed on a common definition that Altmetrics is a broad term that encapsulates the collection of multiple digital indicators related to scholarly work. These indicators are derived from activity and engagement among diverse stakeholders and scholarly outputs in the research ecosystem, including the public sphere.

Greater access to full-text data, such as in PubMed Central, enables the analysis of citing sentences, which allows for discovery of the context and sentiment of the citing occurrence. Finally, because reliable results from scientometric analysis depend on accurate data, variants in names of individuals and institutions have always been a problem, necessitating disambiguation and data cleaning and/or unification. The adoption of Researcher ID, ORCID, and other unique identifiers for individuals is welcome as are new efforts to establish institutional unique identifiers.

James Hardcastle, senior manager for product analytics at Taylor & Francis: The predominant trend in metrics has been the increasingly diverse nature of what can be measured, which has moved metrics beyond citations. This has meant a push away from journal level metrics such as the Journal Impact Factor.  Other important developments are the Impact Agenda in the REF being less dependent on Impact Factors, as well as the continuing growth of Altmetrics.

Recently it has seemed as if altmetrics were going to overtake traditional metrics in terms of importance. What are your views on this?

Roelandse: To me this seems more complex. Altmetrics are a clear enrichment for articles and help the author to establish the reach and impact of his work. Only altmetrics, citations and downloads combined provide this insight.

Thiveaud: Much more research is required to understand different altmetrics indicators – their nature, meaning, and dynamics – and whether they are related in any way to research impact even as more broadly defined. For those that give insight to impact, there will be a need to normalise the indicators for age and field or topic, something that is only beginning. There is no prospect at the moment that altmetrics will ‘overtake’ traditional metrics. No one metric, either traditional or alternative, is going to provide a definitive answer about the value of a journal or paper. Metrics are best used in combination to obtain a complete picture – is a paper with a high download rate also highly cited? Some may supplement traditional metrics, but it is very early days still.

Hardcastle: Altmetrics as a term covers a wide range of attention and measures that should not be treated as a single homogeneous group. Measures such as references in policy documents and use in reading lists are very different to tweets or Facebook posts. It seems unlikely that the number of tweets a research article receives will be used to evaluate its ‘quality’. However, other altmetric indicators are closer to traditional citation metrics in what they measure, with Mendeley readers being a good leading indicator of future citations.

The Altmetric Manifesto focused heavily on using altmetric tools as a filter to help readers deal with the ever growing quantity of outputs. Instead tools such as and ImpactStory are more focused on authors, publishers and institutions gathering information about the attention content receives. There hasn’t yet been a big push to altmetric services for readers.

Does the Journal Impact Factor have a long-term future?

Henning Schoenenberger, director for product data and metadata management, Springer Nature: While the variety of alternative indicators measuring the impact of scientific research and researchers is continuously increasing, Journal Impact Factor continues to be an important long term metric for assessing journal performance, for all its known short-comings and lack of granularity.

Thiveaud: Yes, it does. DORA was an important reminder to stop using this journal performance measure in a manner for which it was not intended, that is, the assessment of individual papers or researchers. Misuse of the journal impact factor does not make the measure invalid, however. It has proven to be a reliable and authoritative guide to journal performance, influence, and stature for librarians and for the research community for more than 40 years. In fact, its use is growing, especially as publishers in rapidly developing regions are seeking to improve their journal offerings and aiming to be identified as a journal that has been accepted in the Web of Science Core collection for the sciences and social sciences.

Hardcastle: As long as journals exist, then metrics such as the Journal Impact Factor will have a place for evaluating them. As article level metrics become cheaper and easier to use, the Journal Impact Factor will be less frequently used as a tool for research assessment and once again return to its original purpose for collection management and journal evaluation.

Research Information noted a couple of years back that we are moving towards measures that are more difficult to count. Do you agree with this, and if so what are the implications?

Roelandse: Yes, I think this is correct. With the myriad of sources that are added to the altmetrics portfolio, it has become challenging to assess the weight of the various sources. If your content is cited in a policy document, or a member of parliament tweets about your article, one could assume your work has made a certain impact on society. However, how to assess this high level of granularity is to be determined. At the same time, it would be good to note that especially for researchers, the highest score counts, whether the data source itself is open, reliable and deduplicates or not. It is remarkable to see that sources that lack these last qualifying criteria have become the golden standard for citations.

Thiveaud: As Ludo Waltman at CWTS in Leiden has advised (‘A review of the literature on citation impact indicators’, Journal of Informetrics, 10 [2]: 365-391, May 2016), there is little need for more metrics or more complicated metrics. Many are duplicative with existing indicators.

Many are over-engineered and tempt users into a fallacy of false precision. Other formulae and algorithms are several steps removed from the data and subject to very debatable assumptions. Composite indicators are particularly dubious since weightings of different indicators in a group have no scientific basis and their modifications yield very different portraits of performance.

Hardcastle: There is a rise in black-box metrics that are can’t be replicated easily such as EigenFactors and Altmetric Attention Scores, which make it very difficult to work out the effect of a single citation or piece of attention on the final metric. As science outputs gets more complex and the requirements of funders and institutions changes the metrics they choose will reflect the diversity of need.
Could publishing metrics be simplified further? Would that help the community? Or do measures actually need to be more complicated?

Roelandse: We have the tendency to provide researchers with a bag of metrics and add more scores on a regular basis. I doubt this is of use to most of the researchers and may even cause confusion. Which one is relevant, which one isn’t? This is where we could play a role as publishers. 

Thiveaud: Good practice dictates that one should use several measures that address different aspects or dimensions of the phenomenon being measured, that the indicators should be specific to the questions being asked (which is why there is not a standard or cookbook methodology), and that the measures should not be redundant (highly correlated to other indicators being used that measures the same thing). Generally, only a few, fairly simple indicators are adequate, usually a mix of size-dependent and size-independent indicators.

Hardcastle: There are two conflicting demands. Publishers, authors, institutions and other metric consumers need metrics that reflect the different types of research, subject differences and the increasing range of outputs. They also want simple metrics that are easy to understand and use. The mathematical complexity of a metric isn’t a problem as such, instead it needs to be easy to understand the underlying drives for a change in the metric and what is being measured.

Ultimately it isn’t necessarily the complexity or the number of metrics that causes problems, but using the wrong metric in the wrong place. The publishing and metric industry has not been good at articulating how different metrics should and shouldn’t be used.