Return to responsibility

24 May 2018

When the late Eugene Garfield, founder of the Institute of Scientific Information (now Clarivate Analytics), introduced the Journal Impact Factor, he probably didn’t realise he would spend the next few decades reiterating how it should be used.

While selecting journals for the ISI’s Science Citation Index, he was worried that small, yet important, review journals would be overlooked if he and colleagues simply relied on publication or citation counts. So come 1972, he had devised JIF as a measure of the short-term average citation rate of papers published in a journal. Its impact was profound.

This powerful citation analysis was swiftly published in Science, Garfield and colleagues were soon fairly comparing small-scale journals and industry behemoths for ISI, and librarians were adopting the metric to help with journal purchase decisions.

Yet, amid success, controversy grew. Researchers, managers and administrators were increasingly using the metric for different purposes, including judging scientific quality in individual research articles and journals. Garfield provided much comment.
For example, in 1979, he published an article in Scientometrics that asked ‘Is citation analysis a legitimate evaluation tool’? His answer was ‘When properly used, citation analysis can introduce a useful measure of objectivity…’.

And two decades later, in Canadian Medical Association, he wrote: ‘Like nuclear energy, the impact factor has become a mixed blessing. I expected it would be used constructively, while recognising that in the wrong hands it might be abused.’

Today, controversy reigns and the mighty JIF remains a hotbed of intense discussion. While the metric helps researchers to decide in which journal to submit a paper, it is also used to judge peers, influence career moves and sway grants. In recent years, scathing headlines such as ‘Do not resuscitate: the journal impact factor declared dead’ and ‘Hate journal impact factors? New study gives you one more reason’ reveal ongoing tension. And only last month, Nature Index published an article, ‘What’s wrong with JIF in five graphs’.

But for Marie McVeigh, product director of Clarivate’s Journal Citation Reports, which includes JIFs, and Jonathan Adams, director of ISI at Clarivate, some causes of what’s wrong with JIF are crystal clear. ‘Researchers use a lot of intelligence around the [use of metrics] and have deep awareness here,’ says Adams. ‘But research managers really should be taking a lot more responsibility for proper decision-making.

‘They use metrics such as the Journal Impact Factor and other indicators like H-Index as a substitute for this, and also as a defence for their lack of competence and making effective decisions over the deployment of resources,’ he adds. ‘This needs to be dealt with at the institutional level and requires cultural change. It is not acceptable to use an indicator in place of a proper academic management decision.’

McVeigh agrees and adds: ‘Metrics is a complicated field…. and anybody that wants to use one number for every purpose is never going to have the proper picture of value and will be missing a big part of the picture.’

The JCR product manager also highlights how the impact factor has been widely critiqued for not being an article-level metric, yet was designed to be a journal-level metric. ‘The journal, as a vehicle to communicate scholarly work, has very different properties to an individual article,’ she says.
And while she points out that parallels can be drawn between the article and its journal, she asserts that wise use is essential.

‘If you publish your article in a journal with a high impact factor, I can’t tell you if it will be cited, but I can say that it has been through the same process of peer review, editorial supervision and oversight from a responsible publisher that is consistently selecting materials that are cited within two years of publication,’ she points out.

‘A journal doesn’t get a high impact factor by publishing one blockbuster every other year, it is consistently publishing work that gets attention from the scholarly community,’ she says. ‘We have offered intelligent information around the journal since 1975, so now the question is how do we get users to be more intelligent in how they use this information and also provide more insight into the information that we can offer?’

Adams’ and McVeigh’s comments on metric misuse are not new and several initiatives have come into force to impart insight. Back in 2012, the San Francisco Declaration on Research Assessment (DORA) was formed in a bid to eliminate the misuse of research metrics and encourage metrics users to think more broadly about assessment. The Leiden Manifesto followed in 2015 – set up by Paul Wouters, Professor of scientometrics at Leiden University – offering a ‘distillation of best practice in metrics-based research assessment’.

Soon afterwards, The Metric Tide report was published in response to an independent review of metrics led by James Wilsdon, then professor of science and democracy at the University of Sussex. The Forum for Responsible Metrics has since formed, and Wilsdon has also chaired an expert panel on the use of next generation metrics for the European Commission.

Importantly, with each new declaration and review, issues around metrics use have been increasingly exposed. DORA chair Professor Stephen Curry is a proponent of JIF-free assessment for individual researchers and is calling for more action. In Nature World View he wrote: ‘Let’s move beyond the rhetoric: it’s time to change how we judge research.’

Mike Taylor, head of metrics development at Digital Science, is confident such change is coming soon. He believes, even in the last nine months, the notion of ‘responsible metrics’ has been rising. ‘I’ve seen an increase in the number of people talking about this, and it is a good way to characterise many conversations,’ he said.

According to the head of metrics, a few years ago most metrics users relied on JIF and the H-Index alone, and as he puts it: ‘We had this relative poverty of data back then.’

But this has changed. Metrics are being liberalised and scholarly players have access to more numbers than ever before. As Taylor points out: ‘This wealth of numbers doesn’t necessarily lead to a wealth of understanding. We are increasingly understanding that when we open up metrics, we need to open up this understanding and communicate the value, strengths and weaknesses of one metric over another.’

Clearly today’s metrics initiatives can only help, but how can the commercial organisation assist? According to Taylor, entities such as Digital Science cannot advise researchers to join a metrics initiative, such as DORA, but can help to support understanding by simplifying the user experience.

As he points out: ‘I’m not going to say we have the answer but we have control over unique object identifiers and can influence whether, say, the Relative Citation Ratio, Citation Count or Collaboration Index are the first indicators to appear [in a search].

‘So if you are on a journey of discovery you can choose to see the appropriate metrics for that, while if you are performing institutional comparisons, you might want different metrics,’ he adds. ‘We have underlying data models to help people make intelligent decisions; we have some way to go but are engaged with this.’

Basket of metrics

In recent years, publishers across the board have been using more and more metrics. Elsevier, for one, provides several metrics calculated at different levels of aggregation, including journal- and article-level. According to Andrew Plume, Elsevier’s director of market intelligence, the organisation refers to this as its ‘basket of metrics’ and knows how it should be used.

‘[We have] two golden rules for research evaluation,’ he says. ‘To always use quantitative and qualitative indicators side-by-side, and when choosing quantitative indicators, to always use more than one.’

Today’s metrics include CitesScore Metrics, as well as altmetrics from Plum Metrics, although Plume asserts that Elsevier is ‘constantly scanning the horizon for new, useful metrics’. Still, as the Elsevier director also points out, the JIF still ‘looms large in the life’ of today’s academics. ‘Academics rely on it but it remains as misunderstood as it is well-used,’ he says. ‘Article-level metrics do counter some of its flaws but journal metrics, such as JIF, will always be valuable where a shorthand is needed for the importance of a journal as a filter, and not the article as the object of citedness.’

Like McVeigh, Plume also points to how journal-level metrics can be useful when a newly-published article has not had sufficient time to garner citations. But ultimately, he believes a mix of metrics is essential and should be used with care: ‘The task of all involved [in metrics] is to get the most appropriate metrics in the hands of informed parties,’ he said. ‘As long as a user has access to good information about the data source, calculation and appropriate use of each metric, we must rely on [him or her] to exercise good judgements for their own unique cases.’

In a similar vein, director of product data and metadata at Springer Nature, Henning Schoenenberger, is keen for researchers to better use and understand metrics. As he puts it: ‘The Journal Impact Factor is hugely used and hugely appreciated by researchers, but it really doesn’t say anything about the quality of the single articles within a journal.

‘Having spoken to many researchers in the last few years, this quantitative approach is no longer enough to really show what is the impact of scientific research… so we really want to make sure the researcher has the full picture to see this,’ he adds.

‘On both the journal-level and book-level, we’ve very much seen a lack of granular metric figures and this is what we have been working to fill.’
Indeed, Springer Nature partnered with Altmetric as early as 2014 to track activity around scholarly literature on social media, Mendeley, ResearchGate and more. Bookmetrix, with its book-level and chapter-level metrics quickly followed, with the Collection Citation Performance (CCP) – to help assess the quality of Springer Nature’s various e-book collections – launching in 2017.

Schoenenberger firmly believes that integrating metadata, such as persistent identifiers as well as topical annotations, including chemical- and geo-entities, increases content use and raises research quality. He also hopes that a metric that demonstrates research reproducibility will become available in the near future.

‘I just don’t believe in a scientific world with only quantitative measuring, so I think our strategy has been to partner the quantitative metrics with the qualitative metrics,’ he says. ‘Looking into the future, I’m pretty sure the Journal Impact Factor will not have vanished completely, but impact will be measured using a much more granular and richer set of figures, and this is what we are working towards.’

But not everyone is quite so confident about the future of the JIF. As Digital Science’s Taylor puts it: ‘Eugene Garfield was a genius… and Clarivate has since done an amazing job of curating the value of the JIF and establishing it as a North Star metric across the world. ‘But we are seeing an increasingly open world of metrics that will inevitably chip away at the JIF, and maybe in 10 years it will be gone,’ he asserts. ‘[Similarly] Scopus and Web of Science have such value… but with increasing democratisation and openess, I can easily see that over the course of ten years the importance of these databases will be much lower.’

Indeed, Digital Science’s research information system was launched with a free app as well as a basic set of indicators, including the National Institutes of Health Relative Citation Ratio, Citation Count and Average Citation Rate, as well as an Altmetric Attention Score. The initial metrics were kept ‘close to the data’ for simplicity, but this is changing and right now the company is developing its own version of an impact factor with two-, three- and five-year analysis windows.

Without a doubt, with Dimensions, Digital Science is providing competition to Elsevier’s Scopus and Clarivate’s Web of Science, but what about ‘rigour’ and ‘quality’? While Dimensions is inclusive in terms of content coverage, Scopus and Web of Science’s content is curated.

And as Clarivate’s Marie McVeigh points out, citation data capture at Clarivate is based on an historic backfile of 1.4 billion references, which she describes as a ‘pretty big polling sample’.

Taylor isn’t fazed. ‘Someone once said to me “Dimensions is a cheap version of Scopus, but don’t forget you will always need Web of Science for the quality”,’ he recalls. ‘But if you talk to younger researchers; their relationship with data is much less about what’s being curated and more about what is available. ‘The reality is these people prefer choice and speed to quality, and as they progress through academia, they will take these attitudes with them,’ he adds. ‘I am sure this is why we see so much interest in open access and Sci-Hub.’

But be it choice and speed, or quality, metrics providers and users all agree that as more and more metrics emerge, correct use remains imperative. As McVeigh puts it: ‘It’s not about having more metrics, nor about having the “right” metrics; it’s about using the right metric in the right place and adding new [indicators] when they are fit for purpose.’ She wants to see metrics users using Journal Citation Reports, with its JIFs and other metrics, the way Eugene Garfield envisioned it; as a data-set that allows the examination of journals as ‘socio-scientific phenomena, as well as communications media’.

‘That is a fantastically modern way of thinking about journals; and is baked into the JCR itself,’ she says. ‘We now need to make this aspect of [this service] more accessible, so it’s not a single number but a more rounded view of where each journal contributes to the research landscape.’