Research analytics: friend, foe and informant

7 January 2013

Neil Jacobs considers analytics and its role in research management and assessment

Research can be measured. So argue those who seek to sell products that offer insight into patterns of research behaviour. These patterns are often based on the authors of articles and books, and the usage and citation of those articles and books. But, as a recent Jisc report suggests, ‘maybe this is just another fad driven by the benefits seen in areas such as online shopping and social media?’

Today, universities around the world are paying increasing attention to performance measures and patterns, some of which are presented in league tables. For example the top 100 in the Shanghai Academic Ranking of World Universities are competing against each other globally for researchers, grants, students and marks of research quality. Looking at data doesn't just offer universities information on research performance. It can also be used to improve their performance.

The application of such data is often called ’analytics’. Analytics allows industry and academia to seek meaningful patterns in data, in ways that are pervasive, ubiquitous, automated and cost effective, and in forms that are easily digestible.

The use of analytics is growing. Cornell University says that ‘the role of analytics in research will be a focus of our forthcoming corporate plan’, but goes on to flag some concerns. For example, analytical data will vary between disciplines and international and inter-disciplinary research will impose specific requirements on analysis if they are to be adequately represented.

These reservations are examples of the questions that universities and researchers should be asking if they are to get the best value from this opportunity – questions familiar to graduate research methods, of validity, reliability, ease of use, and appropriateness to context and purpose. They are neatly summarised in the UK by the National Audit Office guidance on performance indicators, in ‘Choosing the Right FABRIC’.

Two trends particularly highlight the challenge for universities when using analytics.

One of these is the increasingly systematic management of research and its communication. Approaches such as STAR-Metrics in the USA, Lattes in South America and ‘Current Research Information Systems’ in Europe demonstrate that both universities and funders want to capture and use codified information about research for strategic and, in some cases, tactical gain. The key here is interoperability; ensuring – so far as this is possible – that data are comparable; is a research grant the ‘same’ thing in this system as it is in that system? The CASRAI data dictionary should help the research community both reach agreement on this type of question, and also own that process.

The other trend is the uses to which ‘transactional data’ are put, especially by large retail operations such as Wall-Mart and social network sites such as Facebook. Every digital action by a researcher leaves behind a ’data exhaust’, a trace, and when these are aggregated, they provide a rich source of data that can be mined for meaningful patterns.

Already, libraries in the UK are using this kind of data from their systems to anticipate risks to student retention and performance. How long before researchers are monitored in a similar way?

Well, as I’m sure you’ve guessed, they already are.

The recent furore over management decisions at Queen Mary, University of London, which were largely based on indicators derived from citation data, is the most extreme example of this in the UK. In other countries, financial rewards tied to the Thomson Reuters ‘impact factor’ are common. More sophisticated approaches are also available, these may use combinations of co-authorship, citation and co-citation and sometimes usage and other transactional data, in the manner of social network sites. The obvious example is Mendeley, but others of this type include VIVO (Cornell, Florida), Catalyst Profiles (Harvard), and Loki (University of Iowa), Academia.edu, and a variety of commercial products including cos, Index Copernicus Scientists, Research Crossroads, BiomedExperts and Epernicus.

So, is the answer to reject the use of quantitative data to inform the management of research? Even if a principled, positive answer to that question were proposed, it would be unlikely to affect practice on the ground. Instead, research communities might be better served by engaging very actively in the discussions that are already happening about the use of these data. The range of such data is increasing daily as researchers use the web for their work, and the ‘altmetrics’ movement is only one international forum in which the application of data to research is being explored. For example, publishers such as the Public Library of Science (PLoS) encourage a participatory approach. Many of those involved are experts; researchers and ex-researchers, publishers and librarians, and they are by no means uncritical enthusiasts for the automated gathering of data and use of metrics.

In many ways, universities and researchers are fortunate. Their core business, research, involves collaboration, and the collection, analysis and interpretation of data. Therefore, they are uniquely placed to develop and deploy analytic techniques in an informed way. The question is: will they?

Neil Jacobs is programme director for digital infrastructure at Jisc