Thanks for visiting Research Information.

You're trying to access an editorial feature that is only available to logged in, registered users of Research Information. Registering is completely free, so why not sign up with us?

By registering, as well as being able to browse all content on the site without further interruption, you'll also have the option to receive our magazine (multiple times a year) and our email newsletters.

Briefing paper aims to clarify TDM for publishers

Share this on social media:

The Association of Learned and Professional Society Publishers (ALPSP) has released a briefing document explaining text and data mining to its members. The briefing document looks at the processes involved and the things that scholarly publishers need to consider in order to enable researchers to do this effectively. 

Audrey McCulloch, chief executive of ALPSP, explained the reasoning for producing the document: ‘Scholarly and professional publishers need a concise explanation of what TDM is in both academia and commercial settings. They also need straightforward information on the tools and services currently available to support them. This will help them make content more readily available, which is particularly important in light of the recent UK exception to copyright.’ 

The briefing paper lists ‘time, expertise, access to data and information in a compliant, aggregated and normalised format (usually XML), as well as text mining tools to support analysis’ as benefits to researchers.

Researchers in the pharmaceutical industry, for example, might mine published research for: literature and patent analysis; drug repurposing; competitive intelligence; biomarker discovery; drug safety; and sentiment analysis.

‘At the present time, one of the key benefits of using TDM (and one of the drivers for greater access to the literature) is to unlock clues to potential medical advances,’ noted McCulloch.

However, there are challenges for researchers, which include legal uncertainty; identifying copyright holders; lack of standards; lack of understanding; and lack of technical skills.

For publishers, there are challenges too. ‘As text and data mining is still a relatively new activity, there are no standard methods or techniques used by all researchers For example, some researchers may request documents in XML format, while others may be happy with PDFs,’ notes the briefing paper. Other challenges include developing and managing systems for content access and delivery; handling requests; and ensuring that mining activities do not impact the stability and security of the publisher’s platform.

The paper is aimed at ALPSP members. ‘We felt there needed to be a simple reference guide, for those who are being approached about TDM now, and for those whose research community is currently not engaged in TDM, but may well be in the future,’ said McCulloch. 

‘Publishers face many demands on their attention and resources at the moment. There are an increasing number of open-access mandates around the world, evolution of technology and changing behaviour and requirements of authors and readers. Facilitating new research tools such as TDM are in addition to this. This needs careful management to provide access to those wishing to mine, while protecting content from piracy and ensuring continuity of use by other readers,’ she continued.

‘There is a legal obligation for providers of content to UK researchers to facilitate TDM for subscribers with non-commercial requirements. This paper demonstrates that the facilitation of TDM can happen for both non-commercial as well as commercial activities and outlines the support available for publishers.’

The briefing paper includes summaries of some of the systems already available that can help publishers enable text and data mining. These include services from Copyright Clearance Center, CrossRef, Infotrieve and Publishers Licensing Society.