Where is artificial intelligence taking publishing?

PARTNER CONTENT – EXPERT VIEWPOINT

10 November 2021

For advancements in scholarly output discovery to exist, systematic content structuring, clustering and categorisation must evolve, writes Sally Ekanayaka

Big names like Elsevier have long been aboard on the AI bandwagon in search of new and improved ways to support research, discovery and scholarly publishing, and AI has only begun streamlining research.

Granted, scholarly communications is ever-expanding, never-ceasing, certainly more and more openly available, and build up of content across preprint servers, journals and books. But with a digitised, more predominantly open access (OA) landscape come new norms and new expectations, all connected to a wave of novel publishing technologies and standards.

So, the question. Where is AI, the newest comrade in publishing, taking us?

MyScienceWork Data Scientist Maha Amami, Ph.D shares some success stories about AI applications in scholarly communications as related to research artefacts.

Advances in Entity Recognition Systems

For advancements in scholarly output discovery to exist, systematic content structuring, clustering and categorisation must evolve. Let’s begin there.

Named entity recognition (NER) is an important, often first, step in information extraction pipelines whether it be question answering, information retrieval, coreference resolution, topic modeling, and so on.

The Sirius tool Dcypher, a highly accurate state of the art NER tool, screens various entity types and is robust towards variations in text genre and style. With the application of ontologies, NLP and Machine Learning models, the solution extracts entities (and discovers new ones) from within articles, abstract or full text. Publishers use these advanced NER systems across large collections of both structured and unstructured content to first recognise, then normalise and link various entities, all the while being trained with only small amounts of labeled data.

Advances in Semantic Search

Coming to the good stuff:

From the avalanche of Covid-19 research publications, only one-fifth resulted from international collaboration questioning the magnitude of redundancy and absence of added value of funded research. Also, following a year of experimentation around different OA publishing approaches, finding relevant papers, data and software has become an associated challenge.

For digital libraries to remain relevant, Sirius offers a comprehensive semantic search engine to navigate around too large or unspecific search results and the enormous ambiguity that comes attached with for example biomedical objects (genes, chemicals, diseases and much more…). Extending traditional search engines, semantic search engines combs for entities and classifies documents, the semantic relations between and across different keywords and identifying interactions by connecting the context of an article in a meaningful scientific way. This is also the eventual catalyst for Publishers to offer content through advanced recommendation systems that takes into in depth account the navigation and topic of interests of users.

Advances in Semantic Publishing

A growing number of scholarly institutions and publishers are implementing semantic publishing for the greater scholarly output relevance, discoverability and understanding it promises to deliver.

‘The benefits of an agile incremental approach to semantic publishing needs no emphasizing and the trend towards its adaptation is certainly on the rise’, says Maha Amami. ‘Journals are well aware that failing to provide the quality and depth of information that readers seek will become increasingly marginalised. It all boils down to early adopters who certainly benefit from increased merit for their journals, greater submissions, increased readership and naturally, higher impact factors. Better understanding of the citations thanks to sentiment analysis technologies is another added bonus.’

Dr. Amami and her teams’ project-based customisation of Sirius includes:

Improving search engines to provide more accurate, relevant, diverse and serendipitous content to end users;
Increasing readership by empowering semantic search (evaluation of content based on sentiment analysis of citations, products, organisations);
Offering relevant content (based on researcher topics of interest) through improved recommendation systems;
Building community networks to detect domain experts (existing or new) and proximity between them; and
Tracking new developments and detection of trends within research topics.

‘Sirius considerably opens up numerous possibilities in different fields amongst which improving semantic search, performance evaluation of institutions and researchers and tracking research in different fields are a popular choice,’ concludes Dr. Amami.

For more information on Sirius and data science solutions for publishing, contact Maha Amami, Ph.D at maha.amami@mysciencework.com.