Turning scholarly content inside-out

Share this on social media:

Neil Blair Christensen

Neil Blair Christensen discusses the evolution of enriched scholarly research

While organising some thoughts for this piece, I asked the question below to Josh Nicholson, the CEO and co-founder at Scite, which provides smart citations to check how a scientific article has been cited and if its findings have been supported or contrasted by others. To date, Scite has extracted more than 900 million citation statements from over 25 million full-text articles. 

Neil: ‘Hi Josh, Do you think of Scite in terms of PID [persistent identifiers]? I'm writing a brief piece about enrichment and I'm thinking of different angles. One of the angles is around PIDs that people may think of in simplistic and traditional terms, but I wonder if tools like Scite are a more applied/ smarter generation of PIDs that generate outcomes and not just long-lasting references between objects. People may tend to think of PIDs as ORCID, ROR, GRID, DOI etc.. Is Scite actually a dynamic next-generation PID that takes it further by not just establishing the links, but by assessing and interpreting the context of the links in the process?’

Josh: ‘Hi Neil, I generally don't think of Scite as a PID because I also associate PIDs as more simplistic and traditional like you. With that said, I think the argument could be made that Scite is a PID. I generally introduce our work by saying we are introducing the next generation of citations, which are PIDs, right?’ (Christensen, NB, Nicholson, J. Personal communication, June 28, 2021)

Readers may rightfully challenge my casual interpretation of PIDs, as some may, but let’s try to use my question as a jumping-off point for how we may think about objects that cite and link objects for particular criteria; how the sophistication of linked objects evolve; how we make use of them, and where they originate. 

Some of us may tend to think traditionally of enriching objects or PIDs as fairly rudimentary infrastructural add-ons, but what if we expand the framework? Are they the beginnings of what one could think of as ‘inside-out’ skeletons for more transparently assessing, certifying, and linking required aspects within research studies? 

What if tomorrow’s reproducibility, recognition, and linking standards require transparent and open inside-out constructs of research content to validate the value of research studies, similar to Scite but for many more aspects? Whatever we use and design today has implications for tomorrow’s workflows, and more ‘inside-out’ dynamics may be brewing than we recognize in traditional PID and content enrichment conversations.

Let's look at some examples. Scite has partnered across the scholarly research industry over the past couple of years, and recently also partnered with Research Square Company to provide smart citation badges for nearly 100,000 preprints on the Research Square preprint server that again are the results of hundreds of linked journal partnerships. 

Other types of badges already available on the same preprint server include Prescreen, Methods Reporting, Data Reporting, and 3rd party Dimensions badges. Additional 3rd party SciScore and Ripeta badges have also been piloted on the server, and behind the scenes, American Journal Experts offers language quality assessment badges to help authors optimise their writing before sharing preprints. Elsewhere, one can think of the emergence of Open Science badges, DataSeer, Altmetrics, the framework for MDAR (Materials Design Analysis Reporting), or capsules like Code Ocean, Jupyter Notebooks and R Markdown. 

In other areas, consider F1000 Research or eLife’s ‘publish, then review’ models, the CRediT taxonomy, RAiD, or Rescognito’s recognition ledger as additional signs of inside-out constructs. They are positioned as value-adding components that offer a diversity of ways to link, identify, certify, optimise, or reproduce critical aspects of the research endeavour, but rather than foreign objects that are tagged onto research objects, they turn the studies inside-out, increase utility and make it more immediately apparent what the studies offer in strength, limitations, and reproducibility. 

This direction may also relate to what Malcolm Macleod coins in his ‘Publomics’ approach for systematically combining and integrating more granularity of research claims and provenance in order to facilitate more effective ‘research on research’ as an answer to the replication crisis. So much information is being generated, collected, and not used or extracted effectively from inside traditional article capsules. Over the years, requirements for research articles have mushroomed, and we’ve seen author guidelines grow as a result. 

Twenty years ago, author guidelines would often fit on one or two printed pages and today that is easily 10-15 printed pages, not counting the links to additional resources for in-depth requirements and forms. These requirements are compounded by growing research output and increase the workload of authors, editorial offices, editors, reviewers, publishers, and readers. However, these developments also make it attractive and necessary to smartly assess targeted aspects from within these complex studies. Several of those author guideline requirements equal targeted services already here or waiting to emerge. 

Alice Meadows and Phill Jones recently shared a couple of great posts about PIDs in the Scholarly Kitchen. 

They were possibly the catalyst for making me wonder why fairly simple and traditional PIDs continue to require value proposition pitching, yet they are foundational to the direction of scholarly communication. I suspect it’s because the framing of what PIDs include is maybe too narrow. It is possibly similar for content enrichment, which makes some people think of semantic enrichment and a decade’s worth of seeing demos of text highlighted in different colours, keyword tagging and related article algorithms. 

Why couldn’t something like a data reporting badge that persistently identifies and certifies the presence of a met requirement in a research article be thought of as a PID or content enrichment? 

Similarly, why couldn’t open ‘publish, then review’ review reports be thought of in terms of PIDs and content enrichment — once the review is done, it’s persistent and it links to a critical component of a research article. The list goes on and is consistent with FAIR data principles to make research data meet principles of findability, accessibility, interoperability, and reusability. I feel that more inclusive PID and enrichment frameworks may have more to offer.

This opinion piece suggests that something is brewing that may currently appear disjointed and may not fit traditional frameworks, but it feels consistent in direction and staying power. I doubt that anyone truly knows what services will survive or emerge to make lasting impacts, but from multiple directions, it seems like the scholarly research community is somewhat deconstructing or modularising scholarly workflows into multiple components. 

This is an opportunity for the research community (including publishers and librarian readers of Research Information) to rethink how research is produced and shared. At Research Square, we envision a partnered future where our roles in the workflow, namely in the areas of preprinting, manuscript preparation, and review support, are achieved through partnerships with people such as yourselves and the organisations you represent. 

Let’s make research communication faster, fairer, and more useful.

Neil Blair Christensen is director for partnership solutions at Research Square Company