Open science: up for a challenge?

29 August 2023

Stephanie Dawson argues the case for organisations to upgrade to open science

Open Science can seem daunting to academic publishers who may feel like the goal posts are moving just as they embrace Open Access.

As a broad umbrella term that can include a range of practices from open notebooks, methods and data to open-source programming and citizen science, it may also seem that Open Science focusses more on how researchers conduct their studies before the results are ever written up for publication. The aims reach far beyond the academic journal. But these broad goals offer a wealth of opportunities for publishers to participate in and profit from the move towards Open Science.

The 2021 definition of Open Science from UNESCO describes “practices aiming to make multilingual scientific knowledge openly available, accessible and reusable for everyone, to increase scientific collaborations and sharing of information for the benefits of science and society, and to open the processes of scientific knowledge creation, evaluation and communication to societal actors beyond the traditional scientific community.” Because Open Science can be broadly defined and implemented, it is essential to understand what aspects are most immediately relevant to the publishing community and beyond and to outline a strategy for the practices that will increase engagement and encourage participation.

Digital first

Open Science is digital. Its practices exploit the full range of digital knowledge creation and dissemination. If Open Access can be basically fulfilled by making an article pdf freely accessible for readers on a webpage, Open Science requires machine-readable licenses, persistent identifiers and rich metadata for fully interoperable digital systems.

Open Access is a prerequisite but also only a first step. In order to participate in a digital network of knowledge, a range of further persistent identifiers, controlled vocabulary and shared tagging make research outputs – from data sets to preprints and published articles – more accessible to computers for analysis and participation in the aggregation of scientific research in a networked knowledge base. By building on shared knowledge, the research and publishing community can work to create markers of trust for the public and systems that encourage participation beyond the academy.

Rich, interoperable metadata should be the first goal for publishers upgrading to Open Science. By using controlled vocabularies and persistent identifiers, academic research outputs can more fully participate in and support the goals of Open Science. Digital Object Identifiers (DOIs) are unique, actionable, interoperable persistent identifiers (PIDs) that enable stable sharing of information and should be applied to every publication. Crossref’s XML schema for the metadata associated with DOIs further make it possible to establish the relationships between published articles and early versions shared online as preprints, pre- or post-publication open reviews, translations or retractions.

Open data sets should be cited with their own DataCite DOIs, rather than shared as unstructured supplemental materials in a pdf. Authorship verified via ORCID helps to facilitate transparent and reliable links between researchers, their contributions, and their affiliations. The CRediT taxonomy provides standardized vocabulary for contributor roles. Affiliations can be more powerfully aggregated to answer questions about institutional-level outputs and usage with the new ROR persistent identifiers. Controlled vocabulary for research funders through the Funder Registry on Crossref provides a similar function. Research Resource Identifiers (RRIDs) for resources such as antibodies, cell lines, etc. increase the reproducibility of published research. Finally, unified common Open Access licenses via Creative Commons allow computers to differentiate between research available for data mining or open to build upon. And this is just a selection of the most popular PIDs available!

Unfolding impact

The more research outputs apply a shared controlled vocabulary via persistent identifiers, the more deeply they can unfold their impact within an interoperable network of knowledge. They are open not only to individual readers, but, due to unambiguously defined relations, they are also more sustainably connected to the larger network of knowledge. This opens up new possibilities for interoperability and is the foundation upon which Open Science is built.

With that long list of PIDs it may seem like a lot of extra work for publishers, but embracing Open Science comes with some important rewards. Upgrading to Open Science brings better discoverability in a whole range of big data systems. At ScienceOpen over the past 10 years we have built a discovery environment and citation index with now over 87 million records for articles, books, chapters and conference proceedings based on openly accessible data.

Our database depends on DOIs, structured XML and open citations. It interfaces with a number of other systems and products such as SciELO, ORCID, Altmetric, Scite.ai, SciScore and Reviewer Credits that similarly use DOIs as the main reference point. We see more usage of content with machine-readable open access licenses, with open abstracts and structured citation data, with affiliations and keywords. And ScienceOpen is not the only open discovery tool drawing on this data – Google Scholar, Dimensions, the Lens, Semantic Scholar and others offer the general public access to powerful discovery tools that were formerly reserved to the realm of academics working in university libraries. Better discoverability leads to more usage, a central currency in the academic information economy.

Increased usage as a reward for open practices should be visible for publishers, authors and readers. Support of alternative metrics is a core practice under the Open Science umbrella and is particularly highlighted in the EU Open Science strategy. Spear-headed by Open Access publishers over a decade ago to demonstrate the value of paying an APC for open publishing, article-level metrics have become a new standard within the researcher community. Authors expect to see the number of citations to their article, download numbers or online attention. An interoperable digital landscape of Open Science publishing makes it possible to track usage and impact at an ever more granular level. A key development in the last years was opening up citation data. Through the advocacy of the Initiative of Open Citations as of 2022 now 100% of references deposited at Crossref are treated as open metadata.

Open science and HSS

Do you feel, nevertheless, like Open “Science” is not for you as a book publisher in the humanities and social sciences? Wrong! While the use of Crossref DOIs as persistent identifiers for journal articles has become the industry standard, it is still relatively new territory for books and chapters. The metadata registered with most book DOIs is also significantly less rich, usually lacking abstracts and citations, thus providing less text and fewer nodes of connectivity for search engines to work with. This can result in lower discoverability for the main output types – books and chapters – of the humanities and social sciences. ScienceOpen began indexing books and chapters in 2019 and now has over two million records with validated DOIs.

A quick search for books, however, immediately highlights how slim most of the book metadata available through Crossref is. To help smaller publishers enrich their book metadata, ScienceOpen created the free platform BookMetaHub, funded by the German Ministry of Education and Research, that can translate metadata from the XML format used for book sellers and libraries, ONIX, into a Crossref-ready format. This richer metadata can help book publishers to benefit from the advantages in discoverability.

Another way to upgrade to Open Science is to explore open and transparent peer review. Just by committing to accept manuscripts that have been posted online as a preprint, publishers can support the research community. Add this information to your journal pages and instructions or write a blog to get the word out: researchers are still sometimes unsure what counts as “previously published.”

In a next step you could agree to consider open peer reviews of preprints as part of your peer review process or to allow your reviewers to sign or post their review reports. ASAPbio offers a wealth of resources for authors and journals around preprint review. And if you decide to choose a fully open or transparent peer review model for your next journal, make sure to give reviews a DOI in the Crossref peer review XML schema to make them citable and discoverable in their own right. There are an increasing number of journals and publishers embracing some form of open review including BioMedCentral, EMBO, Copernicus, eLife, JMIR, F1000 Research, and ScienceOpen. Ross-Hellauer and Görögh (2019) provide some thoughtful guidance for getting started.

Finally, one of the main goals of Open Science is to engage the general public and raise levels of trust in science and academia. Projects range from citizen science participation, plain-language lay summaries, multilingual publishing.

Academic publishers are already engaging with Open Science and driving it forward. But there is more work to be done before a seamless flow of information connects researchers “for the benefits of science and society” and speeds up the pace of innovation to address some of the big challenges that lay before us.

Stephanie Dawson is CEO at ScienceOpen

REFERENCES

UNESCO (2021) UNESCO Recommendation on Open Science. Paris. Available at: https://unesdoc.unesco.org/ark:/48223/pf0000379949.locale=en (Accessed: August 24, 2023).

European Commission (2019) Open Science, Research and Innovation. Available at: https://research-and-innovation.ec.europa.eu/strategy/strategy-2020-2024/our-digital-future/open-science_en (Accessed: August 24, 2023).

Ross-Hellauer, T., Görögh, E. Guidelines for open peer review implementation. Res Integr Peer Rev 4, 4 (2019) DOI: 10.1186/s41073-019-0063-9.

ScienceOpen