Open science and the new normal

Share this on social media:

Topic tags: 

While open science has gained from the pandemic many challenges remain, writes David Stuart

Image: Sergey Nivens/shutterstock.com

There can be no doubt that Covid-19 gave a boost to open science.

There’s nothing quite like a global pandemic to focus the mind on the need for openness and collaboration, and publishers and researchers quickly took unprecedented steps to reduce the barriers of access to research articles and data.

But as things begin to return to a ‘new normal’, and some of the barriers begin to reappear, it is important to consider what open science has actually gained from the pandemic, and some of the challenges that remain to be overcome in the face of other global challenges.

Open science in the pandemic

Any discussion of open science in the pandemic is necessarily an over simplification. Researchers’ experiences will have differed considerably depending on their field and personal circumstances.

For those working in areas directly aligned with the pandemic, the focus will have been on ensuring research findings and data were shared as widely and quickly as possible. For those in other fields it may have been about trying to find new ways of working off campus and away from the lab.

It is also important to recognise that open science is much broader than just the open access and open data discussions that gain much of the attention. It can refer to a wide range of activities in the research lifecycle, from the way science is captured through open notebooks to the way it is measured with open bibliometrics, and while some areas of open science will have gained additional attention in the pandemic, others will have been neglected.

If there was one feature of open science that was universal in the pandemic, however, it was the speed with which things needed to change. Mark Hahnel, CEO and founder of the open access repository Figshare, and Hannah Heckner, director of product strategy at Silverchair, shared their thoughts about open science during and after the pandemic, and both emphasised the shift in speed as seen from their different perspectives.

For Heckner, working for a leading independent platform provider for scholarly and professional publishers, this was the speed of change to open access of the version of record: ‘We saw a real acceleration in a lot of the open access movement in the industry. We worked really closely with a lot of clients to lower paywalls when people who were reading their content were going off-campus, and we worked closely with Oxford University Press to deposit in PubMed Central all of their relevant research as part of the Public Health Emergency Covid-19 Initiative.

‘Once some of the dust was starting to clear with those paywalls down, and with those workflows set-up to get the content downstream, we started working really closely with clients to make sure we are continuing to connect with those previous subscribers, how are we getting them the metrics they need to underline the value of the subscription when we need to put the paywalls back up. It was a true time for partnership, that we realised we needed to support with quick pivots, with these shifting business models, and different thoughts about on-campus and off-campus access.’

At the same time there was an urgency to the sharing of preprints, and the data associated with the coronavirus, by researchers. As Hahnel noted: ‘There was suddenly this global urgency that we all need to be working together. People studying infectious diseases were already working together, but there was a massive gear switch in terms of making preprints available on bioRxiv, or platforms like Figshare. People also started using Figshare to store different countries’ national statistics because some countries didn’t have particularly solid web sites, they had web sites going up and down.’

The push for open data wasn’t limited to the sharing of Covid-19 data, but as lockdowns were implemented and researchers had more time at their computers and away from the lab, they started having the opportunity to make other data available, and make use of other people’s data.  

As Hahnel pointed out: ‘There’s two levels of usage of Figshare: there’s how many people are publishing content through Figshare, and how many people are downloading, and reusing content available on Figshare. There was a huge peak in the amount of submissions in March 2020, and a little peak again in September, but since then it’s just resumed its normal upward trend.

‘The bigger story was the idea that people were making use of content after being forced out of the lab. We do ‘The State of Open’ survey every year (it will be coming out by the end of the year), and one of the things we’ve seen is that a lot more people in this year’s survey have used other people’s data. I’d assume that is because they were forced to.

‘If your full-time job is a wet lab scientist, and you can’t do wet lab science, then your career progression is on hold unless you can publish more papers, and so you go and find lots of data. When asked where they get the data to pull together, the primary location was data repositories, to do a reanalysis or a cumulative bit of work across many different data sets.’

While open access and open data undoubtedly received a lot of attention, Heckner also noted that other parts of the push to greater openness have done less well, or been ‘paused’, things like the openness of supporting and thinking about the incentive structure.

This is a point that ties into the theme of this year’s Open Access Week (www.openaccessweek.org): ‘It Matters How We Open Knowledge: Building Structural Equity’. While it is easy to focus on the number of papers or amount of data that is being made available, it is important that we don’t ignore the issue of equity during the pandemic. Equity is about ensuring fair and impartial access to the whole of the scientific process, and typically the pandemic had the effect of exacerbating existing inequalities.

Post-pandemic open science

Open science after the pandemic, will inevitably be different from both open science during the pandemic and open science before the pandemic. Those researchers who were driven to data-driven science as the wet labs closed, or had more time to publish their data online, will inevitably have returned to something like pre-pandemic normalcy with more lab work and less time for data cleaning, publishing and reuse. But other aspects will change. While publishers will reintroduce paywalls, they will inevitably want to learn from the longer-term ongoing and inevitable trend towards open science.

As Heckner pointed out, this can bring new data needs: ‘They’re starting to put up some more of their paywalls in a more thoughtful approach, and we need to figure out how to get the data that those publishers need in order to make those decisions: to explore new access models, and to support off-campus access for that content that is no-longer open access or perhaps was never opened up for access.’

The onwards march towards open science means even the most established of barriers are slowly being overcome. For example, both Heckner and Hahnel raised the ongoing issue of the culture of science, with the publication of a few articles in high impact factor journals having an excessively high impact on an individual’s career.

Heckner said: ‘We’ve been seeing a lot of really great movement in this area with open access journals being really valued in the research ecosystem, but there’s still some people who say ‘in order to get tenure I need to publish in X,Y,Z’ and that may be behind a paywall’. Although as Hahnel noted, while we still live in a world where your career is set with a couple of articles published in Nature, it has changed in recent years so that you can now publish the preprints and still be published in Nature.

At the same time, we’re also seeing a wider range of content coming online, as Heckner explained: ‘We’re definitely seeing a diverse content set coming onto the platform. There’s a desire to get all the different artefacts from publishing, which might have been more ephemeral before, digitised so that they are durable moving forward. There’s an increased emphasis on digestible research. Incorporating more multimedia into your research. Getting videos of discussion between researchers, rather than having them take the time to perhaps write a preprint and post it. So getting these more digestible, snackable, and accessible pieces of science out to readers, that’s certainly something we see continuing on in the future.’

Associated with the issue of culture and new content is the issue of trust. This is an expanding issue as new forms of content are beginning to appear online, and cultural expectations change. For example, as Heckner pointed out, there is an increased expectation of transparency throughout the review process: ‘Trust and transparency between the technological provider, the publisher and reader is really important. There’s trust in the peer review system, and when peer review isn’t happening there should be some transparency and openness and saying where this research came from and what it relates to.’

There is also the issue of trust in the way others make use of the open data that is increasingly being shared, and as Hahnel points out, this is one of the areas where perceptions have actually been damaged by the pandemic: ‘It used to be the reason that people didn’t want to publish their data because they didn’t want to get scooped. The statistics are showing that that’s falling down the order of importance now, and now the top reason is misuse of their data. If I publish my data people are going to use it for their own twisting of words? I think this is a direct reaction to Covid, but people are also now seeing that if they publish their data they get more opportunities for collaboration. Some 35 per cent of respondents had been involved in a collaboration as a result of data they had previously shared.’

As Hahnel went on to say, the pandemic also brought out those with more questionable reasons for sharing data. ‘There’s always a trade-off with fast but good publishing. With preprints, with the large majority of open data publishing, there is no peer review. We have seen a lot of people who are making content openly available, but then it starts to be more questionable, and you have to ask ‘What are the motivations for sharing this data?’.

‘You always had the memory of water folks, who had interesting angles to their research and couldn’t get it published elsewhere, but then you’d start to get the fake news brigade, and the quality of the research, or the quality of the questions being asked probably needed to be dug into a little bit more. In 2020 all of the content that came out at the beginning was well-intended, and I’d say we’re now at a phase where public trust is changed and not all of it is well intended. We’ve definitely had to update our terms on what kind of content we’re accepting.’

At the same time as the culture around open science is changing, there are still ongoing technical challenges to be overcome, as Heckner pointed out: ‘There’s also a large amount of technological work and infrastructure that needs to exist in order to support open science. Open science will really be successful when there is a co-linking between the artefacts in the research system. There has been some really great work in this area with CrossRef, with organisation IDs like ROR (Research Organization Registry), connecting people with their research, and to the other researchers that are citing and using and contributing to that research. But that stuff isn’t free, so continued investment in that infrastructure, continued collaboration between technological providers and their publishers, is something that needs to exist and continues to need to be invested in, in order for this to be a successful movement.’

Hahnel also raised the importance of metadata, also highlighting the potential of new technologies to help with some of those problems: ‘The idea of fast but good publishing is what I see as the next 10 years of open science.

‘We’re starting to see that you can build on top of the research that has gone before, getting the machines to do the work. AlphaFold, with the PDB (Protein Data Bank) and UniProt (the Universal Protein Resource), is the first great example of machines and information working together in the academic space to move further faster.

‘The next 10 years is going to be a large push towards that, but we need to have lots more homogeneity around diverse data sets, and to do that we need much more metadata. If we look at 2030 it’ll be an open access paper publishing world, you’ll be able to get your hands legally on every paper. The next big challenge is the curation of data sets to make them usable in the same way that AlphaFold made use of all the PDB and UniProt data.’

The progress towards a more open science is reliant on a mixture of technology and culture, and the interaction between the two, as Hahnel pointed out: ‘You can say that in the 1980s we had repositories where you could publish open access, green open access, but people didn’t do it. What has been amplified is the idea that the culture of open science is good, open research is good, as open as possible, as close as necessary.

‘On the open access side of things we’re already at post-50 per cent open access publications this year, and the open access movement will just continue to grow. Researchers are more aware of the reasons why you should publish openly now, and that will continue to grow. Open data will continue to grow, we’ll continue to see a step change. Covid was one of the drivers, with the public trust of research and the need to see the numbers, but funder mandates are the other side of it with the National Institute of Health mandating Open Data associated with publications by January 2023.  

‘We’ll get more evolution in the technology, more experiments. Not all of them will work. What is the gap between a preprint and a publication? It’s just peer-review. Publishers will tell you that it’s also copy-editing and things like this, but you can use technology for a lot of those things now.

‘But you really can’t get away from the peer-review, so I think we’ll see a middle ground of how you rate and curate preprints, and with an open data world we’ll move from not needing to check data in any way to there needing to be some level of checking going on – some level of curation, particularly around metadata, in order to make it FAIR, Findable, Accessible, Interoperable, and Reuseable.’

Building back a better open science

It would be hard not to consider open science as one of the successes of the pandemic, although the nature of that success is probably more subtle than many would like. Although there was a step change in open access and open data, this should be seen as part of the overall ongoing trend towards open science. There are other areas of openness that may have been paused during the pandemic, and aspects of open data that have raised new challenges around issues of trust. We are also left with the inevitable question: if Covid-19 was considered important enough to change our practices, however temporarily, what other issues are equally important?

The Covid-19 pandemic is only one of many big challenges facing society – and some of the most significant have been summed up in the UN’s sustainable development goals (SDGs). The 17 goals, ranging from health and poverty to tackling climate change, dwarf the problems of the pandemic, and often require cross-disciplinary joined up thinking. It is not a case of simply partitioning off those articles that may help with a particular problem – problems and solutions can spring from anywhere.  

The increasing interest in the SDGs is demonstrated by the growing interest in the Times Higher Education Impact Rankings, and it seems inevitable that people will ask: If we could do it for Covid, why not for this SDG? As with a lot of society, the publishing sector finds itself facing the question of whether we will ‘build back better’ or get back to ‘business as usual’.

The challenges remain the same from the publishers’ perspective, but researchers’ and the public’s expectations will have undoubtedly risen. While open science is increasingly recognised as a common good, there is no one-size-fits-all solution – diversity of disciplines require different responses. Covid may have given open science a temporary nudge, but it is an evolving process and best practice will take time to emerge.