Finding meaning from chaos

Share this on social media:

Web technologies offer researchers new ways to find and use information but publishers need to innovate to help them, believes John Haynes, vice president of publishing at the American Institute of Physics (AIP)

What trends have you noticed?

In the last 10-15 years online has clearly come to dominate. We are printing as a side product and more than half our customers don’t get print journals at all. This trend has been a great boon for us all. It means that we can concentrate our innovation and investment online.

The web has revolutionised our communication with all our customers including authors, referees, editors and readers. However, in a fundamental way journals have not really changed. There is still the peerreview process, authors still write in linear text and articles have a similar style and format to those written 100 years ago.

The journal has probably lasted so long because it serves multiple functions, including peer review, archiving and preservation. Nobody has really found a better alternative that serves all the journal’s functions.

For readers, there is more information than ever before. Michael Mabe, CEO of STM has observed a steady increase in information published of three per cent per year over many decades. This makes it a challenging world for researchers if they are trying to stay abreast of their field and be aware of developments in related fields. In my view, publishers have not really done a good job of addressing this information overload and this is one of the main areas AIP is focussing on.

How can publishers help?

Publishers have built vast PDF warehouses but getting to the meaning of the content is not easy in this format. Studies by Carol Tenopir (University of Tennessee) and Donald King (University of Pittsburgh) show that researchers read more than in the past but spend about the same total time reading. Researchers are looking for key observations, experimental data, conclusions and other bits of information but it’s not easy to extract meaning across a large number of articles when the content is locked inside a PDF.

AIP is investing in things like more structured and automated tagging of its XML content. On our hosting platform, Scitation, our investments in XML and Mark Logic enable us to make the content come alive so that navigation to formulae and maths within articles is quick. The notion of community is also important. A journal has always been a community of editors, authors and readers. We are trying to support this with related links and suggestions of what other people have viewed.

Our scientific social networking site AIP UniPHY has gone amazingly well since its launch in September. We recently passed the 15,000 registered user mark. People are using it in an intensive fashion and coming back. Its success is largely because it was pre-populated from high-quality bibliographic sources. It supports the idea of community, based on who people have written papers with, and allows research fields to be viewed in a different way. Learned societies were created to foster communities and in the 21st century this means online communities with global reach.

How can semantic technologies benefit physics?

We need to invest more in subject domain experts to decide what is useful to tag. We are also creating a thesaurus by developing our Physics and Astronomy Classification Scheme (PACS). We can then use human and automatically generated tags to add value and meaning to our contentand make it easier for researchers to do their jobs.

What about data?

Physics is a data-rich field. We are looking at tagging data so that it can be extracted. Data mining is seen as a great opportunity but there is a lot of work to do. One great quote that I heard about this is: ‘my data’s mine and your data’s mine.’ Scientists don’t really want to let data go so there are many societal issues to address.

There is also considerable work to do on standardising how data is presented. Big labs like CERN have many users and standard procedures but that is not typical. Most researchers work in small groups, using different techniques and different instruments and it is a challenge to invest in large data projects if you aren’t a large organisation.

How you approach data sharing is another question, whether it is managed centrally or crowd sourced. There is talk of RDF triples, “publishing” small parcels of research results and data instead of the traditional journal approach. Getting to this point is a challenge though, especially as there is no academic credit for these small packets of information at the moment.

What are AIP’s plans?

Mobile is a big part of our plans. We recently launched an iPhone application with very little marketing and have had several hundred downloads and several thousand PDFs delivered. For our newest title (Journal of Renewable and Sustainable Energy) we have a mobile edition that will work on any web-enabled smart phone. The ubiquity of web-enabled devices makes this a good way for publishers to engage with their communities. We’ve even started to have requests to develop a mobile version of our peer-review software.

It does come with a format challenge though. Delivering PDFs to mobile devices is not a happy experience. Formats like ePub are likely to offer better approaches an this is something that our Emerging Technology team is looking at.

Interview by Siân Harris