Users should be at the centre of Web 2.0 plans

Share this on social media:

Topic tags: 

For some, Web 2.0 is the next Big Thing, while others are sceptical about any buzzword worship. But how will this new internet philosophy affect scholarly publishing? Charlie Rapple of Ingenta reports

There's been no hotter topic in 2006 than Web 2.0. Much has been made of its community engagement: putting research back into the hands of the researchers and fulfilling the web's true potential for user-interaction. Web 2.0 philosophy could be viewed as a threat to traditional publishers in that, amongst other things, it provides researchers with alternative channels for content dissemination. But there's no doubting the widespread nature of the attitudinal shift. Those embracing its technologies are well-positioned to expand and thus protect their role in the information chain.

Attempts to define the term “Web 2.0” engender heated debate amongst the technical community. It is broadly associated with several concepts and technology trends.

One of these is the concept of the semantic web. This is the sharing of structured data by providing an interface (API) with which other web applications can interact. For publishers, whose revenue models often rely upon restricting access to their data, uptake is limited by issues of strategy rather than technology. And it's not just the publishers who are cautious. Many librarians warn against the perils of social tagging, by which users of social software such as del.icio.us and flickr tag data according to self-created taxonomies (known as folksonomies). Equally, many researchers are unconvinced of the accuracy and thus the value of user-created resources such as Wikipedia.

Keeping up with the movement

But there's more than one way to keep abreast of a new wave. The Web 2.0 movement can simply be considered as a way to engage users. It is not just about enabling them to interact but also about helping them to access and manipulate the data they need. From this perspective, it is not difficult for publishers to consider themselves as already doing things in a Web 2.0 way. Rather than comprehensively trying to embrace the new order, publishers should look to build on those areas where it overlaps with their existing methodology. It is surprising to see the levels at which publishers can engage without undermining their current revenue streams or strategies.

For example, publishers have long been encouraging third party use of open data. The aim is to drive traffic to access-controlled content, by making structured metadata openly available (to abstract and indexing databases, or via open archives initiatives), and supporting predictable linking syntaxes.

Another example is remixing. More progressive implementations of RSS, such as 'recent content' or 'most viewed articles' feeds, are semantic web-friendly. This means that they can be retrieved and 'remixed' by another site (such as a library OPAC). One could further argue that our industry was an early adopter of remixing in its development of, and support for, federated searching. This espouses the seamless spirit of Web 2.0 by providing a single interface to multiple data sets.

Elsewhere, early adopters have created blogs (such as Ingenta's All My Eye) to complement or replace the role of traditional newsletters in publicising service developments and product announcements. The format lends itself well to syndication and thus increased use of the content. Blogs can also be tied in with specific journals as an extension to the discussion forums of the '90s. Enabling comments on postings can drum up debate and encourage usage of the journal articles to which they relate. This capitalises on the pre-existing status of a given journal as the centre of its community, and the freely-available content can serve to draw users into the paid-for papers.

The long tail

Another hot Web 2.0 concept that publishers are already involved in is the long tail. This is a new business model of making money from the large numbers of low-demand products rather than focusing on a smaller number of bestsellers. Before the internet, the economics of this was less favourable because of the high costs associated with production and storage of unpopular items. But the story is different when the products are digitally created, stored, sold and distributed.

Specialist publishers were already 'monetising the long tail' long before eBay and have taken advantage of the e-journal revolution to reach out further to niche markets. The idea can be taken further, though. Publishers could promote less-mainstream content within their sites by adding 'more like this' links from popular articles. They could also enable users to vote for articles, such as happens on sites like Digg. Publishers could even, if they feel brave enough, post a 'least read' list to catch users' attention.

Of course, there's no better way to maximise visibility and use of all a publisher's content than enabling it to be indexed by Google. This technology giant is also held responsible for the rise of another Web 2.0 phenomenon: the mashup, where publicly available data sets are combined to provide dynamic new services. The launch of the Google Maps application program interface (API) in June 2005 encouraged a plethora of programmers to create applications that draw on Google Maps' data. An example is Map My Run, which also brings in data from US Geological Service to provide elevations of plotted routes.

This idea can be extended to other types of data. At Ingenta we're piloting some projects which utilise a variety of data-sets to enrich the full text we host. Both OCLC and Talis (see page 10) have recently announced prizes for the best mashups.

Meanwhile, Google is not the only search player to board the Web 2.0 train. Other providers such as Yahoo! are developing social search tools that filter results based on folksonomies and user-preferences. And new search engine Rollyo allows you to create a 'searchroll' to restrict results to sites you trust – a user-defined extension to the concept of Elsevier's Scirus.

In spite of this technolust, we should remember that we're a long way from critical mass. Non-sophisticated users make up the majority, and aren't interested in the more collaborative aspects of Web 2.0. What they do want from it is an information-rich user experience. This means more data to supplement the published literature, such as the additional details necessary to reproduce an experiment; or a means to feed back responses to authors and thus engender discussion which could further the research. Informal communication media (technical presentations, conference papers, pre-prints, even email and phone discussions) can be harnessed to strengthen the message of formal communication channels and to counteract the length of the formal process.

A range of technologies can be employed to support this. One example is using community 'trust' mechanisms. These can verify the expertise of participants in the same way as eBay feedback attests to transactor reliability.

Sophisticated data storage is another component in supporting this. Ingenta's new Metastore is a component-based data repository which allows us to store and deliver raw research data alongside the associated journal article. The technology behind it, RDF, is popular amongst Web 2.0 advocates because its common data model makes it readily extensible and remixable. Its flexibility allows us to extend the formats in which research results can be communicated, and to embrace the informal media that more traditional online publishing technologies preclude.

In the longer term we anticipate that authors themselves could add supplementary data directly to Metastore. Although author self-archiving of papers is currently sluggish, espousal of collaborative enterprises such as Nature's Omics or Signaling gateways suggests it is not unreasonable to expect stronger support in the longer term.

Fitting a business model

Key to the availability of such data is the business model by which it can be accessed. Whilst Web 2.0 is lauded for going hand-in-hand with open source, such generosity is not compulsory. Nonetheless a flexible e-commerce framework is advantageous to encourage maximum usage. Of equal value is granular addressibility of content, whereby URLs are clean, compact, assigned at the lowest possible level and, preferably, predictable. Interoperability is clearly critical to the collaborative environment. As elsewhere, work towards standardisation in this area will pave the way for further uptake.

In summary? Whilst early adopters are thriving on the additional functionality that Web 2.0-styled services can supply, the majority of researchers continue to have relatively simplistic requirements. Some publishers still need to focus on successfully delivering the basics before expending resources to deal in the bells and whistles. Even when they do turn their attentions to new technology developments it is critical to serve our communities appropriately, with data and tools that will add genuine value to their workflow.

Given the frenzied debate around Web 2.0, it seems inevitable that the usage of the term will decline as providers try to dissociate from the media hype. Many in the industry are predicting a dotcom-style bust, as the bubble bursts for operations trading heavily on their Web-2.0-ability. If it does, those who survive – like their dotcom-era predecessors – will be those who have taken steps to provide user-focused services whilst maintaining a strategy with substance, not hype, at its core.

Charlie Rapple is head of marketing at Ingenta

Further Information

The websites, companies and products mentioned in this article can be found at the following URLs:

del.icio.us: del.icio.us
flickr: www.flickr.com
Wikipedia: www.wikipedia.org
eBay: www.ebay.co.uk
All My Eye: allmyeye.blogspot.com
Digg: www.digg.com
Google: www.google.com
Google Maps: maps.google.co.uk
Map My Run: www.mapmyrun.com
Ingenta: www.ingenta.com
OCLC: www.oclc.org
Talis: www.talis.com
Yahoo!: www.yahoo.com
Rollyo: rollyo.com
Scirus: www.scirus.com
Omics: www.nature.com/omics
Signaling: www.signaling-gateway.org