Searching for science

Share this on social media:

Topic tags: 

John Murphy profiles Elsevier's free science search engine, Scirus

More mature users of electronic information remember the delights of Gophers and Veronicas. These worked mainly because there was so little information published electronically. Once the whole world decided to publish online, these tools led on to search engines. However, with so much information now online, search engines have a habit of returning too many results for academic researchers to deal with and do not always distinguish in rating and returning results between genuine research and paranoid ravings.

This is why publishing giant Elsevier decided to create Scirus in 2001. This search engine is specifically designed for the academic scientific community and identifies scholarly material from general web results. Other companies, notably Google and Microsoft, have also produced their own versions. There is no business model to support such services but Elsevier supports its free service as part of its attempts to broaden its relationships with the scientific community.

So how does such a tool help scientists? Joris van Rossum, head of Scirus, said the company began to realise that, with the rise of the web, there were many more places to find good scientific literature than simply in journals and databases. Many academics publish their work on their own websites and institutions also have their own repositories. There is also a vast amount of ‘grey’ literature out there which, while not necessarily as authoritative as peer-reviewed journals, still has some value to scientists. The important thing, he explained, is to clearly identify the quality of the information and communicate this to the user, who could then use their own judgement.

Rossum: 'We give a structure to the unstructured web. We make it very clear where the results have come from.'

Scirus has access to many journals, not just those accessed through Elsevier’s Science Direct platform. It also sorts through databases such as PubMed and free institutional repositories and selects them as ‘preferred web sources’. Searchers can also select results from all web sources and the results can be segmented into different categories. The search technology is designed to screen out all non-scientific sources so that, if a scientific word is also the name of a rock band, for example, the sites devoted to the rock band will be excluded. Van Rossum said: ‘Scientists derive a lot of benefit from the web but we wanted to make sure that they were able to find the information in the right way. We have other electronic products such as Science Direct, for full text, and Scopus, which is an abstracting and indexing service. Scirus captures the scientific web and delivers it to the scientific community.’

Since its launch, the service has been developed by forging relationships with institutional repositories and other publishers. A recent development has been the Library Partnership Programme where libraries have shared their subscription and resource information. This means that, when an article is located using Scirus, the searcher can also see a link to the full text using an existing subscription through a link resolver. About 300 major libraries have signed up to this service. If there is not a subscription there is always a ‘landing page’ which could offer a pay-per-view option.

Searches can be basic or advanced and results can be filtered by means of key words. Rossum said that, in future, it may be possible to create search engines that are specific to a field, such as chemistry or physics, and the engine would only return results from sources in that field.

Scirus publishes a considerable amount of information on how it finds and indexes sources and its ranking algorithm. It does this because it believes there is less likelihood of people misusing this information in the scientific community and acknowledges that scientists like to know everything about what they are doing.

‘In the scientific community the ranking algorithm is not the only consideration – it is also about our understanding of the scientific community. We give a structure to the unstructured web. We have a controlled seed list and we make sure that everything that goes into Scirus is genuinely scientific,’ said Rossum. ‘We are very candid about what we cover and the way we do it, and we make it very clear where the results have come from. It is very important to know if you are looking at a paper in The Lancet or a paper written by a first-year undergraduate.’

Scirus has access to about 450 million pages, about 25 million journal sources, 20 million patents and 127 databases including institutional repositories and thesis databases. It also has about one million regular users.

Rossum said that, as a free product, a search engine can be overlooked very easily. However, much of the technical effort that drives Scirus is also used within other electronic products from Elsevier. And of course, like all publishers, Elsevier benefits from its information being more visible and accessible. All of which means that the company is likely to carry on investing in Scirus for some time.