|
|
PATENT SEARCHINGThe Web drags patents into 21st CenturyPatent searching is a tricky business. Until recently the Web has done little to improve the situation - but that's all changing now. Vanessa Spedding reports. Of all the sources of valuable research information, patents are among the most important and the least friendly. Properly surveyed they can give insights into the state of the art across any field. They can provide a window onto the activities of competitors, by disclosing the new products a company may be bringing to market (and the countries in which it's likely to market them) and by revealing which companies are involved in a given field of technology. They can also reveal the path that led to a particular invention, the science or logic behind the invention, and its intended application. But the legal nature of patents lends them an uncompromisingly formal style: they are written in a language sometimes so abstruse that it does more to obscure the nature of the invention than to elucidate it. Not only that, the millions of patents already in existence are distributed across heterogeneous archives and databases, in each case coded and grouped according to one of several classification systems, and connected to related patents in other archives by another set of codes describing the family. Most recent patents - i.e. those filed since 1976 - have been digitised, but the records exist in different databases with different data formats and data structures. A thorough search through these sources to locate current patents of interest requires an intimate knowledge of an array of software tools, search commands, searching techniques and classification systems, to say nothing of the structure and significance of the resulting patents themselves. Small wonder that patent searching is an expert's job. And it is a job that has never been more significant. The financial business of profiting from ideas or proprietary knowledge alone has spawned the new 'knowledge economy,' which has accelerated to the point where entire businesses function on the basis of creating inventions, protecting them with patents or copyright, and either licensing them out or suing for infringement. Patents enable companies to seize markets, generate revenue, gain competitive advantage, and boost shareholder value. No surprise that the quickening pace of this new economy brings with it the high-price intellectual property (IP) lawsuits so frequently in the news in the high tech sector. For the IP manager or the inventor, being certain of ownership of the first patent on an invention, or hunting down others infringing that patent, is nothing less than mission-critical.
Until recently, whether the Web had made the job of patent searching any easier was a moot point. Many believed that the arrival of free access patent databases on the Web would change the nature of patent searching for good, for the first time making it widely accessible and eliminating the need to pay for subscriptions to commercial services or to employ searching professionals. The fact that these databases could be searched using simple keyword entry made them all the easier. And indeed there are proponents of this approach who argue that it is entirely sufficient for 'patentability' searches where an overview of prior art is needed, if admittedly inadequate for the more rigorous infringement searches. US patent attorney Jerry Black, for example, has written a controversial book explaining how the free, full-text database of the USPTO, explored by means of keyword and fore- and back-citation linking, and combined with some basic knowledge of other free databases, can throw up everything you need. His claims are contentious - but can be scrutinised at www.usintellectualpropertyattorney.com. Most patent professionals, however, agree that the main benefits of free-access Web databases are that they provide a low-cost means of doing initial background searches, and that's about it. The problem is that they suffer serious drawbacks for more crucial searches. For example: free databases generally come from the patent issuing authorities (usually national patent offices) so their content is restricted to those patents granted by that particular authority. There is no universal structure, so the same fields may not necessarily be searchable across different databases. There is no 'added value' - such as readable abstracts in plain English, which has given patent information-provider Thomson Derwent its enviable reputation. There are rarely any patent analysis technologies. And they do not provide the option of sophisticated, command driven, Boolean searches as offered by powerful tools from host companies such as Dialog, Questel-Orbit and STN - which also allow parallel searches across several (commercial and free) databases at once. More worryingly, a quick and easy search on a free site is extremely unlikely to uncover 'stealth patents' - one of the latest IP protection tricks. The authors of stealth patents deliberately choose obfuscating keywords and try to have their patents inappropriately classified in order that others' searches do not throw them up. The reason? A stealth patent can be a handy source of income if, once the technology has matured, others unwittingly infringe it. Earlier this year Derwent conducted a survey to investigate the use of free and paid-for patent information, by information professionals, in 349 of the top 500 companies in Europe. The survey found that a two-tiered search process, with users exploiting free resources for a general overview before migrating to paid-for services for definitive information, is the preferred approach. The paid-for method successfully used by the patent professionals has remained unchanged for years: sign up to one or more on-line hosts (see panel below); via them, take out subscriptions to the databases of interest and then dial-up to conduct expert searches across relevant databases, using an exclusive, fast connection not hampered by Web traffic surges. These commercial providers provide access to patent collections throughout the world, as well as value-added patent information, analytical tools and other technologies. For completeness, they are usually used in conjunction with an inspection of the appropriate classes within the relevant classification system(s), which can lead to serendipitous discovery by browsing. A number of these commercial providers have recently released innovative new functionalities alongside the search function. For example, on-line host company Questel-Orbit has announced its 'PatCite' feature, which allows easy forward and backward citation searching. STN, the collaboration between FIZ Karlsruhe, the Japan Science and Technology Corporation and CAS, now offers access to its huge selection of databases via the Web, as well as via direct-dialup, and caters for different levels of expertise with products like STNEasy. On a grander scale, patent information veteran Derwent has announced ambitious initiatives that exploit its growing assembly of information partners under the ever-expanding Thomson umbrella. It is rolling out a technology search tool that combines patent data with other sorts of research information. Matt Brocklehurst, Global Marketing Manager, Engineering & Technology for Thomson Derwent, described his division's plans. 'We are moving from the patent space into the Internet space with the Derwent Web of Software and the forthcoming Web of Nanotechnology,' he explained. 'Because it has only been possible to patent software relatively recently, you must look at other sources to find the prior art. Our Web of Software provides a search engine that reveals the prior art from several sources, including journals, magazines, and selected Web sites. It is totally Web-based and we have been able to use the Web expertise of ISI [another Thomson company].' Web of Nanotechnology will deploy the same concept on the same platform but with nanotechnology content, again because of the commercially embryonic state of the field. Brocklehurst is confident that the search engine behind these portals is one of the best and most comprehensive available, and is what the market needs. But what will it do for their more traditional patent data offerings? (To say nothing of on-line host Dialog - yet another Thomson company.) 'The Derwent Web of Software was not created to replace or compete with Derwent's existing products, such as the Derwent World Patent Index on the various online hosts,' he explained. 'The online hosts are best suited for situations where all searching is done by an information professional, and offer the most searchable fields. Although not quite as comprehensive in terms of searchable fields as the online hosts, the Derwent Web of Software still offers very detailed advanced searching, including field tags, combinable search sets, and saved searches.' The key thing about Derwent's 'Web of ...' concept, according to Brocklehurst, is that it offers an ideal solution for situations where the skill level of the searcher varies, from the end-user to the information professional.
It has inspired another product, the Derwent Innovation Index 3.0, which provides a means of searching value-added patent records from Derwent World Patents Index (DWPI) at the same time as patent citation information from Derwent Patents Citation Index (DPCI) - again, at all levels within an organisation, through a Web interface. Full-text documents are not available from Derwent, however; for that users must link to other outfits such as Delphion (see below) or MicroPatent. Derwent is not the only company to recognise the Web's potential for augmenting the offerings of the commercial providers. The fact is, the arrival of the Web-based patent databases did achieve something profound for patent searching, which was to arouse the interest of researchers and managers who had not previously bothered themselves with the intricacies of the more archaic systems. The result is that people at all stages of the research process have started to expect easy Web access to patent archives. Take Delphion. An IP management spin-out from IBM, Delphion offers an Internet-based service for researching and analysing worldwide patent and intellectual property information, called Delphion Research. Delphion Research has gained a reputation, not only for storing and providing Web access to full-text patent information from US, European, Japanese, and World Intellectual Property Organization (WIPO) patent authorities (and commercial databases such as Derwent's), but also for implementing productive tools to analyse and track market activities, and to view and download patent images. Earlier this year Delphion was selected for the 'Best of the Web' award by UK magazine PC World, and again, it seems it was access for all that did it. 'By breaking down the barriers to accessing and using patent information, Delphion has received tremendous response from companies and individuals worldwide,' said Woody Ritchey, its president and CEO. Delphion claims to have the 'most complete patent record available anywhere' and indeed it offers access to more than 40 million records (compared with Derwent's 20 million+). But its focus is as much on the analysis as the data. It offers both simple and complex searches via the Web, on multiple collections, and also offers a range of productivity tools to help with the process of analysing results. So Delphion subscribers can 'drill down' to more detail about a patent or the text itself; they can export data from selected patent fields in mutually compatible formats for spreadsheet analysis; they can use the 'PatentLab' wizard to more easily construct 2D and 3D charts to help spot trends; they've had backward and forward citation linking for some months; they can cluster similar patents into groups and plot relationships between them, and so on. Delphion sees itself as the pioneer of patent information on the Web. But it may just be about to be disabused of that belief. As this issue of Research Information goes to press, US IP management company Patent Café will be putting the finishing touches to its standardised, global patent database and an innovative new search engine, the combination of which it expects to create a genuine stir in the industry. Patent Café has been back to first principles to investigate the fundamental shortcomings of patent searching itself. Put aside, for a moment, the fancy analytics: if you're not finding the patent of key importance to your business, your search is probably worthless. To address the shortfalls of the combined keyword searching, classification browsing and citation/family linking approach, Patent Café have devised a search engine that recognises the concept of the patent, not just the words. And (no surprises here), it can be used via the Web, at various levels of sophistication and by people with different degrees of skill. Andy Gibbs, president and CEO of the company, explained. 'We index the patent database and create a 'concept space' - which uses terabytes of storage space, as does the patent database. The software looks for mathematical relationships between words in and between patents. Each time it finds one, it plots it as a 3D mathematical vector in the concept space.' It sounds confusing, but the results seem impressive. By means of these mathematical vectors, the engine can recognise the essence of a patent, even when the obvious keywords are missed out. For example, it will find a patent about a bicycle that deliberately does not use the word bicycle anywhere, by asking it to look for 'bicycle' as the concept and specifying 'NOT bicycle' in the text search. It also finds patents using obscure non-English language keywords (and the results are not limited to the language of the keyword). Again, it's not absolutely infallible, but the USP, says Gibbs, is that Patent Café's new tool delivers a unique result set. He is confident that professional searchers will be keen to use it - probably in parallel with one or two other approaches or providers. Patent Café clearly means business. Not only has it focused on producing a genuinely innovative search engine, it is also determined to offer the biggest and most universal patent database in the world. Its systems are currently uploading 300 patents a minute, 24 hours a day and at the time of writing they had passed the 40 million records mark. Gibbs wants the lot - all 60 million existent digital patent records - and the older ones that are still slowly being read in by OCR too, when they're ready. The system uses the new World Intellectual Property Organisation (WIPO) format, which ensured that data fields are standardised, and this has been applied to the entire back file. All data is imported in XML format and 'normalised,' country by country, to comply with the new standard. The project has already been underway for two and a half years. 'And this,' said Gibbs, 'is only the precursor to where we're going'. Classification systems and Boolean searching may not have had their day yet, but there's no doubt that the nature of the digital patent information space is undergoing a radical shift. Signs are that, what may now appear to the traditional patent researcher be over-elaborate Web-based systems will one day represent the default interface to patent information.
Primary patent-issuing authorities and databases
|
||