From Aristotle to semantic analysis

Share this on social media:

The need to organise information efficiently and reliably is more important than ever, argues Allan Gajadhar

Libraries and academies have existed since ancient times to promote and order our understanding of our world. From earliest history, myths and legends arose that provided some explanation of the natural world and its forces to early humans.  Sages and thinkers in ancient cultures around the globe all produced attempts at understanding the natural and supernatural with a variety of ontological concepts.  The word ontology itself has its origins in ancient philosophy as the study of being. Early Western philosophers such as Aristotle came up with some of the earliest attempts at categorising knowledge, as well as some of the earliest analysis of the basic notion of ontology, or the study of being. 

Ontology in the modern world is very much related to the notion of categorisation – categorisation being the expression of structures that organise meaning. In today’s world, the fields of linguistics and information science have adopted these terms in service of the organisation of knowledge.  

With the explosion of information that began with the advent of publishing, the need to organise information became a necessity. Librarians were among the first to define and use the notion of systematic categorisation of information. The notion of a taxonomy has arisen in order to effectively structure domain-specific knowledge, making it accessible and useful. In today’s world of automation, big data and global connectivity, sensible methods of organising knowledge have become critical to the ability to find and make effective use of information in the vast universe of available data. 

The need to find (and cite) relevant knowledge, and the concomitant difficulty in doing so, are fundamental concerns to modern researchers. This holds true in every discipline, including the studies of language, humans and society, and of economics, as well as in the ‘hard’ sciences. The task is even greater today, with the explosion of data in every academic, corporate and civic discipline that may have been digitised, but not linked into a broader universe of classification. Data is everywhere, it is a matter of finding it, and making sense of it by linking it to broader, well-known schemes of classification.

This is where modern notions of taxonomies and the resultant deep-linking of information provide the modern researcher with invaluable capabilities for finding knowledge.

The life sciences industry offers an example of a field of knowledge where researchers need to make sense of highly diverse, internal and external, information flows stemming from sources such as biomedical literature, patents, clinical trial reports, healthcare records, specialised news outlets and, occasionally, even social media. Without the help of appropriate information management technologies, it has now become close to impossible for scientists and information professionals to innovate effectively and adhere to the demands of highly regulated, efficient information management.

Optimise scientific information management

Innovations rarely come out of a vacuum; it has now become essential for research and development organisations to be acutely aware of prior and ongoing work, both internally and in other teams across their industry. 

Given the massive amount of available scientific literature in proprietary and public content repositories, it is essential to be able efficiently to extract from this content the structured information and key insights like:

  • What are known targets and leads for a given indication?
  • What biomarkers might provide early indicators of drug effectiveness or disease prognosis?
  • For a given lead candidate, what are known adverse effects of similar compounds?
  • Obtain a reliable and comprehensive picture of the information available regarding negative events and other clinical risks.
  • Do similar compounds have side effects that might suggest repurposing for alternative therapeutic uses? and
  • Rapidly identify areas of intervention and better understand causal factors, while monitoring the results of clinical events.

Semantic analysis discovers what is contained in content, understanding the meaning and context, and dramatically improving an organisation’s ability to use all information available. 

Life science and pharmaceutical companies can ensure the highest level of precision and recall in ensuring quick and accurate response to FDA requirements, and improve knowledge management of all their information assets. A solution based on semantic analysis can provide powerful automated disambiguation, classification, entity extraction and metadata to classify research content automatically, monitor feedback on drugs, gather physician comment and experience for future drug developments, and verify the strategies employed by sales and distribution channels.

Answers to questions such as these may optimise the effectiveness of R&D organisations by accelerating innovation cycles and avoiding costly dead-ends.

Boost market intelligence

In the pharmaceutical industry – due to the combined onslaught of increasing regulatory pressure and sustained competition from generics – the traditional product development strategy rooted in fundamental research has become more cut-throat, and the corresponding pipelines are drying up. 

Similar trends are at play in other segments of the larger life sciences industry. More emphasis has now shifted to alternative strategies involving external growth capitalisation on intellectual property. In this context, it is essential for organisations to:

  • Track competitors’ business development including partnerships and licensing;
  • Identify relevant acquisition targets with promising pipelines;
  • Gather competitive intelligence about business, communication, and research strategy; 
  • Leverage the FDA website to extract competitive information of approval process and phase, understand which drugs are approved, which are not approved, and why, in order to identify virgin or underexploited areas of interest through white spot analyses;
  • Maintain real-time awareness of regulatory evolutions;
  • Minimise intellectual property infringement risks and associated legal costs; and
  • Detect sentiment relating to product or brand by understanding opinions.

Stay ahead of the competition

Each organisation’s view of the competition is specific to its own portfolio and business strategies. To identify trends, opportunities and threats, researchers and decision makers will turn to a wide selection of information sources, including scientific publications, patents, news, clinical reports and user generated content. Effective competitive intelligence (CI) requires sources that are semantically enriched, normalised, inter-connected and made available in a centralised location. The benefits of such an approach in the pharmaceutical industry include: 

  • Up-to-date and relevant information on publications, patents, clinical trials, experts, news and pharmacovigilance for informed decision making;
  • CI experts spend less time looking for information and more time analysing search results;
  • A solution that provides access to hitherto untapped information with the ability to charting clinical trials landscape, including related news and results;
  • Identifying experts as revealed by their publication metrics and collaboration networks;
  • Mining news and research for the latest scientific and biopharma business news (scientific discoveries, drug approvals, trials, regulatory news, conferences, and deals) and extract information from online publication repositories such as Medline, NIH Projects, and patents; and
  • Dynamic, timely information based on databases built around deep linking, revealing previously undiscovered connections, in real time.

The pharmaceutical and life sciences industry are a good example of the value taxonomies and ontologies can generate in bringing order to the vast universe of available content. Semantic analysis can be of great value in understanding the meaning and context of information, and dramatically improve its usability. 


Allan Gajadhar is senior sales engineer at Expert System Enterprise