|
|
chemistryA database approach to chemical reactionsThere are almost too many resources for chemists hoping to unearth compound structures, organic reaction schemes, and even failed syntheses. David Bradley digs in to find out what can be mined. When it comes to chemical facts, there is no better way to harness data than through a database. Out on the Web, CD-Rom and proprietary systems, there are countless resources available from selective/ thematic databases, such as those offered by the likes of Accelrys, to the broad compilations from Beilstein and the Chemical Abstracts Service (CAS). Whatever the focus and relevance or the breadth of coverage required, there is a rifle or a shotgun to choose from.
ACCELRYS
Accelrys offers access to the likes of the Royal Society of Chemistry’s Methods in Organic Synthesis (MOS). This is a current awareness journal published monthly, which abstracts more than 100 internationally recognised organic chemistry journals. Accelrys’ electronic version of MOS is a highly selective designer database, essentially picking up where Springer’s Chem- React fails in that it focuses on novel synthetic methods from the literature. Hence, it picks up on functional group changes, carboncarbon bond-forming reactions, new reagents and synthons, enzymatic and biotransformations, and ways of introducing protecting groups and important chiral, or handed, centres into a structure. The database adds about 3,300 reactions each year, is updated quarterly and currently stands at more than 33,000 indexed reactions going back to 1991. The Accelrys version is ISIS and Accord enabled. Intriguingly, Accelrys also offers the chemist a glimpse of failure in the form of its recently launched ‘Failed Reactions’ database. This unique compilation lets chemists know about reactions that either reached a dead end and no product, or simply produced an entirely unexpected result. The database tells chemists where the reactions were published and thus helps them avoid other people’s mistakes. Coupled with MOS and used in conjunction with ChemReact, you could search for a particular target and dig out a reaction scheme from the many possible with the best outcome and least number of pitfalls. There are already thousands of Failed Reactions archived and Accelrys plans to add tens of thousands more over the next couple of years.
BEILSTEIN
While Beilstein holds data on about eight million compounds, there are also more than five million chemical reactions and 35m associated chemical property and bioactivity records. These include data describing pharmacodynamics and environmental toxicology, transport, distribution and fate, essential stuff in drug discovery for instance.
CAS
The ChemFinder search service is available as a standalone product for accessing the proprietary databases sold by Cambridge Soft, but it also offers itself up in web server format. This may not seem particularly novel as there are already countless search engines out there that can perform something similar. But, Cambridge Soft reckon they have tightened up their chemical searching by working from a single master list of chemical compounds, so that users avoid the problems of mis-spelled ‘mehtyl’ groups, identifying ‘aluminium’ and ‘aluminum’ compounds and doing it quickly to boot. If you’re after detailed knowledge of chemical behaviour, then Springer’s ChemReact will provide several clues. It carries data on some 300,000 reactions abstracted from the chemical literature of 1974-1991, which obviously excludes much of the recent developments in chemical synthesis but nevertheless provides the stock in trade for the vast majority of reaction schemes that a chemist would employ. The database carries the reactant and product structures, necessary solvents, required reagents and catalysts and will also give you an idea of how high a yield you might expect and what side products may form.
TRIPOS
Tripos produces a database of diverse and pure compounds that emerge from the recent advances in combinatorial chemistry and high-throughput screening. LeadQuest contains some 80,000 compounds and is still growing. The company uses its ChemSpace decision-making technology to design and select novel compounds at a rate of two trillion per hour, allowing it to sieve out biologically relevant molecules rapidly from the vast numbers of possible structures. Tripos also makes available a special edition of the Chapman & Hall Databases in several configurations. The Dictionary of Pharmacological Agents, Dictionary of Organic Compounds, Dictionary of Natural Products, Dictionary of Inorganic and Organometallic Compounds, the US National Cancer Institute’s database of structures tested by NCI for carcinogenic activity and the Derwent World Drug Index (WDI) are possible data-mining sources. Tripos offers a 2D structure database that carries associated property data, a 3D coordinates version with no property data and a complete version, and probably the most useful but expensive option, that includes 2D and 3D data and each compound’s properties. 3D coordinates are not, however, available for the Dictionary of Inorganic & Organometallic Compounds. The chemical structures within the databases are provided as pre-built and preindexed Unity format databases but are also in a format suitable for loading into Oracle Version 7.
NIST
CONCLUSION
|
||