|
What is Cheminformatics?
The size of the information problem in chemistry is staggering. It can be judged from the fact that Chemical Abstracts Service adds over 700,000 new compounds to its database annually. Massive amounts of physical and chemical property data are gene rated each year for new and existing chemical substances. Such an avalanche of data can bury a chemical research project unless ways can be found to cope with it. Fortunately, those trained in chemical informatics can provide tools to acquire, organize, and evaluate data--tools that yield new insights for further chemical research.
Chemical informatics companies combine molecular simulation and data analysis techniques with high quality graphical visualization to obtain stunning results. Chemical informatics thus helps chemists investigate new problems and organize and analyze scientific data to develop novel compounds, materials, and processes through the application of information technology.
Following are the major aspects of cheminformatics:
Information Acquisition: Methods used for generating and collecting data empirically (experimentation) or from theory (molecular simulation)
Information Management: Storage and retrieval of information
Information Use: Data analysis, correlation, and application to problems in the chemical and biochemical sciences.
Information Acquisition :
Information acquisition is highly dependent on the computer today. With the integration of modern sensors into chemical instrumentation, the volume of data that can be generated is enormous. Future instrumentation will incorporate information from existing chemical databases, employ modeling techniques, and analyze experimental data as they are generated. Such "smart instruments" will significantly improve the ability of the user to make intelligent decisions about the course of an experiment while the data are being collected and analyzed.
There now exist two complementary pathways for generating and collecting information in the chemical sciences: by experimentation and by computer simulation. Traditionally, the gathering of data from experiments was done manually, but with the development of computers small enough to be purchased by individual laboratories, the phrase "computers in chemistry" arose to describe their use. Several decades ago this expression meant interfacing a computer to an experiment like a spectrometer or a chromatograph and collecting the data in real time for storage and later manipulation. While this is still being done with microprocessors built into the instruments themselves, a more encompassing label for the wide range of chemical activities involving computers is "computational chemistry."
Computational chemistry seeks to predict quantitatively molecular and biomolecular structures, properties, and reactivity by computational methods alone. It uses modern chemical theory to predict the speed of unknown reactions and the synthetic sequences by which complex new molecules can be made most efficiently. Computational chemistry allows chemists to explore how things work at the atomic and molecular levels and to draw conclusions that are impossible to reach by experimentation alone. Thus, computational chemistry supplements experimentally derived data.
One aspect of computational chemistry is molecular modeling. Molecular modeling involves the investigation of three-dimensional molecular structures using classical and quantum mechanical methods assisted by computer graphics. Other molecular modeling techniques include quantitative structure-property relationships, which find applications in structure-based drug design, similarity searching, and molecular shape prediction. Molecular modeling techniques are utilized extensively in pharmaceutical research, especially to predict pharmacophores--the structural features of molecules required for particular biological activities. Molecular modeling is now used routinely to generate data concerning energetics, dynamics and other information at the molecular scale that is not amenable to experimentation.
Recent advances in combinatorial synthesis and high throughput screening technologies now allow for preparation and analysis of hundreds of thousands of molecules (by a single company!) yearly. Combinatorial chemistry techniques grew out of several disciplines, including organic, medicinal, and physical chemistry, engineering and robotics, computational chemistry, informatics, and screening technology. Robotics as used in combinatorial chemistry provides the drug industry a powerful tool with which to screen millions of potential compounds in a fraction of the time it would have taken to evaluate even a few dozen compounds a decade ago. Now widely employed in the pharmaceutical area, combinatorial chemistry has begun to find applications in materials science. Because so much information is being generated and collected from combinatorial technologies, there is a concomitant problem associated with storing and retrieving those data. That problem is now being addressed by those skilled in chemical informatics.
Information Management :
Many of the applications for storing and retrieving chemical data have grown out of the rapid developments in chemical structure coding and searching. The advances in structure-based applications have led to integrated chemical information systems- -more and more of which have Web interfaces--and to specialized applications such as Laboratory Information Management Systems (LIMS). The ability to search large secondary databases such as Chemical Abstracts or Medline easily and precisely and to move seamlessly back and forth between the original primary journal literature and the abstracting and indexing databases is one of the truly great achievements of modern chemical informatics research.
Chemists have developed their own communication system (chemical nomenclature and structure systems) that adds a unique dimension to informatics. There is a confluence of activities in chemical informatics that is centered on the chemical structure (both 2-D and 3-D depictions). Two-dimensional chemical structural databases have evolved from traditional chemical structure diagrams into structure searching and substructure searching systems. In the late 1980s, attention turned to 3-D structure searching and representations of chemical structures in three dimensions. Recently, techniques for the full description of the conformational space of flexible molecules and similarity searching techniques have been discovered. These are now being incorporated into chemical information storage and retrieval systems.
Information Use :
The computer has enabled chemists to analyze and correlate data from massive chemical and biochemical databanks, and when coupled with chemical visualization and modeling techniques, it is revolutionizing chemical research. Informatics techniques help create an integrated information environment in which all aspects of chemical research and development can be dealt with in a unified system. Not only can chemical structures be used as search keys in such systems, but also unknown properties and spectra can be predicted using chemical informatics tools and techniques that draw on the existing knowledge base of chemistry. Data mining has emerged as a significant factor in the reassessment of data collected over time in an organization. Chemists can now access decades of raw data stored in disparate formats and obtain useful results to build on the research that has taken place in past years.
|