E-mail: Uwe.Boehme@chemie.tu-freiberg.de
E-mail: Silke.Tesch@ub.tu-freiberg.de
Abstract: Information management via the computer is essential in the chemistry curriculum. Online databases, in-house databases and chemical information in the internet deliver valuable information for chemists. The article describes the uses of information technology in a chemistry course. The course is a regular part of the chemistry curriculum at the Bergakademie Freiberg. A broad range of databases and information resources are presented throughout the course: online catalogues and electronic journals, chemical information in the internet, Chemical Abstracts, Beilstein and Gmelin Database, crystallographic databases, spectroscopic data from Specinfo, reaction databases, patent information and patent searching. A balance of different pedagogical components is essential for the success of the course. Incorporating these lessons into the curriculum allows students to search for chemical information and gives them the basic skills needed for a professional career in the chemical industry or at research institutions.
Access to chemical information is a cornerstone of chemistry. Therefore, information management via the computer is essential in the chemistry curriculum. Online databases, in-house databases and chemical information in the Internet deliver valuable information for chemists.
Specific problems are arising when someone is searching chemical compounds. About 26 million compounds are known today. Every single chemical compound has a number of different chemical and physical properties, from melting point, solubility, spectral data, to preparation and reactivity. This data are added up to an enormous pile of information. How can we get specific information about our problem out of this pile of data? It is impossible to search the whole printed literature at once. Additional electronic media are now available, which also have to be considered.
Furthermore key words are often not adequate to search for structures and chemical properties. Therefore the search for structures and substance data is necessary. It can be important to search for certain substance properties for compounds with tailor made properties for applications in materials science. Otherwise it might be necessary to search compounds with certain functional groups for pharmaceuticals or in drug design.
We will demonstrate the uses of information technology in a chemistry course. The course is a regular part of the chemistry curriculum at our university. We are, of course, aware that there are other courses on this topic. But we have worked hard over the last few years to prepare this course and we believe we have created an interesting, modern and useful course for our students. We hope this report about our experience will inspire some colleagues to create similar courses.
Access to chemical information is possible via different methods:
The chemistry curriculum in Freiberg
Semester | Subject | ||||||
1. | Math and Physics | Inorganic Chem. | |||||
2. | Physical Chem. | Analytical Chem. | |||||
3. | Organic Chem. | ||||||
4. | |||||||
5. | Biochemistry Toxicology | ||||||
6. | Technical Chem. | ||||||
Examen for Baccalaureus | |||||||
7. | Computer Science | 3 Topics Advanced Courses | |||||
8. | Chemical Information | ||||||
9. | |||||||
10. | Diploma | ||||||
Examen for Diploma Thesis |
The following databases and information resources are presented throughout the course:
These topics cover the whole range of electronic information available today.
The lesson on chemical information in the internet covers:
We discuss search strategies for accessing chemical information in the Internet and we use examples for training these strategies. The Internet is extremely helpful for obtaining general information about a subject, and to search for publications by author.
Some of the training examples are designed to indicate the
limitations of the internet:
Example for Training Lessons: Search for "lectures in crystallography"
The search was performed with the general term "crystallography" and the specific term "lectures in crystallography". Table 2 shows the results of the search, obtained in June 2001.
Table 2. Example for search results with selected search engines.
Search Engine / Directory |
General Term "crystallography" |
Specific Term "lectures in crystallography" |
Yahoo | 2 categories 36 sites |
3290 ("and") |
Altavista | 233 716 | 5 000 |
Excite | 12 740 | 50 |
148 000 | 27 | |
Lycos | 214 524 | 6381 |
Metacrawler | 32 | 0 |
Chemie.de | 17 733 | 599 |
The meta search engine retrieves data from other search engines. Only 32 documents are shown for "crystallography" since only a restricted number of results per source is shown (10 hits per search engine). The chemistry portal Chemie.de finds a remarkable number of documents.
A fast and reasonable access to this subject is given by the systematic directory, and by search for the specific term in some search engines. These results are variable for any other search question, and it is often better to try out different search engines.
Online and in-house databases are important as comprehensive sources of information. They cover a specific part of chemical knowledge independent of single journals, publication years or the physical presence of the printed book or journal.
Electronic databases offer the following advantages:
The Scientific and Technical Information Network (STN International) offers information from over 200 databases covering a broad range of scientific fields, including chemistry, engineering, life sciences, pharmaceuticals, biotechnology, and patents. Various search questions in chemistry can be answered with these databases.
All databases at STN can be searched by using the Messenger command language.
Messenger
very simple commands (Commands in Table 3).
We discuss additional special commands and logical operators for constructing complex search profiles and solving specific problems (boolean operators and wild cards in Table 3).
Table 3. Short overview of the retrieval language Messenger.[1]
Commands | |
expand | View index |
search | Search |
display | Display results |
Use of Boolean operators | |
AND | Two or more search terms in the same record |
OR | Either search term in the same record |
NOT | Retrieves the first term but not the second term |
Wild Cards or Truncation | |
! | Adds one variable character to your term |
# | 0 or one character |
? | Any number of characters |
Search Combined Terms | |
(A) | Terms must be next to each other |
(W) | Terms must be adjacent |
(L) | Terms must be in the same field |
search crystal? (L) structure# | |
Search in specific fields | |
search waymouth, r?/au | Search in author field |
Ways to Search
Searching by chemical names or molecular formulas is often not suitable for retrieving a specific chemical compound or a chemical structure. The reason for this is that chemical names for large molecules can be very complicated. This means that a single error in the chemical name returns zero search hits. Yet, a search for a molecular formula will retrieve all isomers with the same composition.
The way out of this dilemma is to search for a chemical structure. This can be done through the graphical user-interface provided by STN-Express (see Figure 1). The building and retrieval of complex chemical structures is the next step in the work with online databases.
Chemical Abstracts (CA)
This database covers all areas of chemistry and chemical engineering. CA is a bibliographic database which contains secondary information. A bibliographic database describes the contents of a publication. It includes bibliographic data (author, title, journal), key words and a brief description of the content of the publication.
CA is the database with the broadest coverage of all chemistry databases. It gives an overview of the available chemical literature and provides access to the chemical literature for any given search problem. Sources for CA include more than 8,000 journals, patents, technical reports, books, conference proceedings, and dissertations.[2]
Chemical Abstracts consists of two databases: The Registry file and Chemical Abstracts itself (see Scheme 1).
Scheme 1. Use of Registry and Chemical Abstracts.
Registry
|
Chemical Abstracts
|
Beilstein and Gmelin
Contents of Beilstein [5]
The Beilstein File includes information about:
Scheme 2. Examples Beilstein.
Which preparations are described for the synthesis of chloracetylene (BRN=593-63-5) starting from 1,1-dichloroethene(BRN=1733365) ? => search 593-63-5/rn and 1733365/pre.sm 1 593-63-5/RN 4 1733365/PRE.SM L10 1 593-63-5/RN AND 1733365/PRE.SM => display hit Search compounds with a boiling point between 159 and 171°C at 750 to 760 torr! => search 159-171/bp (p) 750-760/bp.p 490 159 CEL - 171 CEL /BP 146 750 TORR - 760 TORR /BP.P L9 9 159 CEL - 171 CEL /BP (P) 750 TORR - 760 TORR /BP.P => display hit |
Gmelin contains over 800 different chemical and physical property fields, a deep and detailed index of the original literature (example see Figure 2). Broad categories of data found in the database include:
Because of its strength in materials, Gmelin is a logical choice for researchers in materials science and electronics. However, the database's detailed information on organometallics also makes it an invaluable tool in synthetic chemistry. Organometallics mediate many unique syntheses. With Gmelin, chemists can find out more about the chemical properties of particular catalysts and perhaps identify new and better catalysts for running certain reactions.
Gmelin provides a special search strategy for coordination compounds which is found in no other database: the ligand search system. This superior search method gives access to coordination compounds from a completely different point of view: it is possible to retrieve all coordination compounds with the same ligand environment, independent of the central atom or the sum formula of a compound (see Figure 3).
Specinfo
Specinfo is a factual database for spectroscopic data. It is probably the world's largest spectra collection with more than 660,000 spectra.[7]
Specinfo includes the following features:
This database has some additional commands and search fields tailored to the specific requirements of retrieving spectroscopic data.
Specinfo has an additional tool for calculating NMR spectra. There are a number of such programs available on the market. This is not the time to discuss their advantages and disadvantages.
Only one short statement: the calculation of NMR spectra in Specinfo is based strictly on the data sets of real compounds in the database. This leads to very reliable and exact calculated spectral parameters on compound classes which are registered in the database. Spectra of new or unusual bonded compounds can be predicted incorrectly.
Crystallographic databases
The Cambridge Structural Database (CSD) and the Inorganic Crystal Structure Database (ICSD) are introduced during the course.[8, 9] Both databases are available as in-house versions. CSD provides access to organic and organometallic structures (mainly X-ray structures, some structures from neutron diffraction). The ICSD contains inorganic structures. These two databases provide access to all kind of organic, organometallic and inorganic structures. Both databases have a graphical user interface.
The Cambridge Structural Database (CSD) contains crystal structure information for over 230,000 organic and organometallic compounds. All of these crystal structures have been analyzed using X-ray or neutron diffraction techniques.
For each crystallographic entry in the CSD, there are stored distinct types of information.[10] These are:
Scheme 3. Example for a database entry in the CSD.
GEJPEQ Dichloro-bis(eta$5!-pentamethyl-cyclopentadienyl)-zirconium at 140deg.K, C20 H30 Cl2 Zr1 U.Bohme,B.Rittmeister, Private Communication,1998 *REFC=GEJPEQ Authors' cell dimensions a b c alpha beta gamma 14.7600 16.6110 8.0740 90.0 90.0 90.0 Zr1 .5000 0 .1012(2) Cl1 .3766(2) 0 -.1040(5) C1 .5000 .1508(8) .0775(23) C2 .4251(8) .1316(7) .1797(15) C3 .4543(7) .1027(7) .3257(18) C4 .5000 .1884(13) -.0869(24) ...etc. |
Patent databases
A number of different patent databases are available. All of them are searchable with the Messenger command language.
Some examples of patent databases [11]:
EUROPATFULL | European patent and application full texts |
INPADOC | International patent database covering over 60 countries |
JAPIO | Bibliographic data of Japanese patent applications |
PATDPA | Patents and utility models in Germany |
PATIPC | The international patent classification |
USPATFULL | Full-text database of US patents |
WPINDEX | Database of international patent publications |
CA | Chemical Abstracts contains also patents |
The Learn-database for World Patents Index (LWPI) provides an opportunity for training in specific retrieval strategies for patents. We work with the learn databases of the world patent index and CA during our course.
Scheme 4. Example for a patent search.
Search patents from Bayer on herbicides in LCA! => expand bayer/cs => search e3-e8 --> L1 => search herbicid? --> L2 => search L1 and L2 --> L3 => search L3 and patent/dt --> L4 => display L4 1-9 bib abs |
A balance of different pedagogical components is essential for the success of the course. These components are:
The individual computer-based training lessons are organized as follows:
The databases cover virtually all aspects of modern chemistry. We have not discussed databases for biological structures (like Brookhaven Protein Database), since biochemistry is not part of the chemistry curriculum at our university. Of course, it would be possible to include biological databases as well. The idea of teaching and training with online databases is widely applicable and extremely helpful for the future careers of young chemists.
Incorporating these lessons into the curriculum allows students to search for chemical information and gives them the basic skills needed for a professional career in the chemical industry or at research institutions.
Remark concerning the databases:
If you want to create a course like ours, you will need the cooperation of one or more data base providers. Luckily, STN Karlsruhe provides us with logins to learning databases and restricted access to some full databases. In return, we actively promote these databases, which increases the likelihood that our students will use them in their future jobs .
If you plan to do such a course, we recommend that you contact your local database provider and ask for logins for the databases you want to use for your lessons. There is a good chance that they will help you.
1.
Further information about messenger is available at http://www.fiz-karlsruhe.de/fiz/service/c_doc.html#Mess 2.
Website of Chemical Abstracts: http://www.cas.org/ 3.
Database description of Beilstein: http://info.cas.org/ONLINE/DBSS/beilsteinss.html 4.
Database description of Gmelin: http://www.cas.org/ONLINE/DBSS/gmelinss.html 5.
http://www.beilstein.com/products/xfire/ 6.
http://www.beilstein.com/products/xfire/gmelin.shtml 7.
Database description of Specinfo: http://www.cas.org/ONLINE/DBSS/specinfoss.html 8.
Website of the Cambridge Crystallographic Data Centre: http://www.ccdc.cam.ac.uk/ 9.
Database description of ICSD: http://www.cas.org/ONLINE/DBSS/icsdss.html 10.
http://www.ccdc.cam.ac.uk/prods/csd/csd.html 11.
For the database descriptions see at: http://www.cas.org/ONLINE/DBSS/dbsslist.html