Chemical Education Journal (CEJ), Vol. 5, No. 2 /Registration No. 5-27/Received September 25, 2001.
URL = http://www.juen.ac.jp/scien/cssj/cejrnlE.html

Teaching Chemistry in the Information Age: Internet, Online and In-house Databases

Uwe Boehme(1) and Silke Tesch(2)

Technische Universitaet Bergakademie Freiberg, (1) Institut fuer
Anorganische Chemie, Leipziger Str. 29; (2) Universitaetsbibliothek
"Georgius Agricola", Agricolastr. 10; D-09596 Freiberg (Germany)

               

E-mail: Uwe.Boehme@chemie.tu-freiberg.de
E-mail: Silke.Tesch@ub.tu-freiberg.de

               

Abstract: Information management via the computer is essential in the chemistry curriculum. Online databases, in-house databases and chemical information in the internet deliver valuable information for chemists. The article describes the uses of information technology in a chemistry course. The course is a regular part of the chemistry curriculum at the Bergakademie Freiberg. A broad range of databases and information resources are presented throughout the course: online catalogues and electronic journals, chemical information in the internet, Chemical Abstracts, Beilstein and Gmelin Database, crystallographic databases, spectroscopic data from Specinfo, reaction databases, patent information and patent searching. A balance of different pedagogical components is essential for the success of the course. Incorporating these lessons into the curriculum allows students to search for chemical information and gives them the basic skills needed for a professional career in the chemical industry or at research institutions.



Introduction

Access to chemical information is a cornerstone of chemistry. Therefore, information management via the computer is essential in the chemistry curriculum. Online databases, in-house databases and chemical information in the Internet deliver valuable information for chemists.

Specific problems are arising when someone is searching chemical compounds. About 26 million compounds are known today. Every single chemical compound has a number of different chemical and physical properties, from melting point, solubility, spectral data, to preparation and reactivity. This data are added up to an enormous pile of information. How can we get specific information about our problem out of this pile of data? It is impossible to search the whole printed literature at once. Additional electronic media are now available, which also have to be considered.

Furthermore key words are often not adequate to search for structures and chemical properties. Therefore the search for structures and substance data is necessary. It can be important to search for certain substance properties for compounds with tailor made properties for applications in materials science. Otherwise it might be necessary to search compounds with certain functional groups for pharmaceuticals or in drug design.

We will demonstrate the uses of information technology in a chemistry course. The course is a regular part of the chemistry curriculum at our university. We are, of course, aware that there are other courses on this topic. But we have worked hard over the last few years to prepare this course and we believe we have created an interesting, modern and useful course for our students. We hope this report about our experience will inspire some colleagues to create similar courses.

Access to chemical information is possible via different methods:

               
           
Scope of the course
               
We want to provide students with basic search skills so that they can retrieve information from a wide range of information resources. As we all know, this is an indispensable qualification for a career in science and business.

The chemistry curriculum in Freiberg

Methods of classic search in the printed literature are not taught in a separate course but the topic is included in third, fourth and fifth semester practical courses. Students are introduced to basic search methods by the library staff and must then find out how to search in printed Beilstein or Gmelin.
               
During the advanced laboratory courses starting in the sixth semester, it is sometimes necessary for students to get current information from literature or obtain a quick survey on data for a specific substance. For example, students may be asked to find the most effective way to synthesize a compound or to get information about substance data such as melting point, refractive index or boiling point.
               
The students in the seventh or eighth semester have acquired enough knowledge about chemistry to understand different types of databases. Since they actually need information from databases, they are able to ask well-constructed questions. Students take our course on chemical information and databases during the 8th semester (see table 1).
               
Table 1. Chemistry Curriculum at the Bergakademie Freiberg

Semester Subject
1. Math and Physics Inorganic Chem.
2. Physical Chem. Analytical Chem.
3. Organic Chem.
4.
5. Biochemistry Toxicology
6. Technical Chem.
Examen for Baccalaureus
7. Computer Science 3 Topics Advanced Courses
8. Chemical Information
9.
10. Diploma
Examen for Diploma Thesis



               

Outline of the course

The following databases and information resources are presented throughout the course:

               

These topics cover the whole range of electronic information available today.

               
Online catalogues, electronic journals and E-books are derived from the classic printed library and will not be discussed further in this article.

Chemical information in the internet

The lesson on chemical information in the internet covers:                       

We discuss search strategies for accessing chemical information in the Internet and we use examples for training these strategies. The Internet is extremely helpful for obtaining general information about a subject, and to search for publications by author.

Some of the training examples are designed to indicate the limitations of the internet:                        

Example for Training Lessons: Search for "lectures in crystallography"

The search was performed with the general term "crystallography" and the specific term "lectures in crystallography". Table 2 shows the results of the search, obtained in June 2001.

Table 2. Example for search results with selected search engines.

Search Engine /
Directory
General Term
"crystallography"
Specific Term
"lectures in crystallography"
Yahoo 2 categories
36 sites
3290 ("and")
Altavista 233 716 5 000
Excite 12 740 50
Google 148 000 27
Lycos 214 524 6381
Metacrawler 32 0
Chemie.de 17 733 599

 


               


The search for "crystallography" in the systematic directory Yahoo retrieves a restricted number of hits, which gives a general access to this subject. The search for the specific term with Yahoo gives much more answers, which are not from the systematic directory, but from a linked search engine. Full text search engines like Altavista, Excite or Google retrieve up to 200 000 hits, which are far to many! The search with the specific term gives an restricted and reasonable number of answers with Excite and Google.

The meta search engine retrieves data from other search engines. Only 32 documents are shown for "crystallography" since only a restricted number of results per source is shown (10 hits per search engine). The chemistry portal Chemie.de finds a remarkable number of documents.

A fast and reasonable access to this subject is given by the systematic directory, and by search for the specific term in some search engines. These results are variable for any other search question, and it is often better to try out different search engines.

 

Electronic databases

Online and in-house databases are important as comprehensive sources of information. They cover a specific part of chemical knowledge independent of single journals, publication years or the physical presence of the printed book or journal.

Electronic databases offer the following advantages:

The Scientific and Technical Information Network (STN International) offers information from over 200 databases covering a broad range of scientific fields, including chemistry, engineering, life sciences, pharmaceuticals, biotechnology, and patents. Various search questions in chemistry can be answered with these databases.

All databases at STN can be searched by using the Messenger command language.

 

Messenger

very simple commands (Commands in Table 3).

We discuss additional special commands and logical operators for constructing complex search profiles and solving specific problems (boolean operators and wild cards in Table 3).

Table 3. Short overview of the retrieval language Messenger.[1]

Commands
expand View index
search Search
display Display results
Use of Boolean operators
AND Two or more search terms in the same record
OR Either search term in the same record
NOT Retrieves the first term but not the second term
Wild Cards or Truncation
! Adds one variable character to your term
# 0 or one character
? Any number of characters
Search Combined Terms
(A) Terms must be next to each other
(W) Terms must be adjacent
(L) Terms must be in the same field
search crystal? (L) structure#
Search in specific fields
search waymouth, r?/au Search in author field

 

Ways to Search

Searching by chemical names or molecular formulas is often not suitable for retrieving a specific chemical compound or a chemical structure. The reason for this is that chemical names for large molecules can be very complicated. This means that a single error in the chemical name returns zero search hits. Yet, a search for a molecular formula will retrieve all isomers with the same composition.

The way out of this dilemma is to search for a chemical structure. This can be done through the graphical user-interface provided by STN-Express (see Figure 1). The building and retrieval of complex chemical structures is the next step in the work with online databases.

               
               

Figure 1. Screenshot of the structure drawing menu from STN Express.

Chemical Abstracts (CA)

This database covers all areas of chemistry and chemical engineering. CA is a bibliographic database which contains secondary information. A bibliographic database describes the contents of a publication. It includes bibliographic data (author, title, journal), key words and a brief description of the content of the publication.

CA is the database with the broadest coverage of all chemistry databases. It gives an overview of the available chemical literature and provides access to the chemical literature for any given search problem. Sources for CA include more than 8,000 journals, patents, technical reports, books, conference proceedings, and dissertations.[2]

Chemical Abstracts consists of two databases: The Registry file and Chemical Abstracts itself (see Scheme 1).

Scheme 1. Use of Registry and Chemical Abstracts.

 

 

Registry
33 million substances (September 2001)
Searching chemical substances:

  • Chemical names
  • Chemical name segments
  • Molecular formulas
  • Chemical structures

CAS Registry number

Chemical Abstracts
Bibliographic database
20 million document records
Searching subjects:

  • Key word searching
  • Names of authors, companies
  • Document types

Bibliographic information

 

 

Beilstein and Gmelin

Both are factual databases. Beilstein contains facts and structures relating to organic substances [3]; Gmelin has information on inorganic, coordination and organometallic compounds.[4] Facts on substances registered in the database can be retrieved, making it unnecessary to acquire the original text. Additional bibliographic information can also be retrieved. Both databases are available as in-house or online databases.

Contents of Beilstein [5]                                     

The Beilstein File includes information about:                

Scheme 2. Examples Beilstein.

Which preparations are described for the synthesis of 
chloracetylene (BRN=593-63-5) starting from 1,1-dichloroethene(BRN=1733365) ?

=> search 593-63-5/rn and 1733365/pre.sm
                        1 593-63-5/RN
                        4 1733365/PRE.SM
      L10               1 593-63-5/RN AND 1733365/PRE.SM
=> display hit

Search compounds with a boiling point between 159 and 171°C at 750 to 760 torr!
    
=> search 159-171/bp (p) 750-760/bp.p
                       490 159 CEL - 171 CEL /BP
                       146 750 TORR - 760 TORR /BP.P
      L9    9 159 CEL - 171 CEL /BP (P) 750 TORR - 760 TORR /BP.P
=> display hit


Contents of Gmelin [6]

               

Gmelin is the only comprehensive, electronically searchable source of structures, and properties in inorganic and organometallic chemistry. Its coverage is unparalleled, with data from the Gmelin Handbook of Inorganic and Organometallic Chemistry (1772-1975) and from most well-respected inorganic, organometallic, and materials science journals abstracted since 1975. Gmelin currently contains 1.4 million compounds, including for instance:
                       

Gmelin contains over 800 different chemical and physical property fields, a deep and detailed index of the original literature (example see Figure 2). Broad categories of data found in the database include:

Because of its strength in materials, Gmelin is a logical choice for researchers in materials science and electronics. However, the database's detailed information on organometallics also makes it an invaluable tool in synthetic chemistry. Organometallics mediate many unique syntheses. With Gmelin, chemists can find out more about the chemical properties of particular catalysts and perhaps identify new and better catalysts for running certain reactions.


               

               

Figure 2. Example for available property fields in Gmelin.

Gmelin provides a special search strategy for coordination compounds which is found in no other database: the ligand search system. This superior search method gives access to coordination compounds from a completely different point of view: it is possible to retrieve all coordination compounds with the same ligand environment, independent of the central atom or the sum formula of a compound (see Figure 3).


               

               

Figure 3. Example for the ligand search system in Gmelin.

Specinfo

Specinfo is a factual database for spectroscopic data. It is probably the world's largest spectra collection with more than 660,000 spectra.[7]

Specinfo includes the following features:

                                       

               
               

Figure 4. Example of a printed spectrum in Specinfo.

This database has some additional commands and search fields tailored to the specific requirements of retrieving spectroscopic data.

Specinfo has an additional tool for calculating NMR spectra. There are a number of such programs available on the market. This is not the time to discuss their advantages and disadvantages.

Only one short statement: the calculation of NMR spectra in Specinfo is based strictly on the data sets of real compounds in the database. This leads to very reliable and exact calculated spectral parameters on compound classes which are registered in the database. Spectra of new or unusual bonded compounds can be predicted incorrectly.

 

Crystallographic databases

The Cambridge Structural Database (CSD) and the Inorganic Crystal Structure Database (ICSD) are introduced during the course.[8, 9] Both databases are available as in-house versions. CSD provides access to organic and organometallic structures (mainly X-ray structures, some structures from neutron diffraction). The ICSD contains inorganic structures. These two databases provide access to all kind of organic, organometallic and inorganic structures. Both databases have a graphical user interface.

The Cambridge Structural Database (CSD) contains crystal structure information for over 230,000 organic and organometallic compounds. All of these crystal structures have been analyzed using X-ray or neutron diffraction techniques.

For each crystallographic entry in the CSD, there are stored distinct types of information.[10] These are:

                       

Scheme 3. Example for a database entry in the CSD.

   
GEJPEQ
Dichloro-bis(eta$5!-pentamethyl-cyclopentadienyl)-zirconium at  140deg.K, C20 
H30 Cl2 Zr1
U.Bohme,B.Rittmeister, Private Communication,1998
*REFC=GEJPEQ

Authors' cell dimensions
a        b        c        alpha    beta    gamma
14.7600  16.6110  8.0740   90.0     90.0    90.0  
Zr1    .5000       0            .1012(2)  
Cl1    .3766(2)    0           -.1040(5) 
C1     .5000       .1508(8)     .0775(23) 
C2     .4251(8)    .1316(7)     .1797(15) 
C3     .4543(7)    .1027(7)     .3257(18) 
C4     .5000       .1884(13)   -.0869(24) 

 ...etc.

 

 


               
Figure 5. User interface of the CSD.

 

Patent databases

A number of different patent databases are available. All of them are searchable with the Messenger command language.

Some examples of patent databases [11]:

 

EUROPATFULL European patent and application full texts
INPADOC International patent database covering over 60 countries
JAPIO Bibliographic data of Japanese patent applications
PATDPA Patents and utility models in Germany
PATIPC The international patent classification
USPATFULL Full-text database of US patents
WPINDEX Database of international patent publications
CA Chemical Abstracts contains also patents

 

The Learn-database for World Patents Index (LWPI) provides an opportunity for training in specific retrieval strategies for patents. We work with the learn databases of the world patent index and CA during our course.

 

Scheme 4. Example for a patent search.

 

     
Search patents from Bayer on herbicides in LCA!

=> expand bayer/cs
=> search e3-e8	--> L1
=> search herbicid?	--> L2
=> search L1 and L2	--> L3
=> search L3 and patent/dt	--> L4
=> display L4 1-9 bib abs

 

Pedagogical concepts

A balance of different pedagogical components is essential for the success of the course. These components are:

                       

The individual computer-based training lessons are organized as follows:

 

Present trends

 

Summary

The databases cover virtually all aspects of modern chemistry. We have not discussed databases for biological structures (like Brookhaven Protein Database), since biochemistry is not part of the chemistry curriculum at our university. Of course, it would be possible to include biological databases as well. The idea of teaching and training with online databases is widely applicable and extremely helpful for the future careers of young chemists.

Incorporating these lessons into the curriculum allows students to search for chemical information and gives them the basic skills needed for a professional career in the chemical industry or at research institutions.

 

Acknowledgement

Remark concerning the databases:

If you want to create a course like ours, you will need the cooperation of one or more data base providers. Luckily, STN Karlsruhe provides us with logins to learning databases and restricted access to some full databases. In return, we actively promote these databases, which increases the likelihood that our students will use them in their future jobs .

If you plan to do such a course, we recommend that you contact your local database provider and ask for logins for the databases you want to use for your lessons. There is a good chance that they will help you.

 

References

 

1.      Further information about messenger is available at http://www.fiz-karlsruhe.de/fiz/service/c_doc.html#Mess                    

2.      Website of Chemical Abstracts: http://www.cas.org/                     

3.      Database description of Beilstein: http://info.cas.org/ONLINE/DBSS/beilsteinss.html                    

4.      Database description of Gmelin: http://www.cas.org/ONLINE/DBSS/gmelinss.html                   

5.      http://www.beilstein.com/products/xfire/                       

6.      http://www.beilstein.com/products/xfire/gmelin.shtml                   

7.      Database description of Specinfo: http://www.cas.org/ONLINE/DBSS/specinfoss.html                       

8.      Website of the Cambridge Crystallographic Data Centre: http://www.ccdc.cam.ac.uk/                      

9.      Database description of ICSD: http://www.cas.org/ONLINE/DBSS/icsdss.html                       

10.     http://www.ccdc.cam.ac.uk/prods/csd/csd.html                   

11.     For the database descriptions see at: http://www.cas.org/ONLINE/DBSS/dbsslist.html             


TopTopHeaderHeaderCEJv5n2CEJ Vol. 5, No. 2, Contents