The Biology Project > Biochemistry > The Chemistry of Amino Acids
Note about Dr. Dayhoff
Biophysical
Society
The origin of the single-letter code for the amino acids
The origin of the single-letter code for the amino acids is of historical interest, and in fact, this story may help the student to learn the code. The reason for the code is simple enough–in the very early days of bioinformatics, the very fastest computers were in fact, rather clunky. Dr. Margaret Oakley Dayhoff, arguably the founder of the field of bioinformatics, shortened the code from the three letter designations to the single letter code in an effort to reduce the size of the data files needed to describe the sequence of amino acids in a protein. The listing of amino acids, the three letter and single letter code, and the explanation for the choice of the single letter is given below. Note that there are 20 amino acids commonly found in proteins, and 26 letters in the alphabet. As a result, most of the letters are used.
To develop a single-letter code for the amino acids, Dr. Dayhoff attempted to make the code as easy to remember as possible. Of course, if the name of each amino acid began with a different letter, the code would be simple indeed. For 6 of the amino acids, the first letter of the name is unique, making the code simple. These are:
His Ile Met Ser Val |
H I M S V |
First letter of the name First letter of the name First letter of the name First letter of the name First letter of the name |
For the other amino acids, the first letter of the name is not unique to a single amino acid, so Dr. Dayhoff assigned the letters A, G, L, P and T to the amino acids Alanine, Glycine, Leucine, Proline and Threonine, respectively, which occur more frequently in proteins than do the other amino acids having the same first letters.
Alanine Glycine Leucine Proline Threonine |
Gly Leu Pro Thr |
G L P T |
First letter of the name First letter of the name First letter of the name First letter of the name |
Some of the other amino acids are phonetically suggestive.
Arginine Phenylalanine Tyrosine Tryptophan |
Phe Tyr Trp |
F Y W |
|
For the remaining 5 amino acids, Dr. Dayhoff was reaching somewhat to find an easy-to-remember connection between the single letter and the amino acid. She assigned aspartic acid, asparagine, glutamic acid and glutamine the letters D, N, E and Q, respectively, noting that D and N are nearer the beginning of the alphabet than E and Q, and that Asp is smaller than Glu, while Asn is smaller than Gln.
Aspartic Acid Asparagine Glutamic Acid Glutamine |
Asn Glu Gln |
D |
|
By the time Dr. Dayhoff got to lysine, there were not too many letters left, so she used the letter K, explaining that K is at least near L in the alphabet.
Lysine |
|
Note about Dr. Margaret Oakley Dayhoff (1925-1983)
Dr. Margaret Oakley Dayhoff was a professor at Georgetown University Medical Center and a noted research biochemist at the National Biomedical Research Foundation where she pioneered the application of mathematics and computational methods to the field of biochemistry. Dr. Dayhoff dedicated her career to applying the evolving computational technologies to support advances in biology and medicine, most notably the creation of protein and nucleic acid databases and tools to interrogate the databases. Her PhD degree was from Columbia University in the Department of Chemistry, where she devised computational methods to calculate molecular resonance energies of several organic compounds. She did postdoctoral studies at the Rockefeller Institute (now Rockefeller University) and the University of Maryland, and joined the newly established National Biomedical Research Foundation in 1959.
Dr. Dayhoff's work with proteins began in 1961 when she developed tools to aid protein chemists in determination of amino acid sequences by automatically overlapping the sequences of peptides. She went on to initiate the "Atlas of Protein Sequence and Structure", and to develop many of the tools used today in database design and utilization. In 1980, Dr. Dayhoff developed an on-line database system that could be accessed by telephone line, the first sequence database available for interrogation by remote computers. Dr. Margaret Oakley Dayhoff, the founder of the field of bioinformatics, died before the field was recognized as a distinct area for investigation. She was, indeed, a pioneer.
Dr. Dayhoff was extremely active in the Biophysical Society, and served the society as both its secretary and president. One of her interests was in enhancing the ability of women to successfully pursue careers in the sciences. She was well aware of the many challenges facing women in science, and worked hard to encourage and mentor women in scientific careers. It is therefore fitting that the Margaret Oakley Dayhoff award was established to encourage young women to enter careers in scientific research. This award is aimed towards women of very high promise who have not yet reached a position of high recognition within the structure of academic society. It is administered through the Biophysical Society , and candidates are judged on achievement and promise in fields within the purvue of the Biophysical Society .
The Biology Project > Biochemistry > The Chemistry of Amino Acids
http://biology.arizona.edu
All contents copyright © 2003. All rights reserved.