Cracking of the genetic code
The cracking of the genetic code represents a pivotal advancement in molecular biology, unveiling how genetic information in DNA translates into proteins within living organisms. At its core, DNA is comprised of four bases—adenine (A), thymine (T), guanine (G), and cytosine (C)—which combine to form sequences that dictate the specific arrangement of amino acids, the building blocks of proteins. The process involves messenger RNA (mRNA), which is synthesized from DNA and carries the genetic instructions to ribosomes, the cellular machinery where proteins are synthesized.
The groundbreaking work of scientists like Francis Crick and Marshall Nirenberg led to the understanding that groups of three DNA or RNA bases, known as codons, correspond to specific amino acids. This triplet coding system allowed for the potential of 64 different codons, which could specify the 20 amino acids, including the discovery of "stop" codons that signal the termination of protein synthesis. Additionally, transfer RNA (tRNA) plays a crucial role in this process by serving as an intermediary that brings the appropriate amino acids to the ribosome based on the codon sequence of the mRNA.
Deciphering the genetic code has had profound implications, enabling scientists to identify genes linked to diseases, develop gene therapy techniques, and enhance our understanding of genetic mutations. This monumental achievement in genetics parallels other significant discoveries, marking a transformative era in biological research and medicine.
Cracking of the genetic code
SIGNIFICANCE: The deciphering of the genetic code was a significant accomplishment for molecular biologists. The identification of the “words” used in the code explained how the information carried in DNA can be interpreted, via an RNA intermediate, to direct the specific sequence of amino acids found in proteins.
The Nature of the Puzzle
Soon after DNA was discovered to be the genetic material, scientists began to examine the relationship between DNA and the proteins that are specified by the DNA. DNA is composed of four deoxyribonucleotides containing the bases adenine (A), thymine (T), guanine (G), and cytosine (C). Proteins are composed of twenty different building blocks known as amino acids. The dilemma that confronted scientists was to explain the mechanism by which the four bases in DNA could be responsible for the specific arrangement of the twenty amino acids during the synthesis of proteins.
![Notable mutations. Selection of notable mutations, ordered in a standard table of the genetic code of amino acids. As can be seen, clinically important missense mutations generally change the properties of the amino acid residue between being basic, acidic polar or nonpolar. By Mikael Häggström [Public domain], via Wikimedia Commons 94416433-89143.jpg](https://imageserver.ebscohost.com/img/embimages/ers/sp/embedded/94416433-89143.jpg?ephost1=dGJyMNHX8kSepq84xNvgOLCmsE2epq5Srqa4SK6WxWXS)
![Amino acid sequence. Knowing the genetic code and the way it relates to proteins made by the body are tools to understand cancer cells. By Linda Bartlett (Photographer) [Public domain or Public domain], via Wikimedia Commons 94416433-89144.jpg](https://imageserver.ebscohost.com/img/embimages/ers/sp/embedded/94416433-89144.jpg?ephost1=dGJyMNHX8kSepq84xNvgOLCmsE2epq5Srqa4SK6WxWXS)
The solution to the problem arose as a result of both theoretical considerations and laboratory evidence. Experiments done in the laboratories of Charles Yanofsky and Sydney Brenner provided evidence that the order, or sequence, of the bases in DNA was important in determining the sequence of amino acids in proteins. Francis Crick proposed that the bases formed triplet “code words.” He reasoned that if a single base specified a single amino acid, it would be possible to have a protein made up of only four amino acids. If two bases at a time specified amino acids, it would be possible to code for only sixteen amino acids. If the four bases were used three at a time, Crick proposed, it would be possible to produce sixty-four combinations, more than enough to specify the twenty amino acids. Crick also proposed that since there would be more than twenty possible triplets, some of the amino acids might have more than one code word. The eventual assignment of multiple code words for individual amino acids was termed “degeneracy.” The triplet code words came to be known as codons.
Identifying the Molecules Involved
Since DNA is found in the nuclei of most cells, there was much speculation as to how the codons of DNA could direct the synthesis of proteins, a process that was known to take place in another cellular compartment, the cytosol. A class of molecules related to DNA known as ribonucleic acids (RNAs) was shown to be involved in this process. These molecules consist of ribonucleotides containing the bases A, C, and G (as in DNA) but rather than thymine (T). One type of RNA, ribosomal RNA (rRNA), was found to be contained in structures known as ribosomes, the sites where protein synthesis occurs. Messenger RNA (mRNA) was shown to be another important intermediate. It is synthesized in the nucleus from a DNA in a process known as transcription, and it carries an imprint of the information contained in DNA. For every A found in DNA, the carries the base U. For every T in DNA, the mRNA carries an A. The Gs in DNA become Cs in mRNA, and the Cs in DNA become Gs in mRNA. The information in mRNA is found in a form that is complementary to the sequence in DNA. The mRNA is transported to the ribosomes and takes the place of DNA in directing the synthesis of a protein.
Deciphering the Code
The actual assignment of codons to specific amino acids resulted from a series of elegant experiments that began with the work of Marshall Nirenberg and Heinrich Matthaei in 1961. They obtained a synthetic mRNA consisting of polyuridylic acid, or poly (U), made up of a string of Us. They added poly (U) to a cell-free system that contained ribosomes and all other ingredients necessary to make proteins in vitro. When the twenty amino acids were added to the system, the protein that was produced contained a string of a single amino acid, phenylalanine. Since the only base in the synthetic mRNA was U, Nirenberg and Matthaei had discovered the code for phenylalanine: UUU. Because UUU in mRNA is complementary to AAA in DNA, the actual DNA bases that direct the synthesis of phenylalanine are AAA. By convention, the term “codon” is used to designate the mRNA bases that code for specific amino acids. Therefore UUU, the first code word to be discovered, was the for phenylalanine.
Using cell-free systems, other codons were soon discovered by employing other synthetic mRNAs. AAA was shown to code for lysine, and CCC was shown to code for proline. Scientists working in the laboratory of Severo Ochoa began to synthesize artificial mRNAs using more than one base. These artificial messengers produced proteins with various proportions of amino acids. Using this technique, it was shown that a synthetic codon with twice as many Us as Gs specified valine. It was not clear, however, if the codon was UUG, UGU, or GUU. Har Gobind Khorana and his colleagues began to synthesize artificial mRNA with predictable nucleotide sequences, and the use of this type of mRNA contributed to the assignment of additional codons to specific amino acids.
In 1964, Philip Leder and Nirenberg developed a cell-free protein-synthesizing system in which they could add triplet codons of known sequence. Using this new system, as well as Khorana’s synthetic messengers, scientists could assign GUU to valine and eventually were able to assign all but three of the possible codons to specific amino acids. These three codons, UAA, UAG, and UGA, were referred to as “nonsense” codons because they did not code for any of the twenty amino acids. The nonsense codons were later found to be a type of genetic punctuation mark; they act as stop signals to specify the end of a protein.
There is no direct interaction between the mRNA codon and the for which it codes. Yet another type of RNA molecule was found to act as a bridge or, in Crick’s terminology, an “adaptor” between the mRNA codon and the amino acid. This type of RNA is a small molecule known as transfer RNA (tRNA). Specific enzymes connect the amino acids to their corresponding tRNA; the then carries the amino acid to the appropriate protein assembly location specified by the codon. The tRNA molecules contain recognition triplets known as anticodons, which are complementary to the codons on the mRNA. Thus, the tRNA that carries phenylalanine and recognizes UUU contains an AAA anticodon.
By 1966, all the codons had been discovered. Since some codons had been identified as “stop” codons, scientists had begun searching for one or more possible “start” codons. Since all proteins were shown to begin with the amino acid methionine or a modified form of methionine (which is later removed), the methionine codon, AUG, was identified as the start codon for most proteins. It is interesting that AUG also codes for methionine when this amino acid occurs at other sites within the protein.
The cracking of the genetic code gave scientists a valuable genetic tool. Once the amino acid sequence was known for a protein, or for even a small portion of a protein, knowledge of the genetic code allowed scientists to search for the gene that codes for the protein or, in some cases, to design and construct the gene itself. It also became possible to predict the sequence of amino acids in a protein if the sequence of nucleotide bases in a gene were known. Knowledge of the genetic code became invaluable in understanding the genetic basis of mutation and in attempts to correct these mutations by gene therapy. The discovery of the genetic code was therefore key to the development of genetics in the late twentieth and early twenty-first centuries, perhaps outshined only by the discovery of DNA’s double-helical structure in 1953 and the completion of the in 2003.
Key terms
- anticodona sequence of three nucleotide bases on the transfer RNA (tRNA) that recognizes a codon
- codona sequence of three nucleotide bases on the messenger RNA (mRNA) that specifies a particular amino acid
Bibliography
Ball, Philip. "Cracking Codons." Royal Society of Chemistry, 28 Sept. 2022, www.chemistryworld.com/opinion/cracking-codons/4016221.article. Accessed 6 Sept. 2024.
Crick, Francis H. C. “The Genetic Code III.” Scientific American 215 (1966): 57. Rpt. in The Chemical Basis of Life: An Introduction to Molecular and Cell Biology. Freeman, 1973.
Crick, Francis H. C. “The Genetic Code: Yesterday, Today, and Tomorrow.” Cold Spring Harbor Symposia on Quantitative Biology, vol. 31, 1966, pp. 3–9.
Edey, Maitland A., and Donald C. Johnson. Blueprints: Solving the Mystery of Evolution. Rpt. Viking, 1990.
Ginsberg, Judah. "Breaking the Code." Chemical Heritage, vol. 29, no. 1, 2011, p. 11.
Hartman, Hyman, and Temple F. Smith. "The Evolution of the Ribosome and the Genetic Code." Life, vol. 4, no. 2, 2014, pp. 227-49.
Judson, Horace Freeland. The Eighth Day of Creation. Commemorative ed. Cold Spring Harbor Laboratory, 2013.
Karp, Gerald. “Gene Expression: From Transcription to Translation.” Cell and Molecular Biology: Concepts and Experiments. 7th ed. Wiley, 2013.
Kay, Lily E. Who Wrote the Book of Life? A History of the Genetic Code. Stanford UP, 2000.
Olby, Robert. Francis Crick: Hunter of Life’s Secrets. Cold Spring Harbor: Cold Spring Harbor Laboratory, 2008.
Mooney, Carla, and Carla Carbaugh. Genetics: Breaking the Code of Your DNA. Nomad, 2014.
Portugal, Franklin H., and Jack S. Cohn. A Century of DNA: A History of the Discovery of the Structure and Function of the Genetic Substance. MIT, 1977.
Ridley, Matt. Francis Crick: Discoverer of the Genetic Code. Atlas, 2006.
Sheppard, Robert. "Cracking the Genetic Code." Maclean's, vol. 116, no. 12, 2003, p. 48.
Trainor, Lynn E. H. The Triplet Genetic Code: The Key to Molecular Biology. World Scientific, 2001.