The Code of Life: Little Words, Big Message
by Daniel Criswell, Ph.D. *
Most of us are impressed with the apparent intelligence of those who use big words in speeches or conversation. Even more impressive are those who actually know what these words mean, how to use them, and how to spell them! On the other hand, no one is ever going to accuse Huck Finn or Tom Sawyer of Ivy League intelligence based on their pronunciation and use of the "English" language. Any politician will confess that speeches frequently interspersed with "words" such as, "yes'm," "an'um," and "duh," are not likely to capture the confidence of potential voters. Unless, of course, the speaker is someone as clever as Mark Twain could be with a pen.
By using slang and colloquial expressions, Twain demonstrated that the choice of words can communicate several levels of information about a character. Similarly the "words" that comprise the code of life (the genetic code) also communicate several levels of information. The "words" of the genetic code form "sentences" called genes. The genes are poly-functional, able to produce more than one protein, depending on which direction the gene (sentence) is read, or where the gene starts and stops (Sanford 2005). In addition to coding for the correct protein, the letters that comprise the genetic code are organized in a way that minimizes errors in protein sequence and structure (Archetti 2004), helps to regulate the amount of protein produced by the cell (Archetti 2004; Ikemura 1985; Chamary and Hurst 2005), and possibly assist proteins in folding into the correct functional shape (Quinn 1975; Kimchi-Sarfaty et al. 2006, see endnote).
Many of us have watched enough television, or at least remember enough of our high school biology, to know that the substance with the information to form life has one of those big impressive names -- deoxyribonucleic acid, or DNA for those of us who prefer Tom Sawyer. DNA is the source of the three letter words that determine what the life form will be and how it functions. The genetic code words are made from just four letters, A, C, G, and T, which correspond to the four nitrogenous bases, adenine, cytosine, guanine, and thymine. Because there are three letters in each code word and only four letters to choose from, the genetic code has just 64 (43) words. Sixty-four words to spell out the information necessary to make all the forms of life on our planet!
How does this code work and how does this information indicate there must be a Creator responsible for it? Even an introductory investigation of the genetic code reveals several levels of information that must have come from an intelligent source. How the genetic code is translated into functional proteins that make life possible is similar to how an architect produces a blueprint of a house and then has someone deliver it to a contractor who builds the house. In a cell, DNA would be the blueprint; a similar nucleic acid, messenger RNA (mRNA) would be the messenger; and the cellular machinery for protein synthesis would be the contractor and his workers. In DNA, the four bases, A, C, G, and T, are arranged in a long chain or polymer to provide the blueprint for building a specific house, or make that protein. These letters are arranged in a chain with two strands forming a double-stranded molecule. One strand has the coding information and the complementary strand is used as a template to correct damage (mutations) to the coding strand.
DNA Coding Sequence &n
DNA Complementary Sequence CTCATCGTCAGGGGTGGAACTGCG
Notice that G pairs with C, and A pairs with T in the double-stranded DNA molecule. This complementary base pairing facilitates the transcription of a message from DNA to the cellular machinery through mRNA. To write a message to the protein synthesis machinery (the contractor) in the cell, the two DNA strands separate, and enzymes (proteins) construct a complementary mRNA strand, which differs from DNA by having a different base, U (uracil), in place of T (thymine).
DNA Coding Sequence &nbs
p; GAG-TAG- CAG-TCC-CCA-CCT-TGA-CGC
&nbs p; &n bsp; &nbs p; &n bsp; CUC-AUC-GUC-AGG-GGU-GGA-ACU-GC G
The sequences are segmented in this example to show the three letter "words" in the mRNA called codons that are responsible for taking the genetic code to the protein synthesis machinery in the cell. A protein is made from amino acids linked together in a chain. These chains can then be folded into filaments or globules depending on the particular function of the protein. If this were an actual protein, the first four amino acids would be leucine, isoleucine, valine, and arginine based on the four code words or codons, CUC, AUC, GUC, and AGG. There are just 20 amino acids typically found in living things and 64 codons. Because of this, each amino acid has more than one codon. Leucine and arginine have six codons while most of the other amino acids have two or four codons. For this reason the code has frequently been referred to as "redundant" and the third letter of each codon was once thought to be "junk" since this letter in many of the codons does not affect the amino acid chosen by the cellular machinery.
Does this mean the genetic code is redundant or is there additional information in these codons? Codons that are similar to each other correspond to amino acids with similar chemical properties. In fact, the most used codons are those that, when mutated, keep on coding for the same amino acid or an amino acid that has similar chemical properties (Woese 1965; Willie and Majewski 2004). Leucine, with six different codons, CUC, CUA, CUU, CUG, UUA, and UUG, provides a good example of how base substitutions might not affect the amino acid sequence in a protein. A mutation in the DNA sequence resulting in an mRNA change in the third letter for four of the leucine codons starting with cytosine (C) would not change the amino acid sequence. For example, from the sequence above CUC-AUC-GUC-AGG, a mutation that changes the codon CUC to CUA would still place leucine at the beginning of the amino acid sequence. This type of mutation is referred to as a synonymous or neutral mutation causing no change in the protein sequence. A more interesting scenario would be if the first base in the second codon were changed from AUC to CUC. Leucine would substitute for isoleucine at the second position in this sequence. However, isoleucine, leucine, and valine all have very similar chemical properties and substituting these amino acids for each other might result in very minor changes in the structure and function of the affected protein. By contrast, arginine, an amino acid with quite different chemical properties from the other three in the example, also has a set of codons that are quite different. In most cases, it would take multiple mutations to change an arginine codon to a codon for one of the other three amino acids. The genetic code is arranged to minimize the affects of mistakes (mutations) in the synthesized protein and to reduce the occurrence of random changes in the organism.
The code also has information that determines the amount and rate of protein production. To assemble a protein, mRNA codons are "read" by another nucleic acid, transfer RNA (tRNA), which in turn correctly aligns specific amino acids in the newly forming protein. For the codon CUC, tRNA attaches leucine to the amino acid sequence. Each tRNA bonds to mRNA with a complementary anti-codon (GAG in this case). If the protein being synthesized has several leucine amino acids, synthesis will go faster if the mRNA codons are CUC and there is a large population of tRNA with a GAG anti-codon. The rate of protein synthesis will be much slower if there are many CUC codons for leucine and few tRNAs with a GAG anti-codon. This preference is called codon usage bias. Proteins that are produced in large quantities by the cell have mRNA codons that match the most common tRNA anti-codons available (Ikemura 1985). Proteins that are in low concentration in the cell do not utilize the codon bias towards the most common tRNA species available and consequently, are synthesized at slower rates (Archetti 2004; Ikemura 1985). The codon usage bias helps to regulate the amount of a particular protein produced in the cell. Synonymous mutations in DNA that change an mRNA codon, but do not change the amino acid sequence, potentially can cause changes in the amount of a specific protein in a cell by altering the speed that these proteins are produced, consequently altering cellular functions.
Although the third base in many codons may not be important in determining the amino acid sequence, this position has information that affects the structure of mRNA (Shabalina, Ogurtsov, and Spiridonov 2006). Remember, the third letter in the leucine codons CUA, CUU, CUC, CUG, are synonymous sites, but each of these codons might produce different secondary structures. The mRNA secondary structure helps determine how long mRNA will last in the cell before being metabolized or degraded. The amount of protein a cell can make from mRNA is directly related to how long the mRNA persists in the cell. Synonymous mutations have been shown to affect the secondary structure and the decay rate of mRNA (Duan and Antezana 2003), which in turn affects how much of a specific protein is produced in the cell. Although the protein sequence is unaffected, altering the amount of a protein in the cell by changing mRNA secondary structure through "synonymous" mutations (CUA, CUU, e.g.) is associated with diseases in humans (Duan et al. 2003; Capon et al. 2004). These disorders emphasize the importance of maintaining the sequence integrity of the "redundant" third letter in the codon, and how changing it affects normal cellular functions.
The current data indicate that all of the bases in the genetic code are important for producing the correct protein in the appropriate amounts in the cell, and these are just a few of the examples of the information contained in the DNA code. It may be that when all of the information is deciphered from the genetic code, terms such a "synonymous," "neutral," and "redundant," will be obsolete. Just as Twain's wit and humor, in written form, is evidence of intelligence, the words of the genetic code are evidence of an Intelligent Author, and this Author of Life has loaded the genetic code with much information using little three-letter words!
Quinn, a creationist, proposed a model of how a synonymous base substitution in mRNA (one that does not change the protein sequence), could alter the protein structure and consequently its function. Thirty-one years later, Kimchi-Sarfaty provided evidence of this actually occurring in a cell.
- Archetti, M. 2004. Selection on codon usage for error minimization at the protein level. J Mol Evol 59 (3):400-15.
- Capon, F. et al. 2004. A synonymous SNP of the corneodesmosin gene leads to increased mRNA stability and demonstrates association with psoriasis across diverse ethnic groups. Hum Mol Genet 13 (20):2361-8.
- Chamary, J. V., and L. D. Hurst. 2005. Evidence for selection on synonymous mutations affecting stability of mRNA secondary structure in mammals. Genome Biol 6 (9):R75.
- Duan, J., and M. A. Antezana. 2003. Mammalian mutation pressure, synonymous codon choice, and mRNA degradation. J Mol Evol 57 (6):694-701.
- Duan, J. et al. 2003. Synonymous mutations in the human dopamine receptor D2 (DRD2) affect mRNA stability and synthesis of the receptor. Hum Mol Genet 12 (3):205-16.
- Ikemura, T. 1985. Codon usage and tRNA content in unicellular and multicellular organisms. Mol Biol Evol 2 (1):13-34.
- Kimchi-Sarfaty, C. et al. 2006. A "silent" polymorphism in the MDR1 gene changes substrate specificity. Science Express, December 21, 2006.
- Quinn, L. Y. 1975. Evidence for the existence of an intelligible genetic code. Creation Research Society Quarterly 11:188-198.
- Sanford, J. C. 2005. Genetic Entropy and the Mystery of the Genome. First ed. Lima, NY: Ivan Press.
- Shabalina, S. A. et al. 2006. A periodic pattern of mRNA secondary structure created by the genetic code. Nucleic Acids Res 34 (8):2428-37.
- Willie, E., and J. Majewski. 2004. Evidence for codon bias selection at the pre-mRNA level in eukaryotes. Trends Genet 20 (11):534-8.
- Woese, C. R. 1965. On the evolution of the genetic code. Proc Natl Acad Sci U S A 54 (6):1546-52.
*Dr. Daniel Criswell has a Ph.D. in molecular biology.
Cite this article: Criswell, D. 2007. The Code of Life: Little Words, Big Message. Acts & Facts. 36 (3).