Deciphering the sequence of the human genome is the molecular biology equivalent of landing a man on the moon. The project was discussed as early as 1985 and officially began as an international consortium of laboratories in 1990 with a projected completion date of 2005. Original countries participating in the Human Genome Project (HGP) included the United States, Great Britain, France, and Japan, with China and Germany joining the effort at a later date. Recently, a privately funded effort by Craig Venter's Celera Corporation promised to finish the genome and release it to the public in only three years. In February, 2001, Venter kept his promise and Celera's draft sequence of the human genome was published in Science.1 The draft sequence of the publicly funded HGP was published in the same week in Nature.2 These draft sequences are a "rough draft" of the entire human genome and will require more work to close gaps and orient sequence fragments before a "finished" genome sequence can be made available.
By now some readers are probably wondering what exactly is a genome? Unfortunately, that question is not easily answered. At the most simplistic level, the genome is the entire DNA content of a given cell. DNA is a fibrous chemical made of long strings of nucleotides which form the genes which code for the traits of the organism. Thus, the genome has come to be thought of as a "blueprint" or a "book" of life, in the sense that the genome is primarily responsible for the organism itself. Oversimplification of this idea has led to the notion that life is just the product of DNA, but as a creationist I reject this reductionistic concept. God creates organisms in Genesis 1 in a fully mature state; the genome therefore would seem to be a repository of information to ensure the continuation of the mature state of the organism. Based on the value God places on maturity, I would suggest that the nature and function of the adult human specifies the original content of the human genome.
The hyper-reductionistic view of genomes as the source of life is found largely in the popular press. Most biologists easily recognize the limitations of this perspective, and the U.S. funding agencies are already emphasizing applications of genomic information in the "post-genome era." In actuality, the "book" or "blueprint" analogy is useful in biology, for both books and blueprints are used to store information but are fairly inert and inactive without someone to read them and carry out their instructions. Thus, while the genome does not determine the organism, it is certainly an important source of information for the coding and operation of a human being.
The human genome is composed of 22 linear pairs of chromosomes called autosomes, and 2 special chromosomes (X and Y) called sex chromosomes. Each chromosome is a single piece of DNA, and human chromosomes range in size from approximately 45 million to 279 million nucleotides.3 In terms of mass, the finished sequence of the smallest human autosome (Chromosome 21)4 is 53.5 million times bigger than a single sugar molecule (glucose). DNA nucleotides come in four varieties: adenine, cytosine, guanine, and thymine, and it is the sequence of these nucleotides that contain the information that make up our genes. The sequence may take the form of large-scale, tandemly repeated structures called heterochromatin, or the sequence may be largely unique, complex sequences called euchromatin. Heterochromatic regions are found in the centromeres (interior "handles") and telomeres (end caps) of chromosomes, and in special structures called knobs. Genes are mostly found in the euchromatic region of the genome. For technical reasons, heterochromatin is very difficult to sequence correctly, while euchromatin is relatively simple to sequence.
The Celera human genome sequence contains 2.91 billion nucleotides of human euchromatin and the HGP sequence contains 2.69 billion. The human genome is by far the largest genome ever completely sequenced. The next largest is from Drosophila, which weighs in at 120 million nucleotides of euchromatin, merely 4% of the size of the human genome.5 Celera and HGP scientists have mapped the sequence onto genetic and physical maps of the chromosomes, so that the physical location and orientation of each contiguous DNA sequence is known with high confidence. The size of the human genome is easily misinterpreted, however. One might think that a human might have 25 times as much DNA as a fly because humans are so much larger and more complex. Unfortunately, the amount of DNA in a genome appears to be uncorrelated with biological complexity. For example, the single-celled ciliate Paramecium caudatum possesses a genome of 8.6 billion nucleotides, more than twice as big as the human genome. One of the largest known genomes, 670 billion nucleotides, is found in the single-celled Amoeba dubia. Other complex multicellular organisms, such as the chicken, contain genomes that are substantially smaller than the human genome.6
One of the most impressive aspects of the human genome is our virtual ignorance of its contents. Using sophisticated laboratory and computational analyses, Celera scientists recognized 26,500 human genes, and predicted the locations of another 13,000. The HGP used a different methodology and predicted a total of 31,778 human genes. The good agreement of these two methods indicates that the human genome probably contains between 30,000 and 40,000 genes. Because of the complex structure of human genes, it is not possible at this time to give a definite gene count. The coding sequences of the 39,000 genes predicted by Celera occupy only 1.1% of the 2.91 billion nucleotides, and the HGP estimates that human genes occupy less than 5% of their draft sequence. The remainder of the genome is composed of genetic control regions, important chromosomal features (telomeres and centromeres), and a lot of DNA that we simply don't understand. Only 35% of the genome is composed of known repetitive sequences (such as Alu repeats and retroelements).7 These surprisingly small fractions of the genome that we understand merely highlight the vast amount of research that remains to be done before we truly comprehend the human genome. Knowing the sequence is just the beginning.
In addition to the philosophical considerations noted above, the human genome offers many exciting revelations for the creationist. First, the various human ethnic groups are more similar than scientists originally expected. By comparing the genome sequence with previously known human sequences already publicly available, Celera scientists discovered that humans differ by only one nucleotide for every 1250 in the genome. These differences are called Single Nucleotide Polymorphisms (SNPs). Some creationists will no doubt claim that this confirms the Genesis account of creation and use this statistic as a polemic against racism. What I find surprising by the SNP frequency is that the ethnic groups seem to be too similar. Based on what we know about modern mutation rates and human divergence times, the human genome should contain about twice as much diversity assuming uniform mutation rates in the past. This stands as a fascinating challenge to creation scientists and no doubt holds an interesting clue to the divergence of organisms after the Flood.
Second, as one might expect from the gene frequency, only 1-4% of the SNPs occur in sequences that code for genes.8,9 Thus, the common creationist claim that point mutations almost always result in visibly harmful effects is not technically correct, since 95% of SNP mutations occur in non-coding regions of the genome which have little or no observable effect on the organism. This is certainly not good news for evolution and SNPs are increasingly recognized as a poor method to generate biological diversity. New sources of mutation, such as chromosomal rearrangements (see below), gene transfer, and repeat sequences are being explored as possible sources of mutation for the neo-Darwinian mechanism.
Third, Celera and HGP scientists noted a positional bias between Alu repeat sequences and human genes. The distribution of genes on the chromosomes is very similar to the distribution of Alu repeats. Wherever the genes are located, Alu repeats are not far away. Based on thirteen different biological evidences, I recently proposed that repeat sequences and rapid, post-Flood diversification are causally linked.10 Based on this theory, I predicted that we should observe a positional bias between repeat sequences and important organismal genes. The human genome is therefore a confirmation of my theory of organismal diversification, and may yield important new insights into the mechanism of change in baramins (created kinds).
Fourth, the human genome shows a remarkable and complex internal structure. Celera and HGP scientists discovered that many genes in the human genome are duplicated, some more than once. While this result was expected, what was surprising was Celera's discovery that 1077 blocks of three or more genes are also duplicated as a group throughout the genome.11 These blocks of similar genes are found on every chromosome and create a very complex network of similarities. The existence of intra-genomic duplications is not unprecedented, since the Arabidopsis genome also showed internal duplications, though on a smaller scale.12 Whether these blocks are the products of God's creativity or historical changes in the genome remains to be seen, but the network of similarities certainly highlights the modular nature of the biological creation.
The human genome sequence is a remarkable accomplishment, and a fitting beginning to the science of the twenty-first century. We have certainly not heard the end of the human genome project, as the secrets of the majority of the human genome have yet to be uncovered. With this genome sequence, we can look forward to a long and no doubt fruitful age of discovery, revealing the mysteries and wonders of God's creation.
- Venter, J.C., and 272 others, "The Sequence of the Human Genome," Science 291(2001):1304-1351.
- International Human Genome Sequencing Consortium, "Initial Sequencing and Analysis of the Human Genome," Nature 409(2001):860-921.
- International Human Genome Sequencing Consortium, ref. 2. <4> Hattori, M., and 61 others, "The DNA Sequence of Human Chromosome 21," Nature 405(2000):311-319.
- Adams, M.D., and 194 others, "The Genome Sequence of Drosophila melanogaster," Science 287(2000):2185-2195.
- Li, W.-H. and D. Graur, Fundamentals of Molecular Evolution (Sunderland, MA: Sinauer Associates, Inc., 1991), p. 209.
- Venter et al., ref. 1.
- Venter et al., ref. 1.
- The International SNP Map Working Group, "A Map of Human Genome Sequence Variation Containing 1.42 Million Single Nucleotide Polymorphisms," Nature 409(2001):928-933.
- Venter, et al., ref. 1.
- Wood, T.C., "The AGEing Process: Rapid Post-Flood Intrabaraminic Diversification Caused by Altruistic Genetic Elements (AGEs)," Origins (2001), submitted.
- Arabidopsis Genome Initiative, "Analysis of the Genome Sequence of the Flowering Plant Arabidopsis thaliana," Nature 408(2000): 796-815.
* Dr. Wood is Assistant Professor at the Center for Origins Research, Bryan College, Tennessee.