Scientists announced the formal completion of the human genome Monday - a milestone marking the end of the first chapter of the genetics revolution and the dawn of a more arduous chapter two - figuring out the meaning of it all.
The next challenge will stretch far into the decades to come: determining the function of all 3 billion DNA letters, and understanding how those letters direct the growth, life, reproduction, disease and death of human beings.
To that end, government scientists Monday outlined their research road map for the future in a document to be published this week in the scientific journal Nature. Among the major initiatives is a project known as ENCODE, which will analyze 1% of the genome in detail to identify every functioning element it contains. Also highlighted was the HapMap project, which will catalog the minute genetic differences between people to help track down genes associated with complex traits, predispositions or diseases.
Members of the International Human Genome Sequencing Consortium paused to celebrate the completion of the genome Monday at a news conference in Bethesda, Md. The affair was more muted than the exuberant milestone of 2000, when the consortium declared that a working draft covering 90% of the genome was complete.
The new version is far more accurate and complete, covering 99% of the genome with an error rate of less than 1 in 10,000 nucleotides.
Eric Lander, director of the Whitehead Institute/MIT Center for Genome Research in Cambridge, Mass. - one of the leading genome-sequencing centers in the U.S. - said Friday that the genome's completion, following a frenzied month of tying up loose ends, felt "spectacular."
"Not to get soppy about this, but this is very much, I'm sure, what the people who went to the moon felt," Lander said. "You pinch yourself and say: It's really happened."
Yet ignorance about the genome still far outstrips knowledge, said Dr. Eric Green, scientific director for the National Human Genome Research Institute, the government body that coordinated the U.S. genome effort. Scientists know little about the estimated 30,000 human genes. [I see that estimate of 30,000 is holding firm, although the low figure shocked most geneticists when it was first announced.]
Many parts of the genome don't encode genes at all but may nevertheless be vitally important, playing roles in turning genes on or off, directing the replication of DNA, carrying instructions for making tiny strands of RNA that perform crucial jobs in the cell.
[A prediction of ID, that "junk DNA" would turn out to be functional, is borne out. I don't suppose that Darwinists will admit embarrassment.]
The ENCODE project aims to tease out in exquisite detail all these myriad functions for a tiny sliver of the genome. The 1% investigated by ENCODE, an acronym for Encyclopedia of DNA Elements, will include regions known to contain genes of interest and others selected randomly.
"We're throwing everything we can at it," Green said.
Ultimately, all of the genome will be probed in a process that could extend for decades.
The second major effort, the HapMap project, delves into a key question for medical genetics - what makes us different?
Although the structure and sequence of the human genome are basically the same for everyone, there exist many small variations in DNA from one person to the next. Such variations help explain why some people have greater propensities for heart disease or asthma or autism.
So far, genome researchers have identified 10 million genome sites housing tiny variations known as "SNPs." The HapMap project, announced in October, is intended to pinpoint the 100,000 variations deemed most useful for tracking down genetic traits and predispositions. The effort is expected to take three years.
Many other strands of study will be needed before the genome can offer its full promise to medicine and biological understanding. These include closely comparing the human genome with those of other creatures such as the mouse and chimpanzee, and solving the structure and function of the tens of thousands of proteins our genomes encode.
The human genome project was first seriously discussed in the 1980s - but at that time, many scientists believed it would be crazy to attempt such a huge and tedious-sounding task with the technology then available.
In those days, the art of "sequencing" - determining the precise order of the four chemical nucleotides along a string of DNA - was time-consuming and primitive. The notion that one could sequence an entire genome's worth of DNA and then have a prayer of assembling the information in the correct order was considered slim.
But advancements in automated sequencing and computer technology allowed the effort to begin in 1990 and finish up more than two years ahead of schedule. Scientists in the United States, England, Germany, France, Japan and China contributed to the effort, which was largely conducted with public money. The results are available on the Internet to anyone who wishes to access them.
The consortium announced in 2000 the completion of a "working draft" of the genome. The announcement was preceded by a fierce race between the public consortium and a private sequencing effort by a Rockville, Md.-based biotech company, Celera Genomics.
The earlier draft had an error rate of 1 in 1,000 "letters," or nucleotides. It also had 150,000 "gaps": regions of the genome that had not been sequenced.
The completed version has only 400 gaps. Those gaps remain because, for unknown reasons, these portions of the genome have been very difficult to grow in large quantities in bacteria, a step that is required before a chunk of genome is sequenced. The pieces may be unusually shaped or toxic to the bacteria they are grown in.
Certain regions of densely packed, highly repetitive DNA that do not contain genes also have not been sequenced.
Biologists say the effect of the genome on biology has already been huge.
In the pre-genome era, it was common for whole teams of graduate students to devote years pinpointing and sequencing a gene of interest. Now that the genome is completed, the same information can often be had with the click of a computer mouse.
In the years since the genome was completed, the identification of genes involved in human disease has exploded - from less than 100 in 1990 to more than 1,400 today.
The accomplishment is "stunning," said Dr. Leena Peltonen, professor of human genetics at UCLA. "It is some class of miracle.... When I was in medical school, there were even textbooks that stated the human genome could never be known in such detail because of its huge size and complexity."
However, she added, there are many rudimentary discoveries still to be made, such as the precise number of genes that our genomes contain.
"We have a kind of toolbox," Peltonen said. "Now we have to transform that information into increased understanding."
File Date: 04.23.03