Post details: How to identify orphans in the human genome

01/17/08

Permalinkby 04:17:04 pm, Categories: Literature - Articles, 979 words   English (UK)

How to identify orphans in the human genome

When the working draft of the human genome was first published, the best estimate of the number of human genes was 35,000. With further analysis, this number fell to about 25,000 and stimulated a debate about why the number was much smaller than expected. A recent study has led to a further significant drop to 20,500. The researchers had realised that some reported sequences might not be genes after all. The ScienceDaily reports thus:

"To distinguish such misidentified genes from true ones, the research team, led by Clamp and Broad Institute director Eric Lander, developed a method that takes advantage of another hallmark of protein-coding genes: conservation by evolution. The researchers considered genes to be valid if and only if similar sequences could be found in other mammals - namely, mouse and dog. Applying this technique to nearly 22,000 genes in the Ensembl gene catalog, the analysis revealed 1,177 "orphan" DNA sequences. These orphans looked like proteins because of their open reading frames, but were not found in either the mouse or dog genomes."

The researchers write:

"The results above are consistent with the orphans being simply random ORFs, rather than valid human protein-coding genes. However, consistency does not constitute proof. Rather, we must rigorously reject the alternative hypothesis."

Two alternative explanations needed to be tested. The first explanation considered was that the sequences were lost in mouse and dog lineages yet retained in the lineage leading to humans. To explore this, the researchers "compared the orphan sequences to the DNA of two primate cousins, chimpanzees and macaques" and confirmed that they were absent there as well. This was taken as proof that the sequences have not been preserved via the primate lineage.
The other possibility is that the orphan sequences are novel to humans. The researchers write:

"If the orphans represent valid human protein-coding genes, we would have to conclude that the vast majority of the orphans were born after the divergence from chimpanzee. Such a model would require a prodigious rate of gene birth in mammalian lineages and a ferocious rate of gene death erasing the huge number of genes born before the divergence from chimpanzee. We reject such a model as wholly implausible. We thus conclude that the vast majority of orphans are simply randomly occurring ORFs that do not represent protein-coding genes."

The significance of this methodology is the insight it gives to the way evolutionary theory is totally dominant in studies of the human genome. Common ancestry is adopted as the framework for interpretation. The "proof" that the orphans are not novel human protein-coding genes is that this would require a "prodigious rate of gene birth" after the human-chimpanzee split. This proof is justified by the theoretical framework (deduction) rather than by analysis of empirical data. After adopting an evolutionary paradigm, the researchers sift the evidence to show that all the genes fit into a coherent pattern of birth and loss. Those that don't fit are simply deduced to be ORFs.

Erasing genes
Erasing genes because they do not fit evolutionary theory

An ID perspective is not impressed by this logic because it exhibits circular reasoning. Consequently, there is a serious flaw in the research methodology. Empirical tests for these sequences being non-coding are complex, but they must involve experimentation to explore whether the genetic sequences are associated with gene products.
Grand comments like "There's no real creativity going on in the mammalian genome" are only valid if the research methodology is robust. Furthermore, 1,177 orphan sequences out of 20,500 genes represents over 5% of the human gene resource. It is worth bearing this figure in mind when the 1% difference figure is quoted for the human/chimpanzee genome.
Finally, this research has stirred not a few ID scientists to predict that functionality will be found for some or all of these putative orphan genes. The situation a variant of the "Junk DNA" story of 10 years ago, where the evolutionary biologists were far too quick to invoke functionlessness on the basis of their theory. Happily for science, the "Junk DNA" theory has been forced to be revised. Some of us are prepared to say that the same thing will happen in the case of human orphan genes.

Distinguishing protein-coding and noncoding genes in the human genome
Michele Clamp, Ben Fry, Mike Kamal, Xiaohui Xie, James Cuff, Michael F. Lin, Manolis Kellis, Kerstin Lindblad-Toh, and Eric S. Lander.
Proc. Natl. Acad. Sci. USA. 26 November 2007 | DOI: 10.1073/pnas.0709013104

Abstract: Although the Human Genome Project was completed 4 years ago, the catalog of human protein-coding genes remains a matter of controversy. Current catalogs list a total of ~24,500 putative protein-coding genes. It is broadly suspected that a large fraction of these entries are functionally meaningless ORFs present by chance in RNA transcripts, because they show no evidence of evolutionary conservation with mouse or dog. However, there is currently no scientific justification for excluding ORFs simply because they fail to show evolutionary conservation: the alternative hypothesis is that most of these ORFs are actually valid human genes that reflect gene innovation in the primate lineage or gene loss in the other lineages. Here, we reject this hypothesis by carefully analyzing the nonconserved ORFs - specifically, their properties in other primates. We show that the vast majority of these ORFs are random occurrences. The analysis yields, as a by-product, a major revision of the current human catalogs, cutting the number of protein-coding genes to ~20,500. Specifically, it suggests that nonconserved ORFs should be added to the human gene catalog only if there is clear evidence of an encoded protein. It also provides a principled methodology for evaluating future proposed additions to the human gene catalog. Finally, the results indicate that there has been relatively little true innovation in mammalian protein-coding genes.

See also:

Broad Institute of MIT and Harvard, Human Gene Count Tumbles Again, ScienceDaily (Jan. 15, 2008)

Image source - http://www.broad.mit.edu/news-images/erazegene.jpg
Image credit: Bang Wong, Broad Institute | iStockphoto

Permalink

Literature

July 2008
Mon Tue Wed Thu Fri Sat Sun
<<  <   >  >>
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      

Search

Linkblog

Links - Groups and Organizations

Links - Of General Interest

  • A Brief View of Time and Those That Live There

    Don Cicchetti blogs on: Culture, Music, Faith, Intelligent Design, Guitar, Audio

    Permalink
  • A Quick Guide to Sequenced Genomes Permalink
  • ARN Related Web Links Permalink
  • Creation/Evolution Quotes

    Australian biologist Stephen E. Jones maintains one of the best origins "quote" databases around. He is meticulous about accuracy and working from original sources.

    Permalink
  • CreationEvolutionDesign

    Most guys going through midlife crisis buy a convertible. Austrialian Stephen E. Jones went back to college to get a biology degree and is now a proponent of ID and common ancestry.

    Permalink
  • Darwinian Fairytales by David Stove

    Complete zipped downloadable pdf copy of David Stove's devastating, and yet hard-to-find, critique of neo-Darwinism entitled "Darwinian Fairytales"

    Permalink
  • ID The Future

    Intelligent Design The Future is a multiple contributor weblog whose participants include the nation's leading design scientists and theorists: biochemist Michael Behe, mathematician William Dembski, astronomer Guillermo Gonzalez, philosophers of science Stephen Meyer, and Jay Richards, philosopher of biology Paul Nelson, molecular biologist Jonathan Wells, and science writer Jonathan Witt. Posts will focus primarily on the intellectual issues at stake in the debate over intelligent design, rather than its implications for education or public policy.

    Permalink
  • John Mark Reynolds Blog

    A Philosopher's Journey: Political and cultural reflections of John Mark N. Reynolds. Dr. Reynolds is Director of the Torrey Honors Institute at
    Biola University.

    Permalink
  • NASA Astronomy Picture of the Day Permalink

Misc

Syndicate this blog XML

What is RSS?

powered by
b2evolution