Skip to main content
Premium Trial:

Request an Annual Quote

Team Led By UCSC s Haussler to Reconstruct Whole Genome of Distant Mammalian Ancestor

COLD SPRING HARBOR, NY, Oct. 31 (GenomeWeb News) - It's technically possible to computationally reconstruct the genome of the ancestor of all placental mammals, according to David Haussler of the University of California, Santa Cruz, who is spearheading a collaborative effort to deliver the assembly of such a genome to the research community.


Haussler, a professor of biomolecular engineering at UCSC, said that an effort to "reconstruct the evolutionary history of each base in the human genome" from the time of the so-called Boreoeutherian ancestor, which lived around 75 million years ago, is "the grand challenge of human molecular evolution."


Speaking Saturday at the annual Genome Informatics conference here at Cold Spring Harbor Laboratory, Haussler outlined several pilot studies that he and his collaborators have conducted as proof of principle for such a project. The group published a paper in the December issue of Genome Research describing the use of comparative genomics to computationally reconstruct the CFTR locus, which encompasses more than 1 million base pairs and includes 10 genes, including the gene involved in cystic fibrosis.


Since then, Haussler said, he and his collaborators - including Webb Miller at PennStateUniversityand researchers from the Broad Institute and the BaylorHumanGenomeSequencingCenter- have used similar methods to reconstruct a region of human chromosome 13q and are moving on to whole-genome reconstruction.


Work so far "indicates feasibility," Haussler said, with an overall accuracy of around 91.5 percent - a number he hopes to bring up to around 98 percent.


Haussler acknowledged that there is some skepticism about the accuracy of the reconstruction. But he said that he is confident in his team's method and validation process.


He and his colleagues have developed a software program to simulate the evolution of DNA over millions of years - statistically accounting for substitutions, insertions, deletions, and other polymorphisms that arise over time -- and test this program on a hypothetical ancestral DNA sequence, artificially evolving the DNA to create simulated "modern" DNA sequences for multiple species.


Then they use their computational reconstruction procedure, which is based on multiple alignments of many species, to work backwards and recreate the hypothetical ancestral sequence. They can then compare the two versions of the hypothetical ancestral genome to determine the accuracy of the method.

After applying the reconstruction process to real genomic sequences, the team validates its predicted ancestral genome by simulating the evolution for organisms that are not included in the group from which the ancestral genome was derived. They can then compare the simulated evolved genome to the real one to gauge the accuracy of the predicted ancestral genome.  


Haussler said that scaling the project up to the entire human genome is a "captivating" prospect that would provide valuable insights into human evolution, but would ideally use the full genome sequences of around 20 mammals - far more than the National Human Genome Research Institute plans to sequence to completion.


"We have to keep sequencing genomes," he said.


PennState's Miller told GenomeWeb News that the project is proceeding even with incomplete data, however. "We're not waiting," he said. "We're going as fast and furious as we can."


Miller said that the project is using the genomes of 11 organisms so far - human, chimp, rat, mouse, dog, macaque, rabbit, cow, armadillo, elephant, and tenrec. Even though most of these have only been sequenced to very low coverage, Miller said that the team has already used this data to reconstruct the entire ancestral genome "a few times," although the results are still "preliminary."


Estimates for the length of the project vary, with some involved saying it could take as long as two years for a completely assembled ancestral genome to reach the public domain. Other sources involved in the project said an initial draft assembly could be available within the next six months.

The Scan

Genome Sequences Reveal Range Mutations in Induced Pluripotent Stem Cells

Researchers in Nature Genetics detect somatic mutation variation across iPSCs generated from blood or skin fibroblast cell sources, along with selection for BCOR gene mutations.

Researchers Reprogram Plant Roots With Synthetic Genetic Circuit Strategy

Root gene expression was altered with the help of genetic circuits built around a series of synthetic transcriptional regulators in the Nicotiana benthamiana plant in a Science paper.

Infectious Disease Tracking Study Compares Genome Sequencing Approaches

Researchers in BMC Genomics see advantages for capture-based Illumina sequencing and amplicon-based sequencing on the Nanopore instrument, depending on the situation or samples available.

LINE-1 Linked to Premature Aging Conditions

Researchers report in Science Translational Medicine that the accumulation of LINE-1 RNA contributes to premature aging conditions and that symptoms can be improved by targeting them.