To predict the function of a gene or protein, according to EraGen co-founder Steve Benner, you’ve got to understand its history and go beyond the notion that homology implies equivalent functions. Benner said EraGen’s Master Catalog 2.0, which began shipping last week, does just this.
The Master Catalog clusters sequences by “relatedness,” providing a picture of how function is conserved or diverges in the context of the evolutionary tree.
Using multiple sequence alignment, it fosters the analysis of “natural annotation” — proteins that interact with each other often display episodes of rapid sequence evolution at the same time in evolutionary history.
EraGen recently announced a collaboration to market the Master Catalog along with Genome Therapeutics’ Pathogenome, a comprehensive database of medically significant microbial and fungal genome sequences.
Benner said that biology in the post-genome era requires the fusion of two separate scientific traditions: physical scientists, who develop mathematical and physical models to describe phenomena such as the structure of molecules; and natural historians, who operate by collecting and classifying information. Natural history operates in arenas such as evolutionary biology that are too complex to be usefully predicted by physical and computational models.
As an example of how these distinct approaches can be successfully combined, Benner cited structural biology. Initial attempts to predict protein folding using purely physical methods foundered on the sheer complexity of the molecules. But evolutionary biologists contributed the idea that the folds that contribute most to a protein’s “fitness” in terms of natural selection, would tend to be conserved over time.
From the differences between how proteins actually evolved, and how they would have changed if folds were not conserved, researchers could tease out a signal containing information about the protein structure. About 30 protein folds have been predicted in the past decade using this approach, Benner said.
However, in genomics, the insights of evolutionary biology have yet to be fully put to work, Benner contended. “Remarkably little of the NIH-funded biomedical research program considers the history.”
As a consequence, the NIH has not exploited well the historical information that could have sped its research, Benner said.
“Genomics programs are captured by computational people, not the biologists who are natural historians,” he added. “They’re very focused on one atom or molecule, not taking the long view.”
Benner cites the excitement surrounding the discovery of leptin’s association with obesity in mice as an example of the pitfalls of ignoring the lessons of evolution. “You have to correlate events in the molecular record with the paleontological record,” he said.
When proteins change function, the historical record leaves a signature of that change. In the Master Catalog, this is detected as an increase in the ratio of non-synonymous to synonymous substitution in the DNA sequence; that is, an acceleration in the rate at which the amino acid sequence of the coded protein evolves.
One major obstacle to fruitful interaction between scientists taking the natural history and physical science approaches to their work, said Penn State evolutionary biologist Blair Hedges, has been that they usually come from different educational backgrounds and don’t generally run into each other.
“Molecular biologists usually have strong training in chemistry and biochemistry. Evolutionary biologists tend to come out of backgrounds in organismal biology,” he said. “They’re two separate camps. But that’s changing a bit nowadays; there’s more of an emphasis on interdisciplinary research.”
Hedges credits the NASA Astrobiology Institute, founded in 1998, with bringing a varied group of researchers together. Hedges and Benner both belong to NAI’s Evolutionary Genomics group, which stays in touch through workshops and videoconferences.
“The greatest value of the NAI may be to change the paradigm of biomedical research,” Benner agreed.