Each year, RECOMB’s stream of Bayesian networks, hidden Markov models, and Gibbs samplers is punctuated by a series of invited talks that offer a higher-level view of computational biology. Attendees perennially rank these keynote speakers among the highlights of the meeting: Not only do the talks help them view their work within a broader context, but they often inspire the audience to consider entirely new problem areas for future research.
Edward Trifonov, a pioneer in DNA sequence pattern analysis who now heads the Genome Diversity Center at the University of Haifa in Israel, kicked off this year’s meeting with the Stanislav Ulam Memorial Computational Biology Address, entitled “Poetry and Prose of the Sequences.”
“The sequences are beautiful,” Trifonov said, explaining that just like human language, sequence information conveys messages that are much more subtle than a reader or listener may pick up from the order of the letters and words. As an example, he noted that one could detect “periodicities” in DNA sequence much like rhyming or rhythmical patterns in poetry. More than simply an esoteric exercise, Trifonov explained how this approach was helpful in leading him to a biological hypothesis: Seeing a strong tendency for G to appear in the first position of a three-base periodicity, Trifonov determined that it’s likely that the very first genes to evolve were GGC and GCC triplets repeated 3-8 times, making the first proteins strings of glycine and alanine.
Moving from the sublime to the practical, Terry Speed of the University of California, Berkeley, provided the audience with a crash course in “low-level” analysis of microarray data — “not the sexiest game in town,” he admitted, but essential for precise and accurate analysis. Sporting a T-shirt that read, “Macro Statistics for Microarrays,” Speed stressed the importance of careful pre-processing for microarray data, including background subtraction and normalization. Of utmost importance, he said, is ensuring that the data used for the final analysis actually has biological relevance. “You can’t just use a fancy black-box machine-learning technique — many of which were invented by people in this room — to find biological meaning,” he chided.
Just two days before the final version of the human genome sequence was made available through the University of California Santa Cruz genome browser and other resources, UCSC’s David Haussler offered a glimpse of his vision for the future of computational genomics. Describing genome browsers as “computational microscopes,” he explained that the ultimate goal for UCSC and other genome browser providers is to offer “a continuously tuned engine for discovery” with “a new view of the genome every day.” Grand challenges following the completion of the Human Genome Project, according to Haussler, include “reconstructing the evolutionary history of each base in the human genome.”
Other invited speakers ventured into the realm of pure biology. Nobelist Christiane N sslein-Volhard of the Max Planck Institute of Developmental Biology delivered the Distinguished Biology Lecture on the use of the zebrafish as a model organism, while Dieter Oesterhelt, also of Max Planck, offered insight into the study of halobacteria — a type of archaea that thrives in salt ponds.