ATLANTA — Epigenomics, phylogenetics, and modeling were among the most popular themes at this year’s Georgia Tech-Oak Ridge National Laboratory International Conference on Bioinformatics, though the meeting also covered gene regulation, misalignments in whole-genome mapping, CpG islands, microRNA targets and other topics.
Such themes were among a handful trumpeted by Georgia Tech provost Gary Schuster, who said during an introductory talk that while bioinformatics was only a “fledgling field” a decade ago, “today, in a lot of ways, it has center stage. It was looking for definition and now, in some ways, it defines itself.”
Schuster said that in the recent past, chemists were leading modern biology, but now it’s the other way around. “It was once thought that eventually chemistry would solve everything, but now the question is, ‘Will chemistry eventually solve biology?’” Schuster posed.
He showed that while chemistry once pushed biologists, now informatics is the driver. “Computer modeling and simulation are making “huge contributions” to the life sciences and “will accelerate the pace of discovery faster than anything you’ve seen,” said Schuster.
Several presenters and speakers were focused on computer modeling, one of the most popular themes in the Saturday sessions.
Presenter Jun Liu from Boston University was one of three speakers focusing solely on modeling, who discussed how, using a computer algebra system called Reduce, her team has been able to study predictive models of gene regulation. According to Liu, Reduce uses a simple regression model to “shuffle” gene names to find up to 300 motifs and to confirm results.
Reduce, developed by the Bussemaker lab at Columbia University, is designed to support calculations that are not possible to do by hand, among other highlights.
Reduce is available freely here.
Joel Bader, of the biomedical engineering department at Johns Hopkins University, discussed two research projects involving modeling, including one that uses graph diffusion models
Bader’s team is developing graph kernels, which identify similarities between two objects, to look at vertices that “share a lot of neighbors.” Specifically, his group is using kernels for multi-gene searches in gene expression studies, the results of which he shared in his talk.
Peter Good, a program officer at the National Human Genome Research Institute, spoke about the various epigenomics funding projects currently underway at the National Institutes of Health.
One such project will use a $1.5 million NIH grant to build an epigenomics data-analysis and -coordination center for all reference epigenome mapping centers, and import all data generated through the NIH Roadmap Epigenomics program.
Good said that the project intends to fund three to five mapping centers, which are expected to generate a great deal of data, though he declined to offer how much. He did not have time to comment further, prior to press time.
Good said the NIH is funding the EDACC because “there will be a lot of data [according to] the way the [NIH Roadmap] program is set up.”
The request for applications for the center data is available here, and additional funding opportunities for the Epigenomics Roadmap program are available here.
Phylogeny for Dummies?
Another area taking center stage at the conference – as evidenced by the number and variety of talks and posters that promoted it – was how biologists can better tap tools to understand the evolutionary divergence of species, and other aspects of phylogeny.
“There are a lot of arguments about how you construct the phylogenic trees because it’s how you understand evolution, … But it’s not as easy an approach as Jean-Michel presented.”
Jean-Michel Claverie of Université de Mediteranée in Marseille, France, described his platform, called Phylogeny.fr, as “phylogeny for dummies.” He said the tool, which is available here, provides access to algorithms his team developed that allow biologists with little or no formal bioinformatics training to elucidate the functions and evolutionary patterns of many genes in popular databases.
With Phylogeny.fr, users can choose between three methods: a one-click mode for uploading FASTA files; an advanced mode, in which one manually sets parameters; and the a la carte mode, which includes a mix-and-match variety of applications such as visualization and alignment tools.
After the conference, one delegate told BioInform that there were numerous problems with Claverie’s simplification of phylogeny.
“There are a lot of arguments about how you construct the phylogenic trees because it’s how you understand evolution,” said one attendee who asked for anonymity because of confidentiality concerns at her company. “But it’s not as easy an approach” as Claverie presented.
“What he was suggesting is he would do a selective representation for the tree … for huge amounts of samples, … but it’s impossible to do an alignment for 100,000 sequences,” said this person. “If he takes all the available sequence data that is available in all available organisms, it’s a very difficult thing to do. One must calculate the distance between every pair of sequence[s].”
There is also the question of competition, and the fact that there are other sites that offer multiple-alignment algorithms, including ClustalW or MUSCLE (multiple sequence comparison by log-expectation), which ”have been in use for years.”
In an interview with BioInform, Claverie called the criticism “appalling,” noting that a scientific conference is designed to promote free speech.
Also discussing phylogenetics was Steven Salzburg of the University of Maryland Center for Bioinformatics and Computational Biology, at College Park. While Salzberg’s primary topic was genome organization in bacteria, he discussed how his work has included both transcription “terminators” and the study of overlapping genes via biological, rather than algorithmic, results.
Discussing transcription, Salzburg said that when genes are transcribed together “very often they have related functions.”
When discussing what he called “the evolutionary dynamics of overlapping genes” he said that with a phylogenetic approach, he starts with a phylogenetic tree of prokaryotes to facilitate the study.
On a related topic, a team from the National Center for Biotechnology Information presented a poster on an evolutionary model of the regulatory ACT domain superfamily with inferences derived from CDTree, a software application.
CDTree, available for free here, is designed to help classify protein sequences; supplies a web-based service to permit user interaction with pre-defined protein domain hierarchies; and offers an integrated software environment to assimilate large amounts of data via a suite of analysis methods.