Skip to main content
Premium Trial:

Request an Annual Quote

Rice U Team to Use $1.1M NSF Grant for Cloud-compatible Bayesian Tools for Evolutionary Studies


NEW YORK (GenomeWeb) – Two research groups from the computer science department at Rice University will use a three-year, $1.1 million grant from the National Science Foundation to develop cloud-based statistical software for analyzing evolutionary patterns.

Specifically, Christopher Jermaine and Luay Nakhleh, who are both associate professors of computer science at Rice, will use the NSF funds to create open-source cloud software that uses Bayesian inference techniques to track how genes and genomes evolve across species, and to make the software broadly available to the research community.

In practice, being able to run analyses in parallel and to access thousands of computers quickly in the cloud will help shorten the time to results significantly, according to the developers. "We're talking about potentially taking a years- or decades-long computation and making it feasible by changing the underlying algorithm and making it amenable to distributed computing," Jermaine said in a statement. Moreover, it would provide a potentially cost effective alternative to purchasing and running large local clusters, they said. It could even appeal, they believe, to researchers who have mainframes in house because of the potential for parallelized analysis.

An otherwise powerful technique for estimating evolutionary history in phylogenetics studies, Bayesian inference is computationally impractical for large datasets, according to Nakhleh. "Analyzing data sets with 10 or 20 gene sequences can easily take hundreds of hours," he said in statement. "But the tree of life has millions of sequences and is built from millions of species. There's no way traditional Bayesian techniques are even going to get close to handling that." It's currently infeasible, for example, to use these solutions to build trees composed of thousands of taxa or species, Nakhleh told BioInform.

Parallel and distributed computer infrastructure offer a solution to the intensive computation needs of phylogenetics researchers; however; very little research has explored the potential of this kind of infrastructure for these kinds of studies, Jermaine said. "There's a reasonably large amount of work on cloud-based Bayesian learning, but it's almost all for data analytics, not for biological applications," he told BioInform. For example, he and his colleagues have developed a system that lets users "write and execute codes for large-scale Bayesian models," he explained, adding, however, that on the whole "there are not many papers describing cloud-based phylogenetics tools, and I think it's safe to say that [nothing] has been targeted to Bayesian phylogenetics in particular."

The NSF grant will enable the Rice researchers to expand existing Bayesian methods and make them more amenable to parallel and distributed computing systems like the cloud. Over the next three years, they'll work on mathematical modeling and algorithm development, implementing and running the software on distributed systems, refining it to remove bottlenecks, and finally publishing the software.

"We want to deliver something that’s very easy to use," Jermaine said, so "that somebody can just boot up a machine instance on Amazon [for example]" and then with "a couple of key strokes, fire up a cluster under that machine's control and then run whatever they want to run."

Filed under

The Scan

Positive Framing of Genetic Studies Can Spark Mistrust Among Underrepresented Groups

Researchers in Human Genetics and Genomics Advances report that how researchers describe genomic studies may alienate potential participants.

Small Study of Gene Editing to Treat Sickle Cell Disease

In a Novartis-sponsored study in the New England Journal of Medicine, researchers found that a CRISPR-Cas9-based treatment targeting promoters of genes encoding fetal hemoglobin could reduce disease symptoms.

Gut Microbiome Changes Appear in Infants Before They Develop Eczema, Study Finds

Researchers report in mSystems that infants experienced an enrichment in Clostridium sensu stricto 1 and Finegoldia and a depletion of Bacteroides before developing eczema.

Acute Myeloid Leukemia Treatment Specificity Enhanced With Stem Cell Editing

A study in Nature suggests epitope editing in donor stem cells prior to bone marrow transplants can stave off toxicity when targeting acute myeloid leukemia with immunotherapy.