Cloud Computing Tool for RNA Sequence Analysis

By Matthew Dublin

Using a grant from Amazon Web Services and the National Institutes of Health, researchers at the Johns Hopkins Bloomberg School of Public Health have developed an RNA sequencing data analysis program for the cloud called Myrna. The new software calculates differential gene expression in large RNA-seq datasets by using Bowtie, an ultrafast, memory-efficient short read aligner, and R/Bioconductor for statistical calculations. These tools are combined in an automatic, parallel pipeline that runs in the cloud using Elastic MapReduce, on a local Hadoop cluster.

"Cloud computing makes economic sense because cloud vendors are very efficient at running and maintaining huge collections of computers. Researchers struggling to keep pace with their sequencing instruments can use the cloud to scale up their analyses while avoiding the headaches associated with building and running their own computer center," says Myrna developer Ben Langmead, a research associate in the Bloomberg School's Department of Biostatistics. "With Myrna, we tried to make it easy for researchers doing RNA sequencing to reap these benefits."

Langmead and his colleagues used the software to process a large collection of publicly available RNA sequencing data. Using Amazon Web Services' cloud, Myrna calculated differential expression from 1.1 billion RNA sequencing reads in less than 2 hours at cost of about $66.