RHadoop Project=Big Data Analytics with R and Hadoop

Below is a video featuring data scientist and RHadoop project lead Antonio Piccolboni in which he introduces Hadoop and explains how to write map-reduce statements in the R language to drive the Hadoop cluster. The RHadoop project is an open-source initiative that aims to better enable researchers to extract data from Hadoop for analysis with R and to run R within the nodes of a Gadoop cluster.

While roughly two years-old, Pacific Northwest National Laboratories' Ronald Taylor has a paper that provides a thorough roundup of Hadoop use in bioinformatics.

      Matthew Dublin is a senior writer at Genome Technology.