Dan Koboldt at MassGenomics says next-gen sequencing has presented some problems for the bioinformatics community in terms of the sheer amount of data being generated. Tools for sequencing data analysis are under demand for efficiency, flexibility, and scalability, he says. A key development in this field was the Sequence Alignment Map, but now, researchers at the Broad Institute have come out with the latest advance — the Genome Analysis Toolkit. "Essentially, GATK is a foundation of code that takes advantage of the SAM/BAM input format to simplify many of the common requirements for data analysis tools," Koboldt says. It can accommodate reads from any sequencing platform, support most sequence aligners, and recognize public database formats, he adds. "It strikes me that frameworks such as this, coupled with the latest 4-core, 8-core, even 50-core CPUs, may finally be bioinformatics' answer to the challenge of massively parallel sequencing," Koboldt says.
A Look at GATK
Aug 12, 2010