Nature Biotech recently spoke to developers of some of the most widely used bioinformatics software packages — including Blast, Bioconductor, Galaxy, MAQ, and IGV — to discuss what makes certain tools successful.
A commentary in the journal notes that the "secret sauce" for useful software includes five ingredients: "developers must possess sufficient proximity to and understanding of the research problem at hand; timing of the software release should correspond with the emergence of the problem in the research community that it addresses; software should have extensibility and interoperability; the algorithm implemented by the software should ideally be novel and indicative of profound insight; and, finally, a broad range of users should be able to run and operate the program."
Cole Trapnell, who helped develop the Tophat/Cufflinks suite of short-read sequence analysis tools, admits that the success of those tools was largely the result of being at the right place at the right time. "We were stepping into fields that were poised to explode, but which really had a vacuum in terms of usable tools," he says. "You get two things from being first. One is a startup user base. The second is the opportunity to learn directly from people what the right way, or one useful way, to do the analysis would be."
Heng Li, a developer of MAQ, BWA, and SAMtools, agrees that there is a first-mover advantage for software. "When MAQ came out, there was no other software that could do integrated mapping and SNP calling," he tells Nature Biotech. "BWA was among the first batch of Burrows-Wheeler–based aligners (BWA, Bowtie and SOAP2 were all developed at about the same time). Similarly, SAMtools was the first generic SNP caller that worked with any aligner, as long as the aligner output [was] SAM format."
The developers also discussed a recurrent issue in bioinformatics: the proliferation of software tools that all do essentially the same thing. James Taylor, a developer of the Galaxy sequence analysis package, notes that one problem here is the lack of incentives for improving existing software instead of inventing something new. "There is also a problem with discovering software that exists; often people reinvent the wheel just because they don't know any better," Taylor says. "Good repositories for software and best practice workflows, especially if citable, would be a start."