What do you get when you cram 20 biohackers in a stuffy room for three days? A new database spec, of course.
By Bernadette Toner
After three days of forsaking the countryside in favor of a windowless room, a whiteboard, and a bank of computers, the 20 participants of Part I of the first-ever biohackathon wrapped up with a new specification for accessing biological sequence databases across multiple languages and platforms.
The specification, Open Bio Database Access, will give bioinformatics developers a library of modules to ensure seamless access to sequence databases. The spec is being incorporated into the four primary open-source bioinformatics projects: BioPerl, BioPython, BioJava, and BioRuby.
Organized and sponsored by Electric Genetics and O’Reilly and Associates, with help from a few commercial sponsors, the biohackathon kicked off the O’Reilly Bioinformatics Technology Conference in Tucson, Ariz., in late January. But while most early arrivals for the conference spent the weekend hiking in the nearby Santa Catalina mountains or basking in the Arizona sun, the 20 biohackers spent three full days coding diligently under the fluorescent lights of a stuffy hotel conference room.
The consensus: it was all worth it
“This is a brilliant example of how well open source works,” says Antoine Van Gelder, chief bioinformatics programmer at Electric Genetics. “The level of productivity here has eclipsed anything I’ve seen in my life.”
Other participants agreed that the results of the event far exceeded their expectations and that there was a surprising level of agreement among programmers, many of whom met face-to-face for the first time after months or even years of communicating electronically.
“It’s good to be able to just shout at people instead of waiting for them to wake up and answer their e-mail,” jokes Ewan Birney of the European Bioinformatics Institute, who served as technical leader of the event.
Participants made the most of their rare opportunity to work together, share ideas, and engage in the occasional debate. A whiteboard set up Saturday morning was soon filled with a “magic matrix” of projects and modules. Developers set to work with a degree of concentration and cooperation not typically available to them, checking off their assigned boxes on the whiteboard as they completed each component.
“It was great to not have to worry about work or school and just have time to focus,” says Andrew Dalke of Dalke Scientific Software.
By the end of the event, more than three-quarters of the boxes had been checked off, about twice as many as expected. The rest of the work, as well as a full performance test, was to be completed at the second part of the biohackathon in Cape Town, South Africa, at the end of February.
Birney says the OBDA standard should benefit system administrators, database operators, and application developers who currently have to draw from several in-house or third-party tools to access sequence data reliably from multiple sources.
“We have 12 different systems for this problem at the Sanger Centre,” Birney says. Once the library is fully tested, developers should have ready-made modules on hand that should greatly simplify the process.
Birney adds that a usable OBDA specification should be available this spring.
This article originally appeared on GenomeWeb.com.