Researchers affiliated with the University of California, San Diego, have received a three-year, $1.4 million grant from the National Science Foundation to develop bioinformatics tools for the Kepler open source scientific workflow project.
The funds, awarded to the San Diego Supercomputer Center and the California Institute for Telecommunications and Information Technology, will be used to develop a next-generation sequence data-management module, called bioKepler, that can will be able to perform an array of bioinformatics tasks using distributed execution techniques.
The module is an offshoot of the Kepler project — a platform for scientists, analysts, and computer programmers to create, execute, and share models and analyses across a range of scientific and engineering disciplines. Kepler is maintained by researchers at UC Davis, UC Santa Barbara, and UC San Diego.
The bioKepler module will be packaged for installation on various types of distributed execution environments, for example, as a web service and as virtual machines tuned for public and private clusters and clouds.
Within the bioKepler project, the researchers plan to develop applications for sequence database searches; mapping; sequence assembly; gene prediction; clustering; multiple sequence alignment, phylogeny, and taxonomy; protein annotation; and other tasks such as data format transformation and parsing.
All the resources, materials, and software products produced by the bioKepler project will be integrated with Calit2's Community Cyberinfrastructure for Advanced Microbial Ecology Research and Analysis, or CAMERA, a data repository and a bioinformatics resource for metagenomic analysis.
The CAMERA project has already used the Kepler workflow system "comprehensively," according to project co-investigator Weizhong Li, a research scientist at Calit2 and bioinformatics group leader for CAMERA.
"With the proposed developments in bioKepler, the CAMERA project and its large user communities will benefit from a larger set of next-generation sequence analysis tools with much better scalability and flexibility," Li said in a statement, adding that other next-generation sequencing projects "can also take advantage of the bioKepler software."