If 2007 was the year of the next-gen sequencer, 2008 could turn out to be the year when bioinformaticists are forced to learn how to analyze and annotate all the data those machines are spitting out. And this will likely drive further trends in the field, such as increased hiring and a renewed focus on economical computing systems.
Ryan Koehler, staff scientist with Applied Biosystems, says that “the biggest thing” in bioinformatics in the coming year “will be the giant data sets … [from] next-gen sequencing [machines].”
Indeed, ABI has identified bioinformatics as a potential bottleneck for prospective adopters of its newly launched SOLiD sequencer, and last fall expanded its Software Community Program to encourage development of third-party software tools for the platform.
Michael Hadjisavas, director of commercial development at ABI, said at the time that the company decided to “reach out to the community and invite a dialogue in the area of software because we as a company cannot address all of the software requirements for the data interpretation of the myriad readouts that could be deployed.”
For companies like ABI, the lack of good support software for high-throughput sequence analysis could potentially slow adoption of the technology. “The amount of data that a researcher would have … could be very substantial and overwhelming, and unless there are companion software elements in place, the ability of customers to really enjoy the value of these instruments … can be somewhat challenged, and that’s a problem,” Hadjisavas said.
Next-gen sequencing vendors Roche/454 Life Sciences and Illumina, as well as a number of academic groups, are also tapping into sequence analysis.
Groups developing such methods include: the EBI; the Broad Institute; the British Columbia Cancer Center’s Genome Sciences Center; Stony Brook University; the University of North Carolina, Chapel Hill; and the Max Planck Institute for Molecular Genetics.
— Laurie Wiegler
Scientists at the Institute for Systems Biology and New York University have developed a cellular model to predict the molecular reaction of free-living cells to genetic and environmental changes. The model, called EGRIN, or Environmental and Gene Regulatory Influence, uses data from genome-wide binding-location analyses, mass spectrometry, and computational analysis of genome structure, among other techniques.
A recent study led by researchers at Georgetown University Medical Center indicates that scientists do not fully appreciate the complexity level of genomics data. The data generated from proteomics or genomics technologies is overly complex, resulting in some researchers who arrive at erroneous conclusions.
The National Institutes of Health officially launched its Human Microbiome Project, which will be funded with $115 million over five years through the NIH Roadmap initiative.
Assisting Bioinformatics Efforts at Minority Schools
Grantee: Nicholas Hugh, Carnegie Mellon University
Began: Sept. 1, 2000; Ends: Aug. 31, 2010
Hugh and his colleagues will continue the program of assisting minority institutions in building multidisciplinary bioinformatics training programs. These will be designed as modular courses in a curriculum that can be adapted by minority-serving institutions. The training program will emphasize sequence-based bioinformatics and other research areas of computational biology.
$1.2 million/FY 2007
Women in Bioinformatics Seminar Series
Grantee: Marcella McClure, Montana State University
Began: Sept. 15, 2007; Ends: Sept. 14, 2008
A recent report by the National Academy of Sciences, et al., found that although more women are majoring in the sciences, their numbers continue to diminish as they move up the academic hierarchy. McClure and her colleagues will continue their seminar promoting exceptional research by established and junior women in computational biology in order to inspire younger women to pursue careers in science.