NSF To Award $20M Annually for Bioinformatics Through 2010
The National Science Foundation plans to award $20 million per year for bioinformatics research projects over the next three years, according to a program solicitation that NSF released this week.
The NSF’s Advances in Biological Informatics program will award approximately $20 million per year to between 20 and 25 projects that “encourage new approaches to the analysis and dissemination of biological knowledge for the benefit of both the scientific community and the broader public,” the agency said in the solicitation.
The ABI program is “especially interested in the development of informatics tools and resources that have the potential to advance, or transform, research in biology supported by the Directorate for Biological Sciences at the National Science Foundation,” NSF said.
NSF said the program encourages research in the following areas:
- New data types, algorithms, and methods for understanding complexity in biological systems across multiple scales of organization from molecules to ecosystems;
- Algorithms, software, or ontologies “related to the retrieval, integration, and use of heterogeneous biological information;”
- Tools that can facilitate biological research workflows, analytic pathways, or integration between the field and the laboratory, or between observation, experiments and models;
- Software and methods that use new technologies for acquiring, communicating, or visualizing biological data;
- New methods and tools for developing and using biological databases, “including research into database architectures and infrastructures, data standards designed to be extendable to different biological domains, and data structures for new types of biological information;” and
- Informatics tools and approaches that bridge “interdisciplinary differences in concepts and data” between biology and other sciences.
Proposals are due the second Tuesday in August through 2010.
Sanger's Durbin Says 1000 Genomes Project Has Surpassed 300 Gigabases
More than 300 gigabases of data have already been generated for 1000 Genomes pilot projects — more data than is currently housed in all of GenBank — and organizers plan to have a whopping 2 terabases of genetic information by the end of this year.
An international consortium launched the 1000 Genomes Project this February. Speaking at the Biology of Genomes meeting at Cold Spring Harbor Laboratory last night, Steering Committee Co-chair Richard Durbin, a principal investigator at the Wellcome Trust Sanger Institute, summed up the progress made on the project so far.
The first stage of the project involves three pilot projects, Durbin explained. The first pilot project is a low-coverage analysis of 60 samples from three different populations — including individuals of European, African, and East Asian descent. The second will involve families or trios of individuals of European and African descent analyzed at higher coverage, and the third will involve sequencing 1,000 genes in 1,000 people at high coverage. Several experiments suggest 20 or 30 times coverage will be necessary, Durbin noted.
So far, the team has mainly generated data for the first two pilot projects. More than 300 gigabases of data have been generated so far, Durbin said. By comparison, a single human genome comprises 3 gigabases. The consortium has completed low-coverage sequencing on roughly 10 people from the first pilot project and has higher-coverage data on some trios — mainly European— for the second pilot project.
Preliminary analyses have also been done on most of this data. The team did a data freeze a couple of weeks ago, when it had about 240 gigabases of data: 32 gigabases on low-coverage samples, 185 gigabases from European trios, and 20 gigabases from African trios.
After these pilot projects, Durbin said, the consortium will generate high-coverage information from multiple populations for all 1,000 genomes of the 1000 Genomes Project. The exact design for that two-year main project has yet to be finalized. For example, the main stage of the project will likely involve collecting additional samples and the team must determine how — and where — these will be collected.
The goal right now, Durbin said, is to get sufficient coverage to call variants down to five percent in three populations and down to one percent in the 1,000 genes. Ultimately, they hope to be able to call variants down to one percent in all 1,000 genomes.
The team plans to have 2 terabases of data by the end of this year.
The 1000 Genomes data collected so far has been submitted to the NCBI’s Short Read Archive.
— Andrea Anderson (This article originally appeared in BioInform sister publication GenomeWeb Daily News)
HHMI Awards Colgate $1.2M Grant to Create Systems Bio Undergrad Program
Colgate University said this week that it will use a $1.2 million grant from the Howard Hughes Medical Institute to start an undergraduate program in systems biology research.
Colgate said it will use the funds to create a new major in mathematical biology beginning in the fall of 2009.
The university said will hire a tenure-track systems biologist with a joint appointment in math and biology who will develop courses, lab modules, and new teaching tools for mathematical biology.
Kenneth Belanger, associate professor of biology, will direct the program, which also will include a science outreach for K–12 teachers and pre-college students.
The programs will include summer research opportunities at the university, a series of systems biology speakers, a 10-week summer research program at the National Institutes of Health, and participation in an NIH Study Group.
The launch of the program follows the opening of Colgate’s Robert H.N. Ho Interdisciplinary Science Center, which includes classrooms, research and teaching labs, and space for five different science departments, including a new visualization lab focused on 3D scientific animations.
GE Healthcare to Distribute GeneBio Software in Latin America
GE Healthcare will distribute Geneva Bioinformatics’ protein identification and characterization software and other products in Latin American, GeneBio said this week.
Under the agreement, the GE Healthcare Life Sciences Latin America, based in São Paulo, Brazil, will distribute GeneBio’s Phenyx MS identification platform, its Aldente product for positive matrix factorization analysis, and its Daemon and other mass spectrometry-related products throughout the region.
GeneBio’s Phenyx software, which was developed through a collaboration with the Swiss Institute of Bioinformatics, was “designed to meet the concurrent demands of high-throughput MS data analysis and dynamic results assessment,” GeneBio said.
GE Healthcare has been distributing GeneBio’s Melanie imaging platform since 2004.
Compugen Posts $281K in Q1 Revenues
Compugen this week reported first-quarter revenues of $281,000 versus no revenues in the comparable period a year ago.
The company posted a net loss of $2.5 million, or $.09 per share, down 19.4 percent from a net loss of $3.1 million, or $.11 per share, for the first quarter of 2007.
Compugen’s R&D expenses decreased 10 percent to $1.8 million from $2 million, and its SG&A costs increased 9 percent to $1.2 million from $1.1 million.
The firm held cash, cash equivalents, deposits, and marketable securities of $14.8 million as of March 31.
New International Consortium Aims to Improve Reference Genome
A newly formed international consortium recently unveiled a resource for improving the human reference genome.
The Genome Reference Consortium represents a small group of centers and institutions that are actively working to not only take stock of the gaps and small errors in the human reference genome, but also to incorporate new information about the magnitude of normal variation in the genome.
The group launched a new web site last week to coincide with the Biology of Genomes meeting at Cold Spring Harbor Laboratory. The site lets users access information on individual chromosomes and report problems with specific regions of the reference sequence.
“Pretty much everyone, uniformly, thought it was a good idea,” NCBI staff scientist Deanna Church, who presented a poster introducing the GRC at the Biology of Genomes meeting, told GenomeWeb Daily News.
The GRC consists of members from the Wellcome Trust Sanger Institute, the Genome Center at Washington University, the European Bioinformatics Institute, and the National Center for Biotechnology Information. The project is being funded by the National Human Genome Research Institute and the Wellcome Trust.
The wet bench work will be done at the Sanger Institute and Washington University’s Genome Center, Church said, while the EBI and NCBI are offering “bioinformatics support to make curation for experimentalists simpler.”
“It is now apparent that some regions of the genome are sufficiently variable that they are best represented by multiple sequences in order to capture all of the sequence potentially available at these loci,” the GRC website states. “The goal of this group is to correct the small number of regions in the reference that are currently misrepresented, to close as many remaining gaps as possible, and to produce alternative assemblies of structurally variant loci when necessary.”
The first seeds of the GRC were planted several years ago. Even as researchers were publishing papers on chromosomes sequenced during the human genome project, they realized that they weren’t capturing all the information necessary for a complete reference. “It was appreciated at the time that there were still gaps and things like this,” Tim Hubbard, head of informatics at the Sanger Institute, explained. “There was the question of what to do about that.”
Newer technology, sequencing of additional human genomes, and an increasing appreciation of normal human genetic variation underscore the need to re-assess the reference genome.
“Overall, [the human reference] is really quite a good assembly.” Church said. But, she added, that doesn’t mean the specific locus a researcher might be interested in is 100 percent correct.
“We have issues, I think, for pretty much every chromosome right now,” she said, noting that the consortium has heard of reports for problems in all but one or two chromosomes. “We are very aware that there are certain regions in the genome associated with copy number variations and disease phenotypes.”
The GRC web site currently includes literature related to each of the human chromosomes along with potential problems or concerns that have been noted for each so far. For instance, the site includes a table summarizing the regions currently under review in the reference genome. That, and other features on the site, will be updated shortly, Church noted. It also houses similar information for the mouse reference genome.
In order to improve the human and mouse reference genomes, the GRC is asking researchers to report any issues they have discovered in particular regions of the reference genome. New data is also being collected by members of the collaboration and through projects such as the 1000 Genomes Project to inform future reference assemblies.
For instance, the GRC website currently contains information on builds 35 and 36 of the human reference genome. The collaboration hopes to release the next build in the spring of 2009 and update it annually after that, Hubbard said. Though he noted that it may inconvenience those who have to remap their data each time a build is released, Hubbard emphasized the need for an accurate and complete reference genome.
“Overall, it’s necessary. It’s really important to do this to make sure people aren’t being misled,” Hubbard said, adding, “We’re open for business in terms of collecting information.”
— Andrea Anderson (This article originally appeared in BioInform sister publication GenomeWeb Daily News)