Skip to main content
Premium Trial:

Request an Annual Quote

Gates Foundation Awards Stanford $7.5M to Develop Integrated TB Genomics Database

The Bill & Melinda Gates Foundation has awarded Stanford University a four-year, $7.5 million grant to create a centralized genomics database to support tuberculosis drug- and vaccine-development. 
The “driving force” for the award, according to project head Gary Schoolnik, is that three other TB research programs funded by the Gates Foundation require access to a comprehensive database on Mycobacterium tuberculosis: one project is developing new drugs to combat latent TB; another is developing vaccines; and the third is identifying biomarkers that are predictive of drug or vaccine efficacy.
As a result, Schoolnik said, the project plans to release a “fast track” version of the database by October, with more complete versions released over the next three years of the grant.
Schoolnik, a professor of medicine, microbiology, and immunology, is leading a team of researchers at Stanford, the Broad Institute, and the Harvard School of Public Health to compile and integrate a broad spectrum of genomic, proteomic, and structural data related to M. tuberculosis.
The primary goal — and challenge — for the effort will be integrating a vast array of disparate resources that already exist for the organism, Schoolnik said. While the M. tuberculosis genome was sequenced in 1998, and the annotated sequence is available via the TubercuList database, Schoolnik noted that there is currently no centralized resource that integrates this sequence data with other information that would be helpful to researchers looking to combat TB.
Schoolnik outlined a broad range of data that will eventually be integrated into the resource. His lab, for example, has already accumulated a large amount of TB gene-expression data that will serve as a solid foundation for the database. In addition, the Stanford team plans to conduct whole-genome RT-PCR-based gene-expression studies on tissue samples from TB-infected patients.
“It’s very difficult to obtain expression information from bacteria within host tissues [using microarrays] because the host tissues have so much more RNA than the microbe contributes,” Schoolnik said, “so you need an ultra-sensitive way to do that, and RT-PCR is the preferred way to do that.”
The Stanford researchers plan to build upon the existing Stanford Microarray Database infrastructure to house the TB microarray data, but integrating that with the RT-PCR experiments will require a bit of work because the RT-PCR data “doesn’t look a bit like the data from a microarray experiment,” Schoolnik said.
“Exactly how these two databases will be melded together so that one can navigate seamlessly between these two datasets is one of the challenges in building this database,” he said.
In addition to the gene-expression data, Schoolnik said the database will include “all the known annotated sequences of Mycobacterium tuberculosis plus related organisms.” The Broad Institute, which is currently sequencing and annotating eight strains of the pathogen, will contribute this component of the resource.
The database will also be integrated with predicted metabolic pathways from SRI International, which currently maintains pathway databases for M. tuberculosis CDC1551 and M. tuberculosis H37Rv as part of its BioCyc suite.

“Exactly how these two databases will be melded together so that one can navigate seamlessly between these two datasets is one of the challenges in building this database.”

Other components include a database of gene knockouts and associated phenotypes, a database of essential genes, structural information for M. tuberculosis proteins, a database of antigenic M. tuberculosis proteins, and a set of protein-protein interaction data from yeast-two-hybrid experiments.
Schoolnik said that some of this data already exists and some has yet to be generated. In addition, the Stanford team is still in discussions with external resources, such as SRI and TubercuList, regarding integration details. “We’re not sure at this point whether we’ll have links out or whether we’ll bring them in,” he said.
The goal, he said, is to create a resource that will allow researchers to navigate easily from a gene of interest to a metabolic pathway, expression pattern, protein structure, or other key information that will help provide insight into drug or vaccine development.
He added that a longer-term goal for the project is to add “druggability scores” to the proteins in the database that will enable researchers to predict whether a particular protein action can be inhibited. “That’s something that we’re keen to do, although exactly how is not clear right now,” he said.
The primary design goal for the database is to support “certain goals in human medicine,” Schoolnik said. “As far as we know, there is no other database like this for small genomes.”  

Filed under

The Scan

Genome Sequences Reveal Range Mutations in Induced Pluripotent Stem Cells

Researchers in Nature Genetics detect somatic mutation variation across iPSCs generated from blood or skin fibroblast cell sources, along with selection for BCOR gene mutations.

Researchers Reprogram Plant Roots With Synthetic Genetic Circuit Strategy

Root gene expression was altered with the help of genetic circuits built around a series of synthetic transcriptional regulators in the Nicotiana benthamiana plant in a Science paper.

Infectious Disease Tracking Study Compares Genome Sequencing Approaches

Researchers in BMC Genomics see advantages for capture-based Illumina sequencing and amplicon-based sequencing on the Nanopore instrument, depending on the situation or samples available.

LINE-1 Linked to Premature Aging Conditions

Researchers report in Science Translational Medicine that the accumulation of LINE-1 RNA contributes to premature aging conditions and that symptoms can be improved by targeting them.