Skip to main content
Premium Trial:

Request an Annual Quote

Protein Researchers Won t Mind Being SPAMmed with New Informatics Resource


Genome sequences are being churned out at an incredible rate these days, but functional annotations of genes and proteins lag behind. Likewise, structural genomics projects produce 3D protein structures en masse, but these do not always give a clue to their function. To fill this need, researchers at the San Diego Supercomputer Center, the Keck Graduate Institute, and the Burnham Institute received a five-year, $5.4 million grant from the National Institute of General Medical Sciences this month to build a public resource for “systematic protein annotation and modeling,” christened with the requisite — and rather unfortunate — acronym of SPAM.

Key to the project’s goal of providing functional annotation will be “improved algorithms for sequence comparison, sequence-structure comparison and structure-structure comparison,” said project head Philip Bourne, director for integrative biosciences at SDSC and professor of pharmacology at UCSD. The result will be a core resource of databases containing annotated sequences and predicted structures for proteins from many genomes, plus software and visualization tools. No other public effort is currently creating a resource on this scale, Bourne added.

In contrast to other databases that provide annotations for proteins, like SwissProt or PIR, SPAM will largely contain putative annotations based on comparisons, not experimental data. “We already have pipelines [of methods] that take open reading frames and do putative annotation on that data…and we are putting these pipelines together,” said Bourne.

About 10 people will work full-time on the SPAM resource. Gregory Dewey and David Wild at the KGI will focus on new methods for alignment using a statistical mechanics approach; Wild will also develop new methods for protein-fold and remote homolog recognition using a Bayesian network model. Adam Godzik at the Burnham Institute will concentrate on improving homology modeling tools for models with varying degrees of sequence similarity to known structures. Bourne and his colleague Ilya Shindyalov will improve database, query, and visualization tools, as well as the combinatorial extension algorithm for pairwise and multiple structure alignments.

Bourne and his colleagues, in collaboration with Ceres, a Los Angeles-based plant genomics company, have already created an Arabidopsis thaliana protein database, which they made available last month at Combining the results from Blast-Wu, Psi-Blast, 123D+, Coils, TmHMM, and SignalP, they modeled domain structures for more than 25,000 predicted Arabidopsis proteins. “The large-scale plan is to do that level of annotation and modeling on all known genomes,” Bourne said.

But Bourne’s long-term plans reach beyond SPAM, which is expected to come online within two months at Bourne is currently writing grant applications to build a resource called “Encyclopedia of Life,” which would integrate SPAM with other forms ot data.

— JK

Filed under

The Scan

Transcriptomic, Epigenetic Study Appears to Explain Anti-Viral Effects of TB Vaccine

Researchers report in Science Advances on an interferon signature and long-term shifts in monocyte cell DNA methylation in Bacille Calmette-Guérin-vaccinated infant samples.

DNA Storage Method Taps Into Gene Editing Technology

With a dual-plasmid system informed by gene editing, researchers re-wrote DNA sequences in E. coli to store Charles Dickens prose over hundreds of generations, as they recount in Science Advances.

Researchers Model Microbiome Dynamics in Effort to Understand Chronic Human Conditions

Investigators demonstrate in PLOS Computational Biology a computational method for following microbiome dynamics in the absence of longitudinally collected samples.

New Study Highlights Role of Genetics in ADHD

Researchers report in Nature Genetics on differences in genetic architecture between ADHD affecting children versus ADHD that persists into adulthood or is diagnosed in adults.