Skip to main content
Premium Trial:

Request an Annual Quote

Protein Researchers Won t Mind Being SPAMmed with New Informatics Resource


Genome sequences are being churned out at an incredible rate these days, but functional annotations of genes and proteins lag behind. Likewise, structural genomics projects produce 3D protein structures en masse, but these do not always give a clue to their function. To fill this need, researchers at the San Diego Supercomputer Center, the Keck Graduate Institute, and the Burnham Institute received a five-year, $5.4 million grant from the National Institute of General Medical Sciences this month to build a public resource for “systematic protein annotation and modeling,” christened with the requisite — and rather unfortunate — acronym of SPAM.

Key to the project’s goal of providing functional annotation will be “improved algorithms for sequence comparison, sequence-structure comparison and structure-structure comparison,” said project head Philip Bourne, director for integrative biosciences at SDSC and professor of pharmacology at UCSD. The result will be a core resource of databases containing annotated sequences and predicted structures for proteins from many genomes, plus software and visualization tools. No other public effort is currently creating a resource on this scale, Bourne added.

In contrast to other databases that provide annotations for proteins, like SwissProt or PIR, SPAM will largely contain putative annotations based on comparisons, not experimental data. “We already have pipelines [of methods] that take open reading frames and do putative annotation on that data…and we are putting these pipelines together,” said Bourne.

About 10 people will work full-time on the SPAM resource. Gregory Dewey and David Wild at the KGI will focus on new methods for alignment using a statistical mechanics approach; Wild will also develop new methods for protein-fold and remote homolog recognition using a Bayesian network model. Adam Godzik at the Burnham Institute will concentrate on improving homology modeling tools for models with varying degrees of sequence similarity to known structures. Bourne and his colleague Ilya Shindyalov will improve database, query, and visualization tools, as well as the combinatorial extension algorithm for pairwise and multiple structure alignments.

Bourne and his colleagues, in collaboration with Ceres, a Los Angeles-based plant genomics company, have already created an Arabidopsis thaliana protein database, which they made available last month at Combining the results from Blast-Wu, Psi-Blast, 123D+, Coils, TmHMM, and SignalP, they modeled domain structures for more than 25,000 predicted Arabidopsis proteins. “The large-scale plan is to do that level of annotation and modeling on all known genomes,” Bourne said.

But Bourne’s long-term plans reach beyond SPAM, which is expected to come online within two months at Bourne is currently writing grant applications to build a resource called “Encyclopedia of Life,” which would integrate SPAM with other forms ot data.

— JK

Filed under

The Scan

Positive Framing of Genetic Studies Can Spark Mistrust Among Underrepresented Groups

Researchers in Human Genetics and Genomics Advances report that how researchers describe genomic studies may alienate potential participants.

Small Study of Gene Editing to Treat Sickle Cell Disease

In a Novartis-sponsored study in the New England Journal of Medicine, researchers found that a CRISPR-Cas9-based treatment targeting promoters of genes encoding fetal hemoglobin could reduce disease symptoms.

Gut Microbiome Changes Appear in Infants Before They Develop Eczema, Study Finds

Researchers report in mSystems that infants experienced an enrichment in Clostridium sensu stricto 1 and Finegoldia and a depletion of Bacteroides before developing eczema.

Acute Myeloid Leukemia Treatment Specificity Enhanced With Stem Cell Editing

A study in Nature suggests epitope editing in donor stem cells prior to bone marrow transplants can stave off toxicity when targeting acute myeloid leukemia with immunotherapy.