The Gene Ontology is branching out in a number of new directions. Its developers are currently working with Robert Stevens’ lab at the University of Manchester to “recast GO as a description logic,” said Chris Mungall of the Berkeley Drosophila Genome Project. The proposed changes would modify GO’s current phrase-based approach, which often creates unwieldy descriptions of processes that complicate the annotation process. The new method would assign “slots,” or attributes, to certain terms to streamline management of the ontology. The term “protein binding,” for example, would contain a slot for “binds-to,” Mungall said, that would then be modified by terms from orthogonal ontologies that describe proteins, sequences, or other biological entities. Mungall was quick to acknowledge that any fundamental changes to GO would have a tremendous impact on systems currently using the ontology, and noted that the developers are working closely with the Manchester group to ensure a smooth transition to the more formal structure.
Midori Harris from the EBI reported that GO has been integrated with the National Library of Medicine’s Unified Medical Language System (UMLS). GO is the first set of molecular biology terms to serve as a “source vocabulary” for the computer-readable biomedical resource, she said.
Another ontology, eVOC, from the South African National Bioinformatics Institute, is garnering a great deal of interest, according to co-developer Winston Hide. The controlled vocabulary of terms is “intentionally simple” so that biologists can use it to annotate gene expression experiments, Hide said. Electric Genetics is currently working on a commercial tool using the vocabulary, which will be called Evoke.
A PowerPoint slide with a picture of a T-shirt bearing the slogan, “RNA: the other nucleic acid” was only one sign that bioinformaticists are giving RNA a bit of belated respect. Sam Griffiths-Jones of the Sanger Institute described one new RNA-friendly resource, called Rfam. Modeled after the Pfam database of protein families, Rfam (http://www.sanger.ac.uk/Software/Rfam/) uses the Infernal software (http://www.genetics.wustl.edu/eddy/infernal/) developed in Sean Eddy’s lab at Washington University to identify and classify RNAs into 114 distinct families.
Another software tool, called Rsearch, is the RNA version of Blast. Aligning non-coding RNA sequences is notoriously difficult, explained Washington University’s Robert Klein, because the molecules exhibit very little sequence specificity. By taking RNA’s secondary structure into account, Rsearch is able to detect RNA homologs that Blast or Ssearch would miss, Klein said. One drawback, he warned, is that the algorithm is very computationally intensive, and there appear to be no obvious heuristics to speed it up.
TIGR Goes Open Source
The Institute for Genomic Research has recently made nearly all of its software available under an open source license, said John Quackenbush.
In September, TIGR agreed to release an open source version of the TM4 suite of microarray analysis tools developed by Quackenbush and others at TIGR [BioInform 09-02-02]. Now, TIGR’s software page (http://www.tigr.org/software/) bears a note that the majority of its packages are “OSI certified.” Quackenbush said a combined effort by himself, Owen White, and Steve Salzberg convinced the TIGR legal team to change the licensing software model.
Quackenbush said that since the TM4 package was released in September, more than 3,500 unique users have downloaded one or more of its components.
Protein Standard Initiative Progresses
EBI’s Henning Hermjakob told BioInform that the Protein Standards Initiative launched in October [BioInform 10-28-02] is progressing rapidly. A draft of an XML schema for protein-protein interactions is currently being circulated among its developers for further refinement, Hermjakob said, and a paper on the format will likely be submitted to a journal within a month. In addition, he said, “a number of instrument vendors” are on board to develop a mass spec data standard, and a meeting is planned as part of the upcoming ASMS meeting in June in Montreal.
Further information: http://psidev.sourceforge.net.
GMOD Gathers Steam
Another collaborative initiative, the Generic Model Organism Database (GMOD) project, held its second meeting at Cold Spring Harbor Lab last week. GMOD co-developer Lincoln Stein said that the project participants are evaluating available tools to include in the final GMOD package and working on “how to make the software work together.” An initial release of the generic database package is planned for September, Stein said.
Further information: http://sourceforge.net/ projects/gmod.