Skip to main content
Premium Trial:

Request an Annual Quote

Making a GO of it: Michael Ashburner Reflects on the Little Ontology Project that Could

Premium

The career of Michael Ashburner has taken a number of turns over the past few decades — from bench-based "fly pusher" to joint head of the European Bioinformatics Institute and back to the lab in the genetics department at Cambridge University. More recently, Ashburner took a slight detour into the literary world, with the publication of Won for All — a gossipy account of the sequencing and annotation of the Drosophila melanogaster genome in the late 1990s.

The book, categorized by its publisher, CSHL Press, as a "nonfiction novel," has been described by reviewers as "Bridget Jones for geeks."

Anyone who has ever experienced Ashburner's outspoken style will appreciate the reference as a fair assessment of the traits that have made him such an influential figure in the bioinformatics community — a delicate balance of chatty energy and scientific rigor, with a healthy dose of idealism. It is these qualities that helped establish the Gene Ontology — one of countless community projects he's been affiliated with over the years — as the foundation for the now-thriving field of biomedical ontology development.

The success of GO wasn't preordained, however. When Ashburner first introduced a prototype of a controlled vocabulary for the FlyBase Drosophila model organism database at the Bio-Ontologies meeting at ISMB 1998, "it went down like a lead balloon," he told BioInform. "No one was interested and no one could be bothered."

In an interview last week at the Bio-IT World Life Science Conference and Expo, where the non-profit Bioinformatics Organization presented him with the annual Benjamin Franklin Award for Open Access in the Life Sciences, Ashburner recounted his experiences in nurturing GO through those early years.

One turning point, he said, came soon after that 1998 meeting from an unlikely ally at the height of the private/public genome sequence scuffle: industry. Kenneth Fasman, formerly head of informatics at AstraZeneca (and now director of drug development strategy and performance at the company), proposed to provide some modest funding. In fact, Ashburner said in his talk following the Franklin Award ceremony, the contract with AstraZeneca stipulated "that they would get their money back if the data wasn't made available in the public domain."

That funding, around $250,000, "kept us going until December 2000," when GO received its first NIH grant, Ashburner said.

It's all been uphill from there, albeit with a few bumpy patches. Ashburner acknowledged that GO has had its share of criticism from some in the computer science community, who initially dismissed it for lacking the logic of more rigorous ontologies. "We made a lot of mistakes early on," he said, adding that "there are many changes to the GO that need to be made."

Nevertheless, he said, the GO Consortium has always been willing to listen to its critics. As an example, he cited Barry Smith, a professor of philosophy and cognitive science at the University at Buffalo, who published several papers that were highly critical of GO. Ashburner said that he and other members of the GO Consortium were initially "extremely annoyed" at Smith's unsolicited feedback, but after a meeting between Smith and the GO Consortium that he described as "robust," both sides buried the hatchet and Smith is now an active contributor to the effort.

Smith is a core staff member for the recently launched National Center for Biomedical Ontology at Stanford University, where Suzanna Lewis, a key GO developer, is co-principal investigator.

The GO Consortium is currently updating the ontology to make it more "logically rigorous," Ashburner said. The effort will make the resource much more useful for the bioinformatics community, but the GO developers are still wrestling with the challenge of ensuring that it remains "backwards compatible," he said.

Despite its faults, there is no doubt that GO has become a key component of the bioinformatics infrastructure. The GO Consortium website lists around 80 software tools — both academic and commercial — that rely on GO in some way, and Ashburner estimated that there may be another 100 that are not listed. The site's usage page reports more than 6 million "accesses" since 1999, with an average of almost 20,000 per week.

The success of the project, according to Ashburner, stems from its foundations as a true grassroots effort. GO developers "are those who wish to use it for other purposes," he said. In contrast, Ashburner pointed to the NCI Thesaurus, which was developed by an external IT contractor. "Those guys have no vested interest in seeing that it works," he said. "Their reputations are not on the line."

Another factor that led to rapid adoption of GO as a de facto bioinformatics standard was the rise of microarray technology in the late 1990s. "Timing is everything," Ashburner said, noting that aside from genome annotation, microarray analysis remains the primary application area for GO.

Now, Ashburner said that he is working with developers of a number of other public domain ontologies, such as the Cell Ontology and ChEBI — an ontology for small molecules — in the hopes that they can emulate GO's success. Ashburner said that the need for a chemical ontology is particularly severe, in light of new public domain resources for chemical structure such as PubChem. Currently, "there is no public-domain ontology for chemicals," he said, and for-profit organizations currently have "a stranglehold" on the field.

Other efforts, such as the Sequence Ontology, an RNA ontology, and anatomy ontologies for model organisms, are also underway, he said, while there is demand for ontologies in emerging fields like metagenomics, as well as a "desperate" need for a human disease ontology. OMIM is an "incredible resource," he said, but is "frustrating" for bioinformatics because it lacks metadata. "It's essentially a very large flat file," he said.

"Clearly there is a proliferation," Ashburner said of the biomedical ontology field, but he noted that the current spate of development won't go on forever. "The biomedical domain is large," he said, "but limited." There are only so many vocabularies in the field, and Ashburner said that in a few years, he expects that researchers will have access to a suite of compatible ontologies to describe everything from the smallest molecule to human anatomy.

The Gene Ontology has received more than $15 million in NIH funding since 2001, according to the NIH database. Ashburner said that the GO Consortium is in the midst of renewing its grant, and that despite NIH budget cuts, he is hopeful that the project's funding will continue. "The NIH has shown that it is committed to developing and sustaining community resources," he said.

While GO's projected budget of just under $20 million over the next five years is "not a trivial amount of money," he said, "it's still pretty cheap compared to experimental science."

— Bernadette Toner ([email protected])

Filed under

The Scan

J&J Booster Support

A US Food and Drug Administration advisory panel has voted to support a booster dose of Johnson & Johnson's SARS-CoV-2 vaccine, according to the Los Angeles Times.

To Keep the Cases Moving

The president of the UK Royal College of Pathologists tells the Financial Times that more investment is needed to tackle a backlog of cases.

NAS Expels Archaeologist

Science reports Luis Jaime Castillo Butters' expulsion is the first of an international member from the US National Academy of Sciences.

PLOS Papers on Angelman Syndrome-Like Cases, Salmonella Paratyphi A, SARS-CoV-2 in Brazil

In PLOS this week: exome sequencing analysis of Angelman syndrome-like cases, genetic epidemiology of Salmonella Paratyphi A, and more.