FORTALEZA, Brazil — This year’s Intelligent Systems in Molecular Conference was the first in the conference’s 14-year history to be held in a developing country, and despite long travel times for most attendees — made even longer for many by the untimely bankruptcy of Brazil’s national airline — 850 delegates made their way to the Northeast coast of Brazil for the annual bioinformatics meeting.
ISMB 2006 coincided with the 20th anniversary of SwissProt, which held its own conference here in celebration of that milestone prior to the main ISMB meeting. Many took full advantage of Fortaleza’s tropical locale to attend both conferences, and attendees described the SwissProt conference as a valuable community-building event and an opportunity to reflect on the history of the curated protein resource, which has had its share of ups and downs in the past two decades (see Q&A with SwissProt founder Amos Bairoch, this issue], for further information on SwissProt’s history and the meeting).
This retrospective atmosphere continued through ISMB as well. Many attendees cited a keynote by Michael Waterman of the University of Southern California, prior to being presented with the International Society of Computational Biology’s senior scientist award, as one of the highlights of the conference.
Waterman spent the first half of his talk discussing his early work with Temple Smith in developing the alignment algorithm that bears their names. Waterman shared several rejection letters that the pair received for their seminal paper on the method, including one from the journal Mathematical Biology, which described the algorithm as “neither mathematics nor biology,” and added that there “seems to be little in your manuscript that is new.”
Waterman provided that letter and several others as encouragement for young scientists in the audience, noting that it’s never been easy to publish truly novel approaches.
But it wasn’t all just Memory Lane. Waterman shifted gears to discuss his current work on developing analysis tools for whole-genome optical mapping. This method shows great promise for DNA analysis, according to Waterman, because it requires no cloning or amplification and enables up to 100X coverage, but is fraught with informatics challenges.
Data from the technique — which stretches out DNA molecules in microfluidic channels, digests them into fragments, and then captures the fragments via imaging — includes many sizing errors, missing cuts, false cuts, missing fragments, chimeras, and other artifacts that need to be accounted for in the analysis step.
Waterman said that his group’s approach to analyzing optical maps is based on traditional sequence assembly, but relies on a new probabilistic scoring scheme to compute the overlaps.
In a first for ISMB, this year’s conference featured a “New Frontiers” track in which five prominent scientists — Janet Thornton, Chris Sander, Amos Bairoch, Phil Bourne, and Søren Brunak — outlined their vision for bioinformatics over the next five to 10 years, and presented challenges and opportunities for the field in the years ahead. Each talk was followed by an interactive discussion period with the audience.
Thornton, director of the European Bioinformatics Institute, discussed the role of bioinformatics in medicine and agriculture. Currently, she said, biomolecular data resources and tools sit at the center of most biomedical disciplines, but little progress has been made in creating links to these other fields. Describing most bioinformaticists’ impressions of these surrounding disciplines, she said, “It’s like when people thought the earth was at the center of the universe.”
Thornton noted that a recent survey of EBI resources revealed that very few users currently come from the medical and agricultural communities, “but I think that will change radically over the next 10 years” as these fields continue to focus more and more on the molecular scale, she said. The challenge for bioinformatics, she said, will be ensuring that new tools in the field meet the needs of these other disciplines.
Sander, director of the Computational Biology Center at the Memorial Sloan Kettering Cancer Center, presented a laundry list of challenges in the field in an effort to get the grad students and postdocs in attendance thinking about “the hard problems and underrepresented new areas” in bioinformatics.
These challenges included making biological function computable, developing better quantitative models of evolution, building models of cellular decision processes, enabling synthetic biology, developing analytical tools for neurobiology, elucidating the phenotypic consequences of genetic and somatic variation, closing the gap between experimentation and computation, and increasing public support for scientific research.
In response to a question regarding the best way to bridge the gap between bioinformatics and experimental labs, Sander’s response was, “Just do it.” While noting that he’s been in many situations where experimental biologists viewed computational biology as a “service activity,” that perception is changing, he said. “There is more and more recognition that computational biology isn’t a service, but a science, so just do it.”
“The challenge is capturing the concepts that need this linking at the time the paper is written.”
Bairoch, leader of the SwissProt group at the Swiss Institute of Bioinformatics, addressed the challenges that bioinformatics resources have historically faced in securing long-term funding, and proposed one option for ensuring continued support for data repositories. Bairoch’s so-called “Fortaleza declaration” described a system in which each grant proposal for a large-scale experimental project would be required to set aside a portion of its funding to cover the costs of storing and managing the data from the project.
Many in the audience welcomed this as a promising approach, but many also noted that those in attendance held a biased view of the situation. The challenge, several said, would be convincing experimental biologists of the value in setting aside a portion of their grant money for data management.
Peter Covitz, director of bioinformatics core infrastructure at the National Cancer Institute Center for Bioinformatics, noted that the NCI is exploring a similar requirement to ensure that grant solicitations include language related to data sharing and data management, but he said that such a mechanism could not be “one size fits all.” Smaller labs with smaller data sets may not be able to afford to carve out a share of their funding, he said, especially if there are no tools in place to facilitate the deposition of data. Such an effort would require “low- or no-cost tools to ensure that data is submitted appropriately,” he said.
Rolf Apweiler of the EBI said that he liked the idea “because it would help us,” but added that “we need to convince people that it’s worth capturing and sharing data.” Many ISMB attendees come from the genomics and structural bioinformatics communities, where data sharing is accepted, but in some disciplines, such as proteomics, “it’s hard to convince people that they should share data,” he said.
Bourne, of the Department of Pharmacology at the University of California San Diego, discussed the future of scientific publishing, which in many ways is beginning to overlap with traditional bioinformatics databases. New technologies like the semantic web and the rise of open access publishing are leading to a point where concepts within journal articles can be marked up with metadata at the time of publication so that they are dynamically linked to other resources in the field, he said.
There are a few hurdles in the way of this vision, however. “The challenge is capturing the concepts that need this linking at the time the paper is written,” Bourne said. He noted that most papers are currently submitted in Microsoft Word, and “no one is going to want to do the additional work to add metadata.”
However, he added, Microsoft is currently developing new technologies to enable “rich semantic connections to be made” at the time of writing. In this scenario, he said, an author could type a gene name into Word, and much like the software currently checks spelling or grammar, it could check the gene name against an ontology and suggest an alternative term.
Brunak, director of the Center for Biological Sequence Analysis at the Technical University of Denmark, discussed the future of computing in biology, and proposed the somewhat controversial idea that the field of computational biology may not even exist in 15 years as computation becomes an essential tool in the field. “All biology will include some computing,” he said.
While some attendees understandably were resistant to the idea that their discipline may not exist in the not-so-distant future, others were open to the idea. Chris Sander noted that there may be several possible scenarios for the evolution of the discipline: computational scientists could take over the field, or computation could be completely absorbed into biology, “like the department of gel-casting disappeared.”
Another option, he noted, is for specialized computational biology centers to spring up to support high-end computation in the field, much like meteorology is conducted today.