The 2012 Intelligent Systems for Molecular Biology conference held this week in Long Beach, Calif., marked the 20th anniversary of what is considered the largest meeting in computational biology.
As part of the festivities at this year's meeting, two founding members of the International Society for Computational Biology, which plans and manages ISMB, presented an anniversary keynote.
Lawrence Hunter, who directs the computational bioscience program and the center for computational pharmacology at the University of Colorado School of Medicine, and Richard Lathrop, a professor in the department of computer science at the University of California, Irvine, delivered the keynote, which traced the early days of the meeting, with its initial focus on artificial intelligence, to its current focus on computational biology.
BioInform caught up with Hunter, who was the first president of ISCB, after his talk to discuss the history of the conference and possible future directions for the community. What follows is an edited version of the conversation.
It's been 20 years since the first ISMB. How has the meeting evolved over the years?
ISMB has gone through several stages. In the very beginning it was almost entirely computer scientists and there were really clear themes that emerged from the meeting. [For example, at] the third meeting … half of the papers were about hidden Markov models. As the field has grown and changed, there is a much less clear division between the computer scientist and the biologist. We've really become computational biologists and so the level of biological sophistication has gone up and the field has diversified so that there is really rarely a clear theme anymore; it's multifaceted and diverse.
Another thing that's changed is the orientation toward medicine. In the early days of the field, we were grappling with much more basic science problems and while there is still a lot of that, there is a much higher proportion of work that's translational or clinical. Whether it's drug repositioning, where I think there is real potential to change the pharmaceutical industry based on the kind of informatics work that's done here, to an increase in the use of clinical data in the techniques that are being proposed here — whether it's text mining or patient records or formalin-fixed, paraffin-embedded samples and the challenges in doing transcriptomics in those kinds of clinical samples — we are much more tightly connected to human health than we were 20 years ago.
Is that a good thing? Does the focus on health mean that bioinformatics tool development in other areas is being neglected?
I think it's a good thing. Everybody wants to be relevant. Scientists don't want to do things in the abstract; they want to do things that make a difference in people's lives. One of the biggest ways to make a difference in people's lives with bioinformatics is through medicine or pharmacology. There has never been a big contingent of folks working in agriculture but there are always a few … so far, the agricultural impacts have been smaller than the medical ones. And there are plenty of people doing basic science who are trying to understand how life works, [and] not so much trying to affect disease. I think there is a good balance and it will shift around from time to time. It would be great if there were more agricultural kinds of applications … [but] there is much more funding for things with medical applications than there are for ones with ag applications.
Following up on comments about funding, do you find that researchers have gotten better at including a budget for informatics in their grant proposals?
I think reviewers demand pretty sophisticated informatics in a lot of grants. For NIH grants, especially for the bigger, more prestigious ones — R01s or program projects or the [Clinical and Translational Science Awards] — all of those require a pretty good degree of informatics sophistication, I think, in order to do well. Looking over the last 20 years, one thing that has improved, although it could still use work, is study sections at [the National Institutes of Health], the review panels, becoming more sophisticated about computation. For a long time there was no standing study section at NIH that was specifically computational. Now there are two. There is also increasing sophistication on other study sections, so if you sit on an NIGMS panel, for example, there are going to be at least a couple of people who are pretty sophisticated about the informatics looking at those applications.
For the really large center proposals, and I am thinking now about the CTSA awards, there was such an emphasis on the informatics in the program announcement from the NIH that it changed institutions. Medical schools started adding divisions or departments of biomedical informatics in response to NIH requirements that the grant proposals be more sophisticated.
You mentioned earlier that ISMB has evolved since it first launched. Do you think that the meeting and ISCB in general have stayed true to their initial mandates?
It's evolved. When we first put it together, we were thinking about artificial intelligence and robotics in molecular biology. It was much narrower. There were already conferences on, say, biological databases and we didn't think that it was our topic. There was also the RECOMB [Conference on Research in Computational Molecular Biology] community, the algorithms community, and we separated from them too so that original vision was much narrower. ISMB has turned into a much more inclusive conference and ISCB a more inclusive society.
ISCB and ISMB both start with 'IS' but the 'IS'es are different. ISMB, the conference, was about intelligent systems, that is, about AI. ISCB is the International Society for Computational Biology; it's a much broader mandate. It includes databases and algorithms and visualization and all kinds of things that aren't intelligent systems. That's been a big change from the initial vision and, I think, ultimately a good one. I think the boundary lines were not productive and while I am still very interested in the artificial intelligence question, the blending of people working from different areas of computer science all sort of pulling towards solving problems motivated by biology has really been productive and so I am glad we've changed a bit from the initial vision.
Is there still room for AI?
The AI stuff has never gone away. There is tons of machine learning here, text mining, ontology, and knowledge representation here. One of the reasons I think this conference and this field and the original AI in molecular idea has been so successful is the technology works. It works in molecular biology almost better than it works in any other application area. So there is no shortage of intelligent systems at ISMB. It's just more than that now.
Are there any computational issues that the community was dealing with 20 years ago that are still being dealt with today?
We go in cycles. If you go back to the very early ISMBs there was a lot of sequence analysis and alignment questions and relatively little dynamics. Fast forward 10 years, everything was microarrays and time series and concentration levels and sequence analysis was a boring solved problem. Fast forward 10 more years and we've gone back in a circle. Right now, microarrays are kind of a boring solved problem and sequence analysis is really interesting and hot again. The technology changes and so the problems change, nothing ever seems to stay solved. Either our ability to peer into the biology lets us know that we were naïve or over simplistic about something that we now need to go back and look at much more carefully. For example, the assumption that only protein-coding bits of the genome were transcribed underlay a lot of science for a long time. Now it turns out that a huge portion of the genome is transcribed and there is a lot of action going on in RNA editing and mircoRNAs and long non-coding RNAs are starting to look interesting again. As you look deeper, more interesting problems come up that you didn't notice when you were making assumptions about how biology works.
It's rare in our field that we prove some technique optimal. The best we can do is prove that my way of doing it is better than X,Y, and Z and so it's a step forward but that always leaves the possibility that there is yet a still better way to do it and we still see people who are working on topics that have been well studied for a long time [such as] splice site identification, transcription start sites, structure prediction, function prediction problems that have been studied for a long time, yet new methods that are generally better come out. Even after working on it for 20 years, there is still the potential to do better.
Looking ahead 20 years from now, what do you see as the future of bioinformatics?
Let me take [a prediction] from my keynote. I think that we will see computer programs as increasingly independent and individuated intellectual partners. Right now, everybody using, say, Cufflinks uses the same version and it does the same thing every time. 20 years from now, I would expect that my computer program would be so customized to my way of thinking and what's going on in my lab that the same computer program would do something different in somebody else's lab. That doesn't mean it's not reproducible, we'll know what it did and why, but that rather than having tens of thousands of copies that do the same thing, it'll be more like having a computational member of the lab. It will know what we are after and what our interests are and what my collaborators want and who my competitors are and be much more individualized. I am not going to say that we'll have a program that everyone thinks is a mind 20 years from now … but I think along the path to developing genuine artificial intelligence, all minds are unique, everybody is different, and that's going to be increasingly true to programs too.