BA in natural science from the University of Pennsylvania, PhD in computer science from Stanford University.
Postdoctoral fellow at the National Center for Biotechnology Information, spent two years as a vice president at Pangea Systems, and has spent eight years at the SRI International Artificial Intelligence Center.
Says he settled in California primarily to pursue outdoor interests like backpacking and skiing. Also enjoys woodworking and cooking.
QWhere will bioinformatics be in two years? Five years?
AI hope that in two years the standard bioinformatics failure modes will be an integral part of the culture of the field. Standard Failure Mode 1 is when a computer scientist new to the bioinformatics field builds a beautifully engineered software system that elegantly and rapidly solves the wrong problem. Standard Failure Mode 2 is when a biologist new to the bioinformatics field doesn’t know what he or she doesn’t know about software engineering and creates yet another unparsable flatfile database, or generates yet another unmaintainable and unscalable Perl program. It’s frustrating to see the same mistakes repeated over and over again because software people don’t realize how important it is to understand the science, and biologists don’t realize how difficult software engineering is.
QWhat are the biggest challenges the bioinformatics sector faces?
AEducating people that bioinformatics is a distinct scientific discipline. The depth of knowledge that now exists in bioinformatics is such that often it’s not enough to have been trained in computer science and in biology. Someone trained in both fields still may not know enough about the methods and techniques in bioinformatics per se to have a big impact on this field. Or they may not have the ability to “think global.” A second challenge is educating people to not always build everything themselves. It’s ironic that despite the fact that so much of the biotech industry has developed as a research-service business, most biologists insist on building their own software (my hypothesis is that they consistently underestimate the cost of doing so). That laboratories should outsource the development of enzymes, primers, and other reagents for their research is incredibly obvious to biologists, but the same biologists who insist on outsourcing work for their labs insist on building software components themselves, even if building software is not their expertise.
QHow are bioinformatics activities organized within the framework of SRI?
ASRI is a non-profit research institute. Our bioinformatics research activities are located within a computer-science research division called Information and Computer Science, and include scientists from the Artificial Intelligence Center, Computer Science Laboratory, and Speech Research Group. We have active collaborations with life scientists at SRI.
QWho are your current customers?
AOur main bioinformatics customer right now is the federal government. We hope to expand our government R&D business in areas such as database integration and sequence mining. We have some very talented speech researchers, for example, who have begun applying their hidden Markov model proficiency to bioinformatics problems. But we are also actively working to develop research collaborations with companies. SRI has some very powerful computer-science technologies to offer, such as information-extraction software developed by our natural-language group.
QWhat do you have in the development pipeline?
AMy two main projects now are continuing the development of the EcoCyc and MetaCyc pathway databases and disseminating the Pathway Tools software behind EcoCyc in the bioinformatics community. The Pathway Tools is for building new model-organism databases. Several years ago we generalized the software so that we can apply it to many organisms simultaneously, not just to E. coli, to build databases that combine pathway and genome information. The academics we are collaborating with are very excited by the software, such as its ability to paint gene-expression data onto a metabolic-pathway map for the entire cell. We have just released a suite of databases constructed using the Pathway Tools on our WWW site at ecocyc.org.