AT A GLANCE : Received BA in biology from Yale College and PhD in evolutionary biology from Harvard University in 1988. Runs the 550-processor computer cluster that supports research at the museum’s Institute for Comparative Genomics. Also serves as an adjunct professor of biology at Columbia University, New York University, and the City University of New York Coaches soccer, baseball, and basketball for his children.
QWhere will bioinformatics be in two years? Five years?
AThere are a couple of issues facing us right away. The main concerns are database issues. The first of which will be databases that will allow you to gather all the information on a particular creature, let’s say, so you might enter a taxon name and pull down all the genetic information as well as all the biologically relevant information. Another thing that I think will be very important is automated annotation of genome sequences. That’s a tool that really needs to be out there.
QWhat are the biggest challenges the field of bioinformatics faces?
ATwo things: One is being able to gather and manipulate conveniently huge amounts of data in whole genome sequences; and two, which is both a problem and a benefit, is integrating phylogenetic frameworks into bioinformatics. Annotation, efficient description, the way you probe databases, and the way you get information out and interpret it is best described in a phylogenetic framework. I think that integration is a big challenge, but it will be a big benefit when it happens.
QHow large is your staff?
AWe have around five researchers involved in computing and biology and then around another 10 general IT people.
QWhere does your funding come from?
AAll sorts of sources. We have some internal funds from the museum, we get funding from the city of New York, and we have competitive grants to federal agencies and private agencies.
QWhat hardware do you use?
AI use mainly self-assembled PC boxes and the Linux operating system.
QWhat bioinformatics software do you use? Do you use in-house developed or third-party software?
AWe wrote all our own parallel software to do all our searches. You can download our source code from ftp.amnh.org. There’s sequence alignment software, there’s phylogenetic search software and all sorts of things.
QWhat databases do you use?
AMainly public databases like GenBank.
QHow is your department organized within the framework of the museum?
ADifferent people involved in individual research areas of the museum may come together in bioinformatics, but there’s no bioinformatics department.
QWhat non-existing technology do you most wish you had?
AReprogrammable hardware would be really good — FPGA (field programmable gate array)-based stuff. You could download some of your algorithms into hardware, get hardware speedups, but still be able to modify the hardware as algorithms evolve and change.
FPGAs are out there but they’re mainly developed for prototyping so you can get hardware solutions relatively quickly compared to designing a chip and seeing how it works. They’re somewhat flexible, but not as flexible as they really could be. You can’t download completely different applications into these things, but someday you probably will be able to.
QWhat made you decide to enter a career in bioinformatics?
AMy real research interest is in the study of the phylogenetic patterns in arthropods. So I use all sorts of information, DNA foremost among that. So I came to the point where I needed to implement certain methodological things that I’d come up with in DNA sequence analysis and so I wrote programs and built computers to do that.