AT A GLANCE: PhD in polymer statistical mechanics from Yale University. Oversees the Laboratory of Computational Genomics at the Donald Danforth Plant Science Center. Enjoys building furniture and fishing.
QWhat are the biggest challenges the field of bioinformatics faces?
AThe biggest challenges are to develop tools that can annotate the orphan protein sequences in a number of genomes, particularly human. Such functional annotation has to range from the molecular or biochemical function to the physiological function of each protein sequence. Ultimately, one will have to be able to predict the temporal and spatial patterns of protein expression, and protein-protein interactions.
Another challenge is to deal with the problem of posttranslational modification of proteins. The prediction of the tertiary structure of membrane proteins is also an unsolved problem. Partly this is due to the difficulty in experimentally determining membrane protein tertiary structure.
Finally, the ultimate challenge to bioinformatics is to understand enough about the biology that one could simulate a single cell.
QWhat particular problems do public bioinformatics projects face?
AOne problem is getting timely access to proprietary data given limitations in funding. Also there is the problem of recruiting and retaining highly qualified bioinformaticists.
QHow does the near completion of the human genome project affect funding for public projects?
AProbably it will shift funding from genome sequencing to protein annotation. Proteomics will play an even bigger role.
QHow do you compete with companies to attract and retain qualified bioinformaticists?
AThus far, I have been successful in competing with companies. In part this is due to the freedom of research projects available in an academic environment. I also try to provide the best resources possible so that the environment is very stimulating and competitive.
QWhere does your funding come from? How much funding do you have?
ACurrently I am funded from the Division of General Medical Sciences of NIH and from the National Science Foundation. I also have institutional support from the Danforth Plant Science Center. All told, I have about $550,000 per year in direct costs.
QWhat hardware do you use?
AWe have a cluster of 520 dual-processor Pentium III 733 MHz nodes, each with 256-512 MB of memory and a 20 GB hard drive. It runs Linux. Our desktop workstations are a mixture of PCs running either Linux or Windows NT.
QWhat bioinformatics software do you use? Do you use in-house developed or third-party software?
AWe have developed our own threading (PROSPECTOR), ab initio (MONSSTER) protein folding and protein function annotation software. A significant fraction of our efforts is in algorithm development. We also use FASTA, CLUSTALW, BLOCKS, and PSI-Blast.
QHow would you compare the quality of publicly available and commercially available bioinformatics products?
AIn our hands they seem to be comparable, but we have little experience with commercially available bioinformatics products.
QWhat non-existing technology do you most wish you had? What’s missing from the bioinformatics toolbox?
AAn energy function that can discriminate the native state of a protein from misfolded states is missing. Also tools to predict quaternary structure and that can assign proteins to pathways are needed.
QWhat made you decide to become a bioinformaticist?
AIt happened by accident rather than by design. I am a polymer statistical mechanic by training. I just found biological problems fascinating and found that I could apply tools developed elsewhere to address these questions. Then with the age of the genome, I asked what could I do to participate — the answer being structure-based annotation of genomes and so one day I woke up and I am a bioinformaticist.