AT A GLANCE: Completed PhD work in a Drosophila neurobiology laboratory at UC San Francisco. Joined the BDGP as a postdoc and currently designs the project’s computer infrastructure. Interests include skiing, horseback riding, sailing, snorkeling and musical theater.
QWhere will bioinformatics be in two years? Five years?
AIn the near future I hope to see bioinformatics evolve to integrate data from various sources. Most bioinformatics programs only analyze one piece of data such as the primary sequence and only recently a few programs started to combine more sources such as gene array experiments and DNA promoter regions. Computer programs will have to start approaches to integrate as many data sources as possible to advance our understanding of complex biological systems.
QWhat are the biggest challenges the field of bioinformatics faces?
AI see the biggest ongoing challenge in merging biology and computer science. It takes years to train qualified people to do independent research in either field and even then often only a highly specialized subtopic can be really well mastered. Contributing to biological research will require intimate knowledge of those formerly divergent disciplines or at least sufficient knowledge to bridge specializations. Additionally, access to specialized informatics tools for a less computer-savvy bench biologist is a major problem.
QWhat particular problems do public bioinformatics projects face?
AA major problem of public efforts is the limited funding in a field that involves large amounts of money in commercial settings. Out of sheer economics, and frequently less job security in academia, many highly qualified people seriously consider leaving the academic setting in favor of an industry career path. In addition, public data are free to use for everyone, whereas companies can and often do limit access to their own data. They are able to use the public data and analysis to verify their own data and make those improved data available to their paying customers only.
QHow does the near-completion of the human genome project affect funding for public projects?
ACompleting various genomes is like opening Pandora’s box. Now we’ve opened just the first box and found a huge number of interesting problems. Understanding sequence data will require an ongoing stream of various genomics projects. Genome projects required building a huge “well-oiled” support infrastructure and post-genomic projects will vastly benefit from having this infrastructure already in place.
QHow do you compete with companies to attract and retain qualified bioinformaticists?
AWe have a small but very productive group of people working in the informatics group. That means that every member of the group works largely independently and everybody can claim a significant part of our endeavor as his or her own designs and work. This level of independence is not often seen in commercial settings and has proven to be a valuable asset in keeping people.
QWhat hardware do you use?
AWe have multiple high-end Sun Enterprise 450 systems and a substantial number of Sun Workstations. Recently, we acquired a 40-processor Beowulf cluster from Linux Networx for the genome annotation and to provide several sequence analysis services for the public such as BLAST. We are slowly switching toward more Linux systems.
QWhat bioinformatics software do you use?
AAll of our software is from public sources, in-house, or often a combination of those two. Major endeavors currently under development are an annotation analysis pipeline combining many of the programs above and an annotation workbench.
QWhat’s missing from the bioinformatics toolbox?
ACurrently, I would most like to see a modular algorithms toolbox. Often similar algorithms are used with slight modifications and different combinations, but currently many parts have to be written from scratch. Having a flexible and extendible toolbox of current algorithms would provide an easy way to address novel biological problems.