AT A GLANCE MS and PhD in computer science from Pennsylvania State University.
Prior to co-founding Integrated Genomics, worked in the mathematics and computer science division of Argonne National Lab, where he developed the PUMA database for phylogenetic analysis and the WIT platform for comparative genome analysis.
Hobbies include reading, cycling, and playing chess.
Q Where will bioinformatics be in two years? Five years?
AI suppose the real question is how much data will there be in two years, since the impact of bioinformatics is largely determined by the quality and quantity of data accessible to a user. Firstly, I predict we would have sequenced on the order of 250 more-or-less complete genomes by then, a wealth of microarray data, and some limited amount of proteomic data. Within five years, we should have deciphered at least 1000 prokaryotic genomes, approximately 100 single-cell eukaryotes, and the gene-rich sections of at least 50 multicellular eukaryotes. Secondly, there would be processes of integration that would allow users to rapidly extract relevant data and synthesize them meaningfully.
QWhat are the biggest challenges the bioinformatics sector faces?
ADeveloping meaningful integrations that would make sense of the rapidly growing pool of data from diverse sources and connecting this wealth of data to central biological questions that would impact the quality of life.
QWhat do you see as the most important task for bioinformatics beyond genome sequencing?
ABioinformatics will play a huge role in accelerating discovery. One of the challenging tasks of bioinformatics is filling in missing pieces of knowledge. In some cases, researchers have knowledge of the sequence data, but no correlated functions. In other cases, the functions can be deduced by analyzing pathway information, but the genes for those functions have yet to be identified. To achieve this goal, it is necessary to have bioinformatic tools that would characterize protein families and clarify which alterations in the sequence would produce significant alterations in the function. Additionally, it is important to know to what extent it is possible to connect alterations in function of proteins with their structural changes.
QWho are your current customers?
AOur customers have been looking for a way to accelerate their ability to work with their organism of choice. We provide our customers with annotated sequence information in the context of cellular pathways. These customers include Dow Chemical, Genencor International, BASF, Archer-Daniels-Midland, Roche Vitamin Division, Cargill, Maxygen, and several other companies. In the first two years, we concentrated on integrating and performing comparative analyses of genomes on a diverse set of microorganisms. These efforts attracted both academics and companies that use microorganisms for industrial processes. At the present, we have begun analyzing genomes of plants and other eukaryotes using the knowledge we gained from our study of microorganisms. Our reconstruction of eukaryotic pathways will have a profound impact on pharmaceutical, agriculture, and chemical companies.
QWhat bioinformatics software do you use?
AWe use both proprietary and publicly available software. We have developed a bioinformatic suite for comparative genomics. The ERGO Bioinformatic Suite integrates biological data from genomics, biochemistry, and high-throughput expression profiling to achieve a comprehensive analysis of genomes. The publicly available tools we use include Blast, Psi-Blast, ProDOM, and Pfam.
QHow large is your bioinformatics staff?
A40 scientists and growing.
QWhat products do you have in the development pipeline?
ATools for pathway modeling, analyzing regulatory sites, analyzing microarray data in the context of gene-function networks. We are also developing a framework that would allow other bioinformatic systems to integrate with our ERGO suite.