AT A GLANCE: BS in population genetics, MS in computer science. Prior to joining GCI, began the pathoinformatics group at Massachusetts General Hospital and served as its director. Interests include mountaineering, rock climbing, and competitive sailing.
QWhere will bioinformatics be in two years? Five years?
AI take this question more philosophically than in practical forecasts. Scientists in biology are working toward all-encompassing, rather “simplistic” discoveries. This methodology is analogous to the physics community’s pursuit of fundamental “truths” — profound but simple discoveries. Biology, however, is not made up of “hard” truths but of statistical truths that, themselves, are becoming more complex. The more we learn about the processes of life, the greater we discover its complexity. Over the coming years, the truths will be found in the complexity. Life and its processes are defined by relationships, to loci, to mRNA, to proteins, to cells.
Bioinformaticists realize and embrace this reality. As we begin to simplify, or integrate, the data and relationships, the explanations that arise will begin to make this complexity easier to manage and understand.
Practically, in two years, we will be finding the impact of genomic variation upon phenotype and in five years we will be able find its explanation — the how and the why — at least simplistically.
QWhat are the biggest challenges bioinformatics must overcome?
AStandards, or lack thereof. Different processes and methods will always require customization, but we will need to work out the communication standards between these systems.
QWhat hardware do you use?
AA real mix, but the common thread is Unix. We use clustered Compaq alpha servers and Sun servers in production. Development takes place primarily on Intel Linux boxes.
QWhat bioinformatics software do you use? Do you use in-house developed or third-party software?
AGCI’s business is discovery oriented and not product oriented. The result is that I do not need, or want, to spend time developing software that already exists for a specific task or analysis. I’ll buy this rather than build. In-house development, therefore, is focused on the integration of the software into our technology, architecture, and processes. Marrying the software with our lab, experimental, and discovery processes, and making it distributable, is the focus of our integration team.
We also use data integration engines and systems to build custom tools and processes for internal and public data sources. Our data mining and analysis team evaluates, modifies, and implements public algorithms and builds custom, proprietary algorithms to mine and analyze our internally generated genotypic, clinical, and expression data together.
QHow do you integrate your data?
AWe integrate our data at the level that leaves us the most flexible. Data that is very dynamic, like most of the public domain data and some of our internal data, is integrated through a software layer using SRS, GeneticXchange’s software, and custom written code within our J2EE framework. Data mining and analysis relies on static data so this is integrated at the database level.
QWhat non-existing technology do you most wish you had?
AVery dense SNP arrays and very well qualified, standard, and available genetic ontologies.
QWhat projects are you working on now?
AGCI is in the late-early stage of building its informatics capabilities. We are placing all the integration layers on-line, building NLP systems, and implementing algorithms for clinical genotypic, haplotypic, and expression analysis and mining.
QWhy did you enter bioinformatics?
AThere was no plan but a series of situations that presented themselves and the choices that were made. The key, I think, for any career is to remain flexible so that one can make spontaneous decisions when they are presented.