George Lake, Institute for Systems Biology s Chief Information Officer


AT A GLANCE: Formerly NASA’s project scientist for high performance computing for earth, space, life, and microgravity sciences. Has two daughters, Caitlin and Astrid (both names are words for “star” in other languages). Hobbies include theoretical astrophysics, cycling, ping-pong, air hockey, and tennis.

Q Where will bioinformatics be in two years? Five years?

A The regard for information science in biology is high, the good will toward computer and computational scientists is even higher. However, there is only so far the field will go without integrating the skill sets. If we don’t create an infrastructure that enables a reasonable curriculum to train all biologists, the field will stumble and saturate.

Q What are the biggest challenges the field of bioinformatics faces?

A Computing used to be about equipment. Now, it’s all about people. And very few people are being trained for biological computing because students are learning how to pull down menus in a few packages and change parameters but these aren’t the key skills needed in the field. We need people trained in data structures, algorithms, and their implementation in object-oriented systems. Think of how a wet lab works. I get phone calls from people setting up training programs asking what hardware and software to buy. It’s just not about that.

Q How do you compete with companies to attract and retain qualified bioinformaticists?

A We are fortunate because we have a high profile and a good track record of success. There is an enormous breadth of research at the institute and our senior investigators have spun off a dozen companies during their careers. So someone can be drawn to the institute seeing that there is tremendous potential to learn, grow, and build here, while at the same time they may choose to be a founder or very early employee of a new start-up.

Q Where does your funding come from?

A We view long-term flexible funding as key to our mission. We are actively seeking an endowment to achieve this. Increasingly, agencies are funding “Centers of Excellence” that permit groups to use more judgment to stop and start projects within their fixed funding envelope. A combination of these two sources will be critical to our success. For now, we have a $5 million start-up gift and a $5 million, five-year unrestricted grant from Merck. In addition, we have roughly $12 million a year in federal grants.

Q What hardware and software do you use?

A We were formed from existing groups that used a collection of hardware and software including Apple Macs, Microsoft Windows, Sun OS, Linux, and even Next. Within groups and labs, we stress that homogeneity is your greatest ally. The Next machines are disappearing. Mac is transitioning to a desktop over Unix. Because we build new technology and perform qualitatively new experiments, we are forced to develop software in-house. We have a lot of software that is developed here and used by the community, including Jimmy Eng’s SEAQUEST and Jeremy Buhler’s DAPPLE. We also use a lot of community software.

Q How would you compare the quality of publicly available and commercially available bioinformatics products?

A The interest and perceived quality of commercially available bioinformatics software is proportional to how far you are removed from using it. I think of “proprietary” versus “open source” as being the key. We are involved in developing new information systems for new technologies but we can’t build the new information systems by scripting inside some proprietary package that is immature and often not robust and reliable. If we can’t see the source, we can’t know where the problems are.

Q What non-existing technology do you most wish you had? What’s missing from the bioinformatics toolbox?

A I’ve been stunned by what’s missing. The main things are flexible data standards and open source software systems. For example, there is no standard for image data, something that would include headers for the history of acquisition and manipulation rather than just an image standard like TIFF.

