Late this summer, five veterans of large-scale biology gathered in Boston for the final installment of this year’s Genome Technology roundtable series. The topic of the year has been “labs of the future” — earlier conversations centered on automation and technology as well as data challenges. This discussion took on a more personal bent as our experts debated the people aspect of tomorrow’s laboratories: from skills to education to interaction, what will it take to keep people ahead of the curve in this rapidly evolving field?
Genome Technology was delighted to host this panel, which included two biopharma representatives, a traditional genomics institute leader, and two scientists who have migrated closer to the clinical side of things. What follows are excerpts of the conversation, edited solely for space.
Genome Technology: Let’s start with brief introductions.
Marcia Nizzari: I joined the Whitehead Institute about four and a half years ago [before] we became the Broad. I came there without a biology background at all — I have a big background in commercial software development. I then got recruited into this director position for the program in medical and population genetics, which is David Altshuler’s program. That’s just been fantastic in terms of the learning. I joined the sequencing center when they were first scaling up for the Human Genome Project — it was a great opportunity.
Isaac Kohane: I’m by training a pediatric endocrinologist. I dropped out of medical school to get a PhD in computer science [and later] started an informatics research group in 1992 with one person: myself. We’re now 60. A lot of them are PhDs in applied math, physics, mechanical engineering, chemical engineering, theoretical math, and MPHs in public health. The hats I wear: I’m codirector of the Harvard Medical School Center for Biomedical Informatics [and] director of the Countway Library of Medicine. I’m also director of the Children’s Hospital informatics program, and head of bioinformatics of the Harvard Partners Center for Genetics and Genomics. My opening gambit in this [is that] it’s an interesting challenge to have quantitatively oriented people nurtured and promoted in academic medical centers.
Christopher Bouton: I was trained as a molecular biologist. I went into Johns Hopkins for neurobiology [but] then we started playing with these things called microarrays that had just recently come out, and I had no idea what to do with all that data. So I started to play with ways of using this data — that got me into computational biology. Then I went into two biotechs in the Cambridge area and most recently took a position at Pfizer Research Technology Center as head of integrated data mining.
David Sedlock: I’m currently in the research informatics group at Millennium. My background is actually in bacteriology — that’s where I got my PhD — and biochemistry and spent a lot of time in infectious disease microbiology. I got involved in the informatics arena mainly through database development, curation, and eventually drug discovery. So now at Millennium I pretty much have responsibility for both the bioinformatics and the cheminformatics platform.
John Quackenbush: My PhD is actually in theoretical physics. I thought I would have a very different career than the one I have now — I was finishing my degree in the late ’80s, early ’90s, when the job market in physics was crashing because the Cold War had ended and I was trying to figure out what I was going to do with my life. At the time there was this thing called the Human Genome Project that was just starting, and the NIH realized they needed people with non-standard backgrounds so they had fellowships to attract people to work on the genome project. I was one of the first three people to apply. I started working in 1992 in genomics and went into the lab and ended up starting to write software and develop databases. Since then I and my group have always worked divided between the lab and software development. I worked at Stanford for a while and then TIGR and then about 17 months ago I went to Dana-Farber. I’ve got a group now of about 15 or 16 people. Our work is really focused largely on developing protocols in the laboratory … and then taking those software tools and making them widely available.
GT: We see people coming from so many different backgrounds into this field. What do people need in terms of basic skills and education?
Kohane: Let me start an opening shot. Right off the bat, any seasoned recruiter will know that it’s individuals who are smart which trumps everything — and quantitative thinking in this area is essential, whether it’s innate or acquired. By the time you get to someone in their 30s there will be a group of people who are not capable of quantitative thinking or are not inclined to do so. For those who are uninterested or unable to do quantitative thinking it’s probably the wrong tack. I teach a few courses [and see] individuals who are interested in this area, they want to become biologists who are able to do this kind of work, but they don’t really have the tendency or real interest in the quantitative aspects. It’s extremely difficult for them. At least for my group I look for individuals who are very motivated by the biology questions, the biomedical questions — but also the ability in quantitative reasoning.
Bouton: I’ll add to that an ability to use information technology. So much of what we do now is based on pulling together disparate types of information and somehow synthesize it in order to get at whatever biological question you’re interested in — people who are able to do that creatively, people who are able to know which tools are out there and which tools are right for which approach, that kind of ease or interest in using those tools is important.
Nizzari: That’s a really key word: synthesize. That’s what I look for very much. In my own group I have an MD, a couple of PhDs in neuroscience and other fields, and people who came from more traditional engineering backgrounds. The group being able to get together and effectively problem-solve in that way is incredibly important.
The type of course I’d like to see the more biology-oriented person take is something that I’ve seen all throughout my software career but I’ve seen it writ large in this area: the fact that we can’t get requirements specified. I’d love to see courses that really address that head on, almost in a business school model where you have case studies where they have to write a spec and do a prototype and find out where it falls down and where it succeeds. That’s one of the things we’ve introduced at my own lab in the past several years.
Quackenbush: One of the things I thought was odd about biology when I got involved was the sociology — this divide between the people who generate the data and the people who analyze it. The people that I’ve seen be most successful are not those with any particular background, but those who are really bright and who you can drag across that divide. I have people with lots of different backgrounds: biology, medical degrees, software. It’s people who you can get from one side or the other who can speak the language. When I get software developers who are trained in software development, I make them go spend time in the lab so they know what people are actually doing. When I get somebody with a biology background, even as a postdoc, I make them take a Perl programming class — not expecting they’re going to be a professional software developer, but so that they can talk to people effectively across that boundary.
It has to be more team-oriented and project-oriented than the way we think about traditional biology; you need to get people talking to each other. The genome project has taught that so well — the history of the genome project is littered with lots of really cool software that was written by people who didn’t talk to biologists, and so it was very nice and very elegant and didn’t do what the biologists needed done or in a way they could understand it. On the flip side, biologists end up doing a lot of their data analysis in Excel because that’s all they know.
Sedlock: The issue around information technology — definitely that’s a big issue within the industry itself, i.e. working with the users whether novice or mature and finding out what they have been exposed to or not exposed to. I do believe from what I see that in this world that’s critical. Whether it’s quantitative thinking or critical thinking, the more that is maintained the better because you never know when you’re going to have to rely on it. What we see are individuals who conceptually understand biology, or more appropriately, they understand molecular biology, and they understand the bench tools, they can develop an assay — but they have trouble thinking more holistically about adapting that to the graphically evolving technologies, whether it’s instrumentation, informatics, computer science, etc.
Kohane: I often start my talks in this area by saying: If you look at the supposedly virtuous cycle of computational people working with biological people, those two sides of the virtuous cycle often view each other as the intellectual equivalent of peasants. Often the bioinformaticians view themselves as taking the raw materials from these third world workers called clinicians or biologists [to] extract the value and provide the real intellectual content. Conversely the biologists/clinicians view themselves as having asked the right question and then they send the data to a bioinformatician much like a monkey grinding an organ. I see this mutual lack of respect.
Quackenbush: I’ve had that experience so many times. People will come to me and say can you do this for me, can you assign a postdoc — it’s usually something for which the intellectual content is so small that it’s nothing any graduate student or postdoc in my lab would ever be able to get a publication for on their own. It’s usually taking 5,000 sequences and running Blast. The way I’ve started explaining it to them is that what they’re asking me to do is the intellectual equivalent of me going to them and saying, ‘Do you have a postdoc who could do 5,000 PCRs, and by the way I’m not going to pay you for the reagents or their time?’
Kohane: There’s too few individuals at this interface. These individuals, who are in my mind at least as intellectually competent as any other investigators, are being asked to both be intellectual leaders and servants. Ultimately when we fill the ranks with sufficient labor force I think there’ll be a stratification of tech-equivalent people in this area and intellectual leaders. Right now we’re feeling a strain because leaders in this area — people who are able to do this integrative, quantitative, and biological thinking — are being asked to do a whole range of efforts. That’s what’s creating the unease and the dissonance between expectations, tasks, and rewards.
Sedlock: We see a tremendous disadvantage within the informatics world. We deal with systems and those systems are comprised of various components; architectural development is critical. Graph theory is something that’s very basic, and I don’t think enough of the biologists and chemists understand networks or their laboratory environment workflow. Aspects within the mathematical world related to the exploitation, the evolution of graph theory as it has emerged in terms of network development — whether its use is in IT or in biology — is important and there’s just a handful of people that understand it.
Nizzari: What we have to figure out is how to ensure that those people at a malleable enough age understand [the benefit]. My management coach used to say you have to play radio station WIIFM for them, which means ‘what’s in it for me.’ So somehow they need to get indoctrinated for the greater good and [get an] understanding of the personal opportunities as well.
Bouton: I think it all comes back to people who are able to work in teams are going to be those people who succeed. It’s just not the case that everyone’s going to get to be able to understand graph theory the right way, or use information tools in the right way, or get the fundamental biological steps in the right way. It’s more and more the case that people who can work in a team and not expect that the other people in that team are just going to do services for them, but also respect what they’re doing and not treat them like peasants and actually at some point try to pick up some of the other skills.
Quackenbush: I think industry may be better at this than academia. I think the problem with academia is people who are like that get caught in a vise because they’re valued, but they’re not valued as intellectual equals. Getting them promoted as somebody who collaborates a lot is incredibly difficult. Quite frankly I think that whole model is outmoded because I think most of the biologists you talk about who are these great individual investigators can’t analyze these large-scale data sets — they’re turning to someone else even though they’re expected to be able to do everything. Even when we know that’s fiction, the fiction persists. The value of doing that kind of quantitative analysis is greatly downplayed.
The experience you have even trying to publish things — if you come up with a great method, the only way to get into Science or Nature is to have a new biological discovery where the method is three lines and a footnote. Your method gets published, if you’re lucky, in some second-tier journal.
Kohane: I’m still claiming that a common thread is quantitative thinking or systematics. Those approaches are not universally held. That skill set is not sufficient but it’s necessary and it’s probably what’s lacking. All of us when we try to recruit people in this area are having incredible trouble finding people. These people are rare; there’s only a small number of people around the world that I would like to recruit. Graduate students that I have trained are being recruited to be department chairs at other universities — there’s just an absolute lack of this integrative, quantitative skill set. I think we can be optimistic that it will be met.
Quackenbush: Coming back to the initial question, what training is right — there are a lot of people who created programs now to train people in bioinformatics, and most of the time the people I see coming out of those programs don’t have that quantitative skill set.
Nizzari: Exactly. It’s very disappointing.
GT: To wrap up, what practical things can readers do right now to improve themselves, or to get ahead in this field?
Nizzari: I have one to volunteer that we have a lot of road time with, which is to have weekly get-togethers that always happen where the lab users and the PIs who are interested get together and do the MIT media center demo-or-die type model. You take a look at early software that’s being created the lab; users get an opportunity to have some actual skin in the game, to get these defined. It’s incredible how much ownership they have felt over this stuff. You need to come back a week or two later and have them see in the software their ideas taking hold. We’ve just seen incredible benefit of getting things delivered that are actually on target [for users].
Kohane: Go to one of the industry groups or academic labs that is showing a successful track record of integrative genomic thinking and figure out how to get yourself in if you believe you have the right mindset. Go in by hook or crook. Take whatever pay cut it takes, and spend a year in that group. That is going to be more valuable in both training you and making clear to your future employer that you actually have the skill set than anything else.
Bouton: If you’re a biologist and you’re not so familiar with information technology, get comfortable with it. Do some Googling and find out what the databases are that are useful. I was a bench biologist who wanted to do computational biology — I bought more For Dummies books [than you can imagine]. But even more so, instead of going to a class and taking a course and just learning Perl, find an interesting problem and actually use whatever you’re trying to learn about in trying to solve that problem. For example, I wanted to work with microarray data, so I developed this Dragon database — through doing that I learned way more than I would’ve if I had gone to a course.
Nizzari: It’s the theme of apprenticeship instead of pure education, which is really excellent.
Sedlock: Another area outside the standard biology informatics — do you know how to manage a project? Do you actually know how to time box something, do you know anything about constraints?
Quackenbush: First, tear down the walls. The best thing that ever happened to my group at TIGR [was when] all my people [who] were spread out all got shoved into a trailer. They hated it. But more work got done because it put the postdocs who were working in the lab next to the software developers, in cubicles next to each other, so if somebody had a problem they stuck their head around the corner. If you create environments where there’s separation, progress slows.