At A Glance:
Name: Kevin Coombes
Position: Chief of bioinformatics section, University of Texas MD Anderson Cancer Center, since 2000. Associate professor of biostatistics and biomathematics.
Background: Associate professor, department of mathematics, University of Maryland, 1989-1999.
Assistant professor, department of mathematics, University of Michigan, 1985-1989.
Assistant professor, department of mathematics, University of Oklahoma, 1983-1985.
PhD, Spencer Bloch’s research group, department of mathematics, University of Chicago, 1982.
Last week, Kevin Coombes cautioned against using statistically unsound experimental procedures during a talk he gave at the Assocation of Biomolecular Resource Facilities conference in Savannah, Ga. ProteoMonitor caught up with Coombes this week to find out more about his background and work in biostatistics.
How did you get into the field of biostatistics? Did you study that during your early career?
For my PhD, I worked in pure mathematics under Spencer Bloch. It had nothing to do with biology. It was pure mathematics — a field called arithmetic algebraic geometry. It’s the field that Andrew Wiles worked in when he proved Fermat’s last theorem. And most mathematicians in the field were very proud of the fact that it had nothing to do with applications.
I got into biology officially some time in the mid-90’s. I knew what was going on because my wife’s a hematopathologist. What happened in the mid-90’s was she came home all excited about this brand new technology, which at the time was microarrays. And she wanted to do some projects to use microarrays to help classify leukemias and lymphomas. The realization that she had at the time was that the biology looked straight-forward, but she wanted somebody to be able to analyze all the data that was going to come out.
She came to me largely because she knew there were statisticians in the department I was working in — at the time I was at the University of Maryland in College Park. So we put together a grant where she and a molecular biologist — she was at the University of Maryland in Baltimore. So there was a biologist on it, and I recruited a statistician for them. And then I was on the grant thinking that I would just translate because the statistician had only worked on weather patterns for NASA or something. So I figured I would translate between the statistics and biology, and I got interested in the problem.
We actually managed to get the grant. At the time we had run one microarray — it was early enough in the field that one microarray was enough to get an exploratory grant. And a little while after that, she got recruited down to MD Anderson which was in the process of setting up their microarray core facility and had set up aside slots for bioinformatics. And they decided that a pure mathematician — even if that’s what he did — that had a grant with microarrays in it qualified to do bioinformatics. So that’s when I started doing it for real. I came down to MD Anderson in 1999.
When did you get into proteomics?
I got interested in proteomics in response to the paper that Liotta and Petricoin published in the Lancet. That was February 2002. By that time we’d been doing microarrays here for some time. We knew what we were doing, I think. We had established lots of collaborations with people here at MD Anderson. And that paper came out, and within two weeks, people at MD Anderson said, ‘That’s a really good idea. We’d like to do that with our favorite cancer. Would you analyze the data?’
So Keith Bagerly, Jeff Morris and I got into proteomics at the point in self defense. We knew the data was coming so we decided we would get some data and learn how to analyze it before they actually ran the experiments here.
Where did you get the data from?
The first data that we got we downloaded from the stuff that Liotta and Petricoin made available on their website. We reanalyzed that data, and in the process we also started to analyze some data here and to try to develop some methods. You can see some of our analysis which is published. We had a paper published in Bioinformatics basically saying that that original data from Petricoin and Liotta was wrong. We’ve gone around giving talks about that for a while.
Can you describe a little about what was wrong with the study — what the original conclusions were, and what your conclusions were?
Well, their original conclusions had stated some strong conclusions that they could do close to 100 percent sensitivity and specificity in detecting ovarian cancer, based on the patterns they’d found. We analyzed the data and were not at all convinced.
There were two main points. What they did is they made spectra available, and they said if you look at these five m/z values, the patterns of intensities at those five m/z values are what distinguish normal from cancer. We looked at their data and didn’t see the distinction. The problem was they had analyzed raw spectra. What they had posted on the web was baseline corrected. So I believe they found something in the raw spectra. However, it wasn’t robust enough to survive baseline subtraction. Therefore, I don’t think it’s real. That’s my bias anyway.
That was the first issue. The second issue is that their study included — in addition to serum from healthy women and serum from ovarian cancer patients — they also had serum from women who had various kinds of benign disease — things like ovarian cysts. And one of the stunning things in their paper was the claim that they could tell the benign disease. They trained their algorithm to tell normal from cancer. And when they gave it a spectrum from a woman with benign disease, it said, ‘This doesn’t look like anything I’ve seen before. It’s something else.’ So this was extra evidence of specificity that it was correct.
The problem was the benign disease looked nothing at all like normal and cancer. The normal and cancer looked similar. The benign disease looked completely different. We think that something happened in the processing that made it different. The benign disease [samples] weren’t randomized in with everything else, and therefore it was technology that made it look different.
So those were the two main points. We also tried, because they’ve done several studies, to look at if anything generalized from one study to another. And it didn’t appear to. It looked like there was this sort of recurring problem of not randomizing the data. There’s a note that we’ve written on that in response to a paper they published in Endocrine Related Cancer. It looks there like they ran all the normals first, and then all the cancers. So the underlying problem as far as we can tell is that the design of the study wasn’t adequate. And they confounded technological factors with biological factors. So yes, there are differences, but the differences can be explained by technology instead of biology. And therefore there’s still some questions about whether they’re real.
Can you tell me a bit about your transition from math to biology — are these methods of analysis that you’re talking about new? Did you have to devise new statistical tests to analyze this data?
We have been working on new statistical tests, and there are a couple of issues here. If you look at what we’ve done at MD Anderson, a lot of the contributions that we’ve made are early on in the processing. We’ve been concerned with the quality of the data and making sure that you go from the raw data to something sensible, and this is true for microarrays and for proteomics.
A lot of statisticians who have gotten into the field want to start after all that processing has been done. They have basically a matrix that has samples in one direction, genes or proteins in the other direction, and it just contains intensity numbers, and they’re working on techniques that that point.
The computer science adage of ‘garbage in, garbage out’ applies very strongly here. If you haven’t gotten good data at that point, it doesn’t matter how fancy the methods are — you can’t get good results. And so we’ve worked on the microarray level and on the proteomics level to figure out how we get from the raw data to something we can trust. There’s not anything deep mathematically or statistically in that. It’s a matter of working very carefully through and making sure that what you’re doing at each step doesn’t mess things up too badly. And it’s hard work, and we occasionally have trouble when we send things to statistics journals getting them to think it’s important. But it’s tremendously important if you want to do applied science — if you actually want to get results that count and have biological meaning.
We have also worked at the higher end. Once we’ve gotten to the point where we have good data, we’re working on getting some new tests there. What we’re focused on at that level is how do you identify what’s a biomarker, or potential biomarker? It involves some at least somewhat new statistics because what we’re realizing is that it’s a different question. It’s not just saying, ‘What’s differentially expressed?’ It’s not saying, ‘Is the average value different?’ That’s what a T-test looks at, for example — it looks at the center of the data and says, ‘Is that different for cancer and normal?’
But that isn’t necessarily the property that gives you the best biomarker. The best biomarker may be something where a lot of the cancers look just like the normals, but there’s this big bulge out in the tail — there’s a small subset of cancer patients for which this gene is elevated, or this protein is elevated. And that requires different kinds of statistical tests to get at.
What we believe is that most biomarkers are going to identify a subset of patients, and you’re going to have to put together a whole panel of biomarkers to get there. Basically, I don’t believe that we’re going to find a single biomarker that picks out all responders from all non-responders. It’s going to be some complex of things. And that’s why we think we’re going to have to put together a complex of biomarkers, each of which gets a subset of [patients]. When we have enough of those biomarkers, then we’ll get everybody.
Statistically, the consequence is that, well, we can look for individual markers like this, but we’re forced into looking for combinations of markers. And this is a huge problem with overfitting the data, because once you start looking at combinations of markers — when you’re looking at 10,000 proteins, or 40,000 mRNA levels — the combinations blow up on you very quickly, and it’s very easy by chance to find the combination that appears to fits the data. So it’s really hard statistically to figure out what the right validation strategy is. That’s the problem that we most want to solve right now. That’s the one we intend to become famous for.
What kind of tools do you use to work on this problem?
I almost gave you the pure math answer — I need pencil and paper. That’s almost true here. We’re using a mixture of computer experiments and some actual experiments to try to determine how many samples we need and what kinds of strategies we can use.
What’s your general feeling about leaving pure math?
I’m enjoying what I’m doing immensely. Part of it is being convinced that it will have a real impact. Working in pure math, I probably knew personally all 150 people in the world who were likely to read the papers that I wrote. There are more than that many people just at MD Anderson who care about what we’re doing right now. So I think it’s just going to have a bigger impact, and I’m having a lot of fun doing it.