Assistant Professor of Pathology
University of Pittsburgh Cancer Institute
At A Glance
Name: James Lyons-Weiler
Position: Assistant professor of pathology, University of Pittsburgh Cancer Institute, since 2002.
Background: Assistant professor; co-director of Center for Bioinformatics and Computational Biology, University of Massachusetts, Lowell, 2000-2002.
Postdoc, Institute for Molecular Evolutionary Genetics, Pennsylvania State University, 1998-2000.
PhD in ecology, evolution and conservation biology, University of Nevada, Reno, 1998.
At the Cambridge Healthtech Institute's Biomarker Discovery Summit conference, held recently in Philadelphia, (see ProteoMonitor 9/30/05), James Lyons-Weiler gave a talk on using better statistical methods for proteomics experiments. ProteoMonitor caught up with Lyons-Weiler after his talk to find out more about his statistical methods and how they can improve on biomarker discovery, as well as help conserve the number of patient samples and other resources needed to do studies.
How did you get into working with the statistics of proteomic experiments?
I was interested in the processes of speciation. As a graduate student, I was very interested in reconstructing the evolutionary history of closely related species to understand the processes of speciation. Then I was a postdoc at Pennsylvania State University in the Institute of Molecular Evolutionary Genetics. I wanted to study the molecular evolutionary processes. I was developing some statistical techniques there as a postdoc when I read my first microarray paper. I immediately recognized that studying patterns of differences and shifts and changes associated with disease would be an outstanding thing to do with microarrays. I gravitated immediately to the question of, 'How does one best analyze microarray data?'S So I undertook a research area of comparative evaluation of methods of analysis for high-throughput genomic data. And it wasn't that long thereafter that I started learning about proteomic data sources mass spec-based proteomics and targeted, directed protein antibody arrays.
It was a split second decision to go from phylogenetics and molecular evolution to understanding the etiology of disease and creating classifiers for diagnosis and prognosis of disease, specifically in cancer. It was a split second decision, and it was an easy one. My mother died of cancer when I was very young breast cancer so I literally stopped work in one day. I remember the very moment that I made the decision. I stopped work on phylogenetics and molecular evolution and I never really turned back. I focused specifically and exclusively on microarray data analysis.
I landed at the University of Pittsburgh Cancer Institute where they happened to be producing large amounts of these kinds of data, and I have gravitated into a position where with most of my collaborations, I have handled, analyzed, evaluated all, if not most, of the microarray and the proteomics data for most studies. The data come to me from three different facilities. I lead a team that is dedicated to finding the best performance and the best methods for each stage in the analysis. I'm very content doing what I'm doing. It's the best job in the world. I love collaborating with people and being able to try to help facilitate biomarker discovery, evaluation, and validation.
In your talk in Philadelphia, you said that with powerful statistics, you can reduce the number of clinical samples and other resources that you need to do a study. Can you tell us about what is needed in order to perform these 'powerful statistics'?
First off, the number one technique I would advise everyone to address is technical replicates how many times do you have to run the same biological sample? Is it one time, is it four times, is it eight times, or more than that? You want to make sure that what you're looking at from the average profile from those technical replicates would be the same if you ran those samples the same number of times again. We're really talking about measurement reproducibility. That is a very old statistical technique taking multiple technical replicates and averaging them, and it's a very powerful technique. Instead, what I see happening far and wide in the discovery phase of proteomics is the researchers will do multiple replicates, but instead of averaging them, they will count the number of times that they see a given peak appearing. That is an OK approach, but it's not as powerful as averaging. The power of averaging is to pull out features that you might not actually be able to detect, and they can be fairly robust with averaging. That's a fairly easy, simple question, and I'm basically astounded that those questions are not given answers.
When a machine is packaged and shipped, vendors should provide the biomedical research community with data [showing] that you can achieve a particular amount of reproducibility with this number of replicates, and they should show the whole curve so that people can decide how many technical replicates they need.
The other kinds of statistical analyses that we've been working on are somewhat orthogonal to classical statistics like the T-test, or even their non-parametric, permutation-based alternatives. In cancer in particular, or for any heterogeneous disease, the methods that we've been working on are more focused on the question of, 'How many patients is a particular biomarker unusually expressed in?' as opposed to saying, if you have a classical study design of two or four groups, 'Show me the genes that show the highest differences between the groups.'
The first class of methods that we've developed were recently published in BioMed Central. They're tests that allow us to identify genes or proteins or any type of markers that are differentially expressed in a significant number of patients. The distinction there is that we're mostly interested in genes that might be differentially expressed in only a subset of patients. In which case the classic T-test might be very far afield on identifying biomarkers that are critical to understanding the etiology of some patients.
So what we're striving for is a compromise between the population-based sciences and an individualized medicine-based perspective. For a given marker or growth factor or DNA repair mechanism in cancer, what are the solutions for subsets of patients? Cancer is a biological process that is not going to be neatly patterned whatsoever. Each tumor in an individual is unique. As cancer progresses, there are random gene amplifications or random gene losses that are highly unique to an individual. So the cancer gathers the ability to do everything that a cancer does by hook or by crook. So if you use the T-test, or other population-based measures for finding differential expression, you might be restricted to finding only markers that are informative on the largest number of patients. And while that sounds quite intuitive, and kind of the right thing to do, it only takes us so far, because what if there are not a set of biomarkers that in some combination give us sufficiently high specificity and sensitivity to be able to be rolled out as a cancer screen? If we adopt an ingrained approach to our statistical analysis, we're always only going to look at the T-test, and then we're really only looking at part of the spectrum of the biomarkers and what they're telling us.
So the second class of methods that we're developing and we haven't published this yet, but I don't mind telling you about it is a classic method that is based on basically inverting a problem. Instead of saying we expect to find a small number of markers that are informative on a large number of patients, let's use a large number of markers to be able to be informative on all patients. And how you use the information in rendering the classification is critical. In rendering the classification, it looks to us that our early look at a few datasets is pretty exciting. With the traditional additive intensity-based method for disease prediction modeling, we hit somewhat of a performance ceiling of about 80 percent sensitivity. With this other approach, which is a genetic algorithm optimized prediction model, we are well positioned to make a prediction on most patients at least in the datasets and patients that we have on everybody, because we don't assume that every marker is informative on every patient.
We have to meet cancer at its own level of complexity. We shouldn't use population statistics to understand similarities and differences among tumors because there's not a real, natural population, so the population type statistics are not really applicable.
So with this method, we look for proteins that are different in a significant number of patients. That number is going to be determined by what we expect by chance alone. If we see that a marker is informative for a number of patients a remarkable number of patients that tells me that that marker is more informative than chance. Going forward, we're not content to say that every marker in our panel has to be used on every patient. Instead, let's see if the patients exhibit a significant weighted score of informative markers for us to be concerned that they might have this disease.
This is distinctly different from a model where every marker on a panel weighs in on a patient.
Does this reduce the number of patients that you would have to enroll into a study?
Well, here's the thing: If we know that most biomarkers are likely to be informative on only a subset of patients, that means there are some very good markers that no matter how many samples you have, you will not see enough difference among the patients to have a significant difference to include in your biomarker panel. So step one, if you use the population-based biomarker approach, you're putting yourself in a situation where you require unreasonable amounts of power to be able to detect what might appear to be small differences. But it's not that they're small differences. They're very large differences for some patients, but it might be only 20 percent of the patients that have that marker informative. For the majority of your patients, the marker's irrelevant.
I'm really quite concerned that the very large clinical trials that people are saying that we need might not directly address the most pressing problem. I mean, yes, we need large sample sizes to get accurate parameter measurements, but if you're estimating the wrong parameter in the first place, then we need to rethink the business of biomarker development.
How are you going about spreading this message?
We have created a new research journal. It's online and open access. It's called Cancer Informatics. All the papers are peer reviewed. I'm the founding editor-in-chief there. From my point of view, the best way we can use the new journal is to facilitate real discussions on the fundamentals of how we look at the data. We look at the data through technology filters, statistical filters, and statistical frameworks. The more we ask hard questions about how we can best interpret these data to create a knowledge base about the underlying mechanisms of these chronic diseases, the more we can actually try out these ideas in a collaborative way, I think we're going to see a major breakthrough in terms of seeing real translation.
So I think the perspective that is kind-ingrained in the way we do biomarker science that has been inherited in the past from other types of science, are probably a primary reason why we don't yet see massive amounts of success in translating all of our investments into improved healthcare.
You had mentioned in your talk a technique called multivariate balancing that helps increase statistical power. Can you describe what that is?
I call it multivariate balancing. It's also called adaptive accrual. For everything that you can measure in a clinical population that might be a source of variability in the types of biomarker measurements that you're taking, in the biomarker prediction model and framework there's a paper written by Mikel Aiken this paper really hasn't received the attention that it deserves to be given it describes how, if one performs randomization in the assignment of people in clinical trials to either the treatment or control group, you achieve a certain amount of efficiency. And that efficiency is achieved by the randomization procedure. If you do randomization, it turns out that you're guaranteeing that something like five or 10 percent of the studies that are done by chance alone are really problematic. But we basically kind of accept that.
An alternative to that is to do what I call multivariate balancing. In multivariate balancing you use a statistical model. The statistical model is designed to try to predict whether a patient is in either the treatment group or the control group before you assign it to the group. And if you can predict based on these chance factors that you have elected on smoking, age, gender, etc. if you can predict that person's placement in a group, then whichever group you've predicted them to fall in, you don't put them in that group. You put them in the other group. So what you end up doing is kind of homogenizing and balancing out whatever factors would have allowed you to recover by chance the groups. And it absolutely guarantees higher efficiency than randomization.
I think more people should really give this a hard look as an alternative to randomized clinical trials.
What kinds of things are you working on for the future?
For the future I'd like to develop a directed, proteomic or oncospecific bioarray that assays all known cancer biomarkers. We need to think about whether or not we know enough about a particular cancer to try to do targeted arrays. I think it would be better to do a cancer array because the molecular etiologies of cancer behind different organs might be similar. And so panels across different organs are very likely to overlap.