Name: Guy Tillinghast
Title: Director, Clinical Trials Program, Riverside Cancer Care Center
Professional background: 2001-present, director, clinical trials program, Riverside Cancer Care Center, Newport News, Va.; 1998-2001, research associate, National Cancer Institute, Bethesda, Md.
Education: 1998 — medical oncology fellowship, National Cancer Institute, Bethesda, Md.; 1995 — internal medicine residency, University of Massachusetts; 1992— MD, Brown University School of Medicine, Providence, RI
In a dozen papers published this month in Nature Biotechnology and The Pharmacogenomics Journal, members of the US Food and Drug Administration-hosted Microarray Quality Control consortium discussed their attempts to identify sources of bias in array-based studies, and provided recommendations for how best to analyze microarray data when predicting clinical outcomes.
While the first phase of the MAQC project established that findings could be reproduced across different microarray platforms and laboratories, this second phase of the project sought to provide guidance on how to best build accurate and reproducible multivariate gene expression-based prediction models, also referred to as classifiers (BAN 8/3/2010).
The researchers found that model performance depended largely on the endpoint and team proficiency, and that different approaches generated models of similar performance. Batch size, or the number of samples processed, can also impact the results of a study, the consortium found.
According to Guy Tillinghast, the author of a Nature Biotech "News and Views" piece on the MAQC-II papers and medical director of the clinical trials program at the Riverside Cancer Care Center, the consortium's results should be useful for clinicians who are looking to overcome uncertainty associated with microarrays in order to introduce the technology into medical practice.
Despite "analytic challenges and concerns about accuracy and reproducibility" of array-based results in the clinic, Tillinghast said he was "amazed" by the results of the study, which showed that microarray algorithms can be reliable enough to justify clinical application, at least within certain contexts.
According to Tillinghast, several findings from MAQC-II may help bring the technology closer to clinical use. For example, in his paper, he noted that microarray experiments should be designed to minimize batch effects. In addition, he said that quality control metrics should be used to distinguish variation in gene expression caused by laboratory artifact rather than by clinical phenotype. Finally, he noted that the findings of MAQC-II on microarray classifiers may be useful for analyzing data from other high-throughput assays, such as next-generation sequencing.
BioArray News spoke with Tillinghast this week about how the MAQC-II findings will impact the clinical use of arrays. Below is an edited transcript of that interview.
What is your background and how and why did you get involved in MAQC-II?
I am a medical oncologist trained at the NIH. I run the clinical trials program at the Riverside Cancer Care Center. I performed cDNA microarrays while a fellow at the NIH. I have been interested in the use of microarray for the treatment of metastatic breast cancer. My ambition was to introduce arrays into community practice by performing cancer biopsies, and developing classifiers that would assist patients with treatment of their cancer. My experience has been that when you have one of these patients, there are a large number of [therapeutic] agents to try. You might try an agent for two months and very often the agent won't work — the cancer will grow, and the patient will experience side effects of the agent. My idea was to take the guesswork out of treatment, and use gene-based treatment selection.
As part of my research and background, I read about MAQC I and communicated with Leming Shi [a researcher at the FDA's National Center for Toxicological Research and the leader of the MAQC effort]. He invited me to participate in the MAQC-II project in 2005. I became a co-author of several papers, and the author of a summary paper. I have drafted guidelines for clinical classifier development that is available at our wiki site upon request. I also am interested in classifier development using next-generation sequencing data. I have become involved with that through using participate in the Bioconductor project, which has developed a lot of open-source software for not only microarray but and sequencing data analysis as well.
What were your objectives and the consortium’s main objectives in this second phase of the project?
My personal perspective was that when you do clinical trials it is important to require as much rigor as possible. My objective with MAQC-II was to come up with clear guidelines for how a project should be conceived, how array-based classifiers should be developed and validated, and particularly how bioinformaticians should interact with other members of the clinical trial team, and other aspects of building and validating classifiers. That level of detail has not been generally available. Many have written about use of array technology, but it is such a complex area with so many facets that there needs to be very clear level of understanding for participants of internal review boards require more rigorous details and model standard operating procedures that oversee clinical trials, so that they have a sense that the work being done is beneficial, and is not going to harm the patient. It is because patients may experience toxicity in a clinical trial that utmost rigor is required. Since the FDA oversees all research in this country, I was especially interested in participating with the FDA in in this project.
From the FDA perspective, they are interested in the personalized medicine directive, meaning an interest in developing technologies that personalize therapy selection. Therapy for cancer is getting ever more specialized, for example: Herceptin for the Her2/Neu subtype of breast cancer, or Erbitux for EGFR mutation negative colon cancer. Future therapy advances will depend on small subsets of patients that are defined by technology. We need this technology to determine which subset a patient belongs to. It seems like an area where arrays or some another 'omics technology will be part of this future.
[ pagebreak ]
What are some of the reasons why arrays have not been adopted clinically?
By "clinically," you mean with cancer, although toxicology was also investigated in the MAQC-II project. There are two array-based tests that I am aware of: Agendia's MammaPrint and Pathwork Diagnostics' Tumor of Origin Test. I have not used MammaPrint and I probably won't, as that test is mainly focused in prognosis, and I have become accustomed to using a PCR-based test, OncotypeDX, and usually other clinically available parameters can help with that. I have not used the Tumor or Origin Test yet, but anticipated that I will use it. Cancer of Unknown Primary is a small subset of cancer cases. It is way down the list in terms of frequency, but still it is a useful test to have available. So the Pathwork test addresses an unmet need. The barrier there is mainly with infrastructure. It takes a while for the clinic to adjust and adapt to a technology like that. I predict that it will get better as cancer centers get more familiar with the technology.
There are multiple barriers that will slow the development of microarray-based clinical tests. Expense is an important barrier, as it is expensive to design and carry out a project leading to a microarray test. Hopefully, the experience of the MAQC-II with designing microarray experiments will make cost estimates more accurate. Another barrier is technology change. New microarray chips cause a need for re-validation, ie. determining whether a microarray classifier works with the new chip. These re-validation efforts can be expensive. There are multiple clinical problems for which microarray-based tests can be developed. Cancer is many diseases, and not just one disease like HIV. While these multiple clinical situations create opportunities, they also make it expensive to develop all these tests.
An important barrier that I have personally been involved with addressing is the lack of clear good practice guidelines for conducting clinical trials with microarrays and for subsequent classifier development.
Lately, there has been a distraction from other technologies, especially next-generation sequencing. In the future, maybe PCR, NGS, or proteomics will provide better predictive tests. Alternatively, each clinical situation may be best served by a different 'omic. The current situation resembles development of computers and the software industry. Computer hardware was available in the 1950s, but it took for the 1980s for computers to obtain widespread use. In a few decades, we may have available a battery of 'omics-based tests that significantly improve human health, analogous to what computers have done for the world economy.
During the study, you identified sources of variability that affect the accuracy of array-based predictions. What were they?
Sources of variability with microarray-based classifiers can come from the endpoint, the microarray platform, and the statistical analysis team. Some endpoints are harder to predict than others. Disease-free survival remains easier to predict that overall survival. The variability across the platforms was investigated with MAQC-I; the MAQC-II found that experience was an important source of the variability in classifier performance, when testing the classifiers on an additional set of samples that were not a part of the original training set. Never underestimate the potential of bias to cause a deterioration in classifier performance. Experience can overcome this bias.
What are some of the findings of this study that could improve the reliability of array-based predictions? How can they be best implemented?
[In the MAQC-II study,] the more experience you had, the better your model seemed to perform. Still, each of 33 teams selected what they thought was their best model, but the reference selected models did better. That was an important lesson. In order for this technology to work, experienced biostatisticians are an essential ingredient. You kind of have one chance to do things right. If you do things wrong, you'll have to spend a lot of money to recover additional expenses. You might have to go back and fix a classifier and do the external validation all over again. Having experts up front is critical to saving money down the road.
Another finding was that some endpoints are better than others. There are going to be some limits to arrays in even the best circumstances to predicting overall survival, for instance. The moral is you need to have clinical variables to make a prediction. The clinical stuff can be done with a questionnaire – such as age, sex, economic status. Some variables can be very predictive. In lung cancer, for instance, women always do better than men. You have to be aware of that in a study. Combining the clinical information with array information would be done in ways like this.
Another thing: I was surprised by how well the results came out. I was amazed really. But it's not going to happen all the time. You can really fail badly. It is amazing how quickly bias can creep into what you are doing. There were very smart people who participated who still got hit by bias. How you do batches, how you conduct analysis – if someone can spot something that you are doing wrong, they can help you a lot. I would add that there are going to be more papers. The data will be publicly available so it will be possible for people to train themselves on how to be good statisticians. I think education is going to be the key to introducing this technology.
You hint at the end of your paper that this study may help shepherd other technologies, like sequencing, into clinical practice. How so?
Many of the principles developed for microarrays can be applied to other 'omics. Sequencing appears to allow determination of gene-expression levels with a wider dynamic range, and less bias with regard to individual genes. Still, many of the problems encountered with microarrays still remain, particularly multiplicity. This issue is rephrased in multiple ways, such as the 'high dimensionality problem' and the 'large P small N' problem, although each of these re-phrasings illustrates different perspectives on the same fundamental problem. The aspects of designing a study will always remain a critically important part. What we learned with MAQC-II can be applied with sequencing. People are reusing what they learned already. A lot of the people who are adopting sequencing were trained with microarrays. Their background is in microarrays, and that is where they are coming from.