It’s hard to find significant associations in multigenic studies of diseases influenced by complex genetic and environmental factors. That’s probably why Marylyn Ritchie became so popular after her talk this week at a pharmacogenomics meeting at Cold Spring Harbor Laboratory in New York.
Ritchie’s lab is working on a new version of its Multifactor Dimensionality Reduction program, which analyzes the interactions of as many as 15 different genetic factors — SNPs located in the same gene or multiple genes, or separate genes. The goal is to associate them with binary endpoints, such as two disease states. It has no sample-size limit.
If successful, the new version, due out next spring, will become a novel statistical tool for drug makers to use in analyzing reams of pharmacogenomics data to identify interactions that many other methods can miss.
The purpose of MDR is to find gene-gene or gene-environment interactions in the absence of main effects, according to Ritchie. The method “has more power” than most traditional statistical methods, because rather than leaving the data’s “full dimensionality,” [MDR] collapses it into a single dimension with two endpoints, she said. The result is that a smaller sample size is needed to detect interaction effects.
“A lot of the drug companies already have [data needing this kind of analysis],” said Ritchie, adding that she has spoken to several drug-company representatives looking for a way to meaningfully analyze that data. At the Cold Spring Harbor meeting, Ritchie spoke with officials from Pfizer, GlaxoSmithKline, and “smaller companies” based in Europe and in the United States about the functions of her software.
Ritchie’s group is also “in discussion with lots” of small companies interested in licensing the existing version of the software, but which cannot yet afford it because the current pricing scheme is targeted to big pharma, she said. Her group has been revamping their prices and hopes to work with MDR users as consultants, helping to interpret data and operate the program, she said. The software is free for academics.
“Nearly every talk [at the meeting] said, ‘We still have to look for epistasis and gene-gene interactions—we haven’t been able to do that,’” said Ritchie.
The current version of MDR is available on the Ritchie laboratory website (http://chgr.mc.vanderbilt.edu/ritchielab), but it has a drawback that the new version will avoid: Independent variables are limited to six levels. For example, a polymorphism having three alleles has six possible genotypes. At the moment, MDR can accept no more.
While Ritchie was explaining MDR to Pharmacogenomics Reporter, the interview was briefly interrupted by a conference attendee with questions about interactions that might be hiding in her data. The attendee found a main effect — “quite a high p-value of 2C19” — without finding evidence of interaction involving CYP2D6 and 2C19 using a traditional statistical method.
“But I wonder whether we could send you that,” said the attendee. “It’s a small sample, but the fact that we found a main effect on one gene, it might still be worth trying to find a gene-gene interaction.”
The analysis would “definitely be worth it,” responded Richie. “I hate trying to tell people the lower limit of sample size because it really depends on the effect.”
Following Ritchie’s talk, Rebecca Blanchard, an associate member of the Fox Chase Cancer Center Department of Pharmacology, wanted to know whether MDR would be appropriate for a case-series study design having only cases and no controls. In particular, she said she hopes to associate estrogen metabolism polymorphisms with age of onset, tumor size, and the grade of the tumor.
“Most pharmacologists are not savvy in statistics — at least not in the level of statistics she brings to the table,” Blanchard told Pharmacogenomics Reporter.
With standard regression methods, the frequency of some polymorphisms can require “thousands and thousands” of subjects to produce data that one can be “comfortable with,” said Blanchard. The MDR program might be able to identify gene-gene interactions in smaller population sizes, and identify those that are clinically meaningful, she said.
The MDR program first went public in a 2001 paper in the American Journal of Human Genetics, and Ritchie’s group began to distribute the software to academic users later in the year, she said. Since 2002, the number of licensees “really picked up,” with about 150 current users worldwide, including Jianfeng Xu at Wakeforest University; Vessela Kristensen at the NIH; and Benjamin Raby at Brigham and Women’s Hospital, said Ritchie. The program is free for academic users and licensed for a fee to industry. Possibly as a result, the first MDR version has no commercial users.
A few new versions of the MDR program are in development, in addition to the version that the group hopes to have ready by next spring. Ritchie and Jason Moore, her former principal investigator at Vanderbilt, and Eden Martin of Duke University are working on an MDR pedigree-disequilibrium test that uses a statistical method developed by Martin to detect interactions in extended pedigrees.
Another upcoming version features multiple endpoint levels, and classifies data into more than two groups — for example, four different tumor disease states, said Ritchie.
A parallel version of MDR that the group plans to release “next summer” will distribute processing among “several hundred nodes” of a computer, Ritchie said. The parallel program, called pMDR, is already working on the computer system in her lab, but it hasn’t yet been tested on another system. Ritchie and her colleagues plan to publish a scientific article featuring the program before distributing it.
The Ritchie lab also is in the “thought process phase” of a genome-wide-association MDR, to be called gaMDR, which along with pMDR is the version Ritchie expects to be most in demand by big pharma, since “they are already collecting genome-wide association data.”