Most random gene expression signatures are significantly associated with breast cancer outcome, according to a research team at Université Libre de Bruxelles.
The Belgian group, which presented its work in PLoS Computational Biology last year, compared 48 published breast cancer outcome signatures to those comprising random genes and found that 28 of the breast cancer signatures, or 60 percent, were "not significantly better outcome predictors than random signatures of identical size," while 11, or 23 percent, were actually poorer predictors than the median random signature.
Bertrand Jordan, author of the Bioessays review, called the paper "provocative and iconoclastic," as it "states that many published interpretations of expression profiling experiments are not really significant."
Jordan, who is an emeritus research director at the French National Center for Scientific Research, told BioArray News this week that the paper makes a "strong statement" that "seems to be backed by fairly solid data" and "deserved more exposure than it had received" initially.
In addition to the review and commentary, Vincent Detours, a computational biologist at ULB and corresponding author on the PLoS Computational Biology paper, said that he has received "encouragement" from other researchers since it first appeared.
"My impression is that many people were aware of the issue or felt uncomfortable about the proliferation of signatures, and were pleased to see their concerns addressed," Detours told BioArray News last week.
An 'Iconoclastic' Paper
As Detours noted in an opinion article published last December in The Scientist, the accumulation of signatures with all sorts of biological meaning, but nearly identical prognostic values, had "already looked suspicious" to him, coauthors David Venet and Jacques Dumont, and others as far back as 2007.
After collecting from the literature some signatures with as little connection to cancer as possible, the authors discovered that the signature of the blood cells of Japanese patients who were told jokes after lunch, and a signature derived from the microarray analysis of the brains from mice that suffered social defeat were both associated with breast cancer outcome by any statistical standards.
The authors then reviewed published cancer signatures and found that 60 percent were no more prognostic than signatures made by picking up genes at random among the 21,000 human genes.
While this problem occurred with single-gene markers, it became "dramatic" with multi-gene signatures, wrote Detours. And though a gene chosen at random already has roughly a one in five chance of being prognostic, for signatures made of more than 100 genes, the authors found that 90 percent are prognostic.
To explain this, the authors showed in the PLoS Computational Biology paper that in breast cancer, the "expression of a large fraction of the genome correlates with the proliferation rate, which is prognostic in this disease," Detours wrote.
As Detours noted in the opinion piece, it took the authors four years and six rejections to get the work published in a computational biology journal, which he acknowledged was "not the most efficient venue to reach the oncology community."
In his review, Jordan argued that "such a provocative finding should have been highlighted" and said that it was "difficult not to interpret this difficulty [in getting the paper published] as an expression of reluctance to question the significance of a widely practiced and widely published approach."
As for the Belgian group's findings, Jordan said that the data "speaks for itself." He specifically referenced Figure 2 in the paper, which shows how random sets of genes perform for each of 48 published signatures.
The authors tried 1,000 random sets of genes for each signature, and, as Jordan noted, in several cases, the random sets performed "as well or better" to predict overall survival in breast cancer than the set of genes "painstakingly chosen after many involved experiments" by the authors of the papers in which the signatures were published.
Among the 48 signatures Detours and his colleagues put to the test were three that diagnostics firms have commercialized as tests. These include a 16-gene signature that Genomic Health includes in its Oncotype DX breast cancer assay, the 70-gene signature that is the foundation for Agendia's MammaPrint test, and a signature that is used in Ipsogen's MapQuantDx offering.
Unlike most of the signatures tested, though, the three commercialized signatures perfomed "much better than almost all of the random gene sets of the same size," Jordan noted in his review.
Detours last week stressed that the paper's scope is "limited to the use of the signatures as research tools" and does not question the utility of clinical assays sold by Agendia, Genomic Health, Ipsogen, or any other firm. He added that the authors are not arguing that "all signatures are no better than random" but that 60 percent are, "so it should be no surprise that some published signatures are significantly
better than random gene sets."
That being said, Detours suggested that the paper could be useful for companies that wish to use gene expression signatures as the basis for various oncology tests.
"Test developers should have the kind of negative controls, such as random signature tests, proposed in the paper," said Detours.
As far as breast cancer is concerned, he said that current prognostics are "far from perfect," but that it is "very unlikely that better predictors will be produced by mining existing expression data."
He offered that in the future a weaker immune component in ER-negative tumors may provide "prognostic power" in the treatment of breast cancer. Detours added that "no one knows what will come out of sequencing studies [that] measure different biological parameters" and that the "analysis of circulating tumor cells and circulating tumor DNA are certainly interesting."
Jordan said this week that for test developers, "all that really matters is the clinical validity of the test, ideally using as few genes as possible."
He said that paper "leads one to suspect that many signatures are unnecessarily complex," and said that another recent paper in the Journal of the National Cancer Institute shows that a "very good classification" can be achieved with three genes. "This is obviously very important for test developers," said Jordan.