NEW YORK (GenomeWeb News) — Data generated by genomic and proteomic technologies are overly complex, causing some cancer researchers to arrive at “simplistic” or “misleading” conclusions, according to a review article in the January issue of Nature Reviews Cancer.
The study, led by researchers at Georgetown University Medical Center, concluded that scientists “don’t appreciate how complex the data is that is being generated.”
High-throughput genomic and proteomic tools “have allowed us to see that nature is more complex than we thought, and while we don’t yet know what the overarching biological rules are — such as the interrelationship between multiple signaling pathways that can lead to cancer development — we are trying to play the game like we do,” lead author Robert Clarke, professor of oncology and physiology and biophysics at GUMC’s Lombardi Comprehensive Cancer Center, said in a statement released yesterday.
“The answers to our questions are probably there in the data, but the issue is whether we can get them using these complex tools and, also, how we will know they are right when we see them,” he added.
Clarke, who is also interim director of GUMC’s Biomedical Graduate Research Organization and co-director of the school’s Breast Cancer Program, led the analysis with six other scientists from Georgetown and from Virginia Polytechnic Institute.
In the statement, GUMC said researchers like Clarke are currently studying ways to “understand the theory and properties of the data” generated by genomic and proteomic tools and “how they may affect data analysis and interpretation.”
At the core of the challenge is that in the clinical evaluation of cancer, the “thousands of active molecules” that exist in a single excised tumor sample produce “very high-dimensional data spaces.” As a result, researchers face “10,000 or so dimensions, if you consider a molecule working along a pathway as a dimension.”
Clarke uses the analogy of a box, which has a height, a width, and a length. But if you add color and fiber you add two dimensions, he said. “There are countless things going on in a cell that could describe it; this is the essence of multi-dimensionality and these tools tell you all of that.”
Not all of these data will be relevant to the research that yielded them. “Some cells in a tumor are dying, some are not. Some are growing, others are not. Some are trying to spread and the rest aren’t,” Clarke said. “Everything is going on in a tumor at once, and all of these activities require coordination of different genes. So it may not be accurate to analyze these molecules as if they are all focused on performing a single function.
“We need to discover what specific genes perform which function,” he said. “If we knew the rules” — which genes participate in which process, for instance — “we should be able to understand some of the questions we have, but we are not there yet.”