Combining cDNA microarrays and artificial neural networks (ANN), researchers at the National Human Genome Research Institute and Lund University in Sweden have developed a genetic fingerprinting method for diagnosing four difficult-to-distinguish childhood cancers.
The researchers, who published their findings in the June 2001 issue of Nature Medicine, analyzed gene expression patterns in four different types of similar small, round blue-cell tumors (SRBCT), neuroblastoma, rhabdomyosarcoma, non-Hodgkin’s lymphoma, and the Ewing family of tumors.
“Tumors are currently diagnosed by histology and immunohistochemistry based on their morphology and protein expression. However, poorly differentiated cancers can be difficult to diagnose by routine histopathology,” wrote lead author Javed Khan, a principal investigator at the National Cancer Institute. “Here we developed a method of diagnostic classification of cancers from their gene-expression signatures and identified the genes that contributed to this classification.”
In their experiment, Khan and colleagues prepared cDNA microarrays using the NHGRI’s microarray protocol. The arrays included 3,789 sequence-verified genes and 2,778 ESTs from Research Genetics. Next the researchers hybridized samples of tumor tissue and cell lines from the four different types of cancers to the arrays. They imaged the array expression patterns using DeArray software, which was developed at the NHGRI and licensed to Scanalytics, and removing genes with a red intensity of less than 20.
To convert their expression results into a diagnostic classification method, the researchers employed a Linux-based artificial neural network, a parallel computational processing system set up to mimic the biological neural networks of the human brain.
The ANN, which like the brain can learn to discern certain patterns and place them into categories, is currently used by the FBI for fingerprint, voice, and handwriting recognition.
“The ANN is a method of error minimization,” said Khan. “It goes through a learning epoch, where it tries to learn the features of gene expression for a particular sample, then it goes back and feeds in these [diagnostic criteria] and tries to use them to diagnose a new sample.” The ANN gets feedback about whether the diagnosis is correct, then incorporates this information into its classification criteria and goes back to another learning iteration.
To teach the ANN to diagnose tumors based on gene expression patterns, the researchers first input the gene data from 63 “training samples,” that had already been accurately typed based on other methods. They combined the gene expression information from the nearly 4,000 expressed genes into 10 large chunks using a method called principal component analysis.
The researchers input these 10 components from each sample and asked the ANN to assign the samples to one of four types based on the component variation. By repeating these experiments through 100 learning epochs, the researchers were able to train the ANN to correctly diagnose the cancers 100 percent of the time.
The researchers then tested the ANN model using 25 blinded test samples, 20 of which came from the four cancers and five of which were from other tissues. The model correctly classified all of them, but discarded three correctly classified SRBCT samples due to less than 95 percent confidence in the classification. They also identified 41 new genes previously not known to be related to these cancers.
The result, the researchers wrote, “supports the potential use of these methods as an adjunct to routine histological diagnosis.”
The analysis method, however, does not provide information on which genes have a causative effect on which cancers and lacks the sensitivity to detect some differential gene expression patterns.
With refinements including higher-density arrays, Kahn believes this microarray-ANN diagnostic method could easily make the leap from research lab to clinical setting.
“The next step is to have national prospective trials with more patients, and to use the microarray not only to diagnose a [cancer] but to find the genes that correlate with a bad prognosis and [use them] to target these patients for treatments,” Khan said.