Skip to main content
Premium Trial:

Request an Annual Quote

Ravi Kothapalli: Verification of Microarray Data Needed

Premium

AT A GLANCE: Ravi Kothapalli

Research Assistant Professor, H. Lee Moffitt Cancer Center and Research Institute, Tampa, Fla.

Studies gene expression in LGL leukemia

PhD, University of Western Ontario, Canada

Recently published a paper in BMC Bioinformatics entitled “Microarray Results: How Accurate Are They?”

You published a paper with the title “Microarray Results: How Accurate Are They?” How did this study come about?

Approximately three and a half years ago, we performed a gene expression study on peripheral blood mononuclear cells from LGL leukemia patients using Incyte Genomics’ UniGEM-V chips. At that time, microarrays were very new. We got what we thought were good results, but we wanted to confirm them before we were going to publish them.

We selected about 20 highly upregulated genes. First of all, we tried to confirm the sequences of the cDNA probes on the array. Incyte, at that time, was selling the clones they were spotting on the glass slides, and we bought and sequenced these 20 clones. As it turned out, approximately 30 percent of them were completely different sequences than what we thought they were.

In a second step, we picked those genes with verified sequences and checked their expression with Northern blot analysis, but we could only confirm our results for about half of them.

 

What other problems did you come across?

Some of the probes were not specific. For example, Incyte changed the name of one of the genes on their array from granzyme H to granzyme B, so we looked at which of these is actually upregulated in our patients. Using Northern blot analysis, we only got one signal, but it turned out that the probe we used [that was derived from the cDNA array] can recognize both genes. When we performed an RNase protection assay which has specific probes for granzyme H and granzyme B, we found equal amounts of both genes in our patients. Another problem was lack of probe specificity for different gene isoforms. Also, most of the probes on the microarray represent the 3’ ends of the genes.

 

What results did you get with Affymetrix arrays?

In order to confirm our results, we used Affymetrix HU6800 chips. We performed the experiment again, and some of the results agreed with [the] Incyte Genomics chips, but we noticed there is a lot of discrepancy in the fold-change. Perforin, for example, was 103-fold overexpressed with the Affymetrix system, and only 3.8-fold using the Incyte system. We performed Northern blot experiments, and the results fell in between those two extremes.

 

Why do you think you saw such high expression with the Affymetrix arrays?

I strongly believe that the mismatch probes on the Affymetrix arrays interfere with the calculation of the real fold change in expression. If you see more hybridization with the mismatch probes, you assume there is no specific hybridization with the real gene, but is very difficult to interpret the results if the expression of a given gene is very low. This problem may be associated with hybridization conditions - in one instance, since there was more hybridization to the mismatch probes, the fold-change was calculated as negative, even though the gene was clearly expressed, as evidenced by hybridization to perfect match probes.

 

What is your advice to microarray users?

Whatever results you get from a microarray experiment, it is always better to confirm them with other methods like Northern blots, RT-PCR, or RNase protection assays. Microarrays are an excellent technique - there is no doubt about it - but we cannot use the results without verification. Especially if you are under financial constraints or if your samples are very precious, and you do only one microarray experiment, you have to use some other methods. Also, before you spot your cDNA microarray, you should check your sequences.

 

So could it be that most of the data in the literature is flawed?

No, I am not saying that. Definitely microarray technology is a great technology, but it is just the beginning. There may be some mistakes in the literature; for example, I could have published my data without Northern blot verification. Had I done so, I would have been wrong in about half the cases. If people don’t verify their results, I am not going to believe their data entirely.

The Scan

Genetic Tests Lead to Potential Prognostic Variants in Dutch Children With Dilated Cardiomyopathy

Researchers in Circulation: Genomic and Precision Medicine found that the presence of pathogenic or likely pathogenic variants was linked to increased risk of death and poorer outcomes in children with pediatric dilated cardiomyopathy.

Fragile X Syndrome Mutations Found With Comprehensive Testing Method

Researchers in Clinical Chemistry found fragile X syndrome expansions and other FMR1 mutations with ties to the intellectual disability condition using a long-range PCR and long-read sequencing approach.

Team Presents Strategy for Speedy Species Detection in Metagenomic Sequence Data

A computational approach presented in PLOS Computational Biology produced fewer false-positive species identifications in simulated and authentic metagenomic sequences.

Genetic Risk Factors for Hypertension Can Help Identify Those at Risk for Cardiovascular Disease

Genetically predicted high blood pressure risk is also associated with increased cardiovascular disease risk, a new JAMA Cardiology study says.