This article was originally published July 9.
In a comparison of RNA-seq technology with exon arrays, researchers from the National Institutes of Health found that arrays are more sensitive and less variable, but RNA-seq enables the discovery of novel transcription events and detect a broader range of expression.
However, because next-gen sequencing technology is changing so quickly, the sensitivity differences that the study authors found between the Affymetrix Human Exon 1.0 microarray and the Illumina Genome Analyzer may not hold true anymore, noted Nalini Raghavachari, from the DNA sequencing and genomics core facility at the National Heart, Lung, and Blood Institute.
The NHLBI-led study was published last month in BMC Medical Genomics.
Raghavachari told In Sequence that the NHLBI team decided to compare the two techniques because they were in the midst of an array study, which they started in 2010, comprising around 6,000 samples that were part of the Framingham Heart Study cohort. When they began their array study, RNA-seq was still five to six times more expensive.
"At the time, people questioned, 'Why chips as opposed to next-gen sequencing?'" she said. So, the comparison was done in part to "find out what we're missing," and also to compare the two techniques and quantify the advantages and disadvantages of each strategy.
The team compared RNA-seq and arrays from samples from six patients with sickle cell disease and four healthy controls.
For the array-based experiments, they started with 50 nanograms of RNA, and amplified it using NuGen's RNA amplification system. They then ran the samples on Affymetrix's Human Exon 1.0 microarray and analyzed the data with ExonAnova, an exon array analysis tool developed by researchers at NIH.
For the sequencing experiment, the team started with 1.5 micrograms of total RNA. Library prep was done according to Illumina's protocols, and sequencing was done on the GA, generating around 15 million reads per sample. Reads were mapped to the reference using Bowtie/Tophat and reads that mapped to more than 10 locations were discarded.
To fairly compare the two technologies, they counted reads that mapped to each probe set selection region within each exon. Areas of low transcript counts — defined as areas where only six or fewer samples had more than six transcripts — were filtered out, which left 11,562 transcripts for further analysis. Additionally, one of the control samples had to be discarded entirely because after normalizing the data, it was found to be an outlier.
As with the microarray data, ExonAnova was used to analyze alternative splicing of the RNA-seq data.
The authors observed a clear separation between the healthy controls and the sickle cell disease patients in both the RNA-seq data and the microarray data. The RNA-seq data, however, showed a larger dynamic range of expression compared to the microarray data, and also allowed for the identification of novel transcripts and alternative splicing events. In this experiment, RNA-seq identified 86 novel regions that showed greater than a two-fold change between the disease and control samples.
RNA-seq was also able to identify sequence variation in expressed transcripts that the microarrays did not detect. For instance, the team looked at the globin gene in both the disease and healthy samples for the known driver mutation. Like microarrays, RNA-seq identified these known mutations in all the disease samples, but it further found that one of the patients was a compound heterozygote for the mutation, which cannot be picked up by microarrays.
Additionally, when looking at differentially expressed genes, those from the RNA-seq data tended to show higher fold changes than those from the microarray data.
However, technical variation among experiments was lower with microarrays and microarrays were also more sensitive than RNA-seq.
"Even with the usage of 30 times less starting material (50 ng vs 1.5 micrograms), exon arrays could detect as many transcripts above background as in RNA-seq," the authors wrote. "Although both platforms detect similar expression changes at the gene level, the exon array is more sensitive at the exon level and deeper sequencing is required to adequately cover low-abundance transcripts."
However, the authors noted that it's possible to improve sensitivity and the ability to adequately cover low-abundance transcripts by increasing sequencing coverage. For instance, while sequencing depth in this study was around 10 million reads, if an Illumina HiSeq instrument had been used, it would have been "easy to generate ~80 million reads," the authors wrote.
Raghavachari said this finding was somewhat surprising. A number of factors could contribute to RNA-seq being less sensitive than arrays, she said. For example, the sample prep steps of RNA-seq could impact sensitivity, particularly if not all the ribosomal RNA was removed. Or it could be that there were too few reads or that the read lengths were not long enough.
She added that with newer sequencing technology and library construction methods, these sensitivity differences would be reduced, if not eliminated.
Another advantage of the microarrays, she said, is that they require much less starting material. In this experiment, they started with only 50 nanograms of RNA for the microarray experiment, but needed 1.5 micrograms of RNA for the sequencing experiment. However, newer sequencing methods are now available, which require nanograms of starting RNA.
Raghavachari said that going forward, her team is starting to move into RNA-seq using the Illumina HiSeq instrument, including doing RNA-seq on many of the same samples from the Framingham Heart cohort that they analyzed with microarrays, both to validate their findings and also to look for novel biomarkers.
However, microarrays still have their use, she said, especially in a clinical setting where sample input may be limited.
Choosing microarrays or RNA-seq will "depend on the biological question," she said. "If you already know a pathway and you are looking to see how that pathway is getting modulated because of mutations or drug changes," then a microarray will suffice.
"But if you want to discover novel transcripts or variants, then RNA-seq is an excellent technology," she said.