NEW YORK (GenomeWeb) – Gene expression in maize is more complex than previously imagined, according to researchers from Cold Spring Harbor Laboratory.
Using single-molecule RNA sequencing, CSHL's Doreen Ware and her colleagues examined the transcriptome of six corn tissues to find that more than half of the transcripts they captured were novel, as they reported in Nature Communications today. Some of these novel transcripts were isoforms of known genes, while others were fully novel. In addition, they determined that DNA methylation plays a role in the generation of these various isoforms and described novel long non-coding RNAs and fusion transcripts.
"Our new research establishes the amazing diversity of maize, even beyond what we already knew was there," Ware said in a statement. "This diversity is fascinating in its own right and at the same time has great import for agriculture."
Using the PacBio RS II platform, Ware and her colleagues sequenced the transcriptomes of six tissues — root, pollen, embryo, endosperm, immature ear, and immature tassel — from the maize inbred line B73.
From this, they identified 111,151 unique isoforms that corresponded to some 27,000 genes. This is nearly double the amount of transcripts in the reference genome annotation, they noted. The maize draft genome was published in 2009.
After setting aside transposable elements, the researchers compared the remaining isoforms against the maize reference genome gene set. Based on this, they found that 3 percent of the isoforms represented novel transcripts from novel loci, but 57 percent were novel isoforms that share at least one splice site with the annotated reference gene set.
RNA from biological replicates was also sequenced on the Illumina HiSeq 2000 platform to both verify and quantify the PacBio-generated transcript isoforms. An average 86 percent of splice junctions from six tissues in the PacBio data were supported by short read mapping, the researchers noted.
Pollen had the highest amount of tissue-specific isoforms, followed by embryo, while root had the lowest amount, Ware and her colleagues reported. The functions of these tissue-specific isoforms varied. According to a Gene Ontology analysis, endosperm-specific isoforms were enriched for nutrient reservoir functions, a finding consistent with its role in food storage.
Long-read transcriptome data could also help fix incorrect gene models. Ware and her colleagues focused in particular on two well-characterized maize genes that are improperly annotated in the reference dataset. RGH, which was misannotated due to assembly error, was found in their data to have four isoforms, one of which matched its known structure. Meanwhile, CSR1, which lacks annotation in the reference, has two isoforms, one in the root and one in the tassel, according to this dataset.
"Beyond these two examples, we expect that the new transcriptome data will greatly improve the gene annotation," Ware and her colleagues wrote in their paper.
Non-coding RNAs also make up a significant portion of the transcriptome. Of the 878 lncRNAs the researchers uncovered, 11 corresponded to known lncRNA and the remaining 867 were novel.
DNA methylation appears to have a role in isoform generation, Ware and her colleagues found. They reported that alternative splicing is repressed by CHG methylation at acceptor sites though promoted by CG methylation at donor sites.
They likewise uncovered some 1,430 fusion transcripts, a portion of which they validated using an Illumina paired-end read approach. This suggests, they said, that transcript fusion is more common in maize that thought. Many of these fusions corresponded with splice junctions, suggesting that the slicing machinery is also involved in the generation of fusion transcripts.
These findings begin "to reveal new functional parts that we didn't know about before," Ware added in a statement. "By having insight into what those other parts are and what they do, we begin to realize new ways of breeding corn, adapting it, for example, to changes in climate as average annual temperatures in growing zones continues to climb."