NEW YORK (GenomeWeb) – Capping off a nearly eight-year effort, a team led by researchers from University of Delaware's Delaware Biotechnology Institute has published a comprehensive atlas of small RNAs and microRNAs from three species of algae and more than 30 species of model and non-model vascular plants.
With the support of a $1.1 million National Science Foundation grant, in 2006 DBI's Blake Meyers and Pamela Green began the project, building off a previous effort to sequence sRNAs in rice and Arabidopsis. Specifically, they aimed to extend what is known about sRNA diversity and evolution beyond model plants and get a better idea of just how conserved miRNAs are as gene regulators.
According to their paper, which appeared last month in Nature Communications, the investigators developed and analyzed a set of sRNA libraries from 34 plant species including three green algae, one fern, three orders of gymnosperms, three basal angiosperms, nine monocots, and 14 eudicots. Among these are economically important crops, representatives of grasses and tree species that are key sources of wood and/or biofuel, and members of two plant families — Poaceae and Solanaceae — that have been a priority for the plant genomics community.
For each species, the team analyzed up to three samples. These generally included ones from leaves, reproductive organs, other organs such as roots or pods, tissues from plants under biotic or abiotic stress, and tissues of agronomic interest such as cotton fibers. For the green algae species, samples were obtained from three different growth conditions.
"As such, we expect this set of samples to contain the majority of miRNAs encoded in the genome of the corresponding species," the scientists wrote in Nature Communications.
The result was 99 sRNA sequencing libraries, which contained a median of 3.9 million reads and 1.2 million distinct sequences per library. Combined, the libraries represented 461 million sRNAs, with 132 million unique sequences.
From these libraries, the research group assembled a list of miRNA sequences that included sequences matching both reference miRNA and miRNA* sequences, as well as nucleotide substitution variants, also known as isomiRs. It then calculated the number of distinct miRNA sequences in each of the 34 plant species studied, finding 100,014 unique sequences.
An analysis for the sequence abundances revealed a "major divide" in miRNA evolution between algae and terrestrial plants, according to the paper. For example, "no higher-plant miRNAs were observed in C. coralline, a species that has been described as the green algae most closely related to land plants. Overall, no miRNA families common to both algae and terrestrial plants could be identified.
As for miRNA conservation across terrestrial plant species, a total of 82 known miRNA families were found to be present across several terrestrial plants. Some of these were ubiquitous and highly expressed across all terrestrial species, but the majority are either abundant in some species but present at low levels in most others, or specific to only certain species.
The investigators also observed a significant correlation between miRNA abundance and conservation in plants, with sequences belonging to 21 highly conserved families accounting for 54 to 98 percent of all miRNA sequences in almost all species.
"In addition, the low overlap between our identified miRNA sequences and the sequences present in miRBase … suggests that most miRNA sequences are species specific," the researchers wrote. Indeed, even the closely related species A. thaliana and A. lyrata each have specific miRNAs.
Notably, 90 percent of the identified miRNA sequences were isomiRs, and the number of sequence variants for a particular family is correlated to the number of reads for that same family. "It would thus appear that abundance leads to sequence diversity," the researchers wrote in their paper.
The significant presence of low-abundance sequences raises questions about their biological importance, they added in their paper. Studies examining the effects of miRNA depletion have been carried out for conserved families, demonstrating their various roles in plant development, but the same types of experiments have not been conducted for low-abundance, non-conserved miRNA families.
"It is therefore unclear whether the biological role of a miRNA is a function of its abundance, or whether a difference exists between the type of target genes regulated by high- versus low-abundance miRNAs from the same family," the researchers wrote.
The work presented in Nature Communications also brings up the issue of what constitutes a "canonical" sequence and the definition of an isomiR. The data show that the most abundant sequences can vary across species or across cell types in the same organ of a particular species, and can be absent from widely used databases.
"It would therefore appear that [a] canonical sequence will have to be defined on a case by case basis as the most abundant sequence for a particular family in a particular species/organ/cell type/developmental stage," the team stated.
Overall, the newly published atlas is expected to provide a key resource for studies tackling these and other questions. "By greatly enhancing the depth and breadth of plant sRNA sequencing and analysis available to date, this study provides new insights and a resource that will be beneficial well into the future," the scientists concluded.
Database resources and computational analysis tools developed during the project are available here.