NEW YORK (GenomeWeb) – Researchers from China, the US, and Sweden have sequenced and started analyzing a high-quality draft genome assembly for the sweet wormwood or Qinghao plant, Artemisia annua, a Chinese shrub producing the antimalarial sesquiterpene compound artemisinin.
"Currently, the supply of [artemisinin-based combination therapies] is reliant on the agricultural production of artemisinin," senior author Kexuan Tang, a metabolic and developmental science researcher at Shanghai Jiao Tong University, and his co-authors wrote. "However, plant-based production sometimes cannot meet the global demand due to the low amount of artemisinin produced in A. annua leaves."
As they reported in Molecular Plant today, Tang and his colleagues used a variety of sequencing technologies to produce a genome assembly exceeding 1.7 billion bases, to an average of 260-fold coverage. They found that sweet wormwood has a protein-coding repertoire that tops that of any other plant sequenced so far — a set that includes a vast collection of genes that appear to be specific to A. annua's clade within angiosperm plants.
By digging into the plant's genome and transcriptome sequences, the team got a look at genes and regulatory sequences contributing to artemisinin production, using these insights to establish transgenic plants with particularly pronounced artemisinin content.
"Based on comprehensive genomic and transcriptomic analyses we generated transgenic A. annua lines producing high levels of artemisinin, which are now ready for large-scale production and thereby will help meet the challenge of increasing global demand of artemisinin," Tang and his co-authors wrote.
The researchers focused on the Huhao 1 cultivar, a heterozygous diploid plant known for its relatively robust artemisinin production. Using Illumina HiSeq 2500, Illumina MiSeq, Roche 454 GS FLX, and PacBio RSII instruments, they generated shotgun, mate-pair, paired-end, and long-read sequences that were assembled into a 1.74 billion base genome, spanning nearly 39,600 sequence scaffolds.
Together with transcriptome sequences from nine sweet wormwood tissues, the draft genome sequence made it possible to pin down 63,226 predicted protein-coding genes within the highly repetitive genome, the team reported, including more than 41,500 genes backed up by the RNA sequence data.
Through a series of comparative genomic, phylogenetic, gene content, and transcription factor analyses, the researchers got a look at A. annua's relationships to other plants, along with the gene family expansions and regulatory features that have bumped up the plant's capability to produce terpene secondary metabolites such as artemisinin.
"[T]he expansion and functional diversification of genes encoding enzymes involved in terpene biosynthesis are consistent with the evolution of the artemisinin biosynthetic pathway," the authors wrote. "We further revealed by transcriptome profiling that A. annua has evolved the sophisticated transcriptional regulatory networks underlying artemisinin biosynthesis."