NEW YORK (GenomeWeb News) – Canadian researchers reported online today in Genome Biology that they have sequenced the draft genome and transcriptome of a marijuana-producing Cannabis sativa strain known as "Purple Kush."
Through comparative analyses of the C. sativa Purple Kush genome and transcriptome with sequences from two hemp-producing C. sativa strains known as Finola and USO-31, the team identified transcription patterns involved the production of the psychoactive compound 9-tetrahydrocannabinol, or THC.
In particular, their results suggest that Purple Kush expresses a THC-precursor producing enzyme known as 9-tetrahydrocannabinolic acid synthase that's not expressed by the hemp plants. Instead, those plants contained transcripts for an enzyme called cannabidiolic acid synthase that does not produce psychoactive compounds.
"Biochemically and chemically, people had known that there's this bifurcation in the pathway in marijuana and hemp, but this was the first molecular evidence for what was occurring," co-corresponding author Jonathan Page, an adjunct professor at the University of Saskatchewan and researcher at the National Research Council of Canada's Plant Biotechnology Institute in Saskatoon, told GenomeWeb Daily News.
Though the psychoactive product THC is the most notorious natural product of Cannabis, the plants also produce more than 100 other cannabinoid compounds with potential pharmacological properties, including some cannabinoids that are not psychoactive.
Over thousands of years since its domestication in Central Asia, C. sativa has spread around the world, the researchers explained, where it has been grown not only for its psychoactive and potential medicinal properties, but also as a source of hemp fiber, oil, and protein-rich seeds.
This broad range of uses has led to the development of strains that vary dramatically in their features and natural product content. Plants cultivated for hemp fiber production typically contain low levels of THC, for example, while marijuana crop plants are high in THC. But the reasons for such differences are poorly understood.
"Selective breeding has produced cannabis plants for specific uses, including high-potency marijuana strains and hemp cultivars for fiber and seed production," they wrote. "The molecular biology underlying cannabinoid biosynthesis and other traits of interest is largely unexplored."
To look at this in more detail, the team, led by Page and University of Toronto researcher Timothy Hughes, used a combination of paired-end and mate pair sequencing with the Illumina GAIIx, Illumina HiSeq, and Roche 454 GS FLX Titanium platforms to sequence DNA from young leaves of female C. sativa Purple Kush plants provided by a medical marijuana grower from Vancouver, British Columbia.
Overall, the sequence generated provided roughly 110 times coverage of the genome, estimated to be around 820 million base pairs.
In their de novo C. sativa Purple Kush haploid genome assembly, the team included 534 million base pairs of sequence across nearly 790 million bases of the genome overall. They then used this draft genome as a reference to re-sequence the genomes of two hemp strains, Finola and USO-31.
In addition to genome sequencing, the researchers also used the Illumina platforms to do RNA sequencing on several Purple Kush tissues and on female flower tissue from the Finola hemp strain.
This transcriptome information proved useful for identifying some 30,000 gene transcripts for C. sativa and also for gauging the similarities and differences in gene expression between the marijuana and hemp strains.
While they did see some single nucleotide variant patterns in the genomes that may eventually help in classifying Cannabis strains, for example, the researchers did not detect obvious differences in the genomes alone that would explain the distinct properties of the marijuana and hemp plants.
When they turned to the transcriptome data, though, the team found evidence for distinct enzyme expression in the Purple Kush cultivar compared to the Finola and USO-31 cultivars.
The marijuana-producing strain contained transcripts missing from the hemp-producing plants, namely for the enzyme THCAS, which helps make a THC precursor. On the other hand, tissue from the hemp plants contained transcripts for the CBDAS enzyme, which produces a non-psychoactive compound.
The team also found hints that the expression of other enzymes contributing to cannabinoid production is elevated in marijuana strains relative to hemp strains.
"We think this is sort of a molecular signature of the selection by humans over time of high-THC strains," Page noted. "We think this probably represents transcriptional control, where one or more transcription factors that control that pathway are turned on and then up-regulate those genes."
Nevertheless, he added, more research is needed to determine whether the transcript differences are a consequence of changes to gene regulation or to the actual genes coding for the enzymes themselves, Page noted, since the hemp strain genomes were not re-sequenced completely enough to resolve this conclusively.
In the future, the researchers plan to generate a physical map of the genome and to re-sequence additional C. sativa varieties to learn more plant genotype patterns. Page said some members of the team will likely do more in-depth biochemical analyses of cannabinoid pathways as well.
Those involved in the effort say the Cannabis genome and transcriptome information could have applications for a range of research efforts, from biomedical studies aimed at identifying new medicinal compounds to agriculturally focused studies.
"The Cannabis sativa genome enables the analysis of a multi-functional plant that occupies a unique role in human culture," the researchers concluded. "Its availability will further the development of therapeutic marijuana strains with tailored cannabinoid profiles and provide a basis for the breeding of hemp with improved agronomic characteristics."
Data from the C. sativa Purple Kush genome and transcriptome is available online through the Cannabis Genome Browser.
Publication of the paper follows an announcement in August by private company Medicinal Genomics that it had sequenced the genomes of two Cannabis strains. Investigators involved in that effort, which has not yet been published, said they are primarily focused on understanding the genes and pathways coding for non-psychoactive, but medically relevant compounds produced by Cannabis plants.