NEW YORK (GenomeWeb News) – An international team led by Chinese investigators reported online yesterday in Nature Genetics that it has sequenced and started analyzing the draft genome of the wild South American cotton plant Gossypium raimondii, which is related to the commercially important cotton species G. hirsutum and G. barbadense.
One of the four sub-genomes present in tetraploid G. hirsutum and G. barbadense plants, known as sub-genome "D," is believed to have come from a G. raimondii-like ancestral plant. The researchers reasoned that sequencing the relatively straightforward diploid G. raimondii genome could lay the foundation for more extensive future genetic studies on the tetraploid plants that are regularly used for cotton fiber production.
"We believe that sequencing of the G. raimondii genome will not only provide a major source of candidate genes important for the genetic improvement of cotton quality and productivity, but it may also serve as a reference for the assembly of the tetraploid," co-corresponding author Shuxun Yu, a researcher with the Chinese Academy of Agricultural Sciences' Cotton Research Institute, and colleagues explained.
The team used paired-end Illumina HiSeq 2000 sequencing to take on the 880 million base G. raimondii genome, generating sequence that covered around 88 percent of the plant's genome assembly to an average depth of nearly 104-fold.
Genomic DNA for the study came from a G. raimondii accession called CMD 10 that is thought to have been rendered more or less homozygous through several rounds of self-fertilization.
Almost three-quarters of the assembled sequence could be cobbled onto one of the plant's 13 chromosomes, researchers reported, while analyses of the genome sequence uncovered an estimated 40,976 protein-coding genes. Of these, more than 92 percent were verified by sequencing RNA transcripts in G. raimondii tissue.
RNA-sequencing experiments on samples from G. raimondii and one of the commercial cotton species, G. hirsutum, also offered the team a chance to look at some of the transcriptional patterns contributing to cotton fiber development, since the G. hirsutum produces these fibers, while G. raimondii does not. For instance, that analysis indicated that three of four sucrose synthase enzyme genes in the cotton genome are more highly expressed in G. hirsutum than in G. raimondii, while transcripts for some other biosynthetic genes appeared to be exclusively expressed by the fiber-producing species.
When they compared the new G. raimondii genome with those of several plants sequenced in the past, investigators saw an over-representation of transposable elements and fairly sparse gene density in the cotton plant's relatively puny genome.
Patterns in the plant genomes also pointed to at least two duplication events in the lineage leading to G. raimondii: one event that occurred in the ancestor of all eudicot plants, producing six copies of the genome, and another, more recent whole-genome duplication event that seems to have been specific to plants in the cotton genus.
Based on analyses of gene families that are shared between various plant species, the team determined that G. raimondii is part of a subclade that also includes the cocoa plant Theobroma cacao but is distinct from the subclade that houses plants such as Arabidopsis thaliana and papaya.
Consistent with this apparent relationship, they found, both the cotton and cocoa genomes contain a family of CDN1 genes that contribute to the production of a plant compound used to ward off some herbivores and potential pathogens.
Understanding these and other defense genes may eventually prove useful for improving commercial cotton plants, the study's authors noted, as will additional work aimed at untangling the genetics behind other cotton traits of interest.
"We suggest that sequencing of the G. raimondii genome is a major step toward fully deciphering and analyzing the genomes of the Gossypium family to improve cotton productivity and fiber quality," they concluded.
Earlier this year, a University of Georgia-led team participating in the US Department of Energy Joint Genome Institute Community Sequencing Program effort announced that it had sequenced G. raimondii using next-generation technologies and was making the genome sequence data publicly available.
Members of that group also produced a physical map for G. raimondii, published in the journal BMC Genomics in 2010.