NEW YORK (GenomeWeb News) – Researchers from the US, UK, and Germany have sequenced, assembled, and started to analyze an extensive set of genes for bread wheat, Triticum aestivum, en route to a full genome sequence for the plant.
"This work moves us one step closer to a comprehensive and highly detailed genome sequence for bread wheat," co-author Jan Dvorak, a plant sciences researcher at the University of California, Davis, said in a statement, "which, along with rice and maize, is one of the three pillars on which the global food supply rests."
But while Dvorak said the work, described online today in Nature, has "yielded important information that will accelerate wheat genetics and breeding and help us better understand wheat evolution," he cautioned that the effort represents "one step in the global effort to produce a high-quality draft of the bread wheat genome sequence."
For the study, the researchers performed shotgun sequencing to take on the massive hexaploid genome — an amalgam of three genomes representing the ancient grasses that came together to form the plant. Along with sequence data for bread wheat, they also folded in sequence information for three related grass plants, developing gene assemblies and finding SNPs within each of wheat's component genomes, called the A, B, and D genomes.
"Although the assemblies are fragmentary," the team noted, "they form a powerful framework for identifying genes, accelerating further genome sequencing and facilitating genome-scale analyses."
From this sequence data, investigators unearthed as many as 96,000 predicted protein-coding genes in wheat, identifying gene families that have been nipped, tucked, or expanded during the plant's evolution and domestication.
Bread wheat's origins are believed to stretch back some 8,000 years, the study's authors said. The plant, an apparent hybrid between the tetraploid emmer wheat T. dicoccoides and a diploid goat grass in the Aegilops tauschii species, has been domesticated and cultivated alongside the spread of agriculture. It is now produced on a massive scale, making up an estimated one-fifth of the calories consumed by humans.
Like other grasses, wheat has a dynamic genome replete with repeats. Past genetic studies hint that around 80 percent of wheat's estimated 17 billion-base genome is comprised of retroelements and other repeats. That repeat content, coupled with the genome's size and polyploid nature, has made it difficult to fully characterize.
For the current study, researchers used Roche 454 GS FLX Titanium and GS FLX+ platforms to sequence nuclear DNA from a Chinese Spring wheat variety known as CS42, generating enough sequence to cover the genome at an average depth of around five-fold. To that, they added SOLiD 3 and SOLiD 4 sequence data for CS42 and plants from three other commercial wheat varieties.
To help assess sequences associated with each wheat component genome, the team also relied on sequence data for related grass species assessed by Roche 454, Illumina, and/or SOLiD sequencing.
For instance, additional data on wheat's "D-genome" came from folding in SOLiD 4 and Roche 454 GS FLX Titanium sequence data for A.tauschii. Similarly, T. monococcum, a plant resembling the ancestral organism that contributed wheat's "A-genome" was sequenced with Illumina technology, while researchers tapped previous, unpublished data for information on a "B-genome" related plant called A. speltoides.
In addition to comparisons with sequences from plants such as rice and maize — which helped in assembling bread wheat sequences into gene models — the researchers compared wheat sequences to those of A. tauschii to see how gene content has shifted in the polyploid plant relative to the diploid descendents of its ancestors.
Overall, the team estimated that the bread wheat genome houses somewhere on the order of 94,000 genes to 96,000 genes, along with a slew of pseudogenes and gene fragments that are believed to stem from transposon or retroelement activity in the genome.
The polyploid genome appears to have eliminated around 10,000 to 16,000 of the genes found in ancestral plants. But gene families involved in a few processes, such as plant growth and metabolism, seem to have expanded during wheat domestication.
The data described in the study is thought to represent a more or less complete set of wheat's genes, study authors said, though not all of the genes have been assigned to a specific component genome.