NEW YORK (GenomeWeb News) – Members of the Potato Genome Sequencing Consortium reported online yesterday in Nature Genetics that they have sequenced a high-quality, draft version of the potato genome.
The international team sequenced a so-called doubled monoploid potato, a homozygous diploid, using a combination of Sanger, Illumina, and Roche 454 sequencing. The draft genome was then used to map sequence reads from a heterozygous diploid potato line more similar to commercially-grown plants.
By analyzing the genome sequences, along with tissue-specific transcriptome sequences from the two lines, the team identified more than 39,000 protein-coding genes, including some that seem to contribute to tuber formation and disease resistance. And, they say, information in the genomes will serve as a resource for finding SNPs that can be used not only to study potato diversity, but also to produce potato crops with high quality traits.
"Given the pivotal role of potato in world food production and security, the potato genome provides a new resource for use in breeding," co-principal investigator C. Robin Buell, a plant biology researcher at Michigan State University, and her co-authors wrote. "Many traits of interest to plant breeders are quantitative in nature and the genome sequence will simplify both their characterization and deployment in cultivars."
Potato plants belong to the same plant family as tomato, tobacco, and eggplant, within a larger angiosperm dicot clade known as asterid, the researchers explained. But sequencing the potato genome has proven challenging since most potato cultivars are heterozygous, autotetraploid plants containing four distinct genome sequences.
Consequently, the team started by using a BAC-by-BAC method to try to sequence the genome of a more simple potato line called S. tuberosum group Tuberosum RH89-039-16, or "RH," a heterozygous diploid. The sequence data provided a glimpse at the extent of heterogeneity present in this line, Buell explained, but proved more difficult than expected, even using a BAC-by-BAC approach.
That prompted the researchers to turn their attention to a more genetically homogeneous potato plant, a doubled monoploid potato called S. tuberosum group Phureja DM1-3 516 R44, or "DM," developed through tissue culture-based methods.
"These two genotypes represent a sample of potato genomic diversity," the researchers explained. "DM with its fingerling (elongated) tubers was derived from a primitive South American cultivar whereas RH more closely resembles commercially cultivated tetraploid potato."
Because high-throughput sequencing platforms became available at around the same time, the team decided to use a combination of Illumina GA, Roche 454, and Sanger approaches to tackle the 844 million base DM genome.
In the process, they generated a 727 million base genome sequence, including 623 million bases that were anchored to a potato genome map using about 2,600 polymorphic markers.
With this sequence in hand, the team used the DM genome to help map whole genome shotgun reads that had been generated for the heterozygous diploid RH, Buell explained.
They then annotated the potato genome, using additional information obtained by sequencing RNA from dozens of libraries made from DM and RH samples collected from different tissues, developmental stages, and stress conditions, identifying 39,031 predicted protein-coding genes. Of these, more than a quarter of the genes code for at least two isoforms, they reported.
Comparisons between the protein repertoire predicted for potato and those of 11 green plants provided hints about the potato's evolutionary history and relationships to other plants, revealing nearly 3,200 protein families that were shared amongst the plants and 2,642 genes that appear to be specific to the asterid clade.
Through their genome and tissue-specific transcript analyses, researchers also identified genes involved in tuber development, including storage genes and genes coding for components of starch biosynthesis pathways, as well as genes that seem to influence disease resistance.
And by comparing the homozygous DM diploid sequence to the heterozygous RH sequence, the team also got clues about some of the genetic mutations that contribute to inbreeding depression in the plants, which are mainly propagated vegetatively.
Together, the genomes are already offering insights into potato biology that may ultimately translate into potato crop improvements. Even so, Buell noted, there will likely need to be additional improvements to sequencing technology before it's feasible to sequence the genomes of tetraploid potato plants, since longer reads are needed to accurately resolve haplotypes.
"[T]he development of experimental and computational methods for routine and informative high-resolution genetic characterization of polyploids remains an important goal for the realization of many of the potential benefits of the potato genome sequence," she and her co-authors concluded.