SAN DIEGO (GenomeWeb News) – Members of the 150 Tomato Genome ReSequencing Project have completed the sequencing stage of that study on tomato genetic variation and are now starting to dive into data analysis, Wageningen University's Gabriel Sanchez-Perez said during a presentation here last night.
The 150 Tomato Genome ReSequencing Project, which kicked off last spring, was spearheaded by researchers at Wageningen University and involves an international group of collaborators from academia, the tomato industry, and government organizations.
These collaborators have been chasing down genome sequence data on around 150 cultivated and wild tomato plants from across the Solanum clade, with an eye to finding the genetic features that underpin valuable traits lost from cultivated tomato plants through processes such as diversification and domestication. That is especially interesting in light of the dramatic phenotypic diversity found in Solanum, Sanchez-Perez explained, noting that the clade includes everything from typical domestic tomato plants to flowering trees.
The group also is in the process of putting together new reference genomes from plants spread across the tomato tree to facilitate their analyses and understanding of the newly-generated resequencing data.
The plants being sequenced for the study include representatives from 44 tomato landraces, 10 old tomato varieties, and around 30 additional exotic and/or wild tomato species. Each line was carefully selected based on plants' geographical range, fruit features, or phylogeny, Sanchez-Perez said, and the plant genomes were sequenced to an average depth of more than 40-fold using Illumina instruments.
To further round out the analysis, researchers did low-coverage sequencing on 60 individual plants generated by crossing domestic tomato plants from the S. lycopersicum cultivar Moneymaker and S. pimpinellifolium CGN 15528.
Because the tomato reference genome, published last year in Nature, is phylogenetically far removed from many of the plants included in the current study, he added, the resequencing team is also generating new reference genomes for three plants from other parts of the Solanum tree: S. habrochaites, S. pennellii, and S. arcanum.
Again, Illumina reads made up many of the sequences generated for the reference plants, which were each sequenced to more than 200-fold depth, on average. But the researchers also are folding in additional mate pair and Roche 454 reads as well, to help in de novo assembly of each reference genome.
So far Sanchez-Perez said two of the three assemblies are in pretty good shape, while additional work is still needed for the third.
Along with continued work on that reference genome, the team is in the process of cleaning up their resequencing data and working through the read mapping and SNP calling steps that precede more widespread analyses of the new sequence collection.
The group plans to look at a very old sample as well: bits of tomato from the 16th century that are currently housed in an Italian herbarium. Before they tackle that valuable and limited resource, though, the group is homing their tomato skills on another old tomato sample collected in Surinam in the 1800s, Sanchez-Perez noted.