Researchers are testing new high-throughput sequencing technologies on the sidelines of two recent large-scale sequencing projects that have been conducted by traditional sequencing methods, In Sequence has learned.
Last week, seven years after the landmark publication of the Drosophila melanogaster genome, a large international team of researchers called the Drosophila 12 Genomes Consortium published the genomes of 10 additional fruit fly species in Nature, bringing the total number of sequenced Drosophila species to 12. In a pilot project, scientists also tested 454’s sequencing technology for discovering new SNPs in several strains of two Drosophila species.
Separately, another group of scientists, working on the Tumor Sequencing Project, published their analysis of large-scale genomic changes in lung cancer samples earlier this month and is currently sequencing the exons of about 1,000 genes in almost 200 lung cancer samples, using standard Sanger sequencing of PCR products. That group is also currently testing new sequencing technologies.
New Drosophila Species Improve Annotation
All 10 new fly genomes were sequenced by traditional random shotgun Sanger sequencing. Beckman Coulter subsidiary Agencourt Bioscience generated the data for five of the new species, the Washington University Genome Sequencing Center and the Broad Institute each contributed two genomes, and the J. Craig Venter Institute was responsible for one of the genome sequences, which ranged in size between 138 megabases and 236 megabases.
Comparing the genomes allowed the scientist to discover almost 1,200 new genes in D. melanogaster, and to correct errors in hundreds of already annotated genes.
“It is clear that you get a lot of information by sequencing multiple species that are related to each other at the right distance from each other,” Doug Smith, director of Agencourt’s sequencing center, told In Sequence last week. “Each one informs the other, and you are able to discern things by looking at those together, which would be harder to see if you were looking at them one at a time.”
The analysis is also a good model for the parallel study of large numbers of mammalian genomes, Smith said, a project that the National Human Genome Research Institute has been working on for several years. One of the goals of the Drosophila project, he said, was to develop bioinformatics tools that can be used for other analyses.
New sequencing technologies got their try in a pilot study led by Andy Clark, a professor of population genetics at Cornell University. For that study, which is not yet published, scientists at the Washington University Genome Sequencing Center sequenced 11 strains of D. melanogaster and 10 strains of D. mauritiana using 454’s platform, and generated about 2.5x coverage from single 454 runs for each strain.
“This was a pilot project to assess the efficacy of 454 for SNP discovery in whole-genome shotgun runs aligned to a reference genome,” Clark told In Sequence by e-mail. The main reason the group chose 454’s technology for the project was that it “was the most readily available technology at the start,” he said. The scientists may use the results from the study to design genome-wide SNP arrays now, he added.
“This was a pilot project to assess the efficacy of 454 for SNP discovery in whole-genome shotgun runs aligned to a reference genome.”
Lung Cancer Sequencing Supplements Array Data
Earlier this month, another group of scientists, part of the NHGRI-funded Tumor Sequencing Project, published a copy number analysis of almost 400 lung adenocarcinoma samples online in Nature. They discovered 57 frequent genomic changes, only 15 of which were linked to genes previously known to be involved in lung cancer.
In a second, ongoing phase, the three NHGRI-supported large-scale sequencing centers — Baylor College of Medicine, the Broad Institute, and the Washington University Genome Sequencing Center — are sequencing about 1,000 genes in almost 200 of the same lung cancer samples, coupling PCR with standard capillary electrophoresis sequencing.
“We worked with experts in lung adenocarcinoma and selected around 600 genes from a list of suspects,” which were coupled with genes coming out of the array study, Elaine Mardis, co-director of the Wash U GSC, told In Sequence by e-mail. The work was split up evenly between the three centers, which plan to publish their analysis next year.
The centers are also testing new sequencing technologies, including those from 454, Illumina, and Applied Biosystems, according to Rick Wilson, director of Wash U’s sequencing center and the sequencing project’s leader. They are coupling these with capture technologies that involve microarrays from NimbleGen and Agilent, he said, as well as “other approaches, too, that essentially accomplish the same desired effect of facilitating directed sequencing.”