Three research teams have independently developed methods that enable next-generation technologies to sequence portions of a human genome, a necessary step for large-scale human exon sequencing studies or candidate gene-sequencing projects.
The methods are noteworthy because up until now, researchers have mostly used PCR to selectively amplify short stretches of DNA, but this approach cannot be multiplexed to a high degree.
Two of the three approaches use NimbleGen microarrays to capture genomic regions, followed by sequencing: one on 454’s Genome Sequencer and the other on Affymetrix and NimbleGen resequencing arrays. The third method uses a modified version of molecular inversion probes that targets exons with oligonucleotides that were released from an Agilent microarray, and sequences the amplified exons using Illumina’s Genetic Analyzer. All three selection methods could be used, in principle, with any of the current next-gen sequencing technologies, according to the researchers who developed them.
The three approaches, which were published in separate articles in Nature Methods this week, need to be further refined, but researchers are confident that they will greatly increase the utility of next-generation sequencers.
“Any time you want to sequence just a portion of a genome, rather than the whole genome, these enrichment techniques are very valuable,” said Michael Zwick, an assistant professor at Emory University Medical School and author of one of the papers.
“It’s pretty simple but it’s quite profound, and it will shortly have a huge impact,” predicted Richard Gibbs, director of Baylor College of Medicine’s Human Genome Sequencing Center and an author of one of the other reports.
NimbleGen & 454
Gibbs’ team collaborated with NimbleGen Systems and used the company’s custom high-density oligonucleotide microarrays to capture almost 7,000 exon sequences, as well as target areas around the BRCA1 gene locus that ranged in size from 200 kilobases to 5 megabases. The team analyzed the results on a 454 FLX sequencer.
According to Gibbs, what mainly distinguishes the methods is their upfront cost per assay, how evenly the targeted regions are represented after enrichment, how flexible the assay is, and how well they capture variant alleles.
Custom microarrays, he pointed out, allow users to add different probes to represent a particular region more strongly, “and I am not sure if the same flexibility is afforded by every method,” he said.
NimbleGen, which like 454 is owned by Roche, is currently developing the capture technology and optimizing it for 454 sequencing. “We are collaborating closely with 454,” Tom Albert, senior director of research and development at NimbleGen, wrote in an e-mail message to In Sequence. 454’s long reads have “synergies” with “the conditions required for optimum enrichment performance,” he said.
At last week’s Genomes, Medicine and the Environment conference in San Diego, Michael Egholm, 454’s vice president of research and development, said that Nimblegen can currently represent the entire human exome on seven of its arrays, and that the company plans to soon reduce this to two arrays with higher density. 454, on the other hand, is planning to improve the output of its sequencing platform (see related article in this issue).
According to Gibbs, it currently takes about seven runs on 454’s FLX instrument to sequence all the human exons captured on NimbleGen’s seven arrays at a total cost of approximately $100,000, most of which is associated with sequencing.
According to Albert, Nimblegen plans to provide “commercial products and services” for genome enrichment in early 2008.
Eventually, Gibbs wants to use the approach for large-scale human exon sequencing, he said.
“Our dream is to create knowledge of all the exonic variants in a range of human populations, which might require more than 1,000, perhaps 2,000 different individual samples to be processed,” he said.
“Everybody is of like mind; we want to get a lot more genomes sequenced,” Gibbs said, mentioning ongoing discussions on the best way to organize such a project. “Those discussions have not truly matured yet. But this technology is going to impact on those discussions, that’s for sure,” he said.
NimbleGen & Resequencing Arrays
Michael Zwick and his team from Emory University School of Medicine used an approach similar to Gibbs’ in capturing DNA on NimbleGen custom arrays, but sequenced a smaller region of the genome using Affymetrix and Nimblegen resequencing arrays.
However, they are planning to couple the enrichment method with next-generation sequencing, in particular Illumina’s Genome Sequencer, which Zwick’s lab received a few weeks ago.
Also, the two enrichment methods differ in “some of the early steps,” including the efficiency of the initial ligation, according to Zwick.
Zwick said that the cost for a commercial 384,000-feature NimbleGen array is on the order of $500. He and his colleagues use the arrays “at least twice,” stripping the bound DNA off after the first experiment.
Their method is not proprietary; “anyone can buy the arrays and do this experiment,” he pointed out, at least as long as NimbleGen continues to sell oligonucleotide custom arrays.
He and two colleagues are in the process of founding a company, called PeachTree Genetics, that plans to provide clinical diagnostic testing of disease genes that is currently not available, “largely because the genes are large and difficult to sequence,” Zwick said. Initially, the researchers have focused on developing high-density CGH and resequencing arrays to analyze dystrophin, which is mutated in Duchenne Muscular Dystrophy. The company will aim to “harness [microarray-based genomic selection] to make sequence-based diagnostics cheaper, faster, and more broadly available,” he said.
At Emory, Zwick and his colleagues plan to use their method and Illumina’s sequencing platform to sequence all the genes on the X chromosome. They just received a $3 million grant from the Simons Foundation for autism research, which includes this project (see In Sequence 10/9/2007).
“For the experiments we are trying to do, we think [Illumina] may be better right now,” Zwick said. “But we are certainly interested in pursuing sequencing with 454 also, because we think a mix of the two technologies may be the best way to go forward.”
Emory does not have a 454 sequencer at the moment, he said, but he is interested in building a sequencing core facility, starting with his Illumina sequencer. He said Emory recently purchased a 1,000-node high-performance computing cluster that his lab can access, “so a lot of the computing infrastructure is in place to support these next-generation tools.”
Agilent Oligos & Illumina
George Church at Harvard and his colleagues published the third method, showing that they can use it to capture and amplify approximately 10,000 exons in a single reaction and sequence them on Illumina’s Genome Analyzer.
“Any time you want to sequence just a portion of a genome, rather than the whole genome, these enrichment techniques are very valuable.”
Still, this level of multiplexing is “a fraction of where we want to be,” Jay Shendure, an assistant professor at the University of Washington and former postdoc in the Church lab, and a senior author of the study, told In Sequence last week.
He said that one strength of his “genome partitioning” approach, which targets exons with molecular inversion probe-like oligos that have been released from Agilent microarrays, is the fact that it precisely specifies the boundaries of the target.
Also, the different steps, which include hybridization, extension, and ligation, lead to a high degree of specificity.
The uniformity of the method, however, still needs to improve and is currently worse than that of the microarray-based approaches, Shendure said. In fact, only 10,000 of the 55,000 exons the researchers targeted in their report were detectable by sequencing.
Agilent, which Church chose for the high quality of its oligonucleotides, currently provides oligo libraries made on arrays only to early-access customers “who are actively engaged in application development,” Wilson Woo, director of strategic programs for Agilent’s genomics business, wrote in an e-mail message to In Sequence this week.
The price per library, which ranges from “a few hundred dollars” to over $20,000, depends on two factors: the number of unique oligos per library, ranging from 500 to 55,000; and the oligo length, which can range from 50 to 200 bases.
Such an oligo library represents “a renewable source” that can be used to amplify exons in a large number of samples, according to Shendure.
“The next-gen sequencing is an important market opportunity for Agilent Oligo Libraries,” Woo said, though the company has no timeline yet for commercializing them more broadly. Partnering with a specific next-gen sequencing instrument vendor is “one of the options we are considering” he added. For now, Agilent is continuing to work with Shendure to improve the method.
For Shendure, the main goal is to improve the uniform representation of targets. To that end, he will also explore hybridization-based methods that use the same Agilent oligos and work in solution instead of on an array. Uniformity is critical, he said, because “if you need to sequence 10 times more to get the [targets] that are less abundant, that kind of defeats the purpose,” which is to reduce the amount of sequencing.
Both specificity and uniformity are important, he said, and “ultimately, anything you can do to improve either of those parameters is going to give you a reduction in the amount of sequencing that you have to do later.”
According to Shendure, Church eventually plans to use the method to sequence all exons in participants of his Personal Genome Project (see In Sequence 7/31/2007). In addition, Shendure wants to apply it to various biological projects, some “exon-focused,” others “regionally focused.”
But the three groups are not the only ones who are working on exon or genome-region amplification methods. Earlier this year, a team from Stanford University’s Genome Technology Center showed that using a method based on the so-called selector technology, they could amplify 170 exons in parallel (see In Sequence 5/22/2007).
Raindance Technologies recently won a grant to develop an exon sequencing sample-prep method that uses micron-size droplets (see In Sequence 7/17/2007).
German microarray platform maker Febit Biotech said this summer that it is working with several partners to develop its platform for selective sequencing on Illumina’s Genetic Analyzer, and expects to launch its first products next year (see In Sequence 7/24/2007).
Earlier this month, Perlegen said it is using its proprietary sample-prep and -amplification technologies to select specific regions of the genome for sequencing in a collaboration with 454 (see In Sequence 10/9/2007).
Though these and other capture and enrichment methods clearly have their place now, they may lose their impact as the cost of sequencing drops. “Eventually, if sequencing a whole genome becomes sufficiently cheap, then you probably would try to sequence the whole genome, and you would not need enrichment,” Zwick said. “But we are still some years away from that being a reality.”