Following the publication of several capture methods last year for selecting portions of the genome for sequencing, companies have started to commercialize some of these technologies, while academic researchers continue to develop and refine their approaches.
Last week, Roche NimbleGen became the first vendor to launch a sequence-capture service, and plans to start selling capture arrays and reagents this summer.
Meanwhile, research groups at Washington University and at the Broad Institute have developed different genome-selection methods using oligonucleotide libraries from Agilent Technologies, and Agilent later this year plans to release several products for “genomic partitioning” that will be based on these libraries.
Febit Biomed is currently offering a capture service, called HybSelect, to early-access customers and plans to launch the service widely later this year.
Under Roche NimbleGen’s new service, researchers can request up to 5 megabases of target DNA to be captured from the human or mouse genome. The target region can be either contiguous or split up into as many as 20,000 small “exon-like” regions, Xinmin Zhang, the company’s senior manager for marketing sequencing products, told In Sequence by e-mail last week. The targets must be from unique regions of the genome, which excludes about 30 percent of the genome that is repetitive.
Roche NimbleGen currently uses 385,000-feature arrays with “long oligo probes” for the service but this summer plans to start using its new high-density HD2 arrays, which carry 2.1 million features. Those arrays will allow researchers to request up to 30 megabases of DNA to be captured — for example, all human exons.
Zhang said that pricing for the service “varies depending on the particular details of the project” but is “dramatically lower” than it would cost to amplify the same regions by PCR.
Later this year, the company also plans to start selling capture arrays, both in 385K and HD2 formats, as well as requisite reagents and instrumentation to researchers for use in their own labs.
The company will offer a catalog array that has been optimized to capture 150,000 human exons, as well as custom arrays according to researchers’ specifications, according to Zhang.
Roche NimbleGen, in collaboration with researchers at Baylor College of Medicine, published its sequence capture method in Nature Methods
last fall (see In Sequence 10/17/2007). In that project, approximately 70 percent of the sequence reads hit the target regions.
Since then, according to Zhang, the researchers have worked on several improvements, including adding quality-control probes on the arrays to measure the success of the enrichment. These were chosen “based on their consistent capture efficiency across the full range of captured samples,” Zhang said. Their enrichment, measured by quantitative PCR, serves “as a proxy for the target regions.”
Agilent will release “a portfolio of products to address the growing need for genomic partitioning.”
The scientists are also working on performance enhancements, for example to increase the uniformity of enrichment across the target regions and to decrease the amount of starting DNA required. “Most of these are currently in development and testing, but will be offered to researchers as soon as they pass our validation for commercial products,” Zhang said.
Uniformity currently falls within a 10-fold range, according to Zhang, but the company has developed “new ways” to make it more even. “This will benefit the downstream sequencing experiments because a reduced number of reads are required to achieve sufficient coverage at the low end.”
Scientists currently submit 21 micrograms of genomic DNA for the service, but company researchers have already reduced this to “a few micrograms, and we believe we can use even less,” Zhang said. The company has also used whole-genome amplified DNA on the arrays, and has “seen very little bias in the capture sample introduced by the WGA process.”
To capture genomic material, the researchers fragment the starting DNA and add linkers to both ends. After hybridizing the DNA to the array and washing it, the target sequences are eluted and amplified “if needed.” Researchers receive back 10 micrograms of captured DNA.
Roche NimbleGen has worked closely with 454 Life Sciences to be able to sequence the captured fragments easily on 454’s platform. The 500-base-pair fragments match well with the 250-base-pair reads of the GS FLX, Zhang said, and the long reads can be used for haplotyping and identifying small indels, “which could be difficult for short-read technologies.”
But some early-access customers have developed protocols to use the capture arrays with other sequencing platforms, Zhang said, though Roche NimbleGen has not validated these. Last year, researchers at Cold Spring Harbor Laboratory published a method for using the capture arrays in conjunction with Illumina’s Genome Analyzer (see In Sequence 11/6/2007).
Agilent: Arrays and Oligos
Two research groups have developed and improved their own genome-selection methods using oligonucleotide libraries provided by Agilent Technologies.
At the Advances in Genome Biology and Technology meeting in February, Jay Shendure, an assistant professor at the University of Washington and former postdoc in George Church’s lab at Harvard, talked about improvements in the exon capture method he and his colleagues published in Nature Methods
last fall (see In Sequence 10/17/2007).
That method uses modified molecular inversion probes to target exons, and employs Illumina’s Genome Analyzer to sequence the captured DNA.
In their publication, the researchers showed that they were able to capture, amplify, and sequence about 10,000 exons, or about 18 percent, of the 55,000 exons they targeted in a single reaction. Since then, they have increased that proportion to 91 percent, Shendure reported at the meeting. The specificity of the method is high: 98.6 percent of the reads mapped to the targeted regions, he said.
However, he said the uniformity of the method needs to be improved, for example by adjusting the concentrations of different probes.
Sequencing the exons of a HapMap sample on Illumina’s platform, he and his colleagues were able to call variant bases in about half of the 6.7 megabases of target sequence. They observed almost 100 percent concordance with HapMap genotypes at homozygous positions, and 96 percent at heterozygous positions. About 90 percent of the SNPs they identified were already annotated in dbSNP.
Since the meeting, Shendure and his colleagues have also developed an array-based method, using Agilent arrays with 244,000 features and 60-mer probes, which he recently presented at a user’s group meeting of the Department of Energy’s Joint Genome Institute.
Starting with 5 micrograms or less of DNA, they used one array to target all 150,000 exons, or 25 megabases, of the human genome. The tiling density is lower “than what others have done, but [it] still seems to work, given that it is a first pass,” Shendure told In Sequence last week.
They were able to capture and sequence about half the exons with sufficient depth to call SNPs. “The downside relative to [molecular inversion probes] is the lower specificity,” he said, with only about 41 percent of sequence reads mapping to the target region.
Also at the AGBT meeting, Carsten Russ, a research scientist in the genome sequencing and analysis program at the Broad Institute, presented an exon capture method, called Hybrid Selection, which also uses Agilent’s oligo libraries.
That method used 170-mer biotinylated oligonucleotides, transcribed into RNA baits, to target exons in the genome in solution. After hybridization, the scientists capture the oligos on strepatividin-coated magnetic beads, PCR-amplify the target DNA, and sequence it.
The reason they use RNA oligos, Russ said, is that it is single-stranded, so it can be used in large molar excess over the target, enabling them to use less than 1 microgram of starting DNA.
And because they generate the RNA bait by transcription rather than PCR, they avoid PCR errors. Also, RNA hybridizes more strongly to DNA, and large amounts of RNA baits can be made and stored in advance.
In a pilot project, the researchers targeted approximately 15,000 exons with 22,000 baits, which they sequenced both on Illumina’s and 454’s platforms.
They were able to hit about 95 percent of the target bases at least once. Forty-four percent of the reads were completely on target. Ninety percent of reads were “on or near” target, with 55 percent completely on target. The target coverage is even, with 75 percent of target bases having at least 50 percent of the mean coverage, he said.
The researchers then went on to target approximately 3,100 exons in two tumor samples and matched controls from the National Human Genome Research Institute’s Tumor Sequencing Project, which they sequenced on Illumina’s platform. In this proof-of-principle project, they found that 95 percent of the reads hit their target, and about 82 percent were on or near a target. The researchers found about 170 SNPs per sample, 85 percent of which are in the dbSNP database.
According to their conference abstract, “capture and resequencing of a wider panel of exons in tumor/normal pairs is in progress.”
Russ estimated that the cost for capturing the exomes of 96 individuals using his method is about $25,000, however this is a rough estimate and “on the high side,” according to Chad Nusbaum, co-director of the Broad’s genome sequencing and analysis program.
Agilent, which is providing oligonucleotide libraries to a number of early-access customers, plans to launch “a portfolio of products” based on the libraries throughout this year that address “the growing need for genomic partitioning,” according to a company spokesperson. These products will include both catalog and custom products, he said, and might also comprise a service. “The program is still taking shape, and we’re not ruling anything out at this point,” according to the spokesman.
The libraries are based on Agilent’s microarray-based SurePrint inkjet oligonucleotide manufacturing technology.
Other companies are preparing to enter the “genome partitioning” market as well. For example, Febit Biomed of Heidelberg, Germany, is preparing to launch a selection service, called HybSelect, based on its own microarrays.
Febit Chief Scientific Officer Peer Stähler told In Sequence’s sister publication BioArray News last month that the company is currently offering HybSelect to a number of undisclosed early-access customers and that it will fully launch the service later this year. He declined to reveal pricing information.