NEW YORK (GenomeWeb News) - Applied Biosystems last week disclosed details of its Agencourt Personal Genomics next-generation sequencing platform to the scientific community at a meeting in Cambridge, Mass.
By next summer, ABI expects that the technology, called “Supported Oligonucleotide Ligation and Detection,” or SOLiD, will generate up to 500 million bases of sequence data per run “but we can see the platform getting North of 10 gigabases” per run in the future, Kevin McKernan, senior director of scientific operations at ABI’s high throughput discovery unit in Beverly, Mass. – formerly APG – told GenomeWeb News.
ABI reiterated pans to place early-access systems by mid-2007 followed by a full release with “fully supported workflows,” according to Kevin Corcoran, the company’s vice president and general manager of genetic analysis, who also gave a presentation at the meeting.
The audience was all ears when Gina Costa, director of R&D projects in Beverly, explained how the technology works, how it has developed it over the last eight months, and how it has performed in a cancer resequencing project.
Her talk, part of a well-attended next-gen sequencing session at last week’s Cancer Genomics and Emerging Technologies meeting, followed presentations by Helicos BioSciences and George Church from Harvard University, and preceded talks by 454 Life Sciences and Solexa.
ABI’s Agencourt technology differs from its rivals mainly in that it uses reversible terminating ligation rather than sequencing-by-synthesis to read the DNA, though it bears some similarities to its competitors.
Like 454’s Genome Sequencer 20, the technology uses emulsion PCR to amplify the sample. And like Solexa’s 1G Genome Analyzer, it performs the sequencing reactions on a glass slide array. At the moment, the method reads two bases per step and generates 20-base reads, but 25-base reads are “just around the corner,” according to McKernan. The commercial instrument, expected to launch next year, will run two arrays in parallel with two separate flow cells, microscopes, and cameras included.
To prepare genomic DNA for the instrument, ABI scientists either create a fragment library of sheared DNA or a mate-paired library for paired-end sequencing. The company holds intellectual property “related to making paired libraries,” according to McKernan, but “sees a lot of room for collaboration and open source methods” in that area.
The researchers then attach the DNA fragments to micrometer-sized beads and amplify them by emulsion PCR, a step similar to 454’s sample preparation method.
Since the majority of the beads do not carry PCR products, the researchers next use an enrichment step to select beads coated with DNA. Finally, they spread and immobilize the biotin-coated beads on a glass slide covered with streptavidin.
According to Costa, the company initially used gel-coated slides but switched to non-gel slides in the spring to remove gel-related background noise, and to allow users to easily spot multiple samples onto an array.
“ABI will start collaborative experiments “in the very, very near future.”
Each of the two arrays that will appear on the upcoming commercial version will be divided into 1,800 panels. Costa said the scientists have so far only used as many as 1,000 panels on one array. Solexa, by comparison, uses one array divided into eight channels.
The density of the beads could vary between 20,000 and 180,000 per panel, according to Costa. Assuming read lengths of up to 40 bases, for which the company has proof of principle, the platform could yield, in theory, more than ten gigabases of sequence data per run, according to McKernan.
However, he said, “that’s not happening next year. That’s happening sometime in the product’s continuum.” Rather, the early-access instruments, to be available next summer, will initially generate only 500 megabases per run. “Much like the [capillary electrophoresis] instrument line, ABI is designing several generations of the instruments, which will eventually be able to accomplish the gigabase throughput the technology is capable of,” he told GenomeWeb News.
By comparison, Solexa said its instrument will generate up to 1 billion bases of sequence data by the end of this year, based on up to 5 million 25-base reads in each of the 8 channels on its slide. 454’s instrument, meantime, currently generates 20 million bases of sequence information per run and the company plans to offer 100 million bases sometime next year.
ABI scientists next put the array into a flow cell and add a sequencing primer and a mix of 8-mer oligonucleotide probes labeled with four different fluorescent dyes. Each probe interrogates two bases, at positions 4 and 5. Only the correct probe is ligated to the sequencing primer and its signal, which identifies bases 4 and 5, is recorded by a camera.
Finally, the probe is cleaved between its 5th and 6th base at a phosphorothiolate linker, creating a site for the next round of probes to hybridize. These identify bases 9 and 10 of the template.
This cycle is performed several times, each time reading two bases with a gap of four in between. To sequence the bases in the gaps, all probes and primers are stripped off and a new round of sequencing is started using a sequencing primer that is one base shorter than the first one. After five rounds of sequencing with primers of different lengths, 25 bases in a row are identified.
At the moment, the system can produce 20-base reads, but the researchers are working on improving this. According to Costa, read length is limited by an attrition of 10 percent to 20 percent per cycle. She said ABI is currently trying to increase the amount of template on the bead surface in order to reach 40-base reads.
The technology has a built-in error-checking method that helps make it highly accurate, Costa claimed. “We have not found a systematic error associated with homopolymers or dinucleotide repeats,” she said. The error-checking method, termed “Two Base Encoding,” relies on the fact that each base is in fact read twice, McKernan explained. To distinguish a sequencing error from an actual SNP, the scientists look at a base’s immediate neighbors, or junctions. “In order for you to believe a base is in fact a SNP, you must see both junctions change,” McKernan said. “Most errors in these massively parallel sequencing systems are in fact single changes, and we can eliminate these [as] measurement errors [if] both junctions have not confirmed the change.”
Since the beginning of the year, the system’s performance has steadily improved, Costa said. While it produced 129 megabases of E. coli sequence data from paired 15-base reads per run in February, it generated 57 megabases of sequence from a human X chromosome-derived BAC, containing highly repetitive regions, in a single run in April. And last month, it churned out 350 megabases of E. coli sequence with 20-base forward and 15-base reverse reads in a run.
Extrapolating from that, she said, the commercial instrument could theoretically generate up to 1.6 gigabases per run – if all panels from both arrays were used and if two 20-base reads, one from each end, were obtained from each target DNA. The early-access instruments, though, will deliver 500 megabases, according to McKernan.
The ABI scientists have also compared the APG technology to conventional Sanger sequencing in a cancer resequencing project in collaboration with Victor Velculescu’s group at Johns Hopkins University. In that project, Costa reported, they sequenced 1,149 amplicons from 124 genes and were able to detect 170 out of 180 SNPs that Sanger sequencing found. The APG technology also discovered a number of additional SNPs that Sanger sequencing missed but that still need to be confirmed by other methods.
When can other researchers get their hands on the technology? ABI will start collaborative experiments “in the very, very near future,” said Corcoran, the genetic analysis vice president, adding that the company plans to place early-access discovery systems by mid-2007 followed by a full release with “fully supported workflows.”
But the Agencourt technology is not ABI’s only answer to next-generation sequencing, he said. For example, the company has a stake in VisiGen Biotechnologies, as well as an internal research program on sequencing with very long reads.
“You will hear from AB additional announcements about nanopore sequencing and other ways of sequencing with long reads,” Corcoran said. These technologies, which will tackle the so-called “$1,000 genome,” are “probably seven to 10 years out,” he predicted.
In the meantime, he said, “We believe that Sanger capillary electrophoresis is a platform that has a bright future” and is “not being replaced overnight.”
Julia Karow covers the next-generation genome-sequencing market for GenomeWeb News. E-mail her at [email protected]