Researchers at the Wellcome Trust Sanger Institute have developed an amplification-free library preparation protocol for the Illumina Genome Analyzer that reduces coverage bias and helps improve the mapping and assembly of certain GC-biased genomes.
The scientists described their new method in a paper published online on Sunday in Nature Methods. Although the authors note that the protocol particularly benefits the sequencing of AT-rich genomes, they say that it is also quicker than the standard method and "should be used routinely to prepare libraries for Genome Analyzer sequencing."
The standard Illumina library preparation protocol includes a PCR-amplification step before the sample is loaded onto the flowcell. This step enriches for DNA fragments that have adapters ligated to both ends and generates sufficient DNA to be accurately quantified.
However, the PCR can lead to uneven representation of different parts of a genome, in particular in the case of an AT-rich genome, such as the malaria pathogen Plasmodium falciparum. Even when performed with a low number of cycles, the PCR step is "still a source of duplicate sequences, amplification bias, and struggles with (A+T)-rich base compositions," the authors note.
In order to get around this, the scientists modified the library protocol in a way that eliminates the PCR-amplification step. They achieve this by ligating adaptors onto the DNA fragments that both allow the fragments to hybridize to the flowcell and also allow sequencing primers to hybridize.
Only fragments that successfully add adaptors to both ends can grow into clusters on the flowcell, so no enrichment of such fragments by PCR is required, they note. Also, since they use quantitative PCR to quantify the adaptor-ligated fragments, they no longer need to amplify the template to be able to accurately quantify it.
The scientists then showed that their method leads to more even sequence coverage for an AT-rich malaria genome, facilitating both SNP calling and de novo assembly with a short-read assembler.
"The main benefit is to genomes that have extremely AT-rich genomes, malaria species being the best-known examples," Dan Turner, head of sequencing development at the Sanger Institute and the senior author of the paper, told In Sequence by e-mail last week.
But the approach might be more generally useful. Unlike the existing library protocol, the no-PCR method "does not diminish the complexity of sequencing libraries," Turner said, and the Sanger institute is now also using it "on genomes with more neutral base compositions."
Illumina has mentioned in presentations that it intends to support a PCR-free library prep (see In Sequence 2/10/2009), but the company did not provide information before deadline about when it will start doing so, and whether it plans to support the Sanger Institute's protocol or has developed its own version of it.
The Broad Institute is one genome center that is currently testing a version of the Sanger's protocol. Based on preliminary results involving qPCR but not yet sequencing, "we are cautiously optimistic," Chad Nusbaum, co-director of the genome sequencing and analysis program at the Broad, told In Sequence by e-mail last week. The results suggests that the protocol indeed leads to more even representation of genomes with extreme GC-content, and could thus help cover "regions that would either be underrepresented or missed entirely by the standard process," he said.
[ pagebreak ]
And if it turns to to "smooth out" coverage across the GC spectrum of any genome, it could also result in a need for less coverage in genome sequencing projects, and thus lower the cost of such projects.
Another Illumina GA customer who wants to adopt the new method is Vladimir Benes, head of the genomics core facility at the European Molecular Biology Laboratory in Heidelberg.
"I think it's a very important modification of the protocol," he said. "I see it as an important way to reduce the bias of coverage."
Benes told In Sequence last week that he has already ordered the adaptor sequences and plans to use the new protocol as soon as possible, both to sequence a bacterial genome with extreme GC content and for other genomic DNA preparations.
A "slight limitation" of the amplification-free method, he said, is the amount of starting DNA required, which might be higher than what is needed for the conventional protocol, a reservation that he shares with the Broad's Nusbaum. "At the moment, it's really for genomic DNA-sequencing of any kind, where you have enough material [available]," he said, adding that the protocol may be further improved.
According to Benes, another source of bias during the library preparation comes from the shearing of DNA by nebulization. One potential remedy to this could be a different fragmentation method, for example a Covaris device, which uses ultrasound. That method "is much closer to truly random fragmentation," he said. Illumina mentions Covaris as an alternative method for fragmenting DNA in a protocol for preparing 2- to 5-kilobase mate pair libraries.
Benes said that the amplification-induced bias during the library prep is "well known" and "hotly debated" among users.
Last year, for example, researchers at the Max Planck Institute for Molecular Genetics in Berlin published a paper in Nucleic Acids Research that addresses different types of biases in Illumina sequencing data, including an unusually high read density in GC-rich regions.
Although the problem is "acknowledged by Illumina," the company has "not openly admitted" it, Benes said.
James Hadfield, head of the genomics core facility at the Cambridge Research Institute in the UK, which is about to install its third Genome Analyzer, said he believes the new method "could become the standard method [for library preparation] and replace what people are currently doing."
He has not yet tried the new protocol but plans to do so. It "not only removes bias, but could also simplify and speed up the protocols, ultimately leading to 96-well preps," he told In Sequence by e-mail last week.
Another improvement that would help automate the sample prep is eliminating gel purifications, he added.