By Monica Heger
Using next-generation sequencing to perform human leukocyte antigen typing can be cost-effective, accurate, and enable sequencing of many individuals at once, according to researchers at the Broad Institute. Writing in BMC Genomics last week, they described two slightly different protocols for sequencing the class I loci of HLA on Roche's 454 GS FLX.
The protocols use molecular barcodes, which enabled the researchers to pool 95 samples on a 96-well plate (with one empty well as a control), bringing the cost to around $40 per sample, including the cost of reagents, labor, and equipment.
In addition to the two sequencing protocols, the researchers also designed an HLA-calling algorithm. Paul de Bakker, who led the research, told In Sequence that eventually the team plans to develop software that could call HLA types from any sequencing data in which the HLA genes are covered, such as whole-genome or whole-exome data.
The researchers are now using their protocol to sequence the HLA regions of 3,000 HIV-positive individuals in order to figure out why some individuals are naturally able to control the replication of the virus even without anti-retrovirals.
"From a medical standpoint, it's important to build the kinds of tools that enable us to study this part of the genome," said de Bakker. "This region has the highest number of associations to human disease."
Jian Han, whose company iRepertoire focuses on sequencing the T-cell and B-cell receptors, said that he was glad to see more immune-related sequencing applications. "It's the most polymorphic region of the genome, and is also related with the environment," he said. "It gives us so much information about disease."
Other firms, such as Adaptive TCR and Sequenta (formerly MLC Dx), are also looking to commercialize immune repertoire-sequencing and -analysis services (IS 8/31/2010).
Sequencing the HLA region is difficult, though, because it is extremely variable, so it's hard to design primers that are able to capture those genes. While the Broad researchers' protocol targets two exons — 2 and 3 — from HLA-A, HLA-B, and HLA-C, there are "other genes a stone's throw away from these genes, and it is even trickier to build primers for those," de Bakker said.
Both protocols make use of molecular barcodes so that samples can be pooled and multiplexed. The main difference between the two is the point in the process where the barcodes are added. The first protocol, deemed "library construction-based barcoding," is good for a smaller number of samples, said Rachel Erlich, lead author of the paper. In that protocol, molecular barcodes are added to the 454 adapters, for a total of six specific primer pairs to sequence exons 2 and 3 for HLA classes A, B, and C.
"This is good for a lab with a smaller project because you're just buying six primer pairs, so it's very inexpensive, and [has] very little upfront cost," Erlich said.
For example, if a researcher had 10 different samples, instead of having to do 60 separate PCR reactions, he could pool by sample in order to do only 10, since each gene from the specific exon would be barcoded.
In the second protocol, dubbed "PCR-based barcoding," molecular barcodes are added to both the forward and reverse exon-specific PCR primers.
So, for example, if a researcher has 96 samples, with six amplicons from each sample, without barcodes he'd have to do more than 500 library constructions. But because of the barcodes, they can be pooled following the PCR step, and the researcher can do a single library construction, Erlich said.
"The big advantage is you're putting molecular coding on at the PCR step, and you can pool all your amplicons at once," she said. "So you do a single library construction, whereas for the other, you'd have to do one library construction per sample."
The team first tested the library-construction based protocol on 270 HapMap samples from four populations with known HLA types, and achieved 96.4 percent accuracy at a 4-digit resolution, the standard resolution for resolving HLA types.
They generated 7.4 million bases pairs of sequence data in 203,108 reads with an average read length of 364 base pairs, and an average coverage of 103 reads per exon. When they compared their results to the HapMap, generated by sequence-specific oligonucleotide hybridization, a standard method of HLA-typing, they found that 93.9 percent of the calls were concordant, and of the 94 discordant cases, 38 were errors in the HapMap dataset, 49 were due to low coverage, and seven due to sample contamination.
The accuracy was "quite high," said Erlich, noting that the method found mistakes in the HapMap database.
They then tested their second method on 95 HapMap samples, which allowed them to process all the samples on one 96-well plate with one empty well as a control. They achieved 98.6 percent accuracy at a 4-digit resolution. The difference in accuracy between the two methods may be because in the PCR-based barcoding method, the team achieved more even coverage across the different amplicons, Erlich said. But, she said, the results were pretty similar.
The team chose to use the 454 platform because of its longer read lengths. "One of the advantages for doing HLA on 454 is that, conveniently, the exons are all just shorter than the lengths of a 454 read, so you can get phase across an entire exon," Erlich said. Recognizing the advantages of 454's long read lengths, Roche itself has been looking into the use of its platform for HLA typing for several years (IS 3/17/2009).
De Bakker agreed that the long read lengths of the 454 platform made it amenable to the approach. "Given the particulars of the task, I'm not sure it would have worked as well" on another platform, he said.
He added that so far they have gotten good results from the protocol on their HIV samples, and are about halfway through the sequencing of the 3,000 HIV-positive individuals. All of the individuals are enrolled in clinical trials, he said.
Aside from HIV, Erlich said the method could be useful for studying other types of infectious disease, autoimmune diseases, cancer, and even psychiatric diseases. "I foresee a big need for [HLA sequencing] in the future," she said.
"Being able to do these studies and understand how our immune system interacts with its environment, and being able to do it cheaper and more efficiently, is going to open up the possibility of doing HLA typing to more medical researchers and that will give us an additional piece of information in trying to figure out how to combat these diseases," she said.
Have topics you'd like to see covered in In Sequence? Contact the editor at mheger [at] genomeweb [.] com.