By Monica Heger
This article has been modified to indicate that the sequencing method described was co-developed at the University of Utah and Washington University.
Formalin-fixed, paraffin-embedded tissue samples have proven to be a vexing source of clinical specimens for genetic analysis. While abundant samples exist, particularly from tumors, DNA from FFPE tissue tends to be degraded and can be tricky to sequence, compared to DNA from fresh tissue.
Recently, though, a number of labs and commercial vendors have been developing techniques to enable sequencing of these samples.
One of the latest techniques, co-developed by researchers in the department of pathology at the University of Utah and by researchers in the department of pathology and the Genome Institute at Washington University, and published online last week in the Journal of Molecular Diagnostics, uses a hybrid capture target enrichment approach based on PCR-generated capture probes. The researchers used the approach to identify the viral integration sites of the Merkel cell polyomavirus in four FFPE cases of Merkel cell carcinoma, including one primary and subsequent metastatic tumor.
The study marks the first time that a viral integration site has been identified with sequencing from an FFPE tumor. The method could have implications for cancer research in particular, since the majority of tumors, and especially rare tumors, are formalin fixed and paraffin embedded.
"The subject they are addressing is very important: The application of targeted next-generation sequencing to FFPE material," said Michal Schweiger, a researcher in the department of cancer genomics at the Max Planck Institute, who was not affiliated with this work, but who has also developed methods to sequence DNA from FFPE tissue (IS 6/2/2009).
Henry Wood, who sequences FFPE tumors to study copy number variation in Pamela Rabbitts' lab at the Leeds Institute of Molecular Medicine, agreed with Schweiger. Particularly in cancer research, some labs only have access to FFPE samples. "We have access to [FFPE] blocks from our hospital archives so we can study samples that we wouldn't be able to have if we waited for patients to walk in the door," he told In Sequence via e-mail.
Wood added that the main challenge with sequencing FFPE samples is the quality of DNA. "Most of the DNA is in fragments [shorter] than 300 base pairs, so a lot of PCR-based targeting needs to be redesigned." Because of the challenges, and because few groups have published methods for sequencing FFPE samples, "any development in this area is welcome."
The technique could be particularly useful for rare tumors, said Eric Duncavage, an instructor of pathology at the University of Utah who co-led the new study. "We had this rare tumor type — there are only fewer than 1,000 cases per year in the US … and there was no fresh tissue for testing," he said.
Duncavage and his team created capture probes by designing 23 overlapping PCR products across the Merkel cell polyomavirus genome with an average size of 275 base pairs. Next, the team de-paraffinized the tumor tissue with the chemical xylene, and then they extracted one microgram of DNA.
They then used their probes to capture the viral DNA, and prepared a sequencing library for the Illumina Genome Analyzer, performing paired-end sequencing, running three samples with 75 base paired-end reads and one with 50 base paired-end reads. Two libraries were constructed and sequenced for each sample.
Average viral coverage ranged from 4,700- to 37,000-fold, with 18 percent of the sequence mapping back to the viral genome. The size of the capture probes, which ranged from 222 base pairs to 353 base pairs, did not appear to play a role in hybridization efficiency.
The approach was able to identify viral insertion sites within the human tumor genome, as well as single-base changes and rearrangements within the viral DNA. Interestingly, the paired primary and metastatic samples yielded identical viral deletion patterns, conserved at the single-base level.
[ pagebreak ]
Capture hybridization is particularly suited for identifying viral integration sites because capture methods have an inherent "slop" or "off-target coverage" in them, said Duncavage.
So, despite the fact that the team designed viral-specific probes, the probes will also catch human DNA. "Just by chance, you'll end up with pieces that have both human and viral DNA," he said.
The method contrasts with PCR-based enrichment, where both the 3' and 5' ends of targeted DNA must be specified. Instead, the hybrid capture approach allows for the enrichment of sequences that contain only partial homology, enabling fragments with mixtures of viral and human DNA to be enriched.
The researchers identified the viral insertion sites in the three cases where they used 75-base paired end reads, but were only able to identify the 5' insertion site in the sample that they sequenced with 50-base paired end reads. They hypothesized that the 3' site occurred in a repetitive region of the genome. The paired primary and metastatic tumors had identical viral insertion sites, while all the other samples had unique insertion sites.
While there is not a lot of published data on commercial capture methods, Duncavage said he believes his team's approach is comparable to those approaches, although a little less efficient. One advantage, he said, is that the team was able to design longer probes, which are useful when looking for viral insertion sites because there is a greater chance of finding the chimeric region of the genome that contains both viral and human DNA.
Duncavage said that he is planning to compare the probe-design method to Agilent's SureSelect kit to see what the differences are in terms of capture efficiency and the ability to identify viral insertion sites. He noted, however, that several factors have changed since the team began its work, which may shift the scales in favor of commercial methods.
For one thing, he said, the researchers developed their method at a time when commercial methods were a lot more expensive. When the team performed the study, probes from companies like Agilent cost around $1,200 per sample, compared to between $100 and $200 for self-designed probes such as those described in the paper. Now, said Duncavage, the commercial probes run around $500 each.
In addition, since publishing the study, the team has switched over to the Illumina HiSeq, which enables multiplexing and further reduces the cost of targeted sequencing with both the team's own method as well as commercial kits.
Duncavage noted that there are both advantages and disadvantages to using self-designed probes. Researchers can design longer probes, and it is still less expensive than using commercially designed probes, he said, but it can also be time-consuming and labor-intensive.
On the other hand, because researchers can make a limitless supply of probes with PCR reactions, the self-designed method would be a good option for large-scale experiments looking at a small region of interest, he added.
Wood said that while "commercially available kits are good and improving, there is still a need for researchers to push the limits of the technology….There will always be room for people to tailor methods to suit their own requirements."
He said he would consider using this latest method in his studies of head and neck cancer patients, in trying to understand the mechanism of HPV infection.
However, one drawback to the method is that it requires knowing the sequence of the virus, said Schweiger, so, "at the moment, the method is not useful for de novo discoveries or diagnostics." However, she said it could be potentially modified, by targeting multiple viral sequences, for broader applications in the future.
Have topics you'd like to see covered by In Sequence? Contact the editor at mheger [at] genomeweb [.] com.