Such adjustments will likely hone the sequencing approach, the researchers said. In the meantime, they are optimistic that the results they’ve obtained so far are the first of many. “It is early days, and I think it’s the first baby steps,” Futreal said. “But it’s very promising.”
Genetic Rearrangements Found in Cancer Cell Lines Using Massively Parallel Paired-End Sequencing
NEW YORK (GenomeWeb News) – A group of UK researchers have identified several different types of somatically acquired and germline mutations — including rearrangements, insertions, deletions, and copy number variations — in two lung cancer cell lines using one technique: massively parallel paired-end sequencing.
The paper, published in Nature Genetics’ advanced, online edition yesterday by investigators from the Wellcome Trust Sanger Institute and the University of Cambridge, suggests it may be possible to systematically catalogue the genome-wide genetic changes in cancerous cells.
“The basic aim is to obtain a genome-wide screen for [cancer] rearrangements,” Michael Stratton, co-head of the Sanger Institute’s Cancer Genome Project and an author on the paper, told GenomeWeb Daily News today, adding that this experiment demonstrates the team’s ability to “go from genome-wide screens to fusion transcripts.”
Traditionally, it has been much easier to detect chromosomal abnormalities in cancerous cells through karyotyping than it has been to pinpoint genome-wide genetic changes. In cancerous cells, “the whole genome is very, very, very disorganized,” senior author Andrew Futreal, co-head of the Cancer Genome Project at the Sanger Institute, told GWDN. Understanding the genetics of these rearrangements is important, Stratton added, so that researchers can target suspicious genes or regions of the genome.
But until now, there’s been no efficient, genome-wide approach for efficiently seeing the genetic view of cancer-associated rearrangements. For this study, Futreal, Stratton, and their colleagues did massively parallel paired-end sequencing on genomic DNA from two lung cancer cell lines —NCI-H2171, a small-cell lung cancer cell line, and NCI-H1770, a neuroendocrine cell lung cancer cell line — using an Illumina Genome Analyzer.
This involved chopping up the genome into millions of segments — 200 to 500 base pairs in length — and sequencing both ends of these individual inserts. By mapping the ends of each onto a reference genome, the researchers can see what sorts of mutations are present in different parts of the genome.
For instance, if one end maps normally but another maps closer than usual, there is likely a deletion in that insert. But if the ends map farther apart than normal, there’s probably an insertion. Likewise, the team reported, since it’s possible to count how many times certain sequences are present, it’s possible to see copy number variations at a resolution similar to that obtained using currently available SNPs.
This sequencing effort, done to two or three times coverage, was specifically aimed at finding genetic rearrangements, Futreal said. Sequencing samples to much greater depth — some 20 times or more — could also provide information about point mutations, he added.
To hone in on the most informative data, Futreal explained, the team first threw out bad reads as well as those that mapped perfectly to the reference genome, focusing on aberrant inserts for which sequence data at each end was available.
All told, the researchers identified 306 germline structural variants and 103 somatic rearrangements. The germline and somatic or acquired mutations seem to result from different biological processes, Stratton noted. Whereas somatic mutations tended to result from double-stranded DNA breaks and acquired tandem duplications, germline variation was more often due to retrotransposons.
And while it’s too early, with just two samples, to find cancer-associated mutation patterns, Futreal and Stratton said this should become possible as they start extending the method to more samples and cancer types. Since the massively parallel paired-end approach is cheaper and more efficient than methods such as end-sequencing BAC libraries, Futreal said, “It’s getting to the point where one can conceive of applying this technique to a larger number of tumor types.”
In the future, he added, the researchers may combine data on short and longer inserts to increase the amount of information they can glean from each tumor sample. Using larger inserts between paired ends will allow researchers to cover the genome with much less sequence, Stratton said, but makes it more difficult to pinpoint exactly where rearrangements or breakpoints are. “We have to work out the balance of these benefits and costs,” he said.