By Julia Karow
This article, originally published Oct.7, has been updated with comments from an author of the study.
A primary breast cancer acquired several new mutations in its protein-coding regions as it progressed toward metastasis, according to a new report by Canadian researchers, who argue that studying these mutations might help scientists understand why cancers become treatment resistant.
The research team, of the BC Cancer Agency in Vancouver, sequenced the genome and transcriptome of a metastatic breast cancer at high depth, using the Illumina Genome Analyzer platform, and analyzed the data for somatic coding mutations.
The scientists then determined how many of these mutations they could already detect in the primary tumor of the same patient, which was removed nine years earlier. Only a fraction of them could be found at all, and even fewer were present at a high frequency.
In addition, by analyzing not only genome but also transcriptome data, they uncovered two new RNA-editing events that change the amino acid sequence of two proteins.
The study "offered the opportunity, for the first time as far as we are aware, to look at the evolution of mutational burden across a very long period of time within the same patient," said Marco Marra, director of the BC Cancer Agency Genome Sciences Centre in Vancouver, and one of the authors of the report, which appeared online in Nature last week.
"Our results show the importance of sequencing samples of tumor cell populations early as well as late in the evolution of tumors, and of estimating allele frequency in tumor genomes," the authors noted.
Being able to study how a metastatic cancer's genome differs from that of a primary tumor "is important because it gets at the heart of how treatment shapes the genetic
constitution of a malignancy," Marra said. "We need to know, with great precision, what the changes are that allow tumors to evade treatment."
The project, which was completed about six months ago, according to Marra, was just the beginning of a large-scale effort to study treatment-resistant cancer. "We are going to continue with using large-scale high-throughput approaches, such as DNA sequencing, to get at the mutational spectrum of treatment-resistant [cancers]," he said, including breast cancer, hematologic malignancies, childhood cancers, and lung cancer.
For their study, the Canadian researchers chose lobular breast cancer, an estrogen-receptor positive subtype that makes up about 15 percent of all breast cancers.
One of the reasons for focusing on breast cancer as the first example is that the Vancouver researchers plan to study more breast cancer samples as part of the Molecular Taxonomy of Breast Cancer International Consortium, or METABRIC, project, a collaboration between five hospitals and research centers in the UK and Canada, Marra said.
According to the BC Cancer Foundation, the Vancouver team is now sequencing the genomes of several hundred so-called triple negative breast tumors as part of an effort to build "a comprehensive genomic map of breast cancer" from 2,000 samples.
For their published study, the researchers initially sequenced DNA from a metastatic lobular breast cancer sample, generating approximately 2.9 billion paired-end reads with a mean read length of 48 base pairs, or 141 gigabases of sequence data, on the Illumina Genome Analyzer. About 121 gigabases aligned to the human reference genome, equivalent to about 43-fold coverage.
[ pagebreak ]
In parallel, using RNA-seq, they sequenced the transcriptome of the same sample, generating 182 million reads, or 7.1 gigabases of data, of which 161 million reads aligned to the reference.
Marra declined to say how much it cost to sequence the metastatic breast tumor, citing the rapidly declining cost of sequencing that renders any number obsolete in a matter of months. "It wasn't free, but it was eminently feasible and the kind of thing we can imagine doing hundreds of," he said.
The researchers then analyzed the data for putative single-nucleotide variants, insertions and deletions, gene fusions, translocations, inversions, and copy number alterations.
In total, they predicted 1,456 new coding non-synonymous SNVs, of which 1,120 remained after they removed pseudogenes and HLA sequences.
Sanger sequencing of these 1,120 SNVs in the tumor DNA as well as in normal lymphocyte DNA from the same patient confirmed 437 of them to be non-synonymous coding variants. Of these, 405 were present in normal germline DNA, and 32 were only present in the tumor.
Of those 32 somatic point mutations in 32 genes, 30 occurred in both the tumor DNA and transcriptome, whereas two were present only in the transcriptome.
An earlier study two years ago by a group at Johns Hopkins University that sequenced about 18,000 protein-coding genes in 11 breast tumors had come up with a list of candidate cancer genes, or CAN genes, but the Canadian researchers found that none of their 32 point mutations were located in those genes. Eleven of the 32 genes, though, were already contained in the Catalogue of Somatic Mutations in Cancer database, but with different mutations.
The researchers then tested six of the mutations in 192 additional breast cancer samples — a mix of lobular and ductal subtypes — and found that although none had one of the exact same mutations, three contained different mutations in one of the six genes.
In order to determine how many of the 32 mutations were already present in the primary tumor, which was diagnosed nine years before the metastatic tumor, the team sequenced the positions of 30 of the mutations in the primary tumor, most of them by Illumina sequencing, using the read counts to estimate the frequency of the mutations in the tumor.
They found that five of the mutations were prevalent in the primary tumor, and six were present at frequencies between 1 percent and 13 percent, meaning they were probably only present in parts of the tumor. Nineteen of the mutations could not be detected in the primary tumor, "despite the fact that we sequenced to what we considered to be sufficient depth to detect even a one-percent change with confidence," Marra said.
"Thus, significant heterogeneity in tumor somatic mutation content existed in the primary tumor at diagnosis," the authors wrote, and "significant evolution of coding mutational content occurred between primary and metastasis," although it is unclear whether the 19 additional mutations in the metastasis resulted from radiation therapy or "innate tumor progression."
"Presumably, the things that have become enriched in the [metastatic tumor] are things that somehow confer the viability of the metastatic cells," according to Marra.
These results differ from an analysis of an acute myeloid leukemia tumor by high-throughput sequencing, published last year by researchers at Washington University School of Medicine (see In Sequence 11/11/2008), the authors noted, where they said the tumor did not evolve to such an extent.
Separately, the researchers examined how RNA editing might lead to changes in proteins expressed in the tumor. They found 3,122 candidate edits, of which 526 led to non-synonymous changes. After validating 75 of these editing events in 12 genes in the metastatic cancer by Sanger sequencing, they found that transcripts of two genes were indeed edited, resulting in altered proteins. "These observations emphasize the importance of integrating RNA-seq data with tumor genomes in assessing protein variations," the researchers wrote.