NEW YORK (GenomeWeb News) – In a paper appearing online last night in the Proceedings of the National Academy of Sciences, an international research team reported that they have come up with a strategy for assessing allele-specific copy number analyses in tumor samples.
The Norwegian and Belgian-led team developed the algorithm, dubbed allele-specific copy number analysis of tumors, or ASCAT, to glean allele-specific CNV and other information in tumor samples based on SNP data. When they tested the approach using genotype data on more than 100 breast cancer samples, they found that, in most cases, the algorithm could determine tumor ploidy, discern allele-specific copy number profiles in tumor cells, and tease apart aberrant from normal cells within tumors.
"By aggregation of ASCAT profiles across our series, we obtain genomic frequency distributions of gains and losses, as well as genome-wide views of [loss of heterozygosity] and copy number neutral events in breast cancer," co-senior author Vessela Kristensen, a researcher affiliated with Oslo University Hospital and the University of Oslo, and co-authors wrote.
When the team used the newly developed algorithm to compare five breast carcinoma sub-types, they found differences in ASCAT profiles, including the tumor cell fraction, from one sub-type to the next. And, they added, looking at so-called allelic skewness — in which one allele is lost and the other gained — provided clues about potential players in breast cancer development.
"[B]y evaluating the relative frequency of deletions and duplications of the two possible alleles at each SNP locus, we construct a genome-wide map of allelic skewness, pointing to candidate genes/loci that may drive breast cancer development," they wrote.
Although a number of cancer genome studies have been done in recent years, the researchers noted, analyzing genomic data generated from tumor samples can be tricky, since tumor samples typically contain both cancerous cells — that may or may not be diploid — as well as some normal cells.
"For these reasons, most studies have been limited to reporting gains and losses (array CGH), possibly supplemented by allelic imbalances (SNP arrays), and are unable to assign correct (allele-specific) copy numbers to all loci in the reference genome," the authors explained.
In an effort to glean more information from such cancer genome studies, the researchers developed and tested the ASCAT strategy using genotype data generated with Illumina 109K SNP arrays for 112 breast carcinoma samples — including samples from luminal A, luminal B, basal-like, normal-like, and ERBB2 carcinoma sub-types.
Of the 112 tumor samples tested, they were able to get ASCAT data for 91 of the tumors, identifying losses and gains, getting information on the fraction of aberrant cells within tumors, identifying loss of heterozygosity events, and more. The team subsequently used independent approaches such as DNA dilution experiments and fluorescence in situ hybridization to verify findings from the algorithm.
Using the algorithm, the researchers identified specific alleles that tended to be lost or gained in the breast tumors. Based on their findings so far, they explained, it seems likely that certain alleles are under selection in — and may contribute to the development of — some tumors.
Moreover, the team found differences in ASCAT profiles depending on breast cancer sub-type. For instance, they noted, in samples from the luminal A sub-group, the algorithm uncovered the highest proportion of aberrant tumor cells. On the other hand, the ERBB2 and normal like sub-types appeared to have the lowest fraction of such cells.
And although the current study relied on array data, the team argued that a similar approach might be beneficial for assessing data from whole-genome sequencing studies of cancer as well.
"The dissection of cancer genomes is taken to the next step by the recent introduction of cancer genome sequencing," they wrote. "We believe ASCAT profiles could be useful tools for interpretation of these data, aiding in the assembly of the data and in the identification of changes varying in size from point mutations to complex rearrangements."