Skip to main content
Premium Trial:

Request an Annual Quote

Analytical Method Measures Microsatellite Instability Status from Existing Next-Gen Sequence Data


NEW YORK (GenomeWeb) – A University of Washington-led team has developed a method for tapping existing tumor sequence data to detect the microsatellite instability (MSI) that can occur in tumors from many cancer types.

The researchers described the approach, called mSINGS, in a Clinical Chemistry study published online in late June. There, they showed that the analytical technique could reliably classify hundreds of previously sequenced tumor samples — assessed by exome or targeted gene panel sequencing — into MSI-positive or MSI-negative groups.

The work was motivated by an interest in identifying, improving, and streamlining methods for clinical MSI testing, senior author Colin Pritchard, laboratory medicine researcher at the University of Washington, told In Sequence.

The same mSINGS method of tallying up different repeat length alleles at dozens, hundreds, or thousands of microsatellite loci can be applied to a wide range of datasets, including sequences stemming from gene panel sequencing, exome sequencing, or whole-genome sequencing, he explained.

For example, he and his colleagues are starting to apply mSINGS in research efforts aimed at understanding MSI biology. They also plan to integrate the analysis into clinical sequencing pipelines at the University of Washington's CLIA-certified Clinical Molecular Genetics Laboratory, so that investigators there can get another level of information — MSI status — from samples sequenced clinically.

As gene panel and exome sequencing are increasingly finding their way into the clinic, Pritchard explained, it's helpful to start thinking of ways to glean more information from such sequence data to increase efficiency and decrease costs in the healthcare setting.

Along with the broad look it offers at MSI across the genome, the mSINGS approach can be scaled up to look at larger and larger sets of microsatellite markers, without the need for using up sometimes-scant patient samples in additional experiments.

"It really is, in some respects, like getting something for nothing," the study's first author Stephen Salipante, a laboratory medicine researcher at the University of Washington, told IS. Both Salipante and Pritchard are involved in co-directing the center's CLIA-certified clinical lab.

The tendency for repetitive microsatellite tracts to spontaneously take on or shed small sequences is typically attributed to DNA mismatch repair glitches caused by inherited or sporadic mutations to mismatch repair genes or epigenetic regulators, the researchers noted.

These microsatellite changes — and the mutations underlying them — have become important when considering cancer risk, diagnosing and treating certain cancer types, and predicting their outcomes, they explained.

To date, the most widely used method for determining a sample's MSI status has been MSI-PCR, an approach that involves using fluorescently labeled primers to PCR amplify sequences at around five well-characterized microsatellite markers, followed by capillary electrophoresis to see shifts in microsatellite sizes.

High-throughput sequencing holds potential for rapidly finding mutations in mismatch repair genes that can lead to MSI.

But focusing on MSI causes may not catch all of the MSI-causing genetic or epigenetic glitches in a given sample, the study's authors explained, which prompted them to pursue a sequencing-based method to more directly measure MSI.

To that end, the researchers began by searching for variable microsatellite sites in sequence data generated through gene panel or exome sequencing studies on hundreds of tumor samples.

Clues from past studies hint that microsatellites may be a bit more common outside of protein-coding portions of the genome.

Nevertheless, the team determined that it could identify between a dozen and a few thousand microsatellite markers in samples from three targeted sequence datasets: whole-exome sequencing or targeted gene sequencing through the Cancer Genome Atlas, ColoSeq, and UW-OncoPlex projects.

The samples included 26 colorectal cancers classified as MSI-negative, MSI-positive, or MSI-low that had been exome sequenced by TCGA members. For those tumors, the researchers had access to around 44 million bases of sequence data apiece, spanning some 30,000 genes.

Another 103 colorectal, ovarian, endometrial, ovarian, breast, or prostate cancer samples had been tested at the University of Washington using a ColoSeq assay, which targets the exons and introns of 50 genes, comprising about 1.4 million sequence bases. The remaining 195 tumors had been tested by targeted gene sequencing by University of Washington researchers using the 195-gene, 850,000-base UW-OncoPlex assay.

Starting from manual searches for instability-prone microsatellite tracts in a subset of the available samples, the researchers uncovered 146 instability-prone microsatellite sites and 15 such microsatellite markers across sequences targeted by UW-OncoPlex and ColoSeq, respectively.

"One of the capture designs we looked at, the ColoSeq capture, actually does capture introns," Pritchard said. "Because of that, there was a far greater number of microsatellites that we were able to identify in that capture design."

With their more automated, custom program, meanwhile, the researchers tracked down 2,957 mononucleotide microsatellite loci within exome sequences targeted for the TCGA analysis.

Such findings indicate that microsatellite markers can be identified in retroactively compiled data and that they are abundant enough to be found in targeted capture sequences focused mainly on protein-coding exon sequences, Salipante noted. "We were able to do this kind of analysis on even small to moderate-sized gene capture panels without taking any design considerations into account," he said.

Once these microsatellites were identified, the team established baseline estimates of MSI by considering the number and distribution of repeat length alleles present at these sites using DNA from a few dozen matched normal blood or MSI-negative tumor samples.

Next, the researchers evaluated the full collection of ColoSeq, UW-OncoPlex, and TCGA exome samples, classifying each sample as either MSI-positive or MSI-negative based on the number of microsatellite alleles supported by 5 percent or more reads at each marker locus and comparisons to baseline. To be included in the team's analyses, a locus had to be covered by at least 30 reads.

"The criteria we employed for determining a sample's MSI phenotype are related to those currently used in interpreting MSI-PCR assays," the study's authors noted, "in that we compare the number of signals reflecting products of different lengths at an individual marker against those from a healthy sample to assess potential instability."

The researchers' results suggest that mSINGS assessments of exome, ColoSeq, and UW-OncoPlex panel sequence data can classify MSI status in tumors with high sensitivity and specificity — at least relative to the MSI status established by gold standard PCR-based testing.

For 108 samples (64 from the ColoSeq set, 18 samples sequenced with the UW-OncoPlex panel, and all 26 exome-sequenced TCGA samples), the group compared MSI classifications from mSINGS with those established using Promega's PCR-based MSI kit. For all but two of the samples — both sequenced with the ColoSeq panel — the mSINGS method accurately pegged tumors as having high- or low-MSI.

In one case, mSINGS identified instability in an MSI-PCR-negative sample, while a second sample deemed MSI-positive by MSI-PCR did not reach the cutoff for instability classification by mSINGS.

For five samples previously deemed "MSI-low" by MSI-PCR, the mSINGS method produced an MSI-negative classification. It's still unclear whether those results mark a limitation in the current analysis or whether they are due to false-positive measurements at one of the loci tested by PCR, Pritchard noted.

"The microsatellite-low cases we looked at were not significantly different, in terms of the fraction of unstable loci compared to true negatives, but we didn't look at enough cases to really comment on it," he said.

Additional research is needed to look at how well mSINGS classifies samples with intermediate MSI levels. But because it can offer a peek at hundreds or even thousands of microsatellite marker loci, Pritchard argued that the mSINGS approach may turn out to be a powerful tool for better defining MSI-low samples and getting a sense of what this classification means in a clinical context.

The team is also keen to explore the possibility of doing mSINGS classifications of MSI status in tumors without matched normal controls. Results from the current analysis suggest control-free MSI testing may be possible with mSINGS, by comparing MSI-positive and MSI-negative tumors across a population, though that remains to be seen in larger sample sets.

"This is early days," Pritchard said. "But what it looks like is that the larger you expand the number of loci you look at, the less critical it is to have a matched-normal [sample]."

"We think that's one of the really big advantages," he said. "In addition to … having informatics that can actually make the call of whether loci are stable or not, another huge advantage is that it probably abrogates the need for a patient-matched normal [sample]."

The mSINGS analyses done to date have all depended on sequence data generated using Illumina instruments.

The same general approach can likely be applied using sequences produced on other platforms, though Salipante noted that reads generated on instruments prone to homopolymer errors could theoretically influence the results, since microsatellites are homopolymers.

"It's not something we've explicitly looked at," he said. "Whether or not that's something that can be overcome — or is something that needs to be overcome — is not clear to me,"

Along with efforts to clinically validate mSINGS in different tumor types and incorporate it into somatic sequencing analyses done at the University of Washington's Clinical Molecular Genetics Laboratory, the team plans to bring together MSI classifications from mSINGS with gene sequence data to learn more about the biological basis of MSI.

"The scale of this type of assay is such that it is going to allow us to look at microsatellite instability in a way that has not been possible to look at it before," Salipante said. "That is an active area of research for us … to look into the biology of these tumors and try to learn something about the spectrum of microsatellite instability and also whether and how microsatellite instability is different in these different tumor types."

The study's authors are making software and other methods related to mSINGS freely available to other academic researchers and CLIA-certified laboratories.