In a guest post at the Genomes Unzipped blog, two researchers from the University of California, Berkeley, discuss how genotype imputation could help consumers get more information out of their 23andMe testing results by predicting or "imputing" variants that are not assayed by the genotyping chip.
Berkeley's Peter Cheng and Eliana Hechter say that the imputation process involves comparing 23andMe data — or data from some other direct-to-consumer testing company — with a reference panel from the 1000 Genomes or HapMap projects. Since these reference datasets have a much larger pool of SNPs than DTC chips test for and because humans have a common genetic ancestry, the imputation algorithms can predict other variations that would likely be present in the user's genome if it was fully sequenced.
Basically, "the imputation algorithm models your genome as a mosaic of related genomes, and uses these related genomes to fill in all of your missing data," the post says.
There are different programs that can be used to predict genotypes. Hechter and Cheng say they used one called Impute2 which was developed by a team from the University of Oxford. They've also written a handy script that transforms 23andMe raw data files into a format that is compatible with Impute2.
Hechter and Cheng do note that 23andMe updates its genotyping chip from time to time and customers can upgrade their data for a fee.
And the results appear to be quite accurate, according to the post. "To give a rough idea, the average confidence was 97.58 percent across all SNPs [Cheng] imputed in his own genome," they write.
Also, Hechter says she was able to impute BRCA2 variants that she quite likely has, which are not included on the 23andMe chip.