Gregory Marcus, senior product manager of DNA analysis, Affymetrix.
Affymetrix this week said that it has reduced the price of its two-chip 500K SNP genotyping set to $250, and that it will offer the 500K set as a single array by the end of the year and will launch a million-SNP product by the first quarter of 2007.
The company credited these advances to a collaboration with the Broad Institute of MIT and Harvard that has resulted in improved genotyping algorithms "that allow researchers to extract more information from each array without any manufacturing or assay changes."
BioInform interviewed Gregory Marcus, senior product manager of DNA analysis at Affymetrix, via e-mail in order to get some more information about these improved algorithms and their impact on the company's genotyping offering. An edited transcript of the e-mail interview follows.
Can you explain the methodology behind the new algorithm that enables it to extract more information from the chips?
The BRLMM [Bayesian Robust Linear Model with Mahalanobis distance classifier] algorithm was developed at Affymetrix, based on work done on the 100K Array by Terry Speed of the University of California at Berkeley. His results were published in Bioinformatics in November 2005 [Bioinformatics. 2006 Jan 1;22(1):7-12].
Affymetrix offers an open software architecture, where we supply data sets and easy access to our raw data to encourage scientists to develop new algorithms and tools.
Affymetrix was able to eliminate the mismatch probes that were not used by the BRLMM algorithm. This freed up half of the chip real estate, which enabled us to load more [relevant] content on each array.
The improved algorithm is delivered in three forms: software with a user interface, a command line tool, and an open source tool. Almost half of our downloads have been the open source or command line tool. These allow our … customers to easily integrate our methods in their existing analysis pipelines. This is unique among genotyping companies.
What are the key differences between this new algorithm and what Affymetrix was using before?
The call rates for heterozygote genotypes are now the same as the call rates for homozygote genotypes. The previous algorithm tended to undercall heterozygotes, which could have introduced a bias in downstream analysis.
What are the primary informatics challenges of moving toward a 1-million SNP GeneChip, and how do you anticipate that this new algorithm will help address those challenges?
Our software is designed to help customers manipulate this many SNPs, allowing them to export data in formats compatible with downstream tools (e.g. Haploview) as well as in subsets. For example, you can export the data by chromosome, giving smaller units of SNPs to manipulate.
What did the Broad researchers contribute to these improvements? Did they help develop the BRLMM genotype-calling algorithm, or were they involved in other aspects of the development?
The Broad Institute did a lot of work creating algorithms for downstream analysis, such as the multi-marker algorithm developed with 100K and 500K data [Evaluating and improving power in whole-genome association studies using fixed marker sets. Pe'er I, de Bakker PI, Maller J, Yelensky R, Altshuler D, Daly MJ Nat Genet. 2006 Jun;38(6):663-667].
The Broad was part of the 500K product development, helping with SNP selection, for example. We will be working closely with them on our newly announced products.
How did the collaboration with the Broad researchers come about? Did Affymetrix license any specific technology from the institute as part of the collaboration?
The collaboration grew naturally out of a close partnership between Affymetrix and the researchers at the Broad that has existed through four generations of the product. The Broad researchers are excited by the opportunity to participate more extensively in pushing the technologies forward to move the field in a direction that allows the power of whole-genome association studies to be realized as soon as possible.
No technologies are being licensed as part of this collaboration but the two groups will collaborate closely on SNP selection and the algorithms for reducing the features necessary to extract high-quality genetic information from the arrays.
Is Affymetrix collaborating with the Broad (or other research groups) on algorithm development for other GeneChip applications?
Yes, the Broad is one of many research groups that are working on algorithms. Nancy Cox at the University of Chicago is another example of someone that is working on new genotype-calling algorithms [Coverage and Characteristics of the Affymetrix GeneChip Human Mapping 100K SNP Set. Nicolae DL, Wen X, Voight BF, Cox NJ. PLoS Genet. 2006 May 5;2(5):e67].