This story was originally posted March 3.
Affymetrix last week released a data set of 5 million variants on its website. The genotyping data set, part of the firm's Axiom Genomic Database, is based on validation of genomic variants from the Single Nucleotide Polymorphism Database, 1000 Genomes Project, the National Human Genome Research Institute's Database of Published Associations, and various collaborations.
The data set also includes genotyping data for more than 2 million validated rare and common genomic variants that Affy recently contributed to the 1000 Genomes Project, many of which were not previously available from any source. The data will be incorporated into the 1000 Genomes Project’s public data repository. Affy joined the project last year (BAN 11/9/2010).
President and CEO Kevin King said in a statement that the data release will benefit the firm's customers. "With the availability of millions of novel markers in more diverse populations, along with the increasing capacity of our microarray technology, we are providing researchers with more resources to accelerate their disease association studies," King said. "By releasing this unparalleled data set, Affymetrix is giving clinical researchers better access to valuable content and making it easier to optimize their discovery and validation efforts."
All variants in the new release were tested against a "large, diverse sample set," Affy said, including 1,300 samples across 11 populations from the International HapMap Project. Using the data set, Affy customers can design Axiom myDesign Arrays with up to 2.6 million markers for candidate gene and genome-wide association studies by disease, pathway, and population, the firm said.
The new files, located here, contain data on 5.4 million SNPs, including approximately 1.8 million from HapMap and dbSNP, approximately 3 million from the 1000 Genomes Project 2009 release, and an additional 0.6 million from collaborative discovery projects. Affy plans to expand the Axiom Genomic Database and publicly release additional data sets this year.