Affymetrix and researchers at Harvard Medical School have developed a new SNP genotyping array for studying the relationships between human populations.
Scheduled to launch later this month, the Axiom Human Origins Array is one of a growing menu of population-focused chips created by the Santa Clara, Calif.-based array vendor.
The company designed the chip in cooperation with David Reich, an associate professor of genetics at Harvard Medical School.
Reich has coauthored a number of papers on the origins of and relationships between human populations, most recently publishing a study that used SNP analysis to compare the gene flow in Denisova hominins with that of populations in Southeast Asia and Oceania.
The primary objective of the partnership was to develop an array that could be used specifically for population genetics and evolutionary biology studies. "Most of the projects to date have been conducted using [genome-wide association studies]-based arrays, which were designed for medical genomics projects and are not optimal for population genetics studies," said Candia Brown, director of strategic marketing for Affy's genotyping business.
Brown told BioArray News that whole-genome genotyping arrays currently on the market have two specific limitations. One is that they "utilize tagging SNPs that are based on population-specific haplotype patterns," and the second is that the "manner in which the SNPs were ascertained," such as depth of read, quality of data, number of genomes sequenced, and sample source, are "not well documented."
These shortcomings can "confound the data analysis and estimation of allele frequencies" when used for population genetic studies, she said.
Reich agreed that the existing crop of whole-genome genotyping chips has suffered from the "convoluted" process in which SNPs were selected, making them "useless" for those studying population genetics.
"I think this is even more of an exaggerated case in terms of when there is a need for a population-specific array versus the universal array," Reich told BioArray News. "Normally, the universal arrays capture perhaps not as well as the population-specific arrays, but gather a bunch of the same information;" he said. "But here, the universal arrays are useless."
Reich argued that arrays designed for studying population history should be based on well-documented SNPs. "Even if you have 2.5 million SNPs, if the discovery is not perfectly documented, then you cannot use them for population genetics," he said. "You really need to start from scratch and identify variations in samples of known ancestry and document them," Reich added. "And that is what was done here."
Reich and fellow researchers have in the past recommended a "relatively straightforward SNP selection strategy" in which SNPs are discovered by "comparing two chromosomes from the same individual of known ancestry, and then genotyping them in a larger panel of samples from the same population.," Brown noted, citing a 2007 Nature Genetics paper co-authored by the Harvard geneticist.
This was the path the collaborators pursued for the design of the Axiom Human Origins Array. The array contains approximately 629,000 SNPs, an aggregate of 13 population-specific panels, each containing between 10,000 and 150,000 markers per population.
The 13 populations represented on the array include San Bushmen, Yoruba, Mbuti Pygmies, French, Sardinian, Han, Cambodian, Mongolian, Karitiana, Papuan, Bougainville, Chimpanzee, and Denisova.
Reich said that the SNPs on the array were discovered in samples that had been sequenced previously. "SNPs were chosen entirely based on the sample in which the discovery was done, not using a database, because that would have introduced bias," he said.
The array also contains an additional 87,000 markers in mitochrondrial DNA, chromsome Y, and markers from Affy's Axiom Genomic Database that overlap with the company's SNP Array 6.0 and Illumina's HapMap 650Y BeadChip to allow researchers to compare data from previous studies.
Brown added that Affy genotyped 934 biological samples from the Human Genome Diversity Panel developed by the Human Genome Diversity Project and the Centre d'Etude du Polymorphisme Humain and posted the genotype data set to the CEPH database. The data set is now available to the scientific community for unrestricted use.
Reich described it as the "world's best data set for population genetics" and argued that it is "better than sequencing datasets because sequencing sets have unreliability in allele calls, biases, and alignments with reference sequences."
The chip is priced at $250 per sample, which includes the "cost of the array, reagents and plasticware to process the sample," Brown said.
[ pagebreak ]
Population Focus
The new chip is in line with Affy's recent focus on making population-focused arrays available for genotyping studies. The company has to date launched four such arrays: the Axiom Genome-Wide CEU 1 Array for European populations; the Axiom Genome-Wide ASI 1 Array for studying East Asian genomes; the Axiom Genome-Wide CHB 1 Array for studying the Han Chinese genome; and the Axiom Genome-Wide PanAFR Array Plate Set, with coverage of West and East African populations.
The company also has plans to commercialize arrays covering disease-associated genes in several populations. These include an Axiom Genome-Wide EUR Array covering European ancestry, an Axiom Genome-Wide EAS Array for those with East Asian ancestry, an Axiom Genome-Wide AFR Array for studies of admixed populations of West African and European ancestry, and an Axiom Genome-Wide LAT Array for studying admixed populations of Native American, European, and West African ancestry.
The latter four chips were developed with researchers at the University of California, San Francisco, and Kaiser Permanente (BAN 4/26/2011).
According to Brown, the newest chip, the Human Origins Array, will be marketed to population geneticists and evolutionary biologists for basic research. It will also be "very useful" for consumer genomic firms who offer ancestry testing, she said.
Reich said he expects the Human Origins chip to be ideal for large international consortia trying to characterize variation in different populations, where "unbiased content" is a priority. "You don't need an array that has billions of SNPs. What you need is at least 100,000 SNPs, but SNPs that are well chosen," he said.
Reich credited Affy with making an array that "population geneticists wished we would have had five years ago." He said he is unaware of any competitive product on the market.
Of his relationship with the firm he said, "We would have been happy to partner with another company that makes arrays, but Affy was proactive and aggressive in identifying this as a product that they thought was unique."
Have topics you'd like to see covered in BioArray News? Contact the editor at jpetrone [at] genomeweb [.] com