Skip to main content
Premium Trial:

Request an Annual Quote

Expecting 'Deluge' of Data, Vendors and Users Prepare for Launch of 1M SNP Chips

Over the past two weeks, both Affymetrix and Illumina have announced plans to launch 1-million SNP whole-genome genotyping products in the second quarter.
Affymetrix revealed its plans first on Jan. 8, pledging a 1-million SNP product in Q2 that will build on its newly launched SNP 5.0 Array that combines 500,000 SNPs with an additional 500,000 copy number measurements. Illumina followed two days later with plans to release its Human 1M BeadChip, which similarly offers SNPs and content for copy number variation analysis (see BAN 1/9/2007, BAN 1/16/2007).
However, while many customers have welcomed the higher-density genotyping chips, they are also preparing to handle the informatics challenges presented by not only 1-million SNP arrays, but by future generations of even higher-density whole-genome genotyping arrays.
“It’s a huge informatics challenge. No question about it,” said Jeff Gulcher, chief scientific officer at DeCode Genetics, which is collaborating with Illumina on several fronts and played a role in selecting content for the forthcoming Human 1M BeadChip.
“When we selected Illumina over Affymetrix, we were impressed with how they dealt with this deluge of data,” he told BioArray News this month. “We were able to keep track of the data through their laboratory information management system, but once you generate this deluge of data it becomes difficult to keep track of it all, and to even get it into a database and out of a database,” Gulcher said.
“It can take hours to get information just from a small subset of all your genotypes from a relational database. So we’ve had to make our own adjustments to that,” he added.
According to Gulcher, DeCode has Illumina’s LIMS system on the front end and has “made alterations downstream to manage this analytical data” using existing software systems to “match phenotype with genotype in a very efficient way.” 
Illumina declined to comment on any changes it may make to its software or LIMS in advance of the release of the Human 1M BeadChip.
Making informatics adjustments in house, rather than seeking an off-the-shelf fix for challenges related to higher-density arrays seems to be fairly commonplace in an area where companies and research institutes need informatics tailored to their needs.
Fanny Chagnon, client manager of the McGill University and Génome Québec Innovation Center, which offers both Illumina and Affymetrix genotyping platforms to Canadian researchers, said her organization already made internal adjustments to handle Illumina and Affy’s 500,000 SNP products last year that should hold when the 1-million SNP products are released.
“To use the Affy 500K [arrays], we’ve had to arrange our database so that it could handle such a body of data,” Chagnon told BioArray News last week. “But the way we’ve arranged it, we can handle much more data, so I think we’ll have no problem [with] a 1-million SNP product,” she said.
Chagnon also pointed out that the 1-million SNP chips will not be in high demand at Génome Québec, a situation that could lighten the data load at the resource. She said that researchers are still using Illumina’s HumanHap300 chip, which was launched in late 2005, more than the HumanHap550, which was launched last April.
“There has been quite a bit of demand for the 550, however the cost is slightly prohibitive. Researchers use the 300 more right now,” Chagnon said. She added that all informatics used to handle the data from the chips have been developed in house.
Challenge at Hand
According to Steve Lincoln, vice president of informatics at Affymetrix, customers like Chagnon that have already developed infrastructures to support Affy’s 500K genotyping product should be able to upgrade to the 1-million SNP 6.0 Array with relative ease.
“Our successive product updates are based on the same proven assay and same general informatics approach,” Lincoln wrote in an e-mail to BioArray News this week. “The new SNP 6.0 Array will be able to leverage the emerging best practices and most current systems for data management, data quality control, and data analysis that the 500K and SNP 5.0 arrays now use,” he wrote. 
Still, Lincoln acknowledged that 6.0 users will face “real-world issues of phenotyping consistency [and] population stratification” and that Affy is drawing on its relationships with its customers to provide the user community with the necessary tools to make sense of all the data they are getting from the company’s genotyping products.
“Affymetrix continues to collaborate with a number of leading academic groups in developing both the genotyping and the copy number functionality on our whole-genome arrays,” Lincoln wrote.
He cited Affy’s relationships with the Broad Institute and the University of California in Berkeley that helped the firm develop an algorithm that is central to its new genotyping assays (see BAN 7/25/2006).

“It’s a huge informatics
challenge. No question
about it.”

“Statisticians, laboratory staff, and software developers at the Broad have been working very closely with the Affymetrix group on the full spectrum of chip design, genotype calling, data quality control, and data-analysis issues,” Lincoln wrote.  “We continue to leverage their practical experience in running a very high-throughput laboratory,” he added.
Lincoln also said that customers that needed additional software tools could turn to academic and commercial sources to meet their challenges. “While Affymetrix provides the basic workflow ‘plumbing’ for such data, through the open systems approach I mentioned, literally dozens of tools from academic and commercial developers have already become available [that] support the Affymetrix arrays,” he said.
The need to provide those tools is not lost on software developers like Ocimum Biosolutions, a Hyderabad, India-based life sciences firm that sells its own GeneChip-compatible analytical software. Ocimum CEO Anuradha Acharya told BioArray News this week that the company doesn’t have software available for high-density SNP analysis yet but that Ocimum is in the “process of introducing analysis on high-density arrays” that call for “different and more complex algorithms.”
So far the only informatics company to announce high-density SNP capabilities for the Affy platform has been Partek, which said last week that its Genomics Suite software was deemed compatible by Affy for SNP association studies (see BAN 1/16/2007).
Partek CEO Tom Downey said that the impending launch of the 1-million SNP arrays from Illumina and Affy “doesn’t faze” the company, and that Partek is working now on Affy’s tiling arrays that have 50 million features.
“To go from 500,000 to a million is no big deal,” Downey said. “Now, that’s not true of other software and most of the freeware out there, but Partek has special designs in our software that make especially efficient use of memory and speed,” he added.
He said that Partek at this time is not planning any upgrades to Genomics Suite to handle the new products. “My understanding is that Partek will work out of the box on the 1-million SNP chip with no upgrades required,” he said.

The Scan

Myotonic Dystrophy Repeat Detected in Family Genome Sequencing Analysis

While sequencing individuals from a multi-generation family, researchers identified a myotonic dystrophy type 2-related short tandem repeat in the European Journal of Human Genetics.

TB Resistance Insights Gleaned From Genome Sequence, Antimicrobial Response Assays

Researchers in PLOS Biology explore M. tuberculosis resistance with a combination of sequencing and assays looking at the minimum inhibitory concentrations of 13 drugs.

Mendelian Disease Genes Prioritized Using Tissue-Specific Expression Clues

Mendelian gene candidates could be flagged for further functional analyses based on tissue-specific transcriptome and proteome profiles, a new Journal of Human Genetics paper says.

Single-Cell Sequencing Points to Embryo Mosaicism

Mosaicism may affect preimplantation genetic tests for aneuploidy, a single-cell sequencing-based analysis of almost three dozen embryos in PLOS Genetics finds.