Perlegen Sciences, the $100 million Mountain View, Calif., spinoff of Affymetrix, is devouring thousands of its parent company’s GeneChips in its ambitious project to use array technology to capture the entire spectrum of human genetic variation.
Perlegen scientists published an article in the November 23, 2001 Science describing their proof-of-principle study using GeneChip wafers to find single-nucleotide polymorphisms on chromosome 21. In this study alone, which contained a sample size of 20 people, the researchers used 160 wafers, or a total of 7,840 GeneChips.
But now, said Perlegen CEO Brad Margus, “we have at least half of the chromosomes in our pipeline somewhere, and our plan is by August or September next year to complete the genome.”
The Perlegen researchers are looking at 50 human genomes. Margus estimates they go through about 250 wafers per genome, which means that the company will use a total of 12,500 wafers or 612,500 GeneChips by the time the study is complete.
At retail prices for GeneChips, this study could cost up to $600 million. But Affymetrix, which still has a significant equity position in Perlegen, is providing the chips at discounts that most GeneChip users could only dream of.
As a result, said Margus, this method of characterizing SNPs in the human genome is much cheaper and more efficient than using large sequencing machines.
“We’ve figured out that one of our technicians can prepare the sample and hybridize it to two or three wafers in an eight-hour day, and can read as many bases in a day as 100 ABI or dideoxy sequencers.”
Perlegen plans to use the SNPs it finds for whole genome scanning for disease-related genes. After finding the total number of SNPs in its sample of 50 genomes, said Margus, the company will narrow these down to the common SNPs, and then, by determining haplotype structure — or the specific location of blocks of SNPs that co-vary — will find the most relevant SNPs.
“Now, once you have these 300,000 or so SNPs, you look at those SNPs in people who have disease compared to people who don’t have a disease,” Margus said. “With those genes that differ, you know for sure they are involved in the cause of the disease. Some may be genes or proteins that you can do something about.” In other words, the company hopes to find a trove of drug or diagnostic targets.
How is Perlegen Doing it?
Perlegen scientists detailed their method for attacking this mammoth task in the Science paper, “Blocks of Limited Haplotype Diversity Revealed by High-Resolution Scanning of chromosome 21.”
First, the group had oligonucleotide arrays made from the sequence of chromosome 21. These arrays, which Affymetrix makes in whole “wafers” of 49 GeneChips using its proprietary photolithographic process, include 25-mer oligonucleotides on each feature of the chip. To detect the SNP, the chips group the features into squares of four. Each feature in the square includes an oligo probe that is identical to the others in its square except for the 13th base. This base varies, with each of the four bases represented.
For the hybridization, the researchers extracted copies of chromosome 21 from the somatic cell samples of 24 different people taken from a range of ethnic groups, choosing one chromosome for each person. After they discarded unreliable samples, 20 chromosomes remained. They then hybridized the samples to the arrays.
Using this method, the Perlegen group was able to locate 35,989 SNPs. To further winnow down this information, the Perlegen investigators chose the group of SNPs with an allele frequency of ten percent or greater, which totaled over 24,000.
Next, the investigators sought to isolate the haplotype blocks. Using an algorithm, they were able to find a single block that included 26 SNPs and spanned 19 kilobases. But they also found that other blocks contained only two or three SNPs. This result indicated that previous methods of looking at genetic variation by chopping the chromosome into 10 kb blocks would not be a valid strategy.
“The great news is that these haplotypes are real, and really a way that the world can reduce the number of SNPs you have to look at to define people’s genomes,” said Margus. “The bad news is that, in order to find the haplotypes, you’re going to have to look through every base of many genomes.”
While the study shows that Perlegen has a viable SNP-finding strategy, it will still have to prove that this mass of data is relevant to drug development.
“The real question is not whether microarrays have potential uses for SNPs,” said SNP Consortium CEO Arthur Holden, “but how difficult it is going to be to use the variation in the genome to understand the progression of common diseases, and to use this information to find drug targets?”