The Southwest Foundation for Biomedical Research has recently installed its first next-generation sequencing instrument, an Illumina Genome Analyzer, and it plans to apply the technology to examine the genetic roots of complex disease in a large-scale population-based study.
Researchers at the San Antonio-based institute plan to sequence chromosome 19 in about 1,000 Mexican-Americans, and correlate the sequences with phenotypic traits they have already located to regions of that chromosome. The scientists have been studying this population of approximately 40 families for more than 15 years by a variety of methods, including brain scans, whole genome SNP genotyping, and transcriptional profiling.
Eventually, the SFBR researchers would like to sequence the entire genomes of the study participants, but until the cost of sequencing comes down, they will start on a smaller scale. “It would be a good chromosome for us to go after as a proof of principle, in preparation for whole-genome resequencing, [to show] that we can handle such a large amount of data, be able to parse through it all, and rapidly prioritize most likely functional variants,” said John Blangero, director of the AT&T Genomics Computing Center at SFBR.
Blangero and his colleagues have already developed statistical methods to identify likely functional variants in DNA sequences, and have applied these to individual genes. “We reckon, if we can do it on the level of the gene, it’s just a scale thing,” he said. His center has recently scaled up its hardware: with a $1 million gift from the AT&T Foundation, it has doubled the number of processors working in parallel, to 3,000 CPUs. “Our statistical program of research is all oriented towards what happens if you have complete genomic information on everybody in your sample,” he said. “Are we going to be able to sort through and rapidly find the true things? That’s what we are focusing on.”
As a proof-of-concept study for the chromosome 19 project, the researchers are currently getting ready to sequence the chromosome in about a dozen samples, using an Illumina Genome Analyzer that the institute acquired in May with the help of a $300,000 gift from the Elizabeth Huth Coates Charitable Foundation. About 150 processors that are separate from the computing center’s main cluster are dedicated to sequence alignment and assembly. “It is a formidable program to do the piecing together of the sequence,” Blangero said. “We did not want that to be a bottleneck.”
The researchers already know about 300 quantitative trait loci, or QTLs, on chromosome 19 that influence expression levels of genes correlated with various common diseases. The reason they chose to sequence an entire chromosome, rather than genomic regions within it, is that it appeared to be easier to prepare a chromosome by flow-sorting than to amplify 10 or 20 megabases of DNA by long-range PCR.
Chrombios, a German company, has already flow-sorted chromosomes 19 and 17 from several samples, but the yield — on the order of nanograms — is not high enough for sequencing, which requires about a microgram of starting material. At the moment, the SFBR researchers are working on optimizing the amplification of the chromosomal DNA, said Eric Moses, an associate scientist in the complex disease genetics laboratory at SFBR.
SFBR researchers have already studied the Mexican-American cohort extensively by other means, taking a variety of measurements related to diseases like diabetes, heart disease, obesity, and osteoporosis. They have also genotyped the participants with short tandem repeat markers, and they are currently conducting high-dimensional brain scans of them. “It’s a population that we have been following for years for, basically, normal human biological variation,” Blangero said.
Recently, the scientists completed a whole-genome transcriptional profiling analysis on lymphocytes collected from the participants 15 years ago, using Illumina’s BeadExpress platform, a study that has been accepted for publication in Nature Genetics.
The researchers are also in the midst of a whole-genome association study using Illumina’s Infinium 500K and 1 million SNP chips, according to Moses.
Up until now, they have resequenced candidate genes in linkage regions in order to find variants that are responsible for the phenotype, identifying these candidates by SNP genotyping across the region. “The alternative approach is to sequence all the people that have contributed to that linkage signal across that region, and the functional variants should be there; it should be in the sequence data,” Moses explained. “The challenge is to analyze that data and find which of the variants that you have identified by resequencing is the right one.” That, he said, needs to be confirmed experimentally.
“It would be a good chromosome for us to go after as a proof of principle, in preparation for whole-genome resequencing, [to show] that we can handle such a large amount of data, be able to parse through it all, and rapidly prioritize most likely functional variants.”
Resequencing entire genomes could also replace SNP genotyping, Blangero suggested. “We basically want to skip this intermediate step where you just type known SNPs because those SNPs tend to be common variation. It’s not likely to be where the real action is,” he said. “We really think that the heart of human variation is in rare variations.”
The researchers are not the only group that is planning to follow up on large-scale genotyping studies with next-gen sequencing. Earlier this year, three research consortia who published genome-wide association studies told In Sequence that they are now planning sequencing projects (see In Sequence 5/1/2007).
In addition to genome resequencing, the SFBR scientists are planning to use the new Illumina sequencer for expression analysis of genes and microRNAs, especially in animals — for example, baboons from the SFBR’s primate center. “It’s a great way to do quantitative transcription, even of unknown products,” Blangero said.
One reason the SFBR researchers decided to acquire Illumina’s next-generation sequencer was their good experience with the company’s other platforms, both in terms of quality and customer support, Moses said. Although the researchers had already decided to get the platform when it was still sold by Solexa, “that decision was strengthened when we found out that Illumina was going to be the company [behind it],” Moses said.
However, Blangero said that the deciding factor was high throughput at low cost. In fact, the researchers were also interested in Applied Biosystems’ SOLiD sequencer, but when they were ready to buy earlier this year, ABI did not have the instrument available yet.
“That’s not to say that we won’t buy one of theirs in the future,” Moses said, adding that SFBR researchers “will be looking at [the SOLiD] very closely” as they gain experience with Illumina’s Genome Analyzer. SFBR owns two ABI 3730 capillary electrophoresis sequencers and is “very happy with ABI” as well, he said.
“If you want to be on the front edge, you have to hedge your bets and probably go for more than one technology,” Blangero said.