ANN ARBOR, Mich.--According to Francis Collins, director of the US National Human Genome Research Institute and the Human Genome Project, the Holy Grail of biology is not the complete DNA sequence of the human genome. Instead, he believes that goal--likely to be reached by 2003, if not sooner--is just a means to a greater end: understanding the genetics of common human diseases. "We would like to uncover the causes of all disease," Collins told an audience at the University of Michigan in mid-April. To help scientists begin, Collins said, "the genome project is developing power tools." He explained, "That's what we do. We build 'em and give 'em away."
Collins, who has been on official leave from his faculty post at the University of Michigan for
seven years, delivered the concluding seminar in a speaker series here, "Bioinformatics and the Genome Project." His lecture touched on topics as wide-ranging as the dangers of basic science patents, the pros and cons of new sequencing and genotyping technologies, and the future role of genetic testing. Above all, he stressed the genome project's effort to catalog human variation--the engine that he predicted will drive the world's medical research effort in the first decade of the next century.
Why bother cataloging variation? "We need a new strategy," Collins explained. Gene hunting using the traditional method of linkage analysis followed by positional cloning has worked well for rare, single-gene diseases, and for the small percentage of common diseases where risk is passed down in a Mendelian, single-gene fashion, such as for BRCA1 and 2 in breast cancer. But it won't work for the vast majority of diseases where several genes contribute to risk. "What we really need are the kind of power tools that would help us understand not only the Mendelian 5 percent but the very non-Mendelian 95 percent," Collins said, asking, "Truly polygenic disease--how are you going to get at it?"
One solution is association studies, comparing "affecteds" versus "unaffecteds." This approach has yielded genes for Alzheimer's disease (ApoE), thrombosis (Factor V Leiden), and AIDS resistance (CCR5), among others. "But in every one of these [the gene] arose from a hunch," Collins noted. "This is not something that would have succeeded if you didn't already have a pretty good guess about the biology. And you know, that's not going to cut it, because if you're talking about diabetes, or hypertension, or coronary artery disease, or schizophrenia," he contended, "you just don't know enough."
Collins's solution? Whole-genome association studies, using a map of single-nucleotide polymorphisms (SNPs). The Genome Project is now building a catalog of 100,000 SNPs, and the recently announced drug company/university SNP Consortium will add at least 150,000 more. Unfortunately, even with a high-density SNP map in place, checking individuals for these genetic variants will be tedious and expensive using current technology. A study of diabetes, Collins estimated, would require 100 million separate genotype reactions. "That's going to be pretty painful, unless you have a technology that scales awfully well," he noted.
One obvious possibility is the DNA chip, commercialized by Affymetrix and other companies. The technology is fast and accurate, but still "there are some concerns," Collins said. "One is, the chip has to be made from a certain design. So if you want to change the design, it's very expensive, you have to start from scratch," he explained, adding, "It's not very amenable to fine-tuning."
Expense is another drawback to the chip technology, Collins asserted. Some can genotype up to 2,000 SNPs in one fell swoop, but cost about $2,000 per use, "well outside the range of most people's budgets," he remarked.
Collins described an entirely different genotyping method using mass spectrometry. The technique, developed by San Diego biotech firm Sequenom, distinguishes nucleotide bases according to mass as measured by a MALDI-TOF mass spectrometer, instead of relying on gel separations. "Will this turn out to be the method of choice?" Collins asked. "I don't know, but it's certainly come a long way in the last couple of years." He predicted that on cost and speed the Sequenom instrument would be a "significant player."
Collins also spoke to the relative merits of the two DNA sequencing machines vying for dominance, now that the genome project's sequencing phase is roaring into full gear. Both devices--the much publicized Perkin-Elmer ABI Prism 3700 and Molecular Dynamics' MegaBace--employ capillary electrophoresis sequencing.
"It's really a horserace," Collins said, pointing to each instrument's pros and cons. The MegaBace is capable of producing long reads, he said, but observed that "it seems to have some difficulties in terms of failure rate, and we don't quite understand what the problem is." The 3700, Collins continued, "appears also to be a wonderful instrument, although its read length is somewhat shorter."
Collins also lamented the lack of a large-scale sequencing technology cheap enough for the typical university research lab to use. Such work "can really only be done in a few places, because it's very expensive and requires a very large operation," he said. "Millions of dollars and hundreds of employees and oodles of square feet--I mean most places really don't have the stomach for that." Collins mentioned chip miniaturization technology developed by David Burke at the University of Michigan as one approach that might someday make large-scale sequencing affordable in academia. "Either we will see sequencing eventually merge completely into the private sector, and academics will no longer do it," Collins contended, "or, preferably, technology such as this will reduce the cost and make it more accessible to the average investigator. That's clearly where we need to go."