By Monica Heger
This story, originally published on March 10, has been updated to include outside comment.
Two groups of researchers, one from Baylor College of Medicine and one from the Institute for Systems Biology, demonstrated in separate studies the ability of whole-genome sequencing to analyze genetic inheritance in a family and identify causative genes of Mendelian diseases.
The two groups used different sequencing strategies: The Baylor team used Life Technologies' SOLiD platform to sequence the whole genome of just one patient and then genotyped the probable functional variants in the other affected family members; while the ISB group hired Complete Genomics to perform whole-genome sequencing on all four family members. The researchers from both studies say that their approaches will be useful for studying not only Mendelian diseases, but also more complex diseases.
"I think these are two landmark studies that inch the field forward towards diagnosing using genomic technology," said Stephan Zuchner, director of the Center for Human Molecular Genomics at the University of Miami.
"Given how the speed of whole-genome sequencing has increased dramatically and the costs have dropped, it has the potential to blow right by genome-wide association studies and exon-based sequencing" for identifying causative genes in genetic diseases, said John Porter, the program director at the National Institute for Neurological Disorders and Stroke.
In the Baylor study, published in the New England Journal of Medicine, the researchers studied a family of ten, in which four out of eight children were affected with Charcot-Marie-Tooth neuropathy, an inherited neuropathy that is characterized by the malfunction of peripheral nerves and muscle atrophy. They sequenced the entire genome of one of the affected individuals — James Lupski, professor of molecular human genetics at Baylor and lead author of the study — and did a genotype analysis of potential functional variants in the other affected members.
They obtained 89.6 gigabases of mappable sequence data, which included 8.3 gigabases of 35 base pair fragment reads, 30.3 gigabases of 25 base mate-paired reads, and 51 gigabases of 50 base mate-paired reads. They then analyzed 30 genes previously known to be involved with the disease and an additional 10 loci in the other affected members.
They identified around 3.4 million SNPs and 234 copy number variants. None of the copy number variants fell within genes thought to be involved with any neuropathies. Of the 3.4 million SNPs, they first narrowed down the number by eliminating all the SNPs found within public databases of known variants. Next they selected only variants found within the probable genes and loci, which narrowed the list down to 650. From those 650, they found one missense mutation and one nonsense mutation located on the same gene. All affected individuals had those mutations, and the parents each had one copy of one of the mutations.
However, what is of possibly greater interest, said Lupski, is that one of the parents and one of the grandparents both had features of neuropathy, but not the actual disease. And, when they went back and looked at their genomes, they found that they each had copies of the missense mutation. "It suggests that there is some gain of function," said Lupski.
Also, the parent with the nonsense mutation was more susceptible to carpal tunnel, he added. Lupski said this suggests there may be other diseases for which a gene that causes one disease may play a role in other, more complex diseases.
While other studies have used exome sequencing to identify genes associated with Mendelian disorders, Lupski said that exome sequencing does not provide adequate information about copy number variation. He added that his team plans to use whole-genome sequencing to study other neuromuscular diseases. He said they will likely begin with Mendelian disease, but will also study more complex diseases as well.
"We'll start with Mendelian diseases because we can use the inheritance patterns to help us dissect the causal variants. But I think this approach will also be important in multifactorial or complex traits," he said.
In the ISB study, published in Science, the researchers sequenced the whole genomes of four family members — two siblings and their parents — in which the two siblings both had Miller syndrome and primary ciliary dyskinesia, recessive Mendelian diseases. The sequencing was done by Complete Genomics to an average coverage of 40-fold.
The team found 4.5 million SNPs. In order to narrow down the SNPs, they hypothesized that the disease-causing gene was in the coding region of the genome, that it was a rare SNP, and that the parents were both heterozygous, while the affected children were either homozygous or compound heterozygous — meaning they each had a missense mutation, but not necessarily the exact same missense mutation.
[ pagebreak ]
They compared the SNPs to known common SNPs, and eliminated all the common SNPs, which narrowed the pool down to about 250,000. Then they looked for identical regions in both the children, since they both inherited the disease. They found two SNPs that matched the recessive mode, while three genes fit the compound heterozygote mode. Of the four genes, one had already been identified as the causative gene in primary ciliary dyskinesia, and further testing identified the causative gene for Miller syndrome. That result was simultaneously confirmed by whole-exome sequencing of unrelated individuals by a group at the University of Washington (see In Sequence 9/29/2009).
David Galas, senior vice president for strategic partnerships at the Institute for Systems Biology and senior author of the paper, said that while in this case exome sequencing was able to identify the causative mutation, for more complex diseases that are not single-gene recessive, exome sequencing may not be as effective. He also said that the ISB team was able to study family genetics in much more detail with whole-genome sequencing.
"We could see, for example, all of the private SNPs," Galas said. Also, had their approach not yielded the correct genes, he said their next step would have been to look in the promoter regions, which would not have been visible with exome sequencing.
Not only did the whole-genome family sequencing allow Galas's team to identify the causative genes in Miller syndrome, but it also allowed them to identify all the recombination points with "incredible accuracy."
They compared their findings to the HapMap study and found similar results — that recombination seems to occur only in "hotspots," meaning that certain areas of the chromosomes are more susceptible to recombination than others. What exactly that all means, though, is still unclear, Galas said.
Galas's team used a slightly different sequencing strategy than that used in the NEJM study. That group also sequenced the whole genomes of the affected individuals, but they only did a genotyping analysis on the parents.
The University of Miami's Zuchner said that sequencing the entire family enabled the ISB group to find a novel gene. "If they had not sequenced the parents, they would have had many more candidate genes to sort through in the end," he said. "If you are looking for novel genes, sequencing just the individual patient is not enough." However, he added, since the Baylor group was looking for mutations in already known genes, its approach worked well for that task.
Galas said there are tradeoffs between the two approaches. "If [ISB] hadn't checked the right genes [in the parents], they would have missed [the causative one]. Our [approach] is more comprehensive, but it's also more expensive and more work," he said.
He added that for this project, the sequencing cost about $25,000 per genome, but he said it was more of a collaboration with Complete Genomics, because his group helped the company with software improvements, and in the future, it would be more of a fee-for-service transaction. However, he also said that costs have come down since the study, and that the same amount of sequencing would cost about half of what it did then.
The Baylor study cost around $50,000 total for sequencing reagents, but Lupski added today the cost would probably be about one-third that price.
Galas added that he thought whole-genome family sequencing would be even more useful for complex diseases. While exome sequencing is limited to simple diseases, there's "no limitation to the types of diseases you can use this technique for elucidating the genetic underpinnings."
Zuchner said that whole-genome family sequencing should eventually work well for more complex diseases. "For now, it will work well for Mendelian recessive diseases and rare syndromic conditions — those are the imminent targets," he said. "Eventually, it will work for complex, common genetic phenotypes like diabetes and Alzheimer's."
Galas said his team next plans to study Huntington's disease in families. While the causative gene for Huntington's has already been identified, his team will look for modifier genes — genes that influence factors such as the age of onset and severity of the disease. He said that they will continue to use Complete Genomics' sequencing services for the immediate future, and added that they will likely sequence between 100 to 200 genomes in the upcoming year.
Aside from the Huntington's project, Galas also plans to study the genetics of other neurodegenerative diseases such as Alzheimer's and Parkinson's, but these efforts are "still in early stages of lining up collaborators." He said that they will need large families with a history of those diseases because they are so complex.