By Julia Karow
This article was originally published Sept. 25.
Scientists from the Institute for Systems Biology have analyzed the genomes of a four-member family that Complete Genomics sequenced on their behalf and have pinpointed three candidate disease genes for the two children, who suffer from rare Mendelian disorders.
Independently, a group at the University of Washington identified and validated two of the same candidate genes in an exome-sequencing study that involved the same two children as well as two unrelated individuals suffering from the same syndrome.
Both groups presented their results earlier this month at the Personal Genomes conference at Cold Spring Harbor Laboratory. For Complete Genomics, the ISB study provided the first independent validation of the company's sequencing service by an early-access customer.
The studies show the potential of whole-genome and exome sequencing to discover the underlying causes of rare monogenic diseases from a few samples, without the need for linkage studies.
At the conference, Lee Rowen, a senior research scientist at ISB, discussed early results from the institute's unpublished pilot project, the data for which Complete Genomics delivered in May.
Based on the results, ISB plans to send Complete Genomics another 100 samples from multi-generational families for sequencing, she said. Institute researchers are currently identifying the next sets of families to be analyzed.
For the study, Complete Genomics sequenced the genomes of two healthy parents and their two children, who both suffer from a rare craniofacial malformation syndrome as well as from severe lung disease. The DNA samples came from the University of Washington and arrived at the company in February.
The project had three main aims, Rowen said: to evaluate Complete's sequencing technology; to study how genes are passed on from one generation to another — including recombination and new variations; and to find candidate genes for the children's diseases.
At a cost of $20,000 per genome, Complete Genomics generated about 120 gigabases of data per sample, which it provided to ISB researchers in May, including reads mapped to the hg18 reference genome, coverage tables, and variation tables.
Base calls for both alleles of all four genomes added up to about three-quarters of the human genome. "We were happy with what they gave us," Rowen said.
The researchers also assessed the technology's error rate, which they currently estimate to be about 10 to 20 bases per megabase, depending on the portion of the genome analyzed. The error analysis is ongoing, she said, including a classification of the types of errors. Overall, they found the data quality to be "extremely high," she added.
Analyzing the genomes allowed the scientists to study which chromosomes each parent passed along to the children, and to localize chromosomal crossover sites during meiotic recombination at high resolution.
In order to pinpoint candidate disease genes, the researchers — assuming that the children suffer from rare Mendelian diseases — searched the data for novel non-synonymous SNPs in the children's exons and ended up with a short list of three genes.
Several Paths Lead to Rome
Independently, a group of researchers led by Jay Shendure, an assistant professor of genome sciences at the University of Washington, sequenced the exomes of the two children from the same family, as well as the exomes of two additional individuals from two unrelated families who suffer from the same cranofacial disorder, which he identified as Miller's Syndrome.
To isolate the exonic target DNA, he and his colleagues used array hybridization on Agilent arrays, followed by sequencing on the Illumina Genome Analyzer II with 76-base pair reads.
Using a similar filtering approach for SNPs compared to the one they used in their recently published proof-of-principle study on Freeman-Sheldon Syndrome, where the disease-causing gene was known (see In Sequence 9/1/2009), they identified a single gene that was mutated in all four individuals, which is involved in nucleotide metabolism.
For the two siblings that were also analyzed by the ISB group, Shendure's team identified a mutation in a second gene that appears to cause their lung disease, suggesting that they suffer from two independent Mendelian diseases. Both genes are among the three candidates identified by the ISB group.
Although both teams' approaches essentially led to the same answer, Shendure said exome sequencing might be more efficient to study Mendelian diseases today because only about one percent of the genome needs to be analyzed, although he cautioned that mutations outside of exons will be missed.