By Monica Heger
A host of exome sequencing projects are attempting to demonstrate the technology's utility in identifying the genetic causes of both rare Mendelian diseases as well as more complex diseases, such as heart, lung, and blood diseases. Proponents of the approach say that, for now, exome sequencing makes sense for these applications because the results are more straightforward to interpret and it is more cost effective than whole-genome sequencing.
Recently, researchers from the University of Miami demonstrated exome sequencing in multiple generations of a family. Their study, published in PLoS ONE this month, showed that exome sequencing is a useful way to look for de novo variation and Mendelian inconsistencies throughout a family.
Separately, a team from Cold Spring Harbor Laboratory used exome sequencing to identify a causative mutation for Joubert's syndrome, a rare brain disease characterized by the underdevelopment of the area of the brain that controls balance and coordination. The results from that study, published in the American Journal of Human Genetics in January are already being used in the clinic, Yaniv Erlich, an author of the study, told In Sequence.
Finally, three large-scale exome sequencing projects recently began: the National Heart Blood and Lung Institute Large-Scale DNA Sequencing Project, which is using exome sequencing to study heart, lung, and blood diseases; the Exome Project, also funded by NHLBI, as well as the National Human Genome Research Institute, which is focused on developing exome sequencing technology (see In Sequence 9/1/2009); and a third, under the NHGRI funded Medical Sequencing Discovery Projects, which is focused on Mendelian diseases, as well as more complex diseases such as Alzheimer's disease.
Jay Shendure, assistant professor of genome sciences at the University of Washington, who is working on all three of the exome sequencing projects, said exome sequencing "represents a new and powerful way to solve things that have been intractable to other approaches."
CSHL's Erlich agreed, and said it is especially rewarding to see results so quickly. In his study, he sequenced the exomes of a healthy mother and her daughter, who had Joubert's syndrome, on the Illumina Genome Analyzer.
The study generated 35 million paired-end reads, with read lengths of 36 base pairs, per sample. Erlich and his colleagues then compared the two exomes, narrowing down the possible causative SNPs by first eliminating all that were not homozygous in the daughter and heterozygous in the mother and excluding those that were already documented. They then looked for SNPs that had an effect on a protein and were also in the suspected chromosomal region, and only one remained. These results were also confirmed by colleagues in Israel who examined an additional 13 patients with the same syndrome from eight different Ashkenazi Jewish families.
"It's a fantastic example of how fast you can find these mutations using next-gen sequencing," Erlich said, adding that the whole project took just seven weeks. And he said that Dor Yeshorim, a genetic testing clinic based in New York, has now started testing for that mutation. "To see something so quickly instated into a clinical setting is very satisfying," he added.
Erlich said his group is now working on a case study, trying to discover the genetic cause of a severe neurological disease in the child of married cousins.. He added that discussions with his collaborators indicate that there is a lot of excitement about using exome sequencing to study genetic diseases.
Stephan Zuchner, director of the Center for Human Molecular Genomics at the University of Miami, and senior author the PLoS ONE study, agreed that exome sequencing could be particularly useful for studying Mendelian diseases. He also added that exome sequencing would be good for studying de novo variation in generational family studies. Zuchner and his team used 454's GS FLX Titanium to sequence the exomes of eight individuals from three generations in the same family. Their average read lengths were 350 base pairs, and they obtained between 700,000 megabases and 1.3 gigabases of sequence for each individual.
While this study was mainly a proof of principle, showing that the technique worked, Zuchner said his team is planning to use the method to study neurodegenerative diseases such as hereditary spastic paraplegia, which is characterized by the degeneration of the long axon in the spinal cord and is similar to amyotrophic lateral sclerosis.
Aside from individual laboratories, there are also several large-scale exome sequencing projects being funded by the National Human Genome Research Institute and the National Heart Blood and Lung Institute.
Under the Medical Sequencing Discovery Project, the University of Washington team will identify genes in about Mendelian diseases, including rare, single-gene disorders, as well as more complex, common diseases like Alzheimer's and multiple scleroris where there is evidence of strong genetic component and in cases where family pedigrees exist, said Jay Shendure from the University of Washington.
The researchers are using either NimbleGen or Agilent capture arrays, and doing the sequencing on the Illumina GA, with paired-end reads of 76 base pairs. Shendure added that they haven't figured out exactly which Mendelian diseases or exactly how many they would sequence.
"We're selecting a set of Mendelian diseases based on the samples we have access to, but also on certain characteristics that we think make them likely to have success," Shendure said. For example, one key is the ability to follow up on initial findings, so having access to other samples will be important. Also, selecting for diseases with homogenous phenotypes and ones that are suspected to have one or two causative genes will make it easier to find those genes.
But Shendure also said they will look at more complex diseases, like multiple sclerosis and Alzheimer's, for which they can obtain a good family pedigree. Alzheimer's disease, for example, is often passed down throughout family generations. And even though there are likely many genes involved, the hope is that by sequencing the whole exome of individuals throughout several generations, the researchers will be able to identify some of those genes. "We're taking a risk," Shendure said.
NHLBI's Large-Scale DNA Sequencing Project will use exome sequencing to study heart, blood, and lung diseases. The project, which is a collaboration between the sequencing centers at the University of Washington and the Broad Institute, as well as several large cohort studies that will be contributing the samples, began last fall, and will sequence the exomes of around 8,000 individuals over the next two years.
Deborah Nickerson, who heads the lab at the University of Washington that is being funded under this grant, said that the first 1,000 samples are already being sequenced at both her lab and the Broad Institute. They are using a similar sequencing strategy as in the Medical Sequencing Discovery Project — Agilent or NimbleGen sequence-capture technologies coupled with sequencing on the Illumina GA with 76-base paired-end reads. The current samples are from individuals with early-onset heart attack and chronic obstructive pulmonary disease.
"We're sequencing exomes of individuals from the extremes of phenotypic distribution for traits known to be linked to the development of heart, lung, and blood diseases and individuals with early-onset disease, to enrich for genetic effects," said Nickerson. In the case of early-onset heart attack, they are looking at individuals at the extreme ends of body mass index and amounts of low density cholesterol, for example.
While these diseases likely have multiple genes involved in their development, the goal is to narrow the pool down to a handful that are associated with risk of these diseases. "These diseases are heterogeneous, but by sequencing large numbers of individuals, we hope to overcome those problems," Nickerson said.
Ultimately, she would like to identify novel pathways that could aid in drug development and new therapeutic strategies. "We expect we'll find a few [genes] that we already know about, but the goal, really, is to find some unexpected things. That will make it a really important project, if we can uncover things that we couldn't by other means," she added.
While whole-exome sequencing has proven a valuable tool for identifying the genetic causes of Mendelian disease, exactly how useful it will prove in more complex diseases remains to be seen. Some researchers think that exome sequencing will have a shelf life until the costs of whole-genome sequencing come down. Erlich said that while it is difficult to give an exact figure, his gut feeling is that when it becomes possible to do whole-genome sequencing in two to four lanes, exome sequencing may no longer be necessary. But, he said, it will depend on the project and will also require that storage costs come down, and algorithms improve.
"Eventually, you will do whole-genome sequencing," said Miami's Zuchner. But until that time, "exome sequencing is a great transitional method," he said.
University of Washington's Shendure agreed that exome sequencing probably is a transitional technology, but he said it's impossible to say for sure whether it would be two years, five years, or more than ten years before the costs of whole-genome sequencing came down enough to displace it. "The timing of that moment is still a question mark," he said. "We don't think it makes sense to sit around and wait."