Since Jay Shendure's team at the University of Washington identified candidate genes for Freeman-Sheldon syndrome using exome sequencing in 2009, a growing stream of research papers showing how targeted capture methods can help researchers find underlying mutations for a range of Mendelian disorders has flooded the literature. "It's amazing. Early in 2010, you could very well publish a paper just because you used exome sequencing, but now, less than a year later, to find a gene using exome sequencing is not a selling point anymore," says Stephan Züchner, director of the Center for Human Molecular Genomics at the University of Miami Miller School of Medicine. Züchner recently led a team that used exome sequencing to study Charcot-Marie-Tooth disease. Because Charcot-Marie-Tooth presents a relatively uniform phenotype across most patients, clinicians are often unable to determine which of the more than 35 genes related to Charcot-Marie-Tooth are affected in a given individual. Züchner's team used whole-exome sequencing to identify a non-synonymous mutation in GJB1 in members of an undiagnosed family with Charcot-Marie-Tooth.
A number of conditions — including Joubert syndrome, Kabuki syndrome, Schinzel-Giedion syndrome, and Sensenbrenner syndrome — have been explored with whole-exome sequencing. Shortly after the Freeman-Sheldon paper was published, cancer researchers — including those in the International Cancer Genome Consortium — began to explore the potential of exome sequencing as a tractable and inexpensive tool for profiling cancers. At the same time, researchers at Johns Hopkins University published a study in Science using exome sequencing to characterize hereditary pancreatic cancer. The paper was the first in which researchers identified a gene responsible for a hereditary cancer with exome sequencing.
By the thousands
There are currently several large-scale exome sequencing initiatives underway. The National Human Genome Research Institute's Medical Sequencing Discovery Project supports investigators using exome sequencing to study Mendelian diseases, while the National Heart, Lung, and Blood Institute supports the Exome Project and the Grand Opportunity — or GO — Exome Sequencing Project, the latter of which has a goal of sequencing 7,000 exomes during a two-year period. Both initiatives aim to further the development of exome sequencing technology to study heart, lung, and blood diseases. Indeed, researchers refined several different targeted sequencing methods during the technology development portion of the Exome Project, such as molecular inversion probes, an array-based hybrid capture method, and in-solution capture — the last of which is widely regarded as the most effective technique for large-scale exome sequencing — among others.
[pagebreak]
Weiniu Gan, director of NHLBI's Genetics, Genomics, and Advanced Technologies Program, says that when the Exome Project launched in 2008, he and his colleagues were not even sure the technique was worth all the effort. "Why not just wait a couple of years for whole- genome [sequencing]? Well, by now we already know that if we wait a couple of years, there still might not be whole-genome sequencing," Gan says. "And exome sequencing is not just for the exome." Some exome sequencing approaches can be adapted to include promoter regions and can be used to generate methylation profiles. "This is something that will stay as an option for targeted sequencing," he adds.
Washington's Shendure and his colleagues at the GO Exome Sequencing Project published a Genome Biology paper in December, describing how they extended a method for constructing shotgun fragment libraries using transposase that simultaneously catalyzes in vitro DNA fragmentation and adaptor incorporation. Shendure's team also reports protocols for exome capture using as little as 50 nanograms of DNA.
To the clinic
The rapid pace of exome sequencing technology refinement has many researchers in the community confident that a targeted approach could eventually be of widespread use in the clinic. But before it gets to that point, more improvements must be made. "One of our goals is certainly to reduce costs, but perhaps more importantly to improve the quality of the end product," Shendure says. He points out that exome capture approaches still miss about 5 percent of the coding sequence. In addition, researchers must develop ways to deal with a range of quality and quantity of input DNA, since researchers depend on stored frozen samples to study many conditions. "The [samples] were collected at various time points and various methods over the past several decades and have been stored in different ways, and people don't want to give up everything they have in case they need to do validation or whole genome sequencing or something else with it down the line," he says. As such, robust, low-input protocols will be necessary or expanding the repertoire of samples to which researchers can apply exome sequencing.
[pagebreak]
Cost-benefit analysis
A team of researchers at the National Institutes of Health's Undiagnosed Diseases Program — a pilot project that aims to diagnose patients with rare diseases that often baffle physicians — recently reported results that suggest exome sequencing may become a clinical standard in the near future. According to David Adams, an NHGRI senior staff clinician who works with pediatric patients and their families through the program, exome sequencing will become increasingly attractive to both clinicians and patients grappling with complicated diseases. "By the time people come and see us, generally they've had a thousand dollars of workup already. Quite a bit of that is spent seeing specialists and having procedures done, and people often have a number of molecular tests performed," Adams says. "At some point, the cost of doing all those clinical molecular tests — in other words, sequencing candidate genes — would be greater than ... looking at the whole exome screen. That's a utilitarian cost-benefit analysis argument you can make."
So far, Adams and his collaborators have analyzed around 50 exomes from 11 patients and their relatives. In what has become common practice in exome sequencing studies of de novo coding mutations, the team took a triad-based approach and looked not only at exomes from the patients, but also those of their unaffected parents and siblings. So far, the researchers have identified disease-causing variants in two patients and candidate causative variants in two others.
"In the beginning, we are looking though a large number of variants and our tools are fairly cumbersome, so the further along we get, the smaller the number of high-quality variants we have to be taken to the lab, where they are ruled in or ruled out. Our ability to home in on what's important for a given family is a big thing," Adams says. "We hope to be able to tell people we have some experience using this as a screening technique [so that] we can tell them it's worth the extra effort to get family members up front, because that aids the analysis a lot." Adams says that as the technology becomes less expensive, clinicians can begin to apply it earlier on, and the cost-benefit argument becomes easier to make. Ultimately, he and his colleagues would like to share streamlined exome analysis tools with the community.
[pagebreak]
When it comes to the question of whether exome sequencing can be applied in the clinic, there are two significant hurdles that researchers must overcome. The first is accuracy. The main challenge for exome sequencing is that target coverage is not yet sufficient — simply because it's based on hybridization, and some regions of genes are more difficult to hybridize than others. It will be essential to determine exactly which gene tests work well with an exome sequencing approach and which should stand alone. A second problem is that exome sequencing often produces false-positive hits, compelling researchers to follow up on their results. "At minimum, most people would agree right now that if you detect a change in a Mendelian gene, you should go back and apply a different method — like Sanger sequencing or another way of genotyping — to confirm that this change really is present," Miami's Züchner says. "There are methodological issues ... surrounding exome sequencing." He adds that the cost of sequencing is coming down so rapidly that soon "it will be just straightforward whole-genome sequencing."
Despite reservations that relegate exome sequencing's status to a provisional solution until affordable whole genome sequencing is available, some researchers are already demonstrating its potential for bringing new solutions to the clinic today. Nikolas Papadopoulos, director of translational
genetics at Johns Hopkins
University's Sidney Kimmel Comprehensive Cancer Center, recently took an exome sequencing-based approach to explore the genetic basis of pancreatic neuroendocrine tumors. He and his colleagues found that mutations in DAXX and ATRX are markers for these tumors. However, the real significance of the team's paper is its suggestion that patients might benefit from drugs that are already available. "The point of the paper is the personalized medicine aspect, in that we found that some of the pancreatic tumors have mutations in genes where these patients would benefit from taking endocrine inhibitors which already exist," Papadopoulos says. "Exome sequencing technology has allowed us to make these discoveries."
[pagebreak]
In October, investigators at the University of Copenhagen and BGI in Shenzhen, China, published in Nature Genetics the largest exome sequencing study to date, which helps to solidify the role of rare variants in disease. The team sequenced the exomes of 200 individuals of Danish ancestry and found an excess of low-frequency, non-synonymous variants relative to synonymous ones. Joris Veltman, an associate professor of genetics at Radboud University Nijmegen Medical Centre, says that exome sequencing not only provide insights into disease-causing mutations in rare monogenic diseases, but also represents a new paradigm for researchers' understanding of common disorders. "It's always good to say that we got a lot of experience from using exome sequencing in rare monogenic syndromes, but of course the big challenge for all of us in genetics is tackling these common disorders," Veltman says.
He and his colleagues recently conducted a study to determine whether de novo mutations might compensate for allele loss due to reduced fecundity in common neurodevelopmental diseases. They tested their de novo mutation hypothesis in 10 patients with unexplained mental retardation using a family-based exome sequencing approach, and identified unique non-synonymous de novo mutations in nine genes. "We were able to show that a large proportion of this common disorder of mental retardation is actually caused by de novo mutations in the genome of a patient. ... Our data has suggested that this is actually a common cause of the reproductively lethal disorder," Veltman says. "As a geneticist, this also really changes your idea of setting up genetic studies, in that we used to look for families where a disease is occurring in order to start mapping that gene because that's what we could do using linkage analysis. But now we can go for disease that occurs sporadically where we can use DNA from the parents as a control set to filter out all of the variants that are inherited from unaffected parents and then look for mutations that arise de novo in the patients."
Cracking complexity
With exome sequencing's impressive track record for Mendelian diseases, many researchers are curious about its potential to aid studies of more complex diseases. Last fall, researchers from the Broad Institute and Massachusetts General Hospital published a study in the New England Journal of Medicine in which they describe their use of a solution capture approach, coupled with exome sequencing on an Illumina Genome Analyzer platform, to identify the gene that causes familial hypolipidemia, a condition characterized by very low levels of LDL cholesterol. The team sequenced the exomes of two siblings with very low levels of LDL cholesterol as well as 60 exomes of unaffected, unrelated individuals of similar ancestry. Sifting through the variants, the researchers found that the two affected siblings are compound heterozygotes for two distinct nonsense mutations in ANGPTL3, which encodes the angiopotetin-like 3 protein that is known to inhibit both lipoprotein and endothelial lipase, thus pointing to the gene's role in LDL cholesterol metabolism.
[pagebreak]
The underlying variants that play a role in the development of some mitochondrial disorders and celiac disease have also come into the purview of this targeted approach, with varying degrees of success. Some researchers, including Washington's Shendure, caution that it might be dangerous to generalize based on these examples, as it is difficult to know in advance which diseases are suited for exome sequencing.
"There is definitely lots of potential, but I think it would be wise not to set expectations too high. For example, we would probably be unwise to define success as having exomes to solve the missing heritability question as this is just setting ourselves up for disappointment," Shendure says. "My prediction is that there will be certain complex diseases where exome sequencing is extraordinarily successful for identifying new disease-relevant genes or understanding the architecture of the genetic basis for that disease, and others where the discoveries will be much more modest. But the only way to know is to ask the question."
Exome's exit?
While some say that exome sequencing is essentially a place-holder technology, others say that ultimately the debate on its inherent value compared to whole--genome sequencing misses the point. "First, early predictions that exome sequencing would be short-lived have already proven incorrect, so I think that we should be careful about predicting exactly when this transition will occur. The price point of whole-genome sequencing still has a ways to drop before the difference between it and exome sequencing becomes negligible," Shendure says. "It continues to be the case that our analysis of whole genome sequences, in most contexts, focuses almost exclusively on the part that we can reasonably interpret in a clinical context, the exome. So not much may change when things transition from exomes to genomes in terms of how we look at the data."