The battle between whole-genome and whole-exome sequencing has been waged for years. Researchers have discussed whether one approach or the other is better for different purposes — basic versus clinical research, complex versus Mendelian diseases, sequencing thousands of genomes versus just a few. Many researchers consider the debate both useful and practical, to help them clarify the uses of each method, how they could be used together, or how to best develop pipelines to analyze the data coming from them.
"I think each should be considered as the application and what it's being put towards. The two salient points are: What's the differential in terms of cost and what's the differential in terms of what you can get out of it?" says the University of Washington's Jay Shendure. "In terms of cost, the differential is dropping. As that equation continues to evolve, it'll shift the balance more towards whole-genome, which is where I think we're headed — but that's happening I think a little quicker than people have anticipated. That being said, I think exome sequencing will continue to be substantially cheaper than genome sequencing, at least in the near future."
As exome sequencing takes into account only the coding portion of the genome — about 1.5 percent of the whole — it is less expensive than whole-genome sequencing and yields important information on mutations and other variations. However, exome sequencing -misses non-coding variation and some structural variation. As the cost differential between the two methods shrinks, some researchers are using the more comprehensive whole-genome method or a combination of the two to get as much information out of their data as possible.
Cost is not the only consideration. Sequencing, whether of a genome or an exome, is only the first step in determining which variants are useful in the study of a particular disease, or in the clinic for the diagnosis and treatment of a patient. The bioinformatics portion of the process is just as important — and is usually more time consuming. Making sense of the millions of data points means trying to make sure that the right data points are being analyzed, as both extraneous and missing information can hamper a researcher's work.
[ pagebreak ]
Disease by disease
"You have to think about what you're doing and where it would be relevant," Shendure says. "For Mendelian disorders, my general view is that right now, it's hard enough trying to interpret data in the exome. So I question what value the genome adds when the first thing people do is focus their analysis on the exome."
However, he adds, there is value in whole-genome sequencing for cancer, particularly for those types where structural variation is both pervasive and highly relevant.
Elaine Mardis, co-director of the Genome Institute at Washington University in St. Louis, is a proponent of whole-genome sequencing in cancer research. "What I worry about fundamentally is: 'What are you missing?'" Mardis says. "People tend to think about the liability of finding things. I worry about the liability of missing things and what that means for a cancer patient when they find out that the test they took, or had their sample assayed by, was a little bit on the inadequate side in terms of the sensitivity and ability to pick up alterations that we know happen in the genome that are above and beyond simple point mutations."
Even for more common cancers that have been studied extensively, whole-genome sequencing can yield more useful information than whole-exome sequencing, Mardis says — especially since researchers are finding that no matter how much is known about a specific kind of cancer, each patient's disease is different.
"It's a really interesting debate right now because a lot of companies are coming out with very focused gene tests," Mardis says. But while there are obvious benefits to taking such an approach — lower cost, simplified analysis — and the tests are reasonably powered to find point mutations, there is value in the comprehensive nature of the whole-genome approach, she adds.
Researchers at The Cancer Genome Atlas, in which Mardis also participates, made a statistical calculation to determine how many tumors need to be sequenced before a comprehensive picture emerges of a certain kind of cancer. "This calculation got done and the magic number was 500, which is a nice round number," Mardis says. "But the more cancers we sequence of the same type, the more tumors we sequence, the more I think that number is infinity. Because every patient's tumor is different." Although it would be nice to think that there it might be possible to design the perfect target-capture test for different kinds of cancer, she adds, the larger the sample size in the statistical calculation becomes, the more that possibility becomes "a pipe dream."
In the near term, there is useful information to be garnered from a targeted approach to cancer research, but "in the fullness of time, we'll realize how limited they actually were," Mardis says. "I'm not saying it's not worth doing, but the faster we can get to whole-genome methods, the more comprehensive we'll be at characterizing every patient and being able to assign the best strategy based on that."
For Stanford University's Michael Snyder, each approach has its own utility. "If it's a smaller number of samples or projects where we really want to look at copy-number, then we do whole-genome," he says. "Whole-genome is better for getting regulatory sequences and copy-number information than exome, but exome digs deeper into the exome itself and you get a better call on the variants."
Like many of his colleagues, Snyder assumed that once whole-genome sequencing started to get cheaper, exome sequencing would slowly disappear from use. He no longer believes that to be true, however, as exome sequencing can pick up on variants that whole-genome sequencing often misses. "That's because of the increased depth of sequencing," he says. However, Snyder adds, "when we have fewer genomes, we do whole-genome to get a better picture. Or if there's some reason for thinking intuitively that there may be copy-number changes, then we do whole-genome."
[ pagebreak ]
Time and money
The cost of the two approaches, whether in time or money, has also become part of the equation for researchers trying to decide which to perform. But even those costs are narrowing. UW's Shendure says the difference in the time it takes to sequence one genome compared to one exome is shrinking. He would argue a whole genome could take less time to do than an exome in some cases.
"If I wanted to sequence a genome today — let's say I wanted to do a 30x whole genome, I prep a library, I can do that in a couple of hours, and I throw it onto a HiSeq, let's say 130x coverage — so I throw it onto three lanes, that takes 10 days," Shendure says. "If I do an exome, I've got to account for my exome, which adds some time. But I've still got to put it on the same HiSeq, so it's still going to take 10 days."
The actual cost of doing the sequencing is also falling. Shendure says it is interesting that the cost difference between the two methods is now being discussed in terms of a few thousand dollars. "It used to be 10 to one. Now we're talking about three to one, or four to one — so it's definitely changed," he says.
However, there still remains enough of a gap that exome sequencing remains the more economical choice for researchers studying many -genomes at once. For Snyder, studying common Mendelian diseases means trying to get as many participants in the study as possible — both affected and unaffected. And that, in turn, means doing exome sequencing, at least on the first pass, because of the reduced cost. The actual difference in cost is about a couple of thousand dollars between the two. However, Snyder says, "the price difference is still big enough when you add it up over multiple individuals. Most people pay less than $4,000 a genome, and for exome it costs us somewhere on the order of a $1,000 to $1,200. So that does add up."
This cost estimate does not include the time and money it takes to analyze and interpret the data once it has been sequenced. "These days we spend more money on the downstream analysis," Snyder says. "There are two issues with interpretation — one is calling all the variants, which we now have automated. And then there's looking at the variants to see what they might mean in terms of the biological effects, and that is not automated. It's very time consuming. That's clearly the bottleneck, whether it's exome or whole-genome."
WashU's Mardis adds it is easy to be overwhelmed with data if the analysis pipeline isn't rapid and well organized. "That only gets worse the more data you're producing," she says.
[ pagebreak ]
Another layer in the cake
Despite any disagreements researchers may have on when and for what purpose to use whole-genome or whole-exome sequencing, many agree that multi-pronged approaches have utility for basic and clinical research.
For some projects, where he and his team want to make sure they get every bit of information possible, Snyder says they often combine whole-genome and whole-exome sequencing in the same study. "There are some situations, especially in clinical research, where we'll do both exome and whole-genome because it gives us much better coverage of critically important regions as well as lets us find all the copy-number stuff," he says.
But it is not just the two different methods that Snyder combines. In addition, he likes to use two different platforms at once — Complete Genomics' and Illumina's, for example — because the technologies are different, and can produce different kinds of errors. "There are a lot of practicalities, a lot of advantages, to using two different technologies that are very different to capture different information," Snyder says. "If you just do one genome with just one platform, you spend a lot of time trying to verify if these variants are right or not. By adding an exome or a different technology, you really do reduce the time spent looking at variants a lot, and I think you wind up saving money in the long run that way. I'm a big advocate of multiple technologies, whether it's whole-genome plus exome, or two different whole-genome technologies."
For cancer research, Mardis and her colleagues have found that RNA-sequencing analysis is an important addition to whole-genome analysis, as it not only shows the researchers which genes are mutated, but also which of the mutated genes are actually expressed by the tumor. "I think you'd be a little surprised at how many genes are mutated, yet not expressed," Mardis says. "And that means either the gene is not expressed at all by the tumor, or the mutated allele is not expressed, maybe for reasons that we can't see from that genome sequencing or that RNA-seq."
This kind of multi-dimensional research approach can have direct ramifications for how patients are treated — namely, in determining if a target is druggable, or if it is a mutation that does not lead anywhere. "I worry a little bit about that missing piece, because most of the commercially available tests that I'm aware of look at the DNA level only, so that's just another piece of information that I'm becoming more convinced over time that you need to have," Mardis says.
Shendure says he believes there is value in combining whole-genome and whole-exome sequencing to create a kind of hybrid approach for certain kinds of research. "People here have talked about low-coverage genome supplemented with exome so you've got perhaps enough information to do significant copy-number analysis, supplemented by high-coverage sequencing in the coding region," he says, adding, "We'll probably need a new word for that."