For most researchers chipping away at deciphering disease susceptibility, ramping up the pace at which new knowledge is brought to the clinic is the primary goal.
Last year, Greg Cooper, an acting assistant professor in the Department of Genome Sciences at the University of Washington, and his colleagues contributed data to a study conducted by the International Warfarin Pharmacogenetics Consortium that used genotyping technologies to help clinicians improve how they determine the correct dosage of warfarin, an anti-coagulant. Two genes, CYP2C9 and VKORC1, affect warfarin dosage — collectively, they predict about a third to 40 percent of dose variation. "It's a very potent drug that prevents strokes and unwanted blood clots, but it can also cause severe bleeding side effects. You have to get the dose just right and part of that is genetic variation that controls what your dose should be," Cooper says. "There's a big clinical trial going on right now to determine if the genetic testing is worth it, in terms of balancing out the cost of genotyping relative to the 'Does it prevent bleeding consequences and does it get people the right dose quickly and effectively?' That's a bellwether; if it doesn't work there it's pretty disheartening because those are two of the largest effects of the common variant on a common traits."
Cooper also works closely with Evan Eichler's lab at the University of Washington on studies that attempt to elucidate the causes of childhood neurological disorders. While Cooper's personal focus is using SNP assays to infer copy number status using a mixture of both genome-wide SNP arrays and customized SNP panels, he says that CGH arrays are an invaluable part of the analysis pipeline. "A lot of the data that I do analysis on, people in the Eichler lab follow up on and do CGH array analysis — the advantage being that it's a lot cheaper to do custom chips. We can order a 2 million probe NimbleGen array or 1 million probe Agilent array, and get it in a month," Cooper says. "It's not ridiculous to customize, whereas with SNP arrays you basically get fixed content."
Cost and throughput level are two of the main challenges to studying the role of copy-number variations in disease. To detect rare CNVs, researchers need to round up large patient populations.
"There are a lot of CNVs in the human genome that are embedded in these more complex blocks of duplicated sequence where you have four, five, six or more copies, and distinguishing those kinds of discrepancies is a lot harder than the zero, one, two category," Cooper says. "That's one major limitation of a lot of SNP arrays in general, and even CGH arrays can struggle to differentiate that kind of copy number."
To address these issues, Cooper and Eichler recently used an adapted Illumina BeadXpress SNP genotyping assay along with an algorithm called SCOUT, for SNP-Conditional OUTlier detection, to quickly identify both rare and common CNVs in large cohorts with an acceptable degree of speed.
They tested their approach using samples from more than 1,000 children suffering from unexplained intellectual disability. Using this approach, they were able to diagnose roughly 3 percent of the children in the study by identifying a CNV known to be pathogenic, as well as two other CNVs thought to be involved with disease. "The upshot is that we now have an assay that readily scales to large numbers that is affordable, and that is extremely accurate for rare variants," Cooper says. "The downside is that we don't analyze the whole genome, but because of our strategy behind selecting loci, we can target loci that are enriched for known or suspected CNVs and pick up many pathogenic events."
Array primer extension
Scott Tebbutt, a principal investigator and assistant professor at the University of British Columbia's James Hogg iCapture Centre, has been hard at work on a medium--throughput, microarray-based genotyping method using array primer extension that he says greatly improves genotyping accuracy. The design uses multiple genetic probes for each SNP together with statistical analysis algorithms to convert the data into a high-quality genotype call. To further improve accuracy, Tebbutt and his team tweaked a PCR design to reduce genotyping error caused by sporadic allelic dropout. They are demonstrating the effectiveness of their approach in a study in progress on atherosclerosis using plasma samples that were several years old.
By extending the PCR multiplexing to around 90-plex, they were able to amplify hundreds of patient plasma samples that hadn't been pre-treated prior to storage. "That was interesting to my colleagues who study atherosclerosis because they have a whole bunch of plasma samples from years ago — before -genetic association tests became commonplace — but they saved the plasma," Tebutt says. "It's generally known now that there are fragments of DNA circulating in plasma, so they had all these samples and we showed them that we could do highly multiplexed PCR amplification of DNA from plasma samples."
He adds: "They decided to set up a genetic association study using about 1,000 DNA samples from atherosclerosis patients. … It looks at about 90 SNPs and at atherosclerosis pathways to see if there's any genetic variation there."
Ultimately, Tebbutt and his colleagues would like to transfer these experimental improvements into the healthcare setting to ramp up the speed of genetic testing. Speed is particularly important in the emergency room, where a patient might need to be genotyped immediately for not just one or two SNPs but hundreds of different SNPs across many different genes to find out what drug he or she should be given. "You need that info within an hour or less, and there doesn't seem to be anything out there that can do that because the technology is still lacking. If someone comes into the clinic and they want a genetic test or the clinic needs a genetic test for certain genes while the patient is in the emergency room, you need a very fast turnaround test in multiplex, so we're interested in trying to make the PCR even faster," Tebutt says. "When you try and translate it to clinical applications, you're up against the clinical chemistry, the politics of medicine, plus the underlying science as well."
Next-generation sequencing is also enabling researchers to elucidate previously intractable diseases and make a serious impact that will hopefully aid clinicians someday. Physician-scientist Stephen Kingsmore, president and CEO of the National Center for Genome Resources, says that an area where next-gen sequencing has demonstrated a good proof-of-concept is with Mendelian diseases where there are only a few families and subsequently not enough information to do linkage analysis to find gene targets. "If you had the families, it would be pretty simple, but you just don't have enough affected cases that are related," Kingsmore says. "Brute force genome sequencing has shown to be pretty effective at finding the causes of those Mendelian diseases, so that's going to be an exciting area over the next five years where we can look at 2,000 or 3,000 Mendelian diseases for which we don't yet know the gene."
While it is possible to get SNP genotypes out of genome sequence data with next-gen sequencing, it requires sufficient coverage and it is not always easy to call variations such as indels. There is also a bioinformatics lag when it comes to alignment algorithms to detecting those indels. "For SNP calls, I think as long as you have 30- to 40-fold coverage, you're in good shape," Kingsmore says. "A lot of the algorithms don't yet detect indels and with more difficult things like complex rearrangements, we are still looking for solutions. For clinical applications, you can't get it wrong — the gold standards are very well established. So there's quite a ways to go before the bioinformatics and the quality of the sequence data is mature, robust, and validated enough for the clinic."
When it comes to doing genotyping in the clinic, sometimes there can be too much of a good thing, Kingsmore says, especially when it comes to all the data generated by sequencing technology. "I think there is currently relatively little clinical utility in a whole-genome sequence because, for the vast majority of it, we don't know what to do with that information and from a clinician's standpoint, that is deleterious," he says. "You really want information that's actionable, you want to understand what the information means, and so throughout medical genetics, there's this phenomenon of variants of unknown significance, and they cause an awful lot of consternation. What do you do with these? Even if you are doing Sanger sequencing and you're sequencing the cystic fibrosis locus, and you come across what you think is a mutation that might be important but you're not sure, how do you report that to a patient? You don't want to raise fears."