SAN FRANCISCO — Despite the falling prices of next-generation sequencing, targeted panels still have clinical utility, according to Heidi Rehm, director of the Laboratory for Molecular Medicine at the Partners HealthCare Center for Personalized Genetic Medicine.
At the Clinical Genome Conference here this week, Rehm presented examples of cases where it made more sense to use a targeted panel as well as cases where going straight to whole-genome sequencing was appropriate. Additionally, in collaboration with the creators of PhenoDB, she is working on developing a matchmaking system for unsolved exome and whole-genome sequencing cases, to address interpretation challenges.
The LMM has been one of the early adopters of clinical sequencing. It is developing a diagnostic whole-genome sequencing pipeline (see CSN 6/15/2011), and offers an array of targeted panels including a 51-gene Pan Cardiomyopathy panel, a 71-gene hearing loss panel called OtoGenome, and a 57-gene Pulmogene Panel for respiratory disorders.
In her presentation, Rehm said that technology has not yet improved to the point where exome or whole-genome sequencing can supplant targeted panels. For instance, exome kits miss between 5 percent and 15 percent of coding sequences, she said. Additionally, variant calling algorithms still struggle with indels, copy number variants, structural variants, and repetitive regions.
Rehm said that the LMM is able to confirm only around 85 percent of called indels with Sanger sequencing. Because of this high false-positive rate and because the capture kits miss a significant number of bases, for LMM's panels it fills in "every base that's missed by next-gen sequencing with [Sanger] confirmation," which is "not feasible to do at the level of an exome," she said.
Rehm highlighted several cases that illustrate the advantages and limitations of panels and whole-genome sequencing. In one case, the LMM saw a child with nonsyndromic hearing loss. Testing for known causes came back negative, so the laboratory opted to try whole-genome sequencing. The patient and affected siblings were all sequenced as part of a research study. Sequencing identified a four-base insertion in the gene OTOP1, which is a gene involved in hearing loss. But closer evaluation found that the insertion was actually a false positive due to a misalignment of the reads.
Further analysis of the family's history with hearing loss found that the disorder followed an autosomal dominant model as opposed to recessive, as originally thought. That opened up the analysis and implicated 32 rare, novel variants across 31 genes, but "there was no strong evidence for any of them," Rehm said. The team decided to do a linkage analysis and identified a gene, STRC. The gene also has a pseudogene, which makes it difficult to analyze with exome or whole-genome sequencing, she said.
Re-analyzing the patient with the LMM's OtoGenome test, which includes copy number analysis, identified a 1-megabase deletion in the STRC gene, which knocked out four genes. The deletion was present in all siblings. Aside from being causative for hearing loss, the deletion also knocked out the gene CATSPER, which is associated with infertility in men, a finding that they were able to report back to the family.
"In this case, we shouldn't have done whole-genome sequencing, but should have started with a targeted approach," Rehm said. "But, in other cases, you should start with whole-genome sequencing."
For example, she said the LMM saw a patient with distal arthrogryposis type 5, a disease known to be autosomal dominant and to frequently occur de novo. Neither of the affected child's parents had the disorder, providing evidence that the disease was in fact de novo. In this case, it made sense to sequence the trio and look for coding sequence variants present in the child, but not the parents, Rehm said.
Indeed, sequencing turned up two candidate genes. The gene ACSM4 was ruled out because it has been found to be common in the population. The second gene, PIEZO2, had no known clinical association with any phenotype. "It remained a great candidate, but how do you prove causality in a novel gene when it's only in one case?" Rehm said.
Luckily, it turned out that another group had identified a patient with the same disorder and the same variant. "But that was serendipity," Rehm said. "We need to approach this in a more robust way."
Exome Matchmaking
Rehm said the LMM is working to address these issues with interpretation when there is a lack of sufficient data to declare a mutation as pathogenic. One such way is the lab's development of a matchmaking system for unsolved exome and whole-genome sequencing cases. Rehm's group is developing a system in collaboration with Ada Hamosh, professor at the Institute of Genetic Medicine at Johns Hopkins University, who developed the PhenoDB tool for collecting phenotype information.
The goal of the system is to help in cases where a diagnostic exome or whole-genome sequencing test has not yielded a result for the patient. Rather than leaving those cases to "sit stagnant in our lab with no ability to make a conclusive argument" for a diagnosis, the goal is to enable data sharing that would match those cases with other cases that have similar phenotypes, candidate genes, or variants.
The system will use an algorithm to "enable matches to cases submitted through the system," Rehm explained during a presentation. It will allow for structured phenotype information to be entered, and it will include candidate genes and variants and even VCF files, she said. If a match is found, both parties will be notified.
"This will be critical, given how rare certain cases are, that we have ways to match up these rare candidate associations," Rehm said.
Rehm said that the project is still in the early development phases, so she does not have a timeline for when it will launch. Additionally, she would like to be able to integrate other databases, to "interface whatever phenotype system laboratories are using."
Starting with PhenoDB makes sense because the National Institutes of Health's Centers for Mendelian Genomics laboratories — at the University of Washington, Baylor College of Medicine, Johns Hopkins, and Yale — are all depositing data into it, she said.
The goal of the system will be to scour the database for similar phenotypes and candidate gene information in order to find cases that are likely the same. In designing such an algorithm, Rehm said it is necessary to assign priority to different phenotypes and candidate genes. For example, nearly everyone has variants in the Titan gene, so that would be a lower priority. But a gene with very little sequence variation would score higher, she said. Similarly, with phenotypes, a common phenotype like intellectual disability would have a lower score than something much more rare.
Thus far, in her own lab's experience with clinical next-gen sequencing, Rehm said interpretation has been the biggest hurdle.
Rehm said that of 3,000 hypertrophic cardiomyopathy cases, 66 percent of the variants that were identified as causative were found in only one case, and the lab is continuing to see about a 17 percent novel variant rate for the disorder.
The problem is even greater for hearing disorders, Rehm said. From 2,000 cases, 81 percent of the causative variants for hearing loss were found in only one family.
Interpretation of sequence variants is a major bottleneck, not only in terms of providing accurate diagnosis, but also in terms of turnaround time. The LMM has been timing how long it takes to assess candidate causal variants. Variants for which there is no data in the public domain, take around 22 minutes to assess. But, if there are publications on the variant, it can take up to two hours to read through the literature to determine whether there is sufficient evidence for pathogenicity and, there is "little way to automate that process," she said.
During her presentation, Rehm highlighted several cases that illustrate the challenges with interpretation and problems with the current variant databases.
For instance, Rehm said that Sherri Bale from GeneDx previously presented on a case where a patient was diagnosed with Noonan syndrome. The report with the evidence for the variant pointed to a study from a reputable source. The patient contacted the author, who said that since publishing the study, he has found the variant in 7 percent of his controls and now thinks that the variant is benign, but had no way to easily share that information.
Upon hearing this case, Rehm said she searched LMM's online database and found one case where that same variant was reported as pathogenic. It was the case of a woman whose prenatal nuchal translucency screen turned up abnormal and pointed to a test for Noonan syndrome. The test identified this variant and it was reported back to the patient as likely pathogenic. The patient ended up terminating the pregnancy. "We don't know all the reasons for the termination," Rehm said, "but it could have been our finding."
Others have found similar problems with incorrect variant classification. Stephen Kingsmore's group, previously from the National Center for Genome Resources, published a study in Science Translational Medicine of a next-gen sequencing-based carrier screening test it was developing and found that 27 percent of mutations reported in the literature as pathogenic were in fact common polymorphisms or misannotated.
Partially because of this problem, the group abandoned the test as a carrier-screening test and has since developed it as a newborn diagnostic test that screens for 600 inherited diseases, which it has launched out of Children's Mercy Hospital in Kansas City, Mo. (CSN 8/9/2011). The team is also developing a whole-genome sequencing protocol dubbed STAT-seq that will run on the HiSeq 2500 for newborns admitted to the neonatal intensive care unit (CSN 10/3/2012).
Rehm said that so far in LMM's whole-genome sequencing cases, which it is doing as part of the NIH-funded MedSeq project run by Robert Green's group at Brigham and Women's Hospital, the team finds an average of 25 putative pathogenic variants per case, but only 10 percent of them have enough data for pathogenic claims.
"It's a lot of work to get down to just a few variants," she said.