NEW YORK (GenomeWeb) – One of the talks at the American Association for Cancer Research annual meeting in April described a rules-based computational algorithm for ranking and prioritizing cancer somatic variants by clinical and biological relevance.
The so-called Precision Heuristics for Interpreting the Alteration Landscape (PHIAL) is one of several informatics applications being used to analyze data in the Broad's Clinical Research Sequencing Platform (CRSP), a clinical lab set up by the institute to provide CLIA-regulated genetic testing.
PHIAL's lead developer, Eliezer Van Allen, an instructor in medicine at Harvard Medical School and a clinician at Dana Farber Cancer Institute, presented the tool during a session at AACR centered on methods and necessary considerations for doing prospective cancer genotyping in clinical contexts. Details of the method and its application were previously described in a technical report published last year in Nature Medicine that described an approach for prospectively generating and interpreting whole exome sequence from formalin-fixed, paraffin-embedded tumor samples in clinical environments. The research study was conducted by investigators at Dana Farber, the Broad Institute, and elsewhere.
In his AACR talk, Van Allen discussed the challenges of interpreting data from whole exome sequence in clinical contexts where the emphasis is on a single patient rather than large cohorts. "What we need to do in this context is actually deeply look at one cancer patient's whole exome sequencing data at both the tumor and the germline [data] and figure out what [in these data] might be useful from a clinical perspective," he said.
That goes beyond simply identifying and linking variants to their respective genes to actually ranking lists of variants based on clinical actionability and then using that information to try to guide clinical care — exactly what PHIAL was designed to provide, Van Allen said.
As explained in the Nature Medicine paper, the tool uses a rules-based approach to rank alterations such as mutations, short insertions and deletions, and copy number alterations based on prior knowledge about clinical and biological significance gleaned from scientific literature, manually curated information, and expert opinion as well as information from commonly used repositories such as the Catalogue of Somatic Mutations in Cancer, Van Allen said.
This information is stored in the database of Tumor Alterations Relevant for Genomics-driven Therapy (TARGET), a public repository created for PHIAL that contains cancer-linked genes with potential therapeutic, prognostic, and diagnostic implications for cancer patients, the researchers wrote. PHIAL works by matching and ranking input mutations to the information contained in TARGET and returns information such as whether a variant has been reported to have an association with a specific therapy or what sorts of therapies might be more appropriate for one patient over another, he said. For example, a patient with a HER2 amplification might be treated with a HER2 inhibitor, while another patient with a HER2 deletion would receive a different treatment.
Van Allen and his colleagues are currently working on improvements to PHIAL and hoping to get the tool adopted and used more broadly, he said. Planned improvements include enabling the algorithm to integrate exome and RNA-sequencing data, he said, as well as doing a better job of integrating somatic and germline genetics. They are also working on keeping the TARGET database up to date, he said. One of the ways they're trying to do that is through TumorPortal, a Broad-led effort to crowd source variant annotations for cancer genes and mutations from the cancer research community, he said.
Informatics for the Broad's CRSP
PHIAL is one of several tools in the analysis pipelines that are used to explore and make sense of data generated by the Broad's CRSP laboratory. Niall Lennon, director of genomics R&D and clinical development at the Broad, told GenomeWeb that the CRSP informatics pipeline leverages many of the same tools, expertise, and best practices used by the Broad's research side that have now been validated for use in clinical contexts per CLIA requirements.
CRSP is somewhat unlike traditional clinical laboratories because it isn't attached to a hospital, which played a role in determining what sorts of services the lab intends to provide, Lennon said. From the beginning, "we decided that we were going to be a little bit different in that we were going to do a mainly technical product [meaning] that we do the sequencing and then we bring it through our variant calling pipelines and … deliver the variants from the sample, but we stop short of doing a clinical interpretation and giving a clinical report to a physician," he said.
There was also an element of wanting to leverage existing resources and not duplicate efforts. "One thing we were cognizant of was that there were [already] many groups who were building out the capabilities to do the interpretation part [and] a lot of those groups exist within our ecosystem," he said, Partner institutions like Massachusetts General Hospital and Dana Farber already have their own advanced molecular pathology and variant interpretation capabilities, for example. Moreover, since the Broad is not a hospital and does not have on staff the sort of experts needed to evaluate the clinical details of patient cases, it made sense to focus initially on providing services that utilize its existing strengths on the research side.
"As we mature ... there are smaller cases where we could see ourselves adding, if not the entire clinical interpretation, at least more information that gets closer to that," Lennon said.
As an example, the Broad is working with the Carlos Slim Foundation on a mass spectrometry-based test for a mutation that causes a rare kidney disease. Scientists at the Broad and MGH discovered the causal variant for the disease, a cytosine duplication, in a stretch of DNA that is very difficult to sequence. Based on this discovery a test was developed which determines whether a patient in question has the duplication. The test is currently research-use only but is currently being validated for clinical use in CRSP. As such, CRSP is working on an automated report that would be delivered to requesting physicians and would provide more detailed information about the presence of the duplication in the sample — the doctor does the final evaluation and signs off on the report. "It allows us to provide more meaningful results but stops short of full clinical interpretation," Lennon said.
The lab currently offers exome sequencing using an Illumina HiSeq 2500 instrument. Turnaround time for one exome is three weeks, including delivery of the VCF files. One informatics tool used within CRSP is a custom-designed laboratory information management system that handles sample accessioning through loading the sample onto the sequencing instrument, and provides mechanisms through which clients place orders and retrieve VCF files post-analysis, Lennon said. On the actual analysis side there are two sets of pipelines depending on what sort of exome sequencing the customer wants done: whole-exome sequencing with germline analysis or whole-exome sequencing with somatic analysis for matched tumor-normal samples.
For the germline analysis, the lab uses a dedicated instance of Picard tools to manage sequencing metrics, and tasks such as duplication marking, and so on, and then the data goes into the Genome Analysis Toolkit Haplotype Caller, which generates the final output, Lennon said. The pipeline for the somatic side uses elements of the Broad's Firehose suite, as well as tools such as MuTect for calling SNVs, and Indelocator for insertions and deletions, he said. This pipeline also includes tools developed by collaborators at associate laboratories such as at Levi Garraway's lab at Dana Faber where PHIAL was initially developed. Planned tools for this pipeline include one being developed in Gad Getz's lab at the Broad for identifying and filtering out tumor-normal contamination.
Getting its research tools ready for clinical use didn't require any fundamental changes to the algorithms themselves, Lennon said. Much of the work revolved around ensuring the robustness and reproducibility of pipelines as well as providing appropriate documentation including information about what versions of software are used to analyze data, how the tools run, sensitivity and specificity results, and so on. "That's where a lot of effort goes in — porting tools over from the research pipelines," he said.
In addition to its exome sequencing offering, CRSP is also involved in partnership projects with local pharmaceutical companies that want to access and use its pipelines and tools for their clinical trials. Last year, Lennon told GenomeWeb, CRSP was involved in a clinical trial with a local pharma company that is creating and validating a 700-gene cancer panel. The study involved a data from a cohort of around 600 patients and about 1,200 samples.