NEW YORK (GenomeWeb) – The Global Alliance for Genomics & Health (GA4GH) has released the second tier of BRCA variant data, integrating information on more than 13,000 variants across public repositories around the world.
The data release is part of the BRCA Challenge, a demonstration project of the GA4GH, which is an international coalition of more than 350 organizations aiming to build common tools and approaches for securely sharing genomics and clinical data. The group last October released the first tier of publicly accessible BRCA variant data, giving healthcare providers and members of the public the ability to see whether variants are pathogenic, benign, or not yet classified. Classifications included in the first tier of the BRCA Exchange web portal have been reviewed by a consortium of experts, called ENIGMA.
The second-tier data GA4GH has now released publicly is intended for researchers, genetic counselors, and "people who are more comfortable with the nuances of variant classification," Rachel Liao, the clinical working group manager for the BRCA Challenge, told GenomeWeb.
Tier 2 includes information from all the public databases, such as NIH's ClinVar, the Leiden Open Variation Database (LOVD), the Exome Aggregation Consortium, and 1000 Genomes Project, which contain BRCA variant data and presents it in a consistent format. Researchers can now call up a listing for a specific variant within BRCA Exchange and see all the different references on it across databases, instead of downloading the data from individual repositories, which use different formats, data definitions, and include varying data elements.
"This will allow the merging of data for the purpose of research and ultimately improve variant classification," Liao said. "The entire goal of the BRCA Challenge is to improve variant classification."
If a variant is not in the first-tier consensus portal that likely means that the ENIGMA consortium hasn't vetted it yet. It doesn't necessarily mean that a consensus classification of that variant isn't known. "We recognize that patients and clinicians that don't identify a variant in the consensus portal will go into our research space to see the evidence," Liao recognized. "Obviously, you still have to make clinical decisions, even if consensus is not available."
BRCA1 and BRCA2 are highly variable genes. Many mutations in these genes are already well known to increase the risk of hereditary breast and ovarian cancer in carriers, but some mutations are exceedingly rare, making it challenging to confirm their links to cancer. GA4GH is hoping that the BRCA Exchange portal will become the "one-stop shop" for getting information on BRCA variants at a time when hereditary cancer testing has become one of the most competitive and crowded diagnostic markets.
Following the Supreme Court's 2013 decision to invalidate several of Myriad Genetics' patent claims on BRCA1/2 isolated gene sequences, competition increased in a space the company previously monopolized. Labs began competing on a number of fronts, including test price, turnaround time, and the accuracy of variant classifications. In particular, diagnostic companies began racing to reduce the number tests that identified BRCA variants of unknown significance (VUS), which are markers with unclear links to disease.
But because Myriad has a proprietary database containing information from approximately 2 million patients it has tested, the firm has boasted it makes more definitive pathogenic or benign calls than its competitors. Based on 2013 data, Myriad had the lowest VUS rate in the business, with 0.6 percent tests identifying a VUS in BRCA1 and 1.6 percent in BRCA2. More recently, the company presented data claiming that its Pheno history-weighting algorithm was more than 99.5 percent accurate for upgrading and downgrading VUS to more definitive classifications.
It response, researchers, labs, and patient groups began encouraging data sharing and use of public repositories, essentially to break data silos. And using these resources, other labs are competing with Myriad. For example, researchers from the diagnostic firm Invitae and Stanford University compared BRCA1/2 results from Invitae's NGS panel test for patients who had previously received testing from Myriad and found 99.8 percent concordance. Invitae relies on non-proprietary data resources for its variant classification.
There are now more shared data resources dedicated to BRCA1/2 than ever before. The University of Utah, its Huntsman Cancer Institute, and its non-profit lab Arup last year launched an open-source variant database dedicated to these genes. Quest launched BRCA Share, agreeing to fund curation and functional studies to enhance BRCA1/2 variant classifications within the Universal Mutation Database (UMD), housed at Inserm, France's national institute of health and medical research. Ambry Genetics last month funded the sequencing of 10,000 cancer patients' exomes and publicly shared aggregate, allele-frequency data.
These labs investing in public variant resources all compete in the hereditary cancer tesitng space and market tests that gauge BRCA mutations. Many of these labs, including Myriad, are also working with top cancer centers around the country to collect patients' variant information in a registry, called PROMPT, and improve understading of pathogenic variants and VUS in hereditary cancers.
However, Myriad executives have said that public variant databases, as they exist today, shouldn't be used for clinical decision making. In a study published last year, Myriad compared pathogenic classifications of 116 BRCA variants across five databases — the Breast Cancer Information Core; LOVD; ClinVar; Inserm's Universal Mutation Database; and the Human Gene Mutation Database — and found that all the databases agreed on the classification for only four variants.
Those supportive of data sharing and public repositories have acknowledged that public repositories need to be cleaned up and updated regularly — and the second-tier data in GA4GH's BRCA Exchange shows disagreements in variant classifications between databases.
"That is a way in which the research tier is not as user friendly from a clinical perspective, because it doesn't distinguish pathogenicity classification at all," Liao said. "It just shows everything that has ever been submitted for a variant. And we know some of those classifications are false and nobody thinks they are true."
In order to access research-level data, users have to acknowledge a pop up that informs them that classifications in that section are from the original submitters and haven't been vetted by experts. However, the experts within in ENIGMA — the consortium that evaluates variant classifications for BRCA Challenge — are constantly working to move variants from the second tier research portion of BRCA Exchange into tier one, the clinical grade portion.
The approximately 1,000 variants currently in tier 1 of BRCA Exchange are mostly missense variants, which often have uncertain classification. For truncating variants, it's more straightforward whether or not the protein product is functioning normally. Liao noted that ENIGMA will soon move close to 1,500 truncating variants into the first tier.
The NIH's Clinical Genome Resource, which experts are building into an authoritative resource on the clinical relevance of genes and variants, has confirmed ENIGMA as an expert classification body. ENIGMA has its own variant classification pipeline, but GA4GH plans to make automated bioinformatics tools available to experts, so they can streamline and speed up their work.
Meanwhile, BRCA Challenge experts have tried to make the second tier of research-level variant data as user friendly as possible. They have come up with a list of synonyms for variants to ensure non-genetics experts can search for variants, even if they aren't inputting the standard terms used by genetics researchers. "This will be an improvement over other databases that currently exist," Liao noted.
The second-tier data release lays the groundwork for the third and final data tier GA4GH wants to launch later this year, which will include case-level evidence on variants. In this space, which will not be open to the public, credentialed researchers will gather the evidence on variants for the purpose of classification, while protecting the privacy of patients and families to whom the data belong.
"Sometimes, it's possible to classify variants as pathogenic or benign with public data," Liao said. "But oftentimes it requires family pedigrees, segregation analysis, and disease status of individuals, things that we would consider private or personally identifiable information."
GA4GH is also hoping to build a resource like BRCA Exchange in somatic cancers. The group has already launched Matchmaker Exchange, which allows researchers around the world to connect with each other about the genetics of rare diseases; and the Beacon project, which enables institutions to install servers that outside users can ping with simple queries about the genomic data.
With all its projects, GA4GH wants to emphasize the importance of sharing genomics information around the world, which is why the group unveiled the latest data release within BRCA Challenge at the International Congress of Human Genetics in Kyoto, Japan. "We would like to engage with global groups," Liao said, "and build a global community of researchers, advocates, and patients who are using the BRCA Exchange tool."
With that goal in mind, GA4GH will soon launch an online community space where people doing work on BRCA1/2 genes around the world can engage with one another.
This article has been corrected to note that Myriad's proprietery database now contains variant information on 2 million patients, not 1 million.