NEW YORK (GenomeWeb) – As clinical sequencing becomes more common and as healthy individuals are increasingly seeking information about their own genomes, the way genomes are interpreted and reported back should shift from reporting back pathogenic variants, to reporting back informative clinical genotypes, according to the New York Genome Center's Nathan Pearson.
Pearson, senior scientific director of scientific engagement and public outreach at the NYGC, gave a presentation at last month's Clinical Genome Conference in San Francisco about why reporting clinical genotypes — essentially, combinations of interacting variants that together are more informative than a list of single variants — would become important as sequencing becomes more widespread.
He also urged researchers to begin to develop the computational infrastructure for handling such information.
In a follow-up interview with GenomeWeb, Pearson elaborated on his vision, gave several examples of current clinical genotypes that the community could begin reporting today, as well as the challenges that will have to be overcome in developing a reporting system for genotypes.
Pearson gave the analogy that reporting variants instead of genotypes is like a drug label that includes a list of the elements rather than the molecules. A drug label would list carbon, hydrogen, and nitrogen as ingredients, but instead specify the molecule that those combination of elements form, like caffeine or strychnine. Similarly, a clinical report would be more informative if instead of listing specific variants, it described genotypes formed from a combination of variants.
Can you elaborate on what you mean by a clinical genotype?
Conventionally, in taking genomic data and making it clinically useful, we lay computational groundwork … and classify things in a person's genome to report back to the physician. We've started by classifying variants, which is a sensible place to start because they are the simplest reliably heritable way two chromosomes could differ. And in the early era of clinical genomics, we were only looking in people's genomes who were urgently sick and had a few differences that strongly explained why that person was sick.
But now we have the boon of being able to look at many people's genomes much more comprehensively, including those of healthy people. Less dramatic health differences may not trace back to smoking gun differences.
Interactions between variants in our genome will matter and sometimes they will be non-additive, meaning you can't predict them by tallying them up. For example, diseases like hormone-sensitive cancers like breast, ovarian, or testicular cancer, clearly happen much more often in one sex versus another sex. That's so obvious to us that we forget it's actually a genetic interaction. The same BRCA1 variant carried by you or by me, because our sex chromosomes differ from each other, our genotype differs. That BRCA1 variant may influence our health in very different ways in the context of the broader genotype that differs between you and me. And that's a non-additive interaction.
Are there other examples of clinical genotypes that could be reported now?
There are a few different kinds of simple interactions that are non-additive that we already understand. For instance, we could build in sex chromosome dependence. Instead of reporting simply a BRCA1 variant, we could also report whether that person has an XX karyotype or an XY karyotype, and also, if that person is androgen insensitive. If a person is XY and androgen insensitive, the breast tissue may respond differently to the BRCA1 variant, maybe much more like a typical XX. That is something that can be built right into the clinical interpretation computation.
There are some emerging examples of fairly strong and clear non-additive interactions among variants at different sites in the genome. For example, you may carry two copies of Apoe4, an ancestral variant in APOE. But you may also carry a rare derived variant that we think arose in northern Europe in the APP gene that effectively keeps you healthy despite what might be scored as pathogenic, in terms of a ClinVar report. You might say, wow, two copies of Apoe4, this person is very likely to get Alzheimer's disease fairly early in life. But, that prediction would be strongly tempered from what we know about this rare variant in APP.
Likewise, there are non-additive interactions we can point to between variants in different genes that make you sick. A well-known classic example of this would be two nominally recessive variants — one in a gene called RDS and one in a gene called ROM1. In theory, if you have only one copy of each you shouldn't have retinitis pigmentosa. But, if you do have a copy of each — those two different variants in two different genes — then you are actually very likely to get retinitis pigmentosa.
So, the benefit of reporting genotypes is that you get more accurate predictions?
Yes. And also, if you sequence a typical healthy adult today, that person will have somewhere between zero to three nominally pathogenic variants. And they'll have a bunch more variants found that are in intriguing genes — ones that have been tied to diseases, but the variants might be brand new or not fully characterized, so we can't say much about them. Nonetheless, a physician might be faced with interpreting some of those.
That long list faces a physician who may have a few minutes to deal with this data. But, some of it might be readily compressed and the interpretations more precise by reporting instead a genotype.
Imagine a hypothetical genome with five variants, each of which is in ClinVar as pathogenic for high risk of breast cancer, long QT syndrome, high LDL cholesterol, hemolytic anemia, and beta-thalassemia. But the individual is male, which modifies the breast cancer variant to family risk. He also carries a variant known to be protective of long QT syndrome. And, the variant for beta-thalasesmia is actually known to be protective for high LDL.
If you incorporate what we know already, in terms of genotypes, that's a very different message for a physician to give. But that information today wouldn't be reported out as such in the clinical report.
How will researchers figure out the relevant genotypes? It seems that there are endless combinations of variants that would have to be studied. Is there a logical place to start looking first — for instance, in the 56 genes that the American College of Medical Genetics has identified as being disease causing?
Now that we're sequencing many genomes, we can start to stratify, pick out people we know have some component of a known risk genotype, like a BRCA variant or the Apoe4 variant, or another variant that is squarely implicated in disease, and look at who does and does not get sick.
Already, there have been important steps in that direction. For example, one aim of the [Collaborative Oncological Gene-Environment Study] is to look for modifiers of the nominally dominant BRCA1 and BRCA2 variants for breast cancer.
Another example is the Resilience Project, an effort by Mt. Sinai, which is looking at people who carry variants that should be harmful, but are healthy.
A second way is through in vitro work in model organisms. A third approach is that as more people are sequenced and phenotyped, we'll start to notice clumps of people that will get sick in a particular way — clumps that can be functionally characterized. In addition, we can leverage expression data to better understand genes. Studies that look at individuals who are sick but don't have a classic recessive genotype will help turn up more cases of compound heterozygosity. And finally, we can design studies that look for combinations of alleles where there is one in each parent but the combination is never seen together in a child. If you look across enough people, you can spot variants that are incompatible with each other.