Skip to main content
Premium Trial:

Request an Annual Quote

The Science Behind DTC Genetic Testing

Premium

In the last year, direct-to-consumer genetic testing has exploded onto the scene with companies offering their customers glimpses of what's in their genomes, from the inane, such as earwax type, to the serious, including risk of Parkinson's disease. The largest of the DTC companies — 23andMe, DecodeMe, and Navigenics — offer services steeped in the latest and greatest results from the scientific literature. Some scientists, though, wonder if it's too soon, if the research is too raw, to be offered as a commodity.

"To a large extent, not absolutely 100 percent, but to a large extent, I think that its application in the realm of the consumer or the individual patient is quite premature," says James Evans, a professor of genetics and medicine at the University of North Carolina at Chapel Hill.

The SNPs the companies test are pulled from genome-wide association studies, and the assays are run on robust arrays from Illumina or Affymetrix in CLIA-certified labs. But for all that solid foundation, what the companies tell their customers may be missing some of the puzzle pieces, scientifically speaking.

At each step of the process from determining what kind of variants to look at, to reporting a customer's risk, or intervening in the clinic, the science is there — just not quite complete. Most variants reported in GWAS are common variants associated with a disease; however, not all researchers think they are the most important type of disease variants. In cases where multiple common variants — each of which confers small amounts of risk for a disease — are combined to give an overall view of risk, the statistical model assumes that the variants act independently, though that might not be the case. Once in the clinic, many physicians don't know what to make of a patient's increased genetic risk for a disease, particularly if it's a small increase.

From the silver-lining department, these holes in genome scanning provide an opportunity for researchers, both at the DTC companies and in academia, to fill them in. "It is an exciting, new source of information that may ultimately contribute to our ability to provide better care for patients, but, at the moment, there continues to be a lot of additional work that needs to be done," says David Herrington, a professor of internal medicine at Wake Forest University Medical School.

Common variants

The SNPs used in DTC genetic scans come straight from the journals. In general, the companies scour the literature for papers that meet their criteria to ensure the study was up to snuff. At Navigenics, co-founder Dietrich Stephan says the criteria include checking the power of the study, and how the cohorts were selected and characterized. "The most important thing is that it has to have been replicated a number of times in independent cohorts," Stephan says.

23andMe and DecodeMe follow similar protocols to make sure that the study doesn't diverge too far from the "standard recipe," as 23andMe scientist Brian Naughton puts it.

"[If] they start doing something weird, that's a good indication that they are trying a little too hard [to find an association]," adds Serge Saxonov, another founding scientist at 23andMe.

The vast majority of papers that the companies consider come from GWAS. "We are focused on genome-wide association studies primarily, and then look to see if there's wide replication or not," says Decode Genetics' chief scientific officer, Jeffrey Gulcher. Decode also conducts many of its own studies into disease associations.

There's no question that the SNPs identified and replicated in GWAS are associated with disease. "There certainly are SNPs that have unambiguously been shown to be associated with various conditions, like the 9p21 SNP for coronary diseases is unambiguous. It has been repeated numerous times," Wake Forest's Herrington says.

David Goldstein, director of the Center for Human Genome Variation at Duke University's Institute for Genome Sciences and Policy agrees that "the association with disease is secure." He adds, "The only people that dispute the associations themselves … are basically cranks."

The variants included in the DTC tests focus on diseases that are prevalent in the population. "We focus on the most common diseases — [we've] prioritized cardiovascular [disease], cancer, and other diseases that are fairly common," Decode's Gulcher says. "We were fortunate enough to find the first genes for type 2 diabetes, and myocardial infarction, aneurysms, stroke, atrial fibrillation, of course breast cancer, glaucoma, restless leg syndrome, et cetera. That's formed the nucleus of what we offer."

So far, most SNPs identified for these complex diseases are common variants that typically affect about five percent of people. The common disease-common variant hypothesis says that the genetic risk for common diseases comes from one or more common disease alleles, though the risk from each variant is small.

"I think that [DTC companies], of course, would like to use common variants because, by definition, they are common," says Evans at UNC. "Many people will have these risk alleles and will pay money to be tested. But again, by very definition, these common risk alleles are almost all very subtle in their effects."

Common variants account for small changes to a person's risk for disease as compared to the general population, but when multiple common variants for the same disease are combined, they may account for more of that person's susceptibility for the disease. "In some cases the cumulative effect can be quite large," Naughton says.

But they may not. "It's still sort of an open question whether the common variants-common disease hypothesis is going to pan out. I think there's certainly some merit to that, but, technically, that's really what we have at hand now," says Michael Christman, the president and CEO of the Coriell Institute for Medical Research in New Jersey.

Goldstein goes a little further. "The basic problem is what most of the companies are offering right now is based on these gene chips that tag a bunch of common variants, and those common variants are associated with minuscule changes of risk for conditions," he says. "That is pretty meaningless information at the individual level."

Rare variants

Instead, Goldstein says that rare variants could confer a markedly higher risk of disease than common variants do. "We're playing games to a certain degree with common variants because they are not very important. At the same time, the research is going in a dramatically different direction," he says. "We are finding that rare genetic differences have a huge impact and those things are going to eventually be identifiable, too, by these direct-to-consumer companies." In particular, a large study that Goldstein participated in with researchers from Decode found 1 percent or 2 percent of schizophrenia patients have rare deletions at 1q21.1, 15q11.2, and 15q13.3. Those deletions are major risk factors for the disease.

Rare variants that turn up in 1 percent to 3 percent of the population, says Eric Topol, a professor of translational genomics at the Scripps Research Institute, "are likely to have much higher penetrance or probability of susceptibility. Those, unfortunately, couldn't be determined through these genome-wide association studies because there wasn't adequate statistical power." But he thinks that's about to change as the price of next-generation sequencing comes down. "Through sequencing, their identification will be possible," adds Topol, who is collaborating with Navigenics on a study of the value of genetic testing.

The major direct-to-consumer genetic testing companies don't yet offer to sequence their customers' genome, but they already are or will soon be testing rare variants. "We've started with common diseases and common variants and you'll see that we'll layer in rare variants, copy number variants, and epigenetics," Navigenics' Stephan says.

23andMe already includes a few rare variants, such as one for deep vein thrombosis and another for Parkinson's disease. "The Parkinson's disease mutation is fairly rare, but if you have it, [it has a] really, really strong effect. [It's] pretty much the most important thing you can learn about yourself," Saxonov says.

At Decode, Gulcher says that rare variants, alongside common variants, will give a better picture of disease risk, though it's not certain. "It's just difficult to know ahead of time how big a role they will play," he says.

Goldstein has a different theory for the role of common variants. Instead of them being associated with a common gene that has a small impact on disease risk, Goldstein says the SNP could mark off a genetic region where there are a lot of rare variants that have a high impact on disease. "The genetics is not what we think it is," he says. "It's totally then misleading somebody to think, 'Oh, I've got an allele that's a little tiny bit more worse or a tiny bit better.' I don't think it actually translates like that at that personal level."

Either way, the hunt is on for rare variants, now using sequencing platforms rather than genotyping chips. At Scripps, Topol says scientists are using the Illumina sequencer and will be getting an Applied Biosystems SOLiD to search for low-frequency variants.

Calculating risk

With common variants, the name of the game is small effects. Small effects that could, when combined, have a significant impact on someone's health. First, though, the risk that each individual SNP confers must be determined.

In the literature, when an association between a SNP and a disease is found and replicated, the risk from these retrospective case-controlled studies is expressed as an odds ratio. To get an odds ratio, the odds of cases having the disease are divided by the odds of the controls having it. But odds ratios are retrospective measures of risk, not prospective measures. "Usually papers from Nature Genetics or the American Journal of Human Genetics … will report what's called an odds ratio, which is something very similar to your increase in risk. A, let's say, 20 percent increase in risk might be similar to an odds ratio," 23andMe's Naughton says.

Each SNP has its own odds ratio that needs to be converted to a prospective risk measure. "We had to come up with a conversion model between odds ratios and relative risk and then validate that model to make sure that it was robust," Navigenics' Stephan says. "That transformation depends on allele frequency and the prevalence of disease and those two things change. … It's really important to do that math correctly."

Transforming an odds ratio to a relative risk changes the definition of that risk — now it is in relation to the general population's risk. "We're just simply saying you have this genotype, what's your risk compared to general population?" Decode's Gulcher says.

To get an overall view of risk, the effects of the individual SNPs have to be combined — the 15 percent to 20 percent from individual SNPs could cumulatively become much higher risk for a disease. "It seems as if in most cases, any individual SNP is not likely to contribute a large amount of absolute risk and there are some promising evidence to suggest that combining data from multiple SNPs in the aggregate may be a better strategy," Wake Forest's Herrington says.

There are different ways to combine that data. "You can play a counting game where you count up whether or not you are positive or negative for high risk variants at each of the loci, or you do it more precisely where you multiply the actual risks, and not assuming that this gene has the exact same risk as this other gene — which is never the case," Gulcher says.

The model commonly used to get a view of overall risk is multiplicative and assumes that each SNP acts independently on a person's risk of disease. "It is very, very similar to assuming that each SNP increases or decreases your odds independently and multiplying them together, so two increases and two decreases probably works out to something like average risk," Naughton says.

The SNPs, though, may or may not interact with each other. "One of the biggest holes so far is that the aggregate risk of these independent markers have not been studied very well, so we don't know if they just add or multiply," says Topol at Scripps. "Are they truly independent?"

UNC's Evans concurs: "We don't really know how to combine these genetic risk factors. We can do simple mathematical associations, assume they are all independent and come up with an aggregate risk. The reality, however, is that biologically that's a naïve way to approach it. … It may be that having two or three or four or five of those actually represents less risk for you."

Gulcher says that at Decode the team has justified the multiplicative model by studying it in 20,000 patients and 20,000 controls. "[It] appears to be just as good as any other model and in most cases, the best model, as opposed to a dominant or recessive or some sort of interactive model," he says. Decode has also worked with the Broad Institute to study whether common variants interact with one another. "We fail to find any of that and everybody else has failed as well, for the common variants," Gulcher says.

Actionability and utility

On the medical side, the worries are about whether the risk information provided by DTC companies is medically actionable and whether it has clinical utility. The diseases included on DTC scans fall along a spectrum of actionability. "If you have a risk for melanoma, you might want to protect yourself more from going out into the sun. That one's pretty easy, I think we all agree," says Topol. "If you're at risk for a heart attack, you might want to take a little bit more seriously things like being thin, the right type of nutrition, and your exercise plan. With Alzheimer's, there's nothing definitive… but there's lots of things that are being studied right now."

Navigenics' HealthCompass, which tests for 23 diseases from abdominal aneurysms to stomach cancer, focuses on conditions that can be prevented or delayed. "We made a conscious decision only to include actionable end prediction, so diseases like diabetes, heart disease, cancer, et cetera, where you can either avoid an exposure or go in for early screening or go on a medicine," Stephan says. "We did not think that it would be productive to include diseases like Lou Gehrig's disease for which there is nothing you can do."

Likewise, the Coriell Institute, which is conducting a large, prospective study of the clinical utility of genetic testing results, is only reporting medically actionable diseases — currently numbering 10 — to their participants.

Others are more forthcoming. DecodeMe and 23andMe cover a larger swath of diseases, with Decode surveying 34 diseases and 23andMe more than 90 health conditions and other traits. "Our philosophy is that we want to be able to tell you as much as we can find out from your genome. We do try and be comprehensive," 23andMe's Naughton says.

There are studies under way, such as the one at Coriell, to find out what people and their physicians do when given genetic risk information. "The challenge at this point for these kinds of tests is … showing that knowing that information permits a change in clinical care that … translates into a real, measurable improvement in health outcomes," Wake Forest's Herrington says. "We don't know at the moment, for example, if you have the 9p21 variant [for cardiovascular disease], would you do better taking a statin than if you did not have the 9p21 variant."

Right now, however, most of the information about clinical utility is anecdotal. Decode's Gulcher is an example himself. From scanning his own genome, he learned that he had two-fold higher risk for prostate cancer. That result prompted his primary care physician to get a PSA screening for Gulcher, despite his being younger than 50. Gulcher's PSA level was on the high end of normal and he wound up seeing a urologist who diagnosed Gulcher with high-grade cancer on both sides of his prostate. The cancer was then surgically removed.

Decode supports clinical utility studies, Gulcher adds. "What do you do with this prostate cancer profile? Higher risk patients, do they benefit from earlier screening? Is the specificity of the current standard biomarkers like PSA enhanced by knowing patients are at higher or lower risk for prostate cancer?" he asks.

Studies are underway. Scripps Health has teamed up with Navigenics and Microsoft to work on a prospective study of people to see what impact genetic knowledge has on consumers' lifestyle, diet, exercise, and psychological well-being, as well as on medical screening and diagnosis. "We're going to be studying this question for years," Topol says. "If it was so obvious, [if] everybody knew the answer, then we of course wouldn't have to do this work. No one has done a study like this prospectively yet."

Coriell has a similar undertaking. In its study, the participants have their genetic profile done and Coriell also surveys them to collect demographic data, family histories, and medical information. "Through a secure Web portal, we will follow up with surveys of the participants to keep track, to monitor their behaviors and basically try to determine whether use of this information actually changes their lifestyle and then ultimately improves their health outcomes," Christman says.

Research

Much of what DTC companies offer is based on the fruit of academic researchers' labors, but how these companies will affect the research community has yet to be determined. "We know that none of what we are doing could exist without the efforts of a slew of researchers around the world, so our core goals honor those efforts and contribute to them," Navigenics' Stephan says.

Many of the companies have their own research arms — Decode has its diagnostic side and a number of collaborations with academic researchers; its founder, Kari Stefansson, continually publishes genome-wide association studies.

23andMe has 23andWe, which is an "effort to do surveys and to do research on the website with our customers. That is being done in collaboration with outside researchers at universities," Naughton says. "In some sense we are giving back to the research community in terms of access to our data and just allowing them an opportunity to validate or test their hypotheses in large samples that they otherwise might not be able to."

Not only is Navigenics working with Scripps, but the company is also getting informed consent from its customers to refine internal risk prediction models using those customers' clinical, demographic, and genetic information. Stephan says that Navigenics also envisions working with researchers, but first wants to ensure customer privacy. "We just want to be really careful as we [give] this data out that no one gets damaged," Stephan says.

Some scientists are skeptical that research from DTC companies will be informative. "The companies love to say, 'We're engaging in research, too, and we're going to create a new paradigm for research.' Anytime that I hear the word 'paradigm' I start to shudder," UNC's Evans says. "When you collect data in research, you have to collect data in as unbiased a way as possible. The problem is that you have a very select population signing up for these things. … I am very skeptical [the data] will be broadly applicable to the population at large."

One thing the companies do is show some of the pieces missing from the genomics puzzle. Scripps' Topol points out that SNPs are just markers in the genome and that functional genomics strategies need to be harnessed to find the actual causative gene or genes. "We have lots of statistical associations of a zip code of a genome with a trait or disease, [but] we don't know within those hundreds of thousands of base pairs what's really going on," he says. "No one has really identified the smoking gun variant of the sequence."

On the other hand, DTC companies could have little impact on the research in the academic realm. "I don't know about that," Topol says. "They are really just taking stuff from the research community, offering it to the public. I don't know that there's another way to close the loop on that."

DTC Genetic Testing and Ethnicity: More Research Needed

The vast majority of genome-wide association studies done to date, and from which direct-to consumer companies pull their information, investigate disease associations in people of European descent. Some of the SNPs were then validated in other populations, such as the 9p21 marker for cardiovascular disease, which was validated in Chinese people. But some of the risk information may not be applicable to people of other backgrounds — the 9p21 marker, for instance, was not validated in African-Americans.

"Almost all our knowledge is based on European ancestry, so the other major ancestries have just been looked at, in some cases, as a replication. But the primary data doesn't come from them, so there are really holes in this understanding," the Scripps Institute's Eric Topol says.

At Decode Genetics, Jeffrey Gulcher says scientists there start out with the "quintessential Caucasian population" by studying Icelanders' genomes to find risk alleles and then replicate those findings in other populations through their collaborations in the United States and Europe. "I would say at least half the time it appears that the markers that we find in Caucasians do replicate in African-American populations, and I would say the rate is even higher for East Asian populations," Gulcher says.

That lack of knowledge is reflected in what DTC companies tell its customers who are not of Northern European ancestry. 23andMe says it tries to make it clear that the risks given to its customers have mainly been found in European cohorts and that "doesn't necessarily translate to other ethnicities," 23andMe's Brian Naughton says.

At Navigenics, Dietrich Stephan says that scientists try to give ethnicity-specific risk when they can. "[We] explain to them that, in general, we've seen that these risks transfer across ancestry populations but it may not be the case for this one and as we learn more, we'll update [them] with respect to our refined risk estimate," he says.

Wake Forest University's David Herrington says more research needs to be done. "Do we have enough information about these associations in other ethnicities? Absolutely not. Do we need to do more research in this area? Absolutely, we do. I do think that is a bit of a weakness in the currently available data that I hope will be addressed soon by the scientific community," he says.

It already is. At the Coriell Institute for Medical Research, researchers are trying to include people of non-Caucasian descent in its study of the clinical utility of genetic testing. "We definitely tried to encourage people to understand that the usefulness of this cohort is to build a population of people within which scientists can do genome-wide association studies such that minority involvement can be an opportunity to help in the effort to find risk factors in non-Caucasian individuals," says Margaret Keller, an associate professor at Coriell.

Topol says Scripps is encouraging people from diverse ancestral backgrounds to take part in its studies. "We're trying to get representative Asian and African ancestry so it isn't European-centric work," he says. "We're doing quite well with that and, in this area of the country, we have pretty good diverse population. We're fortunate."

The Scan

Billions for Antivirals

The US is putting $3.2 billion toward a program to develop antivirals to treat COVID-19 in its early stages, the Wall Street Journal reports.

NFT of the Web

Tim Berners-Lee, who developed the World Wide Web, is auctioning its original source code as a non-fungible token, Reuters reports.

23andMe on the Nasdaq

23andMe's shares rose more than 20 percent following its merger with a special purpose acquisition company, as GenomeWeb has reported.

Science Papers Present GWAS of Brain Structure, System for Controlled Gene Transfer

In Science this week: genome-wide association study ties variants to white matter stricture in the brain, and more.