NEW YORK (GenomeWeb) – While exome sequencing has proven to be a useful tool for solving undiagnosed diseases in individuals with genetic disease, its usefulness in healthy individuals to predict future disease risk is less well studied, despite predictions that eventually everyone will have his or her genome sequenced.
A researcher from the Genome Analysis Center in Norwich, UK has sought to answer the question of if an individual today wanted to get his exome sequenced and interpreted, would it provide useful information that could help him better understand his health risks.
Manuel Corpas, community driven data visualization project leader at TGAC, crowdsourced the exome sequencing of himself, his sister, and his parents. All members were also genotyped via a 23andMe SNP chip as was an aunt on his mother's side.
Corpas and his family's exome sequencing data was interpreted by four different analysis companies and reported recently in BMC Genomics. Corpas found that the results varied greatly between analysis companies and that there was essentially no overlap between the companies' interpretations of which variants were likely to be pathogenic and predicting of disease risk.
Corpas told GenomeWeb that the goal of the project was to try to and figure out "how far we could go in terms of trying to understand our personal genetics information."
The project began in 2009 when Corpas first ordered a 23andMe SNP test and was shocked when it told him he had an increased risk for prostate cancer, which does not run in his family. "None of my relatives have had prostate cancer, so as a geneticist and scientist, I wanted to understand how that result came about," he said.
However, getting that type of information from the 23andMe data was nearly impossible, since the 23andMe data did not detail the inheritance pattern of the risk variants.
Figuring that out required his family's genetic information as well. Corpas convinced his sister, parents, and mother's sister to also get their 23andMe data and to make it freely available, the results of which he published in the open-source journal, F1000 Research.
That initial experience in DTC genomics piqued Corpas' interest even further.
He next convinced his family members to get their exomes sequenced and made publicly available.
"Families in the near future will go through this process and will have to come to grips with how to handle this data," Corpas said. "When the sequencing of newborns happens routinely, which might happen in a few years time, there will be lots of issues," he said.
Corpas said that for his experience, he wanted to use available direct-to-consumer tools to try to understand his genome, rather than go through a medical process facilitated by his healthcare provider.
In addition, he raised $3,300 for sequencing his own exome as well as the exomes of his wife and parents via a crowdsourcing campaign and then covered the remaining costs out of pocket. He and his family each had their exome sequenced by BGI. Then, four different analysis companies offered to do the interpretation for free in exchange for co-authorship on the study.
Ingenuity, Diploid, GeneTalk, and BIOBASE all provided interpretation and annotations and returned reports. All of the data was made publicly available.
"One of the main observations from this combined analysis is that each platform provides a substantially different set of results," the authors wrote.
Each of the companies was asked to analyze the exome data and then return back a report highlighting the most significant findings. Each company highlighted a different set of results.
For instance, the results from BIOBASE predicted that all members were susceptible to preeclampsia and it also predicted a deleterious mutation in the Fanconi anemia gene in all family members except for the mother. By contrast, the Diploid analysis came back with two main findings for all family members: a variant associated with an increased resistance to the common HIV strain infection and wet type earwax. The GeneTalk analysis did not predict any family-wide variants, but did predict a greater risk of renal colic from kidney stones. Meantime, Ingenuity found variants associated with the neuropathy Charcot Marie Tooth in three family members.
"This is actually kind of alarming," study co-author Peter Robinson, head of computational biology at the Institute for Medical Genetics at Universitätsklinikum Charité in Berlin, Germany, told GenomeWeb. "The fact that four well-known companies have no overlap means that either one is right and three are wrong or none are right, and there is a need for standards for how to interpret exomes in a medical setting."
"This is a little bit worrying," Corpas added. "I would have expected a much greater concordance."
Corpas said that all the analysis companies had the same sequence data, and that data was validated against the SNP chip data and found to be more than 95 percent concordant, so the problems did not seem to arise from the data. In addition, he said, the companies all used publicly available databases and scientific literature, so he said it may not really be a question of the analyses being right or wrong, "it's just that perhaps we still don't have enough annotations of variants."
Robinson said another issue is that each of the companies uses different sets of algorithms. For instance, he said, Ingenuity's pipeline relies on a network algorithm that looks at pathways and protein interactions, while Gene Talks basically just uses filters to remove common variants. However, he said, for each bioinformatics pipeline, it is important to use the right set of parameters. "If you change the parameters you'll get different results," he said.
In addition, he noted that this study was done completely outside of the medical context, so the sequencing results were not incorporated with clinical data, which is extremely "important for making sense out of the data."
Nonetheless, Robinson said that exome and genome sequencing in cases where an individual already has a disease is very useful. "There's a big difference between explaining a disease that's already there, and you know the symptoms," he said, versus using exome sequencing to "predict whether a disease will occur later. That's very difficult and not very clinically useful right now."
Nevertheless, Corpas continues to be a believer in DTC genomics and in empowering people to understand their genomic information, and he wants to continue studying how genomic sequencing will impact people's lifetime choices and social interactions.
"I think that if you want to understand your genome you should have the ability to do so in a manner that is as accessible as possible," he said. "But I don't think that what you can currently say about a personal genome for a healthy person is more than anecdotal."