NEW YORK (GenomeWeb) – With a $1.76 million grant from the National Human Genome Research Institute, 23andMe will sequence the genomes of 925 existing African American customers to improve the accuracy of genetic research for that group.
By creating a panel of genome sequences representative of the African American population, consumer genomics firm 23andMe is hoping to enhance researchers' ability to make accurate links between diseases and genetic variants in research studies. "There's a feeling in the field that many of the existing panels out there are heavily biased toward European ancestries," Adam Auton, 23andMe senior scientist and statistical geneticist, told GenomeWeb.
A 2011 Nature study estimated that 96 percent of individuals used in genome-wide association studies are of European descent. This makes genotype imputation — a statistical method that uses data from SNP microarrays to infer some of the disease-associated variants that aren't detected directly — challenging for individuals of non-European ancestry.
"We wanted to address that problem by building an imputation panel for people of African American ancestry," Auton said.
According to Arjun Manrai, a research fellow in biomedical informatics at Harvard Medical School, researchers studying genetics of non-European populations can use repositories like 1,000 Genomes and the Broad Institute's Exome Aggregation Consortium (ExAC), which recently doubled in size, to 120,000, by including exome data from additional individuals. The NHGRI's GWAS Catalogue — a collection of published SNP-trait associations — also has ethnicity information, Manrai told GenomeWeb, and using that, some researchers have developed ethnicity-specific risk calculations.
The fact that 23andMe's project will sequence the genomes of just under 1,000 people "is quite interesting and very relevant," Manrai said.
Manrai and Isaac Kohane, director of the informatics program at Boston Children's Hospital, led a study where researchers analyzed publicly accessible exome data for variants that were thought to cause hypertrophic cardiomyopathy but were overrepresented in the general population, and reanalyzed the variants' classifications. They found that multiple patients of African American or unspecified ancestry tested by a leading genetic testing lab received reports that misclassified benign variants as pathogenic for the condition.
The widely cited study, published this August in the New England Journal of Medicine, powerfully demonstrated the need to sequence genomes of diverse populations and incorporate that data into control cohorts for accurate variant classification. "Given the frequency rate of the variants we observed, which were much more common in African Americans, even if relatively small cohorts of African Americans ,,,had been included in the original studies [as controls], it's likely that those variants wouldn't have been misclassified as pathogenic," Manrai said.
"Our philosophy, as most folks' philosophies are in this space, is that filtering putative pathogenic variation against as large a control cohort as possible is maximally powered to eliminate the types of false positives that we saw in our study, but also to find the true positives as well," he added.
23andMe's project, in Manrai's view, would be a welcome addition to the resources currently available for genetic research into non-European populations, but it'll be important to make that data widely available to the research community. He noted that data within ExAC, for example, is entirely open access. "If it's widely available for clinical geneticists and researchers, then it'll be another great resource for us," he said.
Auton said it was 23andMe's aim to make the sequencing data a resource for researchers around the world. To date, the firm has largely shared data from its research efforts through the published literature. However, within this project, the company will deposit the sequences of consenting, de-identified customers to an NIH-supported database, such as the Database of Genotypes and Phenotypes (dbGaP).
23andMe will reach out to its existing customers to recruit the nearly 1,000 participants for this project. Those who consent will not need to give an additional sample for whole-genome sequencing.
Auton couldn't provide an estimate of how many 23andMe customers had already agreed to partake, but noted that approximately 80 percent of the company's more than 1.2 million genotyped customers generally consented to research. More than 200,000 of 23andMe's customers have non-European ancestry.
Still, it's not clear, he admitted, if the same proportion of customers will agree to contribute to this project, because 23andMe will ask if it can share their genomic data with other researchers. "We will be contacting our customers to explain the benefits and risks of doing so," Auton explained. "We hope that people think this is a worthwhile effort to increase diversity in genetic studies."
23andMe's decision to share this data with the broader research community comes at a time when the National Institutes of Health, the National Cancer Institute, and the US Food and Drug Administration are encouraging researchers and genetic testing companies to submit genetic variant information to public repositories as a way to improve the field's collective understanding of how genomics impacts health. Sharing data on genetic variants is also a priority within national projects, such as the Precision Medicine Initiative and Cancer Moonshot.
The Precision Medicine Initiative will attempt to collect the data — medical records, genetic and metabolomic profiles, microbiomes, environmental exposures, and lifestyle habits — of a million volunteers in order to fuel research and advance understanding of diseases. Experts organizing the effort have repeatedly emphasized that the 1 million participant cohort should reflect the diversity of the US.
These large-scale initiatives will also need to ensure patient privacy to the extent possible and educate people about the limits of that protection when it comes to genetic information. For example, within dbGaP, individual-level data submissions must be scrubbed of names and other identifiable information. "The genetic fingerprint, however, is embedded in an individual’s genotype data, which is not de-identifiable," according to dbGaP's website. As such, dbGaP employs an authorized access system to distribute individual-level data.
Auton further pointed out that within this effort, 23andMe will share only genetic sequence data with the research community, but not any associated phenotypic information.
The latest effort to make DNA sequences of African American customers is part of a growing focus at 23andMe to improve diversity in genomics. The fact that the majority of genetics studies has been done on European populations, Auton said, is not only unethical, it's also bad science. "We strongly believe that including a range of genetic diversity is good science," he said. "By not studying individuals of other ancestries, we'd be blinding ourselves to some of the discoveries that could be [in those] other populations."
Earlier this year, the company received $250,000 from the NIH to develop a genomic analysis pipeline for identifying disease-linked genetic variants using admixture mapping and use that approach to improve genetic research for people of African, Asian, and Latino ancestry.
Then in October, 23andMe announced the African Genetics Project, through which it is offering free spit kits and analysis to people whose grandparents were born in the same African country, or the same ethnic or tribal group in one of these countries — Angola, Benin, Burkina Faso, Cameroon, Ethiopia, Gabon, Gambia, Guinea Bissau, Guinea, Ivory Coast, Liberia, Republic of Congo, Senegal, Somalia, Sudan, and Togo. 23andMe said it is prioritizing West African countries in this project since most of the slaves brought to America came from these locations.
The African Genetics Project is part of 23andMe's Roots into the Future research initiative, which it launched in 2011 with the goal of accelerating genetic research within the African American community by enrolling 10,000 participants who self-identify as African American, Black, or African. 23andMe was able to reach its recruitment goal approximately a year after the project launched, a company spokesperson said.