NEW YORK (GenomeWeb) – Historically, most genetic research has been conducted on individuals of European descent, and although more recent genomic sequencing studies have attempted to include individuals from different ancestral backgrounds, researchers are starting to discover that a lack of diversity in sequencing studies is perpetuating healthcare disparities among ethnic minorities.
In a study published last month in the Journal of the American Medical Association Oncology, researchers from Memorial Sloan Kettering Cancer Center used statistical methods to determine whether a large-scale cancer sequencing project included enough individuals from various populations to be sufficiently powered to discover cancer-related mutations specific to those populations.
Looking across more than 5,000 US samples from 10 different tumor types that were sequenced as part of The Cancer Genome Atlas project, the researchers found that the study included only enough samples to discover mutations for white Americans across all 10 tumor types as well as for black patients with breast cancer.
"This is an important study to understand the potential sources of healthcare disparities in cancer," Arjun Manrai, a research fellow in biomedical informatics at Harvard Medical School who was not affiliated with the study, told GenomeWeb.
Researchers know that racial diversity exists within the mutational landscape of cancer. For instance, nearly half of all lung cancer patients of Asian descent harbor an EGFR mutation, but the mutation is present at only about a 20 percent frequency in white and African American patients. However, the extent of this diversity is unknown.
Senior author of the study, Joseph Osborne, a radiologist and nuclear medicine physician at Memorial Sloan Kettering Cancer Center, said that he and first author Daniel Spratt from the University of Michigan were initially interested in looking for prostate cancer-related mutations that might be more prevalent in African American populations.
African American men suffer from higher rates of prostate cancer diagnosis, relapse, death, and worse overall outcomes than other populations, Osborne said. The researchers first looked through the TCGA data to see if they could identify any new variants that might be related to this discrepancy.
However, an analysis of the data showed that there was a "deficit in the number of samples needed" to discover new causal mutations, despite the fact that "the TCGA had actually done a fantastic job of reaching out and looking for participation of ethnic minorities," Osborne said.
It turned out that the problem was that the TCGA researchers had used proportional numbers of the various ethnicities based in part on census data, Osborne said. But those sample sizes "did not provide adequate statistical power to discover population-specific mutations."
Osborne and the researchers next went back through a subset of the TCGA data, looking at 5,729 samples from 10 different tumor types. Of those samples, more than three-fourths were from white patients, while 660 were from black patients, 173 from Asian patients, 149 from Hispanic, and fewer than 26 were from patients of Native Hawaiian, Pacific Islander, Alaskan Native, or American Indian descent.
The researchers used statistical methods to calculate the number of samples that would be needed to detect new mutations present at 10 percent frequency and 5 percent frequency for each cancer type over the background noise of variation with 90 percent power in 90 percent of genes.
The number of samples needed was based on the overall rate of somatic mutations per megabase for the different cancers, which has been previously reported. That rate ranges from around 0.7 for prostate cancer to 9.9 for lung squamous cell cancer.
The researchers found that although enough samples from white patients were included for each of the 10 tumor types to be able to detect somatic mutations present at 10 percent frequency or higher, the same held true only for black breast cancer patients. Meanwhile at the 5 percent frequency level, no new ethnic-specific mutations could be discovered for any tumor type except among white patients.
"All underrepresented groups remained so when their samples were queried," Osborne said.
Although the idea that genomic variation differs depending on one's ancestry is not new, only recently has technology enabled researchers to more fully understand population-specific variation, including identifying variants that are common in one population but rare in another, Manrai said.
"It's important to do these power calculations so that we have a sense of how many samples are needed to reliably detect mutations over the background rate," Manrai said.
Manrai was not involved with the JAMA Oncology study, but led a study published in the New England Journal of Medicine last month that found that patients of African or unspecified ancestry were often misdiagnosed with hypertrophic cardiomyopathy due to incorrectly classified variants. Because individuals of African descent were not well represented in databases, variants in disease-causing genes that were rare in white populations were labeled as likely disease-causing, despite the fact that they are common and benign in African American individuals.
Manrai said that the power calculation approach that Osborne and his team used in the JAMA Oncology study is a "tried and true" statistical approach. He said that both that study and his NEJM study attempted to tackle different sides of the same problem. While his study looked to identify false positives due to underrepresentation of minority groups, the JAMA Oncology study looked to tackle the problem of false negatives — not being able to call disease-causing mutations.
Both studies come to the same conclusion though, that a lack of diversity in sequencing studies may be perpetuating health care disparities in minorities.
Manrai added that the calculations done by Osborne's group for the TCGA cohort were also important because it shed light on the fact that even in studies that do include minority groups, it is not always enough to include a proportional representation of the various populations. Historically, "proportional representation has been accepted," he said, but now it's clear that even those studies may "lack absolute sample size to call a mutation causal, which may actually widen disparities."
Osborne said that he hopes that the results of this study would be used to help inform researchers whether specific studies were statistically powered for different ethnic groups. For instance, he said, by applying the same statistical methods to other sequencing cohorts, researchers could pinpoint how many additional samples of a given ethnicity would be needed.
He noted that he has already collaborated with researchers at the Broad Institute and Weill Cornell Medical College on a sequencing study of African American males to identify new mutations in prostate cancer that could eventually help address disparities.
In addition, he said that colleagues from the National Institutes of Health's Center to Reduce Cancer Health Disparity will present results from this study to members of the TCGA and discuss the possibility of perhaps expanding it to include cohorts of patients from underrepresented groups. "We're looking to fill in the gaps," he said.