Skip to main content
Premium Trial:

Request an Annual Quote

Experts Discuss Need for Diversity in Precision Medicine, Potential for Bias in AI


SAN FRANCISCO (GenomeWeb) – In order to realize the promise of precision medicine and ensure that it does not perpetuate healthcare disparities, diverse data sets will be needed, experts said at the BioData World West conference here this week.

Diversity is needed both in order to understand how population differences in genomic variation impact health and also to ensure that the bioinformatics and artificial intelligence tools that researchers are turning to for data analysis are not biased.

During panel discussions at the conference, experts discussed these issues, as well as potential solutions including the role that regulators could play.

In recent years, there's been increased attention paid to ensuring that genomic studies, which have historically skewed towards analyzing individuals of European ancestry, include diverse populations. The National Institute's of Health's All of Us study, for instance, aims to enroll participants that reflect the country's diversity.

That's because increasingly researchers are finding that there are population-specific variants that could have clinical impacts. For instance, researchers at the University of Pennsylvania have found that there may be population differences in mutations that are predisposing to breast cancer. And, researchers who de novo sequenced and assembled a Korean reference genome have reported on finding more than 11,000 novel structural variants, some of which are unique to Korean populations and related to disease, including a variant associated with rheumatoid arthritis in Korean populations that is not found in European populations.

At the conference, experts acknowledged that there is still a long way to go toward making sure that genomic data is more diverse.

Adam Berger, a senior fellow of personalized medicine at the US Food and Drug Administration who is Caucasian, acknowledged that "most clinical trials are based on people who look like me," but said that for projects such as All of Us, recruiting a more diverse and representative population was "part of its core, so that we can create a system that's useful for everyone and not just a subset of folks."

He said that would require efforts to engage minorities and other populations who have been underrepresented and may have historically mistrusted the medical establishment. "We need to make sure they're equal partners," he said.

Sandy Aronson, executive director of information technology at Partners HealthCare Personalized Medicine, agreed. "We need to do a better job of this so we can better interpret variants and better serve minority populations," he said. "And, we also need to do a better job in order to better understand genetics in general."

The lack of diversity in genomic databases has already been shown to have real clinical impacts. For instance, researchers in February published a study in the journal JAMA Cardiology that found that cardiomyopathy genetic tests were better able to identify pathogenic variants in white patients than patients from other ethnicities.

And, researchers at last year's annual meeting of the American Association for Cancer Research discussed how a lack of diversity in large cancer genomic datasets can lead to poorer understanding of the biology of cancer in different populations and make it difficult to pinpoint the causes of outcome discrepancies.

There is concern that in the future, as genomic testing becomes increasingly adopted, these differences will be exacerbated, unless there is real effort to address the knowledge gaps.

That concern is particularly heightened with the introduction of artificial intelligence into precision medicine and diagnostics. Companies are looking to use artificial intelligence to help diagnose rare genetic and neurological diseases, to detect cancer early, and more.

Alex Zhavoronkov, CEO of InSilico Medicine, explained how biases can be built into such systems if the data used to develop them is not diverse. He developed an age-prediction algorithm and showed that when the algorithms are trained on an ethnically homogenous dataset, they perform worse when analyzing other ethnicities. For example, he described a study in which an algorithm used age-related biomarkers to predict a person's age. But there are population differences in those biomarkers, so if the algorithm is trained on one population, it does not perform well on other populations.

"Educating people about the value of data and inclusion is important," Zhavoronkov said.

Diversity in data is important, Greg Corrado, director of augmented intelligence research at Google, agreed. "We need broad data sharing in order to build systems that are fair and accurate," he said.

He noted that many researchers and organizations are reluctant to share their data, and he said that one potential solution would be to enable sharing without requiring organizations to give up control of the data. It's possible to design a system that would enable data owners to "move their data into the cloud but retain complete control over how the data is used," he said.

"This notion of biases in data is really interesting," Aronson added, and said that one way to identify biases early on would be to treat the algorithm like an FDA-regulated medical device. For medical devices to be cleared by the FDA, they have to pass a hazard analysis. That involves getting everyone who knows everything about the device to brainstorm every possible way it could go wrong and for each risk assess the likelihood it will happen and the harm to patients.

"As these technologies evolve and move quickly into healthcare, this is often being missed," he said, "and we really need to focus on doing that."

Aronson added that as AI and genomic technologies move into a healthcare setting, the FDA would have a role to play in ensuring that the technologies are safe, effective, and not built on biased data. Although many companies developing the technologies may see the FDA as an obstacle, he said companies should welcome an engaged regulator.

"If you're competing in a market, you want a standard to exist such that everyone who is competing has to meet that standard," he said.

Chris Mansi, CEO of, a company that has developed technology to analyze medical images to recognize when someone was having a stroke, agreed. "The FDA is very forward thinking in how they're thinking about AI and technology," he said. The key is to start conversations with the FDA early, he said.

The Scan

ChatGPT Does As Well As Humans Answering Genetics Questions, Study Finds

Researchers in the European Journal of Human Genetics had ChatGPT answer genetics-related questions, finding it was about 68 percent accurate, but sometimes gave different answers to the same question.

Sequencing Analysis Examines Gene Regulatory Networks of Honeybee Soldier, Forager Brains

Researchers in Nature Ecology & Evolution find gene regulatory network differences between soldiers and foragers, suggesting bees can take on either role.

Analysis of Ashkenazi Jewish Cohort Uncovers New Genetic Loci Linked to Alzheimer's Disease

The study in Alzheimer's & Dementia highlighted known genes, but also novel ones with biological ties to Alzheimer's disease.

Tara Pacific Expedition Project Team Finds High Diversity Within Coral Reef Microbiome

In papers appearing in Nature Communications and elsewhere, the team reports on findings from the two-year excursion examining coral reefs.