Skip to main content
Premium Trial:

Request an Annual Quote

Benchmarking Tool to Assess Medical Large Language Models

A team from Google Research, the US National Library of Medicine, and DeepMind has developed a benchmark to assess the clinical knowledge of medical large language models (LLMs), which they then applied to two such LLMs. As they report in Nature, the researchers developed a tool dubbed MultiMedQA that incorporates information from six existing medical knowledge datasets as well as from a seventh dataset they developed of commonly searched health questions. Using MultiMedQA, the team assessed the Pathways Language Model (PaLM) and the related Flan-PaLM to find that Flan-PaLM had 67.6 percent accuracy on US medical licensing-style questions, which they noted was better than other approaches. However, Flan-PaLM struggled with answering consumer medical questions in long form. The team then made adjustments to the model, now called Med-PaLM, and a panel of clinicians then found its answers to be in line with the scientific consensus nearly 93 percent of the time, as compared to about 62 percent of the time for Flan-PaLM. "Our human evaluations reveal limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLMs for clinical applications," the researchers write.

The Scan

Positive Framing of Genetic Studies Can Spark Mistrust Among Underrepresented Groups

Researchers in Human Genetics and Genomics Advances report that how researchers describe genomic studies may alienate potential participants.

Small Study of Gene Editing to Treat Sickle Cell Disease

In a Novartis-sponsored study in the New England Journal of Medicine, researchers found that a CRISPR-Cas9-based treatment targeting promoters of genes encoding fetal hemoglobin could reduce disease symptoms.

Gut Microbiome Changes Appear in Infants Before They Develop Eczema, Study Finds

Researchers report in mSystems that infants experienced an enrichment in Clostridium sensu stricto 1 and Finegoldia and a depletion of Bacteroides before developing eczema.

Acute Myeloid Leukemia Treatment Specificity Enhanced With Stem Cell Editing

A study in Nature suggests epitope editing in donor stem cells prior to bone marrow transplants can stave off toxicity when targeting acute myeloid leukemia with immunotherapy.