Skip to main content
Premium Trial:

Request an Annual Quote

Genome Sequencing, EHR Machine Learning Method Detects Unnoticed Hospital Outbreaks


NEW YORK – A team led by researchers at the University of Pittsburgh has developed a genome sequencing- and machine learning-based strategy for identifying and preventing healthcare-acquired infections that appears to outperform conventional infection prevention methods.

The method, called "Enhanced detection system for healthcare-associated transmission" (EDS-HAT), currently relies on whole-genome sequencing-based surveillance for a defined set of bacterial pathogens that are most often found in healthcare settings, along with machine learning-based analyses on electronic health records to narrow in on potential procedures, exposures, or parts of hospitals that are linked to infections that do arise.

"What we've shown is that [the] approach is very effective at identifying undetected outbreaks," said Lee Harrison, a researcher at the University of Pittsburgh and UPitt Graduate School of Public Health, who noted that "we use the machine learning part to figure out what is causing the outbreak. What's the transmission route for the outbreak that was detected by sequencing?"

Although genome sequencing has previously been used to characterize microbes involved in hospital outbreaks and their transmission patterns, this approach is often applied to understand the trajectory of a specific, recognized outbreak that has already concluded.

"The way we have historically detected outbreaks is by basically somebody in the hospital noticing something unusual — a cluster of a particular organism, say, on a nursing unit," Harrison noted. "When they notice an outbreak, they will ask our lab to do sequencing of the bacteria to confirm that an outbreak has occurred or to refute an outbreak, if there was no transmission."

While such reactive sequencing strategies can provide insights into prior outbreaks, they tend to be inefficient and slow to detect or halt outbreaks that are picking up momentum. In contrast, Harrison and his colleagues reasoned that more extensive sequencing informed by electronic health record data could help to find and react to potential outbreaks more quickly.

With EDS-HAT, he explained, "if you're admitted to the hospital and you have an infection with one of the EDS-HAT organisms, we sequence it, whether we think you're part of an outbreak or not." That genome sequencing is coupled with machine learning to focus in on outbreak causes that may otherwise have gone unnoticed — from shared medical procedures to a contaminated surface to which multiple patients with shared infections have been exposed.

"Both components of EDS-HAT are essential: [Whole-genome sequencing] surveillance is used to 'connect the dots' between seemingly unrelated patients to signal an outbreak and [machine learning], in combination with review by an [infection prevention] expert, then identifies the responsible transmission route," Harrison and his colleagues wrote in a related paper published in the journal Clinical Infectious Diseases last week.

For that proof-of-principle study, the team compared EDS-HAT with conventional infection prevention practices at the University of Pittsburgh Medical Center-Presbyterian Hospital between November of 2016 and November 2018.

Whole-genome sequencing for EDS-HAT and other outbreak sequencing at the center is typically done with Illumina short-read technologies, Harrison said, although long-read Pacific Biosciences reads and hybrid assemblies are used for EDS-HAT in very select circumstances. The method can be applied to a range of sample types, depending on a patient's condition and the clinical workflow already being used.

Over that two-year period, he and his team sequenced nearly 3,200 clinical isolates from 2,752 different patients within the EDS-HAT framework, identifying 297 related microbes from 99 clusters or outbreaks involving several high-risk species and strains currently targeted by EDS-HAT.

"We picked the high-impact ones, because we wanted to focus on the ones that we had the resources to sequence," Harrison said, noting that a microbe-agnostic approach to sequencing would also be possible given available resources.

Indeed, the team is in the early stages of a related EDS-HAT approach that applies to bacteria regardless of the species involved, though that method has not yet been validated. The group has also secured additional funding to continue developing EDS-HAT methods that can find not only bacterial infections but also infections involving respiratory viruses — an application that would involve direct sequencing from clinical specimens that have sufficient viral load.

"What we're going to find, I believe, is something similar to what we found with the bacterial side of EDS-HAT," Harrison suggested, "which is, we're going to find transmission routes that we did not know about."

When it came to the bacterial outbreaks found in the current study, the investigators saw dozens of related readmissions in the first week or month after patients were discharged. They also retraced apparent transmission routes for 65 of the 99 outbreaks.

Outbreaks ranged in size from just two patients in a smaller cluster to 14 patients, according to first author Alexander Sundermann, clinical research coordinator with the University of Pittsburgh Microbial Genomic Epidemiology Laboratory and a doctoral student in infectious diseases and epidemiology at the University of Pittsburgh's Graduate School of Public Health.

During the same time frame, the more traditional infection prevention methods picked up 15 suspected outbreaks affecting 133 patients, who were subsequently tested with sequencing, Sundermann said, though only a handful of transmission events involving genetically related bacteria were confirmed.

"It really shows that … traditional methods [are] pretty inaccurate, whereas we captured so [many] outbreaks that went undetected at that time," he noted.

Those outbreaks included a cluster of vancomycin-resistant Enterococcus faecium infections spread out across several hospital units and over many months that stemmed from an interventional radiology procedure, Sundermann recalled. Those infections appeared to be caused by a non-sterile preparation of a contrast compound that was injected into patients before their procedures.

"The procedures that were occurring were done [according to] the manufacturer's recommendations, although we didn't see hand hygiene performed," he noted. "Once we intervened and had some hand precautions, we didn't see any more related cases to that [outbreak]."

The team published findings from that outbreak in an earlier Clinical Infectious Diseases study. Along with educating healthcare workers about the source of the outbreak and recommending more sterile methods that are now being used at the hospital, the authors alerted the company involved to this potential infection issue.

"It shows the utility of using EDS-HAT, because it would have never been caught before and we don't know how long it was going on," Sundermann said, noting that "we haven't ever heard of an outbreak from interventional radiology, so we were happy to share and publish that, because we're pretty sure it was happening at other hospitals across the US and the world."

"The folks in interventional radiology were just following the training video, and the training video showed unsafe practices," Harrison added. "It's a global company … and I would be surprised if there weren't similar problems at other institutions."

In the University of Pittsburgh Medical Center-Presbyterian Hospital setting for the EDS-HAT study, the method also appeared to offer cost savings compared to existing infection prevention methods, particularly since it can help more patients dodge infections that may require additional treatment or may even extend their time in the hospital.

Calculations for the study used some assumptions that were specific to hospitals or healthcare centers that have ties to academic sites with genome sequencing capacity, Harrison said. He noted that the team is working on a related analysis looking at the cost and other considerations for doing EDS-HAT-based infection prevention at hospitals that do not have that widespread sequencing capacity.

"We're going in a stepwise fashion," Harrison explained. "Now that we've shown the value of the system, we're changing our workflow in the lab to do real-time sequencing. So we're basically starting to implement EDS-HAT as a real-time tool."

Because many hospitals do not yet have the sequencing capacity and related resources to implement a system like EDS-HAT on their own, the investigators are also in the early stages of considering potential commercialization strategies to reach more healthcare centers. The model for that commercialization remains to be seen, and may depend somewhat on the genome sequencing capacity that a given healthcare setting already has in place or available to it, Harrison said.

At hospitals with whole-genome sequencing capacity or sequencing arrangements with a commercial sequencing lab, for example, the team may offer its analytical algorithms or data mining services, while other sites may be looking for sequencing as well as analytical services.

"There are all kinds of different models that might be appropriate for some hospitals over others," Harrison said.

Sundermann noted that the team is also working on a review of genome sequencing-based infectious disease surveillance at other healthcare centers, which has primarily identified sequencing as a retrospective tool for understanding a specific outbreak event with a single organism.

Even so, "we do find that all of these hospitals that did [genome sequencing] are finding outbreaks that would have gone otherwise unnoticed," he said, noting that the machine learning side of EDS-HAT appears to set it apart further from approaches used in the past.