NEW YORK (GenomeWeb News) – Researchers studying the underlying genetic causes for human diseases can dramatically cut their costs and save time by mining data about real patients found in electronic medical records (EMRs), instead of recruiting and sorting participants as they look for common genetic variants, according to a new study led by Northwestern University.
Recruiting the thousands of patients that are required to collect the large amounts of health data necessary to find genetic clues for diseases and to identify disease phenotypes is "expensive and time consuming," the researchers said. But EMRs enable researchers to access data on real patients that have already been collected through doctors' visits, they noted.
In the study, published in the April 20 issue of Science Translational Medicine, researchers were able to cull patient information from EMRs at five different national sites to accurately identify patients with five kinds of diseases and health conditions.
"The hard part of doing genetic studies has been identifying enough people to get meaningful results," lead investigator Abel Kho, an assistant professor of medicine at the Northwestern University Feinberg School of Medicine and a physician at Northwestern Memorial Hospital, said in a statement. "Now we've shown you can do it using data that's already been collected in electronic medical records and can rapidly generate large groups of patients."
The research harnessed the resources of the National Human Genome Research Institute-led Electronic Medical Records in Genomics (eMERGE) Network, including eMERGE partners at Northwestern, Mayo Clinic, Vanderbilt University, Marshfield Clinic Research Foundation, and Group Health.
The study investigators searched EMRs from these sites for patients with type 2 diabetes, dementia, peripheral arterial disease, cataracts, and cardiac conduction.
Kho and his research team identified the diseases by using several search criteria including medications, diagnoses, and lab tests, and then they tested their results using physician review. The team found that EMRs enabled them to identify patients' diseases with 73 to 98 percent accuracy.
The researchers also found that they were able to use the EMRs to reproduce previous genetic findings from earlier prospective studies.
As the cost of genome sequencing continues to fall, Kho noted, it should eventually be possible to include patients' genomes in their medical records, providing a bounty of information for disease researchers.
"With permission from patients, you could search electronic health records at not just five sites but 25 or 100 different sites and identify 10,000 or 100,000 patients with diabetes, for example," Kho explained.
The larger the studies, the better they could be at detecting rare effects of genes and providing more detail about the genetic sequences that lead to diseases, Northwestern said.
This study also found, however, that there were "across-the-board-weaknesses" in the EMRs, all of which used different software. The EMRs also did a poor job of capturing different factors such as race and ethnicity, smoking status, and family history.
"It shows we need to focus our efforts to use electronic medical records more meaningfully," Kho said.