Skip to main content
Premium Trial:

Request an Annual Quote

EMERGE Network Launches Publicly Available Database of Phenotype Identification Algorithms

Premium

By Uduak Grace Thomas

SAN FRANCISCO, Calif. — The Electronic Medical Records and Genomics Network has launched an open resource, dubbed the Phenotype KnowledgeBase, which offers access to validated algorithms for identifying patients with specific disease phenotypes based on data in their electronic medical records.

Joshua Denny, an assistant professor in the biomedical informatics and medical departments at eMERGE participant Vanderbilt University, described the new resource at the American Medical Informatics Association's Summit on Translational Bioinformatics here this week.

PheKB currently includes 12 algorithms developed by members of the eMERGE consortium, though others are welcome to contribute their tools, Denny told BioInform.

The algorithms use natural language processing techniques to mine EMR data for patients with particular conditions of interest to researchers, such as cataracts, Alzheimer’s disease, low levels of high-density lipoprotein, type II diabetes, among others.

These algorithms make their selections using various search criteria, such as ICD9 codes, current procedural terminology codes, laboratories, and medications, according to the website.

Scientists from the consortium have been using the algorithms for a number of projects, including a study published last April in which they mined data from five institutions to find patients in each of five disease groups (BI 04/22/0011).

Denny explained that the consortium developed the database so that its tools could be better disseminated to other research efforts that are also studying disease phenotypes such as the Pharmacogenomics Research Network.

Initially, the eMERGE algorithms were made available through the consortium’s Wikipedia page, but that method did not allow the kind of “interactivity” the researchers were looking for, Denny said.

Through PheKB, users can share their own tools as well as any updates that they make to existing algorithms on the website, he said.

Additionally, users can share tips on how they implemented the algorithms at their sites as well as the results of their research efforts if they like, Denny said.

Now in its second phase, the eMERGE project is preparing to run additional genome-wide association studies on several new disease phenotypes.

The National Human Genome Research Institute awarded $25 million in grants for the second phase of the project — which is expected to last four years — last August. During this phase, the investigators plan to identify genetic variants that are associated with more than 40 disease characteristics and symptoms using genome-wide association studies across the entire eMERGE network (BI 8/19/2011).

Denny told BioInform that the group has already begun working on 21 new phenotypes.


Have topics you'd like to see covered in BioInform? Contact the editor at uthomas [at] genomeweb [.] com.

Filed under

The Scan

Removal Inquiry

The Wall Street Journal reports that US lawmakers are seeking additional information about the request to remove SARS-CoV-2 sequence data from a database run by the National Institutes of Health.

Likely to End in Spring

Free lateral flow testing for SARS-CoV-2 may end in the UK by next spring, the head of Innova Medical Group says, according to the Financial Times.

Searching for More Codes

NPR reports that the US Department of Justice has accused an insurance and a data mining company of fraud.

Genome Biology Papers on GWAS Fine-Mapping Method, COVID-19 Susceptibility, Rheumatoid Arthritis

In Genome Biology this week: integrative fine-mapping approach, analysis of locus linked to COVID-19 susceptibility and severity, and more.