Skip to main content

GeneSeeker Links Nine Resources to Speed the Search for Candidate Disease Genes


Once a critical region for a genetic disease is identified on a chromosome, the work has just begun for the researcher who is charged with finding the candidate genes. Manually searching across numerous rapidly changing expression and phenotype databases is error-prone and time consuming, so Marc van Driel and his colleagues at the Center for Molecular and Biomolecular Informatics (CMBI) at the University of Nijmegen in the Netherlands decided to automate the process.

The result, a web-based software tool called GeneSeeker, can do in minutes what would otherwise take hours or even days, said van Driel. In addition, because the software searches across nine key bioinformatics databases in real time, the results are likely to be more up to date and accurate than they would be through a manual search, he said.

GeneSeeker, available at, “gives a quick overview of the candidate genes for the disorders in the region you’re interested in,” van Driel said. Users can enter genetic mapping information — a chromosome, a chromosome arm, or range — along with gene expression or phenotypic location — such as a tissue type or body part. The software then searches a total of nine databases in two different categories: genetic localization (MimMap, MGD, and GDB), and gene expression and phenotype (Medline, OMIM, SwissProt/Trembl, GxD, Tbase, and MLC) to return any gene names that appear in the specified location and are also expressed in the specified tissue.

To overcome the discrepancies between gene names in the different resources, van Driel and his colleagues created a list of synonyms that combines the gene name information in SwissProt and the GDB. This synonym list is updated weekly.

The CMBI team recently tested the software for 10 diseases with known localization regions and a range of 49-322 positional candidate genes (average of 165). The results of their evaluation, published in a recent issue of the European Journal of Human Genetics, indicate that the software is not only fast, but effective: The number of candidate genes that matched both location and expression or phenotype was reduced to an average of 22.

The software will work best for researchers looking for a quick overview of the current status of a specific region, “but if you’ve already studied a lot of genes in the region and know the region by heart, then it’s not that useful,” van Driel said. However, the CMBI team plans to constantly update and improve upon the system to enhance its effectiveness. Van Driel said his team is currently adding additional datasets to the software’s search route, including Unigene, EST databases, and SAGE (serial analysis of gene expression) data.

The team is also “experimenting” with some metabolic pathway databases in order to expand the capabilities of the system into metabolic diseases, van Driel said.

— BT

Filed under

The Scan

Rise of B.1.617.2 in the UK

According to the Guardian, UK officials expect the B.1.617.2 variant to soon be the dominant version of SARS-CoV-2 there.

Anne Schuchat to Retire

Anne Schuchat is retiring after more than 30 years at the US Centers for Disease Control and Prevention, Politico reports.

US to Share More Vaccines

CNN reports that the US will share 20 million doses of the Moderna, Pfizer, and Johnson & Johnson SARS-CoV-2 vaccines with other countries.

PNAS Papers on Gene Therapy Platform, Aspergillus Metabolome, Undernutrition Model Microbiome

In PNAS this week: approach to deliver protein-based treatments to cells, pan-secondary metabolome of Aspergillus, and more.