REHOVOT, Israel--The Weizmann Institute's genome center and bioinformatics unit here last month released version 2.7 of GeneCards (http://bioinfo. weizmann.ac.il/cards/). The freely accessible product, under development since 1996, is an online database and software tool designed to provide researchers and physicians with fast and convenient access to genetic information. In 1998, the GeneCards site received 321,765 hits, 94 percent of which came from outside Israel.
GeneCards features a page, or card, for every human gene. Each card contains a particular gene's vital statistics, such as its official name, the protein it encodes, its function in the cell, its location on a chromosome, and information on the diseases caused by its mutations.
The GeneCards concept, scripts, and web interfaces were conceived by Jaime Prilusky, head of Weizmann's bioinformatics unit, with Michael Rebhan, postdoctoral fellows Liora Yaar and Vered Chalifa-Caspi, and engineer Marilyn Safran.
Prilusky said one of GeneCards' most striking features is its "intelligent interface" which helps searchers efficiently find data they need. For instance, if a query produces no results, the database may suggest a way to reformulate the question, correct a spelling error (for instance, Altshimer is immediately linked to Alzheimer), or provide tips on where else to look for information.
The product is updated on an ongoing basis by software that searches relevant genomic databases and websites, then organizes and presents the data in a concise and easy-to-read format. Prilusky said that the GeneCards Encyclopedia includes data extracted from several databases:
* From Genome Database, GeneCards presents extracts taken from gene entries, namely synonyms for gene symbols, and links to entries for genes.
* From the Mouse Genome Database, which is a comprehensive source of information on the experimental genetics of the laboratory mouse, GeneCards includes information on mouse markers, mammalian homologies, probes, and clones. GeneCards presents links to mammalian homology pages, mouse gene names and locations, and links to mouse gene entries.
* From the catalog of human genes and genetic disorders called Online Mendelian Inheritance in Man, GeneCards presents a directory of diseases listed as allelic variants in the respective entry for the gene, the locus of the gene, and a link to the OMIM entry.
* From SWISS-PROT, the database of protein information, GeneCards extracts data on protein names, cellular functions, similarities, involvement in diseases, and links to other databases related to specific proteins.
* From UniGene, an experimental system for automatically partitioning GenBank sequences into nonredundant sets of gene oriented clusters, GeneCards extracts cluster numbers, gene symbols, and GenBank accession and library ID numbers.
* From Human Gene Mutation Database, a source of information about disease-causing mutations in genes, GeneCards presents links to specific entries.
* From Genatlas, a catalog of genes, markers, and phenotypes with many links to major data sources, GeneCards offers short descriptions of genes that often contain concise information about cellular functions or role in diseases.
GeneCards also presents links and titles of articles from the Doctor's Guide to the Internet, a web service that provides news about biomedical research and its applications.
The new GeneCards also features: 7,853 genes, 7,682 of which have Human Genome Project-approved symbols; improved GeneCards correlation with UniGene clusters; new links to the Atlas of Genetics and Cytogenetics in Oncology and Haematology; changes in format; and a new GeneCards mirror in the Netherlands, in addition to those in the US and Turkey.
"A crucial aspect of the GeneCards strategy is to make use of standard nomenclature, especially the gene symbols approved by the Human Genome Project's Genome Database nomenclature committee," Prilusky told BioInform. "We want to [promote] widespread use of such a standard nomenclature by incorporating only those data that are connected to approved gene symbols."
Room for improvement
Prilusky said there is still plenty of room for improvement in the management of scientific information. "We are continuously trying to improve the performance of the user guidance system, using feedback from scientists and our own statistics of unsuccessful searches to find computational tools that will take the researcher as fast as possible to the requested information, wherever it may be located."
The Weizmann Institute plans to provide easy access to the literature related to subsets of data in GeneCards by developing intelligent free text knowledge extraction tools. Prilusky said, "We will try to integrate more data into the system, depending on the suggestions we get, and on the appearance of new resources."