Skip to main content
Premium Trial:

Request an Annual Quote

NCGR s XGI Software Supports Web-Based, Automated Sequence Analysis and Annotation


“Very often people in the mouse community or human genetics community consider sequence annotation to be a solved problem,” said Callum Bell of the National Center for Genome Resources. “But we’ve found that there are many, many scientific communities who are underfunded in the area of bioinformatics and they’re not able to do it for themselves.”

In response, Bell and his team of five NCGR colleagues developed an automated system for the storage, analysis, and visualization of DNA sequences that is housed behind a firewall at NCGR in Santa Fe, NM, which users can access remotely.

The software, called XGI for the X Genome Initiative, sprang from two prototypes: the Phytophthora Genome Initiative, first released in 1998, and the Medicago Genome Initiative, which was released in 2000. The reengineered version allows users to analyze and compare data from a variety of species and also offers a higher degree of flexibility and portability. An additional feature of XGI is automated annotation of sequence data through the use of Gene Ontology terms.

“There are things that many of us take for granted as being relatively easy in the bioinformatics world, but for users producing sequence, quite often they lack the computer experience to make that a smooth process,” said Bell. XGI was designed to automate as much of the process as possible. The system comprises an automated sequence analysis pipeline, a supporting relational database schema running on Sybase, and a web-based user interface. The sequence analysis step clusters ESTs into non-redundant sets, deriving consensus sequences in order to reduce the complexity of subsequent analysis, Bell said.

Users submit raw sequence data through the internet and are then able to access and search their processed data once it passes through the pipeline and is stored in the relational database. The data stream between the user and NCGR is encrypted for security.

Bell said the turnaround time for data processing depends on the volume of sequence, but NCGR provides a database mirroring system so that users can view existing data without interruption while it is being processed.

While the goal of the XGI project was to create a remote bioinformatics system so that users require only a web browser, Bell said it could also be installed locally to enable complete control over the system. NCGR is currently working on its first installed XGI system for Plant Research International, a non-profit research group based in Wageningen, the Netherlands. Plant Research International is collaborating with NCGR on development of the system and the two groups will share distribution of any new features.

The first user of the web-based system is another NCGR collaborator, the Samuel Roberts Noble Foundation of Ardmore, Okla. NCGR is also working with both the Syngenta Phytophthora consortium and a consortium of scientists also interested in Phytophthora on development of the system.

The Noble Foundation, the Novartis Foundation, and the USDA provided the funding to develop XGI, but Bell said NCGR is currently working out the best way to support its processing, maintenance, and customization costs. The group is trying to arrive at a coherent set of costs for non-profit and commercial customers, as well as a pricing system for local installation.

Last week, the NCGR also announced the full users’ release of its ISYS software to integrate bioinformatics software and databases. Free evaluation copies of ISYS can be downloaded from

XGI will soon be publicly available at

— BT

Filed under

The Scan

Genetic Risk Factors for Hypertension Can Help Identify Those at Risk for Cardiovascular Disease

Genetically predicted high blood pressure risk is also associated with increased cardiovascular disease risk, a new JAMA Cardiology study says.

Circulating Tumor DNA Linked to Post-Treatment Relapse in Breast Cancer

Post-treatment detection of circulating tumor DNA may identify breast cancer patients who are more likely to relapse, a new JCO Precision Oncology study finds.

Genetics Influence Level of Depression Tied to Trauma Exposure, Study Finds

Researchers examine the interplay of trauma, genetics, and major depressive disorder in JAMA Psychiatry.

UCLA Team Reports Cost-Effective Liquid Biopsy Approach for Cancer Detection

The researchers report in Nature Communications that their liquid biopsy approach has high specificity in detecting all- and early-stage cancers.