Skip to main content

HUPO's Proteomics Standards Initiative to Merge mzData, mzXML Formats to Create New dataXML Format

SEATTLE, June 1 (GenomeWeb News) - The Human Proteome Organization's Proteomics Standards Initiative announced this week that it will combine the current HUPO-PSI format, mzData, with the mzXML format developed by the Institute for Systems Biology.
 
The new, combined format will be called "dataXML." PSI officials said they expect the dataXML project to be mostly completed by the end of the year. They made their announcement at the American Society for Mass Spectrometry conference, held here this week.
 
"This is a major undertaking for the proteomics informatics community and represents widespread agreement on the need to improve data interchange," said PSI officials, who met here this week at the American Society for Mass Spectrometry conference.
 
The new format will incorporate features from both mzData and mzXML, including an interchange schema that has split data vectors compatible with other analytical interchange formats. It will also support both random access indexes and digital signatures via a wrapper schema.
 
The new format will also include tools to support developers and users, including a conocalization program to format legal XML documents before binary indexes or signatures are computed; a validation program to insure that the use of controlled vocabulary terms matches MIAPE requirements; an "Application Programming Interface" including language bindings for popular programming languages; and abstract data models and other documentation to help software developers who want to implement systems based on the interchange format.
 
PSI officials said they expect to complete a data model and ontology models in August, while documentation, draft specification of schema, and language bindings will be done in September. In December, they expect to complete binary indexing and signature programs, a validation program, and reference implementations of converters.

The Scan

Possibly as Transmissible

Officials in the UK say the B.1.617.2 variant of SARS-CoV-2 may be as transmitted as easily as the B.1.1.7 variant that was identified in the UK, New Scientist reports.

Gene Therapy for SCID 'Encouraging'

The Associated Press reports that a gene therapy appears to be effective in treating severe combined immunodeficiency syndrome.

To Watch the Variants

Scientists told US lawmakers that SARS-CoV-2 variants need to be better monitored, the New York Times reports.

Nature Papers Present Nautilus Genome, Tool to Analyze Single-Cell Data, More

In Nature this week: nautilus genome gives peek into its evolution, computational tool to analyze single-cell ATAC-seq data, and more.