Skip to main content
Premium Trial:

Request an Annual Quote

One Codex Launches First Version of Metagenomics Classification Platform


NEW YORK (GenomeWeb) – This week, One Codex released the first version of its bioinformatics platform for identifying and characterizing microbial genomes in the context of food safety, biosecurity, clinical markets, and other areas.

Nick Greenfield, company co-founder and CEO, told GenomeWeb that commercial and academic customers will be able to use the company's platform to analyze up to 100 private samples per year at no cost. In addition, academic users will have unlimited access to the system for analyzing and organizing any public samples, while One Codex foots the bill for storage space and compute on Amazon with funds from private revenue sources and investments.

Beyond that 100-private-sample limit the company will charge a variable price per sample for further analysis using its system. The exact costs will depend on factors such as whether clients need things like additional data archiving capabilities or particular guarantees around personal health information, for instance in the case of clinical samples, Greenfield said. In addition, the company will provide customized services, for a variable fee, for clients running bespoke projects or assays, he said.

The launch follows a public beta that began last summer to put the company's metagenomics classification system through its paces. The system uses a kmer-based approach to search input Fasta or Fastq files against two reference databases that contain just under 40,000 bacteria, viruses, protists, archaea, and fungi genomes. The platform features a user-friendly interface that provides access to tools for organizing, comparing, and querying sequence samples and associated metadata. Following analysis, the system generates reports that summarize the organisms present in the given sample, including read-level information, and offer details about matches to individual kmers within reads. The company also offers an application programming interface and command-line access for large-volume data downloads and uploads.

Earlier this year, the company, which was incorporated under the name Reference Genomics, won a $200,000 award from the US Centers for Disease Control for an application built on its platform for strain-typing Shiga toxin-producing Escherichia coli from complex clinical samples without using traditional culture-based methods. Last month, it published a preprint that benchmarks One Codex against a panel of similar algorithms, including multiple iterations of Kraken, a popular metagenomics classification software that was developed by researchers from University of Maryland and Johns Hopkins University, and two other classification approaches. For the comparison, the researchers used a dataset of 50 million reads gleaned from 10,639 genomes as well as sequence from six organisms that are not usually included in reference datasets used by metagenomics classifiers. According to results provided in the paper, compared to these existing methods, the One Codex platform had the highest degree of sensitivity and specificity for detecting sequences from both the well-characterized species and the sequences from six additional species.

There wasn't a formal end date for the beta but One Codex is launching its platform now, Greenfield said, because it believes that the system is now stable and mature enough for the market. Over the course of the beta, the company ran over 10,000 samples through its platform and performed tens of thousands of analyses primarily in the context of public health epidemiology and food safety research with some early exploratory projects also coming from users on the clinical side, he told GenomeWeb. As the platform goes to market, Greenfield expects that the customer make up will largely mirror the beta meaning that early adopters will likely come from the public health and food safety markets with clinical clients coming in later.

Besides testing, One Codex progressively added several new capabilities to its system in response to beta users' requests. The list includes things to draw in more clinical users, such as  better sample management and comparison functionalities and HIPAA-level security measures, including mechanisms for encrypting data in transit and at rest as well as data access controls, he told GenomeWeb. Other additions include new in silico gene and marker panels that let users get more fine-grained information — such as details on virulence factors and antimicrobial resistance — about their samples. For example, if a user finds Escherichia coli in a sample, they might run a marker panel for Shiga toxin genes to see if the sample contains Shiga toxin-producing E. coli, he said

The company has also expanded its databases up from about 28,000 genomes earlier this year to now just under 40,000 genomes across two reference databases and growing. According to results provided in the company's preprint, access to this wider pool of reference data is the primary reason for its systems improved sensitivity and specificity compared to other programs tested in the study. When users upload datasets to the platform, they are first compared to some 8,000 RefSeq genomes, and then to One Codex's expanded database which offers access to an additional 29,000 genomes, roughly, from other public sources including some draft assemblies that aren't currently available in GenBank as well as some private sources, Greenfield told GenomeWeb. Datasets are checked for contaminants and cleaned using both manual and automated protocols before they are included in the One Codex database.

Moving forward the company will continue to develop tools that help customers in its target market design better assays and tests that quickly and reliably characterize microbial samples, Greenfield said. They are also working on tools for combining and connecting datasets within the One Codex system to explore things like outbreak linkages, antibiotic resistance patterns, and more, he added.

Beta users that GenomeWeb spoke to said that One Codex's platform offers many of the same capabilities available in existing tools like Kraken. However, it features a much sleeker, user-friendly interface than more command line-based tools like Kraken, which should make it more attractive particularly for less bioinformatics-savvy users.

One tester, Bede Constantinides, a researcher at University of Manchester, develops software for characterizing viral variation within species as well as for assembling viral genomes found in metagenomics samples. He highlighted simplicity and ease of use as major benefits of the One Codex platform compared to some other programs, he told GenomeWeb. It's also very fast, he noted, returning results within minutes compared to some competing options which could take weeks to return results. Furthermore, he added, One Codex's rapid response to customer questions and needs are plus points for the company. It's not as sensitive as some traditional alignment-based approaches but is much faster, he said. "It will definitely be my go-to for quickly determining if I have contamination in a sample [for example]."

Another tester, Phelim Bradley, a doctoral student in genomic medicine and statistics at Oxford University, develops methods for analyzing bacterial genes and mutations to try to predict antibiotic resistance in species such as Staphylococcus aureus and Mycobacterium tuberculosis. For the beta, he used One Codex to analyze sequences from bacterial cultures to help clarify which species were actually present in the samples in question. Besides ease of use, the platform includes tools for easily exploring and visualizing results, which aren’t provided with programs like Kraken, he told GenomeWeb. Another advantage of the One Codex platform is its large curated reference database which contains a much broader pool of data than is available to academic tools. Furthermore, the API makes it easy for users build their own tools on top of the One Codex platform, he said.

One Codex platform could compete with products from companies such as CosmosID whose metagenomics platform also couples a kmer-based approach with curated databases bacteria, viruses, molds, and fungi. The company offers services, an appliance option, and it recently added a cloud option to the mix.

The Scan

Machine Learning Helps ID Molecular Mechanisms of Pancreatic Islet Beta Cell Subtypes in Type 2 Diabetes

The approach helps overcome limitations of previous studies that had investigated the molecular mechanisms of pancreatic islet beta cells, the authors write in their Nature Genetics paper.

Culture-Based Methods, Shotgun Sequencing Reveal Transmission of Bifidobacterium Strains From Mothers to Infants

In a Nature Communications study, culture-based approaches along with shotgun sequencing give a better picture of the microbial strains transmitted from mothers to infants.

Microbial Communities Can Help Trees Adapt to Changing Climates

Tree seedlings that were inoculated with microbes from dry, warm, or cold sites could better survive drought, heat, and cold stress, according to a study in Science.

A Combination of Genetics and Environment Causes Cleft Lip

In a study published in Nature Communications, researchers investigate what combination of genetic and environmental factors come into play to cause cleft lip/palate.