As genome sequencing continues to makes inroads into clinics, established vendors and small startups are honing their informatics portfolios to tap into the growing market for software in this setting.
Some of these companies, such as Knome, Omicia, Cypher Genomics, and Personalis, are providing tools and services to aid researchers with functional interpretation of genomic data. Others, like GenomeQuest, are venturing more aggressively into the clinical diagnostics arena where unclear regulatory requirements make for murky waters.
Speaking with BioInform this week, Richard Resnick, GenomeQuest's CEO, noted that regulatory agencies "still haven’t figured out what to do with software companies" like his own, which can’t be classified as labs and made subject to CLIA regulations.
For its part, GQ is "attempting to adhere to what we think the maximal suite of regulations will be" with HIPPA and 21 CFR as the "basic blocks to build on." The company intends to work with the appropriate regulatory bodies to ensure that its software remains compliant, Resnick said.
"Our road map has us there in the earlier part of 2012," he said. "Beyond that we are going to continue to evolve and, if we can, to some degree lead the thinking on ... regulatory and reimbursement requirements."
The company last year set its sights on penetrating the molecular diagnostics market and launched a consortium called GenomeQuest Personalized Medicine Research, or GQ-PMR, to improve its whole-genome analysis tools for personalized medicine as well as to develop new genome annotations (BI 9/17/2010).
GenomeQuest markets GQ-Dx, a clinical decision-support system for molecular diagnostics. It allows users to interpret data from whole-genome, whole-exome, and targeted sequencing studies for use in clinical applications.
Current customers of the platform include the University of Iowa's molecular otolaryngology and renal research lab, which is using the system to develop a sequencing-based molecular diagnostic test, dubbed OtoSCOPE, for individuals with hearing loss (see sister publication, Clinical Sequencing News, 6/21/2011).
In addition to addressing regulatory requirements, GQ-Dx was developed with an eye toward providing accurate results, incorporating relevant content, and scaling to meet research requirements, Resnick explained, adding that these factors set it apart from others in the market.
The tool provides "low level assembly code support" for different sequencing platforms; adopts a local alignment approach to finding genome-wide variations that helps researchers find large insertions and deletions in their genomic data; and includes a validation component that lets users check their results against Sanger sequencing data, he said.
Additionally, GQ-Dx pulls in information from databases such as dbSNP, COSMIC, and HGMD as well as labs' internal knowledge, which is used to "enrich" the variants, Resnick said.
The fact that GenomeQuest has a product in the market, "which is being used to run real diagnostics" should keep it a step ahead of other groups who have begun eyeing the molecular diagnostics space, Resnick said.
In November, the company launched an award program to provide bioinformatics support to clinical labs that are looking to transition to next-generation sequencing. The program will award $20,000 worth of software, annotation data, and compute infrastructure required to process and store patient sequence data, produce tailored diagnostic reports, and perform follow-up research on aggregated results (BI 11/4/2011).
The results, which the company initially planned to announce on Dec. 15, have been pushed back to January 2012 to allow the company time to complete its review process, Resnick said.
KnomeBase
In November, Cambridge, Mass.-based Knome spun out a new informatics software-as-a-service product called KnomeBase that offers tools and services to annotate, compare, and distill raw sequence data in large whole-genome studies. The company also launched a genome discovery kit, KnomeGDK — a suite of tools, scripts, and libraries for querying and visualizing gene interaction networks across multiple genomes.
KnomeBase, which can be deployed on Amazon's cloud or locally, includes tools that were initially offered through the company’s KnomeDiscovery service, which provides whole-genome sequencing, curation, informatics, and interpretation for $4,998.
With a $3,750 price tag, KnomeBase is tailored for clients that require only genome interpretation services. Its toolkit includes KnomeVariants, which is a query tool for finding candidate causal variants, and KnomePathways, a visualization tool that overlays genomic variants onto gene interaction and co-expression networks to help users identify functional interactions between variants.
Jorge Conde, Knome's co-founder, told BioInform that KnomeBase was released in response to a perceived market need.
"We found increasingly that ... there were two types of researchers: Those that needed a lot of help getting down to candidate variants and what they really knew how to do was focus on validation of candidate variants; and then there were researchers that had very deep domain expertise in the mechanisms and molecular biology of their disease they were studying and ... wanted to do the candidate short-listing work themselves," he explained, adding that KnomeBase would be most useful for the latter group.
Jonas Lee, Knome's chief marketing officer, added that by providing informatics tools for researchers who can handle the interpretation piece on their own, the company is able to save time and resources, thus allowing it to offer KnomeBase at a lower price than the company's full-service offering.
The tool is primarily used by geneticists and pharmaceutical researchers, but Knome is working on moving the tool into clinics, and has already worked with a few clients in the space, Conde said.
Currently, the company is tailoring KnomeBase's tools to meet the specific requirements of a number of clinical partners, Conde said, although he could not provide specific names.
"The biggest challenge in medical applications is implementing the medical guidelines for each of the diseases and the decision making that has to go from the findings that we actually come up with," he explained.
KnomeBase's underlying interpretation engine lets users "consume" and structure genomes in such a way that "we have all the information available as to what is known [about] relationships within variants down in that genome and the things that are clinically relevant or have been found in the scientific literature to be associated with disease," Conde explained.
Once the data is structured, researchers can query it, as well as "curate and edit what is important to them in terms of interpretation of the genome," he said.
Finally, users can generate reports that indicate whether any clinically actionable variants have been identified. These variants are provided in a format that physicians can use in their decision-making process, he said.
KnomeBase offers a number of capabilities that the company believes give it a "significant advantage" over competitors such as GenomeQuest, Conde said.
"We have been in the genome interpretation game for several years and so we have a very good sense of how hard it is to structure the data, how this data can be queried, and some of the associated complexities and pitfalls," he said.
He explained that KnomeBase's structuring capabilities let users compare genomes without regard to the sequencing platform used and at scale since the tool is available on the cloud.
This makes it possible for "a researcher doing the 1000 Genomes study, for example, [to] take our technology and compare subgroups of those 1,000 genomes to basically test hypothesis," he said.
Furthermore, "we haven’t seen anyone else out there who has taken a very careful and comprehensive view on integrating and reconciling all of the information that’s out there," Conde said.
"If you are going to be in the genome interpretation business, it's very important to be able to offer a user, in very clear and comprehensive context, what's currently known about the genome," he said. "That goes for everything from being able to define where the genes are to being able to say which variants have been associated in the clinical literature with the specific phenotype to being able to say which genes fall within a particular pathway or a particular set of gene families."
A final point, Conde said, is that Knome's tools were built for non-expert users. He said this sets the company's software apart from GenomeQuest, which offers "great workbenches and workflows" that are suitable for bioinformaticians, but not end users.
Cypher Genomics
A newcomer to the molecular diagnostics space and the software market in general is La Jolla, Calif.-based Cypher Genomics.
The five-person company, which spun out of the Scripps Translational Science Institute, Scripps Health, and the Scripps Research Institute, was tapped to clinically annotate 1,000 human genomes sequenced as part of the Wellderly study — a collaborative project between Complete Genomics and the Scripps Health System (BI 10/7/2011).
Nicholas Schork, director of STSI's bioinformatics and biostatistics division, told BioInform that the company was launched to commercialize an annotation pipeline that includes tools its founders developed to predict the impacts of variations in the genome.
Cypher's tools are used to interpret and analyze genomic data for disease risk assessment and forensic purposes. The pipeline includes computational tools to interrogate the molecular effects of sequence variation; and consolidate disease risk and phenotype expression information for individual genomic health profile reporting.
These tools can be used to predict whether a variant located in a transcription factor binding site would have an impact on its function, for example, Schork said. Additionally, users can explore how variants in specific genes might perturb the functions of the pathways they participate in. This information is then used to predict which mutations in the genome might be pathogenic, he said.
"Our business model is basically to sell annotations," he said, adding that the company plans to do this by forming partnerships with whole-genome sequencing service providers that work with clinical and research groups, where Cypher would provide variant annotation as part of the service for customers.
In line with this plan, he said the firm is currently holding discussions with a number of sequencing service providers as well as some "clinical entities" about their need for annotation services, although he declined to state who they are.
Additionally, the company is open to working directly with researchers who have the sequence data and are looking for help with the annotation, he said.
"Our tools go a little bit beyond what groups like 23andMe and Navigenics" provide, he said.
He explained that while these companies provide customers with reports containing variants that have been associated with disease susceptibility, "we thought if clinicians and researchers want to take advantage of whole-genome sequencing, they want to go beyond the few variants that have been documented ... and work with information about the likely effect of variants that haven’t been investigated in great deal," he said.
[ pagebreak ]
Cypher focuses on "taking annotations both those in the public domain as well as the use of public domain tools and proprietary tools that have been developed to predict the likely functional effect of variants of all kinds in the genome," he said.
Furthermore, unlike companies like Knome, Cypher doesn't rely solely on public domain resources and literature to annotate variants, Schork said.
"Knome is the one-stop shop ... where they identify a sequence provider, help with more upstream bioinformatics, and provide reports about associations between variants and whatever disease the individual customer may have sequenced," he said. Cypher, on the other hand, is focused on "annotation and making sense of the variations that are identified from the sequencing and not so much on the mechanics behind the sequencing or the mechanics behind calling the variants from the sequencing."
Additionally, "we take all the variants, even the ones that are not documented in the literature, and try to annotate those so that either the research or clinical community can take advantage of that knowledge in whatever way they want," he said.
The company expects that its annotations will be of value to the research community, and, when appropriate regulation to govern whole-genome sequencing has been put in place, in clinical settings as well.
"Our vision is that it will be used in clinical settings, with some refinements, for the identification of pathogenic mutations in idiopathic diseases as well as cancer," Schork said.
For example, researchers who are studying rare conditions that are assumed to have a genetic origin that may not be recorded in the literature, could use the pipeline to discover which mutations are involved in the disease, he said.
Moving forward, "we need to really get our finger ... on the types of reporting that pathologists and clinicians might want," he said. "Just giving them a rank list of five to six million variants that exist in an individual's genome probably won't cut it."
Additionally, the firm intends to "build out the tools to allow use in more complex diseases and drug response," he said.
Other Players
Another company that also hopes to target the whole-genome sequencing market is Omicia, which, like Cypher, offers annotation tools for genomic data.
In 2010, the company was selected as a partner in the Cancer Genomic Care Alliance, a cancer genome sequencing project spearheaded by Life Technologies, where it was expected to process, annotate, and interpret raw cancer genome data and convert it into a format that physicians can use (BI 6/25/2010).
This year, the company announced that it had worked with colleagues at the University of Utah to develop a software package, dubbed the Variant Annotation, Analysis, and Selection Tool, or VAAST, that's used for functional interpretation of whole-genome sequence data. The company planned to implement VAAST in its Genome Analysis System — a platform to clinically analyze human genomes (BI 7/1/2011).
The company planned to launch the tool in the third quarter of this year but BioInform learned this week that the company now intends to launch it in the early part of next year.
Another new entry to the market in 2011 is Personalis, founded by a team of researchers from Stanford University, including Russ Altman, chair of the bioengineering department; Euan Ashley, director of the Stanford Center for Inherited Cardiovascular Disease; Atul Butte, chief of the division of systems medicine at the department of pediatrics; and Michael Snyder, chair of the genetics department and director of the Stanford Center for Genomics and Personalized Medicine.
John West, the former CEO of Solexa, which was acquired by Illumina in 2007, is the firm's CEO.
The company plans to focus on the medical interpretation of human genomes for research and, eventually, clinical applications.
Personalis grew out of a collaboration between West, Altman, Ashley, Butte, Snyder, and others on the interpretation of West's genome and those of his wife and two children, which led to a publication in PLoS Genetics in September.
The company has licensed intellectual property from Stanford, including patent applications and databases in the area of interpretation of human genomes, according to West, who spoke to BioInform sister publication Clinical Sequencing News earlier this year (CSN 9/28/2011).
Non-Profit Efforts
These companies and others that are likely to crop up over the next few years will have to contend with infrastructure from academic bioinformatics groups and the broader research community.
The Medical College of Wisconsin's bioinformatics team, for example, has built a software system called CarpeNovo for identifying and annotating disease-causing variants in whole-genome sequence data. It is using the software for a number of clinical sequencing projects. Separately, a team at the University of Toronto has developed a similar system, dubbed MedSavant, that's currently used for clinical research into the genetic roots of autism (BI 11/18/2011).
Meanwhile, researchers at Harvard Medical School's Partners HealthCare Center for Personalized Genetic Medicine have developed system, dubbed GeneInsight, that provides tools to assist genetic testing laboratories with storing and managing genetic variant information and creating interpretative reports. It also provides an electronic data transfer hub that transmits results between testing labs and clinicians (BI 8/26/2011).
A number of public efforts are also underway to develop comprehensive databases of human variation that can be used to improve the interpretation of clinical sequencing projects. Such projects include the Human Variome Project (BI 12/16/2011), ClinVar, and MutaDataBase (BI 11/5/2011).
Knome's Conde noted that most non-profit efforts are focused on establishing a "conceptual" framework for approaching clinical informatics and don’t address issues of "scale, validation or the robustness necessary for something that can really be deployed as part of regular course and practice."
GenomeQuest's Resnick pointed out that open source communities that have developed bioinformatics tools for research settings "simply aren't set up for the clinical industry, where you have got significant issues of security, compliance, of privacy and scale," which lab directors aren't likely to ask their software development team to take on.
Resnick further noted that most labs who are familiar with Sanger sequencing and other forms of testing may find themselves at a loss when they attempt to take on next-generation sequencing.
When these groups make the transition, they want their "experience and relationship to [NGS] technology to be effectively the same as it was to [Sanger] technology," he said.
"That’s really a commercial opportunity and it's rather urgent for these people to solve it," he said.
Have topics you'd like to see covered in BioInform? Contact the editor at uthomas [at] genomeweb [.] com.