Omicia, a bioinformatics startup based in Emeryville, Calif., has taken a page from the height of the Human Genome Project in an effort to create an informatics platform for the post-genomic era.
The company recently hosted a so called “annotation jamboree” — a term coined in 2000 by participants of a two-week public/private workshop to annotate the then newly sequenced Drosophila genome.
In Omicia’s case, 14 of the company’s collaborators and scientific advisors from around the globe gathered at the company’s headquarters from Sept. 25 to Oct. 4 to help map around 2,000 disease genes to the now-complete human genome.
The collaborative model behind the project isn’t surprising, given that Omicia founder Martin Reese was a participant in the original Drosophila jamboree, which was organized by Celera Genomics and the Berkeley Drosophila Genome Project.
The Drosophila jamboree was “unbelievably successful,” Reese told BioInform this week, “because there was this exchange of people who usually don’t talk to each other. You had a Drosophila taste receptor gene expert and the computer geek from the [European Bioinformatics Institute] sitting next to each other.”
One outcome of the Drosophila annotation jamboree that Reese hoped to replicate the second time around was an approach to informatics software development that grew out of biologists and bioinformaticists working side by side to solve a single problem.
“What was very cool was that all the computer people developed their tools in terms of how can they be useful to biologists and clinicians,” he said.
Omicia works with a number of consultants and advisors spread around the world, “so I said, ‘Hey, instead of working with every consultant individually, we can do a similar thing. We can all be face to face and work together for two weeks, bang our heads, and try to solve problems on the fly.”
The primary goal of the Omicia jamboree was to identify a key set of disease genes and to relate disease genotypes to phenotypes. The key task was mapping genetic mutations in the Online Mendelian Inheritance in Man database to the human genome.
While OMIM is the go-to database for disease genes, Reese noted that it doesn’t exactly square with the current version of the human genome. “These are variations that were annotated over the last 20 years in the literature, and people annotated them on the protein sequences that they had, [and] they were just the ones at the time that people thought were the protein sequences.
“Now, 20 years later, the human genome has come out, so there is a gap between these OMIM alleles and the human genome,” he said.
David Kulp, assistant professor of computer science at the University of Massachusetts, Amherst, and a participant in the Omicia jamboree, told BioInform this week that reconciling OMIM with the human genome is “a very laborious and difficult task that can’t be automated because OMIM is a text-based resource and the genome is essentially a digital one. It requires sort of a semi-automated process for doing that mapping and a lot of manual intervention to work through difficult entries.”
For example, Kulp noted, if an OMIM entry from decades ago identifies a SNP in a methionine in exon 2 of a particular gene, “exactly which exon are they talking about in the finished genome?”
Mark Yandell, associate professor in the department of human genetics at the University of Utah, Salt Lake City, and a member of Omicia’s scientific and technical advisory board, agreed. “What's happened is that over time, the gene model has changed as our knowledge has improved regarding the structure of the gene, so a lot of this early data is sort of lost, if you will.”
The goal, he said, is to “effectively move that older data out of OMIM and systematically reattach it to the genes as they are currently annotated.”
Participants at the Omicia jamboree used a combination of computational methods and manual curation to map OMIM genes to the genome and to assess the reliability of each mapping. One tool used in the effort was the company’s prototype computational platform, the Disease Marker Genome Annotation System, which acts as a support system for curation as well as an automated mapping system for linking genes to disease phenotypes.
Other tools were a bit more old-fashioned. George Miklos, a disease gene expert who is currently director of his own consulting firm, flew in from Australia with three 60-pound boxes of journal articles related to disease genes of specific interest to Omicia. “When he was here for those two weeks, I think he read through 500 papers,” Reese said.
Another project during the two-week gathering was an effort to develop a disease ontology in collaboration with researchers from the National Center for Biomedical Ontologies and Sequence Ontology Project. Yandell said that the company is using the MeSH disease classification system as well as the Disease Ontology based on ICD9 and SNOMED and an internally developed ontology.
Building a Foundation for Personalized Medicine
Reese, who was a co-founder of Neomorphic in 1996 and left the firm after Affymetrix acquired it for $70 million in 2000, founded Omicia in 2002 with the goal of “trying to get ready for personalized medicine.”
Reese believes that the time is right for building an informatics foundation for personalized medicine. “Seven years ago the hype was huge that the genome would solve all human diseases, and I’m really excited about the technologies that are coming out now and I think that now is really the starting point for that to all make sense,” he said.
Previously, Reese noted, experiments linking genes with disease “were hard to reproduce, and a lot of times the reason was that it was just too expensive and we didn’t have the technology, and it was just economically not doable.” Now, he noted, new methods like next-generation sequencing, genotyping, and biomarker discovery platforms should help drive adoption of personalized medicine.
“We’re basically saying, ‘These technologies are getting ready, so let’s try to be ready in terms of content, in terms of what we can do,’” Reese said. “So what we’re focused on at this company is bringing genomic discovery ultimately all the way into diagnostics and the clinical world, and what is required to do this, and we’re coming from an IT and information point of view.”
Omicia was recently awarded a Phase II Small Business Innovation Research grant worth $718,557 from the National Human Genome Research Institute to continue developing its software and database. That builds on a $100,000 Phase I SBIR awarded in 2003 to develop the prototype Disease Marker Genome Annotation System.
“Seven years ago the hype was huge that the genome would solve all human diseases, and I’m really excited about the technologies that are coming out now and I think that now is really the starting point for that to all make sense.”
The company has been awarded several other NIH grants over the past few years, including a $100,000 SBIR in 2004 to develop a prototype system for genetic marker information delivery, a $100,000 SBIR in 2005 to develop a system for predicting novel genetic disease associations, and a $182,732 award this year titled, “Comparative systematic genetics for cardiovascular gene identification.”
Reese described the jamboree as a way to “kick-start” the next phase of Omicia’s development. While he declined to provide details of the company’s business model, he noted that its growing knowledgebase of correlated genotypes and phenotypes is “basically our internal powerful engine to drive personalized medicine.”
Reese added that the database is primarily a “research tool” for the company for the time being while it finalizes its approach to the marketplace. “It allows us very nicely to look through and identify the real market opportunities and the real commercial opportunities for personalized medicine,” he said. “This is really our core knowledge base.”
Yandell said that one goal of the company is to “assemble as much data as possible towards producing a knowledgebase that will allow us to identify novel polymorphisms that may be indicative of disease in known genes and also identify new genes that may play a role in some well-known disease.”
Andreas Braun, president of diagnostic R&D firm Dx Innovations and a member of Omicia’s medical advisory board, noted in an e-mail to BioInform that he expects Omicia's database to become “a prime resource for patients and in particular for physicians to interpret the increasingly complex information about genetics/genomics involved in common diseases.”
Omicia’s knowledgebase and software should be “usable” for Omicia’s internal use by the end of the year or early next year, Reese said. “Are we going to sell it as a software product early next year? I doubt that. But can it be used successfully by our collaborators? Definitely.”