Eric Neumann joined Foundation Medicine as vice president of knowledge informatics in November, bringing a background in semantic web technologies to the clinical genomics realm.
Neumann joined Foundation after holding a number of informatics management roles at Sanofi-Aventis, Beyond Genomics, Genstruct, and elsewhere. An early evangelist of the use of the semantic web for life science applications, he is also a founder and organizer of the annual Conference on Semantics in Healthcare and Life Sciences, or CSHALS.
At this year's CSHALS meeting in Cambridge, Mass., Neumann discussed the informatics behind Foundation's cancer genomic profiling assay, called FoundationOne. The test uses next-gen sequencing to identify genomic variants linked to targeted therapies — either on the market or in clinical trials — to which a cancer patient is likely to respond. The company delivers this information through an online reporting platform called the FoundationOne Interactive Cancer Explorer.
Underlying this process is a knowledgebase of thousands of cancer genomic alterations that the company can analyze to find combinations of mutations and cancers that respond best to specific drugs. Neumann discussed at the conference how the company is using the semantic web standard RDF to link published results, molecular cancer databases, and clinical trials within this knowledgebase.
BioInform spoke to Neumann at CSHALS about his new role. The following is an edited transcript of the interview.
Are you fully settled in at Foundation?
It’s been a very accelerated and fast three months. I’ve learned a lot. The culture there is very much about sharing information. It’s fascinating with what they have already accomplished. I think they have a very good view of how to use this type of information so it’s flexible and can be applied to many very relevant interesting questions in the medical and clinical applications of genomics.
What are your responsibilities as VP of knowledge informatics?
It is in stages developing a framework of capturing and exchanging knowledge [and] information that has a lot of associated utility. So not just the data [but] how it's relevant in treatments, how it may be associated to existing research, clinical studies, clinical trials, and how that may indicate opportunities for the different sets of therapies that would be appropriate to people who have particular kinds of genetic alterations in their tumor. Based on the information that we present them … doctors … eventually [will be] able to drill down to the knowledge system to make the right determination on how to treat their patients so that they have much better outcomes than is currently possible.
Can you go into some more detail about what you are developing for Foundation Medicine in terms of informatics?
My area is really on the capture of this information. The genomic analysis provides variant information. The full set of information is first of all going to be presented to the clinicians that are treating the patient. But we do a lot more than just hand it back. We are able to look into the literature. The knowledgebase [has] the ability to search and find what we already know and if there are any known interactions and then we can even say these kinds of effects are often associated with targeted therapies of this nature … sometimes It’s on-label and sometimes it’s been clinically tested but it's not on [the US Food and Drug Administration's] labeling system.
Events that lead to cancer are very unique and it really requires this diagnostic platform … [and] looking across many things in great detail and then that tells us maybe there are some different strategies that are possible here. But what will eventually become important is the results of those choices that the doctors make — that will start to tell us [whether] these models and [information] that has been presented to them are showing a difference in outcomes if they’ve tried this type of drug versus another. So using that information and putting that back into the knowledgebase allows us to really start to see not just how things are treated but how things respond. That, we think, is going to provide a lot of new insights, medical and therapeutic with existing drugs and potentially future drugs to be developed. It's going to accelerate that understanding that will have immediate impact for the doctor and the patient and long-term impact in terms of understanding in more depth a variety of different cancers.
Why did you decide to use a semantics-based approach to develop your system?
There are two things. One is the mechanism and paradigm for organizing and storing the information. Typically what people have done is [build a] database … not based on the science but based on the questions you want to ask. That means you need to know ahead of time what questions you want to ask. Then you can make a specialized warehouse for that and use an existing technology and it will grind through and give you those answers.
It gets trickier when you don’t know all the questions. Driving the design of the data system for the analysis on what are the mathematical questions you are going to ask … you can’t really do that with a relational database. It’s brittle. Everything we’ve seen over the years is that it takes a long time to develop a new data system that is going to hold clinical information and biomarkers, and then as they are using it, within a few months they realize, ‘We’ve got different kinds of data and this thing should but it doesn’t quite handle it and it’s not working.’ They either have to fix or completely rebuild a new one. And the problem of course for many big companies is if they did this for a diabetes project that went to clinical completion, they are already starting to do stuff in another area, say inflammation, [and] those groups don’t usually coordinate. What happens is no one really sees the limitations of the previous [project]; they are always ready to start the next one.
In our space, we are trying to accumulate, over a long period of time, more and more insights. Today we have a system that is good for reporting the variations that we see and particular detailed definitions for the cancers and we have some other information that gets associated with it. But in six months there is a whole new set of things that we haven’t even considered and we’d like to talk about. Then what do we do with the system? [With] semantic web, I can add that in there as long as I can describe the binding relationships between what’s already in there and the new stuff. I put it in and just literally add those extra triples on the existing system and do a little validation and its working.
It cuts down the time for really managing sets of information into a matter of a few weeks or even less. It’s the speed of being able to adapt to the questions that I am asking [and] there may be new people who join who may even have cooler questions they want to ask and they can get it out of that system. They can ask the queries and if the things need a different bit of reasoning they can add that on top. It allows us to be adaptive. As we start to see more and more combinations of genes, we need to have more rigorous ways of not just adding a list but [figuring out how we] look at the combinations. The triple store we are using is being designed with the near future intent to do deep analytics ... and much more rigorous mining. That’s going to be an interesting system that we are going to be working on designing with partners.
So you are developing two systems. There is a knowledgebase and also the reporting system. Can you tell me how they interrelate?
They are tied together. The reports are being collected and organized as they are coming through with the samples. That information is being collected in this knowledge framework. The framework also makes it easier to generate new reports, too, so we can see what is already known by our scientific readers. We have people who are going to be reading the literature and finding that this paper and this trial actually speak very well to this exact case. Then the knowledgebase will take what they’ve found, organize and store it, and then when a case comes up … that has this characteristic mutation for this cancer, the system is actually able to pull together the current understanding in the field. Before it is released to the physicians as a report, it’s already going through our doctors internally to validate. We are really harmonizing a knowledgebase that does the grunt work of organizing lots of facts and … give[s] you quick results but then those results go through a human pipeline of people reviewing [the reports] to make sure we say everything that we need to say and we don’t say something that is going to confuse the doctor. They are presented with a very focused [and] easy to interpret report.
[...] The development of the software to a large part is through our partner Entagen. Their product had enough of the framework available that we could then define within it the organization of the information and how people can get at it and find things. I think there are going to be other approaches and it’s clear that people start with standard approaches. When they see the level of complexity and how to show it in the right level to physicians, you are going to see the advantage of a semantic approach. It’s good to be ahead of the curve.
What else is going on informatics-wise at Foundation Medicine?
There are many things going on. Molecular information as a service is critical for the future of NGS diagnostics. We are keeping an eye on the changes that are going to start happening very rapidly.