Anastasia Christianson, senior director and global discipline lead for biomedical informatics for clinical development at AstraZeneca, leads a group of informatics researchers at the firm's R&D facility in Wilmington, Del., and also directs a global team of informaticians who perform computational analysis on early clinical development and patient safety data.
The researchers in these groups mine internal and external information and then integrate their findings into decision-making about compounds in the firm's drug discovery and development pipeline.
Christianson received her PhD in biological chemistry from the University of Pennsylvania and completed postdoctoral training at Harvard University. She joined AstraZeneca, then called Zeneca, in 1994, where she has held several appointments in translational medicine both on the drug discovery and clinical development sides of the company.
She has also been an adjunct professor at Johns Hopkins University, University of Pennsylvania, and Drexel University.
Christianson spoke with BioInform last week about her work in integrating preclinical and clinical data. What follows is an edited version of that conversation
What have been some of your recent milestones?
The success we have achieved so far is to bring down some of the barriers as well as figure out some effective ways to integrate information from places, internal and external. And we are making good progress on translational approaches: translating preclinical observations to clinical outcomes. We are anticipating what we will see in the clinic as best we can and [want to] continue to improve that.
How do you integrate data from external sources and partners?
Those challenges are both internal and external. We are in an information-rich environment, so one of the first steps is to take care of the information. … The first step was to manage the large datasets. We were generating lots of data and we did a really good job of managing the data, building databases and warehouses. The next step is to start to connect the data and in doing that, you learn about what you need to do there, [that] there are different standards, different types of data.
It's not just the different standards for numerical data. The bigger part is combining numerical data and non-numerical data, the text, and figuring out how to relate one piece of information to another piece of information. That is the challenge we are facing right now.
Does that mean connecting a description of patient conditions with genomic data and image data?
Exactly. Image data starts as numerical values, which we turn into an image, and the genomic data are letters that translate into a protein and through gene expression analysis you get numbers. That gives you genome information and gene expression and image information as numbers. You need to find a way to create relationships [between these data] in an automated fashion so you don't have to do this manually, one image at a time, one sequence at a time, one dataset at a time.
How do you compare a pixel and a basepair?
You end up developing those tools. We are better off now; we can do things we couldn't do a couple of years ago. We need to get better at it so it can be more automated. You need to keep enhancing the tools. And as soon as you have developed one tool, you realize you want to do the next step, for which you need another tool.
Do you have an in-house development team looking at instruments to understand their data output to create the right analytical tools for that data?
The scientists first need to understand what is possible technically now and what needs to be developed. Then you need technical experts to understand the science and the scientific questions to develop the best solutions. Now you are starting to tackle the problem together. Where you end up in trouble is when you have people who know the technology and people who know the science and [they] are not completely understanding each other's challenges.
[ pagebreak ]
Do high-throughput instruments generating terabytes of data change the dynamic?
The high-throughput era has been upon us for some time now. When we first started with those approaches, the first step was to manage the data. We now have a lot of data that we are managing and we are at the next step. Managing the data is not enough and analyzing it by itself is not enough. We need to be analyzing it integrated with our data.
And with new technology, such as second-generation sequencing or more sophisticated imaging technologies, as people are developing tools to manage that data, we already know it needs to be integrated with other data. So we are already a step ahead of where we were a couple of years ago. We already know we need to do that next step. We're doing it better and more efficiently.
What do you think of Sage, the non-profit founded by Merck's Eric Schadt and Stephen Friend, to offer open access to a wealth of data?
The effort of Sage is to share the data. We are generating a lot of data and there is no good reason to be duplicating efforts as we generate a lot of data. Our competitive advantage is how we use the data. So we can share some of the data and learn from one another and put more effort into using the data.
We're involved in some consortia, in the pre-competitive spaces where pharmas share what they're doing, what they know, and talk about sharing data. There are public consortia and smaller collaborations with academia. … We are open to working with others and are working with others.
Sharing hasn't exactly been part of the drug discovery culture, but now there is also the the cross-pharma group Pistoia Alliance [to support collaborations in non-competitive areas of drug discovery] of which AstraZeneca is also a member. Where has the impetus to share come from?
It reminds me of the book, "All I Really Need to Know I Learned in Kindergarten." [laughs]. It is surprising. But knowing what we know now, in a lot of ways it is a no-brainer.
There are a lot of pressures to contain costs. There is a lot of technology that is costly and we know we are generating a lot of information and we know the scientific challenges are still big.
There isn't any one small group, and even taking pharma out of the mix, that is going to tackle it quickly enough. The best way to do it is to join forces and then do what you need to do separately for your competitive advantages. The data generation piece is costly and it is becoming routine, so why not share the data? You are not sharing compound information, you are sharing other data that helps us understand the disease better, the patient population, and other things not associated with the compound.
Using that information and based on the compounds you have, the experiments that you then do and which are specifically targeted to the compound, are smaller.
Sometimes pharma companies need to cancel drugs in late-phase clinical trials. Will this data management approach stop that?
That is the goal. The more we understand diseases and the populations that have the disease, the more targeted our compounds can be, and, by default, we understand our compounds better.
If you understand the mechanism of the disease, you can target your compound better, so then you understand the mechanism of the compound better and the end goal is that you will have fewer surprises in the clinic.
Especially if you are using the translation between pre-clinical observations and clinical outcomes, you can better predict what you might see in the clinic and you should have fewer surprises. Maybe we won't have any surprises!
[ pagebreak ]
Is a computational feedback loop already in place, so you can feed -omics and imaging data back to the drug development team?
There's a technical and a cultural component there. Are we doing it? Yes. Are we doing it as well as we could? No. We're still working, and I think everyone is still working, to become more efficient at it. That is a huge goal of what we are doing.
The drug discovery pipeline has always been drawn as a straight line going from one end to the other. It's really circular: you can draw the whole thing as a circle and you can draw feedback loops from every step to every other step.
The technical piece is being worked on. The cultural piece you deal with by having the right teams working together, having the right communication at the right time. That will help with the technical piece of making it become more automated with the data feeds in both directions. It has to be in both directions.
At a session on wikis at the Bio-IT World conference, many participants from pharma said these tools were great but scientists found it hard to find the time to use them. How helpful are wikis for AstraZeneca researchers?
Is it a good technical solution? Yes, absolutely. Is it the best technical solution? Possibly not. There are some discussions about this, not only at AstraZeneca. Why do people spend time on Facebook to put information in there? How do you find the time to do that? Is there a business environment for this [kind of tool]?
Wikis are not as fun as Facebook.
Why is that? Wikis are very effective. If you are looking for a definition, you might go to a dictionary or a wiki. It's very effective. Nothing by itself is the ultimate solution, at least not yet.
We have a combination of wikis and eRooms that we share and I think the combination is probably the best approach we have right now. In eRooms a group of people can share and work on a document and evolve it and then post it to something like a wiki.
Perhaps we'll end up moving to something like Facebook or something better than that. They are effective, good tools, they are dependent on the individuals making the time to report information and get information.
The cultural piece is to make sure that people see that as a valuable approach and that the time they have to spend on it is worthwhile and a lot less than if they had to go digging through a document repository.
What are your thoughts on cloud computing?
I think it's fantastic. Our IT organization looks into this. It's not my area of expertise; but I am looking at it as a potential user for myself and my group.
When we've got challenges with meeting costs, it's a fantastic way to not have to build the infrastructure and have to support it internally for doing intensive computing. Since we are talking about integrating data, you need the compute power to do that. Being able to do that without having the full cost, sharing the cost, is a fantastic opportunity.
The details about IP questions are, I think, being sorted out right now. I think we are in the process of piloting it. … I am not sure how far along this endeavor is. I know we are looking at how it works, how much does it help our scientists, does it address our need, and are there any questions about IP issues. I can't say that there are, but we're looking into that. … It may mean that you do certain work on the cloud and certain work first internally. I don't know how we will divide up the work, because that is where you might do analysis on your proprietary compound.