Skip to main content
Premium Trial:

Request an Annual Quote

Q&A: GenoSpace's John Quackenbush on Making Data Accessible for Research, Clinical, and Patient Use

Premium

JohnQ_pic.jpgJohn Quackenbush, the founder and CEO of GenoSpace, a Cambridge, Mass.-based bioinformatics company, was one of 13 individuals recognized by the White House's Office of Science and Technology Policy as an Open Science Champion of Change — a program that honors efforts to promote and use open scientific data and publications.

His firm was recognized for its efforts to create tools that make clinical and genomic data available for biomedical research and personalized medicine as well as for the work it's currently doing with the Multiple Myeloma Research Foundation, a non-profit organization that focused on research and the development of better treatments for multiple myeloma. Kathy Guisti, MMRF's founder and CEO, was also one of the honorees at the White House ceremony this week.

GenoSpace and MMRF began working last year on an information ecosystem for researchers, clinicians, and patients living with multiple myeloma that uses data from MMRF's Relating Clinical Outcomes in Multiple Myeloma to Personal Assessment of Genetic Profiles, or CoMMpass, study.

CoMMpass, which was launched in 2011, is collecting clinical and molecular data from 1,000 newly diagnosed patients over a minimum of five years and using it to obtain insights into the disease's progression and treatment response. The study aims to support the discovery of disease biomarkers, new drug targets, and more personalized treatments approaches. MMRF has worked with GenoSpace within the last year to create the infrastructure used to store the data and to provide ready access for visualization and analysis.

Earlier this week, BioInform caught up with Quackenbush, who is also a professor of computational biology and bioinformatics at the Dana-Farber Cancer Institute and Harvard University's School of Public Health, to talk about the award and the MMRF project. What follows is an edited version of the conversation.

Congratulations on winning the award! Did you know you had been nominated?

I had a few people ask me if I was interested in being nominated. One was Kathy from the MMRF, the other was Brian Athey [the TranSmart Foundation's co-CEO and a professor in the University of Michigan's computational medicine and bioinformatics department]. I wasn’t sure I would have a chance but then last week, Kathy and I got a call from the White House. They wanted to ask us a couple of questions about CoMMpass and what it meant for data to be open. We had a short conference call on Wednesday of last week. Then we got a call telling us that we were being invited to the White House and we are going to be recognized for the work we've been doing together.

That’s a pretty impressive accomplishment especially when you consider that you began partnering with MMRF in the latter half of last year.

On the part of the MMRF, CoMMpass represents an extremely innovative approach to funding and driving biomedical research. I think from our side what we've managed to put together is an innovative data portal built on top of a secure repository that really allows you to ask very sophisticated inquiries of the data.

One of the questions that Michael Stebbins [OSTP's assistant director for biotechnology] from the White House was asking when he had us on the phone was 'why not just put the data into dbGAP?' While dbGAP is fantastic as the final repository for data, it doesn’t have the dynamic query capabilities that really were necessary to help CoMMpass deliver on its promises.

The goal at MMRF was to try and really drive research forward by making the data and information necessary to interpret it accessible. We felt that the research gateway that we'd developed, and that we deployed for them, was absolutely what was required to allow their pharma partners, and ultimately anyone, to explore the data and ask questions that involve understanding both the clinical and genomic characteristics of the patients enrolled in the study.

How did you end up partnering with MMRF?

In 2011 Mick Corell and I founded GenoSpace. We recognized early on that genomic data was fundamentally going to be different than most of the other data we'd been dealing with in biomedical research. Gene expression data isn't identifiable. But GWAS data … and ultimately genome sequencing data … represents identifiable information. We quickly understood that we would have to think differently about how we stored and managed that data. We realized that the whole idea of de-identification no longer made sense because as soon as you have genomic data, it is identifiable. [So] if genomic data is identifiable, why not forget de-identification altogether. Let's combine clinical and genomic data in a scalable internet data store that focuses on strong encryption to safeguard patient confidentiality. We built a large NoSQL database to capture and index all of this data…and then started building a host of tools that we could snap into place around it.

Then we heard that MMRF was starting CoMMpass. They had gone through a two-year process to develop specifications for a data resource to support the study and then to identify somebody who could build it. We heard about an [undisclosed] company that they had essentially settled on. We approached the MMRF and explained why we didn’t think their solution was going to work. We were convincing enough that they put us in a 12-week, head-to-head competition with the other vendor. But six weeks into the project, we had completed nearly everything they asked for and the other vendor hadn’t really started. So it was pretty clear that we were the group they needed to work with.

Since then, we've really been refining this portal, adding some extremely innovative features, and making it a tool to drive discovery. The MMRF Research Gateway allows you to combine clinical and genomic data to ask the kinds of questions you want to ask. We realized that many of the most important questions people wanted to ask were based on defining and comparing cohorts. Then [there] are other questions you can ask too about individual genes [and] therapies.

Most importantly, the portal was designed to be very interactive. It’s a discovery engine that’s designed to take advantage of the genomic data and longitudinal information that’s collected over the five-year CoMMpass study and to help suggest new ways to explore the study’s unprecedented information.

So as part of the MMRF partnership, you've developed two so-called gateways to the CoMMpass data; one for researchers and another for patients. Can you talk a little bit about what kinds of data and capabilities are available through each one?

The MMRF Researcher Gateway is focused on addressing the needs of research scientists. CoMMpass is funded through a consortium of pharmaceutical companies. In return for financial support, they have six months early access to the underlying data allowing them to mine the data resource and to look for associations between genomic variants and disease. And after six months they agreed that all this data will be public. The MMRF Researcher Gateway includes tools for dynamic data analysis and visualization. But we’ve also designed it to facilitate collaborative analysis. It allows people to build collaborative groups, share results, [and] to track and archive their research.

Early in our work with MMRF, we started asking how one might go beyond CoMMpass and its first 1,000 patients. We both realized that one way to do this was to create a Community Gateway designed to allow them to share information with other patients and scientists, but also help match patients to clinical trials. We’re in a beta release now, but the plan is to open it more broadly in the fall. The Community Gateway is designed to collect self-reported information, but we could also upload data from patients' EMRs to capture health data. Ultimately, we hope to connect patients to trusted providers who can generate genome sequence and provide it.

Beyond the work we’ve done with the MMRF, we’ve also created GenoSpace for Clinical Care that helps provide detailed, up-to-date, actionable clinical information on mutational profiling data. We have another partner we hope to announce soon. But here, too, we see opportunities to use this to provide additional services to patients.

Following up on that last point, is there a plan to use the CoMMpass data to inform clinical care?

Not directly. It’s a bit of a leap to go from this research data to clinically actionable information. But ultimately, that is among the goals of CoMMpass, and we believe we have the tools to support it.

One last question on the Open Science Champion award; what does it mean for GenoSpace?

Open data in and of itself isn’t the solution. To make data useful, we need to place it into context, together with the information necessary to make it useful. And we need to create tools that let people get access to the data and information in meaningful ways — we need to democratize access.

At GenoSpace, we’ve built tools that make the data useful, but also to facilitate its broad use. This award recognizes the progress we’ve made and serves as an additional external validation — beyond that from our clients — of our strategy.

It is a tremendous honor to be recognized in this way, and it is a testament to the hard work and dedication the entire GenoSpace team has made.

The Scan

And Back

The New York Times reports that missing SARS-CoV-2 genome sequences are back in a different database.

Lacks Family Hires Attorney

A lawyer for the family of Henrietta Lacks plans to seek compensation from pharmaceutical companies that have used her cancer cells in product development, the Baltimore Sun reports.

For the Unknown

The Associated Press reports that family members are calling on the US military to use new DNA analysis techniques to identify unknown sailors and Marines who were on the USS Arizona.

PLOS Papers on Congenital Heart Disease, COVID-19 Infection Host MicroRNAs, Multiple Malformation Mutations

In PLOS this week: new genes linked to congenital heart disease, microRNAs with altered expression in COVID-19, and more.