NEW YORK (GenomeWeb) – Seattle, Wash.-based non-profit Sage Bionetworks has launched a public portal that holds multiple datasets related to Alzheimer's disease that were generated by the Alzheimer's Target Discovery and Preclinical Validation Project, one of two projects being conducted under the auspices of the Accelerating Medicines Partnership's (AMP) Alzheimer's initiative.
AMP is a National Institutes of Health-led multidisciplinary public-private venture involving the NIH, the US Food and Drug Administration, as well as multiple industry and academic scientists. The Alzheimer's Target Discovery and Preclinical Validation Project, one of three initiatives under the AMP umbrella, seeks to identify and validate potential biological targets of the disease. The AMP-AD knowledge portal is built on Sage's Synapse infrastructure, which is used for hosting, tracking, and sharing data and results as part of collaborative projects — the same system supports the Dialogue for Reverse Engineering Assessment and Methods' computational challenges.
The portal provides broad access to large and complex biomedical datasets generated by the participants under the auspices of the initiative as well as preliminary datasets generated prior to the start of the AMP-AD project that have been used in the study, Lara Mangravite, Sage's director of systems biology, told GenomeWeb. This first release contains data from human brain samples. Specific data types include some RNA-sequencing, array and imputed genotype, clinical, methylation, and mass spectrometry data. Over the next several months, additional RNA-seq datasets will be added to the portal as well as miRNA data, and ChIP-seq data, Mangravite said. In total, she said, Sage expects to host different datasets from about 4,500 brain samples. The platform will also host high-dimensional data from model systems such as mouse, drosophila, stem cells, and so on. It currently includes data from some mouse models.
For now, Sage is focused on gathering the datasets from the project participants and making them publically available in whatever format they are submitted within three months of submission. However, Sage plans to eventually go through the individual studies and standardize the datasets. So, for example, in the case of genotype data, "we'd like all of the genotype data to be imputed using the same software and the same reference panel so that the analysts can use it similarly," Mangravite explained. For RNA-seq data, "we are going to go back and reprocess using the same aligner and the same reference panel so that the bioinformatics processing of the data is done uniformly across all of the subjects." Sage is also open to including data from groups external to the AMP-AD partnership in the portal, she added
The portal's contents can be accessed via a web interface that includes detailed information about the data sources, projects, and access to data within the files and folder structure, as well as the ability to track analyses from raw data through analytical output. The data are also accessible through analytical clients using the R, Python, Java or command line clients. These clients offer mechanisms for querying data, for bulk data downloads, and for tracking provenance.
The resource is open, so any interested researchers can browse the site, Mangravite said, but in order to download the data they have to be registered Synapse users — registration is free — and also sign documents agreeing to certain use terms. Downloading the datasets is also free, however, users that want to download the raw RNA-seq BAM files will have to foot the bill themselves.
Generally, the information can be incorporated into ongoing research studies as well as analyzed in combination with other data among other uses with minimal restriction, though some datasets have additional terms governing their use. These are datasets from samples — collected by the Mayo Clinic and the Broad Institute — that were obtained with consent from now deceased individuals. Users are required to submit a signed Data Use Certificate to Synapse in order to obtain approval to access those data.
The AMP-AD project is supported and coordinated by the NIH's National Institute on Aging (NIA). Members of the AMP steering committee for the Alzheimer's disease project are drawn from NIA, the NIH's National Institute of Neurological Disorders and Stroke, the FDA, four pharmaceutical companies — AbbVie, Biogen Idec, GlaxoSmithKline, and Lilly — and four non-profit groups, namely the Alzheimer's Association, Alzheimer's Drug Discovery Foundation, Geoffrey Beene Foundation, and USAgainst Alzheimer's. Academic participants include the Icahn School of Medicine at Mount Sinai; the Broad Institute of MIT and Harvard; University of Florida, Gainesville; Emory University; Rush University; Mayo Clinic; the Institute for Systems Biology; the University of California, Los Angeles; and others.
AMP grew out of discussions between NIH Director Francis Collins, several heads of pharma companies, and other groups back in 2011 around the challenges of developing effective new medications in timely and cost-effective ways to improve the existing process, Neil Buckholtz, the director of NIA's neuroscience division, told GenomeWeb. Out of those talks came three AMP pilot projects focused on Alzheimer's disease, type II diabetes, and the autoimmune disorders rheumatoid arthritis and systemic lupus erythematosus.
Follow-up meetings and discussions around the Alzheimer's initiative resulted in two project recommendations. One of these, dubbed the biomarkers project, is a consortium of three NIA-supported Phase II/III secondary prevention trials that are currently testing several anti-amyloid therapies. The trials, which are being run at Harvard and Washington Universities and the Banner Alzheimer's Institute, are exploring the value of tau imaging and novel fluid biomarkers for tracking responsiveness to treatment and/or disease progression. Baseline data from these trials will eventually be made broadly available through the Alzheimer Association's GAAIN platform.
The goal for this particular project, Buckholtz said, is to identify biomarkers that could potentially predict outcomes from these clinical trials, with the overall idea being to reduce the number of participants and the time required to evaluate and secure approval for therapeutics. "Through a lot of work that's been done over the past 10 to 15 years on biomarkers especially in Alzheimer's disease ... we've learned that the disease process in the brain starts 10,15, even 20 years before the clinical symptoms," he told GenomeWeb. "If you could identify people who are at risk before they have symptoms, then this would be the place where you really want to schedule potential therapies to see if you could slow the course of the disease or slow the onset of the disease."
Currently, there are three trials funded or in the process of being funded as part of this project. The Dominantly Inherited Alzheimer's Network clinical trial is looking at therapies for individuals who have one of three rare mutations that result in early onset dominantly inherited Alzheimer's disease. A second trial focuses on developing anti-amyloid treatments for asymptomatic individuals that are at increased risk of developing Alzheimer's disease. A third trial focuses on treatments for aymptomatic individuals with multiple copies of the Apolipoprotein E4.
The Alzheimer's Target Discovery and Preclinical Validation Project, for its part, seeks to shorten the time between the discovery of drug targets to development of new drugs for Alzheimer's. The project's current grants support efforts to integrate analysis of large-scale molecular data from human brain samples with network modeling approaches and experimental validation, and supports broad sharing of data and analytical tools.