NEW YORK (GenomeWeb) – The developers of the Genomics of Drug Sensitivity in Cancer database have updated the resource to include several large-scale drug screening datasets, new pan-cancer and cancer-specific analyses, as well as new computational tools for analyzing and exploring the information contained in the freely available resource.
Mathew Garnett, group leader at the Wellcome Trust Sanger Institute and one of the developers of the resource, told GenomeWeb that he and his colleagues have more than doubled the size of the repository in terms of drugs and cell lines. Specifically, the website now includes data for 265 drugs, up from a previous 140 or so. This includes 48 drugs currently used in clinics, 76 drugs that are in clinical development, and 141 experimental compounds. The database also includes 224,519 IC50 values up from just under 80,000 in an earlier iteration and 1,074 cell lines including ones from the NCI-60 cancer panel.
The developers have also added whole-exome sequencing data for the cell lines, higher quality and more gene expression data as well as genome-wide methylation data to the database, Garnett said. They have also included functionality for mining the available datasets and a custom tool for downloading data from the repository.
The GDSC, which is funded by a grant from the Wellcome Trust, was first launched in 2012 as a collaboration between researchers at the Sanger Institute and the cancer center at Massachusetts General Hospital. The researchers aim was to identify molecular markers of drug sensitivity that could help researchers and clinicians plan more efficient clinical trials that target defined patient populations. They intend to identify these markers by performing high-throughput drug screens using a large panel of genetically annotated cancer cell lines.
"We know that patients respond differently to cancer drugs in the clinic and that the genetic changes in their cancer can actually underpin that difference," Garnett said. "We are really trying to understand the reason for that difference by looking at the genetic changes in cancer cells that confer sensitivity or resistance to a particular drug [using] this large-scale and comprehensive approach."
In addition to updating the repository, members of the development team have also published the results of a large-scale pharmacogenomics study in Cell that aimed to achieve four objectives for the GDSC. The developers sought to assess if their cancer cell lines faithfully represent the genetic changes that are observed in cancer patients as well as whether these changes influence cells' sensitivity or resistance to particular drugs, and whether combinations of mutations can better explain drug sensitivity or resistance. Furthermore, they wanted to assess whether integrating different kinds of molecular data improves drug-response predictions.
For their analysis, the researchers mapped 1,273 pan-cancer alterations from 11,289 primary tumors in 29 tissues to 1,001 molecularly annotated cancer cell lines. They then measured the response of the cell lines to 265 anti-cancer drugs to check if these alterations could serve as predictors of drug sensitivity. For the study, the researchers used whole-exome sequencing data from sources such as the Cancer Genome Atlas and the International Cancer Genome Consortium. This included 470 somatic variant calls from 48 studies of matched tumor-normal samples covering 6,815 samples and 28 cancer subtypes. They identified 851 copy number segments from 8,239 copy number arrays that cover 27 cancer types. They also assessed 378 CpG sites from 6,166 tumor samples covering 21 cancer types.
The researchers report that their analysis showed that the cell lines "faithfully recapitulate oncogenic alterations identified in tumors." According to results included in the Cell paper, of the 1,273 alterations assessed for the study, 84 percent occurred in at least one cell line and 79 percent were present in at least three cell lines. Furthermore, when the researchers analyzed mutation data from cell lines and primary tumor samples, they found a "high concordance" between the predominant mutations found in the primary tumors and cell lines of the same tissue type in 80 percent of the cancers assessed.
Next, to assess whether the mutations mapped to the cell lines were associated with drug sensitivity and resistance, the researchers screened the 265 GDSC drugs across 990 cancer cell lines averaging about 878 cell lines per drug. They then used a combination of statistical techniques, machine-learning algorithms, and other computational approaches to assess how these mutations impact drug response as well as the contributions of the mutations to drug-sensitivity predictions.
Specifically, they looked for single mutations that could serve as markers of drug response as well as combinations of alterations that improve drug-response prediction, and they assessed the contributions of each cancer-linked alteration to the variation in drug response observed in patients.
Among other results, the researchers identified 688 statistically significant mutation-drug interactions, according to the paper. Furthermore, they identified "significant" mutation associations for 225, or 85 percent, of the 265 drugs assessed in the study including several well-known clinically relevant associations as well as a few novel gene-drug associations.
The researchers were also able to create predictive models for 208, or 78 percent, of the 265 drugs screened in the study and to identify known and novel combinations of mutations that improve drug sensitivity predictions for clinically approved drugs. For example, "cell lines that have an EGFR amplification or a SMAD4 mutation account for 45 percent (10 out of 22) of cell lines sensitive to the ERRB2/EGFR inhibitor afatinib, whereas considering only the EGFR amplified cell lines accounts for only 32 percent (7 out of 22) of the sensitive cell lines," they wrote in the paper. The computational models also performed better with combinations of alterations rather than single alterations, according to the paper.
Lastly, the researchers reported that when looking across cancers, gene expression data is the most predictive data type for assessing variation in drug response followed by tissue of origin, and that adding other data types such as methylation data improves the assessments. In contrast, within specific cancer types, genomic features such as copy number alterations contribute the most to assessing variations in drug response and predictive models improve with the addition of methylation data.
Most of the datasets used in the study are now available within the GDSC, Garnett, who is also a co-author on the Cell paper, told GenomeWeb. So, for example, much of the drug sensitivity and genetic data as well as the interactions between different drugs and genetic alterations can be visualized and mined using the GDSC tools. Taken together, the paper and the updated repository offer "rich resources" for linking genotypes with phenotypes as well as for identifying therapeutic options for treating defined cancer sub-populations, the developers believe.
"A lot of our analysis has been focused around identifying genetic alterations in a cancer cell that will confer sensitivity to a particular drug. The ultimate aim would to be personalize cancer care by looking for [those] genetic changes in the cancers of patients and using this information to select what drug they should be given to really improve the way that we treat cancer patients in the future," according to Garnett.
The GDSC is at the beginning of that process. The next steps would be to assess the effectiveness of these drugs in more complex cellular models as well as in animal models before moving on to clinical trials in future.
"Cancer is so diverse and so heterogeneous, and there are so many drugs, so trying to understand which patient should get which drug is really difficult," Garnett said. "We are pointing people in the right direction about where they might use some of these drugs and where those drugs might be effective, [but] what this doesn't say is that it will work for sure in the clinic. That still needs to be tested rigorously before we actually begin clinical trials."
The GDSC team also plan to continue updating the repository in the future, Garnett said. Specifically, they plan to add more single-agent drug sensitivity data for some of the more recently developed targeted therapies, he told GenomeWeb. They also plan to add information about how combinations of drugs might impact drug sensitivity.