NEW YORK (GenomeWeb) – A new cancer genomics project organized by the New York Genome Center plans to focus on cancer patients from ethnic minority groups, taking advantage of the rich diversity of New York City.
The collaborative effort, called Polyethnic-1000, is an initiative of the Genome Center Cancer Group (GCCG), which is chaired by Harold Varmus of Weill Cornell Medicine and Charles Sawyers of Memorial Sloan Kettering Cancer Center and includes cancer researchers and clinicians from most of the center's member institutions.
In July, the Mark Foundation for Cancer Research awarded the project a $1 million grant, and it has received some additional philanthropy funding, though additional fundraising will be needed to complete all three planned stages.
While cancer genomics has taken off in recent years, both in research and as part of diagnostics, most somatic variant data in public repositories comes from cancer patients of European ancestry, and ethnic minorities have been underrepresented in clinical trials. "We really have a huge deficit in knowledge about the landscape of somatic variants in other populations," said Nicolas Robine, a computational biologist at NYGC and one of the leaders of the Polyethnic-1000 project.
It is well known that the prevalence of some cancer variants differs between populations, and that this can have important implications for treatment. For example, only 5 to 15 percent of lung cancer patients with European ancestry but 40 to 50 percent of those with Asian ancestry have EGFR mutations. "That's a massive difference," Robine said, that is well established but poorly understood.
Also, some groups are more prone to certain cancer types — for example, African American women have higher rates of triple-negative breast cancer than others — but it is unknown whether this might be due to underlying genomic differences, said Fieke Froeling, a medical oncologist in David Tuveson's group at Cold Spring Harbor Laboratory. Froeling and Tuveson are two other project leaders.
Furthermore, germline variants that increase cancer risk are known to differ between populations. For example, BRCA mutations, which increase the risk for breast and ovarian cancer, are prevalent in Ashkenazi Jewish women.
Polyethnic-1000 plans to focus specifically on ethnic minority populations to fill the gap in knowledge, and New York seemed the right place for such a project. "We know that virtually every population on Earth is represented in New York," Robine said.
The project currently involves the genome center's member institutions — which include Memorial Sloan Kettering Cancer Center, Columbia University, Weill Cornell Medicine, NYU School of Medicine, Albert Einstein College of Medicine, the Icahn School of Medicine at Mount Sinai, and others — as well as hospitals in the region, some of which are affiliated with the member institutions, such as SUNY Downstate Medical Center in Brooklyn, New York-Presbyterian Brooklyn Methodist Hospital, New York-Presbyterian Queens, and the James J Peters VA Medical Center in the Bronx.
The first stage of the project, which will kick off as soon as its protocol is approved by the institutional review board, is a retrospective study that will establish the infrastructure for collecting and analyzing samples from different places. The goal is to sequence the exomes and transcriptomes of a total of 100 cancer samples from at least five different institutions. "We really want to establish the flow of samples from the hospitals to our sequencers and the ability to sequence and share the data with those partner institutions," Robine explained.
The only exclusion criteria is patients who have self-identified as white, or of European ancestry. The project will rely on patients' self-described ethnicity for the first and beginning of the second stage of the project, though it will in parallel use their genomic data to define their ancestry. "It is important to note that we may miss some ethnic groups because they self-identify as white even though genomically, they may not be white," Froeling said. "We will look at the concordance between how people self-identify and what their genomic ancestry is."
Most of the samples are expected to be formalin-fixed, and their nucleic acids will be extracted at NYGC's clinical lab, while the sequencing will be carried out in the center's research production lab. Clinical-grade sequencing would have increased the cost, Robine said, and although returning genomic data is a future goal of the project, which will require clinical sequencing, the researchers would for now rather sequence as many samples as they can.
The reason they chose to do exome sequencing is that the data will be comparable to that from projects like The Cancer Genome Atlas (TCGA). It will also allow them to make discoveries that they might not be able to make with a gene panel. And while the number of whole-genome sequencing cancer genomics projects is growing, there are not enough yet to allow for many comparisons, Robine said. Including RNA data will help with the discovery of gene fusions, he added, as well as to find outliers in gene expression.
All data will be stored on servers at NYGC. Project participants will be able to access the data and study somatic variants through an interface such as cBioPortal, which has been used in many other projects. As most of the samples to be analyzed for the first stage have not been consented for broad data sharing, the results will not be available to the larger research community. The goal is to complete the retrospective study by July of next year, though there might be some overlap with the second stage, which could start in early 2019.
The second stage, a prospective study, will be a pilot project to test the infrastructure. Patients will be consented for both germline and somatic sequencing, and the goal is to sequence 1,000 samples, including all cancer types, with the same ethnic exclusion criteria as for the first stage.
During the third stage, the project will expand to even larger numbers of patients and allow researchers that are part of the consortium to put in proposals to tackle specific hypothesis-driven questions or to focus on certain tumor types. Also, new methods may be added at that time, such as whole-genome sequencing, microbiome sequencing, methylome analysis, or single-cell sequencing.
One project, Froeling said, might be to study pancreatic cancer in ethnic groups that are known to respond less well to therapy than others, and to define molecular signatures of therapy response in them. Another project under discussion is to study microsatellite instability and tumor mutational burden across populations and link them with immunotherapy response data from clinical trials.
The number of patients to be studied during the third stage will depend on the available funding and the research proposals submitted. For now, Polyethnic-1000 has sufficient funding for the first stage and the beginning of the second stage and is "very actively looking for funding for stage two and three," Robine said, which could come from private donors, foundations, or through federal grants.
There is no set budget for the project, though. "Potentially, we could sequence every single cancer patient in New York," Robine said.
Longer term, the project plans to return genomic results to doctors and their patients, though details have not been decided yet. "It's definitely what we are intending to do," Froeling said. "How and when is something we need to work out."
The researchers also plan to integrate their data with other consortia, for example the American Association for Cancer Research's Project GENIE and the International Cancer Genome Consortium (ICGC).
"At the very least, we will introduce more diversity to the public databases, which is a goal we will achieve that will be interesting and important," Robine said.