NEW YORK – Owkin expects to generate multimodal tumor microenvironment characterizations for thousands of patients by the end of the year in a bid to aid clinical decision-making and drug discovery, the company said at the annual meeting of the American Association for Cancer Research on Sunday.
The Paris-based company launched the Multi Omic Spatial Atlas in Cancer (MOSAIC) initiative last year aiming to build the world's largest spatial genomics database to help advance cancer biomarker and drug discovery through artificial intelligence-driven bioinformatics analysis. MOSAIC is a collaboration between Owkin and multiple European and American academic research centers. Owkin has committed $50 million to the 10-year project, after which it intends to release the datasets into the public domain. Of note, Owkin founded the MOSAIC consortium with NanoString Technologies, which is no longer involved in the project, according to an Owkin spokesperson.
In a werbcast presentation at the AACR meeting, Jean Phillipe Vert, Owkin's chief R&D officer, said the company's goal is to create something like The Cancer Genome Atlas for spatial omics.
"Spatial omics is important and can shed light on new mechanisms, in particular in immuno-oncology, but if there's one thing we are missing today, it's to have lots of data in lots of patients," Vert said. While much can be learned by studying a single patient deeply, "if you want to be able to find correlations between what we see in patients and clinical outcomes, we need more than one, 10, or 100 patients," he noted. "We need much more."
Owkin is aiming to collect multi-modal spatial omics data from 7,000 patients across seven cancer indications — non-small cell lung, ovarian, bladder, triple-negative breast, mesothelioma, glioblastoma, and diffuse large B cell lymphoma. The company is incorporating data from 10x Genomics' Visium spatial gene expression mapping technology, single-cell RNA sequencing, bulk RNA-sequencing, whole-exome sequencing, digitized hematoxylin and eosin (H&E) staining, and clinical patient records into MOSAIC.
"AI systems trained on these data will be able to better understand the biology of cancer, in particular the tumor microenvironment, leading to discovery of new targets [and] biomarkers, and understanding patient heterogeneity, so that ultimately we can have a more rational approach to match drugs to patients," said Vert.
Owkin's partner hospitals have now processed more than 400 samples. "We are targeting up to 3,000 samples by the end of the year," Vert said, adding that the company intends to maintain that momentum in 2025.
In initial data from those first 400 samples, Owkin researchers saw clusters of cells that expressed different levels of therapeutic targets, including subclusters within malignant cells. As an example, Vert noted that in a sample from a patient with bladder cancer, there were two subclusters of malignant cells corresponding to high and low expression of CDH2, indicating that the tumor has two lines of subclones within it. The researchers then were able to map cells from the two subclones back onto the Visium slide of the tumor and show that the two populations of subclones were separate from each other.
Vert noted that the technology allows researchers to visualize other cell types, too, including cells that are significant in the tumor microenvironment. In the case of the bladder cancer patient, spatial imaging showed that neutrophils tended to localize with the CDH2-high population of cells, while the CDH2-low cells were infiltrated by T cells and natural killer cells.
"Not only do we have two subgroups of cancer but importantly, we have two different tumor microenvironments," Vert said. "And suddenly this raises the question of using this information to better treat this patient by taking into account not only the mutations and the tumor types, but as well what's around the tumors."
Vert said those results plus analysis of gene overexpression suggest that a combination therapy targeting both populations of cells in the tumor could be an effective treatment for the patient. He also explained how the spatial mapping capabilities of MOSAIC could show immunologically hot and cold regions within a tumor or visualize data over multiple time points to show the effects of treatment.
Owkin is also applying artificial intelligence methods to expand the capabilities of MOSAIC analysis, for example by enhancing signal quality in spatial analysis, increasing the resolution of digital images, or generating synthetic spatial omics profiles from H&D slides by predicting spatial gene expression.
Lastly, Vert said Owkin wants to use AI to create a multimodal foundation model. A foundation model is a type of generative AI that is pre-trained to perform a variety of tasks and that learns continuously from data inputs or prompts and can apply information from one use case to another. Vert noted that there are already examples in the scientific literature of foundation models in development for purposes such as modeling a cell or a type of image.
"Ultimately, what's missing as of today is not only to have different foundation models, but to build one foundation model that would capture information across modalities," Vert said.
To that end, Vert and other scientists from Owkin have partnered with former Google DeepMind researchers to launch a new company, Bioptimus, dedicated to applying AI foundation models at various scales to biological systems. The company debuted in March with $35 million in a seed funding round led by Sofinnova Partners, with participation from Bpifrance large Venture and other investors.
Vert said that the ultimate goal of applying AI methods within MOSAIC and other projects is not just to find associations in disease processes, but to determine causation. "We want to move away from just finding correlation, to being able to predict the future [of an outcome] subject to intervention," Vert said, adding that primary applications of that capability would be predicting individual responses to treatments and identifying new targets.
Acknowledging that Owkin's ambitious goal of 7,000 patient samples is still a small number in the larger landscape of millions of patients with cancer, and that extending the data with AI enhancements or synthetic data generated from H&E slides could introduce error, Vert emphasized that a foundation model developed for MOSAIC would still be validated experimentally.
"I don't expect these models to perfectly recapitulate biology, but I think they can be useful," Vert said, adding that Owkin has already seen good performance from its digital pathology model, which is "not trained on hundreds of millions of slides."