NEW YORK – The Human Protein Atlas (HPA) project is expanding its analyses to specific diseases and plans to use Olink's Explore platform to generate blood-based proteomic profiles in samples from patients with a wide range of conditions.
Called the Human Disease Blood Atlas, the project aims to look at more than 100 different diseases and to profile roughly 10,000 individuals per year, said Mathias Uhlén, a professor at the Stockholm Royal Institute of Technology (KTH) and the director of the HPA initiative.
Uhlén noted that this would amount to more than $5 million per year in Olink assay costs and said the project has secured funding for this work for the next three years and will likely secure funding for the next 10 years.
Long term, Uhlén said he and his colleagues would like to collect blood-based proteomic data on the entire Swedish population (roughly 10.4 million people) each year. To that end, they are piloting the use of a dried blood spot collection technology with Olink's platform.
This month, the HPA team published the first data from the effort in a Research Square preprint detailing their use of the Explore platform to measure 1,463 proteins in plasma samples from 1,400 cancer patients spanning 12 cancer types.
Launched in 2003 and run by several Swedish research institutes including KTH, Uppsala University, and the Science for Life Laboratories in Uppsala and Stockholm, the HPA aims to map all the human proteins in cells, tissues, and organs. The project uses a variety of techniques for its measurements but is primarily antibody-based. Currently, the HPA contains protein expression data from more than 40 human tissue types and covering more than 15,000 gene products, around 80 percent of the predicted human proteome.
Uhlén said that the project's move into blood-based disease profiles had been driven in part by the development in recent years of highly sensitive, high-throughput technologies for measuring large numbers of proteins in blood.
"There are some absolutely astonishing new technologies for the analysis of proteins in blood," he said. "We are very excited because there are so many things that can be done with this new technology, which is, I think, a paradigm shift in blood profiling."
Uhlén highlighted specifically Olink and SomaLogic, whose Explore and SomaScan platforms, respectively, are the leading products in the space. The Explore platform uses Olink's antibody-based proximity extension assay and is currently capable of measuring around 3,000 protein targets, with an expansion to 4,500 protein targets planned by the end of the year. The SomaScan platform uses SomaLogic's Somamer reagents, a proprietary type of aptamer, and can measure roughly 7,000 protein targets.
Uhlén said that he was impressed with both firms' technologies but that the HPA researchers had chosen the Olink platform for the Human Disease Blood Atlas project largely due to concerns about data ownership under SomaLogic's previous business model in which the company would only provide researchers with the SomaScan platform under the condition that it retained access to data generated on the system.
"Obviously, you need a lot of ethical approvals to do these kinds of [human] studies, and the requirement of the company that they own the data, I just couldn't cope with from a legal point of view," Uhlén said. He noted that SomaLogic has since reversed this policy and was a "very good alternative" but said that "we are now so encouraged by the Olink data that we will probably stay with that."
In their recent preprint, Uhlén and his coauthors used the Olink data they generated to build plasma protein-based classifiers for distinguishing between 12 cancer types: acute myeloid leukemia, chronic lymphocytic leukemia, diffuse large B cell lymphoma, myeloma, colorectal cancer, lung cancer, glioma, breast cancer, cervical cancer, endometrial cancer, ovarian cancer, and prostate cancer.
Using a random forest and a regularized generalized linear (glmnet) machine learning model, they identified panels of proteins capable of distinguishing one cancer type from another. The panels ranged in size from 18 proteins (for lung cancer) to as few as three proteins (for AML, glioma, myeloma, and ovarian cancer). They built an 83-protein panel that identified specific cancer types with areas under the curve (AUC) of between 0.93 and 1.
The researchers also looked at whether their plasma proteomic data could help detect early-stage cancer, finding that it could distinguish between stage I lung cancer patients and healthy individuals with an AUC of 0.79 and between stage I colorectal cancer patients and healthy controls with an AUC of 0.78., indicating that these protein markers could be useful for early cancer detection, though, as the authors noted, much additional validation is needed.
While genomics has largely dominated multi-cancer early detection research, proteomics researchers have also long pursued protein markers for cancer detection, albeit with relatively little success. Some genomics-based firms, Exact Sciences most prominently, have also explored adding protein data to their MCED products. In June, Exact purchased German plasma proteomics company OmicEra Diagnostics for $15 million with plans to use that company's mass spectrometry-based proteomics technologies for cancer biomarker discovery.
Lukas Reiter, chief technology officer of Swiss proteomics firm Biognosys, which in May published a paper in the Journal of Proteome Research using its mass spectrometry workflow to identify candidate cancer protein markers in plasma, noted that because proteins are typically present in cells at much higher copy numbers than nucleic acid, targeting proteins could enable early detection of tumors that are not shedding detectable amounts of DNA, enhancing the sensitivity of pan-cancer assays.
This copy number advantage is countered, however, by the fact that, unlike for nucleic acids, no technology exists to amplify proteins, making their detection at low levels technically challenging.
Reiter, who is not involved in the HPA initiative, noted that while the Biognosys analysis used mass spec, its findings were similar in terms of the utility of plasma protein markers for distinguishing between cancer types and detecting early-stage disease.
He noted, though, that both projects were very early stage with neither, for instance, comparing the performance of their models to existing clinical practice. "It's just a hint that this may be something that may work well in the future," he said.
Uhlén said that he and his colleagues are now collecting additional cancer cohorts to validate their initial results. Additionally, they are developing a workflow that would allow for running the Olink Explore assay in dried blood spots, which Uhlén said could make large-scale, home-based sample collection feasible.
For this work, the HPA team is working with Capitainer, a Solna, Sweden-based microsampling company that has developed blood sampling cards that allow for consistent collection of fixed volumes of blood from finger pricks, enabling quantitative analyses. Uhlén said that in a pilot project the researchers have managed to get the Olink assay "working quite nicely" with these samples.
He said that longer term he envisions this sampling technology potentially allowing the HPA researchers to make plasma proteomic measurements on the full Swedish population annually, though he noted that any such effort would likely survey a smaller number of proteins — perhaps on the order of a few hundred — given the expense of the Olink assay.
Uhlén said that the Human Disease Blood Atlas is currently in the process of moving to Olink's Explore 3072 assay, which measures just over 3,000 protein targets. Thus far, the project has analyzed around 10,000 patient samples across 88 different diseases including cancer, autoimmune conditions, and neurological, cardiovascular, and infectious disease. He said the researchers will begin making those datasets available in December and plan to publish the full sets in the project's open access resource in spring 2023. By the end of 2023, the team aims to have generated data on around 20,000 patients spanning 120 diseases.