NEW YORK – A team led by researchers at the Francis Crick Institute and Charité University Hospital Berlin have used a high-throughput proteomics workflow to identify protein profiles linked to disease severity in COVID-19 patients.
Detailed in a paper published last week in Cell Systems, the work could help clinicians identify COVID-19 patients at risk of severe infection. It also provided a test run for a newly developed proteomic platform that the researchers are offering for large-scale proteomics experiments running thousands to tens of thousands of samples, said Markus Ralser, director of biochemistry at Charité and senior author on the study.
The work builds on research published last year by Ralser and his colleagues in which they developed a software package called DIA-NN that uses a combination of signal correction strategies to reduce interferences and neural networks to assign the confidence of peak identifications, which allows for high performance even when using very short liquid chromatography gradients. Using the approach, the researchers were able to quantify several hundred proteins in five-minute DIA analyses of undepleted plasma.
In the Cell Systems study, they used a further optimized version of the DIA-NN software to analyze serum and plasma samples from 31 COVID-19 patients, using those samples to identify proteins that were differentially expressed based on the severity of the patient's disease. They then tested these profiles in a separate set of 32 subjects, 17 of them COVID-19 patients and 15 of them healthy volunteers.
In an analysis of 104 samples from the initial 31-patient cohort, Ralser and his team quantified 297 proteins, 229 of which were measured in at least 75 percent of the samples. Among these 229 proteins were 37 that differed in expression with disease severity. Validating these 37 in the second cohort of samples, the researchers found that 27 of them exhibited the same behavior in the validation set.
Ralser said that while the work is still in the early stages, he was encouraged by the fact that patient proteome profiles correlated well with their disease severity.
"It means that in the long run you might be able to make proteomic assays that are predictive," he said.
Building such assays will require many more samples along with validations across different research centers, he noted, adding that the researchers are working to collect longitudinal samples from multiple patients that they can use to further develop the protein profiles.
"But long story short," he said, "these first results show that the disease is very strongly reflected at the level of the proteome."
Ralser said that since the submission and publication of the Cell Systems paper, he and his colleagues have made measurements in a much larger cohort where samples were collected longitudinally throughout the progression of patients' infections.
"The idea now is to develop predictive models," he said. "We want to be able to take a sample from a patient in the early stage of the disease and understand much better or make a prediction of what risk that individual has of developing a severe form of the disease. We need more data to see whether that is possible or not, but we are optimistic because the signatures [generated thus far] are quite strong and quite unique, but we haven't done it yet."
Ultimately, Ralser said, the researchers hope to translate their findings into more routine assays that clinical labs could use down the road to help manage future waves of the pandemic.
The Cell Systems work also served as a demonstration project of sorts for the high-throughput proteomics platform the Charité researchers have developed. In the paper, they detailed several recent improvements to the platform that Ralser said were key to enabling highly reproducible proteomic measurements across cohorts comprising thousands or tens of thousands of samples.
One seemingly minor but in fact significant optimization the researchers made was developing a system for using the same batch of reagents throughout the entire course of an experiment, Ralser said.
"There are quite a few automated mass spec sample preparation methods for proteomics out there, and they all have different strengths and disadvantages," he said.
However, he added, due to the traditional throughput limitations of mass spec-based proteomics, few of these workflows have been designed with running massive numbers of samples in mind. When Ralser and his colleagues began looking at what were the major contributors to imprecision across very large sample sets, they realized that one key factor was the use of different batches of reagents over the course of an experiment.
"This is what creates a large fraction of the batch effects, the need to freshly prepare [reagents] all the time because they are not stable," he said.
To address this problem, the researchers developed a system using pre-made plates of reagents all from the same batch that can be frozen and stored until needed.
"When a project starts, you estimate how big it will be," Ralser said. "Then you make a huge batch of the reagents, pipette them with the robot, and then you freeze them. And this assures that, it doesn't matter how long your experiment runs, how many samples you have, you are always treating every sample with exactly the same reagents. And this brings batch effects massively down."
The researchers also optimized the DIA-NN software for working with high-flow chromatography, which further boosts the platform's throughput.
Currently, the platform is capable of running 180 samples per day per mass spectrometer at a reagent cost of around €10 ($11.40) per sample, Ralser said.
He added that the system has been developed and documented with the goal of achieving ISO13485 certification and that Charité last year launched a high-throughput proteomics core facility to make the system available to researchers both inside and outside the hospital. He said that the facility is in the middle of running samples for an epidemiological project that will total around 20,000 proteomes, though this work has been put on hold during the SARS-CoV-2 pandemic.
"The idea is that this is a facility for projects that start at the size of 5,000 to 10,000 samples," he said. "There are very few places worldwide where you can go and say, 'Can you measure for me 10,000 proteomes?' But clearly for quite a few applications that is what is needed."