Skip to main content
Premium Trial:

Request an Annual Quote

Machine Learning Method Allows Highly Multiplexed Cell Surface Protein Analysis via Flow Cytometry


NEW YORK – A team led by researchers at the Fred Hutchinson Cancer Institute has developed a workflow that enables highly multiplexed analysis of cell surface proteins using conventional flow cytometry.

Detailed in a paper published this week in Science Advances, the method uses machine learning combined with data from overlapping flow cytometry assay panels to determine the co-expression patterns of hundreds of cell surface markers across millions of single cells.

Because the approach, named Infinity Flow, uses traditional flow cytometry as opposed to newer technologies like mass cytometry, it offers a potentially more accessible and less expensive route for single-cell proteomic experiments, said Mark Headley, an assistant professor in immunology at Fred Hutch and senior author on the study.

The basic notion underpinning the method is that expression of cell surface markers is correlated, meaning that the presence of one particular cell surface protein on a cell indicates that certain other proteins are likely also present. Using a limited set of cell surface protein measurements, researchers can therefore impute the presence of a large set of proteins on the cells in their sample.

The method consists of two steps. First, the researchers stain all the cells in an experiment with antibodies to a small number of proteins. This is called the "backbone" panel and typically consists of around the 20 analytes that can be multiplexed using a conventional flow cytometer. They then subdivide the sample into a series of wells, each containing one of the additional antibodies to the proteins they are interested in. For instance, if the researchers want to look at another 200 proteins beyond those covered in the backbone panel, they would subdivide their sample into 200 different wells, each of which would contain an antibody against one of those 200 proteins.

They then analyze each well of samples using flow cytometry, allowing them to collect for each cell expression data on all of the "backbone" proteins along with data on the one additional protein designated for that sample well. With this data, the Infinity Flow software is able to predict for every cell being analyzed the expression levels of the 200 proteins of interest.

In the case of the initial step, the number of proteins measured in the "backbone" panel is limited by the number of proteins that can be measured simultaneously by a lab's flow cytometer. Headley noted that conventional flow cytometry technology tops out at around 28 analytes, while newer spectral flow cytometers can measure 40 to 50 analytes.

In the second round of measurements, each protein is measured in a separate subdivision of the overall sample, which means multiplexing is not limited by questions of interference but only by the number of high quality antibodies that are available, Headley said, noting that existing commercial kits typically offer in the range of 250 to 300 markers.

"As many samples as you can split your big sample into, you can stain that many markers," he said. "You could do hundreds upon hundreds if you wanted."

While existing technologies like Fluidigm's mass cytometry instruments and sequencing-based techniques like CITE-Seq allow for highly multiplexed measurements of proteins across large numbers of single cells, the Fred Hutch researchers believe the Infinity Flow method offers advantages in terms of accessibility and cost.

Flow cytometry is a commonly used and widely distributed technology. The Infinity Flow technique can be run "on pretty much any cytometer, and most research institutes have access to some level of cytometry technology," Headley said.

It's also less expensive than sequencing-based approaches that use antibodies tagged with oligonucleotides for single-cell protein measurements, especially in the case of larger samples.

"Sequencing is still quite expensive," he said. "So if you are trying to get transcriptomic data on thousands of genes across tens of thousands or hundreds of thousands of cells, and then you are also … trying to detect hundreds of proteins, you run into sequencing overhead issues pretty quick. The costs just go up."

Infinity Flow costs scale more slowly. "We can measure 10 million, 20 million, 30 million cells at the same cost," Headley said. "It doesn't cost you more to do the assay the more cells you do."

This also means higher throughput, he said, noting that Infinity Flow experiments can analyze millions or tens of millions of cells while sequencing-based approaches more commonly look at tens of thousands of cells.

The performance of the approach varies with the protein marker being looked at and is also dependent on factors like the quality of the antibody staining and the composition of the backbone panel used. For instance, in the Science Advances study, the researchers found that when they used a backbone panel of proteins common to all T cells, they were unable to impute protein expression levels for particular T-cell types due to the fact that, based on the backbone signature, all T cells looked the same. This, they noted, indicates that tailoring backbone panels to the samples and cell populations of interest can improve the assay's performance.

The researchers also found that the method was best able to impute expression levels for proteins that exhibited multimodal expression patterns and were highly or commonly expressed. In an evaluation of an Infinity Flow assay detailed in the study, they determined that they were able to generate "meaningful imputed signal" for 155 of 252 markers measured.

Headley said that his lab is currently using the approach in its work on cancer metastasis in the lung "to get a really comprehensive look at which types of immune cells are influencing cancer [development] in either a pro-cancer or anti-cancer way."

His team is also involved in collaborations using the technique to study the different activation states of different immune cells in disease and to observe these cells in their various intermediate stages of differentiation.

In terms of future advances, Headley said he and his colleagues would like to apply the approach to intracellular proteins, as well, though he noted that the different fixation conditions required to stain different intracellular proteins presented a challenge to this effort.

He said he had no plans to commercialize the method but could envision antibody and immunology companies creating and selling pre-optimized backbone panels for researchers using the method.