In an effort to create a cheaper, faster, and animal-free method for determining pluripotency in stem cells, an international research team has developed a web-based test that helps investigators determine whether their cell lines are pluripotent based on their gene expression profiles.
In a paper published recently in Nature Methods, the investigators describe their method, christened PluriTest, as an alternative to the so-called teratoma assay, the current standard in the field. The assay determines whether human stem cells are pluripotent based on their ability to generate different types of tissues in tumors developed in immunodeficient mice.
"Many scientists are unhappy" with the teratoma assay, Jeanne Loring, a Scripps Research molecular biologist and one of the authors of the study, said in a statement. The method can take six to eight weeks to get results, she noted, and is also "technically challenging and difficult to standardize."
An added factor, the researchers note in the paper and argue in a prior publication, is that these teratoma assays "may have limited value as a criterion for pluripotency."
Building on previous work that used machine-learning methods to identify specific stem cell phenotypes based on gene expression microarray data, the authors surmised that they could use the same type of data to predict the presence or absence of pluripotent features in stem cells.
First, the authors developed a dataset dubbed "stem cell matrix 2," or SCM2, that included around 450 genome-wide transcriptional profiles from 223 human embryonic stem cell and 41 iPSC lines. They then applied an unbiased analytical approach called nonnegative matrix factorization to "identify unexpected patterns engrained in the datasets."
PluriTest uses two related classifiers: The "pluripotency score," which indicates whether a query sample contains a pluripotent signature; and the "novelty score," which measures how far the stem cells in a query sample deviate from the normal pluripotent stem cell lines contained in the SCM2 dataset.
The pluripotency score is obtained by identifying gene features that indicate pluripotency and ranking them based on the "largest distance between margins of known pluripotent and non-pluripotent samples in the training set," according to the paper.
The novelty score, meantime, measures the ability of an NMF model to "approximate a given query gene expression profile."
Simply put, "you upload raw data to the website and it tells you if your cell line is pluripotent or not and it's also got some information about ... interesting patterns in the gene expression profile," Franz-Josef Mueller, a researcher at the Center for Integrative Psychiatry in Kiel, Germany, and a co-author on the paper, told BioInform.
In addition, he said, "For every [pluripotent] cell type that you can isolate and define with a certain degree of confidence, you can create a model out of it."
For example, he said, a researcher who is differentiating stem cells into beta cells for treating patients with diabetes and wants to make sure that the cells are all the same or fit certain criteria could create a model for the stem-cell-derived beta cells using the system.
After they trained PluriTest on the SCM2 dataset, Mueller and colleagues tested their method on germ cell tumor lines, which resemble PSCs but include genetic and epigenetic abnormalities. "These cells had high pluripotency scores, as expected, but the novelty score indicated that they deviate from the normal PSCs in the SCM2 dataset," the authors wrote.
They also tested the method on datasets comprised of both pluripotent and non-pluripotent lines: Illumina WG6v1, HT12v3, HT12v4 datasets, and another dataset that used Affymetrix U133A arrays.
In one test, using data from an Illumina HT12v3 array, PluriTest was able to predict pluripotency with 98 percent sensitivity and 100 percent specificity, the researchers reported.
The Scripps team has developed a web server for PluriTest that allows users to upload an unmodified microarray scanner file and then automatically performs all data extraction and preprocessing steps. A typical online analysis with 12 samples takes less than 10 minutes, including data upload, the team wrote.
Mueller added that because PluriTest is a "data-driven" model, it's more reliable than current methods for predicting pluripotency, which are "highly vulnerable against the unpredictable."
"We know now that when you culture pluripotent cells from somatic sources or embryos, a lot of things can go wrong and there is no way you can build a signature for something that goes wrong but still looks pretty much like a pluripotent stem cell," he said.
As an example, Mueller cited teratocarcinoma cells, which are malignantly transformed pluripotent cells. "If you assay them with a classical signature list of genes with qPCR, you won't be able to pick up a difference," he said.
Since there is "almost no prior knowledge of what can go wrong" with stem cells, "you need to have this open-ended assessment," which the novelty score provides.
For their next steps, Mueller said the team plans to make the method more "usable" while keeping it simple. In addition, they hope to explore new cell types that can be used for disease modeling and regenerative medicine research, as well as integrate different data types such as epigenetic profiles.
Have topics you'd like to see covered in BioInform? Contact the editor at uthomas [at] genomeweb [.] com.