NEW YORK (GenomeWeb) –Aspects of the healthy immune system vary by age, race, sex, and pregnancy state, according to early results from the 10,000 Immunomes Project (10KIP), an effort to put together a large set of standardized immune measurements for a diverse group of healthy individuals.
To assemble the 10KIP resource, the researchers, from the University of California, San Francisco, and Northrop Grumman's Information Systems Health IT, tapped into publicly available data in the National Institute of Allergy and Infectious Diseases database ImmPort, bringing together immunology measurements — ranging from secreted immune protein profiles and immune cell phenotypes to blood cell gene expression patterns — for more than 10,000 healthy individuals enrolled as unaffected controls for dozens of prior studies.
As they reported online today in Cell Reports, the 10KIP collection has already proven useful for characterizing features such as serum cytokine variability, cell cytokine interactions, and pregnancy-related immune changes.
The investigators hope to continue bolstering the resource by adding immune data from participants in a variety of projects, possibly including ImmunoX — a large immunology program proposed at UCSF's Bakar Computational Health Sciences Institute. The current iteration of 10KIP is available online, allowing researchers to visualize and download the data.
"Since this data is already available, we want people to make use of it and help their science today," senior author Atul Butte, director of the UCSF Bakar Institute, said in a statement. "We expect that, as more scientists upload their data to NIAID's ImmPort database, the power of 10KIP will only grow in value, richness, and scale."
The researchers began with diverse data from more than 290,000 samples, representing 44,775 individuals who participated in 242 studies included in ImmPort data release 21. They whittled this data down to 10,344 participants from 83 studies, coming up with a standardized dataset including more than 42,000 samples available for these individuals.
In addition to its manual data curation steps, the team built pipelines to standardize simulated datasets as well as 10 types of authentic data available for the 10KIP individuals, correcting for technical variation, batch effects, and other potential confounders.
"Through statistical testing and validations in simulated data, we demonstrate the ability to compensate for technical artifacts that invariably arise from collecting data on different days, across different platforms, or at distant institutions by repurposing algorithms developed in computational genetics," the authors wrote.
From there, the team began using this newly assembled and standardized 10KIP dataset to start analyzing immune features. Its analyses revealed varied blood serum levels for dozens of immune cell subsets, depending on participants' age, sex, or ethnicity, for example. And with immune cell and protein profiles in blood samples from a subset of 321 individuals, the group got a look at human immune cell-serum cytokine interactions.
Based on data for 56 pregnant women between the ages of 18 and 40 and for 94 age-matched controls, meanwhile, the researchers retraced immune cell features during and after pregnancy. That analysis uncovered shifts in immune cell proportions across pregnancy, including increased levels of certain cytokines in early pregnancy, a rise in CD4+ T cells across pregnancy, and a late pregnancy dip in B cells.
"Although we recognize that, to date, ImmPort does not contain dense data for all measurement types, this analysis demonstrates that the size and scope of even this initial version of the 10KIP are sufficient to generate age- and sex-matched control cohorts for two types of high-throughput immune measurements as a baseline or comparator to immune perturbation or disease," the authors wrote.