Human immunophenotyping via low-variance, low-bias, interpretive regression modeling of small, wide data sets: Application to aging and immune response to influenza vaccination

Tyson H. Holmes, Xiaosong He

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Small, wide data sets are commonplace in human immunophenotyping research. As defined here, a small, wide data set is constructed by sampling a small to modest quantity . n,. 1. <. . n . <. 50, of human participants for the purpose of estimating many parameters . p, such that . n . <. . p . <. 1000. We offer a set of prescriptions that are designed to facilitate low-variance (i.e. stable), low-bias, interpretive regression modeling of small, wide data sets. These prescriptions are distinctive in their especially heavy emphasis on minimizing the use of out-of-sample information for conducting statistical inference. This allows the working immunologist to proceed without being encumbered by imposed and often untestable statistical assumptions. Problems of unmeasured confounders, confidence-interval coverage, feature selection, and shrinkage/denoising are defined clearly and treated in detail. We propose an extension of an existing nonparametric technique for improved small-sample confidence-interval tail coverage from the univariate case (single immune feature) to the multivariate (many, possibly correlated immune features). An important role for derived features in the immunological interpretation of regression analyses is stressed. Areas of further research are discussed. Presented principles and methods are illustrated through application to a small, wide data set of adults spanning a wide range in ages and multiple immunophenotypes that were assayed before and after immunization with inactivated influenza vaccine (IIV). Our regression modeling prescriptions identify some potentially important topics for future immunological research. 1) Immunologists may wish to distinguish . age-related differences in immune features from changes in immune features . caused by aging. 2) A form of the bootstrap that employs linear extrapolation may prove to be an invaluable analytic tool because it allows the working immunologist to obtain accurate estimates of the stability of immune parameter estimates with a bare minimum of imposed assumptions. 3) Liberal inclusion of immune features in phenotyping panels can facilitate accurate separation of biological signal of interest from noise. In addition, through a combination of denoising and potentially improved confidence interval coverage, we identify some candidate immune correlates (frequency of cell subset and concentration of cytokine) with B cell response as measured by the quantity of IIV-specific IgA antibody-secreting cells and the quantity of IIV-specific IgG antibody-secreting cells.

Original languageEnglish (US)
JournalJournal of Immunological Methods
StateAccepted/In press - Feb 15 2016
Externally publishedYes


  • Confidence interval
  • Denoising
  • Influenza
  • Statistical assumptions
  • Statistical bias
  • Statistical regression modeling

ASJC Scopus subject areas

  • Immunology and Allergy
  • Immunology


Dive into the research topics of 'Human immunophenotyping via low-variance, low-bias, interpretive regression modeling of small, wide data sets: Application to aging and immune response to influenza vaccination'. Together they form a unique fingerprint.

Cite this