TY - JOUR
T1 - Human immunophenotyping via low-variance, low-bias, interpretive regression modeling of small, wide data sets
T2 - Application to aging and immune response to influenza vaccination
AU - Holmes, Tyson H.
AU - He, Xiaosong
PY - 2016/2/15
Y1 - 2016/2/15
N2 - Small, wide data sets are commonplace in human immunophenotyping research. As defined here, a small, wide data set is constructed by sampling a small to modest quantity . n,. 1. <. . n . <. 50, of human participants for the purpose of estimating many parameters . p, such that . n . <. . p . <. 1000. We offer a set of prescriptions that are designed to facilitate low-variance (i.e. stable), low-bias, interpretive regression modeling of small, wide data sets. These prescriptions are distinctive in their especially heavy emphasis on minimizing the use of out-of-sample information for conducting statistical inference. This allows the working immunologist to proceed without being encumbered by imposed and often untestable statistical assumptions. Problems of unmeasured confounders, confidence-interval coverage, feature selection, and shrinkage/denoising are defined clearly and treated in detail. We propose an extension of an existing nonparametric technique for improved small-sample confidence-interval tail coverage from the univariate case (single immune feature) to the multivariate (many, possibly correlated immune features). An important role for derived features in the immunological interpretation of regression analyses is stressed. Areas of further research are discussed. Presented principles and methods are illustrated through application to a small, wide data set of adults spanning a wide range in ages and multiple immunophenotypes that were assayed before and after immunization with inactivated influenza vaccine (IIV). Our regression modeling prescriptions identify some potentially important topics for future immunological research. 1) Immunologists may wish to distinguish . age-related differences in immune features from changes in immune features . caused by aging. 2) A form of the bootstrap that employs linear extrapolation may prove to be an invaluable analytic tool because it allows the working immunologist to obtain accurate estimates of the stability of immune parameter estimates with a bare minimum of imposed assumptions. 3) Liberal inclusion of immune features in phenotyping panels can facilitate accurate separation of biological signal of interest from noise. In addition, through a combination of denoising and potentially improved confidence interval coverage, we identify some candidate immune correlates (frequency of cell subset and concentration of cytokine) with B cell response as measured by the quantity of IIV-specific IgA antibody-secreting cells and the quantity of IIV-specific IgG antibody-secreting cells.
AB - Small, wide data sets are commonplace in human immunophenotyping research. As defined here, a small, wide data set is constructed by sampling a small to modest quantity . n,. 1. <. . n . <. 50, of human participants for the purpose of estimating many parameters . p, such that . n . <. . p . <. 1000. We offer a set of prescriptions that are designed to facilitate low-variance (i.e. stable), low-bias, interpretive regression modeling of small, wide data sets. These prescriptions are distinctive in their especially heavy emphasis on minimizing the use of out-of-sample information for conducting statistical inference. This allows the working immunologist to proceed without being encumbered by imposed and often untestable statistical assumptions. Problems of unmeasured confounders, confidence-interval coverage, feature selection, and shrinkage/denoising are defined clearly and treated in detail. We propose an extension of an existing nonparametric technique for improved small-sample confidence-interval tail coverage from the univariate case (single immune feature) to the multivariate (many, possibly correlated immune features). An important role for derived features in the immunological interpretation of regression analyses is stressed. Areas of further research are discussed. Presented principles and methods are illustrated through application to a small, wide data set of adults spanning a wide range in ages and multiple immunophenotypes that were assayed before and after immunization with inactivated influenza vaccine (IIV). Our regression modeling prescriptions identify some potentially important topics for future immunological research. 1) Immunologists may wish to distinguish . age-related differences in immune features from changes in immune features . caused by aging. 2) A form of the bootstrap that employs linear extrapolation may prove to be an invaluable analytic tool because it allows the working immunologist to obtain accurate estimates of the stability of immune parameter estimates with a bare minimum of imposed assumptions. 3) Liberal inclusion of immune features in phenotyping panels can facilitate accurate separation of biological signal of interest from noise. In addition, through a combination of denoising and potentially improved confidence interval coverage, we identify some candidate immune correlates (frequency of cell subset and concentration of cytokine) with B cell response as measured by the quantity of IIV-specific IgA antibody-secreting cells and the quantity of IIV-specific IgG antibody-secreting cells.
KW - Confidence interval
KW - Denoising
KW - Influenza
KW - Statistical assumptions
KW - Statistical bias
KW - Statistical regression modeling
UR - http://www.scopus.com/inward/record.url?scp=85002800514&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85002800514&partnerID=8YFLogxK
U2 - 10.1016/j.jim.2016.05.004
DO - 10.1016/j.jim.2016.05.004
M3 - Article
C2 - 27196789
AN - SCOPUS:85002800514
JO - Journal of Immunological Methods
JF - Journal of Immunological Methods
SN - 0022-1759
ER -