Estimation of a zero-inflated Poisson regression model with missing covariates via nonparametric multiple imputation methods

Shen Ming Lee, T. Martin Lukusa, Chin Shang Li

Research output: Contribution to journalArticle

Abstract

Zero-inflated Poisson (ZIP) regression is widely applied to model effects of covariates on an outcome count with excess zeros. In some applications, covariates in a ZIP regression model are partially observed. Based on the imputed data generated by applying the multiple imputation (MI) schemes developed by Wang and Chen (Ann Stat 37:490–517, 2009), two methods are proposed to estimate the parameters of a ZIP regression model with covariates missing at random. One, proposed by Rubin (in: Proceedings of the survey research methods section of the American Statistical Association, 1978), consists of obtaining a unified estimate as the average of estimates from all imputed datasets. The other, proposed by Fay (J Am Stat Assoc 91:490–498, 1996), consists of averaging the estimating scores from all imputed data sets to solve the imputed estimating equation. Moreover, it is shown that the two proposed estimation methods are asymptotically equivalent to the semiparametric inverse probability weighting method. A modified formula is proposed to estimate the variances of the MI estimators. An extensive simulation study is conducted to investigate the performance of the estimation methods. The practicality of the methodology is illustrated with a dataset of motorcycle survey of traffic regulations.

Original languageEnglish (US)
JournalComputational Statistics
DOIs
StateAccepted/In press - Jan 1 2019

Fingerprint

Missing Covariates
Poisson Regression
Multiple Imputation
Poisson Model
Regression Model
Covariates
Zero
Estimate
Motorcycles
Inverse Probability Weighting
Missing at Random
Research Methods
Asymptotically equivalent
Estimating Equation
Excess
Averaging
Count
Traffic
Simulation Study
Estimator

Keywords

  • Count data
  • Inverse probability weighting (IPW)
  • Missing at random
  • Nonparametric multiple imputation
  • Zero-inflated Poisson regression

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty
  • Computational Mathematics

Cite this

@article{31d0ea80e55945bfbd94e87d01b36dde,
title = "Estimation of a zero-inflated Poisson regression model with missing covariates via nonparametric multiple imputation methods",
abstract = "Zero-inflated Poisson (ZIP) regression is widely applied to model effects of covariates on an outcome count with excess zeros. In some applications, covariates in a ZIP regression model are partially observed. Based on the imputed data generated by applying the multiple imputation (MI) schemes developed by Wang and Chen (Ann Stat 37:490–517, 2009), two methods are proposed to estimate the parameters of a ZIP regression model with covariates missing at random. One, proposed by Rubin (in: Proceedings of the survey research methods section of the American Statistical Association, 1978), consists of obtaining a unified estimate as the average of estimates from all imputed datasets. The other, proposed by Fay (J Am Stat Assoc 91:490–498, 1996), consists of averaging the estimating scores from all imputed data sets to solve the imputed estimating equation. Moreover, it is shown that the two proposed estimation methods are asymptotically equivalent to the semiparametric inverse probability weighting method. A modified formula is proposed to estimate the variances of the MI estimators. An extensive simulation study is conducted to investigate the performance of the estimation methods. The practicality of the methodology is illustrated with a dataset of motorcycle survey of traffic regulations.",
keywords = "Count data, Inverse probability weighting (IPW), Missing at random, Nonparametric multiple imputation, Zero-inflated Poisson regression",
author = "Lee, {Shen Ming} and Lukusa, {T. Martin} and Li, {Chin Shang}",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/s00180-019-00930-x",
language = "English (US)",
journal = "Computational Statistics",
issn = "0943-4062",
publisher = "Springer Verlag",

}

TY - JOUR

T1 - Estimation of a zero-inflated Poisson regression model with missing covariates via nonparametric multiple imputation methods

AU - Lee, Shen Ming

AU - Lukusa, T. Martin

AU - Li, Chin Shang

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Zero-inflated Poisson (ZIP) regression is widely applied to model effects of covariates on an outcome count with excess zeros. In some applications, covariates in a ZIP regression model are partially observed. Based on the imputed data generated by applying the multiple imputation (MI) schemes developed by Wang and Chen (Ann Stat 37:490–517, 2009), two methods are proposed to estimate the parameters of a ZIP regression model with covariates missing at random. One, proposed by Rubin (in: Proceedings of the survey research methods section of the American Statistical Association, 1978), consists of obtaining a unified estimate as the average of estimates from all imputed datasets. The other, proposed by Fay (J Am Stat Assoc 91:490–498, 1996), consists of averaging the estimating scores from all imputed data sets to solve the imputed estimating equation. Moreover, it is shown that the two proposed estimation methods are asymptotically equivalent to the semiparametric inverse probability weighting method. A modified formula is proposed to estimate the variances of the MI estimators. An extensive simulation study is conducted to investigate the performance of the estimation methods. The practicality of the methodology is illustrated with a dataset of motorcycle survey of traffic regulations.

AB - Zero-inflated Poisson (ZIP) regression is widely applied to model effects of covariates on an outcome count with excess zeros. In some applications, covariates in a ZIP regression model are partially observed. Based on the imputed data generated by applying the multiple imputation (MI) schemes developed by Wang and Chen (Ann Stat 37:490–517, 2009), two methods are proposed to estimate the parameters of a ZIP regression model with covariates missing at random. One, proposed by Rubin (in: Proceedings of the survey research methods section of the American Statistical Association, 1978), consists of obtaining a unified estimate as the average of estimates from all imputed datasets. The other, proposed by Fay (J Am Stat Assoc 91:490–498, 1996), consists of averaging the estimating scores from all imputed data sets to solve the imputed estimating equation. Moreover, it is shown that the two proposed estimation methods are asymptotically equivalent to the semiparametric inverse probability weighting method. A modified formula is proposed to estimate the variances of the MI estimators. An extensive simulation study is conducted to investigate the performance of the estimation methods. The practicality of the methodology is illustrated with a dataset of motorcycle survey of traffic regulations.

KW - Count data

KW - Inverse probability weighting (IPW)

KW - Missing at random

KW - Nonparametric multiple imputation

KW - Zero-inflated Poisson regression

UR - http://www.scopus.com/inward/record.url?scp=85074526144&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85074526144&partnerID=8YFLogxK

U2 - 10.1007/s00180-019-00930-x

DO - 10.1007/s00180-019-00930-x

M3 - Article

AN - SCOPUS:85074526144

JO - Computational Statistics

JF - Computational Statistics

SN - 0943-4062

ER -