Abstract
We consider a semiparametric method to estimate logistic regression models with missing both covariates and an outcome variable, and propose two new estimators. The first, which is based solely on the validation set, is an extension of the validation likelihood estimator of Breslow and Cain (Biometrika 75:11-20, 1988). The second is a joint conditional likelihood estimator based on the validation and non-validation data sets. Both estimators are semiparametric as they do not require any model assumptions regarding the missing data mechanism nor the specification of the conditional distribution of the missing covariates given the observed covariates. The asymptotic distribution theory is developed under the assumption that all covariate variables are categorical. The finite-sample properties of the proposed estimators are investigated through simulation studies showing that the joint conditional likelihood estimator is the most efficient. A cable TV survey data set from Taiwan is used to illustrate the practical use of the proposed methodology.
Original language | English (US) |
---|---|
Pages (from-to) | 621-653 |
Number of pages | 33 |
Journal | Metrika |
Volume | 75 |
Issue number | 5 |
DOIs | |
State | Published - Jul 2012 |
Keywords
- Logistic regression model
- Missing covariates
- Missing outcome
- Missing value
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty