Weighted estimators for proportional hazards regression with missing covariates

Lihong Qi, C. Y. Wang, Ross L. Prentice

Research output: Contribution to journalArticle

75 Citations (Scopus)

Abstract

Missing covariate data are common in epidemiologic studies and disease prevention trials. In this article regression parameter estimation in the Cox proportional hazards model is considered when certain covariates are observed for all study subjects and other covariate data are collected only for a subset. The article presents both simple weighted and kernel-assisted fully augmented weighted estimators that use the partially incomplete data nonparametrically. We use nonparametric methods to estimate selection probabilities in the simple weighted estimating functions. We also use nonparametric kernel smoothing techniques to estimate certain conditional expectations in fully augmented weighted estimating functions. The proposed methods are nonparametric in the sense that they require neither a model for the missing-data mechanism nor specification of the conditional distribution of missing covariates given observed covariates. These estimators allow the missing-data mechanism to depend on outcome variables and observed covariates, and they are applicable to various cohort sampling procedures, including case-cohort and nested case-control designs. We show that the simple and the kernel-assisted fully augmented weighted estimators are typically consistent and asymptotically normal. Moreover, the proposed estimators are more efficient than the simple weighted estimator with the inverse of true selection probability as weight. They also correct the bias of estimates from analysis of the complete data alone when the missing-data mechanism depends on outcome variables. In addition, when covariates are time-independent, certain simple weighted estimators are shown to be asymptotically equivalent to the kernel-assisted fully augmented weighted estimators. Moderate sample size performance of the estimators is examined via simulation and by application to two real datasets.

Original languageEnglish (US)
Pages (from-to)1250-1263
Number of pages14
JournalJournal of the American Statistical Association
Volume100
Issue number472
DOIs
StatePublished - Dec 2005
Externally publishedYes

Fingerprint

Proportional Hazards Regression
Missing Covariates
Estimator
Covariates
Missing Data Mechanism
Estimating Function
kernel
Nonparametric Smoothing
Estimate
Kernel Smoothing
Regression Estimation
Cox Proportional Hazards Model
Case-control
Smoothing Techniques
Asymptotically equivalent
Proportional hazards
Nonparametric Methods
Incomplete Data
Conditional Expectation
Conditional Distribution

Keywords

  • Case-cohort
  • Kernel smoother
  • Missing covariate data
  • Nested case-control
  • Nonparametric method
  • Weighted estimating equation

ASJC Scopus subject areas

  • Mathematics(all)
  • Statistics and Probability

Cite this

Weighted estimators for proportional hazards regression with missing covariates. / Qi, Lihong; Wang, C. Y.; Prentice, Ross L.

In: Journal of the American Statistical Association, Vol. 100, No. 472, 12.2005, p. 1250-1263.

Research output: Contribution to journalArticle

@article{6c108f249014490ba18ab38cfe159ea4,
title = "Weighted estimators for proportional hazards regression with missing covariates",
abstract = "Missing covariate data are common in epidemiologic studies and disease prevention trials. In this article regression parameter estimation in the Cox proportional hazards model is considered when certain covariates are observed for all study subjects and other covariate data are collected only for a subset. The article presents both simple weighted and kernel-assisted fully augmented weighted estimators that use the partially incomplete data nonparametrically. We use nonparametric methods to estimate selection probabilities in the simple weighted estimating functions. We also use nonparametric kernel smoothing techniques to estimate certain conditional expectations in fully augmented weighted estimating functions. The proposed methods are nonparametric in the sense that they require neither a model for the missing-data mechanism nor specification of the conditional distribution of missing covariates given observed covariates. These estimators allow the missing-data mechanism to depend on outcome variables and observed covariates, and they are applicable to various cohort sampling procedures, including case-cohort and nested case-control designs. We show that the simple and the kernel-assisted fully augmented weighted estimators are typically consistent and asymptotically normal. Moreover, the proposed estimators are more efficient than the simple weighted estimator with the inverse of true selection probability as weight. They also correct the bias of estimates from analysis of the complete data alone when the missing-data mechanism depends on outcome variables. In addition, when covariates are time-independent, certain simple weighted estimators are shown to be asymptotically equivalent to the kernel-assisted fully augmented weighted estimators. Moderate sample size performance of the estimators is examined via simulation and by application to two real datasets.",
keywords = "Case-cohort, Kernel smoother, Missing covariate data, Nested case-control, Nonparametric method, Weighted estimating equation",
author = "Lihong Qi and Wang, {C. Y.} and Prentice, {Ross L.}",
year = "2005",
month = "12",
doi = "10.1198/016214505000000295",
language = "English (US)",
volume = "100",
pages = "1250--1263",
journal = "Journal of the American Statistical Association",
issn = "0162-1459",
publisher = "Taylor and Francis Ltd.",
number = "472",

}

TY - JOUR

T1 - Weighted estimators for proportional hazards regression with missing covariates

AU - Qi, Lihong

AU - Wang, C. Y.

AU - Prentice, Ross L.

PY - 2005/12

Y1 - 2005/12

N2 - Missing covariate data are common in epidemiologic studies and disease prevention trials. In this article regression parameter estimation in the Cox proportional hazards model is considered when certain covariates are observed for all study subjects and other covariate data are collected only for a subset. The article presents both simple weighted and kernel-assisted fully augmented weighted estimators that use the partially incomplete data nonparametrically. We use nonparametric methods to estimate selection probabilities in the simple weighted estimating functions. We also use nonparametric kernel smoothing techniques to estimate certain conditional expectations in fully augmented weighted estimating functions. The proposed methods are nonparametric in the sense that they require neither a model for the missing-data mechanism nor specification of the conditional distribution of missing covariates given observed covariates. These estimators allow the missing-data mechanism to depend on outcome variables and observed covariates, and they are applicable to various cohort sampling procedures, including case-cohort and nested case-control designs. We show that the simple and the kernel-assisted fully augmented weighted estimators are typically consistent and asymptotically normal. Moreover, the proposed estimators are more efficient than the simple weighted estimator with the inverse of true selection probability as weight. They also correct the bias of estimates from analysis of the complete data alone when the missing-data mechanism depends on outcome variables. In addition, when covariates are time-independent, certain simple weighted estimators are shown to be asymptotically equivalent to the kernel-assisted fully augmented weighted estimators. Moderate sample size performance of the estimators is examined via simulation and by application to two real datasets.

AB - Missing covariate data are common in epidemiologic studies and disease prevention trials. In this article regression parameter estimation in the Cox proportional hazards model is considered when certain covariates are observed for all study subjects and other covariate data are collected only for a subset. The article presents both simple weighted and kernel-assisted fully augmented weighted estimators that use the partially incomplete data nonparametrically. We use nonparametric methods to estimate selection probabilities in the simple weighted estimating functions. We also use nonparametric kernel smoothing techniques to estimate certain conditional expectations in fully augmented weighted estimating functions. The proposed methods are nonparametric in the sense that they require neither a model for the missing-data mechanism nor specification of the conditional distribution of missing covariates given observed covariates. These estimators allow the missing-data mechanism to depend on outcome variables and observed covariates, and they are applicable to various cohort sampling procedures, including case-cohort and nested case-control designs. We show that the simple and the kernel-assisted fully augmented weighted estimators are typically consistent and asymptotically normal. Moreover, the proposed estimators are more efficient than the simple weighted estimator with the inverse of true selection probability as weight. They also correct the bias of estimates from analysis of the complete data alone when the missing-data mechanism depends on outcome variables. In addition, when covariates are time-independent, certain simple weighted estimators are shown to be asymptotically equivalent to the kernel-assisted fully augmented weighted estimators. Moderate sample size performance of the estimators is examined via simulation and by application to two real datasets.

KW - Case-cohort

KW - Kernel smoother

KW - Missing covariate data

KW - Nested case-control

KW - Nonparametric method

KW - Weighted estimating equation

UR - http://www.scopus.com/inward/record.url?scp=29144504459&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=29144504459&partnerID=8YFLogxK

U2 - 10.1198/016214505000000295

DO - 10.1198/016214505000000295

M3 - Article

VL - 100

SP - 1250

EP - 1263

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

SN - 0162-1459

IS - 472

ER -