A weighted estimating equation for linear regression with missing covariate data

Michael Parzen, Stuart R. Lipsitz, Joseph G. Ibrahim, Steven E Lipshultz

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

Linear regression is one of the most popular statistical techniques. In linear regression analysis, missing covariate data occur often. A recent approach to analyse such data is a weighted estimating equation. With weighted estimating equations, the contribution to the estimating equation from a complete observation is weighted by the inverse 'probability of being observed'. In this paper, we propose a weighted estimating equation in which we wrongly assume that the missing covariates are multivariate normal, but still produces consistent estimates as long as the probability of being observed is correctly modelled. In simulations, these weighted estimating equations appear to be highly efficient when compared to the most efficient weighted estimating equation as proposed by Robins et al. and Lipsitz et al. However, these weighted estimating equations, in which we wrongly assume that the missing covariates are multivariate normal, are much less computationally intensive than the weighted estimating equations given by Lipsitz et al. We compare the weighted estimating equations proposed in this paper to the efficient weighted estimating equations via an example and a simulation study. We only consider missing data which are missing at random; non-ignorably missing data are not addressed in this paper.

Original languageEnglish
Pages (from-to)2421-2436
Number of pages16
JournalStatistics in Medicine
Volume21
Issue number16
DOIs
StatePublished - Aug 30 2002
Externally publishedYes

Fingerprint

Weighted Estimating Equations
Missing Covariates
Linear regression
Linear Models
Regression Analysis
Observation
Multivariate Normal
Missing Data
Consistent Estimates
Missing at Random
Estimating Equation
Simulation Study

Keywords

  • Missing at random
  • Missing completely at random
  • Missing data mechanism

ASJC Scopus subject areas

  • Epidemiology

Cite this

Parzen, M., Lipsitz, S. R., Ibrahim, J. G., & Lipshultz, S. E. (2002). A weighted estimating equation for linear regression with missing covariate data. Statistics in Medicine, 21(16), 2421-2436. https://doi.org/10.1002/sim.1195

A weighted estimating equation for linear regression with missing covariate data. / Parzen, Michael; Lipsitz, Stuart R.; Ibrahim, Joseph G.; Lipshultz, Steven E.

In: Statistics in Medicine, Vol. 21, No. 16, 30.08.2002, p. 2421-2436.

Research output: Contribution to journalArticle

Parzen, M, Lipsitz, SR, Ibrahim, JG & Lipshultz, SE 2002, 'A weighted estimating equation for linear regression with missing covariate data', Statistics in Medicine, vol. 21, no. 16, pp. 2421-2436. https://doi.org/10.1002/sim.1195
Parzen, Michael ; Lipsitz, Stuart R. ; Ibrahim, Joseph G. ; Lipshultz, Steven E. / A weighted estimating equation for linear regression with missing covariate data. In: Statistics in Medicine. 2002 ; Vol. 21, No. 16. pp. 2421-2436.
@article{44863838a8ba43a9926cf9422966f34f,
title = "A weighted estimating equation for linear regression with missing covariate data",
abstract = "Linear regression is one of the most popular statistical techniques. In linear regression analysis, missing covariate data occur often. A recent approach to analyse such data is a weighted estimating equation. With weighted estimating equations, the contribution to the estimating equation from a complete observation is weighted by the inverse 'probability of being observed'. In this paper, we propose a weighted estimating equation in which we wrongly assume that the missing covariates are multivariate normal, but still produces consistent estimates as long as the probability of being observed is correctly modelled. In simulations, these weighted estimating equations appear to be highly efficient when compared to the most efficient weighted estimating equation as proposed by Robins et al. and Lipsitz et al. However, these weighted estimating equations, in which we wrongly assume that the missing covariates are multivariate normal, are much less computationally intensive than the weighted estimating equations given by Lipsitz et al. We compare the weighted estimating equations proposed in this paper to the efficient weighted estimating equations via an example and a simulation study. We only consider missing data which are missing at random; non-ignorably missing data are not addressed in this paper.",
keywords = "Missing at random, Missing completely at random, Missing data mechanism",
author = "Michael Parzen and Lipsitz, {Stuart R.} and Ibrahim, {Joseph G.} and Lipshultz, {Steven E}",
year = "2002",
month = "8",
day = "30",
doi = "10.1002/sim.1195",
language = "English",
volume = "21",
pages = "2421--2436",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "16",

}

TY - JOUR

T1 - A weighted estimating equation for linear regression with missing covariate data

AU - Parzen, Michael

AU - Lipsitz, Stuart R.

AU - Ibrahim, Joseph G.

AU - Lipshultz, Steven E

PY - 2002/8/30

Y1 - 2002/8/30

N2 - Linear regression is one of the most popular statistical techniques. In linear regression analysis, missing covariate data occur often. A recent approach to analyse such data is a weighted estimating equation. With weighted estimating equations, the contribution to the estimating equation from a complete observation is weighted by the inverse 'probability of being observed'. In this paper, we propose a weighted estimating equation in which we wrongly assume that the missing covariates are multivariate normal, but still produces consistent estimates as long as the probability of being observed is correctly modelled. In simulations, these weighted estimating equations appear to be highly efficient when compared to the most efficient weighted estimating equation as proposed by Robins et al. and Lipsitz et al. However, these weighted estimating equations, in which we wrongly assume that the missing covariates are multivariate normal, are much less computationally intensive than the weighted estimating equations given by Lipsitz et al. We compare the weighted estimating equations proposed in this paper to the efficient weighted estimating equations via an example and a simulation study. We only consider missing data which are missing at random; non-ignorably missing data are not addressed in this paper.

AB - Linear regression is one of the most popular statistical techniques. In linear regression analysis, missing covariate data occur often. A recent approach to analyse such data is a weighted estimating equation. With weighted estimating equations, the contribution to the estimating equation from a complete observation is weighted by the inverse 'probability of being observed'. In this paper, we propose a weighted estimating equation in which we wrongly assume that the missing covariates are multivariate normal, but still produces consistent estimates as long as the probability of being observed is correctly modelled. In simulations, these weighted estimating equations appear to be highly efficient when compared to the most efficient weighted estimating equation as proposed by Robins et al. and Lipsitz et al. However, these weighted estimating equations, in which we wrongly assume that the missing covariates are multivariate normal, are much less computationally intensive than the weighted estimating equations given by Lipsitz et al. We compare the weighted estimating equations proposed in this paper to the efficient weighted estimating equations via an example and a simulation study. We only consider missing data which are missing at random; non-ignorably missing data are not addressed in this paper.

KW - Missing at random

KW - Missing completely at random

KW - Missing data mechanism

UR - http://www.scopus.com/inward/record.url?scp=0037199838&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0037199838&partnerID=8YFLogxK

U2 - 10.1002/sim.1195

DO - 10.1002/sim.1195

M3 - Article

VL - 21

SP - 2421

EP - 2436

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 16

ER -