Model misspecification and bias for inverse probability weighting and doubly robust estimators
In the causal inference literature a class of semi-parametric estimators is called robust if the estimator has desirable properties under the assumption that at least one of the working models is correctly specified. A standard example is a doubly robust estimator that specifies parametric models both for the propensity score and the outcome regression. When estimating a causal parameter in an observational study the role of parametric models is often not to be true representations of the data generating process, instead the motivation for their use is to facilitate the adjustment for confounding, for example by reducing the dimension of the covariate vector, making the assumption of at least one true model unlikely to hold.
In this paper we propose a crude analytical approach to study the large sample bias of estimators when all models are assumed to be approximations of the true data generating process, i.e., all models are misspecified. We apply our approach to three prototypical estimators, two inverse probability weighting (IPW) estimators, using a misspecified propensity score model, and a doubly robust (DR) estimator, using misspecified models for the outcome regression and the propensity score. To compare the consequences of the model misspecifications for the estimators we show conditions for when using normalized weights leads to a smaller bias compared to a simple IPW estimator. To analyze the question of when the use of two misspecified models are better than one we derive necessary and sufficient conditions for when the DR estimator has a smaller bias than the simple IPW estimator and when it has a smaller bias than the IPW estimator with normalized weights. For most conditions in the comparisons, the covariance between the propensity score model error and the conditional outcomes plays an important role. The results are illustrated in a simulation study.