Can model averageing improve propensity score- based estimation of average treatment effects?
In order to estimate a treatment effect using observational data, researchers often need to model the propensity score. In practice, various methods are often employed to estimate a range of competing propensity scores, from which the best candidate is chosen according to some criterion, usually relating to accuracy or balancing properties. Then, standard practice is to estimate the treatment effect using the chosen candidate propensity score, as if the selection step does not carry any valuable information.
In the past decade, procedures where model averaging is used to combine candidate models have been proposed for propensity score estimation. Such procedures tune the estimated scores to improve their accuracy or balancing properties. The purpose of this study is to apply model averaging when estimating propensity scores, and investigate whether this results in improved estimates of the average treatment effect. Here, the treatment effect estimates are evaluated in terms of their bias.
In a Monte Carlo simulation study, the candidate propensity scores are estimated by reproducing kernel Hilbert space regressions, and used to estimate the average treatment effects in three different simulation designs. The results suggest that there is no improvement to be gained from combining several candidates, as compared to using the best candidate separately.