Biometrics 2017 02 23() doi 10.1111/biom.12670
The instrumental variable (IV) design is a well-known approach for unbiased evaluation of causal effects in the presence of unobserved confounding. In this article, we study the IV approach to account for selection bias in regression analysis with outcome missing not at random. In such a setting, a valid IV is a variable which (i) predicts the nonresponse process, and (ii) is independent of the outcome in the underlying population. We show that under the additional assumption (iii) that the IV is independent of the magnitude of selection bias due to nonresponse, the population regression in view is nonparametrically identified. For point estimation under (i)-(iii), we propose a simple complete-case analysis which modifies the regression of primary interest by carefully incorporating the IV to account for selection bias. The approach is developed for the identity, log and logit link functions. For inferences about the marginal mean of a binary outcome assuming (i) and (ii) only, we describe novel and approximately sharp bounds which unlike Robins-Manski bounds, are smooth in model parameters, therefore allowing for a straightforward approach to account for uncertainty due to sampling variability. These bounds provide a more honest account of uncertainty and allows one to assess the extent to which a violation of the key identifying condition (iii) might affect inferences. For illustration, the methods are used to account for selection bias induced by HIV testing nonparticipation in the evaluation of HIV prevalence in the Zambian Demographic and Health Surveys.