Individualized treatment rules (ITRs) tailor medical treatments according to patient-specific characteristics in order to optimize patient outcomes. Data from randomized controlled trials (RCTs) are used to infer valid ITRs using statistical and machine learning methods. However, RCTs are usually conducted under specific inclusion/exclusion criteria, thus limiting their generalizability to a broader patient population in real-world practice settings. Because electronic health records (EHRs) document treatment prescriptions in the real world, transferring information in EHRs to RCTs, if done appropriately, could potentially improve the performance of ITRs, in terms of precision and generalizability. In this work, we propose a new domain adaptation method to learn ITRs by incorporating information from EHRs. Unless we assume that there is no unmeasured confounding in EHRs, we cannot directly learn the optimal ITR from the combined EHR and RCT data. Instead, we first pre-train “super” features from EHRs that summarize physician treatment decisions and patient observed benefits in the real world, as these are likely to be informative of the optimal ITRs. We then augment the feature space of the RCT and learn the optimal ITRs by stratifying by super features using subjects enrolled in RCT. We adopt Q-learning and a modified matched-learning algorithm for estimation. We present heuristic justification of our method and conduct simulation studies to demonstrate the performance of super features. Finally, we apply our method to transfer information learned from EHRs of patients with type 2 diabetes to learn individualized insulin therapies from RCT data. This article is protected by copyright. All rights reserved.
This article is protected by copyright. All rights reserved.