A machine learning algorithm used to analyse EHRs identified high-risk individuals who could potentially benefit from HIV PrEP, according to a report presented this week at IDWeek 2016 in New Orleans. Out of 800,000 patients in a large EHR database, more that 8000 were found to be potential PrEP candidates.

Background

HIV PrEP decreases HIV transmission but uptake of PrEP has been limited. We developed automated algorithms to identify persons at increased risk for acquiring HIV using routine structured EHR data.

Methods

We extracted potentially relevant demographics (e.g. age, sex, race), diagnoses (e.g. anal dysplasia, substance use disorders), prescriptions (e.g. suboxone, sildenafil), laboratory tests (e.g. syphilis, gonorrhea, chlamydia, hepatitis C), and procedures (e.g. anal pap smear) from the EHR repository of Atrius Health, a large ambulatory practice group in Massachusetts with ~800,000 patients. We matched each patient with incident HIV during 2006-2015 to 100 HIV-uninfected controls with similar gender and duration of affiliation with Atrius Health as of the year of HIV diagnosis. We developed logistic regression models and machine learning algorithms to predict incident HIV infection in cases vs controls. Machine learning methods included LASSO, Ridge, SVM, and super learner. We assessed area-under-the-curve (AUC), sensitivity, and positive predictive value (PPV) of each algorithm and compared prediction scores among 45 patients independently started on PrEP by Atrius clinicians versus the general population.

Results

There were 138 incident HIV infections in the population. Ten-fold cross-validated AUC was highest with the Ridge method (AUC 0.77, sensitivity 59%, PPV 0.34% for patients with risk scores in the top quintile). Strong predictors included a prior diagnosis of anorectal ulcer, total number of positive gonorrhea tests, and acute HIV testing in the past two years. Median HIV prediction scores for patients independently started on PrEP by Atrius clinicians were significantly higher than those of the general population (median 5.43×10-4 vs 9.05×10-5, P<.0001). There were 3,229 patients in the general population (0.37%) with risk scores above the median scores of patients started on PrEP by Atrius clinicians. These are promising potential candidates to evaluate for interest and suitability for PrEP.

Conclusion

Automated analysis of data routinely stored in EHRs can identify patients at increased risk for incident HIV who are potential candidates for PrEP.

Douglas Krakower, MD1, Susan Gruber, PhD2, John T. Menchaca, BA3, Judith C. Maro, PhD, MS2, Noelle Cocoros, DSc, MPH4, Benjamin Kruskal, MD, PhD5, Ira B. Wilson, MD, MSc6, Kenneth Mayer, MD7 and Michael Klompas, MD, MPH, FRCPC, FIDSA3, (1)Infectious Disease Division, Beth Israel Deaconess Medical Center, Boston, MA, (2)Department of Population Medicine, Harvard Medical School, Boston, MA, (3)Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, (4)Department of Population Medicine, Harvard Medical School & Harvard Pilgrim Health Care Institute, Boston, MA, (5)Atrius Health, Somerville, MA, (6)Health Services, Policy and Practice, Brown University, Providence, RI, (7)The Fenway Institute, Fenway Health, Boston, MA