There is a growing need to identify which bits of information are most valuable for healthcare providers. The aim of this study was to search for the highest impact variables in predicting postsurgery length of stay (LOS) for patients who undergo coronary artery bypass grafting (CABG).
Using a single institution’s Society of Thoracic Surgeons (STS) Registry data, 2121 patients with elective or urgent, isolated CABG were analyzed across 116 variables. Two machine learning techniques of random forest and artificial neural networks (ANNs) were used to search for the highest impact variables in predicting LOS, and results were compared against multiple linear regression. Out-of-sample validation of the models was performed on 105 patients.
Of the 10 highest impact variables identified in predicting LOS, four of the most impactful variables were duration intubated, last preoperative creatinine, age, and number of intraoperative packed red blood cell transfusions. The best performing model was an ANN using the ten highest impact variables (testing sample mean absolute error (MAE) = 1.685 d, R = 0.232), which performed consistently in the out-of-sample validation (MAE = 1.612 d, R = 0.150).
Using machine learning, this study identified several novel predictors of postsurgery LOS and reinforced certain known risk factors. Out of the entire STS database, only a few variables carry most of the predictive value for LOS in this population. With this knowledge, a simpler linear regression model has been shared and could be used elsewhere after further validation.

Copyright © 2021. Published by Elsevier Inc.