The following is a summary of “Impact of Multi-Institution Datasets on the Generalizability of Machine Learning Prediction Models in the ICU,” published in the July 2024 issue of Critical Care by Rockenschaub et al.
Researchers conducted a retrospective study assessing the applicability of machine learning (ML) models for early AEs detection in new hospital environments.
They utilized ICUs from 4 public datasets, across Europe and the U.S. including adult patients admitted in the ICU for at least 6 hours.
The result showed data from 334,812 ICU stays, assessing the transferability of deep learning (DL) models for predicting common AEs like death, acute kidney injury (AKI), and sepsis was systematically evaluated. The study examined whether incorporating multiple data sources and optimizing for generalizability during training could enhance model performance in new hospitals. Models demonstrated high AUROC scores for mortality (0.838–0.869), AKI (0.823–0.866), and sepsis (0.749–0.824) at the training hospital, AUROC scores declined when models were tested at different hospitals, sometimes by up to –0.200. Training with multiple datasets reduced the performance drop, with multicenter models achieving performance comparable to the best single-center model. Methods specifically designed to enhance generalizability did not significantly improve performance in the study.
Investigators concluded that diverse training data were crucial for DL risk prediction, as broader data inclusion can enhance model generalizability, though optimal performance at new hospitals still depends on compatible training data.
Source: journals.lww.com/ccmjournal/fulltext/9900/the_impact_of_multi_institution_datasets_on_the.357.aspx