Although early detection of structural heart disease improves patient outcomes, many patients remain undiagnosed. While echocardiography is impracticable for population screening, electrocardiogram (ECG)-based prediction algorithms can assist identify high-risk patients. For a study, researchers created a unique ECG-based machine learning technique to identify numerous structural heart problems, hypothesizing that a composite model would provide better prevalence and positive predictive values (PPVs), allowing us to make significant echocardiography recommendations.

They trained machine learning models to predict the presence or absence of any of seven echocardiography-confirmed disorders within one year using 2,232,130 ECGs connected to electronic health data and echocardiogram reports from 484,765 individuals between 1984 and 2021. The composite label contained moderate to severe valvular disease (aortic/mitral stenosis or regurgitation, tricuspid regurgitation), a <50% ejection fraction, or an interventricular septal thickness more than >15mm. They examined model performance using 5-fold cross-validation, multi-site validation trained on one site and tested on 10 separate sites, and simulated retrospective deployment trained on pre-2010 data and deployed in 2010.

The composite ‘rECHOmmend’ model, which included age, gender, and ECG traces, had an AUROC of 0.91 and a PPV of 42% at 90% sensitivity, with a composite label prevalence of 17.9%. Individual disease models obtained AUROCs ranging from 0.86 to 0.93 and PPVs ranging from 1% to 31%. The AUROCs for models with varying input characteristics ranged from 0.80-to 0.93, with additional features raising the AUROCs. After training on a different site, multi-site validation produced findings similar to cross-validation, with an aggregate AUROC of 0.91 across the independent test set of 10 clinical sites. In the simulated retrospective deployment, 11% of ECGs obtained in patients without pre-existing structural heart disease in 2010 were categorized as high-risk, with 41% (4.5% of total patients) developing real echocardiography-confirmed illness within one year.

A composite endpoint ECG-based machine learning model may detect a high-risk group for having undiscovered, clinically relevant structural heart disease while outperforming single disease models and boosting practical utility with better PPVs. The strategy might enable focused screening using echocardiography to improve structural heart disease under-diagnosis.