To evaluate the performance of machine learning (ML) in detecting glaucoma using fundus and retinal optical coherence tomography (OCT) images.
PubMed and EMBASE were searched on August 11, 2021. Bivariate random-effects model was used to pool ML’s diagnostic sensitivity, specificity, and area under the curve (AUC). Subgroup analyses were performed based on ML classifier categories and dataset types.
105 (3.3%) studies were retrieved. 73 (69.5%), 30 (28.6%), and 2 (1.9%) studies tested ML using fundus, OCT, and both image types, respectively. Total testing data size was 197174 for fundus and 16039 for OCT. Overall, ML showed excellent performances for both fundus (pooled [95%CI] sensitivity, specificity, AUC = 0.92 [0.91-0.93], 0.93 [0.91-0.94], 0.97 [0.95-0.98]) and OCT (pooled [95%CI] sensitivity, specificity, AUC = 0.90 [0.86-0.92], 0.91 [0.89-0.92], 0.96 [0.93-0.97]). ML performed similarly using all data and external data for fundus, while the external test result of OCT was less robust (AUC = 0.87). When comparing different classifier categories, although support vector machine showed the highest performance (pooled sensitivity, specificity, AUC ranges: 0.92-0.96, 0.95-0.97, 0.96-0.99), results by neural network and others were still good (pooled sensitivity, specificity, AUC ranges: 0.88-0.93, 0.90-0.93, 0.95-0.97). When analyzed based on dataset types, ML demonstrated consistent performances on clinical datasets (Pooled [95%CI] AUC = Fundus: 0.98 [0.97-0.99]; OCT: 0.95 [0.93-0.97]).
Performance of ML in detecting glaucoma compares favorably to that of experts and is promising for clinical application. Future prospective studies are needed to better evaluate its real-world utility.

Copyright © 2021. Published by Elsevier Inc.