Automatic diabetic retinopathy screening system based on neural networks has been used to detect diabetic retinopathy (DR). However, there is no quantitative synthesis of performance of these methods. We aimed to estimate the sensitivity and specificity of neural networks in DR grading.
Medline, Embase, IEEE Xplore, and Cochrane Library were searched up to 23 July 2019. Studies that evaluated performance of neural networks in detection of moderate or worse DR or diabetic macular edema using retinal fundus images with ophthalmologists’ judgment as reference standard were included. Two reviewers extracted data independently. Risk of bias of eligible studies was assessed using QUDAS-2 tool.
Twenty-four studies involving 235 235 subjects were included. Quantitative random-effects meta-analysis using the Rutter and Gatsonis hierarchical summary receiver operating characteristics (HSROC) model revealed a pooled sensitivity of 91.9% (95% CI: 89.6% to 94.3%) and specificity of 91.3% (95% CI: 89.0% to 93.5%). Subgroup analyses and meta-regression did not provide any statistically significant findings for the heterogeneous diagnostic accuracy in studies with different image resolutions, sample sizes of training sets, architecture of convolutional neural networks, or diagnostic criteria.
State-of-the-art neural networks could effectively detect clinical significant DR. To further improve diagnostic accuracy of neural networks, researchers might need to develop new algorithms rather than simply enlarge sample sizes of training sets or optimize image quality.

© 2020 European Society of Endocrinology.