Deep learning system allows for accurate screening for papilledema

Papilledema, normal disks, and those with nonpapilledema-related abnormalities are identifiable with a deep learning system that uses fundus photographs of dilated pupils, according to a recent study published in The New England Journal of Medicine.

“Our main finding was that an artificial-intelligence algorithm using deep-learning neural networks could discriminate among normal optic disks, disks with papilledema, and disks with other abnormalities,” wrote researchers from the BONSAI Group, led by Dan Milea, MD, PhD, of the Singapore National Eye Center, Singapore.

Accurate detection of papilledema — optic nerve edema caused by intracranial hypertension — is invaluable in patients with neurologic symptoms, including headache, stressed Milea and colleagues.

“Examination of the optic nerves is a fundamental component of the clinical examination, but direct ophthalmoscopy is usually avoided or poorly performed by general physicians and nonophthalmic specialists,” they noted. “The findings on ophthalmoscopy influence diagnostic strategy and treatment options. Failure to detect papilledema may result in visual loss and neurologic complications.”

For this study, Milea and colleagues trained, validated, and externally tested a deep learning system designed to classify optic disks. The system was based on 15,846 ocular fundus photographs of pharmacologically dilated eyes from participants with optic-nerve disorders comprising a large, international, multiethnic population collected by the Brain and Optic Nerve Study with Artificial Intelligence (BONSAI) from 24 sites. The fundus images were centered on either the macular or the optic disk, always including different view of the optic disk.

Fundus photographs were collected by neuro-ophthalmologists, who set the reference standard with a specific diagnosis for each photograph at the time of clinical evaluation based on the appearance of the optic-nerve head, clinical evaluation, ancillary testing, and follow-up visits. They classified these as normal optic disks, disks with papilledema due to proven intracranial hypertension, and disks with other abnormalities including anterior ischemic and inflammatory optic neuropathies, optic-disk drusen, optic atrophy, and congenital optic-nerve abnormalities.

Patients with normal optic nerves were included only if they had no other ocular conditions.

For training and validation data sets, Milea and fellow researchers used 14,341 fundus photographs from 6,779 patients. Of these, 2,148 were of disks with papilledema, 3,037 demonstrated other abnormalities, and 9,156 were normal. The percentage classified as having papilledema ranged from 0% to 59.5%; while normal disks ranged from 9.8% to 100%.

To assess the system’s performance at classifying the optic-disks’ appearance, researchers calculated the area under the receiver-operating-characteristic curve (AUC), sensitivity, and specificity, using as their reference the standard of neuro-ophthalmologists’ clinical diagnoses.

In the validation set, their system differentiated disks with papilledema from normal disks and disks with nonpapilledema abnormalities with an AUC of 0.99 (95% CI: 0.98-0.99), normal from abnormal disks with an AUC of 0.99 (95% CI: 0.99-0.99), a sensitivity of 93.2% (95% CI: 91.8-94.5), and a specificity of 95.1% (95% CI: 94.7-95.6).

Milea and colleagues also assessed an external-testing data set of 1,505 photographs. For detecting papilledema, the system had an AUC of 0.96 (95% CI: 0.95 to 0.97), a sensitivity of 96.4% (95% CI: 93.9-98.3), and a specificity of 84.7% (95% CI: 82.3-87.1).

In the external-testing data sets, the overall accuracy of the deep-learning system was 91.8% (95% CI: 90.3-93.3) for detecting normal disks, 87.5% (95% CI: 85.5-89.3) for detecting disks with papilledema, and 81.1% (95% CI: 78.8-83.3) for detecting disks with other abnormalities. For detecting papilledema in these data sets, the trained system had an overall sensitivity of 96.4% (95% CI: 93.9-98.3) and an overall specificity of 84.7% (95% CI: 82.3-87.1).

In all data sets, the mean estimated prevalence of papilledema was 9.5%, for an overall positive predictive value of 39.8% (95% CI: 36.6-43.2) and a negative predictive value of 99.6% (95% CI: 99.2-99.7).

In a post-hoc analysis of the external-testing data sets in which findings from the reference standard differed from the deep-learning system’s classification (177 photographs), 4.2% (n=15) of the 360 disks with papilledema were misclassified by the systems as disks with other abnormalities. None of them, however, were classified as normal. The system missed only one patient with papilledema.

Study limitations included its retrospective design and using a diagnostic reference standard set by expert neuro-ophthalmologists who incorrectly labeled 10 photographs (corrected by the deep-learning system). Other limitations included the use of pharmacologically dilated pupils, which may not reflect general practice, and the low threshold for diagnosing papilledema within the network to avoid false positives.

In an accompanying editorial, Isaac Kohane, MD, PhD, of Harvard Medical School, Boston, MA, wrote: “Accurate assessment of the optic-nerve head, the optic disk, by funduscopy is an important, cost-effective, and noninvasive diagnostic tool for a variety of ocular, neurologic, and inflammatory conditions. Unfortunately, reliable funduscopic assessment is challenging for clinicians working with undilated pupils; even after mydriatic dilation, nonophthalmologists have considerably lower accuracy than neuro-ophthalmologists in detecting optic-disk disorders, including papilledema.”

Take Aways:

  1. Artificial intelligence algorithm can accurately discriminate normal optic disks from those with papilledema and other abnormalities with high sensitivity and specificity.

  2. For non-ophthalmologist clinicians, artificial intelligence, deep-learning algorithm aids in accurate examination of the optic nerves.


E.C. Meszaros, Contributing Writer, BreakingMED™

Milea reported grants from National Medical Research Council of Singapore and SingHealth Foundation.

Kohane reported other from Pulsedata, personal fees and other from Inovalon, outside the submitted work.


Cat ID: 240

Topic ID: 92,240,730,192,925,240