Convolutional neural networks (CNNs) have shown a dermatologist-level performance in the classification of skin lesions. We aimed to deliver a head-to-head comparison of a conventional image analyser (CIA), which depends on segmentation and weighting of handcrafted features, to a CNN trained by deep learning.
Cross-sectional study using a real-world, prospectively acquired, dermoscopic dataset of 1981 skin lesions to compare the diagnostic performance of a market-approved CNN (Moleanalyzer-Pro™, developed in 2018) to a CIA (Moleanalyzer-3™/Dynamole™; developed in 2004, all FotoFinder Systems Inc, Germany). As a reference standard, we used histopathological diagnoses (n = 785) or, in non-excised benign lesions (n = 1196), expert consensus plus an uneventful follow-up by sequential digital dermoscopy for at least 2 years.
A total of 281 malignant lesions and 1700 benign lesions from 435 patients (62.2% male, mean age: 52 years) were prospectively imaged. The CNN showed a sensitivity of 77.6% (95% confidence interval [CI]: [72.4%-82.1%]), specificity of 95.3% (95% CI: [94.2%-96.2%]), and receiver operating characteristic (ROC)-area under the curve (AUC) of 0.945 (95% CI: [0.930-0.961]). In contrast, the CIA achieved a sensitivity of 53.4% (95% CI: [47.5%-59.1%]), specificity of 86.6% (95% CI: [84.9%-88.1%]) and ROC-AUC of 0.738 (95% CI: [0.701-0.774]). The data set included melanomas originally diagnosed by dynamic changes during sequential digital dermoscopy (52 of 201, 20.6%), which reduced the sensitivities of both classifiers. Pairwise comparisons of sensitivities, specificities, and ROC-AUCs indicated a clear outperformance by the CNN (all p < 0.001).
The superior diagnostic performance of the CNN argues against a continued application of former CIAs as an aide to physicians’ clinical management decisions.

Copyright © 2020 Elsevier Ltd. All rights reserved.