The following is a summary of “Evaluating the strengths and limitations of multimodal ChatGPT-4 in detecting glaucoma using fundus images,” published in the June 2024 issue of Ophthalmology by Musleh et al.
ChatGPT is a large language model (LLM) capable of answering ophthalmological questions accurately. However, its ability to spot glaucoma using color fundus photographs (CFPs) from a standard dataset without prior training or tuning has yet to be tested.
Researchers conducted a prospective study evaluating the diagnostic accuracy of ChatGPT-4 in identifying glaucoma cases among CFPs, focusing on binary classifications: ‘Likely Glaucomatous’ or ‘Likely Non-Glaucomatous.’
They used the Retinal Fundus Glaucoma Challenge (REFUGE) dataset, which is available to the public containing 400 images for testing. ChatGPT-4 was used to classify the pictures without prior training or fine-tuning. The analysis centered on constructing a confusion matrix to assess the accuracy of binary classifications.
The results showed that despite no fine-tuning, ChatGPT-4 showed a notably high accuracy of 90% (95% CI: 87.06%-92.94%). The sensitivity was 50% (34.51%-65.49%), the specificity was 94.44% (92.08%-96.81%), the precision was 50% (34.51%-65.49%), and the F1 Score was 0.50.
Investigators concluded that advanced AI models like LLMs might require less data for training in specialized medical fields like ophthalmology. Such techniques could lead to the development of valuable medical care tools, especially in resource-limited settings.
Source: frontiersin.org/journals/ophthalmology/articles/10.3389/fopht.2024.1387190/abstract
Create Post
Twitter/X Preview
Logout