Humans’ eyes normally travel in “scan paths” of fixations connected by saccades. DeepGaze III, a novel model that predicts the spatial position of sequential fixations in a free-viewing scan path over static pictures, was shown in the study. DeepGaze III is a deep learning-based model that predicted where a participant will fixate next by combining picture information with information about prior fixation history. DeepGaze III, as a high-capacity and adaptable model, catches numerous meaningful patterns in human scan path data, establishing a new state of the art in the MIT300 dataset and offering insight into how much information exists in scan paths among observers in the first place. The understanding was used to evaluate the significance of mechanisms expressed in simpler, interpretable models for fixation selection. 

DeepGaze III’s design enabled researchers to separate numerous aspects that influence fixation choices, such as the interaction of scene content and scan path history. Because of the modular architecture of DeepGaze III, they were able to undertake ablation investigations, which revealed that scene content had a greater influence on fixation selection than past scan path history in our primary dataset. They also used the model to find scenarios where the relative relevance of different sources of information varies the greatest. These data-driven insights would be difficult to get with simpler models that lack the processing ability to catch such patterns, highlighting how deep learning breakthroughs might help scientists comprehend more.