Scanning faces: a deep learning approach to studying eye movements in prosopagnosia

Kazemian, Atlas; Oruc, Ipek; Barton, Jason J. S.

doi:10.3389/fneur.2025.1616509

ORIGINAL RESEARCH article

Front. Neurol., 10 September 2025

Sec. Neuro-Ophthalmology

Volume 16 - 2025 | https://doi.org/10.3389/fneur.2025.1616509

This article is part of the Research TopicAdvances in Understanding Visual Disorders Linked to Cortical DysfunctionView all 10 articles

Scanning faces: a deep learning approach to studying eye movements in prosopagnosia

Atlas Kazemian¹

Ipek Oruc¹

Jason J. S. Barton^1,2,3^*

¹Department of Ophthalmology and Visual Sciences, University of British Columbia, Vancouver, BC, Canada
²Department of Medicine (Neurology), University of British Columbia, Vancouver, BC, Canada
³Department of Psychology, University of British Columbia, Vancouver, BC, Canada

Background: Healthy individuals show fixation biases when scanning faces, likely toward the regions that are most informative for identifying faces. Some reports suggest that subjects with prosopagnosia, an impairment in face recognition, have anomalous face scanning.

Objective: Our goal was to determine whether an approach using artificial intelligence could identify key scanning markers of prosopagnosia.

Methods: We used an image-classification technique based on deep learning to study the fixations of subjects with and without prosopagnosia during a face recognition task. We identified the number of fixations that maximizes classification performance and developed two methods of displaying scanpaths as images, each used to train a convolutional neural network.

Results: Optimal classification of acquired prosopagnosic from control trials required four fixations, with an AUC of 80%. The model showed a greater tendency to fixate the lower face and the right eye in acquired prosopagnosia. Optimal classification of developmental prosopagnosic from control trials required 16 fixations, with an AUC of 69%. Fixations on developmental prosopagnosic trials were shifted more toward peripheral regions. When the classifier trained to discriminate acquired prosopagnosia from controls was asked to analyze the developmental prosopagnosic trials, the latter were classified as being more like control scanpaths.

Conclusion: Only a few fixations during face scanning are required to differentiate controls from acquired prosopagnosia, with the latter showing anomalous biases. Developmental prosopagnosic scanpaths resemble degraded control scanpaths rather than anomalous biases. This study shows the potential of deep learning to identify abnormal behavioral markers in a disorder of complex visual processing.

Introduction

When subjects look at scenes or objects, they often scan them with a series of fixations directed at different locations. These scanpaths reflect an active process in which saccades serve to direct both attention and the high spatial and color discrimination of the fovea to regions of greater interest (1). Many factors contribute to what makes a region ‘interesting’. This includes both bottom-up stimulus properties, whose distribution in a scene can be plotted in a salience map (2), as well as top-down cognitive processes such as memory (3), schematic knowledge (4, 5), and task-related processes (6). This complex interplay likely converges to maximize the rate of information accrual for perceptual decisions relevant to the current goals of the observer. This complexity also results in a high degree of variability in scanpaths, even for repeated views of the same image by the same viewer (7, 8).

The fact that a subject’s cognitive processes contribute to shaping scanpaths suggests that the latter may be a useful tool for exploring perceptual or cognitive dysfunction. Schematic knowledge refers to what the subject knows about the organization of certain type of scenes and where useful information is spatially distributed—for example, when looking for a cup, knowing that these are more often located on tabletops than on couches. Thus a subject’s fixations are guided more in a ‘top-down’ manner by a region’s relevance to a perceptual task than in a ‘bottom-up’ manner by its salience, which refers to properties or features that make it noticeable (9). Subjects with general visual agnosia may lose this schematic information, so that their scanpaths are guided more by salient cues than by scene knowledge (10).

Schematic knowledge is relevant to objects as well as scenes. With faces, studies show that the most informative regions for identifying people are the eyes, while the mouth region may be more useful for recognizing certain expressions (11–13). Accordingly, fixations are directed more to the eyes when healthy subjects process identity and more to the mouth when they switch to judging expressions in the same faces (6).

Subjects with prosopagnosia are impaired in recognizing the identity of faces. This can be an acquired or developmental problem (14, 15). There are functional subtypes, with apperceptive, associative and amnestic variants that differ in the degree to which the primary impairment lies in the perceptual processing of faces or in the access to facial memories (16). One intriguing question is whether any or all of these types of prosopagnosia are associated with degraded facial schematics. That is, do prosopagnosic subjects still possess a generic face representation that shows where useful information is likely to be located in a face? Face representations are often conceived of as situated in a multi-dimensional ‘face space’ (17), perhaps referenced to a normative ‘average face’ (18), with individual faces defined by their distances along those various facial dimensions. These distances likely determine which dimensions or facial properties are most useful in distinguishing individuals, generating a facial schematic for identity processing. If prosopagnosia results in a loss of this facial schema, this may be reflected in anomalous scanpaths during a face recognition task.

Some studies have indeed suggested that at least some prosopagnosic subjects have anomalous scanpaths when looking at faces (19–26). Most of these examined small numbers of subjects, though, sometimes single cases, which is problematic given that facial scanpaths vary considerably between healthy subjects (27, 28), and even differ for different faces scanned by the same subject (7, 29). Also, the prosopagnosic reports differ in the abnormalities they find, with some even noting normal scanning patterns (30, 31). The situation is complicated further by the fact that scanpaths have not just a spatial distribution, but also a temporal sequence, which is less often assessed—e.g. (19)—though it can reveal what regions are prioritized.

To advance on this situation, it would be desirable to assess larger numbers of prosopagnosic subjects making a large number of face scanning trials, and to use an analytic technique that can detect predicted and unanticipated differences in complex spatiotemporal sequences in large datasets with substantial between-trial and between-subject variability. Deep learning methods are one such analytic technique. Advances in deep learning algorithms have led to their recent deployment in clinical studies (32), including the fields of ophthalmology and neuro-ophthalmology, where they have been used to analyze ocular fundus images for disease (33–39, 40) and eye movements for cerebellar disorders (41). Most relevant to us, one study used a machine learning approach to explore the scanning of individuals with autism spectrum disorders while they viewed faces during an emotion recognition task, finding that these subjects fixated less on the eyes (42), while another used neural network models trained on scanpaths made by children looking at short videoclips to discriminate subjects with autism spectrum disorders (43).

Similar to that last study (43), we trained convolutional neural networks on images of scanpaths collected from healthy individuals and prosopagnosic subjects, but during a face recognition task. Our questions were first, whether the scanpath classifier could reliably discriminate prosopagnosic from control trials, and second, what scanpath properties the classifiers used to make that discrimination. One valuable feature of this study is that we used ocular motor data that had been previously evaluated with hypothesis-driven analyses (26, 44), allowing us to compare directly a deep-learning approach with traditional statistical methods. In particular, those prior studies tested the hypothesis that, compared to controls, prosopagnosic subjects would show more scanning of the lower than the upper face. However, the results showed little if any difference in the ratio of upper versus lower face scanning. One aim of this study was to explore whether a data-driven approach with deep learning might reveal differences either not evident or not hypothesized a priori in our prior reports.

Our study includes three sections. In the first section, we trained models to distinguish subjects with acquired prosopagnosia from healthy controls. The goals of this section were first, to determine the optimum number of fixations in a scanpath for classification performance; second, to discover the group features learned by the models; and third, to determine how the model labeled individual prosopagnosic subjects—that is, at a single-subject level. In the second section, we trained models to distinguish subjects with developmental prosopagnosia from healthy controls, with similar goals. Thus, these two sections generated two separate binary classifications, rather than a single three-class classification. By examining acquired and developmental prosopagnosia cohorts separately but with the same method, we asked whether these two variants showed similar anomalous scanning that would suggest degradation of facial schemata.

Finally, in the third section, we asked how the model that had been trained to distinguish acquired prosopagnosia from controls would classify subjects with developmental prosopagnosia. There is debate on whether developmental prosopagnosia represents a distinct pathologic entity or merely the low end of the normal spectrum of face recognition (15). If the scanpaths of developmental prosopagnosia resembled the anomalous scanning seen in acquired prosopagnosia, this might be more consistent with pathology disrupting facial representations in a similar way, despite the lack of visible structural damage. If they resembled those of healthy subjects more, this might fit more with the normative view.

Materials and methods

Subjects

Prosopagnosic patients were recruited through www.faceblind.org. The cohort with acquired prosopagnosia consisted of 8 subjects (3 female), four with occipitotemporal lesions (three bilateral), and four with anterior temporal lesions only (two bilateral). The group with developmental prosopagnosia had 10 subjects (7 female). As controls we had 20 age-matched healthy subjects (10 female). All subjects had corrected visual acuity of at least 20/30 in their best eye, with no history of psychiatric or other neurodegenerative conditions. All subjects were of white ancestry [prior studies have noted that east Asians and black subjects in England scan faces differently, with more emphasis on the center of the face (45, 46)].

Prosopagnosic subjects underwent a neuro-ophthalmological examination, Goldmann perimetry, and a battery of neuropsychological tests of intelligence, memory, attention, visual perception, and language skills (47), to exclude more general deficits that could account for face recognition difficulties. Subjects with developmental prosopagnosia also had to score less than 32 on the Autism Questionnaire (48), given that subjects with autism spectrum disorders can have impaired face recognition (49, 50), for reasons that may differ from developmental prosopagnosia.

Inclusion criteria for acquired prosopagnosia were performance worse than 2 standard deviations from the mean of control subjects on at least one of two tests of familiarity for recently viewed faces, the Cambridge Face Memory Test (51) or the face component of the Warrington Recognition Memory Test (52), while performing normally on the word component of the latter (Table 1).

Table 1

Table 1. Demographic and clinical information for prosopagnosic subjects.

Subjects with developmental prosopagnosia reported life-long difficulty in face recognition, corroborated by a high score on the 20-item Prosopagnosia Index (53). This was supplemented by objective confirmation of impaired face recognition with at least two of the following: (i) a discordance between preserved word memory and impaired face memory on the Warrington Recognition Memory Test that was in the bottom 5th percentile, (ii) a score at least two standard deviations below the control mean on the Cambridge Face Memory Task, (iii) a score at least two standard deviations below the control mean on an Old/New faces test (54) or (iv) a score at least two standard deviations below the control mean on a Famous Faces Test.

The demographic details of our acquired and developmental prosopagnosic subjects have been reported previously (26) and are summarized in Table 1. The location of lesions in the acquired prosopagnosic subjects is indicated by their identifiers. ‘R’ indicates a right-sided lesion, ‘L’ a left sided lesion, and ‘B’ bilateral lesions, while ‘IOT’ indicates an inferior occipitotemporal lesion, ‘AT’ an anterior temporal lesion, and ‘ATOT’ a combination of the two.

Data acquisition

An Eyelink 1,000 (SR Research Ltd., Mississauga, Canada) tracked the eye movements of subjects while their head was stabilized by a chin rest in a dimly lit room. Subjects were placed 34 cm away from a computer screen with a 1,024 × 768 pixel resolution, and refresh rate of 140 Hz.

Protocol

The research ethics boards at the University of British Columbia and Vancouver General Hospital approved the research protocols and all subjects gave written consent in accordance with the Declaration of Helsinki. As this work involves data collected several years ago (26), no part of the study procedures or analyses had been pre-registered prior to the research being conducted.

In the learning phase subjects saw 10 facial images, of five people each with two expressions. This was followed by a recognition phase with 35 trials, 10 showing the learned faces and 25 showing distractor faces of different people. All learning and recognition trials began with the subject fixating a cross 7° above where the face would appear. In both learning and recognition phases, subjects were allowed to scan the faces for as long as they wished before pressing the spacebar to move to the next trial. For analysis we collected fixations from the period beginning with the appearance of the face stimulus and ending with the pressing of the spacebar, resulting in a wide range in the number of fixations across trials. Since deep learning methods benefit from larger sample sizes, we entered scanning data from both the learning phase and the recognition phase into our models. This was justified because the prior study had shown that there were no main effects of phase or group-phase interactions in the distribution of fixations, as measured by upper/lower, eye/mouth, or central/peripheral indices (26).

Stimuli

Faces were taken from the KDEF Face Database (76).¹ For the five male target people whose faces were being memorized, we chose images with neutral, sad, or happy expressions. For the distractor faces, 25 male facial identities were chosen at random, also with varying expressions. All stimuli were presented on a white background, with the tip of the nose centered on the screen.

Data preprocessing

The X and Y coordinates and the duration of each fixation were recorded. Each fixation was classified as being directed to one of 10 facial regions of interest, namely the left or right eyes, eyebrows, or cheeks, as well as the forehead, nose, mouth or chin. We imported the data into a Python environment, where trials were stored in individual Pandas Data Frames. For each trial, fixations were considered as separate dimensions, resulting in a vector whose dimensions were equal to the number of fixations in that trial. Each dimension stored a categorical value between 1 and 10 for the facial region to which the fixation was directed.

Machine learning methods

Section 1: deep learning classification of acquired prosopagnosia vs. controls

We first designed a Baseline Model to classify acquired prosopagnosic and control trials based on scanpaths. The aim was to represent each trial by the spatiotemporal characteristics of the fixations made by the subject. Importantly, this did not include all the fixations made in a trial, but only the first m fixations. We transformed each trial to an n-dimensional vector. The first m dimensions represented the trial’s first m fixations as numerical variables indicating to which of the 10 facial regions each fixation had been directed. Dimensions m + 1 through n-1 represent engineered features (Appendix). The last dimension n represented the dependent variable, which was subject group (acquired prosopagnosia vs. control).

To determine the optimal number of fixations to use for image classification, we systematically varied the number of fixations m included for each trial, where 1 < m ≤ 20, to train a logistic regression classifier (the Baseline Model). This allowed us to examine the area under the curve (AUC) for discriminating prosopagnosic from control trials as a function of the number of fixations used.

Scanpaths have both spatial and temporal aspects, which can be represented in a single static image in a number of ways. These will differ in the information they emphasize. Since the best means of discriminating prosopagnosic from control scanpaths was not known a priori, we generated two types of image representations of the scanpath of each trial (Figure 1), using the optimal number of fixations for classifying acquired prosopagnosic subjects versus controls that had been identified by the Baseline Model.

Figure 1

Diagram showing an eye-tracking study on a face. The face is marked with four labeled fixation points. On the right, there are visual representations of fixation sequence and duration, with graphs showing eye movements across facial features such as forehead and eyebrows. Fixation is visualized with varying sizes of dots indicating duration, and a scale for the sequence.

Figure 1. On the left, an example scanpath is shown overlaid on the face (KDEF image ID AM02NES) being viewed. This is transformed into two image representations on the right, the Facial Scanpath (top) and the ROI Sequence (bottom). Reproduced with permission from Lundqvist et al. (76).

Facial Scanpath (Figure 1, top): This plotted fixations as discs centered on the original X and Y coordinates of each fixation. The radius of each disc was proportional to the square of the duration of fixation (to amplify duration differences) divided by a constant (to ensure that most discs still lay within the face boundaries). The order of fixations in the sequence was indicated by the brightness of the disc, with earlier fixations being brighter. This method thus represents both fixation order and duration, and shows the spatial location of fixations as a continuous variable.

ROI sequence ( <italic>Figure 1</italic> , bottom): The x-axis showed the facial region of interest to which the fixation was directed, while the y axis showed the fixation’s position in the temporal sequence, with the first fixation at the top and the last at the bottom. This type of representation emphasizes the temporal order more strongly than the Facial Scanpath does, and represents location categorically by feature rather than continuously, while omitting fixation duration.

We then trained two deep convolutional neural network models using each of these two image sets. The model architecture was identical in all aspects (besides the input size) for both. Our convolutional neural networks used filters that had a 3×3 kernel with a stride length of 2, and a 3 × 3 maxpool size. There were three convolution layers, the first using 32 feature maps, the second 64, and the last 128. The learning rate was set at 0.0001. Code for this analyses is available as a GitHub repository for this project: https://github.com/akazemian/Scanning-Faces-and-Prosopagnosia

The training step for each model involved an 8-fold cross-validation procedure to ensure that the model was tested on the trials from each of the 8 prosopagnosic subjects once. The k-fold validation technique splits the data into k-subsets and training/testing is repeated k times, where each of the k subsets is used once as test set and the other k-1 subsets are used for the training. In our study, a subset is the set of data from one prosopagnosic subject. For the acquired prosopagnosia section, there are 8 folds because there are 8 subjects. Each time, we train on 6 subjects, validate on 1 of the omitted subjects and test on the other omitted subject. We repeat until all subjects have been used once for validation and once for test. For each fold, the process was repeated 10 times, each time with a new random order of trials and different randomly initialized weights in the learning phase, to assess the reproducibility of the results, and the performance of the model was averaged across the 10 repetitions in each fold. Since the dataset involved only 2 groups, a binary loss function was used in the model architecture, measuring the performance as a probability value. The cross-entropy loss increased as the predicted probability diverged from the actual probability, and the function was calculated as

- \frac{1}{n} \sum_{i = 1}^{n} y_{i} log p_{i} + (1 - y_{i}) log (1 - p_{i})

where n is the number of scalar values in the model output and $y_{i}$ and $p_{i}$ are the ith actual value and predicted value in the model output, respectively. The more confident the model is about a trial’s label, the closer the predicted value is to 0 (Control) or 1 (Prosopagnosic).

Having trained two models on the two sets of images (Facial Scanpath and ROI Sequence) for each fold, we next used a weighted average of the predicted values from each model to maximize classification performance for each fold. This we called the Hybrid model. The optimal weights for each of the two models were found using a non-negative coefficient linear regression model. In each cross-validation fold, the linear regression model was trained on the predicted values from the two models and evaluated using the true labels. The positive coefficients were enforced to disregard inverse correlations between the predictions and the true labels. The linear regression coefficients that resulted in the highest AUC score were selected as the optimal weights of the 2 models. In this way, the weights of the two models varied in each fold based on the models’ performance in that fold.

Section 2: deep learning classification of developmental prosopagnosia vs. controls

The same procedure as in Section 1 was used, now with developmental prosopagnosia versus controls. We started by using a Baseline Model with the same vector dimensions to find the optimal number of fixations for classification, and used this optimal number to generate the same two types of scanpath images. We then trained two convolutional neural networks and developed the Hybrid model. One difference from Section 1 is that there were 10 developmental prosopagnosic subjects, whereas there were only 8 acquired prosopagnosic subjects. This led to a 10-fold cross-validation design for training the convolutional neural networks.

Section 3: classification of the scanpaths in developmental prosopagnosia by the hybrid model trained to distinguish acquired prosopagnosia from control subjects

Here the Hybrid model from section one was asked to classify the trials from developmental prosopagnosia subjects as being either more like acquired prosopagnosia trials or more like control trials. The two image representations of a trial were fed to all eight Facial Scanpath and ROI Sequence models, and the final prediction was calculated as the average of the 8 outputs. The more similar a developmental prosopagnosic trial was to a control trial, the closer the final prediction was to 0, and the more similar it was to an acquired prosopagnosic trial, the closer the prediction was to 1.

Additional statistical analysis

To confirm the statistical significance of the results obtained from our models, we used a permutation test to compare the Hybrid model’s AUC score with chance performance in sections one and two. This was done by training the Facial Scanpath and ROI Sequence models on shuffled labels, followed by testing them on true labels and recording the AUC using the same cross-validation design. This process was repeated 100 times to obtain a distribution of chance AUC scores. A p-value was then determined based on the number of permutations whose score was better than the true Hybrid AUC.

We also performed two post-hoc analyses in section one to confirm the results depicted by the outputs of the convolutional neural networks. For Section 1, the analysis for acquired prosopagnosia, we constructed left: right eye indices for each subject as before (26, 44), by subtracting the number of fixations to the eye on the right side of the image from the number to the eye on the left, and dividing by the sum of the two. A positive index indicates more fixations on the eye on the left side, which is the photographed person’s right eye. We analyzed this index with a one-tailed t-test for the post-hoc hypothesis that prosopagnosic subjects fixate the eye on the right of the image more than controls do. Second, we constructed and analyzed an eye: mouth index, where a positive index indicates more fixation on the eyes, and subjected this to a one-tailed t-test for the post-hoc hypothesis that acquired prosopagnosic subjects scan the mouth more than the eyes, compared to controls.

In Section 2, the analysis for developmental prosopagnosia, we constructed a dispersion index (26). This measured the distance of each fixation from the center of the face, and calculated the standard deviation of these distances for each trial, finally obtaining an average of this measure of variability across all the trials for each subject. A large average standard deviation indicates that subjects are scattering fixations broadly, while a low one indicates that they are fixating a few areas repeatedly. We analyzed this dispersion index with a one-tailed t-test for the post-hoc hypothesis that developmental prosopagnosic subjects had more dispersed scanning.

Results

Behavioral results

Mean discrimination performance (d’) was 1.82 (s.d. 0.99) for controls, 1.63 (s.d. 0.76) for developmental prosopagnosia, and 0.52 (s.d. 0.63) for acquired prosopagnosia (Table 2; Supplementary Figure 1). An ANOVA showed a main effect of group [F_{(2, 35)} = 6.58, p < 0.004]. The acquired propagnosia group performed worse than either the controls (p < 0.003) or the developmental prosopagnosia group (p < 0.034), while there was no difference between the last two. As noted (26), this short-term familiarity protocol is easier than other tests of face recognition, and was not designed to maximize diagnostic accuracy for prosopagnosia. Nevertheless, the fact that subjects with acquired prosopagnosia performed worse than those with the developmental variant is consistent with our prior observations with other tests (55, 56).

Table 2

Table 2. Recognition performance of subjects during the task, d’ being discriminative sensitivity, and c being criterion bias.

Section 1: deep learning classification of acquired prosopagnosia vs. controls

We trained our Baseline Model on a dataset generated using the first m fixations of all trials, for 1 < m ≤ 20 and plotted AUC as a function of m. We expected two factors to contribute to the shape of the obtained graph. First, with more fixations per trial, the amount of scanpath information increases, which could improve the ability of the model to distinguish between the two subject groups. On the other hand, since we discard trials with fewer than m fixations, a higher m means a smaller dataset, which could cause a drop in classification performance. Hence, we expected an inverted-U shape to the AUC function, with classification performance peaking at a certain number of fixations. This was indeed the case, with the first four fixations being the optimal number.

We trained two convolutional neural networks, one using the Facial Scanpath and the other the ROI Sequence image sets. The Facial Scanpath model performed better than the ROI Sequence model, with one of the folds even achieving an almost perfect AUC score of 1 (Figure 2A). The Hybrid model’s AUC was recorded and averaged across the 8 folds, with an overall AUC of 80% (Figure 2B).

Figure 2

Two bar graphs displaying performance in AUC. Panel a shows eight folds comparing three models: Hybrid Model, Facial Scanpath CNN, and ROI Sequence CNN, with varying performance. Panel b compares four overall models: Hybrid, Facial Scanpath, ROI Sequence, and Baseline, with Hybrid performing best.

Figure 2. Acquired prosopagnosia. (a) Classification results for the two models in each fold, as well as the Hybrid model. Error bars represent one standard deviation from the mean score obtained by repeating the train-test process 10 times for each fold, evaluating the reproducibility of the results. The Hybrid model was generated using the average predictions from the two models in each fold. (b) Mean AUC score for models in (a) and the Baseline using the first 4 fixations.

To discover the features learned by the models we examined the trials that the models classified with high confidence. Each trial’s label was predicted as a probability. The more confident the model’s prediction, the closer the probability is to either 0 (control) or 1 (prosopagnosia). We looked at the trials with predictions either below the 10th percentile (high confidence for control) or above the 90th percentile (high confidence for prosopagnosia) of each fold. We overlaid these trials (obtained from all subjects regardless of their group membership) to see what model thought constituted a prosopagnosic and a control scanpath. The Facial Scanpath model showed that control-label scanpaths had more fixations on the eyes, particularly the eye on the left side of the image, while prosopagnosic-label scanpaths fixated the nose, mouth and right upper face more (Figure 3A). The ROI Sequence model showed that both groups fixated the nose first, then control-label scanpaths shifted to the left and then both eyes, while prosopagnosia-label scanpaths stayed on the nose and mouth more (Figure 3B).

Figure 3

Diagram illustrating data on prosopagnosia. Part a: Graph showing frequency distribution of prosopagnosic prediction probabilities, with images below depicting fixation density maps. Left image represents high probability control; right shows high probability prosopagnosic. Color gradient indicates fixation density, from low (dark) to high (light).Part b: Heatmaps displaying fixation densities for four facial fixations, with separate columns for facial areas. Left side is high probability control; right is high probability prosopagnosic. Gradients reflect fixation density, mapped to sample facial images.

Figure 3. (a) Left image shows the superimposition of all trials that the Facial Scanpath model classifies as control (i.e., label 0) with high confidence (<10th percentile of all predicted values), while the right shows that of trials that the model classifies as acquired prosopagnosic (i.e., label 1) with high confidence (>90th percentile of all predicted values). (b) Superimposition of all trials that the ROI Sequence model classifies as control (label 0) with high confidence and trials that it classifies as prosopagnosic (label 1) with high confidence. The brighter a cell, the higher the number of trials with a fixation on a particular ROI for that position in the fixation sequence. The faces next to each plot (KDEF Image ID AM02NES) illustrate this by showing the relative density in space for each of fixations 1–4, overlaid on a sample face.

One potential confound is that, since prosopagnosic subjects make more errors than controls, could the model’s predictions actually be discriminating between correct and incorrect trials, rather than between control and prosopagnosic trials? To address this we looked at the scanning on correct and incorrect trials separately. If the model was actually discriminating trial accuracy instead of group membership, scanning on error trials would differ from scanning on correct trials, rather than differing between prosopagnosic and control subjects. This was not the case (Figure 4). Rather, the scanning of controls was similar regardless of whether they gave the right or wrong answer, and the same was true for prosopagnosic subjects.

Figure 4

Visualizations labeled A and B show colored bubble charts and grayscale heatmaps. Both charts depict clusters of purple and red bubbles within a yellow border. The heatmaps below each chart, labeled Fixations 1 through 4, show varying grayscale intensities across facial features like eyebrows, eyes, and cheeks, highlighting areas of focus.

Figure 4. Overlay images of the scanpaths of the first four fixations of (A) control subjects and (B) subjects with acquired prosopagnosia, during correct trials (left graphs) and error trials (right graphs). For (A,B), the top shows the Facial Scanpath overlay, the bottom shows that of the ROI Sequence. Scanpaths during correct and error trials of subjects with acquired prosopagnosia resemble each other more than they do the scanpaths of controls on error trials, which again do not appear much different than how control subjects scan faces on correct trials.

To assess the significance of the Hybrid model’s performance, we used a permutation test to compare the Hybrid AUC score with chance performance. Our model’s performance of 80% was significantly better than the performance of 100 chance models (p < 0.01): i.e., none of the models trained on shuffled labels performed better than the Hybrid model.

As a post-hoc application of traditional statistics for comparison, we analyzed how subjects distributed fixations between the right and left eyes, and between the eyes and mouth, limited to the first four fixations of each trial. (This differs from the prior study (26), which used all fixations of a trial.) The mean left: right index was −0.18 (s.d. 0.64) for subjects with acquired prosopagnosia, and 0.28 (s.d. 0.48) for controls, confirming that subjects with acquired prosopagnosia fixated the eye on the right side of the image more (p = 0.027). The mean eye: mouth index was 0.10 (s.d. 0.63) for subjects with acquired prosopagnosia, and 0.46 (s.d. 0.40) for control subjects, indicating a bias to the eyes for controls but more even scanning between the eyes and mouth in prosopagnosic subjects (p = 0.042).

How did the Hybrid model classify individual prosopagnosic subjects? We examined the distribution of outputs for all eligible trials (those with 4 or more fixations) made by each individual subject (Figure 5). For any given subject, a median prediction closer to 1 than to 0 would indicate that most of the trials of that subject were similar to the model’s understanding of a prosopagnosic scanpath. There was substantial variation across subjects, with the scanpaths of a few (RIOT-4, B-AT1, B-AT2) appearing on average more like control scanpaths. Two points are worth mentioning. The likelihood that a subject had more prosopagnosic-like or more control-like scanpaths was neither related to the presence of a field defect (Table 1) nor to whether they had anterior temporal or occipitotemporal lesions.

Figure 5

A figure showing a series of brain patterns and bubble plots, each corresponding to different tasks labeled B-AT2, RIOT-4, B-AT1, R-AT3, B-IOT2, B-ATOT2, R-AT2, and L-IOT2. Each task has a mean score ranging from 0.232 to 0.710. Below, a box plot compares control and prosopagnosic groups, illustrating score distributions for each task.

Figure 5. These images superimpose the first 4 fixations from all trials for each acquired prosopagnosic subject. The box plots represent the distribution of trial predictions from the Hybrid model for each subject. The ends of the box represent the lower and upper quartiles, while the median is marked by a line inside the box and the whiskers represent the minimum and maximum values, with outliers beyond these plotted as dots. Subjects are sorted by median prediction, which is the value m written at the top of each box. Thus, subjects on the left have fixation patterns that on average are closer to the model’s understanding of how a control subject fixates faces, while those to the right have characteristics that match the model’s understanding of prosopagnosic scanning behavior.

Section 2: deep learning classification of developmental prosopagnosia vs. controls

In this section, we performed a similar process for the developmental prosopagnosia cohort. We started by finding the optimal number of fixations to use, which was 16. Next, we generated the same two types of image representations for developmental prosopagnosic subjects and controls using each trial’s first 16 fixations. This resulted in 16 fixation discs for each Facial Scanpath image, and a 16×10 matrix for each ROI Sequence image. Figure 6 shows the classification results for the 10 folds, including the Hybrid model, which achieved an average AUC of 69% (Figure 6B).

Figure 6

Bar charts showing performance (AUC) of different models. Chart (a) displays AUC across ten folds for three models: Hybrid, Facial Scanpath CNN, and ROI Sequence CNN, with varying performance. Chart (b) presents average performance across models - Hybrid, Facial Scanpath, ROI Sequence, and a Baseline, with Hybrid performing the best.

Figure 6. Developmental prosopagnosia. (a) Classification results for the two models in each fold, as well as the Hybrid model. Error bars represent one standard deviation from the mean score obtained by repeating the train-test process 10 times for each fold, evaluating the reproducibility of the results. The Hybrid model was generated using the average predictions from the two models in each fold. (b) Mean AUC score for models in (a) and the Baseline using the first 16 fixations.

Next, we visualized the scanpaths in the same way as in Section 1. The Facial Scanpath Model confidently classified trials with a higher density of fixations on the central features as control trials, and those with more peripherally scattered fixations as developmental prosopagnosia trials (Figure 7A). The ROI Sequence model suggested that the two groups have a similar scanning behavior during the initial fixations, with a high density of fixations on the nose followed by fixations on the left eye. This model classified trials with a higher concentration on the central regions and almost no fixations on the peripheral regions as control trials, and those with more evenly distributed fixations as developmental prosopagnosia trials (Figure 7B).

Figure 7

A composite image is divided into labeled sections: (a) features probability distributions for prosopagnosic predictions with fixation density heat maps for control and prosopagnosic groups, gradient from high (yellow) to low (purple) density. (b) displays grayscale heat maps of fixation density on facial features across fixations four, eight, twelve, and sixteen for both groups. (c) presents a line graph comparing image contrast across sixteen fixations between high-probability prosopagnosic and control groups, with control generally showing higher contrast.

Figure 7. (a) Left image shows the superimposition of all trials that the Facial Scanpath model classifies as developmental prosopagnosic (i.e., label 1) with high confidence (>90th percentile of all predicted values), middle shows all trials labeled as Control (i.e., label 0) with high confidence (<10th percentile of all predicted values). (b) Superimposition for the ROI Sequence model, with conventions similar to Figure 3B. The face (KEDF Image ID AM02NES) heat maps here represent the average fixation density for three consecutive fixations at a time. (c) The line plot represents the mean contrast value for each fixation number in the 10 train-test repetitions, with the error bars representing 1 standard error.

For a post-hoc statistical confirmation of this impression, we analyzed the contrast between the 10 ROIs for each fixation row in the sequence from 1 to 16 (Figure 7c). A high contrast value indicates a tendency to fixate some regions more than others, while a low contrast value indicates that all regions are fixated to a similar degree. Contrast decreased as the fixation sequence progressed, meaning that fixations became more widely distributed across the face, but that at most points in the fixation sequence the prosopagnosic-label trials had more evenly distributed fixations than the control-label trials. A two-way repeated-measures ANOVA with contrast as the dependent variable and group (prosopagnosic-like versus control-like) and fixation number as independent variables showed a significant main effect for group (F = 22.8, p = 0.001) and fixation number (F = 95.0, p < 0.0001). There was a significant interaction between the two (F = 2.57, p = 0.0021), due to a greater difference between groups for earlier fixations.

For a post-hoc application of traditional statistics for comparison, we analyzed the dispersion index. This differs from the prior study (26), which used all fixations of a trial, by including only the first 16 fixations of each trial, and only trials with at least 16 fixations. The mean dispersion index was 35.3 (s.d. 4.87) for subjects with developmental prosopagnosia, and 31.8 (s.d. 4.64) for control subjects, confirming that subjects with developmental prosopagnosia had more dispersed scanning [t₍₂₄₎ = 1.82, p < 0.041].

As for acquired prosopagnosia, we compared the predictions to the overlay images for correct and incorrect trials. Again, the difference in the dispersal of scanning tracked more with the subject group than the accuracy of the trial response. The greater dispersal of scanning in developmental prosopagnosia is particularly evident in the ROI sequence plots (Figure 8).

Figure 8

Two panels, A and B, each containing two heatmaps and corresponding fixation charts. The heatmaps show visual fixation intensity with yellow and purple areas on a face-shaped outline. Below, grayscale charts detail fixation points on facial features across Fixations 4, 8, 12, and 16. The horizontal axis lists facial areas, while the vertical axis corresponds to fixation numbers. Panels A and B are similar in layout but depict different patterns.

Figure 8. Overlay images of the scanpaths containing the first 16 fixations of (A) control subjects and (B) subjects with acquired prosopagnosia, during correct trials (left graphs) and error trials (right graphs). Top graphs of each half shows the Facial Scanpath overlay, bottom graphs shows that of the ROI Sequence.

Section 3: classification of developmental prosopagnosic scanpaths by the acquired prosopagnosia-control hybrid model

We asked how the Hybrid model that was trained to distinguish between acquired prosopagnosia and control subjects would classify the data of the developmental prosopagnosia group. We generated images using the first 4 fixations of each trial of the developmental prosopagnosic subjects and used them as input to the trained Hybrid model developed in Section 1. We first performed a group analysis. Again, the key output of the classifier is the predicted probability of group membership of a trial, ranging from 0 (control-like) to 1 (acquired prosopagnosic-like). For each group we plotted the distribution of these trial probability values along that 0–1 continuum, as a density plot. We then compared these distributions between the three groups, asking whether the trials of the developmental prosopagnosic group resembled more those of the acquired prosopagnosic group or the trials of the control group. This showed that the developmental prosopagnosia data more closely resembled that of the control group than that of the acquired prosopagnosia group, being only slightly shifted to the right of the control distribution (Figure 9).

Figure 9

Line graph showing frequency distribution of predictions for three groups: DPs, Controls, and APs. The x-axis is labeled Predictions, and the y-axis is labeled Frequency Distribution. Each group has a distinct curve.

Figure 9. Section 3, group results. Each trial is classified by the Hybrid model (that was trained to discriminate between acquired prosopagnosia and controls) as having a probability of being a prosopagnosic-like scanpath, which runs from 0 (definitely control-like) to 1 (definitely prosopagnosic-like), which is represented on the x-axis. For each of the three groups, all the trials of all member subjects are included in a group plot, showing how these predicted probabilities are distributed for that group, depicted as a density line—i.e. the frequency of a particular predicted probability. As expected, the scanpaths of the control group have a mean probability of 0.31 and a peak close to the lower end, around 0.2, indicating that most of their trials show control-like properties. Acquired prosopagnosic subjects show a mean probability of 0.49, with a shallower peak around 0.66, indicating more tendency for their trials to be classified as prosopagnosic-like. The key finding is that developmental prosopagnosic trials have a distribution that appears in shape and place more similar to that of controls, with a mean probability of 0.37.

For individual subjects, we examined how the model characterized the scanpaths of each of the 10 subjects with developmental prosopagnosia (Figure 10). This showed a range of median predictions, with only one subject (DP016) appearing slightly more like acquired prosopagnosia (i.e., median prediction ≥ 0.5) and the rest having median predictions below 0.5, and therefore biased more toward controls.

Figure 10

Nine grayscale matrices and corresponding heat maps illustrate variation in data patterns. Below, a box plot compares control and prosopagnosic groups with median values indicated as $m = 0.235$ to $m = 0.527$. The box plot axis ranges from 0.2 to 0.7, showing comparative distributions labeled from

Figure 10. Section 3, individual results for developmental prosopagnosia. The images superimpose the first 4 fixations from all trials for each of the 10 subjects, as done for acquired prosopagnosia in Figure 5. The box plots show the distribution of predictions from the Hybrid Model for each subject’s trials using their first 4 fixations (plotting conventions as explained in Figure 5). Subjects are sorted by median prediction from the Hybrid Model. As in Figure 5, subjects on the left have fixation patterns that are closer to the model’s understanding of a control subject, and those to the right have characteristics that match the model’s understanding of the scanning of acquired prosopagnosia. All but one subject had a median predicted value lower than 0.5 (i.e., closer to controls than to acquired prosopagnosia).

How did this single-subject analysis contrast between the cohorts? For the control group, the mean single-subject prediction was 0.31 (s.d. 0.11), whereas it was 0.49 (s.d. 0.28) for acquired prosopagnosia and 0.37 (s.d. 0.10) for developmental prosopagnosia. The difference between acquired prosopagnosia and controls was significant [t₍₂₂₎ = 2.75, p < 0.012], whereas the difference between developmental prosopagnosia and controls was not [t₍₂₄₎ = 1.23, p = 0.23].

Relation to individual behavioral performance

We asked two questions. First, from the results of sections 1 and 3, are those with more severe deficits more likely to show scanpaths typical of acquired prosopagnosia? If so, there may be a negative correlation between face recognition accuracy and a scanpath predicting a higher probability—i.e., more typical of acquired prosopagnosia. We analyzed all 18 acquired and developmental prosopagnosic subjects together, using the probabilities of the classifier trained to discriminate acquired prosopagnosia from controls. Scanpath probability showed a trend to a negative correlation with their discriminative accuracy (d’) during the experiment [r = −0.42, F (1,17) = 3.55, p = 0.076], as well as a similar trend with their Cambridge Face Memory Test scores [r = −0.44, F (1,17) = 4.02, p = 0.061].

Second, we asked whether the dispersal of fixations seen in developmental prosopagnosia in section 2 was related to recognition performance. Rather than inverse correlations, we found a trend to a positive correlation with d’ in developmental prosopagnosia (r = 0.57, p = 0.064). Thus, while the developmental prosopagnosic group had more dispersed scanpaths than controls, within this group those with more dispersed scanpaths tended to recognize faces better.

Scanpath correlations with performance in control subjects

Since the face perception of healthy subjects falls on a continuum, does their recognition accuracy correlate with any of the scanning variables that distinguish prosopagnosic from control subjects? The dispersion index measured over the first 16 fixations was not related to d’ in the control group (r = 0.26, p = 0.25). We also assessed whether recognition accuracy was related to the left: right and eye: mouth indices for the first 4 fixations, which were useful features in distinguishing acquired prosopagnosia from control subjects. While there was no relation between the eye: mouth index and d’ (r = 0.007, p = 0.97), there was a trend to an inverse correlation between the left: right index and d’ (r = −0.41, p = 0.066). However, we caution that these are post-hoc analyses on a small sample and can only be regarded as preliminary explorations.

Discussion

We explored a deep learning approach for studying the face scanning patterns of prosopagnosic subjects. For acquired prosopagnosia in Section 1, the highest accuracy (80%) in classifying trials was achieved with the first four fixations, with acquired prosopagnosic trials anomalously biased toward the mouth and the eye in right space. In Section 2, optimal classification accuracy for developmental prosopagnosia was lower (69%) and required 16 fixations, with a more peripheral scattering of fixations typifying developmental prosopagnosia. Our third section showed that the scanpaths of developmental prosopagnosic subjects resembled those of controls more than those of subjects with acquired prosopagnosia.

The fixations in scanpaths likely target the most useful areas of a visual stimulus for a perceptual determination (57, 58). For faces, these may be guided by an internal schema that maps the discriminative utility of various regions of the face, creating a hierarchy for facial features (59), though this may vary with the perceptual task (12). For identity processing, the upper face and the eyes play a key role (60). Accordingly, the scanpaths of healthy subjects show a preference for the upper face and eyes when they are identifying faces (3, 6, 61). In addition, healthy subjects fixate the eye in their left hemifield preferentially (62, 63), whether the face is upright or inverted (3). Like the left visual-field superiority for face processing in tachistoscopic studies (64), a left fixation bias has been attributed to a right hemispheric dominance for face processing.

However, scanpaths can be highly variable. Facial scanpaths can show substantial and durable between-subject idiosyncrasies (27, 28), while the scanpaths used by a given subject can differ between different faces, or between different trials using the same face (7, 29), the latter sometimes as a function of task (6). Hence the facial scanpath biases seen in healthy subjects emerge primarily as tendencies in data amassed from many trials in many subjects. Likewise the differences we observed related to prosopagnosia emerged from a deep learning approach using relatively large numbers of trials from groups of subjects, rather than single cases. Our results showed that both eye/mouth and right/left biases distinguish control scanpaths from the scanpaths of acquired prosopagnosia within the first few fixations. In contrast, these biases do not discriminate between control and developmental prosopagnosic scanpaths. Rather, the centrality of fixations is the differentiating factor in this latter comparison. This recalls another finding, that an emphasis on central facial regions is a characteristic fixation pattern of healthy individuals (65).

There are few studies of face scanning in acquired prosopagnosia, reflecting the rarity of this condition. The first two found no difference between three patients and controls (30, 31). A third examined scanning while two patients identified ambiguous morphed faces (19). One patient showed a normal feature hierarchy but lacked the typical left-side bias, while the second fixated external features and the lower face more, and even showed a right-side bias. Of two other subjects making familiarity judgments, PS fixated first the mouth then the eye in left space (20), while SC looked more at external features (21). These heterogeneous results accord with our analysis of individual prosopagnosic subjects, which showed that scanning was more anomalous in some subjects than in others (Figure 5).

Of note, the presence of a hemifield defect did not determine the frequency with which a subject showed this anomalous acquired prosopagnosic pattern, nor did the location of the lesion. Hence the abnormal pattern identified by the classifier is a marker for acquired prosopagnosia in general, rather than some associated field loss or anatomic feature, and, when combined with our developmental prosopagnosic data, may show some correlation with the severity of the recognition deficit.

For developmental prosopagnosia, others found that fixations in the first 7 s were more dispersed in four subjects making familiarity judgments (22). Reduced fixations on the eyes was reported in two family members (25) and in subject K (66), who also fixated the external features more. A study of 10 subjects did not find a scanning difference between internal and external features, though reduced scanning of the internal features did correlate with the severity of the face recognition deficit (23). This study also found fewer fixations of the eyes and more of the mouth at a group level. A heatmap analysis of 12 subjects showed a preserved emphasis on the eyes but with more fixations of the nose and mouth than controls (24). Our results agree most with the two studies (22, 66) that identified a peripherally dispersed fixation pattern as being characteristic of developmental prosopagnosia (Figure 7). Interestingly, this pattern may be associated with difficult perceptual performance in healthy subjects too, who show more dispersed fixations when contrast-reversed images are used to make visual processing more challenging (67).

One of the strengths of our study was the ability to compare a data-driven approach with hypothesis-driven approaches, given that the same data were previously analyzed using the latter (26, 44). Pancaroglu et al. (44) found that most of the patients with acquired prosopagnosia retained the normal bias for the eyes, with the exceptions of BAT1 and BIOT2, though at the group level there was a lower eye/mouth fixation ratio. Lee et al. (26) found no difference in eye/mouth ratios in either acquired or developmental prosopagnosic cohorts, but a bias to more peripheral facial regions for acquired prosopagnosia subjects with occipitotemporal lesions. Notably, this peripheral shift was not found in developmental prosopagnosia, though the current study shows that it is present in the early fixations of this group. Neither of these two prior studies searched for left/right differences, as these were not hypothesized. Thus, the current approach (a) revealed biases that had been sought but difficult to find with the prior methods, (b) allowed non-predicted findings to emerge, and (c) clarified differences between acquired and developmental prosopagnosia.

One methodologic factor that may have played a key role in the current study was the identification of the optimal number of fixations to use. If fixations on an image are directed and sequenced to maximize the rate of accrual of diagnostic information, then differences between those with normal and anomalous recognition skills may be more apparent in earlier fixations. For faces, healthy subjects only require 2–5 fixations to make familiarity judgments (68, 69). These early fixations already show the characteristic normal biases favoring the left hemifield (70) and the eyes (68), as well as influences of familiarity (62). Some case reports have even asked whether prosopagnosic subjects differ from controls in just the first fixation. For acquired prosopagnosia, SB did not (30), but PS looked at the mouth the most (20). For developmental prosopagnosia, the father and son pair of LG and RG rarely fixated the eyes with their first glance (25).

Our third section asked whether developmental prosopagnosic subjects scanned faces more like control or acquired prosopagnosic subjects. At a group level the distribution of their trials resembled a slightly right-shifted version of the distribution of trials from controls (Figure 9). At an individual level, most developmental subjects had scanpaths that on average resembled more those of controls (Figure 10). However, the individual results fell along a spectrum and we cannot exclude the possibility of heterogeneity within the cohort with developmental prosopagnosia. Heterogeneity is a potential issue given that the diagnosis of developmental prosopagnosia is currently based on behavioral and statistical criteria, without definite imaging or genetic biomarkers (15). On the whole, though, the results suggest that the internal representations of faces—facial schemata—that guide scanning behavior are less disrupted in developmental prosopagnosia than in acquired prosopagnosia. Such results question how closely developmental prosopagnosia parallels the acquired form.² These results are relevant to the debate as to whether developmental prosopagnosia represents simply the low end of a normal spectrum of perceptual ability, or a pathologic entity resulting from aberrant development (15). While our results do not prove the point, the fact that the facial scanpaths of developmental prosopagnosic resemble control scanpaths more may be more consistent with the normal-spectrum concept.

To summarize, the current deep learning approach advanced upon the prior studies (26, 44) by showing, first, that optimal differentiation of acquired prosopagnosia from controls required only four fixations, while 16 fixations was needed for discriminating the scanpaths of developmental prosopagnosic subjects from controls. Hence acquired prosopagnosic scanpaths show anomalies that are apparent very early with good discriminative power (AUC 80%), whereas differences in developmental prosopagnosia are more subtle, requiring more fixations to discern, with less discriminative power (AUC 69%). Second, within these early samples of fixations, the classifier detected a tendency for acquired prosopagnosic scanpaths to focus less on the eyes and more on the mouth compared to controls, a difference that had been hypothesized but difficult to show using the entire sample of fixations (26, 44). Third, it revealed a bias toward the eye on the right side of the image in acquired prosopagnosia, which had not been examined on an a priori basis in those prior studies. Fourth it showed that these two biases are not characteristic of developmental prosopagnosia, where scanning resembled a more dispersed version of control scanpaths. In contrast, the prior study had not been able to show any difference between developmental prosopagnosic subjects and controls (26). Nevertheless, for our fifth key finding, the final section showed that developmental prosopagnosic scanpaths are more similar to those of controls than to those of subjects with acquired prosopagnosia.

There are three implications of the current results. First, the anomalous scanpaths of acquired prosopagnosia suggest disruption of a facial schema that shows where the most useful information about facial identity is located, the result being that they scan the mouth and the eyes nearly equally while controls scan the eyes more. Second, given that the right and left eyes have similar utility for identity judgments (12), the shift away from the eye on the left side of the image may be related less to distorted information processing and more to loss of the right hemispheric dominance for face identification, which is consistent with the fact that all of our acquired prosopagnosic subjects had structural lesions in the right hemisphere. Third, and consistent with the prior conclusion (26), the fact that these anomalous biases are not seen in developmental prosopagnosic subjects suggests that the latter still possess a relatively normal facial schema.

The dispersion seen in developmental prosopagnosic scanpaths has several possible explanations. One is that the facial schema is more degraded and less precise, or its use in directing attention and fixations is less accurate. Alternatively, given the trend to more dispersed scanpaths being associated with better face discrimination in these subjects, it may represent a useful strategic response to widen the field of sampling when a perceptual system is faced with diagnostic uncertainty. Relevant to this point, visual search tasks analyzed with linear ballistic accumulator models show that developmental prosopagnosia is characterized by normal rates of information accumulation but higher evidence thresholds for reaching a decision about face identity (71). Difficulty in reaching a threshold may spur broader stimulus sampling, though still guided by an intact facial schema in developmental prosopagnosic subjects, and the additional evidence obtained by this change in sampling may support more accurate face recognition in this group.

The main limitation of our study is the small dataset. The performance of deep learning models improves with more training data. However, acquired prosopagnosia is a rare condition. The size of the current cohort was achieved only by recruiting across the North American continent, and it represents most subjects of the largest such group examined in recent times. Developmental prosopagnosia may be more common (72) and cohorts of a similar size to ours have been reported in prior work (23, 24). It would be of interest to replicate our results in additional groups of developmental prosopagnosic subjects, particularly given the possibility of heterogeneity. A second caveat is that our results may be culturally specific. Our prosopagnosic and control subjects are all of white ancestry. Eye movement studies have found that, while such subjects fixate the eyes and mouth, east Asians fixate the nose at the center of the face (45), while the initial fixations of black subjects also fixate the nasal region more (46).

To our knowledge, this is the first use of a machine learning approach to study face scanning in prosopagnosia. Our approach identified the optimal segments of the scanpath for classification, as opposed to using all fixations. The end-to-end nature of deep learning eliminated the need for manual feature addition, which is useful when group characteristics are not fully understood. The advantages of this approach were made apparent by comparing its results directly with prior analyses of the same data using a traditional hypothesis-driven approach (26, 44). Whereas the prior work suggested only a modest preference for the lower face in acquired prosopagnosia, the current approach confirmed this, showed that there was also an anomalous bias to the right side of facial images, and revealed that these biases were already apparent in the first four fixations. Furthermore, this method revealed that the scanpaths of developmental prosopagnosia did not mirror those of acquired prosopagnosia, showing instead a dispersal to more peripheral regions, and overall resembling control scanpaths more than the anomalous scanpaths of acquired prosopagnosia.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation. Requests to access these datasets should be directed to amFzb25iYXJ0b25Ac2hhdy5jYQ==.

Ethics statement

The studies involving humans were approved by Institutional Review Boards of the University of British Columbia and Vancouver Hospital. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

AK: Software, Investigation, Writing – review & editing, Data curation, Writing – original draft, Methodology. IO: Methodology, Validation, Writing – review & editing, Supervision, Formal analysis, Visualization. JB: Data curation, Resources, Project administration, Writing – review & editing, Conceptualization, Writing – original draft, Supervision.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This work was supported by grant MOP-102567 from the Canadian Institutes of Health Research, a Natural Sciences and Engineering Research Council of Canada Discovery Grant RGPIN-2019-05554 (IO) and an Accelerator Supplement RGPAS-2019-00026 (IO), and a Canada Foundation for Innovation, John R. Evans Leaders Fund (IO). JB is supported by Canada Research Chair 950-232752 and the Marianne Koerner Chair in Brain Diseases.

Acknowledgments

We thank Raika Pancoroglu, Sherryse Corrow, and Dongho Lee for their work in obtaining the data, and Brad Duchaine for assistance with recruitment.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.

Generative AI statement

The authors declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fneur.2025.1616509/full#supplementary-material

SUPPLEMENTARY Figure 1 | This scatterplot shows the sensitivity (d’) and bias (c) for each participant in the behavioural face recognition protocol. Black filled circles denote individuals with acquire prosopagnosia, open circles denote individuals with developmental prosopagnosia, and light grey circles denote controls.

Footnotes

1. ^https://kdef.se/

2. ^Similar concerns raised for other reasons have led some to propose the unwieldy name of “prosopdysgnosia” for the developmental form (73).

References

1. Henderson, J. Human gaze control during real-world scene perception. Trends Cogn Sci. (2003) 7:498–504. doi: 10.1016/j.tics.2003.09.006

PubMed Abstract | Crossref Full Text | Google Scholar

2. Itti, L, and Koch, C. A saliency-based search mechanism for overt and covert shifts of visual attention. Vis Res. (2000) 40:1489–506. doi: 10.1016/s0042-6989(99)00163-7

PubMed Abstract | Crossref Full Text | Google Scholar

3. Barton, JJ, Radcliffe, N, Cherkasova, MV, Edelman, J, and Intriligator, JM. Information processing during face recognition: the effects of familiarity, inversion, and morphing on scanning fixations. Perception. (2006) 35:1089–105. doi: 10.1068/p5547

Crossref Full Text | Google Scholar

4. Friedman, A. Framing pictures: the role of knowledge in automatized encoding and memory for gist. J Exp Psychol Gen. (1979) 108:316–55. doi: 10.1037/0096-3445.108.3.316

PubMed Abstract | Crossref Full Text | Google Scholar

5. Biederman, I, Mezzanotte, RJ, and Rabinowitz, JC. Scene perception: detecting and judging objects undergoing relational violations. Cognit Psychol. (1982) 14:143–77. doi: 10.1016/0010-0285(82)90007-X

PubMed Abstract | Crossref Full Text | Google Scholar

6. Malcolm, GL, Lanyon, LJ, Fugard, AJ, and Barton, JJ. Scan patterns during the processing of facial expression versus identity: an exploration of task-driven and stimulus-driven effects. J Vis. (2008) 8:2. doi: 10.1167/8.8.2

PubMed Abstract | Crossref Full Text | Google Scholar

7. Groner, R, Walder, F, and Groner, M. Looking at faces: local and global aspects of scanpaths In: A Gale and F Johnson, editors. Theoretical and applied aspects of eye movement research. Amsterdam: Elsevier (1984). 523–33.

Google Scholar

8. Mannan, SK, Ruddock, KH, and Wooding, DS. Fixation sequences made during visual examination of briefly presented 2D images. Spatial Vis. (1997) 11:157–78.

Google Scholar

9. Foulsham, T, and Underwood, G. How does the purpose of inspection influence the potency of visual saliency in scene perception? Perception. (2007) 36:1123–38. doi: 10.1068/p5659

Crossref Full Text | Google Scholar

10. Foulsham, T, Barton, JJ, Kingstone, A, Dewhurst, R, and Underwood, G. Fixation and saliency during search of natural scenes: the case of visual agnosia. Neuropsychologia. (2009) 47:1994–2003. doi: 10.1016/j.neuropsychologia.2009.03.013

PubMed Abstract | Crossref Full Text | Google Scholar

11. Gosselin, F, and Schyns, P. Bubbles: a technique to reveal the use of information in recognition tasks. Vis Res. (2001) 41:2261–71. doi: 10.1016/S0042-6989(01)00097-9

PubMed Abstract | Crossref Full Text | Google Scholar

12. Schyns, P, Bonnar, L, and Gosselin, F. Show me the features! Understanding recognition from the use of visual information. Psychol Sci. (2002) 13:402–9. doi: 10.1111/1467-9280.00472

PubMed Abstract | Crossref Full Text | Google Scholar

13. Smith, ML, Cottrell, GW, Gosselin, F, and Schyns, PG. Transmitting and decoding facial expressions. Psychol Sci. (2005) 16:184–9. doi: 10.1111/j.0956-7976.2005.00801.x

Crossref Full Text | Google Scholar

14. Susilo, T, and Duchaine, B. Advances in developmental prosopagnosia research. Curr Opin Neurobiol. (2013) 23:423–9. doi: 10.1016/j.conb.2012.12.011

PubMed Abstract | Crossref Full Text | Google Scholar

15. Barton, JJ, and Corrow, SL. The problem of being bad at faces. Neuropsychologia. (2016) 89:119–24. doi: 10.1016/j.neuropsychologia.2016.06.008

PubMed Abstract | Crossref Full Text | Google Scholar

16. Davies-Thompson, J, Pancaroglu, R, and Barton, J. Acquired prosopagnosia: structural basis and processing impairments. Front Biosci. (2014) 6:159–74. doi: 10.2741/e699

PubMed Abstract | Crossref Full Text | Google Scholar

17. Valentine, T. A unified account of the effects of distinctiveness, inversion and race in face recognition. Quart J Exp Psychol. (1991) 45A:161–204.

Google Scholar

18. Webster, MA, and MacLeod, DI. Visual adaptation and face perception. Philos Trans R Soc Lond Ser B Biol Sci. (2011) 366:1702–25. doi: 10.1098/rstb.2010.0360

PubMed Abstract | Crossref Full Text | Google Scholar

19. Barton, J, Radcliffe, N, Cherkasova, M, and Edelman, J. Scan patterns during the processing of facial identity in prosopagnosia. Exp Brain Res. (2007) 181:199–211. doi: 10.1007/s00221-007-0923-2

PubMed Abstract | Crossref Full Text | Google Scholar

20. de Xivry, J-JO, Ramon, M, Lefèvre, P, and Rossion, B. Reduced fixation on the upper area of personally familiar faces following acquired prosopagnosia. J Neuropsychol. (2008) 2:245–68. doi: 10.1348/174866407X260199

PubMed Abstract | Crossref Full Text | Google Scholar

21. Stephan, BCM, and Caine, D. Aberrant pattern of scanning in prosopagnosia reflects impaired face processing. Brain Cogn. (2009) 69:262–8. doi: 10.1016/j.bandc.2008.07.015

PubMed Abstract | Crossref Full Text | Google Scholar

22. Schwarzer, G, Huber, S, Gruter, M, Gruter, T, Grob, C, Hipfel, M, et al. Gaze behaviour in hereditary prosopagnosia. Psychol Res. (2007) 71:583–90. doi: 10.1007/s00426-006-0068-0

Crossref Full Text | Google Scholar

23. Bobak, AK, Parris, BA, Gregory, NJ, Bennetts, RJ, and Bate, S. Eye-movement strategies in developmental prosopagnosia and “super” face recognition. Q J Exp Psychol. (2017) 70:201–17. doi: 10.1080/17470218.2016.1161059

PubMed Abstract | Crossref Full Text | Google Scholar

24. Malaspina, M, Albonico, A, Toneatto, C, and Daini, R. What do eye movements tell us about the visual perception of individuals with congenital prosopagnosia? Neuropsychology. (2017) 31:546–63. doi: 10.1037/neu0000362

PubMed Abstract | Crossref Full Text | Google Scholar

25. De Luca, M, Pizzamiglio, MR, Di Vita, A, Palermo, L, Tanzilli, A, Dacquino, C, et al. First the nose, last the eyes in congenital prosopagnosia: look like your father looks. Neuropsychology. (2019) 33:855–61. doi: 10.1037/neu0000556

PubMed Abstract | Crossref Full Text | Google Scholar

26. Lee, D, Corrow, SL, and Barton, JJS. The scanpaths of subjects with developmental prosopagnosia during a face memory task. Brain Sci. (2019) 9:188. doi: 10.3390/brainsci9080188

PubMed Abstract | Crossref Full Text | Google Scholar

27. Mehoudar, E, Arizpe, J, Baker, CI, and Yovel, G. Faces in the eye of the beholder: unique and stable eye scanning patterns of individual observers. J Vis. (2014) 14:1–11. doi: 10.1167/14.7.6

Crossref Full Text | Google Scholar

28. Peterson, MF, and Eckstein, MP. Individual differences in eye movements during face identification reflect observer-specific optimal points of fixation. Psychol Sci. (2013) 24:1216–25. doi: 10.1177/0956797612471684

PubMed Abstract | Crossref Full Text | Google Scholar

29. Miellet, S, Caldara, R, and Schyns, PG. Local Jekyll and global Hyde. Psychol Sci. (2011) 22:1518–26. doi: 10.1177/0956797611424290

PubMed Abstract | Crossref Full Text | Google Scholar

30. Le, S, Raufaste, E, and Demonet, JF. Processing of normal, inverted, and scrambled faces in a patient with prosopagnosia: behavioural and eye tracking data. Brain Res Cogn Brain Res. (2003) 17:26–35. doi: 10.1016/S0926-6410(03)00077-6

PubMed Abstract | Crossref Full Text | Google Scholar

31. Rizzo, M, Hurtig, R, and Damasio, A. The role of scanpaths in facial recognition and learning. Ann Neurol. (1987) 22:41–5. doi: 10.1002/ana.410220111

PubMed Abstract | Crossref Full Text | Google Scholar

32. Amisha, M P, Pathania, M, and Rathaur, V. Overview of artificial intelligence in medicine. J Family Med Prim Care. (2019) 8:2328–31. doi: 10.4103/jfmpc.jfmpc_440_19

Crossref Full Text | Google Scholar

33. Abràmoff, MD, Lavin, PT, Birch, M, Shah, N, and Folk, JC. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. NPJ Digit Med. (2018) 1:39. doi: 10.1038/s41746-018-0040-6

PubMed Abstract | Crossref Full Text | Google Scholar

34. Gulshan, V, Peng, L, Coram, M, Stumpe, MC, Wu, D, Narayanaswamy, A, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. (2016) 316:2402–10. doi: 10.1001/jama.2016.17216

PubMed Abstract | Crossref Full Text | Google Scholar

35. Bhaskaranand, M, Ramachandra, C, Bhat, S, Cuadros, J, Nittala, MG, Sadda, SR, et al. The value of automated diabetic retinopathy screening with the EyeArt system: a study of more than 100, 000 consecutive encounters from people with diabetes. Diabetes Technol Therapeut. (2019) 21:635–43. doi: 10.1089/dia.2019.0164

PubMed Abstract | Crossref Full Text | Google Scholar

36. Ting, DSW, Cheung, CYL, Lim, G, Tan, GSW, Quang, ND, Gan, A, et al. Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA. (2017) 318:2211–23. doi: 10.1001/jama.2017.18152

PubMed Abstract | Crossref Full Text | Google Scholar

37. Liu, H, Li, L, Wormstone, IM, Qiao, C, Zhang, C, Liu, P, et al. Development and validation of a deep learning system to detect glaucomatous optic neuropathy using fundus photographs. JAMA Ophthalmol. (2019) 137:1353–60. doi: 10.1001/jamaophthalmol.2019.3501

PubMed Abstract | Crossref Full Text | Google Scholar

38. Li, Z, He, Y, Keel, S, Meng, W, Chang, RT, and He, M. Efficacy of a deep learning system for detecting glaucomatous optic neuropathy based on color fundus photographs. Ophthalmology. (2018) 125:1199–206. doi: 10.1016/j.ophtha.2018.01.023

PubMed Abstract | Crossref Full Text | Google Scholar

39. Milea, D, Najjar, RP, Jiang, Z, Ting, D, Vasseneix, C, Xu, X, et al. Artificial intelligence to detect papilledema from ocular fundus photographs. N Engl J Med. (2020) 382:1687–95. doi: 10.1056/NEJMoa1917130

PubMed Abstract | Crossref Full Text | Google Scholar

40. Delavari, P, Ozturan, G, Yuan, L, Yilmaz, O, and Oruc, I. Artificial intelligence, explainability, and the scientific method: A proof-of-concept study on novel retinal biomarker discovery. PNAS nexus. (2023) 2:pgad290.

Google Scholar

41. Chang, Z, Chen, Z, Stephen, CD, Schmahmann, JD, Wu, HT, Sapiro, G, et al. Accurate detection of cerebellar smooth pursuit eye movement abnormalities via mobile phone video and machine learning. Sci Rep. (2020) 10:18641. doi: 10.1038/s41598-020-75661-x

PubMed Abstract | Crossref Full Text | Google Scholar

42. Król, ME, and Król, M. A novel machine learning analysis of eye-tracking data reveals suboptimal visual information extraction from facial stimuli in individuals with autism. Neuropsychologia. (2019) 129:397–406. doi: 10.1016/j.neuropsychologia.2019.04.022

PubMed Abstract | Crossref Full Text | Google Scholar

43. Carette, R, Elbattah, M, Cilia, F, Dequen, G, Guérin, J, and Bosche, J. Learning to predict autism spectrum disorder based on the visual patterns of eye-tracking scanpaths. In: Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2019). (2019). p. 103–111.

Google Scholar

44. Pancaroglu, R, Hills, CS, Sekunova, A, Viswanathan, J, Duchaine, B, and Barton, JJ. Seeing the eyes in acquired prosopagnosia. Cortex. (2016) 81:251–65. doi: 10.1016/j.cortex.2016.04.024

PubMed Abstract | Crossref Full Text | Google Scholar

45. Blais, C, Jack, RE, Scheepers, C, Fiset, D, and Caldara, R. Culture shapes how we look at faces. PLoS One. (2008) 3:1–8. doi: 10.1371/journal.pone.0003022

Crossref Full Text | Google Scholar

46. Hills, PJ, and Pake, JM. Eye-tracking the own-race bias in face recognition: revealing the perceptual and socio-cognitive mechanisms. Cognition. (2013) 129:586–97. doi: 10.1016/j.cognition.2013.08.012

PubMed Abstract | Crossref Full Text | Google Scholar

47. Hills, CS, Pancaroglu, R, Duchaine, B, and Barton, JJ. Word and text processing in acquired prosopagnosia. Ann Neurol. (2015) 78:258–71. doi: 10.1002/ana.24437

PubMed Abstract | Crossref Full Text | Google Scholar

48. Baron-Cohen, S, Wheelwright, S, Skinner, R, Martin, J, and Clubley, E. The autism-Spectrum quotient (AQ): evidence form asperger syndrome/high-functioning autism, males and females, scientists and mathematicians. J Autism Dev Disord. (2001) 31:5–17. doi: 10.1023/a:1005653411471

Crossref Full Text | Google Scholar

49. Barton, JJ, Cherkasova, MV, Hefter, R, Cox, TA, O'Connor, M, and Manoach, DS. Are patients with social developmental disorders prosopagnosic? Perceptual heterogeneity in the Asperger and socio-emotional processing disorders. Brain. (2004) 127:1706–16. doi: 10.1093/brain/awh194

PubMed Abstract | Crossref Full Text | Google Scholar

50. Kamensek, T, Susilo, T, Iarocci, G, and Oruc, I. Are people with autism prosopagnosic?. Autism Research. (2023) 16:21009–2109.

Google Scholar

51. Duchaine, B, and Nakayama, K. The Cambridge face memory test: results for neurologically intact individuals and an investigation of its validity using inverted face stimuli and prosopagnosic participants. Neuropsychologia. (2006) 44:576–85. doi: 10.1016/j.neuropsychologia.2005.07.001

PubMed Abstract | Crossref Full Text | Google Scholar

52. Warrington, E. Warrington recognition memory test. Los Angeles: Western Psychological Services (1984).

Google Scholar

53. Shah, P, Gaule, A, Sowden, S, Bird, G, and Cook, R. The 20-item prosopagnosia index (PI20): a self-report instrument for identifying developmental prosopagnosia. R Soc Open Sci. (2015) 2:140343. doi: 10.1098/rsos.140343

PubMed Abstract | Crossref Full Text | Google Scholar

54. Duchaine, B, and Nakayama, K. Dissociations of face and object recognition in developmental prosopagnosia. J Cogn Neurosci. (2005) 17:249–61. doi: 10.1162/0898929053124857

PubMed Abstract | Crossref Full Text | Google Scholar

55. Davies-Thompson, J, Fletcher, K, Hills, C, Pancaroglu, R, Corrow, SL, and Barton, JJ. Perceptual learning of faces: a rehabilitative study of acquired prosopagnosia. J Cogn Neurosci. (2017) 29:573–91. doi: 10.1162/jocn_a_01063

PubMed Abstract | Crossref Full Text | Google Scholar

56. Corrow, SL, Davies-Thompson, J, Fletcher, K, Hills, C, Corrow, JC, and Barton, JJS. Training face perception in developmental prosopagnosia through perceptual learning. Neuropsychologia. (2019) 134:107196. doi: 10.1016/j.neuropsychologia.2019.107196

PubMed Abstract | Crossref Full Text | Google Scholar

57. Deco, G, and Schurmann, B. A neuro-cognitive visual system for object recognition based on testing of interactive attentional top-down hypotheses. Perception. (2000) 29:1249–64. doi: 10.1068/p3010

PubMed Abstract | Crossref Full Text | Google Scholar

58. Rybak, I, Gusakova, V, Golovan, A, Podladchikova, L, and Shevtsova, N. A model of attention-guided visual perception and recognition. Vis Res. (1998) 38:2387–400. doi: 10.1016/s0042-6989(98)00020-0

PubMed Abstract | Crossref Full Text | Google Scholar

59. Shepherd, J, Davies, G, and Ellis, H. Studies of cue saliency In: G Davies, H Ellis, and J Shepherd, editors. Perceiving and remembering faces. London: Academic Press (1981). 105–31.

Google Scholar

60. Issa, EB, and DiCarlo, JJ. Precedence of the eye region in neural processing of faces. J Neurosci. (2012) 32:16666–82. doi: 10.1523/JNEUROSCI.2391-12.2012

PubMed Abstract | Crossref Full Text | Google Scholar

61. Henderson, JM, Williams, CC, and Falk, RJ. Eye movements are functional during face learning. Mem Cogn. (2005) 33:98–106. doi: 10.3758/BF03195300

PubMed Abstract | Crossref Full Text | Google Scholar

62. Althoff, RR, and Cohen, NJ. Eye-movement-based memory effect: a reprocessing effect in face perception. J Exp Psychol Learn Mem Cogn. (1999) 25:997–1010. doi: 10.1037/0278-7393.25.4.997

PubMed Abstract | Crossref Full Text | Google Scholar

63. Butler, S, Gilchrist, ID, Burt, DM, Perrett, DI, Jones, E, and Harvey, M. Are the perceptual biases found in chimeric face processing reflected in eye-movement patterns? Neuropsychologia. (2005) 43:52–9. doi: 10.1016/j.neuropsychologia.2004.06.005

PubMed Abstract | Crossref Full Text | Google Scholar

64. Gilbert, C, and Bakan, P. Visual asymmetry in perception of faces. Neuropsychologia. (1973) 11:355–62. doi: 10.1016/0028-3932(73)90049-3

Crossref Full Text | Google Scholar

65. Winograd, E, Davies, G, Ellis, H, and Shepherd, J. Perceiving and remembering faces. Am J Psychol. (1983) 96:151–4. doi: 10.2307/1422222

Crossref Full Text | Google Scholar

66. Schmalzl, L, Palermo, R, Green, M, Brunsdon, R, and Coltheart, M. Training of familiar face recognition and visual scan paths for faces in a child with congenital prosopagnosia. Cogn Neuropsychol. (2008) 25:704–29. doi: 10.1080/02643290802299350

PubMed Abstract | Crossref Full Text | Google Scholar

67. Luria, S, and Strauss, M. Comparison of eye movements over faces in photographic positives and negatives. Perception. (1978) 7:349–58. doi: 10.1068/p070349

PubMed Abstract | Crossref Full Text | Google Scholar

68. Cook, M. Eye movements during recognition of faces In: MM Gruneberg, PE Morris, and RN Sykes, editors. Practical aspects of memory. New York: Academic Press (1978). 286–92.

Google Scholar

69. Hsiao, JH-W, and Cottrell, G. Two fixations suffice in face recognition. Psychol Sci. (2008) 19:998–1006. doi: 10.1111/j.1467-9280.2008.02191.x

PubMed Abstract | Crossref Full Text | Google Scholar

70. Mertens, I, Siegmund, H, and Grusser, OJ. Gaze motor asymmetries in the perception of faces during a memory task. Neuropsychologia. (1993) 31:989–98. doi: 10.1016/0028-3932(93)90154-r

PubMed Abstract | Crossref Full Text | Google Scholar

71. Djouab, S, Albonico, A, Yeung, SC, Malaspina, M, Mogard, A, Wahlberg, R, et al. Search for face identity or expression: set-size effects in developmental prosopagnosia. J Cogn Neurosci. (2020) 32:889–905. doi: 10.1162/jocn_a_01519

PubMed Abstract | Crossref Full Text | Google Scholar

72. Bowles, DC, McKone, E, Dawel, A, Duchaine, B, Palermo, R, Schmalzl, L, et al. Diagnosing prosopagnosia: effects of ageing, sex, and participant-stimulus ethnic match on the Cambridge face memory test and Cambridge face perception test. Cogn Neuropsychol. (2009) 26:423–55. doi: 10.1080/02643290903343149

PubMed Abstract | Crossref Full Text | Google Scholar

73. Caldara, R, Schyns, P, Mayer, E, Smith, M, Gosselin, F, and Rossion, B. Does prosopagnosia take the eyes out of face representations? Evidence for a defect in representing diagnostic facial information following brain damage. J Cogn Neurosci. (2005) 17:1652–66. doi: 10.1162/089892905774597254

PubMed Abstract | Crossref Full Text | Google Scholar

74. Heisz, JJ, and Shore, DI. More efficient scanning for familiar faces. J Vis. (2008) 8:9. doi: 10.1167/8.1.9

PubMed Abstract | Crossref Full Text | Google Scholar

75. Rossion, B. Prosopdysgnosia? What could it tell us about the neural organization of face and object recognition? Cogn Neuropsychol. (2018) 35:98–101. doi: 10.1080/02643294.2017.1414778

PubMed Abstract | Crossref Full Text | Google Scholar

76. Lundqvist, D, Flykt, A, and Öhman, A. The Karolinska Directed Emotional Faces - KDEF, CD ROM from Department of Clinical Neuroscience, Psychology section, Karolinska Institutet. (1998).

Google Scholar

Appendix. Features used in the baseline model

Features added based on previous literature

Features added based on statistical analysis

Features added based on our assumptions

Keywords: face recognition, scanpath, artificial intelligence, neural network, developmental

Citation: Kazemian A, Oruc I and Barton JJS (2025) Scanning faces: a deep learning approach to studying eye movements in prosopagnosia. Front. Neurol. 16:1616509. doi: 10.3389/fneur.2025.1616509

Received: 23 April 2025; Accepted: 12 August 2025;
Published: 10 September 2025.

Edited by:

Panitha Jindahra, Mahidol University, Thailand

Reviewed by:

Alex P. Leff, University College London, United Kingdom
Gordon Plant, University College London, United Kingdom

Copyright © 2025 Kazemian, Oruc and Barton. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Jason J. S. Barton, amFzb25iYXJ0b25Ac2hhdy5jYQ==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.