Holistic Face Categorization in Higher Order Visual Areas of the Normal and Prosopagnosic Brain: Toward a Non-Hierarchical View of Face Perception

Rossion, Bruno; Dricot, Laurence; Goebel, Rainer; Busigny, Thomas

doi:10.3389/fnhum.2010.00225

ORIGINAL RESEARCH article

Front. Hum. Neurosci., 10 January 2011

Sec. Sensory Neuroscience

Volume 4 - 2010 | https://doi.org/10.3389/fnhum.2010.00225

Holistic face categorization in higher order visual areas of the normal and prosopagnosic brain: toward a non-hierarchical view of face perception

Bruno Rossion^1,2*

Laurence Dricot²

Rainer Goebel³

Thomas Busigny¹

¹ Institute of Research in Psychology, University of Louvain, Louvain-la-Neuve, Belgium
² Institute of Neuroscience, University of Louvain, Louvain-la-Neuve, Belgium
³ Maastricht University and Brain Imaging Center, University of Maastricht, Maastricht, Netherlands

How a visual stimulus is initially categorized as a face in a network of human brain areas remains largely unclear. Hierarchical neuro-computational models of face perception assume that the visual stimulus is first decomposed in local parts in lower order visual areas. These parts would then be combined into a global representation in higher order face-sensitive areas of the occipito-temporal cortex. Here we tested this view in fMRI with visual stimuli that are categorized as faces based on their global configuration rather than their local parts (two-tones Mooney figures and Arcimboldo’s facelike paintings). Compared to the same inverted visual stimuli that are not categorized as faces, these stimuli activated the right middle fusiform gyrus (“Fusiform face area”) and superior temporal sulcus (pSTS), with no significant activation in the posteriorly located inferior occipital gyrus (i.e., no “occipital face area”). This observation is strengthened by behavioral and neural evidence for normal face categorization of these stimuli in a brain-damaged prosopagnosic patient whose intact right middle fusiform gyrus and superior temporal sulcus are devoid of any potential face-sensitive inputs from the lesioned right inferior occipital cortex. Together, these observations indicate that face-preferential activation may emerge in higher order visual areas of the right hemisphere without any face-preferential inputs from lower order visual areas, supporting a non-hierarchical view of face perception in the visual cortex.

Introduction

The human brain can detect a face in a visual scene in a fraction of a second (e.g., Lewis and Edmonds, 2003; Rousselet et al., 2003; Fei-Fei et al., 2007; Crouzet et al., 2010), yet the neural mechanisms subtending the initial categorization of a visual stimulus as a face remain largely unclear. Neuroimaging studies have identified a set of areas in the human visual cortex that respond significantly more to pictures of faces than to other object shapes (Sergent et al., 1992) thus potentially playing an important role in categorization of a visual stimulus as a face. These areas, as identified in functional magnetic resonance imaging (fMRI), are of few square millimeter and are located outside of well-defined retinotopic visual cortex (Halgren et al., 1999), in the lateral part of the inferior occipital lobe (“Occipital Face Area,” OFA, e.g., Gauthier et al., 2000), more anteriorly in the middle fusiform gyrus (the “Fusiform Face Area,” FFA, e.g., Kanwisher et al., 1997) and in the posterior part of the superior temporal sulcus (pSTS, e.g., Puce et al., 1998). They are bilateral but with a much stronger face-sensitive response in the right than in the left hemisphere (e.g., Sergent et al., 1992). These three areas are considered to form the core section of an extensive network of cortical areas that are particularly sensitive to faces (Haxby et al., 2000; Ishai, 2008; Fox et al., 2009; Weiner and Grill-Spector, 2010), and which can also be identified in the non-human primate brain (Tsao et al., 2008).

An important question to clarify is how the processes carried out within this set of face-sensitive areas lead to the initial perception of a visual stimulus as a face.

According to one view, face stimuli are processed through feedforward hierarchical stages in the visual system, starting with the extraction of simple facial parts (eyes, mouth, nose, …) in lower order visual areas of the occipital cortex. These parts would then be combined to form a more global face representation in higher order face-sensitive areas of the occipito-temporal pathway. This “local-to-global” view is inspired by the feedforward hierarchical view of the visual system (Hubel and Wiesel, 1962; Felleman and Van Essen, 1991), and influential computational models of object recognition (Marr and Vaina, 1982; Biederman, 1987; Riesenhuber and Poggio, 1999; Ullman, 2007) postulating an initial decomposition of the visual stimulus into parts and the subsequent combination of these parts through several stages of increasing complexity. This view is hierarchical and feedforward in the sense that the response properties of populations of neurons in a higher order area are supposed to be constructed by the ordered arrangement of feedforward inputs from lower order areas. A similar “local-to-global” view has been endorsed by computational and theoretical accounts of face perception in the human brain (Burton, 1994; Jiang et al., 2006). More specifically, neurofunctional models of face perception derived from neuroimaging studies postulate that face-sensitive processes are initiated in the inferior occipital cortex (OFA), feed information forward to the anteriorly located middle fusiform gyrus (FFA) and pSTS, and then to the anterior section of the temporal and prefrontal cortices (Haxby et al., 2000; Fairhall and Ishai, 2007; Ishai, 2008; see also Lerner et al., 2001; Pitcher et al., 2007).

However, according to an alternative view, inspired by Gestalt Psychology (e.g., Köhler, 1947; Flavell and Draguns, 1957; see Spillman and Ehrenstein, 2004), a face stimulus could be initially represented at a global level (i.e., a whole face). This initial representation would be coarse, and would be progressively refined into a more fine-grained face representation allowing subordinate face categorization (i.e., individualization, see Sergent, 1986). At the neural level, it has been postulated that such an initial coarse categorization of the face stimulus could take place in a higher order visual area such as the FFA, particularly in the dominant right hemisphere for faces. Perceptual processes and representations that involve high-resolution details might then involve lower order visual areas such as the right OFA through reentrant connections (Rossion et al., 2003; Rossion, 2008).

This latter neural view is largely inspired from relatively recent theoretical and experimental work carried out on object recognition, in particular on how response properties in early visual areas can be shaped by neural activity in higher order visual areas (Mumford, 1992; Hupé et al., 1998; Lee et al., 1998; Lamme and Roelfsema, 2000; Bullier et al., 2001; Galuske et al., 2002; Hochstein and Ahissar, 2002; Murray et al., 2002; see also Bar, 2003).

With respect to face perception specifically, this view is supported by at least two observations about the categorization of a stimulus as a face by the human brain. First, a visual stimulus can be readily categorized as a face even if it does not contain clear elementary facial parts, its faceness being defined solely or primarily by the global organization of the elements. A classical example is provided by two-tone (thresholded, black and white) images of faces introduced in the 1950s (Mooney, 1956, 1957) to test the ability of children to form a coherent percept of shape on the basis of very little detail. These “Mooney” faces (Figure 1A) have been of great interest to psychologists and neuroscientists throughout the past half a century (e.g., Mooney, 1956, 1957; Perrett et al., 1984; Parkin and Williamson, 1987; Jeffreys, 1989; Dolan et al., 1997; George et al., 1997; Kanwisher et al., 1998; Moore and Cavanagh, 1998; Ramachandran et al., 1998; Rodriguez et al., 1999; Jemel et al., 2003; McKone, 2004; McKeeff and Tong, 2007) because of their ambiguous nature, specificity (two-tone faces seem more readily identifiable than other objects; Moore and Cavanagh, 1998) and their sudden interpretability.

FIGURE 1

Figure 1. Above. Examples of stimuli used in the experiment 1, Mooney faces (http://www.princeton.edu/artofscience/gallery): (A) Upright stimuli (response = face) and (B) Inverted stimuli (response = non-face). Below. Examples of stimuli used in the experiment 2 Arcimboldo faces (http://www.princeton.edu/artofscience/gallery), (C) Upright stimuli (response = face) and (D) Inverted stimuli (response = non-face). (E) The three Mooney face stimuli above have been divided into four square fragments, spatially displaced. Contrary to fragments of a face photograph (e.g., Ullman, 2007), fragments of such two-tone faces cannot be seen as facelike: the stimulus needs to be perceived as a whole to see the face.

In a Mooney image, the local parts are too ambiguous to be recognized as facelike individually, as illustrated on Figure 1E. Rather, these local parts must be disambiguated based on their context within a global configuration. Consequently, Mooney faces are said to require holistic/configural processing for successful perception (e.g., Newcombe, 1974; Parkin and Williamson, 1987; McKone, 2004): the stimulus needs to be processed as an integrated whole rather than as a collection of independent parts. Moreover, since two-tone images of novel objects do not lend themselves to volumetric interpretations, the correct perception of a Mooney stimulus appears to depend on previously stored representations in memory, or a top-down application of a 2D global face template (Cavanagh, 1991; Moore and Cavanagh, 1998; Hegdé et al., 2007; Kemelmacher-Shlizerman et al., 2008). Indeed, when a Mooney picture is presented upside-down, the face is usually not perceived (e.g., Figure 1B), presumably because the visual input cannot be disambiguated with the help of internal 2D global representations (i.e., top-down processes).

Yet another example of face perception based on global configuration rather than local parts is illustrated by the famous paintings of Giuseppe Arcimboldo (sixteenth century; Hulten, 1987), in which a face is constituted of non-face (usually organic) elements such as fruits and vegetables, animals, flowers, etc. (Figure 1C). Here, the parts can be identified relatively easily, but they correspond to non-face objects, not to elementary facial parts. Like Mooney stimuli, an Arcimboldo’s painting can be categorized as a face due to the global face configuration formed by these non-face elements rather than through the identification of the elements themselves. As a matter fact, a visual agnosic patient who cannot identify the constituent object parts may still perceive the face in these Arcimboldo paintings (Moscovitch et al., 1997), indicating that the face is perceived independently of the nature of the parts per se. Again, the face is usually not perceived when the painting is presented upside-down, an aspect that was used by the artist Arcimboldo to make his paintings reversible Figure 1D; Hulten, 1987).

If a face stimulus can be readily perceived despite the absence, or the reduced diagnosticity, of local facial parts in Mooney and “Arcimboldo” stimuli, this poses an important challenge to hierarchical neurofunctional and computational models of face perception. Indeed, it indicates that under certain circumstances at least, a whole face configuration can be seen without identifying the parts as being facelike.

A second observation casting doubts on a strict hierarchical scheme of face perception in the human brain is that face-sensitivity can be observed in higher order areas of the right middle fusiform gyrus (right FFA) and pSTS despite a structural damage to the territory of the posteriorly located (i.e., lower order) right OFA. This observation was first made on the patient PS (Rossion et al., 2003) who presents with a severe inability to recognize and discriminate individual faces (acquired prosopagnosia; Quaglino et al., 1867/2003; Bodamer, 1947), but can nevertheless categorize a stimulus as a face (Schiltz et al., 2006). This observation of a FFA without OFA has been replicated in several fMRI studies of the same patient (e.g., Schiltz et al., 2006; Dricot et al., 2008), also disclosing face-preferential activation in her right pSTS (Sorger et al., 2007). A preferential activation to faces of the FFA despite a bilateral lesion of the territory of the OFA in another (prosop)agnosic patient has strengthened this finding (Steeves et al., 2006). Thus, higher order visual areas such as the FFA may show a preference for faces despite the absence of any face-sensitive response in the structurally damaged inferior occipital cortex (i.e., no OFA). These observations suggest that, in the normal brain, a preferential activation to faces in a higher order visual areas such as the FFA may arise independently of putative face-sensitive inputs from theinferior occipital cortex (OFA), perhaps through direct connections from early (non-face-sensitive) visual areas (Rossion, 2008).

In the present study, we aimed at testing further this latter view of the microgenesis of face perception by combining the two sets of evidence reviewed above. That is, we recorded the behavioral and neural response(s) of normal observers and of the brain-damaged prosopagnosic patient (PS) to the presentation of Mooney and Arcimboldo face stimuli, in order to test three predictions drawn from this hypothesis.

(1) First, the response to Mooney and Arcimboldo stimuli – that are mainly or exclusively perceived as faces by means of holistic processing – should be much larger in the right FFA than the OFA (compared to the response of these two areas in a classical face localizer in fMRI).

(2) Second, providing that low-level vision is well preserved, cases of prosopagnosia who have a right FFA such as the patient PS described above, should still be able to categorize readily a Mooney or an Arcimboldo stimulus as a face. Previous neuropsychological investigations have not clarified this issue, and it is often claimed that many prosopagnosics (if not all) have particular trouble perceiving a face in binary-tone (Mooney) images (e.g., Levine and Calvanio, 1989; Laeng and Caviness, 2001) or in Arcimboldo paintings (Harris and Aguirre, 2007). In truth, only a few exemplars of Mooney faces have been presented to acquired prosopagnosics as part of clinical neuropsychological examinations, with various recognition success rates (e.g., Sergent and Villemure, 1989; Davidoff and Landis, 1990; Young et al., 1990; Sergent and Signoret, 1992b; Steeves et al., 2006; Rivest et al., 2009), and only recently a PS was tested and succeeded at detecting a face in a few Arcimboldo paintings (Rivest et al., 2009). Hence, this issue is still largely open, and deserve to be tested more formally with a PS who has well preserved low-level vision, does not present general visual integrative agnosia, and shows face-sensitive responses in high-level visual areas. The interest of testing the patient PS’ ability to perceive faces in such stimuli requiring holistic processing is also increased by recent evidence showing that this patient cannot rely on holistic processes to individualize (recognize or match/discriminate) faces (Busigny and Rossion, 2010; Ramon et al., 2010; Van Belle et al., 2010). While impairment in holistic processing for individualizing faces has also been shown in other cases of acquired prosopagnosia (e.g., Levine and Calvanio, 1989; Sergent and Villemure, 1989; Boutsen and Humphreys, 2002), dissociation between (intact) holistic processing for face categorization and (impaired) face individualization has never been reported to our knowledge.

(3) Third, provided that the patient PS is able to detect a face stimulus in a substantial amount of Mooney/Arcimboldo stimuli, we tested the hypothesis that she recruits primarily, as normal observers would, her right FFA to perform this function. This prediction follows (1) and (2) and would provide further evidence for a non-hierarchical view of face perception in the human brain.

Materials and Methods

Participants

Patient PS

The PS’ behavioral and neural profiles have been described in detail in several previous studies (e.g., Rossion et al., 2003). Briefly, PS was born in 1950 and sustained a closed head injury in 1992 that left her with extensive lesions of the left mid-ventral (mainly fusiform gyrus) and the right inferior occipital cortex. Minor damages to the left posterior cerebellum and the right middle temporal gyrus were also detected (see Sorger et al., 2007) for all information about the patient’s lesions). PS’ only continuing complaint is a profound difficulty in recognizing familiar faces, including those of her family when they are presented out of context (see Table 1 in Rossion et al., 2003 for the neuropsychological profile of the patient). This impairment in face recognition and individual face discrimination has been formally established in several behavioral studies with classical neuropsychological tests (Benton and Van Allen, 1972; Warrington Recognition Memory Test, Warrington, 1984) as well as individual face matching and recognition computer tasks (see Rossion et al., 2003; Caldara et al., 2005; Schiltz et al., 2006; Orban de Xivry et al., 2008; Busigny and Rossion, 2010). Importantly, PS does not present with any difficulty in recognizing and discriminating non-face objects, even at the subordinate level and when response times are considered (Rossion et al., 2003; Schiltz et al., 2006; Busigny et al., 2010). Her visual field is almost full (small left paracentral scotoma, see Sorger et al., 2007) and her visual acuity is good (0.8 for both eyes as tested in August 2003).

Behavioral Study

Ten healthy control participants took part in the behavioral study. They were matched to PS for gender, age (PS: 56 at time of testing; controls’ mean: 53.4, SD: 3.6), and education. None of them had a history of neurological or vascular disease, head injury, or alcohol abuse, nor did they display cognitive complaints. Two experiments (Mooney faces experiment 1 and Arcimboldo faces experiment 2) were conducted with all the participants.

fMRI Study

Beside PS, a group of seven(S1–S7, age range 20–26, all females) healthy controls performed the face localizer and the Mooney faces experiment 1. Three of these participants (S1, S4, and S5) and three additional participants (S8–S10, age range 20–26, five females) performed the Arcimboldo faces experiment 2. None of them had a history of neurological or vascular disease, head injury, or alcohol abuse, nor did they display cognitive complaints. In addition, one age-matched participant (AM, 53 years, female) performed the face localizers and the experiments 1 and 2. We tested only one age-matched control to PS in fMRI for practical reasons, but also because the profile of activation in the right FFA has been shown to remain stable across decades (Brodtmann et al., 2003) as also confirmed by our previous studies (Schiltz et al., 2006; Sorger et al., 2007). Also, the participant AM’s data did not differ from the young control participants in this study. Both PS and the control participants gave their informed written consent prior to the behavioral and fMRI experiments. The study conformed to the Declaration of Helsinki and was approved by the Ethics Committee of the Medical Department of the University of Louvain. All participants and PS proved to be strongly right-handed according to the Edinburgh Inventory (Oldfield, 1971).

Stimuli

The stimuli used in the behavioral and fMRI Mooney faces experiments were taken from the dataset originally created by Aaron Schurger and colleagues (Art of Science Competition, Princeton University¹). These types of stimuli were created following the same procedure that was used by Craig Mooney (Mooney, 1957) in his study to explore the perceptual closure ability – that is the ability to form a global and coherent perceptual representation on the basis of few details. To create our experiment, we selected 80 Mooney faces among the Schurger’s set (Figure 1A).

The stimuli used in the behavioral and fMRI Arcimboldo experiments are inspired by the paintings of the sixteenth Century artist, Giuseppe Arcimboldo (see Hulten, 1987; or for example²) and by the creations of the contemporary mosaic portrait artist, Jason Mecier³. Both created works of art consisting of faces composed by non-facial elements (vegetables, fruits, animals, candies, stationeries, pebbles, etc.). The pictures were downloaded from the websites and were cropped so that only the area of the face was present. Next, the pictures were homogenized to have roughly the same size and resolution. In total, 40 Arcimboldo face stimuli were created (Figure 1C).

Four categories of stimuli were used in the fMRI localizer experiment: Faces, Cars, and their phase-scrambled versions: Scrambled faces and scrambled cars. The Face condition consisted of 43 pictures of faces (22 females) cropped so that no external parts (hair, etc.) were revealed. All the faces were shown in frontal view. There were inserted in a gray rectangle to form a rectangular image (Figure 2).

FIGURE 2

Figure 2. Functional brain areas disclosed in the face localizer experiment with faces, scrambled faces, pictures of cars and scrambled cars. The different areas (bilateral FFA, OFA, and right pSTS) are displayed below for a normal participant, and the intact part of this network is illustrated for the patient PS on top (right FFA, left OFA, right pSTS). Note the asymmetrical lesions, damaging the cortical territory of the right OFA and left FFA (for a full description, see Sorger et al., 2007).

Similarly, the Car condition consisted of 43 pictures of different cars in a full-front view also embedded in a gray rectangle. Faces and cars were presented in color and equalized in luminance. The scrambled stimuli were made using a Fourier phase randomization procedure (see e.g., Sadr and Sinha, 2004 that yields images preserving the global low-level properties of the original image (i.e., luminance, contrast, spectral energy, etc.), while completely degrading any category-related information (Figure 2). Pictures of faces/cars and the phase-scrambled face/car pictures subtended equal shape, size, and contrast against background.

Procedure

Behavioral Mooney and Arcimboldo faces experiment. The stimuli were presented using E-prime 1.1 (Schneider et al., 2002) on a 15′ laptop display (resolution: 1024 × 768; refresh rate: 60 Hz), and subtended approximately 5.4° in height and 3.8′ width. Participants indicated a response by pressing designated keys on a keyboard. The back color of the screen was in gray (128, 128, 128). Percentages of correct responses and response times on correct trials were calculated.

The 80 selected items were presented upright (Figure 1A) and upside-down (Figure 1B) and were displayed in random order in two blocks of 80 trials. Each picture appeared on the screen sequentially, and the observers had to decide whether they could see a face in the stimulus or not by pressing one of two response keys. They were instructed the face had to be presented in an upright orientation, but could be of different viewpoint, sex, age, etc. Each stimulus was presented on the screen until the participant responded, and was followed by a central cross (300 ms) and a gray screen (300 ms).

To compare PS’ behavioral performance to normal participants, the modified t-test of Crawford and Howell (1998) for single-case studies was used. Here we used a <0.05 p-value within the framework of a unilateral hypothesis. Analysis was performed with a computerized version of Crawford and Howell’s method: SINGLIMS.EXE (Crawford and Garthwaite, 2002).

fMRI localizer experiment. Prosopagnosic patient and normal participants performed one block-design localizer fMRI experiment aimed at defining the areas responding preferentially to faces. They viewed 24 blocks per run (18 s per block, 2 runs of 11 min) of alternating pictures of faces, scrambled faces, cars, and scrambled cars (six blocks per condition), with 9 s fixation cross epochs between the blocks. They performed a one-back identity task (two or three positives per block). During a block, 18 stimuli were presented for 750 ms followed by a 250 ms black screen during each block. All images sustained a size of roughly 5.4° in height and 3.8° in width of visual angle and varied slightly in location in X (10%) and in Y (13%) on each trial.

fMRI Mooney faces experiment. Prosopagnosic patient and normal participants viewed three runs (10 min 13 s per run) of 40 Mooney faces in each orientation, displayed in random order. Each picture sustained a size of roughly 5.2° in height and 3.6° in width of visual angle and varied in location in X (10%) and in Y (13%) on a gray screen (128 128 128). The stimuli appeared on the screen during 1750 ms followed by a cross of 4250, 5500, or 6750 ms and a gray screen of 250 ms. This timing ensured that the onset of distinct events were separated by at least 6–8 repetition times (TRs) to avoid the overlapping and saturation of the hemodynamic responses. Full randomization of trial order and of ISI duration further reduced any potential top-down effects of anticipation of the stimuli.

Participants were asked to press the right response key when they saw a face and the left response key (with the same hand) when they could not see a face in the stimulus (the exact instructions were the same as in the behavioral experiment above). The stimuli were displayed with a PC running E-prime 1.1 (PST Inc.) through a projector surface located over the head of the subject and viewed with an angled mirror.

fMRI Arcimboldo faces experiment. The exact same procedure as the fMRI Mooney faces experiment was used with the 40 Arcimboldo faces (three identical runs of 10 min 13 s). Each picture sustained a size of roughly 6.6° in height and 5.4° in width of visual angle.

Imaging parameters. Magnetic resonance images of brain activity were collected from PS and normal controls using a 3T head scanner (Siemens Allegra, Siemens AG, Erlangen, Germany), with repeated single-shot echo-planar imaging: echo time (TE) = 50 ms, flip angle (FA) = 90°, matrix size = 64 × 64, field of view (FOV) = 224 mm × 224 mm, slice order descending and interleaved, slice thickness = 3.5 mm. The other scan parameters varied over the different experiments: repetition time (TR) = 2250 ms, 36 slices for the face localizer (or TR = 1500 ms, 24 slices for four control participants); TR = 1250 ms, 20 slices for the event-related Mooneys and Arcimboldo faces experiments (all participants). A three-dimensional (3D) T1-weighted data set encompassing the whole brain was acquired to provide detailed anatomy (1 mm³) thanks to aADNI sequence(TR = 2250 ms, TE = 2.6 ms, FA = 9°, matrix size = 256 × 256, FOV = 256 mm × 256 mm², 192 slices, slice thickness = 1 mm, no gap, total scan time = 8 min 5 s).

Data analysis of the imaging experiments. The fMRI signal in the different conditions was compared using BrainVoyager QX (Version 1.9.10, Brain Innovation, Maastricht, The Netherlands). Prior to analysis, the functional data sets were subjected to a series of preprocessing operations: linear trend removal for excluding scanner-related signal, a temporal high-pass filtering applied to remove temporal frequencies lower than three cycles per run, and a correction for small interscan head movements by a rigid body algorithm rotating and translating each functional volume in 3D space. The data were corrected for the difference between the scan times of the different slices. Data was not smoothed in the spatial domain for any of the experiments. In order to be able to compare the locations of activated brain region across participants all anatomical as well as the functional volumes were spatially normalized (Talairach and Tournoux, 1988) and the statistical maps computed were overlaid to the 3D T1-weighted scans in view to calculate Talairach coordinates for all relevant activation clusters. Subsequently, the functional data were analyzed using one multiple regression model (General linear model, GLM) consisting of predictors, which corresponded to the particular experimental conditions of each experiment. The predictor time courses used were computed on the basis of a linear model of the relation between neural activity and hemodynamic response, assuming a rectangular neural response during phases of visual stimulation (Boynton et al., 1996).

Statistical analyses were carried out in several steps. First, the areas responding preferentially to faces were defined independently for PS and each individual participant in the face localizer experiment. The conjunction of the contrast [(Faces–Scrambled faces) and (Faces–Objects)] between the two face localizer runs was computed. This conservative procedure ensured that the larger activations to faces than objects identified were those consistent across the two runs, and the voxels identified were not responding preferentially to faces because of low-level properties. The FFA was defined as all the contiguous voxels (i.e., forming a cluster) in the middle fusiform gyrus significant at [q(False discovery rate) < 0.05]. The FFA was defined separately in each hemisphere (right and left FFAs). The set of all contiguous significant voxels in the inferior occipital cortex defined the OFA, in each hemisphere separately. The same procedure was applied in the pSTS. If these areas were not found for some participants at this statistical threshold, the threshold was adjusted to less conservative values [p(uncorrected) < 0.005; then p(uncorrected) < 0.05] in order to be able to test all the areas in all participants (see Dricot et al., 2008). All information about the regions of interest and statistical thresholds used for each individual participant is provided in Table A1 of Appendix.

Second, for each participant, including PS and AM, and every ROI, a single subject GLM analysis was ran (df = N of regressors – 1, N of volumes over the 3 runs – 1: 2, 1457) and mean beta-values were extracted for the contrast (upright Mooney faces–inverted Mooney faces). The contrast was tested also for normal control participants in a multi-subject random effect analysis (multi-subject GLM with predictors separated for each included control participant; df = 6). The same procedure was used for analyzing Experiment 2 with the exception that Arcimboldo stimuli replaced the Mooney stimuli (df = 5 in the multi-subject GLM analysis).

Third, we directly compared PS and the normal participants for the level of activation to Mooney faces in the two face areas showing a significant response for PS: right FFA and pSTS. Average percent signal change for every ROI was computed using the baseline epochs as reference for each condition. An index of the strength of the response in each ROI was computed by means of the beta weights of the GLM analysis as follow: [(upright Mooney faces − inverted Mooney faces) divided by (Mooney faces + Mooney faces inverted]. The exact same index was also computed with comparison of the “correct trials” only (“face” response for upright stimuli; “non-face” response for inverted stimuli). This index was defined in each of the two ROIs, for PS and for every participant. PS’ index in each ROI was then compared to normal participants using the modified t-test of (Crawford and Howell, 1998) as described above. The same contrasts were used for analyzing Experiment 2 with the exception that Arcimboldo stimuli replaced the Mooney stimuli.

Finally, a whole-brain GLM analysis across control participants was conducted in order to determine, without a priori functional localization, the regions that were primarily involved in the categorization of the stimulus as a face (separately for the Mooney and Arcimboldo experiments). This whole-brain analysis was done also for PS, and for AM.

Results

Experiment 1. Mooney Faces

Behavioral Mooney faces experiment

Prosopagnosic patient detected 73.8% of the upright stimuli as faces, a normal range performance (control mean = 83.1%; SD = 17.9; t₉ = –0.495; p = 0.316; Crawford and Garthwaite, 2002). She made very few false alarms (3.8%) comparable to the normal controls (6%; SD = 9.4; t₉ = 0.223; p = 0.414; Figure 3A). PS was also as fast as the normal controls, independently of the presence of a face in the picture or not (respectively t₉ = –0.451; p = 0.331 and t₉ = –0.474; p = 0.324; Figure A1A of Appendix).

FIGURE 3

Figure 3. (A) Behavioral results of the prosopagnosic patient PS and control participants in experiment 1 (detecting Mooney faces). (B) Results of experiment 2 (Arcimboldo faces). For both experiments, the average response times are calculated on correct trials, separately for “Face” and “Non-face” stimuli. PS’ accuracy rates are in the normal range, both for the “Face” as for the “Non-face” items. If anything, PS is even faster than control participants. Bars in the graph represent the standard errors of the mean.

Thus, the acquired PS was able to detect efficiently and readily a Mooney face stimulus, a visual categorization task that is assumed to rely on the global organization of the facial elements rather than a detailed analysis of these elements. Although many other cases of acquired prosopagnosia have been tested with Mooney face stimuli, most of these tests were part of a clinical neuropsychological preliminary report (a test of visual closure) rather than a systematic experiment. Patients have been reported to either be impaired at processing Mooney face stimuli (e.g., Young et al., 1990; one case in Davidoff and Landis, 1990; Steeves et al., 2006; Laeng and Caviness, 2001) or to perform in the normal range (two cases in Sergent and Villemure, 1989; Davidoff and Landis, 1990; Sergent and Signoret, 1992a; Rivest et al., 2009). However, these investigations were never systematic. Specifically, the patient data was never compared to appropriate control data, the patients suffered from general visual impairments, they were tested in variants of the face/non-face decision task (e.g., categorizing the Mooney faces according to gender), and response times were never considered. To the best of our knowledge, the present study thus provides the first solid evidence of the preserved ability to perceive a face stimulus in ambiguous Mooney stimuli in acquired prosopagnosia.

fMRI experiment: Mooney face perception

Functional localizer approach. In the localizer scan, normal participants showed activation (see Table A1 of Appendix for details) in the right and left FFA (mean Talairach coordinates: 36.7, −46.7, −19.1 and −40.9, −46.3, −19.6 respectively), in the right and left OFA (38, −78.4, −11.8 and −34.6, −73.6, −15.6, respectively) and in the right pSTS (43.1, −44.6, 9.2)⁴. For PS, consistent with previous observations, there were significant activations at [q(False discovery rate) < 0.05] to faces in the FFA and pSTS of the right hemisphere (36, −51, −21, and 45, −56, 10 respectively). These areas were normal in size and anatomical location as previously reported for normal observers and for PS (Rossion et al., 2003; Sorger et al., 2007; Dricot et al., 2008). There was also a small significant activation in the left inferior occipital cortex for PS [left OFA, at p(uncorrected) < 0.05, −40, −73, −19], as found in some previous studies of the patient (Sorger et al., 2007). In agreement with all these previous studies of the patient PS, even at the least conservative statistical threshold (p < 0.05 uncorrected) at which these two ROIS could be identified in every normal participant, there was no evidence of significant activation around PS’ lesions which could have been considered as a right OFA and left FFA (Table 1).

TABLE 1

Table 1. Summary of the statistical significance (t-values; all ps < 0.05 in yellow) in the regions of interest defined in the functional localizer experiment for the patient PS and the control participants (random analysis).

In summary, the missing components of the network of face-preferential activation in PS’ brain (left FFA and right OFA) were areas located in structurally damaged tissue (Figure 2). As shown previously (Rossion et al., 2003), the average localization of the right OFA in normal participants falls within the right inferior occipital lesion of the patient PS (Figure A2 of Appendix). Moreover, 9 out of the 11 individually localized right OFAs in the present study fall completely or largely within the territory of PS’ right inferior occipital lesion.

In the fMRI experiment with Mooney faces, PS was again as accurate as control participants (72.8%; control group average is 69.1%, SD = 8.5%; t₆ = 0.41; p = 0.35; AM: 83%; Crawford and Garthwaite, 2002) and as fast (1384 ms, controls’ average = 1216 ms, SD = 335 ms; t₆ = 0.89; p = 0.21; AM: 1000 ms).In normal participants (random effect group analysis, Mooney upright − Mooney inverted; dl = 6), we found a significant effect in the two right hemisphere higher order “face areas” (FFA, pSTS), but not in the left OFA (FFA, t = 3.33, p < 0.016; pSTS, t = 3.59, p < 0.012; left OFA, t = 1.15, p = 0.30; Table 1; Figures 4A and 5 and Figure A3A of Appendix).

FIGURE 4

Figure 4. (A) Response to upright (“face”) and inverted (“non-face”) Mooney stimuli in the right FFA and OFA for the group of normal participants in the experiment. Note that while the right FFA showed a significantly larger response to Mooney faces, as shown previously (e.g. Kanwisher et al., 1998), there was no such effect in the right OFA. (B) The same observations were made for Arcimboldo stimuli.

FIGURE 5

Figure 5. Among the face-sensitive areas that are intact in the prosopagnosic patient’s brain, only the right FFA, and the pSTS to a lesser extent, present a larger response to Mooney stimuli perceived as faces as compared to the same pictures presented upside-down (“non-faces”). Similar findings are made control. participants (S1 illustrated), PS and the age-matched control (AM). The left OFA does not show any enhanced response to the Mooney face stimuli.

The exact same results were found in the individual analysis of PS: right FFA (t = 3.873, p < 0.0001), pSTS (t = 2.751, p < 0.0060), left OFA (t = −1.201, p = 0.23), and for the age-matched participant (AM: FFA, t = 2.73, p < 0.0064; pSTS, t = 3.69, p < 0.0002; left OFA, t = −1.93, p = 0.05; Figure 4).

In the right OFA, located in a region of cortex structurally damaged in PS’ brain, control participants showed no significant advantage for upright compared to inverted Mooney faces (t = −0.28, p = 0.79; AM: t = −2.21, p = 0.03; Figure 4A). However, the contrast was also significant in the left FFA (also lesioned in PS’ brain) in the group of normal participants (t = 3.17, p < 0.019; Figure A3A of Appendix), but it was not significant for the AM participant (t = −1.17, p = 0.24).

PS and controls: upright Mooney faces recognized compared with inverted Mooney faces not recognized. The same analysis as above was performed only on trials with correct responses (contrast: upright Mooney faces recognized–inverted Mooney faces not recognized; Figures A4 and A5 of Appendix). Identical results were found, except in the left FFA of the control group and the age-matched participant, because these areas no longer reached the significance level (left FFA: t = 1.86, p = 0.11; AM: t = 0.85, p = 0.40). Note that this observation cannot be accounted for by the reduced number of trials in the conditions of interest, since the response to upright Mooney faces is even increased in PS’ and AM’ right FFA in the same analysis relative to when all trials are considered (FFA in PS, t = 5.07, p < 0.000001; FFA in AM, t = 4.39, p < 0.00001; df of the individual subject GLM: 4, 1457).

Direct comparison of PS and normal participants. upright vs. inverted Mooney faces. In order to statistically compare PS’ level of activation to these stimuli in face-sensitive areas to the normal population, we calculated an index of activation for upright Mooney faces for each participant (Upright Mooney faces − Inverted Mooney faces/Upright Mooney faces + Inverted Mooney f ces).

In the right FFA, the larger activation for upright Mooney faces was identical for PS and normal participants, including AM, whether the activation index was computed on all trials (PS: 0.107) or only on correct trials (PS: 0.152) only (all trials: mean = 0.069, SD = 0.023; t₇ = 1.54, p = 0.17; correct trials: mean = 0.145, SD = 0.066; t₇ = 0.10, p = 0.92; Crawford and Garthwaite, 2002; AM’s indexes: 0.049 and 0.095 respectively; Figure 6A). Similarly, in the right pSTS, there was no difference between PS (all trials: 0.134 and correct trials: 0.227) and the controls: (all trials: mean = 1.021, SD = 2.703; t = −0.31, p = 0.38; correct trials: mean = 0.286, SD = 0.758; t = −0.08, p = 0.47; AM’s indexes: 1.35; 1.92).

FIGURE 6

Figure 6. (A) Indexes of the level of differential activation [(upright Mooney faces – inverted Mooney faces) divided by (Mooney faces + Mooney faces inverted)] in the right FFA for face and non-face stimuli in experiment 1 (Mooney stimuli), reported for the group of controls (right) and for each individual participant, including PS and the age-matched control. Note that the difference was larger when only the trials that were correctly categorized as faces or non-faces were considered in the analysis. (B) Indexes of the level of differential activation for face and non-face stimuli in the experiment 2 with Arcimboldo stimuli.

In summary (Table 1), independently of whether the analysis was computed on correct trials or on all trials, we found significant activations to the perception of Mooney faces primarily in the right FFA and right pSTS in the normal population. Strikingly, identical observations were made for the PS. In the left inferior occipital cortex, structurally intact in PS’ brain and where a face-sensitive response (left OFA) was found in the localizer, there was no evidence for sensitivity to faces presented as Mooney stimuli, neither for the normal controls nor for PS. Finally, in the two regions that are structurally damaged in PS’ brain, there was little (left FFA) or no (right OFA) evidence for sensitivity to the categorization of a Mooney stimulus as a face in the normal population.

Whole-brain analysis. To strengthen our results, a whole-brain analysis was performed on PS’ data with the conjunction of the three contrasts (one per fMRI run: upright Mooney faces–inverted Mooney faces). Two clusters were found at p(uncorrected) < 0.05 in PS’ brain (Figure 7). The first one corresponds roughly to the FFA (31, −55, −17, 250 voxels). It shows a preferential response to faces over cars and scrambled faces in PS’s face localizer (F–O: t = 8.34, p < 0.000001, F–SF: t = 8.36, p < 0.000001). The other one was located near the middle temporal sulcus (46, −49, −7, 122 voxels), and does not show a preferential response to faces in the localizer (larger for both objects and scrambled faces than faces: F–O: t = −1.45, p < 0.15 and F–SF: t = −2.11, p = 0.04). The same analysis was performed with AM (Figure 8), for which we also found two clusters: 34, −52, −13 (67 voxels; face-sensitivity: F–O: t = 15.25, p < 0.000001, F–SF: t = 26.31, p < 0.000001) corresponding exactly to her right FFA, and 52 −50 −5 (60 voxels) in the middle temporal sulcus (face-sensitivity: F–O: t = 10.27, p < 0.000001, F–SF: t = 11.35, p < 0.000001; note the proximity with the pSTS).

FIGURE 7

Figure 7. Whole-brain analysis performed on the patient PS in the Mooney face categorization experiment. Two areas were found to be significant, one corresponding roughly to the patient’s right FFA, presenting face-preferential response to faces in the localizer experiment, and the other one in the middle temporal sulcus, which did not show a preferential response to faces in the localizer experiment.

FIGURE 8

Figure 8. Whole-brain analysis performed on the age-matched control (AM) in the Mooney face categorization experiment. As for the patient PS, two areas were found to be significant, one corresponding roughly to the right FFA, presenting face-preferential response to faces in the localizer experiment, and the other one in the middle temporal sulcus, which did not show a preferential response to faces in the localizer experiment.

With our seven participants considered as a group, we found only one significant cluster at p(uncorrected) < 0.05, corresponding to the Talairach coordinates of the right FFA (33, −42, −19, 344 voxels). This region showed a preferential response to faces as identified in the face localizer with the seven participants (F–O: t = 5.44, p < 0.000001, F–SF: t = 30.11, p < 0.000001). There was no significant activation in the right inferior occipital cortex, or the left hemisphere in this whole-brain analysis.

Importantly, we note that the whole-brain analysis of the face localizer in the same seven participants at the same statistical threshold [p(uncorrected) < 0.05] gave rise to a rFFA of 1124 voxels, and a rOFA of 359 voxels. The rOFA’ size was thus of 32% of the rFFA’ size in the face localizer experiment. Considering this proportion, one could have expected a rOFA of about 110 voxels in the Mooney experiment (32% of the 344 voxel size of FFA). Instead, there were no significant voxels in the right inferior occipital cortex above statistical threshold. The same reasoning can be made for the left FFA (494 voxels at p < 0.05; 151 voxels expected in the Mooney experiment, 0 found) and the left OFA (268 voxels at p < 0.05; 82 voxels expected, 0 found).

In summary, both the functional localizer approach and the whole-brain analysis indicate that the categorization of the stimulus as a face based on the Mooney stimuli takes place primarily in face-sensitive populations of neurons in the middle fusiform gyrus (FFA) of the right hemisphere, and to a lesser extent in the right posterior part of the STS. These areas are structurally intact in PS’ brain and they show the exact same response profile in fMRI in this task as for normal observers. Overall, these observations indicate that the categorization of the visual Mooney stimulus as a face does not rely on lower order visual areas that are sensitive to segmented face photographs in the localizer (OFAs), but is supported almost exclusively by higher order areas in the right hemisphere.

Experiment 2. Arcimboldo Stimuli

Behavioral Arcimboldo faces experiment

Here PS obtained an accuracy rate 70% on the stimuli in which a face is detectable (Figure 3B). In comparison to control participants (mean = 84.5%; SD = 8.3), this score is slightly, but not significantly, below average (t₉ = –1.662; p = 0.07; Crawford and Garthwaite, 2002). Even though the patient cannot be considered as impaired based on these results (Figure A1B of Appendix), one might argue that this indicates abnormal perception of these stimuli as compared to normal controls. However, a qualitative analysis of the errors, comparing the items failed by PS and the normal controls indicates that the nature of the responses for PS and controls is similar: each of the stimuli that she classified incorrectly was also classified incorrectly by at least one, and often several control participants (Figure A1C of Appendix). In fact, if we consider only the 17 items that were always detected as faces by the controls, PS obtained a score of 100% correct responses. This suggests that PS does not process these stimuli qualitatively differently than the control participants. Moreover, when considering her performance for non-face items, she is as accurate as the controls (92.5%; average = 92%; SD = 8.8%; t₉ = 0.05; p = 0.48). As for response times on correct trials, PS was again comparable to the control participants, for both face and non-face items (respectively t₉ = 0.69; p = 0.25 and t₉ = –0.21; p = 0.42). Thus, overall, PS (irrespective of the presence of a face) presents with a profile of response that is similar to the control participants (Figure A1 of Appendix).

To our knowledge, this is the first demonstration of accurate and fast perception of faces made of non-face elements such as the paintings of Arcimboldo by a patient suffering from acquired prosopagnosia. It contradicts the view that such patients are unable to see a face in an Arcimboldo’s painting (Harris and Aguirre, 2007). However, this is perfectly in agreement with the ability of acquired PSs to categorize a stimulus as a face, as opposed to their impairment in individualizing faces. Moreover, the patient PS does not present with object perception impairments and her low-level vision is well preserved, contrary to most cases of acquired prosopagnosia with extensive lesions (e.g., Barton et al., 2004), or cases of aperceptive agnosia like DF who may indeed be unable to see the face in a painting of Arcimboldo (e.g., Steeves et al., 2006).

fMRI Arcimboldo faces experiment

Performing the same task in the scanner, PS was as accurate as the controls (60.4%, mean = 62%, SD = 8.3%; t₅ = −0.19; p = 0.43; AM: 63%; Crawford and Garthwaite, 2002). For response times on correct trials, PS (1384 ms) was also within the normal range (controls’ mean = 1271 ms, SD = 319 ms; t₅ = 0.33; p = 0.37; AM: 1028 ms).

Functional localizer analysis. Prosopagnosic patients’ data for face-sensitive areas were the same as used for the Mooney face experiment: significant activation of the right FFA and pSTS, as well as the left OFA. Mean Talairach coordinates of the ROIs for the control participants of the second experiment (S1, S4, S5, S8, S9, S10) are: 37.5, −46.0, −19.5 for the right FFA, −39.7, −45.3, −17.8 for the left FFA, 30.8, −79.7, −7.7 for the right OFA, −36.7, −70.2, −14.3 for the left OFA and −44, −50.3, 9.8 for the right pSTS (see Table A1 of Appendix for details about individual coordinates; Table 1).

PS and control participants. upright Arcimboldo faces compared with inverted Arcimboldo faces. In normal participants, we found significant effects in the two higher order “face areas” (right FFA, random effect analysis: t = 4.18, p < 0.0087; right pSTS: t = 2.60, p < 0.0484; AM: FFA: t = 4.25, p < 0.00002; pSTS: t = 3.84, p < 0.0001) but it was not significant in the left OFA: t = 0.99, p = 0.37 (AM: t = 0.51, p = 0.61; Figures 9 and 5B, and Figure A3B of Appendix). Identical results were found for PS, with significant effects in the right FFA (t = 5.60, p < 0.000001), in the pSTS (t = 6.02, p < 0.000001) but not in the left OFA (t = 1.31, p = 0.19; Figure 9).

FIGURE 9

Figure 9. Among the face-sensitive areas that are intact in the prosopagnosic patient’s brain, the right FFA, as well as the pSTS showed a larger response to Arcimboldo stimuli perceived as faces as compared to the same pictures presented upside-down (“non-faces”). Similar findings are made for control participants (here A1 illustrated), PS and the age-matched control (AM).

In regions structurally damaged in PS’ brain, the group of control participants showed a difference between the two conditions in the left FFA (smaller than the right FFA) but not in the right OFA (left FFA: t = 3.15, p < 0.025, Right OFA: t = 1.70, p = 0.15; for AM: left FFA: t = 2.45, p = 0.014; right OFA: t = 0.36, p = 0.72; Figure A3B of Appendix).

PS and control participants. upright Arcimboldo faces recognized compared with inverted Arcimboldo faces not recognized. For correct trials only, the results were the same for PS [significant in the FFA (t = 5.02, p < 0.000001), in the pSTS (t = 4.92, p < 0.00001) but not in the left OFA (t = 1.03, p = 0.30)] and for the group of control participants (right FFA, t = 2.99, p < 0.030; pSTS, t = 3.37, p < 0.020; left FFA, t = 2.24, p < 0.076; right OFA, t = 1.82, p = 0.13, left OFA, t = 1.03, p = 0.35). For AM, the right and left OFA also reached significance in this comparison (AM: right FFA, t = 5.54, p < 0.000001; pSTS, t = 3.09, p < 0.002061, left FFA, t = 5.568, p < 0.00001, right OFA, t = 2.77, p < 0.0057, left OFA, t = 2.83, p = 0.0046], although the BOLD signal rather showed a deactivation for the condition “inverted Arcimboldo faces not recognized” in the OFA (Figures A6 and A7 of Appendix).

In summary (Table 1), we found significant activations to the perception of Arcimboldo faces primarily in the right FFA and right pSTS in the normal population and PS. In the left OFA, structurally intact in PS’ brain, there was no evidence for sensitivity to faces presented as Arcimboldo stimuli, neither for the normal controls nor for PS. Finally, in the two regions that are structurally damaged in PS’ brain, there was little (left FFA) or no (right OFA) evidence for sensitivity to the categorization of an Arcimboldo stimulus as a face in the normal population.

Direct comparison of PS and normal participants. upright vs. inverted Arcimboldo faces indexes. In the right FFA, the larger activation for upright Arcimboldo faces was identical for PS and normal participants, whether the index (see Materials and Methods) was computed for all trials or only for correct trials (PS: 0.114 and 0.250, respectively; controls: all trials, mean = 0.114, SD = 0.097; t₅ (Crawford and Garthwaite, 2002) = 0.26, p = 0.81; correct trials, mean = 0.203, SD = 0.149; t₅ = 0.29, p = 0.78; AM: 0.088 and 0.162; Crawford and Garthwaite, 2002; Figure 6B). The same findings were made for the right pSTS (PS: all trials: 0.167 and correct trials: 0.378; controls: all trials: mean = 0.150, SD = 0.368; t₅ = 0.04, p = 0.48; correct trials: mean = 0.101, SD = 0.246; t₅ = 1.04, p = 0.17; AM: 0.78, 0.80).

Thus, the direct comparison between PS and normal observers did not reveal any significant difference in the level of activation in the right FFA or other face-sensitive area for Arcimboldo stimuli perceived as faces.

Whole-brain analysis. As for the Mooney faces experiment, a whole-brain analysis was performed on PS’ brain with the conjunction of the three contrasts (one per run: Arcimboldo faces upright − Arcimboldo faces upside-down). Four clusters were found in PS at p(uncorrected) < 0.05 (Figure A8 of Appendix): one corresponding to the Talairach coordinates of the right FFA (34, −56, −18, 507 voxels; significant in PS’ face localizer: F–O: t = 7.71, p < 0.000001 and F–SF: t = 9.63, p < 0.000001); two in the right dorsolateral prefrontal cortex (41, 31, 29, 136 voxels and 41, 18, 29, 161 voxels), not responding preferentially to faces: F–O not significant: t = 0.51, p = 0.61 and t = 0.90, p = 0.37 respectively). Finally, there was also one cluster in the left lingual gyrus (−20, −62, −15, 192 voxels), close to the V4/V8 region found previously (−23, −70, −15, Sorger et al., 2007) and responding more to faces and scrambled faces than object and scrambled objects (t = 5.20, p < 0.000001 and F–SF not significant: t = −0.34, p < 0.73) in PS’ face localizer. The same analysis was performed with AM (Figure A9 of Appendix) in which we also found four clusters: one large cluster, corresponding to the Talairach coordinates of the right FFA (34, −50, −15, 276 voxels, significant in AM’ face localizer: F–O: t = 11.53, p < 0.000001 and F–SF: t = 22.61, p < 0.000001), one small cluster located closely to the left FFA (−41, −44, −13, 76 voxels, face-sensitivity in the localizer: F–O: t = 6.36, p < 0.000001 and F–SF: t = 10.88, p < 0.000001), and one probably in the lateral occipital complex but overlapping the right OFA (LO, e.g., Malach et al., 1995; 35, −77, −6, 53 voxels, object-sensitivity: t = 15.04, p < 0.000001 and face-sensitivity: F–O, t = 4.12, p < 0.00004). There was also significant activation in the postcentral gyrus (−55, −25, 38, 69 voxels, no face-sensitivity: F–O: t = 0.87, p = 0.38). When considering our six control participants as a group, we found only three significant clusters, one corresponding to the Talairach coordinates of the right FFA (32, −58, −14, 120 voxels; 6 participants’ face localizer, F–O: t = 2.38, p < 0.017230, F–SF: t = 15.717, p < 0.000001) and two belonging most probably to the lateral occipital complex [LO, e.g., Malach et al., 1995: −30, −78, −16, with 65 voxels, corresponding to the left ventral posterior part of LO (six participants’ face localizer, F–O: t = 1.60, p = 0.11, O-SO: t = 7.43, p < 0.000001) and one 28, −41, −18 (52 voxels) to the right ventral anterior part of LO (six participants’ face localizer, F–O: t = 1.07, p = 0.28, O-SO: t = 17.09, p < 0.000001)].

To summarize, the whole-brain analysis confirmed the dominant role of the right FFA in categorizing Arcimboldo stimuli as faces. This region was the most significant in all comparisons, independently of whether all trials or correct trials only were considered, both for the normal control participants and for the patient PS. It was also the only face-sensitive region that responded to Arcimboldo faces in all participants. Other areas, such as the right pSTS were involved, but to a lesser extent. This observation is strengthened by the comparison of the whole-brain analysis of the face localizer in the same six participants at the same statistical threshold [p(uncorrected) < 0.05], which gave rise to a rFFA of 373 voxels, with a rOFA of 28 voxels. The rOFA’ size was thus only of 8% of the rFFA’ size in the face localizer experiment with these participants. Considering that the size of the rFFA was even larger in the Arcimboldo experiment at this threshold (507 voxels), one could have expected to disclose a rOFA of at least 38 voxels. Instead, there were no voxel in the right inferior occipital cortex above statistical threshold. The same reasoning can be made for the left FFA (115 voxels in the localizer at p < 0.05, 156 voxels expected, 0 found) and the left OFA (in the localizer at p < 0.05, 35 voxels expected, 0 found).

Considering experiments 1 and 2 altogether, it appears that holistic face perception is subtended first and foremost by higher order face-sensitive areas of the right hemisphere, in particular the right FFA, rather than by face-sensitive lower-level visual areas such as the rOFA, both in the normal brain and for the patient PS. To fully support this claim, we also tested directly for the interaction between the two main areas of interest identified in all normal brains (rFFA, rOFA) and the conditions (upright, inverted Mooney or Arcimboldo faces), taking the individual beta weights of the GLM analysis in an 2 × 2 ANOVA model for repeated measures. For Mooney faces, we found a significant interaction (F_1,6 = 10.80, p = 0.017) between the two factors, reflecting the significant difference in the FFA (post hoc t-test: p = 0.016) but not in the OFA (p = 0.8). For Arcimboldo faces, there was also a significant interaction (F_1,5 = 10.99, p = 0.021) between the two factors, reflecting the significant difference in the FFA (post hoc t-test: p = 0.008) which failed to reach significance in the OFA (p = 0.16).

Finally, across all face-sensitive regions, we also observed a lateralization index (see Materials and Methods) of 58% of the significant face-sensitive voxels in the right hemisphere in the functional face localizer, which increased up to 100% for the Mooney face stimuli (experiment 1) and 73% for the Arcimboldo face stimuli (experiment 2).

Discussion

Holistic Coarse-to-Fine Perception of Faces in the Normal Brain?

Several neuroimaging studies of the normal brain have shown that the middle fusiform gyrus, in particular the pre-localized right FFA, responds more to two-tone Mooney stimuli when they are perceived as faces (Dolan et al., 1997; Kanwisher et al., 1998; Andrews and Schluppeck, 2004; McKeeff and Tong, 2007). This observation has generally been taken as evidence for a role of the FFA in perceptual awareness about the faceness of the stimulus. Here we showed that compared to full photographs of faces, such Mooney stimuli increase right lateralization in the normal brain, and do not elicit face-preferential responses in lower order visual areas of the inferior occipital cortex, including a functionally localized OFA. Remarkably similar observations are made with another kind of stimuli perceived as faces through their global configuration: Arcimboldo’s faces, which were used for the first time to our knowledge in neuroimaging here.

These observations are difficult to reconcile with the conventional view that the processing of faces must go through hierarchical stages, with the extraction of facial parts in the OFA, leading to the subsequent perception of whole faces in the FFA (Haxby et al., 2000; Jiang et al., 2006; Fairhall and Ishai, 2007; Ishai, 2008; see also Lerner et al., 2001 and Pitcher et al., 2007⁵). Rather, it indicates that face-preference can arise in higher order visual areas, leading to a FFA, independently of putative face-preferential inputs from the OFA. One possibility would be that there are direct connections from retinotopic visual areas to anterior visual areas such as the middle fusiform gyrus, perhaps through the inferior longitudinal fasciculus (ILF, see Catani et al., 2003; Thomas et al., 2009). This view is in agreement with the evidence of direct projections from V1 to V4 and from V2 to the posterior part of the infero-temporal cortex in the monkey brain (TEO; Nakamura et al., 1993). In humans, diffusion tensor imaging (DTI) studies have not yet reported direct anatomical connectivity between V1 and the FFA in the majority of brain connectivity patterns tested, but direct connections between early visual areas such as V3 and V3a and the FFA have been reported (Kim et al., 2006), suggesting that early visual areas can send non-category-related visual information that may be interpreted as facelike in higher order visual areas. Another possibility would be that in the normal brain, there is a first pass of information in the inferior occipital cortex that does not elicit face-preferential responses (no OFA) before reaching the middle fusiform gyrus. However, FFA activation in PS’ brain does not favor this interpretation (see below).

What would then be the function of the OFA when it is activated by normal face stimuli? One hypothesis is that it contributes to face perception following – rather than preceding – the initial categorization of the stimulus as a face in higher order visual areas (right FFA and pSTS; Rossion et al., 2003; Rossion, 2008). According to this view, neurons in the OFA, presenting a smaller receptive field, might be useful to extract finer-grained information from the visual stimulus, for instance to identify the particular face components (e.g., eyes, nose, mouth, etc.). Several arguments support this view.

First, an initial global representation of the face might depend on neurons that are sensitive to the entire visual stimulus, i.e., neurons located quite high in the visual hierarchy and with a large receptive field (Desimone et al., 1984; Tsunoda et al., 2001). Previous neuroimaging studies have shown that the FFA (also called mFUs or pFus in some studies) is more sensitive to progressive image scrambling than the OFA (also referred to as the face-sensitive responses in the LOC), suggesting that the FFA represents faces at a more global level than the OFA (Lerner et al., 2001; see also Grill-Spector et al., 1998). Moreover, a larger response for central as opposed to peripheral visual stimuli is found in both the OFA and FFA, but this difference is much smaller in the FFA (Levy et al., 2001).

Second, according to a number of authors, processing in the visual system is thought to follow a coarse-to-fine sequence, with the coarse structure of the stimulus, carried by low spatial frequency (LSF) channels, being processed before the fine local details transmitted by high SF (e.g., Flavell and Draguns, 1957; Ginsburg, 1978; Watt, 1987; Schyns and Oliva, 1994; Hugues et al., 1996; Parker and Costen, 1999; Loftus and Harley, 2004). Because of the initial availability of LSF, the early visual representation would be that of the global structure of the face stimulus, this coarse frame being refined over time with the slower accumulation of higher spatial frequencies (Sergent, 1986). A recent fMRI study supports this view, showing that the rFFA, and to a lesser extent the right pSTS, respond preferentially to LSF faces in early stages of face processing (i.e., until 75 ms of exposure duration) as compared to higher SFs (Goffaux et al., in press). Moreover, in that study, the response to finer-grained face information, i.e., high SF, became more significant over time in the bilateral FFAs and in the right OFA, providing further support for the view advocated here.

Third, many authors have suggested that rather than initiating the process of object/face categorization, lower order visual areas (such as the OFA) may rather be contacted through feedback from higher order visual areas (such as the FFA) to perform processes that involve high-resolution details, fine geometry and spatial precision (Mumford, 1992; Hupé et al., 1998; Lee et al., 1998; Lamme and Roelfsema, 2000; Bullier et al., 2001; Galuske et al., 2002; Bar, 2003). This view is consistent with the presence of massive cortical bi-directional connections (Felleman and Van Essen, 1991) and the hypothesis of reentrant phasic signaling between areas of the visual cortex (Edelman, 1978, 1993). It is also in line with an influential theoretical framework of visual perception, the reverse hierarchy theory (RHT; Hochstein and Ahissar, 2002), according to which explicit perception begins at high areas of the visual cortex, representing “the gist of the scene,” or an object at the basic level⁶. The details are not represented at this stage, and the representation is then refined by recruiting lower order visual areas, with smaller receptive fields neurons, through feedback connections.

Finally, we have to acknowledge that the inferences made here about the neurofunctional organization of face categorization are based only on the magnitude of signal in the areas of interest and in the whole brain. That is, we found both the FFA and OFA (bilaterally) in the face localizer in our normal observers’ brains, while only the right FFA was found for Mooney and Arcimboldo stimuli. However, one cannot exclude that face-sensitivity to such stimuli in the inferior occipital cortex is present at a smaller spatial scale than a whole region of interest. More precisely, the pattern of response across voxels in the inferior occipital cortex, rather than the absolute magnitude of signal, may differ between perceived and non-perceived Mooney faces (see Hsieh et al., 2010). This hypothesis could be explored in future studies by multivariate pattern analysis methods (e.g., Haxby et al., 2001; O’Toole et al., 2007; Mur et al., 2009; Peelen et al., 2009)⁷. Nevertheless, no effect of upright vs. inverted Mooney faces (which in most cases translates to the perception of either a face or a non-face) was found in the traditionally defined OFA, or in its vicinity (in the full brain analysis), suggesting that it is not a necessary stage of face perception in general.

A Neural Locus for the Right Hemisphere Dominance in Holistic Face Perception

The right hemispheric dominance for unfamiliar face perception is now a well established fact. Acquired prosopagnosia follow either bilateral or right unilateral occipito-temporal lesions (Hecaen and Anguelergues, 1962; Landis et al., 1988; Bouvier and Engel, 2006; Sorger et al., 2007), and multiple sources of evidence ranging from divided visual field studies (Levy et al., 1972; Parkin and Williamson, 1987; Hillger and Koenig, 1991), neuroimaging (e.g., Sergent et al., 1992; Kanwisher et al., 1997; McCarthy et al., 1997), event-related potentials (ERPs, e.g., N170 Bentin et al., 1996) or single-cell recordings in the non-human primate brain (Perrett et al., 1988) have supported the dominant role of the right posterior visual areas in processing face stimuli (see also Zangenehpour and Chaudhuri, 2005). Here we observed enhanced right lateralization when perceiving faces defined on the basis of their global configuration. This observation provides further evidence – and a neural locus – to classical evidence of a right hemispheric advantage at perceiving Mooney face stimuli from lesion studies (Lansdell, 1970; Newcombe, 1974) and divided visual field experiments (Parkin and Williamson, 1987). More generally, it offers strong support to the long-standing view that the right hemisphere dominance in processing faces is directly related to global/holistic perception (Levy et al., 1972; Ellis, 1983; Sergent, 1988; de Schonen and Mathivet, 1989; Sergent and Villemure, 1989; Hillger and Koenig, 1991).

Normal Holistic Perception of a Generic Face in Acquired Prosopagnosia

We found that acquired prosopagnosia does not necessarily prevent holistic face perception. The patient PS was not only able to see the faces adequately in Mooney and Arcimboldo stimuli, but her pattern of responses looked perfectly normal, even when considering response speed.

To the best of our knowledge, the present study provides the first solid evidence of the preserved ability to perceive a face stimulus in ambiguous Mooney and Arcimboldo figures in acquired prosopagnosia. We believe that this observation is particularly interesting because there is now wide evidence that the same patient PS is impaired at holistic processing of individual faces. For instance, she does not show any advantage at matching upright over inverted faces (Busigny and Rossion, 2010). Also, when she has to match specific parts of individual faces, her performance is not influenced by the identity of the other face parts (lack of whole-part advantage and composite face effect, Ramon et al., 2010). Eye gaze fixation and performance during gaze-contingency also demonstrate a lack of holistic perception when the patient has to individualize faces (Orban de Xivry et al., 2008; Van Belle et al., 2010). Taken together with the present evidence for normal holistic (generic) face perception, these observations indicate that there may be two stages of holistic face perception: one that allows categorizing a face stimulus at the basic-level (“it’s a face”), a necessary stage of processing that is preserved for PS, and a second stage that allows the extraction of a more fined-grained representation of the whole individual face, a process that is impaired for PS.

Holistic Face Perception in Normal and Damaged Neural Circuits

Prosopagnosic patient showed normal level activation to Mooney/Arcimboldo faces in the exact same brain areas as normal observers, the right FFA primarily. This observation reinforces the view that the patient perceives a generic face holistically, just like normal viewers. Moreover, the fact that she performed as well as normal participants in this study, without any possible contribution of a right OFA and a left OFA, reinforces the view that these areas are not critically involved in holistic (generic) face perception.

We note that this last observation can be related to a previous study in which normal FFA response was found in a case of prosopagnosia during perception of a face in a biased vase/face illusion (Hasson et al., 2001, 2003). However, the patient tested by Hasson et al. (2003) had no brain damage, and his OFA was intact. Second, the FFA and OFA were activated by a single stimulus (modified vase-face illusion) that is fundamentally different than the many Mooney faces used here. The vase/face stimulus can be defined as two profile faces just based on the contour (a line) defining the (2) face(s). It is segmented, with all the parts being well separated from the background by a different texture/color and identified as forming the face stimulus precisely because they share the same surface properties. In contrast, Mooney face stimuli are two-tone images for which some blobs making the face are in black and others are in white, confounded by two-tone background blobs. The visual system thus has to rely on an internal 2D face template to segment the Mooney stimulus and see it as a face (Cavanagh, 1991; Moore and Cavanagh, 1998).

The right FFA was the strongest, and most consistently activated region in the Mooney/Arcimboldo experiments, suggesting that this area is necessary for the initial holistic perception of a face as a face. This claim may appear at odds with pattern-based classification studies of fMRI data indicating that even without the contribution of the (bilateral) FFA, faces can be reliably distinguished from other object categories, suggesting rather a distributed representation of categories in the ventral temporal cortex (Haxby et al., 2001; O’Toole et al., 2005). Similarly, prosopagnosia can arise from different bilateral and right hemispheric lesion sites, including the right FFA (Barton et al., 2002), yet these patients are generally still able to categorize a face as opposed to a non-face object, supporting a distribution of resources to perform the simple categorization of the stimulus as a face. However, face classification in PSs, or pattern-based classification in fMRI studies of the normal brain, are generally performed with well-segmented visual stimuli, which can be categorized based on multiple local and global cues. In the present study, we used stimuli that had to be perceived holistically in order to be categorized as faces. Hence, the present data suggests that insofar as a holistic representation of the stimulus is required for face perception, the right FFA may be a fundamental region.

Finally, we note that a critical role of the right FFA in holistic generic face perception does not confine this area to this basic, yet important, function. There is wide evidence that representations of individual faces are coded in a network of visual areas, including the right FFA and OFA (e.g., Gauthier et al., 2000; Grill-Spector and Malach, 2001; Yovel and Kanwisher, 2005; Gilaie-Dotan and Malach, 2007), as well as the anterior section of the inferior temporal cortex (AiT; Kriegeskorte et al., 2007; Nestor et al., 2008). Interestingly, the right FFA is also the area which shows the strongest sensitivity to holistic perception of the (finer-grained) individual face, as demonstrated by the composite face effect (Schiltz and Rossion, 2006; Schiltz et al., 2010; see also Harris and Aguirre, 2008; Andrews et al., 2010). That is, face detection and individualization appear to be subtended by overlapping neural circuits in the human brain, and the right FFA seems to play a dominant role in coding faces holistically, both at coarse and fine-grained representational levels. According to a non-hierarchical scheme as suggested here and previously (Rossion, 2008), the first face representation emerging in the rFFA following feedforward processing could be holistic and coarse (Goffaux et al., in press; see also Sugase et al., 1999; Sripati and Olson, 2009 for electrophysiological evidence of coarse-to fine in the monkey infero-temporal cortex). This holistic representation could then be refined by a second wave of inputs and/or reentrant functional interactions with lower visual areas such as the OFA.

Summary and Conclusions

We provided evidence from neuroimaging in normal viewers that when holistic perception is required to categorize a visual stimulus as a face, the stimulus can activate the right FFA without a contribution of face-sensitive inputs in inferior occipital cortex (no right OFA). This finding is strengthened by normal behavioral and neural responses during holistic perception of faces in a PS with brain damage to the cortical territory of the right OFA. Altogether, these observations suggest that even during the presentation of clear face pictures, face-preferential responses might emerge in higher order visual areas (FFA) independently from, and perhaps before, face-preferential responses in lower order areas in the inferior occipital cortex (OFA), supporting a non-hierarchical view of face perception in the visual cortex.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

We are grateful to PS for her patience during the time-consuming behavioral and fMRI experiments and to all our control participants. We also thank Aaron Schurger for providing us with the Mooney face stimuli, Henryk Bukowski for help in data analysis, as well as Dana Kuefner, Leon Deouell, and two reviewers for their helpful comments on a previous version of this manuscript. This research was supported by a grant ARC 07/12-007, and a Mandat d’impulsion scientifique (FNRS) 2008–2011 to Bruno Rossion. Bruno Rossion and Thomas Busigny are supported by the Belgian National Fund for Scientific Research (FNRS).

Footnotes

^http://www.princeton.edu/ artofscience/gallery
^http://www.artyst.net/A/Arcimboldo16/ Arcimboldo.htm
^http://www.jasonmecier.com
^There were other areas activated by the conjunction contrast in some participants, but unlike the FFA, OFA, and pSTS they were not consistent across participants, and were not the focus of the present study.
^In that latter study, transcranial magnetic stimulation (TMS) applied over an average coordinate of the right OFA at a very early stage in time (60–100 ms following visual stimulation) disrupted individualization of faces differing by facial parts (Pitcher et al., 2007). This finding has been taken as evidence that the OFA is the first face-sensitive relay in the human brain, coding for facial parts. However, while Pitcher et al.’s (2007) findings may indeed be taken in favor of an early role of the OFA in face processing, they concern the specific case of face individualization. In order to contradict the present findings and their interpretation, one would rather have to show that TMS applied to the (right) OFA leads to impairment in holistic face detection, and to a reduced or abolished early response to faces in the FFA if TMS is combined with fMRI.
^Note however that according to the RHT framework, the first approximation of the visual stimulus in high level visual areas would be the result of a feedforward hierarchical process (see Rossion, 2009, p. 193, for discussion of differences between RHT and the present proposal).
^Note that such observations are unlikely, for two reasons at least. First, the whole-brain analysis performed at the voxel level, with unsmoothed fMRI data, did not reveal any face-preferential response in the inferior-occipital cortex in the Mooney/Arcimboldo experiments. Hence, if consistent face-preferential responses were present, they would have had to be fully distributed (i.e., peppered) at the voxel level so that no contiguous voxels would show a larger response to faces over non-face stimuli. Second, the patient PS has a lesion in the right inferior occipital cortex, ruling out any form of face-preferential response in this part of the brain. Nevertheless, she shows rFFA activation to Mooney/Arcimboldo stimuli.

References

Andrews, T. J., Davies-Thompson, J., Kingstone, A., and Young, A. W. (2010). Internal and external features of the face are represented holistically in face-selective regions of visual cortex. J. Neurosci. 30, 3544–3552.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Andrews, T. J., and Schluppeck, D. (2004). Neural responses to Mooney images reveal a modular representation of faces in human visual cortex. Neuroimage 21, 91–98.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Bar, M. (2003). A cortical mechanism for triggering top-down facilitation in visual object recognition. J. Cogn. Neurosci. 15, 600–609.