The brain on art: intense aesthetic experience activates the default mode network

Aesthetic responses to visual art comprise multiple types of experiences, from sensation and perception to emotion and self-reflection. Moreover, aesthetic experience is highly individual, with observers varying significantly in their responses to the same artwork. Combining fMRI and behavioral analysis of individual differences in aesthetic response, we identify two distinct patterns of neural activity exhibited by different sub-networks. Activity increased linearly with observers' ratings (4-level scale) in sensory (occipito-temporal) regions. Activity in the striatum (STR) also varied linearly with ratings, with below-baseline activations for low-rated artworks. In contrast, a network of frontal regions showed a step-like increase only for the most moving artworks (“4” ratings) and non-differential activity for all others. This included several regions belonging to the “default mode network” (DMN) previously associated with self-referential mentation. Our results suggest that aesthetic experience involves the integration of sensory and emotional reactions in a manner linked with their personal relevance.


INTRODUCTION
Human beings in every culture seek out a variety of experiences which are classified as "aesthetic"-activities linked to the perception of external objects, but not to any apparent functional use these objects might have. Looking at paintings, listening to music, or reading poems-these are hedonic experiences in which humans consistently choose to engage. And although the relevant objects in and of themselves have no immediate or direct value for survival or for the satisfaction of basic needs (food, shelter, reproduction), they nevertheless accrue great value within human culture. What are the neural underpinnings of aesthetically moving experience?
Although the foundation of aesthetic inquiry as a formal scholarly discipline is relatively recent-the philosopher Alexander Baumgarten introduced the modern use of the term in 1739musings about the nature of "beauty" date back at least as early as Plato (Plato, 1989) and Confucius, and evidence exists of well-developed artistic traditions in most of the world's ancient cultures (e.g., China, India, Egypt, Mesopotamia, Persia). But it is only recently that it has become possible to investigate the physiological bases of aesthetic experience. Recent neuroimaging studies have identified several brain regions whose activation correlates with a variety of aesthetic experiences-namely locations in the anterior medial prefrontal cortex (aMPFC) and the caudate/striatum, with several additional regions detected in some studies but not others (Blood and Zatorre, 2001;Cela-Conde et al., 2004;Kawabata and Zeki, 2004;Vartanian and Goel, 2004;Jacobsen et al., 2006;Di Dio and Gallese, 2009;Kirk et al., 2009;Ishizu and Zeki, 2011;Lacey et al., 2011;Salimpoor et al., 2011). These findings form the initial basis for the field of neuroaesthetics, but key questions remain. In this study we examined more closely issues surrounding the intensity and diversity of aesthetic responses.
A major theme in philosophical inquiry into aesthetic experience is a tension between universality and subjectivity. On one hand, many authors have argued that aesthetic evaluations rely on universal principles. On the other, philosophical inquiry also emphasized the importance of understanding aesthetic responses as strongly subjective. These two views are not, in principle, mutually exclusive: subjective judgment may lead to aesthetic evaluations that are so consistent across individuals as to be termed universal. Indeed, the notion of universal aesthetics relies on the observation of wide agreement among people about the aesthetic value of certain objects or classes of objects (e.g., flowers; Scarry, 1999). Yet aesthetic judgments are not only subjective but also highly susceptible to cultural norms, education, and exposure. Thus, while there may be certain items that command consensus in their evaluations, for the majority of artifacts judgments can vary widely.
This variation in aesthetic judgments can be used to isolate the neural dimensions of aesthetic responses as opposed to reactions to particular features of a given work of art (e.g., Kawabata and Zeki, 2004;Salimpoor et al., 2011). To date, most studies have used stimuli that generated wide agreement. Putative subjective aspects of an experience were potentially confounded with differences in the stimuli themselves. Another fundamental problem is that using stimuli on whose aesthetic value people tend to agree necessarily gives more weight to common internal factors-be they driven by culture or by evolution-and leaves little room for truly individual aspects of subjective aesthetic experience to emerge. We solved this by using stimuli for which people expressed strongly individual preferences. These large individual differences enable us to use the diversity of visual artwork to parse out the different components of aesthetic experience.
To allow for these individual preferences to emerge, an important guiding principle in the choice of our stimulus set was that it should span a variety of styles and periods (see Figure 1). One way in which diverse stimuli may lead to individual differences is that they invoke a variety of emotions-an aesthetic response includes evaluations that can vary in valence and degree of arousal, from "preference" and "pleasure" to "beauty," "sadness," "awe," or "sublimity" (Frijda and Sundararajan, 2007;Zentner et al., 2008). Therefore, our instructions to observers explicitly acknowledged that strongly moving aesthetic experiences may come in a variety of forms, not merely beauty and preference. With this paradigm, we find large individual differences in which of the artworks observers find aesthetically moving: on average, each image that was highly recommended by one observer was given a low recommendation by another. Therefore, any BOLD effects found in a contrast of high vs. low recommendation reflect differences in aesthetic reaction, not stimulus features.
Differences in subjective experience may arise not only from differences in the emotions that a given artwork evokes but also from how different individuals weigh these emotions. To examine this, observers also responded to a nine-item questionnaire addressing evaluative and emotional components of their aesthetic experience for each artwork.
We find that brain regions differentially activated by artworks given high and low aesthetic recommendations can be classified into two distinct sets by virtue of the pattern of their response. BOLD activation varied linearly with observers' ratings in several sensory (occipito-temporal) regions. Activity in the striatum (STR) and pontine reticular formation (PRF) also varied linearly with ratings but straddled their resting baseline, exhibiting below-baseline activations for low-rated artworks. In contrast, a separate network of frontal and subcortical regions showed a step-like increase only for the most moving artworks ("4" ratings) and non-differential activity for all others. This included several regions belonging to the "default mode network" (DMN) previously associated with self-referential mentation, such as the anterior aMPFC. Within these networks, we observed sensitivity to positive and negative emotional aspects of aesthetic experience, and evidence for individual differences correlated with personal differences in aesthetic evaluation.

OBSERVERS
Sixteen observers were recruited at New York University (11 male; 13 right-handed; 27.6 ± 7.7 years) and paid for their participation. All had normal or corrected to normal vision. Informed consent was obtained from all participants, in accordance with the New York University Committee on Activities Involving Human Subjects.
FIGURE 1 | Examples of the artworks used in this experiment. All images were obtained from the Catalog of Art Museum Images Online (CAMIO) database (http://www.oclc.org/camio). See List of artworks for image credits and the full list of artworks used in the experiment.

Frontiers in Human Neuroscience
www.frontiersin.org

STIMULI
One hundred and nine images were selected from the Catalog of Art Museum Images Online database (CAMIO: http://www. oclc.org/camio; Figure 1 and List of Artworks). CAMIO contains more than 90,000 images of textiles, paintings, architecture, and sculpture from museum collections around the world. The works of art came from a variety of cultural traditions (American, European, Indian, and Japanese) and from a variety of historical periods (from the 15th century to the recent past). Images were representational and abstract, and could be roughly classified as either female figure(s) (33), male figure(s) (23), a mixed group (20), still life (11), landscape (14), or abstract painting (8). These classifications did not show significant effects on responses.
Commonly reproduced images were not used, in order to minimize recognition. Most observers recognized no images, and no observer recognized more than a very few (3-5) stimulus images as reported by survey responses.
Images were scaled such that the largest dimension did not exceed 20 • of visual angle, and the area did not exceed 75% of a 20 • box. Stimulus presentation and response collection were controlled using a Macintosh G4 running Matlab 6.5 and the Psychophysics Toolbox (Brainard, 1997).

PROCEDURE
Observers were told they would be viewing a set of artworks while lying in the scanner. They were to use a scale of 1-4 by pressing a button on a hand-held response box to answer the question "how strongly does this painting move you?" according to the following instructions: Imagine that the images you see are of paintings that may be acquired by a museum of fine art. The curator needs to know which paintings are the most aesthetically pleasing based on how strongly you as an individual respond to them. Your job is to give your gut-level response, based on how much you find the painting beautiful, compelling, or powerful. Note: The paintings may cover the entire range from "beautiful" to "strange" or even "ugly." Respond on the basis of how much this image "moves" you. What is most important is for you to indicate what works you find powerful, pleasing, or profound.
Each observer viewed all 109 artworks; the order was counterbalanced across observers to control for possible serial order effects. Observers were instructed prior to entering the magnet and given practice trials using artworks not in the stimulus set.

Nine-item evaluative questionnaire
After the fMRI session, observers were given a short break, and were then taken to a behavioral lab where they sat in front of a computer screen to complete a nine-item questionnaire. They were shown the same set of paintings in the same order as in the scanner. Each painting was shown for 6 s. Observers were asked to rate the intensity with which each artwork evoked the following evaluative/emotional responses: joy, pleasure, sadness, confusion, awe, fear, disgust, beauty, and the sublime. Responses to this nineitem questionnaire were given using mouse clicks on a visual seven-point scale for each item. These items were presented in random order on each trial. Observers could respond to the nine items in any order, but could not change ratings.
Observers ranged from those with novice-level experience of art and art history to several having completed some undergraduate study in the history of art (evaluated using a survey at the time of the experiment). Before entering the scanner, observers were also administered the Positive and Negative Affect Schedule (PANAS; Watson et al., 1988). PANAS is a highly stable and internally consistent metric for dispositional affect (mood), used to determine how frequently an observer experiences positive and negative affect in a defined time period. Observers in this study were asked to answer questions with regard to the immediately preceding few days.

fMRI SCANNING PROCEDURES
fMRI scans were carried out at New York University's Center for Brain Imaging, using a 3-T Siemens Allegra scanner and a Nova Medical Head coil (NM011 head transmit coil). Artworks were projected onto a screen in the bore of the magnet and viewed through a mirror mounted on the head coil.
The 109 artworks were divided into four sets (different subsets per observer, depending on order) and shown over the course of four functional scans using a slow event-related design. During these functional scans, the blood oxygen level dependent (BOLD) signal was measured from the entire brain using thirty-six 3 mm slices aligned approximately parallel to the AC-PC plane (in plane resolution 3 × 3 mm, TR = 2 s, TE = 30 ms, FA = 80 • ). Each trial began with a 1 s blank period then a blinking fixation point for 1 s, followed by the artwork for 6 s, and a blank screen for 4 s, during which the observer pressed a key corresponding to recommendation. An additional 0, 2, or 4 s blank interval was inserted pseudorandomly between trials to jitter trial timing, with an average trial length of 13.14 s.
Observers were also run in a localizer scan containing blocks of objects, scrambled objects, faces, and places. This 320 s scan consisted of four 18 s blocks of each stimulus type, during which the observer performed a "1-back" task (where observers monitor for exact repeats of an image). Each block contained 16 stimulus images plus two repeats, each presented for 800 ms with a 200 ms inter-stimulus-interval. The full-color images were placed on top of phase-scrambled versions of the same stimuli filling a 500 × 500 pixel square to control for differences in size across stimulus categories.
A high resolution (1 mm 3 ) anatomical volume (MPRage sequence) was obtained after the functional scans for registration and spatial normalization.

BEHAVIORAL DATA ANALYSIS
For the observers' recommendations collected during the scanning session, a measure of agreement across individuals was computed by taking the set of 109 recommendations for every pair of observers and computing the Pearson correlation coefficient. Images with any missing recommendation values were excluded from the correlations in a pairwise manner. One observer gave no "4" recommendations, and was, therefore, excluded from subsequent analyses relying on the contrast of "4" vs. "1" responses. Similarly, a measure of across-observer agreement was computed Frontiers in Human Neuroscience www.frontiersin.org for each item of the nine-item questionnaire collected after the scanning session. For each item, the Pearson correlation coefficient was computed for each pair of observers.

Factor analysis of evaluative questionnaire
The responses on the nine-item questionnaire produced by each observer to each artwork (16 × 109 = 1744 trials total) were then converted to z-scores within observers and concatenated into a single large matrix of scores. Principal components extraction was used to identify factors with eigenvalues greater than one. Two emotional/evaluative factors survived and were rotated using the "direct oblmin" method, which does not require that the factors be orthogonal. Scores on these two factors were computed for each of the 1744 trials using regression (see Figure 7).

fMRI DATA ANALYSIS
The scans were pre-processed using the FMRIB Software Library (FSL; Oxford, UK) to correct for slice-timing and motion, and were high-pass filtered at 0.0125 Hz. Subsequent analyses were performed using BrainVoyager QX (Brain Innovation, Maastricht, Netherlands). After alignment to observer-specific high-resolution anatomical images, the scans were normalized to Talairach space (Talairach and Tournoux, 1988), blurred with an 8 mm Gaussian kernel, and converted to z-scores.

4-vs.-1 whole brain analysis
To identify regions sensitive to observer recommendation, a whole-brain random effects group-level general linear model (GLM) analysis was computed with the responses of each observer on each of the four possible recommendation levels coded as separate regressors (as a 6 s "on" period for each image convolved with a standard two-gamma hemodynamic response function, HRF). A contrast of the "4" regressors vs. the "1" regressors was computed and the resulting statistical map was corrected for multiple comparisons at a false discovery rate (FDR) of q < 0.05 (Benjamini and Hochberg, 1995;Genovese et al., 2002) and a cluster threshold of 5 3 mm 3 voxels. This contrast will be referred to as the 4-vs.-1 whole-brain analysis (see Appendix Table A1 and Figures 3-5).

ROI analysis
In order to compare BOLD activation for all four recommendation levels across these regions, the group-level clusters from the 4-vs.-1 analysis were used to draw regions-of-interest (ROIs) from which we extracted timeseries for each observer. Using the average (over voxels in the ROI) of non-blurred, z-scored timeseries for each scan, individual observer parameter estimates for each of the four recommendation levels were obtained using a GLM with a standard two-gamma HRF convolved with a 6 s "on" period for each image (see Figures 3-5). Standard errors were computed across observers.

4-vs.-321 whole brain analysis
To further isolate processes particular to aesthetic response, we computed a second whole-brain contrast relying on the same whole-brain GLM as above, but with a new contrast of only the "4" recommendations vs. the average of all the other recommendation levels, balanced to add to zero [e.g., a linear contrast of (−1 −1 −1 3) for the 1, 2, 3, and 4 regressors]. The same statistical threshold was used to correct for multiple comparisons -FDR of q < 0.05 and a 5 3 mm 3 cluster threshold. This contrast will be referred to as the 4-vs.-321 whole-brain analysis (see Figure 6). Note that this contrast may lead to the discovery of new activations not found in the original 4-vs.-1 analysis. Given the widely extended and interconnected nature of the resulting whole-brain map, we do not report the full set of activation coordinates-most of the peak activations were coincident with regions reported for the 4-vs.-1 contrast. Group-level ROIs were isolated for four prominent activations not found in the 4-vs.-1 contrast: the anterior medial pre-frontal cortex (aMPFC), the left hippocampus (HC), left substantia nigra (SN), and the left posterior cingulate cortex (PCC). It was not possible to draw an isolated ROI for the aMPFC from this contrast given the large swath of activationwe, therefore, drew a more restricted ROI for the aMPFC based on the 4-vs.-1 whole-brain contrast, but with a statistical threshold of p < 0.001.

ROI analysis of evaluative factors
The trial-by-trial scores for the two factors extracted from the principal components factor analysis of the nine-item evaluative questionnaire were used to create BOLD predictors by convolving with a standard 2 gamma HRF with a length of 1 TR (2 s) and a delay of 1 TR relative to image onset. This middle TR was chosen as a compromise given our uncertainty about when, during a 6 s viewing, an observer was able to integrate enough information across successive fixations of an artwork to generate an affective response. The resulting timecourses were combined with an "Image On" predictor and orthonormalized using the Gram-Schmidt process before being entered into a GLM predicting BOLD activation in each of the ROI's identified in the whole brain analysis (see Figure 7).

Individual differences analysis of evaluative questionnaire
We performed an analysis of individual differences in responses to the nine-item evaluative questionnaire and their relationship to BOLD activation. Each observer's recommendations and their subsequent responses on the nine items were converted to zscores, and then concatenated into a single large matrix (16 observers × 109 images = 1744 rows). We performed a stepwise regression analysis in SPSS (IBM, Somers, NY) of observers' recommendations against their responses to the nine items to eliminate redundant terms or terms which had no significant predictive power for recommendations. Individual standardized beta weights were then computed for how well each of the items surviving this procedure predicted recommendations, entered in order from most-to-least predictive at the group level (see Appendix Table A2). The resulting beta weights, which can be conceptualized as reflecting the weight an observer places on a particular emotion/evaluation when making recommendations, were used to predict the size (across observers) of the 4-vs.-1 BOLD effect in the set of ROIs identified in the whole-brain recommendation-based analysis. This yielded an overall R 2 for each ROI and beta weights for each of the items with associated confidence intervals. A significant effect in this analysis would indicate that variability across observers in the size of the BOLD Frontiers in Human Neuroscience www.frontiersin.org effect in an ROI is related to variability in how much individual observers weigh a particular emotion/evaluation when making recommendations (see Figure 8).

RESULTS
There was very low agreement in recommendations across observers, as assessed by computing the correlations between observers' recommendations taken in pairs (Figure 2). The average agreement (0.13 ± 0.17) indicates quite low agreement for visual art compared to other kinds of stimuli (e.g., Vessel and Rubin, 2010). (The mean of this distribution is significantly different from zero by a t-test, t[119] = 8.72, p < 10 −13 , but Cronbach's alpha, a measure of inter-rater reliability, confirms the very low agreement, α = 0.709; Cronbach, 1951). This finding has an important methodological consequence: on average, each image highly recommended by one observer was given a low recommendation by another. Therefore, any BOLD effects found in a contrast of high vs. low recommendation reflect differences in aesthetic reaction, not features of the images. A whole-brain group contrast of trials in which an observer gave an image the highest recommendation ("4") vs. trials in which the image was given the lowest recommendation ("1") revealed a set of posterior, anterior, and subcortical brain regions that were correlated with observers' aesthetic recommendations (Appendix Table A1; see "Materials and Methods, 4-vs.-1 Whole brain analysis"). Below, we describe further the responses of these regions, grouped by the nature of the response. The groupings were based on an analysis beyond that which produced Table A1 (4-vs.-1)-specifically, the pattern of responses across all four recommendation levels (see below). To examine those patterns, individual regions of interest (ROIs) were created based on the 4-vs.-1 whole-brain contrast, and the average timecourses were analyzed to estimate the response to each of the four response levels (see "Materials and Methods, ROI analysis").
In posterior (occipito-temporal) ROIs, there was a linear relationship between recommendation level and BOLD response FIGURE 2 | The distribution of pairwise correlations across observers' recommendations, illustrating highly individual responses. Each observer's recommendations were correlated with every other observer's recommendations, taken in pairs. This histogram shows the distribution of all the correlation coefficients. (Figure 3; left inferior temporal sulcus, ITS: −49, −61, −2; left parahippocampal cortex, PHC: −31, −32, −15; right superior temporal gyrus, STG: 52, −10, 7). In left ITS and left PHC BOLD response increased in an approximately linear fashion above resting baseline for increasing recommendations. Similarly, BOLD signal in right STG decreased in an approximately linear fashion below resting baseline for decreasing aesthetic reactions.
In two subcortical regions, the left striatum (STR) and the pontine reticular formation (PRF), there was also a linear relationship between recommendation and BOLD activation. But in contrast to occipito-temporal ROIs, BOLD response levels straddled the resting baseline (Figure 4; STR: −12, 10, 6; PRF: 0, −28, −17). Thus, highly-rated images led to activation greater than baseline and low-rated images led to decreases from the resting baseline.
In contrast with the linear relation between recommendation and BOLD response observed in the occipito-temporal and subcortical regions above, frontal ROIs identified in the 4-vs.-1 contrast (Appendix Table A1) revealed a markedly different pattern of responses. In the left inferior frontal gyrus, pars triangularis (IFGt), left lateral orbitofrontal cortex (LOFC), and left superior frontal gyrus (SFG) there was a non-linear, "step-like" pattern relating aesthetic recommendation and BOLD response (Figure 5). Activation in left IFGt (−50, 32, 12) and left LOFC (−35, 24, −4) was near baseline for artworks given a 1, 2, or 3 recommendation, but was strikingly higher for artworks given a 4, the highest recommendation (Figure 5; right-middle panels). The left SFG (−5, 19, 62) also showed this non-linear, step-like pattern, though shifted downward such that artworks rated 1,2, or 3 were significantly below baseline and only artworks rated 4 were at baseline (Figure 5, top panel). Similarly, activation in the left mediodorsal thalamus (mdThal: −6, −18, 12), which is heavily bidirectionally connected to the prefrontal cortex (Tobias, 1975;Tanaka, 1976;Behrens et al., 2003) showed a non-linear pattern of BOLD response with little differentiation for artworks given recommendations of 1, 2, or 3, but a much higher response for artworks given a 4 ( Figure 5, bottom right).

HIGHLY MOVING IMAGES ENGAGE THE DEFAULT-MODE NETWORK AND RECRUIT ADDITIONAL NEURAL SYSTEMS
The strikingly higher response of frontal regions for artworks rated as the most aesthetically pleasing over all other artworks lends initial support to the hypothesis that a "4" recommendation was fundamentally different from a 1,2, or 3, and that these trials were not just revealing "more" activation in a general network sub-serving preferences, but that they reflected the engagement of an additional process. To test this hypothesis further, we calculated a second whole-brain contrast between just the trials resulting in a rating of 4 and the average of all other trials (ratings of 1, 2, or 3; see "Materials and Methods, 4-vs.-321 Whole brain analysis"). This new analysis gave us more power to detect regions showing a difference for trials rated as 4 but that may not have been detected in the 4-vs.-1 contrast. This 4-vs.-321 contrast revealed a large swath of activation on the medial surface of the left hemisphere, extending from the anterior medial prefrontal cortex (aMPFC: −6 38 4) to the SFG activation seen in the 4-vs.-1 contrast (Figure 6 top left). The aMPFC is known to be a core region of the DMN; (Shulman et al., 1997;Mazoyer et al., 2001;Raichle et al., 2001), and, as FIGURE 5 | Anterior frontal regions and an associated region of the thalamus show non-linear, "step-like" responses to increasing recommendation. The whole-brain images illustrate the t-statistic for the 4-vs.-1 contrast. Panels on the right illustrate the average beta weight (as a z-score) for each recommendation level, averaged across 15 observers (lSFG = left superior frontal gyrus; lIFGt = left inferior frontal gyrus, pars triangularis; lLOFC = left lateral orbitofrontal cortex; lmdThal = left mediodorsal thalamus). Error bars are standard errors of the mean across observers.
expected, inspection of the response to all four recommendation levels in this region shows a decrease in activation below baseline for presentation of most images (those rated a 1, 2, or 3). In contrast, those artworks rated as the most aesthetically moving (recommendation of 4) lead to BOLD activation at aMPFC's resting baseline (Figure 6 top right). In other words, activation in the aMPFC for highly moving artworks is not suppressed, as it is for most artworks and most other types of external stimuli. The left posterior cingulate cortex (PCC: −9 −49 18) another core region of the DMN, showed a similar, though less striking, pattern of activation (Figure 6, middle right).
In addition to the aMPFC and PCC, the 4-vs.-321 contrast also revealed several subcortical regions showing significantly higher activation for only the highest rated artworks. The left substantia nigra (SN: −8, −12, −6) and the left hippocampus (HC: −30 −21 −10; Figure 6 bottom panel) were not differentially activated by trials rated as 1, 2, or 3, but did show significantly greater activation for trials that resulted in recommendations of 4.

Frontiers in Human Neuroscience
www.frontiersin.org It is important to note that the differential response across the 4 recommendation levels cannot simply reflect response selection, as observers are selecting a response on every trial. It is also unlikely that the BOLD effects reflect an implicit mapping of a four response to a "yes" response, and not to aesthetic experience per se. If this were the case, one might expect to see faster response times on those trials. However, when we analyzed observer's mean response times for trials of each recommendation level separately, we saw no such effect [one-way ANOVA with subjects as a random effect; F(3, 56) = 0.44, p = 0.73].

SEPARABLE BOLD RESPONSES TO POSITIVE AND NEGATIVE ASPECTS OF AESTHETIC EVALUATION
Aesthetic experiences can invoke a wide variety of evaluative and emotional responses. Following the fMRI session, observers saw each artwork a second time and rated the degree to which it brought about a specific response on a nine-item questionnaire of evaluative terms (see "Materials and Methods, Nine-item evaluative questionnaire"): pleasure, fear, disgust, sadness, confusion, awe, joy, sublime, and beauty.
Evaluative reactions to individual paintings were not consistent across individual observers (average across observer correlations of 0.13, 0.49, 0.29, 0.38, 0.32, 0.30, 0.16, 0.17, and 0.17 for each term respectively; standard deviations ranging from 0.10 to 0.20). The range of agreement on these items illustrates that some of the variability in recommendations across observers was at least partly driven by different feelings being evoked by each painting (e.g., low agreement for ratings of pleasure), but was also because different people place different weights on those feelings (such as fear).
This variability at two stages-both in the mapping between artworks and feelings they evoke, and in mapping between evoked feelings and aesthetic recommendation-precludes any meaningful direct relationship (at the group level) between ratings of these nine items and activation in the set of brain regions revealed by the 4-vs.-1 and 4-vs.-321 whole-brain group analyses. One approach to understanding these subjective evaluative responses is to test whether there exists a reduced set of latent factors that are common across observers and can explain a significant proportion of the variance in responses.
A principal components factor analysis identified two grouplevel factors that together accounted for 59% of the variance in observers' ratings on the evaluative questionnaire ( Figure 7A) Factor 1: eigenvalue of 3.045, accounting for 33.8% of variance; Factor 2: eigenvalue of 2.269, accounting for 25.2% of variance (see "Materials and Methods, Factor analysis of evaluative questionnaire"). Factor 1 loaded very highly on pleasure, beauty, and other positive questionnaire items, while Factor 2 loaded very highly on fear, disgust, and sadness ( Figure 7B). Scores on these factors were computed for each observer looking at each image and used to re-analyze the BOLD timeseries from the previously identified set of ROIs (see "Materials and Methods, ROI analysis of evaluative factors").

BOLD EFFECTS IN THE PRF AND LEFT ITS REFLECT INDIVIDUAL WEIGHTS ON EVALUATIVE RESPONSES
Evaluative responses across observers were highly individual (see above). Individuals may rely on different evaluative and emotional responses when making their aesthetic recommendations. A regression analysis on each individual's set of responses was used to determine what weights would need to be assigned to each of these items in order to predict each observer's recommendation for each artwork (see "Materials and Methods, Individual Frontiers in Human Neuroscience www.frontiersin.org

FIGURE 7 | Factor analysis of the nine-item evaluative questionnaire reveals two major group-level factors that are reflected in the activation of frontal and subcortical regions. (A) Loadings for the two factors on each of the nine items. (B)
A plot of the nine-items in the two-factor solution reveals a cluster that groups high on Factor 1 and a second cluster that groups high on Factor 2, with Awe and Sublime being partway between the two clusters. (C) BOLD predictors constructed from factor scores reveal ROIs that respond to image onset or either factor. Responsivity to Factor 1 is significant in lSN and approaches significance in lSTR and lSFG. Responsivity to Factor 2 is significant in lSTR, lIFG and approaches significance in laMPFC. Error bars are standard errors of the mean, computed across observers.
differences analysis of evaluative questionnaire"). Three of the items could be removed without significantly affecting the predictability of the set: joy, confusion, and the sublime. Across observers, different subsets of the remaining evaluative terms were effective in predicting individual recommendations (Appendix Table A2). For example, some observers tended to recommend images that they reported as awe inspiring, while other observers did not show a significant relationship between awe and recommendation, but did show a relationship between images that evoked fear and their recommendations of those images.
These individual profiles of evaluative weightings were correlated with the magnitude of observed 4-vs.-1 BOLD effects in two ROIs, the PRF and left ITS (Figure 8). Individualized weights on the remaining six evaluative terms were able to account for a large proportion of across observer variability in the PRF and left ITS 4-vs.-1 BOLD effect sizes (R 2 = 0.70 and 0.62, respectively).
Observers who tended to recommend images they found to be awe-inspiring showed a larger effect of recommendation in the PRF, a part of the reticular activating system [ Figure 8A; beta = 1.22 ± 0.98, t(8) = 2.88, p = 0.021]. No other evaluative term reached significance in the PRF.
In the left ITS, observers' weights for pleasure were significantly related to the BOLD effect [ Figure 8B; beta = 1.72 ± 1.34, t(8) = 2.96, p = 0.018]. This relationship suggests that left ITS may at least partially mediate the relationship between rated pleasure for an artwork and aesthetic recommendation. No other evaluative term reached significance in the left ITS.

CONTROL ANALYSIS
Regions that respond to specific stimulus types (faces or places) showed no effect of recommendation [One-Way ANOVA, left FFA . These regions were identified using an independent localizer scan. We were able to identify a face-responsive region in the posterior fusiform gyrus (FFA) in 12 of the observers (Puce et al., 1995;Kanwisher et al., 1997;McCarthy et al., 1997) and a place-responsive region in the collateral sulcus (CoS) in 14 of the observers (Epstein and Kanwisher, 1998;Epstein et al., 1999). This finding rules out the possibility that the linear effects of recommendation observed in PHC, STG, or ITS depend on stimulus differences.

DISCUSSION
Aesthetic judgments for paintings are highly individual, in that the paintings experienced as moving differ widely across people. The neural systems supporting aesthetic reactions, however, are largely conserved from person to person, with the most moving artworks leading to a selective activation of central nodes of the DMN (namely, the aMPFC, but also the PCC and HC) thought to support personally relevant mentation (see below). The most moving artworks also activate a number of other frontal and subcortical regions, including several which reflect the evaluative and emotional dimensions of aesthetic experiences. A separate network of posterior and subcortical regions show a graded (linear) response signature to all artworks in proportion to an observer's aesthetic judgment. Finally, two regions (PRF and left ITS), show differences in activation level across individuals that are correlated with whether the individual finds certain aspects of a painting (e.g., awe) appealing.

ENGAGEMENT OF THE DEFAULT MODE NETWORK DURING THE MOST AESTHETICALLY MOVING EXPERIENCES
The aMPFC shows decreases in activation from its resting baseline for all images except those rated as most aesthetically moving. Previous studies have reported that activation in this region is positively correlated with aesthetic evaluation (Kawabata and Zeki, 2004;Vartanian and Goel, 2004;Jacobsen et al., 2006;Di Dio and Gallese, 2009;Ishizu and Zeki, 2011). However, none of these studies have clearly shown the relationship of aesthetically driven activations to this region's resting baseline. The DMN is a network of brain areas associated with inward contemplation and self-assessment Raichle et al., 2001;Kelley et al., 2002;Wicker et al., 2003;D'Argembeau et al., 2005D'Argembeau et al., , 2009Andrews-Hanna et al., 2010). As with other areas in the DMN (such as the PCC, where we also see differential activity for only the most aesthetically pleasing images), aMPFC typically shows below-baseline activity in response to external stimulation, and this was indeed what we found in observers' responses to many of the art stimuli to which they were exposed. However, for those few stimuli that each observer judged as creating a strong aesthetic experience, the suppression of aMPFC were alleviated, which is typically seen when observers perform tasks related to self-reflection or during periods of self-monitoring. Such activation in the aMPFC at or above its resting baseline in response to an external stimulus is rare.
Importantly, our results show that only the most aesthetically moving artworks lead to differential, and widespread, activation in the aMPFC, contrary to the claim (Kawabata and Zeki, 2004;Ishizu and Zeki, 2011) that activation in this region is related to beauty in a linear fashion. This difference may be a consequence of the lower number of response levels used in their studies (three vs. four), the inclusion of paintings deemed "ugly" by their observers, the fact that the paintings were not being seen for the first time, or by differences in instructions.
Several studies of self-reflective processes have shown that aMPFC does not deactivate during tasks in which observers assign to themselves personally relevant traits of varying valence (e.g., happiness, honesty, cruelty, etc., Kelley et al., 2002;D'Argembeau et al., 2005;Amodio and Frith, 2006;Moran et al., 2006). Trait studies may reflect a set of processes whereby observers don't simply think about themselves, but, more specifically, match traits with self-inspection, as a part of broader social cognition. In a similar manner, release from deactivation during aesthetic experience may reflect observers' matching self-inspection with their perception of an object.
Strong emotions that are salient to observers also attenuate the depression of aMPFC activation associated with task performance (Simpson et al., 2001a,b), while emotion processing that is not personally relevant (e.g., viewing pictures of unknown persons in empathy-producing situations) has no effect on decreased activation of aMPFC during task performance (Geday and Gjedde, 2009). Highly moving aesthetic experiences appear to represent an analogous situation in which an external stimulus brings about a strong emotional response.
During such intense aesthetic experiences, the aMPFC may function as a gateway into the DMN, signaling personal relevance and allowing for a heightened integration of external (sensory/semantic) sensations related to an art object and internal (evaluative/emotional) states. How such integration is neurally instantiated and how it is related to reward circuits (e.g., whether it is caused by or creates activity in reward-related brain areas) are important questions for further research.

UNIQUE RESPONSE SIGNATURES FOR SENSORY AND EVALUATIVE NETWORKS
This is the first report of unique response signatures separating cortical activations to artwork into a posterior occipito-temporal network and an anterior frontal network. In addition to the frontal activation in aMPFC, the SFG, IFGt, and LOFC also show a step-like response, the latter two regions increasing above baseline for only the most moving images. Within this set of frontal regions, the factor analysis of evaluative responses further distinguishes the ROIs from one another-the LOFC shows no sensitivity to either Factor 1 or Factor 2, while lFGt is sensitive to Factor 2, and both SFG and aMPFC show weak sensitivity to Factors 1 and 2, respectively. Subcortically, activations in the SN, mediodorsal thalamus, and hippocampus also show a step-like pattern of response, suggesting that these regions interact with the frontal network.
This network of frontal regions, which we refer to as an "evaluative" network, likely supports an analysis of emotional response and personal relevance. We suggest that the step-like pattern is a signature of an aesthetic response, where the most moving images produce a clearly differentiable pattern of signal, going beyond mere liking, to something more intense and personally profound.

Frontiers in Human Neuroscience www.frontiersin.org
Additional support for this interpretation comes from a recent study in which observers were instructed to view artworks in terms of semantic or visual detail ("pragmatically"), as opposed to in terms of color, composition, shapes, mood, and evoked emotion ("aesthetically"). They found an activation in left lateral prefrontal cortex 37,7;BA 10) corresponding to what we term left IFGt, which was selectively engaged in the "aesthetic" condition (Cupchik et al., 2009). The second signature we observe, a linear response to observer recommendation, is found in more posterior cortical regions (PHC, ITS, and STS). In all of these areas, BOLD signal responds to the onset of any image and linearly tracks observers' aesthetic reactions. Several previous reports have also found activations in occipito-temporal areas for preference judgments of a variety of stimuli, including artwork, abstract geometric shapes, scenes, and faces (e.g., Vartanian and Goel, 2004;Jacobsen et al., 2006;Kim et al., 2007;Yue et al., 2007).
These activations likely reflect a stimulus-bound sensory and semantic analysis of preference that is relatively automatic. Supporting this interpretation is the finding that observers whose recommendations were well predicted by ratings of imageinduced "pleasure" tended to show a larger BOLD effect in the ITS (suggesting that observers differ in the degree to which they value a sensory/semantic analysis performed by posterior areas versus emotional evocativeness when reacting to aesthetic experiences). It is important to note that the linear effect of aesthetic recommendation that we observed in these areas is not due to systematic differences in the type of stimuli preferred by the observers, as neither the CoS nor FFA, defined using an independent localizer task (for places and faces, respectively) showed any effect of recommendation.
Subcortical regions STR and PRF, which also show a linear relationship to observer's recommendations, increased above baseline for recommended images and decreased below baseline for non-recommended images. Given the involvement of a column of areas in the midbrain with arousal functions (Kinomura et al., 1996;Steriade, 1996), these activations may reflect "reward" valence in STR and arousal level in PRF, two often theorized axes of emotional responsivity (Lang et al., 1990;Low et al., 2008). Although we did not explicitly measure physiological arousal, the fact that the BOLD effect size in PRF was larger for observers who tended to recommend images they found awe-inspiring suggests a potential association between aesthetic awe and arousal.

INTEGRATION IN THE STRIATUM
Not only is STR activity linearly related to aesthetic recommendation, it is also sensitive to both emotional/evaluative factors. This suggests that STR may integrate perceptual, evaluative, and reward components of aesthetic response for the purpose of outcome selection (the choice of recommendation level). This pattern, along with the detection of a related response pattern in the mdThal, is in accord with the established existence of corticostriato-pallado-thalamic loops (Alexander et al., 1986;Steriade and Llinás, 1988;Alexander and Crutcher, 1990;Middleton and Strick, 2002;Kelly and Strick, 2004). Further research will be needed to elucidate the temporal dynamics of the flow of information between these regions in aesthetic responses.
The location of the observed striatal activation straddles the anatomical division between dorsal and ventral STR, and is similar to that reported by Vartanian and Goel (2004), though other studies of preference have reported more ventral effects (Kim et al., 2007;Lacey et al., 2011). Intriguingly, we did find significantly greater activation in the right ventral STR for the most highly recommended images (4-vs.-321 contrast, results not shown). The literature on reward posits that the dorsal STR represents the "actor" function of learning and implements habits or decisions (Maia, 2009), as well as the expectation of reward and punishment (Delgado et al., 2000(Delgado et al., , 2003, whereas the ventral STR (along with the amygdala, VTA, and OFC), carries out "critic" functions of representing actual reward and reward-prediction error (Schultz et al., 1992;Schoenbaum et al., 1998;Hikosaka and Watanabe, 2000;Schultz, 2000;Tremblay and Schultz, 2000;Setlow et al., 2003;Paton et al., 2006;Wan and Peoples, 2006;Simmons et al., 2007). While the locus of our activation in STR does not clearly fall in either the ventral or dorsal STR, the fact that STR responds regardless of emotional valence is in agreement with findings in monetary reward (Delgado et al., 2000(Delgado et al., , 2003. Findings in regard to aesthetic reward have suggested a schism between desired and achieved reward that maps onto dorsal and ventral STR, respectively. Based on a PET study of pleasurable resolution of musical expectation, Salimpoor et al. (2011) have suggested that the caudate ("dorsal" STR) responds primarily to expecting a desired reward ("wanting"), while the nucleus accumbens (ventral STR) is active while experiencing the peak emotional response ("liking") associated with the resolution of a musical theme, line, or phrase. Unlike the novel, static images used in our study, their musical stimuli are temporally extended experiences, enabling listeners to predict the resolution of a musical phrase (and subsequent pleasure) based on familiarity with musical structure or particular songs. This may partially explain the difference in the locus of striatal effects following the hypothesized moment of aesthetic reward, given the known involvement of basal ganglia structures in a variety of temporally sequenced behaviors (Harrington et al., 1998). However, our task and results argue against a strict interpretation of striatal activation as reflecting anticipatory "wanting" a predicted reward, as there was no possibility of differential anticipatory responses for any of our images.

AREAS FOR FURTHER RESEARCH
Our experiment is the first to find activation in the SN in visual aesthetic response, though it has been reported for music (Suzuki et al., 2008). Activation in the left SN for the most highly rated images raises the possibility that the efferent dopaminergic connections from the SN to the STR offer a mechanism by which hedonic responses to the most highly moving images might be modulated. This might be tested in further research.
In this set of observers, recommendation-related BOLD response appears primarily as increases in activation in the left hemisphere. However, it is unclear at this time whether this represents a real difference in the lateralization of aesthetic processes or merely reflects variation in the sensitivity of observing these effects at the whole-brain level.
Finally, it remains to be seen to what degree these systems are perturbed by depression or other mood disorders. Intriguingly, Frontiers in Human Neuroscience www.frontiersin.org we found that the size of the BOLD effect in PHC, reflecting semantic/sensory processing, was larger for observers reporting positive mood (r = 0.68, p < 0.004 using r to z transform), suggesting that mood may act as a gateway to getting pleasure from sensory/aesthetic experiences.

CONCLUSIONS
The nature of aesthetic experience presents an apparent paradox. Observers have strong aesthetic reactions to very different sets of images, and are moved by particular images for very different reasons. Yet the ability to be aesthetically moved appears to be universal. The emerging picture of brain networks underlying aesthetic experience presents a potential solution to this paradox. Aesthetic experience involves the integration of neurally separable sensory and emotional reactions in a manner linked with their personal relevance. Such experiences are universal in that the brain areas activated by aesthetically moving experiences are largely conserved across individuals. However, this network includes central nodes of the DMN that mediate the intensely subjective and personal nature of aesthetic experiences, along with regions reflecting the wide variety of emotional states (both positive and negative) that can be experienced as aesthetically moving.
The linking of intense aesthetic experience and personal relevance may have implications for artists and educators alikefurther research could explore whether increasing the personal relevance of aesthetic experiences increases their intensity and the resulting associations.