Comparison of Objective Measures for Predicting Perceptual Balance and Visual Aesthetic Preference

Hübner, Ronald; Fillinger, Martin G.

doi:10.3389/fpsyg.2016.00335

ORIGINAL RESEARCH article

Front. Psychol., 11 March 2016

Sec. Perception Science

Volume 7 - 2016 | https://doi.org/10.3389/fpsyg.2016.00335

Comparison of Objective Measures for Predicting Perceptual Balance and Visual Aesthetic Preference

Ronald Hübner^*

Martin G. Fillinger

Department of Psychology, Universität Konstanz, Konstanz, Germany

The aesthetic appreciation of a picture largely depends on the perceptual balance of its elements. The underlying mental mechanisms of this relation, however, are still poorly understood. For investigating these mechanisms, objective measures of balance have been constructed, such as the Assessment of Preference for Balance (APB) score of Wilson and Chatterjee (2005). In the present study we examined the APB measure and compared it to an alternative measure (DCM; Deviation of the Center of “Mass”) that represents the center of perceptual “mass” in a picture and its deviation from the geometric center. Additionally, we applied measures of homogeneity and of mirror symmetry. In a first experiment participants had to rate the balance and symmetry of simple pictures, whereas in a second experiment different participants rated their preference (liking) for these pictures. In a third experiment participants rated the balance as well as the preference of new pictures. Altogether, the results show that DCM scores accounted better for balance ratings than APB scores, whereas the opposite held with respect to preference. Detailed analyses revealed that these results were due to the fact that aesthetic preference does not only depend on balance but also on homogeneity, and that the APB measure takes this feature into account.

Introduction

The perceptual mechanisms involved in visual aesthetics and preference judgments have long been a matter of debate (for an overview see Palmer et al., 2013). Since the seminal work of Gustav Theodor Fechner (Fechner, 1871, 1876) one approach of corresponding experimental studies has been to find aesthetic primitives, i.e., relatively simple perceptual features that determine the attraction of a stimulus (Latto, 1995; Munar et al., 2014). A prominent candidate in this respect is perceptual balance, i.e., how well the elements in a picture are arranged. There is wide consensus among aestheticians that balance has a great effect on the appreciation of a picture (Poore, 1903; Arnheim, 1954). Nevertheless, the mechanisms of balance perception are still largely unknown. Whereas most researchers agree on a global level what perceptual balance is, they disagree on the details of how balance is determined. In the present article we consider currently applied measures and examine how they are related to subjective balance, symmetry, and aesthetic preference.

Similar to early ideas of the artist and writer Henry Poore (1859-1940, Poore, 1903), and, as revealed by McManus et al. (2011), based on the work of Denman Ross (1853-1935, Ross, 1907), the Gestalt psychologist Rudolf Arnheim (1904-2007) hypothesized in his book Art and Visual Perception (Arnheim, 1954) that each rectangular frame has a hidden structure or field of invisible forces (analog to a magnetic field in physics). The center of the frame has the strongest attraction, followed by the corners, the two main axes, and the diagonals. If an element is placed in the frame, then it is pulled by all the forces of the hidden structure, which produces an inner tension or psychological force of that element in relation to the square. For instance, if a single element is placed at the center, then all forces compensate each other and the picture is perfectly balanced. In contrast, if the element is placed off-center, then there is a pull toward the center, which results in imbalance. The situation is obviously more complex if several elements are placed in a frame. In this case each element has a relative perceptual weight resulting not only from the hidden forces of the frame, but also from forces originating from the other elements. A picture is perceived as balanced if these weights compensate each other. Furthermore, as proponent of Gestalt psychology, Arnheim (1954) also assumed that perceptual grouping (by similarity of form, color, etc.) modulates the forces between the elements.

An alternative characterization of perceptual balance is to consider the subjective equilibrium of a picture. According to Arnheim (1954), every visual pattern has a center of perceptual “mass,” which depends on the perceptual weight of the elements. If this center coincides with the geometric center of the frame, then the picture is balanced. It is assumed that the perceptual weight of an element increases proportionally to the element's distance from the center of “mass” (lever principle in physics). However, the weight also depends on factors such as element size (larger elements are perceptually heavier than smaller ones), color (e.g., red is perceptually heavier than blue), and regularity (regular shapes are perceptually heavier than irregular ones). Arnheim conceded that most of these factors have to be verified, which is still valid today.

Some of Arnheim's (1954) main assumptions have already been tested. McManus et al. (1985), for instance, presented reproductions of art work as well as plain stimuli, and had their participants to place a fulcrum beneath each picture so that it looked balanced (horizontally). For the reproductions of art work they found that the adjusted position of the fulcrum varied considerably, suggesting that art work is not generally well balanced. Moreover, when participants had to locate the perceptual center for unchanged pictures and for pictures where a portion was removed, the locations were rather similar. From these results McManus et al. (1985) concluded that the balance of a picture depends more “…upon a global integration of the picture as a whole, than of any individual element of it” (p. 314f).

Even for their plain stimuli McManus et al. (1985) found no simple relation. Whereas element position was crucial for positioning the fulcrum, size and color of the elements were less important. Furthermore, although the distance of an element from the frame's geometric center and its size led to a larger shift of the fulcrum, these two factors were not correctly integrated for the judgment of balance.

In a later study, Locher et al. (1996) used reproductions of twentieth-century art paintings and a manipulated less-balanced version of each. Art experts and non-experts had to rate the balance of each picture and to determine the (two-dimensional) center of perceptual “mass.” As a result, both groups moved the center for the disrupted version, but only the experts judged this version as less balanced. Locher et al. (1996) concluded that the center of perceptual “mass” and the overall judgment of balance are not as close as thought.

Because these results do hardly support Arnheim's theory, Cupchik (2007) speculated that the terms of the theory were only meant metaphorically. McManus et al. (2011), however, believed that Arnheim wanted his theory to be taken literally, i.e., in a physical sense. To test their conjecture, they even went a step further and, instead of asking participants to indicate the fulcrum of pictures, calculated the center of “mass” by assuming that the “mass” of each pixel in a (gray-level) picture corresponds to the inverse of the pixel's gray level. They then examined whether the center was closer to an axis for art photographs than for control images, which was indeed the case.

Other tests, however, failed. For instance, in one experiment where McManus et al. (2011) presented simple pictures with only two discs but of a different gray level, performance was incompatible with a physical interpretation of balance. In view of these results, McManus et al. (2011) also came to the conclusion that the terms in Arnheim's theory cannot be taken literally.

The considered studies suggest that perceptual balance is a complex feature of pictures that depends on several factors, whose details are still largely unknown. However, the studies also demonstrate that computing objective measures for predicting subjective balance and preference is a promising approach for investigating these matters. As we have seen, McManus et al.'s (2011) physical interpretation of perceptual “mass” was not successful in this respect. However, there are other measures. Wilson and Chatterjee (2005), for instance, developed a test for the Assessment of Preference for Balance (APB). In connection with this test they introduced a measure, which we will call “APB” that highly correlated with perceptual balance and preference (liking), at least for simple pictures such as shown in Figure 1.

FIGURE 1

Figure 1. Example stimuli used in the experiments. Left panel: Pictures from the APB (Wilson and Chatterjee, 2005). The first and second number below each picture indicates the APB (computed with our algorithm) and DCM score, respectively. Note: the lower the value the higher the balance. In the top left figure additionally the different axes are shown for the demonstration of how the APB score is computed (see text for details). Right panel: Examples of the new stimuli. The intersections of the long lines in each picture indicate the respective center of “mass” (see text for details). The short lines imply the corresponding geometric center.

That the applicability of the APB measure might indeed be restricted to simple pictures is suggested by results of Gershoni and Hochstein (2011), who found only small correlations between this measure and ratings for Japanese calligraphy. Nevertheless, even if the measure predicts only preference between simple stimuli, it could be the starting point for the development of more sophisticated measures that also apply to complex pictures. Unfortunately, it is not even sure that the APB measure is valid for simple images. For instance, in a study by Silvia and Barona (2009), who used a subset of Wilson and Chatterjee's (2005) stimuli, no substantial correlation between APB scores and liking was observed.

One aim of the present study was to replicate Wilson and Chatterjee's (2005) results by applying complete sets of their original images. Furthermore, because the APB measure is the average of eight components, it was possible to use multiple regression analyses to examine the extent to which the components are related to perceptual balance and aesthetic preference. Such analyses have not been done before. A second aim of our study was to compare the APB measure to three other objective measures that have also been proposed for measuring balance: a measure of balance that is based on the physical interpretation of perceptual “mass,” a measure of mirror symmetry, and a measure of heterogeneity. Finally, we wanted to examine to what extent the results can be generalized. Therefore, we also applied new sets of stimuli.

For replicating Wilson and Chatterjee's (2005) results and for comparing the APB measure with alternative measures, we conducted two experiments. In the first one we collected balance and symmetry ratings for pictures from the APB and examined how well the different measures can account for the judgments. In the second experiment different participants rated the same pictures with respect to aesthetic preference (liking). The ratings were then correlated with the judgments from Experiment 1, and with the different measures. As we will show, our results were similar to those of Wilson and Chatterjee's (2005). However, some of the alternative measures were also highly correlated with balance or preference ratings. A third experiment, where participants had to rate the balance as well as the liking of new stimuli, revealed that the specific selection of stimuli has some effects on the results. Before we report our results in detail, however, we introduce the applied measures.

Assessment of Preference for Balance (APB)

Wilson and Chatterjee's (2005) test for the APB consists of images containing seven black elements of varying sizes that are scattered on a white quadratic background (750 × 750 pixels). There are 65 images with circles, hexagons, or squares, respectively. All elements within each image have the same shape (for examples see Figure 1). To also have an objective measure of balance for each picture, they created a specific score, defined by the mean of eight partial measures that are more or less related to symmetry. Relying on symmetry seems to be reasonable, because this feature is strongly related to balance and preference. Mirror symmetry, for instance, is the simplest form of balance. Accordingly, symmetric pattern can not only be processed and remembered more easily than asymmetric ones (Garner and Clement, 1963), they are also judged as more “beautiful” (Jacobsen and Höfel, 2002). On the other hand, balance can be understood as a more complex form of symmetry (Locher and Nodine, 1989).

For obtaining the APB score, two symmetry measures are computed around the vertical and the horizontal axes, and around the two diagonal axes, respectively. Assume that a picture is divided along the horizontal dimension into four vertical, equally sized rectangles (see upper left picture in Figure 1), denoted by A₁, A₂, A₃, and A₄, from left to right, respectively. If f denotes a function that counts the number of black pixels in a given area, then the number N of all such pixels in a picture is f (A₁) + f (A₂) + f (A₃) + f (A₄). The first partial symmetry measure for the horizontal dimension (around the vertical axis) is defined by h = (|[f (A₁) + f (A₂)] – [f (A₃) + f (A₄)]|/N)·100, i.e., the absolute difference between the number of black pixels in the left half and that in the right half of the picture in percent. The second measure for this dimension reflects the so-called horizontal inner-outer relation and is defined by h_io = (|[f (A₁) + f (A₄)] − [f (A₂) + f (A₃)]|/N)·100. Analogous partial measures are computed for each of the remaining three axes (the corresponding divisions of the picture area are shown in the upper left picture in Figure 1). The corresponding measures for the vertical dimension are denoted by v and v_io, those for the main diagonal (top left to bottom right) by md and md_io, and those for the anti-diagonal by ad and ad_io. Finally, the mean of the eight partial measures defines the APB score. Note that a low score (percentage) means high balance, whereas a high score reflects poor balance.

Deviation of the Center of “Mass” (DCM)

Because the APB score is only loosely related to physics, we also applied a measure that is more strongly related to a physical interpretation of balance in the sense of Arnheim (1954). For this objective we computed a measure that represents the deviation of the center of “mass” (DCM) from the picture's geometrical center. Assume two elements with visual “masses” m₁ and m₂ respectively, arranged on a beam. A point located between these objects at distance of d₁ and d₂, respectively, is the center of “mass” (balance point, fulcrum) if m₁d₁ = m₂d₂. A practical way to calculate the center is to calculate the distances r₁ and r₂ of the “masses” from an arbitrary reference point (see McManus et al., 2011). The balance center is then located at distance r = (m₁r₁ + m₂r₂)/(m₁ + m₂).

For the black-and-white pictures used in this study, we assumed that the “mass” of a black pixel is one, whereas that of a white pixel is zero. If we chose position x = 0 as reference point, then the center of “mass” b_x on the horizontal dimension is located at position:

\begin{array}{l} b_{x} = \frac{\sum_{i = 1}^{w} m_{i} r_{i}}{\sum_{i = 1}^{w} m_{i}}, \end{array}

where w is the picture width, and m_i the number of black pixels in column i. The center for the vertical dimension is calculated analogously. In Figure 1, the line intersections in the two upper right pictures indicate the respective locations of the center of “mass.” The geometric centers are implied by the short lines.

In the present study we used the normalized location $b_{x}^{'} = b_{x}$ /w, which can vary from zero to one. For these coordinates the geometrical center is at 0.5, and the horizontal distance to the center of “mass” is d_x = 0.5-b′_x. An analog distance d_y was calculated for the vertical dimension. The DCM measure of balance is then defined by the Euclidean distance of the two-dimensional center of visual “mass” to the geometrical center of the image. Specifically, we used the relative deviation in percent:

\begin{array}{l} D C M = (\frac{\sqrt{d_{x}^{2} + d_{y}^{2}}}{0.5}) 100 . \end{array}

Mirror Symmetry (MS)

As shown, the APB score is the mean of different measures most of which are based on the symmetry around some axis of the picture. Symmetry, however, is reflected only coarsely by these measures. Therefore, we also considered a measure of mirror symmetry (MS) that is defined by the mean of mirror-symmetry measures around different axes. The partial score for a given axis was computed by a formula suggested by Bauerly and Liu (2006). Assume that the vertical axis is the axis of reflection and that m and w denote the height and width of the image in pixels, respectively. The required number of comparisons n for each row is w/2, if w is even and (w-1)/2, if w odd. Assume further a binary variable X_ij that is 1 if there is a match between pixels and 0, otherwise. Finally, there is a factor that reduces the weight of the match the farther away from the axis of reflection it is. The symmetry s for the vertical axis is then:

\begin{array}{l} s = \frac{2}{3 m n} \sum_{i = 1}^{m} \sum_{j = 1}^{n} X_{i j} (1 + \frac{j - 1}{n - 1}) . \end{array}

Analogous measures were computed for the horizontal axis and for each of the two diagonals. At the end, the four measures were multiplied by 100 and averaged. The resulting MS score is the mean symmetry in percent. The higher the value the more symmetric the picture.

Homogeneity (HG)

If we consider the pictures of the APB (for examples see the left panel in Figure 1), then it is obvious that balance is confounded to some extent with homogeneity. For many pictures it holds that, the less scattered the elements in the picture, the less balanced the picture. To investigate this relation in detail, we also wanted to include a measure of homogeneity. A measure that reflects this feature and that has widely been applied, among others for evaluating the design of user interfaces (e.g., Ngo et al., 2002), is information entropy (Shannon, 1948). Assume that we divide the picture area into M equally sized regions (bins). The entropy E is then defined by:

\begin{array}{l} E = - \sum_{i = 1}^{M} p_{i} ln p_{i}, \end{array}

Where p_i is the probability of black pixels in bin i, which is usually estimated by the corresponding relative frequency. For a given number of bins, the maximum entropy is reached if all bins contain the same number of black pixels. The value of this maximum is ln(M). Thus, a proper score of picture homogeneity can be obtained by calculating the relative entropy:

\begin{array}{l} E_{r} = \frac{E}{ln M} . \end{array}

For the present study we computed separate values E_rx and E_ry for the horizontal and vertical dimension, respectively. For each dimension we divided the picture into 10 bins along the corresponding axis. The score HG, which reflects homogeneity in percentage, is then:

\begin{array}{l} H G = (\frac{E_{r x} + E_{r x}}{2}) 100 . \end{array}

Experiment 1

In our first experiment we collected balance and symmetry ratings for two sets of pictures (constructed from circles or from hexagons) taken from the APB (Wilson and Chatterjee, 2005) and examined to what extent these ratings correlate with the objective measures of balance, symmetry, and homogeneity.

Method

Participants were 18 students from the University of Konstanz. They were recruited via an online system (ORSEE, Greiner, 2015) for participating in the experiment. The data of two participants were excluded from data analysis, because one of them produced many extreme values (0 and 100), and the other misunderstood the rating scales. The remaining 16 participants (3 males) had an average age of 23 years (SD = 1.77). All had normal or corrected-to-normal vision and were paid 8 € for their participation. The experiment was performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its later amendments. In agreement with the ethics and safety guidelines at the Universität Konstanz, we obtained a verbal informed consent statement from all individuals prior to their participation in the study. Potential participants were informed of their right to abstain from participation in the study or to withdraw consent to participate at any time without reprisal.

Apparatus and Stimuli

The stimuli were presented on a 19″ LCD-monitor with a resolution of 1280 × 1024 pixels. A personal computer (PC) served for controlling stimulus presentation and response registration. As stimuli served all 65 pictures with circles and all 65 pictures with hexagons from Wilson and Chatterjee's (2005) APB. The pictures consisted of seven elements, which had the same shape, but varied in size. The APB score for each stimulus was calculated by our own algorithm, which produced values quite close (r > 0.999) to those provided by Wilson and Chatterjee (2005). Across pictures, the APB scores ranged from 3.49 to 65.9 (M = 35.3, SD = 17.4). DCM scores ranged from 1.51 to 79.5 (M = 35.4, SD = 25.1), homogeneity (entropy) from 64.1 to 94.8 (M = 82.9, SD = 7.71), and mirror symmetry from 1.13 to 10.9 (M = 3.91, SD = 1.65). The stimuli were presented at the center of the monitor on a black background. Each picture had an extension of 750 × 750 pixels, which approximately corresponded to a visual angle of 21° horizontally and vertically.

Procedure

After the participants had read the instruction and considered 6 example stimuli (3 with circles and 3 with hexagons), which were not used for the main task, they rated each picture with respect to balance and symmetry. Instead of a 1-to-5 rating scale, as in Wilson and Chatterjee (2005), we applied a continuous scale (1-to-100 slider bar) to reduce information loss (cf. Treiblmaier and Filzmoser, 2009). The scale went from “not balanced” to “balanced” for the balance rating, and from “not symmetrical” to “symmetrical” for the symmetry rating. The participants saw a horizontal slider located below the stimulus and had to move a computer mouse to adjust the position of the slider that corresponds to their subjective estimation of balance or of symmetry, respectively. The corresponding numeric value (not visible for the participants) of the chosen position was then entered by clicking the left mouse button. There was no time limit. Immediately after the value was entered, the next stimulus was displayed.

Balance and symmetry were assessed in alternating blocks of 130 trials. Half of the participants started with rating balance, the other half with rating symmetry. There were two blocks for balance and symmetry rating, respectively. The 130 pictures (65 with circles and 65 with hexagons) were randomized within each block. The experiment lasted approximately 50 min.

Results

Balance Ratings

The mean balance ratings ranged from 16.2 to 79.3 (M = 43.2, SD = 15.3). They were subjected to a one-way within-participant ANOVA with factor stimulus type (circles, or hexagons). There was no significant difference (circles: 45.5, hexagons: 41.0) between the stimulus types, F_{(1, 15)} = 2.88, p = 0.113, $η_{p}^{2}$ = 0.159.

APB

The mean APB scores for pictures with circles and with hexagons were 34.9 and 35.7, respectively. In a first step we computed for each participant the correlation between the balance ratings and the scores across the 65 pictures with circles and across the 65 pictures with hexagons. For the pictures with circles the correlations ranged from −0.075 to −0.860, and for those with hexagons from −0.132 to −0.855. There was only one participant with non-significant (p > 0.05) correlations for both stimulus types. Three participants had a non-significant correlation for one of the stimulus types. The mean correlations are listed in Table 1.

TABLE 1

Table 1. Means (across participants) of the individual correlations (across pictures) between the balance and symmetry ratings and the different scores in Experiment 1.

Next, we computed the mean balance ratings across participants for each picture and correlated the obtained values across all 130 pictures with the different scores. The correlations and corresponding R²-values are shown in Table 2.

TABLE 2

Table 2. Correlations between the mean ratings and the different scores across both stimulus types in Experiment 1.

As mentioned, the APB scores are the mean of eight different measures. This implies that each component has the same weight. To examine whether this is appropriate, we also computed a multiple linear regression for each of the two stimulus types and for both types together. The results are shown in Table 3. If we consider the regression across both stimulus types, then we see that R² increased for the APB score (Table 2) from 0.615 to 0.751, which demonstrates that different weights for the components can improve the predictive power of the score. The obtained individual coefficients indicate that the horizontal component (symmetry over the vertical axis) has by far the largest weight. In contrast, the inner-outer components hardly explained variance.

TABLE 3

Table 3. Regressions of balance ratings on the components of the APB scores in Experiment 1.

DCM

The mean DCM scores for circles and hexagons were 33.3 and 37.3, respectively. Correlations of the DCM scores with the balance ratings for individual participants ranged from −0.008 to −0.786 for circles and from −0.153 to −0.861 for hexagons. The mean correlations are listed in Table 1. As can be seen, the correlations for the DCM scores were somewhat higher than those for the APB scores. However, a comparison across both stimulus types revealed no significant difference (−0.467 vs. −0.486), F_{(1, 15)} = 2.11, p = 0.167, $η_{p}^{2} = 0.123$ . If we consider the mean balance ratings (see Table 2), then their correlation with the DCM scores was also numerically larger than that with the APB scores.

Because the DCM score represents the Euclidian distance to the image center, it is interesting to examine whether a linear combination of the horizontal and the vertical distance would have been a better measure. Therefore, we computed a multiple linear regression with these two components. As a result, there was a strong contribution of the horizontal deviation (see Table 4). However, R² was smaller than the corresponding value for the DCM score (0.605 vs. 0.675). Thus, the Euclidian distance of the center of “mass” from the image center is a better measure than the linear combination of the horizontal and the vertical deviation.

TABLE 4

Table 4. Regressions of balance ratings on the components of the DCM score in Experiment 1.

MS

The mean scores of mirror symmetry for circles and hexagons were 3.21 and 3.36, respectively. Correlations between the symmetry ratings and the MS scores for the individual participants varied between 0.058 and 0.333 for circles and between −0.043 and 0.317 for hexagons. The mean values, which are shown in Table 1, were relatively small, as was the correlation between the mean balance ratings and the MS scores (see Table 2). A regression of the balance ratings on the four components of the MS score improved R² only slightly from 0.175 to 0.196 (see Table 5). Although the diagonals significantly accounted for the balance ratings, the horizontal dimension (vertical axis of reflection) had the strongest effect.

TABLE 5

Table 5. Regressions of balance ratings on the components of the MS measure in Experiment 1.

HG

The mean HG scores for circles and hexagons were 83.8 and 82.0, respectively. Correlations of the HG scores with the balance rating ranged for individual participants from 0.037 to 0.801 for circles and from −0.008 to 0.770 for hexagons. Mean correlations are shown in Table 1. The correlations for the HG measure are smaller than those for the APB and DCM scores. Such a pattern also occurred for the correlations with mean balance ratings (see Table 2). A statistical test revealed that the mean correlation for the HG scores was significantly smaller than that for the APB scores (−0.411 vs. −0.467), F_{(1, 15)} = 25.6, p > 0.001, $η_{p}^{2} = 0.631$ .

To examine how the two dimensions of the HG measure are related to the balance ratings, we computed a multiple linear regression with the corresponding two components. It revealed that homogeneity along the vertical dimension was as important as that along the horizontal dimension. Accordingly, the regression did not increase R².

Symmetry Ratings

The symmetry ratings differed significantly between the two stimulus types, $F_{(1, 15)} = 14.0, p < 0.01, η_{p}^{2} = 0.484$ , indicating that the pictures with circles were perceived as more symmetric than those with hexagons (47.7 vs. 42.5). The mean correlations between the symmetry ratings and the different measures are shown in Table 1. Obviously, the symmetry ratings were rather weakly correlated with the MS scores. Six participants had at least one non-significant correlation (p > 0.05). Interestingly, the APB scores correlated higher with the symmetry ratings than with the balance ratings (−0.691 vs. −0.467), $F_{(1, 15)} = 12.9, p < 0.01, η_{p}^{2} = 0.462$ , which was also the case for the DCM scores (−0.670 vs. −0.486), $F_{(1, 15)} = 12.8, p < 0.01, η_{p}^{2} = 0.461$ , and for the HG scores (0.627 vs. 0.411), $F_{(1, 15)} = 13.9, p < 0.01, η_{p}^{2} = 0.481$ .

The mean correlations were also somewhat larger for the DCM scores than for the APB scores (−0.691 vs. −0.700), which, however, was not significant, F_{(1, 15)} = 0.610, p = 0.447, $η_{p}^{2} = 0.039$ . A similar pattern of correlations occurred for the mean scores. Table 2 shows that the symmetry ratings correlated highly with the balance ratings (shared variance was 86%). In view of this correspondence we did not further analyze the symmetry ratings and their relation with the different measures.

Discussion

Our results show that the mean ratings for balance and symmetry were highly correlated, which suggests that it was difficult for the participants to operationalize the two concepts differently. That the two ratings were nevertheless not identical is indicated by the fact that the symmetry ratings were significantly higher for pictures with circles than for those with hexagons, which was not the case for the balance ratings. Moreover, the APB, DCM, and HG scores correlated higher with the symmetry ratings than with the balance ratings.

With respect to the APB scores, we replicated the result of Wilson and Chatterjee (2005). The mean scores correlated highly with the mean ratings of balance. However, it also became clear that the individual correlations were much smaller and varied considerably across participants. Furthermore, a regression analysis of the mean balance ratings on the components of the APB scores revealed that the components accounted differently for the balance ratings. The horizontal dimension had the largest effect, followed by the vertical one. Whereas the diagonal components also contributed to a small but significant extent, the inner-outer components had a negligible effect. In all, a differential weighting of the individual components increased the percentage of explained variance, compared to the original score with equal weights (averaging).

The DCM scores correlated surprisingly high with the ratings. The correlations were numerically even higher than those for the APB scores, which shows that it was actually not necessary to invent a new score for measuring balance. The HG scores also correlated substantially with subjective balance and symmetry, although not as high as the APB and the DCM scores. Interestingly, in contrast to the other measures, the vertical dimension was similarly importance for this correlation than the horizontal one. The MS scores had the weakest relation to the ratings, suggesting that perception does not take mirror symmetry into account, at least not for the current type of pictures.

Taken together, the results show that objective measures can be constructed that reflect perceptual balance (and symmetry), at least for the relatively simple pictures used here. A straightforward method is simply to compute how much the center of “mass” deviates from the geometric center of the picture. The larger the deviation the less balanced the picture. Another method would be to compute APB scores. However, although this measure also correlated highly with the ratings, a closer look at the pictures reveals an inconsistency. Table 6 includes three pictures from the APB whose APB scores increase from left to right. Obviously, Pictures #45 is less balanced than Picture #27. However, it is hard to believe that picture #46 shall be less balanced than Picture #45. That this is inconsistent to one's impression is also confirmed by our balance ratings. Picture #46 received a rating that was even higher than that of Picture #27. In contrast to the APB score, the DCM measure reflects this order. This strongly favors of the DCM score as measure for perceptual balance.

TABLE 6

Table 6. Example pictures from the APB and corresponding ratings and objective scores.

An analysis of the components of the APB measure revealed that the reason for this inconsistency are the inner-outer components, which represent the difference in black pixels between the inner and the outer areas. Consequently, if elements are present only in the center, as in Picture #46, then these components have a high value, indicating unbalance, which however, does not reflect subjective balance. If we consider the different measures in Table 6, then it is obvious that the inner-outer components correspond to homogeneity. Indeed, homogeneity is highest for Picture #27. Thus, the APB measure is not very well suited for representing balance.

Experiment 2

In our second experiment we wanted to collect preference ratings for the pictures shown in the first experiment. To avoid any influence from a second task, the participants had merely to indicate how much they liked each picture. The main goal was to examine to what extent the different ratings from Experiment 1 and the introduced measures can account for preference judgments.

In this context we also wanted to replicate the results of Wilson and Chatterjee (2005), who found a high correlation between their APB score and liking. In a subsequent study, Silvia and Barona (2009) could not replicate this result. However, their main goal was to test the hypothesis that pictures with curved elements are preferred to those with angular elements (Bar and Neta, 2006). Therefore, they applied only a selection of 9 pictures with circles and one of 9 pictures with hexagons from the APB to construct three different levels of balance. Whereas pictures with circles were indeed preferred to those with hexagons, the correlation between APB score and liking was rather low.