Similar Personality Patterns Are Associated with Empathy in Four Different Countries

Empathy is an important human ability associated with successful social interaction. It is currently unclear how to optimally measure individual differences in empathic processing. Although the Big Five model of personality is an effective model to explain individual differences in human experience and behavior, its relation to measures of empathy is currently not well understood. Therefore, the present study was designed to investigate the relationship between the Big Five personality concept and two commonly used measures for empathy [Empathy Quotient (EQ), Interpersonal Reactivity Index (IRI)] in four samples from China, Germany, Spain, and the United States of America. This approach was designed to advance the way the Big Five personality model can be used to measure empathy. We found evidence of medium effect sizes for associations between personality and empathy, with agreeableness and conscientiousness as the most important predictors of affective and cognitive empathy (measured by the respective IRI subscales) as well as for a one-dimensional empathy score (measured by the EQ). Empathy in a fictional context was most closely related to openness to experience while personal distress was first of all related to neuroticism. In terms of culture, we did not observe any distinct pattern concerning cultural differences. These results support the cross-cultural applicability of the EQ and the IRI and indicate structurally similar associations between personality and empathy across cultures.


INTRODUCTION
Empathy represents an important human social construct enabling successful social interactions (e.g., Ford, 1982). In the broadest sense, empathy is defined as the reactions of one individual to the observed experiences of another (Davis, 1983), although this is only one of many definitions discussed in the literature (e.g., Walter, 2012). Despite a large body of research designed to understand empathy, the way empathy is conceptualized is currently not well understood. For example, there currently exist several points of discussions concerned with disentangling (sub)components of empathy (e.g., Batson, 2009), its measurement (e.g., Melchers et al., 2015) or possible causes for individual differences (e.g., Moore, 1990;Knafo et al., 2008). This study was designed to help resolve these issues, in part, by investigating the association between the five-factor model of personality and trait empathy across cultures.
Personality is one effective way to measure individual differences in cognitive thinking patterns or emotional tendencies. The Big Five personality model has developed as one of the most representative models of personality structure (Deary and Matthews, 1993;McCrae and Costa, 2003). Although the Big Five are associated with a variety of processes and concepts related to empathy like attachment styles and romantic relationship outcomes (Shaver and Brennan, 1992), emotional intelligence ( Van der Zee et al., 2002), or impulsivity (Whiteside and Lynam, 2001), there are, to our knowledge, only three studies dealing with the relationship of empathy with the Big Five. Wakabayashi and Kawashima (2015) investigated the relationship between the NEO-FFI (Costa and McCrae, 1992) and empathy as measured by the Empathy Quotient (Baron-Cohen and Wheelwright, 2004; EQ) in a sample of Japanese university students. The authors found nearly no relationship (only 2-3% explained variance in terms of overlap) and associations were only significant in case of prediction via agreeableness or extraversion. The authors interpreted their results in parts by cultural factors. In addition, they argue that the EQ was designed to measure cognitive empathy (i.e., the ability to take the perspective of another person), which (according to the authors) is not closely related to more affective personality traits like, for example, agreeableness. In contrast, Nettle (2007) found strong associations between the Big Five and the EQ. In his study, data were collected by use of an online questionnaire in the USA, and the Big Five were assessed using the International Personality Item Pool's IPIP five-factor personality scales (Goldberg, 1999). Nettle (2007) observed that agreeableness (r = 0.75) and extraversion (r = 0.37) were significantly associated with higher EQ values. Nettle raised the question whether agreeableness and empathy may be the same general concept, although one must keep in mind that the reported size of correlation means that still more than 40% (1-0.75 2 ) of variance between both questionnaires was not shared. Finally, del Barrio et al. (2004a) investigated a sample of Spanish adolescents. Empathy was assessed with the Spanish version of the Bryant Empathy Index for Children and Adolescents (Bryant, 1982;del Barrio et al., 2004b), and the Big Five were measured with the Spanish version of the Big Five Questionnaire (Caprara et al., 1993), in which some dimensions are named differently compared to those in the classic inventories NEO-FFI or NEO-PI-R. The authors found medium size relations between empathy and the Big Five, especially to friendliness (agreeableness), openness, and conscientiousness. Overall, these results do not provide a complete understanding of the association between personality and empathy that may, in part, be influenced by cultural differences as well as by differences due to inconsistent measurement methods.
An important aspect to consider when comparing samples from different cultures is the cross-cultural applicability of the Big Five. The Big Five have been translated and used in numerous languages under the assumption of cross-cultural applicability (Schmitt et al., 2007). Despite this frequent use, some researchers are skeptical whether it is possible to use the same questionnaire within different cultural contexts (e.g., Poortinga and Van Hemert, 2001). And indeed, there are a number of concerns and problems that need to be kept in mind when using Big Five measures within different cultural contexts: For example, it has been discussed whether the personality trait taxonomy, which is the basis for the Big Five structure, can be applied to different languages, because studies provided evidence for structural inhomogeneities depending on language (e.g., De Raad, 1998) and there are authors who assume language specific idiosyncrasies (e.g., Juni, 1996). On the other hand, studies testing the generalizability of the Big Five factor solution have replicated the fivefactor structure quite well for many languages (Church and Lonner, 1998;Jolijn Hendriks et al., 2003). Another problem concerns whether data raised in different cultural contexts possess scalar equivalence (e.g., Byrne and Campbell, 1999) or whether encountered differences may be the result of matching translations, uneven sampling or culture-specific style of responding (Schmitt et al., 2007). Besides, different cultural groups may respond depending on different reference groups, who in turn may influence their response patterns (e.g., Heine et al., 2002). Despite these and many other difficulties (for a more detailed portrayal compare for example Van de Vijver and Leung, 2001;McCrae and Allik, 2002), new research methods and strategies for data analysis allow meaningful cultural comparisons: these safety measures (see also Schmitt et al., 2007) include, for example, specific guidelines for the translation of questionnaires (e.g., Beaton et al., 2000), the use of structural equation models (e.g., Cheung and Rensvold, 2000) or the implementation of new scales or items to improve the comparability of questionnaires (e.g., van de Vijver, 2000).
Initial research on the cross-cultural applicability of the Big Five (e.g., John and Srivastava, 1999) has shown replicability for the Germanic languages and has drawn a more diverse picture for non-Western cultures and languages, especially questioning the universality of the openness dimension. Besides, research on the NEO-PI-R has depicted the cultural sensitivity of agreeableness and extraversion, because some of their facets load on other dimensions depending on culture (McCrae and Allik, 2002). More recent research found configural but no factorial or scalar invariance for the Big Five (Nye et al., 2008) and noticeable differences in item functioning (Church et al., 2011). Besides, some researchers found evidence for additional indigenous factors like positive and negative evaluation (Almagor et al., 1995;Benet and Waller, 1995) or Chinese tradition (Cheung and Leung, 1998). In sum (see also McCrae and Allik, 2002), the cross-cultural validity of the dimensions extraversion, agreeableness and conscientiousness seems clearly established although extraversion and agreeableness are culturesensitive. For neuroticism, intercultural applicability is also hedged, although in this case results also depend on the methodological approach (etic vs. emic, see also Davidson et al., 1976). The cross-cultural validity of the openness dimension is questionable to an extent which makes some authors even argue for dropping the dimension. But although there are problems and concerns when using the Big Five model in cross-cultural research, the model still is "the best working hypothesis of an omnipresent trait structure" .
Based on the previous studies concerning the empathypersonality relationship, the current study follows two major aims: first, to address the question whether (and how) the Big Five personality model is associated with empathy. Furthermore, because previous studies have utilized inconsistent measurement approaches across different cultures, we investigated the effect of culture on the association between personality and empathy by using similar measurement tools (set of questionnaires) across samples (China, Germany, Spain, and the United States of America). The design of this approach may help to explain the inconsistent results reported in previous studies, and may also shed light on general differences concerning questionnaire responses across cultures. For the measurement of empathy, we selected two of the most commonly used measures, the Interpersonal Reactivity Index (IRI; Davis, 1983) and the EQ. The IRI measures cognitive (i.e., the ability to put oneself in the shoes of another person) and affective (i.e., the ability to feel with another person) aspects of empathy on separate scales [as well as empathy in a fictional context (fantasy) and distress triggered by social interaction (personal distress)]. The EQ measures all empathy components in one score. We measured the Big Five by using the NEO-FFI.
Based on prior studies demonstrating associations between the Big Five and empathy, we predict to observe positive associations between empathy measures (EQ as well as the cognitive and affective subscale of the IRI) and agreeableness, conscientiousness, openness, and extraversion. In terms of culture, we do not predict major differences across cultures. Due to the heterogeneity of previous mono-cultural results and because of the lack of studies comparing the empathypersonality relationship between cultures, we refrain from any specific predictions concerning culture.

Participants
We obtained data in China [N = 438; M (age) = 19.61; n = 167 women], Germany (N = 304; n = 232 women; M (age) = 21.17], Spain 1 (N = 62; n = 35 women; M (age) = 28.02] and the United States of America (N = 92; n = 56 women; M (age) = 19.58] by use of an self-constructed online questionnaire based on the Adobe Go Live software. The questionnaire was designed in such way that participants could not send back incomplete data. All data was collected within a university environment. Participants (mainly undergraduates) were invited to participate via onlineadvertisement and in classes. Participants received no monetary compensation for their participation but credits (as part of their study regulations). Written informed consent to participate was obtained prior to testing, and the study was approved by the local ethics committee at the Universities of Georgia, USA and Bonn, Germany.

Measures
In this study, self-reported empathy is measured with the IRI (Davis, 1983) and the EQ (Baron-Cohen and Wheelwright, 2004), because both are among the most common used questionnaires in studies on empathy.
The IRI consists of the four subscales empathic concern, perspective taking, fantasy and personal distress. According to Davis (1983), these exhibit orthogonality, which is why they are not aggregated into a total score. The first two subscales measure affective (empathic concern; EmC) and cognitive (perspective taking; PeT) aspects of empathy. Fantasy (Fan) grasps subjects' ability to transpose themselves into feelings, thoughts and actions of fictional characters, and personal distress (PeD) has been defined as the "self-oriented" feelings of personal anxiety and unease in tense interpersonal settings (Davis, 1983). Translated versions of the questionnaire were used [Siu and Shek (2005) for the Chinese version, Enzmann (1996) for the German version and Pérez-Albéniz et al. (2003) for the Spanish version]. Final questionnaire versions were inspected by native speakers and retranslated into English as an additional quality check.
The EQ (Baron-Cohen and Wheelwright, 2004, p. 166) has been developed in the context of research on autism as a screening device to measure empathic abilities. It yields a general empathy score, which does not differentiate between facets of empathy (as done in the IRI), because the authors argue that although a cognitive and an affective side of empathy exist, both co-occur in most instances and therefore cannot be easily disentangled. The EQ does, however, exhibit a threedimensional structure with underlying factors called cognitive empathy, emotional reactivity and social skills (Lawrence et al., 2004;Muncer and Ling, 2006), which can be measured separately. Translated versions of the questionnaire were taken from www.autismresearchcentre.com/arc_tests. Final questionnaire versions were inspected by native speakers and retranslated into English as an additional quality check. For more details concerning the empathy measures, please see also Melchers et al. (2015).
The NEO-FFI questionnaire (Costa and McCrae, 1992) is one of the most common multidimensional inventories assessing the five most important personality domains extraversion, neuroticism, openness to experience, conscientiousness and agreeableness according to the lexical approach to personality. These domains are measured on the level of interval scaling. The NEO-FFI consists of 60 items and has proven good reliability and validity. In case of the Spanish and the USA sample, values for the NEO-FFI were calculated from the longer NEO-PI-R version. Translated versions of the questionnaire were used [McCrae et al. (1996) for the Chinese version, Borkenau and Ostendorf (1993) for the German version and Aluja et al. (2002) for the Spanish version]. Final questionnaire versions were inspected by native speakers and retranslated into English as an additional quality check.

Statistical Methods
Initially, we checked for associations of empathy to age and gender in all samples by use of correlations and analyses of variance (ANCOVA), respectively. These analyses were conducted, because gender (e.g., Schmitt et al., 2008;Derntl et al., 2010) as well as age (e.g., Lennon and Eisenberg, 1987;Roberts et al., 2006) have been shown to impact on personality and empathy. This is particularly important, because our samples were not equal concerning these variables (compare results section). Hereafter, we performed sensitivity analyses for the samples by use of the G * Power 3.1 software (Faul et al., 2009). We did so because of the varying sample size, which lead to differences in sensitivity between samples. For these analyses, we used t-tests with α = 0.01, power = 0.80 and the respective sample size as input parameters. Next, we investigated the relationship between empathy and personality in the total sample by use of Pearson's correlations, and by use of stepwise hierarchical regressions to predict EQ values or one of the IRI dimensions, respectively, by use of the Big Five. The latter analyses were performed to identify the most important predictors for empathy and to account for multicollinearity between the five dimensions of the NEO-FFI. Subsequently, we investigated whether there were mean differences in questionnaire responses between samples by use of ANCOVA, controlling for age and including gender as a second factor. Finally, we analyzed and compared the relationship between personality and empathy for each sample in the same manner as we analyzed the data across samples. In case of the correlations, we further analyzed whether there are significant differences in amount of correlation between cultures by use of z statistics. Besides, we once again performed sensitivity analyses by use of the G * Power 3.1 software. For these analyses, we used z-tests (Cohen's q; Rosnow and Rosenthal, 2003) with α = 0.01, power = 0.80 and the respective sample sizes as input parameters. In case of the regression models, we used a stepwise forward selection approach with gender and age as a first step and the Big Five dimensions as a second step. For all analyses, a significance threshold of p = 0.01 was defined to account for multiple testing of the five personality dimensions (p = 0.05 divided by five tests).

Age, Gender, and Sensitivity
Descriptive data concerning gender distribution and mean age are presented in Table 1. In both cases, there were highly significant differences between our four samples. In case of age, all post hoc comparisons except the China/USA comparison delivered significant results. In case of gender, especially the distribution in the Chinese sample (more male than female participants) differed from the other nations. Index numbers indicate significant post hoc tests, numbers with the same index number (per line) differ significantly from each other (corrected with Bonferroni).
Concerning gender, there was only one significant difference in the Chinese sample: Chinese women report more stress in social interaction than men [F (1,436) = 16.282, p < 0.001].
In contrast, we found many differences depending on gender in the German sample: women depicted higher values in all empathy dimensions except perspective taking [Fan: Women scored higher in empathy and agreeableness while men scored higher in openness. Finally, in the sample from the USA we observed no significant gender differences.
Analyses concerning our samples' sensitivity revealed the following results: Given the respective samples sizes, an α = 0.01 and an expected power of 0.80, we could detect correlation coefficients down to r = 0.161 for the Chinese sample (t = 2.587, df = 436), r = 0.193 for the German sample (t = 2.592, df = 303), r = 0.408 for the Spanish sample (t = 2.660, df = 60) and r = 0.341 for the US-American sample (t = 2.632, df = 90).

Association Between Personality and Empathy in the Total Sample
Mean questionnaire responses and standard deviations across all four subsamples are presented in Table 1. Overall, these are comparable to the responses found in other studies administering the IRI, the EQ or the NEO. Table 2 depicts the partial correlations between empathy and personality (corrected for age) in the total sample. Here, agreeableness is the personality dimension demonstrating the highest correlation with the EQ as well as with the affective and the cognitive empathy subscale from the IRI. For fantasy, openness is the personality dimension with the highest correlation, for personal distress it is neuroticism. Analyses of gender differences in these correlations depicted no significant differences. Table 3 shows the hierarchical regression models predicting either the EQ or one of the IRI subscales by gender and age (first step) and personality (second step) in a forward selection approach model. Results once again show that agreeableness (and in case of the EQ to a lesser extend conscientiousness) is the most important personality dimension to predict the classical empathy dimensions. The relationships between openness and fantasy as well as between neuroticism and personal distress are still highly significant and keep about the same effect size although the regression model controls for potential effects of multicollinearity. Overall, the EQ or personal distress, respectively, share about double the amount of variance with personality than perspective taking or empathic concern. Fantasy, in comparison, is only predicted poorly.

Comparison of Self-Report Measures Between Samples
Mean questionnaire responses and standard deviations of the four samples are also presented in Table 1. With the exception of perspective taking, fantasy and personal distress, there were significant differences between countries for all administered questionnaires (corrected for age and with gender as a second factor). Most significant differences were found between the USA and Germany (5), followed by China and Germany (4) and Spain and China/the USA (3 each). Results have been corrected for age. To be able to detect gender differences, coefficients were calculated separately for female and male participants. The comparison of the relationship of personality to empathy between samples revealed only three significant differences: for the EQ and extraversion (China/Germany), for perspective taking and conscientiousness (China/Germany) and for fantasy and openness (China/Spain/USA). It should be noted, however, that the power to detect differences between samples varies depending on the respective sample combination due to the differences in sample size (compare methods), as can be derived from the results of our additional sensitivity analysis. Given an α = 0.01, power = 0.80 and the respective sample sizes, we could detect the following differences between correlation coefficients: China/Germany: q = 0.256; China/Spain: q = 0.474; China/USA: q = 0.398; Germany/Spain: q = 0.486; Germany/USA: q = 0.412; Spain/USA: q = 0.574. Table 5 shows the results of the hierarchical regressions, which account for multicollinearity between the Big Five dimensions. Overall, the Big Five predicted 14-46% of the responses given to the EQ dependent on country subsample. For the IRI, results are more mixed. With respect to the affective and cognitive component of empathy and the fantasy scale, the Big Five explain between 5 and 22% of variance, for personal distress much more (24-36%). There is substantial variation between cultures in the predictive power of the models trying to predict the same empathy dimension. However, if comparing this variation between the empathy dimensions, there is no uniform pattern (i.e., there is no culture where in general personality explains more or less variation in empathy than in another culture). Here, Coefficients were also calculated for the female and the male subsample (compare results). There were no significant gender differences in any of the correlations. A significance threshold of p = 0.01 was defined to account for multiple testing of the five personality dimensions. we once again must keep in mind the differences in sample sizes and its impact on power. From a content perspective, the "classical" empathy dimensions (EQ, IRI perspective taking, empathic concern) seem primarily related to agreeableness (and in addition conscientiousness in case of the EQ), while fantasy is mainly associated to openness and personal distress mainly to neuroticism. Despite differences in the variance explained, the order of predictors is remarkably consistent across samples.

DISCUSSION
In this study, we investigated the relationship between the Big Five model of personality and several commonly used measures of empathic processing across samples from four different cultures. Based on previous studies, we hypothesized positive correlations between central measures of empathy and agreeableness, conscientiousness, openness, and extraversion. For the culture dependent analyses, we expected no major differences between cultures. The observed associations between empathy and personality for the total sample match our expectations. In case of the EQ, but also in case of IRI perspective taking and empathic concern, agreeableness turned out to be the by far most important predictor, explaining between 11 and 18% of variance in empathy responding. For the EQ, conscientiousness was also important (8% explanation of variance) while in case of the two IRI subscales no other personality dimension explained additional substantial amounts of variance. Therefore, our results suggest medium size correlations/associations between empathy and agreeableness, indicating a big importance of personality opposed to Wakabayashi and Kawashima (2015) but not a complete overlay of both constructs opposed to Nettle (2007). Besides, we didn't observe differences in the prediction of cognitive versus affective empathy, indicating that both associate with the same personality dimensions.
It makes perfectly sense that we found agreeableness to be the best predictor for empathy because it is primarily a dimension of interpersonal behavior and represents the quality of social interaction (Costa et al., 2001). Furthermore, agreeableness can predict prosocial as well as aggressive behavior (Graziano and Eisenberg, 1997). Graziano et al. (2007) even offered a mechanism explaining the association. According to them, humans low in agreeableness do not report less empathy because they lack empathic affect or prosocial motivation, but because they lack skills in shifting the focus of these reactions to others.
From a cross-cultural perspective, research has shown differences in agreeableness between cultures, for example, East Asian cultures have been shown to score the lowest in agreeableness (compare below for a more detailed discussion) and Southern Europeans have depicted higher scores in Index numbers represent significant difference in correlation, values with the same index number (per field) differ significantly from each other. Coefficients were also calculated for the female and the male subsamples (compare results). A significance threshold of p = 0.01 was defined to account for multiple testing of the five personality dimensions. agreeableness than Eastern and Western Europeans (Schmitt et al., 2007). To our knowledge, there is no comparable detailed study that deals with cultural differences in responding to empathy questionnaires. Therefore, future research with a big number of cultures/countries will show whether the predictive value of agreeableness for empathy on an individual level is also reflected in differences in empathy scores (i.e., lower scores in empathy in East Asian samples, too). In our own data, such a difference is already hinted in the mean score comparison. Here, both significant differences in empathy involve lower scores in China compared to the other samples. Next to agreeableness, conscientiousness is the second personality dimension with a bigger predictive value for empathy. Here, the association may be explained by the negative relation of conscientiousness to psychoticism (McCrae and Costa, 1985), which is defined by a lack of empathy (Richendoller and Weaver, 1994). This relationship between psychoticism and empathy in turn can be explained by the lack of concern for others as well as egocentricity, which are both among the significant features of psychoticism (Richendoller and Weaver, 1994).
Concerning the other subscales of the IRI, we found neuroticism to be important to explain personal distress and openness to be important to predict fantasy. These results fit in with the IRI's theoretical background: For both subscales, it has been a matter of discussion whether they are facets of empathy or not (fantasy lacks the personal social interaction, personal distress has been defined as a distinct affective state apart from empathy or as a reaction to a specific empathic makeup, see also Melchers et al., 2015). Therefore, it is plausible that these subscales' relation to the Big Five differs from the other subscales. For fantasy, a relation to openness is the most plausible, because feeling empathy in a fictional context presupposes openness to the respective fictional world of a book, a theater play etc. Furthermore, openness includes creativity, which also plays an important role in generation and sensitivity to fictional environments. For personal distress, neuroticism fits perfectly as a candidate for an association, because personal distress measures the attitude toward and feelings evoked by negative social interactions. These feelings, in turn, should be strongly influenced by the fundamental attitude to social interactions, which again is strongly related to neuroticism.
Results concerning differences in empathy responses and personality depending on country/culture do not offer a clear cultural trend. Therefore, we refrain from an extensive discussion of these differences, also because they overall suggest cultural similarity as most of the observed differences are small. But what should still be considered is the following: three out of four subscales of the IRI questionnaire show no cultural dependence and even in case of empathic concern the differences are small. Therefore, we expect that the questionnaire can measure cognitive and affective empathy regardless of the culture under investigation.
Analyses concerning the association between the empathy measures and the Big Five indicate homogenous associations across cultures. Overall, only three correlations with significant differences between cultures were found. These data suggests no cultural dependence of the association between empathy and personality. A more nuanced picture emerges when taking results of the regressions, which consider multicollinearity, into account. Here, we see differences in the predictive power of the Big Five for empathy. However, once again there are no clear associations with a specific culture, rather more independent patterns for each subscale are detectable.
Overall, the following picture emerges: We found no big differences between the investigated samples, neither for mean responses (which need to be interpreted very carefully) nor for the relationship between empathy and the Big Five. The differences we found do not follow a uniform pattern related to a putative cultural continuum. Agreeableness and conscientiousness are the most important personality factors to explain empathy responding, openness predicts fantasy and neuroticism personal distress. We found that the Big Five can explain up to 46% of variance in empathy questionnaire responses.
Our results may have important implications for our understanding of empathy as well as for intercultural interaction. Out of an evolutionary perspective, empathy originally developed due to its proximal fitness advantages because it allows nonconflicting or conflict reduced cooperation of individuals in groups by synchronizing or explaining affective and cognitive states, which, for example, brings significant advantages when hunting or defending the own group, but also reduces intragroup conflicts (de Waal, 2008). This development is most likely associated with the increased investment in the own offspring and the associated need for synchronization of affective but also cognitive states between parents and offspring (MacLean, 1985). The need to develop such skills is also reflected in their both ontogenetically and phylogenetically gradual emergence (de Waal, 2008;Walter, 2012). An interesting question is how the later developed cultural differences may have changed the function or the usefulness of empathy for successful social interaction. Here, one should cast a glance to the classic dimensions that have been proposed for the classification of cultures in the literature. Through a large number of cross-cultural studies, Hofstede (1979Hofstede ( , 1980Hofstede ( , 1982Hofstede ( , 1983 was able to identify a total of four dimensions [Power Distance, Uncertainty Avoidance, Individualism vs. Collectivism (InCo) and Masculinity vs. Femininity (MaFe)], according to which cultures can differ from each other. Power Distance has been defined as "the extent to which the less powerful of institutions and organizations accept that power is distributed unequally" (Hofstede and Bond, 1984). Here, one could expect a negative relationship between empathy and high power distance, because the more social interactions or their sequences, respectively, depend on power status, the less empathy is needed to synchronize. Function wise, empathy could be an ability, which helps especially individuals low in power to adapt to the demands of the higher power people. Uncertainty Avoidance is "the extent to which people feel threatened by ambiguous situations, and have created believes and institutions that try to avoid these" (Hofstede and Bond, 1984). In case of this dimension, once could expect a positive relation to empathy, because empathy informs a person of the goals and emotions of a respective interaction partner, thereby reducing ambiguity and creating positive interaction conditions. In other words, empathy (or empathic abilities) should be the more important for a culture, the more uncertainty avoidant the culture is. InCo have been defined as "a situation in which people are supposed to look after themselves and their immediate family only" versus "a situation in which people belong to in-groups or collectivities which are supposed to look after them in exchange for loyalty" (Hofstede and Bond, 1984). Here, expectations could go in both directions: on the one hand, one could expect higher empathy in collectivistic cultures because empathy could serve the function to adapt the individual to their in-group, which is much more important for a collectivistic culture. On the other hand, in individualistic culture there should in general be more variance in behavior compared to a more norm-driven, collectivistic culture, which makes a certain amount of adaption necessary to allow for successful social interaction. Empirical research has given more support to the former hypothesis (e.g., Realo and Luik, 2002;Duan et al., 2008). Studies demonstrating lower self-reported empathy in collectivistic than in individualist cultures (which technically support the latter hypothesis) on the other hand are not easy to interpret, because these mean differences might also be driven by differences in response patterns (compare beyond). Finally, MaFe have been described as "a situation in which the dominant values in society are success, money, and things" versus "a situation in which the dominant values in a society are caring for others and the quality of life" (Hofstede and Bond, 1984). Here, the expected association is rather clear, as the definition of femininity contains a facet of empathy (caring for others). Therefore, one would expect positive correlations of empathy to femininity and negative associations to masculinity. On the other hand, feminine cultures could report more empathy because it is an important value, although there exist no differences in empathic abilities compared to more masculine cultures. A similar difference between self-report and behavior/ability has been reported for general gender differences in empathy (Derntl et al., 2010).
Knowledge on differences depending on cultural properties may be very helpful for our understanding of empathy as well as for practical applications. For the former, the illustration above shows that definitions and functions of empathy may vary depending on the cultural context. This may have, among others, implications for the choice of empathy measures or the interpretation of their results, for example the above-mentioned differences depending on gender. For the latter, knowledge on culture specific empathy is important for example for the design of negotiations (e.g., Buttery and Leung, 1998), the design of customer -service provider relationships (e.g., Dash et al., 2009) or training psychotherapists in cross-cultural empathy (e.g., Dyche and Zayas, 2001).
When interpreting our results, we have to keep in mind that there are some limitations to our findings. One of them concerns the sample sizes for the different samples, which differ between 438 and 62 participants. This difference emerged due to different willingness to participate in the countries under investigation. A consequence is varying power for detecting associations between personality and empathy. This can also be derived from the sensitivity analyses: while we could detect correlations slightly beyond 0.2 in the Chinese and German sample, the US-American sample requires a minimum correlation of 0.34 and the Spanish sample even more (0.4). Especially in case of these latter two we therefore might have not detected the impact of the less strongly associated personality variables. For example, most of the small associations with extraversion were found in the Chinese sample, which is the biggest sample. On the other hand, one could argue that this shortcoming is not a big problem for our study, because these small associations technically don't play an important role in predicting empathy by personality because they are so small. For example, the biggest correlation between extraversion and the IRI (empathic concern, in the Chinese sample) explains less than 6% variance. After correction for gender, age, and multicollinearity, this effect is even smaller (4.4%).
A second, related argument concerns the gender distribution in the samples. Whilst the Chinese sample contains more male than female participants, the other samples comprise more females than males. For the analyses of mean questionnaire responses, we therefore controlled for age and added gender as a second between-subject factor in addition to culture. We don't think that the differences distorted results concerning the relationship between personality and empathy, because we found no significant gender related differences in the correlations of empathy and the Big Five (compare results). Besides, our data have been collected in a university setting, which leads to restrictions in age range and education degrees.
Another aspect, which should be reminded in cross-cultural research is the question of comparability of questionnaire responses. Do we really measure the same agreeableness in China and in Germany? Does the IRI measure the same empathy traits in Spain as in the USA? As in our own data, previous research has shown mean differences in questionnaire responses between cultures (e.g., Allik and McCrae, 2004;Schmitt et al., 2007). These studies even supposed patterns in participants' responses, which could be related to the respective geographic region, thereby creating a kind of map of personality. The interpretation of such differences is difficult: do they represent trait differences, are they caused by differences in response style or social desirability or do they mirror different evaluation of the trait under investigation? And what is the origin of such differences, do they have a cultural or rather a genetic basis or both? For example, Schmitt et al. (2007) found that East Asian participants score lower on all Big Five dimensions except neuroticism (here, they score higher than all other participants) compared to the other groups of participants. The uniformity of this deviation might indicate a response pattern rather than a difference in personality. Allik and McCrae (2004) found smaller standard deviations in Asian and Black African cultures compared to European cultures. This again might be indicative of differences in response patterns, which may impact on the test score. From a content perspective, the Big Five dimensions might also serve different functions in different cultures. For example, the openness factor has not consistently been replicated in China (Cheung et al., 2001), which has been labeled as a collectivistic country (Hofstede, 2001). Openness has been shown to be differentially valued across cultures, and one reason for that might be differences in sociability between collectivistic and individualistic countries (Triandis, 1995;Ward et al., 2004). Another, quite simple, argument could be that openness serves the less a function the more uniform according to given norms an individual is supposed to behave in its culture. A second example for a functional difference is the factor agreeableness. Here, relations have been shown to the idiocentrism (big differences between self and others) versus allocentrism (big differences between own group and other groups) concept (Triandis, 1989) and allocentric individuals/cultures tend to be more agreeable than idiocentric individuals/cultures (Realo et al., 1997). Once again, this might be explained by differences in the function of agreeableness. For example, agreeableness should be more important in an allocentric context to sustain the homogeneity and harmony of a group than in an idiocentric context.
As with personality, cultural differences in questionnaire responses can also be interpreted in different ways in case of empathy. Although there are not many studies in this field, there have already been shown some differences. For example, Trommsdorff (1995) depicted more personal distress and less empathic concern in Japanese (East Asian) children compared to those from Germany (Western culture). Cassels et al. (2010) compared empathy (measured by the IRI) in university undergraduates with an Asian background to those with a Caucasian or a bicultural background. They found greater personal distress and less empathic concern in East Asian participants compared to their Caucasian counterparts and scores of the bicultural participants fell in between. Again, these differences might be the result of different functions of empathy facets depending on culture. One could speculate, for example, whether personal distress evoked by negative social interaction serves a more important function in sustaining harmony and social peace in Asians than in Westerns/Caucasians, while empathic concern in turn serves as the more important function for the same goals in Westerns/Caucasians compared to Asians.
For future studies, it would be useful to collect samples from other countries which include an even more representative assortment of participants and which represent the four descriptive dimensions of cultural differences by Hofstede. Furthermore, it would make sense to consider other measures for empathy and their interaction with the cultures under investigation, as in our case results seem in part to depend on the utilized measure.