# GENDER ROLES IN THE FUTURE? THEORETICAL FOUNDATIONS AND FUTURE RESEARCH DIRECTIONS

EDITED BY : Alice H. Eagly and Sabine Sczesny PUBLISHED IN : Frontiers in Psychology

#### Frontiers Copyright Statement

© Copyright 2007-2019 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88963-140-7 DOI 10.3389/978-2-88963-140-7

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# GENDER ROLES IN THE FUTURE? THEORETICAL FOUNDATIONS AND FUTURE RESEARCH DIRECTIONS

Topic Editors:

Alice H. Eagly, Northwestern University, United States Sabine Sczesny, University of Bern, Switzerland

Image: Alice H. Eagly

The study of gender is deservedly a major focus of research in the discipline of psychology in general and social psychology in particular. Interest in the topic increased sharply in the 1970s with the flowering of the feminist movement, and research has continued to advance since that time. In 1987, Alice Eagly formulated Social Role Theory to explain the behavior of women and men as well as the stereotypes, attitudes, and ideologies that are relevant to sex and gender. Enhanced by several extensions over the intervening years, this theory became one of the pre-eminent, if not the central, theory of gender in social psychology. Also, over the last decades, social psychologists have developed a variety of related approaches to understanding gender, including, for instance, theories devoted to stereotyping, leadership, status, backlash, lack of fit to occupational roles, social identity, and categorization. Reflecting these elements, this e-Book includes articles that encompasses a wide range of themes pertaining to sex and gender. In these papers, the concept of social roles appears often as central integrative concept that links individuals with their social environment. These articles thereby complement social role theory as the authors reach out to build an extended theoretical foundation for gender research of the future.

Citation: Eagly, A. H., Sczesny, S., eds. (2019). Gender Roles in the Future? Theoretical Foundations and Future Research Directions. Lausanne: Frontiers Media. doi: 10.3389/978-2-88963-140-7

# Table of Contents

*05 Editorial: Gender Roles in the Future? Theoretical Foundations and Future Research Directions*

Alice H. Eagly and Sabine Sczesny


Maria Olsson and Sarah E. Martiny

*103 Unnecessary Frills: Communality as a Nice (But Expendable) Trait in Leaders*

Andrea C. Vial and Jaime L. Napier


Claire R. Gravelin, Monica Biernat and Caroline E. Bucher

*205 Religiosity, Religious Fundamentalism, and Ambivalent Sexism Toward Girls and Women Among Adolescents and Young Adults Living in Germany*

Bettina Hannover, John Gubernath, Martin Schultze and Lysann Zander


# Editorial: Gender Roles in the Future? Theoretical Foundations and Future Research Directions

Alice H. Eagly <sup>1</sup> \* and Sabine Sczesny <sup>2</sup>

*<sup>1</sup> Department of Psychology, Northwestern University, Evanston, IL, United States, <sup>2</sup> Department of Psychology, University of Bern, Bern, Switzerland*

Keywords: gender prejudice, social role theory, communion, agency, gender stereotypes, gender roles

#### **Editorial on the Research Topic**

#### **Gender Roles in the Future? Theoretical Foundations and Future Research Directions**

The study of gender has become a major focus of research in psychology and in social psychology in particular. Among early contributors to this study, Eagly (1987) formulated social role theory to explain the behavior of women and men as well as the stereotypes, attitudes, and ideologies that are relevant to sex and gender. Enhanced by several extensions over the intervening years, this theory became a pre-eminent theory of gender in social psychology (Eagly and Wood, 2012). Also, over the last decades, social psychologists have developed a variety of related approaches to understanding gender, including, for instance, theories devoted to stereotype threat, status, backlash, lack of fit to occupational roles, social identity, and categorization. The conference that preceded this Research Topic, sponsored by the European Association of Social Psychology and the Society for Personality and Social Psychology, featured work that fit within the broad umbrella of social role theory and related approaches.

The contemporary interest in the psychology of gender reflects its centrality in the understanding of social behavior. Gender continues to be a driving force in world politics and economics, as evident in the struggles of women to attain parity in political and economic institutions, the transformative impact of the #me-too movement, and the falling birthrates in many nations as women opt for careers instead of large families. In addition, binary gender itself is facing challenge as the two primary sex categories of female and male yield to accommodate multiple gender and sexual identities, including non-binary identities and transgender status.

One of the central topics of the social psychology of gender is gender stereotypes, understood as consensual beliefs about the attributes of women and men. Although describing the content of gender stereotypes might seem to be a task already accomplished many decades ago (e.g., Broverman et al., 1972), research on this matter has continually expanded. Not only has recent research described change in gender stereotypes over time (Eagly et al., 2019), but also this Research Topic includes the Hentschel et al. article that identifies facets underlying these stereotypes' two primary dimensions of agency and communion. Their analysis of agency thus reveals the facets of independence, instrumental competence, and leadership competence and of communion yields the facets of concern for others, sociability, and emotional sensitivity. Other advances in stereotype research consider intersectionalities between gender and other social attributes as well as the prescriptive aspect of gender stereotypes by which they define what members of each sex should and should not do. Illustrating these advances, Koenig's research explores prescriptive stereotypes for the intersections of gender with age from toddlerhood to old age. Among her findings is a weakening of these gender stereotypes in relation to elderly women and men.

#### Edited and reviewed by:

*Eva G. Krumhuber, University College London, United Kingdom*

> \*Correspondence: *Alice H. Eagly eagly@northwestern.edu*

#### Specialty section:

*This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology*

Received: *07 August 2019* Accepted: *09 August 2019* Published: *04 September 2019*

#### Citation:

*Eagly AH and Sczesny S (2019) Editorial: Gender Roles in the Future? Theoretical Foundations and Future Research Directions. Front. Psychol. 10:1965. doi: 10.3389/fpsyg.2019.01965*

Gender stereotypes exert influence in daily life even when they compete with the influences of other social roles. In particular, occupational roles have demands that may be more or less consistent with gender roles. In extending social role theory to account for such circumstances, Eagly and Karau (2002) argued that the female gender stereotype is generally inconsistent with leader roles because of the expectations that women are communal and that leaders, like men, are agentic. Consequently, women can suffer discrimination in relation to leadership roles because many people believe that they are insufficiently agentic to perform effectively as leaders. Manzi raises the issue of whether parallel discriminatory processes exist for men who occupy or seek to occupy roles with primarily communal demands. The article by Block et al. further addresses men's occupancy of communal roles by analyzing the low representation of men in healthcare, early education, and domestic (HEED) roles. Their research shows that, consistent with gender stereotypes, men tend to have agentic values that focus on status, competition, and wealth and thus are not attracted to careers with a focus on caring for others. However, as Van Grootel et al. demonstrate, men tend to underestimate the extent to which other men approve of men's communal traits and behaviors. Correction of this pluralistic ignorance fosters men's greater endorsement of communal values and support for progressive gender-related social change. In a different demonstration of how to reduce the power of existing gender stereotypes, Olsson and Martiny review research on exposure to counterstereotypical role models. They conclude that such exposures do hold promise for promoting counterstereotpical goals and aspirations, especially in girls and women.

For leadership, gender makes a difference, given the definition of leadership primarily in culturally masculine terms that disfavor women. Vial and Napier offer clever demonstrations that people do view agentic traits as more important than communal traits for successful leaders, thus confirming women's disadvantage for attaining leader roles. Communal traits appear to be a nice, but inessential add-on for leaders. Another disadvantage for women, as shown by Player et al., is that male candidates for leadership are valued more highly for their perceived potential to be a good leader rather than their past performance. Female candidates, in contrast, are valued more for their past performance and given relatively little credit for their potential. Consistent with the female stereotype of low agency, women thus have the burden of proving their leadership competence rather than merely being trusted to have potential for the future. As shown by Gruber et al., some women do emerge as leaders, and greater facial attractiveness facilitates their emergence by fostering the ascription of social competence to them. These researchers have yet to investigate the importance of facial attractiveness to male leaders.

Increasing gender diversity in organizations is surely an important social goal for advocates of gender equality. Yet, organizational processes are not so simple that merely adding women catalyzes gains for other women. In fact, women in leadership roles do not necessarily work to change organizational norms to insure equal opportunity for other women, as Sterk et al. argue. Instead, senior women may accept negative stereotypes about women's lesser capacity for leadership. Such "queen bee" senior women may distance themselves from junior women and thus exert negative effects on them. Moreover, as van Dijk and van Engen explain, despite the presence of gender-diverse work groups, organizational behaviors are often constrained by selfreinforcing gender role expectations that perpetuate traditional gender-unfair practices.

Gender stereotypes exert influence in other situations as well. One such setting is high-stakes aptitude tests whose outcomes affect the opportunities of women and men. As shown by the Leiner et al. research on Austrian medical school aptitude tests, there are intriguing sex differences in the ways that female and male test takers perceive the test situation. In particular, the women experienced greater test anxiety than men and perceived the test as less fair. Another realm of social behavior that is fraught with gender issues is sexual coercion and rape. Gravelin et al. provide a thorough review of what is now a large research literature on tendencies to blame the victim of acquaintance rape. Also related to sexual violence is an incident in Germany of mass sexual assault on New Year's Eve of 2015. The discourse that ensued receives careful analysis by Hannover et al. One question that Germans faced is whether the largely Muslim perpetrators of these assaults were motivated by particularly sexist attitudes toward girls and women that emanated from their religion. The findings of this research instead implicated, not a particular religion, but high levels of religiosity and fundamentalism as precursors of the sexist beliefs that fostered violence against women.

In a world in which gender is always in flux, the future of gender relations is uncertain. To help understand this future, Gustafsson Sendén et al. asked Swedes to indicate what they think that the traits of Swedish women and men were in the past, are in the present, and will be in the future. Replicating earlier research by Diekman and Eagly (2000), respondents perceived women to increase in agentic traits over time but remain more communal than men. Such beliefs, derived from the abstract belief that gender equality is increasing, may not reflect actual changes in stereotype content over time (Eagly et al., 2019).

The contemporary challenges to the binary view of sex, gender, and sexuality receive important exploration in the essay by Morgenroth and Ryan. They review earlier writing by the philosopher Judith Butler, who advocated "gender trouble" that would disrupt the binary view of gender. As these authors suggest, Butler's ideas can guide understanding of some of the ways that performance socially constructs gender in society. Butler's writings on performativity and related themes can provide intriguing hypotheses for systematic empirical exploration by social psychologists. In the meantime, other social psychologists argue that the way forward in gender theory entails exploring how gender is and is not socially constructed by producing research that also considers the biological grounding of some patterns of male and female behavior (Eagly and Wood, 2013). From this interactionist perspective, nature, and nurture are intertwined in producing the phenomena of gender.

The articles included in this Research Topic are broadly positioned across the field of social psychology, which encompasses a wide range of themes pertaining to sex and gender. Some of these themes link social psychology to other areas of psychological specialization, such as personality, developmental, cultural, industrial-organizational, and biological psychology as well as to the other social science disciplines of sociology, political science, and economics. In invoking other disciplines and psychology subfields, many of the authors whose work appears in this Research Topic recognize the importance of social roles as a central integrative concept in theories of

# REFERENCES


gender. These articles thereby complement social role theory by reaching out to build an extended theoretical foundation for gender research of the future.

# AUTHOR CONTRIBUTIONS

Both authors listed have made a substantial, direct and intellectual contribution to the work and approved it for publication.

(Thousand Oaks, CA: Sage Publications), 458–476. doi: 10.4135/97814462492 22.n49

Eagly, A. H., and Wood, W. (2013). The nature–nurture debates: 25 years of challenges in understanding the psychology of gender. Perspect. Psychol. Sci. 8, 340–357. doi: 10.1177/1745691613484767

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Eagly and Sczesny. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Multiple Dimensions of Gender Stereotypes: A Current Look at Men's and Women's Characterizations of Others and Themselves

Tanja Hentschel1,2 \*, Madeline E. Heilman<sup>3</sup> and Claudia V. Peus<sup>1</sup>

<sup>1</sup> TUM School of Management, Technische Universität München, Munich, Germany, <sup>2</sup> Amsterdam Business School, University of Amsterdam, Amsterdam, Netherlands, <sup>3</sup> Department of Psychology, New York University, New York, NY, United States

#### Edited by:

Alice H. Eagly, Northwestern University, United States

#### Reviewed by:

Andrea Elisabeth Abele, University of Erlangen-Nürnberg, Germany Elizabeth Haines, William Paterson University, United States

> \*Correspondence: Tanja Hentschel t.hentschel@uva.nl

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

> Received: 20 March 2018 Accepted: 04 January 2019 Published: 30 January 2019

#### Citation:

Hentschel T, Heilman ME and Peus CV (2019) The Multiple Dimensions of Gender Stereotypes: A Current Look at Men's and Women's Characterizations of Others and Themselves. Front. Psychol. 10:11. doi: 10.3389/fpsyg.2019.00011 We used a multi-dimensional framework to assess current stereotypes of men and women. Specifically, we sought to determine (1) how men and women are characterized by male and female raters, (2) how men and women characterize themselves, and (3) the degree of convergence between self-characterizations and charcterizations of one's gender group. In an experimental study, 628 U.S. male and female raters described men, women, or themselves on scales representing multiple dimensions of the two defining features of gender stereotypes, agency and communality: assertiveness, independence, instrumental competence, leadership competence (agency dimensions), and concern for others, sociability and emotional sensitivity (communality dimensions). Results indicated that stereotypes about communality persist and were equally prevalent for male and female raters, but agency characterizations were more complex. Male raters generally descibed women as being less agentic than men and as less agentic than female raters described them. However, female raters differentiated among agency dimensions and described women as less assertive than men but as equally independent and leadership competent. Both male and female raters rated men and women equally high on instrumental competence. Gender stereotypes were also evident in self-characterizations, with female raters rating themselves as less agentic than male raters and male raters rating themselves as less communal than female raters, although there were exceptions (no differences in instrumental competence, independence, and sociability self-ratings for men and women). Comparisons of self-ratings and ratings of men and women in general indicated that women tended to characterize themselves in more stereotypic terms – as less assertive and less competent in leadership – than they characterized others in their gender group. Men, in contrast, characterized themselves in less stereotypic terms – as more communal. Overall, our results show that a focus on facets of agency and communality can provide deeper insights about stereotype content than a focus on overall agency and communality.

Keywords: gender stereotypes, self-stereotyping, communality, communion, agency, men, women, gender identity

# INTRODUCTION

fpsyg-10-00011 January 29, 2019 Time: 13:59 # 2

There is no question that a great deal of progress has been made toward gender equality, and this progress is particularly evident in the workplace. There also is no question that the goal of full gender equality has not yet been achieved – not in pay (AAUW, 2016) or position level (Catalyst, 2016). In a recent interview study with female managers the majority of barriers for women's advancement that were identified were consequences of gender stereotypes (Peus et al., 2015). There is a long history of research in psychology that corroborates this finding (for reviews see Eagly and Sczesny, 2009; Heilman, 2012). These investigations support the idea that gender stereotypes can be impediments to women's career advancement, promoting both gender bias in employment decisions and women's self-limiting behavior (Heilman, 1983).

This study is designed to investigate the current state of gender stereotypes about men and women using a multi-dimensional framework. Much of the original research on the content of gender stereotypes was conducted several decades ago (e.g., Rosenkrantz et al., 1968), and more recent research findings are inconsistent, some suggesting that there has been a change in traditional gender stereotypes (e.g., Duehr and Bono, 2006) and others suggesting there has not (e.g., Haines et al., 2016). Measures of stereotyping in these studies tend to differ, all operationalizing the constructs of agency and communality, the two defining features of gender stereotypes (Abele et al., 2008), but in different ways. We propose that the conflict in findings may derive in part from the focus on different facets of these constructs in different studies. Thus, we seek to obtain a more complete picture of the specific content of today's gender stereotypes by treating agency and communality, as multidimensioned constructs.

Gender stereotypes often are internalized by men and women, and we therefore focus both on how men and women are seen by others and how they see themselves with respect to stereotyped attributes. We also plan to compare and contrast charcterizations of men or women as a group with charcterizations of self, something not typically possible because these two types of characterizations are rarely measured in the same study. In sum, we have multiple objectives: We aim to develop a multidimensional framework for assessing current conceptions of men's and women's characteristics and then use it to consider how men and women are seen by male and female others, how men and women see themselves, and how these perceptions of self and others in their gender group coincide or differ. In doing so, we hope to demonstrate the benefits of viewing agency and communality as multidimensional constructs in the study of gender stereotypes.

# Gender Stereotypes

Gender stereotypes are generalizations about what men and women are like, and there typically is a great deal of consensus about them. According to social role theory, gender stereotypes derive from the discrepant distribution of men and women into social roles both in the home and at work (Eagly, 1987, 1997; Koenig and Eagly, 2014). There has long been a gendered division of labor, and it has existed both in foraging societies and in more socioeconomically complex societies (Wood and Eagly, 2012). In the domestic sphere women have performed the majority of routine domestic work and played the major caretaker role. In the workplace, women have tended to be employed in people-oriented, service occupations rather than things-oriented, competitive occupations, which have traditionally been occupied by men (e.g., Lippa et al., 2014). This contrasting distribution of men and women into social roles, and the inferences it prompts about what women and men are like, give rise to gender stereotypical conceptions (Koenig and Eagly, 2014).

Accordingly, men are characterized as more agentic than women, taking charge and being in control, and women are characterized as more communal than men, being attuned to others and building relationships (e.g., Broverman et al., 1972; Eagly and Steffen, 1984). These two concepts were first introduced by Bakan (1966) as fundamental motivators of human behavior. During the last decades, agency (also referred to as "masculinity," "instrumentality" or "competence") and communality (also referred to as "communion," "femininity," "expressiveness," or "warmth") have consistently been the focus of research (e.g., Spence and Buckner, 2000; Fiske et al., 2007; Cuddy et al., 2008; Abele and Wojciszke, 2014). These dual tenets of social perception have been considered fundamental to gender stereotypes.

Stereotypes can serve an adaptive function allowing people to categorize and simplify what they observe and to make predictions about others (e.g., Devine and Sharp, 2009; Fiske and Taylor, 2013). However, stereotypes also can induce faulty assessments of people – i.e., assessments based on generalization from beliefs about a group that do not correspond to a person's unique qualities. These faulty assessments can negatively or positively affect expectations about performance, and bias consequent decisions that impact opportunities and work outcomes for both men and women (e.g., Heilman, 2012; Heilman et al., 2015; Hentschel et al., 2018). Stereotypes about gender are especially influential because gender is an aspect of a person that is readily noticed and remembered (Fiske et al., 1991). In other words, gender is a commonly occurring cue for stereotypic thinking (Blair and Banaji, 1996).

Gender stereotypes are used not only to characterize others but also to characterize oneself (Bem, 1974). The process of self-stereotyping can influence people's identities in stereotypecongruent directions. Stereotyped characteristics can thereby be internalized and become part of a person's gender identity – a critical aspect of the self-concept (Ruble and Martin, 1998; Wood and Eagly, 2015). Young boys and girls learn about gender stereotypes from their immediate environment and the media, and they learn how to behave in gender-appropriate ways (Deaux and LaFrance, 1998). These socialization experiences no doubt continue to exert influence later in life and, indeed, research has shown that men's and women's self-characterizations differ in ways that are stereotype-consistent (Bem, 1974; Spence and Buckner, 2000).

#### Measurement of Gender Stereotypes

Gender stereotypes, and their defining features of agency and communality, have been measured in a variety of ways (Kite

et al., 2008). Researchers have investigated people's stereotypical assumptions about how men and women differ in terms of, for example, ascribed traits (e.g., Williams and Best, 1990), role behaviors (e.g., Haines et al., 2016), occupations (e.g., Deaux and Lewis, 1984), or emotions (e.g., Plant et al., 2000). Researchers also have distinguished personality, physical, and cognitive components of gender stereotypes (Diekman and Eagly, 2000). In addition, they have investigated how men' and women's selfcharacterizations differ in stereotype-consistent ways (Spence and Buckner, 2000).

Today, the most common measures of gender stereotypes involve traits and attributes. Explicit measures of stereotyping entail responses to questionnaires asking for descriptions of men or women using Likert or bi-polar adjective scales (e.g., Kite et al., 2008; Haines et al., 2016), or asking for beliefs about the percentage of men and women possessing certain traits and attributes (e.g., McCauley and Stitt, 1978). Gender stereotypes have also been studied using implicit measures, using reaction time to measure associations between a gender group and a stereotyped trait or attribute (e.g., Greenwald and Banaji, 1995). Although implicit measures are used widely in some areas of research, our focus in the research reported here builds on the longstanding tradition of measuring gender stereotypes directly through the use of explicit measures.

# Contemporary Gender Stereotypes

Researchers often argue that stereotypes are tenacious; they tend to have a self-perpetuating quality that is sustained by cognitive distortion (Hilton and von Hippel, 1996; Heilman, 2012). However, stereotype maintenance is not only a product of the inflexibility of people's beliefs but also a consequence of the societal roles women and men enact (Eagly and Steffen, 1984; Koenig and Eagly, 2014). Therefore, the persistence of traditional gender stereotypes is fueled by skewed gender distribution into social roles. If there have been recent advances toward gender equality in workforce participation and the rigid representation of women and men in long-established gender roles has eased, then might the content of gender stereotypes have evolved to reflect this change?

The answer to this question is not straightforward; the degree to which there has been a true shift in social roles is unclear. On the one hand, there are more women in the workforce than ever before. In 1967, 36% of U.S. households with married couples were made up of a male provider working outside the home and a female caregiver working inside the home, but now only 19% of U.S. households concur with this division (Bureau of Labor Statistics, 2017). Moreover, women increasingly pursue traditionally male careers, and there are more women in roles of power and authority. For example, today women hold almost 40% of management positions in the United States (Bureau of Labor Statistics, 2017). In addition, more men are taking on a family's main caretaker role (Ladge et al., 2015). Though families with only the mother working are still rare (5% in 2016 compared to 2% in 1970), the average number of hours fathers spent on child care per week increased from 2.5 to 8 h in the last 40 years (Pew Research Center, 2018). In addition, the majority of fathers perceive parenting as extremely important to their identity (Pew Research Center, 2018).

On the other hand, role segregation, while somewhat abated, has by no means been eliminated. Despite their increased numbers in the labor force, women still are concentrated in occupations that are perceived to require communal, but not agentic attributes. For example, the three most common occupations for women in the U.S. involve care for others (elementary and middle school teacher, registered nurse, and secretary and administrative assistant; U.S. Department of Labor, 2015), while men more than women tend to work in occupations requiring agentic attributes (e.g., senior management positions, construction, or engineering; Bureau of Labor Statistics, 2016b). Sociological research shows that women are underrepresented in occupations that are highly competitive, inflexible, and require high levels of physical skill, while they are overrepresented in occupations that place emphasis on social contributions and require interpersonal skills (Cortes and Pan, 2017). Moreover, though men's home and family responsibilities have increased, women continue to perform a disproportionate amount of domestic work (Bureau of Labor Statistics, 2016a), have greater childcare responsibilities (Craig and Mullan, 2010; Kan et al., 2011), and continue to be expected to do so (Park et al., 2008).

Thus, there is reason both to expect traditional gender stereotypes to dominate current conceptions of women and men, and to expect them to not. Relevant research findings are conflicting. For example, a large investigation found that over time managers have come to perceive women as more agentic (Duehr and Bono, 2006). However, other investigations have found gender stereotypes to have changed little over time (Heilman et al., 1989) or even to have intensified (Lueptow et al., 2001). A recent study replicating work done more than 30 years ago found minimal change, with men and women still described very differently from one another and in line with traditional stereotyped conceptions (Haines et al., 2016).

There also have been conflicting findings concerning selfcharcterizations, especially in women's self-views of their agency. Findings by Abele (2003) suggest that self-perceived agency increases with career success. Indeed, there has been indication that women's self-perceived deficit in agency has abated over time (Twenge, 1997) or that it has abated in some respects but not others (Spence and Buckner, 2000). However, a recent meta-analysis has found that whereas women's self-perceptions of communality have decreased over time, their self-perceptions of agency have remained stable since the 1990s (Donnelly and Twenge, 2017). Yet another study found almost no change in men's and women's self-characterizations of their agency and communality since the 1970s (Powell and Butterfield, 2015).

There are many possible explanations for these conflicting results. A compelling one concerns the conceptualization of the agency and communality constructs and the resulting difference in the traits and behaviors used to measure them. In much of the gender stereotypes literature, agency and communality have been loosely used to denote a set of varied attributes, and different studies have operationalized agency and communality in different ways. We propose that agency and communality are not unitary constructs but rather are comprised of multiple

dimensions, each distinguishable from one another. We also propose that considering these dimensions separately will enhance the clarity of our understanding of current differences in the characterization of women and men, and provide a more definitive picture of gender stereotypes today.

# Dimensions of Communality and Agency

There has been great variety in how the agency construct has been operationalized, and the specific terms used to measure agency often differ from study to study (e.g., McAdams et al., 1996; Rudman and Glick, 2001; Abele et al., 2008; Schaumberg and Flynn, 2017). Furthermore, distinctions between elements of agency have been identified: In a number of studies competence has been shown to be distinct from agency as a separate factor (Carrier et al., 2014; Koenig and Eagly, 2014; Abele et al., 2016; Rosette et al., 2016), and in others, the agency construct has been subdivided into self-reliance and dominance (Schaumberg and Flynn, 2017). There also has been great variety in how the communality construct has been operationalized (Hoffman and Hurst, 1990; Fiske et al., 2007; Abele et al., 2008; Brosi et al., 2016; Hentschel et al., 2018). Although there have been few efforts to pinpoint specific components of communality, recent work focused on self-judgments in cross-cultural contexts has subdivided it into facets of warmth and morality (Abele et al., 2016).

The multiplicity of items used to represent agency and communality in research studies involving stereotyping is highly suggestive that agentic and communal content can be decomposed into different facets. In this research we seek to distinguish dimensions underlying both the agency and the communality constructs. Our aim is to lend further credence to the idea that the fundamental constructs of agency and communality are multifaceted, and to supply researchers with dimensions of each that may be useful for study of stereotype evaluation and change.

While we are proposing that agency and communality can be broken down into components, we are not claiming that the use of these overarching constructs in earlier research has been an error. In the vast majority of studies in which communality or agency has been measured the scale reliabilities have been high and the items highly correlated. However, internal consistency does not necessarily indicate that the individual items included are unidimensional (Schmitt, 1996; Sijtsma, 2008), or that the entirety of the construct is being captured in a particular measure. Moreover, there are multiple meanings included in these constructs as they have been discussed and operationalized in gender research. Therefore, we propose that breaking them down into separate dimensions will provide finer distinctions about contemporary characterizations of men and women.

# Perceiver Sex

Findings often demonstrate that male and female raters are equally likely to characterize women and men in stereotypic terms (Heilman, 2001, 2012). This suggests that stereotypes outweigh the effects of evaluators' gender identities and, because men and women live in the same world, they see the world similarly. However, the steady shift of women's societal roles and its different implications for men and women may affect the degree to which men and women adhere to traditional gender stereotypes.

On the face of it, one would expect women to hold traditional gender stereotypes less than men. The increase of women in the workforce generally, and particularly in domains typically reserved for men, is likely to be very salient to women. Such changes have distinct implications for them – implications that can impact their expectations, aspirations, and actual experiences. As a result, women may be more attentive than men to shifts in workplace and domestic roles, and more accepting of these roles as the new status quo. They consequently may be more amenable to incorporating updated gender roles into their understanding of the world, diminishing stereotypic beliefs.

Unlike women, who may be likely to embrace recent societal changes, men may be prone to reject or dismiss them. The same societal changes that present new opportunities for women can present threats to men, who may see themselves as losing their rightful place in the social order (see also Sidanius and Pratto, 1999; Knowles and Lowery, 2012). Thus, men may be less willing to accept modern-day changes in social roles or to see these changes as definitive. There may be little impetus for them to relinquish stereotypic beliefs and much impetus for them to retain these beliefs. If this is the case, then men would be expected to adhere more vigorously to traditional gender stereotypes than women.

# Self-Stereotyping Versus Stereotyping of One's Gender Group

Although gender stereotypes impact charcterizations of both self and others, there may be a difference in the degree to which stereotypes dominate in self- and other-characterizations. That is, women may see themselves differently than they see women in general and men may see themselves differently than they see men in general; although they hold stereotypes about their gender groups, they may not apply them to themselves. Indeed, attribution theory (Jones and Nisbett, 1987), which suggests that people are more prone to attribute behavior to stable personality traits when viewing someone else than when viewing oneself, gives reason to argue that stereotypes are more likely to be used when characterizing others in one's gender group than when characterizing oneself. A similar case can be made for construal level theory (Trope and Liberman, 2010), which suggests that psychological distance promotes abstraction rather than attention to individuating information. Moreover, the impact of societal changes that affect adherence to gender stereotypes is apt to have greater immediacy and personal impact for self, and therefore be more reflected in self-characterizations than in characterizations of others.

Some studies have compared the use of stereotypes in characterizing self and others. In an early study (Rosenkrantz et al., 1968), each participating student was asked to rate men, women, and self on a number of characteristics. The researchers found that self-characterizations of men and women showed less evidence of stereotypes than characterizations of others. Similar results were found in studies on accuracy of stereotyping

(Martin, 1987; Allen, 1995). Using instrumenal (i.e., agentic) and expressive (i.e., communal) attributes from the BSRI and PAQ scales, Spence and Buckner (2000) found very little relation between stereotypes about others and self-characterizations.

There is reason to think that some dimensions of gender stereotypes are more likely than others to be differentially subscribed to when characterizing self than when characterizing others. For example, there is a tendency to boost self-esteem and adopt descriptors that are self-enhancing when describing oneself (Swann, 1990), and this may have conseqences whether these descriptors are consistent or inconsistent with gender stereotypes. If this is so, gender may be an important factor; there are likely particular aspects of gender stereotypes that are more (or less) acceptable to women and men, affecting the degree to which they are reflected in men's and women's self-descriptions as compared to their description of their gender group. However, there also is reason to believe that individuals will embrace positive stereotypes and reject negative stereotypes as descriptive not only of themselves but also of their close in-groups (Biernat et al., 1996), suggesting that there will be little difference between characterizations of oneself and one's gender group. Therefore, to obtain a full picture of the current state of gender stereotypes and their impact on perceptions, we believe it important to compare self-characterizations and characterizations of one's gender group on specific dimensions of gender stereotypes.

# Overview of the Research

In this study, we develop a multidimensional framework for measuring different elements of agency and communality to provide an assessment of contemporary gender stereotypes and their impact on charcterizations about others and self. Using the multidimensional framework, we sought to determine (1) if men and women differ in their gender stereotypes; (2) if men and women differ in their self-characterizations; and (3) if men's and women's self-characterizations differ from their characterizations of their gender groups. In each instance we compare the results using the traditional unidimensional framework for measuring agency and communality with the results using the newly formulated multidimensional framework.

# MATERIALS AND METHODS

# Participants

Six hundred and twenty-nine participants (61% female, all U.S. residents) were recruited online via Amazon Mechanical Turk (MTurk), providing a more representative sample of the U.S. population than student samples. MTurk samples tend to be slightly more diverse than and similarly reliable as other types of internet samples used in psychological research (Paolacci et al., 2010; Buhrmester et al., 2011), but nonetheless are convenience samples rather than true representative samples based on demographic data (see e.g., Pew Research Center, 2017). In our sample, ages ranged from 19 to 83, with a mean age of 34.5 years (SD = 13.1). In addition, education ranged from those who had not attended college (17%), had some college education (33%), had graduated from college (37%), to those who had graduate degrees (13%). 77.6% self-identified as White, 8.4% Asian, 7.0% African American, 4.8% Hispanic, and 2.2% other.<sup>1</sup> The survey link was visible only to U.S. residents who had a greater than 95% acceptance rate of previous MTurk work, an indication that their earlier work had been handled responsibly. In addition, we included a question asking participants to indicate whether they filled out the questionnaire honestly (we assured them that their answer on this question would not have any consequences for their payment). One person indicated that he had not filled out the survey honestly and was excluded from the analyses.

# Design

We conducted an experiment with two independent variables: rater gender (male or female) and target group (men in general, women in general, or self). The target group manipulation was randomly assigned to male and female raters. Subsets of this overall design were used to address our specific research questions.

# Procedure

Participants were told that we were interested in people perception, and they were asked either to rate men in general (N = 215) women in general (N = 208) or themselves (N = 205) on an attribute inventory representing various dimensions of agency and communality<sup>2</sup> . The attributes were presented in differing orders to participants, randomized by the survey tool we used. Ratings were made using a 7-point scale with responses ranging from 1 ("not at all") to 7 ("very much").

# Scale Construction

Using an inductive procedure, scale development proceeded in four steps. In the first step, we identified a set of 74 attributes, representative of how agency and communality have been measured by researchers in the past (consisting of adjectives, traits, and descriptors; see **Appendix Tables A, B** for the full list). The attributes were chosen from earlier investigations of gender stereotypes, including those of Broverman et al. (1972), Schein (1973), Spence and Helmreich (1978), Heilman et al. (1995), Fiske et al. (1999), Diekman and Eagly (2000), and Oswald and Lindstedt (2006). They were selected to represent a broad array of agentic and communal attributes with a minimal amount of redundancy.

In the second step, three judges (the first two authors and another independent researcher) sorted the descriptive attributes into categories based on their conceptual similarity. The total set of attributes measured was included in the sorting task, and there

<sup>1</sup>The median age of the U.S. population is 37.9 years (United States Census Bureau, 2017c); Levels of education of the U.S. population 25 years and older in 2017: 39.2% did not attend college, 16.3% had some college, 31.6% had graduated college, 12.9% have graduate degrees (United States Census Bureau, 2017a); Race/ethnicity percentages in the general U.S. population are as follows: 60.7% White, 18.1% Hispanic, 13.4% African American, 5.8% Asian, 2% other (United States Census Bureau, 2017b).

<sup>2</sup>The attributes in the inventory included the communal and agentic attributes of interest as well as a group of attributes measuring other constructs that were included for exploratory purposes but not used in this study.

was no limit placed on the number of categories to be created and no requirements for the number of attributes to be included within each created category. Specifically, the instructions were to use as many categories as needed to sort the attributes into conceptually distinct groupings. The sorting results were then discussed by the judges and two additional researchers. During the discussion, agreement was reached about the number of categories necessary to best capture the distinct dimensions of the sorted attributes. Attributes for which no consensus was reached about category placement were omitted. Then decisions were made about how each of the categories should be labeled. Seven categories were identified, four of which represented dimensions of agency – instrumental competence, leadership competence, assertiveness, independence – and three of which represented dimensions of communality – concern for others, sociability, emotional sensitivity.

In the third step, we had a different set of three independent judges (all graduate students in a psychology program) do a sorting of the retained attributes into the labeled categories. This was done to make sure that their sorting conformed to the identified categories; items that were misclassified by any of the judges were eliminated from the item set.

Finally, in a fourth step, we used confirmatory factor analysis procedures to further hone our categories. Following standard procedures on increasing model fit (e.g., Byrne, 2010), we eliminated all items that showed a low fit to the created categories. We later conducted a conclusive confirmatory factor analysis, for which the results are reported in the next section.

As a result of these steps, we created seven scales, each composed of the attributes remaining in one of the seven designated categories. The scales ranged from 3 to 4 items, the coefficient alphas all surpassed 0.75, and all corrected item-scale correlations surpassed 0.40 (Field, 2006). **Table 1** presents the attributes comprising each of the scales as well as the Cronbach alphas and corrected-item-scale correlations.

The four scales composed of agentic attributes and denoting dimensions of agency were: instrumental competence, leadership competence, assertiveness, and independence. Thus, the sorting process not only distinguished between competence and other elements of agency (as has been suggested by others like Carrier et al., 2014), but further decomposed the non-competence elements of agency into dimensions of assertiveness and independence. Assertiveness concerns acting on the world and taking charge. Independence connotes self-reliance and acting on one's own, free of the influence of others. Furthermore, competence was subdivided into two separate dimensions – one focused on performance execution (instrumental competence), and the other focused on capability to perform as a leader (leadership competence). Both leadership competence and assertiveness imply high social power whereas instrumental competence and independence are not typically associated with power relations.

The three scales composed of communal attributes and denoting dimensions of communality were: concern for others, sociability, and emotional sensitivity. Concern for others and sociability both entail a focus on others, but the former involves a one-way relationship of giving and nurturance while the latter involves a transactional relationship focused on relationship building. Emotional sensitivity implies an orientation that focuses on feelings as an antecedent or consequence of interactions with others.

#### Confirmatory Factor Analysis

We conducted a confirmatory factor analysis using the R package lavaan (Rosseel, 2012) to test the factor structure of the four final agency scales and the three final communality scales. Results


revealed that for agency, the theoretically assumed four-factor model (i.e., instrumental competence, leadership competence, assertiveness, and independence as first-order factors) provided adequate fit (χ <sup>2</sup> = 370.224, df = 84, p < 0.001, χ 2 /df = 4.41, CFI = 0.947, RMSEA = 0.076, SRMR = 0.045) and also was more suitable than a one-factor model in which all agency items loaded on a single factor (χ <sup>2</sup> = 813.318, df = 90, p < 0.001, χ 2 /df = 9.04, CFI = 0.866, RMSEA = 0.116, SRMR = 0.068). A comparison of the two models showed that the four-factor agency model differed significantly from the one-factor model and was thus preferable (1χ <sup>2</sup> = 443.09, df = 6, p < 0.001). Similarly, for communality the theoretically posited three-factor model (i.e., concern for others, sociability, and emotional sensitivity as first-order factors) provided acceptable fit (χ <sup>2</sup> = 326.000, df = 41, p < 0.001, χ 2 /df = 7.95, CFI = 0.931, RMSEA = 0.108, SRMR = 0.048)<sup>3</sup> and was more suitable than the one-factor model in which all communality items loaded on a single factor (χ <sup>2</sup> = 359.803, df = 44, p < 0.001, χ 2 /df = 7.95, CFI = 0.924, RMSEA = 0.110, SRMR = 0.048). A comparison of the two models showed that the three-factor communality model differed significantly from the one-factor model and was therefore preferable (1χ <sup>2</sup> = 33.80, df = 3, p < 0.001). Overall, these results indicated that even though there were high correlations among the agency scales and also among the communality scales (as we would expect given our idea that in each case the multiple scales are part of the same construct; see **Table 2**), the four scales for agency and the three scales for communality represent different dimensions of these constructs.

# Overall Measures

To provide a point of comparison for our multi-dimensional framework, we also determined scales for overall agency and overall communality. In other words, the 15 agency items were combined into one overall agency scale (α = 0.93) and the 11 communality items were combined into one overall communality scale (α = 0.93).

# RESULTS

# Preliminary Analyses: Rater Age and Education Level

Because of potential consequences of raters' age and education level on the use of gender stereotypes (younger and more educated people might be less likely to adhere to them), we conducted initial analyses to identify their independent and interactive effects. We did not have the opportunity to do the same for race because our subsamples of Asian, African American, and Hispanic participants were not large enough. To determine whether there were differences in the pattern of responses depending upon the age of the rater, we chose the age of 40 as a midlife indicator, divided our sample into two age groups (39 years and younger, 40 years and older), and included age as an additional independent variable in our analyses. Results indicated no main effects or interactions involving age in the ANOVAs conducted. We also divided our sample into two education level groups (those who had graduated from college or had advanced degrees and those who had not graduated from college), and included educational level as an additional independent variable in our analyses. We found no main effects or interactions involving educational level in the ANOVAs. As a consequence we combined data from both younger and older participants and from those who were and were not college educated in the analyses reported below.

# Main Analyses

To address our research questions, we conducted a series of ANOVAs on subsets of our participant sample. For each question, we first conducted ANOVAs on the overall agency scale and the overall communality scale. Then, to determine whether the results differed for different agency and communality dimensions, we conducted mixed-model ANOVAs that included either agency dimension (instrumental competence, leadership competence, assertiveness, independence) or communality dimension as a within-subjects factor (concern for others, sociability, and emotional sensitivity). Fisher's least significant difference (LSD) method was used to test the question-relevant planned comparisons.


∗∗p < 0.01; ∗∗∗p < 0.001.

<sup>3</sup>The relatively large RMSEA is likely due to violation of multivariate normality assumptions (joint multivariate kurtosis = 76.55 with a critical ratio of 55.30). The most important implication of non-normality is that chi-square values are inflated, whereas parameter estimates are still fairly accurate (Kline, 2011).



Results are displayed for a 2 (rater gender: male, female) × 2 (target group: men in general, women in general) × 4 (agency dimension: instrumental competence, leadership competence, assertiveness, independence) ANOVA and a 2 (rater gender: male, female) × 2 (target group: men in general, women in general) × 3 (communality dimension: concern for others, sociability, emotional sensitivity) ANOVA.

# Do Men and Women Differ in Their Gender Stereotypes?

We used a 2 × 2 ANOVA, with rater gender (male, female) and target group (men in genereal, women in general) to assess differences in men's and women's gender stereotypes. We first analyzed the overall agency and communality ratings, and then conducted a 2 × 2 × 4 mixed-model ANOVA including the agency dimensions, and a 2 × 2 × 3 mixed-model ANOVA including the communality dimensions. The mixedmodel ANOVA results are presented in **Table 3**. We followed up with LSD comparisons (see **Table 4**).

#### Agency

The 2 × 2 ANOVA results for the overall agency ratings indicated a main effect for both rater gender, F(1,418) = 15.10, p < 0.001, η 2 <sup>p</sup> = 0.04, and target group, F(1,418) = 5.52, p = 0.019, η 2 <sup>p</sup> = 0.01. The results of the 2 × 2 × 4 mixed-model ANOVA, including the four agency dimensions as a within-subject factor, repeated the main effects for rater gender and target group and also indicated a main effect for agency dimension and an interaction between agency dimension and target group (see **Table 3**), suggesting that there were differences in ratings depending on the agency dimension.

#### **Differences in ratings of men in general and women in general**

LSD comparisons (see **Table 4**) of the overall agency ratings indicated that male raters rated women in general as lower in overall agency than men in general. They further indicated that female raters rated women in general and men in general as equally agentic. LSD comparisons of the individual agency scales indicated that this result held true for most of the agency dimensions. With the exception of the instrumental competence dimension (on which there were no differences in ratings of women and men in general whether the rater was male or female), male raters rated women in general lower than men in general on the agency dimensions (leaderhip competence, assertiveness, and independence). In contrast to the ratings of male raters but in line with the overall agency result, female raters rated women in general no differently than they rated men in general in leadership competence and independence. Yet, in contrast to the results of

TABLE 4 | Means, standard deviations, and LSD results of stereotype ratings.


Means (and standard deviations) of male and female raters rating men in general and women in general for all scales. Ratings were given on a 7-point scale from 1 "not at all" to 7 "very much". LSD comparisons are presented for (1) rating differences of men in general versus women in general for male raters and for female raters, and (2) rater gender differences in characterizations of men in general and women in general.


Results are displayed for a 2 (self-rater gender: male, female) × 4 (agency dimensions: instrumental competence, leadership competence, assertiveness, independence) ANOVA and a 2 (self-rater gender: male, female) × 3 (communality dimensions: concern for others, sociability, emotional sensitivity) ANOVA.

the overall agency ratings, female raters differentiated between women and men in ratings of assertiveness. That is, much like male raters, female raters rated women in general as less assertive than men in general. **Figure 1** displays the results for the agency dimensions.

#### **Rater gender differences in target group characterizations**

Additional LSD comparisons (again see **Table 4**) lent further insight into the source of the gender discrepancy in the comparative ratings of women and men in general. Comparisons of the overall agency ratings indicated that ratings of men in general did not differ as a result of rater gender, but women in general were rated lower by male as compared to female raters.



Means (and standard deviations) of male and female self-raters for all scales as well as LSD comparisons. Ratings were given on a 7-point scale from 1 "not at all" to 7 "very much".

LSD comparisons of the agency dimensions were in line with the overall agency result in ratings of women in general – they were rated lower by male raters as compared to female raters on all four agency dimensions. However, comparisons of the agency dimensions in ratings of men in general were not uniform and deviated from the overall agency results. Although men in general were rated no differently by male and female raters on the instrumental competence, assertiveness, or independence dimensions, female as compared to male raters rated men in general higher in leadership competence (again see **Figure 1**).

#### Communality

A 2 (rater gender: male, female) × 2 (target group: men in general, women in general) ANOVA of the overall communality ratings indicated only a main effect for target group, F(1,418) = 88.68, p < 0.001, η 2 <sup>p</sup> = 0.18. The 2 × 2 × 3 mixed-model ANOVA (see **Table 3**), including the three communality dimensions as a within-subject factor, indicated main effects for target group, rater gender, and communality dimension as well as significant interactions between target group and rater gender, between communality dimension and target group, between communality dimension and rater type, and a three-way interaction.

#### **Differences in ratings of men in general and women in general**

LSD comparisons (see **Table 4**) for overall communality indicated that men in general were rated lower in communality than women in general by both male and female raters. In line with this overall finding, results of the LSD comparisons indicated that both female and male raters rated men in general as lower than women in general on all three communality dimensions: concern for others, sociability, and emotional sensitivity. Thus,

using the overall measure yielded the same information as did the multidimensional measure.

#### **Rater gender differences in target group characterizations**

Additional LSD comparisons (again see **Table 4**) of the communality ratings indicated that both male and female raters rated men in general similarly in communality, but female raters rated women in general higher in communality than male raters did. LSD comparisons of male and female raters rating men in general using the three communality dimensions were aligned with the overall communality result: male and female raters did not differ in ratings of concern for others, sociability, or emotional sensitivity. However, when rating women in general, results of the LSD comparisons of male and female raters were aligned with the overall measure result for only two of the communality dimensions: Female raters rated women in general higher in concern for others and emotional sensitivity than male raters did. On the dimension of sociability, male and female raters did not differ in their ratings of women in general.

# Do Men and Women Differ in Their Self-Characterizations?

We used a one-way ANOVA to assess differences in men's and women's self-characterizations. We first analyzed the overall agency and communality ratings, and then conducted a mixedmodel 2 × 4 ANOVA including the agency dimensions, and a 2 × 3 mixed-model ANOVA including the communality dimensions as a within-subject variable (see **Table 5**). We again followed up with LSD comparisons (see **Table 6**).

#### Agency

ANOVA results of the self-ratings of male and female raters on the overall measure of agency indicated no significant effect for rater gender, F(1,204) = 1.67, p = 0.198, η 2 <sup>p</sup> = 0.01. However, results of the 2 × 4 mixed model ANOVA, with agency dimensions as the within-subject factor, indicated a main effect for agency dimension and an interaction between agency dimension and rater gender, suggesting that self-ratings differed depending on the agency dimension in question (see **Table 5**). LSD comparisons (see **Table 6**) of overall agency showed that, as

leadership competence, assertiveness, independence) by male and female self-raters.

was indicated by the non-significant gender main effects, women rated themselves as equally agentic as men. Yet, the results for the analyses including the four agency dimensions indicated that only findings for instrumental competence and independence were consisent with the pattern of results for the overall agency ratings (there were no differences in the self-ratings of female and male raters). There were, however, significant differences in ratings of leadership competence and in ratings of assertiveness. For both of these dimensions of agency, women rated themselves lower than men did (see **Figure 2**).

## Communality

Results of the ANOVA of the self-ratings of male and female raters indicated a rater gender main effect, F(1,204) = 5.42, p = 0.021, η 2 <sup>p</sup> = 0.03. Results of a 2 × 3 mixed-model ANOVA (again see **Table 5**) with communality dimension as the withinsubjects factor, indicated significant main effects for rater gender and communality dimensions. LSD comparisons (again see **Table 6**), in line with the main effect for rater gender, indicated that men rated themselves lower on overall communality than women. LSD comparisons on the dimension scales indicated that, consistent with the overall communality results, men rated themselves as less concerned for others and less emotionally sensitive than women. However, in contrast to the results for overall communality, there was no difference in how men and women characterized themselves in terms of sociability (see **Figure 3**).

# Do Men's and Women's Self-Characterizations Differ From Their Characterizations of Their Gender Groups?

We used a 2 × 2 ANOVA, with rater gender (male, female) and target group (self, men in general when rater was male or women in general when rater was female) to assess differences in men's and women's self characterizations and same-sex others' characterizations of their gender groups. We first analyzed the overall agency and communality ratings, and then again

#### TABLE 7 | 2 × 2 × 4 Agency ANOVA and 2 × 2 × 3 Communality ANOVA for self-ratings versus target group ratings.


Results are displayed for a 2 (rater gender: male, female) × 2 (target group: self, men in general when the rater was male or women in general when the rater was female) × 4 (agency dimensions: instrumental competence, leadership competence, assertiveness, independence) ANOVA and a 2 (rater gender: male, female) × 2 (target group: self, men in general when the rater was male or women in general when the rater was female) × 3 (communality dimensions: concern for others, sociability, emotional sensitivity) ANOVA.

TABLE 8 | LSD comparisons of self-ratings versus target group ratings.


Rating comparisons for male raters: Male self-raters versus male rater's ratings of men in general; Rating comparisons for female raters: Female self-raters versus female rater's ratings of women in general.

conducted a 2 × 2 × 4 mixed-model ANOVA including our agency dimensions, and a 2 × 2 × 3 mixed-model ANOVA including our communality dimensions (see **Table 7**) and once more followed up with LSD comparisons (see **Table 8**).

#### Agency

The 2 × 2 ANOVA results for the overall agency measure indicated no significant main effect for rater gender, F(1,397) = 2.19, p = 0.139, η 2 <sup>p</sup> = 0.00, or target group, F(1,397) = 0.013, p = 0.909, η 2 <sup>p</sup> = 0.00, but a marginally signicant interaction between them, F(1,397) = 2.77, p = 0.097, η 2 <sup>p</sup> = 0.01. The 2 × 2 × 4 mixed-model ANOVA including the agency dimensions as a within-subjects factor also indicated no significant main effects for rater gender or for target group and again a marginally significant interaction between them. It also indicated a significant main effect for agency dimension and significant interactions of dimension with both rater gender and target group, as well as a three-way interaction between rater gender, target group, and agency dimension (see **Table 7**).

#### **Men's self-ratings versus ratings of men in general**

LSD comparisons (see **Table 8**, means and standard deviations are displayed in **Tables 4**, **6**) of overall agency indicated that male raters rated themselves as more agentic than male raters rated men in general. Results for the agency dimensions were more varied: For the independence and instrumental competence dimensions results were in line with the overall agency result, but male raters rated themselves no differently in leadership competence or assertiveness than male raters rated men in general (see **Figure 4**).

#### **Women's self-ratings versus ratings of women in general**

LSD comparisons (see **Table 8**, means and standard deviations are displayed in **Tables 4**, **6**) of the overall agency ratings indicated that female raters rated themselves no differently than female raters rated women in general. However, comparisons of the four agency dimensions depicted a different pattern. Although ratings of independence were in line with the overall agency result, female raters rated themselves higher in instrumental competence than female raters rated women in general. Most striking, however, were the differences in ratings on the leadership competence and assertiveness dimensions. In contrast to the findings for overall agency, in each of these cases female raters' ratings of themselves were significantly lower than female raters' ratings of women in general (see **Figure 5**). The differences in self-ratings of assertiveness and leadership competence marked the only instance in which there was a more negative characterization of self than of one's gender group.

#### Communality

The 2 × 2 ANOVA results for the overall communality measure indicated a main effect for rater gender, F(1,397) = 19.03, p < 0.001, η 2 <sup>p</sup> = 0.01, and target group, F(1,397) = 42.92, p < 0.001, η 2 <sup>p</sup> = 0.10 as well as a significant interaction, F(1,397) = 10.51, p = 0.001, η 2 <sup>p</sup> = 0.03. The 2 × 2 × 3 mixedmodel ANOVA including the communality dimensions as a within-subjects factor indicated significant main effects for rater gender, for target group, and communality dimension as well as a significant interaction between rater gender and target group, between rater gender and communality dimension, and between target group and communality dimension (see **Table 7**).

#### **Men's self-ratings versus ratings of men in general**

LSD comparisons (see **Table 8**, means and standard deviations are displayed in **Tables 4**, **6**) of overall communality indicated that male raters rated themselves as more communal than male raters rated men in general. LSD comparisons of the three communality dimension scales were consistent with the finding for overall communality. Male raters rated themselves significantly higher than male raters rated men in general in concern for others, sociability and emotional sensitivity (see **Figure 6**).

#### **Women's self-ratings versus ratings of women in general**

LSD comparisons (see **Table 8**, means and standard deviations are displayed in **Tables 4**, **6**) of the overall communality ratings indicated that there was no difference in how female raters rated themselves and how female raters rated women in general. LSD comparisons for sociability and emotional sensitivity were consistent with this finding. However, female raters rated themselves higher in concern for others than they rated women in general (see **Figure 7**).

# DISCUSSION

It was the objective of this research to investigate gender stereotyping of others and self. To do so, we aimed to take into account multiple dimensions of the agency and communality constructs. It was our contention that perceptions on some of these dimensions of agency and communality would differ from one another, and that there would be a benefit in viewing them separately. Our results support this idea. While there were overall findings for agency and communality, analyses of individual aspects of them were not always consistent with these findings. What often appeared to be a general effect when using the overall measures of agency and communality in fact proved to be more textured and differentiated when the multidimensional framework was used. These results support the idea that distinguishing between different agency and communality facets can offer a deeper, more nuanced understanding of gender stereotypes today. Indeed, some important information appears to get lost by only focusing on the overall constructs.

# Answers to Our Research Questions Current Stereotypes

Our results clearly indicate that gender stereotypes persist. They also indicate that stereotypes about agency were more prevalent for male than for female raters. Specifically, male raters described women in general as lower in most aspects of agency than men in general, and also rated women in general lower on each of the agency dimensions than female raters did. Nonetheless,

female raters were not stereotype-free with respect to agency: they described women in general as less assertive than men in general and rated men in general as more leadership competent than male raters did. These findings were masked by the overall measure of agency, which indicated no differences in agency ratings.

Stereotypes about communality also were strongly indicated by our data, but their strength did not tend to differ greatly between male and female raters. All participants rated women higher than men on the three communality dimensions.

#### Self-Stereotyping

Our results showed that men's and women's self-characterizations differed in line with gender stereotypes. Despite the overall agency measure indicating no difference in self-ratings of agency, the analyses incorporating dimensions of agency painted a different picture. Whereas there was no difference in the self-characterizations of men and women in instrumental competence or independence, women rated themselves lower than men in leadership competence and assertiveness. There also were differences in communality self-ratings. Though men tended to rate themselves as generally less communal than women did (as less concerned for others and less emotionally sensitive), their ratings of sociability did not differ from women's.

#### Self-Characterizations Versus Characterizations of One's Gender Group

Self-characterizations were often found to differ from characterizations of one's gender group. Male raters rated themselves as higher in independence and instrumental competence, but no different in assertiveness or leadership competence than they rated men in general. Female raters rated themselves higher in instrumental competence but lower in assertiveness and leadership competence than they rated women in general. These findings are at odds with the results of the overall agency ratings, which imply that male raters consistently rated themselves higher in agency, and that female raters consistently rated themselves no differently than they rated their gender group.

There also were differences between self-ratings and characterizations of one's gender group on the communality dimensions. While female raters only rated themselves higher than they rated women in general in concern for others, male raters rated themselves as higher than they rated men in general on all three dimensions of communality.

# Implications

What does our analysis of current stereotypes tell us? On the one hand, our results indicate that despite dramatic societal changes many aspects of traditional gender stereotypes endure. Both male and female respondents viewed men in general as being more assertive than women in general, and also viewed women in general as more concerned about others, sociable and emotionally sensitive than men in general. On the other hand, our results indicate important departures from traditional views. This can be seen in the findings that unlike male respondents, female respondents indicated no gender deficit in how independent or how competent in leadership they perceived other women to be.

Self-descriptions also tended to conform to traditional gender stereotypes, with men describing themselves as more assertive and more competent in leadership than women did, and women describing themselves as more concerned about others and more emotional than men did. However, there were aspects of agency and communality for which self-characterizations of men and women did not differ. Women's self-ratings of independence and instrumental competence were as high as men's self-ratings, and men's self-ratings of sociability were as high as women's self-ratings. Together with the findings about characterizations of men and women in general, these results attest not only to the possible changing face of stereotypes, but also highlight the importance of considering specific dimensions of both agency and communality in stereotype assessment.

It should be noted that our results suggest a greater differentiation between the multidimensional results for agency characterizations than for communality characterizations. That is, the multidmenstional results more often aligned with the results of the overall measure when the focus of measurement was communality than when it was agency. It is not clear at this point whether this is because of the particular items included in our scales or because communality is a more coherent construct. But, based on our results, it would appear that the use of a multidimensional framework is of particular value when the measurement of agency is the focus – something that should be noted by those involved in studying stereotype assessment and change.

# Competence Perceptions

The lack of similarity in the pattern of results for the two competence dimensions (instrumental competence and leadership competence) is interesting. Although there were differences in ratings on the leadership competence dimension, ratings on the instrumental competence dimension did not differ when comparing ratings of men and women in general or when comparing male and female raters' self-characterizations. It thus appears that there is an aspect of competence on which women are rated as highly as men – the wherewithal to get the work done. However, caution is urged in interpreting this finding. The attributes comprising the instrumental competence scale can be seen as indicative of conscientiousness and willingness to work hard, attributes often associated with women as well as men. Thus there is a question about whether instrumental competence is really a component of the agency construct, a question also prompted by its pattern of correlations with the other dependent measure scales (see also Carrier et al., 2014).

The leadership competence ratings paint a different picture. The consistent perception by men that leadership competence was more prevalent in men than in women suggests that, at least as far as men are concerned, women still are not seen as "having what it takes" to adequately handle traditionally male roles and positions. Whatever the interpretation, however, the different pattern of results found for these two scales indicates that we as researchers have to be very precise in designating what we are measuring and how we are measuring it. It also indicates that

we have to keep close to the construct we actually have measured when drawing conclusions from our data.

# Women and Contemporary Gender Stereotypes

Our results show that women do not entirely embrace the stereotypic view of women as less agentic than men. They did not make distinctions between men and women in general when rating their independence and instrumental competence, nor were their self-ratings on the independence and instrumental competence scales lower than the self-ratings made by men. These findings are noteworthy: one of the key aspects of agency is independence, and it appears that women do not see themselves or other women to be lacking it more than men. Women also did not make distinctions between men and women in general when rating their leadership competence, another key component of agency. These findings suggest that, for modern day women, some important aspects of the agency stereotype no longer apply.

However, our results suggest that women have not moved as far along as one would hope in separating themselves from gender stereotypic constraints. In particular, their self-perceptions of assertiveness and leadership competence – dimensions of agency associated with social power – do not seem to deviate from traditional gender conceptions. Our findings indicate that women not only characterized themselves as less assertive and less competent in leadership than men characterized themselves, but they also described themselves significantly more negatively on these two scales than they described women in general. This means that women rated themselves as more deficient in several central aspects of agency than they rated women as a group, adhering more strongly to traditional gender stereotypes when describing themselves than when describing others. These results seem inconsistent with attribution theory (Jones and Nisbett, 1987) and construal level theory (Trope and Liberman, 2010), and challenge the idea that because people differentiate more when viewing themselves as compared to others they are less apt to use stereotypes in self-description. They also raise questions about differences in aspects of agency that do and do not involve power relations. These findings are in need of further exploration.

# Men and Contemporary Gender Stereotypes

Our results indicate that men continue to accept the stereotyped conception of men lacking communal qualities. They, along with women, rated men in general lower than women in general on all three communality dimensions. It therefore is particularly interesting that in their self-ratings on one dimension of communality – sociability – they did not differ from women. This finding suggests that men conceive of sociability differently when they characterize themselves than when they charcterize others. Other research suggests that whereas women are more social than men in close relationships, men are more social than women in group contexts (Baumeister and Sommer, 1997; Gabriel and Gardner, 1999). Thus, men might have rated themselves as equally sociable as women rated themselves, but for a different reason: because they conceptualized sociability with regard to their groups (rather than close relationships). If so, then clarification is needed about why this potentially different conception of sociability takes hold for men only when they characterize themselves.

Furtherore, it is of note that when comparing themselves with men in general, men's ratings of themselves were significantly higher on all communal dimensions. This finding suggests that although they strongly adhere to traditional stereotypes in their characterizations of men as a group, there is a tendency for men to be less stereotype-bound when they characterize themselves. It also suggests that they are more self-aggrandizing when rating themselves than when rating other men – ascribing to themselves more of the "wonderful" traits traditionally associated with women (Eagly and Mladinic, 1989). This result contrasts with that found for women, for whom traditional gender stereotypes often appeared to exert more influence in self-characterizations than in characterizations of others, even when the result was selfdeprecating rather than self-enhancing. Why there are differences in discrepancies in self-ratings versus other-ratings of women and men raises interesting questions for future research – questions about whether these differential effects are due to the gender of the rater or to the nature of the particular descriptors involved.

# Limitations

Our results indicate that breaking down agency and communality into dimensions was often of benefit when assessing stereotyped perceptions. Though many of our scales were highly correlated, the confirmatory factor analyses provided support that they were distinct facets. Our choice to analyze the scales separately despite high correlations is in line with other researchers, who argue that doing so can enhance results interpretation (Luthar, 1996; Tabachnik and Fidell, 2007). However, we do not claim that the dimensions we derived are the only way to differentiate among the elements of communality and agency, nor do we claim that our scales are the best way to measure them. Indeed, we chose a top–down procedure, using expert judges to derive our scales. This had the advantage that the judges knew about gender research and could effectively represent the literature on gender stereotypes. Nevertheless, if non-experts had done the initial sorting, they may have come to different conclusions about the number or content of items in the different scales or may have generated different scales altogether, ones that perhaps would have been more representative of everyday categories that are consensual in our culture.

Furthermore, our scale construction may have been constrained because our initial pool of items relied exclusively on existing items from past scales, which, although broadly selected, may have been limited by particular ways of thinking about stereotypes. Recent findings by Abele et al. (2016), for example, included a morality facet in their breakdown of communality, and found it to be a robust facet of communality in ratings within and between a large number of countries in both Eastern and Western cultures. We, however, did not include many items that measured morality in our original list of attributes. Whereas we scoured the gender stereotyping literature focused on social perception to compile the most frequently used items for our initial item pool, Abele and colleagues went through a similar

process, but with literature focused primarily on self-perception. Items focusing on the morality component of communality should no doubt be incorporated in future research. In addition, there might also be additional items relating to other facets of agency, such as a cognitive agency facet (e.g., being rational). Moreover, and more generally, a process by which the attributes comprising the scales are generated in a free-form manner and the categorization tasks are performed by a broad-ranging set of judges would serve as a check on our measures and provide guidance about how to modify and improve them.

There are other methodological limitations that are suggestive of follow-up research. We found no differences as a result of the rater's age and education, attesting to the generality of the effects we uncovered, but there no doubt are other possible moderating factors to be explored, such as race and socioeconomic level. Moreover, although we were able to tap into a wide-ranging population, it is important to replicate our study with a more representative U.S. sample to assess the full scope of our findings. In addition, our study was restricted to a sample of U.S. citizens, and it would be interesting to replicate this research with samples that are not exclusively from the U.S. Such cross-cultural replications would help not only to assess generalizability to other cultures, but also to assess the extent to which the nature and degree of change in social roles influences the way people currently conceive of men and women, and men and women conceive of themselves. Finally, it would be useful to conduct research using our measure to describe more differentiated targets to determine whether our results would be similar or different when intersectionality is taken into account and when particular subtypes of women and men are the focus.

# Going Forward

Our findings stimulate several questions for future research. Not only would it be useful to further investigate the competence component of agency, clarifying what it does and does not entail, but also to consider another aspect of competence that has recently been identified as being strongly male gender-typed – intellectual brilliance (Leslie et al., 2015). Exploring the effects of the apparently contradictory view women have of themselves in terms of agency (self-views of their independence and instrumental competence versus self-views of their assertiveness and leadership competence) on women's attitudes and behavior in a variety of spheres also would be valuable. In addition, it would be advantageous to determine whether the greater communality men ascribe to themselves than to other men reflects actual beliefs or is merely self-enhancing, and if it has implications for men's approach to traditionally female roles and positions.

Finally, it is important that in future research attempts are made to demonstrate the usefulness of distinguishing among the dimensions of agency and communality we have identified, and to do so for both self and other characterizations. While for some research questions an overall agency and overall communality measure will likely be sufficient, there no doubt are instances in which finer distinctions will be beneficial. It is possible, for example, that different dimensions of gender stereotypes are more strongly associated with selection decisions, performance evaluations, or reward distributions. Indeed, other researchers have already begun to demonstrate the value of considering distinct facets of agency in assessing gender differences in leader evaluations, but with a less differentiated set of dimensions including only self-reliance and dominance (Schaumberg and Flynn, 2017). It also is possible that different dimensions of selfstereotypes are more strongly associated with career aspirations and choices, or support for gender-related organizational policies. Demonstrating that different dimensions of agency and communality predict different outcomes would add support to our multidimensional framework. In addition to increasing our understanding, such discoveries could provide valuable information about leverage points for intervention to ease the negative consequences of gender stereotyping and the bias they promote.

# CONCLUSION

In this study we have demonstrated the value of subdividing the agency and communality construct in the study of gender stereotypes, and shown that making global statements about agency and communality runs the risk of distorting rather than clarifying our understanding.

Our goal with this paper was to further the conversation in the field about different aspects of both agency and communality and their potentially different effects on self and other characterizations. An underlying theme is that we may be losing information by generalizing to two super constructs and not attending to their components. Our findings demonstrate the complexity of the agency and communality constructs and the potential benefits of thinking about them with greater specificity. This can have consequences not only for understanding stereotypes and gender bias, but also for intervention and change efforts.

What are the implications of our findings for understanding the persistence of gender inequality? Although the results signal easing in some dimensions of traditional gender stereotypes, they make clear that in many ways they persist. Of particular importance is men's unrelenting image of women as deficient in attributes considered to be essential for success in many traditionally male fields – an image that forms the basis of gender bias in many evaluative decisions. But women are not exempt from the influence of gender stereotypes; even though they view women as equal to men in several key agentic qualities, they see themselves as more deficient than men do in both leadership competence and assertiveness, and more deficient in these agency dimensions than women in general. These findings, which result from consideration of multiple aspects of the agency construct, augur ill for the tempering of women's tendency to limit their opportunities. Evidently we still have a way to go before all the components of traditional gender stereotypes fully dissipate and recede, allowing men and women to be judged, and to judge themselves, on the basis of their merits, not their gender.

# ETHICS STATEMENT

fpsyg-10-00011 January 29, 2019 Time: 13:59 # 16

This study was carried out in accordance with the recommendations of the Institutional Review Board, University Committee on Activities Involving Human Subjects, New York University. The protocol was approved by the University Committee on Activities Involving Human Subjects, New York University.

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

# FUNDING

This project was supported in part by an ADVANCE Diversity Science Research Grant awarded to the second author funded

# REFERENCES


by the National Science Foundation ADVANCE-PAID award (HRD-0820202). This project was further supported by the Research Grant "Selection and Evaluation of Leaders in Business and Academia" awarded to the third author and funded by the German Federal Ministry of Education and Research (BMBF) and the European Social Fund (ESF) (FKZ 01FP1070/71). This publication was supported by the German Research Foundation (DFG) and the Technical University of Munich (TUM) in the framework of the Open Access Publishing Program.

# ACKNOWLEDGMENTS

We thank Suzette Caleo, Francesca Manzi, Susanne Braun, and Jennifer Ray for their insights and feedback in the development of this study. We also thank Armin Pircher Verdorfer for his support in calculting the CFA. Portions of this study were presented at the Annual Meeting of the Society for Personality and Social Psychology.



Behavior, eds E. E. Jones, D. E. Kanouse, H. H. Kelley, R. E. Nisbett, S. Valins, and B. Weiner (Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.), 79–94.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Hentschel, Heilman and Peus. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# APPENDIX

TABLE A | List of agentic attributes measured.

fpsyg-10-00011 January 29, 2019 Time: 13:59 # 19

#### Agentic Attributes


#### TABLE B | List of communal attributes measured.


# Comparing Prescriptive and Descriptive Gender Stereotypes About Children, Adults, and the Elderly

#### Anne M. Koenig\*

Department of Psychological Sciences, University of San Diego, San Diego, CA, United States

Gender stereotypes have descriptive components, or beliefs about how males and females typically act, as well as prescriptive components, or beliefs about how males and females should act. For example, women are supposed to be nurturing and avoid dominance, and men are supposed to be agentic and avoid weakness. However, it is not clear whether people hold prescriptive gender stereotypes about children of different age groups. In addition, research has not addressed prescriptive gender stereotypes for the elderly. The current research measured prescriptive gender stereotypes for children, adults, and elderly men and women in 3 studies to (a) compare how prescriptive gender stereotypes change across age groups and (b) address whether stereotypes of males are more restrictive than stereotypes of females. Students (Studies 1 and 2) and community members (Study 3), which were all U.S. and majority White samples, rated how desirable it was for different target groups to possess a list of characteristics from 1 (very undesirable) to 9 (very desirable). The target age groups included toddlers, elementaryaged, adolescent, young adult, adult, and elderly males and females. The list of 21 characteristics was created to encompass traits and behaviors relevant across a wide age range. In a meta-analysis across studies, prescriptive stereotypes were defined as characteristics displaying a sex difference of d > 0.40 and an average rating as desirable for positive prescriptive stereotypes (PPS) or undesirable for negative proscriptive stereotypes (NPS) for male or females of each age group. Results replicated previous research on prescriptive stereotypes for adults: Women should be communal and avoid being dominant. Men should be agentic, independent, masculine in appearance, and interested in science and technology, but avoid being weak, emotional, shy, and feminine in appearance. Stereotypes of boys and girls from elementary-aged to young adults still included these components, but stereotypes of toddlers involved mainly physical appearance and play behaviors. Prescriptive stereotypes of elderly men and women were weaker. Overall, boys and men had more restrictive prescriptive stereotypes than girls and women in terms of strength and number. These findings demonstrate the applicability of prescriptive stereotypes to different age groups.

Keywords: gender, stereotypes, prescriptions, children, adults, elderly, age

#### Edited by:

Sabine Sczesny, Universität Bern, Switzerland

#### Reviewed by:

Rebecca Neel, University of Iowa, United States Monica Biernat, University of Kansas, United States

> \*Correspondence: Anne M. Koenig akoenig@sandiego.edu

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

> Received: 01 April 2018 Accepted: 07 June 2018 Published: 26 June 2018

#### Citation:

Koenig AM (2018) Comparing Prescriptive and Descriptive Gender Stereotypes About Children, Adults, and the Elderly. Front. Psychol. 9:1086. doi: 10.3389/fpsyg.2018.01086

# INTRODUCTION

Gender stereotypes are both descriptive and prescriptive in nature. That is gender stereotypes have descriptive components, which are beliefs about what men and women typically do. They also contain strong prescriptive components, or beliefs about what men and women should do (Fiske and Stevens, 1993; Cialdini and Trost, 1998). This prescriptive nature is assumed to stem from the high level of contact and interdependence between men and women (e.g., Fiske and Stevens, 1993), which not only allows perceivers to create estimates of how men and women actually act but also creates expectations for how they should act.

Prescriptive stereotypes can have positive and negative components: (a) positive prescriptive stereotypes (PPS) designate desirable behaviors that one sex is encouraged to display more than the other and (b) negative proscriptive stereotypes (NPS) designate undesirable behaviors that one sex should avoid more than the other. These proscriptive stereotypes often involve characteristics that are undesirable in either sex, but are permitted in one sex, while being proscribed for the other. For example, according to past research (Prentice and Carranza, 2002; Rudman et al., 2012b), women are supposed to be communal (warm, sensitive, cooperative; PPS for women) and avoid dominance (e.g., aggressive, intimidating, arrogant; NPS for women), and men are supposed to be agentic (assertive, competitive, independent; PPS for men) and avoid weakness (e.g., weak, insecure, emotional; NPS for men). Yet dominance and weakness, which are undesirable, negative traits, are tolerated in men or women, respectively.

The current research measures both prescriptive and descriptive gender stereotypes to answer several questions about their content and magnitude. One first basic question is whether gender stereotypes have prescriptive components not only for adult men and women, but for males and females across different age groups, from toddlers to the elderly. Assuming prescriptive stereotypes exist across these age groups, the current research addresses how both the content and magnitude of prescriptive gender stereotypes changes across age groups. In addition, the current research compares the magnitude of PPS and NPS for males and females within each age group.

## Adult Prescriptive Stereotypes

The fact that gender stereotypes are prescriptive is important to our perceptions of men and women because prescriptive stereotypes indicate approved (or disapproved) behavior. Violations of these prescriptions create strong reactions in perceivers. Whereas violations of descriptive stereotypes often cause surprise, given the person is not acting how the perceiver thought most men or women act, violations of prescriptive stereotypes create reactions of anger and moral outrage, because the person is not acting as they are supposed to act (Rudman and Glick, 2010).

Thus, descriptive gender stereotypes can lead to prejudice and discrimination based on a perceived incongruency between gender stereotypes and role requirements, and prescriptive stereotypes can also produce prejudice if individuals violate gender norms (e.g., Burgess and Borgida, 1999; Heilman, 2001; Eagly and Karau, 2002). Specifically, the angry, moral outrage created by the violation of prescriptive stereotypes can lead to backlash, or social or economic penalties for the stereotype violator (e.g., dislike or not being hired for a position). Rudman et al. (2012a,b) posit that backlash against both female and male targets works to maintain the status hierarchy and keep men in high status positions, but limits agentic women's access to these same positions. For example, women who violate prescriptive stereotypes by acting dominant are disliked and therefore less likely to be hired even though they are seen as competent (Rudman et al., 2012a). Men can also be the recipients of backlash when they violate prescriptive stereotypes by lacking agency and showing weakness (Moss-Racusin et al., 2010; see summary by Rudman et al., 2012a).

Because of this backlash effect, prescriptive stereotypes can predict prejudice, even when descriptive stereotypes do not. For example, when male and female targets had equivalent resumes participants' descriptive stereotypes did not predict evaluations of the targets, but prescriptive stereotypes did predict prejudice toward women pursuing masculine roles (Gill, 2004). Prescriptive stereotypes also create pressures on women and men to act in certain ways, and thus men and women avoid violating stereotypes or hide their non-conforming behavior to avoid penalties, which increases the rate of stereotypical behavior and perpetuates perceivers' stereotypes (Prentice and Carranza, 2004; Rudman and Glick, 2010; Rudman et al., 2012a). Thus, prescriptive stereotypes have important ramifications for behavior.

Whether these prescriptive stereotypes are more restrictive for adult men or women is unclear. Much research has investigated backlash toward women, perhaps because women are often held back from high status positions, which is seen as an important discriminatory outcome in society. However, there are several forms of evidence that suggest men's behaviors may be more restricted than women's in adulthood. For example, although they did not have a direct measure of prescriptive stereotypes, Hort et al. (1990) demonstrated that men were described in more stereotypical terms than women. Other evidence for a restrictive male stereotype stems from looking at the outcomes of stereotype violation. According to the status incongruity hypothesis, there are two prescriptive stereotypes that could create backlash for men (lacking agency and displaying weakness) and only one for women (displaying dominance; Rudman et al., 2012a). This argument suggests that men are viewed more negatively than women for violating gender norms because men loose status (while women gain status) with the violation (Feinman, 1984; Sirin et al., 2004), and status is seen as a positive, desirable outcome. In addition, theories about precarious manhood also suggest that men have to publically and repeatedly prove their strength to be called men because manhood is an uncertain, tenuous social status (Vandello and Bosson, 2013). Even a single feminine or unmanly act could discount a man's status as a man, resulting in avoidance of feminine behaviors. According to this logic, these pressures may create strong prescriptive stereotypes for men to act agentically and avoid weakness to be considered a man—a pressure that is not as strong for women. Lastly, a sexual orientation perspective also indicates that men would be

judged more harshly for feminine behavior than women are for masculine behavior because (a) men who display feminine behaviors are more likely to be perceived as gay than women who display masculine behavior (e.g., Deaux and Lewis, 1984; Herek, 1984; McCreary, 1994; Sirin et al., 2004), and (b) gay men are perceived more negatively than lesbians (e.g., Kite and Whitley, 1996). Given all of these ideas, prescriptive stereotypes may be stronger for men as a way to avoid these negative outcomes of a loss of status, manhood, and perceptions of homosexuality. The current research quantifies prescriptive stereotypes for males and females to assess their content and magnitude and attempts to make comparisons across the stereotypes for males and females.

# Prescriptive Stereotypes About Children

Penalties for stereotype violations also occur for children who act in counterstereotypical ways. Several studies show that reactions from both child (e.g., Smetana, 1986; Levy et al., 1995) and adult (e.g., Feinman, 1981; Martin, 1990; Sandnabba and Ahlberg, 1999) respondents demonstrate more negative consequences (e.g., approval, evaluations) of counterstereotypical behavior from boys than girls ranging from ages 3 to 8 years old. This negative reaction toward boys is often stronger in men than women (e.g., Martin, 1990). Parents give little latitude for boys' behaviors but encourage both feminine behavior as well as masculine occupations and interests for girls, even complaining that their daughters can be "too girly" with pink, princess paraphilia (Kane, 2012). Boys who are "sissies" are especially negatively perceived, whereas girls who are "tomboys" have both feminine and masculine interests and traits and therefore do not violate gender stereotypes as strongly (Martin, 1990, 1995; Martin and Dinella, 2012). Boys also elicit negative reactions for shy behavior, presumably because this behavior violates the male gender role (Doey et al., 2014). As with adults, boys' behavior may be more restricted because of links between feminine behavior and homosexuality (e.g., Sandnabba and Ahlberg, 1999; Sirin et al., 2004). Thus, the consequences for violating stereotypes appear to be especially harsh for boys, and boys tend to be bounded by stricter rules of gender conformity and are subject to stronger "gender policing" than girls. These penalties, similar to backlash in the adult literature, suggest that violations of prescriptive stereotypes are at play. However, the research on children's norm violations does not frame the negative outcomes for counterstereotypical behavior in terms of violations of prescriptive stereotypes. In fact, it is not clear whether people even hold strong prescriptive gender stereotypes about children.

In one study that did address prescriptive stereotypes in children, Martin (1995) measured both descriptive and prescriptive gender stereotypes by asking adults how typical (measuring descriptive stereotypes) and how desirable (measuring prescriptive stereotypes) a list of 25 traits were for 4–7 year old boys or girls. As Martin (1995) predicted, the typicality ratings differed more often than the desirability ratings: The descriptive stereotypes indicated that boys and girls differed on 24 of 25 of the traits, which were selected to contain some masculine, feminine, and neutral items. Yet only 16 of the 25 traits showed sex differences in desirability: Martin (1995) found that boys should enjoy mechanical objects, be dominant, be independent, be competitive, like rough play, and be aggressive but avoid crying/getting upset or being frustrated (compared to girls). Girls should be gentle, neat/clean, sympathetic, eager to soothe hurt feelings, well-mannered, helpful around the house, and soft-spoken and avoid being noisy. Although there were fewer prescriptive than descriptive stereotypes about children in this research, these findings also show that prescriptive gender stereotypes exist for children of elementary-school age in ways that are consistent with adult prescriptive stereotypes.

Although prescriptive stereotypes may exist for younger ages, one could argue that younger people may not be held to as high of a standard for their behavior because they are considered to be more malleable than older targets (see Neel and Lassetter, 2015). To the extent that children are seen as still learning their gender roles and associated appropriate behaviors, people may be more lenient and prescriptive stereotypes might be weaker. On the other hand, adults' descriptive gender stereotypes of children were stronger than their descriptive stereotypes of adults (Powlishta, 2000), and the same effect may apply to prescriptive stereotypes resulting in stronger stereotypes of children. Thus, the magnitude of prescriptive gender stereotypes for children of different ages and how they compare to adult prescriptive gender stereotypes is unclear.

# Prescriptive Stereotypes About Other Age Groups

Once males and females are old enough to understand their gender roles, perceivers may be less lax about what is desirable behavior. Not only may older teens be seen as more in charge of their own behavior, but adolescence and young adulthood highlights differences between males and females in ways that were not relevant to children given the advent of puberty and the initiation of dating scripts. Thus, stereotypical self-perceptions and peer pressure for conformity to gender roles may intensify during adolescence for both males and females (Massad, 1981; Hill and Lynch, 1983; Galambos et al., 1990). This "gender intensification hypothesis" states that there is an acceleration of gender-differential socialization and increased pressure to conform during adolescence. However, it is unclear if these selfbeliefs would transfer to adults' stereotypes of male and female teens. Based on these ideas, one could predict that prescriptive stereotypes adults hold are stronger for adolescents. Whether males' behaviors would still be more restricted is unclear. Some researchers argue that gender role pressures intensify at this age mostly for boys (Massad, 1981; Galambos et al., 1990), which is in line with ideas about precarious manhood, where boys have to continue to strive to become men through their public behavior whereas girls become women through the natural process of menstruation and other biological changes that occur in adolescence (Vandello and Bosson, 2013). However, other researchers suggest a confluence of factors increase pressures on girls' behavior in adolescence compared to childhood, with the leniency given to girls to be tomboys replaced with stricter gender norms and a pressure to exhibit feminine behaviors and interests within a heterosexual dating environment (Hill and Lynch, 1983). Thus, it is unclear whether boys would still be more restricted in their behavior than girls and generally how prescriptive stereotypes may change or emerge for adolescents and young adults.

On the other side of the age range, research has not focused on prescriptive gender stereotypes in the elderly. There is some evidence that descriptive gender stereotypes become more similar for elderly targets, in part because men's attributes become less masculine (Kite et al., 1991; DeArmond et al., 2006; Thompson, 2006). Conversely, other evidence shows that when compared to old women, older men are still seen as more competent, higher in autonomy, and less dependent (Canetto et al., 1995), demonstrating the continued existence of gender stereotypes. However, most of the research on aging stereotypes measures the negativity of the stereotypes (e.g., Hummert et al., 1995; Laditka et al., 2004) and not whether they are gendered. Thus, researchers have not addressed prescriptive stereotypes in the elderly or compared these to stereotypes of young adult or middle-aged men and women. Perhaps elderly men have less pressure to demonstrate their manhood and provide for a family, and thus their restrictions lessen, making violations of gender roles less severe than for younger individuals.

# Current Research

In 3 studies, the current research measured prescriptive and descriptive gender stereotypes for various age groups, including children, adults, and the elderly. In all studies, participants rated how desirable and typical it was for different target groups to possess a list of characteristics. The list of characteristics included a variety of traits and behaviors, many of which have not been used in past research on adult stereotypes, to cover the types of behaviors that may be more relevant to childhood. For example, research on the parental treatment of boys vs. girls demonstrated higher levels of pressure for gendered interests and activities rather than traits (e.g., Lytton and Romney, 1991).

Through this method, the current research attempts to measure prescriptive gender stereotypes of toddlers, elementaryaged children, adolescents, young adults, adults, and the elderly to compare the content and strength of these stereotypes and answer several questions. In particular, assuming that gender stereotypes toward children and the elderly are also prescriptive in nature, current research addresses how both the content and magnitude of prescriptive gender stereotypes changes across age groups. Specifically, based on the emphasis on policing boys' behavior in childhood, one might expect that prescriptive stereotypes would be stronger for boys than adult men. Alternatively, these stereotypes may remain strong across age groups. Conversely, however, prescriptive feminine stereotypes may start weaker for girls and increase with age. Because descriptive stereotypes were also measured, prescriptive stereotypes can be compared to the typicality of each characteristics in males and females. Secondly, the research compares the number and magnitude of PPS and NPS for males and females within each age group to answer the question of whether males are more restricted than females in their behavior. Participants also answered a direct question comparing the desirability of stereotype violating behavior in males vs. females. Research suggests greater restrictions for males are likely for children, but the difference in strength and magnitude of prescriptive gender stereotypes has not been directly tested for specific age groups of children or for adult or elderly stereotypes.

# METHOD

# Participants

Student participants in Studies 1 and 2 took part in a laboratory setting for course credit. In Study 1 (n = 137), participants were 64.2% women; the mean age was 18.73 years (SD = 1.07); 72.3% were White/Caucasian, 16.8% Hispanic/Latino, 11.7% Asian, 5.1% Black/African American, and 6.6% other or unreported (in all studies participants could select as many racial groups as apply). In Study 2 (n = 91), participants were 65.9% women; the mean age was 19.10 years (SD = 1.97); 76.9% were White/Caucasian, 15.4% Asian, 12.1% Hispanic/Latino, 2.2% African American, and 8.8% other or unreported.

In Study 3 (n = 120), participants recruited through Amazon's Mechanical Turk (MTurk; see Buhrmester et al., 2011; Mason and Suri, 2012) participated for \$0.30 for a 15 min survey. Participants were 59.3% women; the mean age was 38.17 years (SD = 13.67); 70.8% were White/Caucasian, 7.5% Hispanic/Latino, 6.7% Black/African American, 5.0% Asian, and 4.1% other or unreported.

# Procedure and Designs

All procedures were approved by the USD Institutional Review Board and all materials are available upon request. Participants in Studies 1 and 2 gave written informed consent, but participants in Study 3 indicated their informed consent online as a waiver of written consent was obtained from the IRB. Participants in all three studies rated the prescriptive and/or descriptive stereotypes of 3–6 groups of boys/men and/or girls/women. In Study 1, each participant rated 3 target groups of either males or females of different ages in a 3 (target age: elementary school, adults, elderly) × 2 (target sex: male, female) × 2 (stereotype rating: prescriptive, descriptive) mixed-model design, with target age and stereotype rating as within-subjects. In Study 2, targets were expanded to more age groups and participants rated 2 target groups of males and females of the same age in a 5 (target age: toddlers, elementary-aged, adolescent, young adult, adult) × 2 (target sex: male, female) × 2 (stereotype rating: prescriptive, descriptive) mixed-model design, with target sex and stereotype rating as within-subjects. In Study 3, the sample was broadened to community participants, who rated 6 groups of males or females of various ages in a 6 (target age: toddlers, elementary-aged, adolescent, young adult, adult, elderly) × 2 (target sex: male, female) × 2 (stereotype rating: prescriptive, descriptive) mixedmodel design, with target age as within-subjects. In all studies, the levels of the within-subject variable were presented in a random order. Target age was designated with a label and a corresponding age group: toddlers (∼2–5 years old), elementary-aged children (∼5–12 years old), adolescents (∼12–18 years old), young adults (∼18–30 years old), adults (∼30–50 years old), the elderly (over ∼65 years old). See **Table 1** for a comparison of study designs.

The instructions stated that the survey asked about the desirability of characteristics for males and females of different age groups. In Studies 1 and 2, prescriptive stereotype ratings were presented first, then the comparison of prescriptive stereotypes, and finally the descriptive ratings. To circumvent social desirability pressures, the instructions pointed out that the researchers were not interested in personal opinions but judgments of how society evaluates these characteristics for males and females of different age groups. Participants were then thanked for their time and debriefed about the purpose of the study.

A sensitivity analysis in G∗Power (Faul et al., 2007) demonstrated that this research was able to detect with 80% power a between-subjects target sex effect of d = 0.37 in Study 1, a within-subjects target sex effect of d between 0.53 and 0.50 (with n between 17 and 19 per target age condition) in Study 2, and a between-subjects target sex effect of d = 0.55 for prescriptive stereotypes and d = 0.56 for prescriptive stereotypes in Study 3. Thus, with a cut-off of d = 0.40 to define a prescriptive stereotype, these studies had acceptable power to detect effects of larger magnitudes, although results from near the cutoff should be taken with caution.

# Measures

#### Prescriptive Stereotypes

In Studies 1 and 2 participants rated the characteristics of target groups in response to the question, "How DESIRABLE it is in American society for [elementary school boys (∼5–12 years old)] to possess the following characteristics? That is, we want to know how [boys] SHOULD act" [emphasis in original]. In Study 3 the second sentence read, "That is, regardless of how boys actually act, we want to know how society thinks [elementary school boys] SHOULD act." The scale ranged from 1 (very undesirable) to 9 (very desirable). This question is similar to the prescriptive stereotype question and response options from Prentice and Carranza (2002), who also used a bi-polar scale.


#### Descriptive Stereotypes

In Studies 1 and 2 participants also rated the characteristics of target groups in response to the question, "Indicate how COMMON or TYPICAL each of the following characteristics is in [elementary school boys (∼5–12 years old)] in American society. That is, we want to know how adult females USUALLY act" [emphasis in original]. In Study 3, the question asking about descriptive stereotypes read "How COMMON or TYPICAL is it in American society for [elementary school boys (∼5–12 years old)] to possess the following characteristics? That is, we want to know how society thinks [boys] USUALLY act." In all studies the scale ranged from 1 (very atypical) to 9 (very typical).

#### Characteristics

Both types of stereotypes were rated on 19–21 characteristics, created by grouping the traits from previous research (Martin, 1995; Prentice and Carranza, 2002; Rudman et al., 2012b) based on similarity, and adding some additional characteristics to cover a larger variety of traits and behaviors and include characteristics more applicable to children (e.g., shy, noisy, interests, play, and dress style). The full list of characteristics is given in **Table 2**.

TABLE 2 | Characteristics rated for prescriptive and descriptive stereotypes.


<sup>a</sup>Used only in Studies 2 and 3.

The trait groupings are the items used in the stereotype ratings and the characteristic represents the label for the overarching concept being measured. The list was displayed in a different order for each study.

To make it easier for participants to rate groups of characteristics (instead of individual traits), participants were instructed to note that not all traits would apply equally across age groups, but within each list of characteristics some may apply more to some age groups than others. Participants were asked to think about the meaning of the overall list as they rated each group, instead of focusing only on 1 or 2 traits in the list. One benefit of grouping traits this way is that it allowed the characteristics to be more applicable across age groups. Participants may have focused on slightly different traits, but all of the traits on a list represented the overall concept being measured, allowing for a comparison of that concept across ages even thought it might manifest as different behaviors in different age groups. Thus, participants could apply that concept to a certain age group, instead of attempting to rate an individual trait that may or may not seem relevant to each age group.

#### Prescriptive Comparisons

In Studies 1 and 2, participants were also asked to compare the desirability of behavior of males and females who are likely violating their prescriptive stereotypes. Specifically, in two questions, participants compared (a) males (of a certain age) acting communal to females (of the same age) acting agentic (PPS of the other sex) and (b) males (of a certain age) acting weak to females (of the same age) acting dominant (NPS for that sex). Communion, agency, weakness, and dominance were defined using the same lists of characteristic given in **Table 2**. The scale ranged from 1 (considerably less desirable for males to act nurturing/weak) to 7 (considerably less desirable for females to act assertive/dominant).

# RESULTS AND DISCUSSION

The raw data supporting the conclusions of this article can be requested from the author. Effect sizes for both prescriptive and descriptive stereotypes are the standardized difference between the relevant conditions, or Cohen's d. I corrected the smallsample bias in estimates of d using the conversion to Hedges' g, but refer to the effect sizes as d. In Study 1 and 3, effect sizes were calculated by dividing the difference in ratings for male and female targets at each of the different age groups by the pooled standard deviation. In Study 2, where target sex was withinsubjects, effect sizes were calculated by dividing the difference in ratings by the average standard deviation, in order to facilitate the meta-analysis across studies (see Lakens, 2013). These effect sizes were then meta-analyzed using fixed-effects across the three studies, when the same age group was rated. A fixed-effects rather than random-effects meta-analysis was more appropriate because the studies had nearly identical measures and the sample of studies was too small to yield a reliable estimate of the betweenstudy variability needed in random-effects computations (see Borenstein et al., 2009).

## Prescriptive Stereotypes

**Table 3** provides the effect sizes in the meta-analysis of prescriptive stereotypes (see the Supplementary Tables for effects for each study separately). As defined by Rudman et al. (2012b), prescriptive stereotypes were defined as traits displaying a sex difference of d > 0.40 and an average rating as desirable (>6 for PPS) or undesirable (<4 for NPS) for males or females. These two criteria mean that a large difference between the desirability of the characteristic between males and females does not necessarily classify as a stereotype if it is not also highly desirable or undesirable for one sex. Based on these criteria, PPS and NPS for males and females are designated in **Table 3**. To facilitate comparisons across age groups, the bottom rows of **Table 3** report the number of characteristics that meet the criteria to be considered as PPS and NPS and the average effect size for these PPS and NPS.

It is clear from these data that prescriptive gender stereotypes exist across age groups, satisfying the assumption that prescriptive stereotypes are relevant for each age group. Thus, the data are described in relation to two questions: (a) comparing the content and magnitude of prescriptive gender stereotypes across age groups and (b) comparing the magnitude of PPS and NPS for males and females within each age group.

#### Comparisons Across Target Age

Toddlers had very few prescriptive stereotypes, and with the exception of being communal for girls, their stereotypes were not about traits but physical appearance and toys. Toddler boys had both strong PPS to have a masculine appearance and play with masculine toys and NPS to avoid having a feminine appearance or playing with feminine toys. Girls had strong PPS to have a feminine appearance and play with feminine toys as well as a weaker PPS to be communal. Although these prescriptive stereotypes were strong, other trait-based stereotypes were much weaker, suggesting that people do not have gendered expectations of toddlers' traits—perhaps because their personalities are perceived as not yet formed and more malleable (e.g., Neel and Lassetter, 2015). People do, however, have strong prescriptions about how toddlers should look and what they should play with, contradicting Campenni's (1999) research showing that gender-appropriateness of toys for toddlers were less stereotypical than ratings for older children.

As early as elementary school, prescriptive gender stereotypes similar to those for adults emerged. The strongest stereotypes for school-aged children were again for physical appearance and behavior, with the same pattern as for toddlers. At this age, sex-typed interests also appeared as prescriptive stereotypes, where it was seen as desirable for boys to be interested in math and science and girls to be interested in language and arts but it is important to note that opposite sex-typed interests did not meet the criteria for proscriptive stereotypes. Trait stereotypes also met the criterion for elementary school-aged children: It was desirable for boys to be agentic and active and avoid being shy, weak, or emotional. Girls, on the other hand, should be communal as well as wholesome and avoid being dominant or noisy. These prescriptive stereotypes are very similar to those found by Martin (1995) for 4–7 year old children, including agency, interest in mechanical objects, rough play and avoiding weakness for boys and communal traits and avoiding noise for girls. The proscription of shyness for boys of this age group is also consistent with Doey et al's.


 Average PPS and Average NPS are the average effect size of the characteristics that meet the criteria for that age group.

and

(2014) analysis of the social (in)acceptability of shyness for school-aged boys. Martin (1995) did label independence as a desirable trait for boys (which did not meet the criteria for a prescriptive stereotype until adolescents in these data) and being neat, well-mannered, and helpful around the house for girls, which were not directly measured in the current data.

Stronger prescriptive gender stereotypes may emerge in elementary school-aged children, compared to toddlers, because by this age people believe that counterstereotypical behavior is predictive of adult counterstereotypical behaviors (Sandnabba and Ahlberg, 1999), and so prescriptive stereotypes become relevant in order to pressure normative behavior. Thus, people appear to believe that elementary-aged children are no longer considered as malleable in their personality as toddlers. Conversely, there was no evidence for the idea that stereotypes for children would be stronger than stereotypes of adults—if anything, they were slightly weaker, although not by much.

Trait prescriptive stereotypes of male and female adolescents were intensified slightly compared to younger children, but not to a high degree and the average prescriptive stereotypes were not different in magnitude from younger children. These stereotypes were also not much different than adult stereotypes. Thus, there is not a lot of support for the idea that adolescence highlights gender differences and intensifies prescriptions based on the magnitude of the stereotypes.

There were some changes in the content of the stereotypes in adolescence and young adulthood, however. Starting in adolescence, PPS for toy/play behavior fell away for both males and females, although NPS to avoid opposite sex-typed toys remained with females picking up the admonition to avoid masculine toys. Stereotypes for physical appearance also remained, at about the same magnitude as for children. PPS for males to be agentic and independent as well as be interested in math and science increased from adolescence into adulthood, but the stereotype for males to be active peaked in adolescence. These PPS are now similar in magnitude to NPS for males to avoid being shy, weak, or emotional. Young adulthood brings a new PPS for males to be intelligent, which remains with age.

For females, adolescence bought a PPS to be likeable and a NPS to be sexually active and young adulthood a NPS for rebelliousness, but none of these stereotypes met the criteria for a stereotype in any other age group. PPS for girls and women to be communal grew with age and peaked in young adulthood, and NPS to avoid dominance grew into adulthood as well. The strongest prescriptive stereotypes for adolescent girls through adult women were to have a feminine appearance and be communal and avoid dominance and masculine toys.

These results replicated previous research on prescriptive stereotypes for adults (Prentice and Carranza, 2002; Rudman et al., 2012b), showing that women should be communal and avoid being dominant and men should be agentic and independent but avoid being weak and emotional. Adult prescriptive stereotypes were expanded in the current study by including more characteristics: Women should also have a feminine appearance and be interested in languages/arts, and avoid having a masculine appearance and being sexually active or noisy. Men should also have a masculine appearance, be interested in science/math/technology/mechanical objects, and be sexually active, but avoid being shy and appearing feminine. Adult men were also supposed to be sexually active, compared to women.

Stereotypes for the elderly were weaker for both men and women. Men were still supposed to have masculine interests, be agentic, and be intelligent as well as avoid feminine toys, appearing feminine, and weakness, but these stereotypes were weaker than those for adults from 30 to 50 years old. For elderly women, all stereotypes fell away except for a PPS to be communal, which was also weaker than for other age groups (excepting toddlers). These results are consistent with the findings that descriptive gender stereotypes weaken for elderly targets (e.g., DeArmond et al., 2006; Thompson, 2006). These stereotypes were also inconsistent across studies (see Supplementary Tables), suggesting that prescriptive gender stereotypes may be less relevant to older age groups.

Overall, these results demonstrated that the content and magnitude of prescriptive stereotypes do change for different age groups, focusing on activities and appearance at the youngest ages studied here, with trait stereotypes increasing for elementary-aged children and continuing through adulthood. There was not much evidence for an intensification of prescriptive gender stereotypes for adolescents, as these stereotypes were similar to both the elementary and young adult age groups. Stereotypes then waned for elderly targets, supporting the notion that prescriptive gender stereotypes also weaken with age.

#### Comparison of Male vs. Female Stereotypes

One test of the question of whether males' behavior is more restricted than females' behavior depends on the number and magnitude of the PPS and NPS in each age group. Based on the data counting and averaging prescriptive stereotypes of males and females of each age group presented in **Table 3**, the stereotypes were more restrictive for males than females at nearly every age group. Although toddlers had few prescriptive stereotypes, the ones that did exist demonstrated that toddler boys had both strong PPS and NPS, whereas girls had only strong PPS but no strong NPS to avoid masculine things. From elementary-aged through adults, females gained weak NPS and the magnitude of male PPS and NPS decreased slightly, but overall the same pattern held. Even though stereotypes for the elderly are weaker for both males and females, the prescriptive stereotypes were still more numerous and stronger for men than women.

In nearly every age group (except the elderly), the average NPS were larger than PPS for males, suggesting that males are directed more based on what they should not do rather than what they should do. Conversely, female PPS stereotypes were stronger than female NPS and male PPS, thus females are directed more based on what they should do rather than what they should not do. Thus, the stronger pressure on males to conform to gender stereotypes focuses on telling boys and men behaviors to avoid. This idea is interesting in relation to precarious manhood, which suggests that men's status as a man is easily lost—especially if they display feminine behaviors (Vandello and Bosson, 2013) that in this research made up NPS for males.

A second test of this question of greater restrictions for males involves the prescriptive comparison bi-polar questions that directly asked participants whether it was less desirable for males or females to violate stereotypes. These questions were identical in Studies 1 and 2 (but omitted in Study 3), and the means are presented in **Table 4**. It is worth noting that in the current study agency did not meet the criterion for a NPS for females and communion did not meet the criterion for a NPS for males. However these characteristics were PPS for the other sex, and this question is labeled as positive violations because it describes males and females acting in ways prescribed to the other sex. Weakness and dominance were proscribed behaviors for males and females, respectively, and thus these are labeled negative violations because for males to act weak and females to act dominant violates NPS.

Most of the means were different from the midpoint of the scale (4), except for positive violations for adults and negative violations for elementary-aged, elderly (in Study 1), and toddlers (in Study 2). Repeated measures analysis of variance (ANOVA) on the positive and negative violations demonstrated that ratings varied by target age for positive violations in Study 1, F(2, 256) = 21.34, p < 0.001, partial η <sup>2</sup> = 0.14, and Study 2, F(4, 360) = 14.09, p < 0.001, partial η <sup>2</sup> = 0.14, and for negative violations in Study 1, F(2, 258) = 36.73, p < 0.001, partial η <sup>2</sup> = 0.22, and Study 2, F(4, 360) = 22.09, p < 0.001, partial η <sup>2</sup> = 0.20. Contrasts showed that for positive violations, it was less desirable for males to be communal than females to be agentic for adolescents, elementary-aged, and young adults but less desirable for females to be agentic than males to be communal in toddlers and the elderly. For negative violations, it was less desirable for males to be weak than females to be dominant for adolescents, young adults, and adults, and in no cases was it less desirable for females to be dominant than for males to be weak.

These results support the notion that males' behavior is more restricted than females even when asking people directly to compare the behaviors of males and females. Although toddlers and the elderly were exempt from these restrictions, there was greater concern, compared to females being agentic or dominant, that (a) elementary-aged boys should not be communal, (b) adolescent boys and young adult men should be not be communal or weak, and (c) adult men should not be weak. A greater emphasis on males' than females' prescriptive violations in these questions was strongest for adolescents, supporting the idea that these concerns more strongly emerge at puberty, even though the overall magnitude of prescriptive stereotypes were not strongest for adolescents. Interestingly, concerns for the positive violations of the elderly reverse, such that it was more concerning if females behave agentically than if males behave communally, consistent with the idea that male stereotypes evolve to include more communal elements in the elderly. Thus, these data that required participants to directly compare the violation of stereotypes for males and females supported the conclusion that males are more restricted in their behavior from elementary school to adulthood.

#### Prescriptive Stereotype Summary

In sum, these findings demonstrated the applicability of prescriptive stereotypes to different age groups, but also their variation depending on the age of the target group. The largest stereotypes for toddlers and elementary-aged youth were for girls to have and for boys to avoid a feminine appearance and playing with feminine toys. Prescriptive stereotypes for very young boys and girls were focused on appearance and play behaviors, and were especially proscriptive for boys—telling them more what not to do than what to do. Trait stereotypes appeared for elementary school-aged children, and the prescriptions for the usual suspects of communion, agency, dominance, and weakness remained into adulthood. Stereotypes for the elderly were then again minimized, demonstrating that people hold elderly men and women to few standards of gendered behavior, although elderly men still had more prescriptive stereotypes than elderly women. Overall, it does appear that males received more pressure in the form of prescriptive stereotypes, especially NPS about what not to do, across all age groups and especially for toddlers.

# Descriptive Stereotypes

**Table 5** displays the average effect size across the three studies in the meta-analysis of descriptive stereotypes. The Supplementary Tables show the effect sizes for each study separately. Similar to Martin (1995), the effect sizes were often larger for descriptive than prescriptive stereotypes not only for children but for most age groups. Using criterion of d > 0.40 (similar to the prescriptive stereotype criterion) to qualify as a descriptive stereotype, 98 out of 126 (77.8%) effects over all age groups qualify as descriptive stereotypes. Thus, males and females were often rated as typically different even when the behavior was not prescribed for one sex over the other. However, descriptive stereotypes were highly correlated with prescriptive stereotypes for toddlers, r(19) = 0.95, p < 0.001, elementary-aged, r(19) = 0.97, p < 0.001, adolescents, r(19) = 0.94, p < 0.001, young adults, r(19) = 0.94, p < 0.001, adults, r(19) = 0.95, p < 0.001, and the elderly, r(19) = 0.77, p < 0.001. Thus, prescriptive and descriptive stereotypes aligned, although these high correlations may be an outcome of having the same participants rate both desirable and typical behaviors in Studies 1 and 2.

## Limitations and Future Research

It is important to note that this research was conducted with majority White samples from the United States. The predominately White samples likely used White targets as their reference group, since target race was not specified. Thus, caution should be used when extrapolating the results to participants or targets of other racial groups. Previous research has demonstrated that that descriptive stereotypes of men and women are more similar to stereotypes of White men and White women than to gender stereotypes of other racial groups (Ghavami and Peplau, 2013) and Blacks are seen as more masculine and Asians as more feminine than Whites (Galinsky et al., 2013). There is also reason to suspect that prescriptive gender stereotypes may vary by race, as Black female leaders do not experience backlash for being dominant (Livingston et al., 2012). Thus, it is important to acknowledge the current results TABLE 4 | Means and standard deviations for comparisons for desirability of violating prescriptive stereotypes by target age.


Means with the different subscripts differed by p < 0.05. Means with \*were significantly different from the midpoint of the scale (4) at p < 0.05. Means lower than 4 indicate it was less desirable for males than females to violate the stereotype, means above 4 indicate it was less desirable for females than males to violate the stereotype.

TABLE 5 | Meta-analyzed descriptive stereotypes (d) by target age.


Positive d-values reflect males were rated higher on that characteristic and negative d-values reflect females were rated higher on that characteristic.

describe stereotypes of Whites for Whites, but more research will be needed to know if other racial groups show similar prescriptive gender stereotypes for different age groups and if men of other racial groups are more restricted in their behavior than women.

In addition, the current ratings were all perceptions of adults (college students or older) of various age groups, from toddlers to the elderly. Missing are ratings of each age group of its own stereotypes (e.g., toddlers of toddlers; adolescents of adolescents; the elderly of the elderly). Suggesting similarity in prescriptive stereotypes across participant age groups, previous research demonstrated that children's reactions to norm violators (e.g., Smetana, 1986; Levy et al., 1995) show the same pattern of greater disapproval of counterstereotypical behavior from boys than girls that adults demonstrate in other studies. In addition, Powlishta (2000) found that children's and adults' descriptive stereotypes of child and adult targets were quite similar, although the difference between ratings of males and females on femininity was weaker for child than adult participants. Descriptive stereotypes of the elderly were also weaker for elderly respondents than middleage or young respondents (Hummert et al., 1995). It is unknown whether similar effects of participant age would occur for prescriptive stereotypes, which might be conceptually more difficult for children to understand as they designate desirable behavior rather than actual behavior. Stereotypes of one's own age group would be interesting to study, but with the current data I was interested in whether adults view different age groups differently. The stereotypes adults hold about children impact how children behave through gender role socialization, modeling, and direct tutelage (Witt, 1997; Bussey and Bandura, 2004). Adults' beliefs about adolescents can also be important, as parents' stereotypical beliefs about adolescents' focus on peers and social concerns impacted parents' perceptions and their child's behavior (Jacobs et al., 2005). Thus, parental beliefs about gender stereotypes can influence their children's gender role behavior, so understanding adults' views of children is important. Future research could assess whether parental status matters to these views, to see if greater familiarity with children or adolescents changes adults' views of prescriptive gender stereotypes.

The current research also did not assess possible reasons for the differences in prescriptive stereotypes across age groups. For example, the research did not attempt to measure the impact of stereotype violations on status, manhood, or perceived sexual orientation, which are all possible mechanisms for the policing of boys and men in terms of what they are not supposed to do. It may be the case that these mechanisms vary across age groups. The smaller prescriptive stereotypes in toddlers may be due greater perceived malleability in personality and trait characteristics, and behaviors of younger children may not speak as directly to sexual orientation (see McCreary, 1994). In addition, if these concerns are reduced or removed for the elderly, this may help to explain the reduced size of prescriptive gender stereotypes in this age group. Future research should continue to address these issues across a wide variety of age groups.

The meta-analytic results presented here average across three studies with different research designs. However, it is important to note that Study 2 had larger effect sizes (see Supplementary Tables), most likely because target sex was within-subjects, encouraging participants to draw sharper distinctions between the male and female groups. These target contrast effects have occurred in other research. For example, Thompson (2006) found that old men were rated as more masculine and less feminine when compared to old women than when compared to young men. Participants in the current research rated the targets in a random order by age, minimizing any one specific age comparison when averaging across participants, but stereotypes may also differ depending on the presentation order of age groups. Thus, the size of the stereotypes may depend on the research design used to capture them.

# Implications

Because prescriptive stereotypes exist across age groups, the mechanism causing the negative reactions and backlash to counterstereotypical behavior may be the same for both children and adults—a violation of prescriptive stereotypes. However, different types of behavior would violate prescriptive stereotypes in adults and children, based on the specific content and magnitude of these stereotypes. For example, negative reactions to children might focus more on violations of physical appearance or play behaviors, rather than traits, whereas reactions to adolescents and adults could result from violations of both trait and appearance prescriptive stereotypes. Future research should address prescriptive stereotypes as a mechanism for negative reactions to children, adults, and the elderly who display counterstereotypical behaviors. Backlash could also vary with perceiver's ideology—non-traditional participants might see stereotype violations as a positive rather than a negative event (see Gaunt, 2013).

# CONCLUSIONS

The current findings demonstrated the applicability of prescriptive stereotypes to different age groups, from toddlers to the elderly, and presented their content and magnitude. All age groups had prescriptive stereotypes, although the content and magnitude of those stereotypes varied across age groups. Prescriptive stereotypes for toddlers contained elements of play and appearance, whereas trait stereotypes appeared starting for elementary-aged children. Prescriptive stereotypes for the elderly were minimized, suggesting less pressure to conform to expectations. Prescriptions for males focused on NPS that admonish what not to do, whereas females' stronger PPS focused on what girls and women are supposed to do. Thus, overall, males' behavior was more restrictive based on these stereotypes. The current research describes the current state of prescriptive gender stereotypes for a variety of age groups, and the consequences of these stereotypes for socialization and backlash as well as how the stereotypes might differ across racial groups deserve further study.

# ETHICS STATEMENT

These studies were carried out in accordance with the recommendations of ethical standards of the American Psychological Association. The protocol were approved by the University of San Diego's Institutional Review Board (IRB). Participants in Studies 1 and 2 gave written informed consent in accordance with the Declaration of Helsinki, but in Study 3 participants did not because a waiver of written consent was granted by the IRB. Instead, participants consented online before participating in the study.

# AUTHOR CONTRIBUTIONS

AK conceived, planned, and carried out the experiments, analyzed the data, interpreted the results, and wrote the manuscript.

# ACKNOWLEDGMENTS

I would like to thank my research assistants Rita Taylor and Brooke Miller for their help with data collection. Publication is made possible by a grant from the College of Arts and Sciences, University of San Diego.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.01086/full#supplementary-material

# REFERENCES


Social Psychology Vol. 45, eds P. Devine and A. Plant (London: Elsevier), 167–227.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Koenig. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Are the Processes Underlying Discrimination the Same for Women and Men? A Critical Review of Congruity Models of Gender Discrimination

#### Francesca Manzi\*

Department of Psychology, New York University, New York, NY, United States

Although classic congruity models of gender discrimination (e.g., role congruity theory, lack of fit) predict negative outcomes for both women and men in gender-incongruent domains, the literature has focused almost exclusively on discrimination against women. A number of recent studies have begun to address the question of whether and under what circumstances men can also be the targets of gender discrimination. However, the results of these studies have so far been mixed. Therefore, the question of whether men, like women, also suffer discrimination when in gender incongruent roles and domains remains unclear. The goal of the present paper is to integrate and critically examine the burgeoning literature on gender discrimination against men in order to assess whether the symmetrical predictions of congruity models are supported. Through this close analysis and integration of the literature, I aim to identify remaining gaps in the research on gender discrimination. In particular, I propose that researchers of gender discrimination would benefit from expanding their scope beyond that of paid work.

#### Edited by:

Alice H. Eagly, Northwestern University, United States

#### Reviewed by:

Donald A. Saucier, Kansas State University, United States Toni Schmader, University of British Columbia, Canada

#### \*Correspondence:

Francesca Manzi fm908@nyu.edu

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

> Received: 28 April 2018 Accepted: 18 February 2019 Published: 06 March 2019

#### Citation:

Manzi F (2019) Are the Processes Underlying Discrimination the Same for Women and Men? A Critical Review of Congruity Models of Gender Discrimination. Front. Psychol. 10:469. doi: 10.3389/fpsyg.2019.00469 Keywords: gender stereotypes, role congruity theory, lack of fit, gender discrimination, male targets

# INTRODUCTION

At first glance, research in the social sciences appears to have provided a thorough account of the dynamics underlying gender-based discrimination. Social psychology in particular has produced a large literature that has sought to uncover the cognitive and motivational mechanisms behind gender discrimination, as well as to track changes in the nature of gender discrimination over time. However, the majority of research in gender discrimination has focused almost exclusively on discrimination against women in traditionally male roles and occupations (Jetten et al., 2013).

This focus on women has not been arbitrary—discrimination on the basis of gender has been a particular problem for women, especially in employment settings. Further, even though women now comprise nearly half of the workforce in most developed nations (Pew Research Center, 2017c; United States Bureau of Labor Statistics, 2017), there are still important domain-specific gender imbalances, such that women remain dramatically underrepresented in occupations that have been traditionally dominated by men. This imbalance puts women at an important social and economic disadvantage, as these positions tend to hold the highest prestige and status, as well as higher monetary and social rewards (Cejka and Eagly, 1999; Hegewisch and Hartmann, 2014; Levanon and Grusky, 2016). Because gender-based discrimination has historically interfered with women's

professional success and continues to hinder their social mobility, gender bias against women is an obvious and central impediment to gender equality. Thus, the focus on gender discrimination against women – and not men – makes sense from a historical, cultural, and political point of view.

The fact that discrimination continues to affect women more than men, however, does not necessarily mean that men cannot be the targets of gender bias in evaluation. Although empirical research has focused almost exclusively on women, most psychological theories of the antecedents and consequences of gender discrimination are not meant to be gender-specific. Rather, many of these theories are posited as explanations of gender bias more generally and therefore should also be able to account for patterns of discrimination against men, should they exist. Though these social psychological theories about gender discrimination have shown themselves to be useful in explaining why, when, and how women encounter barriers in traditionally male roles and occupations, whether they can also explain the potential limitations men encounter when seeking entry into traditionally female domains remains to be seen. Thus, examining whether and under which circumstances men are discriminated against on the basis of their gender has important theoretical implications.

The goal of the present paper is to critically examine classic models of gender discrimination by expanding their scope beyond women in traditionally male settings to also integrate research on the evaluation of men in traditionally female roles and occupations. The primary focus of this review is on congruity models of discrimination (hereafter, "CMDs") such as "role congruity theory" (Eagly and Karau, 2002), "lack of fit" (Heilman, 1983, 2012), and "think manager, think male" (Schein, 1973, 2001), which are among the most well-examined and empirically supported theories of gender bias in the psychology literature. These theoretical explanations argue that there can be a mismatch between what men and women are perceived to be like (i.e., gender stereotypes) and what is thought to predict success in specific occupations (i.e., job stereotypes). This perceived mismatch or incongruity between gender stereotypes and job stereotypes leads to negative performance expectations for both women and men in gender-incongruent domains and, in turn, gives rise to gender discrimination.

The predictions made by CMDs have been consistently supported in research on bias against women in stereotypically masculine (i.e., "male-typed") settings. However, the accuracy of these theories in predicting whether men face similar biases in stereotypically feminine (i.e., "female-typed") occupations and roles is less well established. Accordingly, the primary goal of this paper is to review the existing literature in order to examine whether the processes affecting discrimination against men and women are symmetrical (i.e., whether being in a gender-incongruent role has similar negative effects for both men and women). In doing so, this review will assess the core tenets of CMDs and the psychological mechanisms that they contend are responsible for giving rise to gender discrimination.

Exploring whether men can be the targets of gender-based bias is important not only from a theoretical perspective, but also from a practical one. While women's entry and participation in traditionally male domains have increased dramatically in the past decades, men's participation in traditionally female domains has remained stubbornly stagnant (Blau et al., 2013). Given that occupations in which women outnumber men are typically devalued (Cohen and Huffman, 2003; Hegewisch and Hartmann, 2014), increasing male participation in these areas may help decrease gender segregation and, in turn, help balance the prestige and economic rewards that are allocated to both male- and female-dominated occupations. Importantly, if men's under-representation in feminine roles can be explained, even in part, by traditional models of gender discrimination, then the knowledge we have gained from decades of research on women in traditionally male-settings should be helpful in identifying strategies to combat anti-male bias. If, on the other hand, men's lack of participation in female roles and occupations is not due to gender discrimination, or if the processes underlying bias are not analogous for women and men, then there may be a need for both theoretical revision, as well as new ways to address the persistent gender imbalance in the workplace.

In the following sections, I will review the extant literature focusing on the evaluation of men in female-dominated occupations and interpret these results in light of the predictions made by CMDs. In keeping with the dominant approach of research on CMDs, this review will focus primarily on the processes underlying gender discrimination from the evaluator's perspective (rather than that of the "target" or person being evaluated). That is, the focus will be on people's judgments and evaluations of other men's and women's occupational competence. Because the predictions of CMDs center on evaluations of men in female-typed domains, this review will also be limited to perceptions of men in female-typed roles and occupations.

# CAN MEN BE THE TARGETS OF GENDER DISCRIMINATION?

In recent years, there has been a rise in perceptions of anti-male discrimination. Over 40% of adults believe that men face a little or a moderate amount of discrimination in the United States (American National Election Studies, 2016). While the percentage of men alleging that they have suffered some form of discrimination on account of their gender is still far below that of women (22% vs. 42%, respectively, Pew Research Center, 2017b), many men believe that anti-male discrimination is on the rise, and that it is more prevalent today than in past decades (Bosson et al., 2012; Kehn and Ruthig, 2013).

What explains these growing perceptions of anti-male discrimination? They may in part be a consequence of women's recent social advancements and the appearance of gender-related initiatives focused on women. For example, some see the increase in academic diversity programs aimed at girls, but not boys, as discriminatory, especially given that women are now more highly educated than men (Coston and Kimmel, 2012; Okahana and Zhou, 2018). Diversity policies such as affirmative action or gender quotas may also be seen as

discriminatory because they are thought to violate meritocracy (Eberhardt and Fiske, 1994).

However, perceptions of discrimination against men can also be motivated in nature. For example, believing that diversity policies are based on unjust processes has been found to protect men's self-esteem when confronted with negative performance feedback (Unzueta et al., 2008). Furthermore, men's perception that they too are victims of discrimination may be a form of competitive victimization. According to this perspective, claiming victimhood is a reaction to men's dominance being threatened and/or to feelings of guilt about men's higher social standing (Kobrynowicz and Branscombe, 1997; Sullivan et al., 2012; Jetten et al., 2013; Dover et al., 2016; Young and Sullivan, 2016).

Although the belief that men experience discrimination has been on the rise among the general public, this idea has been far more contentious in academic research and theory. Some argue that, because of their social standing, men are less threatened than women by gender-based bias because gender discrimination does not impede men's upward mobility (Jetten et al., 2013). However, although men may suffer fewer negative outcomes as a result of discrimination, this does not mean that discrimination against men cannot occur. Gender-based discrimination is generally defined as any behavior or action that results in the unfavorable treatment of a person because of their sex or gender (Heilman and Manzi, 2016), and past work has suggested that, under certain circumstances, men too can be subject to negative treatment because of the gender group to which they belong (e.g., Heilman and Wallen, 2010; Vandello and Bosson, 2013). Thus, although the nature and consequences of discrimination may be very different for women and men, men can also be the targets of gender discrimination, at least by this definition.

Nevertheless, other theoretical perspectives contend that definitions of discrimination should also incorporate the notion of legitimacy. Such perspectives are reflected in many mainstream psychological definitions of prejudice, which stipulate that the negative treatment of group members must be "unfair" or "unjustified" in order to constitute discrimination (Major et al., 2002). One potential problem with this definition lies in the fact that what is perceived to be justified or unjustified can vary greatly as a function of many factors, such as changing societal norms. It is likely that actions and circumstances that most people now unequivocally categorize as discrimination against women were not always perceived as such. Under contemporary standards, it is difficult to imagine that, less than 100 years ago, women were not allowed to vote in most countries. Or that up until the 1970s, United States companies could legally terminate pregnant women if they saw them as a liability for their business (United States Equal Employment Opportunity Commission, 1978). Even until recently, non-consensual sex within marriage was not considered rape in most countries (and remains decriminalized in many; World Bank, 2015).

In the modern day, the subjective nature of perceptions of what is "just" is illustrated by other societal practices that appear to go unquestioned. For example, it is still common for wives to take their husband's last name after marriage in many western countries (e.g., Gooding and Kreider, 2010). The reverse practice – the husband adopting the wife's surname – is rare, and the legal procedures that a newly wed couple must go through to make this arrangement are often more obtuse than taking the traditional route (Rosensaft, 2002; Weisberg and Appleton, 2015). Despite this imbalance, most US men and women consider this practice to be perfectly acceptable and more than half believe that women should be required to adopt their husband's last name (Hamilton et al., 2011).

Cultural norms that continue to proscribe traditionally feminine behavior for men may also play a role in why discrimination against men is often not labeled as such. For example, people may be more accepting of behaviors that serve to reinforce these norms, such as challenging the masculinity of male nurses, questioning the competence of a male nanny, or derogating men who actively seek out family-friendly work opportunities (see Funk and Werhun, 2011; Vandello et al., 2013). In this way, social norms may grant a degree of legitimacy to actions that would otherwise be judged as unjust and harmful to men, preventing people from viewing them as discriminatory.

But detecting discrimination is not only a function of an action's perceived legitimacy. Whether people judge an action to be discriminatory also depends on the actors involved – particularly who the perpetrator and the victim are. When the perpetrator is atypical, detecting discrimination becomes more difficult. For example, behavior is less likely to be perceived as discriminatory when it involves a less powerful group acting against a more powerful group (Baron et al., 1991; Inman et al., 1998; Barreto and Ellemers, 2005; Barreto et al., 2010). The perceived typicality of the target also shapes people's judgments of whether discrimination has occurred. Victims of discrimination are less likely to be perceived as such if they do not belong to a group that is commonly discriminated against – that is, when they are not prototypical victims (Inman and Baron, 1996). As members of a high-status group, men are atypical targets of discrimination. Thus, even if there are circumstances in which men are treated negatively because of their gender, they may be less likely to be perceived as victims of genderbased bias.

# CAN DISCRIMINATION OCCUR IN FEMALE-TYPED ROLES AND OCCUPATIONS?

In addition to a focus on women as the targets of discrimination, gender research has also overwhelmingly focused on discrimination in male-typed contexts – that is, occupations that have been historically dominated by men and/or are thought to require stereotypically masculine characteristics. Male-typed occupations typically hold more power and prestige, so it is not surprising that researchers would direct their efforts toward identifying the barriers to women's access and advancement within this domain. However, this narrow focus has left social psychology little insight into the forces at work in female-typed

domains such as early childhood education, health care, and domestic labor.

The lack of research on discrimination in female-typed domains may also reflect the fact that being restricted from entering these domains may often not be categorized as discrimination. Compared to male-typed occupations, traditionally female occupations are generally devalued, tending to carry less status and monetary rewards (England, 2010; Blau and Kahn, 2017). As a result, being excluded from these occupations on the basis of gender may not be readily seen as discrimination, as the social and economic consequences of this exclusion may be less evident.

Importantly, the interplay between the status of the target of discrimination and the status of the occupation may also have implications for whether people perceive an event as discriminatory. Specifically, the lower status of women (relative to men) and the higher status of male-typed occupations (relative to female-typed occupations) are seen to represent an upward movement for women in these fields. Therefore, biases that limit women's full access to these occupations often result in women's relegation to a lower-status position, serving to curb their progress and upward mobility. Conversely, the consequences for men in female-typed occupations are less straightforward. Participation in areas that have been historically dominated by women may be seen to represent a downward movement for men. Thus, actions that result in men's exclusion from female-typed occupations may appear to others as less egregious. Research shows that an event is more likely to be perceived as discriminatory when it is thought to cause significant harm to the victim (Swim et al., 2003). Therefore, even if men are excluded from female domains because of their gender, and even if the processes underlying this exclusion are similar to those suffered by women in traditionally male domains, such an event may not be deemed discriminatory because it is not seen as particularly harmful for men. Furthermore, because the consequences of discrimination against men in these settings are also less prototypical, people may be less prone to recognizing gender-based discrimination within settings historically dominated by women. In this way, the differential social and economic value assigned to traditionally female versus male activities may be playing an important role in shaping whether, when, and where discrimination is perceived to take place.

In sum, the consequences of discrimination against women and men likely differ in many ways, and men's historical advantage and higher social status may to some degree shield them from some of the negative outcomes that women often experience. Moreover, the domains in which gender discrimination against men should be most likely to occur – female-typed roles and occupations, according to CMDs – are typically devalued. Likely as a result of this non-prototypical scenario, discrimination against men in traditionally female domains has often not been labeled as such. Nevertheless, men, like women, can suffer negative outcomes as a function of the specific gender group to which they belong. When and why men experience these negative outcomes has important theoretical implications for our understanding of gender-based bias.

# CONGRUITY MODELS OF GENDER DISCRIMINATION

A large body of work exploring the mechanisms underlying gender-based discrimination has shown that women's and men's participation in the workplace is affected by gender bias in evaluation – a bias that has its origins in gender stereotypes (Burgess and Borgida, 1999; Cejka and Eagly, 1999; Eagly and Karau, 2002; Heilman, 2012). Gender stereotypes are shared beliefs about the attributes, personality traits, and abilities of women and men. Regardless of their (real or perceived) accuracy, gender stereotypes affect how we perceive and evaluate others (Bussey and Bandura, 1999; Eagly and Wood, 2013; Ellemers, 2018). In this way, gender stereotypes often lead to discrimination by guiding decision-making processes in the direction of stereotype-consistency.

Gender stereotypes are developed and perpetuated through the differential distribution of roles and occupations in society. Men's overrepresentation in breadwinning roles and high-power occupations has led to stereotypes portraying men as particularly agentic. Similarly, women's overrepresentation in domestic roles and caregiving occupations has aligned female stereotypes with communality (Eagly et al., 2000; Koenig and Eagly, 2014). Agency comprises attributes such as achievement orientation (e.g., able, successful), assertiveness (e.g., dominant, forceful), and autonomy (e.g., independent, self-reliant), while communality denotes consideration for others (e.g., caring, helpful), affiliation with others (e.g., sociable, likable), and emotional sensitivity (e.g., tender, sensitive). Continuous exposure to this gendered division of labor also gives rise to the belief that men and women are fundamentally different. That is, men are thought to be more agentic than communal, and women are thought to be more communal than agentic (Broverman et al., 1972; Kite et al., 2008; Wood and Eagly, 2010).

Importantly, gender stereotypes are both descriptive and prescriptive. That is, they depict what men and women are like as well as what men and women should be like. The descriptive component of gender stereotypes comprises beliefs about the characteristics of each gender group (e.g., women are emotional, men are rational), while the prescriptive component establishes norms about the appropriate behavior of men and women (e.g., women should be caring, men should be strong) (Burgess and Borgida, 1999; Prentice and Carranza, 2002). Both descriptive and prescriptive components of gender stereotypes have implications for the differential recruitment, selection, and promotion of men and women into different occupations. However, the processes by which descriptive and prescriptive stereotypes give rise to gender-based discrimination vary. Namely, descriptive stereotypes lead to discrimination through differential perceptions of male and female competence in specific roles and occupations, and prescriptive stereotypes lead to discrimination through the derogation and social

penalization of male and female norm-violators (Heilman, 2012; Rudman et al., 2012). Although prescriptive gender stereotypes undoubtedly contribute to the underrepresentation of both women and men in gender-incongruent domains, the focus of the review that follows will primarily be on descriptive gender stereotypes, as these have been the main focus of CMDs.

# The Effect of Descriptive Stereotypes: Discrimination as Perceived Incompetence

According to CMDs, descriptive stereotypes depicting women as communal and men as agentic do not always lead to negative outcomes. Rather, gender discrimination arises when these stereotypes conflict with what is thought to predict success in specific roles and occupations. Though different jobs certainly require different competencies for successful performance (e.g., being a good nurse requires more biology knowledge than being a good journalist), their perceived requirements and the relative importance ascribed to each are also informed by gender stereotypes. For example, jobs in which women are heavily overrepresented (e.g., bank tellers, dental hygienists) tend to be seen as requiring more communal characteristics than occupations in which men are the majority (e.g., financial adviser, civil engineers), which are seen as requiring more agentic characteristics (Glick et al., 1995; Cejka and Eagly, 1999). In this way, occupations themselves are gendered, with the workplace largely being divided into "women's work" and "men's work" (Reskin and Hartmann, 1986; Ridgeway, 2011). Beliefs about the gender-type of different roles and occupations emerge very early (Liben et al., 2001; Martin and Ruble, 2004). Moreover the associations between men, women, and specific occupations (e.g., woman-nurse, man-surgeon) are automatic and difficult to suppress (Oakhill et al., 2005).

Congruity models of discrimination focus on this interplay between descriptive gender stereotypes and the gender-type of particular roles and occupations, arguing that gender-based discrimination is the result of a perceived mismatch between what men and women are thought to be like (i.e., agentic and communal, respectively) and the traits deemed necessary for job success. This perceived mismatch, in turn, gives rise to negative expectations about the potential for success of an individual in a gender-incongruent domain. That is, descriptive gender stereotypes lead to the belief that women and men are not well-equipped to perform effectively in occupations that have been historically dominated by the opposite sex and that they will therefore be less competent in these roles.

Theoretically, CMDs are "gender-blind." They predict that discrimination occurs because of a perceived incongruity between female or male stereotypes and occupational stereotypes. Thus, CMDs predict a symmetrical effect: women will be deemed less competent than men in traditionally male domains, and men will be deemed less competent than women in traditionally female domains. In both cases, the outcome of these stereotype-based expectations should be gender discrimination.

# GENDER STEREOTYPES AND BIAS IN THE EVALUATION OF WOMEN

Congruity models of discrimination have been widely and successfully used to describe and predict anti-female bias in such disparate male-typed settings as the military (e.g., Boldry et al., 2001), upper-level management (e.g., Eagly and Carli, 2007), academia (e.g., Schmader et al., 2007), and sports (e.g., Koivula, 2001). Moreover, a large body of research has provided support for the predictions made by CMDs regarding the psychological mechanisms behind gender bias, with numerous studies demonstrating that stereotype-based expectations lead to discrimination at various stages of women's lives and careers.

Anti-female bias in traditionally male domains begins early on. Female students are perceived as less intelligent and capable than their male peers in domains such as technology and science (Cheryan et al., 2017). This bias is also seen in parents, who often encourage their daughters to pursue more gender-congruent activities, thereby reinforcing beliefs about their lesser competence in male-typed domains (Leaper and Gleason, 1996; Tenenbaum and Leaper, 2003). Even when actively exposing their children to science (an area generally perceived as male in gender-type), parents dedicate more time and effort explaining scientific processes to their sons than to their daughters (Crowley et al., 2001). Gender-based bias continues in higher education, where women are perceived to be less talented than men in academic fields such as engineering, science and philosophy (Nosek et al., 2009; Moss-Racusin et al., 2012; Leslie et al., 2015).

For women who nonetheless choose to pursue traditionally male jobs, the mismatch between occupational stereotypes and female stereotypes gives rise to negative outcomes throughout their careers. Anti-female bias has been observed in job recruitment (e.g., Gaucher et al., 2011), in screening of application materials (e.g., Schmader et al., 2007), in selection decisions (e.g., Bosak and Sczesny, 2011), and in promotion opportunities (e.g., Lyness and Heilman, 2006; Hoobler et al., 2009). The existence of bias against women in male-typed jobs has received further support from several meta-analyses. A recent meta-analysis by Koch et al. (2015) provided strong support for the predictions made by CMDs for women in male-typed domains. In their analysis of 136 experimental studies, the authors found that women were evaluated less positively than men when the job was male in gender-type. In contrast with previous meta-analyses (e.g., Davison and Burke, 2000), the authors found that these effects were driven by male, but not female evaluators. Given that decision-makers in these occupations are likely to be men, these findings suggest that women in male-typed jobs continue to be highly vulnerable to gender-based discrimination.

In keeping with the predictions of CMDs, there is also evidence that the degree of bias against women in a specific male-typed occupation can also change if the stereotypes regarding what is necessary for that occupation change – supporting the contention that gender bias stems from a perceived mismatch between occupational stereotypes and stereotypes about women. Research suggests that such a change may be occurring in the domain of leadership. Early research found that stereotypes about leaders generally resembled stereotypes about men, creating the perception that men are more naturally equipped to fulfill these roles and leading to subsequent discrimination against women in a variety of leadership contexts (see Eagly et al., 1995). In the intervening decades, however, stereotypes about leaders appear to have incorporated more communal characteristics and discrimination against women in leadership roles seems to be decreasing (e.g., Sczesny et al., 2004; Koenig et al., 2011). In line with this change, a recent meta-analysis by Paustian-Underdahl et al. (2014) found no evidence of gender bias in people's evaluations of female leaders in male-typed settings.

Providing further support for CMDs, other information that reduces incongruity perceptions has also been shown to reduce gender bias. Such an effect has been documented for individual women who are depicted as clearly counterstereotypical. For example, presenting an individual woman as unequivocally or exceptionally competent reliably reduces the gender bias against that woman in male-typed settings (Koch et al., 2015). Under certain circumstances, these strongly counterstereotypical women can even be preferred over men as they are often perceived to be extraordinarily competent (Correll and Ridgeway, 2006). Indeed, recent studies suggest that unambiguously successful women are favored over equally qualified men, even in highly male-typed domains (Williams and Ceci, 2015; Leslie et al., 2017). Thus, presenting an individual woman as a clear "exception to the rule" can reduce her perceived incongruity for a given role and, as a result, discrimination is greatly attenuated (or may even be reversed in her favor).

# GENDER STEREOTYPES AND BIAS IN THE EVALUATION OF MEN

In much the same way that success in male-dominated jobs is associated with agency, communality is perceived to be a requisite for success in traditionally female roles and occupations. Research supports this idea, demonstrating that female-dominated occupations and fields are more strongly associated with traditionally female than male traits (Cejka and Eagly, 1999; Gilbert et al., 2015). CMDs predict that the mismatch between people's perceptions of female-typed occupations and male stereotypes will lead to the belief that men will be less competent than women in these settings. These negative competence expectations should, in turn, lead to anti-male bias and discrimination against men in traditionally female domains.

Although there is general consensus among gender researchers regarding women's lower perceived competence in male-typed roles and occupations, there seems to be much less agreement about the consequences that men face when they find themselves in female-typed occupations. While some studies have provided support for CMDs by documenting anti-male bias in female-typed domains, others suggest that, far from suffering discrimination, men are actually favored over women in traditionally female occupations.

As discussed above, the extant research on evaluations of men in female-typed occupations is both scant and fairly recent. However, when considered together, many of these studies offer indirect support for CMDs. For example, men desert female-dominated college majors and occupations at significantly higher rates than women (Addi-Raccah, 2005; Stott, 2007; McLaughlin et al., 2010; Riegle-Crumb et al., 2016). This phenomenon appears to be analogous to what has been termed the "leaky pipeline," referring to the comparatively higher rate of female attrition in male-typed domains (e.g., Cheryan et al., 2017; Department of Commerce of the United States of America, 2017). Thus, research suggests that similar trajectories and career development outcomes might exist for both men and women who choose to pursue gender-incongruent careers.

One possible explanation for these patterns is that the perceived incongruity between gender and occupation leads to higher attrition rates, as CMDs would predict. Supporting these predictions, there is evidence that men who leave female-typed domains are more likely to move into gender-balanced and male-dominated careers, even when this move results in a pay cut (Barnett et al., 2000; Addi-Raccah, 2005; Riegle-Crumb et al., 2016; Torre, 2018). It has been argued that this leaky pipeline may be due, at least in part, to a general culture within these domains that signals to men that they do not fit (O'Lynn, 2004; Simpson, 2004; Kermode, 2006; Bartfay et al., 2010; Isacco and Morse, 2015). Congruity beliefs can affect self-perceptions, which, in turn, may lead to negative outcomes for men in female-typed jobs. For example, perceiving greater conflict between their gender and their job has been linked to higher rates of depression and anxiety, as well as lower job satisfaction and commitment, among male nurses, early childhood educators, and flight attendants (Young and James, 2001; Wolfram et al., 2009; Wallen et al., 2014).

Taken together, these studies suggest that some form gender bias against men may exist in traditionally female fields. However, it is unclear whether and to what degree discrimination per se contributes to these negative outcomes, or whether they are due entirely to men's own perceptions that they do not fit (see Schmader and Sedikides, 2018). That is, men may be deemed competent by others but still choose to leave female-dominated environments because they do not feel like they fully belong. Nevertheless, such perceptions are rarely formed "in a vacuum," and it seems likely that there may be structural or interpersonal factors that contribute to men's feelings of lack of fit.

Other research has provided more direct evidence in support of CMDs, suggesting that the mismatch between male stereotypes and the perceived requirements for success in female-typed domains leads to the expectation that men will not perform as well as women. Several qualitative studies suggest that men are seen as lacking the female skills considered necessary to be a good nurse, early educator, or caregiver (Hochschild, 1983; Yang et al., 2004; Bartfay et al., 2010; Hedlin and Åberg, 2013; Warming, 2013). Providing support for the role of gender stereotypes in this process, there is evidence that these expectations of male

incompetence can give rise to stereotype threat among men in female fields. Mirroring the findings from a large body of research demonstrating that stereotype threat can affect women's performance in male-typed tasks and occupations (Steele and Aronson, 1995; for a meta-analysis see Nguyen and Ryan, 2008), men's performance in female-typed jobs and tasks is also impaired when stereotypes about women's greater ability are made salient (Leyens et al., 2000; Koenig and Eagly, 2005; Kalokerinos et al., 2017).

Still, a central contention of CMDs is that the mismatch between gender stereotypes and the perceived requirements for success in gendered occupations should lead not only to expectations of incompetence, but also to discrimination. Thus, if men are deemed less competent in female-typed roles and occupations, there should be evidence of anti-male bias in selection processes, performance evaluations, and promotions. Some research provides support for this possibility, documenting more negative ratings of men than women applying for a traditionally female job (e.g., Cohen and Bunker, 1975; Gerdes and Kelman, 1981; Etaugh and Riley, 1983; Kim and Weseley, 2017). A recent audit study also revealed that, compared to equally qualified female applicants, male applicants received significantly fewer call-backs from employers in female-typed domains (Yavorsky, 2017). Moreover, a meta-analysis by Paustian-Underdahl et al. (2014) found a tendency for male leaders to be evaluated as less effective than female leaders in educational settings.

Taken together, this research provides some support for the idea that men too can be the targets of discrimination in female-dominated occupations. It also lends some support for the psychological mechanism posited by CMDs – that gender bias stems from presumptions of lesser male competence. However, the existing evidence is far from conclusive. Most of the studies reviewed above present rather indirect evidence for the symmetry predicted by CMDs, and very few show that there is a direct relationship between incongruity perceptions and anti-male discrimination in female-typed settings. These empirical gaps leave open questions regarding the processes underlying discrimination against men.

However, the greater challenge to CMDs may lie in the fact that there is a separate body of literature that appears to directly challenge the findings described above, suggesting that the exact opposite pattern of results can also occur. This research stems primarily from the work of Williams (1992, 1995b), who argued that not only do men not face discrimination in traditionally female jobs, they are actually preferred over women when applying for these jobs and tend to climb the organizational ladder more quickly. This male advantage in female-dominated jobs has been called the "glass escalator." Williams (1992) argues that, unlike women in male-dominated settings, the gender of men in female-typed occupations is construed as a positive difference. As a result, male stereotypes work in men's favor, helping rather than hindering their evaluations and upward mobility.

In line with this perspective, other research has suggested that the experience and consequences of underrepresentation are different for men than for women. For example, early research on "tokenism" contended that being an occupational minority (i.e., a "token") heightens the visibility of one's group membership. For women in male-typed jobs, this visibility leads to negative outcomes, as it makes gender stereotypes about women's lesser competence in these fields salient to perceivers (e.g., Kanter, 1977; Crocker and McGraw, 1984). However, later work has shown that men actually benefit from their token status – the same visibility that leads to greater scrutiny of token women's performance allows token men to showcase and exploit their skills (Williams, 1995a; Yoder and Kahn, 2003). In her interviews of nearly 100 men working in traditionally female jobs, Williams (1995b) found that token men's achievements were often highlighted, and that their mistakes were rarely attributed to their gender. As a result, these men received preferential treatment in hiring decisions and greater incentives to remain in their jobs, as they were more often channeled into specialties with higher chances of upward mobility, or simply directly promoted.

This research suggests that evaluations of men are not subject to negative stereotype-based expectations, even in femaledominated occupations. Rather, it is argued that men's perceived competence benefits from deeply embedded gendered beliefs within organizations, whereby stereotypically masculine qualities are equated with success, and stereotypically feminine qualities are devalued (Williams, 1995b). According to this view, the historical preference for agency over communality in the workplace overrides the effects of numerical dominance. As a result, men always have an advantage over women, even in female-dominated occupations (Williams, 1995b; Evans, 1997; Mahony et al., 2004).

Perhaps the most well-known (and well-documented) consequence of the "glass escalator" is the increased upward mobility of men in traditionally female fields. Several studies have provided support for this phenomenon. For example, token men have been found to receive more promotion recommendations and salary increases than token women and non-token men (Floge and Merrill, 1986; Heikes, 1991; Yoder, 1994; Barnett et al., 2000). Similarly, longitudinal studies using archival data found that men are more likely than women to move into managerial positions as the proportion of women in an occupation increases (Maume, 1999; Hultin, 2003). Other studies have shown that, in female-dominated occupations, men (White men in particular) are more likely than women to be promoted and to receive organizational benefits that enhance career opportunities (Baron and Newman, 1990; Cameron, 2001; Cognard-Black, 2004; Wingfield, 2009; Smith, 2012; Woodhams et al., 2015).

Beyond promotion opportunities, there is other evidence of male advantage in female-dominated contexts. In an experimental study, Fuegen and Biernat (2002) found that token men were positively evaluated by their teammates. Further, qualitative studies suggest that men are often aware of their advantage, describing how being a man in a female-dominated field can help to secure jobs and often leads to greater job stability (Yang et al., 2004; Lupton, 2006). Research in early education, a domain that is perceived to be highly female-typed (Croft et al., 2015; Tellhed et al., 2017) has shown that even when controlling for actual performance and job experience, male teachers are more likely to be hired over female teachers

(McKenna and Johnson, 1981). Moreover, a meta-analysis of early educators (Borman and Dowling, 2008) found significantly less attrition among male than female teachers, providing indirect support for the idea men in female fields may be given more incentives than women to remain in their jobs. Supporting the idea that these positive outcomes stem from masculine organizational cultures, it has been argued that recent efforts to "professionalize" early education perpetuate beliefs linking competence to stereotypically masculine characteristics, even in this highly female-dominated field (Mahony et al., 2004). As a result, male teachers are often advantaged in selection, performance evaluations, and subsequent promotion opportunities. Interestingly, the "leaky pipeline" for men in female-dominated fields described above seems to disappear when men occupy higher-status positions (Torre, 2018).

Along with increasing men's chances of selection, upward mobility, and salaries, gender stereotypes may benefit men in female-typed occupations in less tangible ways. Compared to women in male-typed roles and occupations, men in female-typed domains often describe higher perceptions of workplace support and lower perceptions of workplace mistreatment (Ott, 1989; Taylor, 2010). It has been argued that this may be due to differential task requirements and expectations for men and women in female-typed occupations (Williams, 1995b; Yang et al., 2004; Snyder and Green, 2008). For example, men in traditionally female jobs are not expected to engage in emotional labor to the same extent as their female peers (Cottingham et al., 2015). Thus, the perceived mismatch between men and the communal aspects of female-typed jobs may protect them from some of the psychological stressors (e.g., emotional demands, abusive emotional treatment) that are oftentimes inherent to care-related work (Hochschild, 1983; Evans, 1997).

Contrary to CMDs, the research summarized above suggests that men may in fact benefit from gender stereotypes, even when the setting is heavily female-dominated. However, the conclusions that can be drawn from this work come with their own set of limitations. Some of the findings outlined here have been called into question by other researchers. For example, several large-scale studies have found little evidence that men in female-dominated fields benefit from their token status (e.g., Budig, 2002) or that they are promoted more frequently than women (e.g., Snyder and Green, 2008; Price-Glynn and Rakovski, 2012).

Further, even if male-advantage exists for certain female-typed roles and settings, the specific processes underlying such advantage remain largely unclear. One possibility is that men's opportunities are indeed enhanced (and women's opportunities limited) by an overarching organizational culture that places more value on agency over communality. It is also possible that men's advantage is a consequence of stereotypes of male competence being more impervious to contextual forces than female stereotypes. However, this observed male advantage may also reflect a different process altogether. Specifically, it is possible that some aspects of the "glass escalator" phenomenon could itself be explained via CMDs. In particular, enhanced promotion opportunities may be fueled by stereotype-based perceptions of incongruity between men and lower-level positions, and perceptions of congruity between men and higher-level positions, even in settings that have been traditionally dominated by women. Furthermore, within female-typed settings, the specific occupations to which men (but not women) are often channeled tend to be ones that are more aligned with masculine stereotypes (Yang et al., 2004; Levanon and Grusky, 2016). It may be that these positions offer more expedited paths to promotion. In this case, the perceived mismatch predicted by CMDs may in fact be symmetrical for men and women, but the consequences of such perceptions may not be equivalent.

On the other hand, stereotypes may play no role whatsoever in these effects. It is also possible that men's advantage in female-typed occupations is merely a product of ingroup favoritism, fostered by the higher proportion of men in evaluative and decision-making positions. Thus, without more carefully controlled experimental studies that could directly explore the mechanisms underlying these effects, it is difficult to elucidate the causes of male-advantage and to determine the role, if any, of gender stereotypes and congruity perceptions in this process.

# REEXAMINING CONGRUITY MODELS: DO MEN FACE DISCRIMINATION IN FEMALE-DOMINATED OCCUPATIONS?

The previous section described two broad lines of research examining the evaluations of men in counterstereotypical domains, each reaching a different conclusion. While the first of body of work supports the predictions made by CMDs by presenting evidence of anti-male bias in female-typed settings, the second challenges these predictions and suggests that men may in fact have an advantage over women in traditionally female fields. However, both lines of research agree on one point: gender bias in evaluations exists. Notably absent from this review (and the literature in general) are studies that have failed to find evidence of bias – that is, research that has yielded no differences in evaluations of women and men in gender-incongruent settings. Such studies are likely to be greatly underrepresented both in previous analyses and in the present review due to a long history of publication pressures favoring significant over null results (Song et al., 2000; Dwan et al., 2008). This may have led to a general overestimation of the effects of gender stereotypes on the evaluations of both women and men. However, publication bias may be particularly problematic in the case of men in traditionally female roles and occupations, given the generally sparse amount of research in this area. For example, it is possible that the scarcity of published work is not the result of an actual lack of empirical studies, but of a "file drawer problem" (Iyengar and Greenhouse, 1988). That is, researchers may have indeed examined evaluations of men in female-typed domains but found no evidence of bias. Recent shifts in publication guidelines and increased openness to publishing null findings may therefore have the potential to improve our understanding of the power and scope of CMDs and to test their implications more rigorously.

Despite these shortcomings, the growing number of published studies examining evaluations of both women and men in gender-balanced and female-dominated fields, and the recent development of statistical tools to test and correct for publication bias (e.g., Duval and Tweedie, 2000) has greatly strengthened the conclusions that can be drawn from meta-analytical efforts. Interestingly, the two most recent meta-analyses comparing evaluations of women and men in female-typed occupations have reached a different conclusion from those of the dominant theoretical perspectives in the literature. In contrast with both congruity model and male-advantage predictions, these recent analyses suggest that men are neither disfavored nor favored in female-typed jobs. Specifically, Koch et al., 2015 meta-analysis found strong evidence of anti-female bias in male-typed roles and occupations but did not reveal symmetrical effects for men in female-typed positions. For men, the overall gender-based bias in female-dominated jobs was non-significant. Similarly, the 2014 meta-analysis by Paustian-Underdahl et al. (2014) found that, on average, evaluations of male leaders did not differ significantly from evaluations of female leaders in female-typed organizations.

Thus, the most recent and comprehensive analyses suggest that gender-based bias is not fully symmetrical and that different processes might be at play for evaluations of women and men in gender-incongruent roles and occupations. Nevertheless, there is reason to believe that this conclusion too should be interpreted with some caution. Though the number of studies examining female fields included in recent meta-analyses has certainly increased from earlier endeavors (e.g., Eagly et al., 1995; Davison and Burke, 2000), the imbalance in the number of studies focusing on female versus male fields remains substantial. In addition, the variety of female-typed fields included in these analyses is rather limited, often being restricted to one or two settings (particularly education). Additionally, a moderate portion of the research conducted in traditionally female domains (much of which was included in this review) is qualitative, which precludes it from being included in most meta-analyses. Thus, although these meta-analyses likely constitute the most systematic and reliable test of the symmetry hypothesis of CMDs yet, their results nonetheless do not reflect the full body of empirical findings on this topic.

In sum, the literature to date yields conflicting findings regarding whether gender discrimination truly is symmetrical, as is proposed by CMDs. Identifying where these discrepancies lie appears to be an important first step toward shaping the direction of future research that can further refine our theoretical understanding of gender bias. Indeed, the inconsistencies revealed in this review suggest that CMDs would greatly benefit from systematic research that directly tests its premises for women and men alike.

# Reexamining Congruity Models: The Domestic Sphere

Although women spend less time on domestic work than they did in the past, they continue to contribute significantly more than men to childcare and most household tasks, a disparity that negatively affects women's career progress (Schoppe-Sullivan et al., 2013; Pew Research Center, 2014; Sullivan et al., 2018). Recently, several researchers have argued that to achieve true gender equality, a more balanced distribution of domestic labor is just as important as women's full participation in the workplace. To this end, an emerging body of research in social psychology has begun to examine the reasons behind men's lack of engagement in traditionally female work, including domestic labor (e.g., Vandello et al., 2013; Croft et al., 2015; Gutsell and Remedios, 2016; Meeussen et al., 2016; Tellhed et al., 2017).

Although CMDs have primarily been used to explain gender-based discrimination in the workplace, these models also have the potential to offer important insight into the processes involved in men's lack of participation in domestic labor. Indeed, the domestic sphere may in fact be the domain in which we are most likely to observe the anti-male bias that is predicted by these models. Thus, broadening the purview of CMDs to include unpaid domestic work may in fact prove to be essential to decisively testing their theoretical predictions.

Just as paid labor has been historically dominated by men, unpaid domestic labor has traditionally been the domain of women. It has been argued that very few paid occupations are as female dominated as household work (Cohen, 2004). As a result, people continue to hold strong associations between women and the domestic sphere (Miller and Borgida, 2016), as well as the roles and behaviors that domestic labor entails (e.g., parenting, caretaking; Park et al., 2010). I contend that being a successful homemaker is likely to be perceived as requiring significantly more communality than agency. If so, the domestic sphere would appear to be the most direct analog to the male-typed roles and occupations in which CMDs have so frequently been tested. As such, unpaid domestic work may be the most appropriate setting for testing the symmetry of CMDs – the same perceived incongruity that gives rise to presumptions of lesser female competence in traditionally male occupations should also lead to the belief that men are not equipped to perform well in the household. The strong female-typing of domestic labor may render it one of the few domains in which women should have a clear advantage and be evaluated as significantly more competent than men.

The extant literature does not offer much evidence regarding whether men are indeed presumed to be less competent in the domestic sphere, nor whether these perceptions (to the degree that they exist) lead to discrimination against men in this domain. Perhaps the same reasons behind the dearth of research on evaluations of men and women in female-dominated occupations are also responsible for the scarcity of empirical studies on people's perceptions of male and female homemakers' ability. Because of its low status and unrecognized economic value, domestic labor is often assumed to be undesirable, especially for men. Indeed, the core components of this work (e.g., the care of children and the elderly, household chores) receive little to no monetary reward (United Nations Women, 2018). It is perhaps unsurprising, then, that the lack of male participation in household endeavors has rarely been interpreted as a possible product of gender-based discrimination. After all, domestic work has not been greatly sought after by men.

Nevertheless, the question of whether gender stereotypes about the domestic sphere play a role in men's underrepresentation in domestic labor is an important one. Men's reported interest in sharing domestic work is larger than ever before, and women are increasingly demanding more involvement from their male partners (Pew Research Center, 2013; Livingston, 2014; Dotti Sani and Treas, 2016). Thus, delving into the reasons behind men's persistent lack of involvement – despite their increasing expressions of interest – is both timely and practically relevant. Examining whether there is evidence of anti-male bias in the domestic sphere also has important theoretical implications for CMDs. If men are thought to be less competent in childrearing and household tasks, and if these perceptions lead to the exclusion of men from domestic labor, this would provide strong evidence in support of the symmetry predicted by CMDs. Further, such a finding would suggest that incongruity perceptions may represent an additional barrier to the equal participation of women and men in domestic labor.

Though limited, there is some evidence to suggest that the domestic sphere is an area in which women's competence is assumed. Arguing for a dynamic view of gender stereotypes, Mendoza-Denton et al. (2008) suggested that perceptions of male and female competence actually reverse in the context of domestic work. Specifically, when the context is framed as domestic rather than employment, women are described as more agentic than men (Mendoza-Denton et al., 2008). This is consistent with other research showing that, unlike in paid labor, women often hold a position of authority in the household, directly managing and planning most domestic tasks. This role includes taking charge of the majority of physical and psychological labor, as well as making most of the decisions related to childcare, family healthcare, household purchases, and beyond (Pew Research Center, 2008, 2015; Williams and Chen, 2014; Ciciolla and Luthar, 2019).

There is also some evidence that men's domestic competence is viewed negatively, particularly in the case of childrearing. Poll data show that only 1% of people believe that fathers do a better job caring for a baby (vs. 53% favoring mothers). Further, among those who believe children are better off having at least one parent at home, only 2% think that parent should be the father (Pew Research Center, 2016, 2017a). Other research shows that, when asked to choose who should have custody of a child, most people (including judges) favor mothers over fathers, even when controlling for the characteristics of the parents (Miller, 2018). Beliefs about men's lesser childrearing competence are also reflected in the media. For example, portrayals of "inept fathers" in advertising were recently found to be pervasive enough to prompt regulation in the United Kingdom (Advertising Standards Authority, 2018).

Family psychologists have described a phenomenon called "maternal gatekeeping" that further supports the idea that men's competence may be put into question within domestic contexts. Maternal gatekeeping refers to the belief (observed mostly among mothers in the context of childrearing) that men are not as qualified as women to handle important domestic tasks and should therefore be prevented from performing them (Allen and Hawkins, 1999; Schoppe-Sullivan et al., 2004). While the literature offers a comprehensive description of the behaviors involved in maternal gatekeeping, the exact mechanisms underlying such behaviors remain unclear. For example, some research suggests that maternal gatekeeping hinders men's childrearing abilities by limiting fathers' involvement with their children (e.g., Allen and Hawkins, 1999; Altenburger et al., 2018). Other research argues for the reverse causal direction, suggesting that maternal gatekeeping is a protective strategy used to shield children from already incompetent fathers (e.g., Waller and Swisher, 2006; Austin et al., 2013). Further research is necessary to determine the extent to which maternal gatekeeping actually occurs and whether it is a product of stereotype-based expectations. If the psychological processes behind maternal gatekeeping indeed arise from culturally shared beliefs regarding what men and women are like and what it takes to be a good homemaker, then this phenomenon may provide support for the symmetry of CMDs.

In addition, the perceived mismatch between male stereotypes and the communal requirements for success in domestic labor might also impact men's own perceptions of competence in this domain. A large body of work has demonstrated that women tend to internalize gender stereotypes and come to believe that they are less efficacious than men in traditionally male fields (e.g., Eccles, 1994; Correll, 2004). Decreased self-efficacy has been associated with a lower sense of belonging among women than men in traditionally male settings (Good et al., 2012), as well as lower motivation to participate and engage in these areas (Cheryan et al., 2015). Similarly, men too may internalize stereotype-based expectations about their own lack of proficiency in child-rearing and household chores and conclude that they do not have what it takes to perform well in domestic roles, an idea that has found some support in qualitative research (e.g., Miller, 2011; Ives, 2014). These beliefs, in turn, may lead men to avoid parental responsibilities and to exclude themselves from domestic labor altogether, deferring to women as the domestic "experts."

It is important to note that, like paid labor, unpaid domestic labor is itself divided into roles and tasks that are likely to be differentially gender-typed. Though women spend significantly more time on domestic work than men (even when both partners are employed), there is variation among different forms of domestic work. For example, women report spending more time cooking and cleaning than men do, but men report spending more time on garden maintenance and repairs than women do (American Time Use Survey, 2017). The relative distribution of men and women in these different roles may influence their perceived gender-type and future research examining evaluations of men's (vs. women's) perceived domestic competence should consider these distinctions.

Certainly, the perceived mismatch between male stereotypes and domestic labor is not the sole explanation for men's lack of domestic engagement. Other psychological mechanisms, including motivational processes, are likely to contribute to the belief that men are not good homemakers and that they should not participate in domestic labor. For example, both men's and women's motivation to uphold the status-quo has been associated with the endorsement of gender stereotypes (Glick and Fiske, 2001; Jost and Kay, 2005) and therefore may also explain men's lower involvement in the household. Furthermore, even though the imbalance in domestic work can have negative consequences for women's career advancement, some women may be particularly motivated to maintain ownership over the domestic sphere. As a domain in which female competence is likely acknowledged and unquestioned, domestic labor may be one of the few areas that is primarily reserved for women. Indeed, research suggests that women derive a sense of control from the relative power that gender stereotypes garner them in the household (Williams and Chen, 2014). Moreover, many women may have a strong sense of pride deriving from their (real or perceived) domestic superiority and may strongly identify as mothers and homemakers. For these women, greater male involvement in child-rearing and domestic work may be perceived as a threat to their status within the household and to their identity.

In sum, the dearth of research examining perceptions of domestic competence does not yet allow for a rigorous test of CMDs predictions in this domain. Nevertheless, there is indirect evidence to suggest that, just as incongruity beliefs give rise to the expectation that women will be less competent in male-typed jobs, men too may be deemed less competent in the female-typed household. It is possible, then, that the failure to find clear support for the symmetry of CMDs lies, in part, in the general omission of the domestic sphere as a relevant setting in which to test its assumptions. Examining whether symmetrical incongruity beliefs exist about men in the household, and whether these beliefs lead to perceptions of female superiority and male incompetence, would provide a more thorough and decisive test of CMDs. Importantly, this research would allow us to better understand the processes underlying men's continued lack of domestic participation, a phenomenon that continues to hinder the attainment of gender equality.

# Prescriptive Gender Stereotypes and the Evaluation of Men in Female-Typed Settings

The main goal of this paper was to critically examine congruity models of gender discrimination in light of the extant literature on evaluations of men in female-typed fields. To this end, I focused on descriptive gender stereotypes and their consequences for the competence perceptions of men in traditionally female roles and occupations. However, as mentioned earlier, gender discrimination is not only the product of a mismatch between the perceived requirements of a position and descriptive gender stereotypes ("what men and women are like"); it also results from violations to prescriptive gender stereotypes ("what men and women should be like"). Specifically, prescriptive gender stereotypes may lead to discrimination through social penalties and backlash. Much research has shown that women who are thought to violate these stereotypes (e.g., by behaving in a dominant way or displaying competence in a male-typed roles) are disliked and seen as less hireable (Rudman et al., 2012; Williams and Tiedens, 2016). Though the particular social penalties incurred by men who choose to do "women's work," be it paid female-typed labor or unpaid domestic labor, were beyond the scope of this paper, they are surely crucial to fully understanding men's lack of participation in traditionally female domains. Examining whether these penalties and their downstream consequences are equivalent for male and female gender transgressors is an important question that this paper did not address.

Several authors have argued that prescriptive gender stereotypes and femininity injunctions for men play an important role in men's underrepresentation in communal roles and occupations (Thompson et al., 1985; Croft et al., 2015; Meeussen et al., 2016; Tellhed et al., 2017). A growing body of research has shown that, like women, men too are punished for violating gender norms by behaving in gender-incongruent ways. For example, men who demonstrate proficiency in female-dominated occupations are seen as weak and undeserving of respect (Heilman and Wallen, 2010) and often encounter social backlash (Rudman and Fairchild, 2004). Similarly, modest and self-effacing men are frequently derogated by others (Rudman, 1998; Moss-Racusin et al., 2010). Some have argued that the penalties for gender norm violations are not equivalent for men and women, and that men actually incur greater social costs due to stricter masculinity prescriptions (Pleck, 1995; Vandello and Bosson, 2013). These increased penalties may stem from the fact that stereotypes prescribing agentic behavior for men are often compounded by the strong association between feminine men and homosexuality (Kite and Deaux, 1987), an association that appears to be less strong in the case of masculine women. It has been argued that the fear of being perceived as homosexual may be enough to lead many men to actively avoid communal behaviors and activities (Bosson et al., 2013).

Although the penalties for violating gender norms are mostly informal (e.g., dislike, derogation, avoidance), they may still result in discrimination against men in communal roles and occupations by promoting the exclusion – and self-exclusion – of men from these domains. Thus, even if men and women are selected at equal rates, or if they climb the organizational ladder more quickly, men may still be deterred from pursuing a career in female-typed areas because of the harsh social penalties that such a decision might entail. Field research supports this idea, describing how men in traditionally female occupations often express fear about how they will be perceived by others. For example, male nurses and early childhood educators report being afraid of having their masculinity questioned and, in particular, of being seen as socially and sexually deviant (Williams, 1995a; Cameron, 2001; Harding, 2007). These fears are likely to play an important role in men's job pursuits and aspirations, including their decision to enter and remain in female-typed occupations.

Prescriptive gender stereotypes can also result in penalties for men engaging in unpaid domestic labor. Research has shown that actively taking time off work to fulfill family responsibilities leads to negative consequences not only for women, but for men as well (Wayne and Cordeiro, 2003; Butler and Skattebo, 2004; Coltrane et al., 2013; Rudman and Mescher, 2013). For example, men who spend a significant amount of time dedicated to their family report experiencing more workplace harassment and mistreatment than women in similar caregiving

roles (Berdahl and Moon, 2013). This may be due to the fact that employees seeking family-related work flexibility are often described in feminine terms, a perception that results in social penalties for men, but not women who actively pursue greater domestic participation (Vandello et al., 2013).

In sum, prescriptive stereotypes may contribute to men's lack of participation in female-typed roles and occupations by fostering a hostile environment for men who choose to engage in this type of labor. Future research should explore whether the processes that lead to social penalties are similar for men and women, and whether the consequences for prescriptive violations are comparable. It is possible, for example, that the strong association between male stereotypes and the provider role might shield norm-violating men from the economic costs of backlash (e.g., decreased hireability and promotion), but that men are more likely than women to lose their social standing as a result of their transgression, given the lower status assigned to female roles and behaviors.

# CONCLUSION

The present review examined the literature on evaluations of men in female-typed settings with the goal of elucidating whether discrimination processes for men and women are truly

# REFERENCES


symmetrical, as congruity models of discrimination predict. The results were mixed. While some research provides support for the idea that men, like women, are presumed to be less competent in gender-incongruent occupations, other research suggests that men may have an advantage over women in female-dominated occupations.

However, these findings do not necessarily imply that CMDs are only useful when explaining discrimination against women. Expanding the paradigms used to test CMDs to also include unpaid domestic work has the potential to deepen and refine our understanding of gender discrimination, as well as to provide further support for the psychological processes underlying these models. Future research should explore whether the mismatch between male stereotypes and domestic stereotypes give rise to perceptions that men are less competent in the domestic sphere. Doing so may help to identify important predictors of men's lack of engagement and participation in the household and can shed light on potential pathways to balance the distribution of women and men, both in the workplace and the household.

# AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and has approved it for publication.


Psychol. Public Policy Law 5, 665–692. doi: 10.1037/1076-8971.5. 3.665




say-mothers-do-a-better-job-caring-for-a-new-baby-than-fathers-but-



**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Manzi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Worth Less?: Why Men (and Women) Devalue Care-Oriented Careers

Katharina Block<sup>1</sup> \*, Alyssa Croft<sup>2</sup> and Toni Schmader<sup>1</sup>

<sup>1</sup> Department of Psychology, The University of British Columbia, Vancouver, BC, Canada, <sup>2</sup> Department of Psychology, The University of Arizona, Tucson, AZ, United States

In the present research, we applied a goal-congruity perspective – the proposition that men and women seek out roles that afford their internalized values (Diekman et al., 2017) – to better understand the degree to which careers in healthcare, early education, and domestic roles (HEED; Croft et al., 2015) are devalued in society. Our first goal was to test the hypothesis that men, relative to women, are less interested in pursuing HEED careers in part because they are less likely than women to endorse communal values. A second, more novel goal was to extend goal congruity theory to examine whether gender differences in communal values also predict the belief that HEED careers add worth to society and are deserving of higher salaries. In three studies of undergraduate students (total N = 979), we tested the predictive role of communal values (i.e., a focus on caring for others), as distinct from agentic values (i.e., a focus on status, competition, and wealth; Bakan, 1966). Consistent with goal congruity theory, Studies 1 and 2 revealed that men's lower interest in adopting HEED careers, such as nursing and elementary education, was partially mediated by men's (compared to women's) lower communal values. Extending the theory, all three studies also documented a general tendency to see HEED as having relatively lower worth to society compared to STEM careers. As expected, communal values predicted perceiving higher societal worth in HEED careers, as well as supporting increases in HEED salaries. Thus, gender differences in communal values accounted for men's (compared to women's) tendency to perceive HEED careers as having less societal worth and less deserving of salary increases. In turn, gender differences in perceived societal worth of HEED itself predicted men's relatively lower interest in pursuing HEED careers. In no instance, did agentic values better explain the gender difference in HEED interest or perceived worth. These findings have important implications for how we understand the value that society places on occupations typically occupied by women versus men.

Keywords: gender differences, agentic values, communal values, career evaluations, career choice, career status, occupational interest

# INTRODUCTION

"If we're going to get to real equality between men and women, we have to focus less on women and more on elevating the value of care."


Try for a moment, to imagine a world without teachers and nurses. Not only is this difficult to do, but it also paints an unpleasant picture. Workers in healthcare and education play vital roles

#### Edited by:

Alice H. Eagly, Northwestern University, United States

#### Reviewed by:

Una Tellhed, Lund University, Sweden Amanda Diekman, Miami University, United States

> \*Correspondence: Katharina Block kblock@psych.ubc.ca

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

> Received: 30 April 2018 Accepted: 13 July 2018 Published: 10 August 2018

#### Citation:

Block K, Croft A and Schmader T (2018) Worth Less?: Why Men (and Women) Devalue Care-Oriented Careers. Front. Psychol. 9:1353. doi: 10.3389/fpsyg.2018.01353

in the functioning of civil societies (Bordieu and Passeron, 1990; Holmes and Gastaldo, 2002). Yet, as political scientist and policy analyst [Slaughter (2015), October 1] suggests, these positions are often devalued. On the one hand, men devalue care-oriented occupations (e.g., teaching or nursing) as personal career paths (Croft et al., 2015). But in addition, those men and women who do choose healthcare, early education, and domestic roles (HEED; Croft et al., 2015) are afforded both lower status and lower salaries in many societies (Cross and Bagilhole, 2002; England et al., 2002). Many HEED professionals feel this broader devaluation. In the United States, public funding for education has been cut by as much as 37% since 2008 – prompting teacher-strikes in several states to protest low salaries (Turner, 2018, April 11). Given the important role of these care-oriented professions to personal (Le et al., 2018) and societal well-being (Bordieu and Passeron, 1990; Holmes and Gastaldo, 2002), why are HEED careers not highly valued, both as occupational choices for men and for society as a whole? In the current research, we apply a goal congruity perspective (Diekman et al., 2017) to test whether men's and women's endorsement of communal values predicts their personal interest in and perceptions of the broader societal worth of HEED careers.

# Gendered Career Perceptions

Despite several waves of feminism and active efforts by governments, men and women continue to be disproportionately represented in different types of occupations. To date, women remain underrepresented in science, technology, engineering, and math (STEM) fields, where they make up only 9–16% of engineers and 21% of computer programmers (Bureau of Labor Statistics, 2017). Whereas an active literature seeks to understand and rectify this underrepresentation of women in STEM, much less attention has been paid to the equally sizable gender imbalance in communally oriented careers (see Croft et al., 2015). In many HEED careers, men are markedly underrepresented, making up only 10% of nurses and 4% of preschool and kindergarten teachers in the United States (Bureau of Labor Statistics, 2017). Men's self-selection out of care-oriented roles might have negative consequences both for men themselves and those served by HEED professionals (Croft et al., 2015). Thus, the first goal of the current research was to better understand why men are relatively less interested in personally pursuing HEED careers.

As suggested by Slaughter, HEED occupations are not simply unpopular career choices among men, they are also generally devalued in society. HEED careers are assigned lower status and paid lower salaries than traditionally male-dominated STEM careers (e.g., England et al., 2001). In the United States, where teachers stage walk-outs to protest their low salaries, teaching is among the lowest paid occupations given training requirements (Alegretto and Mishel, 2016). Similarly, in other Western countries such as the United Kingdom and Germany, hourly pay-rates in education and healthcare are considerably lower than those in scientific sectors (ILOSTAT, 2015). Such data suggest that HEED careers are, quite literally, perceived as worth less money than are STEM careers. And because women tend to be overrepresented in these low-paying HEED careers, sociologists have suggested that the tendency to undervalue care-oriented roles perpetuates the persistent gender wage gap women continue to face in modern societies (Kilbourne et al., 1994; England et al., 2001). Despite such broad implications for important social issues, to the best of our knowledge, there has been no empirical social-psychological investigation of the perceived societal worth of HEED (or STEM) careers. Thus, in addition to better understanding men's disinterest in HEED careers, our second and perhaps more important aim was to document whether people do in fact see HEED careers as having less worth than STEM careers, and if so, identify factors that predict this perception. We examine these questions through the lens of social role theory, goal congruity theory, and status-value theory.

# Social Role Theory and Goal Congruity

Social role theory (Eagly, 1987; Eagly and Wood, 2012) provides a broad framework for understanding how gender segregation into different roles eventually leads new generations of men and women to internalize distinct traits and values. The theory suggests that the historical overrepresentation of women in careoriented (e.g., HEED) roles results in societal gender stereotypes of women as inherently more communal (i.e., oriented toward care for others, Bakan, 1966) than men. In turn, such stereotypical expectations lead new generations of women to internalize communal values more than do men (Eagly, 1987; Ridgeway and Correll, 2004; Abele, 2003; Eagly and Wood, 2012). In line with this theory, a wealth of evidence shows that men endorse communal values and traits relatively less than do women (Bem, 1974; Spence et al., 1974; Spence and Helmreich, 1978; Costa et al., 2001; Donnelly and Twenge, 2017). Moreover, such gender differences in communal values are evident early in development, with boys reporting lower communal value endorsement than girls as early as age 6 (Block et al., 2018). In contrast, although women are viewed as less agentic (i.e., focused on self-promotion, Bakan, 1966) than men (Bem, 1974; Spence et al., 1974; Spence and Helmreich, 1978), women have become somewhat more agentic as they have entered the workforce (Donnelly and Twenge, 2017).

Once men and women have internalized communal (and other) values to different extents, these values should, in turn, color their perceptions of careers. As an extension of social role theory, the goal congruity perspective (Diekman et al., 2017) suggests that both men and women seek careers that match their own internalized values for communion and agency. Female-stereotypic (e.g., HEED) and male-stereotypic (e.g., STEM) careers differ in the extent to which they are perceived to afford these values. Specifically, HEED-related careers, such as nursing, are perceived as highly communal but lower in agency; whereas STEM careers are perceived as relatively lower in communion but higher in agency (Diekman et al., 2010; Tellhed et al., 2018). As a consequence, the goal congruity perspective offers an explanation for patterns of horizontal gender segregation by occupation. Past findings show that women's relatively higher communal value endorsement predicts a reduced interest in taking on STEM and other

male-dominated careers (Evans and Diekman, 2009; Diekman et al., 2010; Diekman and Steinberg, 2013). In addition, reframing STEM careers as more communal increases women's interest in these careers (Diekman et al., 2011). Though not exclusively focused on HEED careers, past research also suggests that endorsing communal goals predicts favoring female-stereotypic careers among undergraduate (Evans and Diekman, 2009) and high school students (Tellhed et al., 2018). Thus, the first goal of the present research was to test the straightforward prediction that men's lower interest in HEED careers is partly explained by their lower endorsement of communal values.

The current research also extends the goal congruity perspective beyond merely understanding men's and women's career choices, to examine the broader worth that men (and women) perceive in HEED and STEM fields. In line with the introductory quote by Anne-Marie Slaughter, we begin by hypothesizing that, although both men and women might see HEED careers as having less worth compared to STEM, men might particularly devalue the importance of HEED to society. Previous work on status-value asymmetries suggests that high-status group members tend to devalue domains in which their group is underrepresented, whereas low-status group members find it difficult to devalue domains inhabited by higher-status outgroups (Schmader et al., 2001). Given men's higher status in society (Conway et al., 1996; Correll, 2004; Ridgeway and Correll, 2004), we expect them to see less value in female-dominated HEED careers than do women, whereas women might not similarly devalue the broader societal worth of male-dominated STEM careers.

In addition to the importance granted to the roles occupied by higher-status groups, we propose that goal congruity processes also shape the perceived worth of various careers. Specifically, we theorized that internalized values not only guide men's and women's personal career choices, but also their broader perceptions of careers as adding (or not adding) significant worth to society. Because HEED careers are seen as supporting communal goals (Diekman et al., 2010), we expected those that those who feel that communion is broadly important (who tend to be women) will see greater worth in HEED careers' contributions to society and will want to see HEED workers compensated well. And because men tend to endorse communal values less strongly, we predicted that men will perceive relatively less societal worth in HEED careers than will women – a difference that will be mediated by men's lower endorsement of communal values.

An additional, more exploratory goal of these studies was to examine whether men's less-favorable perceptions of the societal worth of HEED careers would subsequently predict their reduced interest in actually pursuing HEED careers. Generally speaking, people seek careers that they perceive as making meaningful contributions to society (Hirschi, 2012). What is seen as meaningful, however, could vary based on one's personal values. Thus, we also explored whether men's tendency to perceive relatively less worth in HEED roles (as predicted by their relatively lower communal value endorsement) would mediate gender differences in interest in HEED careers.

Although our primary focus was on communal value endorsement as a predictor of HEED evaluations, we also examined other values that might be seen as incompatible with HEED. Perhaps men are relatively less interested (or perceive less worth) in HEED careers not because they place less value on communion, but instead because they care more about agency, competition, and or money. These other values might feel incompatible with careers that seem to emphasize putting others' needs above one's own and putting others' well-being above profit. Evidence of gender differences in agency is mixed. Some contemporary studies no longer show gender differences in agentic values (Diekman et al., 2010), but in other research men rate themselves higher on agentic traits than do women (Donnelly and Twenge, 2017). In addition, teenage boys are more likely than girls to prioritize agentic over communal goals (Tellhed et al., 2018), and men are more likely to emphasize competition as a means to gain status (Gneezy and Rustichini, 2004; Croson and Gneezy, 2009) and focus on salary when evaluating careers (Fortin, 2017). If HEED careers are perceived as not affording competition and wealth, these professions could represent a mismatch to men's values, providing an alternative or independent reason for their devaluation of HEED.

# Overview of Research

In three samples of young adults, we examined the relationship between gender, personal values, and evaluations of HEED careers as both: (a) personally interesting, and (b) as having broader worth to society. In Studies 1 and 2, we applied the goal congruity perspective to men's HEED interest, and tested the hypothesis that men are less interested in HEED careers to the extent that they hold less communal values than do women. We also hypothesized that HEED careers would be seen as having less worth to society compared to STEM. More importantly, we expected men, as compared to women, to perceive HEED careers as having less societal worth (Studies 1–3) and less deserving of pay increases (Study 3), two effects that would be partly explained by men's lower communal values. A more exploratory prediction was that men's relatively low interest in HEED careers as a function of their lower communal values would itself be partially explained by the lower societal worth men grant these occupations (Studies 1 and 2). Whereas we focused on communal values across hypotheses, we also tested the additional explanations that agentic values (Studies 1–3), trait competitiveness (Study 2), and/or material values (Study 3) would predict negative HEED evaluations instead of, or in addition to, communal values. Lastly, we also assessed men's and women's interest in, and perceived worth of STEM careers for comparison.

# STUDY 1

One goal of Study 1 was to extend previous work on goal congruity theory (Diekman et al., 2017) to formally test the hypothesis that men's, compared to women's, relatively lower interest in HEED careers can be explained by their lower communal values. The second and more novel goal was to examine whether communal values also predict perceptions of the broader societal worth of HEED careers. Finally, we tested whether there is a gender difference in the perceived societal worth of HEED careers that might be mediated by the gender gap in communal (and/or agentic) values.

# Method

#### Participants and Procedure

fpsyg-09-01353 August 9, 2018 Time: 9:8 # 4

Trained research assistants (male and female) recruited 380 (184 male/196 female) participants in public areas of a large Canadian university to complete a brief paper-and-pencil survey about "values, opinions, and preferences" in exchange for candy. Although no a priori power analysis was conducted, a sensitivity analysis suggested that the study was powered to detect a small to medium sized interaction in an analysis of variance (η 2 <sup>p</sup> = 0.02) at 80% power. Participants had a mean age of 19.91 years (SD = 2.02) and were predominantly East Asian (46.3%) and Caucasian (22.9%), with some South East Asian (13.4%) participants.<sup>1</sup>

# Measures

#### **Personal values**

Participants rated "how important" each of seven communal values (helping others, serving humanity, working with people, connection with others, attending to others, caring for others, intimacy; α = 0.79) and seven agentic values (power, recognition, achievement, self-promotion, independence, status, competition; α = 0.69) were "to them personally." This list of values was adapted from Diekman et al. (2010). Participants rated each value by placing an X on a 10 cm long scale anchored by "Not at all important" (0) to "Extremely important" (100). Responses were measured with a millimeter ruler.

#### **Career interest**

To assess personal interest, participants rated the "degree to which [they] can imagine being at all interested in" five HEED (social worker, human resources manager, preschool/kindergarten teacher, educational administrator, registered nurse; α = 0.76) and five STEM careers (engineer, computer scientist, environmental scientist, architect, dentist; α = 0.68) adapted from Diekman et al. (2010). <sup>2</sup> Ratings were made on the same 10 cm response-scale used for values, with the anchors "Not at all interested" to "Extremely interested".

#### **Perceived worth to society**

Participants rated the same HEED and STEM careers (αHEED = 0.90; αSTEM = 0.85) for their perceived worth to society. Specifically, participants estimated the ideal pay they would assign to reflect a given career's worth to society. We emphasized that "we are NOT asking [them] to estimate the actual pay these roles currently get on the job market, but rather the VALUE you want to assign to them." Ratings were made by placing an X on a visual continuous scale with the anchors of "\$0 per hour" to "\$400 per hour".<sup>3</sup>

#### **Exploratory variables and demographics**

The survey included exploratory measures of future breadwinner and caregiver roles and career- vs. family prioritization. Gender differences in these "domestic" variables, and their relationships to personal values can be found in the **Supplementary Online Materials** (SOM) but will not be discussed in this paper. At the end of the study, participants completed a standard demographic questionnaire including gender, age, year standing, major, ethnicity, sexual orientation and dating status.

The full datasets all studies in this manuscript can be located at osf.io/ejz78.

# Results

#### Gender Differences

#### **Personal values**

Based on previous findings, we expected men to score lower than women on communal values but expected no clear gender difference in agency (Diekman et al., 2017). In line with this prediction, a 2 (participants gender: male vs. female) × 2 (value-type: communal vs. agentic) mixed analysis of variance (ANOVA) yielded a significant interaction between participant gender and value-type, F(1,376) = 11.62, p = 0.001, η 2 <sup>p</sup> = 0.03. Pairwise comparisons revealed that men valued communion less than did women, p < 0.001, but men and women valued agency to a similar extent, p = 0.970. It is notable, however, that both men and women reported valuing communion more than agency, all ps < 0.001. Descriptive statistics, d-scores for gender differences, and correlations for key variables are reported in **Table 1**.

#### **Career interest**

Reflecting past evidence of gender differences in occupational interest (Evans and Diekman, 2009; Su et al., 2009; Diekman et al., 2017), we expected men to be more interested in male-stereotypic STEM careers, and women to be more interested in female-stereotypic HEED careers. A 2 (participant gender) × 2 (career-type; STEM vs. HEED) mixed ANOVA yielded the expected gender × career-type interaction, F(1,378) = 2.65, p < 0.001, η 2 <sup>p</sup> = 0.16. As expected, men reported significantly less interest in HEED careers than did women, p < 0.001. Women, in turn, reported less interest in STEM careers than did men, p = 0.003. Pairwise comparisons showed that women favored HEED over STEM careers, p < 0.001, whereas men favored STEM over HEED careers, p < 0.001.

<sup>1</sup> Study 1 collapses across data from five versions of the same survey measures collected at the same time by the same research assistants. These versions simply varied item-order in an attempt to subtly prime a communal mindset, but initial analyses revealed no differences by order.

<sup>2</sup>An additional item, "lawyer" was excluded from the STEM scale given a low itemtotal correlation, r = 0.11, compared to at least 0.20 for all other items, and the lack of gender-imbalance in law careers (Bureau of Labor Statistics, 2017). An item "homemaker" was excluded from the HEED scale given our focus on paid careers.

<sup>3</sup>An exploratory maximum likelihood factor analysis with direct oblimin rotation confirmed that participants' ratings of societal worth formed two factors that, although positively correlated, r = 0.55, included ratings of HEED careers on one factor (56.11% of variance) and ratings of STEM careers on the second factor (13.40% of variance).


<sup>∗</sup>p < 0.05. Superscripts on d-scores indicate significant level of gender differences from tests reported in text. All scales had a range of 0–100 except the value measures, in which participants expressed value in an hourly pay amount that could vary between \$0–\$400/hr.

#### **Perceived worth to society**

We expected that men, as compared to women, would see less societal worth in HEED careers (as indicated by a lower ideal salary). A 2 (participant gender) × 2 (career-type) mixed ANOVA on the perceived societal worth assigned to these careers revealed significant main effects of career-type, F(1,378) = 156.81, p < 0.001, η 2 <sup>p</sup> = 0.29; and of gender, F(1,378) = 6.29, p = 0.013, η 2 <sup>p</sup> = 0.02; that were qualified by a significant gender by career-type interaction, F(1,378) = 4.78, p = 0.029, η 2 <sup>p</sup> = 0.01. Consistent with key hypotheses, and as shown in **Figure 1**, men assigned significantly less societal worth to HEED careers than did women, p = 0.003. However, women and men assigned similarly high levels of societal worth to STEM careers, p = 0.100. Although both men and women assigned higher ideal salaries to STEM than to HEED, ps < 0.001, that difference was significantly smaller for women, d = 0.38, compared to men, d = 0.52.

#### Mediation of Occupational Perceptions by Communal Values

Given the observed gender differences in career interest and perceived societal worth of HEED careers, we next tested our hypotheses that communal values would partially account for these gender gaps in HEED perceptions. Using Preacher and

Hayes' (2012) PROCESS Macro in SPSS (Hayes, 2012; Model 4), we regressed each outcome variable (personal interest and societal worth of HEED in separate models) onto gender as the predictor variable and communal values and agentic values as simultaneous mediator variables. To focus specifically on the relationship of communal values with HEED perceptions, analyses always controlled parallel ratings of STEM occupations.<sup>4</sup> All variables were standardized in these and other models. Indirect effects of gender on career via values, and their confidence intervals, were estimated using 10,000 bootstrapped re-samples. Models are visualized in **Figure 2**.

In addition to the already described gender differences in communal values, β = 0.23, SE = 0.05, t(376) = 4.58, p < 0.001; communal values predicted significantly higher personal interest in HEED careers, β = 0.22, SE = 0.05, t(373) = 4.76, p < 0.001, as well as higher societal worth assigned to HEED careers, β = 0.11, SE = 0.04, t(373) = 3.13, p = 0.002. Importantly, there was a significant indirect effect of gender on HEED interest as mediated by communal values, a <sup>∗</sup>b = 0.05, SE = 0.02, bootstrapped CI0.<sup>95</sup> (0.02, 0.09), p < 0.05; and of gender on perceived societal worth of HEED careers as mediated by communal values, a <sup>∗</sup>b = 0.02, SE = 0.01, bootstrapped CI0.<sup>95</sup> (0.01, 0.05), p < 0.05. In contrast, endorsing agentic values did not predict personal interest in HEED, β = −0.003, SE = 0.05, t(373) = −0.06, p = 0.956, but did predict lower perceived societal worth of HEED in this sample, β = −0.07, SE = 0.03, t(373) = −1.98, p = 0.049. Yet, the lack of gender difference in agentic values precludes this variable from mediating effects, both a <sup>∗</sup>bs < 0.001.

These mediation models provide support for the hypothesis that men show less personal interest in and assign lower societal worth to HEED careers, in part, because communal values are less important to them than they are to women (13% of total gender difference in career interest and 33% of gender difference societal worth was explained by communal values). After entering communal and agentic values (alongside STEM perceptions as control) into these models, gender was still a significant predictor of HEED interest, β = 0.33, SE = 0.05, t(373) = 7.10, p < 0.001, but

<sup>4</sup>Communal and agentic values were positively correlated, r(378) = 0.15, p = 0.003. Both for HEED interest and societal worth of HEED, communal values remain a significant mediator of this relationship when agentic values and STEM ratings are NOT included. See **Supplementary Online Materials** (SOM) for these analyses.

not of societal worth of HEED, β = 0.06, SE = 0.03, t(373) = 1.90, p = 0.058.<sup>5</sup>

#### **Are gender differences in HEED interest mediated by communal values and societal worth?**

Given that gender differences in communal values related to both gender differences in HEED interest and societal worth, we further asked whether men's, compared to women's, relatively lower interest in HEED careers is partly explained by the lower perceived worth of these careers. We tested this serial mediation with the PROCESS macro (Hayes, 2012; model 6) entering gender as the main predictor, communal values as mediator 1, and societal worth of HEED as mediator 2 of a model predicting HEED interest as the outcome. All paths controlled for agentic values, societal worth of STEM, and interest in STEM.

Results, summarized in **Figure 3**, yielded evidence of a significant serial mediation effect, a<sup>1</sup> ∗a2 <sup>∗</sup>b = 0.01, SE = 0.003, bootstrapped CI0.<sup>95</sup> (0.002, 0.02). Gender was a significant predictor of communal values, β = 0.22, SE = 0.05, t(373) = 4.31, p < 0.001; which in turn predicted greater perceived worth of HEED careers, β = 0.11, SE = 0.03, t(372) = 3.12, p = 0.002. Perceiving higher societal worth in HEED, in turn, predicted higher personal interest in adopting HEED careers, β = 0.29, SE = 0.07, t(371) = 4.19, p < 0.001. Results from serial mediation analyses thus suggest that men's (vs. women's) relatively lower interest in HEED careers is partially explained through communal values – both through communal values' relationship to perceived lower societal worth of HEED, but also communal value's direct relationship to lower interest in HEED careers.

# Discussion

Results from Study 1 suggest that strongly endorsing communal values relates not only to greater interest in adopting HEED careers and less interest in STEM careers (as shown previously by Diekman et al., 2010, 2011), but also predicts perceiving more societal worth in HEED occupations. As expected, men, compared to women, were less interested in pursuing HEED careers themselves and also tended to perceive lower societal value in HEED careers. In turn, gender differences in communal values partially accounted for these gender differences in devaluing HEED on both a personal and societal level. Seeing less societal worth in HEED also partially explained men's lower interest in HEED. These results suggest that those who care less about nurturing and connection (who are more likely to be men), tend to place less value on roles in society that provide care to others (i.e., HEED). Moreover, given that those who strongly endorsed agentic values tended to assign lower societal worth to HEED (and more to STEM) careers, agency appears to play some role in these evaluations. The lack of gender difference in agency in this

<sup>5</sup>Though not our focus, we conducted parallel analyses on evaluations of STEM roles. These mediational models (summarized in SOM) revealed that communal values predicted lower interest in STEM careers, suggest that communal values significantly mediated gender differences in STEM interest and predicted marginally lower societal worth perceived in STEM careers (given an absence of a gender difference in perceived societal worth of STEM).

large sample, however, made this variable an unlikely potential explanation for gender differences in how these occupations are evaluated.

# STUDY 2

Although Study 1 provided initial evidence that communion plays a larger role than agency in understanding men's underrepresentation in (and devaluation of) HEED roles, we were concerned that our abstract measure of agentic values might have obscured relevant facets of the construct. Despite the fact that the gender gap in overall agency is no longer found by all contemporary studies (Diekman et al., 2010), men in most societies remain markedly more competitive than women (Croson and Gneezy, 2009). This gender difference in competitiveness is evident early in development (Gneezy and Rustichini, 2004), and across cultures (Gneezy et al., 2009). In addition, evidence suggests that a competitive mindset lead to less prosocial behavior (Liberman et al., 2004). It is plausible that gender differences in this particular facet of agency offer an alternative explanation for men's relatively lower interest in and perceived societal worth of HEED careers, because men's striving for competitiveness could be perceived as incompatible with careoriented HEED roles (consistent with goal-congruity perspective; Diekman et al., 2017).

Study 2 tested whether possible gender differences in trait competitiveness – as a key component of agency – account for gender differences in HEED role interest and perceived societal worth, over and above the mediational effect of communal values (documented in Study 1). Study 2 was originally designed to test an experimental manipulation of competitiveness, in which participants were randomly assigned to play either a competitively- or a cooperatively framed game that has been used in the past to prime competitive vs. cooperative mindsets (Liberman et al., 2004). Because this manipulation failed to show effects on competitive behavior or self-reported competitiveness, we collapsed across conditions and analyzed the dataset correlationally. Controlling for condition does not change results (see SOM). The strengths and limitations of this approach will be addressed in the Section "General Discussion."

# Method

## Participants

We recruited 308 (152 men/156 women) undergraduates from a large Canadian university who participated either for research credit or \$10 (Mage = 20.0, SD = 2.23). Participants reported a variety of majors (39.3% from Psychology, 10.7% from other Arts majors, 22.7% from Science majors, 12.7% business, and the rest from other majors) and were predominantly East Asian (52.9%), Caucasian (22.7%), or East Indian (14.0%). Study 2 was run in 2014 with a goal of recruiting a minimum 75 participants per condition and gender. Sensitivity analyses with G∗power suggested that this sample was powered to detect a small to medium interaction effect in an ANOVA (η 2 <sup>p</sup> = 0.025) with 80% power (alpha = 0.05).

#### Procedure

Participants were brought into the lab in pairs, ostensibly for a study examining individual differences in playing games. They completed the study in individual cubicles, thinking that they were playing with a partner in another cubicle. Based on random assignment, they either heard a description of the task as a "cooperation game" played "with a partner" (cooperation condition), or as a "competition game" played "against an opponent." (competition condition). After learning the rules of the prisoner's dilemma game (PDG, Liberman et al., 2004), all participants played only a single trial of the game before completing the same measures completed by participants in Study 1 (but on the computer). Because initial analyses revealed that participants were not more likely to choose the competitive option in the PDG as a result of the task description and the manipulation had no effects on other measures, analyses collapsed across this experimental manipulation to instead test our correlational hypotheses parallel to Study 1. More details can be found in the SOM.

#### Measures

As in Study 1, we assessed (in the described order) participants' interest in HEED (0–100 scale; α = 0.73) and STEM careers (0–100 scale; α = 0.70), participants' perceptions of societal worth of HEED (\$0–\$400 per hour scale; α = 0.93) and STEM careers (\$0–\$400 per hour scale; α = 0.92), and their communal (0–100 scale; α = 0.83) and agentic values (0–100 scale; α = 0.80).<sup>6</sup>

#### **Trait competitiveness**

Participants self-reported their trait competitiveness after the above described measures on a 9-item measure (α = 0.94; Houston et al., 2002) before completing demographics. Items included positively worded statements (e.g., "I am a competitive individual.") and negatively worded statements (e.g., "I don't like competing against other people.") and were rated on a scale of 0 = "Strongly Disagree" to 100 = "Strongly Agree."

# Results and Discussion

#### Gender Differences in Outcomes

#### **Personal and traits values**

A 2 (participant gender) × 3 (value-type: communal, agentic, competitiveness) mixed ANOVA showed the anticipated participant gender by value-type interaction, F(1,305) = 32.52, p < 0.001, η 2 <sup>p</sup> = 0.18. Replicating gender differences found by others (Croson and Gneezy, 2009), Bonferroni-corrected pairwise comparisons showed that men scored significantly higher on competitiveness, p < 0.001, but significantly lower on communal values, p = 0.009, than did women. As in Study 1, men and women did not differ in the extent to which they felt agentic values were important to them, p = 0.414. Means, d-scores for gender differences, and correlations for all variables can be found in **Table 2**.

<sup>6</sup>We removed the item "competition" from the agentic values composite because it was highly correlated with trait competitiveness, r = 0.70, p < 0.001, and we aimed to disentangle these constructs. Results are unchanged when including this item.

#### TABLE 2 | Study 2 descriptive statistics and bivariate correlations.


<sup>∗</sup>p < 0.05. Superscripts on d-scores indicate significant level of gender differences from tests reported in text. All scales had a range of 0–100 except the value measures, in which participants expressed value in an hourly pay amount that could vary between \$0–\$400/hr.

#### **Career interest**

As in Study 1, a 2 (participant gender) × 2 (career-type: HEED vs. STEM) mixed ANOVA yielded the predicted interaction, F(1,306) = 77.14, p < 0.001, η 2 <sup>p</sup> = 0.20. As expected, simple pairwise comparisons revealed that men reported less interest in HEED careers than did women, p < 0.001. Women, in turn, reported less interest in STEM careers than did men, p = 0.006. In addition, women favored HEED careers over STEM, p < 0.001, whereas men in this sample reported non-significantly lower interest in HEED than in STEM, p = 0.145. Perhaps because Study 2 was dominated by students from a HEED-related field (i.e., the psychology participant pool), there was also a general tendency of participants to report more interest in HEED than in STEM careers, F(1,306) = 46.47, p < 0.001, η 2 <sup>p</sup> = 0.13.

#### **Perceived worth to society**

In addition, a 2 (participant gender) × 2 (career-type) mixed ANOVA on perceived societal worth of careers revealed a main effect of career-type, F(1,306) = 52.99, p < 0.001, η 2 <sup>p</sup> = 0.15, that was qualified by a significant gender by career interaction, F(1,306) = 5.04, p = 0.025, η 2 <sup>p</sup> = 0.02. In this sample, there was no main effect of gender, F(1,306) = 1.21, p = 0.272, η 2 <sup>p</sup> = 0.004. As visualized in **Figure 4**, in support of our hypothesis, men assigned

marginally less societal worth to HEED careers than did women, p = 0.081. Men and women, however, assigned similar levels of societal worth to STEM careers, p = 0.716. Parsed differently, both men and women assigned more societal worth to STEM than to HEED, ps < 0.001, but the gap was significantly larger for men, d = 0.35, than for women, d = 0.17.

#### Mediation of Career Attitudes by Personal Values

As in Study 1, we next tested the extent to which in communal values, agentic values, and now also trait competitiveness, predicted evaluations of HEED careers (controlling for gender), that in turn mediate gender differences in HEED perceptions. As before, all possible mediators (communal values, agentic values, trait competitiveness) were entered into the mediational regression model simultaneously to better estimate unique effects, and models also controlled for STEM perceptions.<sup>7</sup> Results from these analyses are visualized in **Figure 5**.

As documented above, we found gender differences in communal values, β = 0.15, SE = 0.06, t(306) = 2.61, p = 0.010, and trait competitiveness, β = −0.36, SE = 0.05, t(306) = −6.79, p < 0.001, but not agentic values, β = 0.05, SE = 0.06, t(306) = 0.82, p = 0.414. Consistent with the findings from Study 1, endorsement of communal values significantly predicted greater interest in HEED careers (controlling for interest in STEM careers), β = 0.28, SE = 0.05, t(302) = 5.80, p < 0.001, as well as the tendency to assign higher societal worth to HEED careers (controlling for societal worth of STEM careers), β = 0.09, SE = 0.03, t(302) = 2.60, p = 0.010. Over and above communal values and trait competitiveness, agentic values predicted both less interest in HEED, β = −0.18, SE = 0.05, t(302) = −3.66, p < 0.001, and a tendency to perceive lower worth to society in HEED, β = −0.08, SE = 0.03, t(302) = −2.18, p = 0.030. In contrast, despite the previously described gender differences, trait competitiveness did not significantly relate to interest in, β = 0.04, SE = 0.05, t(302) = 0.66, p = 0.512, or societal worth

<sup>7</sup>Both communal values, r(308) = 0.15, p = 0.009, and competitiveness, r(308) = 0.41, p < 0.001, correlated positively with agency. As documented in the SOM, all indirect effects through communal values remain significant when adding experimental condition as control variable, and when removing agentic values, trait competitiveness, and control variables from all models.

assigned to HEED careers, β = −0.01, SE = 0.04, t(302) = −0.25, p = 0.803.

Finally, bootstrapping analyses to estimate indirect effect sizes yielded significant indirect effects of gender via communal values on both interest in HEED related careers, a <sup>∗</sup>b = 0.04, SE = 0.02, bootstrapped CI0.<sup>95</sup> (0.01, 0.08), and perceptions of societal worth of HEED careers, a <sup>∗</sup>b = 0.01, SE = 0.01, bootstrapped CI0.<sup>95</sup> (0.003, 0.03). Given that there was no relationship between competitiveness and these outcomes, analyses yielded no evidence that trait competitiveness mediated either gender differences in HEED interest, a <sup>∗</sup>b = −0.01, SE = 0.02, bootstrapped CI0.<sup>95</sup> (−0.05, 0.03), or societal worth assigned to HEED careers, a <sup>∗</sup>b = 0.003, SE = 0.01, bootstrapped CI0.<sup>95</sup> (−0.03, 0.03). Similarly, given the lack of gender differences on agentic values, analyses yielded no evidence that agentic values mediated either gender differences in interest in HEED careers, a <sup>∗</sup>b = −0.01, SE = 0.01, bootstrapped CI0.<sup>95</sup> (−0.03, 0.01), or societal worth assigned to HEED careers, a <sup>∗</sup>b = −0.004, SE = 0.01, bootstrapped CI0.<sup>95</sup> (−0.02, 0.004). After entering communal and agentic values, and trait competitiveness (alongside STEM perceptions as control) into these models, gender remained a significant predictor of HEED interest, β = 0.44, SE = 0.05, t(302) = 8.37, p < 0.001, but not of perceived societal worth of HEED, β = 0.07, SE = 0.04, t(302) = 1.96, p = 0.053. These findings further support our hypothesis that relatively lower communal values predict the extent to which individuals in general, and to some extent men in particular, find HEED roles less personally interesting and perceive them as having less worth to society.<sup>8</sup>

#### **Are gender differences in HEED interest mediated by communal values and societal worth?**

Lastly, as in Study 1, we conducted serial mediation analyses with the PROCESS macro (Hayes, 2012; model 6) entering gender as

<sup>8</sup>Consistent with Study 1 and past research (Diekman et al., 2017), analyses summarized in the SOM revealed that communal values predicted less interest in STEM related careers, thereby mediating gender differences in STEM interest. Gender differences were absent for perceived societal worth of STEM, and communal values were only marginally related to lower societal worth of STEM.

the main predictor, communal values as mediator 1, and societal worth of HEED as mediator 2 of a model predicting personal HEED interest as the outcome. Again, all paths controlled for agentic values, trait competitiveness, societal worth of STEM, and interest in STEM. Results, summarized in **Figure 6**, yielded a significant serial mediation effect, a<sup>1</sup> ∗ a2 <sup>∗</sup>b = 0.004, SE = 0.003, bootstrapped CI0.<sup>95</sup> (0.001, 0.01). There were gender differences in communal values, β = 0.18, SE = 0.06, t(302) = 2.72, p = 0.007, which were predictive of higher societal worth perceived in HEED careers, β = 0.09, SE = 0.03, t(301) = 2.56, p = 0.011. Perceiving higher societal worth in HEED, in turn, predicted higher personal interest in taking on HEED careers, β = 0.31, SE = 0.08, t(300) = 3.72, p < 0.001.

### Discussion

Taken together, results of Study 2 replicate findings from Study 1, providing further support for a goal congruity perspective of men's (and women's) devaluation of HEED roles. Compared to women, men were less personally interested, and perceived somewhat less societal worth, in HEED careers to the extent they were less likely to have internalized communal values. Consistent with findings from Study 1, further analyses suggest that men's (vs. women's) relatively lower interest in HEED careers is partially explained by their lower communal values' predicting lower societal worth assigned to HEED careers. Results from Study 2 also failed to find any support for the alternative hypothesis that high agency, in general, or high competitiveness, more specifically, can provide better explanations for men and women's different evaluations of HEED occupations. Irrespective of gender, however, we observed that stronger endorsement of agentic values, over and above gender and trait competitiveness, consistently predicted perceiving HEED careers as contributing less worth to society in Studies 1 and 2. Although we replicated a frequently observed gender difference in competitiveness, we found no evidence that more competitive people tend to devalue HEED careers. Together, patterns from the first two studies are in line with our assertion that one factor underlying men's relatively lower personal interest in and perceived societal worth of careers such as nursing and teaching, is that men are less likely than women to internalize communal values.

# STUDY 3

Studies 1 and 2, to our knowledge, provide the first evidence for the novel hypothesis that communal values not only predict personal interest in careers but also plays an important role in the broader societal worth people assign to different occupations. There were gender differences in evaluations of HEED careers as having worth to society, but personal communal values consistently predicted these evaluations over and above gender. Given the under-examined nature of this topic, Study 3 was designed to focus more specifically on the extent to which personal values predict both perceptions of the societal worth of HEED careers, and support for efforts to increase HEED salaries (in order to match STEM salaries).

Our first aim was to replicate the relationship between communal values and perceived societal worth of HEED careers using a more rigorous methodology. In Studies 1 and 2, participants expressed the societal worth they perceived in HEED (and STEM) careers as an ideal hourly pay. Whereas this method does provide a meaningful ratio scale, participants' ratings could easily be skewed by their knowledge of the realistic discrepancies in income or work hours between the different career-types in North America. Because workers in HEED professions (e.g., teaching and nursing) earn lower salaries (Cross and Bagilhole, 2002) and work fewer hours (Statistics Canada, 2017) than comparable male-dominated STEM careers, participants' ratings of societal worth could be biased by their knowledge of these differences. To address this concern, participants in Study 3 initially rated their perceptions of the actual pay and work hours of careers, which allowed us to partial out these ratings from their assessments of ideal pay as a measure of perceived worth. In addition, we improved our measures of societal worth by rephrasing the items more clearly, and also by narrowing the focus to careers that clearly require caregiving (i.e., we replaced "human resources manager" and "educational administrator" with "occupational therapist" and "special education teacher").

The second aim of Study 3 was to examine the relationship between (and gender differences in) communal values and people's support for policies aimed at increasing HEED salaries to match STEM salaries. Similar to our findings on societal worth in the previous studies, we predicted that those with

lower communal values (who also tend to be men) would be less supportive of policies designed to increase HEED salaries. Moreover, we predicted that men would be less likely to support increases in HEED salaries than would women, a gender difference that should be partially accounted for by men's relatively lower endorsement of communal values, as well as by their tendency to see HEED careers as worth relatively less to society. In testing our hypotheses using a novel operationalization of our key outcome, we also increased the external validity and potential generalizability of our findings.

Our key hypotheses are based on the theoretical assumption that goal congruity processes lead people with stronger communal values to perceive greater societal worth in communal HEED careers, and therefore want HEED careers to be compensated accordingly. Because communal values reflect a more general endorsement of social equality as a prized goal (Schwartz and Bilsky, 1987), one would expect that individuals who are more communal should also be more supportive of increasing gender balance (i.e., a form of equality) in any field (HEED or STEM). We thus tested whether communal values (and gender differences in them) would predict participants' support for social action aiming to increase the gender balance in general (i.e., average support for increasing gender balance for both HEED and STEM). However, we also explored whether communal values uniquely predicted support for increasing gender balance in HEED over and above support for gender balance in STEM.

Lastly, in Studies 1 and 2, we found little evidence that men's evaluations of HEED careers are explained by the value they placed on agency or a desire to be competitive. A final aim of Study 3 was to test a new alternative hypothesis that men's relatively lower worth placed on HEED careers is instead (or additionally) predicted by their valuation of material wealth. If men value money more than do women, then this prioritization of money could reasonably predict their more positive judgment of STEM careers, which drive economic growth (Cooke, 2002), over HEED careers which are traditionally publicly funded and pay lower salaries (Cross and Bagilhole, 2002; Bagilhole and Cross, 2006).

# Method

#### Participants

A total of 307 undergraduate students completed the study in individual cubicles in the lab (run in 2016). This number was higher than our a priori target of 280 because we oversampled to account for exclusions due to failed attention checks. Our target sample was calculated by estimating the sample size needed to obtain 85% power to find an indirect effect equal to the average effect size we found in Studies 1 and 2. We excluded 15 participants who failed basic attention checks indicating that they randomly chose answers (e.g., "If you are paying attention, please select option two.") and one participant who did not identify as either male or female, leaving a final sample of 291 (146 men/145 women). Participants were on average 20.06 years old (SD = 2.34) and were 1st (27.5%), 2nd (32%) or 3rd (25.1%) year students in Psychology (38.1%), other Arts majors (20.3%) and other Science majors (15.1%). Participants were predominantly East Asian (47.80%) or White (26.80%).

### Materials and Procedure

#### **Personal values**

As in the previous studies, participants began by rating the extent to which seven communal values (α = 0.85) and seven agentic values (α = 0.79) were personally important to them. Embedded with these values, participants in Study 3 also rated two items ("Money"; "Wealth," r = 0.85, p < 0.001) which were combined to assess participants' endorsement of material values. All ratings were made on a scale of 1 (Not at all important) to 9 (Extremely Important)<sup>9</sup> and all values were presented in randomized order.

#### **Perceived career attributes**

Before rating the perceived societal worth of each career, participants were asked to estimate the real salary, rated on a scale of '\$0 per hour' to '\$150 per hour', and then the weekly work hours, rated on a scale of "0 h a week" to '90 h a week,' for each HEED career (αsalary = 0.88, αhours = 0.79), STEM career (αsalary = 0.93, αhours = 0.87).

#### **Worth to society**

We updated the phrasing of this item to increase clarity. Participants in Study 3 were asked to "assign a dollar amount to represent what you think each of the following careers should be paid based on their worth TO THE FUNCTIONING OF SOCIETY" (added text in all caps). In this way, participants rated the worth to society of five HEED careers (nurse, social worker, special education teacher, occupational therapist, and elementary school teacher; α = 0.94) and five STEM careers (computer systems architect, industrial engineer, mechanical engineer, architect and software developer; α = 0.94).<sup>10</sup> All careers were presented in a randomized order. Ratings were made on a continuous slider scale with the anchors \$0 to \$150 per hour. This range was updated to more closely match the actual average pay of all the occupations used according to data from the Canadian government (Government Canada, 2015) with 20% added to the highest average hourly pay.

#### **Support for change**

After making their ratings of specific careers, participants completed three measures of support for social change in regards to HEED and STEM that served as three novel outcome variables: (1) support for HEED salary increases (to match salaries in STEM), (2) support for increasing the gender balance in HEED, and (3) support increasing the gender balance in STEM. Given that we were most interested in HEED perceptions, scales were always presented in this order. All ratings were made on a scale of 1 (strongly disagree) to 7 (strongly agree) and items were randomized within each outcome variable.

<sup>9</sup>We changed the response format given that we switched our survey program to Qualtrics for the last study, and found this response format more visually intuitive. <sup>10</sup>Interspersed between these careers, were two domestic roles (homemaker and stay-at-home parent; r = 0.70, p < 0.001) and five careers which tend to be gender balanced (Bureau of Labor Statistics, 2017; retail manager, accountant, business account manager, financial analyst and marketing manager; α = 0.94). These items were not central to our hypotheses and will not be discussed further in this paper.

#### **Support for salary increases**

fpsyg-09-01353 August 9, 2018 Time: 9:8 # 12

Participants first read a paragraph describing that employees in HEED careers are typically paid less than those in STEM careers, despite requiring similar amounts of education and work hours. Participants then rated their agreement with seven statements on the value of policies and governmental action aimed at increasing HEED salaries to match salaries in STEM (e.g., "It would be fair to increase salaries for occupations such as nursing, teaching, and social work until they become similar to salaries in engineering and technology related occupations." and "We do NOT need to try to increase the pay of nurses, teachers and social workers to match those of engineers and computer scientists."; α = 0.91). The full measure is provided in **Appendix 1**.

#### **Support for increasing gender balance**

Next, participants completed two measures that assessed the extent to which they support making efforts toward equal gender representation in HEED careers and STEM careers. First, participants read about gender imbalances in HEED and STEM occupations before rating the extent to which they agree with 10 statements about support for gender balance in HEED (α = 0.93; e.g., "Professions such as nursing, teaching, and social work would be enhanced with a more equal distribution of men and women" and "Policies should be enacted to encourage hiring more men in jobs where they are fewer in number, such as nursing, teaching, and social work"). Next participants rated 10 parallel statements about support of gender balance in STEM (α = 0.94). To create and index of general support for gender balance we first z-scored all items for support of gender balance HEED and in STEM and then averaged these 20 items into the overall index (α = 0.95).

#### **Demographics and exploratory variables**

Along with several exploratory variables assessing participants' perceptions of the ideal priorities of a society and compatibility of communal and agentic values, participants completed a standard demographic questionnaire including age, gender, year in school, major, ethnicity, SES, ethnicity, marital status and political orientation. In addition, participants answered two open-ended questions designed to assess what they thought the study was about and whether they had any idea about our specific hypothesis. All measures are listed in the SOM.

# Results and Discussion

#### Gender Differences

#### **Personal values**

Descriptive statistics, correlations, and gender differences for all variables in Study 3 are summarized in **Table 3**. As in the previous studies, a 2 (participant gender) × 3 (value-type: communal, agentic, material) mixed ANOVA yielded the expected interaction, F(1,288) = 5.08, p = 0.007, η 2 <sup>p</sup> = 0.03. Bonferroni-corrected pairwise comparisons suggested that, once again, men reported lower communal values than did women, p = 0.033, but men and women showed comparable levels of both agentic values, p = 0.152, and material values, p = 0.911. Comparisons within gender (Bonferroni-corrected) suggested that men endorsed material and agentic values at similar levels, p = 0.556, whereas women endorsed material values significantly more than broad agentic values, p = 0.006. Both men and women, however, reported valuing communion more than either of the two other values, ps < 0.003.

#### **Perceived career attributes**

A goal in this study was to better disentangle perceptions of societal worth of HEED from participants' estimates of the real salary and work hours of HEED and STEM careers in the current labor market. A 2 (participant gender) × 2 (career-type) mixed ANOVA on estimated real salary revealed only main effects of career-type, F(1,289) = 400.88, p = <0.001, η 2 <sup>p</sup> = 0.58, and of gender, F(1,289) = 11.77, p = 0.001, η 2 <sup>p</sup> = 0.04, but no gender by career-type interaction, F(1,289) = 1.37, p = 0.244, η 2 <sup>p</sup> = 0.01. These effects suggested that participants correctly perceived that STEM careers pay higher wages than HEED careers, but also that women generally reported higher salary estimates for both career-types than did men.

A 2 (participant gender) × 2 (career-type) mixed ANOVA on perceived work hours revealed a main effect of career-type, F(1,289) = 6.28, p = 0.013, η 2 <sup>p</sup> = 0.02, but no effect of gender, F(1,289) = 0.01, p = 0.928, η 2 <sup>p</sup> < 0.001. Importantly these effects were qualified by a significant interaction, F(1,289) = 7.38, p = 0.007, η 2 <sup>p</sup> = 0.03. Simple pairwise comparisons showed that there were no significant gender differences in perceived work hours for either HEED, p = 0.220, or STEM careers, p = 0.212. However, whereas women estimated similar work hours for STEM and HEED, p = 0.878, men thought that employees in STEM careers worked significantly longer hours than those in HEED careers, p < 0.001.

#### **Perceived worth to society**

In line with hypotheses, people's estimates of HEED careers' actual salary, r = 0.63, p < 0.001, and work hours, r = 0.38, <0.001, were both positively related to greater perceived societal worth in HEED careers. To test whether gender differences in the worth of HEED careers were robust to these estimates of the actual labor market, we analyzed participants' societal worth ratings in a 2 (participant gender) × 2 (career-type) mixed analysis of covariance (ANCOVA) controlling for estimates of real salary and work hours for both STEM and HEED careers. Adjusted mean estimates from these analyses are displayed in **Figure 7**. Consistent with hypotheses, there was a marginal gender × career-type interaction, F(1,285) = 3.77, p = 0.053, η 2 <sup>p</sup> = 0.01. The main effects of career-type, F(1,285) = 0.88, p = 0.348, η 2 <sup>p</sup> = 0.003, and gender were not significant, F(1,285) = 1.92, p = 0.167, η 2 <sup>p</sup> = 0.01. As in Study 2, simple pairwise comparisons showed that although women perceived STEM and HEED careers to have similar societal worth, p = 0.591, men perceived STEM to have greater societal worth than HEED, p = 0.024. In addition, men tended to undervalue HEED careers compared to women, p = 0.053, whereas men and women assigned similar societal worth to STEM roles, p = 0.846. The fact that the size of these gender differences was reduced by controlling for participants' estimates of current salary and work hours suggests that the perceived worth ratings were, as we had suspected in the previous studies, somewhat contaminated


with perceptions of the real labor market. However, even after accounting for the extent to which perceived worth is also tied to also perceiving HEED careers to actually work less hours and earn lower salaries, men perceive significantly less worth in HEED compared to STEM careers.<sup>11</sup>

#### **Support for social change**

In addition to their perceptions of the societal worth of HEED and STEM, participants also rated their support for pay increases in HEED (to match those of STEM) and support for gender balance in both HEED and STEM careers. For support for HEED salary increases, we conducted a one-way ANCOVA comparing participants' support for that type of social change, controlling for estimated real salary and work hours in both HEED and STEM careers. Results for HEED salary increase suggested that, as we expected, men tended to support increases in HEED salaries significantly less than did women, F(1,285) = 35.17, p < 0.001, η 2 <sup>p</sup> = 0.11<sup>12</sup>. Estimated work hours in HEED, F(1,285) = 3.79, p = 0.055, η 2 <sup>p</sup> = 0.01; and STEM, F(1,285) = 9.87, p = 0.002, η 2 <sup>p</sup> = 0.03; were marginal and significant covariates in the model, respectively; whereas pay perceptions in HEED and STEM, were not, F < 2.40, ps > 0.120. These results suggest that gender differences in support for HEED salary increases are robust to controlling for labor market perceptions.

Furthermore, a 2 (gender) × 2 (career-type) mixed ANCOVA on support for attaining gender balance within each career-type (again controlling for career perceptions) yielded a main effect of gender, F(1,285) = 65, p < 0.001, η 2 <sup>p</sup> = 0.19; that was qualified by a significant participant gender × career-type interaction, F(1,285) = 2.08, p = 0.024, η 2 <sup>p</sup> = 0.02. Women were more supportive than men of promoting gender balance in HEED as well as in STEM careers, all ps < 0.001. Furthermore, both men and women supported increasing gender balance more in STEM than in HEED, all ps < 0.001, although this difference was significantly larger for women, d = 0.56, than for men, d = 0.24.

#### Do Gender Differences in Values Predict Support for Social Change?

Our primary goal in Study 3 was to test our hypotheses that communal values would predict both societal worth of HEED and support for HEED salary increases. We tested these relationships controlling for participants' perceptions of salary and work hours in the real labor market. We also examined material values as an alternative predictor of these outcomes. In mediational analyses using the PROCESS macro (Hayes, 2012; Model 4), we first regressed societal worth of HEED and support for HEED salary increases (in separately models) onto communal values, agentic values, and material values as simultaneous mediators of the observed gender difference on each variable. As before, analyses with perceived societal worth of HEED as an outcome also controlled for the perceived societal worth of STEM careers, but all analyses also controlled for perceived real salary and work hours of the outcome career-type.<sup>13</sup> Results are summarized in **Figure 8**.

Replicating the results of the prior two studies, communal values predicted a tendency to assign significantly greater societal worth to HEED careers, β = 0.09, SE = 0.04, t(281) = 2.27, p = 0.034, as well as stronger support for HEED salary increases, β = 0.19, SE = 0.05, t(282) = 3.45, p < 0.001. With material values now in the model, agentic values did not uniquely predict perceived societal worth of HEED or support of salary increases, βs < 0.06, t < 0.74, p > 0.457. However, the endorsement of material values did significantly predict both lower ratings of societal worth in HEED careers, β = −0.12, SE = 0.05, t(282) = −2.40, p = 0.017, and less support for increasing HEED salaries, β = −0.23, SE = 0.07, t(281) = −3.19, p = 0.002.

Given men's tendency to report lower communal values than did women, bootstrapping analyses revealed significant indirect effects of gender thorough communal values on societal worth of HEED, a <sup>∗</sup>b = 0.01, SE = 0.01, bootstrapped CI0.<sup>95</sup> (0.001, 0.03), as well as support for HEED salary increases, a <sup>∗</sup>b = 0.02, SE = 0.01, bootstrapped CI0.<sup>95</sup> (0.004, 0.06). Given the absence of any gender differences in agentic and material values, indirect effects through these variables were non-significant, all a <sup>∗</sup>bs < 0.01, ps > 0.05. These effects provide evidence that men's relatively lower communal value endorsement can partly account not only for their different evaluations of HEED roles, but might also explain why men, compared to women, are less concerned about efforts to promote higher salaries paid to HEED careers. Moreover, these results address concerns that the previously observed effects might be biased by participants' awareness of the actual salary and work hours of these careers on the labor market.

In additional secondary analyses, we tested communal, agentic, and material values as simultaneous mediators of (a)

<sup>11</sup>Without the covariates entered, gender differences show very similar patterns to Studies 1 and 2. The notable difference is that, without covariates, the gender × career-type interaction is significant, p = 0.010, and the gender difference in societal worth of HEED is also significant, p = 0.044.

<sup>12</sup>This gender difference is similar, η 2 <sup>p</sup> = 0.13, without covariates in the model.

<sup>13</sup>The indirect effect suggesting that communal values is a mediator of gender differences societal worth of HEED is robust to removing all covariates except perceived societal worth of STEM. When societal worth of STEM is removed, the relationship between communal values and HEED worth is of similar magnitude but non-significant, β = 0.07, SE = 0.06, t(289) = 1.16, p = 0.247.

the gender difference in general support for gender balance (averaged responses to both STEM and HEED questions), as well as (b) the gender in difference in supporting increased gender balance specifically in HEED (controlling for support of gender balance in STEM). As before, all analyses also control for the estimated real salary and work hours of both HEED and STEM in the outcome. Analyses on general support for gender balance revealed that communal values did significantly relate to greater support for increasing overall gender balance in careers, β = 0.16, SE = 0.04, t(282) = 2.85, p = 0.005, and previously described gender difference in communal values thus accounted for a significant proportion of the gender difference in support of gender balance, a <sup>∗</sup>b = 0.02, SE = 0.01, bootstrapped CI0.<sup>95</sup> (0.002, 0.06). After accounting for gender differences in communal values, female-gender still predicted higher support of general gender balance, β = 0.43, SE = 0.04, t(282) = 7.82, p < 0.001. In contrast, analyses revealed that none of the three personal values significantly predicted support for gender balance in HEED specifically, after controlling for gender balance in STEM, βs < 0.08, ts < 1.35, ps > 0.180, and thus none of the indirect effects were significant, a <sup>∗</sup>bs < 0.006.<sup>14</sup> Thus, those who are more communal support reducing occupational segregation in both male and female-dominated roles, not only in HEED.

#### Does Societal Worth Mediate Gender Difference in Support for Salary Increases in HEED?

Given that communal values predicted both the perceived societal worth of HEED as well as support of salary increases in HEED, a final analysis examined whether the gender difference in support for salary increases was mediated by the perceived societal

<sup>14</sup>Parallel analyses with societal worth of STEM as outcome suggested that communal values did not predict societal worth of STEM careers after accounting for covariates. Despite the absence of gender difference in material values, having stronger material values did, however, predict greater perceived societal worth of STEM. Parallel analyses on gender equality support for STEM also showed no significant relationships or indirect effects through any of the value variables. Details in the SOM.

worth of HEED. To test this, we conducted serial mediation analyses with the PROCESS macro (Hayes, 2012; Model 6), entering gender as the independent variable, communal values as the first mediator, and societal worth of HEED as the second mediator in predicting support for salary increases in HEED careers as the outcome. Again, all paths controlled for agentic values, material values, and perceived pay and work hours for HEED and STEM, as well as societal worth of STEM. Results of bootstrapping analyses, summarized in **Figure 9**, revealed a significant serial mediation (gender → communal values → societal worth of HEED → Support for salary increases), a<sup>1</sup> ∗a2 <sup>∗</sup>b = 0.005, SE = 0.003, CI0.<sup>95</sup> (0.001, 0.01). In addition, results suggested that both simple indirect effects also remained significant; (1) gender → communal values → Support for salary increases, a<sup>1</sup> <sup>∗</sup>b = 0.02, SE = 0.01, CI0.<sup>95</sup> (0.01, 0.05), and (2) gender → societal worth of HEED → Support for salary increases, a<sup>2</sup> <sup>∗</sup>b = 0.03, SE = 0.01, CI0.<sup>95</sup> (0.002, 0.06). These results suggest that gender differences in communal values and perceived societal worth of HEED, combined, explain 15% of the variance in men's tendency to be less supportive of increases in HEED salaries than are women.

# GENERAL DISCUSSION

Despite their importance to the well-being of societies, HEED careers are devalued both on a personal and a societal level, perhaps especially by men. The first aim of the current research was to apply the goal congruity perspective – the idea that we evaluate careers based on how they fit our personal values (Diekman et al., 2017) – to understand men's relative lack of personal interest in adopting careers in healthcare and education. Studies 1 and 2 provided support for our prediction that men's relatively lower communal values partially accounted for men's (compared to women's) lack of interest in HEED careers. Just as past research suggests that women are deterred from STEM careers due to perceiving them as incompatible with their strong communal values (see Diekman et al., 2017), these findings lend support to the assertion that men's relatively lower internalization of communal values leads them to see communal careers in healthcare and teaching as less attractive career options.

A more novel contribution of the current research was to extend the tenants of the goal congruity perspective to understand men's, but also women's, tendency to devalue HEED careers. In all three studies, we found that HEED (compared to STEM) careers are seen as providing less worth to society, in line with predictions derived from status-value theory. As expected, men devalue HEED careers more than do women – they perceive HEED as having somewhat less societal worth (all studies) and are significantly less supportive of increasing HEED salaries (Study 3). In addition, evidence suggests that these gender differences can be explained by goal congruity processes. Men's, compared to women's, relatively lower communal values partially accounted for their tendency to perceive lower societal worth and to be less supportive of salary increases for HEED. In turn, these perceptions of societal worth (as predicted by their lower communal values), also predict men's relatively low interest in taking on HEED careers in the future.

In addition to explaining gender differences in HEED perceptions, our results have implications for the broader way that goal congruity processes shape people's perceptions of what roles have worth. Whereas actual HEED and STEM salaries are realistically shaped by structural factors – such as their disproportionate representation in the public vs. private sector, respectively – our evidence suggests that men's and women's desire to afford certain careers with higher salaries is predicted, at least in part, by the basic values they internalize. Even when controlling for perceptions of current labor market characteristics such as actual salary and work hours (Study 3), individual differences in communal values consistently predict perceptions of the societal worth and support for salary increases in HEED careers – over and above perceiver gender. These novel findings suggest that the abstract values we espouse can directly account for our willingness to take on certain careers ourselves and even predict which careers we perceive as worthwhile to society in general.

Although we focused our investigation on the role of communal values in the gendered perception of HEED careers, we also assessed whether other dimensions of individual differences – broad agentic values, or trait competitiveness and material values – might relate to men's and women's tendency to devalue HEED careers at a personal and/or societal level. Our findings suggest that none of these additional value dimensions can account for gender differences in perceptions of HEED careers. Yet, we find some evidence that, over and above gender, individuals who value agency more highly, and specifically

those who value material gains, tend to perceive HEED careers as having lower societal worth. This is especially meaningful since historical data trends show a general increase in agentic self-evaluations (i.e., achievement motivations) among both men and women in America in the last 40 years (Twenge et al., 2012). Whereas future research should aim to replicate these effects, our work provides preliminary evidence that valuing independence, status, and especially wealth is linked to the perception that communally oriented HEED careers provide less worth to society than STEM careers.

# Limitations and Future Directions

Whereas the current research is, to our knowledge, the first to apply a goal congruity lens to men's broad evaluations of HEED careers, our research methodology has important limitations. First, the correlational nature of our analyses prevents strong conclusions that communal values cause evaluations of HEED careers. However, given the conceptualization of values as relatively stable (Trapnell and Paulhus, 2012), it's somewhat less likely that evaluations of specific HEED roles cause the broader communal values one endorses. In addition, one possible limitation of the current research is that Study 2 was initially designed as an experimental test of the effects of competitiveness on career evaluations. We adapted a manipulation that, in past studies (Liberman et al., 2004), had successfully primed competitive vs. cooperative mindsets in a prisoner's dilemma game. As detailed in the SOM, this manipulation failed to show any effects on participants' choice of how to play the game. It is unclear why we failed to find effects of this manipulation on provoking a competitive mindset or behavior. Yet, Study 2 is well-powered, like all other studies in the paper, and closely replicates results from Study 1 with almost identical measures. In addition, results remain the unchanged when controlling for condition (see SOM), further assuaging any potential concerns that this failed manipulation eroded our ability to test correlational hypotheses.

Our conclusions are further limited by the nature of our measures. Whereas we took care to design measures that were face-valid and intuitive to our participants, our measures ask participants to make relatively explicit judgments about careers which may or may not predict their actual behavior or decision making. First, we asked participants to assign an ideal salary based on a career's value to society, but the construct value or worth can be construed in a number of ways (e.g., value to the survival vs. the productivity of society). Future research should consider different operationalizations of perceived societal worth. Second, future research might also use behavioral measures of career evaluations (e.g., actual donations to career-training programs) to assess the realistic consequences of people's evaluations. Third, given that people tend to have poor introspective insight for their motivations (Nisbett and Wilson, 1977), and that reporting high levels of communal values is socially desirable (Fiske et al., 2007; Fiske, 2018), future researchers might consider measuring communal value endorsement with more indirect or implicit measures.

Moreover, despite our attempts to rule out possible alternative explanations for our findings, such as current labor market conditions biasing perceptions of HEED careers, the correlational nature of our analyses prevents us from conclusively ruling out other forces that might play into the devaluation of HEED roles. For example, both social role theory (Eagly, 1987; Eagly and Wood, 2012) and the status-value asymmetry perspective (Schmader et al., 2001) would suggest that the mere fact that women are overrepresented in HEED can itself influence how these careers are perceived. Future research should aim to disentangle the effects of gender representation in a given career from the effects that a career's value-affordances have on its perceived societal worth, perhaps using novel or ambiguous occupational descriptions.

On a related note, given that we only provide correlational evidence, future research should also consider experimental tests of the relationship between personal values and HEED evaluations. Even if the relationship between individual differences in communal values and perceptions of HEED is not spuriously caused by a third variable, it is unclear whether or not increasing men's communal values could directly increase perceived worth of HEED careers. Men, in most societies, face rigid masculine gender roles norms and, consequently, are wary of transgressing such norms (Vandello and Bosson, 2013). Thus, theorists have suggested that gender role norms (Croft et al., 2015) and especially the expectation to become the primary breadwinner (Diekman et al., 2017) might constrain men's career aspirations and evaluations, even if a given career would match their personal values. Future research might explore different avenues for creating a better match between HEED roles and men's internal values – e.g., by increasing men's communal values directly, or reframing the value-affordances of HEED – in conjunction with efforts to remove external normative pressures for men to devalue HEED careers.

Given our restricted sample of North American undergraduates, the generalizability of our results also remains an open question. Our findings could potentially provide a framework for understanding cultural differences in the status and pay of careers, because not all cultures undervalue their healthcare workers and teachers. In Finland, for example, teaching ranks among the most highly respected and desirable occupations (Ahonen and Rantala, 2001). Past research suggests that in collectivistic cultures, both men and women see themselves as more communal (Cuddy et al., 2015). In light of our findings, future research should sample more diverse populations, and examine whether cultural differences in men's communal values might explain the status and pay of HEED careers differently by country or cultural backdrop.

# Implications

Our findings lead to new directions for understanding how we evaluate male- vs. female-stereotypic careers. In the interview quoted at the beginning of this article, Anne-Marie Slaughter suggests that true gender equality will only become feasible if we can encourage both men and women to perceive communal roles as more worthwhile. Our findings highlight that men's and women's basic communal value endorsement is related to such perceptions of HEED as worthwhile. Because previous research suggests that especially men can confer status onto careers (Reskin, 1988; Schmader et al., 2001; Major et al., 2002) and are seen as the standard for societal ideals (Cuddy et al., 2015), elevating communal activities in the eyes of men might be the first step toward increasing the status of vital HEED careers.

# ETHICS STATEMENT

fpsyg-09-01353 August 9, 2018 Time: 9:8 # 18

All studies were conducted after review and approval from the Behavioural Research Ethics Board of the University of British Columbia and in line with current guidelines of the Canadian Tri-Council Policy Statement. Studies were run under approved applications H10-03173 and H15-00087. All subjects completed an informed consent and were informed of any deception after the study.

# AUTHOR CONTRIBUTIONS

KB and TS worked together conceptualized hypotheses and study design. KB spearheaded data collection and analyzed data under the supervision of TS. AC helped conceptualize Study 3 and provided critical feedback and edits throughout the data analysis and writing process.

# REFERENCES


# FUNDING

This research was supported by a grant from the Social Sciences and Humanities Research Council of Canada, awarded to TS (895-2017-1025).

# ACKNOWLEDGMENTS

We sincerely thank the research assistants of the Social Identity Lab, especially Gaylean Davies, Puneet Sandhu, Sheila Wee, Jason Proulx, and Ryan Villamin; without whom this research would not have been possible. We also thank Audrey Aday, Lucy De Souza, Eisha Sharda, and Antonya Gonzalez for comments on an earlier draft of the manuscript.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.01353/full#supplementary-material



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Block, Croft and Schmader. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# APPENDIX 1

fpsyg-09-01353 August 9, 2018 Time: 9:8 # 20

# Support for Salary Increases

Not all careers in society are paid the same, even if they require very similar levels of education and work hours. Careers in healthcare, teaching and social work currently pay less than careers in technology and engineering that require similar levels of education. Please indicate the degree to which YOU agree or disagree with the following statements?

(1 = I strongly disagree, to 7 = I strongly agree)


# Support for Increasing Gender Balance

Men and women are currently unevenly distributed in different occupations. While there are more women in healthcare, teaching and social service professions, there are more men in engineering, technology and upper management professions. Please indicate the degree to which YOU agree or disagree with the following statements?


# Uncovering Pluralistic Ignorance to Change Men's Communal Self-descriptions, Attitudes, and Behavioral Intentions

Sanne Van Grootel<sup>1</sup> \*, Colette Van Laar<sup>1</sup> , Loes Meeussen1,2, Toni Schmader<sup>3</sup> and Sabine Sczesny<sup>4</sup>

<sup>1</sup> Center for Social and Cultural Psychology, University of Leuven, Leuven, Belgium, <sup>2</sup> Fonds Wetenschappelijk Onderzoek, Brussels, Belgium, <sup>3</sup> Department of Psychology, University of British Columbia, Vancover, BC, Canada, <sup>4</sup> Institute of Psychology, University of Bern, Bern, Switzerland

Edited by:

Leigh Ann Vaughn, Ithaca College, United States

#### Reviewed by:

Frank Zenker, Lund University, Sweden Eric Mayor, University of Neuchâtel, Switzerland

> \*Correspondence: Sanne Van Grootel sanne.vangrootel@kuleuven.be

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

> Received: 27 April 2018 Accepted: 13 July 2018 Published: 10 August 2018

#### Citation:

Van Grootel S, Van Laar C, Meeussen L, Schmader T and Sczesny S (2018) Uncovering Pluralistic Ignorance to Change Men's Communal Self-descriptions, Attitudes, and Behavioral Intentions. Front. Psychol. 9:1344. doi: 10.3389/fpsyg.2018.01344 Gender norms can lead men to shy away from traditionally female roles and occupations in communal HEED domains (Healthcare, Early Education, Domestic sphere) that do not fit within the social construct of masculinity. But to what extent do men underestimate the degree to which other men are accepting of men in these domains? Building on research related to social norms and pluralistic ignorance, the current work investigated whether men exhibit increased communal orientations when presented with the true norms regarding men's communal traits and behaviors vs. their perceived faulty norms. Study 1 (N = 64) revealed that young Belgian men indeed perceive their peers to hold more traditional norms regarding communal and agentic traits than their peers actually hold. Study 2 (N = 319) presented young Belgian men with altered norms to manipulate exposure to men's actual normative beliefs (i.e., what men truly think), their perceived norms (i.e., what men believe other men think), or a no information control. When men were presented with actual rather than perceived norms, they altered their own self-descriptions, future behavioral intentions, and broader genderrelated social attitudes in a more communal direction. In particular, men who were presented with information about men's actual beliefs regarding the compatibility between communal and agentic traits exhibited the strongest movement toward a more communal orientation. The findings show that participants in conditions that uncover pluralistic ignorance adapted their attitudes and behaviors to be more in line with the actual norm: adopting a more communal self-concept, having lower intentions to hide future communal engagement, and supporting more progressive gender-related social change. The results are discussed in terms of influences of norms on men's communal orientations and broader attitudes toward gender-related social change, and the downstream implications for increased gender-equality in HEED domains where men remain highly underrepresented.

Keywords: pluralistic ignorance, changing norms, men in HEED, communal attitudes, stereotypes, gender segregation

# INTRODUCTION

fpsyg-09-01344 August 9, 2018 Time: 9:7 # 2

Gender continues to be a driving force behind men's and women's self-selection into some careers and not others. Although real and perceived biases can create obstacles to entry, gender stereotypes can also constrain the interests that men and women have. Moreover, much of the social psychological work on occupational segregation predominantly focuses on women and their underrepresentation in fields often dominated by men, such as science, technology, engineering, and mathematics (i.e., STEM). However, a limited amount of research has focused on the other side of the coin: men's underrepresentation in fields dominated by women, for example in health care, elementary education, and roles in the domestic sphere (i.e., HEED; Croft et al., 2015). Although the percentage of women in traditionally male-dominated roles has risen somewhat over the last half-century, men's entry into communal HEED fields traditionally dominated by women has remained fairly low (Croft et al., 2015; Levanon and Grusky, 2016). In HEED fields, in particular, communal qualities are required that embrace the typical female stereotype, focusing on emotional sensitivity and concern for others, such as being kind and considerate, and being understanding and perceptive. On the other hand, in STEM fields, in particular agentic qualities are required that embrace the typical male stereotype, focusing on autonomy and achievement, such as being independent, competent, and resultsoriented (Heilman, 2012). Gender differences in the degree to which boys and girls value communion and agency have been found starting already in childhood (Block et al., 2018).

The lack of men in communal fields and domestic roles is concerning. As we will discuss below, when men do engage in communal roles, men, women, children, as well as society as a whole benefit from their active involvement (e.g., Croft et al., 2014). Despite these personal and relational benefits to being communal, those men that have a strong interest in engaging in communal roles may experience societal pressures that keep them out of these roles. Thus, it is of high importance to examine the barriers that men face engaging in communal roles. The current work focuses on how social norms can influence men's communal attitudes. More specifically, we aim to understand what norms young men have about communal roles, and how these norms can influence young men's self-descriptions and attitudes toward their own communal engagement.

As noted, despite their underrepresentation in communal roles and behaviors, there are many benefits to men when they do engage in these roles. When engaging in communal roles, men report increased psychological health, higher marital satisfaction (both partners do, Pleck and Masciadrelli, 2004; Knoester et al., 2007; Duckworth and Buzzanell, 2009; Fischer and Anderson, 2012), and higher happiness and overall life satisfaction (e.g., Fleeson et al., 2002; Sheldon and Cooper, 2008; Le et al., 2013, 2018).

Men's communal engagement is paired with benefits not only for the men themselves, but also for those in their surroundings. Women in dual earner households often face what is called the second shift whereby they engage in more household chores and childcare than their male partner (Milkie et al., 2009; Hochschild and Machung, 2012; Croft et al., 2014). But women who have male partners who are more domestically involved have more flexibility to pursue career ambitions, decreasing the second shift for women. Increased male engagement in domestic roles can thus lift some of the burdens that women face and in turn provide flexibility for women to pursue their career ambitions, closing the gender career achievement gap.

Not only women, but children too experience benefits when men take on communal roles, especially in the domestic sphere. Children show increased cognitive and social development when their fathers engage more in childcare (Marsiglio et al., 2000). Also, girls benefit from their fathers' involvement in their upbringing by reporting less traditional occupational aspirations and less traditional self-stereotyping (Croft et al., 2014). On a larger societal scale, increasing men's representation in communal occupations might also provide young boys with salient role models in HEED (e.g., Cochran and Brassard, 1979). For example, having a male elementary school teacher increases the salience of men in that role and may in turn weaken children's stereotypes (Carrington et al., 2008; Croft et al., 2015). Similar processes are likely to work in other HEED fields, such as in nursing. The shortage of elementary teachers and nurses in many western nations presents an important opportunity to meet these labor shortages by boosting men's interest in these fields.

Despite these many benefits, men have only increased their engagement in communal roles and behaviors slightly (Bianchi, 2011). Gender norms and roles play an important role in maintaining this inequality for men, as they provide strong ideas about what men are and should be like. Social role theory posits that the roles people enact are influential in shaping the traits they are believed to possess. When biological and historical forces lead men and women to self-segregate into different roles, this role segregation then shapes the stereotypes believed to define gender differences (e.g., Eagly, 1987; Eagly et al., 2000). In this way, men's historical roles as leaders, protectors, and defenders leads to a stereotype that men relative to women are more competitive, aggressive, strong, and status-seeking. Traits less associated with the male identity are communal traits, such as being compassionate, warm, understanding, etc. (Burgess and Borgida, 1999; Prentice and Carranza, 2002; Rudman and Fairchild, 2004; Diekman and Goodfriend, 2006).

Although stereotypes can be merely descriptive (i.e., this is what men are like), they often become prescriptive norms that play an important role in maintaining traditional male identity by dictating how men ought to be. When men adhere to such norms, their masculine identity is affirmed (e.g., Vandello et al., 2008) and they are socially validated (i.e., role congruity theory, Eagly and Diekman, 2005). Conversely, when men behave in a way that is not in accordance with these norms – for example by portraying more communal and less agentic traits or behaviors – they may experience economic and social penalties (e.g., Rudman and Fairchild, 2004; Moss-Racusin et al., 2010). In order to avoid such penalties, men may seek to adhere to masculine expectations and roles that society imposes, and continuously (re)assert their male identity by engaging in behaviors that conform to the perceived norm of how men should behave (see the social identity approach; Tajfel and Turner, 1979; Turner et al., 1987). This may lead men

to refrain from communal behaviors and roles and engage in behaviors that endorse the masculine norm.

Thus far, we have argued that men might avoid communal roles and careers because communal behaviors are incongruent with gender norms, and men may thus expect others to see communal behaviors as "unmanly." In response, men may avoid or hide communal behaviors and seek to confirm their masculine identity by behaving in ways they think other men in the group behave. Adhering to masculine norms can be done in many positive ways such as working hard, being a good leader, and engaging in sports. Yet research shows that adhering to these norms is also done through risky behaviors such as excessive use of alcohol and drugs (e.g., Locke and Mahalik, 2005; Mahalik et al., 2007; European Union, 2011; SAMHSA, 2015) and risky financial behaviors (Weaver et al., 2013). However, what if men's perceptions of other men's beliefs are wrong and men are thus unnecessarily refraining from communal roles and engaging in possibly risky behaviors? What if these behaviors are the result of pluralistic ignorance? Pluralistic ignorance is the (incorrect) belief that one's personal attitudes are different from the majorities' attitudes, and thus one goes along with what they think others think (Miller and McFarland, 1991). Pluralistic ignorance thus occurs when people do (not) engage in certain behaviors because they think others would (not) engage in those behaviors (e.g., Miller and McFarland, 1991; Stangor et al., 2001; Sechrist and Stangor, 2005). For example, people's saving decisions may be influenced by what they think others do or do not save (and may even overshadow their own preference) regardless of whether this is the best financial decision or not. Specifically, people may not think it is important to invest in a 401K pension account plan but when hearing that others are doing so may increase their engagement in those behaviors (Sunstein and Thaler, 2003).

The effects of pluralistic ignorance on behavior has been investigated extensively pertaining to alcohol consumption (e.g., Prentice and Miller, 1993; Schroeder and Prentice, 1998; Suls and Green, 2003). Findings indicate that college students often overestimate the social norm related to drinking behavior, and this leads students to engage in excessive drinking with the goal of fitting in, without necessarily having the goal of excessive consumption (Prentice and Miller, 1996). Related to the current topic, research has shown that there may also be pluralistic ignorance in masculinity norms: men tend to overestimate how aggressive their peers are, overinvest in aggression themselves, and overestimate the extent to which their peers would approve of their aggressive behavior (Bosson et al., 2009; Vandello et al., 2009). We extend this past research by hypothesizing: (a) that men might underestimate other men's acceptance of communion, and (b) that this underestimation inhibits their engagement in traditionally female communal roles and behaviors.

In the current research, we first examined in Study 1 whether men underestimate the degree to which other men around them value communal behaviors, and to what extent this potentially faulty norm (mis)fits the way they see themselves. By altering these faulty norms in Study 2, we examine whether exposure to different norms about what traits are valued by their peers (i.e., other students at their university) influences men's own communal self-descriptions, intentions to hide future communal engagement, and broader attitudes toward gender-related social change.

# STUDY 1

The goals of Study 1 were to establish whether there is pluralistic ignorance regarding what personality traits and characteristics are normative for men and whether such faulty norms do or do not reflect the way men see themselves. Firstly, we expected pluralistic ignorance in communal traits as evidenced by a discrepancy between men's own communal descriptions of the ideal man and how they think others in their cohort would describe the ideal man. We hypothesized that the ratings of men's own ideal man would be higher in communion than their peers' perceptions of the ideal man, i.e., ratings by others in their student and age cohort (Hypothesis 1). We did not have a clear hypothesis for agentic traits. On the one hand, there could be pluralistic ignorance in agentic traits such that men's own ideal man would be lower in agency than their perception of other's ideal man (in line with research showing that men tend to overestimate the extent to which their peers approve aggressive behavior; Vandello et al., 2009). On the other hand, there might not be pluralistic ignorance regarding agentic traits since masculine norms are most often communicated in terms of agency, and thus may be more accurately known. Secondly, we expected that this (incorrect) perception of what others expect of a man would provide an unattainable norm for men, as evidenced by a discrepancy between how men describe themselves and how men think their peers describe the ideal man. We hypothesized that men describe themselves as more communal and less agentic than how they think others in their cohort describe the ideal man, suggesting the perception of an unattainable norm (Hypothesis 2).

# Methods

## Participants

Study 1 was completed by 71 Belgian male university students. We excluded 7 participants who self-identified as not exclusively heterosexual (because they might be subject to different norms; see also Vandello et al., 2008) or who were born before 1990 (and thus did not match the student age cohort). The resulting 64 participants (Mage = 21.28, SD = 2.08) were enrolled in different majors, with most enrolled in engineering (32%) and psychology (32%).

## Procedure

The protocol was approved by the University of Leuven's University Social and Societal Ethics Committee. Belgian male university students participated for the chance to win a gift card to a local store popular amongst students. Participants were recruited via social media and through flyers, and were invited to participate in an online study that took approximately 5 min. After providing informed consent as was specified in the ethics application, participants completed the questionnaire

which included both demographic questions and the key traitdescription measures. Finally, participants were debriefed.

#### Measures

Participants were asked to rate themselves and the ideal man (both from their own and their perception of their peers' perspective) on a list of 12 agentic traits (e.g., dominant, competent) and 14 communal traits (e.g., warm, dependent) (based on Abele, 2003; Cuddy et al., 2004; see **Appendix 1** for the complete measures). The order of the 26 traits was randomized between participants within each of the three sections.

#### **Self-Description**

Participants first indicated to what extent the 12 agentic and 14 communal traits described themselves on a scale from 1 – not at all to 7 – very much (αagentic = 0.77 and αcommunal = 0.81).

#### **Own Ideal Man**

Participants then were asked to indicate to what extent they thought the same agentic and communal traits described the ideal man on a scale from 1 – not at all to 7 – very much (αagentic = 0.79, αcommunal = 0.81).

#### **Other Ideal Man**

Lastly, participants were asked to indicate to what extent they thought these communal and agentic traits described what their peers (i.e., others in their student and age cohort) thought was the ideal man on a scale from 1 – not at all to 7 – very much (αagentic = 0.84, αcommunal = 0.83).

#### Analyses

The data were analyzed with paired sample t-tests examining the difference between participants' perception of the ideal man and how they thought their peers would describe the ideal man in terms of communion and agency (Hypothesis 1). A second t-test compared the difference between participants' self-description and how they thought their peers would describe the ideal man in terms of communion and agency (Hypothesis 2). A post hoc power analysis conducted with G∗Power (Faul et al., 2007) indicated that this sample size (N = 64) is sufficient to capture a moderate effect size of r = 0.30 with power of 76.7%. Power for each separate effect can be found in **Appendix 2**. Results fully replicated when controlling for age, ethnicity, and study major.

In order to make adjustments for multiple comparisons, we applied the Bonferroni correction, in which the critical value of significance was lowered from p = 0.05 to p = 0.0125 (α/m, m being the number of tests conducted, in this case four tests).

# Results

First, we compared participants' own descriptions of the ideal man with their perceptions of their peers' descriptions of the ideal man to investigate whether there was indeed pluralistic ignorance. Results (as presented in **Figure 1** and **Table 1**) showed that participants described the ideal man as more communal than they think their peers would describe the ideal man, paired samples t(63) = 3.88, p < 0.001, d = 0.49 (significant at the p < 0.0125 level as required by the Bonferroni correction). Thus, the male participants as a group indicated a more communal ideal

TABLE 1 | Means and standard deviations for Study 1 trait descriptions.


than they thought their peers would report. Interestingly, men did not describe the ideal man as less agentic than what they believed their peers would report, paired samples t(63) = –1.07, p = 0.29, d = –0.13. This result is consistent with Hypothesis 1, postulating that there is indeed pluralistic ignorance with regard to masculinity norms, and that this pluralistic ignorance is specific to communal traits.

Second, we compared participants' self-descriptions with their perception of their peers' descriptions of the ideal man to investigate whether this perceived norm would be experienced as unattainable. Results (as presented in **Figure 2**) showed a trend such that participants thought that their peers would describe the ideal man as less communal than they on average actually described themselves, paired samples t(63) = –1.98, p = 0.052, d = –0.25, yet this effect did not reach significance. Also, participants thought that their peers would describe the ideal man as more agentic than they on average described themselves, paired samples t(63) = –6.32, p < 0.001, d = –0.79 (significant at the p < 0.0125 level as required by the Bonferroni correction). These results suggest that, in line with Hypothesis 2, men perceive that the ideal man is an unattainable norm, especially in terms of agency.

## Discussion

The goal of Study 1 was to establish that men experience pluralistic ignorance and perceive an unattainable norm regarding what traits are deemed desirable and normative for men. Results of this study indicated that indeed there is pluralistic ignorance regarding communal traits as men described the ideal man as more communal than they thought their peers would describe the ideal man. There was no pluralistic ignorance with regard to agentic traits: men's own perception of the ideal man was not more or less agentic than the perceptions they believed are held by their peers. Conversely, it was mainly agentic traits that provided an unattainable ideal for men (in line with research

on precarious manhood and masculinity threat, e.g., Vandello et al., 2008; Bosson and Vandello, 2011), since men described themselves as less agentic than how they believed their peers would describe the ideal man.

Experiencing pluralistic ignorance regarding certain norms reinforces those norms (e.g., Schroeder and Prentice, 1998; Stangor et al., 2001; Sechrist and Milford, 2007). In this case, experiencing pluralistic ignorance regarding what traits are deemed desirable for men is likely to reinforce traditional gender roles and norms of men as needing to be high in agency and low in communion (e.g., Eagly and Steffen, 1984). The findings of Study 1 imply that men may engage in certain behaviors that are not necessarily representative of how they describe the self in order to behave in what they perceive to be a socially desirable or normative manner, even though this may in fact be based on inaccurate information. Adopting traits and behaviors that match a perceived norm but perhaps not the real norm, may thereby actually be reinforcing these (inaccurate) norms, lowering engagement in communal traits and behaviors, and maintaining traditional gender roles and inequalities.

In sum, this study provides the first evidence that men perceive a norm that may not be the actual norm, since men as a group are interested in being more communal than they think their peers expect men to be, and describe the self as less agentic than they think others in their cohort expect men be. Study 2 sets out to examine what happens when we alter these perceived norms.

# STUDY 2

In Study 2, we set out to examine whether men's communal attitudes are affected when we alter the perceived norms. Previous research has established the link between normative perceptions and outcomes influenced by pluralistic ignorance (e.g., Stangor et al., 2001; Sechrist and Stangor, 2005). For example, when university students thought the alcohol consumption norm was higher than it actually was, they also tended to drink more. Making explicit this inaccurate perception led participants to moderate their alcohol consumption (Prentice and Miller, 1996). Thus, the goal of Study 2 was to examine the effects of presenting altered norms on men's attitudes toward communal and agentic self-descriptions, intentions to hide communal engagement, and broader gender-related social change.

Specifically, we constructed five conditions (four experimental conditions and a control condition) in which participants received a norm that was said to be held by their peers. In line with general masculinity norms, the traditional norm condition highlighted that agentic traits are deemed to be most desirable for men to have. The communal norm condition presented the opposite of this, highlighting that communal traits are deemed to be most desirable for men to have. Two further conditions were designed to break the veil of pluralistic ignorance found in Study 1. Specifically, the discrepancy condition highlighted explicitly that while people believe others value especially agency in men, others actually do value communion in men as well. In a fourth compatibility condition, both agentic and communal traits were framed as being important for men to have and compatible with one another. Lastly, in the control condition, no norm was manipulated and thus this functioned as a comparison group reflecting the actual guiding norm as participants perceive it.

The effect of these conditions was investigated on men's communal and agentic self-descriptions, on their intentions to hide future communal task engagement, and on their broader attitudes toward gender-related social change. This allowed us to examine whether norms reflecting different levels of communion affect how men describe themselves and whether they increase progressive attitudes toward gender-related social change. Hiding future communal task engagement is an important outcome given the evidence that hiding a stigmatized identity can have taxing effects on well-being and social belonging (e.g., Swim and Thomas, 2006; Pachankis, 2007; Newheiser and Barreto, 2014). Also, it is important to investigate under what condition men not only engage more in communal roles but also refrain from hiding such engagement, since hiding maintains the inaccurate norm that men are not communal even when some men actually do engage in communal roles.

We hypothesized that in the two conditions that break the veil of pluralistic ignorance (the discrepancy and compatibility conditions), men will describe themselves in more communal ways without it affecting their agency, report fewer intentions to hide communal behaviors, and hold more progressive attitudes toward gender related social change compared to the control condition. We did not expect differences between the traditional norm condition and the control condition, since the traditional norm condition confirms masculinity norms as present in society. We did not have specific hypotheses about the communal norm condition, but added this condition to compare the effect of merely stressing communal norms to uncovering pluralistic ignorance on men's self-descriptions, hiding communal engagement, and attitudes toward genderrelated social change.

# Methods

#### Participants

In Study 2, participants were 379 Belgian undergraduate men. As in Study 1, 60 participants were excluded as they were born before 1990 or did not self-identify as heterosexual (and are thus

potentially subject to different norms, see Vandello et al., 2008), or did not correctly summarize the experimental condition they were in. The resulting 319 participants (Mage = 21.37, SD = 1.95) were enrolled in different majors, with the majority enrolled in engineering (32%) and law (12%).

#### Procedure

The protocol was approved by the University of Leuven's University Social and Societal Ethics Committee. Participants were invited to complete an online questionnaire on their perceptions of their surroundings and were compensated either with course credit or the chance to win a coupon to a popular store. After agreeing to the informed consent as was specified in the ethics application, participants reported demographics and were randomly assigned to one of the five conditions as described above (please see **Appendix 3** for a more elaborate description of the manipulations): the traditional masculinity norm condition (n = 62), the discrepancy condition (n = 60), the compatibility condition (n = 79), the communal norm condition (n = 57), or the control condition (n = 61).

In each of the four experimental conditions, participants received an article describing the results of a fictitious study ostensibly conducted at the participants' university with students of their cohort. Specifically, the study reported students' beliefs about what traits are valued for an ideal man. Each participant thus received a similar article, but within each article, the traits that were said to be valued differed by condition (as described above). Participants then completed manipulation checks and the dependent variables. Participants in the control condition received no article and instead moved straight to the dependent variables. Lastly, participants moved on to the debriefing, in which they were informed of the research design, including the misleading information, and we explained why this was necessary to test the core hypotheses. Participants were given the contact information of the researcher and of the ethical commission that had approved the research.

#### Measures

A complete overview of all measurement items of this study can be found in **Appendix 4**.

#### **Manipulation checks**

Participants indicated to what extent the article asserted that communal traits (e.g., vulnerable, dependent, caring, 11 items, α = 0.92) and agentic traits (e.g., ambitious and competent, 7 items, α = 0.90; presented in random order), were valued by their peers on a scale from 1– not at all to 7 – very much (based on Abele, 2003; Cuddy et al., 2004).

#### **Communal and agentic self-descriptions**

Participants completed scales measuring how they described the self in terms of the same 11 communal (α = 0.82) and 8 agentic traits (α = 0.82; again presented in random order) on a scale ranging from 1 – not at all to 7 – very much (based on Abele, 2003; Cuddy et al., 2004).

#### **Hiding of future communal task engagement**

This scale assessed to what extent participants thought they would hide their future communal engagement regarding: (a) childcare and (b) household chores from people other than family and friends, specifically: (i) from their future colleagues, (ii) their future boss, and (iii) from strangers (α = 0.90, 6 items), on a scale from 1 – emphasize to 7 – hide. A higher score on this scale is thus indicative of more intent to hide behavior.

#### **Attitudes toward gender-related social change**

Attitudes toward gender-related social change was measured using an 8 item scale that assessed attitudes regarding changes in society toward gender equality (α = 0.77). Example items include "It is inevitable that men and women will be equal in their work in the future" and "The interests of a typical man will always differ from those of a typical woman, and this will be reflected in the work they choose to do" (reversed). The scale ranged from 1 – strongly disagree 7 – strongly agree, with a higher score on this scale indicating more progressive attitudes regarding social change toward gender equality.

#### Analyses

The data were analyzed using one-way ANOVAs which examined the main effect of condition. Planned pairwise comparisons were conducted with LSD tests. A post hoc power analysis conducted with G∗Power (Faul et al., 2007) indicated that this sample size was sufficient to capture a moderate effect size of r = 0.30 with power of 99.5%. Power for each separate main effect can be found in **Appendix 5**. Results replicated when controlling for age, ethnicity, and study major, with the exception of one effect, as specified below.

# Results

#### Manipulation Checks

Analyses showed that the manipulations were perceived as intended. First, the degree to which participants indicated communal traits had been discussed as valued traits for men in the article differed across the four experimental conditions, F(3,252) = 32.01, p < 0.001, η 2 <sup>p</sup> = 0.28. Specifically, planned comparisons showed that those in the traditional norm condition indicated that the article described their peers as valuing communal traits significantly less (M = 3.92, SD = 1.36) than those in the discrepancy condition (M = 5.51, SD = 0.89), p < 0.001, d = –1.29, [–1.96; –1.23]; the compatibility condition (M = 4.91, SD = 0.96), p < 0.001, d = –0.87, [–1.33; –0.65]; and the communal norm condition (M = 5.49, SD = 0.78), p < 0.001, d = –1.70, [–1.94; –1.20]. Those in the communal norm condition (M = 5.49, SD = 0.78) and discrepancy condition did not report different levels of communal traits, ns, but reported communal traits as being more valued by those in their cohort than those in the compatibility condition (M = 4.91, SD = 0.96), p < 0.001, d = 0.65, [0.26;0.95].

Participants also correctly reported the valued agentic traits for their respective article, F(3, <sup>252</sup>) = 33.81, p < 0.001,η 2 <sup>p</sup> = 0.29. Planned comparisons showed that those in the traditional norm condition reported agentic traits to be more valued by their peers (M = 5.39, SD = 0.89) compared to those in the discrepancy condition (M = 4.13, SD = 1.29), p < 0.001, d = 1.05, [0.85; 1.68]; the compatibility condition (M = 5.00, SD = 1.09), p = 0.046, d = 0.39, [0.01; 0.78]; and the communal norm condition

(M = 3.48, SD = 1.35), p < 0.001, d = 1.70, [1.49; 2.33]. Those in the communal norm condition indicated agentic traits as being less valued by their peers (M = 3.48, SD = 1.35) compared to the discrepancy condition, p = 0.002, d = 0.50, [0.23; 1.07] and the compatibility condition, p < 0.001, d = 1.27, [1.12; 1.91].

#### Communal and Agentic Self-Descriptions

As hypothesized, there was a significant effect of condition on participants' communal self-descriptions, F(4,314) = 2.63, p = 0.034, η 2 <sup>p</sup> = 0.032 (see **Figure 3**). Planned comparisons show that, as expected, men in the compatibility condition described themselves as more communal (M = 5.24, SD = 0.75) than those in the control condition (M = 4.85, SD = 0.76), p = 0.001, d = 0.86, [0.15; 0.63]. There were no significant differences between the other conditions.

There was a marginal effect of condition on agentic selfdescriptions, F(4,314) = 2.05, p = 0.09, η 2 <sup>p</sup> = 0.025. Planned comparisons indicated that men in the communal norm condition tended to describe themselves as less agentic (M = 4.59, SD = 0.81) than those in the traditional norm condition (M = 4.99, SD = 0.81), p = 0.01, d = –0.50, [–0.70; – 0.10]; and marginally less agentic than those in the control condition (M = 4.89, SD = 0.92), p = 0.056, d = –0.35, [– 0.60; 0.01]. There were no significant differences between the other conditions. However, the effect of condition on agentic self-descriptions disappeared when controlling for study major and the initial effect was only marginal. Therefore, we cannot draw the conclusion that conditions differed in terms of agentic self-descriptions.

#### Hiding Communal Task Engagement

Next, the extent to which participants expected to hide their future communal engagement from others was investigated. Results show an effect of condition on hiding future communal behaviors from others, F(4,314) = 2.71, p = 0.030, η 2 <sup>p</sup> = 0.033 (see **Figure 4**). Planned comparisons revealed that participants in the compatibility condition intended to hide communal engagement less (M = 4.17, SD = 1.14) than those in the control condition (M = 4.56, SD = 1.05), p = 0.048, d = –0.33, [–0.78; 0.00], and also less than those in the communal norm condition (M = 4.80, SD = 1.36), p = 0.002, d = –0.51, [–1.03; –0.24]. Unexpectedly, those in the traditional norms condition expected to hide future communal engagement less (M = 4.37, SD = 1.20) than those in the communal norms condition, p = 0.041, d = –0.34,

[–0.85; –0.02]. There were no significant differences between the discrepancy condition and the other conditions.

#### Attitudes Toward Gender-Related Social Change

Finally, there was a main effect of condition on the attitudes toward gender-related social change, F(4,314) = 3.35, p = 0.010, η 2 <sup>p</sup> = 0.041 (see **Figure 5**). Specifically, planned comparisons showed that those in the compatibility condition had more progressive attitudes toward gender-related social change (M = 4.97, SD = 1.00) than those in the control condition (M = 4.48, SD = 1.03), p = 0.004, d = 0.49, [–0.82; –0.16], and also than those in the communal norm condition (M = 4.51, SD = 1.02), p = 0.008, d = 0.46, [–0.79; –0.12]. Also as expected, those in the discrepancy condition had more progressive attitudes toward gender-related social change (M = 4.84, SD = 0.90), than those in the control condition (M = 4.48, SD = 1.03), p = 0.045, d = –0.35, [0.01; 0.71]. Unexpectedly, those in the traditional norm condition had more progressive attitudes toward gender-related social change (M = 4.87, SD = 0.93) than those in the control condition (M = 4.48, SD = 1.03), p = 0.027, d = –0.37, [0.04; 0.74].

# Discussion

The goal of Study 2 was to examine whether breaking the veil of pluralistic ignorance with regard to norms for men would increase men's communal self-description, decrease their hiding of future communal task engagement, and make their broader attitudes toward gender-related social change more progressive.

Our findings show that the discrepancy condition (which indicated that while people believe others especially value agency in men, others actually value communion as well)

increased participants' attitudes toward gender-related social change, but it did not affect participants' self-descriptions or hiding intentions. It could be that the beginning of this manipulation, which highlighted a strong agency prescription for men in society (before uncovering that this was part of pluralistic ignorance amongst their peers) actually made salient a societal masculine norm, decreasing the effectiveness of this condition. The compatibility manipulation (which indicated that both communal and agentic traits were valued in men) had the strongest effects. As expected, in this manipulation participants' reported more communal self-descriptions without affecting their agentic self-descriptions, less intentions to hide future communal tasks, and more progressive attitudes toward gender-related social change. It thus appears that making salient the actual norm through emphasis on the higher than expected compatibility between agentic and communal traits may be more effective than highlighting the discrepancy between expected and actual norms. This emphasis on the compatibility of communion and agency may allow men to be communal but not at the cost of agency, which is also important for men (e.g., Vandello et al., 2008; Vandello and Bosson, 2012).

Our results also suggest that merely highlighting that men value communal traits may not be sufficient: in the communal condition participants did not report more communal selfdescriptions and showed more hiding intentions than in the traditional norm condition. This suggests that when norms stress the value of communion and not agency, men might seek out ways to protect their male identity by hiding communal engagement.

Participants in the traditional norm condition did not differ from those in the control condition regarding their self-descriptions and hiding intentions, suggesting that this traditional norm is similar to their default perception of what the norm is. Unexpectedly, however, those presented with the traditional norm showed more progressive attitudes toward gender-related social change than those in the control condition and less intentions to hide communal self-engagement than those in the communal condition. Perhaps learning of research that confirms the traditional norm provides men with a masculinity affirmation and a sense of certainty as to what the norm is, thus allowing them to report attitudes that are somewhat more progressive (Ridgeway, 2011). This finding may also be caused by the mechanism of paradoxical thinking (Hameiri et al., 2014): when people are presented with opinions they believe but that are phrased more extremely, they tend to show a decrease in their own beliefs. Thus, a presentation of strong masculinity norms may have triggered a counter reaction to such norms in participants.

# GENERAL DISCUSSION

Traditional masculine norms are still present even in more progressive societies. Perhaps as a result, men are still highly underrepresented in communal HEED domains such as health care, elementary education, and roles in the domestic sphere (Croft et al., 2015). The very low engagement of men in communal roles and behaviors persists despite increasing insight into the many benefits these careers and roles might have for men's own well-being (e.g., Fleeson et al., 2002; Sheldon and Cooper, 2008; Le et al., 2013, 2018), but also for their female partner's upward mobility, children's aspirations, and for society as a whole (Croft et al., 2014).

To adhere to gender norms, men engage in certain behaviors and roles while avoiding others – in line with what they believe the norm prescribes. Yet, previous research has shown that people may not always have a correct estimate of what the general norm prescribes, which leads them to behave in line with an inaccurate norm; this has been coined "pluralistic ignorance" (Miller and McFarland, 1991; Vandello et al., 2009). The current work aimed to gain more insight into pluralistic ignorance with regard to masculinity norms on communal and agentic traits.

Study 1 established that there is indeed pluralistic ignorance amongst the young men in this sample regarding what traits actually describe the ideal man. Specifically, Study 1 highlighted a difference between these young men's own perception of the ideal man compared to how they think their peers describe the ideal man. Moreover, this study showed that the perceived norms also prescribe very high agency, higher than the agency men ascribe to themselves. Together, our studies provide a preliminary discovery (see Witte and Zenker, 2017) of pluralistic ignorance in gender norms for men and the potential to increase men's communal engagement by revealing these erroneous beliefs.

In order to examine the effect of these faulty ideas and the possible correction thereof, Study 2 introduced different norms to test their causal effect on men's self-description, hiding intentions of communal engagement and attitudes toward gender-related social change. Providing participants with these more accurate depictions of the actual norm indeed had an effect: Highlighting the compatibility between agentic and communal traits seemed especially effective as men exposed to this norm self-described as more communal, showed lower intentions to hide communal engagement, and reported more progressive and broader attitudes toward gender-related social change. This compatibility norm might be powerful because it can allow men to value communion and at the same time maintains the positive value for agentic traits consistent with traditional notions of male identity. In this sense, valuing both agentic and communal traits serves as an affirmation of that identity at the same time that it broadens the identity (Sherman and Cohen, 2002; Derks et al., 2009; Glasford et al., 2009; Spencer-Rodgers et al., 2016). Existing work has also shown that reaffirming important aspects of identity allows exploration of newer aspects of identity traditionally associated with the outgroup (Derks et al., 2006, 2007; Van Laar et al., 2010, 2013). This work thus suggests that valuing agentic in addition to communal aspects may allow men more exploration on the communal side, in that it may decrease possible masculinity threat that is linked to engaging in roles and behaviors that are traditionally female (i.e., precarious manhood, Vandello et al., 2008).

# Limitations and Future Directions

One potential limitation of this work is that participants' answers in Study 2 may have been affected by demand characteristics –

perhaps men could simply have been saying what they just had been told. Although the effects on self-descriptions of traits in Study 2 might be explained in this way (given that the articles mentioned these traits explicitly), it is more difficult to explain the full set of results – including changes in hiding intentions and changes in broader attitudes toward genderrelated social change - as demand characteristics. Moreover, demand characteristics are unable to explain the finding that participants in the traditional condition seem to show counter reactions to this norm such that they show lower intentions to hide their communal engagement and they show higher support for gender-related social change. Moreover, the current work is consistent with other studies in which norms were manipulated revealing similar effects (e.g., Schroeder and Prentice, 1998; Stangor et al., 2001; Sechrist and Milford, 2007; Diekman et al., 2013). Further research should investigate whether manipulated norms indeed change the actual perception of norms and lower pluralistic ignorance, and how long these effects persist.

A second possible limitation is the within-participant and cross-sectional nature of Study 1. Such a design was necessary to uncover discrepancies between participants' own trait descriptions for self or ideal man and these same participants' perceptions of their peers' prescriptions for ideal men. Yet, our methods could have given participants insight into the goals of the study. In this case, however, consecutive scales of the same traits would more likely lead to more similar answers on these scales. This would provide a conservative test of Study 1, since it would lead to an underestimation of the expected discrepancies. Also, this concern does not extend to Study 2, which used an experimental manipulation to show that men are affected by varying these norms.

An additional limitation is the relatively small sample size of Study 1. The G∗Power analysis for Study 1 indicated that the chance of a Type II error was slightly elevated; β = 0.233 instead of the suggested acceptable probability of β = 0.20 (Cohen, 1992). It is thus important to conduct further studies with large enough samples.

As this is the first work of its kind, these results are a first step and thus can be considered a preliminary discovery (see Witte and Zenker, 2017) of pluralistic ignorance in gender norms for men and the potential to increase men's communal engagement through uncovering such inaccurate norms. Further research is needed to further investigate the psychological processes at play and to extend these findings. It would be interesting to investigate to what extent the current findings are similar or different in different contexts and samples. The current studies were carried out with male university students pursuing higher education, a sample that is generally associated with more progressive attitudes (e.g., Hoffman and Kloska, 1995). Also, the studies were conducted in Belgium, a cultural context that scores relatively low on gender inequality (UNDP, 2015). Future research could test whether our results generalize to lower educated men and other cultural contexts. While there is no reason to suppose the effects will not generalize, it will be important to replicate these effects in these samples and to consider important moderators. In less progressive samples, it is possible that there is less pluralistic ignorance when men themselves also hold traditional ideals of masculinity (thus showing less of a contrast with perceived ideals held by others). However, it could also be that in these samples, men hold both more traditional ideals of masculinity and perceive stronger ideals held by others so that there is still a relative difference between own and other ideals for men resulting in pluralistic ignorance. Also, different cultures could prescribe different traits that are deemed acceptable or essential for men to hold (for instance, honor is highly valued in some cultures). We would expect that while the content of masculine ideals may differ across cultures, there could still be similar degrees of pluralistic ignorance regarding own and other's ideals.

Future research could also seek to replicate our findings across age groups. Research shows that as people age, they describe themselves as more communal (Diehl et al., 2004; Roberts et al., 2006). It would be interesting to investigate whether increases in men's communion as they age are due to the decrease of pluralistic ignorance such that they get a more accurate perception of gender norms over time; or rather that pluralistic ignorance remains, but that with age, people may find it less important to follow gender norms and more important to follow their personal preferences and ideals.

The present research investigated male undergraduate's peers as an important reference group for normative influence. It would be interesting for future research to also investigate the importance of other groups in setting the norm and influencing men's communal engagement. For instance, older men, such as the young men's fathers, or senior men in the workplace may also be important reference groups. Also, women may be an important driving force in setting normative expectations in terms of communal orientations for men, as women benefit from men's communal investments in the family context (Meeussen et al., 2018).

Also, a field intervention study would be needed to test whether our Study 2 manipulation of creating awareness of pluralistic ignorance may allow men to feel less coerced toward adopting traditional gender roles in real life contexts. There are already some notable projects that aim to increase male engagement in communal roles. For example, through a series of programs and workshops across the world, NGO PROMUNDO (2018) promotes gender equality and encourages gender-related social change, both in educational sessions and campaigns. Based on our findings, it may be interesting to include a component that uncovers pluralistic ignorance in such projects. We would encourage a scientific examination of the effectiveness of these programs and their different components as to inform governmental organizations wishing to promote men taking up paternal leave, increase male representation among elementary school teachers, and increase male representation in nursing.

# CONCLUSION

The current studies offer the first data consistent with the hypothesis that there exists pluralistic ignorance among

men regarding what traits are desirable for an ideal man, and show that uncovering inaccurate beliefs may alter selfdescriptions, intentions to hide communal engagement, and broader gender-related social attitudes to better fit with the actual norm. Theoretically, these findings offer initial insights into the underlying normative processes at play in the underrepresentation of men in communal roles. Research such as that presented in this paper can be used to help find more effective ways to address pluralistic ignorance and promote positive gender-related social change.

# AUTHOR CONTRIBUTIONS

SVG, CVL, LM, TS, and SS contributed to the development of the hypotheses. SVG, CVL, and LM contributed to the data collection. SVG conducted the statistical analyses. All authors contributed to the interpretation of results and the writing of the manuscript.

# REFERENCES


# FUNDING

This research was supported by an Odysseus grant to CVL from the Research Foundation of Flanders (FWO) grant number G.O.E66.14N.

# ACKNOWLEDGMENTS

The authors would like to thank the University of Leuven's Center for Social and Cultural Psychology and Katharina Block for useful comments, and Elisabeth Leroy and Josje Tooten for their help in data collection.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.01344/full#supplementary-material


extraverted as "good" as being extraverted? J. Pers. Soc. Psychol. 83, 1409–1422. doi: 10.1037//0022-3514.83.6.1409



Witte, E. H., and Zenker, F. (2017). From discovery to justification: outline of an ideal research program in empirical psychology. Front. Psychol. 8:1847. doi: 10.3389/fpsyg.2017.01847

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Van Grootel, Van Laar, Meeussen, Schmader and Sczesny. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Does Exposure to Counterstereotypical Role Models Influence Girls' and Women's Gender Stereotypes and Career Choices? A Review of Social Psychological Research

#### Maria Olsson\* and Sarah E. Martiny

Department of Psychology, UiT – The Arctic University of Norway, Tromsø, Norway

#### Edited by:

Alice H. Eagly, Northwestern University, United States

#### Reviewed by:

Georgina Randsley de Moura, University of Kent, United Kingdom Thekla Morgenroth, University of Exeter, United Kingdom

> \*Correspondence: Maria Olsson maria.olsson@uit.no

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

Received: 02 May 2018 Accepted: 31 October 2018 Published: 07 December 2018

#### Citation:

Olsson M and Martiny SE (2018) Does Exposure to Counterstereotypical Role Models Influence Girls' and Women's Gender Stereotypes and Career Choices? A Review of Social Psychological Research. Front. Psychol. 9:2264. doi: 10.3389/fpsyg.2018.02264 Gender roles are formed in early childhood and continue to influence behavior through adolescence and adulthood, including the choice of academic majors and careers. In many countries, men are underrepresented in communal roles in health care, elementary education, and domestic functions (HEED fields, Croft et al., 2015), whereas women are underrepresented in the science, technology, engineering, and mathematical (STEM) fields (Beede et al., 2011) and top leadership positions (Leopold et al., 2016). Theories focusing on the development of gender roles suggest that across the lifespan people perceive certain roles to be more or less appropriate for their gender (e.g., Gender Schema Theory, Martin and Halverson, 1981; Social Role Theory, Eagly and Wood, 2011). Specifically, researchers have postulated that observing same-sex role models triggers learning processes whereby observers internalize gender-stereotypical knowledge of roles and act accordingly, which results in gender-congruent aspirations and behavior. It seems reasonable that if observing men and women in gender congruent roles fosters gender-congruent aspirations and behavior, then frequently observing gender-incongruent role models (e.g., male kindergarten teachers or female scientists and leaders) should reduce gender stereotyping and promote gendercounterstereotypical aspirations and behavior. In many countries, governments and societal decision-makers have formed initiatives based on the idea that exposure to gender-counterstereotypical role models influences aspirations and career choices among children, adolescents, and young adults. The present review gives an overview of research-based interventions involving observing or interacting with counterstereotypical role models, particularly focusing on outcomes for girls and women. Extending earlier reviews, we summarize laboratory-based and field-based studies and then critically discuss and integrate the findings in order to provide an overall picture of

**88**

how counterstereotypical role models shape observers' occupational aspirations and academic choices in childhood, adolescence, and young adulthood. We conclude by outlining suggestions for future research and briefly discussing implications for future interventions.

Keywords: role models, stereotypes, STEM, leadership, women, girls, counterstereotypical

# INTRODUCTION

. . . relatable [female] role models will bring important future [female] scientists, mathematicians, technologists, engineers, innovators, and leaders into in the career pipeline.

#### 1000 Girls, 1000 Futures

Gender roles concern the expectation of what conduct is appropriate for men and women based on the distribution of men and women in different roles (Eagly et al., 2000). Children from every walk of life are exposed to gender roles from an early age. First and foremost, children are exposed to gender roles in their immediate environment through their parents, siblings, relatives, neighbors, peers, and teachers, but also through educational resources, media, and popular culture. The social environment and media often depict traditional gender roles (Lauzen et al., 2008; Kahlenberg and Hein, 2010; Kan et al., 2011; Steyer, 2014; Koss, 2015; Murnen et al., 2016; Reich et al., 2018). For example, in many western countries, men spend more time in paid work whereas women spend more time in unpaid work (Kan et al., 2011). In addition, analyses of prime-time television programs show that men are typically represented in agentic (i.e., work-related) roles, whereas women are typically represented in communal (i.e., family related) roles (Lauzen et al., 2008). Given this widespread exposure to traditional gender roles, it does not seem surprising that children themselves report gender stereotypes, and gender-stereotypical ability beliefs, play preferences, peer preferences, and career aspirations from a very young age (Freedman-Doan et al., 2000; Levy et al., 2000; Serbin et al., 2002; Sebanc et al., 2003; Wilbourn and Kee, 2010; Baker et al., 2016; Bian et al., 2017; Golden and Jacoby, 2018). Specifically, research has shown that girls in 1st and 4th grade think the subjects they are worst at is computers and science, whereas boys think they are worst at reading (Freedman-Doan et al., 2000). Children's gender-stereotypical beliefs of their current ability may shape their behavior later in life as they select activities they believe they are good at (Wigfield and Eccles, 2000).

One way that gender-stereotypical ability beliefs may become visible later on is in career choices. In many Western countries, men are underrepresented in communal roles in health care, elementary education, and domestic functions (HEED), whereas women are underrepresented in agentic and highstatus roles such as leadership positions (Croft et al., 2015; Leopold et al., 2016), and in the science, technology, engineering, and mathematical (STEM) fields (Beede et al., 2011). There are several reasons why it is important to promote an equal representation of men and women in different occupational fields. First, gender equality provides benefits to both men's and women's welfare and health (Seedat et al., 2009; Read and Grundy, 2011; Holter, 2014). Second, increasing the number of women interested in STEM can meet the demands of an ever-expanding labor market and reduce the gender wage gap (Beede et al., 2011). Likewise, promoting men's interest in HEED roles is important for overcoming labor shortages and promoting gender equality (Croft et al., 2015). Numerous initiatives and interventions have been implemented in several countries to encourage boys and girls to consider non-traditional occupational choices (e.g., Discover!; Little Miss Geek; 1000 girls, 1000 futures; Mind the Gap!; The Norwegian Government's gender equality action plan; the WISE Campaign). These initiatives and interventions are often based on the rationale that observing or interacting with men and women in non-traditional domains, providing a so-called gendercounterstereotypical role model, will promote non-traditional behavior.

A gender-counterstereotypical role model is an individual who engages in a role that is antithetical to gender stereotypes (e.g., a female CEO, a female scientist, or a male preschool teacher). Role models have been defined in various ways in the literature (for an overview, see Morgenroth et al., 2015). We follow the lead of other researchers and consider role models as "individuals who influence [children's, adolescents,' and young adults'] achievements, motivation, and goals by acting as behavioral models, representations of the possible, and/or inspirations" (Morgenroth et al., 2015, p. 468). The present review focuses on interventions that utilize counterstereotypical role models to influence women's aspirations to enter fields where they are underrepresented and negatively stereotyped. Role model interventions have been implemented with different goals in mind, such as promoting women's interest and confidence in pursuing a career in STEM or other high-status roles such as top leadership and politics.

The underrepresentation of women in certain academic or high-status fields cannot be solely attributed to essential differences between men and women. First, mean gender differences in ability tend to be influenced by extreme cases at the end of the distribution (Hyde, 2005), and sometimes gender differences in aspirations and abilities only appear when gender stereotypes have been made salient (Spencer et al., 1999; Quinn and Spencer, 2001; Davies et al., 2005). Second, research suggests that at least part of the reason women do not enter certain academic or high-status fields originates in psychological barriers created by stereotypes. For example, a lack of females in STEM and top leadership positions may signal to women that members of their gender lack the skills necessary to be successful in these domains (Eagly et al., 2000). Thus, in order to encourage women to enter

STEM and high-status positions where they are underrepresented and negatively stereotyped, it is important to expose women to female role models (Lockwood, 2006; Plant et al., 2009; Stout et al., 2011; but see Bagès and Martinot, 2011).

We will present literature on whether counterstereotypical role models have the potential to turn observers into role aspirants. Role aspirants are individuals who emulate and are inspired by role models (Morgenroth et al., 2015). Although the underrepresentation of men in certain educational and occupational domains certainly warrants empirical attention, we focus our review on girls and women because the vast majority of research has focused on women's underrepresentation in male-dominated fields (for a discussion of the dearth of research on men in female-dominated HEED fields, see Croft et al., 2015). We will discuss wide-ranging studies exploring the effects of observing or interacting with gendercounterstereotypical role models from childhood to young adulthood including experimental research, correlational data, and evaluations of real-life interventions. Thus, extending earlier work, we will build a bridge between interventions conducted in the laboratory and interventions conducted in the field. We will also highlight factors that ought to be considered when developing future role model interventions. Role model interventions can encompass many different goals but are here defined as explicit attempts to change children's, adolescents', and young adults' aspirations toward a gendercounterstereotypical occupational role by presenting them with a gender-counterstereotypical role model. In the following, we briefly summarize the main underlying theoretical assumptions about the effects of role models and then review the success of role model interventions in childhood, adolescence, and adulthood.

# THEORETICAL UNDERPINNINGS OF INTERVENTIONS

Although there is some disagreement amongst scholars regarding the underlying processes in the development of gender-congruent behavior, many theories have identified the observation of models–particularly same-sex models–as a major factor (e.g., Gender Schema Theory, Bem, 1981; Developmental Intergroup Theory, Bigler and Liben, 2006; Social Cognitive Theory, Bussey and Bandura, 1999; Social Role Theory, Eagly and Wood, 2011). It is not surprising then that many interventions that aim to target the underrepresentation of women in certain occupations and academic fields have involved exposure to stereotype-incongruent role models. It has been theorized that gender-stereotypical beliefs (which are widespread beliefs about the attributes of men and women, Heilman, 2001) are one of multiple factors that determine females' achievement-related aspirations and choices (Wigfield and Eccles, 2000). While not all scholars agree that stereotypes play a major role in guiding gender-congruent behavior (e.g., Bussey and Bandura, 1999), some scholars argue that observational learning gives rise to stereotypical beliefs, which then foster stereotypical behavior through various mediating processes (Martin et al., 2002; Wood and Eagly, 2012).

Theories concerning the development of gender stereotypes and stereotype congruent behavior in childhood are very rarely applied to gender development in adulthood or vice versa (exceptions include Bigler and Liben, 2006; Wilbourn and Kee, 2010). Theories also differ in their terminology and emphasis on different cognitive processes. Nevertheless, some theories of gender development in childhood versus adulthood share the assumption that observational learning gives rise to stereotypical beliefs, which subsequently guide behavior (Gender Schema Theory, Bem, 1981; Social Role Theory, Eagly and Wood, 2011). For example, the assumption that children learn to associate men and women with certain attributes through observing their environment is a central tenet of Gender Schema Theory (Bem, 1981). This gender knowledge forms cognitive schemas, which give rise to stereotypical beliefs and influence behavior (Martin et al., 2002). According to Gender Schema Theory, a girl who chooses to play with a doll has engaged in the following thought process: dolls are "for girls" and "I am a girl" which means that "dolls are for me" (Martin and Halverson, 1981, p. 1120). If a gender-stereotypical environment fosters stereotypical knowledge, which in turn fosters stereotype congruent behavior, interventions involving exposure to gender-counterstereotypical role models should reduce gender stereotypes and enhance gender-counterstereotypical aspirations.

The assumption that adults' stereotypes stem from observational learning is a key tenet of Social Role Theory (Eagly and Wood, 2011). According to Social Role Theory, people attribute the underlying cause of the unequal distribution of men and women in various roles to inherent gendered characteristics. Thus, because people mostly observe women in communal domains (where they are concerned with others, Abele and Wojciszke, 2007), people associate women with being socially skilled, nurturing, and caring. Likewise, because people mostly observe men in agentic domains (where they are concerned with pursuing their goals, Abele and Wojciszke, 2007), people associate men with being assertive and dominant. Men and women may subsequently internalize stereotypes about their gender, which guide their behavior (Hogg, 2000; Greenwald et al., 2002; Eagly and Wood, 2011). According to Social Role Theory, stereotypes are dynamic: when people perceive a non-traditional division of labor, they associate men and women with counterstereotypic characteristics (e.g., Diekman and Eagly, 2000; Wilde and Diekman, 2005). From this perspective, if the gender distribution of roles change, men's and women's gender stereotypes, self-concepts, and behavior should change accordingly. Thus, exposing men and women to counterstereotypical role models has the potential to change men's and women's aspirations and career choices.

Observational learning may operate differently at different stages of development. Notwithstanding this factor, it is possible to infer from theories applied in both childhood and adulthood that modeling is a precursor to the development of gender stereotypes (Gender Schema Theory, Bem, 1981; Social Role Theory,

Eagly and Steffen, 1984). That being said, gender-developmental theorists and role-model theorists alike assert that role aspirants are far from passive learners (Martin et al., 2002; Bigler and Liben, 2006; Morgenroth et al., 2015). The effect of the role model on the role aspirant is instead moderated by the role aspirant's previous experience, knowledge, and perceptions of the role model. The extent to which role models influence men's and women's aspirations and career choices may also interact with other factors such as direct instruction (Bussey and Bandura, 1999), parents' differing perceptions of their sons and daughters (Furnham et al., 2002; Tenenbaum and Leaper, 2003), parents' tendency to attribute their daughters' success to hard work and their sons' success to innate talent (Yee and Eccles, 1988; Räty et al., 2002), and biological sex differences (Eagly and Wood, 2013).

Because these theories propose that counterstereotypical role models influence child and adult role aspirants through the same processes, we review role model interventions that have been implemented from early childhood through early adulthood. Role model interventions have focused on a range of outcomes. Some interventions have targeted gender stereotypes, some have strived to promote self-efficacy and counterstereotypical behavior, and some have tried to enhance women's aspirations toward fields where they are underrepresented. Role model research in childhood, adolescence, and adulthood has emphasized different outcomes, which means that we are not able to compare exactly the same variables at different developmental stages. For the childhood literature, we review studies that test the success of exposure to gender-counterstereotypical role models on girls' gender stereotypes, aspirations, and behavior. For the adolescence and adulthood literature, we review studies that test the success of exposure to gendercounterstereotypical role models on girls' and women's gender stereotypes, self-concept, efficacy-beliefs (i.e., confidence in one's abilities, Bandura, 1977), career aspirations, and academic choices.

# A LITERATURE OVERVIEW OF THE EFFECTS OF ROLE MODELS IN EARLY CHILDHOOD, ADOLESCENCE AND EARLY ADULTHOOD

In the following, we provide a comprehensive–but not exhaustive–overview of whether exposure to counterstereotypical role models influences children's, adolescents' and young adults' gender stereotyping. In line with gender theories (Gender Schema Theory, Bem, 1981; Social Role Theory, Eagly and Wood, 2011), we argue that learning about gender is a process that takes place throughout a person's lifespan. Exposure to or interaction with counterstereotypical role models may therefore influence role aspirants at every stage of development. Whereas research on exposure to counterstereotypical role models in adulthood has gained a lot of empirical attention over recent years, there has been a paucity of research on counterstereotypical role models in early childhood. In this review, we chose to include research spanning from early childhood into early adulthood, not because the literature easily lends itself to comparisons (in fact, it is quite the contrary!), but because we think that researchers and students interested in this topic would benefit from an overview. Previous research has tended to separate the study of gender in childhood from adulthood, which has resulted in different research foci in the two fields. Different research foci in childhood and adulthood literature can give the impression that learning about gender is vastly different across the lifespan. However, although adults and children may not be equally affected by observing or interacting with role models, the processes by which an adult learns is a continuation of processes by which a child learns. An overview can help to highlight both similarities and differences across the lifespan and potentially promote further research on role model processes in childhood.

An overview can also shed light on whether role model interventions are more effective in childhood or adulthood. Important and far-reaching decisions such as which classes to take in upper secondary school or at university are made during adolescence or early adulthood. Female participation in STEM subjects tends to diminish drastically at the secondary educational level and again at university (Cronin and Roger, 1999). This decrease suggests that the potential presence of psychological barriers at these educational stages demotivates adolescent girls and young women from pursuing careers in these fields. Role model interventions may thus be particularly critical during secondary and higher education. However, some scholars have argued that interventions aimed at changing stereotypes should take place in early childhood, preferably before children have developed a firm understanding of gender roles (e.g., Bigler and Liben, 2006). Early gender-stereotypical beliefs may shape children's interests and have an accumulative effect on their skill acquisition and aspirations. Thus, interventions that occur later in development may be less effective or may have to be more comprehensive to counteract established interests and skills. Interventions may also be less successful once cognitive schemas are established, as schemas influence subsequent information processing (e.g., causing counterstereotypical information to be forgotten or distorted; Bigler and Liben, 1990; Frawley, 2008). However, interventions that take place too early may not be as effective as young children may not be able to generalize counterstereotypical information from one domain to another. This is because young children are more knowledgeable of stereotypical behavior among their own sex than they are of stereotypical behavior among the opposite sex. For example, although a young girl assumes that a child who plays with dolls also plays with a make-up kit, she may not assume that a child who plays with cars also plays with airplanes (Martin et al., 1990). Considering young children's limited abilities in making logical inferences, interventions in early childhood may have to be more comprehensive than in adulthood as they have to model counterstereotypical behavior in many domains. These developmental factors support the need for an overview of how effective interventions have been at different stages in development.

# EFFECTS OF EXPOSURE TO COUNTERSTEREOTYPICAL ROLE MODELS IN CHILDHOOD AND PREADOLESCENCE

As children observe men and women in different roles, they learn what it means to be a man or a woman within their cultural context. Put differently, children form gender stereotypes based on their observation of role models. Role models that influence observers in one way or another have exerted a 'role model effect.' The majority of research-based interventions in childhood and preadolescence have focused quite broadly on promoting a broader repertoire of behaviors by exposing children and preadolescents to counterstereotypical role models. We will first review indirect evidence for the role model effect by summarizing studies that assess whether the stereotypicality of parents' occupational roles correlate with the stereotypicality of their children's occupational aspirations or behavior. We then turn toward direct evidence by summarizing experimental and non-experimental between-subjects design interventions.

# Correlational Evidence

Parents are the role models young children are exposed to most (Bandura and Bussey, 2004). In line with this, researchers have argued that parents' occupations have a notable influence on offsprings' gender stereotypes and career aspirations (e.g., Eagly et al., 2000). Numerous studies that have correlated mothers' occupational roles with their daughters' aspirations have found indirect evidence for the role model effect. For example, the stereotypicality of mothers' work is associated with the stereotypicality of daughters' occupational aspirations in both preschool and preadolescence (Marantz and Mansfield, 1977; Barak et al., 1991). In addition, daughters of mothers who work either full time or in counterstereotypical occupations also report more gender role flexibility in childhood, more counterstereotypical career plans in adolescence, more counterstereotypical behavior in adulthood, and less marriage-career-conflict concerns (Levy, 1989; Barnett et al., 2003; Fulcher and Coyle, 2011; Greene et al., 2013).

When interpreting these results, we have to keep several things in mind. First, all of the studies reported above have used a correlational design and therefore do not provide causal evidence for the role of observational learning in early childhood. Second, correlational relationships between parental occupational roles and children's aspirations may, in some cases, be confounded with third variables such as instructional learning or how parents engage differently with their sons and daughters (Bussey and Bandura, 1999; Moon and Hoffman, 2008). Third, parental roles only account for small amount of variance in adults' gender role attitudes (Barnett et al., 2003), and sometimes no significant relationship is found between mothers' roles and daughters' aspirations and behavior (Moen et al., 1997; Cunningham, 2001). Nevertheless, the findings reported above are important because they show that variations in gender roles within girls' social reality can affect their aspirations and behavior. It is not surprising that the relationship between parents' occupations and daughters' gender-related aspirations and behavior is mixed, as many factors such as the mothers' specific occupation and attitude toward work may influence daughters' gender–related aspirations and behavior (Helms-Erikson et al., 2000). Taken together, the results of empirical studies investigating the relationship between parents' occupational roles and daughters' gender-related aspirations and behavior are mixed.

# Evidence From Interventions

In order to address the limitations of correlational designs and infer more conclusively the potential impact of role model interventions, it is important to review experimental research. Experimental interventions typically involve exposing children to counterstereotypical occupational role models for a relatively short period of time. Sometimes, interventions involve brief exposure that is repeated over several consecutive days. Occasionally, interventions involve exposure to counterstereotypical role models that span over several weeks or months. Studies that assess the effects of brief exposure to counterstereotypical role models are generally designed to assess the processes of observational learning, not the efficacy of role model interventions per se. Nevertheless, these studies provide useful information as many real-life interventions with counterstereotypical role models similarly involve only a brief exposure time. Following exposure to a counterstereotypical role model, children's gender stereotypes and sometimes their aspirations or actual behavior are assessed. The majority of brief experimental interventions were conducted in or prior to the 1990s and not many recent studies in this area have been published. Much of the early research has already been summarized in several reviews (e.g., Katz, 1986; Liben and Bigler, 1987; Bigler, 1999). For this reason, we merely give a brief overview of this earlier work and integrate these findings with more recent findings in the subsequent section. We conclude by outlining the potential of role model interventions, and making suggestions for future interventions and research.

## Do Children's Gender Stereotypes Change Following Exposure to Counterstereotypical Role Models?

The methods used in role model interventions have typically consisted of exposing children to literature or commercials depicting men and women in counterstereotypical roles. In general, the literature shows that exposure to counterstereotypical role models influences girls' gender-related beliefs. Among girls from preschool-age to 4th grade, exposure to counterstereotypical female exemplars reduced their occupational gender stereotypes and traditional attitudes toward women (Flerx et al., 1976; Ashby and Wittmaier, 1978; Pingree, 1978; Scott and Feldman-Summers, 1979; Trepanier-Street and Romatowski, 1999; but see Karniol and Gal-Disegni, 2009; Pike and Jennings, 2005). For example, Pingree (1978) presented 3rd graders with commercials that either depicted traditional women (e.g., a housewife) or non-traditional women (e.g., a female physician). Girls who had been exposed to non-traditional women reported less traditional attitudes toward women than girls who had been exposed to traditional women. Meeting counterstereotypical role

models in real life also appear to reduce gender-stereotypical beliefs among children. Third graders reported less gender stereotypes after listening to men and women in counterstereotypical occupations talking about their careers (Tozzo and Golub, 1990). In addition, preadolescent girls were less likely to picture a scientist as male after interacting with female scientists during a 10-day long science camp (Leblebicioglu et al., 2011). Taken together, evidence shows that exposure to or interaction with counterstereotypical role models can reduce gender stereotyping.

## Do Children Internalize Gender Stereotypes Following Exposure to Counterstereotypical Role Models?

Even though interventions involving exposure to counterstereotypical role models appear to change girls' gender stereotypes, the overarching aim of role model interventions is not only to change specific stereotype beliefs but also to influence children's subsequent behavior. It is therefore surprising that several of these studies have failed to include a measure of children's aspirations or behavior (e.g., Tozzo and Golub, 1990; Trepanier-Street and Romatowski, 1999; Karniol and Gal-Disegni, 2009). The failure to include a measure of children's aspirations or behavior may be due to a tendency among researchers to assume that boys and girls use gender stereotypes as a compass for behavior (Martin and Halverson, 1981). However, the assumption that stereotypes determine behavior is problematic. Research has repeatedly shown that changes in stereotypes do not reliably predict change in behavior (see Bigler, 1999). Specifically, studies have failed to find a significant change in girls' aspirations for counterstereotypical occupations (Ashby and Wittmaier, 1978; Bailey and Nihlen, 1990; Bigler and Liben, 1990; Liben et al., 2001; Coyle and Liben, 2016) or preferences for counterstereotypical toys following a brief exposure to gender-counterstereotypical role models (Spinner et al., 2018, but see Ashton, 1983). Thus, the lack of correspondence between girls' knowledge of what other women do and what they subsequently do suggests that stereotypes may not become internalized following short-term experimental interventions.

One factor that contributes to the lack of role model effects may be the extent to which the child perceives herself as similar to the role model. Anderson and Many (1992) analyzed 8- and 10-year-old children's spontaneous thoughts on reading material that depicted children in non-traditional roles and found that the children sometimes struggled to relate to the counterstereotypical role models. Since role model effects are partly driven by role aspirants' desire to become similar to the role model (Morgenroth et al., 2015), it seems crucial that the child identifies common ground with the counterstereotypical role model. Interventions that involve brief exposure to counterstereotypical exemplars may therefore benefit from explicitly highlighting similarities between the role model and the role aspirant to promote behavior change. Another factor that contributes to a lack of role model effects may be that children forget or distort counterstereotypical information, particularly if they are only briefly exposed to a counterstereotypical role model (Bigler and Liben, 1990; Frawley, 2008). Indeed, research has indicated that longitudinal interventions are more effective at eliciting changes. For example, Nhundu (2007) found that female primary school students who had been exposed to non-traditional educational material depicting females in non-traditional careers over a 3-year period expressed greater aspirations to pursue a nontraditional career than girls who had been exposed to traditional educational material. The education material explicitly encouraged young girls by including information such as: 'Anybody can do any job they like as long as they get trained for it and become skillful.' Thus, although this intervention was "successful," it is not possible to establish whether the girls' counterstereotypical aspirations were influenced by the repeated observation of counterstereotypical women, the direct encouragement, or a combination of these two factors.

## Is the Role Model Effect Sustained and Does it Generalize to Other Domains?

Although children sometimes appear to internalize counterstereotypical information following exposure to counterstereotypical role models (e.g., Ashton, 1983), one must not assume that role model effects observed immediately after a brief exposure will be sustained. First, observations of behavior at one time point are not reliable indicators of permanent behavioral change in young children (Green et al., 2004). Second, stereotype change recorded immediately after an intervention is not always observed at a 1-week follow-up (Flerx et al., 1976; Savenye, 1990). This might be the case because children are exposed to traditional gender role information in their everyday life, which might overwhelm the effect of the intervention. The majority of studies, however, have failed to assess whether stereotype change following brief exposure to counterstereotypical role models is sustained. Thus, in order to draw firm conclusions regarding the longevity of role model effects following brief exposure to counterstereotypical exemplars, more research that assesses children's gender stereotyping, aspirations, and behavior at several time points following the intervention is needed.

Moreover, it is questionable whether brief exposure to counterstereotypical role models in one domain will influence what is considered gender-appropriate in another domain. Research suggests that if change in stereotyping is observed at all, it is limited to the specific domains modeled in the intervention. For example, 3rd and 4th grade students read eight stories over a 4-week period either depicting a majority of males or a majority of females engaging in traditionally masculine roles. Children who had read about counterstereotypical women reported less stereotypical beliefs about women, but only for the roles that were portrayed by the characters in the stories (Scott and Feldman-Summers, 1979). The limited potential for counterstereotypical role models to eradicate traditional gender role beliefs may be determined by cognitive abilities, which preclude young children from making generalizations to other domains (Bigler and Liben, 1992). However, Trepanier-Street and Romatowski (1999) found stereotype change for occupations that were not included in the intervention. Children from three different preschools read six books over the course of 2 months that depicted both children and adults in counterstereotypical occupational roles. After listening to the stories, children engaged in several activities (e.g., children participated in a group discussion or listened to an adult talking about their career). It is thus possible that children

reported less gender stereotypes for domains that were not included in the reading because they had also engaged in discussions about other occupational gender roles. Liben and Bigler (1987) also point out that although the abovementioned intervention was successful, the activities varied for each preschool and it therefore remains difficult to evaluate exactly which factor caused the effects and how to replicate them.

Evaluations of studies involving longitudinal exposure to counterstereotypical exemplars suggest that interventions focusing solely on targeting gender roles in one domain may not cause children to alter their gendered behavior in other domains. For example, Nhundu (2007) found that although girls' stereotypes about occupations and their occupational aspirations appeared less gender-traditional following exposure to counterstereotypical occupations, girls still embraced gender roles relating to domestic work and emphasized the importance of women prioritizing family over career. Thus, despite a positive effect on girls' career aspirations, girls' sense of the priority of domestic work for women may counteract these effects. Interventions must therefore be comprehensive and must target gender stereotyping more broadly than the occupational domain. Moreover, it may also be important for interventions to influence not only the role aspirant, but also her family and peers (Adler et al., 1992). Research on an affirmative action program promoting females into leadership positions in local communities showed that counterstereotypical role models who are observable by the entire community influence not only the behavior of the role aspirant but also those of the wider community (Beaman et al., 2012). Specifically, in communities where there had been more than one period with a female leader, girls reported more educational aspirations, better educational outcomes, and less responsibility for domestic tasks, and parents reported higher career expectations for their daughters. Thus, when the entire community is exposed to female role models, it may make it easier for girls to choose non-traditional paths.

To summarize, brief exposure to counterstereotypical role models appear to change children's gender stereotypes on a shortterm basis. However, the changes in stereotypes are not always sustained and do not necessarily affect children's aspirations and behavior. These modest role model effects are not surprising given that the exposures to counterstereotypical exemplars in experimental interventions are brief and might stand in sharp contrast to what the children experience and observe in their everyday life when observing their parents or consuming media. Having said that, we conclude that based on the current literature it would be premature to dismiss the potential of brief exposure to counterstereotypical role models on children's aspirations and behavior. More research is needed to assess not only if, when, and why changes in stereotyping are sustained and internalized, but also whether changes in stereotyping have 'spill over effects' to other domains not present in the interventions. To our knowledge, no research to date has assessed how early exposure to counterstereotypical role models influences girls' later career choices. However, women sometimes attribute their motivation to pursue academic studies to a female role model they were exposed to early in life (Lockwood, 2006). It thus seems reasonable that small changes in interests in early childhood can set the child on a different trajectory that may accumulate into counterstereotypical behavior later on. While it appears that longitudinal exposure to counterstereotypical role models may change children's aspirations, the extent to which changes in aspirations in childhood are realized later on in adulthood is not clear. This is because there is a tendency for role model interventions to focus on gender stereotypes in one domain (e.g., the occupational domain) and not address gender expectations in other domains (e.g., the domestic domain). This may be problematic as some girls may see the home domain and the work domain as mutually exclusive. Due to greater exposure to female role models in the domestic domain than in the occupational domain, expectations to engage in the domestic role (e.g., to look after children at home) may be greater than expectations to engage in the agentic role (e.g., to pursue a high-status career). This means that even though girls may express counterstereotypical occupational aspirations following exposure to counterstereotypical exemplars, these aspirations may clash with gender expectations in the domestic domain later in life, which may preclude girls from pursuing highstatus careers. In order for role model interventions to have the predicted effect in adulthood, interventions ought to confront the expectation that women will serve as the primary caregiver by also exposing girls to males engaging in the domestic domain.

# Future Research on Interventions in Childhood

The aim of reviewing interventions in early childhood was not only to evaluate these interventions, but also to identify potential for new research. One implication of this review is that it is not clear whether role model effects are driven by children's propensity to emulate same-sex role models (Bussey and Bandura, 1999), or because counterstereotypical role models lead children to change the way they see themselves (Martin et al., 2002). Thus, future research on interventions should assess gender stereotypes, self-stereotyping, and subsequent behavior to determine whether a change in stereotypes is internalized and acted upon. This could potentially be assessed by observing children's behavior over a long period of time and using child-friendly implicit measures to assess stereotypes (e.g., Green et al., 2004; Most et al., 2007; Banse et al., 2010). Implicit measures may sometimes be preferred over explicit measures as implicit measures are less dependent on young children's ability to report their inner beliefs accurately and less susceptible to social desirability bias. A second future direction derives from the finding that children as young as 3 years old hold stereotypes about communal behavior (Baker et al., 2016). Thus, future research should assess whether children are able to infer communal and agentic traits from counterstereotypical role models, if they internalize them, and whether this influence a range of behaviors and preferences that were not necessarily targeted in the intervention. In addition, although it has been found that self-efficacy beliefs predict preadolescents' career choices (Bandura et al., 2001), there is to our knowledge no research on whether exposure to counterstereotypical role

models influences young children's self-efficacy beliefs. Finally, more research should evaluate existing field-based interventions.

Based on theoretical reasoning, we proposed that observing or interacting with counterstereotypical role models would change children's gender stereotypes and their sense of self. The research reviewed above only partially supports this claim. More research is needed to draw firm conclusions about the impact of counterstereotypical role models on role aspirants, and to integrate other processes that shape girls' aspirations and behavior.

# EFFECTS OF EXPOSURE TO ROLE MODELS IN ADOLESCENCE AND EARLY ADULTHOOD

We now move our focus from childhood and preadolescence to adolescence and early adulthood. Many role model interventions in adolescence and early adulthood are based on the same underlying principle as in early childhood and preadolescence. Namely that observers internalize gender-stereotypical knowledge of roles and act accordingly, which results in gendercongruent aspirations and behavior. Interventions in adolescence and young adulthood are typically more focused on a specific domain than in childhood and preadolescence. The ultimate goals of interventions in this age-group are to influence girls' and women's academic aspirations and career-related choices, especially focusing on domains where women are underrepresented and negatively stereotyped. To provide a justification for role model interventions, we first review correlations between the number of female role models in non-traditional fields and non-traditional role aspirants. We then turn to direct evidence by summarizing interventions that involve brief exposure to a counterstereotypical role model in the laboratory, and brief or prolonged interactions with a counterstereotypical role model in real life. We finish by outlining recommendations for future research.

# Correlational Evidence

If the proportion of female role models corresponds to the proportion of female role aspirants in non-traditional fields, then it provides prima facie evidence that the role models have influenced observers' achievements, motivation, or goals. There is correlational evidence for the role model effect in several domains where women are underrepresented, including politics, science, and engineering (Sonnert et al., 2007; Wolbrecht and Campbell, 2007). For example, adolescent girls talk more about politics and report more future intentions to engage politically in countries where there is a greater number of female politicians (Wolbrecht and Campbell, 2007). Moreover, research that has looked at the relationship between the number of counterstereotypical role models and the number of counterstereotypical role aspirants at United States universities over time has found that if the percentage of female faculty members in a science and engineering department increases by 10%, the percentage of female majors in biological sciences, physical sciences, and engineering can be expected to increase by 1.2% (Sonnert et al., 2007). The small effect sizes reported may seem to suggest that having more same-sex role models has little relevance to achieving overall gender equality. However, considering the cumulative impact small effects can have in real life over the course of time, these results should not be overlooked (Eagly, 1996). In addition, although the role model effect appears to be small, the effect is more pronounced in the presence of more than one genderincongruent role model (Nixon and Robinson, 1999; Campbell and Wolbrecht, 2006; Sonnert et al., 2007; but see Canes and Rosen, 1995).

However, it is not possible to infer causal relationships from cross-sectional findings. It could be that a stronger presence of female role models encourages the participation of female role aspirants due to a role model effect or it could be that the corresponding increase in both female role aspirants and female role models is caused by a third unknown variable. Thus, despite promising evidence from correlational studies, experimental or between-subjects design studies are needed to make causal inferences about the impact of gender-incongruent role models on role aspirants.

# Evidence From Interventions

The role model literature in adolescence and adulthood has gained attention in recent years. Experimental laboratory studies have typically involved providing female university students with information about women who are successful in fields where women are underrepresented and negatively stereotyped. Field-based between-subjects design studies have typically assessed the effect of interacting with female counterstereotypical role models. Following exposure to counterstereotypical role models, the extent to which girls or women have internalized the characteristics, behavior, or goals of the role model is assessed. In the following, we review interventions that involve exposure to or interaction with counterstereotypical role models from a broad range of academic or career-related settings. We focus exclusively on interventions in domains where women are underrepresented and negatively stereotyped. We propose that counterstereotypical female role models modify existing knowledge about women, which becomes internalized by the role aspirant, and this internalized knowledge then enhance self-efficacy beliefs, aspirations, and performance.

#### Do Adolescents' and Adults' Gender Stereotypes Change Following Exposure to Counterstereotypical Role Models?

One aim of role model interventions using counterstereotypical role models is to change girls' and women's perceptions of what they themselves can or should do by changing perceptions of what women in general can do. Studies have shown that students presented with descriptions or portrayals of nontraditional women changed their stereotypes about women, at least temporarily (Savenye, 1990; Dasgupta and Asgari, 2004; Rosenberg-Kima et al., 2008). For example, Dasgupta and Asgari (2004) presented female students with pictures and descriptions of several famous women in leadership positions in counterstereotypic fields such as science, business, law, and

politics. Female students subsequently took part in an Implicit Association Test (Greenwald et al., 1998), which assessed the strength with which they associated women and men with being leaders and supporters. The results showed that female students were quicker to associate women with leadership following exposure to counterstereotypical women. This effect was replicated in a longitudinal design that took advantage of the pre-existing differences in the proportion of female faculty at two universities. These findings suggest that exposure to counterstereotypical exemplars can reduce gender stereotypes.

#### Do Adolescents and Adults Internalize Gender Stereotypes Following Exposure to Counterstereotypical Role Models?

Brief exposure to just one counterstereotypical female role model in STEM can also enhance, at least temporarily, female roleaspirants' self-efficacy beliefs, determination to succeed, and performance in domains where women are underrepresented and negatively stereotyped (Marx and Roman, 2002; McIntyre et al., 2003; Rosenberg-Kima et al., 2008; Plant et al., 2009; Stout et al., 2011; Shin et al., 2016). The theoretical reasoning that underlie many role model interventions is that women see themselves in line with prevailing stereotypes (Guimond et al., 2006). From this follows that if a woman starts to perceive women in general as more agentic, she should also view herself as more agentic. In other words, following exposure to gender-counterstereotypical information, role aspirants should see themselves in less stereotypical ways. However, only a handful of studies have assessed the extent to which brief exposure to counterstereotypical role models causes women to internalize counterstereotypical information (also known as self-stereotyping, Guimond et al., 2006).

Several studies show that the way adult women see themselves change following brief and long-term exposure to counterstereotypical female role models (e.g., Lockwood, 2006; Asgari et al., 2010; Stout et al., 2011; Shin et al., 2016). However, not all role model interventions include a measure of gender stereotypes (e.g., Marx and Roman, 2002), and those that do sometimes fail to find a role model effect on gender stereotypes (Plant et al., 2009; Stout et al., 2011; Shin et al., 2016). For example, Plant et al. (2009) found that although middle-school girls reported greater self-efficacy and greater interest in engineeringrelated careers after being exposed to female engineers, they still endorsed traditional gender stereotypes related to engineeringrelated fields. Thus, the evidence as to whether the role model effects reported above were facilitated through a change in gender stereotypes and corresponding self-stereotyping remains inconclusive.

## Is the Role Model Effect Sustained and Does it Generalize to Other Domains?

Adolescents and adults appear to internalize counterstereotypical information immediately following brief exposure to counterstereotypical exemplars. However, since the majority of laboratory-based studies have failed to use a follow-up design, it is not possible to affirm whether brief exposure to counterstereotypical role models has an enduring effect on role aspirants' academic performance and career-choices (but see Herrmann et al., 2016). It seems likely that interactions over a long period of time with a counterstereotypical role model have more substantial role model effects than a brief exposure. To address the decreasing proportion of women in advanced STEM courses, several field-based interventions have been implemented during foundational STEM courses. They have found that female students exposed to female role models are more likely to set high-achieving goals and take intermediate courses in their respective fields than those exposed to only male role models (Asgari et al., 2010; Carrell et al., 2010; Porter and Serra, 2017). This role model effect is only observed in subjects where females are underrepresented, which indicates that female professors, rather than being better teachers than male professors, help to break down some of the psychological barriers preventing women from pursuing certain fields (see also Carrell et al., 2010). Thus, it seems that longitudinal exposure to counterstereotypical role models has the potential to enhance the effects reported by studies on short-term exposure. However, we cannot conclude from these studies that female professors affected role aspirants by challenging gender stereotypes. For example, it could be that the female professors facilitated a climate in which female students felt more comfortable actively participating, which had an effect on their performance, and ultimately their aspirations.

For role models to change how role aspirants see themselves, it may not be enough for female role aspirants to become aware that other women have achieved success in a given domain. It may also be critical that the role aspirant see themselves as similar to the role model (e.g., Rosenberg-Kima et al., 2008; Cheryan et al., 2011; Stout et al., 2011; Asgari et al., 2012; Hoyt et al., 2012). For example, Rosenberg-Kima et al. (2008) exposed undergraduate students to either a relevant role model (young and cool) or an irrelevant role model (old and uncool). Female students reported more self-efficacy if they had been exposed to a relevant role model than if they had been exposed to an irrelevant role model. Feelings of similarity are important because they convey the "if she can, so can I" idea to the role aspirant, which facilitates gendercounterstereotypical self-stereotyping. Interventions that fail to facilitate identification with the role model may not result in a role model effect. Studies that have assessed interventions in which adolescent girls engaged in science tasks and interacted with female scientists revealed that girls did not immediately and spontaneously view the female scientists as potential role models (Buck et al., 2008; O'Brien et al., 2017). Specifically, girls only began to view the female scientists as role models after establishing personal connections with them (Buck et al., 2008). Thus, it may be necessary for interventions to allow girls to establish personal bonds with the role model to facilitate aspirations toward a domain, particularly among younger girls who are not already invested in STEM. To highlight similarities between role aspirants and role models, some initiatives have tried to make female counterstereotypical role models more relevant by feminizing them. One example of this is the Science Cheerleaders initiative. In this initiative, girls who pursue science also do cheerleading at public events. The goal of this initiative is to reduce negative stereotyping

about female scientists. To our knowledge, there has been no scientific evaluation of the Science Cheerleaders initiative. However, research suggests that employing highly feminine role models may be unsuccessful and even backfire. For example, Betz and Sekaquaptewa (2012) found that 6th and 7th grade girls who did not strongly identify with STEM reported less self-efficacy, less current interest in math, and less aspirations to pursue math after being exposed to a highly feminine role model in STEM. The feminine role model failed to produce a role model effect because the observers viewed the combination of femininity and success in STEM to be unachievable.

Taken together, brief exposure may inadvertently deter role aspirants from fields where they are underrepresented and negatively stereotyped because of two reasons. First, role aspirants see very successful women as exceptions to the rule and therefore not representative of their group (Kunda and Oleson, 1995). Second, role aspirants fail to see themselves in the role model (Rudman and Phelan, 2010; Hoyt and Simon, 2011). For example, Hoyt and Simon (2011) found that after reading about successful female leaders, female undergraduate students not only gave themselves worse evaluations on a leadership task but they also perceived the task as more difficult. This is because observing a counterstereotypical role model may result in a contrast-effect whereby the role aspirants think they cannot achieve the same level of success as the role model (also known as upward comparison threat, Rudman and Phelan, 2010). This is contrary to an assimilation-effect where observers' performance improves following exposure to a successful genderincongruent role model (Latu et al., 2013). Firm conclusions on why brief exposure to counterstereotypical role models appear to sometimes cause contrast-effects and sometimes cause assimilation-effects cannot be drawn by comparing the design of existing studies. However, it seems that a role model effect is less likely to occur when the role aspirants perceive themselves as unable to achieve what the role model has achieved (Lockwood and Kunda, 1997). For example, when undergraduate women had made an incremental attribution, i.e., when they believed that successful women had achieved success through hard work, discipline, and persistence, they were more likely to associate themselves with leadership traits than when they had made an entity attribution, i.e., when they believed successful women had achieved success because of their talent (Hoyt et al., 2012). This suggests that in order for female counterstereotypical role models to be effective role models and reduce stereotypical beliefs about women's capabilities, it is important that female counterstereotypical role models are seen as representative of women in general.

The research reviewed above suggests that brief and longitudinal exposure to counterstereotypical role models can change women's gender stereotypes and self-stereotyping. Moreover, exposure to or interaction with counterstereotypical role models can enhance role aspirants' immediate self-efficacy beliefs and performance, and even influence role aspirants on a longterm basis by affecting their academic choices. While exposure to counterstereotypical role models appears to break down some of the psychological barriers to women's participation in, or aspirations toward, fields where they are underrepresented, it is not always possible to determine whether changes in self-stereotyping are responsible for these role model effects. Thus, more research is needed to identify when and to what extent changes in self-stereotyping underlie role model effects. The cause of role model effects is interesting from both a theoretical and practical point of view. If the presence of female role models facilitates active participation in class, for example, then active participation may be important for enhancing feelings of self-efficacy and spurring interest toward domains where women are underrepresented and negatively stereotyped (but see Weisgram and Bigler, 2007). If stereotypes drive role model effects, then interventions should focus more actively on challenging stereotypical beliefs about women. Such interventions may benefit from carefully selected role models as similarity between role aspirants and role models seems crucial to facilitate self-stereotyping (McCrea et al., 2012).

# Future Research on Interventions in Adolescence and Adulthood

One of the goals of this review was to identify challenges and limitations in the role model literature for future research to address. Although numerous studies involving counterstereotypical role models have been conducted, they have been conducted with different goals in mind, with samples that are either partly invested or not invested in the role models' field of expertise, and within different academic fields (for an exception, see Shin et al., 2016). This provides a number of questions for future research. First, research should address whether exposure to counterstereotypical role models promotes the same degree of counterstereotypical aspirations in all fields where women are underrepresented and negatively stereotyped. Second, research is needed to explore in greater detail what psychological processes drive these effects. Third, research must systematically assess how interventions are affected by role aspirants' current interest or investment in the field. Fourth, future research must take a more holistic view to incorporate the role of the wider community (e.g., family, peers, or romantic partners) in depressing role model effects. Lastly, empirical research is needed to assess the efficacy of addressing gender roles in domains that seem incompatible with pursuing a career in a high-status field (e.g., marriage-career conflicts, childrearing) for longitudinal success.

Based on theoretical reasoning, we examined empirical support for the notion that observing or interacting with counterstereotypical role models would change adolescent's and adult's self-stereotyping. The research reviewed above only partially supports this claim. More research is required to establish the role of self-stereotyping in role model effects.

# DISCUSSION

The current unequal distribution of women in various occupational roles acts as a psychological barrier to women's entry into certain academic and high-status professional fields. In other

words, occupational gender roles are both an antecedent to, and a consequence of, gender congruent behavior. Many initiatives that aim to promote women's entry into fields where they are underrepresented and negatively stereotyped are based on the notion that this can be achieved through exposure to counterstereotypical female role models. The main aim of this review was to infer from correlational, laboratory-based, and fieldbased studies the potential of counterstereotypical role models to promote girls' and women's aspiration toward counterstereotypical occupational roles by counteracting the endless stream of gender-stereotypical information children, adolescents, and young adults are faced with on a daily basis.

First, we established that long-term exposure to counterstereotypical role models (e.g., mothers in non-traditional work, female politicians, and female faculty) in role aspirants' natural environment positively correlated with their aspiration toward, and engagement with, counterstereotypical roles. Second, we assessed whether these role model effects could be simulated by timelimited role model interventions and, if so, what processes drive these role model effects. Our review of the role model literature showed that brief exposure to counterstereotypical role models in both childhood and adulthood is sometimes able to change stereotypical beliefs about women, at least temporarily. Despite this, we found that role aspirants-particularly young children did not always internalize characteristics of the role models. On the one hand, it is possible that brief exposure to counterstereotypical role models in early childhood is not sufficient to shift the way young girls perceive themselves. On the other hand, is possible that the lack of reported role model effects in early childhood are attributed to the limited number of times internalization has been assessed. We initially set out to provide an overview of interventions in childhood, adolescence, and adulthood in order to draw conclusions about what kinds of role model interventions are more effective in early childhood or later in development. However, the limited number of studies on how role models' influence children's aspirations and behavior means it would be premature to draw firm conclusions at this point. Third, we assessed whether long-term exposure to counterstereotypical role models generated more pronounced role model effects. We identified that longitudinal interventions, particularly those that involved the community, follow-up activities, or explicit encouragement, appeared to have an effect on children's and preadolescents' aspirations and behavior. Similarly, longitudinal exposure that facilitated active engagement appeared to enhance role model effects among young adults, particularly among highly motivated students. In comparison to role model research in adolescence and adulthood, role model research in early childhood and preadolescence has not assessed whether factors such as perceived dissimilarity suppresses role model effects. In adolescence and adulthood, it is clear that gendercounterstereotypical role models must challenge existing gender stereotypes, but at the same time not be seen astoo atypical. Taken together, the reviewed literature suggests that interventions that aim to promote counterstereotypical behavior can be effective at any point in a person's lifespan but should be designed with the role aspirants in mind, considering their current interests and motivations to engage in that behavior.

# POTENTIAL FOR FUTURE ROLE MODEL INTERVENTIONS

The underlying reason for why some role model interventions are "successful" is not always clear. Most field-based studies in childhood, adolescence, and adulthood have involved observational learning, active engagement, and sometimes instructional learning (e.g., Jayaratne et al., 2003). The question as to whether role model effects are reliant on both exposure to and interactions with counterstereotypical role models, or whether role model effects can be facilitated by observational learning alone warrants attention. This is important to assess since interventions that utilize mere observations of role models are potentially more cost-effective than interventions that require interactions with counterstereotypical role models over a long period of time (Herrmann et al., 2016). Moreover, there is no evidence to support the hypothesis that children's selfstereotypes change following exposure to counterstereotypical role models. As such, the role model effect observed in childhood may be driven by imitation processes (Social Cognitive Theory, Bussey and Bandura, 1999) rather than by self-stereotyping processes (Gender Schema Theory, Martin et al., 2002). Future research should thus address through what pathway role model effects in childhood occur so this can be directly addressed in interventions.

Although research has not established that mere exposure to counterstereotypical role models promotes counterstereotypical behavior and aspirations in early childhood, several large-scale initiatives have been developed based on this idea. For example, Norway is seeking to recruit more male preschool teachers under the assumption that exposure to men in communal roles will reduce gender stereotyping and promote non-traditional occupational choices among children (see Norwegian Government's Gender Equality Action Plan, 2014). While this initiative has not yet been empirically evaluated, qualitative analyses of children's perceptions of male preschool teachers have found no evidence that daily exposure to counterstereotypical role models (i.e., male preschool teachers) challenges or changes children's stereotypes. First, gender does not appear to be a notable factor in preschool children's descriptions of their male teacher (Sumsion, 2005), meaning that children may not learn to associate men with communal behavior. Second, analyses have suggested that children observe their male preschool teacher as someone who typically engages in stereotypical behavior (e.g., Sumsion, 2005; Harris and Barnes, 2009). For example, Sumsion (2005) found that children never depicted their male preschool teacher engaging in traditional 'female' play but frequently depicted him as heroic and resourceful, as someone engaging in traditional 'male' play. Thus, based on the findings from these qualitative studies, one might conclude that exposure to counterstereotypical role models (although intended to reduce stereotyping) may sometimes inadvertently reinforce traditional gender roles. However, in our opinion, these conclusions should be treated with caution. It might be the case that specific conditions need to be met in order to ensure that male preschool teachers are perceived as role models. For example, preschoolers might need to be exposed to more than one counterstereotypical role model

in order to generalize the communal behavior they observe in their male teachers to men in general.

More assessments of real world interventions are needed. One factor that should be considered is how the change in stereotypes is measured. Interventions are sometimes deemed successful based on a change in explicit stereotypes (e.g., Leblebicioglu et al., 2011). This could be problematic as research has shown that exposure to counterstereotypical role models enhance women's self-concept and performance through implicit rather than explicit stereotypes (Dasgupta and Asgari, 2004). Second, it is important to consider changes in a range of domains, even those that were not directly targeted in the intervention. Interventions that focus primarily on stereotypes in the occupational domain may not be comprehensive enough to facilitate real change in girls' future career choices because they do not also target gender roles in the domestic domain. Domestic expectations are present early on and may conflict with counterstereotypical aspirations. Thus, in order to demonstrate to girls that pursuing a career and raising children are not mutually exclusive, future interventions may benefit from portraying a female role model who has both a successful career and children. The risk of this approach is that female role models who manage to excel in both occupational and domestic roles may be seen as achieving unattainable success. Future interventions thus need to take care to present relatable role models whose success appears attainable. In order to reduce expectations that women will take the bulk share of domestic work, it may also be important to conduct interventions with boys. Without a corresponding shift in boys' attitudes toward communal roles (Sinno and Killen, 2009), girls may be unlikely to pursue high-status or demanding careers due to difficulties with pursuing a career while simultaneously being primarily responsible for domestic work (Hochschild and Machung, 2012).

# LIMITATIONS AND FUTURE DIRECTIONS

This review includes a selection of articles that are relevant to our specific hypothesis that exposure to or interaction with counterstereotypical role models reduce gender stereotyping and promote counterstereotypical aspirations and behavior.

## REFERENCES


We conducted a thorough literature review, but not a systematic search due to counterstereotypical role models being variably defined in the literature. We selected literature that both confirmed and challenged our hypothesis, with the aim to produce a balanced narrative review. We encourage researchers to conduct a meta-analysis on the studies reviewed above to integrate role model effects more systematically. More research is also needed on whether exposure to counterstereotypical male role models influence boys' and men's gender stereotyping and career choices. Men are underrepresented in communal occupations and roles (Croft et al., 2015). However, very few field-based role model interventions have been implemented to promote communal behavior in boys and men. Whilst we assume that the same processes that underlie role model effects would apply for boys and girls, experimental research has produced inconsistent findings. Sometimes studies have found a role model effect for girls but not boys, and sometimes studies have found a role model effect for boys but not girls (Katz, 1986; Buren et al., 1993; Green et al., 2004; Pike and Jennings, 2005). Future research should investigate the reason for these mixed findings. On a final note, gender roles have changed over the last few decades. Thus, moving forward, more carefully designed research on the impact of counterstereotypical role models in early childhood and scientific evaluations of initiatives and interventions in adolescence are warranted in order to see whether previous findings replicate across time and contexts.

# AUTHOR CONTRIBUTIONS

SM and MO conceived of the presented idea. MO reviewed the literature. SM supervised the findings of this work. Both authors discussed the results and contributed to the final manuscript.

# FUNDING

The publication charges for this article have been funded by a grant from the publication fund of UiT The Arctic University of Norway.

women on young women's leadership self-concept. Pers. Soc. Psychol. Bull. 38, 370–383. doi: 10.1177/0146167211431968





**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Olsson and Martiny. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Unnecessary Frills: Communality as a Nice (But Expendable) Trait in Leaders

Andrea C. Vial<sup>1</sup> \* † and Jaime L. Napier<sup>2</sup>†

<sup>1</sup> Department of Psychology, Yale University, New Haven, CT, United States, <sup>2</sup> Department of Psychology, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates

Although leader role expectations appear to have become relatively more compatible with stereotypically feminine attributes like empathy, women continue to be highly underrepresented in leadership roles. We posit that one reason for this disparity is that, whereas stereotypically feminine traits are appreciated as nice "add-ons" for leaders, it is stereotypically masculine attributes that are valued as the defining qualities of the leader role, especially by men (who are often the gatekeepers to these roles). We assessed men's and women's idea of a great leader with a focus on gendered attributes in two studies using different methodologies. In Study 1, we employed a novel paradigm in which participants were asked to design their "ideal leader" to examine the potential trade-off between leadership characteristics that were more stereotypically masculine (i.e., agency) and feminine (i.e., communality). Results showed that communality was valued in leaders only after meeting the more stereotypically masculine requirements of the role (i.e., competence and assertiveness), and that men in particular preferred leaders who were more competent (vs. communal), whereas women desired leaders who kept negative stereotypically masculine traits in check (e.g., arrogance). In Study 2, we conducted an experiment to examine men's and women's beliefs about the traits that would be important to help them personally succeed in a randomly assigned leader (vs. assistant) role, allowing us to draw a causal link between roles and trait importance. We found that both men and women viewed agentic traits as more important than communal traits to be a successful leader. Together, both studies make a valuable contribution to the social psychological literature on gender stereotyping and bias against female leaders and may illuminate the continued scarcity of women at the very top of organizations, broadly construed.

Keywords: gender roles, gender stereotypes, leader-role expectations, agency, communality

# INTRODUCTION

It has been argued that stereotypically feminine traits like communality will define 21st century leaders, and women and men with these attributes will rule the future (Gerzema and D'Antonio, 2013). However, despite the embracing of so-called feminine management, women continue to be highly underrepresented in top executive roles (Catalyst, 2018), and bias against female leaders persists (Eagly and Heilman, 2016; Gupta et al., 2018). We posit that one reason for this disparity

Edited by:

Sabine Sczesny, Universität Bern, Switzerland

#### Reviewed by:

Crystal L. Hoyt, University of Richmond, United States Ioana Latu, Queen's University Belfast, United Kingdom

> \*Correspondence: Andrea C. Vial andrea.vial@yale.edu †These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

Received: 01 April 2018 Accepted: 12 September 2018 Published: 15 October 2018

#### Citation:

Vial AC and Napier JL (2018) Unnecessary Frills: Communality as a Nice (But Expendable) Trait in Leaders. Front. Psychol. 9:1866. doi: 10.3389/fpsyg.2018.01866

is that, whereas communality is appreciated as a nice "add-on" for leaders, it is stereotypically masculine attributes related to agency, such as competence and assertiveness, that are valued as the defining qualities of the leader role, especially by men (who are often the gatekeepers to these roles). We examined this premise in two studies in which we assessed men's and women's idea of a great leader with a focus on gendered attributes.

Although leadership is associated with masculine stereotypes (Schein, 1973; Koenig et al., 2011), this association appears to have weakened somewhat over time (Duehr and Bono, 2006). For example, a meta-analysis that examined the extent to which stereotypes of leaders aligned with stereotypes of men revealed that the masculine construal of leadership decreased significantly between the early 1970s and the late 2000s, as people increasingly associate leadership with more feminine relational qualities (Koenig et al., 2011). One reason for this change is the slow but noticeable surge during this period in the number of management roles occupied by women. It is possible that the raising presence of women in management roles may have reduced the tendency to associate leadership with men, given that women tend to lead differently than men (Eagly et al., 2003), and exposure to counterstereotypic individuals tends to reduce implicit biases (Dasgupta and Asgari, 2004; Beaman et al., 2009). Another reason why leadership perceptions may over time have become more androgynous (i.e., involving more stereotypically feminine in addition to stereotypically masculine qualities) is that the organizational hierarchy has flattened over time (Bass, 1999) and has come to require less directive, top-down approaches to leadership (Eagly, 2007; Gerzema and D'Antonio, 2013).

Effective leadership, which can be highly contextual (Bass, 1999), is thought to be generally participative and transformational (Bass and Riggio, 2006). Transformational leadership styles, which involve motivating, stimulating, and inspiring followers (Burns, 1978; Mhatre and Riggio, 2014), are associated with increased morale and performance at various organizational levels (Wang et al., 2011). They are also associated with female leaders somewhat more so than with male leaders (Eagly et al., 2003; Dezso and Ross, 2011; Vinkenburg et al., 2011) and tend to be viewed as relatively more feminine than autocratic or transactional styles (Stempel et al., 2015). Indeed, there is evidence that transformational leaders tend to blend masculinity and femininity and are overall more androgynous (Kark et al., 2012). Management scholars thus recognize that effective leadership combines both agency-related and communal behaviors and traits (Bass, 1999; Judge and Piccolo, 2004), which are consistently associated with men and women, respectively (Burgess and Borgida, 1999; Prentice and Carranza, 2002). Given a trend toward ever more collaborative work environments in the digital age (Bersin et al., 2017), traits and behaviors typically associated with women such as cooperation and sensitivity to others' needs (Prentice and Carranza, 2002) are sometimes praised as the future of leadership (Gerzema and D'Antonio, 2013).

However, even though leader role expectations may be relatively more feminine today than 40 or 50 years ago (Koenig et al., 2011), women who aspire to top leadership positions continue to be at a considerable disadvantage. For example, women tend to be overrepresented in support and administrative roles (Blau et al., 2013; Hegewisch and Hartmann, 2014), but continue to occupy less than half of management positions they comprised about 34.1% of general and operations managers in 2017 according to the Current Population Survey (U.S. Department of Labor Bureau of Labor Statistics, 2018). The proportion of women is lower in executive positions that confer major decision-making power: They occupied only 28% of chief executive roles in 2017 (U.S. Department of Labor Bureau of Labor Statistics, 2018), and a mere 5% when considering S&P 500 companies, the largest, most profitable firms in the United States (Catalyst, 2018). Although these patterns are likely to result from a variety of factors, including gender differences in interests, goals, and aspirations (Reskin et al., 1999; Diekman and Eagly, 2008; Schneider et al., 2016), there is substantial evidence that at least some of this disparity is due to gender bias (Rudman, 1998; Heilman et al., 2004; Heilman and Okimoto, 2007; Phelan et al., 2008; Rudman et al., 2012).

Bias against female leaders is likely multiply determined. On one hand, it may reflect social conservatism and antifeminist attitudes (Forsyth et al., 1997; Rudman and Kilianski, 2000; Hoyt and Simon, 2016) and a tendency to maintain the traditional status quo where women serve primarily as caretakers (Rudman et al., 2012). For example, different attitudes toward the role of women in society predict liberals' and conservatives' disparate levels of support for female job candidates (Hoyt, 2012). On the other hand, bias against female leaders has also been connected to the perceived relative incongruity (Eagly and Karau, 2002) or lack of fit (Heilman, 1983, 2001, 2012) between the traits typically associated with women and the traditional female gender role and the traits ascribed to the leader role. This low perceived correspondence between feminine stereotypes and leader roles makes women appear unsuitable for authority positions. Moreover, when women demonstrate the kinds of attributes that are deemed requisite for effective leadership (e.g., agency) they sometimes elicit penalties for violating gender role expectations (Heilman and Okimoto, 2007; Rudman et al., 2012; Williams and Tiedens, 2016). The effect of gender stereotypes can make it difficult for women to thrive in leadership roles (Vial et al., 2016), and can compound over time and slow women's advancement in organizational hierarchies (Agars, 2004).

The persistence of bias against female leaders (Eagly and Heilman, 2016) appears in direct conflict with the increased valorization of more androgynous leadership styles that draw from communal, traditionally feminine traits and behaviors (Eagly, 2007; Judge and Piccolo, 2004; Judge et al., 2004; Gerzema and D'Antonio, 2013). This apparent contradiction is the focus of the current investigation, in which we test the idea that communal traits are appreciated in leaders primarily as an accessory or complement to other, more agentic qualities that tend to be viewed as more essential and defining of the leader role. We examined the trade-off that people make when thinking about agency and communality in relation to the leader role, testing the prediction that communal traits are valued in leaders only after reaching sufficient levels

of agentic (i.e., more stereotypically masculine) traits. As such, even when leader role expectations may also comprise communal traits (Koenig et al., 2011), agentic traits might still be considered the hallmark of leadership—necessary and sufficient to lead. Communal attributes, in contrast, may be appreciated as nice but relatively more superfluous complements for leaders.

Moreover, even when more communal leadership styles may be increasingly appreciated (Eagly, 2007; Judge and Piccolo, 2004; Judge et al., 2004; Gerzema and D'Antonio, 2013), we propose that the people who most value it happen to be women, who are typically not the gatekeepers to top organizational positions of prestige and authority. There is meta-analytic evidence that the masculine leadership construal tends to be stronger for male versus female participants (Boyce and Herd, 2003; Koenig et al., 2011). Furthermore, compared to women, men evaluate female leaders as less ambitious, competent, intelligent, etc. (Deal and Stevenson, 1998; Vial et al., 2018), and are less likely to select female job candidates (Gorman, 2005; Bosak and Sczesny, 2011; Koch et al., 2015). Thus, the concentration of men in top decision-making roles such as corporate boards and chief executive offices (Catalyst, 2018) may be self-sustaining because men in particular tend to devalue more communal styles of leadership (Eagly et al., 1992; Ayman et al., 2009). In contrast, given that communal traits are more strongly associated with their gender in-group (Burgess and Borgida, 1999; Prentice and Carranza, 2002), women may show more of an appreciation for these traits compared to men (e.g., Dovidio and Gaertner, 1993). In the current studies, we compared men's and women's preferences for communality and agency in leaders.

As stated earlier, the underrepresentation of women in top leadership roles is likely to stem not only from bias against female leaders (Heilman and Okimoto, 2007) but also from women's relatively low interest in pursuing these roles in comparison to men (Diekman and Eagly, 2008; Lawless and Fox, 2010; Schneider et al., 2016). Stereotypes linking leadership with men and communal roles with women might have a negative impact on women's sense of belongingness and self-efficacy in leadership roles (Hoyt and Blascovich, 2010; Hoyt and Simon, 2011). For example, women report lower desire to pursue leadership roles after being exposed to stereotypic media images (Simon and Hoyt, 2013). If communal traits are overall seen as "unnecessary frills" in leaders, as we propose, and if women place higher importance on being communal when they occupy a leadership role relative to men, such mismatch might discourage women from pursuing top leadership positions (Heilman, 2001). Thus, in addition to investigating whether men and women value agency and communality differently in leaders, we also considered how much they would personally value such traits if they were to occupy a leadership role.

# OVERVIEW OF RESEARCH

We conducted two studies to assess men's and women's idea of a great leader with a focus on gendered attributes. In Study 1, we examined the attributes that men and women viewed as requisite (vs. superfluous) for ideal leaders. In Study 2, we conducted an experiment to examine men's and women's beliefs about the traits that would be important to help them personally succeed in a randomly assigned leader (vs. assistant) role. In both studies, we measured trait dimensions related to gender roles and leadership including competence and assertiveness (i.e., agency) as well as communality. Agency and communality represent two basic dimensions of person perception and judgments of the self, others, and groups (Fiske et al., 2007; Abele et al., 2016). Agency is typically perceived as more self-profitable than communality, which is more often viewed as benefitting others and, as a result, communality tends to be more valued in others versus the self, whereas the reverse is true for agency (Abele and Wojciszke, 2007). Thus, it is possible that people value communality relatively more when evaluating others (vs. the self) in leadership roles. Here, we investigated how much men and women valued agency and communality when thinking about another in a leader role (Study 1) and when thinking of the self as a leader (Study 2).

Study 1 examined the notion that communal attributes are viewed as highly desirable in leaders—but only after more basic requirements have been met, which map strongly onto stereotypical masculinity (i.e., agency). Past research has examined the extent to which various attributes were seen as relevant to the leader role—either generally characteristic of leaders or typical of successful leaders (e.g., Schein, 1973; Powell and Butterfield, 1979; Brenner et al., 1989; Boyce and Herd, 2003; Sczesny, 2003; Sczesny et al., 2004; Fischbach et al., 2015). However, in those studies, participants rated traits one at a time and in absolute terms (e.g., "please rate each word or phrase in terms of how characteristic it is," on a 5-point scale; Brenner et al., 1989, p. 664). These absolute ratings may mask the potential trade-offs between different traits when evaluating a specific person, whose traits come in bundles (Li et al., 2002).

Specifically, the importance of communal characteristics for leaders may depend on levels of other traits (Li et al., 2002, 2011; Li and Kenrick, 2006), and participants considering such traits in isolation might assume acceptable levels on other desirable attributes (e.g., agency). For example, although communality might make someone desirable as a leader, communality might be considered irrelevant if a leader is insufficiently agentic. We investigated these potential trade-offs in Study 1.

In addition to agency and communality, we included traits that were negative in valence and stereotypically masculine (e.g., arrogant) and feminine (e.g., emotional) in content. Past investigations suggest that negative masculine stereotypes, which map onto a "dominance" dimension and are related to status attainment (Cheng et al., 2013), are strongly proscribed for women (Prentice and Carranza, 2002; Hess et al., 2005). Moreover, a number of investigations have revealed that dominance perceptions play a crucial role in bias against female leaders, who are often viewed as domineering and controlling (Rudman and Glick, 1999; see also Williams and Tiedens, 2016). Similarly, a recent review suggests that negative feminine stereotypes about the presumed greater emotionality of women

relative to men (Shields, 2013) are closely linked to bias against female leaders (Brescoll, 2016). For example, men in general tend to be described as more similar to successful managers in emotion expression than are women in general (Fischbach et al., 2015). Thus, in Study 1, we examined participants' interest in minimizing these negative traits when designing their ideal leader.

In Study 2, we examined whether people's leader role expectations differ when they think of themselves occupying that position. Many past investigations have compared perceptions of men and women in general with perceptions of successful managers (Schein, 1973; Powell and Butterfield, 1979; Heilman et al., 1995; Schein et al., 1996; Powell et al., 2002; Boyce and Herd, 2003; Duehr and Bono, 2006; Fischbach et al., 2015). Other studies have documented perceptions of successful male and female managers (Dodge et al., 1995; Heilman et al., 1995; Deal and Stevenson, 1998). We extend this prior work by directly assigning men and women to a leader role (versus an assistant role) and testing which kinds of attributes they view as important for them to be personally successful in that role. The random assignment of men and women to a leader role allowed us to draw a causal link between occupying a leadership role and differentially valuing communality and agency.

In both studies, we compared the responses of men and women, seeking to better understand how their leader-role expectations differ (Koenig et al., 2011). Past work suggests that individuals may generally prefer the kinds of attributes that are viewed as characteristic of their gender in-groups (Dovidio and Gaertner, 1993), and women compared to men have been found to possess less masculine leader-role expectations (Boyce and Herd, 2003; Koenig et al., 2011) and to value female leaders more (Kwon and Milgrom, 2010; Vial et al., 2018). Thus, we were interested in testing whether women might show higher appreciation for communal attributes in leaders in comparison to men.

# STUDY 1: REQUISITE AND SUPERFLUOUS TRAITS FOR IDEAL LEADERS

We tested the notion that communal traits are viewed as desirable in leaders—but only after more basic requirements have been met, namely, agency. We examined participants' preferences for the kinds of traits that would characterize the ideal leader by using a methodology that was originally developed to study mate preferences (Li et al., 2002). This method essentially compares the extent to which different traits are desirable as choices become increasingly constrained, helping distinguish the attributes that are considered truly essential or fundamental in a mate (or in our case, a leader), from traits that are considered luxuries. "Luxury" traits might ultimately be superfluous if the essential attributes (or "necessities") are not met. Conceptually, traits that are viewed as necessities tend to be favored when choices are constrained. As constraints are lifted, fewer resources are devoted to traits that are considered necessities, and more resources are allocated to luxuries.

This approach is apt to reveal the perceived trade-offs between more stereotypically feminine (i.e., communal) and masculine (i.e., agentic) leadership characteristics. By directly examining these trade-offs and identifying necessities and luxuries, we hope to clarify the seeming conflict between the increased valorization of more androgynous leadership styles that draw from traits and behaviors traditionally associated with women (Judge and Piccolo, 2004; Gerzema and D'Antonio, 2013) and the persistence of male bias (Eagly and Heilman, 2016).

We predicted that compared to communal traits, agentic traits would be rated as more of a necessity for an ideal leader, or, in other words, that communality would be treated as more of a luxury than agency. We measured two facets of agency separately, namely competence and assertiveness (Abele et al., 2016). Following Li et al. (2002), we assigned participants increasingly smaller budgets that they were instructed to use to "purchase" different traits to design their ideal leader. Participants made tradeoffs first between traits denoting competence and communality, and then between traits denoting assertiveness and communality. We expected that as people's budgets got smaller, they would prioritize competence and assertiveness over communality.

Finally, to examine the kinds of attributes that people may find intolerable in leaders, we also included negative traits, which map onto relaxed proscriptions (Prentice and Carranza, 2002) for men (e.g., arrogant, stubborn) and women (e.g., emotional, weak). We anticipated that participants might be especially interested in minimizing negative traits that people more commonly associate with men than with women (such as arrogant) as these traits align with the culturally prevalent idea that "power corrupts" (Kipnis, 1972; Keltner et al., 2003; Inesi et al., 2012). In contrast, negative feminine stereotypes, while generally undesirable (Prentice and Carranza, 2002), are not seen as typical of those in top positions, and thus people may be less concerned with curbing these attributes when thinking about an ideal leader. Therefore, we expected to find that participants' responses would reflect a priority to minimize negative traits more stereotypically associated with men over negative traits stereotypically associated with women.

We also considered whether participants would show more of an appreciation for positive traits that are stereotypically seen as characteristic of their gender in-group than positive stereotypes of a gender out-group (e.g., Dovidio and Gaertner, 1993). Thus, we expected female participants to rate communal traits as more necessary than male participants, whereas male participants were expected to see agentic traits (competence and assertiveness) as more necessary than female participants. These predictions also align with past research suggesting that women endorse less masculine leader stereotypes than men (Boyce and Herd, 2003; Koenig et al., 2011) and are more supportive of female leaders (Kwon and Milgrom, 2010; Vial et al., 2018). Additionally, participants were expected to show less of an aversion for negative traits that are stereotypical of their gender in-group than negative stereotypes of a gender out-group—that is, we expected female participants to see it as more of a priority to reduce negative traits commonly associated with men than male participants, whereas

male participants were expected to prioritize minimizing negative feminine stereotypes more so than female participants.

# Method

#### Participants

Power analysis performed with G∗Power 3.1 (Faul et al., 2007) indicated the need for at least 162 participants to have adequate power (1−β = 0.80) to detect small to medium effect sizes (f = 0.175) for the main effects of budget, participant gender, and their interaction for each of three lists of traits. In total, 281 participants took part in the study via Amazon Mechanical Turk (Mturk). The study was described to potential Mturk participants (i.e., those with at least 85% approval rates) as a short survey on work-related attitudes and impressions of other people, in which participants would be asked to read some materials and answer some questions about their experiences, beliefs, and attitudes. The study took approximately 5 minutes and participants were compensated \$0.55. Eight participants (2.8%) indicated that some of their answers were meant as jokes or were random. We report analyses excluding these 8 participants (n = 273; mean age = 35.94, SD = 11.73; 57.5% female; 76.2% White). One participant did not indicate gender (0.4%).

#### Procedure

Participants were asked to think about the attributes that would make someone an ideal leader. We asked them to design their ideal leader by purchasing traits from three different lists, and we gave participants a set budget of "leader dollars" that they could spend at their discretion. Each of the three lists contained 10 traits in random order, and participants could spend up to 10 dollars on each trait. For each list of traits, participants were first asked to allocate 60 leader dollars between the 10 traits. Then, participants were asked to do this exercise again two more times, first with a budget of 40 leader dollars, and then with a budget of 20 leader dollars. All stimuli are reported in full in **Appendix A**.

The first list of traits included five agentic/competence traits (capable, competent, confident, common sense, intelligent) and five communal traits (good-natured, sincere, tolerant, happy, trustworthy). The second list included five agentic/assertive traits (ambitious, assertive, competitive, decisive, self-reliant) and an additional five communal traits (cooperative, patient, polite, sensitive, cheerful). The third list included five negative masculine stereotypes (arrogant, controlling, rebellious, cynical, stubborn) and five negative feminine stereotypes (emotional, naïve, shy, weak, yielding), as classified by Prentice and Carranza (2002). The instructions for the third list were slightly different from the first two lists, as participants were asked to indicate how much they would pay so that their ideal leader would not possess each of the 10 negative traits. At the end of the study, all participants were asked basic demographic questions (e.g., age, race), and received a debriefing letter. In both studies, prior to debriefing, we asked participants to indicate whether any of their answers were random or meant as jokes ("yes" or "no"). We reassured participants that they would receive full compensation regardless of their answers to encourage honest responding.

# Analytic Strategy

We first computed the proportion of each overall budget that was allocated to agency/competence versus communality, agency/assertiveness versus communality, and negative masculine versus feminine stereotypes. For the first two, we combined the amounts allocated to agentic traits (competence or assertiveness) for each budget and computed the total proportion such that higher scores indicated a larger proportion of the budget was allocated to agency (competence or assertiveness) versus communality. We followed the same procedure for the negative traits, where higher scores indicated a larger proportion of the budget allocated to eliminate negative traits stereotypically associated with men over those associated with women.

As the budget expands, people allocate an increasingly smaller proportion of their extra income to necessities and spend a larger proportion of income on luxuries. In order to investigate which trait categories were seen as necessities and which were seen as luxuries, we followed Li et al.'s (2002) analytic strategy and compared participant allocations in the low budget (i.e., 20 leader dollars) with how they allocated their last 20 leader dollars. We computed the allocation of the last 20 dollars by subtracting the amount purchased in the medium budget (40 dollars) from that of the high budget (60 dollars), and then divided by 20. This strategy is similar to asking participants how they would allocate an additional 20 leader dollars after they have already spent 40. We submitted the proportion scores for the first 20 and the last 20 leader dollars as repeated measures in three separate Analysis of Variance (ANOVA) tests, one for each trait category (i.e., competence/communality, assertiveness/communality, and negative masculine/feminine stereotypes), with participant gender as between-subjects factor.

# Results

We examined the bivariate associations between the proportion of budgets allocated to the different sets of traits at the three budget levels. Across budgets, the proportion spent to gain competence (vs. communality) was significantly positively associated with the proportion spent to gain assertiveness (vs. communality) (correlations ranging from r = 0.47 to r = 0.39, all ps < 0.001, depending on budget.) Additionally, across budgets, the proportion spent to gain competence (vs. communality) was significantly negatively associated with the proportion spent to minimize negative traits that are more stereotypically masculine (vs. feminine) (correlations ranging from r = −0.33 to r = −0.15, all ps < 0.001, depending on budget). The same pattern emerged even more strongly for the association between the proportions spent to gain assertiveness (vs. communality) and the proportions spent to minimize negative traits that are more stereotypically masculine (vs. feminine) (correlations ranging from r = −0.47 to r = −0.41, all ps < 0.001, depending on budget). In other words, these bivariate correlations suggest that a stronger preference for agency (competence or assertiveness) over communality was associated with a weaker desire to reduce negative masculine traits over negative feminine traits. (Partial correlations controlling for participant gender revealed the same patterns).

#### Competence Versus Communality

fpsyg-09-01866 October 1, 2018 Time: 15:20 # 6

There was a significant effect of budget for the competence/communality traits list, F(1,270) = 2780.21, p < 0.001, η<sup>p</sup> <sup>2</sup> = 0.911, such that the difference in the proportion allocated to competence relative to communality was higher for the first 20 dollars (M = 0.59, SD = 0.18) compared to the last 20 dollars (M = −0.001, SD = 0.004), M<sup>D</sup> = 0.60, SE = 0.011, 95% CI[0.573, 0.618]. This pattern is consistent with participants viewing competence as more of a necessity and communality as more of a luxury. There was also a significant main effect of participant gender, F(1,270) = 5.50, p = 0.020, η<sup>p</sup> <sup>2</sup> = 0.020, and a significant interaction between participant gender and budget, F(1,270) = 5.51, p = 0.020, η<sup>p</sup> <sup>2</sup> = 0.020. Men and women differed in their allocation of their first 20 dollars, such that men prioritized competence over communality (M = 0.62, SD = 0.19), to a significantly higher extent than women (M = 0.57, SD = 0.17), M<sup>D</sup> = 0.05, SE = 0.022, 95% CI[0.008, 0.097], p = 0.020, η<sup>p</sup> <sup>2</sup> = 0.020. However, men and women allocated the last 20 dollars in a similar way, M<sup>D</sup> = −0.001, SE = 0.001, 95% CI[0.000, 0.002], p = 0.290, η<sup>p</sup> <sup>2</sup> = 0.004.

The ideal proportions of competence/communality as a function of budget are presented in **Figure 1A**. As can be seen in the figure, for all three budgets, male as well as female participants spent more on competence traits than on communal traits, and this difference became larger as options became more constrained (i.e., as the budget became smaller). While men's and women's allocations were more similar for the high and medium budgets, when the budget became smaller, men's preference for competence over communality (62% vs. 38% of the budget) was stronger than women's (57% vs. 43% of the budget). In other words, the tendency to view competence as more of a necessity than communality was apparent in both men and women, and men valued competence over communality more strongly than women when choices were constrained.

#### Assertiveness Versus Communality

There was also a significant effect of budget for the assertiveness/communality traits list, F(1,270) = 1428.82, p < 0.001, η<sup>p</sup> <sup>2</sup> = 0.841, such that the difference in the proportion allocated to assertive over communal traits was significantly higher for the first 20 dollars (M = 0.51, SD = 0.22) compared to the last 20 dollars (M = −0.0005, SD = 0.005), M<sup>D</sup> = 0.51, SE = 0.014, 95% CI[0.488, 0.542]. There was no significant main effect of gender, F(1,270) = 1.49, p = 0.223, η<sup>p</sup> <sup>2</sup> = 0.005, and, contrary to predictions, the interaction between budget and participant gender was not significant, F(1,270) = 1.45, p = 0.229, ηp <sup>2</sup> = 0.005. The ideal proportions of assertive versus communal traits as a function of budget are presented in **Figure 1B**. As can be seen in the figure, as the budget became smaller, participants spent slightly but reliably more on assertive traits than on communal traits. This pattern is consistent with participants viewing assertiveness as more of a necessity and communality as more of a luxury.

#### Negative Masculine/Feminine Stereotypes

Finally, there was a significant effect of budget for the last list of traits focused on negative masculine/feminine stereotypes, F(1,270) = 1760.12, p < 0.001, η<sup>p</sup> <sup>2</sup> = 0.867. The difference in the proportion allocated to minimizing negative masculine stereotypes relative to negative feminine stereotypes was higher for the first 20 dollars (M = 0.62, SD = 0.24) compared to the last 20 dollars (M = −0.002, SD = 0.005), M<sup>D</sup> = 0.62, SE = 0.015, 95% CI[0.586, 0.646]. This pattern is consistent with participants viewing the minimization of negative masculine stereotypes as more of a necessity and the minimization of negative feminine stereotypes as more of a luxury. There was a significant main effect of participant gender, F(1,270) = 9.22, p = 0.003, η<sup>p</sup> <sup>2</sup> = 0.033, and a significant interaction between participant gender and budget, F(1,270) = 8.74, p = 0.003, ηp <sup>2</sup> = 0.031. Men and women differed in their allocation of the first 20 dollars, such that women prioritized the minimization of negative masculine over feminine stereotypes (M = 0.66, SD = 0.24) to a significantly higher extent than men (M = 0.57, SD = 0.24), M<sup>D</sup> = 0.09, SE = 0.030, 95% CI[0.031, 0.148], p = 0.003, ηp <sup>2</sup> = 0.032. Men and women allocated the last 20 dollars in a similar way, M<sup>D</sup> = 0.001, SE = 0.001, 95% CI[−0.001, 0.001], p = 0.770, η<sup>p</sup> <sup>2</sup> < 0.001.

The ideal proportions of negative masculine/negative feminine stereotypes as a function of budget are presented in **Figure 1C**. As can be seen in the figure, for all three budgets, male as well as female participants spent higher proportions of their budgets to minimize negative masculine stereotypes than to minimize negative feminine stereotypes, and this difference became larger as options became more constrained (i.e., as the budget became smaller). While women and men's allocations were more similar for the high and medium budgets, when the budget became smaller women's interest in minimizing negative masculine stereotypes relative to negative feminine stereotypes (66% vs. 34% of the budget) was stronger than men's (57% vs. 43% of the budget). In other words, the tendency to see it as a necessity to curb negative masculine (vs. feminine) stereotypes was apparent in both men and women, and women devalued negative masculine (vs. feminine) stereotypes more strongly than men when choices were constrained.

# Discussion

The goal of Study 1 was to examine the attributes that men and women view as requisite (vs. superfluous) for ideal leaders. As predicted, leader agency was seen as more of a necessity relative to leader communality, which was viewed as more of a luxury. We found that when people's budgets were constrained, both men and women were more likely to give up communality in favor of both competence and assertiveness.

It is worth noting that, when participant choices were only minimally constrained (i.e., in the high budget condition), the relative preference for assertiveness over communality appeared to reverse. In other words, when they could choose rather freely, participants in this study favored a communal leader over an assertive one. Such reversal is in line with the increased valorization of more androgynous leadership styles that draw from traditionally feminine traits and behaviors (Judge and Piccolo, 2004; Judge et al., 2004; Eagly, 2007; Gerzema and D'Antonio, 2013). However, the methodology employed clearly

assertiveness versus communality. (C) Ideal percentages of the budget allocated to minimizing negative masculine versus feminine stereotypes.

indicates that communal traits do not hold the same value as assertiveness in relation to idealized leadership, as communal traits were only valuable once agentic attributes had been sufficiently met.

We found that participants devoted a larger proportion of their budgets to minimizing negative masculine stereotypes, such as arrogant and controlling, than negative feminine stereotypes, such as emotional. This preoccupation with negative masculine stereotypes in particular may reflect a general view that power corrupts (Kipnis, 1972; Keltner et al., 2003; Inesi et al., 2012), as well as an attempt to keep those deleterious effects of power at bay in ideal leaders. In contrast, minimizing negative feminine stereotypes became of interest only after negative masculine stereotypes were sufficiently reduced.

Although both men and women ultimately preferred agency to communality, the results suggest that, compared to men, women prefer leaders who show more of a balance between competence and communality (whereas men more strongly favor competence), and who can keep traits like arrogance or stubbornness in check. In line with our expectation that participants would be more tolerant of negative stereotypes of their gender in-group than negative stereotypes of a gender out-group, we found that women in particular prioritized minimizing masculine negative stereotypes when thinking about an ideal leader. Men seemed more tolerant of these negative traits, which are generally seen as more typical in their gender in-group than the gender out-group (Prentice and Carranza, 2002). Instead, men spent relatively more

of their budgets to curb negative feminine stereotypes in leaders.

A potential limitation in Study 1 is that, in the absence of a qualifier, participants might have thought primarily about a male individual when designing their ideal leader—given that these roles historically have been (and continue to be) disproportionally occupied by men (Blau et al., 2013), and given a general tendency to think of men as category exemplars, as reviewed recently (Bailey et al., 2018). Rather than asking participants to design their ideal "female" or "male" leader, which may arouse socially desirable responses, we again examined which traits people think are necessary for leadership in Study 2 by having male and female participants imagine themselves in a leadership (or assistant) role, and then asking them to rate what traits they believe are important to succeed in that role.

# STUDY 2: IMPORTANT TRAITS TO SUCCEED IN LEADER VS. ASSISTANT ROLES

In Study 2, we had participants imagine themselves in either a leadership or assistantship role and examined the extent to which they believed they would need to act in agentic and communal ways in order to be successful in that role. To our knowledge, the present study was the first one to examine adult men's and women's beliefs about the traits they would need to be successful in a randomly assigned leader role. As such, this study is particularly well suited to establish a direct causal link between occupying a leadership role and differentially valuing agentic and communal traits.

We expected that agentic traits, including competence and assertiveness, would be rated as more important to succeed in a leader role, but as less crucial for assistant roles. In contrast, we expected participants to see communal traits, such as patient and polite, as more important to be a successful assistant than a successful leader. Moreover, although previous research has shown that agency is more desirable than communality in the self (as compared to in others) (Abele and Wojciszke, 2007), we predict that the role will influence the extent to which people find agentic traits desirable in the self. Specifically, whereas we expected that agency would take precedence over communality for participants in the leader role, we expected to find the reverse for those in the assistant role, for whom communality would take precedence over agency.

We anticipated that both male and female participants would rate agentic traits (like competence and assertiveness) as more important to succeed as a leader than communality, similar to past investigations (Koenig et al., 2011). However, we also anticipated an interaction between role and participant gender, such that women compared to men would rate communal traits as more important to succeed as a leader. This is because people tend to favor traits and attributes that are characteristic of their in-groups (versus attributes that are not, or that characterize an outgroup) (Dovidio and Gaertner, 1993), and because women compared to men have been found to possess less masculine leader-role expectations (Boyce and Herd, 2003; Koenig et al., 2011) and to value female leaders more (Kwon and Milgrom, 2010; Vial et al., 2018).

# Method

#### Participants

The study employed a 2×2×3 mixed design with participant gender (male vs. female) and role condition (leader vs. assistant) as between-subjects factors and trait category (competence, assertiveness, and communality) as a within-subjects factor. We enrolled 252 MTurk participants with a HIT completion rate of 95% or higher, who were compensated \$0.55. The study took approximately 10 minutes and was described to potential participants as a research study about personal experiences, feelings, and attitudes. Three participants (1.2%) indicated that some of their answers were meant as jokes or were random. We report analyses excluding these 3 participants (n = 249; mean age = 32.55, SD = 11.88; 42.6% female; 71.9% White). One participant (0.4%) did not indicate gender. A sensitivity power analysis using G∗Power 3.1 (Faul et al., 2007) showed a sample of this size (n = 249) is sufficient to detect a small interaction effect between within- and between-factors, i.e., f(U) = 0.169 with power = 0.80 and f(U) = 0.208 with power = 0.95 (assuming α = 0.05, four groups, and 3 repeated measures).

#### Procedure

Participants first read a short vignette asking them to imagine that they were part of a team working on an important project. The full text of the vignette is presented in **Appendix B**. Half of participants were randomly assigned to a role condition in which they imagined being the team leader, and the other half were assigned to a role condition in which they imagined being the assistant to the leader. All participants were asked to indicate how important each of a series of attributes was to be successful in their role. Specifically, for each trait, they read "As [a leader/an assistant] it is important to be [trait]," and indicated their answer from 1 (not at all) to 7 (extremely so). The list of traits, all of which were used in Study 1, included eight agentic traits, three of which measured competence (i.e., competent, confident, capable; α = 0.75), and five of which measured assertiveness (i.e., ambitious, assertive, competitive, decisive, self-reliant; α = 0.78), and eight communal traits (i.e., cheerful, cooperative, patient, polite, sensitive, tolerant, goodnatured, sincere; α = 0.83).<sup>1</sup>Finally, all participants were asked basic demographic questions (e.g., age, race), and received a debriefing letter.

# Results

We conducted a mixed-model ANOVA with participant gender and experimental role condition as between-subjects factors, and

<sup>1</sup>The following traits were added for exploratory purposes but were not included in Study 1 or in the analyses for Study 2: excitable, sophisticated, refined, immoral, self-interested, cut-off, cynical, visionary, inspiring, dominant, powerful, independent. After rating all traits, participants indicated how much they identified with their gender in-group. Gender identification did not vary with condition or participant gender and was excluded from all analyses.

trait category (competence, assertiveness, and communality) as a repeated measure. As expected, we found a significant interaction between role and trait category, F(2,243) = 32.31, p < 0.001, ηp <sup>2</sup> = 0.210. The interaction between participant gender and trait category was not significant, F(2,243) = 1.85, p = 0.159, nor was the 3-way interaction between trait category, role, and participant gender, F(2,243) = 1.19, p = 0.306.

All means are represented in **Figure 2**.

Pairwise comparisons revealed that participants in the leader role rated both competence, M<sup>D</sup> = 0.242, SE = 0.09, 95% CI [0.056, 0.428], p = 0.011, and assertiveness, M<sup>D</sup> = 0.839, SE = 0.12, 95% CI [0.599, 1.078], p < 0.001, as significantly more important to succeed compared to participants in the assistant role. In contrast, communality was rated as significantly more important to succeed as an assistant than as a leader, M<sup>D</sup> = −0.218, SE = 0.10, 95% CI [−0.422, −0.013], p = 0.037.

Looking at it another way, participants in both the leader and assistant roles rated competence as the most important set of traits, higher than assertiveness (M<sup>D</sup> = 0.794, SE = 0.09, 95% CI [0.627, 0.961], p < 0.001 in leader role; and M<sup>D</sup> = 1.391, SE = 0.09, 95% CI [1.223, 1.559], p < 0.001 in assistant role) and communality (M<sup>D</sup> = 1.085, SE = 0.07, 95% CI [0.945, 1.226], p < 0.001 in leader role; and M<sup>D</sup> = 0.626, SE = 0.07, 95% CI [0.485, 0.766], p < 0.001 in assistant role). Those in the leader condition rated assertiveness as more important than communality, M<sup>D</sup> = 0.291, SE = 0.09, 95% CI [0.108, 0.474], p = 0.002, whereas those in the assistant condition did the reverse, rating communal traits as more desirable than assertive ones, M<sup>D</sup> = −0.765, SE = 0.09, 95% CI [−0.949, −0.581], p < 0.001.

#### Discussion

The goal of Study 2 was to examine men's and women's beliefs about the traits that would be important to help them personally succeed in a randomly assigned leader (vs. assistant) role. As expected, results supported our general predictions. In line with past work (Koenig et al., 2011), people rated competence and assertiveness as more necessary for success as a leader (vs. assistant), and communality as more necessary for success as an assistant (vs. leader). Although competence was seen as relatively more important for leaders than for assistants (as would be expected for a high-status professional role; e.g., Magee and Galinsky, 2008; Anderson and Kilduff, 2009), competence emerged as the most important trait to succeed in both types of roles. Moreover, as we had anticipated, even though people tend to value agency over communality when thinking of the self (Abele and Wojciszke, 2007), role assignment had the effect of reversing this pattern for participants in the assistant role (at least in terms of assertiveness, which assistants rated as less important for them to succeed than communality).

Even though we had expected to find that women (vs. men) would value communal traits to a higher extent (Boyce and Herd, 2003; Koenig et al., 2011), women were just as likely as men to see these traits as relatively unimportant for them personally to be successful in leader roles, and we failed to find any participant gender effects either in the leader or assistant role. This null interaction effect—which stands in contrast to the gender differences we observed in Study 1 might reflect the power of role demands to change self-views (Richeson and Ambady, 2001) and to override the influence of other factors such as category group memberships (LaFrance et al., 2003). Moreover, it is possible that, even if women valued communality more so than men when thinking about other leaders, they may nevertheless feel as though acting in a stereotypically feminine way and behaving less dominantly than a traditional male leader would place them at a disadvantage relative to men (Forsyth et al., 1997; Bongiorno et al., 2014). Such self-versus-other discrepancy might explain why the expected gender difference in the appreciation of communality relative to agency-assertiveness emerged in Study 1, when participants were thinking of ideal leaders, but was not apparent in Study 2, when participants were asked to think about themselves in a leader role.

## GENERAL DISCUSSION

The main goal of this investigation was to examine people's beliefs about what makes a great leader with a focus on gendered attributes, given that more stereotypically feminine leader traits (i.e., communality) appear to have become more desirable over time (Koenig et al., 2011), and that some have claimed that these attributes will define the leaders of the future (Gerzema and D'Antonio, 2013). The results of the two studies reported here were generally in line with our predictions that men's and women's idea of what it takes to be successful in leadership roles is essentially agency, which is a stereotypically masculine attribute. Communality is appreciated in leaders, but only as a non-vital complement to the fundamentally masculine core of the leader role. Whereas past investigations have reached similar conclusions (e.g., Koenig et al., 2011), the current studies contribute to this body of work in important ways.

This investigation was the first that we know of to examine the potential trade-off between agentic and communal traits in leaders. The results of Study 1 supported the proposed view that communality is valued in leaders only after meeting the more stereotypically masculine requirements of being competent and assertive. Importantly, the methods in Study 1 revealed that communal traits are indeed valued in leaders when choices are unconstrained. These results indicate that when participants rate traits independently from one another, as in past studies (e.g., Schein, 1973; Powell and Butterfield, 1979; Brenner et al., 1989; Boyce and Herd, 2003; Sczesny, 2003; Sczesny et al., 2004; Fischbach et al., 2015), their responses might unduly inflate their true appreciation for communal leader attributes. When choices were constrained, participants in Study 1 showed a clear preference for agentic leader traits (i.e., competence and assertiveness). Other investigations have similarly revealed how subtle differences in the measurement of group stereotypes may change the overall conclusions (Biernat and Manis, 1994). We hope that the methods in Study 1 may be adapted in future investigations to further examine gender leader-role expectations and preferences.

Moreover, the random assignment of men and women to a leader (vs. assistant) role in Study 2 allowed us to establish a direct causal link between occupying a leadership position and differentially valuing agentic and communal traits, extending past investigations (e.g., Heilman et al., 1995; Boyce and Herd, 2003; Duehr and Bono, 2006; Fischbach et al., 2015). We found that men and women were largely in agreement; both indicated that it would be more important for them to possess agentic rather than communal traits in order to be a good leader. These results underscore women's internalization of stereotypically masculine leader role expectations, which could discourage women from pursuing leadership roles (Bosak and Sczesny, 2008; Latu et al., 2013; Hoyt and Murphy, 2016). Furthermore, if women tend to internalize a stereotypically masculine view of leadership, it follows that women who have an interest in and attain leadership roles might have a strong tendency to behave in line with those role expectations—for example, by displaying assertiveness, which could elicit backlash and penalties for violating gender prescriptions (Rudman and Glick, 1999; Phelan et al., 2008).

Alternatively, it is possible that, even though women may value communality in leaders more so than men, as Study 1 revealed, they may nevertheless feel as though enacting these characteristics would make them appear less effective as leaders or place them at a disadvantage relative to male leaders (Forsyth et al., 1997; Bongiorno et al., 2014). For example, past investigations suggest that female leaders who behave in relatively less agentic ways are perceived to be less likable and less influential than similar male leaders (Bongiorno et al., 2014). This differentiation between the traits that women value in leaders and the traits they feel as though they must exhibit to be successful in that role (perhaps to be taken seriously by others in that role; Yoder, 2001; Chen and Moons, 2015) may explain why we did not find the predicted interaction with participant gender in Study 2.

# LIMITATIONS AND REMAINING QUESTIONS

Although the random assignment of men and women to a leader (vs. assistant) role in Study 2 allowed us to extend past investigations by drawing causal links between roles and trait desirability, a potential limitation in our approach is that the role manipulation may also conceivably lead to a difference in psychological feelings of power across conditions (Anderson and Berdahl, 2002; Schmid Mast et al., 2009). Given the large conceptual overlap between leadership and "power" (commonly defined as asymmetric control over resources; Keltner et al., 2003), it is possible that the results of Study 2 reflect at least in part the way men and women feel when they are in a position of power, independently from their role as leaders or assistants. Future investigations may address this issue by measuring felt power (Anderson et al., 2012) to examine whether participants value similar traits as they did in Study 2 over and above felt power. For example, it is conceivable that individuals in leadership roles that foster stronger (vs. weaker) feelings of power might value communality to a lower extent, and behave more dominantly overall (e.g., Tost et al., 2013).

Another potential limitation in Study 2 is that participants assigned to the assistant role condition might have assumed that the team leader was male—consistent with the notion that people think "male" when they think "manager" (Schein, 1973). Therefore, it is unclear whether the traits that they thought would help them be a successful assistant would be contingent on the assumption that they would be assisting a male-led team. Future investigations may probe whether people believe that it takes different attributes to successfully work for a female versus a male leader, and how those beliefs impact their support for male and female supervisors. For example, if men think that a female leader would expect more cooperation from subordinates than a male leader, this expectation may partly explain their reluctance to work for women.

It is also worth noting that, in both studies, we did not specify the context under which leadership (and, in Study 2, assistantship) was taking place. It seems likely that participants were thinking of some traditionally male-dominated domain (as businesses typically are). However, one important next step for future work is to examine whether the leadership domain affects which traits people value in leaders, and which traits they would find valuable for them, personally, to be a successful leader. Leaders tend to be considered particularly effective in industries and domains in which the gender composition is congruent with the gender of the leader (Ko et al., 2015; see also Eagly et al., 1995). It is conceivable that being the leader of a team that is working in a traditionally feminine domain (e.g., childcare, nursing, or even a business that caters primarily to women, such as maternity-wear or cosmetics) might change people's perception of which traits are most important.

Whereas our investigation was focused on the general dimensions of agency and communality (Abele et al., 2016),

future research might adapt the methodology of Study 1 to examine the potential tradeoffs between other kinds of leader attributes. For instance, past research has examined task-oriented versus person-oriented trait dimensions (Sczesny et al., 2004), traits related to activity/potency (e.g., forceful, passive; Heilman et al., 1995), "structuring" versus "consideration" behaviors (Cann and Siegfried, 1990; Sczesny, 2003), and transformational leader traits (Duehr and Bono, 2006), to name a few. In particular, given that transformational leadership styles tend to be quite favorable in contemporary organizations (Wang et al., 2011), and are more closely associated with femininity (Kark et al., 2012; Stempel et al., 2015), it would be especially interesting to examine whether such transformational leader attributes are also considered "unnecessary frills" (much like communal attributes in Study 1). As mentioned earlier, the context of leadership (more male- vs. more female-dominated) may be an important moderating factor worthy of consideration (Ko et al., 2015). For example, male followers appear to react more negatively to transformational leadership styles compared to female followers (Ayman et al., 2009). Thus, it is possible that the tradeoff between more and less transformational leadership attributes may partly depend on the specific industry or domain.

Similarly, whereas we examined two sub-dimensions of agency (i.e., competence and assertiveness) following Abele et al. (2016), we did not distinguish different facets within the dimension of communality. Specifically, research suggests that communality may be broken into sub-dimensions of warmth or sociability (e.g., friendly, empathetic) and morality (e.g., fair, honest) (Abele et al., 2016), a distinction that may be meaningful and consequential in the evaluation of leaders. It has been argued that morality in particular, more so than warmth/sociability, plays a primary role in social judgment (Brambilla et al., 2011; Brambilla and Leach, 2014; Leach et al., 2017), and moral emotions are implicated in bias against agentic female leaders (Brescoll et al., 2018). Thus, future investigations may examine how the tradeoff between agency and communality explored in our research might change when the morality facet of communality is considered separately from the warmth/sociability facet.

Additional research may extend the current investigations by adapting the methodology we employed in Study 1 (which we, in turn, adapted from Li et al., 2002) in various ways to further examine leader-role expectations and preferences for communality and agency in leaders (both in others and in the self). Whereas we did this in the current investigation by testing the potential tradeoffs between ideal levels of communal and agentic traits (Study 1) and the extent to which men and women viewed those traits as personally important to succeed in a leader (vs. assistant) role (Study 2), it would be worthwhile to merge these two paradigms in the future. For example, men and women in leadership roles might be asked to think about the traits they would need to be successful and then to "purchase" various amounts of those traits for themselves. Similarly, participants could be asked to purchase traits to design the ideal leader versus the ideal subordinate (e.g., the perfect assistant).

# IMPLICATIONS AND CONCLUSION

The findings from this investigation may illuminate the continued scarcity of women at the very top of organizations, broadly construed (Eagly and Heilman, 2016; Catalyst, 2018). Overall, across studies, both women and men saw communality as relatively unimportant for successful leadership. These traits, however, make women particularly well suited to occupy low status positions (Study 2), which may contribute to gender segregation (Blau et al., 2013) via women's self-selection into low status roles (Diekman and Eagly, 2008; Schneider et al., 2016).

On a more positive note, our results also suggest that women may be more supportive than men of leaders who exhibit more feminine leadership styles. We found as we had expected that women showed higher appreciation for communal attributes in leaders in comparison to men (Study 1). Furthermore, in Study 1 we also examined participants' interest in minimizing negative traits stereotypically associated with men and women when designing their ideal leader. Rather than desiring leaders to possess lower amounts of negative traits that are more stereotypically feminine (such as emotional; Shields, 2013), participants desired leaders to lack negative traits more commonly associated with men (like arrogance; Prentice and Carranza, 2002), and this preference was stronger among women compared to men.

Whereas many studies have assumed to some extent that descriptive gender and leader stereotypes are similarly shared by men and women (see review by Rudman and Phelan, 2008), our results suggest that this assumption needs to be reconsidered, particularly with respect to gender traits that are relevant to leadership. Even when men and women agreed on the attributes they would personally need to be successful leaders (Study 2), Study 1 showed that women ideally prefer leaders who are more communal relative to men, and that they feel more negative than men about certain aspects believed to characterize both men and leaders (arrogance). These subtle gender differences in leader-role expectations dovetail past investigations showing patterns consistent with gender in-group favoritism effects (Tajfel et al., 1971; Greenwald and Pettigrew, 2014) on evaluation of female and male authorities (Eagly et al., 1992; Norris and Wylie, 1995; Deal and Stevenson, 1998; Ayman et al., 2009; Kwon and Milgrom, 2010; Bosak and Sczesny, 2011; Paustian-Underdahl et al., 2014; Vial et al., 2018). For example, past studies have revealed that women have more positive attitudes toward female authorities compared to men, whether implicit (Richeson and Ambady, 2001) or explicit (Rudman and Kilianski, 2000). Similarly, a recent investigation revealed that female employees working for female supervisors tend to respect those supervisors more so than male employees and engage in positive work behaviors more frequently than male employees when working for a woman (Vial et al., 2018).

Overall, the two studies reported here further suggest that women might be relatively more supportive of leaders with more communal leadership styles compared to men. Thus, while it may be too soon to tell whether these stereotypically feminine traits will indeed define the leaders of the 21st century (Gerzema and D'Antonio, 2013), our investigation suggests that women might be more willing than men to embrace this trend.

# ETHICS STATEMENT

fpsyg-09-01866 October 1, 2018 Time: 15:20 # 12

This study was carried out in accordance with the recommendations of the American Psychological Association's Ethical Principles in the Conduct of Research with Human Participants. The protocol was approved by the Institutional Review Board at Yale University. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

# REFERENCES


# AUTHOR CONTRIBUTIONS

AV wrote the first draft of the manuscript. JN provided feedback and edits. Both authors worked collaboratively on study design and data collection and analysis.

# ACKNOWLEDGMENTS

The authors wish to acknowledge Isabel Bilotta for her assistance with Study 1, and Marianne LaFrance, April Bailey, Natalie Wittlin, Alex Noyes, Brian Earp, and Lucy Armentano for their helpful feedback.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Vial and Napier. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# APPENDIX A

# STUDY 1 STIMULI

fpsyg-09-01866 October 1, 2018 Time: 15:20 # 15

Participants first read the following preliminary instructions:

Please take a moment to think about the characteristics that would make someone your ideal leader. By "leader" we mean someone within a group who:

Controls group resources. Hires, promotes, and fires group members. Determines what needs to be done in order to achieve the group's goals. Assigns tasks to group members.

Evaluates group members' performance.

Ultimately, a leader is responsible for the group's outcomes. In this study, we will ask you to "design" your IDEAL LEADER by purchasing traits from a predetermined list. We will give you a budget of "leader dollars" which you can spend at your discretion.

Participants then saw three lists of traits, one at a time. The first two lists were prefaced by the following instructions:

Please design your ideal leader using the traits listed below. How many leader dollars would you spend for your ideal leader to possess each of these traits? For each trait, drag the bars to indicate how many leader dollars you would be willing to spend for your ideal leader to possess the trait in varying amounts. For example, if your ideal leader would be highly creative, you may want to spend \$9-10 leader dollars on that trait. In contrast, if your ideal leader would be only a little extroverted, you may want to spend \$0-1 leader dollars on that trait.

Participants then saw the list of traits, including a budget specification (e.g., "Your total budget is \$60. You may not exceed this budget when designing your ideal leader.") After rating all traits on a given list, participants were prompted to do this again with a different budget:

Now we would like you to try this again, only this time you have fewer leader dollars to spend on your ideal leader. For each trait, drag the bars to indicate how many leader dollars you would be willing to spend for your ideal leader to possess the trait in varying amounts.

These instructions were accompanied by a new budget specification (e.g., "Your total budget is \$40. You may not exceed this budget when designing your ideal leader.") The task instructions were the same for the two lists containing positive traits (e.g., competence/communality and assertiveness/communality). Finally, the instructions for the third list, which contained negative masculine and feminine stereotypes, read as follows:

Now we are interested in which characteristics you would not want your ideal leader to possess. How many leader dollars would you spend for your ideal leader not to possess each of these traits? For each trait, drag the bars to indicate how many leader dollars you would be willing to spend for your ideal leader not to possess the trait in varying amounts. For example, if you would strongly prefer that your ideal leader not be lazy, you may want to spend \$9-10 leader dollars to avoid that trait. In contrast, if you have only a modest preference that your ideal leader not be forgetful, you may want to spend \$0-1 leader dollars to avoid that trait.

These instructions were followed by budget specifications.

# APPENDIX B

# STUDY 2 STIMULI

Participants first read the following instructions, customized to condition. In the leader role condition, the text read:

Imagine you are leading a team on a special and important new project. As the leader, you are in charge of putting together a team of people to assist you in completing the project. You also determine what needs to be done in order to achieve your goals, and you assign tasks to your team members as you consider appropriate. As the leader, you also make sure team members follow your instructions and deliver in a timely manner, without missing any important deadlines. Ultimately, you are responsible for the final product, and it is your job to lead the team effort to realize your vision and complete the project successfully.

In the assistant role condition, participants read the following:

Imagine you are assisting a leader on a special and important new project. As an assistant, your job is to provide support to the team leader in completing the project. The team leader determines what needs to be done in order to achieve the team's goals, and assigns tasks to you as appropriate. As an assistant, you follow the leader's instructions, and you must deliver in a timely manner, without missing any important deadlines. Ultimately, the leader is responsible for the final product, and it is your job to help realize the leader's vision and support and assist the leader to complete the project successfully.

After reading these role instructions, all participants read the following instructions prior to rating a series of traits:

Below is a list of traits and attributes. Please indicate how important each of them is to be successful in your role as (team leader / team assistant). In other words, consider how much each of these traits would help you fulfill your role as (team leader / team assistant).

# Overlooked Leadership Potential: The Preference for Leadership Potential in Job Candidates Who Are Men vs. Women

Abigail Player, Georgina Randsley de Moura\*, Ana C. Leite, Dominic Abrams and Fatima Tresh

Centre for the Study of Group Processes, School of Psychology, University of Kent, Canterbury, United Kingdom

Two experiments tested the value people attach to the leadership potential and leadership performance of female and male candidates for leadership positions in an organizational hiring simulation. In both experiments, participants (Total N = 297) valued leadership potential more highly than leadership performance, but only for male candidates. By contrast, female candidates were preferred when they demonstrated leadership performance over leadership potential. The findings reveal an overlooked potential effect that exclusively benefits men and hinders women who pursue leadership positions that require leadership potential. Implications for the representation of women in leadership positions and directions for future research are discussed.

#### Edited by:

Sabine Sczesny, University of Bern, Switzerland

#### Reviewed by:

Susan Murphy, University of Edinburgh, United Kingdom Marcel Zentner, University of Innsbruck, Austria

#### \*Correspondence:

Georgina Randsley de Moura G.R.de-Moura@kent.ac.uk

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

> Received: 30 April 2018 Accepted: 19 March 2019 Published: 16 April 2019

#### Citation:

Player A, Randsley de Moura G, Leite AC, Abrams D and Tresh F (2019) Overlooked Leadership Potential: The Preference for Leadership Potential in Job Candidates Who Are Men vs. Women. Front. Psychol. 10:755. doi: 10.3389/fpsyg.2019.00755 Keywords: leadership, potential, gender, women, talent management, hiring decision

# INTRODUCTION

"Women hold up more than half the sky and represent much of the world's unrealized potential." Ki Moon (2011)

The unbalanced representation of women in leadership is a significant social, cultural, and organizational issue. Given that women now represent 40% of the global working population (The World Bank, 2017), it would be reasonable to expect a comparable gender ratio in leadership roles. However, women only represent 34% of managerial positions around the world (World Economic Forum [WEF], 2018), and even less in the top roles. For example, in the United States less than 5% Fortune 500 CEOs are women (Zarya, 2018). Thus, the persistent underrepresentation of female CEOs across different countries suggests that women face significant gender bias in the processes involved in the hiring and promotion of leaders. It may be that women's different career trajectories render them less likely to occupy management positions than men (e.g., Eagly and Karau, 1991, 2002; Ryan and Branscombe, 2012; Hoobler et al., 2014). Moreover, some research indicates that there are exceptions to the preferential selection of male leaders, with women more likely to be appointed to risky or precarious positions for example (glass cliff, see Ryan and Haslam, 2005). Nonetheless, the evidence overall indicates that women are less likely than men to be appointed to top leadership roles (Moss-Racusin et al., 2012; Chartered Management Institute [CMI], 2016; Glass and Cook, 2016).

# Leadership Potential

Identifying talent for the future is key for organizations, and confers a competitive advantage (Silzer and Dowell, 2010). Talent management systems and leadership potential programs are designed to

identify those individuals who will be leaders in the future and occupy senior positions (Church et al., 2015). Leadership potential specifically refers to exhibiting the qualities that signal future leadership effectiveness (e.g., Silzer and Borman, 2017). There are several frameworks that identify key characteristics of leadership potential, one of the most prominent being analytical capability (e.g., strategic insight, Dries and Pepermans, 2012). However, most research on leadership potential has confounded it with current and past performance rather than on distinct indicators of leadership potential (Silzer and Church, 2009). Specifically, leadership potential and leadership performance are highly conflated in practice, because indicators of high performance often provide the only source of information about potential. The use of high-performance indicators to measure potential has been criticized because performance is limited to the requirements of an individual's current role, and may not extend to success at the next level (Robinson et al., 2009). Indeed, performance indicators can create a "halo effect" that may overinflate perceptions of leadership potential (Balzer and Sulsky, 1992; Konczak and Foster, 2009).

An operational distinction between potential and performance was provided by Tormala et al. (2012). Participants were presented with competing candidates who were either higher in potential or higher in performance. Future potential overshadowed previous performance with respect to participants' evaluations of impressiveness and endorsement across a range of domains (e.g., art, sport, graduate school entry, and job recruitment). For example, participants judged two candidates with equivalent educational and professional backgrounds for a managerial position at a large company (Tormala et al., 2012, Experiment 2). One of the candidates had purportedly scored higher on a leadership achievement inventory, whereas the other scored higher on an assessment of leadership potential. Participants recognized that the candidate with higher leadership achievement had a more impressive résumé, but they expected the candidate with higher leadership potential to perform better in the future. Therefore, in this research we operationalize leadership potential and leadership performance as distinct leadership characteristics.

Assessments of leadership performance involve judgments of a number of different leadership traits or characteristics (e.g., vision, interpersonal, task-orientated). Previous research has found that assessments of women were higher than those of men on leadership performance but lower than those of men on vision and strategy (e.g., Ibarra and Obodaru, 2009; Roth et al., 2012). Differential ratings on vision and strategy might have consequences for leadership selection given that strategic insight, and analytical skills in general, are acknowledged as key indicators of leadership potential (e.g., Marshall-Mies et al., 2000; Silzer and Church, 2009; Dries and Pepermans, 2012). For example, Ibarra and Obodaru (2009) studied 2,816 female and male executives across 149 countries, analyzing 22,244 evaluations, and found that women were rated better than or equal to men across a range of measures but that men were rated significantly higher than women on "visioning" – the ability to be able to put forward a compelling vision and strategy. Moreover, a meta-analysis of field studies (N = 45,733) revealed

that women were evaluated more favorably than men on overall job performance ratings. Yet women were rated lower than men on the measure of future performance and promotability (Roth et al., 2012). Such differences might arise partly because women are more likely to take on tasks which require competence, but do not improve chances of promotion (e.g., committee service; Babcock et al., 2017). Nonetheless, research on the power of gender stereotypes and decisions about leadership is conclusive – all else being equal, women are judged more harshly than men (e.g., Rudman and Glick, 2001; Lyness and Heilman, 2006; Blau and DeVaro, 2007).

# Gender Bias in Leadership Selection

Social roles include both descriptive beliefs that define what men and women are like, and also prescriptive norms that define how individuals should be and how they should not be (Eagly et al., 2000; Eagly and Wood, 2012). According to social role theory (Eagly and Wood, 1999, 2012), typical gender roles (e.g., women overpopulating communally demanding roles and men overpopulating agentically demanding roles) are likely to persist because people consistently witness typically female and male behavior and conclude that these characteristics are representative of the sexes. Indeed, because people are frequently exposed to typical sex-typed behavior, women are typically perceived as, and expected to be, communal (e.g., caring, sensitive), whereas men are expected to be agentic (e.g., determined, competitive; Eagly and Karau, 1991; Eagly et al., 1995; Heilman, 2001; Eagly and Sczesny, 2009; Rosette and Tost, 2010; Koenig et al., 2011). In those workplaces where agency instead of communality is expected, stereotypes produce distinctive penalties for women (Caleo and Heilman, 2013). In particular, meta-analysis shows that leadership roles are still typically viewed as being agentic (Koenig et al., 2011), and therefore men will be perceived as more capable leaders (Levinson and Young, 2010).

When women demonstrate success in leadership roles, they can be penalized because they violate gender-prescriptive norms (Heilman et al., 2004) or contextual expectations (e.g., Randsley de Moura et al., 2018). Ultimately, when people interrupt gender stereotypes, they can suffer consequences that undermine and devalue their social and economic status (Rudman and Phelan, 2008). Women who put themselves forward for positions of leadership can therefore face backlashes that undermine their status (Rudman and Phelan, 2008). In support of this idea, the devaluation of women leaders is more pronounced when they occupy male-dominated roles (Eagly et al., 1992). Meta-analysis has also highlighted that women who display explicitly dominant behaviors (e.g., direct demands) are perceived as less hirable – because they are rated lower in likeability rather than any reduction in perceived competence (Williams and Tiedens, 2016).

In this paper, we test the hypothesis that women's leadership potential is more likely to be dismissed than men's leadership potential. This is consistent with the "think manager-think male" phenomenon (e.g., Schein et al., 1996). Substantial evidence suggests that the stereotype of a typical leader is highly congruent with masculine traits (Eagly and Karau, 2002;

Koenig et al., 2011). The incongruence between the stereotype of a typical leader and feminine traits may explain why women face more challenging thresholds for promotion. For example, Lyness and Heilman (2006) found that women who occupied management positions that were typically characterized by organizational power and influence (i.e., gender role incongruous) also received lower performance ratings than their male counterparts. In summary, we expect an overlooked potential effect such that women's but not men's leadership potential is likely to be overlooked when people judge and select candidates for leadership.

Although research indicates that evaluations of leaders and promotion to leadership positions are likely to be biased in favor of men, a meta-analysis (Koch et al., 2015; N = 22,348) revealed a bias for men in male-dominated roles (e.g., in a leader position). However, that role congruity bias was attenuated when information clearly highlighted a candidate's high competence. We hypothesized that a female candidate's leadership potential may only be acknowledged if she is unambiguously a high performer (i.e., when her leadership achievements cannot be dismissed).

# OVERVIEW OF STUDIES

Previous studies have found that gender role incongruity (see Heilman and Eagly, 2008) contributes to gender inequality in leadership positions, but to date there is no explicit experimental evidence on gender biases in the recognition of leadership potential. Given the importance of recognizing and effectively managing talent for businesses (Church, 2014), it is essential to investigate gender as a boundary condition to perceptions of leadership potential. Holding constant the actual traits and performance of candidates, two experimental studies used simulated hiring decisions to investigate whether leadership potential is overlooked in women, but not in men.

We used a simulation of organizational hiring of candidates applying for leadership positions. This experimental vignette methodology was used as it is regarded as a reliable and accurate method that allows greater control of the research process (Handley et al., 2007; Doz, 2011). In addition, we recruited participants through online crowdsourcing portals to provide relevant samples (e.g., Buhrmester et al., 2011; Holden et al., 2013).

Experiment 1 tested the effects of candidate gender on the recognition of leadership potential. Specifically, we tested whether there is a preference for potential in both male and female candidates, or whether people overlook leadership potential in female candidates. We also explored whether the decision makers' gender moderated the preference for potential in each gender of candidate. Experiment 2 investigated the evaluation of leadership potential and leadership performance, candidate gender, and decision makers' gender when leaders were being hired for a senior management position. Taken together, these studies examined whether leadership potential is overlooked in women who seek progression into leadership positions, relative to men with identical résumés.

Specifically, we hypothesized that participants would prefer leadership potential over leadership performance (Hypothesis 1). We expected that participants would prefer leadership potential more in male candidates than in female candidates (Hypothesis 2). More importantly, when it comes to candidate choice, we hypothesized that participants would prefer leadership potential over leadership performance in male candidates (Hypothesis 3); but leadership performance over leadership potential in female candidates (Hypothesis 4). In addition, we hypothesized that high leadership potential male candidates would be selected more than high potential female candidates (Hypothesis 5).

All experiments were carried out in accordance with the recommendations of the School of Psychology Ethics Committee at the University of Kent, United Kingdom. The protocol was approved using the School of Psychology Ethics system. All participants gave written informed consent in accordance with the Declaration of Helsinki. The research was conducted in accordance with guidelines from the University of Kent Research Ethics (Human Participants) Committee, the Economic and Social Research Council (ESRC) Research Ethics Framework, and the ethical guidelines from the British Psychological Society (BPS).

# EXPERIMENT 1

# Materials and Methods

#### Participants and Design

We recruited 98 participants (59 males and 39 females, Mage = 36.38, 79.6% employed) via Amazon MTurk. The quasi-experimental design was a 2 (Leadership Characteristic: leadership potential, leadership performance) × 2 (Candidate Gender: female, male) × 2 (Participant Gender: female, male) mixed design, with leadership characteristic and candidate gender as within-participant factors. All additional candidate information (e.g., age, qualifications, work experience, GPA) was counterbalanced.

#### Procedure and Materials

Participants were presented simultaneously with four candidates (male candidate with leadership potential, male candidate with leadership performance, female candidate with leadership potential, female candidate with leadership performance; see **Appendix** in random order from left to right). Participants were asked to imagine they worked for a hypothetical organization "ALPHATech" and that they were involved in the recruitment and selection of a new employee:

"ALPHATech is a successful business providing financial and economic advice (e.g., tax, investments, account management, and pensions) to a number of different industries. Imagine that you work in a human resources role and you are part of the team responsible for recruiting and hiring new employees. ALPHATech are currently expanding their business and as part of this are recruiting for a number of positions within the company. Imagine that you are part of the hiring panel and you have been given some candidates to evaluate."

Candidate potential and performance were manipulated by adjusting the score on two assessments: leadership achievement and leadership potential. Specifically, as in Tormala et al. (2012, Experiment 2), the Leadership Achievement Inventory manipulated a high or moderate performer by varying the score (83/100 or 96/100) and the accompanying paragraphs as follows:

"The LAI gauges leadership achievement, defined as an individual's observed (i.e., actual) leadership performance at the current stage in his or her career. An achievement score of 83 places this applicant in the top 17% of people who have been assessed [An achievement score of 96 places this applicant in the top 4% of people who have been assessed].

The Assessment of Leadership Potential score was accompanied by the following paragraph<sup>1</sup> , which varied depending on the condition (high or moderate leadership potential):

"The ALP gauges leadership potential, defined as the employee's predicted leadership performance in the near future. A score of 96 indicates that this applicant predicted future leadership performance is estimated to be in the top 4% of people who have been assessed [A score of 83 indicates that this applicant predicted future leadership performance is estimated to be in the top 17% of people who have been assessed]."

Thus, in the leadership potential condition, the applicant had received a higher score on potential (top 4%) and a more moderate score (top 17%) on leadership achievement, whereas in the leadership performance condition, the applicant had received a moderate leadership potential score (top 17%) and a high performance score (top 4%). High and moderate scores were used rather than high and low scores, in order to focus attention on the dimension at which the candidate excelled rather than suggesting any weakness (see **Appendix**). The focus on leadership potential or leadership performance was reinforced through comments ostensibly taken from a panel review, for example:

"This candidate has great prospects. She has some exciting new ideas for the future of the team and the organization, which could offer the opportunity to increase sales and performance in the future." [Leadership Potential]

"The applicant is highly capable, and has consistently performed above his own objectives and that of the organizations. The performance in his current role has exceeded expectations." [Leadership Performance]

# Measures Candidate Hiring

Candidate hiring was measured using two items on a 9-point rating scale (α = 0.78): "How interested would you be in hiring each applicant?," "To what extent do you think hiring each applicant would be a good decision or a bad one?" Lower values indicate less hiring intention.

#### Expected Success

Expected success was measured using one item asking participants "How successful do you think each applicant will be in their career?" (1 – not at all successful, 9 – very successful).

#### Résumé Evaluation

Résumé evaluation was measured by asking participants to compare all four applicants and decide "in your opinion, which applicant has the most impressive résumé?" They were required to rank candidates from first (most impressive) to fourth (least impressive).<sup>2</sup>

### Future Performance

Future performance was measured with an order of preference based on performance, "which applicant do you think will perform better by the fifth year at ALPHATech?" Candidates were ranked from best future performance (first) to worst future performance (fourth).

# Results

We conducted a Leadership Characteristic (leadership potential, leadership performance) × Candidate Gender (female, male) × Participant Gender (female, male) mixed ANOVA to analyze the evaluation items of candidate hiring and expected success. We hypothesized that participants would be more willing to hire candidates with leadership potential and would expect those candidates to be more successful than candidates with leadership performance (Hypothesis 1). Furthermore, we expected these effects to be stronger for male candidates (Hypothesis 2 and Hypothesis 3). We did not hypothesize participant gender effects but included this factor as exploratory.

Friedman tests and Wilcoxon Signed Ranks tests were used to analyze whether there were differences in the choicebased rankings of each candidate's résumé and expected future performance. We expected participants to rank the male candidate with leadership potential higher than the male candidate with leadership performance on the evaluation of résumés and expected future performance. We expected the opposite pattern for female candidates (Hypothesis 4). Finally, we expected participants to rank the male candidate with leadership potential higher than the female candidate with leadership potential in both the evaluation of résumés and expected future performance (Hypothesis 5).

## Candidate Hiring

There was a significant main effect of candidate gender, F(1,96) = 5.15, p = 0.025, η <sup>2</sup> = 0.05, with female candidates rated as more likely to be a good hire than male candidates, see **Table 1**. The main effect of leadership characteristic was nonsignificant, F(1,96) = 1.40, p = 0.240, η <sup>2</sup> = 0.01, as was the main effect of participant gender, F(1,96) = 0.42, p = 0.838, η <sup>2</sup> < 0.001.

<sup>1</sup>This was identical to Tormala et al. (2012, Experiment 2) including the typographical error "the applicant predicted future" rather than "the applicant's predicted future."

<sup>2</sup>We also asked participants to make a choice of résumé based on the following item "at present, which applicant had a more objectively impressive résumé?" Results were the same for this measure, and given that the items are very similar and taken that ranking data cannot be aggregated into an average score, we opted to report the results for the first measure. Results for the second item are available from the corresponding author on request.

There was a significant Candidate Gender × Participant Gender interaction, F(1,96) = 9.77, p = 0.002, η <sup>2</sup> = 0.09. Simple main effects of candidate gender within levels of participant gender were analyzed. There was a significant difference for female participants' evaluation of male and female candidates, F(1,96) = 12.09, p = 0.001, η <sup>2</sup> = 0.11, who expressed a preference for female candidates over male candidates overall (**Table 1**). This was not hypothesized, but demonstrates ingroup bias for female participants. There was no significant difference among male participants, F(1,96) = 0.46, p = 0.499, η <sup>2</sup> = 0.01. There were no simple main effects of participant gender within level of candidate gender. Female and male participants did not make significantly different hiring evaluations of female candidates, F(1,96) = 3.59, p = 0.06, η <sup>2</sup> = 0.04, or of male candidates, F(1,96) = 1.88, p = 0.174, η <sup>2</sup> = 0.02. There was no significant Leadership Characteristic × Candidate Gender × Participant Gender interaction effect, F(1,96) = 1.69, p = 0.196, η <sup>2</sup> = 0.017.

#### Expected Success

There were no significant main effects of candidate gender, F(1,96) = 1.27, p = 0.263, η <sup>2</sup> = 0.01, participant gender F(1,96) = 2.34, p = 0.129, η <sup>2</sup> = 0.02, or leadership characteristic F(1,96) = 2.57, p = 0.112, η <sup>2</sup> = 0.03 (which does not support Hypothesis 1). There Candidate Gender × Participant Gender interaction effect was not significant, F(1,96) = 3.10, p = 0.082, η <sup>2</sup> = 0.03, and a non-significant Leadership Characteristic × Participant Gender interaction, F(1,96) < 0.001, p = 0.995, η <sup>2</sup> < 0.01. There was a significant Leadership Characteristic × Candidate Gender interaction, F(1,96) = 4.28, p = 0.041, η <sup>2</sup> = 0.04. Consistent with Hypothesis 2, there was a preference for leadership potential over leadership performance for male candidates only, F(1,96) = 5.12, p = 0.026, η <sup>2</sup> = 0.05, see **Table 1**, but not for female candidates, F(1,96) = 0.001, p = 0.981, η <sup>2</sup> < 0.001. There was no significant differentiation between male and female leadership potential candidates, F(1,96) = 0.52, p = 0.473, η <sup>2</sup> = 0.05, or between male and female leadership performance candidates, F(1,96) = 3.56, p = 0.06, η <sup>2</sup> = 0.04.

Our exploratory analysis for participant gender revealed a significant Leadership Characteristic × Candidate Gender × Participant Gender interaction, F(1,96) = 5.85, p = 0.017, η <sup>2</sup> = 0.06. We decomposed the three-way interaction by participant gender. Simple interactions showed that the Leadership Characteristic × Candidate Gender interaction was only significant among female participants, F(1,96) = 8.37, p = 0.005, η <sup>2</sup> = 0.08, not among male participants, F(1,96) = 0.08, p = 0.783, η <sup>2</sup> < 0.01.

The second order simple effect was significant among female participants who differentiated between candidates with leadership performance, F(1,96) = 7.94, p = 0.006, η <sup>2</sup> = 0.08. **Table 1** shows that female participants expected the female candidate with leadership performance to be more successful than the male candidate with leadership performance. Moreover, female participants expected the male candidate with leadership potential to be more successful than the male candidate with leadership performance, see **Table 1**, F(1,96) = 5.32, p = 0.023, η <sup>2</sup> = 0.05. Female participants did not differentiate significantly between female candidates based on leadership characteristic, F(1,96) = 1.15, p = 0.287, η <sup>2</sup> = 0.01, or between male and female candidates with leadership potential, F(1,96) = 0.54, p = 0.465, η <sup>2</sup> < 0.01.

#### Résumé Evaluation

A Friedman test showed that the ranking evaluations of each candidate résumé were different, χ 2 (3) = 88.51, p < 0.001, see **Table 2** for mean ranks. Wilcoxon signed rank tests provided support for our hypotheses. Specifically, male candidates with leadership potential were ranked higher than male candidates with leadership performance, Z = −6.36, p < 0.001

TABLE 1 | Means and standard errors by candidate gender, participant gender, and leadership characteristic for candidate hiring and expected success (Experiment 1).


TABLE 2 | Mean rank for each candidate for résumé evaluation and future performance (Experiment 1).


(Hypothesis 3). In contrast, female candidates with leadership performance were ranked higher than female candidates with leadership potential, Z = −4.70, p < 0.001 (Hypothesis 4). Furthermore, male candidates with leadership potential were ranked higher than female candidates with leadership potential, Z = −6.27, p < 0.001 (Hypothesis 5). Moreover, female candidates with leadership performance were ranked higher than male candidates with leadership performance, Z = −5.92, p < 0.001. In brief, in support of our hypotheses, male candidates with leadership potential were ranked as more impressive than male candidates with leadership performance. In contrast, female candidates with leadership performance were ranked as more impressive than female candidates with leadership potential.<sup>3</sup>

#### Future Performance

A Friedman test showed that the rankings reflecting expectations of each candidate's future performance were different, χ 2 (3) = 78.59, p < 0.001, see **Table 2** for mean ranks. Wilcoxon signed rank tests revealed that male candidates with leadership potential were ranked higher than those candidates with leadership performance, Z = −6.12, p < 0.001 (Hypothesis 3). In contrast, female candidates with leadership performance were ranked higher than those with leadership potential, Z = −4.65, p < 0.001 (Hypothesis 4). Furthermore, male candidates with leadership potential were ranked higher than female candidates with leadership potential, Z = −6.00, p < 0.001 (Hypothesis 5). Finally, female candidates with leadership performance were ranked higher than male candidates with leadership performance, Z = −5.93, p < 0.001. In brief, results supported our hypotheses, with male candidates with leadership potential ranked more highly than those with leadership performance, but that this would not be the case for female candidates. Indeed, female candidates with leadership performance were ranked higher than female candidates with leadership potential.

## Discussion

Experiment 1 provides the first experimental evidence that female and male candidates' leadership potential and leadership performance are evaluated differently. We did not find evidence for Hypothesis 1, an overall preference for potential. In line with an overlooked potential pattern, we found that participants expected male candidates with leadership potential to be more successful than male candidates with leadership performance (Hypothesis 3), although this was not the case for the candidate hiring measure. When participants ranked female candidates, they preferred leadership performance over leadership potential consistently across measures (support for Hypothesis 4). Interestingly, when participants were asked to rank candidates in evaluation of résumés and on future performance, female candidates' leadership performance was preferred over that of male candidates. This type of ranking decision closely matches actual hiring processes, where final choices rely on rule-based selection criteria (e.g., ranking based on résumé evaluation).

We did not hypothesize effects of participant gender, but exploratory analysis revealed some differences. Specifically, a three-way interaction on candidates' expected success showed that the two-way interaction was only significant among female participants. When judging candidates' expected success, female participants rated female candidates with leadership performance as likely to be more successful than male candidates with leadership performance. Female participants also expected male candidates with leadership potential to be more successful than male candidates with leadership performance.

In this study, female candidates were rated as more hirable than male candidates. This unexpected finding is in line with a recent meta-analysis which showed that women are rated more effective than men in senior levels (Paustian-Underdahl et al., 2014). The stimulus materials presented to participants in Experiment 1 did not specify the level of leadership being recruited for. The information implied that the role was a relatively junior leadership position. This scenario had reasonable face validity because many fast-track programs are specifically designed to develop the potential of emerging talent (Singh et al., 2009; Thomas, 2009; Dries and Pepermans, 2012; Guan et al., 2014). Moreover, the principal motivation behind identifying leadership potential is to generate a pipeline of future leaders, which has major benefits (e.g., Williams-Lee, 2008; Poehlman and Newman, 2014). Nonetheless, the use of leadership potential as a selection criterion may be more common in the case of explicitly senior positions because many of the assessment tools used for selecting senior executives are related closely to those used to gauge high potential (Grabner and Moers, 2013). In Experiment 2, as well as retesting the overlooked potential effect, we therefore modified materials to highlight that the candidates were being considered for senior leadership positions. We also bolstered the measurement of the evaluation of expected success by using a more reliable multi-item measure. We also recruited a larger sample of participants. Finally, to provide a more direct test of Hypotheses 3−5, we asked participants to explicitly rank whom they would hire for the job.

# EXPERIMENT 2

# Materials and Methods Participants and Design

Participants (N = 199; 126 females, 73 male Mage = 35.02, 78.4% in full or part-time employment) were recruited via

<sup>3</sup>Given the ranking nature of the data we were not able to test for interactions with participant gender. However, we conducted Friedman tests, and Wilcoxon signed rank tests, separately for female and male participants for the measures of evaluation of résumé and future performance. The pattern of results was identical for each participant gender group. Please refer to **Supplemental Materials** for the results of these exploratory analyses.

an international online database, Amazon MTurk. The quasi experiment was a Leadership Characteristic × Candidate Gender × Participant Gender mixed design, with leadership characteristic and candidate gender as within-participant factors. All participants were exposed to a total of four résumés manipulating leadership characteristic (leadership potential and leadership performance) and candidate gender (male and female). To ensure consistency in other relevant résumé information, participants randomly received counterbalanced combinations of additional background information for each candidate.

#### Procedure and Materials

fpsyg-10-00755 April 12, 2019 Time: 17:39 # 7

Individuals were invited, via an online platform, Qualtrics, to take part in a study on organizational decision-making. The experiment consisted of two phases. Participants were presented with an imitation Business News article describing the announcement of the retirement, and subsequent search for replacement, of the Director of Financial Affairs of a fictitious company, Tell Inc. The article provided background information about the organization, and a brief description that described Tell Inc.'s role as a growing and successful telecommunications company:

"In an open letter to Tell Inc. employees the CEO, Robin Metcalfe, announced the resignation of the company's Vice President of Financial Affairs, Alex Hepburn, adding 'Alex has been a great asset to this company having immeasurably contributed to our progress over recent years.'

Tell Inc. is a highly successful United States based telecommunications company, consistently performing well on the global markets, with particular growth and expansion in Eastern Europe and China over the last year. Tell Inc. is well known for its dynamic and innovative approach to communication technology, having developed some of the most well-known products on the market today.

This is a very important role for Tell Inc. to fill and there will be significant interest in the technology community about who will be appointed and which direction they will look to take the company in.

CEO Robin Metcalfe, said that they are looking to find 'the best possible candidate to help lead and shape the bright future of Tell Inc.

All eyes are on the CEO and Board of Directors to see who they choose."

Next, participants were presented each résumé (male leadership potential, female leadership potential, male leadership performance, female leadership performance). The background information and leadership scores (future leadership potential and previous leadership achievement) were the same as shown in Experiment 1. In Experiment 2, the résumés were made relevant to the hiring of a more senior candidate by changing candidates' previous work experience to include at least one well known tech or communications company and by providing reviews from other people (previous employer and Tell Inc. CEO) and selfdescriptions by the candidate. These comments reinforced either the candidates' future leadership potential or previous leadership performance. The following examples show quotes from a CEO about a female candidate with leadership performance and about a male candidate with leadership potential, respectively:

"Christine is clearly a candidate who has performed very highly throughout her career. She has shown from her past achievements and accomplishments that she is highly capable of performing to the highest standard. Christine is certainly at the top of her group in her professional achievements."

"Rupert is clearly a candidate who has shown excellent potential throughout his career. You can see from his budding talent and promise that he is highly capable of being one of the best in his field. Rupert is absolutely at the top of his vocation in terms of his professional potential."

Participants then completed the evaluative rating measures (candidate hiring, expected success), immediately after reviewing each candidate. Next, all four résumés were presented simultaneously, so that participants could refresh their memory, and to minimize availability bias toward the most recently reviewed résumé. Participants then completed the dependent measures.

# Measures

#### Candidate Hiring

Candidate hiring was measured using two items (α = 0.85): "I would hire this candidate" and "this candidate would be a good appointment." Items were measured on a rating scale and ranged from 1 (strongly disagree) to 9 (strongly agree).

#### Expected Success

Expected success was measured using six items to examine career and job success on a rating scale, from 1 (strongly disagree) to 9 (strongly agree) (α = 0.94; adapted from Ironson et al., 1989; Kossek et al., 2001). Items included: "How successful do you think each applicant will be in their career?"; "How successful do you think each applicant will be in their career, compared to other people?"; and "How successful do you think each applicant will be in their career, compared to the applicants' significant others?"

#### Résumé Evaluation

Résumé evaluation was indicated by a choice of candidates, participants were asked "in your opinion, which applicant has the most impressive résumé?"(first, second, third, fourth), with first the most impressive.<sup>4</sup>

#### Future Performance

Future performance was assessed with the rank of candidates in response to the question "which candidate do you think will perform better by the fifth year?" (first, second, third, fourth), with first most likely to perform best.

<sup>4</sup> Similar to Experiment 1, we also asked participants to compare résumés based on the following item "at present, which applicant had a more objectively impressive résumé?" Results were the same for this measure, and given that the items are very similar and ranking data cannot be aggregated into an average score, we opted to report the results for the first measure. Results for the second item are available from the corresponding author on request.

#### Player et al. Leadership Potential: Gender and Selection

#### Hire Choice

fpsyg-10-00755 April 12, 2019 Time: 17:39 # 8

Hire choice was measured by participants rank choice of "which applicant would you hire?," first to fourth, with first the choice of hire.

# Results

A Leadership Characteristic (leadership potential and leadership performance) × Candidate Gender (female and male) × Participant Gender (female and male) mixed ANOVA was used to analyze the measures of candidate hiring and expected success. As in Experiment 1, we hypothesized that participants would be more likely to hire candidates with leadership potential and would expect those candidates to be more successful than candidates with leadership performance (Hypothesis 1). Furthermore, we anticipated that these effects should be stronger for male candidates (Hypothesis 2). We did not hypothesize participant gender effects but included this factor as exploratory.

Friedman tests and Wilcoxon Signed Ranks tests were used to analyze whether there were differences in the choice-based rankings reflecting evaluations of each candidate's résumé, future performance, and hire choice. Specifically, we expected participants to rank the male candidate with leadership potential higher the than male candidate with leadership performance on the evaluation of their résumés, future performance, and hire choice (Hypothesis 3). We predicted the opposite pattern for female candidates (Hypothesis 4). Finally, we expected participants to rank the male candidate with leadership potential higher than the female candidate with leadership potential in all ranking measures (Hypothesis 5).

#### Candidate Hiring

There was a significant effect of leadership characteristic, F(1,197) = 15.05, p < 0.001, η <sup>2</sup> = 0.07. Participants rated candidates who exhibited leadership performance on their résumé more favorably than candidates who displayed leadership potential (see **Table 3**). This does not support Hypothesis 1. There was a near significant effect of participant gender, F(1,197) = 3.80, p = 0.053, η <sup>2</sup> = 0.02. **Table 3** shows that female participants rated candidates more highly than male participants. Contrary to Hypothesis 2, the Leadership Characteristic × Candidate Gender interaction was not significant, F(1,197) = 3.14, p = 0.078, η <sup>2</sup> = 0.02. All remaining effects were not significant (see **Table 4**).

#### Expected Success

There was a significant effect of leadership characteristic, F(1,197) = 17.72, p < 0.001, η <sup>2</sup> = 0.08. Candidates with leadership performance were rated as more likely to be successful than those with leadership potential (**Table 3**); this does not support Hypothesis 1. All other main effects and two-way interactions were not significant (see **Table 4**).

There was a near significant Leadership Characteristic × Candidate Gender × Participant Gender interaction, F(1,197) = 3.79, p = 0.053, η <sup>2</sup> = 0.02. We decomposed the three-way interaction by participant gender. Simple interaction effects showed that the Leadership Characteristic × Candidate Gender interaction was only significant for female participants, F(1,197) = 6.08, p = 0.015, η <sup>2</sup> = 0.03, and not for male participants, F(1,197) = 0.32, p = 0.571, η <sup>2</sup> = 0.002. Second order simple effects show that female participants expected the female candidate with leadership performance to be more successful than the female candidate with leadership potential, F(1,197) = 12.15, p = 0.001, η <sup>2</sup> = 0.06. In addition, **Table 3** shows that the female participants expected the male candidate with leadership potential to be more successful than the female candidate with leadership potential, F(1,197) = 9.12, p = 0.003, η <sup>2</sup> = 0.04. Female participants did not differentiate significantly between the male candidates

TABLE 3 | Means and standard errors by candidate gender, participant gender, and leadership characteristic for candidate hiring and expected success (Experiment 2).




TABLE 5 | Mean rank for each candidate for résumé evaluation, future performance, and hire choice (Experiment 2).


based on leadership characteristic, F(1,197) = 0.04, p = 0.842, η <sup>2</sup> < 0.001, or between male and female candidates with leadership performance, F(1,197) = 0.15, p = 0.703, η <sup>2</sup> = 0.001.

#### Résumé Evaluation

A Friedman test showed that the rankings of the résumés differed, χ 2 (3) = 185.25, p < 0.001. Wilcoxon signed rank tests supported our hypotheses, and **Table 5** shows the mean rank per candidate. Male candidates with leadership potential were ranked more highly than male candidates with leadership performance, Z = −9.79, p < 0.001 (Hypothesis 3). In contrast, female candidates with leadership performance were ranked more highly than female candidates with leadership potential, Z = −6.19, p < 0.001 (Hypothesis 4). Furthermore, male candidates with leadership potential were ranked more highly than female candidates with leadership potential, Z = −9.76, p < 0.001 (Hypothesis 5). Finally, female candidates with leadership performance were ranked more highly than male candidates with leadership performance, Z = −7.61, p < 0.001.<sup>5</sup>

#### Future Performance

A Friedman test showed that the four candidates' future performances were ranked differently, χ 2 (3) = 133.85, p < 0.001. Wilcoxon signed rank tests supported our hypotheses, and **Table 5** shows the mean rank per candidate. The future performance of male candidates with leadership potential was ranked more highly than that of male candidates with leadership performance, Z = −8.71, p < 0.001 (Hypothesis 3). In contrast, the future performance of female candidates with leadership performance was ranked more highly than that of female candidates with leadership potential, Z = −3.80, p < 0.001 (Hypothesis 4). Furthermore, male candidates with leadership potential were ranked more highly than female candidates with leadership potential, Z = −7.65, p < 0.001 (Hypothesis 5). Finally, female candidates with leadership performance were ranked more highly than male candidates with leadership performance, Z = −8.05, p < 0.001.

#### Hire Choice

A Friedman test showed that hiring preference differed among the four candidates, χ 2 (3) = 164.84, p < 0.001. Wilcoxon signed rank tests supported our hypotheses, and **Table 5** shows the mean rank per candidate. Specifically, male candidates with leadership potential were more likely to be the hire than those with leadership performance Z = −9.56, p < 0.001 (Hypothesis 3). In contrast, female candidates with leadership performance were more likely to be the hire than those with leadership potential, Z = −4.36, p < 0.001 (Hypothesis 4). Furthermore, male candidates with leadership potential were more likely to be the hire than female candidates with leadership potential, Z = −8.44, p < 0.001 (Hypothesis 5). Finally, female candidates with leadership performance were more likely to be the hire than male candidates with leadership performance, Z = −8.42, p < 0.001.

# Discussion

Experiment 2 provides evidence that candidates' gender moderates evaluations of their leadership characteristics. Consistent findings across the ranking measures provide clear evidence regarding the overlooked potential effect. We found that when participants ranked male candidates there was a preference for potential (Hypothesis 3), whereas leadership potential was overlooked when they ranked female candidates (Hypotheses 4 and 5). Indeed, consistent with Experiment 1, when participants judged female candidates, leadership performance was preferred over leadership potential. Moreover, the finding that leadership potential led to an upgrading of (otherwise equivalent) male candidates relative to female candidates, and that leadership performance led to a downgrading of male relative to female candidates seems highly consistent with the interpretation that gender role expectations moderated judgments of the candidates.

In our exploratory analysis we also found some evidence that participant gender affected these judgments. Specifically,

<sup>5</sup> Similarly to Experiment 1, we conducted Friedman tests and Wilcoxon signed rank tests separately for female and male participants for all ranking measures. The pattern of results was identical for each participant gender group, with two exceptions. Specifically, for the measure of future performance and hire choice, male participants ranked the female candidate with leadership potential similarly

to the female candidate with leadership performance. However, as hypothesized, both female and male participants ranked the male candidate with leadership potential higher than the male candidate with leadership performance. Please refer to **Supplemental Materials** for the results of these exploratory analyses.

an interaction between candidate gender and leadership characteristic on expectations about candidates' success was significant among female participants but not male participants. Female participants rated the male candidate with leadership potential higher than the female candidate with leadership potential. Additionally, female participants expected the female candidate with leadership performance to be more successful than the female candidate with leadership potential.

# GENERAL DISCUSSION

fpsyg-10-00755 April 12, 2019 Time: 17:39 # 10

Our findings provide several new empirical and theoretical contributions. Overall, these studies provide the first experimental evidence that a candidate's gender can affect evaluators' assessment of the value of their leadership potential and leadership performance. In both experiments, consistent with our Hypotheses 3, 4, and 5, leadership potential was preferred when participants ranked male candidates, whereas potential was overlooked when participants ranked female candidates. Male candidates that demonstrated higher potential were perceived to have a more impressive résumé and were expected to perform better in the future than male candidates who demonstrated higher performance. In contrast, female candidates who demonstrated higher performance were perceived to have a more impressive résumé and were expected to perform better than female candidates who demonstrated higher potential. If these findings were extrapolated to real hiring situations, they would mean that whilst women's past performance would have to at least as good as men's, women would be held to higher standards in selection processes because their leadership potential would be less likely to be recognized than men's. The findings emerged most clearly when participants ranked rather than rated candidates. The ranking data are likely to have higher ecological validity given that most recruitment procedures conclude with a ranking process in order to decide which candidate to hire.

Why might men with future leadership potential have a distinctive advantage? One explanation can be drawn from role incongruity theory which highlights that people have a powerful implicit association between leadership and agentic traits (Eagly and Karau, 2002; Heilman and Eagly, 2008). Female candidates who foreground their leadership potential may challenge people's expectations about how women in leadership positions should behave, thereby highlighting role incongruence. Therefore, they may be subjected to greater discrimination than women who primarily emphasize their past leadership performance. The current data do not allow us to test this possibility directly, and further work will be needed to explore this further.

We explored the impact of gender on preference for leadership potential and/or leadership performance. On candidate choice rankings an unexpected but consistent finding was that participants prioritized female's leadership performance over that of male candidates. It could be that women are implicitly required to show greater evidence of competence to overcome stereotypically negative performance expectations, particularly in male gender-typed job domains (Lyness and Heilman, 2006). Therefore, women are more likely to have to demonstrate a successful background in order to show congruence between their skills and the leadership position, and to overcome rolecongruity bias (Koch et al., 2015).

Despite generally consistent findings, a few inconsistencies merit discussion. These may reflect that the two studies assessed judgments relating to different levels of seniority (higher in Experiment 2). In Experiment 1, but not in Experiment 2, participants perceived female candidates to be a better hire overall. This unexpected result might have been driven by participants' reactions to encountering counterstereotypical high-performing women, an advantage that may be worth exploring in case it is limited to judgments about junior leadership roles.

There was also some evidence of gender ingroup bias in both experiments but it was not ubiquitous. Although ingroup bias is a robust social psychological phenomenon (Hewstone et al., 2002), particularly amongst members of more dominant and socially valued groups (Rudman et al., 2002), gender is sometimes an exception to this pattern. This exception is because the more dominant group (i.e., men) are less likely to show direct ingroup bias (Rudman and Goodwin, 2004) perhaps owing to more subtle forms of prejudice (Glick and Fiske, 2001). In Experiment 1, female participants showed ingroup bias in their evaluations of the candidates. In Experiment 2, only female participants demonstrated differences in evaluations of expected success for female candidates based on their potential or performance. This finding suggests further nuanced differences between leadership potential and leadership performance which are likely intrinsically linked to perceptions of gender and leadership. These difference warrant further attention in follow-up research, as they suggest that the demonstration of leadership potential (vs. performance) could also be based in gender role expectations, like ambition.

Going beyond previous research, these studies demonstrated that when faced with a choice, judges consistently ranked male candidates with leadership potential over their female counterparts. Our ranking findings are of particular significance as they mirror the majority of selection and recruitment decisions, whereby only one candidate can be offered the job. Moreover, processes that identify and fast-track leadership potential are already in place in many organizations (e.g., McDonnell et al., 2010). Understanding how gender might influence the perception, promotion, and development of leadership potential over time and career is vital in promoting equality. This research illustrates, for the first time, a subtle but powerful way in which women are discriminated against in the workplace as a direct result of their gender.

# Limitations and Future Directions

We have provided evidence that leadership potential and leadership performance can yield different hiring and evaluation outcomes for men and women. Various limitations need to

be considered before making strong conclusions. First, the extant literature lacks a well-developed empirical foundation for the theoretical distinction between leadership potential and leadership performance. We therefore relied on a general definition of leadership potential, which might not fully encompass the entire array of traits linked to leadership potential or their relationship with leadership performance. We chose to focus distinctly on either past performance or future potential to avoid confusion. We gave the manipulations context and reinforcement in Experiment 2 by providing a richer view of the candidate (e.g., using assessments from previous and prospective employers). Overall, even if a more comprehensive basis for the distinction could improve the design of descriptions in vignettes, the results do show that participants responded differently to the leadership potential and leadership performance depictions of candidates.

We used a crowdsourcing platform which had the advantage of using a real-world sample of employed people across a range of occupations. Nonetheless, it is possible that this approach also introduced more unexplained variability in the sample (e.g., variability in organizational culture) than might be attained with a more homogenous sample (e.g., based in a single organization). Future research could investigate the generality of the overlooked potential effect in single organizational contexts, or compare different organizational contexts that are more typically male- or female-dominated. Moreover, it is conceivable that differences amongst participants' own occupancy of leadership positions, may have influenced their responses. Future research should investigate potential moderating effects of participant leadership experience. A further way to pursue future studies would be to test the effect using samples of hiring managers and members of promotion panels.

Additional directions for research include investigating boundary conditions for the effect such as different leadership goals (e.g., more task-oriented or socio-emotional), or culture. For example, high potential women are regarded as having higher diversity value (Leslie et al., 2017). It would be interesting to test the overlooked potential effect in contexts where diversity goals are salient or not. Using diversity as a boundary condition could also open potential avenues to future interventions.

The degree of role incongruity could also be pursued as a moderating factor. A further subtlety may be that the linguistic framing of the role positions may affect whether or not potential is overlooked. For example, Horvath and Sczesny (2015) found that female and male candidates for a high status leadership position were perceived as fitting equally well to the job when the job advert used feminine-masculine word pairs (instead of solely masculine forms). Linguistic framing might also be relevant for the overlooked potential effect.

Real hiring decisions are based on choices, which our ranking measures simulated. However, the decision to use ranking measures imposed limits to our capacity to investigate moderation effects. Moreover, hiring decisions are often based not only on résumé evaluations but also subsequent rounds of interviews. The present research only speaks to the first stage of this selection process. It may be that these interviews either ameliorate or exacerbate the overlooked potential effect, which also warrants investigation in future research.

# CONCLUSION AND IMPLICATIONS

The present research has practical implications for organizations, and possibly even beyond. For example, if preference for leadership potential in men is a generic phenomenon, it may well confer unfair advantages well beyond commercial and business contexts (e.g., in education, politics, journalism, the legal system). For any organization, ensuring that hiring processes are fair and offer equal opportunities is fundamental for attaining gender equity in leadership positions. Given that employers typically regard leadership potential as a desirable trait (Church, 2014), raising awareness that potential is likely to be undervalued when people judge women may offer a method to improve diversity and equality in leadership. Previous evidence has found that there can be a preference for leadership potential (Tormala et al., 2012), our research highlights that this may be an advantage from which men alone benefit. Our research suggests that women's prospects seem to rest more exclusively on their demonstration of leadership performance over potential. Potential implies that an individual has the quality to perform in wider or different roles in the organization at a later stage (Silzer and Church, 2009). If higher potential among women is not recognized, women may find they are trapped in particular silos (such as administration or human resources), and are at a disadvantage when it comes to more overarching roles and positions. By not fully recognizing leadership potential in female candidates, organizations are inhibiting the prospects of half of their talent. This inhibition ironically means organizations may be less likely to achieve their own full potential.

# ETHICS STATEMENT

All experiments were carried out in accordance with the recommendations of the School of Psychology Ethics Committee at the University of Kent, United Kingdom. The protocol was approved by the School of Psychology Ethics process. All participants gave written informed consent in accordance with the Declaration of Helsinki. The research was conducted in accordance with guidelines from the University of Kent Research Ethics (Human Participants) Committee, the Economic and Social Research Council (ESRC) Research Ethics Framework, and the ethical guidelines from the British Psychology Society (BPS).

# AUTHOR CONTRIBUTIONS

AP and GR conceived of the presented idea and took the joint lead in writing the manuscript. AL, DA, and FT contributed in theory development. AL and DA bolstered analytical methods.

GR and DA encouraged AP to pursue the application of leadership potential and gender to leadership selection and supervised the findings of this research. DA worked on revisions for the final version. All authors provided critical feedback and helped to shape the overall research.

# FUNDING

This work was supported by a grant awarded to GR by the United Kingdom Higher Education Academy (Grant No. HEA-DP057), and by a grant awarded to FT by the United Kingdom Economic and Social Research Council (Grant No. ES/J500148/1).

# REFERENCES


# ACKNOWLEDGMENTS

We thank the colleagues and students in "Grouplab" the Centre for the Study of Group Processes laboratory group, for discussion of ideas, feedback, and comments. Special thanks to Andre Marques and Lazaros Gonidis for their support with statistical analyses.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2019.00755/full#supplementary-material

M. K. Ryan, and M. T. Schmitt (Washington, DC: American Psychological Association), 21–47. doi: 10.1037/11863-002



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Player, Randsley de Moura, Leite, Abrams and Tresh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# APPENDIX: MANIPULATIONS EXPERIMENT 1


• The Leadership Achievement Inventory (LAI) gauges leadership achievement, defined as an individual's observed (i.e., actual) leadership performance at the current stage in his or her career. An achievement score of 96 places this applicant in the top 4% of people who have been assessed. • The Assessment of Leadership Potential (ALP) gauges leadership potential, defined as the employee's predicted leadership performance in the near future. A score of 83 indicates that this applicant predicted future leadership performance is estimated to be in the top 17% of people who have been assessed.

# Women Who Emerge as Leaders in Temporarily Assigned Work Groups: Attractive and Socially Competent but Not Babyfaced or Naïve?

#### Freya M. Gruber, Carina Veidt and Tuulia M. Ortner\*

Department of Psychology, University of Salzburg, Salzburg, Austria

#### Edited by:

Alice H. Eagly, Northwestern University, United States

#### Reviewed by:

Leslie Zebrowitz, Brandeis University, United States Minna Paunova, Copenhagen Business School, Denmark

> \*Correspondence: Tuulia M. Ortner tuulia.ortner@sbg.ac.at

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

Received: 18 May 2018 Accepted: 28 November 2018 Published: 19 December 2018

#### Citation:

Gruber FM, Veidt C and Ortner TM (2018) Women Who Emerge as Leaders in Temporarily Assigned Work Groups: Attractive and Socially Competent but Not Babyfaced or Naïve? Front. Psychol. 9:2553. doi: 10.3389/fpsyg.2018.02553 The underrepresentation of women in top positions has been in the spotlight of research for decades. Prejudice toward female leaders, which decreases women's chances of emerging as leaders, has been discussed as a potential reason. Aiming to investigate the underlying mechanisms of this prejudice, we focused on the question of how facial characteristics might influence women's leadership emergence. Because other research has related ascribed social competence and ascribed naïveté to attractiveness and babyfacedness, respectively, we hypothesized that ascribed social competence would mediate the impact of ascribed attractiveness on leadership emergence and that ascribed naïveté would mediate the impact of ascribed babyfacedness on leadership emergence. In a pilot study, we analyzed data from 101 participants of a women's leadership contest held in 2015 in Germany. We then confirmed these results in a methodologically improved main study on other women who participated in the contest in one of two other years: 2016 and 2017 (N = 195). Women applied to participate in the contest by recording their answers to several questions in a video interview. In the contest, they were assigned to teams of about ten women each and worked on several assessment-center-like tasks. After each task, each member of each team nominated the three women they believed showed the best leadership potential in their group. We operationalized women's leadership emergence as the number of nominations received. We measured participants' facial attractiveness, babyfacedness, social competence, and naïveté by having raters follow a specifically developed rating manual to rate the answers the women gave in the video interviews. In both studies, the results indicated that women with higher ascribed facial attractiveness had higher ascribed social competence, which significantly predicted leadership emergence in the contest. Likewise, women with higher ascribed babyfacedness had higher ascribed naïveté, which significantly, albeit only slightly, negatively predicted leadership emergence. We discuss the implications of the results for personnel selection.

Keywords: leadership emergence, babyfacedness, attractiveness, social competence, naïveté

# INTRODUCTION

fpsyg-09-02553 December 17, 2018 Time: 17:18 # 2

In modern Western societies, sex-related barriers to occupational success still exist and are reflected in economic data. Women hold only 28% of the CEO positions in the United States (U.S. Bureau of Labor Statistics, 2017) and 16% of the 1,500 board seats of the S&P (Ernst and Young LLP, 2015). Women occupy fewer top executive positions than men in the European Union (Bourgeais, 2017) and fewer full professorships worldwide (Johnson, 2017). Because a lack of qualified women can no longer explain these circumstances (European Commission, 2012; Johnson, 2017), it is important to ask why women rarely hold these top leadership positions. One possible explanation for this inequity can be found in differences in leadership emergence. Taggar et al. (1999) defined leadership emergence as the process by which one team member becomes the leader of an initially leaderless group without holding the position by title or by being previously assigned to it. According to Paunova (2015), one or more leaders may emerge within a group when they are perceived and acknowledged as a leader, for instance, due to the positive effect of their leader behavior on the group during a task (Bass and Bass, 2008). Systematic gender differences in leadership emergence may create obstacles for women because being perceived as not able to emerge as a leader may prevent them from being hired or promoted (e.g., Eagly and Karau, 1991, 2002; Finkelstein et al., 2018). Therefore, it is important to further investigate and understand the underlying mechanisms that can help women emerge as leaders or prevent them from doing so.

Scientists have formulated several theories to explain differences in perceptions of the leadership potential of women and men and differences in their leadership emergence: As a basic principle, social role theory (Eagly, 1987) suggests that differences in the behavior of women and men and also differences in how women and men are perceived in a society stem primarily from the different distributions of men's and women's social roles. In this theory, different roles occupied by men and women have an impact on sex-related stereotypes and ascribed aspects of personality because people tend to make inferences from these roles to associated characteristics. In fact, various studies of different fields have demonstrated relations between social roles and the ascribed characteristics of people (e.g., Diekman and Schneider, 2010; Ortner et al., 2011; Koenig and Eagly, 2014). Following this reasoning, Eagly and Karau (2002) formulated role congruity theory to further explain women's greater difficulties in reaching leadership positions. According to this theory, individuals receive positive evaluations by others when their characteristics are estimated as confirming the group's typical social roles (Eagly and Diekman, 2005). In fact, "many people perceive [incongruity] between the characteristics of women and the requirements of leader roles" (Eagly and Karau, 2002, p. 574). This perceived incompatibility between stereotypes of a social group and the requirements for fulfilling a specific role is supposed to result in prejudice, backlash effects (Rudman, 1998), and low evaluations of women as actual or potential leaders (see Eagly and Karau, 2002).

Another related, albeit more general theory that has contributed to explaining inequities in society on the basis of social group processes is the expectation states theory (Wagner and Berger, 1997; Weyer, 2007). This theory addresses small, task-oriented work groups and highlights the relevance of status characteristics (e.g., gender, age, ethnicity) but also physical attributes for the ascription of task-relevant characteristics. Carli and Eagly (2001) employed this approach to explain differences in the leadership emergence of men and women, proposing that there are biases in evaluations of female leaders. Ridgeway (2001) described this approach as applying expectation states theory in a new context. Accordingly, gender roles are enrooted in the social hierarchy and in leadership because the core of gender stereotypes are cross-cultural schemas about women's status positions, also called status beliefs. These status beliefs have been proposed to be a basic reason for lower leadership emergence in women than in men because men, rather than women, are associated with general competence and status worthiness. However, a recent analysis integrating 16 different opinion polls revealed an increase for women's ascribed competence over the past years and concluded gender equality in ascribed competence for women and men in the United States (see Eagly et al., 2018).

Nevertheless, several studies have provided empirical support for the validity of these theoretical frameworks in explaining differences in the performances of women and men with reference to leadership: For example, Garcia-Retamero and López-Zafra (2006) conducted an experiment to test how prejudice against female leaders in different work environments stems from people's expectations. To do so, they asked non-leader participants to rate vignettes of hypothetical candidates for a leadership position. The data indicated prejudice against women except when the woman worked "in an industry congruent with her gender role" (p. 58). Further results indicated effects of social attribution by illustrating that incongruence between masculine task demands and the female gender role reduced leadership emergence even in dominant women (see Ritter and Yoder, 2004).

A more complex and new model combining aspects of personality traits with behavioral mechanisms was recently proposed along with a meta-analysis on leadership emergence: Badura et al. (2018) addressed the question of why men more frequently emerge as leaders than women. For this purpose, the authors proposed gender, individual characteristics, behavioral aspects, as well as situational factors as determinants of the relation between gender and leadership emergence. In order to test their gender-agency/communion-participation (GAP) model, they coded 1,632 effect sizes. Results indicated a beneficial effect of people's gender on leadership emergence through agency (e.g., assertiveness and dominance), but a detrimental effect through communion (e.g., kindness and nurturance). Furthermore, the relations between these traits and leadership emergence were mediated through participative behavior in group discussions. Moreover, they hypothesized that gender differences in leadership emergence depended on situational factors such as the study's setting (business settings, lab settings, classroom settings), gender egalitarianism in the nation where the data were collected, the length of interaction, and the degree of social complexity of the task. Their results indicated that the gender gap in leadership emergence was significantly

lower in business settings compared with lab settings and significantly larger when the interaction time in the task was shorter.

Besides the influence of agency, communion, and behavioral aspects, physical appearance may also contribute to leadership emergence. Much research has revealed that a person's physical appearance is a relevant characteristic in various areas of life, for example, partner selection (Langlois et al., 2000; for a review, see Rhodes, 2006), political elections (Olivola and Todorov, 2010), judicial decisions (Efran, 1974; Zebrowitz and McDonald, 1991), decisions regarding personnel selection (Zebrowitz et al., 1991; Shannon and Stark, 2003), and also ascribed leadership competences (Sczesny and Kühnen, 2004) or leadership selection processes (Stoker et al., 2016). Attractiveness represents one of the most salient aspects of physical appearance and, therefore, affects the ways in which individuals perceive and evaluate others (Little et al., 2011). Several theoretical explanations have also been proposed to account for the relevance of physical attractiveness in the field of leadership: Implicit personality theory by Bruner and Tagiuri (1954) postulated that people unconsciously build hypothetical constructs of trait elements and inferential relations between these attributes when assessing other people (see Schneider, 1973). On the basis of their meta-analysis of 76 studies, Eagly et al. (1991) identified the association between "beauty" and "goodness" as a physical attractiveness stereotype. They reported a significantly higher attribution of favorable characteristics to attractive individuals in comparison with unattractive individuals. Highly relevant for the domain of leadership emergence, individuals rated as attractive received higher evaluations with reference to social competence, adjustment, potency, and intellectual competence. This finding is in line with a study that reported findings that individuals perceived targets whose faces they had rated as attractive with respect to facial symmetry and proportions as more successful, satisfied, and likable than people whose faces they had rated as unattractive (Braun et al., 2001). In fact, studies have revealed positive effects of ascribed attractiveness on life and business outcomes such as career success and management level (Dietl, 2013), income (Judge et al., 2009), and also ascribed leadership status (Cherulnik et al., 1990). According to Cherulnik (1995), the positive impact of facial attractiveness on leader emergence even accumulates over a lifetime because others' attributions create a favorable distinctive social environment that provides special experiences and opportunities. Individuals judged as attractive were also found to be more confident in their social skills than individuals judged as unattractive (Cherulnik, 1995). Moreover, Sobieraj and Krämer (2014) demonstrated a strong relation between ascribed social competence and ascribed physical attractiveness in a study with experimentally varied virtual avatars. On the basis of these findings, we hypothesized that ascribed facial attractiveness would positively predict leadership emergence (Hypothesis 1). Moreover, we hypothesized that social competence would positively mediate the impact of facial attractiveness on leadership emergence (Hypothesis 2) because social competence represents a determinant of leadership skills (Riggio et al., 2003; Groves, 2005; Riggio and Reichard, 2008).

However, not only facial attractiveness affects the ways in which individuals perceive and describe others. The anatomy literature indicates sex differences in facial features (e.g., Enlow, 1982; Gray and Standring, 2008) with female faces having more in common with infantile, babyface-like facets compared with male faces (Guthrie, 1976). Lorenz (1943) defined this set of immature facial characteristics (e.g., big round eyes, a big head, a round face, or a small nose) as the so-called baby schema (German: "Kindchenschema") or babyfacedness. In an evolutionary approach, Perrett et al. (1998) attributed the preference of humans for feminine facial features to the correlations of estrogen-dependent traits with aspects of health and reproductive fitness. Nevertheless, the costs of this form of attractiveness were addressed by investigating whether the more babyish faces of women compared with men further contribute to the attribution of sex-related characteristics and stereotypes. Presenting equally mature-faced male and female faces weakened stereotyped ascriptions and led to the conclusion that women's average facial features promote ascriptions of stereotyped characteristics (Friedman and Zebrowitz, 1992).

In line with these findings, Eagly and Karau's (2002) role congruity theory provides a framework that can be applied to formulate hypotheses on the effects of babyfacedness on leadership emergence: Whereas the leader role is predominantly characterized by agentic traits, such as being competitive, selfconfident, aggressive, objective, and ambitious (see Koenig et al., 2011), people possessing a babyfaced physiognomy were described as less physically strong, less socially dominant, less astute (Zebrowitz McArthur and Apatow, 1983/1984), more naïve, and weaker compared with people possessing more mature faces (Keating et al., 2003; for a review, see Montepare and Zebrowitz, 1998). According to these findings, possessing a babyface is inconsistent with the characteristics required by a leadership role. A growing number of studies focusing on babyfacedness have supported this assumption: For instance, studies have indicated a negative relation between babyfacedness and inferred competence among politicians (Poutvaara et al., 2009), and a negative relation between babyfacedness and leadership status in a high school senior class (Cherulnik et al., 1990). Furthermore, Cherulnik (1995) found in an experiment that facial maturity was a significant predictor of leadership emergence. Investigations of hiring preferences have revealed similar findings (Zebrowitz et al., 1991): Undergraduate participants preferred male and mature-faced applicant targets for leadership positions over female and babyfaced applicants, especially for higher-status positions.

It seems plausible that a person with a high level of ascribed naïveté (defined as a collective term that summarizes how naïve, inexperienced, credulous, and generally intelligent an individual is perceived to be) due to his or her facial appearance would find it difficult to meet leader role expectations. Women may be particularly inclined to suffer from this misperception because they are more likely to encounter prejudice due to role incongruence in the first place (Eagly and Karau, 2002). In addition, women are more likely to be judged as babyfaced (Friedman and Zebrowitz, 1992; Chiao et al., 2008; Olivola and Todorov, 2010). In the second part of this study, we therefore

hypothesized that ascribed babyfacedness would negatively predict leadership emergence (Hypothesis 3). Moreover and analogous to Hypothesis 2, we investigated whether these babyface-specific attributions could explain this negative association: Therefore, we hypothesized that ascribed naïveté would negatively mediate the impact of ascribed babyfacedness on leadership emergence (Hypothesis 4).

To summarize the aims of the present study, we intended to investigate the impact of facial characteristics on women's chances of emerging as a leader, and further, whether the ascription of other attributes that are based on facial characteristics would mediate this effect in the setting of a women's leadership contest. On the one hand, we hypothesized that ascribed attractiveness would be a positive predictor of leadership emergence and that ascribed social competence would mediate this relation. On the other hand, we hypothesized that ascribed babyfacedness would be a negative predictor of leadership emergence and that the ascription of naïveté would mediate this relation. Data collected during contests that were held exclusively for women in three consecutive years provided a basis of the analyses: 2015–2017. The predictor variables (ascribed attractiveness and ascribed babyfacedness) and the mediator variables (ascribed social competence and ascribed naïveté) comprised raters' evaluations of women's appearance in video interviews from self-recorded videos from the contest application phase. Peer nominations provided by group members after participating in assessment-center-like tasks at the contests served as the dependent variable leadership emergence. Therefore, we present a pilot study, which uses data from the 2015 contest, and then the main study containing a larger sample (data from the 2016 and 2017 contests) and a revised method.

# GENERAL METHODS

The analyses based on data collected across three consecutive years of a women's leadership contest conducted in Germany.<sup>1</sup> Data collection took place in two phases: In the application phase, women applied to participate in the contests by submitting their curriculum vitae and letters of recommendation as well as by using a professional software platform to record a video interview consisting of their answers to five or six questions. For each question, applicants could prepare their answer within 2 min. Moreover, participants were instructed to record each of their responses in a maximal duration of 3 min. The women who passed this application phase on the basis of the quality and persuasive power of all of their application documents were invited to participate in the actual leadership contest. At the leadership contest – between attendance at keynote presentations, workshops, and networking slots – contest participants took part in 2 to 3 assessment-center-like tasks in which they engaged in group tasks or group discussions or prepared speeches in groups of 7 to 12 women each. After each task, the women nominated the three group members who had been most convincing as

<sup>1</sup>Visit www.we-are-panda.com for more information about the contests.

leaders. Contest participants who received the most nominations by their peers were rewarded with vouchers, for example, for travel tickets or career coaching. Participants' data were included in our study only when they agreed to the anonymous analysis of their data at the time when they applied to participate in the contest.

# Measures and Variables – General Information

#### Leadership Emergence

Contest participants competed for peer nominations in each group task in short-term work groups. After each task, women completed a questionnaire to (a) nominate and (b) describe the three group members who had convinced them the most with respect to the group member's potential to act as a leader. Leadership emergence was operationalized as the total number of nominations each woman received from her team members across the entire contest. The more nominations a woman received in the contest, the higher her leadership emergence index. However, we controlled for group size, that is, peer nominations possible in the contest, in the main study.

#### Ascribed Babyfacedness and Ascribed Facial Attractiveness

Raters (three in the pilot study, nine in the main study) evaluated women's faces by watching their application videos. Raters watched a 25-s cut of one interview question without sound in order to focus solely on the participants' outward appearance and to avoid any undue influence of other aspects (e.g., spoken content or voice). The length of the video sequence was determined on the basis of pretests as well as literature suggesting that a rating of appearance was possible after this amount of time (see Ambady and Rosenthal, 1993).

After the video sequence had finished, raters assessed the women's facial attractiveness and babyfacedness. For each babyfacedness item, a rating manual depicted example pictures for the extreme values in advance (see **Figure 1** for an example of the eyes facet). The means of the attractiveness items (three in the pilot study: attractive, good looking, and pleasing; two in the main study: attractive, good looking; all ranging from 1 = low value to 10 = high value) averaged across all raters served as the predictor variable ascribed facial attractiveness. The predictor variable ascribed babyfacedness resulted of the mean ratings of the four babyfacedness items (nose, eyes, face form, overall impression, ranging from 1 = not babyfaced to 7 = very babyfaced) averaged across all raters.

Before raters gave their evaluations, they were trained on the basis of a rating manual. In the pilot study, this manual contained black-and-white pictures for the extreme values of the babyfacedness items (for an example, see **Figure 1A**). Before conducting the main study, the manual was revised in accordance with information obtained from the pilot study. Changing the instructions from "How would you evaluate the women shown in the videos. . ." to "How would most people evaluate the women shown in the videos compared with women of the same age?" targeted to reduce subjectivity in the raters' impressions. The revised manual further contained higher quality pictures for the

FIGURE 1 | Example excerpt from the two versions of the rating manual for the babyfacedness subfacet eyes. (A) Displays the extreme values for "not babyfaced" (i.e., small and narrow eyes) in contrast to "babyfaced" (i.e., big and round eyes) from the first version of the manual. (B) Displays extreme values from the revised manual, including an example of mid-ranged babyfaced eyes.

visual attributes of the babyfacedness items by using the freeware program MakeHuman (version 1.1.0, MakeHuman team, 2016) replacing lower quality pictures from the former manual version. This procedure allowed the image of detailed face avatars to represent the extreme and middle values of each item (for an example, see **Figure 1B**). We further revised or removed single items (see below; see the revised manual here<sup>2</sup> ).

## Ascribed Social Competence and Ascribed Naïveté

After having evaluated the women's facial characteristics, the raters watched the videos (see section "Ascribed Babyfacedness and Ascribed Facial Attractiveness") a second time, uncut and with the sound on, and estimated the women's social competence and naïveté. The mean of all items per scale averaged across all raters served as variables further used to test for mediation effects. Ascribed social competence consisted of six items in the pilot study (sociable, confident, warm-hearted, popular, empathic, and socially competent) and of five items in the main study (verbally skilled, sociable, confident, anxious [negatively coded], socially competent; all ranging from 1 = low value to 7 = high value). Ascribed naïveté consisted of three items in the pilot study (naïve,

<sup>2</sup>https://osf.io/sf72x/

mature [negatively coded], critical thinker [negatively coded]), and of four items in the main study (naïve, inexperienced, gullible, generally intelligent [negatively coded]; all ranging from 1 = do not agree at all to 7 = strongly agree).

#### Demographic Data

Age, professional experience, and leadership experience in years were obtained from the CVs the women submitted with their applications.

# PILOT STUDY

# Methods

#### Participants

In sum, 109 women who participated in the contest in 2015 (out of a total of 187 women) gave their consent to use information from their CVs and interviews for research purpose and were included. Videos of eight contest participants were excluded due to insufficient picture or sound quality. The ages of the final sample of 101 women ranged from 22 to 53 years (M = 31.96, SD = 6.68). On average, these women had 9.73 years (SD = 6.04) of professional experience and 6.22 years (SD = 4.54) of leadership

experience. Furthermore, 79.2% were German citizens, 5.0% other, and 15.8% did not provide information about their nationality.

#### Materials

In the pilot study, three raters evaluated facial features and characteristics using the interview sequences from the contest application phase in 2015.<sup>3</sup>

#### Procedure

#### **Leadership emergence**

In the leadership contest, the women worked on three assessmentcenter-like tasks. The participants were randomly and differently assigned to groups of 10 to 12 women per each task. The challenges in the tasks were (a) to engage in a 50-min group discussion about the skills needed for successful leadership, (b) to take 80 min to build a wooden construction, including several predefined intermediate goals, and (c) to take 50 min to prepare a political speech for a female executive politician. Each participant gave written nominations of the three most convincing group peers after each task (see section "Leadership Emergence"). Leadership emergence was calculated on the basis of the General Information (see section "Leadership Emergence").

## **Facial characteristics and ascribed traits**

The ascribed facial attractiveness scale (α = 0.98), the ascribed babyfacedness scale (α = 0.78), the ascribed social competence scale (α = 0.85), and the ascribed naïveté scale (α = 0.88) all demonstrated sufficient internal consistency. The ICC values were acceptable, ranging from 0.63 to 0.82 (see section "Measures and Variables – General Information" for all items and procedure).

#### Statistical Analyses

To test Hypotheses 1–4, we calculated mediation analysis using the "lavaan" package (version 0.6-1, Rosseel, 2018) in R (version 3.5.0). In order to control for non-normality of our data, we conducted robust ULS estimators. Data and R scripts are published in the Open Science Framework (OSF<sup>4</sup> ).

# Results

Descriptive statistics for the dependent and independent variables are presented in **Table 1**, and the correlations between the variables are presented in **Table 2**. Results of the mediation analysis computed to test Hypotheses 1 and 2, displayed in **Figure 2A**, revealed that ascribed facial attractiveness was a significant predictor of ascribed social competence (β = 0.52, SE = 0.08, p < 0.001) and that ascribed social competence was a significant predictor of leadership emergence (β = 0.28, SE = 0.12, p = 0.018). A bootstrapping estimation approach with 1,000 bootstrapped samples indicated a significant indirect effect (βindirect = 0.15, 95% CI [0.02, 0.27], p = 0.025). Yet, ascribed facial attractiveness did not significantly predict leadership emergence before (βtotal = 0.04, SE = 0.10, p = 0.706) or after (βdirect = −0.11, SE = 0.11, p = 0.351) adding the mediator, ascribed social competence. Because there was a significant indirect effect (i.e., the higher a contest participant's ascribed facial attractiveness, the higher her ascribed social competence; and the higher her ascribed social competence, the higher her leadership emergence), we concluded that ascribed social competence acted as a mediator between ascribed facial attractiveness and leadership emergence, even though the analysis did not reveal a significant total effect.<sup>5</sup>

The results of the mediation analysis for Hypotheses 3 and 4 indicated that ascribed babyfacedness was a significant predictor of ascribed naïveté (β = 0.26, SE = 0.10, p = 0.008) and that ascribed naïveté was a significant predictor of leadership emergence (β = −0.33, SE = 0.09, p < 0.001). The bootstrapping estimation approach with 1,000 bootstrapped samples indicated a significant indirect effect (βindirect = −0.09, 95% CI [−0.17, −0.02], p = 0.035). Ascribed babyfacedness did not significantly predict leadership emergence before (βtotal = −0.10, SE = 0.10, p = 0.336) or after adding the mediator, ascribed naïveté (βdirect = −0.01, SE = 0.10, p = 0.920). The results indicated a mediating effect of ascribed naïveté, which means that the higher a participant's ascribed babyfacedness, the higher her ascribed naïveté; and the higher her ascribed naïveté, the lower her leadership emergence, as **Figure 2B** illustrates.

# MAIN STUDY

# Methods

## Participants

We included data from a total of 195 contest participants from 2016 (111 of the 178 participants gave consent) and 2017 (84 of the 114 participants gave consent) in our main study. The ages of the contest participants ranged from 21 to 52 (M = 32.81, SD = 6.69). On average, these women had 9.61 years (SD = 6.12) of professional experience and 4.69 years (SD = 4.33) of leadership experience (for information separated by year, see **Table 3**). In 2016, 75.6% were German citizens, 2.4% other, and 22% did not provide information about their nationality. In 2017, 18.2% were German citizens, 9.1% other, and 71.7% did not provide information about their nationality.

#### Materials

For the 2016 and 2017 samples used in the main study, nine raters analyzed the contest participants' self-recorded video interviews.<sup>6</sup>

#### Procedure

#### **Leadership emergence**

In the leadership contests in 2016 and 2017, in groups that were each comprised of 7 to 10 women, the women worked on three assessment-center-like group tasks in 2016 and on only two tasks in 2017. We excluded one of the tasks used in the 2016 contest from our analysis to enhance comparability between the

<sup>3</sup>Detailed interview questions can be obtained from the first author. <sup>4</sup>https://osf.io/kxm4w/

<sup>5</sup>Counter to previous standards (see Baron and Kenny, 1986), researchers have suggested that it is acceptable to interpret a mediation effect on the basis of a significant indirect effect even in the absence of a significant direct or total effect (e.g., MacKinnon et al., 2007; Hayes, 2018).

<sup>6</sup>Detailed interview questions can be obtained from the first author.

#### TABLE 1 | Descriptive statistics for the variables for both the pilot study and the main study, presented separately by contest year.


The more nominations (leadership emergence) a participant received in the contest the higher her leadership emergence. Ascription ratings were on rating scales [anchor in brackets] on which higher scores indicate a higher degree of each ascribed characteristic. N, sample size; M, mean; SD, standard deviation.

TABLE 2 | Correlations between leadership emergence (dependent variable), ascribed babyfacedness and ascribed attractiveness (predictors), and ascribed social competence and ascribed naïveté (mediators) for both the pilot study and the main study.


Higher variable scores indicate a higher degree of each ascribed characteristic. r = Pearson correlation coefficients. The values from the pilot study are presented above the diagonal, and the values from the main study (using centered variables) are presented below the diagonal. ∗∗∗p < 0.001, ∗∗p < 0.01, <sup>∗</sup>p < 0.05.

two samples. In 2016, the challenges in the tasks were (a) to engage in a 45-min group discussion and (b) to spend 40 min preparing an industrial lobby presentation. In 2017, the women were asked (c) to spend 60 min preparing a huge event and (d) to spend 60 min preparing for a public debate on the gender issue. Each participant provided written nominations of the three most convincing group peers after each task (see section "Leadership Emergence"). We summed the number of nominations for each participant. For the analyses, we divided this sum of nominations by the number of peer nominations possible (i.e., group size per task minus one) in order to control for contaminations due to effects of group size.

#### **Facial characteristics and ascribed traits**

In the main study, the ascribed facial attractiveness scale (α = 0.98), the ascribed babyfacedness scale (α = 0.82), the ascribed social competence scale (α = 0.93), and the ascribed naïveté scale (α = 0.90) all demonstrated sufficient internal consistency. ICC values for all ratings in the main study ranged from 0.88 to 0.89 for the ascribed facial attractiveness items, from 0.81 to 0.89 for the ascribed babyfacedness items, from 0.71 to 0.84 for the ascribed social competence items, and from 0.68 to 0.83 for the ascribed naïveté items (see section "Measures and Variables – General Information" for all items and the procedure).

#### Statistical Analyses

Before we calculated the mediation analyses in a manner that was similar to the analyses used in the pilot study, we centered the predictor, mediator, and dependent variables on the mean of the respective contest year in order to enhance comparability for the total sample in the main study. For example, a participant's ascribed facial attractiveness may have been high in comparison with the entire female population but not in comparison with her competitors in the contest, if all contest participants were also rated as highly attractive. By centering the variables on the respective sample means, we corrected for such potential group characteristics.

As we did in the pilot study, we ran mediation analyses using the "lavaan" package in R. Seven participants had to be excluded from these analyses due to missing information on group size. We further controlled for possible issues of non-normality by, again, using robust ULS estimators. To control for the high correlations between the predictor variables ascribed facial attractiveness and ascribed babyfacedness as well as between the mediator variables, ascribed social competence and ascribed naïveté, revealed by the a priori analyses, we ran further analyses in which we included all variables simultaneously in one path model. As postulated by Preacher and Hayes (2008), this procedure allowed to identify the unique capacity of each mediator to account for the impact of the respective predictor on leadership emergence beyond the shared contribution of the two mediators. Data and R scripts are published in the OSF<sup>7</sup> .

#### Results

Contest participants in the main study received an average of 5.58 nominations (SD = 3.62) across the two tasks. On average (and before the variables were centered), the mean of ascribed facial

<sup>7</sup>https://osf.io/kxm4w/

TABLE 3 | Age and years of professional and leadership experience of women in the main study sample, presented separately by contest years and overall.


N, sample size; M, mean; SD, standard deviation.

attractiveness was 5.96 (SD = 1.31), and the mean of ascribed social competence was 4.80 (SD = 0.61). The mean of ascribed babyfacedness was 3.97 (SD = 0.85), and the mean of ascribed naïveté was 3.41 (SD = 0.63). See **Table 1** for values separated by contest year, and **Table 2** for intercorrelations between the variables.

As hypothesized, the data analyses indicated that ascribed social competence mediated the influence of ascribed facial attractiveness on leadership emergence. Confirming the results of the pilot study, the higher a contest participant's ascribed facial attractiveness, the higher her ascribed social competence (β = 0.19, SE = 0.07, p = 0.005), and the higher her ascribed social competence, the higher her leadership emergence (β = 0.26, SE = 0.07, p < 0.001). The bootstrapping estimation approach with 1,000 bootstrapped samples indicated a significant indirect effect (βindirect = 0.05, 95% CI [0.01, 0.10], p = 0.030). However, ascribed facial attractiveness did not significantly predict leadership emergence, neither before (βtotal = 0.12,

SE = 0.07, p = 0.084) nor after adding the mediator, ascribed social competence (βdirect = 0.07, SE = 0.07, p = 0.344; see **Figure 2A**).

The second mediation analysis revealed that ascribed naïveté functioned as a mediator of the influence of ascribed babyfacedness on leadership emergence. According to our results, ascribed babyfacedness significantly predicted ascribed naïveté (β = 0.18, SE = 0.07, p = 0.009). Further, ascribed naïveté significantly predicted leadership emergence (β = −0.26, SE = 0.07, p < 0.001), supporting the results of the pilot study as well. In other words, the higher a participant's ascribed babyfacedness, the higher her ascribed naïveté. However, the higher her ascribed naïveté, the lower her leadership emergence. The bootstrapping estimation approach with 1,000 bootstrapped samples indicated a significant indirect effect (βindirect = −0.05, 95% CI [−0.10, −0.01], p = 0.031). Further, in line with the results of the pilot study, ascribed babyfacedness did not predict leadership emergence, neither before (βtotal = −0.02, SE = 0.08, p = 0.759) nor after adding the mediator, ascribed naïveté (βdirect = 0.02, SE = 0.08, p = 0.767; see **Figure 2B**).

We further tested the unique ability of each mediator to mediate the corresponding relation between the predictor and leadership emergence in one joint path model. In this model, we controlled for the correlations between the predictor variables and the mediator variables in order to identify the unique impact of each mediator. Overall, the model showed a very good fit, χ 2 (2, N = 188) = 0.43, CFI = 1.00, RMSEA < 0.001, SRMR = 0.012, and is displayed in **Figure 3** with all individual regression coefficients. The unique indirect effects were all nonsignificant. More specifically, neither the effect of ascribed facial attractiveness on leadership emergence through ascribed social competence (βindirect = 0.03, 95% CI [−0.01, 0.08], p = 0.271), nor the effect of ascribed facial babyfacedness on leadership emergence through ascribed naïveté (βindirect = −0.03, 95% CI [−0.09, 0.03], p = 0.334) reached the conventional thresholds of significance. Furthermore, there were no unique direct effects of ascribed facial attractiveness (βdirect = 0.14, SE = 0.08, p = 0.053), ascribed babyfacedness (βdirect = 0.09, SE = 0.10, p = 0.371), ascribed social competence (β = 0.16, SE = 0.12, p = 0.209), or ascribed naïveté (β = −0.14; SE = 0.12, p = 0.264) on leadership emergence. However, when we combined the unique direct and indirect effects of ascribed facial attractiveness, we found a significant unique total effect on leadership emergence (βtotal = 0.17, SE = 0.07, p = 0.018), suggesting that ascribed facial attractiveness had an effect on leadership emergence above and beyond ascribed babyfacedness and ascribed naïveté.

# GENERAL DISCUSSION

Aim of the present study was to increase the understanding of the effects of facial appearance and ascribed characteristics on leadership emergence by employing data collected at a women's leadership contest. We captured characteristics of facial appearance by rating participants' application videos as they responded to standardized interview questions. Peer nominations of the women from two exercises that were part of the contest served as a measure of leadership emergence. Analyses of data from both a pilot study and a main study revealed a significant indirect effect of ascribed facial attractiveness and babyfacedness ratings made by independent raters on leadership emergence. Ascribed social competence mediated the relation between ascribed facial attractiveness and leadership emergence, and ascribed naïveté mediated the relation between babyfacedness and leadership emergence. However, the results did not support Hypotheses 1 and 3 because neither ascribed facial attractiveness nor ascribed babyfacedness significantly predicted leadership emergence. When we tested the hypotheses separately, these findings held both, independent from the inclusion or exclusion of the mediators. Therefore, the analyses in this study revealed that both the women's perceived attractiveness and their babyfacedness were not directly related to their leadership emergence but had an impact through ascribed aspects that were explicitly or implicitly related to these characteristics.

Following the recommendations of a reviewer, we also tested the unique effects of our two mediators and predictors on leadership emergence. Analyses revealed a significant unique effect on leadership emergence only for ascribed facial attractiveness after controlling for ascribed babyfacedness and ascribed naïveté. This finding suggests that the previously reported indirect effects of ascribed social competence and ascribed naïveté showed a large overlap of variance and that the shared and not the unique portion of ascribed personality traits variance mediated the effect of ascribed facial appearance on leadership emergence. Therefore, this finding could signify that an underlying latent variable, namely, ascribed leadership personality, is mediating the effect between ascribed facial appearance and leadership emergence. However, our study was not equipped to handle such a latent variable model due to an insufficient sample size and number of indicators. Future research should investigate the unique and common portions of indirect effects of ascribed facial appearance on leadership emergence through ascribed personality traits by including more indicators for ascribed facial appearance and ascribed personality traits.

These findings also suggest a higher relevance of facial attractiveness for leadership emergence because babyfaced women were perceived as more attractive. However, literature indicated babyfacedness to be related to attractiveness, especially in women (Zebrowitz McArthur and Apatow, 1983/1984). The model employed cannot be used to determine such a halo effect (i.e., possessing babyface causing higher attractiveness ratings in women) because we had defined attractiveness and babyfacedness as equal predictors. In the rating procedure, raters judged facial attractiveness directly before assessing babyfacedness; thus, the latter could not have influenced the attractiveness ratings directly. However, the results of the joint model could further indicate a compensation for the negative influence of babyfacedness on leadership emergence at the same time: In the joint model, partialling out the variance of babyfacedness may explain the significant effect of facial attractiveness on leadership emergence. Babyfacedness has been described as being detrimental to leadership emergence (Cherulnik et al., 1990), but also as positively correlated with facial attractiveness in women (Zebrowitz McArthur and Apatow, 1983/1984). After

controlling for babyfacedness in facial attractiveness, the negative effect of babyfacedness via facial attractiveness was no longer relevant. Thus, the 'pure' facial attractiveness revealed an effect on leadership emergence (i.e., without the confounding effect of babyfacedness). Facial attractiveness was thus no longer influenced by babyfacedness but by other characteristics as, for instance, the symmetry, healthy appearance and other aspects, which we did not explicitly assess.

Further bearing in mind the relation between babyfaced features and femininity (Friedman and Zebrowitz, 1992), the question arises as to whether higher babyfacedness (and thus higher attractiveness) accents the feminine gender role in women and, hence, increases discrimination against them (Zebrowitz et al., 1991). Concluding from our results revealing that babyfacedness weakened the positive effect of attractiveness, it seems that the part of women's attractiveness associated with femininity is not the one promoting but rather preventing their leadership emergence. Further, would babyfaced women therefore have to increase their agentic behaviors to compensate for their even more accentuated gender role in order to emerge as leaders, as indicated by role congruity theory (see Eagly and Karau, 2002; Eagly and Carli, 2007)? Because features of appearance and accordingly ascribed traits depict only some relevant aspects that predict leadership emergence, future research should also include measures of behaviors and other personality traits, for instance, perceived agency and communion (Johnson et al., 2008; Schock et al., 2018) or task-specific aspects (e.g., Eagly and Karau, 1991; Badura et al., 2018).

Ascribed social competence, related to facial attractiveness, indicated an advantage for obtaining a large number of nominations in the women's leadership contest. On the other hand, ascribed naïveté, related to babyfacedness, indicated a disadvantage. The socially enhancing effects of aspects of attractiveness were well in line with the literature (e.g., Eagly et al., 1991; Langlois et al., 2000). Moreover, a childlike appearance and ascribed naïveté represent characteristics that are counter to the characteristics typically ascribed to leaders (e.g., Zebrowitz McArthur and Apatow, 1983/1984; Zebrowitz et al., 1991; Friedman and Zebrowitz, 1992; Masip et al., 2004; Riggio and Riggio, 2010).

These results are particularly remarkable because the predictor and criterion variables were assessed with different methods. Although both kinds of variables were assessed with ratings, the ratings were made by different samples: The raters of the videos were students, whereas the raters in the contest were competing peers. Furthermore, the predictor variables consisted of ratings of recorded video interviews, whereas the criterion variables based upon real-life interactions in a competitive situation. As the two

kinds of raters may have perceived the same targets of rating in different contexts, somewhat different processes – or at least differing on the level of awareness – may have taken place. For instance, the participants of the contest may have focused more on the competition and their own performance, whereas the raters in the lab explicitly concentrated on the features of facial appearance and personality they had to assess. The finding that ratings of video recordings were related to ratings from a real-life situation may hint at two possible mechanisms: First, the impressions that the raters got from the video recordings were similar to the impressions that the peers had during the contest, the latter of which may have shaped the interactions and ascriptions to a certain extent (see Antonakis and Eubanks, 2017). Therefore, the results may suggest that the ascribed social competence as well as the ascribed naïveté of the women may reflect the important and consensual information that was relevant for raters in both situations (i.e., the application and the contest).

Nevertheless, our results are well in line with previous research on the relevance of appearance for election outcome predictions in political leadership (e.g., Todorov et al., 2005; Antonakis and Dalgas, 2009). In other words, ratings of aspects of physical appearance made by individuals who are not voting in the actual leader election (or as in our study, in the leadership contest) could in part predict the outcome of a leader election (or as in our study, a leader competition).

Second and alternatively, these results may also indicate that consistency in first impressions based on facial appearance may lead people to treat other individuals in certain ways that subsequently shape these individuals' outcomes and behaviors (see Rule and Ambady, 2010; Lukaszewski and Roney, 2011). Further, such impressions and associations based on facial appearance may be extended to the interactional mechanism that underlies a self-fulfilling prophecy (Snyder et al., 1977; further, see Zebrowitz and Montepare, 2008; Todorov et al., 2015). Acting on the assumption that the nominations made by the peers in the contest as well as the video ratings made by external raters reveal real aspects of the participants, according to the self-fulfilling prophecy mechanism, the participants may have developed their characteristics (e.g., social competence or naïveté) on the basis of how they were treated due to their facial appearance, attractiveness, and babyfacedness, respectively. Analogously, at the contest, women may have treated an attractive peer who they therefore perceived as socially competent in a way that increased the peer's confidence in acting like a leader. This, in turn, may have led to more actual leadership behavior, thus convincing the other contestants to nominate her as a convincing leader. For example, the fact that attractiveness may have served as a leadership cue may have led participants to pay more attention to particular women in the contest, which in turn may have increased the abilities of these particular women to excel in the group task (see Gerpott et al., 2017).

Another advantage of this study is that the data collection was part of a real contest, and the sample consisted of real women who participated in the study to win the contest and enhance their reputations. This competitive setting could be interpreted as comparable to an actual personnel selection situation (e.g., an assessment center) where performance is required within a short period of time. However, the rather unconventional study setting somewhat restricts the interpretation with regard to generalizability: Peers from group tasks usually do not make hiring decisions in personnel selection, but rather, external observers or assessment experts do. Nevertheless, we aimed to contribute to knowledge about which variables are relevant for leadership emergence. Conclusions cannot be drawn with respect to leadership ascription or leadership performance in the medium or long term, especially in comparison with dayto-day teamwork without a competitive character, even though the contest participants had to balance competitiveness with cooperation in working toward a common goal during the tasks. However, our research revealed that appearance is especially important when no information or only a little other information is available (Olivola et al., 2012). In line with previous research, we would expect the impact of facial appearance on leader selection to decrease when information about competence is available because competence-related information should have a greater impact on selection decisions (Kaufmann et al., 2017). Although facial appearance had an impact on women's chances of getting more nominations in the leadership contest in this study, we would expect this effect to diminish in the medium and long terms after women have entered the work place.

Although the setting allowed obtaining an unusual sample of women, future research should include male participants as well as other ethnicities (e.g., Blacks) to investigate interactions between sex and other personal attributes. For instance, previous research indicated that attractiveness judgments were better predictors of election success for women than for men (Berggren et al., 2010). Furthermore, whereas the current study indicated a disadvantage for individuals possessing facial physiognomy reflecting babyfacedness, Livingston and Pearce (2009), for example, found that a larger number of black male CEOs had babyfaced characteristics in their study compared with white male CEOs. A newer study addressing political voting decisions in an Asian culture revealed that babyfacedness was the strongest predictor of percentages of votes beyond competence, attractiveness, warmth, and background characteristics (Chang et al., 2017).

Our findings provide support for earlier studies (e.g., Sczesny and Kühnen, 2004; Olivola and Todorov, 2010; Re and Perrett, 2014) that found that aspects of physical appearance such as facial attractiveness and babyfacedness affect the attribution of job-relevant characteristics in women. This effect may also play an important role in personnel selection. Future studies should investigate the particular relevance of physical appearance with regard to job profiles and job domains. Furthermore, future studies should investigate how other aspects of physical appearance (e.g., body size) impact this effect and should further investigate strategies for overcoming possible disadvantages related to facial appearance, for example, via clothing or hairstyle.

Visual appearance often influences the first impression a person makes, especially in personnel selection (e.g., based on photographs in CVs) and therefore affects the attributions made to an individual. Under a more practical perspective, we encourage organizations and hiring experts to enhance their awareness of such ascription- and stereotype-inducing mechanisms, helping women to overcome these gender-related obstacles to achieve leadership positions equally to their male counterparts. The use of standardized assessment criteria that are determined in advance (from the side of human resources) and the decision not to include a photo in a job application (from the side of an applicant) may be first steps to diminish such effects, at least with regard to aspects of (facial) appearance.

# DATA AVAILABILITY

fpsyg-09-02553 December 17, 2018 Time: 17:18 # 12

Datasets are in a publicly accessible repository: the datasets analyzed for this study can be found in the Open Science Framework (https://osf.io/kxm4w/).

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the statutes of the University of Salzburg (part XI), Ethic Committee of the Paris Lodron University of Salzburg. The protocol was approved by the Ethic Committee of the Paris Lodron University of Salzburg. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

# AUTHOR CONTRIBUTIONS

FG collected the data for the pilot study, co-revised the rating manual, and supervised and organized the data collection

# REFERENCES


and analyses for the main study. She mainly contributed to writing the Method, Results, and Discussion sections of the manuscript and revised earlier drafts of the entire manuscript. CV conceptualized the research, data collection and analyses of both the pilot study and the main study. She mainly contributed to writing the first versions of the Introduction and Discussion sections as well as the Results section of the pilot study. TO supervised the conceptualization of this research regarding the operationalization and methods. She revised earlier drafts of the manuscript, in particular providing theoretical input for the Introduction and Discussion sections.

# ACKNOWLEDGMENTS

We thank Anne-Kathrin Schock and Thomas Scherndl for their supervisory support regarding conceptualization and analyses. We further thank the student assistants Moritz Baumgärtner, Lennart Dialer, Julian Fuchs, Verena Eberle, Lena Kastner, Anna Krämer, Dominik Mengin, Corinna Pfannenstein, Martina Rogalla, and David Schmid for their help with coding and ratings. We thank Manfred Schmitt and Anna Baumert for great theoretical advice, Tobias Koch for methodological input, and Jane Zagorski for helpful suggestions. We further thank the editors Alice H. Eagly and Sabine Szcesny, as well as Leslie Zebrowitz and Minna Paunova for thoughtful and supportive reviews. We thank the University of Salzburg for financial funding of the Open Access Publication fee. Finally, we thank Isabelle Hoyer and Stuart Cameron for allowing us to conduct research in the context of their leadership contest for female professionals, and the participants of the 2015, 2016, and 2017 contests who agreed to share their data for research purposes.

of human facial attractiveness]. Available at: http://psydok.psycharchives.de/ jspui/handle/20.500.11780/45



Enlow, H. D. (1982). Handbook of Facial Growth. Philadelphia, PA: W B Saunders.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Gruber, Veidt and Ortner. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Perpetuating Inequality: Junior Women Do Not See Queen Bee Behavior as Negative but Are Nonetheless Negatively Affected by It

Naomi Sterk<sup>1</sup> \*, Loes Meeussen1,2 and Colette Van Laar<sup>1</sup>

<sup>1</sup> Center for Social and Cultural Psychology, KU Leuven, Leuven, Belgium, <sup>2</sup> Postdoctoral Fellowship, Research Foundation–Flanders, Brussels, Belgium

Previous research has revealed that women may attempt to avoid negative gender stereotypes in organizations through self-group distancing, or "queen bee", behaviors: emphasizing masculine qualities, distancing themselves from other women, and legitimizing organizational inequality. Factors that increase self-group distancing have been identified (e.g., existing discrimination and low group identification), but it is unknown how self-group distancing by an ingroup leader is perceived by and affects subordinates of the negatively stereotyped group. In the current study, female participants received ambiguous negative feedback from a male versus female leader displaying queen bee-type versus neutral behavior. As expected, a male leader displaying queen bee-type behavior was seen as having less positive intent than a male leader displaying neutral behavior, which in turn increased how sexist he was perceived to be. A female leader displaying queen bee (vs. neutral) behavior was not seen as having less positive intent, which thus did not indirectly influence perceived sexism. Behavior of both male and female leaders did affect junior women: participants exposed to a leader displaying queen bee-type behavior reported more anger, sadness, and anxiety than participants exposed to a leader displaying neutral behavior. These data provide further evidence that simply adding more women or minorities in key senior positions is insufficient to change inequality if bias in the organization is not tackled. Specifically, exposure to gender inequality can steer female leaders to endorse–rather than change–stereotypes about women, and this behavior is particularly consequential because it (a) might not be recognized as bias and (b) exerts negative effects.

Keywords: self-group distancing, sexism, queen bee effects, negative affect, ambiguity, bias

# INTRODUCTION

Despite significant changes in social equality policies and legislation, women remain underrepresented in various fields and in higher positions in society. Within the largest companies in the European Union, women comprise only 5% of CEOs and 23% of board members (European Commission, 2016). These numbers are comparable to those in the United States, in which women in the largest companies comprise 5% of CEOs and 21% of board members (Catalyst, 2017). In Europe as well as in the United States, however, these numbers do mark a slight increase in the

#### Edited by:

Alice H. Eagly, Northwestern University, United States

#### Reviewed by:

Jolien A. van Breen, University of Exeter, United Kingdom Teri Kirby, University of Exeter, United Kingdom

> \*Correspondence: Naomi Sterk naomi.sterk@kuleuven.be

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

Received: 27 March 2018 Accepted: 22 August 2018 Published: 20 September 2018

#### Citation:

Sterk N, Meeussen L and Van Laar C (2018) Perpetuating Inequality: Junior Women Do Not See Queen Bee Behavior as Negative but Are Nonetheless Negatively Affected by It. Front. Psychol. 9:1690. doi: 10.3389/fpsyg.2018.01690

proportion of women as CEOs and in boards. For example, in 2012, 14% of board members in the largest EU companies and 17% of board members in the largest United States companies were women (Catalyst, 2012; European Commission, 2012). It is often believed that such an increase in female leadership will undoubtedly lead to increased gender equality, as women climbing the organizational ladder are presumed to lower organizational bias, actively resist structural inequalities for example in selection procedures, mentor other women (Stout et al., 2011), and create a more identity safe organizational climate for other women (Purdie-Vaughns et al., 2008). Moreover, the mere presence of women in leadership roles should–by providing real life exemplars of female leaders–initiate change in how the leadership role is perceived (Eagly and Karau, 2002; Koenig and Eagly, 2014). Also, in theory, an increasing number of women in leadership positions should attenuate the traditional 'masculine' stereotype of leadership that exists (Sczesny, 2003).

Drawing on research on the Queen Bee (QB) phenomenon, we maintain, however, that it is not certain that other women will automatically benefit from female leaders in top positions. In the current paper we aim to show that when female leaders display QB behavior, this in reality negatively affects other (junior) women. Such negative effects occur because QB behavior can look similar to sexism and exert similar effects, but might be less likely to be recognized as sexism due to the source being female, hereby impairing effective regulation by the receiver. We here outline these ideas.

Women who advance into higher positions often experience barriers due to their gender. As the figures above show, they often find themselves as one of only a few women in (the top of) male-dominated organizations. Female leaders also often have to walk a tightrope between meeting the demands placed on them due to their leader role (e.g., evidencing agentic qualities) and meeting the demands placed on them due to their gender role (e.g., evidencing high communal qualities) (Eagly and Karau, 2002; Eagly and Sczesny, 2009). Moreover, women advancing into higher positions still face bias and gender stereotypes, which can induce social identity threat. Some women navigate this by engaging in self-group distancing: a process whereby members of stigmatized groups cope with inequality by disengaging with the stigmatized group and assimilating into the non-stigmatized group. Groups for which this has been found include women, the elderly, ethnic minorities, and sexual minorities (Eguchi, 2009; Derks et al., 2011a,b, 2015; Weiss and Lang, 2012; see Derks et al., 2016 for a discussion). Self-group distancing can thus be seen as an individual strategy to resolve social identity threat and to restore a devalued identity (Branscombe et al., 1999a). Female self-group distancing has been coined "Queen Bee" (QB) behavior (Staines et al., 1974), and is characterized by a masculine selfpresentation, legitimizing the status quo (e.g., denying gender inequality in the workplace and opposing measures aimed at reducing gender inequality), and underlining dissimilarities to other women (Derks et al., 2016).

Although in recent years more insight has been gathered into factors linked to the development of QB behavior (experiencing gender inequality, low gender identification), it is unknown how QB behavior is interpreted by and affects those who are exposed to it. QB behavior can be seen as ambiguously negative behavior, showing similarities to modern forms of sexism. Unlike traditional forms of sexism, which are more overt and hostile, modern sexism is more subtle and ambiguous. Like QB behavior, modern sexism is characterized by a denial of continued gender discrimination and opposition to women's demands, such as a lack of support for measures aimed at reducing gender equality (Swim et al., 1995; Tougas et al., 1995). Moreover, QB behavior can extend beyond a passive rejection of the ingroup (i.e., absence of acceptance) to actively devaluing ingroup members (e.g., claiming women have lower career commitment than men; Ellemers et al., 2004; Kaiser and Spalding, 2015). Thus, while QB behavior and sexism are distinct in their underlying concerns and motivations (with QB behavior being a self-regulation strategy, driven by identity devaluation concerns that are a response to social inequality such as sexism), QB behavior and modern sexism do show strong outward similarities. However, no research has experimentally examined these similarities and related these similarities to outcomes for those exposed to QB behavior. We thus set out to examine how junior women perceive and are affected by QB behavior. Specifically, we expected to show: (1) that QB behavior is less likely to be perceived as bias because the source of this behavior is a member of the negatively stereotyped group, (2) that despite not being seen as intentionally negative or as sexist, QB behavior negatively affects women confronted by it, and (3) that the ambiguity accompanying perception of this behavior impairs regulatory strategies.

# How Do Women Perceive Male and Female Leaders Displaying QB(-Type) Behavior?

The first aim of this research is to show that QB behavior is less likely to be perceived as bias due to the source of this behavior being female. Although QB behavior appears similar to modern forms of sexism, it is unlikely that QB behavior will be perceived as being driven by negative or sexist intentions. As QB behavior describes a phenomenon occurring in women, for a female subordinate the source of this behavior will be an ingroup member. Ingroup leaders are generally viewed in a more positive light than outgroup leaders, even if they express a preference for members of the outgroup (Duck and Fielding, 2003). This ingroup-outgroup difference can be explained by a phenomenon known as the intergroup sensitivity effect: the tendency for people to be less skeptical of criticism or the source of this criticism if it is coming from an ingroup member than from an outgroup member (Hornsey et al., 2002). This lower skepticism follows from people tending to ascribe positive intent to someone expressing themselves negatively about their ingroup: They believe that the speaker intends to be constructive and means well (Hornsey and Imani, 2004). Furthermore, an ingroup member would be a non-prototypical source of bias, and bias from a non-prototypical source (e.g., sexism from a female, racism from an ethnic minority) is less likely to be recognized as such (Baron et al., 1991; Inman and Baron, 1996; Inman et al., 1998; Cunningham et al., 2009). Because of this nonprototypicality, behavior that would otherwise be perceived as

negative might not necessarily be seen as such when the source of this behavior is an ingroup member. We thus expect that perceptions of a leader displaying QB behavior are influenced by the fact that the source of this behavior is female, and expect the same kind of behavior to be perceived differently if the source of this behavior is male. [When we talk about QB-type behavior from a male leader, we are talking about the same behaviors but enacted by a male leader. Hence by definition this behavior cannot be labeled as queen bee behavior, as queen bee behavior describes self-group distancing in women and can therefore only be displayed by women–a man cannot distance himself from the female ingroup because this is not his ingroup. Therefore, for comprehension purposes, we refer to QB behavior enacted by a male leader as 'QB-type' behavior. When referring to a male and female leader simultaneously we use the term 'QB(-type)' behavior.]. Our first hypothesis is thus that a male leader displaying QB-type (vs. neutral) behavior will be perceived as having less positive intent toward women and will therefore be seen as sexist, whereas this will not be the case for a female leader. Put differently, QB behavior by a female leader will go less recognized as a possible instance of sexism because the female source will be seen as having positive intent toward women.

# How Are Women Affected by QB(-Type) Behavior?

The second aim of this research is to demonstrate that despite not being seen as bias, QB behavior is likely to negatively affect women confronted by it. The "why" of what is being said might be perceived as positive (constructive), but the "what" of what is being said is still similar to modern gender bias. Exposure to bias in its traditional, more blatant, sense has negative consequences such as psychological distress (Klonoff et al., 2000; Szymanski et al., 2009) and negative affect (Wang et al., 2012). Bias does not have to be traditional or obvious, however, to exert negative effects (Barreto and Ellemers, 2005; Salvatore and Shelton, 2007; Murphy et al., 2013). For example, Barreto and Ellemers (2005) exposed participants to statements reflecting either modern sexism (e.g., "discrimination against women is no longer a problem") or traditional blatant sexism (e.g., "women are generally not as smart as men"). The results showed that although participants were not as likely to perceive modern (vs. traditional) sexist statements as sexist, they did show increased anxiety compared to the traditional sexism condition. Our second hypothesis, therefore, is that participants who are exposed to QB(-type) behavior (vs. neutral behavior) will suffer negative consequences (measured through increased negative affect), both when the source of this behavior is male and when the source of this behavior is female (Hypothesis 2).

The third and last aim of this research is to demonstrate that the regulation of negative affect following exposure to QB(-type) behavior will be impaired when the source of this behavior is an ingroup member (female) rather than an outgroup member (male). One way people can regulate negative affect is through identification with the stigmatized group. The rejectionidentification model (Branscombe et al., 1999b) posits that although attributions of experiences to bias negatively affect wellbeing through feelings of rejection, such attributions can also protect well-being through identification with the stigmatized group. In other words, group identification can attenuate feelings of rejection caused by bias because one can still feel included in the stigmatized group. Group identification can also be protective because high group identifiers are more attentive to bias and more likely to recognize bias when it occurs, making it more possible to regulate its negative effects (Operario and Fiske, 2001; Major et al., 2003). As we argue below, both of these mechanisms through which group identification can protect well-being in the face of possible bias are likely to be impaired in the context of QB behavior. Firstly, QB behavior involves rejection stemming from a fellow member of the stigmatized group, so protective effects of gender identification with the stigmatized group are less likely to occur. Secondly, we argue that QB behavior is ambiguous regarding attributions to bias because the source of this behavior is an ingroup member. When bias is ambiguous or unclear, higher identification with the stigmatized group does not protect against negative effects (Major et al., 2003; Dardenne et al., 2007). When the source of QB-type behavior is male and thus an outgroup member, both of these protective functions of group identification are presumably not impaired. Accordingly, our third hypothesis is that women who are more highly identified with their gender group will be protected from negative effects of QB(-type) behavior, but only when the source of this behavior is male. Put differently, we expect that exposure to QB(-type) behavior will not increase negative affect in women who are highly identified with their gender group when the source of this behavior is male–while this will be the case when the source is female (Hypothesis 3).

In sum, we predict that QB behavior is less likely to be perceived as bias because the source of this behavior is female, that despite not being seen as intentionally negative or as sexist QB behavior negatively affects women confronted by it, and that the ingroup source of this behavior–a female–impairs effective regulation of these negative effects.

# MATERIALS AND METHODS

# Participants

Participants were 1st-year female Psychology students in Belgium, who participated in a study about 'their perspective on the university and the future' for course credits. Three of the 171 participants who completed the study were excluded from analyses: One participant was excluded for answering with only the most extreme values on each scale and two participants were excluded because they themselves indicated not having participated seriously<sup>1</sup> . The final sample consisted of

<sup>1</sup>The interpretation of the results is by and large the same with and without exclusions: without exclusions, the interaction effect between leader gender and leader behavior on perceived sexism approaches significance more strongly (p = 0.072 without exclusions versus p = 0.126 with exclusions) and has more power (0.436 without exclusions versus 0.333 with exclusions). All results pertaining to the rest of H1 (mediation through intent), H2 (negative effects) and H3 (regulation of effects) remain unchanged.

168 participants with a mean age of 18.4 years old (SD = 1.17, range 17–29). Most participants (92.3%) self-identified as Belgian and 17.4% (also) identified with another group (such as Dutch, Turkish, Moroccan). We performed post hoc power analyses using G∗Power (Faul et al., 2007) for each of our effects to test whether our sample provided sufficient power. An overview of all power estimates can be found in **Table 1**. Power was sufficient for all effects of interest unless specified in the results and discussion sections.

# Design and Procedure

fpsyg-09-01690 September 18, 2018 Time: 19:3 # 4

The study had a 2 (leader gender: male/female) × 2 (leader behavior: QB[-type]/control) between-participants design. Data were collected online during collective testing sessions in computer rooms at the university. The study was approved by the Ethics Committee of the University. Following informed consent, a baseline measure of gender identification (moderator variable) was taken at the start of the collective testing session, ostensibly as part of a first (unrelated) study. After this unrelated study, which assessed students' attitudes toward the university and which took about 15 min, participants were asked to imagine they had been working at a company for a short time (type of business not


<sup>a</sup>The current study achieved at least 80% power for the conditional indirect effect of QB-type behavior on perceived sexism in the male leader condition, as demonstrated by more than sufficient sample size (Fritz and MacKinnon, 2007).

specified) and were presented with the manipulation of leader behavior (QB[-type] vs. control). Subsequently, participants answered the manipulation checks and answered dependent measures and control variables: perceived sexism and perceived intent of the leader, negative affect, and demographics. The study took approximately 25 min.

# Leader Gender and Leader Behavior Manipulation

The manipulation of the gender of the leader and the manipulation of the leader behavior (QB[-type] behavior vs. control) was situated within a (contrived) company magazine presented to participants, which included an introduction from the CEO and the manipulations in the form of a column. The purpose of this introduction was to provide participants with implicit information about the organizational context, namely a male-dominated organization (photo of male-only board of directors, statement that the company "is now 324 man strong")–the context in which QB behavior is most likely to arise and in which junior women are most likely to be exposed to QB behavior (Derks et al., 2011a,b). Following this foreword, participants read a column designed to manipulate leader gender and leader behavior (see **Supplementary Materials** for full manipulation). The column was ostensibly written by their leader Luc or Marie (leader gender manipulation), in which their leader discussed the organization and his/her motivation for working there. As outlined below, QB(-type) behavior (vs. control) was manipulated by incorporating the following three general indicators of the QB phenomenon (Derks et al., 2016): (1) masculine self-description, (2) endorsement of gender stereotypes, and (3) denial of gender discrimination in the organization. All three were included together as they together have been defined as QB behavior and because we wanted to create a manipulation that was strong enough for a student sample (as they are as yet less attuned to the workplace) imagining a situation reading a single vignette.

Firstly, in all conditions the leader claimed that three characteristics were important for achieving success and emphasized that he/she had these characteristics. These characteristics were selected on the basis of a pretest to be similar in positive valence, but to be either highly masculine or neutral: highly masculine in the QB-(type) conditions (willingness to take risks, focus on results, and being strong) and neutral in the control conditions (being responsible, flexible, and sincere). Secondly, in the QB(-type) behavior conditions the leader (subtly) endorsed gender stereotypes by linking a masculine work environment with a no-nonsense work environment, the implication being that a feminine work environment is not a no-nonsense work environment. Thirdly, the leader in the QB-type behavior conditions denied gender discrimination, implying that individual merit (competence) and not structural disadvantage is the reason there are hardly any women in the organization.

The combination of these three indicators read as follows in the QB(-type) behavior conditions: "I sometimes get asked: 'Doesn't it bother you, almost only male employees?' Why would

that bother me? Because of the masculine work environment? I stayed here because I like a no nonsense work environment. Because it might be unfair? What's unfair about selecting employees based on competence?"

In the control conditions these sentences read: "I sometimes get asked: 'Doesn't it bother you, working for one company for such a long time?' Why would that bother me? Because the work environment stays the same? I stayed here because I like this work environment. Because it's hard work? What dream doesn't require hard work?"

After reading this company magazine, all participants received ambiguous negative feedback from their leader. The content of this feedback was identical across conditions and was added in order to make the situation more self-relevant for participants. The participant was told that the position in which her manager had started his or her career at the company was opening up soon, that this higher position was a good fit for the participant, and that the participant had expressed her interest in this position to her manager a few days ago. Participants were then shown the following ambiguous response from their leader: "Thank you for your email and your interest in the function as assistant project leader, it is indeed a nice position. I have to tell you though that I'm not sure you will be accepted. So check if you want to put your time into that, or maybe think about it some more."

# Measures

Unless otherwise indicated, items were answered on a 7-point scale from (1) strongly disagree to (7) strongly agree. Measures are scored such that higher scores indicate stronger scores on the concept.

#### Perceived Positive Intent of the Leader

Two items measured perceived positive intent of the leader: "Luc/Marie has my best interests at heart" and "Luc/Marie has women's best interests at heart" (r = 0.62, p < 0.001). The correct name (Luc for a male leader and Marie for a female leader) was inserted by the Lime Survey program depending on whether the participant's leader was male or female.

## Perceived Sexism of the Leader

To measure perceived sexism of the leader, participants were asked to indicate the extent to which they agreed with two items (correct name inserted by the Lime Survey program): "Luc/Marie is sexist" and "Luc/Marie acts belittling toward women" (r = 0.43, p < 0.001).

#### Negative Affect

We assessed three types of negative affect using items adapted from the PANAS scales (Watson et al., 1988). Participants were asked to imagine how they would feel in the presented situation, using a 7-point Likert scale from (1) not at all to (7) very much. Three items measured anger (angry, annoyed, and hostile, α = 0.87), four items measured sadness (down, sad, dissatisfied, and unhappy, α = 0.88) and four items measured anxiety (anxious, tense, nervous, and afraid, α = 0.85).

#### Gender Identification

To examine how gender identification altered responses we assessed gender identification using the 'identity centrality' subscale of the hierarchical model of ingroup identification (Leach et al., 2008). This subscale consists of the following three items: "Being a woman is an important part of how I see myself "; "The fact that I am a woman is an important part of my identity"; and "I often think about the fact that I am a woman" (α = 0.76). Gender identification was assessed at the beginning of the collective session as part of an ostensible separate study.

Means and standard deviations of all measures per condition as well as cohen's d are provided in **Table 2**. 2

# RESULTS

# Initial Checks

Initial checks showed that the variance for perceived sexism of the leader was not equal across the conditions. An adjusted

TABLE 2 | Descriptive statistics by leader gender and leader behavior.


Statistics presented are mean scores; standard deviations are presented between brackets. Cohen's d is given for each comparison in the cell below (means for the male and female leader within each level of the behavior leader condition) or in the cell to the right-hand side (means for the control and QB[-type] behavior conditions within each level of the gender leader condition) of the corresponding means.

<sup>2</sup>One additional measure, attributional ambiguity, examined the extent to which participants were (un)sure about their judgment of sexism of their leader. The results did not substantially add to the story while decreasing the coherence thereof, which is why these results are not included in the main text. The measure as well as the results of analyses with this measure are provided in the **Supplementary Materials**. At the end of the study, other measures were administered for exploratory purposes for future research (participants' masculine and feminine self-description, interest in individual mobility, and distancing from the group). These measures were unrelated to the research questions or hypotheses described in this manuscript, and were administered after the measures relevant to the present study had been administered. Thus, these measures did not exert any influence on the present results and can be seen as separate from the present results.

rank transformation test (ART) was performed in order to see if this heterogeneity of variance for perceived sexism affected the results. The ART is a non-parametric test suitable for analyzing interactions (Leys and Schumann, 2010). Data are adjusted and rank transformed, after which the adjusted data are analyzed with factorial ANOVA. The results obtained using ART did not differ from the results obtained using ANOVA, and thus for ease of interpretation we report the results obtained using ANOVA. The statistics for the ART analyses are available in the **Supplementary Materials**. There were no differences between conditions on gender identification, F(3,164) = 0.69, p = 0.561, demonstrating that randomization was successful.

# How Do Participants Perceive Male and Female Leaders Displaying QB(-Type) Behavior?

#### Perceived Positive Intent

We first examined to what extent participants saw their leader as having positive intent toward women. Consistent with expectations, participants in the QB(-type) conditions saw their leader as having less positive intent (M = 3.08, SD = 1.24) than did participants in the control conditions (M = 3.91, SD = 1.09), F(1,166) = 21.09, p < 0.001, η 2 <sup>p</sup> = 0.11. The interaction effect between leader behavior and leader gender was also significant, F(1,164) = 8.76, p = 0.004, η 2 <sup>p</sup> = 0.05. As expected, the male leader in the QB-type condition was perceived as having less positive intent (M = 2.61, SD = 1.05) than the male leader in the control condition (M = 3.84, SD = 1.06), F(1,164) = 26.87, p < 0.001, η 2 <sup>p</sup> = 0.14. The female leader was perceived as having equally positive intent whether she evidenced QB behavior or not, F(1,164) = 0.72, p = 0.399, η 2 <sup>p</sup> = 0.004.

#### Perceived Sexism: Direct Effects

Next we examined to what extent participants saw their leader as sexist. In general, participants saw the male leader as more sexist (M = 3.74, SD = 1.47) than the female leader (M = 2.85, SD = 1.11), F(1,166) = 19.51, p < 0.001, η 2 <sup>p</sup> = 0.11. There was also a main effect of QB(-type) behavior on perceived sexism: participants in the QB(-type) conditions saw their leader as more sexist (M = 3.74, SD = 1.54) than did participants in the control conditions (M = 2.91, SD = 1.07), F(1,166) = 16.90, p < 0.001, η 2 <sup>p</sup> = 0.09. Contrary to expectations, the overall interaction effect between leader behavior and leader gender was not significant, F(1,164) = 2.36, p = 0.126, η 2 <sup>p</sup> = 0.01. An examination of the predicted slopes showed that the predicted simple main effect of QB-type behavior was significant in the male leader condition, F(1,164) = 14.19, p < 0.001, η 2 <sup>p</sup> = 0.08, with the male leader being seen as more sexist in the QBtype condition (M = 4.20, SD = 1.55) than in the control condition (M = 3.19, SD = 1.16). The simple main effect of QB behavior was not significant in the female leader condition, F(1,164) = 2.14, p = 0.146, η 2 <sup>p</sup> = 0.01 (respective means M = 3.09, SD = 1.29 and M = 2.68, SD = 0.94). An alternative breakdown of this interaction showed that the simple main effect of leader gender was marginally significant in the control condition, F(1,164) = 3.63, p = 0.059, η 2 <sup>p</sup> = 0.02, and significant in the QB(-type) condition, F(1,164) = 15.27, p < 0.001, η 2 <sup>p</sup> = 0.09. Yet, the lack of a significant overall interaction and low power for this interaction (0.33) shows that these differences were not strong enough to conclude that differences in attributions to sexism between the QB(-type) and control condition depended on leader gender.

#### Perceived Sexism: Indirect Effects

Moderated-mediation analyses using the PROCESS macro for SPSS (Hayes, 2018, model 7) did, however, support the prediction that the lower perceptions of positive intent explained increased perceptions of sexism for the male leader displaying QB-type behavior: perceived sexism was entered as the dependent variable, leader behavior was entered as the predictor variable, perceived positive intent as the proposed mediator, and leader gender was added as the proposed moderator for the a<sup>0</sup> – b<sup>0</sup> relationship. The results of these analyses are summarized in **Table 3**. As expected, the moderated mediation was significant (index = −0.48, SE = 0.18, 95% CI [−0.84, −0.15]). Perceived positive intent fully mediated the effect of QB(-type) behavior on perceived sexism for the male leader (indirect effect = 0.58, SE = 0.14, 95% CI [0.32, 0.85]), but not for the female leader (indirect effect = 0.10, SE = 0.12, 95% CI [−0.15, 0.35]). Specifically, as can be seen in **Figure 1**, participants who saw a male leader display QB-type behavior saw him as having less positive intent (a = −1.23), which in turn related to increased attributions to sexism (b = −0.47). There was no effect of QB-type behavior on perceived sexism independent of its effect on perceived positive intent (c <sup>0</sup> = 0.44, p = 0.180). Thus, consistent with our expectations, participants perceived a male leader displaying QB-type behavior as more sexist (relative to control) because they perceived him as lacking positive intent.<sup>3</sup>

The data thus partially supported Hypothesis 1. As expected, a male (but not a female) leader displaying QB(-type) behavior was seen as having less positive intent toward women, and although we did not find a significant difference in the direct effect of QB(-type) behavior on perceived sexism for the male vs. for the female leader, there was a significantly different indirect effect: differences in perceived positive intent indirectly led to differences in perceived sexism.

# How Are Participants Affected by QB(-Type) Behavior?

First, we examined whether exposure to QB(-type) behavior was related to higher negative affect. As expected, there were significant main effects of QB(-type) behavior on anxiety, F(1,166) = 5.20, p = 0.024, η 2 <sup>p</sup> = 0.03, anger, F(1,166) = 20.92,

<sup>3</sup>We also tested the alternative reversed mediation, where QB-type behavior through the mediator of perceived sexism would lead to decreased positive intent. This moderated mediation model was not significant, index = −0.22, SE = 0.16, 95% CI [−0.06, 0.56]. Moreover, the–albeit significant–indirect effect of QB-type behavior on perceived positive intent through the mediator perceived sexism in the male leader condition (indirect effect = −0.38, SE = 0.13, 95% CI [−0.66, −0.15]) did not fully mediate the effect of QB-type behavior on perceived sexism (c <sup>0</sup> = −0.96, p < 0.001).

TABLE 3 | Conditional indirect effects of QB(-type) behavior on perceived sexism through perceived positive intent.


p < 0.001, η 2 <sup>p</sup> = 0.11, and sadness, F(1,166) = 18.19, p < 0.001, η 2 <sup>p</sup> = 0.10. Participants in the QB(-type) conditions reported being more anxious (M = 3.80, SD = 1.17), more angry (M = 3.48, SD = 1.36), and more sad (M = 3.34, SD = 1.03) than did participants in the control conditions (M = 3.38, SD = 1.18; M = 2.57, SD = 1.24; M = 2.64, SD = 1.08, respectively), though power for the main effect on anxiety was rather low (0.62). There were no interactions between leader behavior and leader gender on anxiety, F(1,164) = 0.64, p = 0.427, η 2 <sup>p</sup> = 0.004, anger, F(1,164) = 0.04, p = 0.846, η 2 <sup>p</sup> = 0.0002, or sadness, F(1,164) = 0.02, p = 0.887, η 2 <sup>p</sup> = 0.0001. Thus, supporting Hypothesis 2, QB(-type) behavior related to negative outcomes both when this behavior came from a male leader and when this behavior came from a female leader.

# Is Regulation of Negative Effects Impaired Under a Female Leader?

Next, we examined whether gender identification acts as a buffer against the effect of QB(-type) behavior on negative emotions. Using the PROCESS macro for SPSS (Hayes, 2018, model 3), we examined the degree to which gender identification (as a continuous moderator) moderated the effects of leader gender and leader behavior on negative emotions. We expected to find a three-way interaction such that for participants with a male leader, gender identification would serve as a buffer of the effect of QB(-type) behavior on negative emotions, while a similar effect would not occur for participants with a female leader. Results showed that the three-way interaction between leader behavior, leader gender, and gender identification was marginally significant for anxiety, F(1,160) = 3.85, p = 0.052, η 2 <sup>p</sup> = 0.02. Further examination of this interaction showed that among participants who had seen a male leader, the main effect of QB-type behavior on anxiety was moderated by gender identification, b = −0.75, F(1,160) = 9.22, p = 0.004. In line with expectations, simple slope analyses looking at participants with lower and higher gender identification (−1 SD and +1 SD) showed that participants lower in gender identification reported more anxiety when their male leader evidenced QB-type behavior than when he evidenced neutral behavior, b = 0.82, p = 0.016, while participants higher in gender identification reported equal anxiety regardless of whether their male leader evidenced QB-type or control behavior, b = −0.43, p = 0.189 (see **Figure 2**).<sup>4</sup> Meanwhile, among participants who had seen a female leader, the effect of QB behavior on anxiety was not moderated by gender identification, b = −0.07, F(1,160) = 0.08, p = 0.773. Contrary to expectations, gender identification and leader gender did not interact with leader behavior to produce significant three-way interactions on anger, F(1,160) = 1.82, p = 0.179, η 2 <sup>p</sup> = 0.01, or sadness, F(1,160) = 1.40, p = 0.239, η 2 <sup>p</sup> = 0.01. The data thus partially supported Hypothesis 3, cautiously suggesting impaired regulation of negative effects on anxiety (but not anger or sadness) following exposure to QB(-type) behavior from a female source, but not from a male source. However, as this effect was underpowered (0.45), these results should be interpreted with due caution. We further reflect on the issue of power in the discussion section.

4 See **Supplementary Materials** for figures displaying the spread of datapoints around the slopes (**Supplementary Figure A**).

# DISCUSSION

While previous research has investigated the occurrence and antecedents of self-group distancing in women (also known as "Queen Bee" behavior; Derks et al., 2011a,b, 2015, 2016), the current study shifted focus from antecedents to subsequent effects of this behavior on junior women. Results showed that women perceived a male but not a female leader displaying QB(-type) (vs. neutral) behavior as having less positive intent toward women, which in turn related to stronger attributions of sexism. This finding is consistent with research demonstrating that possible displays of sexism directed toward women are less likely to be noticed when the source of this behavior is a woman (Baron et al., 1991; Barreto and Ellemers, 2005; Cunningham et al., 2009). This is the first research, however, to empirically demonstrate a similarity between QB(-type) behavior and sexism (which, as outlined before, are similar in behaviors but conceptually very different given their different underlying concerns or antecedents). Our results show that, like sexism, QB(-type) behavior negatively affects women (Klonoff et al., 2000; Wang et al., 2012). This study is also the first to examine the impact of possibly biased comments from an ingroup leader to an ingroup subordinate. Moreover, we add to research on effects of possible sexism by male and female sources (Barreto and Ellemers, 2005) by including a male and a female control condition rather than only comparing a male source to a female source. With this condition, we were able to eliminate the alternative explanation that a man displaying QB-type behavior was seen as sexist only because of his gender. The results show that a man displaying QB-type behavior was seen as more sexist than a male leader displaying neutral behavior. Thus, beyond a main effect of gender (male leader perceived to be more sexist than a female leader); the act of displaying QB-type behavior uniquely contributed to perceived sexism.

In line with research on the intergroup sensitivity effect (Hornsey et al., 2002; Hornsey and Imani, 2004), participants attributed QB-type behavior coming from a member of the outgroup (a man) to a lack in positive intent toward the ingroup (women), which is why he was seen as sexist. Coming from an ingroup source, however, QB behavior was not attributed to a lack in positive intent. These findings provide further insight into when and why people attribute behavior to bias. Put differently, these findings illustrate circumstances under which people may not attribute behavior to bias, that is when behavior is presented in the context of perceived positive intent.

Although an ingroup leader displaying QB(-type) behavior was less likely than an outgroup leader to be viewed in a negative light, participants nonetheless experienced negative consequences of this behavior. In both the male leader and in the female leader conditions, exposure to QB(-type) behavior increased negative emotions. So while participants did not explicitly perceive QB behavior coming from a female leader as having negative intent, participants' affective responses were as negative as when they had been exposed to a male leader displaying QB-type behavior. Specifically, participants who had been exposed to QB(-type) behavior were more angry, more sad, and more anxious than participants who had not been exposed to this behavior, regardless of leader gender. Notably, as all women in the current study (including those in the control conditions) received ambiguous negative feedback, we can rule out the alternative explanation that it was the act of feeling rejected rather than exposure to QB(-type) behavior which increased negative emotions. As far as we know, this is the first research to show that QB behavior negatively affects junior women's well-being. The finding that QB behavior does not have to be perceived as negative (i.e., negative intentions or sexist) to exert a negative influence is consistent with research showing that bias does not have to be identified as such to exert negative effects (e.g., Barreto and Ellemers, 2005). These findings suggest that QB behavior affects junior women in a way that may go unnoticed: increasing negative emotions but being less likely to be identified as potentially harmful, thus lowering the opportunity to defend against its effects.

The results indeed suggested that high identifiers–those usually protected from some of the negative consequences of bias, as they are highly vigilant and more confident as to when

bias occurs–may not be protected in the usual way against this type of bias. That is: both low and high identifiers showed higher anxiety when exposed to QB behavior by a female leader. Among participants with a male leader, however, higher gender identification buffered against the negative effects of QBtype behavior on anxiety. Yet, this three-way interaction was only marginally significant and underpowered, thus replication research with larger samples is needed to draw more confident conclusions on the regulation of negative effects of QB(-type) behavior by male and female leaders.

Combining the current results with existing research on selfgroup distancing suggests that self-group distancing behavior in organizations may have a number of negative consequences. These consequences are relevant not only for gender groups, but also for other negatively stereotyped and underrepresented groups. Having an ingroup leader who distances him or herself from the ingroup can have pernicious effects for members of that group. Leaders who show possible bias toward underrepresented groups create a negative work environment for members of these groups, even when they themselves are members of these traditionally underrepresented groups and even when they are not perceived as being biased.

Our results highlight the key importance of the organizational climate in any effort to target underrepresentation of groups in the workplace. Only placing a few more minorities and women in the higher echelons of the organizations is not sufficient without also targeting the organizational diversity climate–or at least not if this increase in women and/or minorities does not lead to a critical mass (Kanter, 1977; Torchia et al., 2011; see also Burkinshaw, 2015). Rather than changing stereotypes or improving diversity, select representation of only a few minorities or women without achieving a critical mass may even increase stereotyping and preferential treatment by the majority group (Wright, 2001; Bagues et al., 2017). Without achieving a critical mass, women or ethnic minorities may continue to adapt to threatening organizational climates by distancing themselves from the stigmatized ingroup, which could have negative effects on future career perspectives for members of these groups. Other than achieving a critical mass, options to break the chain of selfgroup distancing are to create a more inclusive or otherwise less threatening organizational climate (Purdie-Vaughns et al., 2008) and to ensure that members of stigmatized groups have access to successful role models who they feel similar to and who do not distance themselves from the ingroup (Cheryan et al., 2011).

# Limitations and Future Research

A limitation of the current study is that the three-way interactions between leader gender, leader behavior, and gender identification on negative affect were underpowered. Future research should further investigate whether gender identification is indeed an effective regulation strategy for QB(-type) behavior by a male but not by a female leader. Moreover, since our results suggested that regulation may be different for different negative emotions (marginal effect for anxiety and no effect for anger or sadness), research could examine whether some emotions are more difficult to regulate in reactions to self-group distancing behaviors. Regulation through directing emotions toward others instead of the self is also an interesting route for further research. For instance, it could be that women with higher gender identification regulate anger not by decreasing this emotion, but by directing it toward the leader, while women with lower gender identification may experience anger toward themselves. This interpretation is consistent with research showing that unambiguous and ambiguous bias both increase negative emotions, but that these emotions are more likely to be directed toward the other when bias is unambiguous, and toward the self when bias is ambiguous (Crocker et al., 1991; Vorauer and Kumhyr, 2001; Ellemers and Barreto, 2009; Barreto et al., 2010).

The present study manipulated QB(-type) behavior with its three main components shown in previous research (masculine self-description, endorsing gender stereotypes and denying gender discrimination). Future research could investigate whether some of these components are more influential than others. It would also be interesting to study whether self-group distancing not only affects subordinates' negative emotions, but perhaps also harms organizational outcomes such as employee satisfaction, organizational commitment, or productivity, especially for members of the negatively stereotyped group. Additionally, future research can examine long-term consequences of leader self-group distancing for subordinates of negatively stereotyped groups. These consequences may include subordinates switching to other careers where they might feel more belonging (Drury et al., 2011; Veldman et al., 2017) or adjusting to the organizational climate by engaging in self-group distancing themselves.

Another avenue for future research could be to examine the processes through which QB(-type) behavior induces negative affect, as it is possible that these processes are different for male and for female leaders, or that for female leaders additional processes are at play. For instance, junior women exposed to QBtype behavior by a male leader may experience more negative affect because they suspect they are a victim of discrimination. Junior women exposed to QB behavior by a female leader may experience negative affect through other or additional processes, for instance because they do not see this female leader as a role model and may fear that success is attainable only for women who are dissimilar to them. Indeed, research has shown that for members of underrepresented groups, a role model who embodies qualities stereotypical of a particular field (i.e., masculine in male-dominated field) may even be less desirable than not having a role model at all (Cheryan et al., 2011). Insight into these processes would strengthen the present research by revealing the underlying mechanisms behind negative effects of QB behavior, and may provide ways to protect junior women from such effects.

It would be interesting to examine self-group distancing and its consequent negative effects in different groups. For instance, would similar effects be found among men employed in traditionally female-dominated work environments? Men in these fields may be underrepresented but not necessarily negatively stereotyped, however, and any negative gender stereotyping there might be is likely to affect men less (Schmitt et al., 2002). As such, men may suffer less from identity threat

in these contexts and may be less likely to have to resort to a strategy such as self-group distancing. Moreover, men who do distance themselves from other men may suffer a loss of status in the eyes of other men and may therefore be less influential (thus exerting less negative effects). The same results might thus not be found among men. We would certainly though expect similar results to be found among other negatively stereotyped groups, such as ethnic minority groups. Here, too, we expect that selfgroup distancing from an ingroup source may not be identified as bias and may have similar negative consequences, including impaired regulation of these consequences.

# CONCLUSION

Existing work shows that an organizational climate that is not identity safe can trigger self-group distancing in members of negatively stereotyped groups. The current work adds that behavior associated with self-group distancing might not be recognized as bias when coming from a member of the ingroup, but nevertheless negatively affects members of that group, making it potentially more likely that members of negatively stereotyped groups will feel lower belonging and motivation. To put it a different way, gradual advancement of members of underrepresented groups will not necessarily lead to increased equality for these groups as long as the environment these individuals advance in leads them to distance themselves from their group. Importantly, these findings do not mean that women and other members of disadvantaged groups should not advance into higher organizational positions. Rather, it is key that efforts also be directed toward removing the structural barriers and the lack of positive climate that members of disadvantaged groups in these positions can face, thus alleviating the need for members of negatively stereotyped and underrepresented to cope by engaging in self-group distancing.

# REFERENCES


# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the General Guidelines, Social and Societal Ethics Committee of the University of Leuven. The protocol was approved by the Social and Societal Ethics Committee of the University of Leuven. All subjects gave written informed consent in accordance with the Declaration of Helsinki. Before starting the questionnaire, participants agreed to an informed consent that provided the general research aims. They were informed that participation was voluntary and could be stopped at any moment during the study; and that their responses are anonymous and treated confidentially. Moreover, they were provided with room for questions and comments as well as all contact information of the researchers and the ethical committee.

# AUTHOR CONTRIBUTIONS

NS, CVL, and LM contributed to the development of hypotheses, data collection, interpretation of results, and the writing of the manuscript. NS conducted the statistical analyses.

# FUNDING

This research was supported by an Odysseus grant to CVL from the Research Foundation of Flanders (FWO) grant number G.O.E66.14N.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.01690/full#supplementary-material

and well-being. J. Pers. Soc. Psychol. 77, 135–149. doi: 10.1037/0022-3514.77. 1.135



gender identification. Eur. J. Soc. Psychol. 45, 599–608. doi: 10.1002/ejsp. 2113



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Sterk, Meeussen and Van Laar. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Flywheel Effect of Gender Role Expectations in Diverse Work Groups

*Hans van Dijk1 \* and Marloes L. van Engen2*

*1 Department of Organization Studies, Tilburg University, Tilburg, Netherlands, 2 Department of Human Resource Studies, Tilburg University, Tilburg, Netherlands*

Popular press suggests that gender diversity benefits the performance of work groups. However, decades of research indicate that such performance benefits of gender diversity are anything but a given. To account for this incongruity, in this conceptual paper we argue that the performance of gender-diverse work groups is often inhibited by self-reinforcing gender role expectations. We use the analogy of a flywheel to illustrate how gender role expectations tend to reinforce themselves *via* three mechanisms. Specifically, we argue that gender role expectations shape (1) the allocation of jobs, tasks, and responsibilities, (2) the behavior of perceivers, and (3) the behavior of target women and men. In turn, these three consequences of gender role expectations tend to confirm the initial gender role expectations, thus creating an automatic, self-reinforcing flywheel effect. Such selfreinforcing gender role expectations provide superficial impressions of individual women's and men's actual knowledge and abilities at best. We therefore further propose that each of the three mechanisms of the flywheel of gender role expectations negatively affects group performance to the extent that gender role expectations inaccurately capture group members' actual knowledge and abilities. Because the extent to which work group members rely on gender role expectations depends on how they form impressions of others, we propose that individuals' motivation to form accurate impressions is crucial for inhibiting the flywheel of gender role expectations. We close by advancing an agenda for future research on each of the three areas of interest in our conceptual analysis: the flywheel effect of gender role expectations, the consequences of this flywheel effect for group functioning, and ways to motivate group members to form accurate impressions.

#### *Edited by:*

*Sabine Sczesny, University of Bern, Switzerland*

#### *Reviewed by:*

*Colette Van Laar, KU Leuven, Belgium Wendy Van Ginkel, Drexel University, United States*

#### *\*Correspondence:*

*Hans van Dijk j.vandijk1@tilburguniversity.edu*

#### *Specialty section:*

*This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology*

> *Received: 07 May 2018 Accepted: 12 April 2019 Published: 07 May 2019*

#### *Citation:*

*van Dijk H and van Engen ML (2019) The Flywheel Effect of Gender Role Expectations in Diverse Work Groups. Front. Psychol. 10:976. doi: 10.3389/fpsyg.2019.00976*

Keywords: gender role expectations, impression formation motivation, team performance, diverse teams, stereotypes

Although popular press proclaims that gender diversity benefits the performance of work groups (e.g., teams, departments, and organizations; see Catalyst, 2004), these statements seem based more on wishes than reality (Eagly, 2016). A meta-analysis of 56 studies that in total represent 7,141 gender-diverse teams (the most proximal unit to assess the consequences of gender diversity) showed a non-significant relationship between gender diversity and team performance (*r* = −0.01; van Dijk et al., 2012). There are, however, a number of plausible arguments why gender diversity should benefit work group performance.

First, in most organizations, individuals are selected based on knowledge and abilities. As gender is often not indicative of individual performance, the optimal work group composition should be a mix with the women and men highest in knowledge and abilities (cf. Lindberg et al., 2010). An underrepresentation of women or men in a certain work group, hence, often

**158**

reflects a certain amount of "false positive error" (selecting a candidate that is not the best for the job) and "false negative error" (not selecting the best candidate for the job) in selection.

Second, given that men and women tend to be socialized differently (Eagly, 1987), they are likely to hold different knowledge, perspectives, and ideas (cf. May et al., 2018). If gender-diverse work groups are able to pool and use the corresponding richness and variety in information, they should be able to make better decisions than gender-homogeneous work groups (cf. van Knippenberg et al., 2004).

Third, most work groups target male as well as female clients (i.e., customers, consumers). In harboring men as well as women, gender-diverse work groups should be better able to understand and cater to the needs of their clients (cf. Ely and Thomas, 2001).

The lack of support for positive effects of gender diversity on work group performance therefore begs the question why the potential of gender diversity is not realized. In this article, we address this question and offer a way forward for researchers and practitioners to better understand what is needed for unlocking the potential performance benefits of gender diversity in work groups.

Specifically, we contend that the main obstacle for the performance of gender-diverse work groups is the self-reinforcing nature of gender role expectations. Ample research in the past decades has shown that gender stereotypes create role expectations in workplaces regarding the behavior of men and women on tasks and positions (Heilman; 1983; Eagly, 1987; Ridgeway, 1991; Eagly and Karau, 2002; Biernat et al., 2010). We argue that these role expectations reinforce themselves by behaving like a flywheel (i.e., a heavy wheel that keeps rotating with little effort after it has gained momentum, e.g., a potter's wheel): *via* a series of bigger and smaller pushes, momentum is created and attained, such that gender role expectations (1) operate autonomously and (2) sustain and reinforce themselves.

We identify three mechanisms *via* which gender role expectations tend to reinforce themselves in gender-diverse work groups. The first is the influence of gender role expectations in the *allocation of jobs, tasks, and responsibilities* (cf. social role theory, Eagly, 1987; role congruity theory, Eagly and Karau, 2002; status construction theory, Ridgeway, 1991); the second is the influence of gender role expectations in *the behavior of perceivers* (cf. expectation states theory, Berger et al., 1974; stereotype content model, Fiske et al., 2002; backlash, Rudman et al., 2012); and the third is the influence of gender role expectations in *the behavior of women and men* (cf. stereotype threat, Hoyt and Murphy, 2016; fear of backlash, Akinola et al., 2018).

Because each mechanism is grounded in generalized impressions of the knowledge and abilities of women and men based on their gender, the mechanisms are always affected by a certain degree of inaccuracy regarding the actual knowledge and abilities of target women and men. Higher degrees of inaccuracy are likely to exacerbate the extent to which jobs, tasks, and responsibilities are allocated to less-knowledgeable group members, and the extent to which the behaviors of perceivers and of women and men disrupt performance. As a consequence, we propose that gender role expectations harm work group performance to the extent that gender role expectations inaccurately capture target women and men's knowledge and abilities. To decrease the likelihood that perceives let their gender role expectations influence their impressions of women's and men's knowledge and abilities and form more accurate impressions of women's and men's knowledge and abilities, we argue that it is crucial that perceivers are motivated to form accurate impressions of each other.

Our conceptual analysis provides three main contributions to the literature. First, whereas gender role expectations are known to negatively affect the position and performance of women and men in stereotype-incongruent roles, we extend these insights by applying them to gender-diverse work groups and argue that gender role expectations in gender-diverse work groups operate like a flywheel. Second, by building theory on this flywheel effect of gender role expectations in gender-diverse work groups, we assert that it is the inaccuracy of gender role expectations that cause gender-diverse work groups to fail in realizing their full potential. Third, in building theory and setting a future research agenda on how to inhibit or alter self-reinforcing gender role expectations, we provide theoretically as well as practically novel suggestions for how to improve the functioning of gender-diverse work groups.

# THE FLYWHEEL OF GENDER ROLE EXPECTATIONS

Research on the performance of (gender-)diverse work groups has commonly adopted a bi-theoretical approach to explain why and how gender diversity may positively or negatively affect group performance (van Knippenberg et al., 2004; van Knippenberg and Schippers, 2007). The information/decisionmaking perspective suggests that diverse work groups hold a richer variety in knowledge and information. When members are able to pool and combine the variety in knowledge and information, diverse work groups should be able to make better decisions and hence outperform homogeneous work groups. By contrast, the social categorization perspective suggests that differences between group members increase the likelihood that group members perceive each other as different, which can lead to the emergence of subgroups, and subsequently increase subgroup conflicts and decrease cohesion as well as the pooling and integration of knowledge and information.

Although this bi-theoretical approach enables accounting for positive as well as negative outcomes, it has omitted how stereotypes and corresponding role expectations shape behaviors, dynamics, and outcomes of diverse work groups (van Dijk et al., 2017). Role expectations represent societally crafted associations and beliefs that enable perceivers to navigate through a world of infinite complexity based on people's characteristics. As such, gender role expectations help perceivers reduce complexity by making inferences about women and men regarding their attitudes, behaviors, skills, etc. based on their gender (Eagly, 1987; Eagly et al., 2000; Haines et al., 2016). By focusing on a person's gender to form an impression of a target person, gender role expectations reduce the amount of time and effort that they would otherwise need to spend on individuation (van Dijk et al., 2017). In work groups, gender role expectations can therefore benefit perceivers by inferring female and male group members' knowledge and abilities, and using that to determine whom to ask for advice and whose input to ignore (cf. van Dijk et al., 2018).

However, forming impressions based on gender role expectations also comes at a cost. Although gender stereotypes tend to be accurate in predicting overall differences between women and men at the societal level (Jussim et al., 2015), at the individual level, stereotype-based impressions are at best superficial generalizations and at worst sexist and highly inaccurate. For example, whereas men overall may be more assertive compared to women, one cannot assume that all male members of a gender-diverse work group are more assertive than all female group members. Despite these potential costs, perceivers do tend to rely on gender role expectations in forming impressions of individual women and men because gender role expectations consume few cognitive resources, and because individuating information is not always available. Insight into how gender role expectations shape group behavior and dynamics is therefore crucial for understanding how gender diversity shapes work group performance.

Many consequences of gender role expectations are well understood and documented in the form of meta-analyses, reviews, and books (e.g., Eagly et al., 2000; Wood and Eagly, 2012). However, studies that focus on the organizational context mainly look at the consequences of gender role expectations for individuals (e.g., obtaining a leadership position, e.g., Eagly and Karau, 2002; individual performance, e.g., Chatman et al., 2008) and stay relatively mute to the role of gender role expectations in processes and outcomes at the work group level (van Dijk et al., 2017).

Furthermore, studies that focus on the consequences of gender role expectations tend to adopt a static approach by assessing how gender role expectations shape certain behaviors and outcomes related to gender inequality. Although there is an occasional reference to potential vicious cycles or downward spirals (e.g., Martell et al., 1996), such dynamic relationships remain under-theorized and are insufficiently explored.

In this conceptual contribution, we argue that the selfreinforcing nature of gender role expectations demands more attention, since it provides insight into why gender role expectations are so pervasive and may cause so many genderdiverse groups to fail reaching their potential. We use the analogy of a flywheel to explain the self-reinforcing nature of gender role expectations. The heavier a flywheel, the more effort is needed to make it spin, but also the harder it is to slow it down once it rotates. Once a flywheel has gained momentum, the flywheel only requires an occasional reinforcement to keep rotating. A flywheel effect thus refers to the continuation of rotations even after the original stimulus has been removed, such that the flywheel (1) operates autonomously and (2) reinforces itself (cf. Collins, 2001). It is because of these two aspects that we deem this a more appropriate and fitting analogy to illustrate how gender role expectations tend to reinforce themselves compared to the hollower terms of vicious cycles and downward spirals. Specifically, we assert that these two aspects of a flywheel capture the tendency of gender role expectations to (1) automatically (i.e., sub-consciously) evoke decisions, behaviors, and interactions that, in turn, (2) confirm and thereby reinforce the very same gender role expectations.

**Figure 1** shows our conceptual model. In the following, we first discuss the self-reinforcing nature of gender role expectations, and subsequently discuss how the flywheel of gender role expectations shapes group performance.

# WAYS IN WHICH GENDER ROLE EXPECTATIONS ARE SELF-REINFORCING

We propose that there are three mechanisms *via* which gender role expectations tend to behave like a flywheel by reinforcing themselves in gender-diverse groups. These mechanisms are as follows: (1) the allocation of jobs, tasks, and responsibilities, (2) the behavior of perceivers, and (3) the behavior of target

women and men. As is recommended for building theory in order to understand a phenomenon (Sparrowe and Mayer, 2011), we base our arguments on different theories that shed a light on the self-reinforcing nature of gender role expectations from a different angle.

# The Allocation of Jobs, Tasks, and Responsibilities

Each group and each organization usually aims to recruit the best (i.e., most knowledgeable, skilled, able) person for a job or task, and likewise allocate responsibilities based on people's competencies and expertise. However, in a focus on finding the best person, there is a caveat, because a perceiver's judgment and evaluation of a target person to a large extent tends to be based on the perceiver's own bias and beliefs (Scullen et al., 2000). Gender role expectations form a prominent source of such biases and beliefs. For example, meta-analytical evidence shows that men are preferred over equally able women for male-typed jobs (but not for female-typed or integrated jobs) (Koch et al., 2015). These findings are in line with the lackof-fit model (Heilman, 1983) and the role congruity theory (Eagly and Karau, 2002), both of which indicate that men are more likely to be recruited and selected for, or promoted into, a leadership position because the male role fits better or is more congruent with the leadership role in the eyes of perceivers.

Ironically, it is the subsequent underrepresentation of women in leadership positions that maintains and reinforces the gender role expectations that men are more suitable for leadership positions, if only because women are not granted the opportunity to prove their worth. Indeed, social role theory (Eagly, 1987) as well as status construction theory (Ridgeway, 1991) suggest that the mere observation of men dominating leadership positions and women being overrepresented in supportive (e.g., administration) or nurturing (e.g., caretaker) roles created, reinforced, and continues to uphold the belief or expectation that men are more suited for agentic and leadership roles and that women fit better in supportive and nurturing roles.

Such a flywheel effect of gender role expectations is not only likely to occur in the allocation of positions but also in many other allocation and decision-making processes in organizations. Consider, for example, performance evaluations (e.g., Lyness and Heilman, 2006; Bosquet et al., 2018), reward allocations (e.g., Castilla, 2008; Abraham, 2017), and promotion decisions (e.g., Roth et al., 2012). It is no coincidence that such evaluations and decisions also tend to be affected by gender role expectations, given that higher performance evaluations are likely to yield higher reward allocations, more chances on a promotion, as well as more chances on being allocated a prominent job, task, or responsibility. Gender role expectations can thus shape the allocation of jobs, tasks, and responsibilities by affecting performance evaluations in an earlier stage that, over time, may be crucial in determining who gets the job.

When looking at the effects of gender role expectations on the allocation of jobs, tasks, and responsibilities in a static way (i.e., at a fixed point in time), such effects may appear small or even nonexistent. However, because of the self-reinforcing nature of allocation and decision-making processes, the resulting cumulative effect over time may very well explain why the proportion of women tends to be lower the more one ascends the hierarchical ladder in organizations (Martell et al., 1996; Agars, 2004; Ridgeway, 2011).

In sum, we propose that gender role expectations shape decisions regarding the allocation of jobs, tasks, and responsibilities, such that gender role expectations tend to maintain and reinforce themselves. Men are more likely to be selected for jobs, tasks, and responsibilities that are congruent with the male gender role, whereas women are more likely to be selected for jobs, tasks, and responsibilities congruent with the female gender role. In subsequently observing the gender-confirming allocation of men and women, gender role beliefs and expectations are likely to be sustained and reinforced. The following flywheel effect is thereby created:

*Proposition 1: Gender role expectations tend to reinforce themselves via the allocation of jobs, tasks, and/or responsibilities: women and men are less likely to be appointed to a job, task, and/or responsibility that are incongruent with their gender role, and the consequent underrepresentation of persons in gender-incongruent roles maintains and reinforces gender role expectations.*

# The Behavior of Perceivers

Our first proposition suggests that it can already be difficult for women and men to obtain a job, task, or position that does not correspond with gender role expectations. But if women and men do obtain such a gender role-incongruent position, we argue that there is a second, complementary mechanism in the flywheel that makes it difficult for them to sustain such a position. This mechanism consists of a collection of behaviors of perceivers that tend to confirm and reinforce gender role expectations.

Specifically, expectation states theory (Berger et al., 1974) suggests that gender role expectations cause perceivers to display supportive or more critical behavior toward a person, depending on the extent to which gender role expectations suggest that the person holds task-relevant knowledge and abilities. The more these gender role expectations suggest that a target person has the knowledge and abilities for a task (e.g., men on maletyped tasks), the more the perceiver will support the person by granting the person opportunities to act, evaluating the person more positively, and being more influenced by the person (Correll and Ridgeway, 2003; Wittenbaum and Bowman, 2005; cf. Cuddy et al., 2007). If, however, gender role expectations suggest that a person does not hold task-relevant knowledge and abilities (e.g., men on female-typed tasks), such a person tends to be victim of various unsupportive behaviors of perceivers. Perceivers may, for example, ignore or interrupt the person, evaluate her or him more negatively, and/or discredit the person (Foschi, 2000). Women and men in gender role-incongruent positions thus are more likely to be the recipients of unsupportive behaviors by perceivers. In turn, such unsupportive behaviors make it more likely that women and men in gender roleincongruent positions fail or quit.

Furthermore, research in backlash suggests that unsupportive behaviors toward people in gender-incongruent positions are not only grounded in gender-based inferences of knowledge and abilities in relation to the task context, but also in more general gender role beliefs. Backlash refers to social and economic reprisals for behaving counter-stereotypically, which can range from the unsupportive behaviors mentioned earlier to discrimination and sabotage (Rudman and Phelan, 2008). Meta-analytical evidence showed that women who explicitly display dominance in male-typed task contexts (i.e., where the majority of workers tend to be men) tend to experience backlash (Williams and Tiedens, 2016). Other research suggests that especially women in high-status male-typed task contexts are likely to suffer from backlash, because their counter-stereotypical presence in such task contexts threatens men's high-status position in society (Rudman et al., 2012). Based on a series of experiments, Rudman and colleagues concluded that "defending the gender hierarchy is a primary motive for backlash" and that, for example, "prejudice against female leaders stems from perceived status violations" (p. 175). There is less research on backlash for men in counter-stereotypical roles, but in line with the argument that backlash is motivated by a defense of the gender hierarchy, those studies overall show that men experience backlash when displaying communal behavior in female-typed task contexts (Moss-Racusin, 2015).

Taken together, expectation states theory and research in backlash suggest that women and men in gender role-incongruent positions are more likely to be subject to unsupportive behaviors from perceivers compared to women and men in gender rolecongruent positions. Such unsupportive behaviors increase the chance that women and men in gender role-incongruent positions fail and/or drop out of their position. Moreover, women and men in gender role-incongruent positions tend to be penalized for displaying behavior that is required for the job or task because it is gender role-incongruent, and therefore become subject of more unsupportive behaviors. We therefore propose that perceivers tend to be supportive toward women and men in gender role-congruent positions, which enables such women and men to do well and remain in their position. By contrast, perceivers tend to be unsupportive of women and men in gender role-incongruent positions, which makes it more likely that such women and men fail and drop out of their positions. The successes of women and men in gender role-congruent positions and the failures of women and men in gender roleincongruent positions, in turn, confirm and reinforce the initial gender role expectations. The behavior of perceivers thus contributes to a flywheel effect that maintains and reinforces gender role expectations:

*Proposition 2: Gender role expectations tend to reinforce themselves via the behavior of perceivers: perceivers tend to be less supportive toward women and men in gender role-incongruent positions compared to women and men in gender role-congruent positions, which makes women and men in gender role-incongruent positions more likely to fail and thus maintains and reinforces the gender role expectations.*

# The Behavior of Individuals

Because individuals are exposed to gender role expectations from their cradle onward, they are often unaware of them and may frequently display behaviors that confirm gender role expectations. For example, women [men] may have been raised to be more modest [assertive] and submissive [dominant], and in showing such behavior reinforce gender role expectations. Social role theory (Wood and Eagly, 2012) suggests that men and women have also internalized gender role expectations and therefore may even prefer to display gender role-confirming behavior.

Even if persons have achieved a gender role-incongruent position, they often remain affected by gender role expectations. The aim of backlash against women and men in gender roleincongruent positions who display counter-stereotypical behavior is to make them behave according to gender norms. Many studies show that the mere fear of backlash already tends to cause women and men to adjust their behavior, up to the point where they may display gender conformity (Rudman and Fairchild, 2004). For example, studies have shown that a fear of backlash caused women to avoid behaving assertively in negotiations on behalf of themselves (Amanatullah and Morris, 2010), limit power displays in political and organizational settings (Brescoll, 2011), distance themselves from supporting subordinate women (Derks et al., 2016), and delegate less compared to men, which hampered performance (Akinola et al., 2018).

Another reason why women and men may display gender role-confirming behavior is stereotype threat. Stereotype threat refers to "the psychological experience of a person who, while engaged in a task, is aware of a stereotype about his or her identity group suggesting that he or she will not perform well in that task" (Roberson and Kulik, 2007, p. 26). Research on stereotype threat (Steele and Aronson, 1995) suggests that aiming to disprove stereotypes can paradoxically also lead to their confirmation. Specifically, several meta-analyses (Wheeler and Petty, 2001; Walton and Spencer, 2009) indicate that stereotype threat negatively affects women and men's performance on more complex gender role-incongruent tasks. There are different explanations for why stereotype threat hampers the performance of women and men on such gender role-incongruent tasks. One explanation suggests that gender role expectations create an awareness among women and men in gender roleincongruent positions that they are expected to perform less well compared to women and men in gender role-congruent positions. This awareness is experienced as a threat that taxes the working memory of women and men in gender roleincongruent positions, and thereby inhibits their ability to perform well (Schmader et al., 2008). Another, potentially complementary explanation is that the awareness of gender role expectations has a demotivating effect. Being demotivated may not just hamper performance, but can even cause women and men to disengage and/or avoid gender role-incongruent positions (Hoyt and Murphy, 2016).

Regardless of whether target women and men tend to display stereotype-confirming behavior because they have been socialized that way, because they fear backlash, or because of stereotype threat, in each case the outcome is that a target person's own behavior is reinforced to be congruent with gender role expectations. In turn, such gender role-congruent behaviors maintain and reinforce the initial gender role expectations, thus contributing to the flywheel effect of gender role expectations. We therefore propose:

*Proposition 3: Gender role expectations tend to reinforce themselves via the behavior of individuals: gender role expectations tend to cause women and men in gender role-incongruent positions to display gender rolecongruent behavior, which maintains and reinforces the gender role expectations.*

# THE CONSEQUENCES OF THE FLYWHEEL OF GENDER ROLE EXPECTATIONS FOR GENDER-DIVERSE GROUPS

Although studies on the consequences of gender role expectations tend to focus almost exclusively on how gender role expectations affect (outcomes of) target women and men and occasionally the perceiver, there are good reasons to expect that gender role expectations will also affect group performance. Specifically, we argue that each of the three mechanisms *via* which gender role expectations reinforce themselves can shape group performance, such that group performance suffers to the extent to which gender role expectations inaccurately capture the division of expertise between men and women in gender-diverse work groups.

With regard to the allocation of jobs, tasks, and responsibilities, gender role expectations are likely to function as a heuristic that facilitates a task division among team members. However, as mentioned earlier, despite the general accuracy of gender role stereotypes regarding overall differences between women and men at the societal level (Jussim et al., 2015), gender role expectations will always carry a degree of inaccuracy in predicting the distribution of women and men's knowledge and abilities in a specific gender-diverse work group for any given job, task, or responsibility. The more that gender role expectations inaccurately capture group members' knowledge and abilities, the more likely it is that gender role expectations lead to a suboptimal task division. Because the performance of work groups tends to depend on the extent to which its members are allocated tasks that align with their expertise (Aime et al., 2014), we argue that the performance of a work group decreases the more that the allocation of jobs, tasks, and responsibilities is based on inaccurate gender role expectations.

Regarding the behavior of perceivers, the more inaccurate gender role expectations are, the more likely it is that perceivers in gender-diverse work groups will turn to the wrong persons for help, follow the wrong advice, and put their trust in those who cannot be trusted, which all inhibits performance. Furthermore, in being more influential, the less capable women and men in gender role-congruent positions are likely to yield an increase in errors and suboptimal decisions, also inhibiting performance. Indeed, given that groups tend to perform best when expertise is recognized (Bunderson, 2003; Joshi, 2014), we argue that the performance of a work group decreases the more the behavior of perceivers is based on inaccurate gender role expectations.

Finally, from the side of target women and men, the various ways in which women and men are pressured to display gender role-confirming behavior (i.e., by socialization, fear of backlash, or stereotype threat) diminishes the influence of women and men in gender role-incongruent positions on group processes and outcomes. If such women and men in reality are the most competent group members, we argue that their limited influence in the group is likely to harm the group's performance. In line with this argument, a recent study showed that genderdiverse groups tended to perform worse to the extent that less-competent members were more influential (van Dijk et al., 2018). We thus argue that the performance of a work group decreases the more the behavior of target women and men is based on inaccurate gender role expectations.

In combination, we propose that gender role expectations harm group performance to the extent that gender role expectations inaccurately capture differences between male and female group members' level of knowledge and abilities:

*Proposition 4: The more inaccurately gender role expectations capture male and female group members' knowledge and abilities, the more gender role expectationsbased allocations of jobs, tasks, and responsibilities, behaviors of perceivers, and behaviors of target women and men inhibit group performance.*

# IMPRESSION FORMATION MOTIVATION AS KEY TO INHIBIT THE FLYWHEEL

Gender role expectations may at first glance appear a useful heuristic to assess one's knowledge and abilities for a job, task or responsibility, yet they remain uninformed guesses at best. Meta-analytic studies on differences between women and men in most work-related knowledge and abilities in general tend to be small, heterogeneous, and converging (e.g., Eagly et al., 2003). More importantly, population differences say next to nothing about specific individuals.

Rather than relying on the flywheel of gender role expectations to form an impression of target persons, we therefore contend that individuals as well as work groups will benefit when group members use other means to discern knowledge and abilities. Based on the literature on how perceivers form impressions of target persons, we argue that group members' impression formation *motivation* is crucial in changing perceivers' reliance on gender role expectations in forming impressions of target persons.

Research on impression formation examines the process *via* which perceivers form an impression of a target. There are a number of slightly different models and theories on the process of impression formation (cf. Brewer, 1988; Fiske and Neuberg, 1990;

van Dijk and van Engen The Flywheel Effect of Gender Role Expectations

Thagard and Kunda, 1996), but they all suggest that there are essentially two systems in a human brain that are responsible for forming an impression (Swencionis and Fiske, 2014). The first is the automatic or reflexive system that tends to form impressions automatically and often subconsciously by tapping into stereotypes in forming impressions of others. The second is the rational or reflective system that tends to form impressions based on deliberate attention to and the processing of individuating information.

Because the rational system consumes cognitive effort, perceivers tend to rely primarily on the automatic system in making inferences (Macrae et al., 1994). Accordingly, the general rule of impression formation is that impressions of others are mainly formed based on the automatic system, *unless* perceivers are sufficiently motivated to direct their attention to individuating information (Fiske and Neuberg, 1990; Nelson et al., 1996). The more that perceivers are motivated to form accurate impressions of others, the more they are willing to invest time and energy in looking beyond stereotype-based associations and pay attention to individuating information.

Gender role expectations are grounded in stereotypes. When perceivers rely on gender role expectations to make inferences of men and women, they thus tap into the automatic system. We therefore argue that the key to diverting work group members' reliance on gender role expectations is to influence their impression formation motivation. The more that work group members are motivated to form accurate impressions of their fellow group members, the more they will rely on individuating information rather than gender role expectations in forming impressions of men and women.

Specifically, we expect that a motivation to form accurate impressions will inhibit the extent to which gender role expectations shape the allocation of jobs, tasks, and responsibilities, the behavior of perceivers, and the behavior of target men and women. In paying more attention to individuating information, the allocation of jobs, tasks, and responsibilities will be more based on who is the right person for the job in terms of actual knowledge and abilities, rather than inferred knowledge and abilities based on gender. In addition, perceivers will be more supportive of group members with actual knowledge and abilities and critical toward those with less knowledge and abilities, regardless of the gender role incongruity of such members (cf. Correll and Ridgeway, 2003; Wittenbaum and Bowman, 2005). We further expect that target women and men will feel less pressured to conform to gender role expectations and instead will feel free to display gender role-incongruent behavior when they experience the need to do so (e.g., when they are the most capable member of the group).

We thus argue that the motivation to form accurate impressions increases perceivers' attention to individuating information and reduces their reliance on gender role expectations. The result is that (1) the allocation of group members to jobs, tasks, and responsibilities is more based on members' knowledge and abilities, (2) the recognition of knowledge and abilities in work groups is improved, and (3) the most capable and experienced group members become more influential, which all positively affect group performance. We therefore propose:

*Proposition 5: The more that perceivers are motivated to form accurate impressions of their work group members, the less gender role expectations will affect the allocation of jobs, tasks, and responsibilities, the behavior of perceivers, the behavior of target women and men, and will, in turn, enhance group performance.*

# AN AGENDA FOR FUTURE RESEARCH

In this conceptual analysis, we have argued that gender role expectations in work groups tend to behave like a flywheel. They automatically reinforce and maintain themselves *via* three mechanisms: the allocation of jobs, tasks, and responsibilities, the behavior of perceivers, and the behavior of target men and women. We have argued that this flywheel of gender role expectations will positively [negatively] affect group performance to the extent that gender role expectations accurately [inaccurately] capture differences in knowledge and abilities between men and women group members. In addition, we have argued that the performance of gender-diverse work groups benefits most when group members' impression formation relies less on the flywheel of gender role expectations, and is instead grounded in individuating information. To make perceivers focus more on individuating information in forming impressions, we have argued that it is key to motivate them to focus on forming accurate impressions.

In combination, these propositions advance theory on gender role expectations and gender diversity in three ways. The first is in pointing out how gender role expectations in gender-diverse work groups tend to be self-reinforcing and operate like a flywheel. Second, we built theory regarding how gender role expectations shape the performance of diverse work groups. The third theoretical contribution pertains to how the motivation to form accurate impressions can reduce the influence of gender role expectations and enhance the performance of gender-diverse work groups. In the following, we present a research agenda for future research, which is structured along these three contributions.

# ADVANCING RESEARCH ON THE FLYWHEEL OF GENDER ROLE EXPECTATIONS

Years of research have shown how gender role expectations shape the allocation of jobs, tasks, and responsibilities, the behavior of perceivers, and the behavior of target men and women (e.g., Eagly and Karau, 2002; Correll and Ridgeway, 2003). These consequences of gender role expectations have been documented in a variety of domains (e.g., recruitment and selection, backlash, and stereotype threat). In clustering the findings of those studies on the consequences of gender role expectations into the three mechanisms of the flywheel of gender role expectations, we hope to have provided researchers with a useful categorization of the different consequences of gender role expectations.

However, we hope that future research will not only focus on these mechanisms as consequences of gender role expectations. The main reason why we introduced the analogy of a flywheel is because of the self-reinforcing nature of gender role expectations. We therefore put a premium on studies that move from a static way of studying the consequences of gender role expectations in isolation to approaches that enable an assessment of the dynamics of gender role expectations within work groups.

Such research requires designs that track the interaction of group members' behavior in organizations over time. Researchers would need to measure gender role expectations longitudinally by using specifically designed indicators (e.g., specific behavioral expectations of group members for certain tasks, or indicators of automatic associations using instruments such as the Implicit Association Test; Greenwald et al., 1998) *via* repeated measures over time, and take stock of what happened in between that may account for changes in gender role expectations. For example, a male group leader may have been replaced by a female group leader, or group members may display more gender role-incongruent behavior. By complementing such findings with experiments in which the causality of the assumed underlying mechanisms is tested, researchers can assess the self-reinforcing nature of gender role expectations.

Although we presented and discussed each mechanism of the flywheel of gender role expectations independently, we expect that the three flywheel mechanisms also affect each other. First of all, the tendency to assign women and men to gender role-congruent jobs, tasks, and responsibilities prevents perceivers from being exposed to women and men in gender roleincongruent positions, and thus reinforces the gender role expectations of perceivers. Second, the gendered allocation of jobs, tasks, and responsibilities limits the extent to which individuals gain experience in gender role-incongruent positions. Third, the reciprocity in the interaction between perceivers and target men and women reinstates gender role expectations and their corresponding behaviors.

Preliminary evidence of such relationships among the mechanisms comes from a recent experimental study on task allocations, which showed that in gender-diverse groups, women, compared to men, more often tend to volunteer, are asked to volunteer, and accept requests to volunteer for low-status tasks (Babcock et al., 2017). Between gender-homogeneous groups, no such gender differences in the willingness to volunteer, the request to volunteer, or the acceptance of requests to volunteer existed. Findings also showed that gender role expectations, rather than individual preferences, were responsible for the gender differences in the behavior of the group members toward each other. Whereas we consider the three mechanisms to meaningfully distinguish between different ways in which gender role expectations maintain and reinforce themselves in work groups, we recommend researchers to also examine relationships among the three mechanisms.

# ADVANCING RESEARCH ON THE GROUP-LEVEL CONSEQUENCES OF THE FLYWHEEL

Because almost all studies on the consequences of gender role expectations in organizational settings have focused on individual level behavior and outcomes (e.g., Hall et al., 2018), research on how gender role expectations shape group-level behaviors and outcomes is still in its infancy. However, we contend that such research is important, given that the interest of many practitioners in diversity tends to focus primarily on how diversity shapes organizational performance (cf. Catalyst, 2004; Eagly, 2016). Two related studies show what research on the relationship between gender role expectations and work group behavior and performance can look like – and how it can advance our knowledge about the consequences of gender role expectations in organizations.

Chatman et al. (2008) showed that the behavior of genderdiverse groups depends on the gender distribution in relation with the nature of the task. Group members who were the only representative of their gender were assumed to be the most competent group member on gender role-congruent tasks (cf. Kanter, 1977; van Knippenberg et al., 2004), and were therefore more often deferred to (cf. Sekaquaptewa and Thompson, 2003). In a similar experiment, van Dijk et al. (2018) showed that group members on gender role-congruent tasks in gender-diverse groups were more influential (measured by speaking time) compared to group members on gender role-incongruent tasks during discussions. In the work groups where gender role-expectations did not match the actual competence of the group members (e.g., the male group member was lower in math ability than the female group members), group members followed the wrong lead (e.g., not using the correct math resolutions offered by competent women, but following men's suggestions in the group), and group performance decreased.

The findings of Chatman et al. (2008) and van Dijk et al. (2018) provide preliminary evidence that gender role expectations shape interactions and performance at the group level. Moreover, they challenge the long-standing proposition in diversity research that diverse groups should be able to make better decisions compared to homogeneous groups when they discuss and share the richness and variety in knowledge, information, and perspectives present in their group (van Knippenberg et al., 2004): in deciding which information to ignore and whose advice to heed, group members tend to rely on biases and heuristics such as gender role expectations rather than being able to objectively assess the value and merit of each member's contribution.

Controlled experiments can build on the studies by Chatman et al. (2008) and van Dijk et al. (2018) to further establish the causal mechanisms of gender role expectations in the functioning and performance of groups. The paucity of research in this area provides numerous opportunities for future research. However, given their importance for team performance, we consider it especially important for future research in this area to further examine the processes and conditions that cause group members to weigh contributions based on gender role expectations – and what may make them forsake doing that.

Furthermore, field research in which work groups in organizations are followed over time would be necessary to examine the extent to which laboratory studies translate to organizational contexts. For instance, work group meetings could be observed to capture verbal and non-verbal expressions of gender role expectations among perceivers as well as target men and women. In relating such behaviors to meeting outcomes and work group performance over time, researchers can assess how gender role expectations may shape work group performance in organizational work groups.

# ADVANCING RESEARCH ON WAYS TO MOTIVATE PERCEIVERS TO FORM ACCURATE IMPRESSIONS

We have argued that motivating perceivers to form accurate impressions will reduce their reliance on gender role expectations and inhibit its flywheel effect. Theory suggests that perceivers' impression formation motivation depends on (1) what the perceiver wants, (2) who controls what the perceiver wants, and (3) what the criteria are for attaining the desired outcome (Fiske and Neuberg, 1990; van Dijk et al., 2017). For example, if a group member desires to be promoted and her or his manager is in charge of making that call, then it is likely that the group member will follow the criteria that the manager has set for promotion. If those criteria include work group elements (e.g., group performance, getting along well with the other group members), then it is more likely that the group member will invest in getting to know the other group members compared to when the criteria only focus on the individual performance of the group member (cf. Overbeck and Park, 2001). Because there is hardly any research in organizations that has looked at how perceivers' motivation to form accurate impressions and reliance on individuating information can be enhanced, we argue that these theoretical guidelines provide a good start for future research.

However, given that there is a large variety in organizational contexts that can relate to differences in what perceivers want (e.g., public versus private sector), who controls what the perceiver wants (e.g., manager, other team members, client), and which contextual factors are known to shape perceivers' impression formation (e.g., task complexity, level of interaction, accountability), many studies will be needed to gather conclusive empirical evidence regarding the criteria that stimulate the motivation to form accurate impressions across task contexts. We therefore recommend researchers to adopt a collaborative approach in studying how perceivers' motivation to form accurate impressions can be enhanced in gender-diverse work groups. An inspirational example of this kind of research is a comparative study by Lai et al. (2016) which reports on a research contest in which research teams were invited to test interventions to reduce implicit racial bias (as measured by the IAT). Extending such a research design to examine the formation of accurate impressions as a function of manipulations of impression formation motivation would provide rich data on possible criteria that may drive the formation of accurate impressions in work groups and inhibit the flywheel of gender role expectations.

Furthermore, research on diversity in organizations suggests that the performance of (gender-)diverse work groups is facilitated by fostering a diversity climate (e.g., Shore et al., 2011; Nishii, 2013), which refers to "employees' perceptions about the extent to which their organization values diversity as evident in the organization's formal structure, informal values, and social integration of underrepresented employees" (Dwertmann et al., 2016, p. 1137). The exact reasons why diversity climates enhance the performance of diverse work groups are still subject of debate and study, but it could very well be that diversity climates in gender-diverse work groups enhance perceivers' motivation to form accurate impressions.

Specifically, Dwertmann et al. (2016) suggested that a diversity climate consists of two components. The *fairness and discrimination* component is defined as "shared perceptions about the extent to which the organization and/or workgroup successfully promotes fairness and the elimination of discrimination through the fair implementation of personnel practices, the adoption of diversity-specific practices aimed at improving employment outcomes for underrepresented employees, and/or strong norms for fair interpersonal treatment" (p. 1151). The *synergy* component of a diversity climate refers to "the extent to which employees jointly perceive their organization and/or workgroup to promote the expression of, listening to, active valuing of, and integration of diverse perspectives for the purpose of enhancing collective learning and performance" (p. 1151). Although each component thus has a different focus and purpose, they both require the organization to establish strong norms that they actively promote and reinforce. To institutionalize such strong norms, criteria involving adherence to such norms and accountability are essential – factors that have been suggested to enhance perceivers' motivation to form accurate impressions (Tetlock, 1983; Fiske and Neuberg, 1990).

Interestingly, the fairness and discrimination component is likely to inhibit the extent to which gender role expectations shape the allocation of jobs, tasks, and responsibilities, whereas the synergy component is likely to inhibit the extent to which gender role expectations shape the behaviors of perceivers and of target men and women. As such, the establishment of a diversity climate may provide an integral solution to motivate perceivers to form accurate impressions, inhibit the flywheel of gender role expectations, and enhance the performance of (gender-)diverse work groups. We therefore recommend that researchers tap into this potentially fruitful avenue for future research.

# CONCLUSION

In using a flywheel as an analogy to illustrate the self-reinforcing nature of gender role expectations in gender-diverse work groups, we hope to create awareness about the pervasiveness of gender role expectations. Moreover, in pointing out that individuals as well as work groups can suffer from gender role expectations, we hope to establish a sense of urgency about the importance of addressing ways to inhibit the flywheel of gender role expectations. We call for researchers as well as practitioners to work together in assessing which interventions are effective in helping members of gender-diverse work groups to rely less on the flywheel of gender role expectations and motivate them to form accurate impressions instead.

# REFERENCES


# AUTHOR CONTRIBUTIONS

HD conceptualized the flywheel analogy. HD and ME co-authored the manuscript.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 van Dijk and van Engen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# How Do Men and Women Perceive a High-Stakes Test Situation?

Julia E. M. Leiner\*, Thomas Scherndl and Tuulia M. Ortner

*Department of Psychology, University of Salzburg, Salzburg, Austria*

The results of some high-stakes aptitude tests in Austria have revealed sex differences. We suggest that such discrepancies are mediated not principally by differences in aptitudes, skills, and knowledge but sex differences in test takers' perceptions of the test situation. Furthermore, previous research has indicated that candidates' evaluations of the fairness of the testing tool are of great importance from an institutional point of view because such perceptions are known to influence an organization's attractiveness. In this study, we aimed to investigate how women and men perceive and evaluate certain aspects of a high-stakes test situation by using the results and evaluations of an actual medical school aptitude test (747 applicants; 59% women). Test takers voluntarily evaluated the test situation and rated specific aspects of it (e.g., the fairness of the selection tool) and provided information regarding their test anxiety immediately after they completed the 4-h test. Data analyses indicated small, albeit significant sex differences in participants' perceptions of the test. Men described the test situation as slightly giving more opportunity to socialize and possessing more opportunity to deceive than women did. Furthermore, the perception of the test situation did not directly predict the test results, but it served as a moderator for the indirect effect of test anxiety on test results. By contrast, there were significant direct effects but no indirect effects of situation perception on evaluations of the fairness of the selection tool: The more the test situation was perceived as a high-pressure situation, the lower the fairness ratings of the testing tool. Results were discussed with reference to gender roles and test fairness.

#### Edited by:

*Alice H. Eagly, Northwestern University, United States*

#### Reviewed by:

*David Reilly, Griffith University, Australia Isabelle Cherney, Merrimack College, United States*

> \*Correspondence: *Julia E. M. Leiner julia.leiner@sbg.ac.at*

#### Specialty section:

*This article was submitted to Gender, Sex and Sexuality Studies, a section of the journal Frontiers in Psychology*

Received: *19 May 2018* Accepted: *26 October 2018* Published: *04 December 2018*

#### Citation:

*Leiner JEM, Scherndl T and Ortner TM (2018) How Do Men and Women Perceive a High-Stakes Test Situation? Front. Psychol. 9:2216. doi: 10.3389/fpsyg.2018.02216* Keywords: test situation perception, test anxiety, sex differences, fairness perception, test performance

# INTRODUCTION

Imagine an important assessment situation, for example, a high-stakes test situation with 100 test takers. When viewed only superficially, there is just one situation, which appears to be very much the same for every test taker. However, there may be up to 100 different impressions because each individual may perceive the same situation in a different way. In this study, we aim to address the issue of differences in perceptions of a competitive standardized high-stakes test situation. The focus lies on sex-related differences in people's perceptions of the situation and the possible effects of these perceptions on test performance. The goal is to initiate a new approach for assessing sex differences in competitive environments.

In many competitive areas, for example, in academic science, professional and managerial senior positions, and assessments, women tend to be outperformed by men. In the European Union, women are underrepresented in senior academic positions (EU, 2012), and larger numbers of board members in European and U.S. companies are represented by men (Backus et al., 2016). Although

Leiner et al. Perception of Test Situations

women and men do not differ considerably in their skills and abilities (see Hyde et al., 1990), aptitude tests have painted a different picture with respect to test performance (see Mau and Lynn, 2001): Analyses have revealed cross-national sex differences in performances on college and aptitude tests (Else-Quest et al., 2010; Salchegger and Suchan, 2018). Whereas in general differences in verbal ability and writing tests favor girls (Reilly et al., 2018), differences in math tests favor boys (Reilly et al., 2015). These differences also apply to high-stakes tests, such as the Graduate Record Exam in the U.S. (see e.g., https:// www.prepscholar.com/gre/blog/average-gre-scores/). However, with reference to cognitive performance, research has revealed that sex differences that favor male test takers tend to occur particularly in competitive situations, indicated by an increase in the performance of men and basically no change in the performance of women, even when women's performances are similar to men's in non-competitive environments (Gneezy et al., 2003; Niederle and Vesterlund, 2007, 2010). In Austria, test scores on public medical high-stakes aptitudes tests have been under public scrutiny for years because of sex differences in test scores (Pfarrhofer, 2017). Although a larger percentage of women (60%) compared with men (40%) took the test in 2017, women represented only 53% of the test takers who were accepted to a university, thus indicating that they scored lower than men (see Pfarrhofer, 2017). Because the relevance of test scores for decisions in the educational system has increased in Europe in recent years (e.g., see the growing number of subjects with entrance exams at Cambridge University and Oxford University; Turner et al., 2017; or the establishment of new entrance tests at German universities after a decision made by the German Constitutional Court in 2017, see Konegen-Grenier, 2018), attention has also been directed toward the topic of test bias and fairness (e.g., Kaufmann, 2010; Fischer et al., 2013; Aguinis et al., 2016). If test scores on group levels are systematically affected by factors that are not intended to be measured by the test, the test provides inaccurate and unfair scoring. According to Helms' (2006) quantitative model, differential performance between groups may stem from individuals' interpretations of test situations that are based on differential past experiences. Interpretations and experiences in test situations and their impact on women's and men's test scores have been insufficiently investigated so far. Therefore, the present study aimed to explore the perceptions of women and men in a high-stakes test situation.

# Systematic Measurement Error: Construct-Irrelevant Variance

When it comes to the assessment of achievement-related variables, the test design as well as the situational circumstances surrounding the assessment situation should allow test takers to show their maximal performance (e.g., Willingham and Cole, 1997). Codes of conduct and standards for test fairness (e.g., the Standards for Educational and Psychological Testing; American Educational Research, Association, American Psychological, Association, and National Council on Measurement in Education, 2014) state that the test situation should further aim to provide comparable opportunities for all test takers to apply the skills, abilities, and knowledge they possess. From a psychometric perspective, the part of the overall variability of the scores that can be attributed to construct-relevant variance should be maximized, whereas the influence of factors that are irrelevant to the construct should be minimized (Stone and Cook, 2016). With respect to measurement error, the literature has distinguished random error from systematic error (see, e.g., Cote and Buckley, 1987). Subsequently, systematic measurement errors are caused by factors that affect measurement outcomes systematically, resulting in a systematic decrease in test scores for an individual test taker or a group of test takers.

Haladyna and Downing (2004) presented a taxonomy for the study of systematic errors associated with construct-irrelevant variance threatening test score interpretation and addressed test anxiety as one of the most common sources. Test anxiety as a trait characteristic, defined as "the tendency to view with alarm the consequences of inadequate performance in an evaluative situation" (Sarason, 1978, p. 213) has been investigated for decades, with women reporting higher occurrences of test anxiety than men (Hembree, 1988; Zeidner, 1990). Research has revealed that test anxiety can impair those who are affected in different ways: Highly test-anxious people are more sensitive to environments that emphasize competition (Hancock, 2001) and tend to view test situations in particular as personally threatening (Sarason and Sarason, 1990). Test anxiety was found to be associated with academic self-concept (Zeidner and Schleyer, 1998) and was identified as affecting academic performance (Chapell et al., 2005). With respect to the underlying mechanism that causes performance to decrease, test anxiety was revealed to impair working memory capacity (Ashcraft and Kirk, 2001) because highly anxious individuals are believed to use more processing resources by worrying than individuals low on anxiety (Eysenck and Calvo, 1992). Furthermore, anxiety was found to lead individuals to show a more self-focusing strategy instead of a task-focusing tendency (Hancock, 2001). These mechanisms could serve as explanations for the underperformance of women on achievement tests.

Based on qualitative and quantitative data, Bonaccio and Reeve (2010) developed a framework of perceived sources of test anxiety: Besides students' perceptions of the test as well as their perceptions of themselves, the test-taking situation was revealed to be an important source of test anxiety. With respect to reactions to test situations, Steele (1997) was the first to introduce stereotype threat as a source of bias on standardized tests. Negative stereotypes were identified as a core characteristic of this phenomenon because self-threats were revealed to interfere with the targets' test performance. Experiments have shown, for example, that women performed worse than men when both groups were explicitly told that this test should show sex differences. In contrast, these differences in women's and men's test performance vanished when the same test was presented stereotype-free (see Spencer et al., 1999). Schmader and Johns (2003) reported that stereotype threat reduced cognitive capacity, which led to lower performance for the stereotyped group. Steele (1997) stressed performance differences caused by stereotype threat as an effect of the situation: Extra situational pressure sets up the frame for attributions of gender-related ability limitations. Research indicated that stereotype threat led to higher numbers of negative thoughts (Cadinu et al., 2005), whereas negative thoughts were identified as related to the cognitive component of test anxiety (Cassady and Johnson, 2002).

# Situation Perception

According to an early statement made by Lewin (1946), people and their environments are interwoven and cannot be separated or studied independently. Situations provide information that is distinctively processed by each individual (Sarason, 1978), thus influencing people's perceptions (e.g., how to encode the situation, expected outcomes, and their subjective value) and thereby affecting the way individuals think and act under such conditions (Mischel, 1977). Considering the interaction of persons and situations, Mischel (1977) shifted the focus to draw attention to the issue of "When are situations most likely to exert powerful effects [. . . ]?" (p. 346), thus addressing their potential influence on individual behavior. His claim refers to so-called strong situations, which provide clear incentives and normative expectations of behavior—criteria that are met in a test situation because of their high standardization and rules of conduct. At the other end are weak situations, which lack environmental cues for performance. Nevertheless, Cooper and Withey (2009) extended this theory by more recently postulating a continuum between strong situations (resulting in main effects of only the situation on behavior) and weak situations (resulting in main effects of only personality on behavior) by proposing that an individual's personality also affects perceptions and reactions in strong situations.

In his model, Rauthmann (2012) proposed that people's unique impressions lead to three components of variance in ratings of situations: perceiver variance, situation variance, and perceiver × situation variance (Situation Perception Components Model; Rauthmann, 2012). With reference to the terminology employed in current approaches in research on situations, cues are defined as objectively quantifiable stimuli that need to be processed by a perceptual system to be interpreted with reference to its content. Each situation is made up by several cues (see Rauthmann et al., 2015), which can be associated with psychological meanings (e.g., pleasant or negative); characteristics (also referred to as qualities or features) determine the psychological meaning of detected cues, embracing the situation's psychological "power" (Edwards and Templeton, 2005; Rauthmann et al., 2014). Situations containing similar cues and/or similar combinations of characteristics and sharing important aspects of their psychological meanings can be summarized as classes of situations. With reference to these different classes, current approaches aim to establish empirically based "class taxonomies" as a system of categories that integrates all possible situations. Recently, analyses of a large and multinational set of data from a questionnaire for assessing situational characteristics (Situational Q-Sort; Wagerman and Funder, 2009) led to a model represented by a structure of eight psychological characteristics relevant for describing situations (Rauthmann et al., 2014): The widely recognized Situational Eight DIAMONDS model (e.g., Rauthmann and Sherman, 2016, 2017; Horstmann and Ziegler, 2018; Rauthmann et al., 2018) comprises the following dimensions with original sample questions (see Rauthmann et al., 2014): Duty (Does something need to be done?), Intellect (Is deep cognitive processing required?), Adversity (Is someone threatened by external forces?), Mating (Is there an opportunity to attract potential mates?), pOsitivity (Is the situation pleasant?), Negativity (Can the situation arouse negative feelings?), Deception (Can others be trusted?), and Sociality (Is social interaction possible or expected?). Research on undergraduate students by Sherman et al. (2013) revealed that individuals' personality and gender play a role in how individuals perceive daily life situations: Men estimated situations as holding more potential for blame, more potential for undermining or sabotage, and more potential for others to be "under threat." Women were more likely to view situations with reference to their potential to evoke a need for support, to give rise to "warmth or compassion", or to allow for emotional expression.

Taking into account the trend that contemporary approaches in the research on situation perception mainly focus on daily life situations (e.g., Sherman et al., 2010, 2013; Rauthmann, 2012; Rauthmann et al., 2015; Horstmann and Ziegler, 2018), psychologists have thus far learned little about the perception of high-stakes test situations. Bringing current findings on test bias (e.g., test anxiety) and contemporary research on situation perception together, this study aimed to shed light on a new viewpoint on testing focusing on the applicant's subjective perception of the situation as a previously unconsidered source of construct-irrelevant variance.

# The Present Study

In this study, we investigated situation perception in a highstakes test situation and its relations to sex differences in test performance and fairness evaluations. We addressed situation perception and further included test anxiety (as a personality trait) as sources of systematic construct-irrelevant score variance. Test takers completed a short paper-pencil form after taking a medical school entrance examination. On the basis of previous research (Sherman et al., 2013), we expected sex differences in the perceived characteristics of the test situation (Hypothesis 1: There will be differences in women's and men's perceptions of a high-stakes test situation). Furthermore, we included test takers' test anxiety (see, e.g., Chapell et al., 2005) and analyzed its unique and moderated effect (by situation perception) on (1) overall test performance and (2) evaluations of the fairness of the selection tool. Given that a university entrance examination serves different interests, we considered possible outcomes of the test on the test taker's side as well as the institution's side: Whereas, test takers aim for admission, and past experiences may result in future expectations with reference to similar situations (see Helms, 2006), the perceived fairness of the testing tool is known to influence an organization's attractiveness (Chapman et al., 2005). We expected both variables, overall test performance as well as the evaluation of the fairness of the selection tool, to be influenced by test anxiety and therefore anticipated test anxiety to function as a suppressor variable in two ways: First, we expected general test anxiety to serve as a mediating variable between test takers' sex and test performance (Hypothesis 2: There will be an indirect effect of sex on performance through test anxiety, which will be moderated by the perception of the situation). Second, and in a similar manner, we expected that test anxiety would serve as a mediating variable between test takers' sex and their evaluation of whether the testing procedure was fair (Hypothesis 3: There will be an indirect effect of sex on evaluations of fairness through test anxiety, which will be moderated by the perception of the situation). Because the influence of situation perception has yet to be investigated in the context of high-stakes tests, we did not formulate directional predictions. However, we expected that aspects that reveal as relevant for situation perception in the context of high-stakes tests may serve as a possible moderating variable as presented in **Figure 1**.

# METHODS

# Participants

In sum, 777 applicants took the entrance test at a private medical school in Austria. In a specially prepared lecture hall, every test taker was provided a workspace with a laptop and a computer mouse as well as a closed white envelope, which contained the evaluation form. After the test, the last screen informed the applicants that the test was over and invited them to open the envelope and voluntarily fill out the items. There were 25 test takers who did not return the evaluation form and five who answered <50% of all items and were therefore not included in further analyses. The resulting sample consisted of 747 participants (442 women between the ages of 16 and 44, M = 20.64, SD = 2.66, and 305 men between the ages of 17 and 35, M = 21.10, SD = 2.56). The major group of participants was German citizens (60%), followed by 35% Austrian citizens, and the remaining 5% were citizens of other countries. The number of cases serving as a base for particular analyses was sometimes slightly smaller because some data were missing on specific scales.

# Procedure

The examination took place during 6 days in April 2017, with a maximum of two test sessions per day, one starting at 08:00 a.m. and one starting at 01:00 p.m. The computerized 4-h aptitude test consisted of 11 different subtests. After the test takers had completed the computerized aptitude battery, they were invited to fill out a short evaluation form. The evaluation form informed the test takers that the aim was to obtain test takers' evaluations of the test situation and test takers' experiences in order to enhance the test and the test situation in the future. The evaluation form included (1) items for assessing test takers' evaluations of the fairness of the testing tool, (2) items for assessing general test anxiety and situation perception as well as (3) an opportunity to provide feedback in a free-response format. Test takers were informed that the information they provided on the evaluation form would not have any impact on the admission decision and that there would not be a risk of harm due to their participation in the survey. Furthermore, test takers were informed that participation was voluntary and refusing had no consequences. On average, it took about 5 min to fill out the form.

# Materials

The evaluation form included short forms of existing scales for assessing the perceived fairness of the selection tool, test situation perception, and general test anxiety (see **Table 1** for an overview of all items). It also included free-response evaluation items. Short scales of original questionnaires were administered to keep the form brief and to ensure that as many test takers as possible would fill it out voluntarily. Test takers rated each item on a 7 point rating scale (0 = not at all, 6 = absolutely), except one item concerning their overall evaluation of the fairness of the testing tool, to which they assigned a grade (A–E). To estimate the psychometric properties of the scales, we ran exploratory factor analyses (see section Statistical Analyses). The psychometric properties of the resulting scales are presented in **Table 4**.

## Evaluation of the Fairness of the Selection Tool

We implemented a short version of the AKZEPT!-L survey (Kersting, 2008) in order to obtain test takers' subjective evaluations of the fairness of the selection tool. These comprised three items in total including the following aspects: Measurement Quality, Face Validity (both rated from 0 = not at all to 6 = absolutely), and an Overall Evaluation of the selection tool (graded from A-E; see **Table 1**).

### Test Situation Perception

In order to obtain an individual score describing the subjective psychological quality of the test situation, we employed an adapted version of the S8<sup>∗</sup> questionnaire published by Rauthmann and Sherman (2016). The original S8<sup>∗</sup> questionnaire consists of 24 items (three items per each DIAMONDS dimension). For this study, we chose two items each from five of the eight original dimensions and adapted them to a test situation: Duty, Adversity, pOsitivity, Deception, and Sociality. A comparison of the original and adapted wording is presented in **Table 2**. This questionnaire had originally been developed for assessing perceptions in daily life situations.

A high-stakes test situation differs from daily life situations, for example, in terms of its standardized structure and test taker's expected behavior, both of which are criteria for strong situations (Mischel, 1977). Therefore, we developed the items in accordance with Mischel's (1977) four criteria for strong situations, "leading everyone to construe the particular event in the same way," "inducing uniform expectancies regarding the most appropriate response pattern," "providing adequate incentives for the performance of the adequate response pattern," and "requirement of skills that everyone has to the same extent" (rated from 0 = not at all to 6 = absolutely; see **Table 1**).

## Test Anxiety

Test anxiety as a personality trait was assessed with four items from the short form of the Test Anxiety Inventory TAI-G— German version (TAI-G; Hodapp, 1991; rated from 0 = not at all to 6 = absolutely; see **Table 1**). Test takers were asked which statements were generally true for them when it comes to test situations. The original TAI-G questionnaire consists of 15 items (Wacker et al., 2008) and has been shown to assess more

TABLE 1 | Overview of items included in the evaluation form.


*Bold items were included in the analyses. Situation perception scales included original items from the S8*\* *questionnaire (Rauthmann and Sherman, 2016) and adapted items, see* Table 2*.*

trait-related stable individual differences than situational effects (Keith et al., 2003).

#### Overall Test Performance on the Admission Test

Overall test performance was calculated as an average weighted z-standardized score of all subtests from the admission test for each test taker. This overall test score was comprised of results from 13 tests for assessing knowledge (e.g., basic knowledge in natural sciences, English), skills, or abilities (e.g., spatial ability, memory, reasoning). The overall score also included aspects of personality<sup>1</sup> assessed with objective personality tests in

<sup>1</sup>Based on an empirical analysis of expert's evaluations, high conscientiousness as well as high agreeableness revealed as important requirements.

TABLE 2 | Original and adapted items from the S8\* questionnaire (Rauthmann and Sherman, 2016).


*The adapted items were translated into English. The German originals can be obtained upon request.*

computerized miniature situations (see Ortner and Proyer, 2015) and questionnaire items (see Ortner et al., 2017).

TABLE 3 | Factor loadings based on a principal components analysis with oblimin rotation for all items.

## Statistical Analyses

In order to estimate the psychometric properties of the adapted version of the S8<sup>∗</sup> questionnaire and Mischel's (1977) criteria for strong situations, we ran an exploratory factor analysis (principal axis factoring using oblimin rotation) to evaluate the factor structure. The results of parallel analysis as well as the scree plot suggested a four-factor solution. We therefore fixed the number of factors to four after dropping items due to low variance (one item) and low communality (<0.30; two items). Furthermore, we dropped items with factor loadings below 0.40 or substantial cross-loadings on several factors (three items). The final fourfactor solution explained 64% of the variance and included nine items. The four resulting factors were labeled Feeling stimulated, Opportunity to socialize, Feeling pressured, and Opportunity to deceive (the items comprising each factor and the factor scores are presented in **Table 3**). The factor scores for these four factors were used for all further analyses (see the descriptive statistics and correlation coefficients for the resulting variables in **Tables 4**, **5**).

To address Hypothesis 1, whether men and women differ in their perceptions of a high-stakes test situation, we calculated simple t-tests with sex as the independent variable and four situation perception factors, which we obtained from the factor analysis, as the dependent variables.

We further analyzed whether the difference between men and women in the test results and in the evaluations of the fairness of the selection tool could be partly explained by different levels of general test anxiety (Hypotheses 2 and 3, respectively). Furthermore, we calculated whether this mediation would hold regardless of the extent to which test takers perceived the situation as a high-pressure situation during the test. For this purpose, we ran a mediation analysis in accordance with Hayes' guidelines (2013, Model 8) using PROCESS 2.16.3 for SPSS with sex as the independent variable, general test anxiety as the mediating variable, situation perception (Feeling pressured) as the moderating variable<sup>2</sup> , and the overall test result (Hypothesis 2)


*Final factor solution; factors were content-based and labeled as indicated in the table headings. Bold values indicate main factor loadings. N* = *652.*

and fairness of the selection tool (Hypothesis 3) as the dependent variables (for an overview, see **Figure 1**). For all models, we centered the products of our variables and computed biascorrected confidence intervals based on 5,000 bootstrapped samples.

<sup>2</sup>We exclusively report the Feeling pressured factor because other situation perception factors did not function as significant moderators of the postulated model. However, results of the moderated mediation models with the other

three situation perception factors Opportunity to socialize, Feeling stimulated and Opportunity to deceive are available upon request.

#### Leiner et al. Perception of Test Situations

# RESULTS

An overview of the descriptive statistics for all scales is presented in **Table 4**. An overview of all correlations is presented in **Table 5**. Effect sizes are interpreted according to Cohen's (1988) classifications.

# Sex Differences in Test Situation Perception

For Hypothesis 1, addressing sex differences in the perception of the test situation, the analyses revealed significant albeit small differences in scores on the factor Opportunity to deceive, indicating slightly higher scores for men (M = 2.36, SD = 1.37) compared with women (M = 2.11, SD = 1.39), t(739) = 2.45, p = 0.014, d = 0.18. This result indicates that men were slightly more likely to perceive the situation as an opportunity to engage in deception compared with women. The analyses further revealed differences with reference to the scores on the dimension Opportunity to socialize. The scores were higher for men (M = 1.26, SD = 1.40) than for women (M = 1.00, SD = 1.29), t(726) = 2.62, p = 0.009, with a small effect size, d = 0.20. This result indicates that men reported viewing the test situation as more social than women did. Analyses showed a very


*#, Number of items;* α*, Internal consistency.*


small and non-significant difference in the situation perception dimension Feeling pressured [men: M = 3.31, SD = 1.30; women: M = 3.16, SD = 1.42, 1M = 0.15, t(741) = 1.50, p = 0.135, d = 0.11]. The analyses revealed no significant differences in situation perception with reference to the dimension Feeling stimulated (men: M = 4.87, SD = 0.86; women: M = 4.87, SD = 0.86), t(745) = −0.43, p = 0.966, d = 0.00.

# Effects of Test Situation Perception and General Test Anxiety on Test Performance

With reference to Hypothesis 2, we tested a moderated mediation model to assess the indirect effect of sex via test anxiety on overall test performance and to determine whether this indirect effect was influenced by the perception that the situation was a high-pressure situation (see also **Figure 2A**). In a first step, we reported the results (unstandardized regression coefficients including 95% bias-corrected bootstrapped confidence intervals with 5,000 samples) for the simple mediation model. Then we continued to check whether this indirect effect changed in accordance with test takers' perceptions of the situation (Feeling pressured). The results for the complete model are also presented in **Table 6**.

The sex differences in overall test performance were significantly mediated by test anxiety (as indicated by a significant index of moderated mediation: b = 0.01, 95% CI [0.00; 0.02]). However, there was still a significant direct effect of sex on overall test performance (b = −0.15, SE = 0.04, p < 0.001), indicating that men scored higher than women on the test even after selfreported general test anxiety was entered as a mediator. Sex also had an effect on test anxiety: Women reported higher general test anxiety than men (b = 0.47, SE = 0.07, p < 0.001), and higher general test anxiety in turn led to lower overall performance (b = −0.04, SE = 0.02, p = 0.022).

Differences in perceptions of the test situation concerning Feeling pressured did not affect performance (b = 0.08, SE = 0.02, p = 0.621) or the sex difference in performance (b = 0.01, SE = 0.03, p = 0.783). However, analyses revealed a significant positive relation between situation perception and test anxiety (b = 0.17, SE = 0.03, p < 0.001) and an effect of situation perception on the size of the sex difference in test anxiety (b = −0.18, SE = 0.07, p = 0.053): For people who reported low pressure (1 SD below the mean), the sex difference in test anxiety


\**p* < *0.05,* \*\**p* < *0.01.*

TABLE 6 | Moderated mediation of sex predicting test anxiety and overall performance via feeling pressured.


*N* = *730.*

was higher (b = −0.03, SE = 0.01, 95% CI [−0.06, −0.01]) than for test takers who perceived the situation as a high-pressure situation (1 SD above the mean; b = −0.01, SE = 0.01, 95% CI [−0.03; −0.00]; see also **Figure 3**).

# Effects of Test Situation Perception and General Test Anxiety on Fairness Evaluations

Parallel to the analyses used to address Hypothesis 2, we again tested a moderated mediation model to assess the extent of the indirect effect of sex via test anxiety on evaluations of fairness of the selection tool (Hypothesis 3). We also tested whether this indirect effect was influenced by the perception of the situation as a high-pressure situation (see **Figure 2B**). Again, we first reported the results for the simple mediation model. Then we continued to check whether this indirect effect changed in accordance with the perception of the situation.

The sex differences in evaluations of fairness of the selection tool were not significantly mediated by general test anxiety (indirect effect: b = −0.00, 95% CI [−0.01; 0.08]). However, there was still a significant direct effect of sex on evaluations of test fairness (b = −0.13, SE = 0.06, p = 0.038), indicating that men reported higher fairness ratings than women after

TABLE 7 | Moderated mediation of sex predicting test anxiety and evaluations of the fairness of the selection tool via feeling pressured.


*N* = *730.*

self-reported test anxiety was entered as a mediator. Although sex had an effect on test anxiety, women reported higher test anxiety than men (b = 0.49, SE =0.07, p < 0.001). Higher test anxiety in turn led to no change in the extent to which the selection tool was perceived to be fair (b = 0.01, SE = 0.03, p = 0.848).

Test takers' perceptions of the test situation concerning Feeling pressured did not affect the sex difference in the extent to which the selection tool was perceived to be fair (b = −0.03, SE = 0.07, p = 0.919) but had an effect on the size of the sex difference in test anxiety (b = −0.12, SE = 0.06, p = 0.025): The more the situation was perceived to be a highpressure situation, the smaller the difference in self-reported test anxiety between men and women became. However, the perception of the situation as a high-pressure situation did not constitute a moderation of the indirect effect because the already mentioned effect of self-reported test anxiety on test takers' evaluation of the test was so low. Nevertheless, analyzes revealed an effect of Feeling pressured on the evaluations of the test situation: The more the situation was perceived to be a high-pressure situation, the lower the ratings of the test fairness were (b = −0.13, SE = 0.02, p < 0.001). The results for the model are also presented in **Table 7** and in **Figure 2B**.

# DISCUSSION

This study was the first to investigate sex differences in the perception of a real high-stakes test situation and to address the question of whether observed differences between men and women in test performance and evaluations of the fairness of the test can be explained by taking into account a thus far disregarded source (i.e., situation perception) and a well-investigated source (i.e., test anxiety) of construct-irrelevant variance. To implement this new approach, we analyzed data from a real university aptitude test while also considering the test takers' evaluations of the test.

First, we hypothesized sex differences in test takers' perceptions of the test situation with respect to their perceptions of the characteristics of the situation, whether they felt pressured, whether they felt stimulated, and their perceived opportunities to socialize and to deceive. Analyses partly supported our expectations and revealed sex differences on the dimensions Opportunity to deceive and Opportunity to socialize: More than women, men seemed to view the test situation as an opportunity to be dishonest ("I could present myself as different from how I really am"; "It was possible to be dishonest with someone") and as a situation that allowed social contact ("Communication with other people was important or desired"; "Close personal relationships were important or could develop"). This finding of higher scores for men with reference to deception is in line with research on differences in dishonest behavior in men and women (Ward and Beck, 1990) and with research on men's greater readiness to show social desirability in responding in personnel selection (Ones and Viswesvaran, 1998). However, the opportunities to cheat on this entrance examination were reduced to a minimum given the highly standardized test scenario, accompanied by several trained supervisors and the computerized test. The finding that men reported higher ratings with reference to social aspects goes against Sherman et al. (2013) results, which revealed higher scores for women on the social dimension. Thus, the different results may be explained by the different connotations of the items that were employed on the one hand, but they may also be a result of the different types of situations that were investigated: Whereas, Sherman et al. (2013) investigated situations in the context of daily life, we focused on an atypical situation: a high-stakes test situation. Taking the items into considerations in our study, it seems that men may have seen this high-stakes test more as an opportunity to interact, network, and compete against others (see e.g., Niederle and Vesterlund, 2007). However, the analyses did not reveal any sex differences on the dimensions Feeling pressured and Feeling stimulated: Men and women seemed to similarly perceive the test as a high-pressure situation and as stimulating. Evaluating the effect sizes of the sex differences, it appears that they are small, with a maximum of d = 0.20. However, it is important to notice that there was no kind of experimental manipulation, and test takers responded to a real situation (see Sherman et al., 2013; section Size of Effects)—a situation that was supposed to be the same for every person taking the test.

With respect to the second hypothesis, we expected an indirect effect of sex through test anxiety on overall test performance, influenced by the perception of the test situation as a highpressure situation. Analyses revealed that men received higher overall test performance scores than women and that this finding could be attributed at least in part to an indirect negative causal effect of test anxiety. The lower overall test performance exhibited by women was partly explained by their higher general test anxiety, a finding that is in line with previous research (e.g., Osborne, 2001; Chapell et al., 2005) and indicates that construct-irrelevant variance was present to some extent in the test takers' results. Situation perception had no effect on overall test performance or on the connection between sex and test performance. However, there was a significant positive relation between the perception of the test situation as a highpressure situation and general test anxiety: Higher scores of Feeling pressured were connected with higher test anxiety in men, which eventually led to the result that the difference in test anxiety between women and men was lowest in the group of test takers who particularly perceived the situation as a highpressure situation (one standard deviation above the mean). This positive association between test anxiety and feelings of pressure supports Sherman et al. (2013) findings, which indicated that personality (in this case test anxiety) is a central and reliable component when it comes to differential situational construal.

Finally, with respect to the third hypothesis, we expected an indirect effect of sex on evaluations of fairness through test anxiety, influenced by the perception of the situation as a highpressure situation. Results showed that women evaluated the selection tool as less fair in comparison with men. Whereas test anxiety did not affect the connection between test takers' sex and their fairness ratings, the data revealed that higher levels of Feeling pressured led to lower evaluations of the fairness of the selection tool. This negative relation indeed is not surprising because feeling pressured and inconvenienced during a test are important aspects of the overall evaluation of the test (e.g., Kersting, 2008). Therefore, the rather negative perceptions of the situation as high-pressure could in this sense also have reflected test takers' negative affective states, an interpretation that would be in line with Horstmann and Ziegler's (2018)results concerning the considerable overlap between the effects and perceptions of situations. However, negative attributions toward medical aptitude tests in Austria seem plausible, especially for women, given the annual reporting that casts doubt on the fairness of such proceedings (see e.g., online articles in kurier.at: Medizin-Aufnahmetest: Gender Gap bei Ergebnissen [Medical entrance test: Gender gap in results], 2015 and derstandard.at: Medizin-Aufnahmetest: Gender-Gap heuer wieder etwas größer [Medical entrance test: Gender gap this year slightly bigger again], 2017). Nevertheless, additional data with further independent measures of test situation construal and situational effects are needed in order to support or refute this argument.

When it comes to a competitive scenario, women face a different situation than men, as Gneezy et al. (2003) noted: "If women believe (even if incorrectly) that men are somewhat more skilled [. . . ] and they take the gender of their competitors as a signal of their ability (and maybe even take gender as a signal of their own ability), then a man and a woman face a different situation in the tournament" (p. 1058). These considerations are in line with gender roles, which classify women as highly qualified in communal scenarios and men as highly qualified in situations that call for assertiveness and mastery (Eagly and Miller, 2016). The results of several large studies (e.g., Colom et al., 2000; Colom and García-López, 2002; see also a review by Halpern and LaMay, 2000) have demonstrated that men and women are equivalent with reference to their general intelligence. However, men have been found to rate their own numerical IQ and their overall IQ higher than women do when it comes to self-estimated intelligence (Furnham et al., 2001; Ortner et al., 2011; see also a meta-analysis by Syzmanowicz and Furnham, 2011). Furnham et al. (2001) discussed the sex differences in self-estimations as influenced by lay conceptions about general intelligence and mathematical and spatial abilities, which are male normative. Such widely known stereotypes are supposed to impair the targets of these stereotypes, in this case women, and can be a driver of sex disparities when it comes to a high-stakes test situation.

Different reasons for the ongoing underrepresentation of women in STEM fields have been discussed (see e.g., Blickenstaff, 2005, for an overview), especially the effects of stereotype threat (Shapiro and Williams, 2012). Given the findings of this study, it seems reasonable to establish perceptions of the test situation as another approach in this context because test situations are an important part of a student's life, and they may have an important impact on career decisions. For example, research has revealed that higher grades in science, technology, engineering, and math (STEM) courses increase a student's probability of continuing with a STEM major (Griffith, 2010). Therefore, we advocate for more empirical research in this area to better understand the interplay between the situational characteristics of high-stakes situations, personality traits such as general test anxiety, and performance differences in men and women, especially in the light of consequences concerning further career implications.

# LIMITATIONS AND OUTLOOK

Several limitations need to be discussed in order to evaluate the given results. Due to the novelty of our research approach, the questions for assessing test takers' perceptions of the situation were based on an already existing form that was developed for investigations of daily life situations. We adapted the items, but we still think there is room for improvement in the formulation of the items so that they will better fit the special requirement of testtaking in a high-stakes situation. Future studies could thus seek to further develop this approach and, on the basis of the gained knowledge, use more selective items that can capture relevant aspects of the test situation.

The procedure of presenting the evaluation form directly after the 4-h aptitude test may have resulted in a too undifferentiated picture of test takers' perceptions because this procedure provided only an overall impression of the individuals' perceptions of the test situation. Nevertheless, it was not possible to evaluate the test situation in parts (e.g., after each task on the test) and to further examine whether the different tasks on the aptitude test induced different outcomes in test takers' perceptions. In addition, administering the evaluation form after the test might suffer from the disadvantage that test takers were fatigued, and asking participants about their perceptions of the test situation as well as their general test anxiety immediately after the admissions test may have led to ratings that were biased by expectations of success or frustration. Nevertheless, this limitation could not have been avoided because there was no opportunity to contact all of the participants before and after the admission procedure. In this regard, a reviewer raised the question of whether test anxiety may reflect a different type of anxiety that is related to the anticipated outcome of the highstakes test. The assessment of test anxiety employed in this study (TAI-G; Wacker et al., 2008) was intended as an assessment of general test anxiety in order to avoid contaminations by a test taker's beliefs about his or her own performance. To make this purpose as clear as possible, participants were explicitly asked to respond to these items by stating what was generally true for them in test situations. Further, as referred to in section Test Anxiety, the TAI-G has been shown to assess more trait-related stable individual differences than situational effects (Keith et al., 2003). However, future research may include test takers' performance expectations as a covariate variable in order to avoid a possible impact of low performance expectations on the assessment of general test anxiety.

Finally, although the information the test takers provided was anonymous, and we made sure to emphasize that it would have no influence on the evaluations of the test takers' performance, we cannot be certain that the test takers' answers were free from social desirability. There have been discussions in the literature, for example, about the idea that even if men and women experience a condition similarly, women express their emotions differently (for an overview, see Vigil, 2009). Due to differences in gender roles, which prescribe appropriate behaviors for men and women (Eagly, 1987; Eagly and Wood, 1991), reporting negative cognitions such as anxiety may be less appropriate for men than for women (see e.g., Feingold, 1994). However, a qualitative analysis after a real-life testing scenario in which test takers are encouraged to answer the question of why a high-stakes test could generally, for women and men, be perceived as fear-triggering and unfair may be able to shed more light on this question. In this context, future research could further investigate the effect of the perceptions of a test situation on test performance in a controlled stereotype-free condition vs. a stereotype-threat condition. The perception of the test situation as positive and challenging, for example, could enhance women's motivation in a stereotype-free condition and serve as a buffer in a stereotype-threat condition.

# CONCLUSION

It is a practitioner's duty to provide every person who takes a test the same chance to show his or her knowledge, skills and abilities and to thereby follow the standards for test fairness (e.g., the Standards for Educational and Psychological Testing). However, the results of this study raise the question of the comparability of test situations for women and men. The present research contribution aimed to take a first step toward highlighting the importance of analyzing aspects of women's and men's different perceptions of an important test situation as a possible source of construct-irrelevant score variance, resulting in a contribution to sex differences in test performance that can have major impact on further career developments. Increasing knowledge of relevant influences may provide the chance to develop test situations or methods that minimize these effects and allow women to excel.

# ETHICS STATEMENT

This data collection was carried out in accordance with the recommendations of the American Psychological Association's Ethical Principles in the Conduct of Research with Human Participants. The protocol was approved by the Institutional Review Board at Salzburg University. All subjects filled out the evaluation form voluntarily and could withdraw at any time without any consequences.

# AUTHOR CONTRIBUTIONS

All three authors developed the study concept. JL carried out the data collection and performed the data analysis under supervision of TS and TO. JL drafted an initial version of the manuscript that was refined and revisited successively by TO and TS.

# ACKNOWLEDGMENTS

We thank Sandra Augart and Sebastian Auer for help with data collection and Alexandra Schoor and Julian Fuchs for data entry.

#### REFERENCES


Many thanks go to Jane Zagorski for proofreading the article and Freya Gruber and Carina Gargitter for helpful suggestions. We would like to thank the editors Alice H. Eagly and Sabine Sczesny and the reviewers David Reilly and Isabelle Cherney for valuable comments on an earlier version of this article. We also thank the participants for provinding their data by filling out the evaluation form.


Zeidner, M., and Schleyer, E. J. (1998). The big-fish–little-pond effect for academic self-concept, test anxiety, and school grades in gifted children. Contemp. Educ. Psychol. 24, 305–329. doi: 10.1006/ceps.19 98.0985

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Leiner, Scherndl and Ortner. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Blaming the Victim of Acquaintance Rape: Individual, Situational, and Sociocultural Factors

#### Claire R. Gravelin<sup>1</sup> \*, Monica Biernat<sup>2</sup> and Caroline E. Bucher<sup>3</sup>

<sup>1</sup> Division of Social and Behavioral Sciences, Franklin Pierce University, Rindge, NH, United States, <sup>2</sup> Department of Psychology, The University of Kansas, Lawrence, KS, United States, <sup>3</sup> Department of Psychology, SUNY Geneseo, Geneseo, NY, United States

Victims of rape are uniquely vulnerable for being blamed for their assault relative to victims of other interpersonal crimes and thus much research has been conducted to understand why this is the case. But the study of victim blaming in acquaintance rape cases is hindered by contradictory empirical results. Early investigations in victim blaming often treated acquaintance rapes and stranger rapes as synonymous and thus much of these data are suspect for drawing conclusions particular to acquaintance rape. This paper provides a comprehensive review of the research literature on victim blame in acquaintance rape cases, highlighting inconsistencies and drawing particular attention to areas of research in need of further exploration. Specifically, we review the commonly studied individual (perceiver) factors that influence victim blaming, as well as common situational (target) factors included or manipulated within sexual assault scenarios. Our review reveals many inconsistent findings and interactions between perceiver and scenario factors. In an effort to make sense of these complex interactions and inconsistent findings, we suggest a need for more transparency in describing the scenarios used in research on victim blaming in sexual assault cases and greater empirical attention to sociocultural factors that may influence blaming tendencies.

#### Edited by:

Alice H. Eagly, Northwestern University, United States

#### Reviewed by:

Barbara Krahé, Universität Potsdam, Germany Guillermo B. Willis, University of Granada, Spain

\*Correspondence: Claire R. Gravelin gravelinc@franklinpierce.edu

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

> Received: 04 April 2018 Accepted: 19 November 2018 Published: 21 January 2019

#### Citation:

Gravelin CR, Biernat M and Bucher CE (2019) Blaming the Victim of Acquaintance Rape: Individual, Situational, and Sociocultural Factors. Front. Psychol. 9:2422. doi: 10.3389/fpsyg.2018.02422 Keywords: acquaintance rape, blame, responsibility, sexual assault, sexual violence, victim blame

# INTRODUCTION

For anybody whose once normal everyday life was suddenly shattered by an act of sexual violence– the trauma, the terror, can shatter you long after one horrible attack. It lingers. You don't know where to go or who to turn to. . .and people are more suspicious of what you were wearing or what you were drinking, as if it's your fault, not the fault of the person who assaulted you. . .We still don't condemn sexual assault as loudly as we should. We make excuses, we look the other way. . .[Laws] won't be enough unless we change the culture that allows assault to happen in the first place.


Sexual assault is a pressing and prevalent concern in our society with estimates that nearly 1 in 5 women in the United States will be sexually assaulted in her lifetime. Of those women who have been sexually assaulted, 41% have been assaulted by an acquaintance (Black et al., 2011). These numbers likely underestimate prevalence, as sexual assaults are one of the most under-reported crimes (Fisher et al., 2000, 2003; Rennison, 2002). In the unveiling of the "It's On Us" campaign to end sexual assault on college campuses, President

Barack Obama highlighted not only the trauma experienced by rape victims due to their assault, but also the secondary victimization many victims experience due to the negative reactions of those around them (see also Williams, 1984; Ulman, 1996). Of these negative reactions, perhaps the most harmful is the frequent tendency to blame the victim for their assault.

Unlike many other interpersonal crimes such as robberies or muggings, victims of sexual assault are particularly vulnerable to being blamed for their attack (Bieneck and Krahé, 2011; Gordon and Riger, 2011), and thus victim blaming in sexual assault cases has been the focus of many empirical investigations. However, despite the extensive amount of research performed on this topic, there is little consensus of when victim blaming will or will not occur in sexual assault cases (see Grubb and Harrower, 2008 and Grubb and Turner, 2012, for a review).

Adding to the confusion, existing reviews on victim blaming often combine the findings across various types of sexual assault (Langley et al., 1991; Pollard, 1992; Whatley, 1996; Grubb and Harrower, 2008; Grubb and Turner, 2012). For instance, Grubb and Harrower (2008) reviewed differences in victim blaming between stranger and acquaintance rape, but then combined these types of sexual assault when discussing the influence of gender and perceived similarity on victim blame. As different factors may matter for victim blaming, combining findings across sexual assault types may be problematic. The goal of this paper is to highlight what we know (and do not know) about victim blaming in acquaintance rape.

The opening statement by President Obama also highlights another important and often ignored element that contributes to the continued tendency to blame victims of sexual assault – the role of cultural structures, beliefs, and practices. Research on sexual assault and victim blame typically focuses on one of two perspectives. The first considers features of the observer as they influence victim blaming tendencies, which we refer to as individual factors. Often discussed as the "rape perception framework," the second perspective focuses on aspects of the victim, perpetrator, or characteristics of the assault as they influence victim blame (Pollard, 1992). We refer to these elements as situational factors. Neither of these perspectives, however, addresses a third critical factor affecting victim blame: societal and institutional factors. Institutional and societal level factors refer to broader cultural influences such as gender roles, media, and rhetoric surrounding sexual assault that contribute to an overall environment promoting victim blame. The current review will consider both individual-level and situational-level variables as they affect victim blaming in acquaintance rape cases but will also discuss the role of institutional and societal-level factors. Further, we consider how all three elements may influence one another (see **Figure 1**).

This paper is intended to provide a comprehensive review of the research literature on victim blaming in acquaintance rape and the conditions under which victim blaming is influenced by individual and situational factors. We begin by briefly defining what we mean by sexual assault, acquaintance rape, and victim blaming. We then review the research literature and propose a broader framework that includes attention to societal and institutional factors as important contributors to victim blame.

# SEXUAL ASSAULT

Current conceptions of rape and sexual assault typically include penetration, whether it be genital, oral, or anal, by part of the perpetrator's body or object through the use of force or without the victim's consent. While not discounting the victimization of men, sexual assault is a gendered crime, with women much more likely to be victimized then men (Brownmiller, 1975; Koss et al., 1987; Koss and Harvey, 1991; Hayes et al., 2013). Indeed, compared to one in five American women, one in 71 men will be assaulted in his lifetime (Black et al., 2011). Thus, while male victimization is indeed problematic, given the highly gendered nature of this crime, the current work focuses exclusively on female victims.

Researchers investigating the prevalence and consequences of sexual assault typically distinguish among three types of sexual assault: stranger rape, date/acquaintance rape, and marital rape. Stranger rape refers to a sexual assault in which the victim and assailant have no prior relationship or acquaintance with one another. When an individual has been sexually assaulted by someone she knows – for instance a friend, classmate, or someone she has gone on a few dates with – it is classified as an acquaintance or date rape (Calhoun et al., 1976; Check and Malamuth, 1983; Estrich, 1987; Johnson and Jackson, 1988; Quackenbush, 1989), but "date rape" is also used to describe assaults that occur in established relationships (Shultz et al., 2000). Finally, sexual assault that occurs within a marriage has been deemed a legal form of rape, with the first successful marital rape conviction occurring in the United States in 1979 (Pagelow, 1988). These distinctions may not provide as much clarity as desired. For example, assault by one's unmarried romantic partner may have more in common with marital rape than acquaintance rape; assault while on a first date may differ considerably from assault by a classmate to whom one has never spoken. The current review will focus on sexual assaults classified as acquaintance rape, and we will note distinctions between dating-related and non-dating related acquaintance rape where relevant. Gaining a greater understanding of victim blaming in acquaintance rape is particularly important given that the majority of rapes are perpetrated by someone known to the victim (Russell, 1984; Koss et al., 1988; Pfeiffer, 1990), and that acquaintance rape cases have a lower probability of conviction in the courts than those that that fit with a stranger rape script (Estrich, 1987; Larcombe, 2002).

# BLAMING THE VICTIM

Blaming the victim refers to the tendency to hold victims of negative events responsible for those outcomes (Ryan, 1971; Eigenberg and Garland, 2008). While victim blaming can occur in a variety of situations, it appears to be particularly likely in cases of sexual assault (Bieneck and Krahé, 2011). Assailants do tend to be found as more culpable for sexual assault than victims (see Grubb and Harrower, 2008), but victims are blamed as well, to a degree that varies substantially depending on features of the assault, the victim, and the perceiver.

There is currently little consensus about the predictors of victim blaming (see Grubb and Harrower, 2008; Grubb and Turner, 2012). In fact, the sexual assault literature appears to offer only one clear conclusion: Victims of stranger rape are the least likely to be blamed for their assault; victims of marital rape are much more likely to be found culpable (Ewoldt et al., 2000; Monson et al., 2000). Direct comparisons between stranger rape and acquaintance rape typically find less blame in the former case (Amir, 1971; Calhoun et al., 1976; Donnerstein and Berkowitz, 1981; L'Armand and Pepitone, 1982; Janoff-Bulman et al., 1985; Tetreault and Barnett, 1987; Muehlenhard and Hollabaugh, 1988; Bridges and McGrail, 1989; Quackenbush, 1989; Pollard, 1992; Hammock and Richardson, 1997; Sinclair and Bourne, 1998; Krahé et al., 2007; Grubb and Harrower, 2008; Bieneck and Krahé, 2011; Droogendyk and Wright, 2014; McKimmie et al.,

2014; Ayala et al., 2015; Stuart et al., 2016, but see Persson et al., 2018). Further, acquaintance rape victims are blamed less than marital rape victims (Ferro et al., 2008). In short, as the victim and assailant become increasingly familiar and romantically involved, victim blame increases (Bridges, 1991; Simonson and Subich, 1999; Krahé et al., 2007; Bieneck and Krahé, 2011; Pederson and Strömwall, 2013, but see McCaul et al., 1990, and Klippenstine et al., 2007).

# MEASURING BLAME

The measurement of "blaming the victim" may seem straightforward, but it varies substantially in the literature. Researchers typically present participants with a scenario of a

sexual assault case, then some researchers assess blame, others assess perceived responsibility, others utilize a combination of both blame and responsibility, and still others assess related constructs. Blame is typically defined as a value judgment of the extent to which one should be held accountable for (and perhaps suffer from) a negative event (Bradbury and Fincham, 1990; Calhoun and Townsley, 1991; Stormo et al., 1997) and is typically measured using a rating scale (e.g., How much is the victim to blame for her assault?). Responsibility, defined as the extent to which victims' choices or actions contributed to their assault (Stormo et al., 1997), is typically assessed by asking participants to assign a percentage of responsibility to the involved parties. Thus, blame may be a harsher assessment than responsibility and perceivers may therefore be more comfortable in attributing responsibility than blame.

Some researchers have argued that blame and responsibility measures can be used interchangeably (Bradbury and Fincham, 1990; Calhoun and Townsley, 1991); others argue that they are distinct constructs and should be treated as such (Richardson and Campbell, 1980, 1982; Critchlow, 1985; Shaver and Drown, 1986; Richardson and Hammock, 1991). The data are inconsistent on these points. For example, Stormo et al. (1997) found their measures of responsibility and blame to be highly positively correlated (see also Krulewitz and Nash, 1979; McCaul et al., 1990), and the two measures were similarly responsive to variations of victim intoxication in sexual assault scenarios. In contrast, Richardson and Campbell (1982) found that victim blaming was unaffected by level of victim intoxication, but drunk victims were judged more responsible for assault than sober victims. Relatedly, in assessing how dating scripts influence victim culpability, Basow and Minieri (2011) found men were more likely to blame victims for their assault than women, while no differences emerged in their separate measure of victim responsibility. Of course, non-significant effects on either measure could be due to floor/ceiling effects, particularly given the high degree of correlation between the constructs (Krulewitz and Nash, 1979; McCaul et al., 1990; Stormo et al., 1997).

Victim blame has also been assessed using other related constructs, including assessments of "fault" (Jones and Aronson, 1973; Kahn et al., 1977; Ford et al., 1998) and the extent to which the victim is perceived to have "enjoyed" the experience (Simonson and Subich, 1999). Others claim that simply failing to label a rape as a rape is a form of victim blaming (Lonsway and Fitzgerald, 1994), although labeling is more commonly used as a manipulation check to ensure that participants perceive scenarios as assaults (see Maurer and Robinson, 2008). Other more general markers of victim blame that are not answered in response to a specific case include rape myth endorsement (the extent to which participants endorse "prejudicial, stereotyped, or false beliefs" about sexual assault, victims, and assailants, pp. 217; Burt, 1980) and the Attitudes Toward Rape Victims Scale (Ward, 1988). However, these assessments often reflect beliefs surrounding stranger rapes (e.g., "Rapes only occur in dark alleys," Payne et al., 1999; Dupuis and Clay, 2013) and thus should not be used as a measure of victim blame in acquaintance or marital rape situations. We view rape myth endorsement as a potential predictor of blame in acquaintance rape, but not as an appropriate measure of victim blame itself.

This review will consider the most common conceptualizations of victim blame (blame, responsibility, and fault) that are specific to a particular victim depicted in a scenario rather than rape myth acceptance, perceived enjoyment, or labeling of an event as rape (see **Table 1** for a comprehensive listing of measures used in the reviewed studies).

# METHODS

To identify the extant research literature, our search strategy included combinations of the keywords rape or sexual assault with victim blame, and limited to date or acquaintance rape in electronic databases including PsycINFO, and Proquest Dissertation and Theses published through December 2017 (inclusive). Additional articles were found by conducting forward and backward searches utilizing reference sections of retrieved articles and earlier reviews through Google Scholar. This approach yielded 137 articles, which were then assessed for fit according to our inclusion criteria. The review was restricted to studies of lay observers (e.g., studies of therapists' tendency to victim blame and personal accounts by victims and perpetrators were excluded). In addition, only studies of victim blame in cases involving a female victim and male assailant, most often depicted via a written or visual scenario, were included. The typical study exposed participants to a vignette/scenario/description of an acquaintance rape, then assessed victim blaming. Following these exclusions, 102 empirical studies on acquaintance rape that used at least one measure of victim blame, as defined above, were located. Our goal was to identify key factors that have been considered as predictors of victim blaming in these studies, to review what has been learned about each, and to highlight inconsistencies and gaps in the literature. These factors generally fall into two categories: features of the perceiver (individual level factors) and features of the acquaintance rape itself (situational factors). We offer a narrative rather than empirical review: Metaanalysis was not appropriate to our goals given the large number of disparate predictors we considered, the often small number of cases of each type, the myriad of moderators (often unique to particular subsets of studies) that point to nuanced patterns rather than main effects.

# RESULTS

# Individual Level Factors as Predictors of Victim Blaming Gender

Given the gendered nature of sexual assault, it is unsurprising that many studies have examined how participant gender may influence evaluations of blame in sexual assault (see Grubb and Harrower, 2008 for a review). There are two contradictory hypotheses one might have about how gender affects victim blaming. On the one hand, because rape is mainly a concern of women, they might be expected to blame less as a function

#### TABLE 1 | Measurement type used to assess victim blame and situational components featured in studies included in review.

#### TABLE 1 | Continued


(Continued)

(Continued)


<sup>∗</sup> = full scenarios obtained. Blame is typically measured as how much the victim is to blame for her assault (rated on a scale with endpoints such as "not at all" to "to a great extent/completely;" responsibility is typically measured as a percentage of responsibility assigned to the involved parties; fault is typically measured as how much the victim is at fault for what happened rated on a scale with endpoints such as "not at all" to "very much/extremely). Abbreviations used to depict which of the following components were present and/or manipulated in scenario(s) for each study; A, Alcohol; D, Drugs; AP, Appearance; SH, Sexual history/actions; F, Force; R, Resistance; RA, victim and/or assailant race/ethnicity; AB, Scenario details absent.

of ingroup solidarity. On the other hand, "just world" ideology (Lerner, 1970, 1980; Hafer, 2000) might suggest they might blame more: Precisely because of the greater threat that sexual assault poses to women, victim blaming may help women distance themselves from the reality that they could be victimized themselves.

Many studies have found that women are less likely to blame victims of acquaintance rape than men (Basow and Minieri, 2011, although gender differences only emerged in their assessment of victim blame, and not in their separate measure of victim responsibility; Calhoun et al., 1976; Selby et al., 1977; Gerdes et al., 1988; Johnson and Jackson, 1988; Kanekar and Nazareth, 1988; Johnson et al., 1989; Bell et al., 1994; Schuller and Wall, 1998; Varelas and Foley, 1998; Lambert and Raichle, 2000; Geiger et al., 2004; Klippenstine et al., 2007; Krahé et al., 2007; Yamawaki et al., 2007; Black and Gold, 2008; Hammond et al., 2011; Casarella-Espinoza, 2015; Ferrão et al., 2016). A number of other studies, however, have produced null effects of gender on victim blaming (Gilmartin-Zena, 1983; Howells et al., 1984; Krahé, 1988; McCaul et al., 1990; Kanekar et al., 1991; Kanekar and Seksaria, 1993; Branscombe et al., 1996; Nario-Redmond and Branscombe, 1996; Hammock and Richardson, 1997; Abrams et al., 2003; Frese et al., 2004; Girard and Senn, 2008; Bieneck and Krahé, 2011; Romero-Sánchez et al., 2012; Loughnan et al., 2013; Pederson and Strömwall, 2013; Bongiorno et al., 2016; Landström et al., 2016; Qi et al., 2016; Persson et al., 2018; although these studies assessed victim culpability for being "sexually touched" at a bar and thus it is unclear if a rape has occured; Scronce and Corcoran, 1995; Stormo et al., 1997; Sims et al., 2007; Strömwall et al., 2013). No studies have found that women engaged in greater victim blaming than men. Thus, the just world prediction currently does not receive support.

A meta-analysis conducted by Whatley (1996) on victim blaming failed to find significant moderation of blame by participant gender. It is problematic to draw any conclusions from this meta-analysis, however, as it combined studies of acquaintance rape with stranger rape. Meta-analyses on rape myth endorsement do indicate men are more accepting of rape myths than women (Anderson et al., 1997; Suarez and Gadalla, 2010). However, as previously noted, rape myths are a problematic marker of victim blaming in acquaintance rape since rape myths more closely reflect stranger rape situations.

We suspect that the inconsistent findings regarding the role of participant gender on blame are likely due to varying components of the scenarios used in victim blaming studies. For instance, Bell et al. (1994) failed to find gender differences, but the scenarios used were brief and vague (only two sentences long). Hammond et al. (2011) exposed participants to a lengthy scenario, several paragraphs long and rich in detail including background information about both the victim and assailant and information about behavior prior to the assault (heavy drinking and flirting). In this study, women were found to blame the victim significantly less than men.

#### Race/Ethnicity/Nationality

Very little research has examined the effect of participant race or ethnicity on victim blaming in acquaintance (or stranger)

TABLE 1 | Continued

rape. Of those studies that have done so, the findings are inconsistent. Bell et al. (1994) study of undergraduates' reactions to a "typical" date rape scenario (victim assaulted after a date), found no effect of participant race (African American, Asian, and Caucasian participants) on victim blaming. Casarella-Espinoza (2015), however, found greater victim blaming among Hispanic participants compared to their Caucasian counterparts. While both of these studies examined blame within a scenario that was likely interpreted as involving a White victim and White assailant, Varelas and Foley (1998) examined how participant race might interact with race of perpetrator or victim. White and Black participants were randomly assigned to read an acquaintance rape scenario that depicted a Black or White female victim and a Black or White male assailant. In general, White participants were less likely to blame victims than Black participants. This main effect, however, was qualified by a significant three-way interaction with victim and assailant race: White participants blamed victims the least when the victim was White and the assailant was Black, while Black participants blamed victims the most when the victim was Black and the assailant was White.

The discrepancies between these two studies could be due to the differing scenarios used (assault after a date versus assault after accepting a ride home from a customer), or to the differing ways in which blame was evaluated (blame versus responsibility), or to the important moderating feature of assailant and victim ethnicity. In any case, related literature provides some support for the argument that, at least in the United States, minority group members may blame sexual assault victims more than ethnic majority members (Caucasians; cf. Feild, 1978; see also Lonsway and Fitzgerald, 1994). Several studies assessing general attitudes toward rape victims and endorsement of rape myths have found less favorable reactions and greater endorsement of rape myths among African-American samples (Williams and Holmes, 1981; Giacopassi and Dull, 1986; Dull and Giacopassi, 1987), Asian-American samples (Mori et al., 1995), and Hispanic-American samples (Fischer, 1987; Jimenez, 2002; Jimenez and Abreu, 2003) in comparison to their Caucasian counterparts (see Suarez and Gadalla, 2010, for a review). Future research should continue to explore the effect of participant race and ethnicity on victim blaming in acquaintance rape cases, especially in combination with race/ethnicity of victim and assailant.

Relatively few studies have compared participants from differing racial/ethnic groups outside of a North American context. Exceptions include Pederson and Strömwall (2013), who compared British and Swedish non-student participants' victim blaming in an acquaintance rape scenario and found no differences. Yamawaki and Tschanz (2005) compared Japanese and American undergraduate students and found higher victim blaming by Japanese than American students (this was true for stranger, acquaintance, and marital rape depictions). In an Australian sample, Bongiorno et al. (2016) found that a non-resisting victim was seen as more blameworthy when her perpetrator was characterized as being culturally similar (Western) to the participant, but cultural similarity had no effect when the victim physically resisted the assault.

#### Rape Myth Endorsement (RME)

As previously stated, some researchers have used RME as an indicator of victim blame. This is problematic because rape myth scales focus on stranger rape and assesses beliefs about rape at a general rather than specific level. Nonetheless, RME may matter for assessing blame in particular acquaintance rape cases. Those high in RME tend to believe that only stranger rape is "real rape." Given that acquaintance rapes deviate from stranger rape both in recognition as rape as well as perceived severity (e.g., L'Armand and Pepitone, 1982; Tetreault and Barnett, 1987; Gerdes et al., 1988; Bridges, 1991), endorsement of rape myths may predict even greater victim blaming in acquaintance rape as these do not fit typical conceptualizations of a "real" rape.

Research clearly supports a positive relationship between endorsement of rape myths and victim blaming in acquaintance rape cases (Lonsway and Fitzgerald, 1994; Stormo et al., 1997; Schuller and Wall, 1998; Varelas and Foley, 1998; Frese et al., 2004; Hayes-Smith and Levett, 2010; Masser et al., 2010; Basow and Minieri, 2011; Hammond et al., 2011; Romero-Sánchez et al., 2012; McKimmie et al., 2014; Starfelt et al., 2015; Qi et al., 2016; Persson et al., 2018). Additionally, the relationship between rape myth endorsement and greater victim blame tends to be strongest among men (Lonsway and Fitzgerald, 1994; Hayes-Smith and Levett, 2010; Hammond et al., 2011). Using a related construct, the Perceived Causes of Rape scale (Cowan and Quinton, 1997), Krahé et al. (2007) found that some subscales of this instrument showed the strongest positive associations with victim blaming: beliefs that rape is due to female teasing and to male pathology, and that men lack control over their sexual urges.

#### Gender Role Attitudes and Identity

Rape Myth Endorsement is significantly correlated with restrictive beliefs about women's roles and rights (see Suarez and Gadalla, 2010). Studies of victim blame in acquaintance rape have also documented a positive relationship between blame and endorsement of traditional gender roles (Howells et al., 1984; Stormo et al., 1997; Simonson and Subich, 1999; Yamawaki and Tschanz, 2005; Sims et al., 2007, but see Hammond et al., 2011). In fact, Simonson and Subich (1999) found that after controlling for gender role endorsement, their finding that men blamed the victim more than women was eliminated; gender role attitudes may be a stronger predictor of blame than participant gender. In one study that manipulated the gender traditionality of the date that preceded an acquaintance rape, victim responsibility and perceived justifiability of the assault were highest in the traditional case (when the man exclusively paid for an expensive date) compared to other scenarios (shared payment, inexpensive date; Basow and Minieri, 2011).

Others have examined the effects of hostile and benevolent sexism (Glick and Fiske, 2001) on victim blame. Several researchers have documented a positive relationship between benevolent sexism and victim blame (Abrams et al., 2003; Masser et al., 2010, although this effect was only present among victims perceived to violate victim and gender stereotypes; Viki and Abrams, 2002; Yamawaki et al., 2007; Pederson and Strömwall, 2013; Persson et al., 2018). The relationship between hostile sexism and victim blaming, however, is more complex. For

instance, Pederson and Strömwall (2013) found no relationship, while others have found hostile sexism (Masser et al., 2010; Persson et al., 2018) to predict greater victim blame (Yamawaki et al., 2007; Masser et al., 2010; Persson et al., 2018).

Both benevolent and hostile sexism reflect concerns about maintaining an unequal power differential between men and women: Benevolently sexist attitudes suggest women are lower in status and in need of men's protection, and hostile sexist attitudes suggest that women are trying to usurp men's greater power. Feminist perspectives point to power as a motivation for committing sexual assault (Brownmiller, 1975; Burt, 1980; Lonsway and Fitzgerald, 1994; Ward, 1995) and the effects of these attitudes on victim blame may be construed as legitimizing the current power hierarchy and maintaining gender differentiation. Indeed, research on "precarious manhood" demonstrates that masculinity, unlike femininity, is tenuous and requires continual social validation and defense (Vandello et al., 2008; Vandello and Bosson, 2013). The importance of power dynamics for victim blaming points to the need to consider the societal power structure. For example, victim blaming may increase in settings in which men perceive power threats by women (e.g., in patriarchal versus egalitarian settings).

Gender identification and threats to masculinity/femininity have also been shown to influence victim blaming. In one study, for example, participants received bogus feedback on a "gender identity survey" which either confirmed or threatened their gender identity and then were asked to evaluate a case of acquaintance rape (Munsch and Willer, 2012). Men whose masculinity was threatened blamed the victim more than those whose masculinity was confirmed. Conversely, women whose femininity was threatened blamed the victim less than nonthreatened women. Thus, threats to one's gender identity may heighten the dominant response among men and women resulting in greater blame among men and lesser blame among women, especially among men who derive a large component of their self-concept from their masculinity.

#### Political Attitudes

People who endorse more politically conservative views are also more likely to blame victims of sexual assault (see Anderson et al., 1997 for a review). For example, Lambert and Raichle (2000) found this relationship using three distinct measures of conservatism [self-rating of conservatism, social dominance orientation (Sidanius et al., 1996) and Protestant work ethic beliefs (Katz and Hass, 1988)]. Across all three measures, the more politically conservative the participants were, the more they blamed the victim.

## Belief in a Just World (BJW)

It is commonly thought that individuals blame victims in order to restore their belief that "good things happen to good people, and bad things happen to bad people" (Lerner, 1970, 1980; Hafer, 2000). The theory of BJW describes victim blaming as a bias that enables people to maintain their beliefs in a predictable and stable environment (Lerner, 1970, 1980; Rubin and Peplau, 1973; Lerner and Miller, 1978) and therefore victim blame should increase to the extent that situations threaten BJW (Hafer, 2000).

But there is little empirical support for the association between just world beliefs and victim blaming in acquaintance rape cases (see Lambert and Raichle, 2000; Hammond et al., 2011; Pederson and Strömwall, 2013; Strömwall et al., 2013; for an exception, see Landström et al., 2016). One study did find that endorsement of BJW predicted blame for victims of sexual assault, but this was only the case among participants placed in a rationalistic mindset (defined as deliberate and effortful processing; Van Den Bos and Maas, 2009). This finding points to a potential reason behind the relative lack of effects of BJW beliefs on victim blaming: Researchers who stress that participants respond with their first, "gut-level," reaction may be bypassing more effortful thought which allows the effect of BJW to influence victim evaluations. It is also possible that BJW more strongly impacts assessments of stranger rape, with high BJW endorsers more likely to blame the victim (Kleinke and Meyer, 1990; Strömwall et al., 2013). In their assessment of belief in a just world on victim assessment across varying relationship types, Strömwall et al. (2013) found BJW to be meaningfully related only to assessments of stranger rape. Specifically, women high in belief in a just world were significantly more likely to blame the victim than women low in belief in a just world, while BJW had no impact on male evaluations of victims.

## Perceived Similarity and Prior Victimization

The degree to which individuals identify with a victim, either at a superficial level such as similar occupation or attitudes, or at a personal level due to their own experience with victimization, may play a role in evaluations of victim culpability. Perceived similarity to a victim may increase empathy for her experience, resulting in lesser blame (Krebs, 1975). However, it is also possible that greater feelings of similarity, particularly among female observers, heighten feelings of personal threat and distancing through victim blaming. Unfortunately, we found only three studies which assess the role of similarity on victim blaming in acquaintance rape; their findings are inconsistent. Johnson (1995) found no effect of similarity (measured as the extent to which participants felt the victim was "like them") on victim blaming, while Bell et al. (1994) and Harbottle (2015) found that the more similar participants felt to the victim (measured with "how similar do you feel to the woman in this scenario?"), the less they blamed her. In studies of stranger rape, there is also no clear indication of the role that similarity plays on victim evaluation (see Fulero and DeLara, 1976; Kahn et al., 1977; Thornton, 1984).

Participants' prior sexual victimization may also serve as an important contributor to perceived similarity to victims. There is little evidence that prior victimization influences victim blame in acquaintance rape (Coller and Resick, 1987; Bieneck and Krahé, 2011; Harbottle, 2015; Gravelin et al., 2017). Unfortunately, the one study located which did mention a difference in blame assessments between victims and non-victims failed to disclose how they differed (Johnson, 1995).

# Summary of the Effects of Individual Factors on Victim Blame

Myriad individual factors have been examined in studies of victim blame in acquaintance rape, but only a few of these factors have

produced consistent findings. Developing a demographic profile of what "type" of participant is most likely to blame victims is limited by a lack of research examining racial/ethnic and national differences, and a focus on college-aged students in Western settings. It does seem to be the case that men endorse rape myths more than women (Lonsway and Fitzgerald, 1994; Hayes-Smith and Levett, 2010; Suarez and Gadalla, 2010; Hammond et al., 2011), but the effect of gender on blame in specific cases of acquaintance rape is less clear-cut.

Furthermore, any effects of participant gender may be due more to endorsement of gender roles and identification with one's gender identity than participant gender itself. Those who endorse traditional gender roles tend to blame victims more, and controlling for gender role endorsement may eliminate effects of gender (Simonson and Subich, 1999). Further, threats to one's masculinity/femininity appear to heighten the prototypical gendered response to victim blaming; men blame victims more and women blame victims less when their gender identity is threatened. Also interacting with gender is RME: Men generally endorse rape myths more than women, and individuals who endorse rape myths engage in more victim blaming. Similarly, men tend to be more politically conservative than women (Pratto et al., 1997; Eagly et al., 2004), and political conservatism predicts victim blaming (Anderson et al., 1997, though only one study has examined this relationship in acquaintance rape scenarios; Lambert and Raichle, 2000).

Some findings also hint at the role of social power in evaluations of victim blame. Both benevolent sexism and the power relations subcomponent of the hostile sexism scale are concerned with maintaining an unequal power differential between men and women. Endorsement of these attitudes predicts greater victim blaming (Viki and Abrams, 2002; Abrams et al., 2003; Yamawaki et al., 2007; Pederson and Strömwall, 2013). Though not described above, one set of studies in which participants' feelings of power and powerlessness were manipulated suggest that powerless men blame victims less than men in a control condition and powerful women tend to blame the victim more than those in a control condition (Gravelin et al., 2017). These findings suggest a need to further consider patriarchal power differentials, a topic we discuss later in the paper.

Despite its direct relevance to issues of victim blame, few studies have examined the association between BJW and victim blaming, and little supportive evidence has been found. Relatedly, examinations of the effects of perceived similarity to the victim have found some, though limited evidence that those who feel more similar to the victim blame her less for her assault (Bell et al., 1994; Harbottle, 2015). No research establishes a link between prior victimization and subsequent blame of a victim in an acquaintance rape scenario.

As discussed when we reviewed each factor, some of the inconsistencies in the literature may be due to the large variety of scenarios that have been used in the victim blaming literature. Much about victim blaming may have to do with the specifics of the scenario itself, as we know, for example, from the finding that blame is greater in acquaintance rape than stranger rape cases overall. And rather than main effects of demographic and attitudinal factors, these factors may differentially matter depending on the specifics of the scenarios or cases participants are asked to consider. In the section that follows we will detail the different aspects of acquaintance rape vignettes that have been implemented or manipulated in the set of studies under review and will highlight instances in which these situational factors interact with individual factors to influence victim blame.

# Situation Level Factors as Predictors of Victim Blaming

Studies of victim blaming in acquaintance rape cases typically assess participant responses to a provided vignette. These vignettes typically consist of a third-person written account of a sexual assault (but see Janoff-Bulman et al., 1985; Tetreault and Barnett, 1987; Willis, 1992; Dupuis and Clay, 2013), in which various components of the case, the victim, and/or the assailant are manipulated. Below we review the most common elements included and/or manipulated in acquaintance rape scenarios and corresponding findings for these elements. However, of the 102 studies evaluated, only 50 included the full scenarios in their published accounts. After attempting to contact all of the authors with missing vignettes, we were able to obtain the full scenarios of an additional 2 studies, resulting in a total of 52 full scenarios for evaluation. The remaining studies were coded and evaluated based on the available information described by the authors (see **Table 1** for a comprehensive list of components found within scenarios).

# Presence of Drugs/Alcohol

Drugs and alcohol are common elements of acquaintance rape cases, particularly those that occur on college campuses (Abbey et al., 1996; Benson et al., 2007; Kilpatrick et al., 2007; Krebs et al., 2009). Much research has established a link between alcohol consumption and sexually aggressive behavior (Muehlenhard and Linton, 1987; Koss and Gaines, 1993; Ullman et al., 1999; Locke and Mahalik, 2005). As seen in **Table 1**, 34 of the 102 acquaintance rape vignettes in the identified literature mention alcohol. This does not include not the widely used Abrams et al. (2003) scenario, which does not explicitly mention alcohol but implies it by describing the victim as flirting and dancing all night at a party, then inviting the perpetrator home for coffee. Only sixteen of these studies experimentally manipulated the presence/absence of alcohol or varying degrees of intoxication; the remaining vignettes simply indicated alcohol use as a stable characteristic in the scenario.

Eleven of the sixteen studies that manipulated intoxication level found that intoxicated victims were blamed more for an acquaintance rape than sober victims (Richardson and Campbell, 1982; Stormo et al., 1997; Wall and Schuller, 2002; Cameron and Stritzke, 2003; Krahé et al., 2007; Sims et al., 2007; Bieneck and Krahé, 2011; Romero-Sánchez et al., 2012; Landström et al., 2016; Qi et al., 2016), and another found a linear increase in victim blame with level of victim intoxication (Stormo et al., 1997). The opposite effect of intoxication emerged for evaluations of the perpetrator: the more drunk the perpetrator, the more participants excused his behavior (see also Richardson and Campbell, 1982; Cameron and Stritzke, 2003; Johnson et al., 2016;

Qi et al., 2016). Using adapted versions of the Stormo et al. (1997) vignettes, Girard and Senn (2008) found that only when the victim was depicted as having received drinks that were stronger than those of her date without her knowledge was she seen as less responsible. Research by Scronce and Corcoran (1995) suggests that women may be more critical of intoxicated victims; female participants found victims of completed or attempted stranger or acquaintance rape as more responsible for their assault if they had been drinking. Examining the combined effects of assailant and victim intoxication further complicates assessments of culpability; research by Klippenstine et al. (2007) found the typical gender effect on victim blame was nullified when both parties were depicted as intoxicated. Further, women blamed victims more than men when the victim was depicted as sober and the assailant as intoxicated.

These studies indicate both that alcohol use is a common feature of acquaintance rape scenarios used in research and that it matters for victim blaming. We suspect that many of the studies for which we could not identify precise scenario content also included alcohol use, a reflection of the common image of acquaintance rape. A more comprehensive understanding of alcohol's role will require that researchers provide complete details about their case scenarios and that this feature be systematically manipulated.

In addition to alcohol use, there is increased societal concern about the use of "date rape drugs" in sexual assaults. Despite this concern, only one study has investigated the effect of date rape drugs on victim blame in acquaintance rape (Girard and Senn, 2008), and these researchers found that the voluntariness of drug use was crucial: Only when the victim voluntarily consumed gamma-hydoxybutric acid (GHB), an intoxicating sedative, prior to an assault was she seen as more blameworthy than a sober victim. Interestingly, a victim who was slipped GHB unknowingly was not seen as less blameworthy than a sober victim assaulted by a sober perpetrator. Marijuana use was examined in one study, with results mirroring the common trend found with alcohol consumption. Victims intoxicated by marijuana or alcohol are perceived as more blameworthy for their assault, while perpetrators intoxicated by the same substances are perceived as less blameworthy (Qi et al., 2016).

#### Appearance and Sexual History

Factors related to a victim's appearance (physical attractiveness, style of dress) and sexual history (sexual orientation, previous sexual partners) are often described or manipulated in research using acquaintance rape scenarios, though less so than in studies of stranger rape. As can be seen in **Table 1**, 32 studies included some mention of victim attractiveness, appearance, or sexual history, and 15 of these studies manipulated some component of this information. Understanding how these elements may influence victim blaming tendencies is important given their ties to many rape myths (e.g., "It is usually only women that are dressed suggestively that are raped," "A lot of women lead men on and then cry rape"; Payne et al., 1999).

A common misconception is that the act of rape is based on sexual desire and therefore attractive victims "ask for it" by being desirable. In domains outside of sexual assault, however, researchers often find that attractive individuals are seen as more responsible for good outcomes than for bad, while unattractive individuals are seen as more responsible for bad outcomes (Dion et al., 1972; Seligman et al., 1974; Stephan and Tully, 1977). Using a manipulation of victim attractiveness through accompanying photographs, two studies on acquaintance rape supported this pattern (Gerdes et al., 1988; Ferrão et al., 2016), though Gerdes et al. (1988) found no effect of assailant attractiveness. However, in both of these studies, the scenarios used were markedly different from the traditional account of an acquaintance rape: in one, the victim was accosted in a dark stairwell (Gerdes et al., 1988) and in the other, the victim was a married mother of two children (Ferrão et al., 2016). These features make it difficult to draw definitive conclusions about the effect of victim attractiveness on victim blame. While it is unclear whether the assault was a stranger or acquaintance rape, research conducted by Kanekar and Nazareth (1988) also failed to find a main effect of victim attractiveness on attribution of victim fault for their assault. Attractiveness was found to produce more blame only among female participants when the victim was also described as physically unharmed from the assault and not emotionally disturbed as a result of the rape. Further, female participants also judged unattractive victims as more blameworthy compared to their male counterparts when the victim was also described as unharmed and emotionally disturbed from the assault.

A more frequently studied aspect of appearance is the clothing "revealingness" or provocativeness of the victim. A scenario used by Muehlenhard and MacNaughton (1988); see also Workman and Orr (1996) described the victim as either dressing and acting provocatively (low-cut blouse, mini skirt, heels, kissing the assailant) or conservatively (high necked blouse, pleated woolen skirt, keeping a distance). Perhaps unsurprisingly, the more revealing or suggestively dressed the victim, the more the victim was blamed for her assault (Gilmartin-Zena, 1983; Cassidy and Hurrell, 1995, although these results are confounded as the conservatively dressed victim was attacked by a stranger; Muehlenhard and MacNaughton, 1988; Kanekar and Seksaria, 1993; Workman and Orr, 1996; Loughnan et al., 2013). Similarly, a victim described as wearing a body-hugging dress and high heels, compared to a more conservatively dressed victim, was viewed as having "led the perpetrator on," leading to less perpetrator blame (but no effect on victim responsibility; Johnson et al., 2016). Using similar scenarios, Whatley and Riggio (1992) found a significant interaction between clothing style and participant gender: Men, but not women, attributed less responsibility to a conservatively dressed victim than a provocatively dressed victim.

Two other studies found null effects of provocativeness on victim blame, but the scenarios and measures used in these studies were quite different from those described above. Smith et al. (1976) manipulated provocativeness via the victim's occupation (she was either a topless/bottomless dancer, social worker, or nun) and the scenario itself was prototypical of a stranger rape: the assault occurred while the victim was walking alone at night and a knife was used. Johnson (1995) also used an occupational manipulation of victim provocativeness, but asked only if the victim was more responsible than the perpetrator. This

measure of victim blame is problematic given that participants generally indicate greater blame to the perpetrator than the victim (see Pollard, 1992; Grubb and Harrower, 2008; Landström et al., 2016).

The sexual history and experience of victims have also been considered as important contributors to victim blame (Whatley, 1996). This information is often manipulated via scenario descriptions of previous relationships or relationship status. Pugh (1983) manipulated the victim's past sexual history via the victim's testimony that she had or had not met other men in a bar and had sex with some of them prior to her alleged assault. When the victim was portrayed as more sexually promiscuous, she was blamed more for her assault (see also Kanekar and Seksaria, 1993; Idsis and Edoute, 2017). A "married mother of three" who went to a party and met a man who subsequently raped her (compared to a woman about whom no relationship information given), was found more to blame for her assault (Viki and Abrams, 2002). However, this manipulation may have less to do with sexual experience than with the perceived immorality of a married mother being at a party and flirting with a strange man.

Howells et al. (1984) found no differences in victim blame across their two levels of victim relationship status (single or engaged). However, in this scenario the victim was a family's regular babysitter who was assaulted by her employer as he gave her a ride home. Participants' schemas for babysitters as relatively young may have reduced overall victim blame in this case, as she was accosted by an older man in a position of power, which may have over-ridden any influence of relationship status on victim blaming tendencies.

Finally, only one study has manipulated the extent to which the victim's sexual orientation influenced victim blaming (Ford et al., 1998). In this research a heterosexual female victim was found to be more at fault than a lesbian victim when assaulted by a male. This finding may speak to the "rape as sexual desire" myth mentioned previously, in that a heterosexual female may be seen as sexually enticing to the heterosexual male assailant (more likely to have "asked for it") and therefore seen as more blameworthy than the lesbian victim.

#### Force and Resistance

The legal definition of rape includes mention of force, and thus, it is unsurprising that a majority of studies on acquaintance rape often include mention of force and/or victim resistance (80 of 102, see **Table 1**). For example, Shotland and Goodstein (1983) manipulated the amount of force the perpetrator used (verbal, or verbal and physical), the degree of victim resistance (verbal, or verbal and physical), and the onset timing of the victim's resistance (immediately after a French kiss, after he begins to caress her below the waist, or after they are undressed). The type of resistance by the victim did not influence perceptions of victim blame (see also Sims et al., 2007), but perpetrator force in combination with onset of protest mattered. When low force was used, the victim was blamed regardless of when she began to protest. When the assailant was depicted as using both verbal and physical force, the victim was only blamed when she delayed protest until the point of undress. Similarly, Idsis and Edoute (2017) found victims were judged less responsible when they physically, rather than verbally, resisted, and when their resistance was depicted as strong, rather than weak. Other research indicates that victims are blamed less when the perpetrator uses physical force (e.g., see Bieneck and Krahé, 2011, although this study combined results across stranger, acquaintance, and ex-partner assaults). Victim resistance also appears to decrease victim blaming (Gilmartin-Zena, 1983; Black and Gold, 2008; Bongiorno et al., 2016, although these results are confounded as the more resistant victim was attacked by a stranger compared to a non-resisting acquaintance rape; dressed victim was attacked by a stranger; Kanekar and Seksaria, 1993; Masser et al., 2010, although this effect only occurred among those high in benevolent sexism that read about a victim who left her children unattended at home; McKimmie et al., 2014), especially when resistance occurs earlier in the encounter (Kopper, 1996). Perpetrator use of physical force also results in less victim blame than a case in which the victim is unable to resist due to intoxication (Krahé et al., 2007). While lacking an assessment of victim culpability, Branscombe and Weir (1992) found that assailant blame was highest when the victim strongly resisted physically (kicking him in the shin and fighting during the entire encounter compared to simply attempting to stand up). Degree of verbal resistance of the victim did not influence perceptions of perpetrator blame.

But some researchers have found different patterns. Wooten (1980) found greater victim blame when the perpetrator was depicted as using moderate force, compared to low or high force. However, this manipulation of force was confounded with victim resistance: In both the low force and high force conditions, victim resistance was both verbal and physical, but the moderate force condition depicted only verbal resistance. Still, this research suggests that victim resistance in combination with degree of force used by the assailant may be important for understanding blame (see also Shotland and Goodstein, 1983). The role of victim resistance may be particularly important among those who believe that rape is a sexually motivated crime (Ong and Ward, 1999). Compared to those who believed rape is motivated by power, participants who endorsed the belief that rape is sexually motivated blamed the victim more when she was described as not resisting the attack. When the victim did resist, however, these beliefs had no effect on victim blaming.

A meta-analysis on attribution of responsibility for accidents (not related to sexual assault) found that accident severity increased the tendency to blame the perpetrator (Burger, 1981). Therefore, an important component of force and resistance that should be assessed in future work is the severity of the assault in terms of both physical and emotional outcomes experienced by the victim. We located one study that manipulated whether the victim was physically hurt following the assault. There was no difference in blame, but participants did recommend longer prison sentences for the perpetrator when the victim was physically hurt (Kanekar et al., 1991; note: the impact of victim injury on sentencing was examined in two studies and was only significant in one). While the above findings on assault severity on blame appear to support the conclusions of the meta-analysis on blame for accidents, more research manipulating the severity of injury to the victim is needed.

### Victim and Perpetrator Race

fpsyg-09-02422 January 18, 2019 Time: 13:45 # 12

As noted in the discussion of race in the "individual factors" section, very little research has examined the role of victim and assailant race on blame in acquaintance rape cases; only five studies have investigated the role of victim/assailant race/ethnicity in victim blaming in acquaintance rape cases (see **Table 1**). This omission is problematic, given that more non-White women are victimized compared to White women (Black et al., 2011). Further, many myths surrounding sexual assault depict a Black male assailant and a White female victim (Davis, 1983; Epstein and Langenbahn, 1994). While lacking an assessment of victim blame, prior research manipulating both victim and perpetrator race in an acquaintance rape found that White victims were more likely than Black victims to prompt beliefs that the assailant should be held legally responsible and that his actions could be defined as criminal (Foley et al., 1995). Counter to the myth of the Black rapist, however, this research did not find any significant differences in blame based on assailant race. Willis (1992) found that regardless of race, victims were rated as less truthful in their reports of a sexual assault if they were depicted as being in a prior or current relationship with their Black assailant, compared to when he was depicted as White or as a Black stranger. Some evidence suggests greater victim blame in intraracial compared to inter-racial rapes (George and Martinez, 2002), but this study collapsed across stranger and acquaintance rape scenarios (despite a main effect in which victims of acquaintance rape were blamed more than victims of stranger rape).

As discussed previously, cultural similarity to the perpetrator increased victim blaming among an Australian sample, but only if the victim was depicted as not resisting the attack (Bongiorno et al., 2016). Dupuis and Clay (2013) found that victim blame was a function of both the victim's race and her perceived respectability, manipulated via the defendant's testimony that the victim was either a "party girl" who often picked up men at bars or a "sweet girl" who didn't date much or go to bars. While respectability did not matter for blame of White victims, it affected blame of Black victims: Respectable Black victims were blamed less than "party girl" Black victims. Furthermore, respectable Black victims were blamed less than respectable White victims, while "party girl" Black victims were blamed more than comparable White victims. Perpetrator race mattered only in one case– the non-respectable victim was seen as more blameworthy than the respectable victim when the perpetrator was Black.

These patterns may be complicated further by consideration of participant race. As described earlier, Varelas and Foley (1998) found that White participants blamed victims less than Black participants and that less blame was attributed to victims when the assailant was Black. White participants also blamed White victims assaulted by Black men less than Black victims assaulted by Black men, while Black participants attributed the most blame to a Black woman assaulted by a White man. One other study manipulated victim and assailant race (Willis, 1992) but did not report comparisons relevant comparisons.

Research on race effects has been limited by the singular focus on Black and White victims and perpetrators (but see Bell et al., 1994, for an exception). More research is needed on how other victim/assailant races (e.g., Asian, Hispanic) may influence blame, as well as potential interactions with participant demographics.

## Socioeconomic Status

Sexual assault may be motivated by need for power (Brownmiller, 1975; Burt, 1980; Lonsway and Fitzgerald, 1994; Ward, 1995) and therefore power differentials within a rape scenario, defined by socioeconomic status, may influence evaluations of blame. Black and Gold (2008) manipulated socioeconomic status of the perpetrator by describing him as either a bus driver or doctor. Women, but not men, held the victim more responsible when she was assaulted by the bus driver than the doctor. In another study in which the victim was portrayed as either a cashier or accountant, both male and female participants rated the cashier as more promiscuous and more blameworthy (Spencer, 2016).

Blame may be more affected by the relative status of the perpetrator compared to the female victim. Using a sample of students at the University of Bombay, Kanekar et al. (1991) manipulated the assailant's occupational status to be higher than, the same as, or lower than the status of the female victim, along with an additional manipulation of whether the victim filed a complaint or not against her aggressor. These researchers found a greater tendency for men to blame the victim when the assailant had higher status (or comparable status) relative to the victim, but only if she did not file a complaint. Relatedly, Yamawaki et al. (2007) manipulated whether the victim or assailant held a high status position or not (well-respected CEO versus student from a local university). When the assailant was in the more powerful position, those who believe women use sex to gain power from men blamed the victim more.

Drawing definitive conclusions from these studies about the effect of socioeconomic status on victim blame is difficult. Black and Gold (2008) did not provide information about the victim's occupation and thus it is unclear whether participants assumed she held a better job than the bus driver, thus changing the power dynamic between the two. Yamawaki et al. (2007) did not include a control condition whereby the victim and assailant held equal power status. Finally, while Kanekar et al. (1991) found gender differences in blame due to relative status of the assailant, this only emerged in the conditions in which the victim chose not to file a complaint. Clearly more research is needed on socioeconomic status and other power differential cues to better determine their effects on victim blame.

# Summary of Situation Level Factors

Alcohol use is common in sexual assault cases and not surprisingly, a large number of sexual assault scenarios used in research include this feature. However, few studies have examined how changes in intoxication and alcohol use levels impact victim blame. Among those that have, the evidence largely

suggests that alcohol use by the victim increases victim blaming, while alcohol use by the defendant reduces his level of blame (Richardson and Campbell, 1982; Stormo et al., 1997; Cameron and Stritzke, 2003; Bieneck and Krahé, 2011; but see Girard and Senn, 2008).

Research considering victim physical characteristics clearly indicates that the more revealing the clothing worn by the victim and the more suggestive her behavior or occupation, the more likely the victim is to be blamed for her assault (Muehlenhard and MacNaughton, 1988; Kanekar and Seksaria, 1993; Cassidy and Hurrell, 1995; Workman and Orr, 1996; Loughnan et al., 2013). Victims with an apparently promiscuous sexual history are also found to be more blameworthy (Pugh, 1983). Provocativeness may also interact with participant gender, such that men, but not women blame provocatively dressed victims more than conservatively dressed victims (Whatley and Riggio, 1992). In one study on victim sexual orientation, heterosexual victims were blamed more than lesbian victims (Ford et al., 1998). Many of these findings are consistent with the belief that physical enticement—based on dress, history, or sexual orientation triggers assault, but one exception to this pattern is the finding that unattractive victims are blamed more than attractive victims (Gerdes et al., 1988). The latter finding may have more to do with a general halo effect favoring attractive individuals (e.g., Dion et al., 1972).

Another common factor considered in sexual assault vignettes is the degree of force and resistance used by the perpetrator and victim. These appear to play an important role in perceptions of victim culpability. Victims who resist their attackers are seen as less blameworthy than those who do not (particularly when they resist early in the interaction; Shotland and Goodstein, 1983; Kanekar and Seksaria, 1993; Kopper, 1996; Black and Gold, 2008). Less victim blaming also occurs when the perpetrator is depicted as using a great degree of force (Bieneck and Krahé, 2011) and when the victim is portrayed as having been injured from the attack (Kanekar et al., 1991).

Despite evidence that non-White women are more likely to be victimized (Black et al., 2011), there is currently relatively little research that manipulates victim and perpetrator race. The work that has been done, however, indicates a more complex interaction with other individual and situational factors. For instance, White participants blamed White victims assaulted by Black men less than Black victims assaulted by Black men, while Black participants attributed the most blame to a Black woman assaulted by a White man (Varelas and Foley, 1998). Further, respectability mattered for blame of Black victims, but not White victims.

Finally, research on the impact of socioeconomic status and power differences between victim and assailant is currently too limited and inconsistent to draw definitive conclusions. However, some research points to the importance of power differentials in influencing blame (Kanekar et al., 1991), and of participants' beliefs that women use sex to gain power from men (Yamawaki et al., 2007).

One difficulty in assessing the impact of situational factors on victim blame is that many published studies do not include full descriptions of the scenarios used. For instance, the sexual assault scenario used by Janoff-Bulman et al. (1985) is simply described as a "first person account of a rape and the events preceding it (pp. 164)." After having received the full scenario by Dr. Janoff-Bulman, however, it is clear that alcohol intoxication played a central role in this scenario ("I had more than I could handle. Bob got drunk too. . .I had a lot to drink. . .I insisted we stay until we had. . .something to get more sober"). Given the role alcohol plays in evaluations of sexual assault, it is important to be aware that this sexual assault scenario centers around a night of heavy drinking. Thus, before we can draw firm conclusions about the effects of various situational factors on victim blame, access to the full scenarios used in research is necessary.

# DISCUSSION

We have reviewed a variety of individual and situational factors that influence victim blaming, but in order to fully understand victim blame we must take into account broader institutional and societal factors that may dictate how perceivers view any given sexual assault scenario. Indeed, it has been suggested that the only way to truly prevent rape is to address the problem of rape at the societal level (Allison and Wrightsman, 1993), considering broader cultural factors that both contribute to sexual assaults and promote rape myths and victim blaming.

As depicted in **Figure 1**, we view individual, situational, and institutional factors as influencing one another. Interactions within individual level factors (e.g., participant gender and rape myth endorsement), situational level factors (e.g., perpetrator force and assailant resistance) and the interaction between individual and situation level factors (e.g., participant race and victim/assailant race) have received some consideration in the research literature. What has yet to be accounted for, however, is how these elements may also be influenced by the cultural context in which they are studied. In the following sections we identify institutional-level factors that may contribute to both sexual assault and victim blaming and then discuss how these factors may interact with individual and situation level factors.

# Institutional/Societal Level Factors Gender Dynamics

Patriarchy is widespread across many cultures (Pratto, 1996; Eagly and Wood, 1999; World Economic Forum, 2017) and feminist scholars have long proposed that sexual assault is motivated by power, with violence against women a function of gendered sex roles that support male domination and female exploitation (Brownmiller, 1975; Burt, 1980; Lonsway and Fitzgerald, 1994; Ward, 1995). Societies that have more egalitarian gender roles tend to have lower rates of sexual assault (Sanday, 1981; White et al., 1997). Interestingly, recent work has found that priming men to feel lower in power increases their ability to take others' perspective, thereby decreasing their tendency to blame victims of acquaintance rape (Gravelin et al., 2017).

Socialization into gender roles may make women more prone to the dangers of sexual assault, but also communicates victim blaming as normative. For instance, Warshaw (1994) argues

that communal roles teach women from a young age to avoid embarrassing a man by rejecting his advances and to not resist a physically aggressive man. Male gender roles may also justify and promote sexually aggressive behavior among men (Griffin, 1971; Sanday, 1981; Beneke, 1982; Warshaw, 1994; O'Toole, 2007) and legitimize victim blaming (Griffin, 1971; Check and Malamuth, 1983; Margolin et al., 1989; Feltey et al., 1991). For example, men may be taught to dissociate themselves from responsibility for their sexual actions, thereby reinforcing myths that once a man is sexually aroused he cannot stop himself (Warshaw, 1994).

Stereotypes and sexual scripts communicated to men and women further complicate sexual relations. Considerable research documents a sexual double standard, whereby men are more free than women to express their sexual desires (Sprecher et al., 1987; Muehlenhard and Hollabaugh, 1988; Muehlenhard and McCoy, 1991; Muehlenhard and Quackenbush, 1998). This pattern reinforces a common belief in token resistance, whereby it is thought that many women say no to sex even when they would like to say yes since it is "unladylike" to desire sex (Gagnon and Simon, 1973; Check and Malamuth, 1983; Schur, 1983; Muehlenhard and Hollabaugh, 1988; Warshaw, 1994). This belief appears to influence approaches to sexual behavior; Muehlenhard and Hollabaugh (1988) found that over 39% of their sample of undergraduate women reported engaging in token resistance at least once, and those who had were more likely to endorse traditional gender roles than women who were sexually active but did not engage in token resistance.

Men are socialized to be the sexual initiators, and, given the belief in, and practice of, token resistance, may be encouraged not to take a woman's reluctance seriously. Thus, sex is often viewed as a challenge, and women become sexualized objects to conquer (Warshaw, 1994). These sexual scripts dictating token resistance from women and persistence by men ambiguate what is viewed as sexual foreplay and what is sexual assault. Acceptance of such scripts may also influence perceivers' evaluation of acquaintance rape victims who resist sexual advances from the assailant. Indeed, research has established that endorsement of gender inequality and traditional gender roles (which includes the practice of token resistance, Muehlenhard and Hollabaugh, 1988) is associated with greater RME and victim blaming (Brownmiller, 1975; Burt, 1980; Deitz et al., 1982; Whatley, 2005; Edwards et al., 2011).

Another strong cultural force that dictates what is considered proper gender role and sexual behavior is religion. A variety of religions, such as Christian evangelism and Islam promote a gender hierarchy that values female submission (see Flood and Pease, 2009); other religious affiliations may convey more or less conservatism regarding appropriate sexual behavior. Using 20 years of data from the General Social Survey, Hoffman and Miller (1997) found that more conservative religions promote traditional female roles while liberal religions promote egalitarianism. Further, the strength of gender norms in a given culture may interact with individual level factors to influence evaluations of victim blame. For example, the extent to which sexually promiscuous victims are blamed may be exacerbated in conservative religious cultures. More generally, to the extent that institutions promote a gendered hierarchy, men possessing lower social power than women are likely to feel threatened, which in turn may lead to more victim blame (e.g., Munsch and Willer, 2012).

#### Media and Sexual Objectification

The hypersexualization and sexual objectification of women in society also leads to greater acceptance of violence against women and victim blame (Malamuth and Check, 1981; Ohbuchi et al., 1994; Lanis and Covell, 1995; MacKay and Covell, 1997; Kalof, 1999). Hypersexualization and sexual objectification refer to the extreme sexuality ascribed to women, often depicting them as purely sexual objects for men's desires. This sexualized representation of women exists in a variety of domains, including pornography, non-pornographic film and television, and print advertising (see Stankiewicz and Rosselli, 2008).

Not only do media outlets often depict women as sexualized objects, but sexual aggression is portrayed as normative behavior in pornography (Longino, 1980; MacKinnon, 1985), films (Donnerstein and Linz, 1986), and music (Schur, 1988; hooks, 1994). While victims of non-sexual aggression are often shown as having suffered from their assault, sexual assault victims are often depicted as initially refusing a man's sexual advances and then become aroused as he ignores her resistance (Smith, 1976; Zilbergeld, 1978; Malamuth and Check, 1981). Eroticizing sexual dominance in the media legitimizes violence against women and may contribute to victim blaming (see Schur, 1988).

In the context of sexual assault, the media also tend to focus on stranger rape (Soothill, 1991), thus influencing how perceivers determine what constitutes a "real rape," and to portray rapists as strangers with solely sexual motivations to assault attractive young females (Allison and Wrightsman, 1993). Deviations from this image to one depicting an acquaintance rape may be less likely to be seen as a sexual assault, resulting in increased victim blaming. Soothill (1991) documented changes in reporting on sexual assaults in major newspapers from 1973 to 1985. Despite an increase in the number of single assailant-single victim sexual assault crimes in the courts across this period, reporting on these types of crimes decreased, with a shift in focus to multiple offender gang rapes instead. This shift may have increased readers' beliefs that gang rapes and stranger rapes are more prevalent and concerning than acquaintance rape. A more recent review of two major newspapers' reporting on sexual assault indicates that gang and stranger rapes are still over-reported relative to acquaintance rapes and to actual prevalence data (Gravelin, 2017).

When media outlets do discuss acquaintance rape, how it is discussed can also contribute to victim blaming. Highlighting rape myths or focusing on ways that acquaintance rapes may resemble prototypical stranger rapes may have negative consequences for victims of assaults that do not include these prototypical features. For example, Franiuk et al. (2008) exposed participants to headlines about an acquaintance rape case against basketball star Kobe Bryant. These headlines were modeled after actual headlines used in newspaper accounts of Bryant's case and either contained rape myths (e.g., "Defense attorneys in sexual assault case say accuser had motive to lie") or not ("Hearing set for man accused of sexual assault"). Participants tended to see Bryant as less guilty after reading headlines containing rape myths than neutral headlines, and this was particularly true among men. Men exposed to the rape myth headlines also endorsed rape-supportive attitudes more so than men in the control condition. In short, the media may exacerbate endorsement of rape myths, which in turn promotes greater victim blaming.

#### Legal and Empirical Rhetoric

fpsyg-09-02422 January 18, 2019 Time: 13:45 # 15

The definition of rape has changed throughout American history and therefore what constitutes rape is dependent on the time and state in which the assault has occurred (Freedman, 2013). It was not until 2012 that the FBI broadened the definition of rape to include non-forcible rape of women and men. In 2014, both California and New York altered their definitions of sexual assault such that rape is not defined by the victim saying "no," but by failing to say "yes." Such a definition acknowledges the role of the assailant in obtaining affirmative consent, rather than the victim in saying no. Branscombe et al. (1996); see also Nario-Redmond and Branscombe (1996) found that focusing participants on how the victim's behavior could have altered a rape outcome produced the greatest amount of victim blame, while focusing on how the assailant's behavior could have prevented an assault generally increased the relative blame assigned to him. Others have found that defining sexual assault as an act of intergroup (a "hate crime"), rather than interpersonal violence (a personal assault) reduced victim blaming in both stranger and acquaintance rape cases (Droogendyk and Wright, 2014).

Despite these recent efforts to broaden the definition of rape and incorporate definitions more closely aligned with nonstranger rape, earlier constructions of rape promoted through rape myths remain deeply embedded in our culture. These myths make it difficult for individuals to recognize rape, particularly non-stranger rape. This difficulty may encourage perceivers to look to situational factors such as the victim's attractiveness and promiscuity to explain the assault in acquaintance rape cases (Weis and Borges, 1973). Given that the working definition of what constitutes a rape varies as a function of time and location, comparing studies conducted in different settings at different times may not be appropriate.

#### Rape Culture

Much research on acquaintance rape asserts that certain settings foster beliefs conducive to rape, often referred to as "rape culture" (Buchwald et al., 1993). Some have suggested that individuals within the United States as a whole view rape as normative and a condoned behavior (Rozée, 1993; Koss et al., 1994), but rape culture is most often associated with college campuses, particularly athletic groups and fraternities. Rape cultures exist outside of the college environment, as well; both high school and professional-level athletics and the military have been studied as rape cultures (see O'Toole, 2007).

Researchers suggest that male-dominated environments such as those mentioned above are particularly likely to promote sexist attitudes and behaviors and may facilitate greater risk of sexual assault as well as victim-blaming myths (Sanday, 1990; Melnick, 1992; Koss and Gaines, 1993; Boeringer, 1996, 1999; Boswell and Spade, 1996; Bleecker and Murnen, 2005; McCray, 2014). Rape cultures are typically defined as hypermasculinized environments that glorify coercive sexual behavior as central to their group identity (O'Toole, 2007). For example, allmale housing units such as fraternities have a higher risk of sexual assaults than co-ed housing (Hinch and Thomas, 1999). Sexual aggression is also particularly likely among the newest members of an all-male group: Fraternity pledges are the most likely of all college males to commit a sexual assault on campus (see Bohmer and Parrot, 1993). Individual level factors such as threats to power or status may be particularly problematic within all-male groups, increasing the likelihood of sexual assault, rape myth endorsement, and victim blaming.

Rape culture is maintained by the norm of silencing victims of rape (Burnett et al., 2009). Particularly in cultures where rape myths are promoted and accepted, victims may question their behavior and be uncertain whether to label their experience as a rape or not (Adams-Curtis and Forbes, 2004; Harned, 2004). Failure to report rape not only protects perpetrators from punishment but also communicates a tolerance for sexual assault that delegitimizes victims' experiences and perpetuates victim blaming.

Rape culture frameworks tend to focus on localized settings that contribute to sexual assault and victim blaming, but broader cultural contexts—including national and regional contexts—have differing historical experience with violence and differing flexibility or rigidity of gender roles which may contribute to differing levels of victim blame (Sanchez-Hucles and Dutton, 1999). A qualitative study on community norms and expectations concerning intimate violence by Sorenson (1996) found that compared to Asian American participants, Mexican American participants described a greater cultural value on male sexual prowess. Victims of sexual assault in many Middle Eastern communities are punished, even outcast by their families, or must marry their rapists in order to restore honor to their families (Ruggi, 1998). Conversely, many African cultures promote flexible gender roles and pride in having strong, independent women, thus potentially reducing blame ascribed to female victims who deviate from traditional gender roles (Hill, 1972; Young, 1986; Boyd-Franklin, 1989, see also Sanchez-Hucles and Dutton, 1999). Finally, Ho (1990), see also Sorenson (1996) noted that Asian values of harmony and close family ties may not promote lesser sexual violence, but may support minimizing or concealing violence.

These cultural differences may contribute both to differences in sexual assault rates and differing levels of victim blaming. A report by the World Health Organization [WHO] (2005) compiled cross-national data from surveys on female

victimization from 1992 through 1997 and found considerable variability in reported victimization. For instance, Asian countries (China, India, Indonesia, and Philippines) had the lowest rate of reported sexual assault as well as the lowest variability within-continent, with incidence of sexual assault ranging from 0.3% in the Philippines, to 2.7% in Indonesia, while the data surveyed from countries in Latin America (Argentina, Bolivia, Brazil, Columbia, Costa Rica, and Paraguay) had the largest variability, with incidences ranging from ranging from 1.4% in Bolivia to 8.0% in Brazil. It is important to note that, while informative, these data do not distinguish between types of sexual assault, and sample sizes varied considerably across studies. Respondents were only asked about sexual assaults that had occurred within the last 5 years and thus does not account incidents outside of this window. There is no national data base on victim blaming, but differing cultural tendencies to minimize or silence sexual assault may communicate greater victim blame by way of trivializing experiences of sexual assault.

Another element varying across regional/ethnic cultures that may contribute to differential evaluations of victims of sexual assault is religiosity; cultures vary in the extent to which they are influenced by religious doctrines. While limited to a sample of undergraduates living in the United States, a study on the role of cultural and religious influences on endorsement of traditional gender roles found more conservative sexual attitudes among Asians (South and East Asians) compared to their Hispanic (South American, Central American, and Mexican) and European American (Caucasian) counterparts (Ahrold and Meston, 2010). Across all three groups, greater intrinsic religiosity and religious fundamentalism predicted more conservative sexual attitudes (endorsement of traditional gender roles). Thus, religiosity and traditional gender role endorsement attitudes may interact with situational elements to contribute to differential degrees of victim blaming. For example, a victim who deviates from a traditional submissive role by behaving promiscuously or fighting her attacker may be seen as more blameworthy by more religious and conservative observers.

# FINAL REMARKS

Research on sexual assault and victim blaming is burgeoning, yet much more needs to be done to understand the individual, situational, and cultural factors that contribute to victim blaming, particularly in the case of acquaintance rape. The current paper identified the most commonly studied aspects of victim blaming in acquaintance rape within the two primary approaches: individual level factors and situation level factors. A review of this literature reveals many inconsistent findings and interactions across both levels. In an effort to make sense of these complex interactions and inconsistent findings, we suggest greater consideration be given to the role of institutional factors on evaluations of victim blame. The final sections of this paper then outlined various institutional factors that we believe should be given greater attention in future research on victim blaming in acquaintance rape and provided evidence to support why these factors may interact with the more commonly studied individual and situational factors.

Acquaintance rapes differ in many ways and therefore researchers cannot use a "standardized" single vignette to study victim blame. However, knowing which details are present or absent in the scenarios used by researchers will help in drawing more accurate and appropriate comparisons and conclusions. Further, despite obvious differences between acquaintance and stranger rape, many researchers still use findings gathered from one type of assault interchangeably with the other when discussing patterns in sexual assault research (Whatley, 1996; Grubb and Harrower, 2008; Grubb and Turner, 2012). As previously highlighted, a substantial number of the papers considered in this review failed to provide full details of the scenarios used in their research. As elements such as the presence/absence of alcohol, victim's clothing and promiscuity, and prior relationship with the assailant all influence how perceivers evaluate cases of sexual assault, it is important to be aware of the full characterization of the sexual assault before drawing conclusions across studies. Therefore, in addition to accounting for institutional factors in future examinations of victim blaming, greater transparency about and open sharing of the scenarios used is needed.

Our narrative review allowed for a wide-ranging overview of research on victim blame in acquaintance rape cases but was limited by a reliance on study significance levels, without taking into account study power (i.e., low N and high N studies received equal weight in our review). This limitation can be redressed by the use of meta-analysis to better quantify the effects of individual, situational, and cultural factors on victim blaming. We hope this review motivates such metaanalytic consideration, as well as additional original research in these areas. The #MeToo movement has brought recent heightened public attention to the problem of sexual assault; this cultural focus may further spur social scientific efforts toward understanding perceptions and treatment of victims of sexual assault.

# AUTHOR CONTRIBUTIONS

CG served as primary investigator, responsible for conception of analysis and review, literature collection, synthesis, critique and write up. MB provided substantial contributions to synthesis and critique and assisted in multiple revisions of manuscript document. CB assisted in literature collection and synthesis/coding and aided in final revisions of manuscript.

# ACKNOWLEDGMENTS

Some of the content within this manuscript originally appeared in the first author's dissertation, which is archived online. All content from the dissertation is cited throughout the manuscript and referenced accordingly in the reference list.

# REFERENCES






Ryan, W. (1971). Blaming the Victim. New York, NY: Pantheon.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Gravelin, Biernat and Bucher. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Religiosity, Religious Fundamentalism, and Ambivalent Sexism Toward Girls and Women Among Adolescents and Young Adults Living in Germany

Bettina Hannover<sup>1</sup> \*, John Gubernath<sup>1</sup> , Martin Schultze<sup>1</sup> and Lysann Zander1,2

<sup>1</sup> Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany, <sup>2</sup> Department of Educational Science, Leibniz University Hannover, Hanover, Germany

#### Edited by:

Alice H. Eagly, Northwestern University, United States

#### Reviewed by:

Gosia Mikołajczak, La Trobe University, Australia Deborah Hall, Arizona State University, United States

> \*Correspondence: Bettina Hannover bettina.hannover@fu-berlin.de

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

Received: 30 March 2018 Accepted: 14 November 2018 Published: 03 December 2018

#### Citation:

Hannover B, Gubernath J, Schultze M and Zander L (2018) Religiosity, Religious Fundamentalism, and Ambivalent Sexism Toward Girls and Women Among Adolescents and Young Adults Living in Germany. Front. Psychol. 9:2399. doi: 10.3389/fpsyg.2018.02399 The New Year's Eve 2015 mass sexual assaults in Germany led to a broader debate about whether the perpetrators, most of them self-identifying as Muslims, were encouraged to such acts by particularly sexist attitudes toward girls and women. Here, we argue that it is not the specific religious affiliation of individuals per se that predicts sexism. Rather it should be the extent to which they are involved in their religion, i.e., their religiosity and their endorsement of religious fundamentalism. In line with the theory of ambivalent sexism, we distinguish hostile and benevolent sexism, while controlling for right-wing authoritarianism and social dominance orientation. In two Pilot Studies, we explored differences in ambivalent sexism (a) between male and female individuals of Muslim faith, Christian faith, Muslim faith, Christian faith, and no religious affiliation residing in Germany, while at the same time (b) differentiating between sexism directed toward girls and sexism directed toward women. In our Main Study, we tested the interrelations between religiosity, religious fundamentalism, and ambivalent sexism in our religious subsamples of male Christians, female Christians, male Muslims, and female Muslims using a multigroup multivariate moderated mediation analysis. In all three studies, Muslims were more religious, endorsed religious fundamentalism more strongly, and held stronger benevolent sexist beliefs toward girls and women as well as stronger hostile sexist beliefs toward women than Christians and non-religious participants. In our Main Study, with female Christians as the reference group, male Muslims' stronger benevolent and hostile sexist beliefs toward girls were mediated by religiosity and fundamentalism. Female Muslims' stronger endorsement of benevolent sexism toward girls could be explained by their higher level of fundamentalism. While our findings show that differences in ambivalent sexism between religious groups were partly due to different levels of religiosity and fundamentalism, they also suggest that there are factors other than those investigated in our studies responsible for male Muslims' particularly strong sexism. We discuss specific contents of Islamic religious teachings and honor beliefs as possible causes to be investigated further in future research.

Keywords: ambivalent sexism toward girls, ambivalent sexism toward women, religiosity, religious fundamentalism, right-wing authoritarianism

# INTRODUCTION

fpsyg-09-02399 November 29, 2018 Time: 16:52 # 2

On the last day of the year 2015 in Cologne and several other German cities (Bielefeld, Hamburg, Dortmund, Düsseldorf, Stuttgart), more than 1,000 girls and women who attended the public New Year's Eve celebrations were sexually assaulted, mostly by groups of men suddenly surrounding and then attacking them on the street. Even adolescent girls, accompanied by their mothers or peers, were harassed. According to official estimates, at least 2,000 men were involved. Most suspects were asylum seekers and illegal immigrants from Muslim-majority countries in North Africa who had only recently arrived in Germany (Noack, 2016; see Schwarzer, 2016, for a more detailed account). A significant side effect of the New Year's Eve mass sexual assaults was a noticeable increase of anti-Muslim sentiments in Germany (Bayrakli and Hafez, 2018).

Previous research has found sexist behavior (Begany and Milburn, 2002; Diehl et al., 2016) as well as the acceptance and occurrence of male violence against women to be predicted by sexist attitudes (Abrams et al., 2003; Chapleau et al., 2007; Koepke et al., 2014). When women deviate from traditional gender norms, they are particularly likely to become the target of hostile sexism (see Sibley and Wilson, 2004; Gaunt, 2013; Glick et al., 2015). One of the focal points of the revitalized debates on women's rights has been the question whether the perpetrators, most of whom were practicing adherents of the religion of Islam, were encouraged in their actions by particularly sexist attitudes toward girls and women (Schwarzer, 2016).

In this article, comparing sample groups from the two monotheistic religions Christianity and Islam, we argue that it is not the specific religious affiliation but rather the extent of religious involvement, i.e., the strength of a person's religiosity, and the adherence to tenets of religious fundamentalism that predict sexism toward girls and women. Religiosity here refers to the importance individuals assign to their religious beliefs (e.g., Huber and Huber, 2012), while fundamentalism describes the view that a set of religious teachings is infallible and the sole repository of fundamental truths. Fundamentalist believers must rigorously obey the rules of their religion in the manner that tradition has established, and those who do so are promised a special relationship with the respective deity (Kirckpatrick et al., 1991; Altemeyer and Hunsberger, 1992; Schnell, 2010). Hence, fundamentalism is not limited to any one religion but describes certain traits that can potentially be found in any religion.

As we show in more detail below, there is evidence that Muslims living in Germany describe themselves as more religious and endorse fundamentalism to a higher extent than German citizens who do not self-identify as Muslims. While most religions teach their believers that they should love and trust their fellow human beings, evidence suggests that sexism and other forms of prejudice can paradoxically be exacerbated through religion, especially in conjunction with high levels of religiosity and fundamentalism (see Hunsberger and Jackson, 2005, for a review).

We investigated this apparent paradox by inquiring into the correlation between sexism toward girls and women and the extent to which people describe themselves as more religious and/or endorse fundamentalist beliefs. We further explored whether differences in these variables could in fact be mediators of variations in the level of sexist attitudes between people of different religious affiliations by (a) investigating non-religious individuals and members of the two largest religious groups within Germany, Muslims and Christians, and (b) measuring openly negative hostile sexist attitudes and seemingly positive benevolent sexist attitudes, the two subcomponents of ambivalent sexism, as well as religiosity and fundamentalism. We then explored possible differences in ambivalent sexism (a) depending on girls versus women being the targets and (b) depending on participants' religious affiliation and gender. In predicting sexist attitudes from religiosity and fundamentalism separately for people of different religious affiliations, we further controlled for the impact of two potential confounds present in the research literature: right-wing authoritarianism and social dominance orientation.

# Religiosity and Fundamentalism in Germany

As in many other Western countries, the percentage of people in Germany who are denominationally bound to one of the two main Christian churches is constantly declining. While in 1970 only 6.4% of the West German population had no religious affiliation, in 2011 it was 30.9% (Sachverständigenrat Deutscher Stiftungen für Integration und Migration, 2016). During the same time, due to worldwide migration, the number of people affiliated with the Muslim religion has constantly risen. The number of Muslims living in Germany rose from 500,000 in 1972 to 3 million in 2000. In 2015 it stood at 4.5 million, which corresponds to about 5% of the German population (ddp-Nachrichtenticker, 2009; REMID, 2017).

Muslim immigrants living in Germany report being more religious than German citizens who do not self-identify as Muslims (Brettfeld and Wetzels, 2007; de Hoon and van Tubergen, 2014 [investigating adolescents], Gille, 2016 [investigating adolescents]; Huber and Huber, 2012). Several studies also found that Muslims living in Germany follow the rules and traditions of their religion more strictly and more frequently engage in the requisite rituals and practices than the non-Muslim population (Albert et al., 2015 [investigating adolescents]; de Hoon and van Tubergen, 2014 [investigating adolescents]; Nyiri, 2007; Diehl et al., 2009; Fleischmann and Phalet, 2011; Diehl and Koenig, 2013).

There is also evidence that Muslims living in Germany hold more fundamentalist religious beliefs than non-Muslim German citizens. Several studies found them to more strongly endorse views that (a) only the religion of Islam (vs. Christianity for Christians) contains fundamental truth, (b) religious rules can never be changed and should be considered more important than secular law, and (c) those who do not obey them will be punished (Heitmeyer et al., 1997; Brettfeld and Wetzels, 2007; Frindte et al., 2011; Koopmans, 2015). For our own research, we therefore predicted that Muslim participants would describe themselves as more religious and endorse fundamentalist positions more strongly than Christian participants, who, in turn, would describe themselves as more religious and endorse more

fundamentalist positions than participants without religious affiliation.

# Ambivalent Sexism Toward Women and Girls

The term sexism describes attitudes linked to the social category of gender which are used to preserve differences or inequality between men and women (cf. Spence, 1999; Leaper and Brown, 2017). According to ambivalent sexism theory (Glick and Fiske, 1996, 2001), sexism has both a negative and an ostensibly positive component. Hostile sexism reflects overtly negative attitudes toward girls or women marked by beliefs that they are inferior, incompetent, or trying to control men by using sex. Benevolent sexism consists of beliefs about the genders that may appear positive but are actually counterproductive to gender equity: it reflects an affectionate but patronizing attitude toward girls and women (Glick and Fiske, 1996, 2001). An instance of this can be found in the idealization of women as in need of or deserving male protection.

The concept of ambivalent sexism is useful for explaining girls' and women's complicity in their own subordination. Girls and women may feel privileged by being cared for and protected by men, or feel flattered by being put on a pedestal as "wonderful, pure creatures whose love is required to make a man whole" (Glick et al., 2000, p. 764). Such seeming advantages can be viewed as compensation for the disadvantages associated with hostile sexism, deceiving girls and women into perceiving the status quo gender hierarchy as fair and just and even endorsing hostile sexist beliefs (cf. Jost and Kay, 2005).

Widening the definition of sexism to include not only hostile attitudes but also ostensibly benevolent ones resolves the apparent paradox in the notion that religiosity can foster sexism. By assigning markedly different roles to men and women and justifying them as "divinely mandated," many religions propagate ostensibly benign sexist attitudes (Glick et al., 2016, p. 547). Benevolent sexist beliefs can serve to maintain and reproduce gender inequality without making the explicit expression of negative attitudes toward girls and women a part of the religious teachings. Hence, the concept of ambivalent sexism is particularly well suited to explain the link between religiosity and sexism. In a multi-nation study, Glick et al. (2000) found that while women consistently rejected hostile sexism, the average scores of men and women on both ambivalent sexism subscales correlated quite strongly within the samples from different cultures. It seems like women are made to feel that their group is inferior to the extent to which men in their community endorse sexist beliefs. This could entail that women contribute to the maintenance of their own group's disadvantaged status by accepting ambivalent sexism (cf. Jost and Kay, 2005). In our studies, we therefore considered it important to investigate ambivalent sexism not only in boys and men but also in girls and women.

Several studies have also applied the concept of ambivalent sexism to adolescent girls (de Lemus et al., 2008, 2010; Garaigordobil and Aliri, 2012; Ferragut et al., 2013; Montañés et al., 2013; Rau, 2013). It is in adolescence, namely when heterosexual boys typically start to anticipate or engage in intimate relations with girls, that the hostile sexist attitudes toward girls which largely prevail among boys during childhood are gradually supplemented by benevolent sexist attitudes (Glick and Hilt, 2000). Evidence for this process is provided by de Lemus et al. (2010), who found that benevolent, but not hostile, sexism toward girls increased the more experienced adolescent boys were with heterosexual romantic relationships. Aside from this study, we are not aware of research that has investigated the impact that attitudes, religious beliefs, or societal factors may have on ambivalent sexism toward girls. Since both women and girls were victimized in the events that sparked this research, we investigated each group as a potential target of ambivalent sexism to explore possible relations of ambivalent sexism toward girls with religiosity and fundamentalism.

Previous studies consistently found men to score higher than women on hostile sexism toward women. This was true irrespective of the country under investigation, as evidenced by the multi-nation study of Glick et al. (2000). Also, the gender difference in hostile sexism has been observed irrespective of whether participants identified as Jews (Gaunt, 2012), as Muslims (Ta¸sdemir and Sakall*ı*-Ugurlu, 2010 ˘ ; Glick et al., 2016), or as Christians (Glick et al., 2002; Mikołajczak and Pietrzak, 2014). In the same way, boys have consistently been found to score higher than girls on hostile sexism toward girls (de Lemus et al., 2010; Garaigordobil and Aliri, 2012; Ferragut et al., 2013; Rau, 2013). Some studies also found male participants to more strongly endorse benevolent sexist beliefs toward women or girls than female participants (Glick et al., 2002; Ferragut et al., 2013; Mikołajczak and Pietrzak, 2014), while others did not find a gender difference (de Lemus et al., 2010; Ta¸sdemir and Sakall*ı*-Ugurlu, 2010 ˘ ; Garaigordobil and Aliri, 2012; Gaunt, 2012; Rau, 2013; Glick et al., 2016). For our own studies, we therefore expected male participants to endorse hostile sexist beliefs toward girls and women more strongly but did not specify a directional hypothesis about gender differences in benevolent sexism.

# Religiosity and Ambivalent Sexism in Different Religious Groups

Research shows that religiosity is associated with gender inequality (e.g., Klingorová, 2015), sexism, and negative attitudes toward gender equality (e.g., Diehl et al., 2009; Seguino, 2011; Adamczyk, 2013). Using data from the World Values Survey, for example, Adamczyk (2013) found that the more religious participants described themselves to be, the more they endorsed gender inequality. Surveying 4,000 Turks living in Germany (90% of them identifying as Muslims) and 10,000 Germans with no migration background (70% self-identifying as Christians), Diehl et al. (2009) found that high religiosity was negatively related to the approval of gender equality in both groups, even after controlling for education and employment status.

Empirical findings regarding the relation between religiosity and ambivalent sexism are less clear-cut. While higher levels of religiosity have been consistently related to a stronger prevalence of benevolent sexism, evidence has been mixed for the association between religiosity and hostile sexism.

In a convenience sample of Jewish participants from Israel, Gaunt (2012) found positive relations between religiosity and benevolent, but not hostile, sexism toward women. In a sample of more than 1,000 men and women from Spain, Glick et al. (2002) found that Catholic religiosity predicted stronger benevolent, but not hostile, sexism toward women. Similarly, Mikołajczak and Pietrzak (2014), investigating a convenience sample from Poland, found Catholic religiosity to covary with benevolent, but not hostile, sexism toward women. In a sample of Evangelical Christian undergraduate students from the United States, Maltby et al. (2010) found Christian orthodoxy to correlate with one of the three subfactors of benevolent sexism toward women, protective paternalism, in men but not in women. In contrast, no relation was found between Christian orthodox beliefs and hostile sexism.

The two studies we are aware of which did find interrelations between religiosity and hostile sexism investigated undergraduate students from Turkey. Ta¸sdemir and Sakall*ı*-Ugurlu (2010) ˘ and Glick et al. (2016) found positive correlations between Muslim religiosity and both subtypes of ambivalent sexism toward women. We are not aware of any study investigating religiosity and ambivalent sexism toward girls.

Taken together, this pattern of findings is consistent with the view that benevolent sexism toward women is tolerated or even encouraged by various religions, while hostile sexism seems to be absent from all the religions investigated aside from Islam (cf. Hunsberger and Jackson, 2005; Whitley, 2009). However, none of the cited studies have accounted for the potential influence of fundamentalism and other ideologies favoring outgroup-derogation, such as right-wing authoritarianism or social dominance orientation. For our own research, we therefore hypothesized that interrelations between religiosity and ambivalent sexism would be attenuated if these confounders were accounted for, and thus included them in our investigations. Since, to our knowledge, no research has yet investigated religiosity and fundamentalism as predictors of ambivalent sexism toward girls, we refrained from formulating a directional hypothesis specifying differences based on whether girls or women are the targets of sexism.

When comparing previous studies investigating people of varying religious affiliations in different countries, the mean values obtained for benevolent and hostile sexism toward women were higher in samples of Muslims (Ta¸sdemir and Sakall*ı*-Ugurlu, 2010 ˘ ; Glick et al., 2016) than in samples of Christians (Glick et al., 2002; Maltby et al., 2010; Mikołajczak and Pietrzak, 2014). We did not find any studies comparing the levels of ambivalent sexism toward girls in different religious groups. Also, no previous study has investigated a religious group that represents a minority in the respective country. According to traditional acculturation theories, religious minority groups can be expected over time to become increasingly similar in their beliefs to the religious majority (cf. Alba and Nee, 1997). Yet, minority status can trigger reactivity as well, i.e., a contrasting of personal beliefs from the ones shared by the majority (e.g., Diehl and Koenig, 2013). Hence, it is plausible for Muslims residing in Germany to be either less sexist or more sexist than fellow believers living in countries with a Muslim majority. In order to analyze whether potential differences are mediated by differences in religiosity and fundamentalism in our Main Study, we ran two Pilot Studies exploring possible differences in ambivalent sexism between religious groups.

# Religiosity and Fundamentalism as Predictors of Sexism

Many studies have identified a link between fundamentalism and negative attitudes, or open hostility, toward outgroups. While most studies examining the fundamentalism-prejudice link have investigated negative attitudes toward minority groups, such as homosexuals (Whitley, 2009, for a review), transgender individuals (e.g., Nagoshi et al., 2008), or racial minorities (Hall et al., 2010, for a review), only a few have also looked at gender-related prejudice (attitudes toward women: McFarland, 1989; Hunsberger et al., 1999; endorsement of rape myth: Sheldon and Parent, 2002; ambivalent sexism: Hill et al., 2010).

A closer look at the interrelatedness of fundamentalism, religiosity, and negative attitudes toward outgroups suggests that the religiosity-sexism link can be at least partly explained by fundamentalism. For instance, in a sample from the United States consisting of undergraduates Johnson et al. (2011) found that fundamentalism strongly covaried with religiosity and, together with right-wing authoritarianism, mediated the relation between religiosity and negative prejudice against homosexuals or African Americans. Similarly, in a European multi-national study Koopmans (2015) found that fundamentalism was strongly related to out-group hostility, while religiosity, controlling for the impact of fundamentalism, was not. Further, Kirckpatrick et al. (1991) and Altemeyer and Hunsberger (1992), investigating college students from the United States and Canada, found that religiosity was unrelated to discriminatory attitudes toward various minority groups once fundamentalism had been controlled for.

Fundamentalism has been found (Banyasz et al., 2016; Harnish et al., 2017) to be strongly correlated with both social dominance orientation (SDO; Pratto et al., 1994) and rightwing authoritarianism (RWA; Altemeyer, 1996). One plausible explanation is that all three ideologies are associated with cognitively rigid thinking (cf. Hill et al., 2010; Brandt and Reyna, 2014). SDO is based on the belief that some groups are superior to others, a belief that coincides with endorsing the suppression of outgroups and a preference for hierarchy within any social system. RWA is a social ideology favoring traditional values and obedience to authority figures, composed of three attitudinal clusters: authoritarian submission, authoritarian aggression, and conventionalism. The Religious Fundamentalism-Scale (we used the German version by Schnell, 2010) developed by Altemeyer and Hunsberger (1992, 2004), for example, has determined strong associations between fundamentalism and RWA (for a review see Altemeyer and Hunsberger, 2004: correlations between 0.62 and 0.82). Also, Sibley et al. (2007) found that RWA and SDO correlated with both benevolent and hostile sexism toward women. We therefore included measures of RWA and SDO to account for these potential confounding variables. To avoid suppression effects and statistical artifacts (Mavor et al., 2009), we treated them as controls in the regression analyses of our Main Study.

# The Present Research

fpsyg-09-02399 November 29, 2018 Time: 16:52 # 5

In light of relevant findings by previous research, we expected (a) that male participants would show more hostile sexism (but not necessarily more benevolent sexism) toward girls and women than female participants would, and (b) that Muslims would score highest on religiosity and fundamentalism, followed by Christians and, lastly, non-religious individuals. To test our research instruments and determine whether we would need to take differences in ambivalent sexism between religious groups into account, we ran two Pilot Studies.

The core assumption of our research was tested in our Main Study with a multigroup multivariate moderated mediation analysis. We expected that religiosity and fundamentalism would be associated with ambivalent sexism irrespective of religious affiliation, but that potential group differences in ambivalent sexism would, at least partly, be mediated by differences in levels of religiosity, fundamentalism, RWA, and SDO.

# MATERIALS AND METHODS

All surveys were conducted with the informed consent of each participant. More specifically, participants were informed that (1) this research was being conducted by researchers from Freie Universität Berlin, (2) the purpose of the research was to investigate adolescents' and adults' values and attitudes toward life, (3) the expected duration would be about 5 min, (3) they had the right to withdraw from the research at any point after participation had begun, (4) there was no financial inducement for participation, and (5) no information relating to the person's identity, such as their name, email or home address would be collected. They were further informed whom to contact for questions about the research (Pilot Study 1, Main Study) or provided opportunity to ask questions and receive answers from the interviewers (Pilot Study 2, Main Study).

# Research Instruments

Religiosity (Pilot Studies 1, 2, Main Study) was measured via the German version of the Centrality of Religiosity Scale (CRS, Huber and Huber, 2012), which is suitable for at least the Abrahamic religions (Judaism, Christianity, Islam). With 15 items, the scale measures the centrality or importance the participant attaches to religious beliefs (e.g., "How often do you take part in religious services?", "How often do you experience situations in which you have the feeling that God or something divine intervenes in your life?"). All answering scales provided five options that referred either to frequency (1 = never, 2 = seldom, 3 = sometimes, 4 = often, very often) or intensity (1 = not at all, 2 = rather not, 3 = somewhat, 4 = rather yes, 5 = very much so), depending on the content of the item.

Fundamentalism (Pilot Studies 1, 2, Main Study) was measured with the Innsbrucker Religiöser-Fundamentalismus-Skala (IRFS, Schnell, 2010), a shortened and adapted German version of the Religious Fundamentalism Scale by Altemeyer and Hunsberger (1992), revised 2004). With eight items and five-point answering scales (1 = strongly disagree, 5 = strongly agree), the one-dimensional scale grasps the extent to which individuals believe that the traditions of their religion are inerrant (e.g., "The traditions and scripts of my religion are without error."), binding and beyond question (e.g., "Someone who compromises the traditions of religion cannot be a true follower of God."), and lead to a special relation with God for those who adhere to the rules they establish (e.g., "Only those who fully comply with the rules of my religion will experience happiness and salvation").

Ambivalent sexism toward girls (Pilot Study 1, Main Study) was measured with the Ambivalent Sexism toward Girls in Adolescents Inventory (Rau, 2013), a German version of the Ambivalent Sexism Inventory (ASI, Glick and Fiske, 1996) adapted for adolescents. The inventory uses five-point response scales (1 = strongly disagree, 5 = strongly agree), with 12 items relating to hostile sexism (e.g., "In a group, a boy is the better leader," "Girls are difficult to predict: they constantly change their minds.") and 13 items relating to benevolent sexism (e.g., "If a girl feels cold, the boy should give her his sweater even if he feels cold himself," "Girls care more about others than boys do").

Ambivalent sexism toward women (Pilot Study 2) was measured with six items from the German version of the ASI by Eckes and Six-Materna (1999) pertaining to hostile sexism (e.g., "Most women fail to appreciate fully all that men do for them"), and six items pertaining to benevolent sexism (e.g., "In a disaster, women ought to be rescued before men", response scales: 1 = strongly disagree – 6 = strongly agree).

Right-wing authoritarianism (Pilot Study 2, Main Study) was measured with six items taken from the German short version of the scale by Altemeyer (1996) developed by Beierlein et al. (2014; sample item: "What we really need are strong, determined leaders, to live securely in our society," answering scales: 1 = strongly disagree to 5 = strongly agree).

Social dominance orientation (Pilot Study 2, Main Study) was measured with eight items (e.g., "We should do what we can to equalize conditions for different groups," answering scales: 1 = strongly disagree to 5 = strongly agree) taken from the scale by Carvacho et al. (in preparation), a short version of the scale by Ho et al. (2015) translated into German.

# Statistical Analyses

To investigate possible differences between genders and religious groups concerning levels of religiosity, fundamentalism, and ambivalent sexism, we conducted, whenever admissible and unless otherwise stated, two-factorial (religious group, gender) ANOVAs (Pilot Studies 1, 2, Main Study). Since heteroscedasticity was plausible, for example, for religiosity between non-religious and religious participants, the HC3 approach described by MacKinnon and White (1985) implemented in the car-Package (Fox and Weisberg, 2011) for R (R Core Team, 2017) was applied in accordance with

the recommendations of Long and Ervin (2000). Accordingly, post hoc group comparisons were performed using t-Tests with Welch-corrected degrees of freedom. When Shapiro–Wilk tests indicated deviations from the assumption of normality for any of the investigated groups after Bonferroni–Holm adjustment, median-based tests as described and recommended by Wilcox (2012) and implemented in the WRS2-package (Mair et al., 2017) for R were used as a robust alternative to classical ANOVAs. Multiple comparisons and inference regarding correlations were corrected using the Bonferroni–Holm adjustment (Holm, 1979). In our Main Study, we investigated our core hypothesis regarding the predictability of ambivalent sexism from religiosity and fundamentalism by estimating a multigroup moderated mediation analysis.

# Pilot Study 1

Our first goal was to examine ambivalent sexism toward girls and identify differences according to gender and religious affiliation. To do so, we conducted an online survey using QuestBack GmbH's online surveying platform Unipark. Since we targeted adolescents and young adults, the survey was primarily shared on the social network platform Facebook and in online forums for religious adolescents<sup>1</sup>,2,<sup>3</sup> ). Additional adolescents were recruited via e-mail distribution lists of religious youth clubs.

#### Research Participants

We reached 132 adolescents and young adults (50 male, 60 female, 22 missing) between 12 and 32 years of age (Mage = 19.36, SD = 3.82). Fifty-six participants self-identified as Christians, 15 as Muslims, 28 as not having any religious affiliation, and 11 as having a religious affiliation other than Christian or Muslim (22 missing). Only participants of Christian or Muslim faith, as well as non-religious participants, were included in the subsequent analyses, reducing the sample size to 99 (43 male, 56 female).

#### Research Instruments

The following reliabilities were obtained for the scales administered in Pilot Study 1: Centrality of Religiosity Scale

1 Shia-Forum

2 religionsforum.de

3 youthweb.net (α = 0.97), Innsbrucker Religiöser-Fundamentalismus-Scale (α = 0.93), and the Ambivalent Sexism toward Girls in Adolescents Inventory (hostile sexism: α = 0.90; benevolent sexism: α = 0.86).

#### Results

**Table 1** depicts the results of ANOVAs or, where deviations from the assumption of normality had been observed, ANOVAs for medians, conducted on religiosity, fundamentalism, and sexism toward girls. Means and standard deviations are reported in the **Supplementary Table S1** in the **Appendix**. In the following, we only report statistically significant findings.

Participants who identified as Muslims were more religious (M = 4.54, SD = 0.28) than Christians (M = 3.44, SD = 1.01) and non-religious participants (M = 1.95, SD = 0.87). Muslim respondents endorsed fundamentalism to a stronger extent (M = 4.08, SD = 0.54) than Christians (M = 2.40, SD = 1.12) and non-religious participants (M = 1.71, SD = 0.84).

Regarding benevolent sexist beliefs toward girls, Muslim participants endorsed them more strongly (M = 3.57, SD = 0.66) than Christian participants (M = 2.89, SD = 0.76) and non-religious participants (M = 2.87, SD = 0.87). Male participants showed higher levels of benevolent sexism than female participants (Mmale = 3.41, SDmale = 0.69; Mfemale = 2.67, SDfemale = 0.75). No significant effects were observed for hostile sexism toward girls.

**Table 2** depicts correlations between all measured variables. Participants held more benevolent sexist attitudes toward girls the more religious they described themselves to be and the more they reported accepting religious fundamentalist beliefs. Hostile sexism covaried with fundamentalism, while the association with religiosity was not statistically significant. As our subsample of Muslim participants was extremely small (n = 15), we refrained from calculating separate correlation coefficients according to religious affiliation.

# Pilot Study 2

Our next goal was to investigate ambivalent sexism toward women. Again, we explored differences according to gender and religious affiliation.

TABLE 1 | Main effects and interaction effects from ANOVAs (F-values)/ANOVAs for medians (V-values) on religiosity, fundamentalism, benevolent and hostile sexism toward girls in Pilot Study 1.


Superscripted letters indicate which of the three post hoc group comparisons for the main effect for religious affiliation are significant at p < 0.05 after Bonferroni–Holm correction: a: Non-religious – Christian, b: Non-religious – Muslim, c: Christian – Muslim.

TABLE 2 | Correlations among religiosity, fundamentalism, benevolent and hostile sexism toward girls in Pilot Study 1.


N = 101. <sup>∗</sup>p < 0.05, ∗∗p < 0.01, ∗∗∗ p < 0.001.

fpsyg-09-02399 November 29, 2018 Time: 16:52 # 7

#### Research Participants

In four different neighborhoods of a large German city, teams of one female and one male psychology student approached passersby in public places (e.g., shopping areas, children's playgrounds) and asked them to fill out our questionnaire. In doing so, we reached a sample of 146 adolescents and adults (71 women, 73 men, 2 indicated a different gender) between 13 and 77 years (M = 34.43, SD = 13.84). Muslims participants were significantly younger (Mdn = 26) than Christian (Mdn = 34, H = 13.67, p < 0.001) and non-religious participants (Mdn = 32, H = 12.47, p < 0.001). Fifty-three participants self-identified as non-religious, 34 as Christians, and 48 as Muslims (7 other religious affiliations, 4 missing). Only Christians, Muslims, and non-religious participants who indicated their gender were included in subsequent analyses (N = 134).

#### Research Instruments

Reliability for the ambivalent sexism toward women scale was very good (hostile sexism: α = 0.87; benevolent sexism: α = 0.87). As we had asked our research participants to fill out our questionnaire on the street, it was important that it could be completed within a few minutes. To ensure this, we shortened the scale measuring religiosity to six items (α = 0.94) and the scale measuring fundamentalism to five items (α = 0.93). Pilot Study 2 additionally included the construal variables RWA (showing an acceptable reliability: α = 0.80) and SDO (displaying a mediocre, but still acceptable reliability: α = 0.67).

#### Results

**Table 3** displays the results of ANOVAs or, where deviations from the assumption of normality had been observed, ANOVAs for medians, conducted on religiosity, fundamentalism, sexism toward women, RWA, and SDO. Means and standard deviations are reported in the **Supplementary Table S2** in the **Appendix**. Only significant effects will be described in the following.

Participants of Muslim faith described themselves as significantly more religious (M = 3.72, SD = 0.92) than Christians (M = 2.51, SD = 1.23) and non-religious participants (M = 1.57, SD = 0.71). Muslims endorsed fundamentalism (M = 3.50, SD = 1.25) to a stronger extent than Christians (M = 1.65, SD = 0.95), and Christians endorsed it more strongly than non-religious participants (M = 1.21, SD = 0.39).

Muslims endorsed benevolent sexist beliefs toward women more strongly (M = 4.36, SD = 1.20) than the other two groups (MChristians = 2.85, SD = 1.21; Mnon−religious = 2.54, SD = 1.12). Also, Muslim participants (M = 3.38, SD = 1.12) endorsed more hostile sexist positions toward women than the other groups (MChristians = 2.01, SD = 0.96; Mnon−religious = 1.91, SD = 0.91). Further, Muslim participants indicated higher levels of support for RWA (M = 3.09, SD = 0.90) and SDO (M = 2.49, SD = 0.75) than the other groups (Christians: MRWA = 2.09, SD = 0.74; MSDO = 1.83, SD = 0.66; non-religious participants: MRWA = 1.94, SD = 0.70; MSDO = 1.78, SD = 0.59).

**Table 4** depicts correlations between all measured variables for the entire sample, separated by religious affiliation. Participants held more benevolent sexist attitudes the more religious they described themselves as being, and the more they accepted fundamentalist tenets. For hostile sexism, the pattern and magnitude of correlations were similar. Calculated within the religious groups of Christians and Muslims, as

TABLE 3 | Main effects and interaction effects from ANOVAs (F-values)/ANOVAs for medians (V-values) on religiosity, fundamentalism, benevolent and hostile sexism toward women, right-wing authoritarianism, and social dominance orientation in Pilot Study 2.


Superscripted letters indicate which of the three post hoc group comparisons for the main effect for religious affiliation are significant at p < 0.05 after Bonferroni–Holm correction: a: Non-religious – Christian, b: Non-religious – Muslim, c: Christian – Muslim.

TABLE 4 | Correlations among religiosity, fundamentalism, benevolent and hostile sexism toward women, right-wing authoritarianism, and social dominance orientation in Pilot Study 2.


Correlations for each religious group are shown in parentheses in the order non-religious / Christian / Muslim. Due to occasional missing data, Ns range as follows: 128 – 136 (51 – 54 / 33 – 36 / 48 – 50). p-values were Bonferroni–Holm corrected within each group but not across groups. <sup>∗</sup>p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001.

shown in **Table 4**, all correlations between our religionand sexism-related variables were positive. However, they varied in strength and many of them did not reach statistical significance.

# Discussion of Pilot Studies 1 and 2

Due to the small sample sizes, we refrained from conducting more complex analyses which would have allowed us to control for potential confounders. While in both Pilot Studies Muslims described themselves as more religious and fundamentalist than the two other groups, floor effects were observed in the statistical distribution of religiosity among non-religious participants and of fundamentalism among Christians and non-religious participants. We therefore oversampled religious participants, particularly of Muslim but also of Christian faith in our Main Study.

In Pilot Study 1, we found higher levels of benevolent sexism toward girls among Muslims than in the other two groups. There were no differences in hostile sexism toward girls. In Pilot Study 2, Muslims endorsed both benevolent and hostile sexist beliefs toward women more strongly than Christians and non-religious participants, while the latter two groups did not differ from one another.

With respect to gender, both Pilot Studies showed, contrary to our expectation, that male and female participants did not differ in their levels of hostile sexism toward girls or women. While male participants showed higher levels of benevolent sexism toward girls than female participants in Pilot Study 1, there was no such difference in benevolent sexism toward women in Pilot Study 2. We aimed to clarify these partly unexpected findings in the investigation of sexism toward girls in our Main Study.

In our Pilot Studies, we found small- to medium-sized correlation coefficients between the religiosity-related variables and ambivalent sexism when investigating the overall samples. These correlations may, however, be at least partly due to mean differences between religious groups for both types of variables. When considered within the religious groups in Pilot Study 2, correlation coefficients varied in strength and were statistically non-significant in many cases.

Our Main Study therefore aimed at investigating whether the differences between genders and religious groups that we uncovered in ambivalent sexism were (at least partly) due to group differences in religiosity and fundamentalism. Additionally, because interrelations between the variables varied considerably across the groups, we conducted a moderated mediation analysis to investigate the links in each of our religious groups of Muslims and Christians independently.

# Main Study

#### Research Participants

For our Main Study, we enhanced our efforts to reach Muslims and Christians not only by launching an internet-based online survey via Unipark, but also by systematically approaching potential participants in places where we expected to find younger religious people (e.g., youth clubs in particular districts of a large German city). As it turned out to be very difficult to gain religious participants, particularly so for boys and young men, we loosened the age-related criterion we had applied in Pilot Study 1 and also approached people of middle age. As in Pilot Study 2, the face-to-face interviews were conducted by teams of one male and one female student.

Altogether, 350 people between 13 and 48 years (M = 21.31, SD = 4.92) participated (127 male, 221 female, 2 missing). Of those, 166 were assessed via an online questionnaire and 184 via interview. Forty-three participants were non-religious, 106 Christians, and 191 Muslims. Ten participants were of a different religion and excluded. Muslim participants (Mdn = 21) were younger than non-religious participants (Mdn = 24, H = 6.50, p = 0.022) and Christian participants (Mdn = 22, H = 7.98, p = 0.014). Non-religious participants did not differ in their mean age from Christians (H = 1.55, p = 0.214). While gender was relatively balanced within the Muslim group (97 female, 94 male), the sample of non-religious participants was somewhat (30 female, 13 male), and the sample of Christians highly (91 female, 15 male) skewed toward female participants.

#### Research Instruments

fpsyg-09-02399 November 29, 2018 Time: 16:52 # 9

Our Main Study used exactly the same scales as in Pilot Study 1, supplemented by the measures of RWA and SDO already applied in Pilot Study 2. The scales reached the following reliabilities in our Main Study: the complete version of the religiosity scale (α = 0.96), the complete version of the fundamentalism scale (α = 0.95), the Ambivalent Sexism toward Girls Inventory (benevolent sexism: α = 0.87; hostile sexism: α = 0.96), the RWA-scale (α = 0.82), and the SDO-scale (α = 0.79).

#### Statistical Analyses

We again conducted ANOVAs and ANOVAs for medians to detect group differences according to religious affiliation (Muslims, Christians, non-religious participants) and gender regarding ambivalent sexism, religiosity, fundamentalism, SDO, and RWA.

To investigate our main hypotheses regarding the predictability of benevolent and hostile sexism from the religion-related variables, only participants reporting to be either of Christian or Muslim faith were included in the following analyses, resulting in four gender-religion combination groups. In a first step, we estimated a multigroup multivariate regression using lavaan (Version 0.5-23.1097; Rosseel, 2012), with both forms of sexism simultaneously included as outcomes. Since we found correlations between the variables to differ according to participants' religious affiliation in Pilot Study 2, we estimated the regression weights freely, meaning that they were allowed to differ across the four groups.

This procedure resulted in a model in which the influences of the five predictors (religion, fundamentalism, RWA, SDO, and age) on the two forms of sexism (hostile, benevolent) are assumed to be moderated by the grouping variable (i.e., the religious affiliation-gender combination). To determine whether the group differences in religiosity and fundamentalism identified in Pilot Studies 1 and 2 were associated with the group differences we identified for ambivalent sexism, we also tested the pathways for mediation. In line with Muthén and Asparouhov (2015), this allows for the identification of three separate effects.

(1) The total natural indirect effect (TNIE) represents the overall influence of the difference between groups in the outcome that is mediated via the intermediate variables. Hence, the TNIE depicts differences between, for example, female Christians and male Muslims in ambivalent sexism that can be explained by the differences between these two groups in religiosity, religious fundamentalism, RWA, SDO, and age.

(2) The pure natural direct effect (PNDE) represents the group differences in ambivalent sexism that go beyond the mediated components, meaning, for example, that female Christians and male Muslims differ in ambivalent sexism due to pathways not captured in the variables assessed in this study.

(3) The total effect (TE) constitutes the sum of the former two, thus representing the overall influence of group differences on the outcomes, that is the overall difference in ambivalent sexism between, for example, female Christians and male Muslims.

To test these effects, we applied the Monte Carlo resampling methods described by Tofighi and MacKinnon (2016), and implemented in the R package RMediation by<sup>4</sup> Tofighi and MacKinnon (2011). We did so because the bootstrap resampling methods which are often applied in these situations have performed poorly in small samples (Koopman et al., 2015). As Christians are the majority religious group in Germany, they are suitable as a reference in the analyses. Since the share of male Christians was very small in our sample, we selected female Christians as the reference group. We tested the bivariate normality of both types of sexism using the Mardia Test as implemented in the psych package for R (Revelle, 2018) and found significant skew in all four groups included in the model. To accommodate the non-normality of the variables, we chose robust standard errors via the MLR estimator implemented in lavaan.

#### Results

Means and standard deviations for the following ANOVAs are depicted in **Table 5**. **Table 6** displays the results of ANOVAs or, where deviations from the assumption of normality had been observed, of ANOVAs for medians, conducted on religiosity, fundamentalism, sexism toward women, RWA, and SDO in our Main Study.

An ANOVA conducted on religiosity revealed a main effect of religious affiliation but neither a main effect of gender nor an interaction effect. Non-religious participants showed the lowest levels of religiosity, Christians higher levels, and Muslims the highest (all pairwise comparisons p < 0.001).

Regarding fundamentalism, an ANOVA for medians revealed a main effect of religion but no effect of gender. The interaction was also significant. Post hoc analyses showed no difference between the genders among non-religious (H = 0.79, p = 0.375) and Christian participants (H = 2.24, p = 0.269), while male Muslims reported stronger fundamentalism than female Muslims (H = 20.31, p < 0.001). Non-religious participants reported lower fundamentalism than either Christians (H = 7.56, p = 0.006) or Muslims (H = 75.99, p < 0.001). The comparison between Christians and Muslims also revealed significant differences (H = 139.77, p < 0.001), with Muslims reporting stronger fundamentalism.

An ANOVA for benevolent sexism showed main effects for religious group and gender but no interaction effect. Post hoc analyses showed no difference between non-religious participants and Christians (t[73.15] = −1.43, p = 0.155), while both differed significantly from Muslims, who showed more benevolent sexism (compared to non-religious participants: t[67.93] = −6.27, p < 0.001; compared to Christians: t[246.49] = −6.67, p < 0.001). The gender effect was due to male participants reporting stronger benevolent sexism than

<sup>4</sup>We altered the implementation of the RMediation-package to provide medians rather than means as the point estimates for effects.


female participants (Mfemale = 2.54, SDfemale = 0.76; Mmale = 2.99, SDmale = 0.72). Further, pairwise comparisons revealed that female Christians (whom we used as the reference group in our moderated mediation analysis) showed less benevolent sexism than female Muslims (t[166.79] = −4.42, p < 0.001) and male Muslims (t[172.65] = −7.86, p < 0.001), but did not differ from male Christian (t[17.93] = −1.95, p = 0.200), female non-religious (t[45.99] = 1.05, p = 0.602), and male non-religious (t[13.89] = 0.26, p = 0.797) participants.

An ANOVA for medians for hostile sexism revealed main effects for religion and gender as well as an interaction effect. Pairwise comparisons revealed that male Muslims were more hostile toward girls than all remaining groups (all p < 0.001), while none of the other five groups differed significantly from each other.

Analyzing RWA in an ANOVA, we found a main effect of religion but neither an effect of gender nor of an interaction between gender and religion. Those without religious affiliation (M = 2.14, SD = 0.93) and Christians (M = 2.10, SD = 0.70) did not differ from each other (t[62.23] = 0.28, p = 0.779) while Muslims (M = 2.74, SD = 0.75) more strongly endorsed RWA beliefs (compared to non-religious participants: t[55.42] = −3.93, p < 0.001; compared to Christians: t[229.31] = −7.33, p < 0.001).

Regarding SDO, there were no effects of religion, gender, or their interaction.

We then estimated the multigroup multivariate regression (see **Figure 1**), only taking participants of Muslim or Christian faith into account. The bivariate correlation coefficients are reported in **Supplementary Table S3** in the **Appendix**. **Figure 1** illustrates the results for both benevolent and hostile sexism toward girls. Within the group of male Muslims, a higher degree of self-reported religiosity was significantly associated with higher degrees of benevolent sexism. Additionally, for female Muslims higher levels of fundamentalism were associated with higher levels of benevolent sexism. RWA predicted benevolent sexism significantly in all groups, except for male Christians. Only among male Muslims was SDO an additional positive predictor and age an additional negative predictor of benevolent sexist attitudes toward girls.

For hostile sexism, religiosity was predictive only among female Muslims. At the same time, female Muslims were less hostile toward girls the stronger they expressed fundamentalist beliefs. In the remaining three groups, hostile sexism increased with fundamentalism, albeit not significantly so for male Christians. RWA was a strong predictor of hostility in all four groups but, again, not significantly so for male Christians. SDO emerged as an additional predictor for female Christians. As was the case for benevolent sexism, the older Muslim participants were, the less they endorsed hostile sexism, whereas age did not have an effect in any of the remaining groups.

We then conducted the moderated mediation analysis specifying female Christians as the reference group. Results are depicted in **Table 7**. For benevolent sexism toward girls, in the analysis for male Christians TE was statistically significant but the indirect and direct effects were not. While falling short of the significance threshold may have been due to the extremely small sample size, results seem to suggest that the stronger benevolent

TABLE 5 | Means and standard deviations (in parentheses)

 for religiosity,

fundamentalism,

 benevolent

 and hostile sexism toward girls, right-wing

authoritarianism,

 and social dominance

 orientation

 separated by

fpsyg-09-02399 November 29, 2018 Time: 16:52 # 10


TABLE 6 | Main effects and interaction effects from ANOVAs (F-values)/ANOVAs for medians (V-values) on religiosity, fundamentalism, benevolent and hostile sexism toward women, right-wing authoritarianism, and social dominance orientation in our Main Study.

Superscripted letters indicate which of the three post hoc group comparisons for the main effect for religious affiliation are significant at p < 0.05 after Bonferroni–Holm correction: a: Non-religious – Christian, b: Non-religious – Muslim, c: Christian – Muslim.

sexism of male Christians as compared to female Christians, indicated by the significant TE, cannot be explained by any of the mediating variables. When comparing male Christians to female Christians in their levels of hostile sexism, a similar picture emerged. In this case even TE was not statistically significant.

For female Muslims, in contrast, the stronger benevolent sexism we found, as compared to female Christians, was mediated by fundamentalism: female Muslims were more in favor of fundamentalist religious beliefs, and this was accompanied by stronger levels of benevolent sexism. In this analysis, both TNIE and TE were statistically significant. The higher level of religiosity found for female Muslims, as compared to female Christians, was associated with higher levels in hostile sexism. The opposite was the case for fundamentalism, where the higher levels were accompanied by less hostile sexism toward girls. Overall, this resulted in a non-significant TE, as female Muslims did not differ from female Christians in their hostile sexism toward girls (**Table 5**).


TABLE 7 | Results of the moderated mediation analysis predicting benevolent and hostile sexism toward girls: medians and 97% Confidence Intervals (in parentheses) generated by the Monte-Carlo resampling approach in Main Study.

<sup>a</sup>Female Christians are used as the reference group in this analysis. <sup>b</sup>This is the combination of the indirect effects of religiosity, fundamentalism, SDO, and RWA. <sup>∗</sup> p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001.

For male Muslims, their more pronounced benevolent sexism, as compared to female Christians, was partly mediated by their stronger religiosity, accompanied by a significant PNDE. The significant PNDE suggests that male Muslims approved more strongly of benevolent sexism toward girls than female Christians did to an extent beyond what can be explained by our mediator variables. A slightly different pattern emerged for hostile sexism, where male Muslims' stronger hostility, as compared to female Christians, was partly mediated by their more pronounced fundamentalism. As for benevolent sexism, a significant PNDE emerged even after inclusion of the two religion-related variables, RWA, SDO, and age, suggesting that male Muslims differed from female Christians in their hostility toward girls due to factors not covered by our analysis.

To summarize, the results of our mediated moderation analysis show that the stronger benevolent and hostile sexism we observed in male Muslims (as compared to female Christians) can be partly, but not completely, explained by religiosity and fundamentalism. However, religiosity and fundamentalism mediated the differences in ambivalent sexism we found between female Muslims and female Christians. In this case, once our mediating variables were taken into account, female Muslims no longer showed PNDEs in their ambivalent sexism when compared to female Christians. The comparison between male Christians and female Christians, in contrast, did not indicate mediation by religion or fundamentalism. Thus, in this case the stronger endorsement of benevolent sexism we observed in TE for male participants cannot not be explained by religiosity or fundamentalism.

# GENERAL DISCUSSION

In this research, we sought to investigate whether religiosity and fundamentalism as such, rather than specific religious affiliation, would be predictors of ambivalent sexism toward girls and women.

We further aimed to disentangle the interrelations between religion- and sexism-related variables, while taking into account two variables that previous research suggested as potential confounders but had not been included in prior investigations: right-wing authoritarianism and social dominance orientation.

# Religiosity and Fundamentalism

In all three studies, Muslim participants described themselves as more religious than Christian participants, who, in turn, described themselves as more religious than participants without religious affiliation. The same pattern was observed for fundamentalism. Muslims held more fundamentalist religious beliefs than Christians, who held more such beliefs than non-religious participants. These differences between religious groups were independent of participants' gender and replicated the findings of previous research (Heitmeyer et al., 1997; Brettfeld and Wetzels, 2007; Frindte et al., 2011; Huber and Huber, 2012; de Hoon and van Tubergen, 2014; Koopmans, 2015; Gille, 2016).

# Interrelations Between Religious Affiliation, Religion, Fundamentalism, and Ambivalent Sexism

To the best of our knowledge, no previous research has compared the relation between religious affiliation and sexist attitudes on either a national or international level. In both Pilot Studies as well as our Main Study, Muslim participants approved of benevolent sexism toward girls and women more than Christian and non-religious participants, with the latter two groups showing no difference from one another. Regarding hostile sexism, our findings were somewhat inconsistent. While in our first Pilot Study investigating attitudes toward girls the

three religious and non-religious groups did not differ from each other in their levels of hostile sexism, in our second Pilot Study Muslims approved of hostile attitudes toward women more strongly than the other two groups, and in our Main Study male Muslims endorsed hostile attitudes toward girls more strongly than female Muslims, male Christians and female Christians, who did not differ from one another. We do not know whether the findings of our Main Study would have been replicated if we had reached a larger sample of male Muslims and/or if we had included the potential confounders RWA and SDO in our first Pilot Study.

The stronger ambivalent sexism indicated by our Muslim participants corresponds to the higher levels of benevolent and hostile sexism that previous studies have found in Muslims living in Muslim-majority countries (Ta¸sdemir and Sakall*ı*-Ugurlu, 2010 ˘ ; Glick et al., 2016) compared to Christians living in Christian-majority countries (Glick et al., 2002; Maltby et al., 2010; Mikołajczak and Pietrzak, 2014). Our findings show that, on average, Muslims living in Germany endorse higher levels of ambivalent sexism than the Christian majority group, despite many of them being third- or fourth-generation residents of Germany (for similar findings regarding other dependent measures see for instance Diehl and Koenig, 2009; Stanat et al., 2010; Jacob and Kalter, 2013; Walter, 2014). These findings could possibly indicate that members of the Muslim minority in Germany feel discriminated against, thereby fueling reactive ethnicity and the adoption of acculturation strategies of separation rather than assimilation (cf. Phinney et al., 2006; Verkuyten and Yildiz, 2007).

Results from our moderated mediation analysis suggest that differences in ambivalent sexism between the two religious groups were partly due to religiosity, fundamentalism, RWA, and SDO. More specifically, the religiosity-sexism link reported by previous research was replicated in all of our studies in medium-sized bivariate correlations, with religious participants showing stronger benevolent and hostile sexism toward girls and women. However, when differentiating participants according to religious affiliation and gender in the multigroup multivariate regression analysis in our Main Study, a more complex picture emerged.

As previous research has found fundamentalism (e.g., Banyasz et al., 2016; Harnish et al., 2017) and ambivalent sexism (Sibley et al., 2007) to be correlated with RWA and/or SDO, we included both ideologies in our analyses. While not significant in our small sample of male Christians, we found that RWA was strongly associated with hostile sexism in all four groups, i.e., irrespective of participants' gender and religious affiliation. The strong correlation between RWA and hostile sexism is in line with previous research that has shown RWA to predict prejudice and hostility in a wide range of intergroup relations (e.g., McFarland, 1989; Hunsberger et al., 1999; Nagoshi et al., 2008; Whitley, 2009; Hall et al., 2010; Hill et al., 2010). In addition, participants (with the exception of female Muslims) showed more hostile sexism toward girls the more fundamentalist their religious beliefs were (this prediction was, again, not statistically significant in our small sample of male Christians). Interestingly, once fundamentalism was accounted for, religiosity did not contribute to the prediction of hostile sexism (except for the group of female Muslims who reported more hostility toward girls the more religious they described themselves to be<sup>5</sup> ). While there were some differences between subgroups, these findings seem to suggest that fundamentalism was more important for the prediction of hostile sexism than religiosity.

A quite different picture appeared for benevolent sexism toward girls, where RWA proved to be a significant predictor in all groups but male Christians. In our Christian subsamples, no variable aside from RWA contributed to the prediction of benevolent sexism. In contrast, the religion-related variables explained variance in benevolent sexism among our Muslim participants: benevolent sexism increased with religiosity in male Muslims and with fundamentalism in female Muslims.

These findings suggest that approval of traditional values and obedience toward authority figures, as measured by participants' endorsement of RWA (Altemeyer, 1996), predict hostile and benevolent sexist attitudes toward girls irrespective of religious affiliation. There may be a relation between allegiance to traditional values and the approval of the status-quo gender hierarchy as well as between avowal of obedience toward authority figures and approval of female submission to male family members. Interestingly, in our Christian subsamples variations in benevolent sexism were only dependent on RWA, whereas in our Muslim subsamples religiosity and fundamentalism mattered as well. This finding suggests that there are specific contents of the religious teachings of Islam which encourage benevolent sexism and are not fully captured by the approval of conventionalism and authoritarian submission (as measured by RWA).

Results of the moderated mediation analysis conducted in our Main Study suggest that the differing degrees of ambivalent sexism between female Muslims and female Christians were explained by our mediating variables, in particular by female Muslims' stronger fundamentalism. The stronger (as compared to female Christians) benevolent and hostile sexist attitudes that male Muslims indicated having toward girls were partly mediated by the religion-related variables. More specifically, highly religious boys and men approved more strongly of benevolent sexist propositions. Benevolent sexism is supposed to reward girls and women who adhere to their traditional role. It is possible that highly religious boys and men hold particularly traditional views on the female role and are thus also more inclined to see girls and women as "wonderful" and in need of male protection, as stipulated

<sup>5</sup>As indicated by the bivariate correlation coefficients, hostile sexism was positively related to fundamentalism, religiosity, and RWA in all three studies as well as across all examined subsamples. Once we controlled for all these interrelated variables, the effect of fundamentalism on hostile sexism became negative and the effect of religiosity on hostile sexism became positive in one of our subsamples: female Muslims. We do not wish to interpret this deviation from the pattern of our findings until it has been replicated in future research.

in the conceptualization of benevolent sexism (cf. Glick and Fiske, 1996, 2001). The difference in hostile sexism between male Muslims and female Christians was attributable to male Muslims' stronger fundamentalism. By including RWA and SDO when predicting hostile sexism, we accounted for the potential influence of cognitively rigid thinking (Hill et al., 2010), traditionalism and authoritarianism (Brandt and Reyna, 2014), strivings for dominance, and negative attitudes toward individuals violating in-group norms (Sibley et al., 2007). Our finding that fundamentalism in male Muslims additionally contributed to the prediction of hostile sexism toward girls suggests that fundamentalism captures features other than those attributed to RWA and SDO, features, moreover, that are unique to religion-related ideology. Our mediation analyses for male Muslims indicated a direct effect for both benevolent and hostile sexism, even after religiosity, fundamentalism, RWA, and SDO had been accounted for. Hence, there are factors other than those covered by our model that are responsible for their stronger ambivalent sexism.

This complex pattern of findings calls for future research examining the association of religion and ambivalent sexism in larger samples from different religious affiliations. In particular, larger samples of male Christians need to be investigated as in both our studies that included data from online surveys (Pilot Study 1, Main Study) this group was clearly underrepresented as compared to Christian girls and women. Possibly, this asymmetry was abetted by the fact that girls and women are overrepresented among Christians in Germany (55% of church members are female), in particular among active church members who, for instance, volunteer in church work (74% girls and women), perform official duties in their local church (1.7% of the female and 0.4% of the male church members), or are employed by the church (in positions other than priests 80% of the employees are women; all statistics from Studienzentrum der Ekd für Genderfragen, 2015). Christian girls and women being more committed to their religion than Christian boys and men could imply that male Christians show a lower willingness to participate in surveys about their values and religion on a voluntary basis than female Christians do. Future studies should also include controls for immigrants' ethnic or cultural background, the number of generations their families have been living in the host country, and their highest completed level of education.

We started by citing anecdotal evidence linked to claims made by the general public that Muslim men were particularly sexist toward girls and women. We then tested whether the higher religious self-identification and stronger endorsement of fundamentalism among Muslim participants in comparison to non-religious and Christian participants offered a more precise explanation of differences in ambivalent sexism than simply belonging to a specific religion, namely Islam. While our studies have provided initial evidence that stronger religiosity and fundamentalism explain some of the variance in ambivalent sexism, these varying levels of religious involvement cannot entirely explain the particularly strong hostile and benevolent sexism we found in male Muslims. Hence, there are factors responsible for their stronger ambivalent sexism other than those investigated in our studies. Glick et al. (2016) have suggested that these factors may include specific contents of Islamic religious teachings. "The Qur'an," they write, "includes verses that seem to offer both subjectively hostile and benevolent justifications for gender hierarchy. On the hostile side, the Qur'an calls for women to submit to men as their inferiors. . . On the subjectively benevolent side, the Qur'an instructs men to protect and provide for women" (p. 546).

An additional factor that might explain the stronger ambivalent sexism we found in our Muslim male participants are honor beliefs. Muslim-majority nations with the largest numbers of Muslim immigrants living in Germany (Turkey and member states of the Arab league) have been described as "cultures of honor" (Nisbett and Cohen, 1996). This term refers to collectivistic cultures that emphasize the value of social reputation, which is frequently associated with prescribed gender-specific behaviors supportive of male power and female subordination. While men gain honor through strength and aggression, women are recognized for sexual purity and obedience toward male family members (Vandello and Cohen, 2003). Men are expected to defend the honor of their family even if it involves using force or punishing women for disobedient behavior, but they are also expected to provide for women and behave chivalrously toward them. Accordingly, Glick et al. (2016) found honor beliefs in men to correlate particularly strongly with hostile sexism and moderately strongly with benevolent sexism. It is therefore likewise possible that the stronger hostile and benevolent sexism we found in male Muslims after we had controlled for religiosity, fundamentalism, RWA, SDO, and age, was due to their stronger honor beliefs.

What can be learned from our findings with respect to the prevention of sexism toward girls and women? Right-wing authoritarianism and religious fundamentalism proved to be strongly correlated with ambivalent sexism irrespective of our participants' religious orientation and contributed to an explanation of the particularly strong hostile sexism toward girls that we have found among our sample of Muslim boys and men in our Main Study. Democratic institutions, such as schools or universities, as well as religious institutions, should strengthen their efforts to diminish the influence of fundamentalist beliefs by teaching the right to freedom of expression and the right to dissent. By promoting tolerance and reasonableness, we can counter the misuse of religion to discriminate against girls and women and promote gender equality in multireligious societies.

# ETHICS STATEMENT

According to our institution's guidelines and national regulations, no ethics approval was required, since our research used anonymous or no-risk tests, surveys, interviews, or observations. In this study no private information (such as name, address, IP address or email address) was collected. Participants' consent was obtained by virtue of survey completion.

# DATASETS ARE AVAILABLE ON REQUEST

fpsyg-09-02399 November 29, 2018 Time: 16:52 # 15

The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

# AUTHOR CONTRIBUTIONS

BH provided the initial idea for the study. BH, LZ, and JG contributed to the conception and design of the study. JG and LZ organized, conducted the data collection, and organized the database. MS performed the statistical analysis. BH wrote the first draft of the manuscript. BH and MS wrote the manuscript's method section. All authors contributed to manuscript revision, read and approved the submitted version.

# REFERENCES


# FUNDING

We acknowledge support by the German Research Foundation (HA2381-11-1; HA2381-11-2) and the Open Access Publication Fund of the Freie Universität Berlin.

# ACKNOWLEDGMENTS

We would like to thank Derya Akyol, Monica Albornoz, Marty Altmann, Sandro Andric, Alina Gomez de Löhn, Jannika Haase, Elisabeth Höhne, Camille Larkins, Vivien Prüter, and Olivia Sidoti for their help with data collection and organization.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.02399/full#supplementary-material

und politisch-religiös motivierter Gewalt. Gutachten im Auftrag des Bundesministeriums des Inneren [Muslims in Germany. Integration, Integration Obstacles, Religion, Attitudes Towards Democracy, The Constitutional State, and Politically Motivated Violence]. Hamburg: Universität Hamburg.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Hannover, Gubernath, Schultze and Zander. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The (Not So) Changing Man: Dynamic Gender Stereotypes in Sweden

Marie Gustafsson Sendén1,2 \*, Amanda Klysing<sup>3</sup> , Anna Lindqvist<sup>3</sup> and Emma Aurora Renström<sup>4</sup>

<sup>1</sup> Department of Social Sciences, Södertörn University, Huddinge, Sweden, <sup>2</sup> Department of Psychology, Stockholm University, Stockholm, Sweden, <sup>3</sup> Department of Psychology, Lund University, Lund, Sweden, <sup>4</sup> Department of Psychology, Gothenburg University, Gothenburg, Sweden

According to Social Role Theory, gender stereotypes are dynamic constructs influenced by actual and perceived changes in what roles women and men occupy (Wood and Eagly, 2011). Sweden is ranked as one of the most egalitarian countries in the world, with a strong national equality discourse and a relatively high number of men engaging in traditionally communal roles such as parenting and domestic tasks. This would imply a perceived change toward higher communion among men. Therefore, we investigated the dynamics of gender stereotype content in Sweden with a primary interest in the male stereotype and perceptions of gender equality. In Study 1, participants (N = 323) estimated descriptive stereotype content of women and men in Sweden in the past, present, or future. They also estimated gender distribution in occupations and domestic roles for each time-point. Results showed that the female stereotype increased in agentic traits from the past to the present, whereas the male stereotype showed no change in either agentic or communal traits. Furthermore, participants estimated no change in gender stereotypes for the future, and they overestimated how often women and men occupy gender non-traditional roles at present. In Study 2, we controlled for participants' actual knowledge about role change by either describing women's increased responsibilities on the job market, or men's increased responsibility at home (or provided no description). Participants (N = 648) were randomized to the three different conditions. Overall, women were perceived to increase in agentic traits, and this change was mediated by perceptions of social role occupation. Men where not perceived to increase in communion but decreased in agency when change focused on women's increased participation in the labor market. These results indicate that role change among women also influence perceptions of the male stereotype. Altogether, the results indicate that social roles might have stronger influence on perceptions of agency than perceptions of communion, and that communion could be harder to incorporate in the male stereotype.

Keywords: social role theory, gender stereotypes, femininity, masculinity, agency, communion, division of labor

# INTRODUCTION

'Signs of gender equality are evident everywhere, from men taking their toddlers to preschool in pushchairs every morning to women rising the ranks in traditionally male-dominated industries' (The Local, 2018).

This quote describes Sweden as an egalitarian country where men are seen in caretaking roles whereas women are seen in typically agentic roles. In fact, Sweden's national representation and

Edited by:

Mario Weick, University of Kent, United Kingdom

#### Reviewed by:

Amanda Diekman, Indiana University Bloomington, United States Linda Carli, Wellesley College, United States

#### \*Correspondence:

Marie Gustafsson Sendén marie.gustafsson@sh.se; mgu@psychology.su.se

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

> Received: 30 April 2018 Accepted: 08 January 2019 Published: 30 January 2019

#### Citation:

Gustafsson Sendén M, Klysing A, Lindqvist A and Renström EA (2019) The (Not So) Changing Man: Dynamic Gender Stereotypes in Sweden. Front. Psychol. 10:37. doi: 10.3389/fpsyg.2019.00037

national brand include gender equality as a fundamental part (Towns, 2002; Jezierska and Towns, 2018). Sweden frequently positions itself and is positioned both nationally and internationally as world leading when it comes to gender equality (Towns, 2002). Following social role theory (Eagly and Steffen, 1984; Wood and Eagly, 2011), a result of such perceptions of gender equality in labor division should be that differences in gender stereotype content would decrease.

Agency and communion represent core dimensions of gender stereotype content, where agency is associated with masculine characteristics and communion with feminine characteristics (Abele and Wojciszke, 2014). Agency refers to traits such as independent, assertive and dominant, whereas communion refers to traits such as relationship-oriented, emphatic and caring. Social role theory posits that this division in gender stereotype content is based on observations (in media or in daily life) of women and men in different roles; a division of labor stemming from women's and men's differing physical capabilities for child rearing contra labor requiring physical strength (Koenig and Eagly, 2014). When women and men occupy and perform tasks in work and family life, personality traits are derived from behaviors, as described by correspondent inference theory (Gilbert and Malone, 1995). Thus, women are perceived as nurturing and kind because they occupy the majority of caretaking roles (both at home and in the labor market) whereas men are perceived as independent and assertive because they occupy the majority of managerial positions and jobs with higher status (Cejka and Eagly, 1999; Eagly and Wood, 2011). Inspections of job characteristics based on O∗Net research by the United States Bureau of Labor Statistics also found positive relationships between communal traits and roles primarily occupied by women, as well as between agentic traits and roles primarily occupied by men (Levanon and Grusky, 2016; Cortes and Pan, 2017). When groups of women or men enter non-traditional roles (i.e., roles requiring characteristics which are not stereotypically associated with that specific gender), social perceivers infer a corresponding shift in personality characteristics to accommodate the new role demands. Evidence of this is that gender stereotypes have been shown to be influenced by perceptions about past, present, and future divisions of labor (Diekman et al., 2005).

So far, the literature on dynamic stereotypes has consistently shown that participants perceive the typical woman of today as more agentic, i.e., having more characteristics associated with masculinity, than the typical woman of previous times (Diekman and Eagly, 2000; Diekman et al., 2005; Wilde and Diekman, 2005; Diekman and Goodfriend, 2006; Garcia-Retamero et al., 2011; Bosak et al., 2017). The perceived change in agency has been quite linear in that masculine characteristics were both perceived to be lower in the past and higher in the future. The shift toward higher perceived agency has been explained by women's increased participation in the labor market in agentically demanding roles. Accordingly, perceived distribution of women and men in nontraditional roles has been identified as a mediator for perceived changes in gender stereotype content in several studies (Diekman and Eagly, 2000; Diekman et al., 2005; Bosak et al., 2017). Evidence for change in perception of men, in contrast, is not as conclusive. In studies from the United States (Diekman and Eagly, 2000; Diekman et al., 2005) and Germany (Wilde and Diekman, 2005), the perception of men showed no change, in Chile and Brazil (Diekman et al., 2005), masculinity was perceived to increase also in men, whereas in Ghana (Bosak et al., 2017) and Spain (Garcia-Retamero et al., 2011), men were perceived to increase in communality. When results indicated a shift in the perception of men, this was less often mediated by perceived distributions of women and men in non-traditional roles (Bosak et al., 2017).

Furthermore, self-reported data among women and men documented stronger shifts in agency related with social roles than communion (Moskowitz et al., 1994). Diekman and Schneider (2010) consider the interactions between broad gender roles and specific roles, and how they might explain change. For example, if women still do more of the household work that is associated with caregiving, or if they perform more communal tasks at work, they should not be perceived to decrease in communion. Similarly, if men do not work in professions which require communal skills, or enact family roles that are less associated with caregiving, men might not be perceived as acquiring communion only by taking more parental leave.

To our knowledge, past research on dynamic stereotypes has not discussed whether there might be differences in how malleable agentic and communal traits are. For example, perceived gender differences in nurturing are to a greater extent attributed to biological causes than gender differences in math ability (Cole et al., 2007), and motherhood is more strongly related with biology than fatherhood (McPherson et al., 2018). It is therefore possible that communion is seen as a part of a female "essence", meaning that communal traits may be harder to gain for those not belonging to the category "woman." However, a recent United States study on the substereotypes of mothers and fathers did find that social perceivers estimated an increase of stereotypical maternal traits in fathers over time, due to fathers being perceived as taking on more maternal tasks (Banchefsky and Park, 2016), meaning that communal traits are possible to include in the stereotype for at least fathers. In comparison to the United States, parental leave is longer in Sweden, and there are special benefits resulting from policies directed toward the non-birth parent. Since these policies have been marketed as an effort to increase parental leave among fathers, paternal roles may be more salient and have higher status in the Swedish society as compared to other countries. The question addressed in the current study is whether changes in parental care among men extend to the general stereotype of men in Sweden, leading to increased perceived communion among Swedish men. Such a shift would occur especially if people see men as more involved in parental care, and if they enact parental roles in the same way as women do (Diekman and Schneider, 2010).

Masculinity, in contrast to femininity have been described as transient, precarious and something that men continuously need to perform (Bosson et al., 2013). Masculinity is also associated with higher status than femininity (Connell and Messerschmidt, 2005; Rudman et al., 2012), indicating that women might benefit from displaying agency, which are some of the trait characterstics of masculinity. Women's self-ratings of agency (Twenge, 1997)

and ratings of women in general (Diekman et al., 2005), have indeed gained in agency over time. However, although women with agentic traits are perceived to be equally competent as men, they may still face social penalties such as being less likeable or hirable compared to men with similar agentic traits and behaviors (Brescoll and Uhlmann, 2008; Rudman et al., 2012; Williams and Tiedens, 2016). Thus, descriptive stereotypes about women might include more agency today than in the past, but prescriptive stereotypes would still require women to avoid excessive agency (Rudman et al., 2012).

Social role theory also acknowledges that contextual factors, such as cultural values, impact inferences from observed role occupation to stereotype content. Cross-cultural research has shown that the male stereotype aligns with the core values of a culture: such that individuals from collectivist cultures rated men as more communal than women, whereas individuals from individualistic cultures rated men as more individualistic than women (Cuddy et al., 2015). Furthermore, research on cultural values has shown that Sweden is rated as individualistic rather than collectivistic (Hofstede, 2001), suggesting that the male stereotype in Sweden would be viewed as containing fewer communal qualities than the female stereotype. However, Sweden is also rated as one of the most feminine countries in the world, meaning that values such as relationships and quality of life are more important than money, objects and work. This indicates that communal roles among men, such as child-rearing, should be valued more highly and of higher status in Sweden as compared to many other countries. In sum, Sweden represents an interesting country in which to investigate if changes in social role occupation can influence the content of both the female and male gender stereotype, because of the strong identification as being a gender egalitarian nation coupled with the presence of individualistic cultural values.

# Sweden and Gender Equality

Sweden does not only have a self-image of being gender equal. In international comparisons Sweden is a highly egalitarian country, being ranked as number five on the Global Gender Gap Index (World Economic Forum, 2017) and as number eight in the Global Leadership and Organizational Behavior Effectiveness (GLOBE) study (Warner, 2012). An international comparison of the parental leave system (Ray et al., 2010) showed that Sweden ranked among the most egalitarian countries when it comes to parental leave among fathers. This ranking can be explained by continuous changes in the social insurance system for parents. The first insurance related to parental leave was introduced in 1954 and referred to as "motherhood insurance." The implementation of the insurance was intended to motivate families to have more children. At this time, about one third of the women had entered the labor market. The motherhood insurance, together with other family laws, made their position on the labor market less vulnerable (Central Bureau of Statistics [SCB], 1953). In 1974, the term for the insurance changed from "motherhood insurance" to "parental insurance." At that time, only 0.5% of fathers took any parental leave. Later reforms partly individualized the parental insurance (Warner, 2012), and the first individual month (often called "daddy month" because it was aimed toward fathers taking on more parental leave) was introduced in 1995. The second "daddy month" was introduced in 2002, and the third in 2016 (Statistics Sweden, 2016). Since 1995, fathers have steadily increased their output of parental leave. In 2012, the average of fathers' leave was 56 days per child (average for mothers was 284 days per child), and 23% stayed at home for more than 3 months (96% of mothers stayed at home for more than 3 months; Inspektionen för socialförsäkringen [ISF], 2012).

Equality in the labor market also has a long history of government interventions. In 1950, 23% of women were active in the labor market, in comparison with 65% of men (Central Bureau of Statistics [SCB], 1953). From the 1950's, women's activity in the labor market increased. In 1980, an office of equal opportunities was established as an independent government authority under the Ministry of Labor. The main purpose was to prevent and act against gender discrimination in the labor market and an anti-discrimination law focused on gender equality was enacted at the same time (Law, 1979:1118). Since then, the anti-discrimination law has been expanded and now includes seven grounds for discrimination: gender, transgender identity or expression, ethnicity, religion or other belief, disability, sexual orientation, and age (DA, 208:567). Even though Sweden has a high and gender balanced work force participation from an international perspective, there is still a high degree of gender segregation in terms of actual occupations. In 2016, the work force participation was 84% among women and 89% among men (Statistics Sweden, 2016). In families with children, 82% of the women and 92% of the men worked. More women (29%) than men (11%) worked part time, although this gender difference has decreased over time. In 2005, 45% of the women worked part time, whereas only 6% of the men did. Concerning gender division of labor, only 15–20% of employees work in jobs or industries with an equal gender distribution. In 2010, the Duncan's D index for occupational segregation (Duncan and Duncan, 1955) indicated that 54% of the Swedish workforce would have to exchange occupations for gender parity to be reached (Halldén, 2014). Among women, 70% work in femaledominated occupations (e.g., nurse, teachers, and receptionist) and among men, 67% work in male-dominated occupations (e.g., drivers, constructions workers, managers; Warner, 2012). Furthermore, the vertical segregation between women and men is larger in Sweden than in many other European countries (Ellingsæter, 2014). Women leaders are common in the public sector and in politics (50%), fewer in the private sector (30%), and very few among stock listed companies (CEO:s = 5%; Statistics Sweden, 2016).

Applying social role theory to a Swedish context makes a few issues visible. For example, although the labor force participation of women is high, possibly leading to higher ratings of agency of women, women are still primarily working in occupations which require a high degree of communion (Cejka and Eagly, 1999). Furthermore, although Swedish men take more parental leave than elsewhere nowadays (Ray et al., 2010), they do not take out as much as Swedish women do, nor have they entered into communally demanding occupations (Statistics Sweden, 2016).

In Sweden, compared to other countries, another complicating factor might be a mismatch between the high ranking on gender equality scales (Warner, 2012), the discourse reported in the media (Towns, 2002), and actual gender labor division (Statistics Sweden, 2016). The discourse and gender equality rankings might lead to the notion that sufficient gender equality has been reached, making future change both unnecessary and impossible and therefore not expected. It might also lead to overestimations of women's and men's non-traditional role performance. If mismatches in perceptions from different sources influence stereotype content and gender distributions in social roles, this indicates a missing piece of the puzzle between social perceivers' observations and gender stereotype content. It is therefore important to determine whether stereotype content derives from estimates of women and men in different roles based on actual observation of role occupation, which indicates that there is still room for improvement in the future, or from general perceptions of gender equality, and that gender equality has been reached.

# Overview of the Current Research

To investigate how social change in Sweden influences perceptions of women and men of the past, present, and future, we asked participants to rate an average Swedish woman or man of these three time points. This design aligns with the social role theory paradigm previously used to examine dynamic stereotypes (Diekman et al., 2005; Diekman and Goodfriend, 2006). In Study 1, stereotype content was measured for women and men at all three time points. Because of the strong gender equality discourse in Sweden, we expected that participants would indicate a change in traits from the past to the present but not from the present to the future. We expected a change in agentic traits for women and a change in communal traits for men.

We also tested whether participants' estimates of labor distribution align with official statistics at present time, and if these estimates can explain the changes in stereotype contents. As in previous studies (e.g., Diekman and Eagly, 2000; Wilde and Diekman, 2005; Bosak et al., 2017), perceived non-traditionalism, i.e., the number of individuals in gender counterstereotypical social roles, is tested as a mediator for change in gender stereotype content over time. More specifically, the perception of women's higher agency should be mediated by non-traditionalism in male-dominated roles, whereas the perception of men's higher communion should be mediated by non-traditionalism in female-dominated roles. This moderated mediation is suggested because an increase in counterstereotypical roles should be associated with an increase of the characteristics associated with those roles among the gender that is perceived to change; but not among the gender that is not perceived to change. As shown in past studies, mediation effects might be stronger for agency than communion (Bosak et al., 2017). This division into agentic and communal non-traditionalism provides a direct test of the social role theory hypothesis that characteristics associated with specific roles increase corresponding characteristics in those performing the roles.

In Study 2, we investigated if controlling for the participants' knowledge of objective change in women's or men's roles influenced perceptions of non-traditionalism in occupational and domestic roles as well as stereotype content. We presented participants with information regarding the actual change from the past to the present in social role occupation, either focusing on role change for women or men. By this design, we directly compare if changes in communal and agentic tasks lead to similar perceived changes in stereotype content from the past to the present.

Both studies were carried out in accordance with national guidelines on ethical research (Swedish Research Council, 2017). This means that participants were informed about their voluntary and anonymous contribution, and that they could quit the survey whenever they wanted without giving any reasons for quitting. They were also informed that results would be presented on aggregated levels with no possibility to extract any personal information. After this information, participants gave their informed consent and were electronically forwarded to the questionnaire. After answering the questionnaire, participants actively submitted their responses. A formal ethical approval is not mandatory for this type of research because it did not include any biodata nor did it intend to affect the participants physically or psychologically. It also did not entail any handling of sensitive data as described in the Swedish law about personal data.

# STUDY 1

The main purpose of Study 1 was to provide results directly comparable to previous research on dynamic gender stereotype content. Hence, Study 1 used the same design as has been used in previous research from other countries (e.g., Diekman and Eagly, 2000; Wilde and Diekman, 2005; Bosak et al., 2017) to establish the content of gender stereotypes of women and men of the past, present, and the future.

# Materials and Methods Participants and Design

Participants (N = 399, Mage = 48.87, SDage = 18.01) were recruited from an existing web panel consisting of 67,000 individuals. Stratification was performed on the web panel participants based on gender, age (in 10-year intervals) and geographic region and participants were randomly selected for participation within quotas that were representative of the Swedish population. Of the 399 participants starting the survey, 323 participants completed it (response rate = 80.95%). Participants indicated their gender with a free text response (women = 51.39%, men = 45.55%, non-binary = 2.17%, did not indicate gender = 0.93%).

The design was a 2 (target gender) × 3 (time) betweensubjects factorial design with personality, cognitive and physical characteristics as outcome measures. Participants were randomized to conditions in which they evaluated either a woman or a man in the past (year 1950), the present (year 2017), or the future (year 2090). Because the current analysis plan includes a somewhat large amount of multiple testing, we calculated the false discovery rate (FDR) for Study 1 ad hoc. We made the decision to use FDR instead of a more conservative alpha correction in order to retain statistical power and give an

intuitively informative coefficient (Benjamini and Hochberg, 1995). The FDR was calculated with the Benjamini–Hochberg method using the sgof package version 2.3 (Castro-Conde and de Uña-Álvarez, 2014) in R version 3.5.1. The total FDR for Study 1 was 1.92% which suggests that the overall risk of falsely rejecting the null hypothesis was under 5%.

# Measures

#### Perceived Role Non-traditionalism in Agentic and Communal Roles

Participants estimated the percentage of the counterstereotypic gender within either traditionally female- or male-dominated occupational and domestic roles (e.g., Diekman and Eagly, 2000; Steinmetz et al., 2014). The occupations were selected from official Swedish labor statistics (Statistics Sweden, 2016), had a minimum of 75% gender homogeneity and should be wellknown occupations to the public. The domestic roles were based on official statistics regarding time spent on household tasks in Sweden (Statistics Sweden, 2012), and on items used in previous studies on social role theory (e.g., Diekman and Eagly, 2000). Agentic non-traditionalism (α = 0.87) included estimates of women in male-dominated occupations (car mechanic, pilot, civil engineer, and stock broker) and domestic tasks typically performed by men (car repairs, paying household bills, changing light bulbs, solving technology problems, and doing home repairs). Communal non-traditionalism (α = 0.85) included estimates of men in female-dominated occupations (pre-school teacher, receptionist, and nurse) and domestic tasks typically performed by women (doing the laundry, cooking, cleaning, playing with children, assisting children with homework, taking care of sick children, and caring for children's appearance) 1 .

#### Gender Stereotype Dimensions<sup>2</sup>

Participants evaluated 30 characteristics representing traits that are typically associated with femininity or masculinity (Cejka and Eagly, 1999). Both positive and negative items were used (Diekman and Eagly, 2000). Because these characteristics were chosen to include both positive and negative characteristics associated more strongly with either women or men, rather than communion and agency more broadly, we will use the terms femininity and masculinity: even though the positive femininity and masculinity subscale do correspond to the constructs of communion and agency, respectively. Each characteristic was evaluated on a scale from 1 (not at all likely) to 7 (very likely). Internal reliabilities for final scales were<sup>3</sup> : positive masculinity (α = 0.76), negative masculinity (α = 91), positive femininity (α = 89), and negative femininity (α = 74). See **Appendix A**, **Table A1** for all items used in the scales and Swedish wording.

# Results

Because past research has shown strongest results for personality characteristics, and because cognitive and physical characteristics showed very few significant differences, we chose to streamline this paper and focus on personality characteristics. Results for the cognitive and physical dimensions can be found in **Appendix B**, **Tables B1**, **B2**. Mediation analyses for these dimensions can be found in **Appendix C**.

Analyses of variance (ANOVAs) are reported for each dependent variable separately. The presence of moderated mediation was determined using an index of moderated mediation (Hayes, 2015). Throughout this article, p-values of 0.05 or less are considered as significant. Because participants' gender did not interact with stereotype content in any consistent pattern, these analyses are omitted.

#### Perceived Role Non-traditionalism

To test participants' perceptions of agentic and communal non-traditionalism over time, we conducted a 3 (year) × 2 (agentic/communal non-traditionalism) mixed ANOVA with agentic and communal non-traditionalism as within-subjects factors and year as between-subjects factor. A significant main effect of time, F(2,313) = 194.34, p < 0.001, η 2 <sup>p</sup> = 0.55, revealed that non-traditionalism increased from the past to the future (p < 0.001), whereas the present did not differ from the future (p = 0.80; see **Table 1**). There was neither a significant effect of type of non-traditionalism, F(1,313) = 0.83, p = 0.36, η 2 <sup>p</sup> < 0.01, nor a significant interaction with time, F(2,313) = 1.32, p = 0.27, η 2 <sup>p</sup> = 0.01. Thus, Swedish participants believed that the past was more traditional in terms of gendered division of labor than the present time. They also estimated similar changes for communal and agentic non-traditionalism from the past to the present.

However, participants did not expect any further change in the future. This could be explained by an overestimation of nontraditionalism at present times. Participants estimated higher non-traditionalism than actual distributions in all gender-typical occupations (see **Table 2**). In **Table 3**, we also present percentages



Ratings of role non-traditionalism were made through estimating percentages (0–100) of women and men occupying social roles. Means with different subscripts across time points differ significantly at p < 0.05.

<sup>1</sup>Four professions with an equal gender distribution were added as filler items: physician, retail salesperson, journalist, and university teacher.

<sup>2</sup>Participants rated the characteristics regarding both how likely the target was to possess them (descriptive beliefs) and how beneficial/harmful the characteristics would be for the target to possess (prescriptive beliefs) (Diekman and Goodfriend, 2006). Results for the two were aligned and therefore only results for descriptive stereotype content is presented, results for prescriptive analysis can be found in **Appendix D**.

<sup>3</sup>The following items were changed: The item "aggressive" was moved to the masculine negative personality scale due to higher inter-item correlation, rpositive =0.39 and rnegative =0.73. "Unprincipled" was removed from the masculine scale and "spineless" and "subordinates self to others" was removed from the female scale to increase internal reliability.

TABLE 2 | Study 1: Mean estimates of percentages of women and men working in different occupations compared to official labor statistics (Statistics Sweden, 2016).


on estimated division of domestic duties, although these data cannot be compared to any official statistics.

#### Gender Stereotype Content

Descriptive data for gender stereotypical personality dimensions are presented in **Tables 4**, **5**. Four 2 (Target gender) × 3 (Year) between-subjects ANOVAs were computed to test the effect of time and target gender on gender stereotypical characteristics; p-values for all pairwise comparisons were corrected using Tukey's HSD with a family-wise error rate of 0.05. The personality dimensions were (1) positive femininity, (2) negative femininity, (3) positive masculinity, and (4) negative masculinity.

TABLE 4 | Study 1: Means and standard deviations for masculine personality, over time and target gender.


Within each target gender, means with different subscripts (a,b) differ significantly at p < 0.05 between time points. Within each time point, means with different subscripts (1,2) differ significantly at p < 0.05 between women and men. Ratings were on a 7-point scale on which higher numbers indicate greater likelihood of possessing the characteristics.

There was a difference between women and men in stereotype congruent directions for three personality dimensions: positive femininity and positive and negative masculinity (p's < 0.05); whereas women and men did not differ in negative femininity (p = 0.54). Three personality dimensions (negative femininity, positive and negative masculinity) were believed to increase over time (p's < 0.05), whereas positive femininity did not differ across time points (p = 0.25). The main effects of target gender were qualified by interactions with time for masculinity, but not for positive femininity.

#### Masculinity

The interaction of Target Gender × Year was significant for positive masculinity, F(2,317) = 4.41, p = 0.01, η 2 <sup>p</sup> = 0.03, and

TABLE 3 | Study 1: Estimated percentage of household tasks performed by the woman in a heterosexual household with children by year.


Ratings of performance of household tasks were made through estimating percentage of task performed by either the woman or the man in a heterosexual relationship: 0 = task only performed by man, 100 = task only performed by woman.

TABLE 5 | Study 1: Means and standard deviations for feminine personality, over time and target gender.


Within each target gender, means with different subscripts (a,b) differ significantly at p < 0.05 between time points. Within each time point, means with different subscripts (1,2) differ significantly at p < 0.05 between women and men. Ratings were on a 7-point scale on which higher numbers indicate greater likelihood of possessing the characteristics.

marginally significant for negative masculinity, F(2,317) = 2.90, p = 0.06, η 2 <sup>p</sup> = 0.02. For positive masculinity, pairwise comparisons showed an increase for women between the past and the present (p < 0.001), but not between the present and the future (p = 0.86). There was no perceived change for men between the past and the present (p = 0.55) or between the present and the future (p = 1.00). In addition, women and men were rated equally in the present (p = 1.00) and the future (p = 0.98), whereas they were rated as differing in the past (p < 0.01). For negative masculinity, both women and men increased from the past to the present (p's < 0.05) whereas there was no perceived change from the present to the future (p's > 0.81). A simple effects analysis showed a larger increase for women, F(2,317) = 21.42, p < 0.001, η 2 <sup>p</sup> = 0.19, than men, F(2,317) = 6.10, p < 0.01, η 2 <sup>p</sup> = 0.04. In addition, women and men were rated equally on negative masculinity in the present time (p = 0.70), and the future (p = 0.07), but differing in the past (p < 0.001).

#### Femininity

We expected a perceived change in feminine traits among men but not women, and although a significant interaction for positive femininity was found, F(2,317) = 3.71, p = 0.03, η 2 <sup>p</sup> = 0.02, pairwise comparisons showed no significant perceived change among men from the past to the present (p = 0.99) or from the present to the future (p = 1.00), nor among women (p = 0.09 and 1.00, respectively). Across all time points, women were rated higher on femininity than men (p < 0.01). For negative femininity, there was no significant interaction F(2,317) = 0.98, p = 0.38, η 2 <sup>p</sup> = 0.01.

Thus, for masculinity the ratings aligned with expectations, but not for femininity. Swedish participants also believed that women and men have equal degrees of masculinity in present time, and that women are still more feminine than men. See **Figure 1** for a visualization of perceived stereotype content change over time.

#### Correspondence Between Roles and Gender Stereotypes

We used moderated mediation analyses to test if perceived changes in gender stereotype content over time was mediated by increased non-traditionalism in corresponding social roles, and if this mediation was moderated by gender. More specifically, an increase of women in male-dominated roles (agentic non-traditionalism) should lead to an increase in women's perceived masculinity but not a decrease in men's perceived masculinity, whereas an increase of men in female-dominated roles (communal non-traditionalism) would lead to an increase in men's perceived femininity but not a decrease in women's perceived femininity. To control for an effect of perceived general change over time, the direct effect of year moderated by target gender was also included in the statistical model.

The moderated mediation models were tested for all stereotype dimensions independent of direct effects of time. The decision to conduct mediation analysis on estimates of femininity despite an absence of a total effect of time was made due to the possibility of a completely indirect effect (for a discussion on completely indirect effects, see Hayes, 2009). The SPSS macro PROCESS, v. 3.00, model 15 (Hayes, 2018) was used to perform the moderated mediation analyses with 95% confidence intervals calculated using a percentile bootstrap approach with 10 000 bootstrap samples. Percentile bootstrapping was chosen because it has been shown to retain the increased power for testing mediation which bootstrapping methods provide, and having only a slightly elevated Type I error rate compared to the inflation of the Type I error rate that comparable bootstrapping methods entails (Fritz et al., 2012). The presence of a moderated mediation effect was determined using an index of moderated mediation (Hayes, 2015). Target gender was dummy coded (0 = "woman," 1 = "man"), and time was contrast coded such that a one-unit increase represents a perceived change from past (−1) to present (0), and from present to future (1), since the time distance between the conditions was ordered and roughly equivalent. See **Figure 2** for a visualization of the statistical model.

For positive masculinity, the index of moderated mediation showed an indirect effect of time through agentic role nontraditionalism moderated by target gender, indicating that the effect of time was mediated for women targets, b = 0.24, SE = 0.08, LLCI = 0.11, ULCI = 0.42, but not for men targets, b = 0.02, SE = 0.05, LLCI = −0.06, ULCI = 0.12. There was no significant direct effect of time, nor was there a direct effect moderated by target gender. Women's increase in positive masculinity over time was thus completely qualified by their increased numbers in agentically demanding roles over time.

For negative masculinity, the index of moderated mediation showed an indirect effect of time through agentic role nontraditionalism moderated by target gender. The effect of time was mediated for perception of women targets, b = 0.16, SE = 0.07, LLCI = 0.04, ULCI = 0.30, but not for perception of men targets, b = −0.02, SE = 0.06, LLCI = −0.16, ULCI = 0.08. There was also a significant direct effect of time which was not moderated by target gender, b = 0.32, SE = 0.12, LLCI = 0.09, ULCI = 0.55. Women's increase in negative masculinity, but not men's, was partially an effect of increased occupancy of agentically demanding roles,

there was also a general increase in these characteristics over time for both women and men independently of social role occupation. See **Table 6** for path coefficients and indexes of moderated mediation for both models.

Furthermore, to test that communal non-traditionalism did not affect masculinity, identical moderated mediation models for non-traditionalism in communal roles were tested on masculinity (see **Appendix C**, **Table C1**). Results showed that perceived changes in communal roles did not have any effect on perceived masculinity, neither direct nor indirect. Consequently, the increase in perceived masculinity among women over time was a result of higher perceived role non-traditionalism in agentic roles.

For femininity, we found no significant mediation of communal nor agentic non-traditionalism, nor an indirect effect conditional on gender (see **Appendix C**, **Tables C2**, **C3** for analysis details); meaning that no support was found for a completely indirect effect of time on femininity through communal non-traditionalism conditional on gender.

characteristics over time, (D) change in negative feminine characteristics over time. Error bars represent standard errors of the means.

# Discussion

Study 1 showed that Swedish participants perceived gender stereotypes as dynamic constructs from the past to the present but not from the present to the future, at least in regards to the female stereotype. Women were perceived as more agentic today than in the past, whereas perception of men did not differ based on time. Furthermore, Swedish participants did not expect any change in division of labor in the future. The lack of expected future change differs from past studies on dynamic stereotypes and could indicate an opinion among Swedish people that gender equality has already been reached and that no further change is expected or considered necessary. This interpretation is supported by the strong overestimation of gender balance in occupations which are actually strongly gender segregated. That Swedish participants have a "mental image" of Sweden as a more egalitarian country than it is was indicated both by the estimates of non-traditionalism and by the non-existent change in the future. Participants also rated that women and men had converged on positive and negative masculinity, and that they never differed on negative femininity. The only difference in 2017 was on positive femininity where women were perceived as more communal than men, which could indicate that communal traits are more difficult to gain for men, that changes in men's parental care are too small, or that men enact parental care with less communion than women do.

From the past to the present, Swedish participants believed that both women and men increased their participation in non-traditional roles. However, this role-change only mediated perceptions of women, meaning that the increase in masculinity was explained by an increase of women in agentically demanding social roles. The perception of men, in contrast, did not change from the past to the present, despite a perceived increase of men in social roles requiring communal behavior. Interestingly, participants strongly, but falsely, believed that men have entered female-dominated roles, which would imply a perceived change also in traits. However, no such relationship between men's entry into communal roles and a perceived increase in femininity in the male stereotype was found. This indicates that the mechanisms behind perceived stereotype change might operate differently for femininity/communion and masculinity/agency or for perceptions of women and men. One explanation might be that the communal traits are seen as more essential, whereas agentic traits are seen as more strongly related to behavior. Theories about precarious manhood (Bosson et al., 2013) have shown that masculinity is something that men need to perform and establish over time, whereas femininity is seen as a natural consequence of being born as a woman. Femininity is hence not seen as something that women need to perform to the same degree – but seen an essential aspect of being a woman. Women can also gain status through increased displays of traits associated with masculinity, whereas avoidance of femininity is important for men's maintenance of masculinity.

By using moderated mediation analyses, we showed that women's perceived increase in masculine traits was specifically associated with a perceived change in women's agentic roles and not associated with any perceived change among men or in communal roles, which is a strong test supporting social role theory. However, the related analyses to test the mediation of men's feminine characteristics by change in men's communal roles were not significant; which indicates that a different mechanism than correspondence inference may be responsible for determining male stereotype content. Similar patterns of full mediation for women, but lack of mediation for men, have been found before (see for example Bosak et al., 2017). Because of the strong "gender equality" discourse in Sweden, we suspect that participants' estimates of division of labor was based more on an "egalitarian bias" than an actual reflection of role change. To control for such effects, we performed a second study where participants were presented with information about the actual changes in the gendered division of labor roles over time.

# STUDY 2

In Study 2, participants were presented with factual descriptions of how gender equality in social role occupation increased in Sweden from the 1970s until today. We framed the role change to focus on either women or men to test whether a focus on women's increase in agentic roles or men's increase in communal roles influenced perceptions of femininity and masculinity, respectively. Following the results in Study 1 showing that femininity might be more difficult to associate with men than

TABLE 6 | Study 1: Unstandardized regression coefficients (standard errors in parentheses) with confidence intervals for estimating the indirect conditional effect of time on masculine personality through agentic non-traditionalism, moderated by target gender.


<sup>∗</sup>p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001.

masculinity with women, we believed that explicitly presenting how men's participation in domestic and parental tasks have increased over time would lead to an increase in femininity but that a control condition or a condition that describes women's increased participation in the labor market would not.

# Materials and Methods

#### Participants and Design

Participants were recruited from web forums on social media pages focused on student forums recruiting participants to psychological research and a student participant pool hosted by Gothenburg University. A total of 676 participants completed the survey, 28 participants were removed from the experimental conditions for failing to answer control questions correctly. The final sample consisted of 648 participants (women = 74.23%, men = 23.92%, non-binary = 1.08%; Mage = 25.58, SDage = 9.72).

We used a 3 (Framing of Role Change: women's increase in agentic roles/men's increase in communal roles/control group) × 2 (Target Gender: women/men) × 2 (Year: 1950/2017) between-subjects factorial design. Participants were randomized to one of the conditions where they read either about women's change in agentic roles, men's change in communal roles or to a control condition, and rated either a typical woman or a typical man of the past (1950) or the present (2017). Since we found no change from the present to the future in Study 1, only the past and the present were included in this study. The false discovery rate (FDR) for Study 2 was calculated in the same way as for Study 1. The total FDR for Study 2 was 2.20% which suggests that the overall risk of falsely rejecting the null hypothesis was under 5%.

#### Measurement Instruments Framing of Role Change

Two texts were created which described an actual change in division of labor for women or men and titled "Women take more responsibility in the labor market" and "Men take more responsibility in the home." The text about women focused on changes in women's participation in the labor market since the mid-1900s (e.g., increasing participation in paid labor and entry into professions previously dominated by men). A graph illustrated the change in employment rate of women and men from 1970 to 2018. The text about men focused on changes in men's participation in unpaid labor since the mid-1900s (e.g., men's increase in parental leave and increased time spent on domestic tasks in heterosexual households). A graph illustrated the percentage of parental leave taken by men and women since from 1974 to now (see Figure A1 in **Appendix A**).

#### Role Non-traditionalism

Perceived role non-traditionalism was estimated as in Study 1<sup>4</sup> : communal non-traditionalism included men's participation in communal occupations and household tasks (α = 0.89), whereas agentic non-traditionalism included women's participation in agentic occupations and household tasks (α = 0.90).

#### Gender Stereotypic Characteristics

The gender stereotypic characteristics scales used in Study 1 were abbreviated in order to avoid participant fatigue that was deemed to be of greater concern in this study, due to the presence of a text for the participants to read. The scales were first constructed to be divided along valence to create a positive and negative scale for both femininity and masculinity. However, the scale for positive masculinity showed very poor reliability; α = 0.58 after trimming of an item with low inter-item correlation. Considering that negative characteristics were included in previous studies on dynamic stereotypes to avoid the risk of confusing stereotype change with social desirability (Diekman and Eagly, 2000), we chose to use measures of combined positive and negative femininity/masculinity; given that regardless of valence the items should be correlated within each gender stereotype. The new, combined scales were made up of eight items for each scale (four positive and four negative items)<sup>5</sup> . Reliability was good for both the femininity scale (α = 0.71) and for the masculinity scale (α = 0.81). Participants responded in terms of how likely on a scale from 1 (not at all likely) to 7 (very likely) a woman/man in 1950/2018 would be to possess these characteristics.

<sup>4</sup>Midwife and social welfare secretary were added to female-dominated professions, and taking care of sick children and paying bills was removed from domestic tasks because they were not associated strongly with either gender. <sup>5</sup>For scales see **Appendix A**.

# Results

#### Perceived Role Non-traditionalism

To test if the framing of role change influenced perceived role non-traditionalism between times, we performed a 3 (Framing of Role Change) × 2 (Target Gender) × 2 (Year) × 2 (Type of Non-traditionalism) mixed ANOVA with type of role nontraditionalism as a within-subjects factor and framing of role change, year, and target gender as between-subjects factors. Non-traditionalism increased over time, F(1,641) = 310.00, p < 0.001, η 2 <sup>p</sup> = 0.33, but none of the other expected effects were significant (p's > 0.05); indicating that participants rated agentic and communal non-traditionalism similarly independent of conditions (see **Table 7** for descriptive data). As in Study 1, participants estimated higher non-traditionalism than actual distributions in all gender-typical occupations (see **Table 8** for descriptive data on estimated gender distribution of occupations compared to official statistics and **Table 9** for descriptive data on estimated gender division of household tasks).

## Gender Stereotype Content

To test whether the framing of role change influenced perceived characteristics of women and men, two 3 (Framing of Role Change) × 2 (Target Gender) × 2 (Year) ANOVAs were computed.

For femininity, there was a significant main effect of target gender, F(1,633) = 285.89, p < 0.001, η 2 <sup>p</sup> = 0.31: Women were perceived as more likely to possess feminine personality characteristics than men. There was no significant main effect of time, F(1,633) = 0.06, p = 0.81, η 2 <sup>p</sup> < 0.01, or framing of role change, F(2,633) = 0.21, p = 0.81, η 2 <sup>p</sup> < 0.01. A significant interaction effect between time and gender, F(1,633) = 13.18, p < 0.001, η 2 <sup>p</sup> = 0.02, indicated that women in 2018 were perceived to have lower levels of femininity than in 1950 (p = 0.04), whereas the perceived change for men was not significant (p = 0.90). Women were still seen as more feminine than men in 2018, (p < 0.001), but the gender gap was smaller

TABLE 7 | Study 2: Means and standard deviations by year and framing of role change for role non-traditionalism.


Ratings of role non-traditionalism were made through estimating percentages (0– 100) of women and men occupying social roles. Means with different subscripts across time points differ significantly at p < 0.05.

TABLE 8 | Study 2: Mean estimates of percentages of women working in different occupations compared to official labor statistics (Statistics Sweden, 2016).

Estimated percentage of women in occupations


0 = only men in the profession, 100 = only women in the profession.

TABLE 9 | Study 2: Percentage of household tasks performed by the woman in a heterosexual household with children by year.


Ratings of performance of household tasks were made through estimating percentage of task performed by either the woman or the man in a heterosexual relationship: 0 = task only performed by man, 100 = task only performed by woman.

than for 1950, (p < 0.001, see **Table 10** for mean values). Finally, different framings of role change did not influence perceptions of women and men differently, since the framing of role change did not interact with time and gender, F(2,633) = 0.76, p = 0.47, η 2 <sup>p</sup> = 0.002, meaning that focusing on women's or men's actual change did not differentially influence perceptions of women and men over time with regards to feminine personality.

For masculinity, there was a significant main effect of target gender F(1,633) = 121.98, p < 0.001, η 2 <sup>p</sup> = 0.16, a significant main effect of year, F(1,633) = 7.04, p = 0.01, η 2 <sup>p</sup> = 0.01, and a significant main effect of framing of role change F(2,633) = 3.61, p = 0.03, η 2 <sup>p</sup> = 0.01. However, framing of role change did not significantly interact with target gender, F(2,633) = 1.80, p = 0.17, η 2 <sup>p</sup> = 0.01, or time, F(2,633) = 2.49, p = 0.08, η 2 <sup>p</sup> = 0.01. Instead, reading about men's increase in communal roles decreased ratings of masculine characteristics in comparison to the control condition (p < 0.01), but not in comparison to reading about



Within each target gender, means with different subscripts (a, b) differ significantly at p < 0.05 between time points. Within each time point, means with different subscripts (1, 2) differ significantly at p < 0.05 between women and men. Ratings were on a 7-point scale on which higher numbers indicate greater likelihood of possessing the characteristics.

agentic role change (p = 0.27). See **Table 11** for mean values and standard deviations. There was a significant interaction between time and target gender, F(1,633) = 78.49, p < 0.001, η 2 <sup>p</sup> = 0.11. Pairwise comparisons showed that women were seen as increasing in masculine characteristics from 1950 to 2018 (p < 0.001), but also that men were seen as decreasing in masculine characteristics from 1950 to 2018 (p < 0.001), leading to the gender gap disappearing for 2018 (p = 0.24; in 1950, men were seen as more masculine than women). TABLE 11 | Study 2: Means and standard deviations for masculine personality by framing of role change, year, and target gender.


Within each target gender, means with different subscripts (a, b) differ significantly at p < 0.05 between time points. Within each time point, means with different subscripts (1, 2) differ significantly at p < 0.05 between women and men. Ratings were on a 7-point scale on which higher numbers indicate greater likelihood of possessing the characteristics.

Finally, we found limited support that different framings of role change influenced perceptions of women and men differently: there was no significant omnibus interaction between framing of role change, time and gender, F(2,633) = 0.11, p = 0.90, η 2 <sup>p</sup> < 0.001, but pairwise comparisons did show that men's decrease in masculinity only was significant within the condition which described women's increasing occupancy of agentic roles (p = 0.02). See **Figure 3** for a visualization of perceived stereotype content change over time by framing of role change.

#### Correspondence Between Roles and Gender Stereotypes

Similar mediation models and analysis method as in Study 1 were used to test whether role distribution influenced gender stereotype content, with the addition of adding framing of role change as a covariate. Framing of role change was dummy coded, with control group as reference, and added as a covariate rather than as a possible moderator since it had not displayed any interaction effects with variables in previous analyses.

For masculinity, the index of moderated mediation showed an indirect effect of time through agentic role non-traditionalism moderated by target gender, indicating that the indirect effect of time through agentic non-traditionalism differed based on target gender. The effect of time was mediated for perception of women targets, b = 0.28, SE = 0.09, LLCI = 0.14, ULCI = 0.48, but not for perception of men targets, b = −0.10, SE = 0.10, LLCI = −0.31, ULCI = 0.07. There was also a significant direct effect of time on masculinity, which was moderated by target gender: For women targets the effect of time was positive, b = 0.52, SE = 0.11, LLCI = 0.30, ULCI = 0.73, and for men targets the effect of time was negative, b = −0.34, SE = 0.11, LLCI = −0.56, ULCI = −0.12. Women's increase in masculinity partially was an effect of increased occupancy of agentically demanding roles, whereas men's decrease in masculinity was not a result of decreased occupancy of agentically demanding roles. There was also a general increase in these characteristics over time for women which was unrelated to social role occupancy. See **Table 12** for path coefficients and index of moderated mediation.

In contrast to Study 1, there was also an indirect effect of time on masculinity through communal non-traditionalism, moderated by target gender. The effect of time was mediated for perception of both women, b = 0.23, SE = 0.10, LLCI = 0.09, ULCI = 0.46, and men targets, b = −0.20, SE = 0.10, LLCI = −0.43, ULCI = −0.03. There was also a significant direct effect of time on masculinity which was moderated by target gender, for women targets the effect of time was positive, b = 0.57, SE = 0.12, LLCI = 0.34, ULCI = 0.80, and for men targets the effect of time was negative, b = −0.23, SE = 0.12, LLCI = −0.46, ULCI = −0.01. This analysis shows that women's increase in masculinity partially was an effect of decreased occupancy in communally demanding roles, whereas men's decrease in masculinity partially was a result of increased occupancy in communally demanding roles. There was also a general increase in these characteristics over time for women as well as a decrease for men which was unrelated to social role occupancy. See **Table 13** for path coefficients and index of moderated mediation.

Taken together, these analyses show that women are seen as increasing in masculinity both when they are increasingly found in agentically demanding roles, and when they decrease their participation in communally demanding roles. On the other hand, perception of men's masculinity is not affected by their participation in agentically demanding roles, but decreases when men are seen as increasingly occupying communal roles.

In contrast to the expectations, we found neither a mediation by communal or agentic non-traditionalism for femininity, nor an indirect effect conditional on gender (see **Appendix D**, **Tables D4**, **D5** for analysis details). Therefore, as in Study 1, time was not found to have an indirect effect on femininity through communal non-traditionalism conditional on gender.

# Discussion

Study 2 presented actual changes in gendered division of labor from the past to the present alongside a control condition to test if participants' estimates of role non-traditionalism and gender stereotype content were affected by a Swedish equality bias. Contrary to expectations, presenting participants with information about actual change regarding social role occupancy did not diminish overestimations of role non-traditionalism compared to the control condition. Instead, participants in Study 2, including the control condition, overestimated the prevalence of women and men in non-traditional roles.

Furthermore, framing of role change had only a limited effect on gender stereotype content. The framing which described

TABLE 12 | Study 2: Unstandardized regression coefficients (standard errors in parentheses) with confidence intervals for estimating the indirect conditional effect of time on masculine personality through agentic role non-traditionalism, moderated by target gender.


<sup>∗</sup>p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001.


TABLE 13 | Study 2: Unstandardized regression coefficients (standard errors in parentheses) with confidence intervals for estimating the indirect conditional effect of time on masculine personality through communal non-traditionalism, moderated by target gender.

<sup>∗</sup>p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001.

men's increase in communal roles did not affect the perceived femininity of men – instead, such a framing decreased overall perceptions of masculinity. However, men's decrease in masculinity from the past to the present was only significant in the condition which described women's increase occupancy of agentic roles; indicating that framing women's increased occupancy of agentic roles lead to decreased perceptions of masculinity in men of the present.

Regardless of how we framed role change, there was support for specifically the stereotype of women being seen as dynamic. Women were perceived as both more masculine and less feminine today compared to the past. Mediation analyses indicated that the perceived increase for women in masculinity was partially explained by their increased participation in agentically demanding roles, along with a decrease in communal role occupation. However, women's perceived decrease in femininity from the past to the present was not mediated by social role occupancy.

The male stereotype was not subject to an increase in genderatypical characteristics as the female stereotype was, but men were seen as having less masculine characteristics today than in the past. Mediation analyses, indicated that men's perceived decrease in masculinity was partly qualified by an increased degree of participation in communally demanding roles. This indicates that when men engage in communal roles this may in fact contribute to a loss of masculinity rather than a gain in femininity. Such results are in line with the ideas of the precarious manhood (Vandello et al., 2008), because there may be differences in the malleability of masculinity and femininity, where masculinity is easier to gain and lose, whereas femininity for men is more difficult to attain but can be lost by women through some mechanism other than social role occupancy.

# GENERAL DISCUSSION

Social role theory explains the origins of gender stereotypes by observed division of labor (Eagly, 1987). Even though people often consider gender stereotypes as relatively stable characteristics, previous research shows that traits associated with women and men are dynamic and subject to change (Diekman and Eagly, 2000; Diekman et al., 2005; Wilde and Diekman, 2005). Such dynamics are expected from social role theory, as changes in the division of labor should be accompanied by corresponding changes in perceived traits of women and men. Changes in division of labor could occur due to several factors. For example, it might be due to economic factors or due to ideological factors, such as national gender equality goals. One example of the latter is campaigns about paternity leave in Sweden which are grounded in an active political striving to increase men's participation in child-care.

In line with such changes, past research (Diekman et al., 2005; Wilde and Diekman, 2005; Bosak et al., 2017) has consistently shown that the female stereotype is seen as more agentic over time, whereas results about perceived change in the male stereotype are mixed. This may be due to contextual variations regarding to what extent men have entered communal roles. In order to test the assumption that men's engagement in communal roles affects the male stereotype, the present research tested whether the male stereotype has changed in line with predictions from social role theory in one of the world's most egalitarian countries – Sweden. Based on several indicators of high gender equality and one of the most beneficial parental leave systems for fathers, we assumed that the male stereotype should include more communion in a Swedish sample as compared to most other countries.

In two studies, we tested if Swedish participants believed that characteristics of women and men changed from the present to the future (Study 1), and from the past to the present (Studies 1 and 2). The results of both studies showed that the content of the female stereotype increased in masculinity. Furthermore, the female and male stereotype converged on masculinity for evaluation of a target in the present. Thus, at the present time, Swedish women and men are seen as equally masculine. This result supports the notion of Sweden as being one of the most egalitarian countries in the world. Moreover, the result for

stereotypes in the present somewhat align with how masculinity is estimated in women and men in the future from similar studies in other countries (see for example Garcia-Retamero et al., 2011; Bosak et al., 2017). We did not see any future change on both of the personality dimensions, which could be explained by this convergence occurring for the present. If women and men are currently perceived as equally masculine, and if this convergence reflects the notion of being gender equal, there is no need for further change in the future. Hence, the lack of future change is in line with the idea and discourse of Sweden as already having arrived at an end point of gender equality; in other words, where equality has already been reached. However, given that large gender differences still exist in Sweden's gender segregated labor market, e.g., women take longer parental leave than men, gender equality is still to be reached. But, as indicated by our results, the movement toward gender equality might proceed at a much slower pace in the future. If people believe that equality has already been reached and perceive women and men as being alike, the existence of gender segregation might be attributed to individual preferences regarding interest in specific occupations rather than to gender stereotypes: which under the current neoliberal framework might be seen as less important to change. Future research should therefore study the origins and consequences of dissonance between actual and perceived gender segregation. For example, the media often strive to present counter-stereotypical representatives. Although these aims are well intended and probably serve as important role models, such strategies might also backlash into false understandings of the actual gender distribution of the labor market.

At the same time, we found no increase in the perceived communality of men. Because Sweden is highly egalitarian, and because Swedish fathers take more parental leave than any other fathers in the world, we expected that there would be a perceived increase in feminine traits among men from past to present time. This expectation was not confirmed in any of the studies, even though participants perceived increased numbers of men in communal roles in both studies. In Study 1, participants overestimated the amount of men in communal roles and in Study 2, they were presented with statistical facts showing the increase in fathers' parental leave from 1974 (the introduction of parental leave rather than maternal leave) to 2017. These facts showed a trend over time which clearly indicated that men have become increasingly engaged in the communally demanding task of child-rearing. Still, there was no corresponding increase in perceived femininity of men. Even though participants did not perceive a change in men's feminine traits, the Swedish male stereotype is quite balanced on agency and communion, as revealed when more closely examining the means for men's femininity and masculinity scores.

Several explanations for the difference in malleability of traits associated with masculinity or femininity are possible, and should be more closely studied in future research. One supposition is that the perceived traits associated with masculinity are more malleable than traits associated with femininity. Past studies have shown that social roles were more strongly correlated with agentic than communal behavior (Moskowitz et al., 1994). Masculine traits may be easier to gain because they are viewed as performative rather than essential. Hence, as women engage in agentic roles, they are perceived to gain agentic traits. Feminine traits, especially those that relate to nurturing and care-taking, on the other hand, might be seen as more essential to the category 'woman' and more difficult to gain through role occupation (McPherson et al., 2018). Following this, even if men engage in communal roles, they may not be perceived to gain communal traits. In support of this idea, the results from Study 2 showed that when men entered communal roles, they were not perceived as having gained communal/feminine traits. Presenting participants with facts about men's increased participation in communal roles did not affect ratings of femininity, and perception of a higher degree of men in communal roles instead mediated the decrease in men's perceived masculinity over time. Some scholars argue that biology might be one cue that fosters essentialism, and if women's caregiving is perceived as more related to biology than men's (McPherson et al., 2018) this could be one factor explaining why men did not increase in communal traits. Yet another reason might be that the number of men in caregiving roles are still too few to cause a change in perceived communality. Finally, mothers and fathers may enact caregiving in a range of different ways – that is, simply because a man is home for parental leave does not mean that he is engaging in caregiving behaviors in the same way that women do. In comparison to roles in the labor market, specific family roles might be easier to adjust to broad gender stereotypes.

Moreover, in Study 1, the female stereotype had higher values on positive traits associated with femininity than the male stereotype across all three time points, indicating a stability in gender differences. This could be explained by the fact that even though Swedish fathers' take comparatively more parental leave than in other countries, Swedish mothers still take the bulk of the parental leave, and they also work in more communal sectors on the labor market. Furthermore, men have not entered communal occupations as shown by official statistics (Statistics Sweden, 2016) meaning that the Swedish labor market is still gender segregated along the lines of communion/agency. Hence, even though Sweden is highly ranked on national indices of gender equality, it is actually not gender equal, even though many of its citizens seem to believe that. The strong overestimation of gender equality in occupations supports this interpretation.

A limitation of these studies is that we have not controlled for participants' awareness of the current gender equality situation in Sweden. In Study 1, we found strong overestimations of the extent to which women and men have entered into non-stereotypical occupations. In Study 2, we controlled for participants' knowledge of actual changes in division of labor, however, participants still underestimated the degree of gender segregation present in the Swedish labor market. Future studies should more explicitly test whether Swedish people think that equality has been reached and whether such beliefs also influence perceptions of division of labor and gender stereotype content.

Another possible explanation of the convergence between women and men on three of the four personality dimensions

might be that traits associated with femininity and masculinity differ in Sweden compared to other nations in which these stereotype dimension scales have been tested. Future research should investigate if this decrease in stereotypicality of classically gendered traits has led to a decrease in gender stereotyping, or if other traits than those used in previous research have become gendered, thus contributing to updated knowledge regarding gender differences in stereotype content.

# CONCLUSION

In support of social role theory, we directly showed that the perceived change in women's agentic traits was specifically associated with a perceived change in the roles occupied by women. However, men were not perceived to change as a result of changing roles. Instead, when men were seen in non-traditional roles, their communal characteristics did not increase. Thus, seeing men taking their children to pre-school as described in the first quote is so far not enough to also perceive men as communal.

# ETHICS STATEMENT

These studies were carried out in accordance with the national guidelines on ethical research established by the Swedish Research Council retrievable at https://publikationer. vr.se/en/product/good-research-practice/. All participants

# REFERENCES


gave their informed consent before participating in the survey.

# AUTHOR CONTRIBUTIONS

MGS conceptualized the idea and wrote the manuscript. AK collected data, performed statistical analyses, and wrote the result section. All authors participated in the planning of the studies, interpretation and discussion of results, and in writing the manuscript.

# FUNDING

This research was funded by FORTE (Swedish Research Council for Health, Working Life and Welfare, 2014-00418).

# ACKNOWLEDGMENTS

We thank Hellen Vergoossen for valuable comments on an earlier draft of this paper.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2019.00037/full#supplementary-material



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Gustafsson Sendén, Klysing, Lindqvist and Renström. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Gender Trouble in Social Psychology: How Can Butler's Work Inform Experimental Social Psychologists' Conceptualization of Gender?

Thekla Morgenroth<sup>1</sup> \* and Michelle K. Ryan1,2

<sup>1</sup> Department of Psychology, University of Exeter, Exeter, United Kingdom, <sup>2</sup> Faculty of Economics and Business, University of Groningen, Groningen, Netherlands

A quarter of a century ago, philosopher Judith Butler (1990) called upon society to create "gender trouble" by disrupting the binary view of sex, gender, and sexuality. She argued that gender, rather than being an essential quality following from biological sex, or an inherent identity, is an act which grows out of, reinforces, and is reinforced by, societal norms and creates the illusion of binary sex. Despite the fact that Butler's philosophical approach to understanding gender has many resonances with a large body of gender research being conducted by social psychologists, little theorizing and research within experimental social psychology has drawn directly on Butler's ideas. In this paper, we will discuss how Butler's ideas can add to experimental social psychologists' understanding of gender. We describe the Butler's ideas from Gender Trouble and discuss the ways in which they fit with current conceptualizations of gender in experimental social psychology. We then propose a series of new research questions that arise from this integration of Butler's work and the social psychological literature. Finally, we suggest a number of concrete ways in which experimental social psychologists can incorporate notions of gender performativity and gender trouble into the ways in which they research gender.

Keywords: gender trouble, gender, gender performativity, social psychology, non-binary gender, genderqueer, Judith Butler

> "We're born naked, and the rest is drag." (RuPaul, 1996)

# INTRODUCTION

A quarter of a century ago, philosopher Judith Butler (1990) called upon society to create "gender trouble" by disrupting the binary view of sex, gender, and sexuality. Key to her argument is that gender is not an essential, biologically determined quality or an inherent identity, but is repeatedly performed, based on, and reinforced by, societal norms. This repeated performance of gender is also performative, that is, it creates the idea of gender itself, as well as the illusion of two natural, essential sexes. In other words, rather than being women or men, individuals act as women and men, thereby creating the categories of women and men. Moreover, they face clear negative consequences if they fail to do their gender right.

#### Edited by:

Alice H. Eagly, Northwestern University, United States

#### Reviewed by:

Peter Hegarty, University of Surrey, United Kingdom Marianne LaFrance, Yale University, United States

#### \*Correspondence:

Thekla Morgenroth T.Morgenroth@exeter.ac.uk

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

> Received: 28 March 2018 Accepted: 09 July 2018 Published: 27 July 2018

#### Citation:

Morgenroth T and Ryan MK (2018) Gender Trouble in Social Psychology: How Can Butler's Work Inform Experimental Social Psychologists' Conceptualization of Gender? Front. Psychol. 9:1320. doi: 10.3389/fpsyg.2018.01320

**239**

We argue that Butler's philosophical approach to understanding gender has many resonances with, and implications for, a large body of gender research being conducted by social psychologists. Indeed, Butler's notion of performativity echoes a range of social psychological approaches to gender and gender difference. What we social psychologists might call gender norms and stereotypes (e.g., Eagly, 1987; Fiske and Stevens, 1993), or gender schemas (Bem, 1981) provide the "scripts" for what Butler's describes as the performance of gender.

We are not the first to point out the relevance of Butler's work to social psychology. Bem (1995) drawing on Butler's work, argued in that as gender researchers we should create gender trouble by making genders that fall outside of the binary visible, in order to disrupt binary, heteronormative views of gender within and outside of psychology. Minton (1997) argued that queer theory more broadly, which challenges the binary, heteronormative system of sex and gender, should inform psychological theory and practice. Similarly, Hegarty (1997) uses Butler's arguments regarding performativity to criticize neuropsychological research that essentializes sexual orientation, pointing out the ways in which it ignores historical and cultural variation in sexuality and excludes women and other minorities. However, despite these calls for gender trouble over 20 years ago, we believe that social psychology, and experimental social psychology in particular, has yet to truly step up and answer the call.

Despite past acknowledgments of the importance of Butler's work by social psychologists, in particular by qualitative psychologist, to our knowledge, little theorizing and research within experimental (and quantitative) social psychology has directly drawn on Butler's ideas. This is despite the fact that there are identifiable similarities in broad theoretical ideas espoused by many social psychologists with an interest in gender and Butler's ideas. Thus, we argue that there is great value in (again) promoting the ideas Butler puts forward in Gender Trouble to social psychologists. While experimental social psychological perspectives on gender have been concerned primarily with the origin and perpetuation of gender stereotypes, Butler's work is more political in her explicit call to create gender trouble. The political nature of the work is perhaps one reason why experimental social psychologists have been reluctant to build on and integrate Butler's ideas in their work – but, we would argue, it is indeed one of the reasons they should. Combining these two perspectives seems potentially fruitful, bringing together Butler's theorizing and her call for social and political change with established experimental social psychological theory and empirically testable hypotheses.

In this paper we will first describe Butler's work in more detail. We will then discuss the extent to which her work fits with different conceptualizations of gender in the social psychological literature, with a focus on experimental social psychology. We will then propose new avenues of research that could potentially grow out of an integration of Butler's work into social psychology. Finally, we will discuss the different ways in which Butler's work can inform and challenge the ways in which we, as experimental social psychologists, study and operationalize gender.

# BUTLER'S VIEW ON GENDER

In her book Gender Trouble Butler (1990) argues that within Western culture, sex, gender, and sexual orientation are viewed as closely linked, essential qualities. The prevalent view is that biological sex is binary (male vs. female), essential, and natural, and that it forms the basis for binary gender, which is viewed as the cultural interpretation of sex, and sexual desire. In other words, there is a belief that a baby born with a penis will grow up to identify and act as a man – whatever that means in a specific culture – and, as part of this gender role, be sexually attracted to women. Similarly, there is a belief that a baby born with a vagina will grow up to identify and act as a woman and, as part of this gender role, be sexually attracted to men. Butler argues that these configurations of sex, gender, and sexual desire are the only "intelligible" genders in our culture.

This societal view of gender is also reflected in the works of many feminist writers, who define sex as biological and gender as cultural (see Gould, 1977, for a review and critical discussion). Butler criticizes this distinction between sex – as natural, essential, and pre-discursive (i.e., existing before culture and before interpretation) – and gender as its cultural interpretation. She argues that it is not just gender that is culturally constructed and has prescriptive and proscriptive qualities, but that this also applies to sex as a binary category. Through this, Butler (1990) argues that the distinction between sex and gender is meaningless, noting that "perhaps this construct called 'sex' is as culturally constructed as gender; indeed, perhaps it was always already gender with the consequence that the distinction between sex and gender turns out to be no distinction at all" (p. 9).

Butler cites evidence for the considerable variability in chromosomes, genitalia, and hormones, that don't always align in the expected, binary manner. Indeed, even biologists, who traditionally view the body as natural and pre-discursive, increasingly argue that a binary view of human sex is overly simplistic and that sex should be viewed as a spectrum rather than a dichotomy, in terms of anatomical, hormonal, and even cellular sex (see Fausto-Sterling, 2000; Ainsworth, 2015 see also Fausto-Sterling, 1993). This variability can include ambiguous genitalia, a "mismatch" between chromosomes and genitalia, or a body that is comprised of a mix of "male" (XY) and "female" (XX) cells<sup>1</sup> . Some research suggest that up to 10% of children are born with sex characteristics that do not clearly fall into the category of female or male (e.g., Arboleda et al., 2014), although these numbers are debated and some argue the number is much lower. For example, Sax (2002) argues that only very specific "conditions" should qualify as intersex and that only about 0.018% of people should be considered intersex. We would argue, however, that exact numbers or specific definitions of what constitutes "intersex" are irrelevant here and that debates about exact numbers are indeed illustrative of the very process Butler discusses – that there is no "objective" or natural sex, but that it is performatively constructed.

<sup>1</sup>Please note that these terms are based on the common view of naturally binary sex under which most researchers operate. We do not mean to imply that Butler herself would use these terms or, indeed, would be convinced by the idea that these bodies – or any bodies – exist "naturally" prior to interpretation.

Regardless of exact numbers, Butler argues that any individual who does not fall clearly into one of the two sex categories is labeled as abnormal and pathological (see Sax's usage of the term "condition"), and steps are taken to "rectify" this abnormality. For example, the majority of babies born with intersex characteristics undergo surgery and are raised as either male or female (Human Rights Watch, 2017), protecting and maintaining the binary construction of sex.

To be clear, Butler does not argue that biological processes do not exist or do not affect differences in hormones or anatomy. Rather, she argues that bodies do not exist outside of cultural interpretation and that this interpretation results in over-simplified, binary views of sex. In other words, biological processes do not themselves result in two "natural," distinct, and meaningful, categories of people. The two sexes only appear natural, obvious, and important to us because of the gendered world in which we live. More specifically, the repeated performance of two polar, opposite genders makes the existence of two natural, inherent, pre-discursive sexes seem plausible. In other words, Butler views gender as a performance in which we repeatedly engage and which creates the illusion of binary sex. She argues:

"Because there is neither an 'essence' that gender expresses or externalizes nor an objective ideal to which gender aspires; because gender is not a fact, the various acts of gender create the idea of gender, and without those acts, there would be no gender at all. Gender is, thus, a construction that regularly conceals its genesis. The tacit collective agreement to perform, produce, and sustain discrete and polar genders as cultural fictions is obscured by the credibility of its own production. The authors of gender become entranced by their own fictions whereby the construction compels one's belief in its necessity and naturalness." (p. 522)

Thus, for Butler, gender is neither essential nor biologically determined, but rather it is created by its own performance and hence it is performative. The term performativity, originating in Austin's (1962) work on performative utterances, refers to speech acts or behaviors which create the very thing they describe. For example, the sentence "I now pronounce you man and wife" not only describes what the person is doing (i.e., pronouncing something) but also creates the marriage (i.e., the thing it is pronouncing) through the pronouncement. Butler builds on this work by exploring how gender works in a similar way – gender is created by its own performance.

However, as this binary performance of gender is almost ubiquitous, its performative nature is concealed. The binary performance of gender is further reinforced by the reactions of others to those who fail to adhere to gender norms. Butler argues that "Discrete genders are part of what 'humanizes' individuals within contemporary culture; indeed, those who fail to do their gender right are regularly punished" (p. 522). This punishment includes the oppression of women and the stigmatization and marginalization of those who violate the gender binary, either by disrupting the presumed link between sex and gender (e.g., transgender individuals) or between sex and sexuality (e.g., lesbian and gay individuals) or by challenging the binary system in itself (e.g., intersex, bisexual, or genderqueer individuals). This stigma is clearly evidenced by the high rate of violence against transgender women, particularly those of color (Adams, 2017); surgeries performed on intersex babies to achieve "normal" sex characteristics (Human Rights Watch, 2017); and the stigmatization of sexual minorities (Lick et al., 2013).

These negative reactions and the binary performance of gender, Butler argues, do not exist by chance. Instead, they serve as tools of a system of power structures which is trying to reproduce and sustain itself – namely a patriarchal system of compulsory heterosexuality in which women serve as a means of reproduction to men, as their mothers and wives. These power structures are both prohibitive (i.e., proscriptive), repressing deviating gender performance, as well as generative (i.e., prescriptive), creating binary, heteronormative gender performance.

Butler's work is a call to action to overthrow these structures and end the problematic practices that they engender. However, she criticizes feminist voices who emphasize a shared identity ("women") to motivate collective action on behalf of the group in order to achieve societal changes. By arguing that gender is not something one is, but rather something one does or performs, Butler argues that gender identity is not based on some inner truth, but instead a by-product of repeated gender performance. Framing gender identity as an inherent part of the self, as many feminist writers did at the time (and indeed still do), she argues, reinforces the gender binary and in turn plays into the hands of the patriarchy and compulsory heterosexuality. Feminists should instead seek to understand how the category of "women" is produced and restrained by the means through which social change is sought (such as language or the political system).

This argument has particular relevance to the notion of gender identity. As such, it has been criticized as invalidating transgender individuals, whose experience of a true inner gender identity that is not in line with the sex they were assigned at birth is often questioned. This is despite the fact that from a young age transgender individuals view themselves in terms of their expressed gender, both explicitly and implicitly, mirroring selfviews of cis-gender<sup>2</sup> children (Olson et al., 2015). Butler has responded to these criticisms repeatedly. For example, answering a question about what is most often misunderstood about her theory in an interview in 2015, she replies:

"I do know that some people believe that I see gender as a "choice" rather than as an essential and firmly fixed sense of self. My view is actually not that. No matter whether one feels one's gendered and sexed reality to be firmly fixed or less so, every person should have the right to determine the legal and linguistic terms of their embodied lives. So whether one wants to be free to live out a "hard-wired" sense of sex or a more fluid sense of gender, is less important than the right to be free to live it out, without discrimination, harassment, injury, pathologization or criminalization – and with full institutional and community support." (The Conversation Project, 2015)

Thus, Butler does not question people's sense of self, but instead criticizes a shared gender identity as the necessary basis

<sup>2</sup> "Cis" refers to individuals for whom the sex they are assigned at birth and their gender identity align.

for political action. She points out that abandoning the idea of gender as an identity does not take away the potential of agency on behalf of women. Instead, it opens up the possibility of agency, which other approaches that view identity as fixed and stable do not enable. The fact that identity is constructed means that it is neither completely arbitrary and free, nor completely determined, leaving room for re-structuring, subversion, and for disrupting the status quo. Thus, the common identity "we, women" is not necessary for collective action on behalf of the feminist movement, as anyone can engage in subversion and the disruption of the gender binary. Indeed, we would argue that feminism becomes more powerful as an inclusive movement for gender equality more broadly defined, not just equality between women and men.

In conclusion, Butler argues that we, as a society, need to create gender trouble by disrupting the gender binary to dismantle the oppressive system of patriarchy and compulsory heterosexuality. While some of Butler's ideas seem very different from how gender is generally viewed in the experimental social psychological literature, others resonate well with social psychological theorizing and empirical research. In the next section, we will discuss ways in which Butler's view is compatible – and incompatible – with some of the most prominent conceptualizations of gender in experimental social psychology.

# IS BUTLER'S VIEW COMPATIBLE WITH CONCEPTUALIZATIONS OF GENDER IN SOCIAL PSYCHOLOGY?

Gender has been an increasingly important focus within psychology more generally, and in social psychology in particular (e.g., Eagly et al., 2012). While there is considerable variation in how psychologists view and treat gender, we argue that many of approaches fall into one of three traditions: (1) evolutionary approaches which view binary, biological sex as the determinant of gender and gender differences; (2) social structural approaches which view societal forces such as status and social roles as the determinant of gender stereotypes and, in turn, gender differences; and, not mutually exclusive from a social structural approach; (3) social identity approaches which view gender as one out of many social categories with which individuals identify to varying degrees. In addition, integrative approaches draw on more than one of these traditions, as well as developmental, social cognitive, and sociological models of gender, and integrate them to explain gendered behavior. While none of these approaches is entirely compatible with the argument that binary sex is constructed through the repeated binary performance of gender with gender identity as a by-product of this performance, there are great differences in the extent to which they are in line with, and can speak to, Butler's ideas.

Evolutionary psychology is, we would argue, the least compatible with Butler's view on sex and gender. Evolutionary approaches to the psychology of gender maintain that gender differences are, for the most part, genetic – resulting from the different adaptive problems faced by women and men in their evolutionary past (see Byrd-Craven and Geary, 2013), particularly due to reproductive differences such as paternal uncertainty for men and higher parental investment for women. These differences, it is argued, then shaped our genes – and gender differences – through sexual selection (i.e., gender differences in the factors predicting successful reproduction; Darwin, 1871). These approaches can be described as essentializing gender, that is, promoting the belief that men and women share an important but unobservable "essence." Essentialism includes a range of factors such the degree to which individuals perceive social categories to be fixed and natural (Roberts et al., 2017) and has been shown to be associated with greater levels of stereotyping and prejudice (Brescoll and LaFrance, 2004; Bastian and Haslam, 2006). Evidence further suggests people who hold highly essentialist beliefs of gender are more supportive of what the authors call "boundaryenhancing initiatives" such as gender-segregated classrooms and legislation forcing transgender individuals to use the bathroom associated with the sex they were assigned at birth (Roberts et al., 2017). Thereby, essentialism, and the resultant stereotypes and prejudice, contribute to the reinforcement of the status quo.

Evolutionary psychology's approach to gender exemplifies many points Butler (1990) criticizes in Gender Trouble. First, it treats sex as a pre-discursive binary fact rather than a cultural construct. In other words, it ignores variability in chromosomes, genitals, and hormones (Fausto-Sterling, 1993; Ainsworth, 2015) and views binary sex – and gender – as an inherent, essential quality. Moreover, evolutionary approaches argue that gender follows from sex and thus portray binary sex as an explanation for, rather than a result of, gender differences (i.e., gender performance). In addition to ignoring the existence of intersex individuals, these approaches also often ignore homosexuality, focusing exclusively on heterosexual desires and reproduction. Thus, we would argue, such evolutionary approaches play into the patriarchal system of compulsory heterosexuality in which women function primarily as mothers and wives.

Social structural approaches to gender such as early conceptions of social role theory (Eagly, 1987) and the stereotype content model (Fiske and Stevens, 1993) are more compatible with Butler's views. Such approaches argue that societal structures such as social roles and differences in power and status determine gender stereotypes, which affect both gendered behavior as well as reactions to those who deviate from gender stereotypes. In other words, gender stereotypes provide the "script" for the performance of gender with negative consequences for those who fail to "learn their lines" or "stick to the script".

The social psychological literature provides many empirical examples of these negative consequences. For example, Rudman and colleagues describe how those who deviate from their scripts often encounter backlash in the form of economic and social penalties (for a review see Rudman et al., 2012). This backlash discourages individuals from engaging in stereotypeincongruent behavior as they avoid negative consequences in the future, reducing their potential to act as deviating role models for others. Moreover, witnessing the backlash gender troublemakers encounter may also vicariously discourages others from breaking gender stereotypes to avoid negative consequences

for themselves. The literature on precarious manhood further suggests that these issues might be particularly pronounced for men (Bosson et al., 2013). Research demonstrates that men must continuously prove their masculinity by avoiding anything deemed feminine to avoid negative consequences such as loss of status. Each of these lines of research are very much in line with Butler's arguments, both with the idea that those who "fail to do their gender right" are punished and with the idea that the gender binary is a tool to uphold the patriarchy.

However, in other respects, social structural approaches are less compatible with Butler's arguments. First, they tend not to take non-binary gender into account, and the empirical research tends to operationalize men and women as disjunct categories. Although research focusing on how intra-gender variability is often much larger than between gender variability (e.g., Hyde, 2005) is a good first step, it still ultimately relies on dividing people into the binary categories of female and male. Moreover, these approaches also rarely take issues of intersectionality into account (see Shields, 2008) and focus on stereotypes of white, heterosexual, middle-class, cis women and men, although there are some notable exceptions (e.g., Fingerhut and Peplau, 2006; Brambilla et al., 2011).

Approaches from the social identity and self-categorization tradition (Tajfel and Turner, 1979; Turner et al., 1987) view gender as a social identity (e.g., Skevington and Baker, 1989). This tradition argues that in addition to one's personal identity, different social groups are integrated into the self-concept, forming social identities. These social identities can be based on meaningful social categories such as gender or occupation, but also in response to random allocation to seemingly meaningless groups. The strength of the identification with one's gender as well the salience of this identity in any given context determine the extent to which the self-concept is affected by gender stereotypes – and in turn the extent to which gendered patterns of behavior are displayed (e.g., Lorenzi-Cioldi, 1991; Ryan and David, 2003; Ryan et al., 2004; Cadinu and Galdi, 2012).

While the idea of gender as an identity – rather than a result of gendered behavior – may be seen as being inconsistent with Butler's argument, results from minimal group studies (e.g., Tajfel et al., 1971) are very much in line with her reasoning. These studies demonstrate that identities can form on the basis of completely irrelevant, artificial categories and are thus by no means inherent nor inevitable. Thus, while in our given society, these identities are considered to be largely binary, this is not inevitable and likely the result of social forces. Moreover, the evidence from a social identity perspective that supports the notion that changes in context can affect gender salience, levels of identification, and thus the extent of gendered behaviors, are also very much in line with Butler's arguments.

Lastly, integrative approaches draw on more than one of these traditions as well as developmental, social cognitive, and sociological models of gender. For example, social role theory has developed over time, integrating biological as well as social identity aspects into its framework, resulting in a biosocial approach (Eagly and Wood, 2012). More specifically, more recent versions of the theory argue that the division of labor leads to gendered behavior via three different mechanisms: (1) social regulation (as described above), (2) identity-based regulation, similar to the processes outlined by social identity theory, and (3) biological regulation through hormonal processes such as changes in testosterone and oxytocin. Importantly, these processes interact with one another, that is, hormonal responses are dependent on expectations from others and gender identity. While the social regulation of gender is very much in line with Butler's arguments, the integration of biological – and particularly evolutionary – perspectives fits less with her idea that gender performance is what creates gender.

Another influential integrative approach is the interactive model of gender-related behavior (Deaux and Major, 1987). Rather than focusing on distal factors which affect gender stereotypes, this model focuses on the situational and contextual factors which result in gendered behavior. The model assumes that the performance of gender primarily takes place in social interactions and serves specific social purposes. Gendered behavior thus emerges based on the expectations held by the perceiver, such as stereotypes, schemata, and knowledge about the specific target; the target themselves (e.g., their self-schema, their desire to confirm or disprove the perceiver's expectations), and the situation. For example, large gender differences in behavior are likely to emerge when the perceiver believes men and women are very different and thus expects stereotypical behavior, changing the way they treat and communicate with male and female targets; when male and female targets hold very gendered self-schemata and are motivated to confirm the perceiver's expectations; and when the situation makes stereotypes salient and allows for different behaviors to emerge.

This model is perhaps the most in line with Butler's perspectives on gender. Similar to Butler, it focuses on the doing of gender, that is, on gendered behavior and its emergence in social interactions. Moreover, the model takes a more social cognitive approach, referring to gendered self-schemata rather than gender identities. Thus, while retaining the context dependence of gendered behavior inherent in social identity approaches, this model does not necessarily presume gender as a social identity in terms of men and women. In contrast to all other models discussed above, this model allows for a less binary, more fluid understanding of gender.

While these approaches thus vary considerably in how compatible they are with Butler's argument, all of them treat gender as a given, pre-existing fact, which is in stark contrast to Butler's core argument of gender being a performative act, coming into existence only through its own performance. The work of social psychologists operating outside of the experimental framework is more compatible in this regard. More specifically, discourse analysts argue that the self, including the gendered self, is created through language (e.g., Kurz and Donaghue, 2013) and focus on the production of gender in interactions rather than on gender as a predictor of behavior. For example, researchers conducting feminist conversation analysis have examined how patterns in the delivery of naturally occurring speech reproduce heteronormative gender (e.g., Kitzinger, 2005) and research from the ethnomethodology-discursive tradition examines how people acquire a gendered character through speech (e.g., Wetherell and Edley, 1999).

# FUTURE RESEARCH DIRECTIONS

In the previous section, we have outlined how some of the issues raised by Butler, such as the negative reactions to those who fail to do their gender right, have already received considerable attention in the social psychological literature. Other aspects of her argument, however, have received very little attention and hold the potential for interesting future research. We identify two broad ways in which Butler's work can inform and shape future social psychological research: (a) engendering new research questions which have not yet been investigated empirically, and (b) challenging our way of studying gender itself.

# New Research Questions

Butler's work is purely theoretical and thus many of her ideas have not been tested empirically, particularly using an experimental approach. Perhaps the most central question that can be examined by social psychologists is whether creating "gender trouble" by subverting ideas about sex, gender, and sexual desire, can indeed lead to changes in binary views of sex and gender and the proscriptive and prescriptive stereotypes that come with these views. Based on predictions derived from social role theory (Eagly, 1987), we would indeed expect that a decrease in the performance of gender as binary (i.e., less gendered social roles) would lead to decreases in gender stereotyping and the reliance on gender as a social category. In other words, if genders are not tied to specific social roles (or vice versa), they lose their ability to be informative, both in terms of self-relevant information ("what should I be like?") and in terms of expectations of others ("what is this person like?").

On the other hand, as gender identity is very central to the selfimage of many people (Ryan and David, 2003), challenging ideas about gender may be perceived as threatening. Social identity theory and self-categorization theory (Tajfel and Turner, 1979; Turner et al., 1987) argue that members of groups – including men and women – have a need to see their own group as distinct from the outgroup. If this distinctiveness is threatened, highly identified men and women are likely to enhance the contrast between their ingroup and the outgroup, for example by presenting themselves in a more gender stereotypical way and applying stereotypes to the other group (Branscombe et al., 1999) or by constructing gender differences as essential and biological (Falomir-Pichastor and Hegarty, 2014). These identity processes may thus reinforce a system of two distinct genders with opposing traits, and further punish and alienate those who fail to conform to gender norms and stereotypes. Future research needs to investigate the circumstances under which gender trouble can indeed lead to less binary views of gender, and the circumstances under which it does not. This needs to include identifying the psychological mechanisms and barriers involved in such change.

Importantly, this investigation should go beyond examining reactions to women and men who behave in counterstereotypical ways, such as women in leadership positions or stayat-home fathers, and include a focus on more radical challenges to the gender binary such as non-binary and trans individuals or drag performers. Butler discusses drag as an example of gender trouble in detail, quoting the anthropologist Newton (1968) in her observations of how drag subverts notions of gender. Discussing "layers" of appearance, Newton remarks that on the one hand, the outside appearance of drag queens is feminine, but the inside (i.e., the body) is male. At the same time, however, it appears that the outside appearance (i.e., body) is male, but the inside (the "essence") is feminine, making it hard to uphold consistent, essentialist ideas about sex and gender. Butler further argues that the exaggeration of femininity (in the case of drag queens) and masculinity (in the case of drag kings) in drag performances highlights the performative nature of gendered behaviors, that is, how gender is created through gendered performance. On the other hand, we would argue that because drag performances often draw heavily on gender stereotypes, they may also reinforce the idea of what it means to be a man or a woman. To our knowledge, there is no psychological research on how drag affects perceptions of gender, but as drag becomes more and more accessible to a wider, and more mainstream, audience (e.g., due to popular TV shows such as RuPaul's Drag Race) it might be an enlightening line of research to pursue. Does drag indeed highlight the performative nature of gender or does it simply reinforce stereotypes? Are reactions to appearancebased disruptions of the gender binary different to behaviorbased ones such as reactions to assertive women or submissive men?

Another potential line of research to pursue would be to build on the discursive literature by examining the performative nature of gender from an experimental social psychological perspective, testing how gender is created through speech and behavior. Drawing on some of the findings from qualitative psychological research discussed in the previous section might be helpful in developing predictions and quantitatively testable hypotheses.

Finally, if gender trouble is indeed effective in challenging binary, essentialist views of sex and gender, it is worth investigating how disruptive gender performance can be encouraged and used as a means of collective action. The literature on collective action to achieve gender equality has often drawn on (gender) identity-based ideas of mobilization (e.g., Kelly and Breinlinger, 1995; Burn et al., 2000). As outlined above, Butler criticizes these approaches and argues that groupbased identities ("we, women") are not necessary to achieve change. How then can we inclusively mobilize others to engage in collective action without drawing on gender identities and inadvertently reinforcing the gender binary – and with it the patriarchal system of compulsory heterosexuality it supports?

More recently, psychologists have argued that it might be more effective to focus on "feminist" (rather than gender) ideologies which acknowledge, rather than ignore, issues of intersectionality (see Radke et al., 2016), and to encourage men to engage in collective action to achieve gender equality (e.g., Subašic et al., ´ 2018). We agree with these arguments but further suggest that collective action research should examine how individuals of any gender can (a) be motivated to engage in collective action to achieve gender equality generally, and (b) be motivated to engage in gender trouble and disrupt binary notions of gender as a form of collective action.

# Studying Gender From a Performative Perspective

In addition to new research question, Butler's work also highlights the need for different methodological approaches to gender in experimental social psychology, and indeed there is much that could be learnt from those that work in the discursive tradition. There is also the potential for gender researchers to engage in gender trouble themselves by changing the way in which they treat gender.

For the most part, experimental psychologists have tended to examine gender as a predictor or independent variable – examining gender differences in all manner of social, cognitive, and clinical measures (e.g., Maccoby and Jacklin, 1974; Hyde, 2005). Indeed, as researchers, we (the authors) are guilty of publishing many papers using this methodology (e.g., Haslam and Ryan, 2008; Morgenroth et al., 2017). Similar to performative speech acts, we would argue that this can be seen as a performative research practice. The way in which we conduct our research and the choices we make in relation to gender creating the very construct that is studied, namely gender and gender differences. Our assumptions of gender as binary, pre-discursive, and natural produces research that focuses on binary, categorical gender as a predictor of gendered attitudes and behavior.

However, to our knowledge, there is very little quantitative or experimental research, that looks at the psychological processes implicated in the performance of gender, that is, treating gender as an outcome or dependent variable. If experimental social psychologists are to contribute to gender trouble, we should shift our views away from sex and gender as causes for behavior and psychological outcomes (i.e., as an independent or predictor variables). Instead, we should treat gender – whether measured as an identity, in terms of self-stereotyping, as simple selfcategorization – as a result of societal and psychological forces. Rather than asking what sex and gender can explain, we need to look at what explains sex and gender.

Moreover, while the literature acknowledges that gender salience and gender self-stereotyping vary depending on context (e.g., Lorenzi-Cioldi, 1991; Ryan and David, 2003), gender itself, regardless of how it is measured, is measured as a stable, and discrete construct. One is a man or a woman and remains so over the course of one's life. If, however, we view gender as a performance, then we must also view gender as an act, a behavior, which changes depending on context and audience. Asking participants to tick a box to indicate one's gender – as many of us often do in our research practices – is an overly simplistic measure and cannot capture the nuances of doing gender. It is neither informative nor, we would argue, terribly interesting. Instead, one could measure gender identity salience and importance or gender performance – for example measuring gender stereotypical behavior or other types of gendered selfstereotyping (e.g., using measures similar to the Bem Sex-Role Inventory; Bem, 1974).

Similarly, we, as researchers, need to stop treating gender as a binary variable. This includes our research practices as well as our theory development and research communications. For example, the demographic sections of most questionnaires should not restrict gender to two options. Instead, they should either provide a range of different options (e.g., non-binary, genderqueer, genderfluid, and agender) or allow open responses. We would also suggest not using the option "other" in addition to "male" and "female" as it can be perceived as stigmatizing. Similarly, if asking about sex rather than gender, at least a third option (i.e., intersex) should be provided (see Fonesca, 2017, for examples).

However, we need to go beyond that. At the moment, even when gender is measured in a non-binary way, those who fall outside of the gender binary are usually excluded from analysis. This is equally true for sexual minorities. Unless sexual orientation is central to the research question, those who don't identify as heterosexual are often excluded by gender researchers as stereotypes and norms of gay, lesbian, bisexual, or asexual individuals often differ from general gender stereotypes. While these decisions often make sense for each individual case (and we, the authors, have in fact engaged in them as well), this overall produces a picture that erases variation and reinforces the idea that there are two opposing genders with clear boundaries. As experimental social psychologists with an interest in gender, we need to do better. Similarly, our theories themselves should allow for a fluid understanding of gender which also takes issues of intersectionality – with sexual orientation, but also with race, class, and other social categories – into account.

Finally, when we talk about gender, we should do so in a way that makes gender diversity visible rather than way that marginalizes non-binary gender further. For example, replacing binary phrases such as "he or she" with gender-neutral ones such as "they" or ones that highlight non-binary gender such as "he, she, or they" or "he, she, or ze"<sup>3</sup> . While the use of the genderneutral singular "they" is often frowned upon and deemed grammatically incorrect (American Psychological Association, 2010; University of Chicago, 2010), it has in fact been part of the English language for centuries and was widespread before being proscribed by grammarians advocating for the use of the generic masculine in the 19th century (Bodine, 1975). Despite these efforts, the singular "they" has remained part of spoken language, where it is used to refer to individuals whose sex is unknown or unspecified ("Somebody left their unicorn in my stable") and to members of mixed-gender groups (e.g., "Anybody would feed their unicorn glitter if they could").

The use of new pronouns such as "ze," specifically developed to refer to people outside of the binary, might be more effortful and equally controversial. However, evidence from Sweden, where the gender-neutral pronoun "hen" has become more widely used since the publication a children's book using only "hen" instead of "han" (he) and "hon" (her) in 2012, indicates that attitudes toward its use have shifted dramatically from predominantly negative to predominantly positive in a very short amount of time (Gustafsson Sendén et al., 2015). As gender researchers, we should be at the forefront of such issues and promote and advance gender equality – and gender diversity – not only

<sup>3</sup>The exact origins of the non-binary pronouns ze/hir or ze/zir are unknown, but ze/hir is often credited to Bornstein (1996). There are no clear conventions around non-binary pronoun use and many different alternatives have been proposed.

through our research but also by communicating our research in a gender-inclusive way, especially in light of Butler's (and others') arguments that language is a crucial mechanism in creating gender and reinforcing the gender binary.

# CONCLUSION

In this paper we put forward suggestions for ways in which Judith's Butler's (1990) notions of gender trouble could be integrated into experimental social psychology's understanding of gender, gender difference, and gender inequality. We have outlined her work and discussed the extent to which prominent views of gender within psychology are compatible with this work. Moreover, we suggested potential avenues of future research and changes in the way that we, as researchers, treat gender.

We believe that, as experimental social psychologists, we should be aware that we may inadvertently and performatively reinforce the gender binary in the way in which we do research – in the theories we develop, in the measures that we use, and in the research practices we undertake. By taking on board Butler's ideas into social psychology, we can broaden our research agenda – raising and answering questions of how social change can be achieved. We can provide a greater understanding of the psychological processes involved in creating gender trouble, and in resisting gender trouble – but above all, we are in a position to create our own gender trouble.

# REFERENCES

Adams, N. (2017). GLAAD Calls for Increased and Accurate Media Coverage of Transgender Murders. Available at: https://www.glaad.org/blog/glaad-callsincreased-and-accurate-media-coverage-transgender-murders

Ainsworth, C. (2015). Sex redefined. Nature 518, 288–291. doi: 10.1038/518288a


# NOTES

The first author of this paper uses they/them/their pronouns, the second author uses she/her/hers pronouns.

# AUTHOR CONTRIBUTIONS

TM and MR jointly developed the ideas in the paper. TM wrote the paper. MR read the paper and provided feedback on several drafts of the paper.

# FUNDING

This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (Grant Agreement No. 725128). This article reflects only the authors' views. The European Research Council and the Commission are not responsible for any use that may be made of the information it contains.

# ACKNOWLEDGMENTS

The authors would like to thank Thomas Morton, Teri Kirby, Christopher Begeny, and Renata Bongiorno for their helpful comments on a previous version of the manuscript and Peter Hegarty for his contribution as an engaged reviewer.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Morgenroth and Ryan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.