# BRIDGING GAPS BETWEEN SEX AND GENDER IN NEUROSCIENCES

EDITED BY : Annie Duchesne, Meng-Chuan Lai, Gillian Einstein, Belinda Pletzer and Marina A. Pavlova PUBLISHED IN : Frontiers in Neuroscience, Frontiers in Behavioral Neuroscience, Frontiers in Human Neuroscience, Frontiers in Endocrinology and Frontiers in Psychology

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-865-9 DOI 10.3389/978-2-88963-865-9

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# BRIDGING GAPS BETWEEN SEX AND GENDER IN NEUROSCIENCES

Topic Editors:

Annie Duchesne, University of Northern British Columbia Canada, Canada Meng-Chuan Lai, University of Toronto, Canada Gillian Einstein, University of Toronto, Canada Belinda Pletzer, University of Salzburg, Austria Marina A. Pavlova, University Hospital Tübingen, Germany

Citation: Duchesne, A., Lai, M.-C., Einstein, G., Pletzer, B., Pavlova, M. A., eds. (2020). Bridging Gaps Between Sex and Gender in Neurosciences. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-865-9

# Table of Contents


Zhiguo Luo, Chenping Hou, Lubin Wang and Dewen Hu


Corinna M. Perchtold, Ilona Papousek, Andreas Fink, Hannelore Weber, Christian Rominger and Elisabeth M. Weiss


Yu H. Zhong, Hong Y. Wu, Ren H. He, Bi E. Zheng and Jian Z. Fan

*78 Bridging Sex and Gender in Neuroscience by Shedding a* priori *Assumptions of Causality*

Melissa M. Holmes and D. Ashley Monks


B. Derntl, J. Hornung, Z. D. Sen, L. Colic, M. Li and M. Walter

*102 Beyond Biological Sex: Interactive Effects of Gender Role and Sex Hormones on Spatial Abilities*

Belinda Pletzer, Julia Steinbeisser, Lara van Laak and TiAnni Harris

*115 Aromatization is Not Required for the Facilitation of Appetitive Sexual Behaviors in Ovariectomized Rats Treated With Estradiol and Testosterone*

Sherri Lee Jones, Stephanie Rosenbaum, James Gardner Gregory and James G. Pfaus

*128 The Verbal Interaction Social Threat Task: A New Paradigm Investigating the Effects of Social Rejection in Men and Women*

Sanne Tops, Ute Habel, Ted Abel, Birgit Derntl and Sina Radke


Irene Meester, Gerardo Francisco Rivera-Silva and Francisco González-Salazar

*168 Untangling the Ties Between Social Cognition and Body Motion: Gender Impact*

Sara Isernia, Alexander N. Sokolov, Andreas J. Fallgatter and Marina A. Pavlova

## Editorial: Bridging Gaps Between Sex and Gender in Neurosciences

Annie Duchesne<sup>1</sup> \*, Belinda Pletzer 2,3, Marina A. Pavlova<sup>4</sup> , Meng-Chuan Lai 5,6,7,8,9,10 and Gillian Einstein10,11,12,13,14

<sup>1</sup> Department of Psychology, University of Northern British Columbia, Prince George, BC, Canada, <sup>2</sup> Department of Psychology, University of Salzburg, Salzburg, Austria, <sup>3</sup> Centre for Cognitive Neuroscience, University of Salzburg, Salzburg, Austria, <sup>4</sup> Department of Psychiatry and Psychotherapy, Medical School and University Hospital, Eberhard Karls University of Tübingen, Tübingen, Germany, <sup>5</sup> Margaret and Wallace McCain Centre for Child, Youth & Family Mental Health, Azrieli Adult Neurodevelopmental Centre, and Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, ON, Canada, <sup>6</sup> Department of Psychiatry and Autism Research Unit, The Hospital for Sick Children, Toronto, ON, Canada, <sup>7</sup> Department of Psychiatry, Faculty of Medicine, University of Toronto, Toronto, ON, Canada, <sup>8</sup> Department of Psychiatry, University of Cambridge, Cambridge, United Kingdom, <sup>9</sup> Department of Psychiatry, National Taiwan University Hospital and College of Medicine, Taipei, Taiwan, <sup>10</sup> Department of Psychology, University of Toronto, Toronto, ON, Canada, <sup>11</sup> Rotman Research Institute, Baycrest Hospital, Toronto, ON, Canada, <sup>12</sup> Department of Gender Studies, Linköping University, Linköping, Sweden, <sup>13</sup> Canadian Consortium on Neurodegeneration and Aging, Toronto, ON, Canada, <sup>14</sup> Wilfred and Joyce Posluns Chair in Women's Brain Health and Aging, Toronto, ON, Canada

Keywords: sex, gender, gonadal hormones, epigenetic, cognition, stress, pain, brain injury

#### **The Editorial on the Research Topic**

#### Edited by:

Hubert Vaudry, Université de Rouen, France

#### Reviewed by:

Giancarlo Panzica, University of Turin, Italy Oliver J. Bosch, University of Regensburg, Germany

> \*Correspondence: Annie Duchesne annie.duchesne@unbc.ca

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Neuroscience

Received: 24 April 2020 Accepted: 06 May 2020 Published: 11 June 2020

#### Citation:

Duchesne A, Pletzer B, Pavlova MA, Lai M-C and Einstein G (2020) Editorial: Bridging Gaps Between Sex and Gender in Neurosciences. Front. Neurosci. 14:561. doi: 10.3389/fnins.2020.00561 **Bridging Gaps Between Sex and Gender in Neurosciences**

Individual differences are shaped by a myriad of interrelated factors depending on how the nervous system develops, adapts, reacts to, and interacts with the world outer settings. Sex-related variables represent a set of sexually-defining biological characteristics including chromosomes, patterns of gene expression, hormone levels, and reproductive/sexual anatomy. These variables have been extensively linked to the development and functioning of the nervous system (Pletzer, 2015; Prager, 2017). The sex of an organism has been associated with a plethora of well-established (Jonasson, 2005; Mogil, 2012; Stevens and Hamann, 2012; Yagi and Galea, 2019) and controversial (Eliot, 2011) nervous system differences. However, sex-related biological variables rarely fully explain nervous system differences between male and female individuals (Eliot and Richardson, 2016), particularly in humans (Pavlova, 2017a; Rippon et al., 2017). In concert with biological differences, women and men differ in their experiences of the social world. Gender-related variables, including gendered behaviors, relations, expectations, beliefs, and attitudes that are experienced throughout the lifespan, have also been associated with differences in brain function and behavior across individuals (Einstein, 2007; Rippon et al., 2014; Jordan-Young et al., 2019). Sex- and gender-related variables dynamically influence our biology and the environment such that these variables are continuously shaping and being shaped in a reciprocal relationship with the world. As scientific and clinical communities recognize the need for neuroscientific inquiry that interrogates the combined contributions of sex- and gender-related variables, the call to action multiplies (Kimerling et al., 2018; Nebel et al., 2018; Grissom and Reyes, 2019; Tannenbaum et al., 2019). This special topics issue of Frontiers assembles a collection of research, reviews and commentaries that propose new approaches to the integration of sex- and gender-related variables in neuroscience.

The constructs of sex and gender as defined in this text are often not dissociable in the literature, especially when individuals are categorized as women or men. Categorizing individuals as "women and men" or "male and female" challenges our capacity to interrogate the relative Duchesne et al. Gender and Sex in Neurosciences

contributions of gender-related factors to understanding individual differences. That said, even when using the broad category of "women and men," novel analytical approaches can improve our characterization of where and when sexand gender-related differences occur. Four articles within this Issue reveal how the binary category, "women and men," continues to moderate brain and psychological processes. Luo et al. demonstrate that multivariate classification approaches with high-dimensional data (e.g., tens of thousands of features per subject/observation) from cortical brain morphology can categorize adult individuals as women or men, replicating previous findings conducted with high-dimensional, large sample size, and multivariate approaches (Chekroud et al., 2016; Rosenblatt, 2016; Anderson et al., 2019). In another study, Stam et al. demonstrate opposing associations between personality traits and gray matter brain volume in individuals grouped as women or men. Two articles provide novel insights into the relevance of the "women and men" as categories for understanding psychological processes. In a study exploring how individuals infer social signals from bodies and eyes of others, Isernia et al. report that categorizing participants as "women and men" reveals that individuals within each group may use different sources of information and perceptual strategies to achieve similar level of performance accuracy in social cognition tasks. Perchtold et al. first demonstrate that individuals categorized as women or men are equally able to generate cognitive reappraisals for anxiety-inducing situations, but that reappraisal ability is only predictive of reduced depressive symptoms in those categorized as men. These studies expand the knowledge foundation upon which elements of sex and gender can be further interrogated.

Improving our neuroscientific understanding of sex and gender can be achieved through direct pharmacological and physiological manipulation. For instance, Derntl et al. demonstrate that the dissociative experience (e.g., following the administration of a subclinical dose of ketamine) differs between individuals categorized as women or men. Similarly, Wang et al. show that the effects of mPFC transcranial direct current stimulation (tDCS) on implicit gender stereotype bias differs between individuals categorized as women or men. Addressing the larger issue of how the "women and men" category moderates the effects of pharmacological and physiological manipulations enhances the ability to make nuanced behavioral predictions and provides critical information for future clinical trial design.

Three studies in this Issue investigate the unique and relative contributions of both sex- and gender-related variables to brain and psychological processes. Hornung et al. preliminary findings demonstrate that the recruitment of brain regions during the processing of gendered self attributes varies according to circulating levels of sex hormones in individuals categorized as women or men. Plezter et al. re-examine the previously-reported finding that the "women and men" category is a reliable predictor of spatial ability and reveal that this association disappears when accounting for the interactive effects of circulating levels of gonadal hormones and self-reported endorsement of stereotypical attitudes and activities. Adopting a similar analytical approach in another study, Plezter interrogates the interaction between gonadal hormones and gendered attitudes and activities in predicting grey matter volume (Pletzer). These studies highlight how important it is to go beyond "women and men" as categories within the realm of neuroscientific inquiry, as they may be obscuring relationships that can be better explained through more nuanced biosocial interactionist approaches.

The prevalence of a number of clinical conditions differs as a function of the "women and men" categories. While an increasing number of theoretical models explore these differences by integrating dimensions of sex- and gender-related variables (Lai et al., 2015; Becker et al., 2017; Nebel et al., 2018), most studies tend to restrict causal explanations to either sex- or gender- related variables (Li and Graham, 2017; Hillerer et al., 2019; Slavich and Sacher, 2019). Thus, looking at interaction between sex- and gender-related variables in clinical conditions may have theoretical and therapeutic benefits. In a critical review of the literature on fibromyalgia, Meester et al. propose a model integrating sex- and gender-related variables. Investigating physiological correlates of consciousness in patients with traumatic brain injury, Zhong et al. demonstrate that high circulating levels of testosterone within a week following the trauma predicted regaining of consciousness only in individuals identified as men. Recognizing and integrating sex- and genderrelated variables is central for furthering our understanding of the brain and moving toward the development of personalized precision medicine.

Measurements, tasks, tests, and experimental manipulations are developed and validated under a number of assumptions that often do not account for the possible roles of sex- and genderrelated variables. Re-examining and validating methodologies across sexes and genders is a crucial step; when validation has not been conducted with sex- and gender-related differences in mind, discriminating between true differences and methodological artifacts is simply impossible (McCarthy et al., 2017; Rich-Edwards et al., 2018). For instance, in a critical review of individual differences in placebo/nocebo effects, Enck and Klosterhalfen report that differences observed between women and men are more commonly reported in experimental studies than in randomized clinical trials, suggesting that methodological bias may contribute to apparent systemic group-level differences. Building on past stress paradigms, Tops et al. developed and validated a new neuroimaging virtual social rejection stress paradigm reproducing peer exclusion commonly experienced on social media platforms and allowing for a more specific investigation of possible sex- and gender-related differences in the neurophysiological processes of peer social rejection. Finally, Jones et al. reveal independent contributions of combined estradiol and testosterone to sexual behavior in female rats, demonstrating the empirical value of examining the role of multiple sex steroid hormones within all individuals. Revisiting and developing new methodologies that account for the possible contributions of sex- and gender-related variables is essential to provide a valid foundation of neuroscientific inquiry.

Ultimately, the research on sex and gender in neuroscience is constrained by issues with operationalizing definitions of sex and gender. Two articles in this Research Topic reconsider the stability of sex and gender as separate, uniform constructs, and how sex and gender relate to one another in the pursuit of understanding individual rather than category-based differences in neuroscience. Holmes and Monks argue that the very categories of sex and gender are problematic in attempting to bridge these constructs with neuroscientific questions. Similarly, Cortes et al. propose that, rather than treating sex and gender as discrete boxes, researchers should focus on understanding an individual's experiences of sex and gender as products of interactive, dynamic and multifaceted epigenetic processes. By focusing on individual-level variables rather than broad categories, these new conceptual frameworks facilitate the understanding of individual differences in neuroscientific processes.

Challenges remain for the bridging of sex and gender dimensions in neuroscience, and, in particular, in our understanding of the social brain (e.g., Pavlova, 2017a,b); some of these are apparent from the studies in this Issue. For example, the varied terminology employed in describing the often category-based sex- and gender-related differences across the different papers within this special issue highlights the need for researchers and clinicians to more consistently and explicitly operationalize their usage of these terms (Clayton and Tannenbaum, 2016). As well, most of the work this Issue operationalizes "women and men" as binary categories (Hyde et al., 2019), which is understandable considering how individuals are often categorized in research, but may in fact

## REFERENCES


be of questionable validity considering our current state of understanding the multi-dimensional nature of sex- and genderrelated variables (Johnson et al., 2009). Ultimately, the field will benefit most from going beyond the dichotomous categories of sex and gender and embracing interactionist models, as underscored by some of the papers in this special issue on Sex and Gender in Neuroscience.

## AUTHOR CONTRIBUTIONS

AD conceptualized, drafted, and implemented the suggestions by co-authors. BP provided the significant practical and conceptual suggestions to the manuscript. MP provided the multiple practical and conceptual feedback suggestions to the manuscript and proofread the final version. M-CL provided the practical and conceptual suggestions to the manuscript. GE provided the practical and conceptual suggestions to the manuscript. All co-authors contributed to the writing of the paper.

## ACKNOWLEDGMENTS

The authors acknowledge the contribution of Jan Van Den Stock and Alfonso Abizaid as additional topic editors of this Research Topic. We thank the contributors to this Research Topic for their participation and all the reviewers for their insightful comments, recommendations to and lively discussions in the review forum.


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Duchesne, Pletzer, Pavlova, Lai and Einstein. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Gender Identification of Human Cortical 3-D Morphology Using Hierarchical Sparsity

Zhiguo Luo<sup>1</sup> , Chenping Hou<sup>2</sup> , Lubin Wang<sup>3</sup> and Dewen Hu<sup>1</sup> \*

*<sup>1</sup> College of Mechatronics and Automation, National University of Defense Technology, Changsha, China, <sup>2</sup> College of Science, National University of Defense Technology, Changsha, China, <sup>3</sup> Cognitive and Mental Health Research Center, Beijing Institute of Basic Medical Science, Beijing, China*

Difference exists widely in cognition, behavior and psychopathology between males and females, while the underlying neurobiology is still unclear. As brain structure is the fundament of its function, getting insight into structural brain may help us to better understand the functional mechanism of gender difference. Previous structural studies of gender difference in Magnetic Resonance Imaging (MRI) usually focused on gray matter (GM) concentration and structural connectivity (SC), leaving cortical morphology not characterized properly. In this study a large dataset is used to explore whether cortical three-dimensional (3-D) morphology can offer enough discriminative morphological features to effectively identify gender. Data of all available healthy controls (*N* = 1113) from the Human Connectome Project (HCP) were utilized. We suggested a multivariate pattern analysis method called Hierarchical Sparse Representation Classifier (HSRC) and got an accuracy of 96.77% for gender identification. Permutation tests were used to testify the reliability of gender discrimination (*p* < 0.001). Cortical 3-D morphological features within the frontal lobe were found the most important contributors to gender difference of human brain morphology. Moreover, we investigated gender discriminative ability of cortical 3-D morphology in predefined Anatomical Automatic Labeling (AAL) and Resting-State Networks (RSN) templates, and found the superior frontal gyrus the most discriminative in AAL and the default mode network the most discriminative in RSN. Gender difference of surface-based morphology was also discussed. The frontal lobe, as well as the default mode network, was widely reported of gender difference in previous structural and functional MRI studies, which suggested that morphology indeed affect human brain function. Our study indicates that gender can be identified on individual level by using cortical 3-D morphology and offers a new approach for structural MRI research, as well as highlights the importance of gender balance in brain imaging studies.

Keywords: cortical three-dimensional morphology, gender difference, hierarchical sparse representation classifier, Magnetic Resonance Imaging, multivariate pattern analysis

## 1. INTRODUCTION

Gender difference has been widely reported in psychiatric and neurological diseases (Piccinelli and Wilkinson, 2000; Baron-Cohen et al., 2005; Shulman, 2007; Eranti et al., 2013; Lai et al., 2015), cognitive functions (Ren et al., 2009; Ohla and Lundstr, 2013; Yin et al., 2017; Chen et al., 2018) and behaviors (Christov-Moore et al., 2014), while its neurobiological mechanism is unclear yet

#### Edited by:

*Meng-Chuan Lai, University of Toronto, Canada*

#### Reviewed by:

*Miao Cao, Fudan University, China Erin W. Dickie, Centre for Addiction and Mental Health (CAMH), Canada*

> \*Correspondence: *Dewen Hu*

*dwhu@nudt.edu.cn*

Received: *02 August 2018* Accepted: *21 January 2019* Published: *07 February 2019*

#### Citation:

*Luo Z, Hou C, Wang L and Hu D (2019) Gender Identification of Human Cortical 3-D Morphology Using Hierarchical Sparsity. Front. Hum. Neurosci. 13:29. doi: 10.3389/fnhum.2019.00029* (Giudice, 2009). As neural function has its structural basis, studying brain neuroanatomy may provide us new insights and understandings of gender difference.

Previous reports tend to explain gender difference in the view of GM concentration, SC and Functional Connectivity (FC). Wang et al. (2012) applied multivariate pattern analysis on GM concentration and resting state fMRI from healthy young adults and got an accuracy of 89%, and they found the occipital lobe and the cerebellum the most discriminative regions of gender difference; Yuan et al. (2018a) proposed a three-dimensional weighted histogram of gradient orientation to describe the complex spatial structure of human brain image, and they got an over 90% accuracy of gender classification on 527 healthy adults from four research sites; Ruigrok et al. (2014) reported gender difference in the amygdala, hippocampus, and insula after meta-analysis in human brain structure; Goldstein et al. (2001) found females had higher percentage of GM than males, while Gur et al. (1999) got a converse result in white matter; Feis et al. (2013) used multimodal gender classification of T1-weighted, T2-weighted and fractional anisotropy images and indicated the frontal lobe the most discriminative lobe. Gong et al. (2009) found greater overall cortical connectivity and more efficient cortical network organizations in women; Ingalhalikar et al. (2013) reported that males had stronger intra-hemispheric SC while females had stronger inter-hemispheric SC using diffusion tensor imaging. Zhang et al. (2018) used 4 fMRI runs of 820 healthy controls from the HCP and got the accuracy of 87% using FC features for gender prediction, and they suggested that FC within the default, fronto-parietal and sensorimotor networks had the greatest gender prediction abilities while the right fusiform gyrus and the right ventromedial prefrontal cortex contributed the most in the default mode network.

Recently, gender difference in surface-based morphology such as cortical thickness, surface area, cortical curvature and cortical volume has attracted much attention. Im et al. (2006) indicated that women showed more significant localized cortical thickening in the frontal, parietal and occipital lobes, which were also reported of significant gender-related difference by Lv et al. (2010) using graph theoretical approaches; Sowell et al. (2007) found women had thicker cortices in posterior temporal and right inferior parietal regions, while men showed larger brain in all locations, especially in the frontal and occipital poles of both hemispheres; Sepehrband et al. (2018) developed a multivariate statistical learning model to predict gender from regional neuroanatomical features on different brain atlases, and they got an 83% cross-validated prediction accuracy and found the middle occipital lobes and the angular gyri the major predictors of gender.

Despite studies of gender difference in surface-based morphology, few paid attention to the original cortical 3-D morphology, which is defined as the voxel-based morphology of the cerebral cortex without gray matter concentration in the standard MNI space. Clearly the original cortical 3-D morphology contains more abundant and complete morphological information, and most surface-based morphology such as cortical thickness and curvature are measured on the cortical 3-D morphology (cortical volume and surface area are measured in the subject's undistorted native volume space). Moreover, most previous morphology studies focused on finding gender difference using statistical analysis while few of them have effectively discriminated males from females with high classification accuracy using those morphological features to support their conclusions.

In this study, we aimed to find gender difference of cortical 3-D morphology and focused on two questions: (a) Can gender be discriminated with a high accuracy using cortical 3-D morphology? (b) What is the most discriminative region of gender in cortical 3-D morphology?

## 2. MATERIALS AND METHODS

## 2.1. Data Acquisition and Preprocessing

Structural MRI was acquired from the HCP S1200 release, and details about the HCP can be seen in Essen et al. (2012). Subjects were scanned on a customized 3T Siemens scanner (Connectome Skyra) with a standard 32-channel head coil and a body transmission coil and scan parameters were as follows: TR = 2400 ms, TE = 2.14 ms, Voxel Size = 0.7 mm isotropic. All 1113 available subjects (age: 22–37 years, gender: 507 males and 606 females) were selected for our gender difference study.

Data were initially preprocessed by the HCP structural pipelines in this study, and a highlight of the HCP pipelines is that it uses T2-weighted structural images for registration so as to get more precise registration and segmentation results. The main preprocessing steps include gradient distortion correction, brain extracting, readout distortion correction, boundarybased cross-modal registration, bias field correction, recon-all pipeline in FreeSurfer, and native to MNI nonlinear volume transformation, and detailed preprocessing steps can be seen in Glasser et al. (2013). One of the outputs, the wmparc, is an accurate subject-specific human brain mask of the gray matter and white matter in the MNI space. In the file "MNINonLinear/wmparc.nii" of each subject of the HCP, the scattered integers between 251 and 2035 stand for different subregions of the cerebral cortex, and when they were defined as 1 and others as 0, the original 3-D morphology of the cerebral cortex were obtained (**Figure 1A**). We also attempted to analyse the discriminative abilities of both anatomical and functional subregions, so atlas-based morphology analysis (Meyer et al., 2017) was conducted with two predefined atlas: the AAL template (Tzourio-Mazoyer et al., 2002) was used as structural atlas and the 7 RSN template (Thomas Yeo et al., 2011) was used as functional atlas (https:// surfer.nmr.mgh.harvard.edu/fswiki/CorticalParcellation\_

Yeo2011, "Yeo2011\_7Networks\_MNI152\_FreeSurfer-Conformed1mm\_LiberalMask.nii," downsampled to 1.4 mm isotropic). All the MRI files and templates were in the standard MNI space for comparisons across subjects.

As surface-based morphology was discussed in this study, we obtained 4 surface-based morphological features (thickness, curvature, sulc and myelinmap) in the HCP for gender difference analysis. They were all spatially downsampled to a ∼32k mesh of each hemisphere (average vertex spacing of ∼2 mm).

FIGURE 1 | Framework of gender identification of cortical 3-D morphology via HSRC. (A) Process of cortical 3-D morphology extraction. For each subject, T1w and T2w were used in the HCP structural pipelines to generate a normalized volume parcellation—the wmparc, which is an accurate subject-specific human brain mask of the gray matter and white matter in the MNI space. We defined the value of the gray matter voxels as 1 and others as 0, and got the original cortical 3-D morphology. (B) Gender classification with cortical 3-D morphology using HSRC. The original cortical 3-D morphology (0.7 mm) of each subject was first downsampled into 1.4 and 2.8 mm, then gender classification was conducted on the 2.8 mm 3-D morphology with 10-fold cross validation, RFS was used on the training data to select voxels in each fold. We set the overall classification accuracy as a function of the number of selected voxels in each fold, and selected the union of the selected voxels in each fold corresponding to the highest accuracy as discriminative voxels, the corresponding voxels in 1.4 mm morphology were selected as the initial input for the next 10-fold across validation. The same operation was conducted in 0.7 mm data.

## 2.2. Hierarchical Sparsity Feature Selection

Considering the scale of the dataset in this study, a 10 fold cross validation was conducted for gender classification, and in consideration of numerous features of MRI data (dimensionality=1,113 × 4,352,560 after abandon all-0 and all-1 columns for 0.7 mm data matrix), dimensionality reduction is essential to alleviate or avoid the curse of dimensionality (Liu and Motoda, 1998).

Feature extraction algorithms like Principal Component Analysis (PCA) combine all features to create new dimensionality reduced features in a new feature space, and general statistical tests like t-test are unsuitable to filter 0-1 distributed features. Comparatively, sparse representations select typical features from the original feature space directly, so that we can maintain the original physical meanings of the cortical morphological features and have a better explanation.

Since sparse representation is not good at dealing with data with too large dimensionality (Su et al., 2012), we proposed a Hierarchical Sparse Representation Classifier (HSRC) algorithm for informative feature selection and classification (**Figure 1B**). MRI data were downsampled to voxel size=1.4 mm isotropic (feature dimensionality=544,069 after abandon all-0 and all-1 columns) and voxel size=2.8 mm isotropic (feature dimensionality=67,994 after abandon all-0 and all-1 columns). The 10-fold cross-validation classification was first conducted in 2.8 mm data. In each fold, we aligned all the 67,994 features of the training set using sparse representation and empirically select the first 10,000 features in 200 intervals, and thus we had 50 (10,000/200) classification results in each fold. The overall classification accuracy was the average accuracy of classification with the same number of training data features across folds, and when the highest overall classification accuracy was got, the union of the selected features in each fold were regarded as the most discriminative features of 2.8 mm data. The corresponding 1.4 mm features of all the selected features in 2.8 mm data were defined as the original features (8 times the dimensionality of the selected 2.8 mm features) for the next sparse representation operation. The same operation was conducted in 1.4–0.7 mm data.

Given training data **X** = [**x**1, **x**2, · · · , **x**n] ∈ R d×n and the associated class labels **y** ∈ R n , the sparse representation algorithm can be modeled as follows:

$$\mathbf{y} = \mathbf{X}^T \mathbf{w},\tag{1}$$

where **w** ∈ R d is the weight vector to be solved and it should be as sparse as possible. It can be described as the following optimization problem:

$$\begin{aligned} \min & \|\mathbf{w}\|\_{0} \\ \text{s.t.} & \mathbf{X}^{T}\mathbf{w} = \mathbf{y}, \end{aligned} \tag{2}$$

it is a ℓ0-norm problem which is difficult to get the solution although the solution is the most desirable to Equation 1.

Under practical conditions, the ℓ0-norm problem is equivalent or approximately equivalent to the ℓ1-norm problem. It is convex and thus can be easily optimized. Besides, the utility of ℓ1-norm makes **w** less sensitive to noise. Consequently, we can get **w** by solving the following problem:

$$\begin{aligned} \min & \|\mathbf{w}\|\_1 \\ \text{s.t.} & \mathbf{X}^T \mathbf{w} = \mathbf{y}, \end{aligned} \tag{3}$$

considering that the constraint condition **X** <sup>T</sup>**w** = **y** makes **w** sensitive to outliers of **X**, we suggested a new equation:

$$\min\_{\mathbf{w}} f(\mathbf{w}) = \left\| \mathbf{X}^T \mathbf{w} - \mathbf{y} \right\|\_1 + \mathbf{y} \left\| \mathbf{w} \right\|\_1,\tag{4}$$

thus we can get the approximate solution of Equation 1, and make sparse representation more robust.

We find Equation 4 is a specific form of the Robust Feature Selection (RFS) algorithm proposed by Nie et al. (2010). The RFS is based on regression and ℓ2,1-norm sparsity regularization. Unlike the traditional least square regression which uses the squared ℓ2-norm loss, RFS emphasizes joint ℓ2,1 norm minimization on both loss function and regularization. Before introducing RFS method, we first present the definition of the ℓ2,1-norm of a matrix.

For the matrix **M** ∈ R <sup>n</sup>×m, its ℓ2,1-norm is defined as:

$$\|\mathbf{M}\|\_{2,1} = \sum\_{i=1}^{n} \sqrt{\sum\_{j=1}^{m} m\_{ij}^{2}} = \sum\_{i=1}^{n} \left\|\mathbf{m}^{i}\right\|\_{2},\tag{5}$$

where **m**<sup>i</sup> is the i-th row of **M**.

Given training data {**x**1, **x**2, · · · , **x**n} ∈ R d , the RFS algorithm employs the one-vs-rest binary coding scheme to encode the class labels. Denote the total number of classes as c. The label vector of training data **x**<sup>i</sup> is represented by **y**<sup>i</sup> ∈ {0, 1} c×1 , such that yi(j) = 1 if **x**<sup>i</sup> belongs to the j-th category and yi(j) = 0 otherwise. The associated class labels of all data points are {**y**<sup>1</sup> , **y**<sup>2</sup> , · · · , **y**<sup>n</sup> } ∈ R c . RFS optimizes the following robust loss function:

$$\min\_{\mathbf{W}} \sum\_{i=1}^{n} \left\| \mathbf{W}^T \mathbf{x}\_i + \mathbf{b} - \mathbf{y}\_i \right\|\_2,\tag{6}$$

where **W** ∈ R d×c is the projection matrix and **b** ∈ R c is the bias vector.

For simplicity, the bias **b** can be absorbed into **W** when the constant value 1 is added as an additional dimension for each data **x**i(1 ≤ i ≤ n) . Thus, the problem becomes:

$$\min\_{\mathbf{W}} \sum\_{i=1}^{n} \left\| \mathbf{W}^T \mathbf{x}\_i - \mathbf{y}\_i \right\|\_2. \tag{7}$$

For the sake of feature selection, we will add a sparse regularizer. Essentially, the i-row vector of **W** corresponds to the transformation vector of the i-th feature in regression. It can also be regarded as a vector that measures the importance of the i-th feature. Considering the task of feature selection, we expect that the transformation matrix holds the sparsity property for feature selection. More concretely, we expect that only a small number of row vectors of **W** are non-zeros. As a result, the corresponding features are selected since these features are enough to regress the original data **x**<sup>i</sup> to its label vector **y**<sup>i</sup> . When we employ the ℓ2 norm of each row vector as a metrix to measure its contribution in this regression, the sparsity property, i.e., a small number of row vectors that are non-zeros, indicates the following RFS objective function:

$$\min\_{\mathbf{W}} \sum\_{i=1}^{n} \left\| \mathbf{W}^T \mathbf{x}\_i - \mathbf{y}\_i \right\|\_2 + \nu \sum\_{i=1}^{n} \left\| \mathbf{w}^i \right\|\_2,\tag{8}$$

where **w** <sup>i</sup> denotes the i-th row of **W**. The parameter γ is to balance the regression loss and the influence of sparse regularizer, and it was set to be the default value 0.01 suggested by Nie et al. (2010) through a series of empirical studies.

Denote data matrix **X** = [**x**1, **x**2, · · · , **x**n] ∈ R d×n and label matrix **Y** = [**y**<sup>1</sup> , **y**<sup>2</sup> , · · · , **y**<sup>n</sup> ] <sup>T</sup> ∈ R n×c , the objective function becomes:

$$\begin{aligned} \min\_{\mathbf{W}} J(\mathbf{W}) &= \sum\_{i=1}^{n} \left\| \mathbf{W}^T \mathbf{x}\_i - \mathbf{y}\_i \right\|\_2 + \nu \sum\_{i=1}^{n} \left\| \mathbf{w}^i \right\|\_2 \\ &= \left\| \mathbf{X}^T \mathbf{W} - \mathbf{Y} \right\|\_{2,1} + \nu \left\| \mathbf{W} \right\|\_{2,1} . \end{aligned} \tag{9}$$

The ℓ2,1-norm based loss function makes RFS robust to outliers in data points and the ℓ2,1-norm regularization enables RFS to select features across all data points with joint sparsity. Though both terms of the objective function are non-smooth, the problem can be solved efficiently with the reweighted method, which has been proved to be convergent. More details about the RFS algorithm can be seen in Nie et al. (2010).

After obtaining the solution of **W**, features are ranked according to the value of **w** i 2 . In other words, the larger value of **w** i 2 denotes that the i-th feature are more important. The features with less importance are then discarded.

## 2.3. Classification and Cross Validation

In each of the 10-fold cross validation, 90% samples were regarded as the training set and the remaining 10% samples were served as the testing set. The classifier used in this study was linear support vector machine (SVM), whose goal is to find a decision function:

$$\mathcal{Y} = \mathsf{h}'\mathfrak{x} + b,\tag{10}$$

by solving the following optimization problem:

$$\begin{aligned} \min\_{h, \varepsilon} &\frac{1}{2}h^2 + C \sum\_{i=1}^N \xi\_i \\ \text{s.t. } &\wp\_i \left(h^\prime \varkappa\_i + b\right) \ge 1 - \xi\_i, \end{aligned} \tag{11}$$

where **h** denotes the normal of the hyperplane, **x**<sup>i</sup> denotes the i-th training vector and y<sup>i</sup> is its corresponding lebel, ξ<sup>i</sup> is the misclassification errors of non-separable cases, and C is the empirical risk and model complexity which was set to be 1 in this study. Females were labeled as -1 and males were labeled as 1, and thus the classification threshold was 0. The classification accuracy and the area under curve (AUC) of the receiver operating characteristic (ROC) curve were used as the classification performance index, and 1,000 times of permutation tests and 1,000 times of bootstrap tests were conducted to access the overall statistical significance of the classification results. In the permutation test of each fold, gender labels were randomly permuted when gender features kept stable, and 1,000 AUC values were used to construct a null distribution and compare with AUC value of using true gender labels. In each bootstrap test, 90% of the training set were randomly chosen as new training set, and inspired by the back projection stage of Wang et al. (2012),

TABLE 1 | AUC and accuracy for gender classification.


the weight of voxels was defined as the absolute of **h**, and detailed equation was as follows:

$$\mathbf{g} = ab\mathbf{s}\,\mathbf{h} = \text{abs}\sum\_{i=1}^{N} \alpha\_{i}\mathbf{y}\_{i}\mathbf{x}\_{i},\tag{12}$$

where **g** denotes the weight vector of voxels, α<sup>i</sup> is the i-th value of alpha coefficient vector α in SVM, and N is the number of subjects in the training set. The mean of **g** in 1,000 times of bootstrap tests was the final weight vector **g**.

## 3. RESULTS

## 3.1. Gender Classification Results: AUC and Accuracy

Results of gender classification using HSRC of three resolutions are provided in the top two rows of **Table 1**. The highest AUC and accuracy, both of which are got from 0.7 mm data, are 0.9925 and 96.77%, respectively. The relationship of classification accuracy and the number of selected features in each fold are provided in **Figure 2B**, which indicates that the classification accuracy of all the three resolutions improves rapidly up to 0.9 with a few voxels and with the same number of voxels, the higher resolution data always have higher classification accuracies with much less computation time (platform: Linux server with 2 Inter(R) Xeon(R) CUP @ 2.10 GHz, 28 kernels, 260 GiB Memory. CentOS 6.7, MATLAB R2015b, 1 fold RFS: 151.3 (0.7 mm) +158.8 (1.4 mm) +64.3 (2.8 mm) = 374.4 s for HSRC; 5682.6 s (0.7 mm) for direct sparsity) and storage demanded, but when direct sparsity is conducted in different resolution data, we do not see improvement of overall classification performance in higher resolution data, which proves that our HSRC algorithm indeed plays a part. The outcomes of conducting direct sparsity in different resolution data are in the median two rows of **Table 1** and **Figure 2A**.

Gender classification using PCA was also conducted for comparing, and results are provided in the bottom two rows of **Table 1**, the classification performance of using PCA is comparable with using direct sparsity, but poor than using HSRC.

We conducted 1,000 times of permutation tests to testify the statistical significance of overall gender classification performance, and detailed results for all three resolution data are in **Figure 3**. Concurring with expectations, null distributions of the AUCs scattered around 0.5, which implied that the

FIGURE 2 | Classification results of the sparse representation, and the classification was a function of the number of voxels selected in each fold. In HSRC, the higher resolution data always have the higher classification accuracy, while in direct sparsity the classification accuracys of three resolution data are roughly the same. The highest accuracy is 96.77% which is got from 0.7 mm data using HSRC.

TABLE 2 | The main locations of the voxels that were selected by HSRC in 0.7 mm.


performance of the classifier for the randomly permuted data sets whose subjects were randomly labeled was just no better than the probability of getting positive side in random coin tossing. All of the AUC values for permuted labels fell behind the AUCs of real labels, which demonstrated high statistical significance of gender classification (p < 0.001) for all three resolutions.

## 3.2. Important 3-D Morphological Features in Gender Discrimination

As the best classification performance was obtained from 0.7 mm data, and other resolution data were downsampled from them, we conducted 1,000 times of bootstrap tests in 0.7 mm data, and the outcome is shown in **Figure 4** and detailed information of the main clusters is in **Table 2**.

The main morphology difference for gender exists mainly in the frontal lobe and the limbic lobe, others scattered in the parietal lobe, the temporal lobe, the corpus callosum and the precuneus. Considering the high relevance of cortical 3- D morphology and GM, we compared our study and previous studies of gender difference with GM concentration, and found that our study had high accordance with the study of gender difference using T1w, T2w, and FA (Feis et al., 2013) and using GM concentration and fMRI (Wang et al., 2012), and also those using cortical thickness (Im et al., 2006; Sowell et al., 2007; Lv et al., 2010) in reporting the main gender difference in the frontal lobe, the limbic lobe, the parietal lobe and the temporal lobe. Moreover, there are reports of gender difference in the precuneus (Kaiser et al., 2008; Taki et al., 2011; Semrud-Clikeman et al., 2012) and the corpus callosum (Witelson, 1989; Allen et al., 1991; Bishop and Wahlsten, 1997).

## 3.3. Discriminative Ability of Brain Subregions

The accuracy of each brain subregion in AAL for gender classification is in **Figure 5**, and the top and bottom 5 discriminative subregions and their classification accuracy are in **Table 3**. The most discriminative regions of gender exist in the front of the brain and the least discriminative regions are the temporal gyrus. It can be seen from **Figure 5** that the accuracy distribution of two hemi-spheres is roughly bilateral symmetrical, which means that the corresponding brain areas of two hemi-spheres have approximately equal discriminative abilities in gender difference.

An interesting phenomenon which should be paid attention to is that the brain subregions' discriminative ability for gender arises from posterior to anterior in the brain, and this phenomenon has high accordance with the evolution regular of human brain: these brain areas located in the anterior of the brain evolved first, while these posterior brain areas evolved later (Buckner and Krienen, 2013). A possible explanation is that these brain areas evolving advanced and better in human evolution history have more abundant and complex function, so they should develop first in individual brain to ensure the basic function, and with evolution the functional difference of gender grows thus the structural difference grows, too. And those brain areas evolving not so full have less functions and those functions are common among human beings.

The accuracies and AUCs of 7 RSN for gender classification are in **Table 4**. Considering the dimensionality of data, the classification of 7 RSN was conducted in 1.4 mm data. The most discriminative brain areas of gender difference mainly distribute in the default mode network, which is also indicated in Zhang et al. (2018). While a majority of the least discriminative regions belong to the visual network and dorsal attention network. The outcome offers a new evidence of the accordance between structural and functional brain.

Surface-based gender difference is in **Figure 6** which shows that gender difference is most obvious in myelinmap of all the 4 surface-based morphology. The average gender classification accuracy in 10 times of 10-fold cross-validation of thickness, curvature, sulc and myelinmap are 0.8740, 0.8022, 0.8431, and 0.8820, respectively. The details of the most discriminative areas are as follows: isthmuscingulate, left superiortemporal, and right insula for cortical thickness; posteriorcingulate and insula for sulc; inferiorparietal, isthmuscingulate and left posteriorcingulate for curvtura; precuneus, rostralmiddlefrontal and superiorfrontal for myelinmap. Interestingly, myelinmap showed greater gender difference and those discriminative areas of myelinmap have high accordance with those areas we find in cortical 3-D morphology, especially in the frontal lobe and the precuneus; those discriminative areas in the other 3 surfacebased morphology are mainly in the insula, which is also found in cortical 3-D morphology.

## 4. DISCUSSION

In this study, we investigated gender difference of cortical 3- D morphology by proposing an HSRC approach, and got an accuracy of 96.77% in a 10-fold cross-validation. The robustness of classification was testified by permutation tests, and the frontal lobe was found the most discriminative region of gender difference in cortical 3-D morphology selected by HSRC. The superior frontal gyrus in AAL and the default mode network in RSN got the highest accuracy in template based classification. Moreover, the advantages of our proposed HSRC method were mentioned. Discussions are in the following.

TABLE 3 | The top and bottom 5 discriminative regions of AAL template and accuracy for gender classification, the highest gender classification accuracy distributed in the Frontal Lobe while the bottommost gender classification accuracy distributed in the Temporal Lobe.




*(1) Visual network; (2) Somatomotor network; (3) Dorsal attention network; (4) Ventral network; (5) Limbic network; (6) Frontoparietal contral network; (7) Default mode network.*

There are reports of gender difference in cortical morphology (Im et al., 2006; Sowell et al., 2007; Lv et al., 2010; Sepehrband et al., 2018) and brain morphology changes in aging (Resnick et al., 2000; Bigler et al., 2002; Rusinek et al., 2003; Fjell et al., 2009) and multiple inherent brain disorders (Lieberman et al., 2001; Ashburner et al., 2003; Thompson et al., 2004; Jouvent et al., 2008; Aylward et al., 2010), and our proposed method may have the potential in auxiliary diagnosis of those disorders combined with other modalities. Theoretically brain morphology is less sensitive to the scan variables than GM concentration, which may help the fusion of sMRI data from different datasets, and thus our discovery may also offer a new thinking in dealing with multisite MRI data (Ma et al., 2018; Yuan et al., 2018b; Zeng et al., 2018).

As far as we know, this work is the first to classify gender with original cortical 3-D morphology and to get an accuracy of over 95% in gender classification using morphological features. It encouraged us to draw a conclusion that genders can be distinguished on individual level by cortical 3-D morphology features, and supported those opinions in the aspect of brain morphology that males and females can be effectively classified (Chekroud et al., 2016; Rosenblatt, 2016; Anderson et al., 2018), as well as challenged these suggestions that brains are essentially indistinguishable in gender (Joel et al., 2015).

The result of bootstrap tests showed that those discriminative regions of gender difference found by cortical 3-D morphology had high accordance with those found by GM concentration and surface-based morphology in previous studies, especially in the frontal lobe, the limbic lobe and the partial lobe. We suggested a hypothesis that those gender difference of GM concentration, to some extent, may be the result of morphology difference.

Atlas-based morphology analysis indicated different discriminative abilities among brain areas, that is to say, some brain areas contributed much to the gender difference, while some areas exert a smaller influence, and even some areas had no contribution for gender difference, which may be referred to as so-called mosaic areas (Rippon et al., 2014; Joel et al., 2015). According to the brain areas classification results, those brain areas with complex functions and functions related to gender reap high accuracy in gender classification. The bootstrap results also show that the high difference voxels are located in the high difference brain areas, which is comprehensible and consistent with the classification results. Moreover, we found good symmetry in AAL-based morphology analysis which is rarely mentioned in

previous studies of gender difference; RSN-based morphology analysis suggested that the default mode network is the most discriminative network, and the same result was also reported in the studies of gender difference using fMRI Zhang et al. (2018).

Considering that sample size was emphasized in recent studies (Ritchie et al., 2018), we particularly compared our findings with those using more than 1,000 samples (Chekroud et al., 2016; Gur and Gur, 2016; Anderson et al., 2018; Ritchie et al., 2018), and we found considerable accordance. First, the reported classification accuracies were more than 90% to support the opinions of sexual dimorphism with different MRI modalities. Second, the most discriminative areas/networks of gender difference were found to be the frontal lobe (Gur and Gur, 2016; Anderson et al., 2018; Ritchie et al., 2018) and the default mode network (Gur and Gur, 2016; Ritchie et al., 2018), further indicating high relevance of cortical morphology, GM concentration and fMRI based on large sample size.

The proposed HSRC algorithm was testified to be helpful in improving classification accuracy while reducing computation and storage resource for high-dimensional MRI data. It also selected features directly, making discriminative voxels more explainable in MRI data and may help to accurately locate lesion of diseased brain (Antel et al., 2003; Lladó et al., 2012).

We noticed several possible limitations in this work. Firstly, there are papers suggesting that important gender difference also exists in subcortical structures like cerebellum, amygdala and hippocampus (Giedd et al., 2012; Ruigrok et al., 2014). As cortical thickness of these subcortical structures is much less than that of the cerebral cortex, it cannot be automatically segmented by the pipelines offered by the HCP at present (Glasser et al., 2013). Since morphology data provided by the HCP did not include these subcortical structures so far, the influence of subcortical morphology to gender difference was not studied. Secondly, the effect of aging on brain morphology was not discussed because of narrow age range of adults (22– 37 years old) in our study. Thirdly, because of the lack of T2w images, we have not conducted multi-site experiment to test the robustness of brain morphology by now. Moreover, although we have conducted dimension reduction, linear SVM and cross-validation to alleviate the risk of overfitting in the classification methodology as far as possible, an independent dataset is still required to validate the generalizability of our proposed model, which should be done once possible in the future.

## ETHICS STATEMENT

This study was carried out in accordance with the recommendations of "name of guidelines, name of committee" with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the "name of committee."

## AUTHOR CONTRIBUTIONS

DH designed the study. ZL conducted the experiment. ZL, CH, and LW wrote the article.

## ACKNOWLEDGMENTS

We thank the HCP for data collection and sharing. This work was supported by the National Natural Science Foundation of China (61420106001).

## REFERENCES



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Luo, Hou, Wang and Hu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Exploring Sex Differences in the Neural Correlates of Self-and Other-Referential Gender Stereotyping

Jonas Hornung<sup>1</sup> \*, Elke Smith<sup>2</sup> , Jessica Junger2,3, Katharina Pauly2,3, Ute Habel2,3 and Birgit Derntl1,4

<sup>1</sup> Department of Psychiatry and Psychotherapy, University of Tübingen, Tübingen, Germany, <sup>2</sup> Department of Psychiatry, Psychotherapy and Psychosomatics, Medical School, RWTH Aachen University, Aachen, Germany, <sup>3</sup> JARA-BRAIN Institute Brain Structure-Function Relationships, Forschungszentrum Jülich GmbH and RWTH Aachen University, Aachen, Germany, <sup>4</sup> LEAD Graduate School and Research Network, University of Tübingen, Tübingen, Germany

#### Edited by:

Belinda Pletzer, University of Salzburg, Austria

#### Reviewed by:

Jennifer Strafford Stevens, Emory University School of Medicine, United States Alexander Nikolaevich Savostyanov, State Scientific-Research Institute of Physiology and Basic Medicine, Russia

#### \*Correspondence:

Jonas Hornung jonas.hornung@ med.uni-tuebingen.de

Received: 05 July 2018 Accepted: 04 February 2019 Published: 18 February 2019

#### Citation:

Hornung J, Smith E, Junger J, Pauly K, Habel U and Derntl B (2019) Exploring Sex Differences in the Neural Correlates of Self-and Other-Referential Gender Stereotyping. Front. Behav. Neurosci. 13:31. doi: 10.3389/fnbeh.2019.00031 While general self-referential processes and their neural underpinnings have been extensively investigated with neuroimaging tools, limited data is available on sex differences regarding self- and other-referential processing. To fill this gap, we measured 17 healthy women and men who performed a self- vs. other-appraisal task during functional magnetic resonance imaging (fMRI) using gender-stereotypical adjectives. During the self-appraisal task, typical male (e.g., "dominant," "competitive") and female adjectives (e.g., "communicative," "sensitive") were presented and participants were asked whether these adjectives applied to themselves. During the otherappraisal task, a prototypical male (Brad Pitt) and female actor (Julia Roberts) was presented and participants were asked again to judge whether typical male and female adjectives applied to these actors. Regarding self-referential processes, women ascribed significantly more female compared to male traits to themselves. At the same time both women and men indicated a stronger desire to exhibit male over female traits. While fMRI did not detect general sex differences in the self- and other-conditions, some subtle differences were revealed between the sexes: both in right putamen and bilateral amygdala stronger gender-congruent activation was found which was however not associated with behavioral measures like the number of self-ascribed female or male attributes. Furthermore, sex hormone levels showed some associations with brain activation pointing to a different pattern in women and men. Finally, the self- vs. othercondition in general led to stronger activation of the anterior cingulate cortex while the other- vs. self-condition activated the right precuneus more strongly which is in line with previous findings. To conclude, our data lend support for subtle sex differences during processing of stereotypical gender attributes. However, it remains unclear whether such differences have a behavioral relevance. We also point to several limitations of this study including the small sample size and the lack of control for potentially different hormonal states in women.

Keywords: sex differences, self-other appraisal, gender, gender stereotyping, fMRI

## INTRODUCTION

fnbeh-13-00031 February 14, 2019 Time: 19:3 # 2

## Gender Stereotyes

Human beings have self-concepts, i.e., ideas about who they are and expectations about how they should behave in a given situation. These self-concepts also encompass gender stereotypes which are common societal expectations of qualities a woman or man should possess (Cattaneo et al., 2011). The kind of such stereotypes is manifold. Some stereotypes relate to general cognitive skills. For example, the belief that man possess superior mathematical skills seems to exist from early school education onward (Cvencek et al., 2011). The mere existence of such a belief creates a situation of stereotype threat. This term refers the threat of confirming mostly negative stereotypes that exist toward a certain group of individuals and that may impair the functioning of these individuals in a way that confirms the stereotype (Steele and Aronson, 1995).

For example, women show worse mathematical performance when being told that men are superior in a mathemical (Cadinu et al., 2005; Dar-Nimrod and Heine, 2006; Good et al., 2008) or a mental rotation task (Moè and Pazzaglia, 2006; Wraga et al., 2007; Sanchis-Segura et al., 2018). Another kind of stereotype refers to more general psychological qualities. In this regard, women are e.g., more easily associated with low-authority whereas the opposite is true for men (Rudman and Kilianski, 2000; Schmid Mast, 2004). Furthermore, women who do not meet such expectations and appear more agentic, i.e., more independent and competitive, are discriminated against (Rudman and Glick, 2001) and judged as less feminine (Rudman and Glick, 1999). Thus, the existence of such gender stereotypes is likely to have an impact on developmental trajectories of women and men biasing their behavior and attitudes and leading to different educational (Nosek et al., 2009) and occupational outcome (Moss-Racusin et al., 2012). Within the larger framework of gender stereotyping, the present study aimed at exploring how gender stereotypes are represented at the level of brain activation both concerning reflection about oneself and a prototypical woman and man.

## Neural Underpinnings of Self-Referential Processing

Self-appraisal processes and their neurobiological underpinnings have been studied for over a decade. Typically, neuroimaging studies ask adults to respond whether trait words or phrases describe themselves and whether these stimuli can also be attributed to others (for a review see Lieberman, 2007). These self-related processes are especially associated with stronger activation in the medial prefrontal cortex (MPFC) when judging about oneself compared to either close (D'Argembeau et al., 2007; Jenkins et al., 2008; Modinos et al., 2009; Feyers et al., 2010) or famous others (D'Argembeau et al., 2005; Jenkins and Mitchell, 2011). More specifically, studies reported activation of the ventral (van der Meer et al., 2010), dorsal (Fossati et al., 2003), and orbital (Pauly et al., 2013) part of the MPFC depending on the task applied (Northoff et al., 2006). Also, the parahippocampal gyrus and precuneus (Feyers et al., 2010), anterior (Modinos et al., 2009), and posterior cingulate cortex (Johnson et al., 2002) as well as the basal ganglia (Benoit et al., 2010) have been shown to be involved in self-referential processes. Furthermore, Veroude et al. (2013) used an appraisal paradigm, in which participants were instructed to indicate whether a phrase described themselves (self), a friend from college (other), or whether the phrases were positive or negative (control). The authors reported stronger activation in men compared to women in the medial posterior parietal cortex (MPPC) and the bilateral temporo-parietal junction (TPJ) across all appraisal conditions suggesting that sex differences during appraisal of self and others exist. However, up to now it is rather unexplored whether men and women recruit similar or different brain regions during processing of stereotypical female and male attributes.

## Neural Correlates of Other-Referential Processes and Gender Stereotyping

People commonly not only reflect about themselves but also about the characteristics other people possess. Especially the stereotypic gender judgments of others has been shown to recruitthe ventromedial prefrontal cortex (VMPFC), the middle temporal gyrus (MTG), the precuneus and the supramarginal gyrus (Quadflieg et al., 2009). Other studies found an increase in activation with stronger gender stereotyping in the amygdala (Knutson et al., 2007) and a part of the right frontal cortex (Mitchell et al., 2009). However, these studies did not tackle the question whether women and men recruit the same or different brain regions when judging other persons or ascribing gender stereotypical adjectives to them.

## Aims and Expectations

In the present study we aimed at investigating the neural correlates during attribution of gender stereotypes to oneself or to a prototypical female and male actor. The main question of interest was whether participant sex had an impact on stereotype processing while evaluating oneself or a famous other. Specifically, we aimed to explore whether women and men recruit similar brain areas during self- and other-reflection. To our knowledge this question has received almost no attention Veroude et al. (2013) and we therefore refrain from postulating a directional hypothesis regarding such differences. Furthermore, we consider exploratory the impact of female and male sex hormone levels on brain activation during self- and otherreflection. This is because to our knowledge no conclusive model exists regarding the action of sex hormones on human cognition in general let alone on gender stereotyping (Sundström-Poromaa and Gingnell, 2014; Toffoletto et al., 2014).

## MATERIALS AND METHODS

## Participants

Originally, twenty right-handed healthy Caucasian women and 21 right-handed healthy Caucasian men participated in the study. Participants were recruited via advertisements posted at the RWTH Aachen University, Germany. This study was

carried out in accordance with the recommendations of the local Institutional Review Board (EK 088/09) of the Medical School RWTH Aachen University with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the local Institutional Review Board (EK 088/09) of the Medical School RWTH Aachen University. All subjects were paid for their participation.

The presence of mental disorders was excluded on the basis of the German version of the Structured Clinical Interview for DSM-IV (SCID, Wittchen et al., 1997), which was conducted by experienced psychologists. The usual exclusion criteria for MRI (e.g., metal implants, claustrophobia, and epilepsy) were applied. Independent t-tests revealed that women and men were of comparable age (p = 0.96) and had similar years of education (p = 0.75). Handedness was assessed by means of the Edinburgh Handedness Inventory (Oldfield, 1971) showing that all participants were right-handed apart from one left handed woman and one man. Crystallized verbal intelligence, as assessed with the Mehrfachwahl-Wortschatz-Test Version B (MWT-B, Lehrl, 1996) did not differ between women and men (p = 0.67). Moreover, all participants completed the Bem Sex-Role Inventory (Bem, 1981) a standard questionnaire for measuring femininitymasculinity and gender roles.

On the day of testing, a blood sample was taken to assess the sex hormones estradiol, progesterone, and testosterone. Two men did not provide blood samples. Assays were analyzed by the Central Laboratory of the Medical School, RWTH Aachen University, using an electrochemiluminescence-immunoassay (ECLIA, Johnson et al., 1993). The intra-assay accuracy was over 90% (i.e., coefficient of variation was 4–8%) and the sensitivity of each assay was 10 pg/ml (estradiol), 0.2 ng/l (progesterone), and 0.2 ng/ml (testosterone).

### Exclusion of Participants

Two women and three men were excluded due to faulty logfiles resulting in 18 available behavioral datasets for women and men. Additionally, for fMRI analysis one further woman and man had to be excluded due to strong movements inside the MR scanner (>2 mm in any direction) leading to 17 available datasets for fMRI analysis.

Regarding analysis of hormone levels, extreme values were identified as being larger than 2.5 SDs from the mean of each hormone and separate for each sex. This led to exclusion of two progesterone values for women. In addition to two missing blood samples for men, this resulted in 19 and 20 (men and women) available testosterone values, 19 and 20 (men and women) estradiol values and 19 and 18 (men and women) progesterone values.

Demographic, neuropsychological, and hormonal characteristics of the total sample are shown in **Table 1**.

## Stimuli and Procedure

During the task we presented 240 personality traits, half of which had been evaluated as being typical male and the other half as being typical female attributes. The gender typicality of the stimuli was verified in a pre-study in 30 healthy participants

TABLE 1 | Information on sociodemographic parameters, neuropsychological performance and hormone concentrations in women and men.


Independent t-tests compared women and men. P-values of these tests are indicated. <sup>∗</sup>Values are given for 15 men as two men did not provide blood samples.

(15 women) during which participants rated a total of 240 German adjectives according to whether they were more a prototypical male or female adjective on a continuous scale from −2 (=very masculine) to +2 (=very feminine). Adjectives with an average rating below 0 were labeled as typically masculine whereas adjectives with an average rating above 0 were labeled as typically female. Thus, participants did not necessarily have to agree whether an adjective was more stereotypically female or male which is reflected by a non-zero standard-deviation of ratings. On average, male attributes received a rating of −0.64 (SD = 0.33) whereas female attributes were rated on average with 0.72 (SD = 0.33). As intended, scores of female and male attributes differed significantly (p < 0.001) and to a high degree as indicated by Cohen's d (d = 4.15). All 240 adjectives were used for the main experiment with sixty of these adjectives (balanced for femininity/masculinity) were presented during the self-condition where participants were asked to judge whether the traits applied to themselves or not via button press (left = yes; right = no). Another 120 gender adjectives (60 in each condition) had to be assigned to either a typical male (Brad Pitt) or typical female (Julia Roberts) celebrity. Both had been selected to represent a stereotypical known prototype of a man and a woman. Indeed all participants reported to know who both actors were. In the other-appraisal task participants had to indicate whether the female or male attribute fitted the famous person or not. In a final lexical control condition further 60 female and male adjectives were presented and participants were asked whether the displayed words contained the letter "r" or not. Consequently, the task consisted of 8 experimental conditions in a 2 × 4 event-related design with the factors Attribute (male, female) and Appraisal Condition (self, typical other man, typical other woman, and lexical). This resulted in the following eight conditions (1) self male attributes, (2) self female attributes, (3) prototypical man male attributes, (4) prototypical man female attributes, (5) prototypical woman male attributes, (6) prototypical woman female attributes, (7) lexical male attributes, (8) lexical female attributes. Each condition was presented ten times in mini-blocks of three attributes with the same female and male attributes presented in each condition across participants. As intended, conditions did not differ regarding the mean rating for femininity/masculinity of attributes (p = 0.76). A total of 240 stimuli were presented

in a pseudo-randomized order. Stimulus presentation was accomplished with Presentation software (Version 14.2, http:// www.neurobs.com), whereby each condition was announced by a brief instruction (5 s). Attributes were presented for 2.1 s followed by a fixation cross jittered between 1.1 and 3.1 s. Each last word of a mini-block was followed by a fixation cross jittered between 5.6 and 10.6 s. This resulted in a total task length of about 27 min with no breaks in between (see **Figure 1**). The order of conditions was permutated to achieve that each condition preceded and followed every other condition approximately equally often to avoid any systematic effects of the order of presentation.

## Analysis of the Behavioral Data

Statistical testing was performed with the Statistical Package for the Social Sciences (SPSS 24, IBM Corp., Armonk, NY, United States). For all analyses, the significance level was set to p = 0.05.

Independent t-tests were used to compare sociodemographic, hormonal, neuropsychological and questionnaire data between women and men. For BSRI, masculinity and femininity scores were calculated. Additionally, as the BSRI also assesses desired femininity and masculinity, we also compared these scores between women and men via independent t-tests.

### Number of Accepted Female and Male Attributes

Analyses were separated for the self- and other condition. Only adjectives that were agreed on ("yes" answers) were included as due to the dichotomic character of the possible answers, the additional use of "no" answers would result in no further information. Then the number of yes-answers was subject to a 2 × 2 ANOVA with the factor Participant Sex (women, men) and Attribute (female, male) for the self-condition and a 2 × 2 × 2 ANOVA with the additional factor Actor Sex (prototypical male, prototypical female) for the other condition. Greenhouse-Geisser corrected p-values are reported in cases of sphericity violation and partial eta squares (η 2 ) are listed as an indication of effect size.

## **Behavioral correlation analyses**

Behavioral correlation analyses were separately performed for women and men between sex hormones (estradiol, progesterone, testosterone) and behavioral performance (number of selfattributed male, female adjectives).

## fMRI Data Acquisition and Pre-processing

Functional imaging data were obtained on a 3 Tesla Siemens MR Scanner (Siemens Medical Systems, Erlangen, Germany) at the Department of Psychiatry, Psychotherapy and Psychosomatics of the RWTH Aachen University. Echo-planar imaging (EPI) was applied (T2<sup>∗</sup> , voxel size: 3.1 mm × 3.1 mm × 3.1 mm, distance factor 15%, GAP 0.5 mm, 64 × 64 matrix, FoV: 200 mm × 200 mm, TR = 2 s, TE = 30 ms, α = 76◦ ). Thereby 36 slices in ascending order covering the whole brain were acquired. Image acquisition was preceded by 5 dummy scans, which were discarded before preprocessing. The resulting 815 volumes per subject were analyzed using SPM12 (Statistical Parametric Mapping; Wellcome Trust Centre for Neuroimaging, London, United Kingdom<sup>1</sup> ). For preprocessing, functional images were first slice-time corrected, realigned to the first functional image, coregistered with the acquired anatomical image, spatially normalized to the standard template of the Montreal Neurological Institute (MNI, Canada) and finally smoothed with an 8 mm FWHM isotropic Gaussian kernel. To remove effects of low frequency noise, a 128 s high pass filter was used.

<sup>1</sup>http://www.fil.ion.ucl.ac.uk/spm

## Analysis of the fMRI Data

fnbeh-13-00031 February 14, 2019 Time: 19:3 # 5

## Whole Brain Analyses

On the first level, regressors were modeled for each of the eight experimental conditions and for each subject and subsequently entered into a second level analysis. Here, a flexible factorial design was calculated for the group analyses performing a generalized linear mixed model (GLM). Movement parameters were included as nuisance covariates. Based on this model, (1) main effects of Participant Sex (women, men), Appraisal Condition (self, prototypical male, prototypical female other, lexical), Attribute (male, female) and Acceptance (yes, no) were analyzed. (2) interactions of these factors were also modeled to investigate especially whether sex differences existed during the attribution of male and female attributes (interaction Attribute × Participant Sex) separately in the self- and other conditions. (3) sex differences in the comparison between self- and other-conditions were investigated by modeling the interaction Condition × Participant Sex comparing both other-conditions separately to the self-condition. To adjust for the inflation of α-errors, whole brain analyses were thresholded at p < 0.001 (cluster-forming threshold) and family-wise-error corrected (FWE) for multiple comparisons at the cluster level to a threshold of p = 0.05. Thus, only clusters with a minimal extent of 70 voxels were detected as significant. The resulting voxel coordinates of significant activation peaks (in MNI-space) were located anatomically by help of an anatomy toolbox (Eickhoff et al., 2005) implemented in SPM12.

### Regression Analyses

To detect clusters on the whole brain level that significantly covaried with (1) sex hormone levels (testosterone, estradiol, progesterone) and (2) with the ratio of the number of self-ascribed female to male attributes, separate whole brain regression analyses were conducted for women and men. Specifically, the contrast images during the self-condition for female and male attributes from the first level analysis of each participant was covaried with sex hormone levels and the ratio of the number of self-ascribed female to male attributes. To do so, we divided the number of self-ascribed female by the number of self-ascribed male attributes, thus indicating a stronger agreement toward female adjectives with scores larger 1, stronger agreement toward male adjectives with scores smaller 1 and equal agreement between female and male attributes for a score of 1. Again, a cluster-forming threshold of p < 0.001 and a FWE-correction at cluster level to a threshold of p = 0.05 was performed. Thus, only clusters with a minimal extent of 45 voxels were detected as significant.

### ROI Analyses

Based on previous studies investigating stereotypical or self- vs. other-processing (Quadflieg et al., 2009; Veroude et al., 2013), we performed several region of interest analyses. These regions included the MPFC, precuneus and bilateral amygdala (Quadflieg et al., 2009) as well as the bilateral TPJ and the MPPC (Veroude et al., 2013). ROIs were defined as 10 mm spheres around center coordinates (in MNI space) taken from these two publications. Only bilateral amygdala was defined anatomically by help of an anatomy toolbox (Eickhoff et al., 2005) to allow for better spatial definition of these ROIs. Mean parameter estimates were extracted and subject to a mixed model 2 × 3 × 2 ANOVA with the factors Participant Sex (men, women), Appraisal Condition (self, prototypical male, female), and Attribute (male, female). Within each ROI, post hoc comparisons were Bonferronicorrected for multiple comparisons. See **Table 2** for all ROIs and their spatial extent.

### **Neural correlation analyses**

Neural correlation analyses were separately performed for women and men between sex hormones (estradiol, progesterone, testosterone) and the beta estimates during self-processing of female and male attributes in all ROIs.

## RESULTS

## Bem Sex-Role Inventory (BSRI)

Comparing masculinity and femininity scores of women and men via t-tests revealed that men described themselves as more masculine (p < 0.001) while no significant sex difference emerged for femininity (p = 0.88). For the desired self, we did not observe significant sex differences (ts < 0.67, ps > 0.52). Within-group analyses revealed that both sexes expressed a desire to reveal more masculine compared to feminine traits (both ps < 0.001). See **Table 3** for statistics.

TABLE 2 | All brain regions selected for ROI analyses.


Center coordinates reflect MNI-space. MPFC, medial prefrontal cortex; MPPC, medial posterior parietal cortex, TPJ, temporo-parietal junction.

TABLE 3 | Mean scores for self-attributed and desired masculinity and feminity according to the BSRI with the standard deviation in brackets.


T- and p-values refer to sex differences. BSRI, bem sex-role inventory.

## Behavioral Performance

### Self-Condition

Neither the main effect of Participant Sex, F(1,35) = 3.94, p = 0.06, nor the main effect of Attribute, F(1,35) = 3.55, p = 0.07, reached significance. Only the interaction Participant Sex × Attribute was significant, F(1,35) = 23.46, p < 0.001, η <sup>2</sup> = 0.40 indicating that women and men agreed to significantly more gender-congruent items than gender-incongruent items. See also **Figure 2** and **Table 4**.

## Other Condition

A main effect of Attribute was found, F(1,35) = 7.10, p = 0.01, η <sup>2</sup> = 0.17, showing that overall more male attributes were accepted. Furthermore, an interaction Participant Sex × Actor

TABLE 4 | Mean number of assigned attributes across the self- and other-conditions (yes-answers) with the standard deviation in brackets.


T- and p-values refer to sex differences.

Sex, F(1,35) = 6.99, p = 0.01, η <sup>2</sup> = 0.11, and an interaction Actor Sex × Attribute, F(1,35) = 25.07, p < 0.001, η <sup>2</sup> = 0.42, was found. The Participant Sex × Actor Sex interaction indicates that women accepted overall more adjectives for the prototypical female compared to male actor (p = 0.02) while this was not the case for men (p = 0.34). The Actor Sex × Attribute interaction shows that gender attributes were assigned in an actor-specific manner with more male compared to female attributes (p = 0.009) being attributed to the prototypical male actor and more female compared to male attributes (p < 0.001) being attributed to the prototypical female actor. No main effect of Actor Sex, F(1,35) = 3.69, p = 0.06, η <sup>2</sup> = 0.10, was detected. See also **Figure 3** and **Table 4**.

## fMRI Results

### Whole Brain Analyses

#### **Main effects of self- and other-condition**

Across all participants, the self-condition (self) compared to the letter judgment condition (lexical) led to strong activation in the left superior frontal gyrus and several smaller clusters including the right temporal gyrus and bilateral cerebellum (see **Table 5**). The other-condition (other) compared to lexical also led to stronger activation in the left superior frontal gyrus and also involved regions like left inferior frontal gyrus and posterior cingulate cortex (see **Table 5**). Directly comparing self and other showed stronger activation during self including parts of the left anterior cingulate cortex and supramarginal gyrus. The inverse contrast (other > self) detected stronger activation in the right precuneus and bilateral superior temporal gyrus (see **Figure 4** and **Table 5**).

## **Sex differences during the self-condition**

The general effect of Participant Sex pertaining to the contrast women vs. men did not yield any significant clusters. However, the interaction of Attribute × Participant Sex resulted in one significant cluster located in the right putamen (k = 137; MNI: x = 31 y = −6 z = −6). To analyse this interaction in more detail, we performed an additional region of interest analysis (see "A Posteriori Region of Interest Analysis").

## **Sex differences during the other-condition**

Again, no general Sex effect was detected comparing women vs. men. Also the interaction Attribute × Participant Sex did not lead to significantly activated clusters. Therefore no further post hoc t-tests were performed.

## **Sex differences comparing the self- to the other-conditon**

Finally, self- and other-conditions were compared by modeling the interaction Condition × Sex for both female and male actor separately. In neither case were significant clusters detected pointing to no differential activation between women and men when comparing self- to other-processing of a typical woman or man.

## Whole Brain Regression

## **Sex hormones on whole brain activation**

Women. During presentation of female adjectives no correlation with either hormone was detected whereas during presentation of male adjectives significant correlations were found for both progesterone and testosterone but not estradiol. For progesterone, a cluster (k = 90) including left insula and superior temporal gyrus was positively associated with hormone values while for testosterone, a cluster (k = 70) in the left postcentral gyrus extending to the rolandic operculum was positively associated with hormone values (see **Table 6**).

Men. During presentation of female adjectives a cluster (k = 49) in right angular gyrus was positively associated with estradiol values. Furthermore, during presentation of male adjectives a negative association was found with progesterone values in the superior medial gyrus (k = 47). No further significant correlation emerged (see **Table 6**).

Self-ascribed female-to-male-ratio on whole brain activation. Neither in women nor in men the ratio of the number of self-ascribed female to male adjectives was significantly related to whole brain activation.

## A Posteriori Region of Interest Analysis

As the whole brain interaction Attribute × Participant Sex during the self processing yielded one significant cluster in the right putamen, we extracted mean beta estimates from this cluster for a more detailed analysis. This analysis revealed that not only the interaction Attribute × Participant Sex was significant, F(1,32) = 8.66, p = 0.006, but that this interaction was additionally dependent on the experimental condition as indicated by a significant three-way interaction Attribute × Participant Sex × Condition, F(2,64) = 6.01, p = 0.004, η <sup>2</sup> = 16. To disentangle this three-way interaction, we first computed separate interactions of Attribute × Sex for each condition showing that only for the self-condition this interaction was significant, F(1,32) = 28.89, p < 0.001, η <sup>2</sup> = 0.47, but not for the two other conditions (Fs < 0.17, ps > 0.69). This indicates that during selfprocessing women had higher activation in the right putamen for female compared to male attributes (p = 0.001) whereas




Comparisons of self, other and lexical condition. Coordinates reflect MNI space. ACC, anterior cingulate cortex; IFG, inferior frontal gyrus; k, cluster extent; L, left; PCC, posterior cingulate cortex; R, right.

men showed stronger activation for male compared to female attributes (p = 0.002). See also **Figure 5**.

#### A Priori Region of Interest Analyses

#### **Main effects of participant sex**

Only in the MPFC a main effect of Participant Sex was detected pointing to higher overall activation in men compared to women (p = 0.05).

#### **Interactions with the factor participant sex**

An interaction Attribute × Participant Sex was detected in both left and right amygdala indicating that across all conditions men had by trend a lower activation for female compared to male attributes (left p = 0.06; right p = 0.08) whereas women did not differ for female and male attributes (left p = 0.35, right p = 0.32). No further interactions including the factor Participant Sex was detected (ps > 0.11, Fs < 2.27).

#### **Main effects of condition and attribute**

Please refer to the **Supplementary Material** and **Table 7** for reports of the main effects of the factors Condition and Attribute.

### Correlation Analyses

### **Sex hormones** × **behavioral data**

Sex hormones were correlated separately for women and men with the number of self-ascribed female / male adjectives However, neither in women nor in men, significant correlations were detected data (rs < 0.37, ps > 0.14).

### **Sex hormones** × **neural data**

For progesterone, in men, a positive association was found with the right (r = 0.64, p = 0.006) and left (r = 0.49, p = 0.047) amygdala and left TPJ activation (r = 0.52, p = 0.32) during presentation of male attributes. All other ROIs were not significantly linked to progesterone values in men and women (rs < 0.48; ps > 0.05). For estradiol (rs < 0.33, ps > 0.20) and testosterone (rs < 0.33, ps > 0.21) no significant correlations were found in men and women. Of note, significant correlations are uncorrected for multiple comparisons as the mere number of comparisons (each hormone was compared separately in women and men with female and male attributes in eight ROIs resulting in 16 comparisons for each sex) would have required almost perfect correlations. We still report these values asking for caution in interpreting them.

## DISCUSSION

The current fMRI study investigated gender-related self- and other-appraisals in adult women and men. The main focus of this study was to investigate whether sex differences on a behavioral and neuronal level exist during such processes.

Notably, women and men self-ascribed more genderstereotyped traits, i.e., women agreed to have more stereotypical female attributes. At the same time both women and men reported the desire to exhibit more masculine traits. On the level of brain activation, women and men recruited similar brain regions during self- and other-appraisal. Only during selfreferential processes one significant cluster in the right putamen was more strongly active pointing to higher gender-congruent activation in women and men, respectively. Furthermore specific region of interest analyses also revealed a similar pattern of gender-congruent activation in bilateral amygdala showing that men had stronger activation for male compared to female attributes – however both during the self- and other-conditions. All other region of interest analyses did not reveal sex differences. Finally, whole brain regressions with sex hormone levels were

conducted separately for women and men for the self-condition. The outcome of these analyses yielded different brain regions for women and men including clusters in the left insula and rolandic operculum, right angular gyrus and superior medial gyrus.

## Self-Appraisal of Gender Stereotypes

In our study, we confronted women and men with traits that had been rated as typically female or male in a pre-study. As expected, women and men self-ascribed more gender-congruent attributes. However, only in women this difference reached significance, i.e., women agreed more often to female rather than male traits when referring to themselves. At the same time both women and men reported the desire to exhibit more masculine attributes. A tentative explanation for this observed pattern may consider the occupationaI situation for women who are underrepresented in academic leadership positions and earn less than men in most

TABLE 6 | Resuls of whole brain regression of sex hormone values on brain activation.


<sup>∗</sup>Association with male attributes. ∗∗Association with female attributes.

Western societies (Carnes et al., 2015; Salinas and Bagni, 2017). Such a discrepancy between the sexes seems to be in part due to conscious or unconscious discrimination against women already at the level of applications. For example, data from Moss-Racusin et al. (2012) demonstrate that for identical applications of a bogus female and male student for a position as laboratory managers, men were rated higher on competency and were rather hired and mentored by female and male faculty members. Similary, Steinpreis et al. (1999) found that for identical applications of female and male scientists both female and male reviewers were more likely to hire the male applicants. Interestingly, not only women but also men are discriminated against when applying for jobs that appear not suitable for them (Davison and Burke, 2000) like communal roles including working as a nurse or social worker (Croft et al., 2015). Also in politics more masculine traits appear beneficial for election success. For example, studies by Klofstad et al. (2012) and Anderson and Klofstad (2012) showed that participants listening to differently pitched female and male voices, voted more often for persons with deeper more masculine voices which was true both for female and male candidates. Thus, at least in the above mentioned domains it can be beneficial to exhibit masculine attributes to increase success and therefore the observed desire to exhibit more masculine traits could make sense. However, we want to point out that this narrative is only speculative and cannot explain why male attributes should be preferred in other non-professional contexts. We therefore ask further studies to conduct more detailed and domain-specific investigations to back or refute our speculations about the desire to exhibit more masculine traits which we found for both women and men.

FIGURE 5 | Display of the interaction Attribute × Participant Sex during ROI-analysis in the right putamen. Men showed higher activation for male attributes whereas women had higher activation for female adjectives. <sup>∗</sup>p < 0.05.

TABLE 7 | Region of interest analysis with indication of main effects and interactions with the factor participant sex.


Amy, amygdala; L, left; MPFC, medial prefrontal cortex; MPPC, medial posterior parietal cortex; R, right; TPJ, tempo-parietal junction. ∗∗Condition × Participant Sex × Attribute; <sup>∗</sup>Attribute × Participant Sex.

## Sex Differences in Neural Networks of Self- and Other-Appraisal

Previous studies have pointed to differences during the appraisal of self- and other-related attributes with parts of the MPFC being more active during self-referential processing (D'Argembeau et al., 2007; Pfeifer et al., 2007; Veroude et al., 2013). In contrast to this, the precuneus has been most consistently recruited during the retrieval of other-related information (Pfeifer et al., 2007; Quadflieg et al., 2009). More tentative had been results about sex differences during such self- and other-processing. In this regard Veroude et al. (2013) pointed to higher activation of bilateral TPJ and MPPC in men compared to women during both self- and other-processing.

## Small Evidence for Sex Differences

Here we showed that women and men had higher gendercongruent activation in bilateral amygdala across all conditions and specific to the self-condition in the right putamen. The putamen forms part of the basal ganglia that is involved in movement and reward processing by means of dopaminergic signaling (Schultz, 2016). This finding could point to a greater reward value of same sex attributes in women and men but this is limited due to the lack of correlations between neural activations and behavior that could help to inform the meaning of brain activation. In a similar vein comes the gender-congruent activation in bilateral amygdala. The amygdala is known for its involvement in the processing of emotional information (Lindquist et al., 2012; Dricu and Frühholz, 2016) and is also generally considered as a salience detector (Sander et al., 2003). Thus, another tentative interpretation for our results could be the increased salience of gender-congruent items in women and men leading to gender-congruent activation in the amygdala. Of further note, regressions of sex hormone levels on whole brain activation revealed stronger activation in left insula and left postcentral gyrus with rising progesterone and testosterone levels, respectively in women during self-processing of male attributes. None of these regions is located in the vicinity of the anterior cingulate cortex which was identified to be most strongly active during self compared to other processing in general (see "General Effects of Self- and Other-Appraisal"). Only the insula has been repeatedly implicated in self-referential processes (Enzi et al., 2009; Modinos et al., 2009) which could thus speak for

a further pronunciation of self-related processes in association with progesterone levels. However, the separate correlations of sex hormone levels with the activations in several regions of interest yielded no conclusive pattern, implicating only higher bilateral amygdala activation with higher estradiol levels during presentation of male attributes in men. Human research still lacks a clear understanding of the cognitive effects of changes in sex hormone levels which has been most consistently investigated within women (e.g., Sundström-Poromaa and Gingnell, 2014; Toffoletto et al., 2014) however with no clear conclusions. In our experiment a multitude of statistical comparisons was performed as we analyzed women and men separately for female and male attributes in several regions. Therefore our data can only be considered preliminary and need further experimental support to corroborate and further specify them before strong conclusions can be derived.

## General Effects of Self- and Other-Appraisal

However, based on our study, we were able to give substantial evidence for a general neural difference between self- and other processing which we therefore want to explain in a bit more detail here.

## The Self

Our results show that a cluster in the left anterior cingulate cortex extending to the insula was more strongly active during the self- compared to the other-conditions which is in line with previous reports showing stronger insular activation for selfprocessing compared to familiarity judgments (Qin et al., 2012). Especially the anterior insula has been repeatedly related to the awareness for internal body states (Craig, 2004, 2009) and was suggested to code emotional salience (Northoff et al., 2011). Thus, it does not come as a surprise that self-referential processing involves the anterior insula (Enzi et al., 2009; Modinos et al., 2009) suggesting that stronger personal involvement during selfreflection shares part of the neural substrates important for coding of emotional salience.

## Others

Our results furthermore show that a cluster in the precuneus was more strongly activated during the other compared to the self-condition. The precuneus is classically involved in a variety of cognitive and emotional functions, such as mental and motor imagery (Cavanna and Trimble, 2006) but also social cognition, self-agency and self-activation (e.g., Vogeley and Fink, 2003). Interestingly, the precuneus is also an important part of the default mode network (Utevsky et al., 2014) but its activation seems to be more relevant for processing of otherrelated information. For example, stronger activation of the precuneus has been reported in participants deciding whether a sentence applied to another person or not (Veroude et al., 2013). Also, Qin et al. (2012) report that the precuneus preferentially responds to stimuli related to (personally) familiar people in contrast to self-specific stimuli. This fits with our findings as all participants were familiar with both actors and suggests that for ascribing the female or male traits to a prototypical woman (Julia Roberts) or man (Brad Pitt) their choices were based on classical gender stereotypes.

## Limitations

It has been shown that menstrual cycle phase influences attractiveness self-ratings of the own body (Durante et al., 2008) which might also translate to self-appraisals. For the current study we did not assess menstrual cycle phase or oral contraceptive intake and thus cannot rule out such hormonal influences played a confounding role (e.g., Pletzer et al., 2015). Further points of limitation refer to the small number of female and male participants potentially not allowing to detect more subtle sex differences. Furthermore, the number of different female and male items we used made it also impossible to balance each experimental condition for the same items. However, we point out that the mean ratings of female and male items did not differ between conditions and therefore this aspect is an unlikely confound in our experimental design. Finally, our participants were mainly students. To the present moment in Germany, there is still a divide between the number of female and male students in different fields of academia with 70–90% of male students in engineering subjects and around 80% of women in educational science. Both the subject of studies and the gender-ratio has been shown to impact stereotype processing, e.g., leading to a stronger stereotype threat when the gender-ratio is off-balanced (Murphy et al., 2007) or for students facing tasks that are off their subject of study (Sanchis-Segura et al., 2018). For this reason our study may not be able to allow general claims both within our sample of students and beyond academia. Other factors that may influence gender stereotyping are personality traits like the big five of personality research: openness to experience, conscientiousness, extraversion, agreeableness and neuroticism (Asendorpf, 2005). Unfortunately, we did not collect such information and await future studies to analyse how they might affect gender stereotyping.

## CONCLUSION

Measuring self- versus other-appraisals to explore behavioral and neural differences between healthy women and men revealed that both sexes self-ascribed more gender-congruent than -incongruent traits while also expressing a higher desire to exhibit more masculine traits. While fMRI did not detect general sex differences in the self- and other-conditions, some subtle differences were revealed between the sexes: both in right putamen and bilateral amygdala stronger gender-congruent activation was found which was however not associated with behavioral measures like the number of self-ascribed female or male attributes.

## AUTHOR CONTRIBUTIONS

BD and UH contributed to conception and design of the study. JJ and KP collected the data. The majority of statistical analyses was performed by JH. ES contributed to statistical analyses. BD and JH wrote the first draft of the manuscript. All authors

contributed remarks to improve the manuscript and approved the final version of the manuscript.

## FUNDING

This study was supported by the German Research Foundation (DFG: HA 3202/7-1, DE2319/2-3, and IRTG 1328) and the Brain Imaging Facility of the Interdisciplinary Centre for Clinical Research of the Faculty of Medicine at the RWTH Aachen University, Germany.

## REFERENCES


## ACKNOWLEDGMENTS

The authors thank Sabine Bröhr, Cordula Kemper, Maria Peters, and Thilo Kellermann for their assistance and support.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnbeh. 2019.00031/full#supplementary-material



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Hornung, Smith, Junger, Pauly, Habel and Derntl. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Does Gender Leave an Epigenetic Imprint on the Brain?

Laura R. Cortes† , Carla D. Cisternas† and Nancy G. Forger\*

Neuroscience Institute, Georgia State University, Atlanta, GA, United States

The words "sex" and "gender" are often used interchangeably in common usage. In fact, the Merriam-Webster dictionary offers "sex" as the definition of gender. The authors of this review are neuroscientists, and the words "sex" and "gender" mean very different things to us: sex is based on biological factors such as sex chromosomes and gonads, whereas gender has a social component and involves differential expectations or treatment by conspecifics, based on an individual's perceived sex. While we are accustomed to thinking about "sex" and differences between males and females in epigenetic marks in the brain, we are much less used to thinking about the biological implications of gender. Nonetheless, careful consideration of the field of epigenetics leads us to conclude that gender must also leave an epigenetic imprint on the brain. Indeed, it would be strange if this were not the case, because all environmental influences of any import can epigenetically change the brain. In the following pages, we explain why there is now sufficient evidence to suggest that an epigenetic imprint for gender is a logical conclusion. We define our terms for sex, gender, and epigenetics, and describe research demonstrating sex differences in epigenetic mechanisms in the brain which, to date, is mainly based on work in non-human animals. We then give several examples of how gender, rather than sex, may cause the brain epigenome to differ in males and females, and finally consider the myriad of ways that sex and gender interact to shape gene expression in the brain.

Keywords: sex, gender, epigenetics, stress, cosmetics, alcohol

## SEX AND GENDER

Most animals on earth come in two sexes. From a biological perspective, sex is defined by gamete size within a species: animals with large gametes (i.e., eggs) are female and those with small gametes (i.e., sperm) are male (Maynard Smith, 1978). In mammals, eggs are made in ovaries and sperm in testes, so gonad type is often used as a shorthand for defining sex. Intersex gonads (part testis-part ovary) are very rare, so biological sex in mammals is a largely dichotomous variable.

Which gonad develops is determined by chromosomal sex (XX versus XY). If a Y chromosome is present, a gene cascade is initiated that causes the previously undifferentiated gonads to become testes; in the absence of a Y chromosome, an alternate cascade leads to the differentiation of ovaries (Brennan and Capel, 2004; Bowles and Koopman, 2013). The testes produce an androgenic steroid hormone, testosterone, for a brief perinatal period, and this hormonal exposure is responsible for masculinization of the external genitalia, internal duct systems, and other somatic differences (Jost, 1978). Testosterone also enters the developing brain and acts via androgen receptors or,

#### Edited by:

Annie Duchesne, University of Northern British Columbia, Canada

#### Reviewed by:

Katherine L. Bryant, Radboud University Nijmegen, Netherlands Sarah Richardson, Harvard University, United States

#### \*Correspondence:

Nancy G. Forger nforger@gsu.edu †These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Neuroscience

Received: 30 November 2018 Accepted: 13 February 2019 Published: 27 February 2019

#### Citation:

Cortes LR, Cisternas CD and Forger NG (2019) Does Gender Leave an Epigenetic Imprint on the Brain? Front. Neurosci. 13:173. doi: 10.3389/fnins.2019.00173

**33**

after aromatization to an estrogen, via estrogen receptors to cause many of the known neural sex differences in animals (Morris et al., 2004; Forger et al., 2016; McCarthy et al., 2017).

Thus, biologists define "sex" based on what gonad is present and, in most cases, the chromosomal, gonadal, hormonal, and anatomical sex are all in accord. In individuals with Differences of Sexual Development, however, this is not the case, e.g., chromosomal males who have testes, but do not make the receptors to respond to testosterone, or chromosomal females exposed to excess androgens early in development (Lee et al., 2016).

In contrast to the relatively well-accepted delineation of sex, suggested definitions of "gender" are more varied. The Canadian Institutes of Health Research defines gender as, "socially constructed roles, behaviors, expressions and identities of girls, women, boys, men, and gender-diverse people. It influences how people perceive themselves and each other, how they act and interact, and the distribution of power and resources in society" (CIHR, 2015). Most of the work on epigenetics in the brain has been performed on experimental animals, which complicates the job of this essay because it is debatable whether non-human animals have gender, based on this definition. If gender requires socially constructed norms, and that an individual identifies as one sex or the other, it is hard to demonstrate gender in non-human animals. On the other hand, to the extent that gender is based on how you are treated by conspecifics, or to the "power and resources" you are likely to accrue, there are many examples of gender in the animal world. The biologist Joan Roughgarden has suggested defining gender simply as, "the appearance, behavior, and life history of a sexed body" (Roughgarden, 2009). Most social scientists embrace a definition of gender as a "system that restricts and encourages patterned behavior" (Risman and Davis, 2013). In other words, the emphasis is not on the individual (i.e., gender identity) but on social interactions that steer the individual's behavior in different ways, based on their biological sex.

Given the latter two definitions, it may be argued that animals have gender, and this is how we define gender for the purposes of this review. Biological sex and gender often interact in complicated ways. However, we will refer to something as a "sex difference" when the difference appears to be due to factors such as sex chromosomes or gonadal hormones, and as a "gender difference" when the difference is likely due to social factors, i.e., when an individual is treated differently by conspecifics due to the individual's perceived sex.

## EPIGENETICS

Epigenetic modifications determine what genes are expressed and represent mechanisms by which the genome can respond to environmental stimuli. The word "epigenetic" (literally, above genetics) was coined by C.H. Waddington in the 1950s to explain how different phenotypes can emerge from the same genotype. In other words, individuals (or cells) with the same genes may wind up with very different observable characteristics (phenotypes) based on environmental interventions at key developmental stages (Waddington, 1957). What controlled those changes was mysterious at the time, but many of the molecular mechanisms underlying the phenomena envisioned by Waddington have now been identified.

The DNA in every cell nucleus is packaged into chromatin by winding around histone proteins. The two best-understood types of epigenetic modifications are (1) post-translational modifications to histones, such as acetylation or methylation, and (2) covalent modifications to the DNA strand itself, e.g., by the addition of methyl or hydroxymethyl groups (Stricker et al., 2017). These epigenetic modifications are controlled by enzymes (e.g., histone acetyltransferases or DNA methyltransferases) and, once placed, they influence the likelihood that a given gene is expressed. For example, DNA methylation is often associated with gene repression, whereas DNA hydroxymethylation may facilitate transcription (Spruijt et al., 2013; Mendonca et al., 2014).

## EPIGENETICS AND SEXUAL DIFFERENTIATION OF THE BRAIN

A transient perinatal exposure to testosterone or its metabolite, estradiol, causes many of the best-studied sex differences in rodent brains, and recent evidence suggests that epigenetic mechanisms underlie many of these hormonal effects (McCarthy et al., 2009; McCarthy and Nugent, 2015; Forger, 2016, 2018). For example, sex differences in the preoptic area of the hypothalamus are disrupted by injecting a DNA methyltransferase inhibitor directly into the brains of newborn rats or mice during the critical period for sexual differentiation (Nugent et al., 2015; Mosley et al., 2017). Similarly, a neonatal disruption of histone acetylation (again, by inhibiting the enzymes that place these marks) prevents the development of sex differences in male rat copulatory behavior (Matsuda et al., 2011), and in size of the bed nucleus of the stria terminalis in mice, a brain region linked to male sexual behavior (Murray et al., 2009). These findings suggest that sexual differentiation of the brain requires orchestrated changes in DNA methylation and histone acetylation.

In another approach, epigenetic marks have been compared between males and females. Based on whole-genome surveys, both histone methylation and DNA methylation patterns differ by sex in the mouse preoptic area (Ghahramani et al., 2014; Shen et al., 2015). Treating newborn female mice with testosterone partially masculinizes the DNA methylation pattern present in adulthood (Ghahramani et al., 2014), and sex differences in the methylation of specific genes also are reversed by neonatal treatment with gonadal steroids in rats (Schwarz et al., 2010). Steroid hormones alter the expression or activity of enzymes that place epigenetic marks (Kolodkin and Auger, 2011; Nugent et al., 2015; Bramble et al., 2016), which may be the mechanism whereby hormones affect the epigenome.

One study in rodents hints at a role for gender in brain epigenetics. Mother rats lick their male neonates more than females (Moore and Morelli, 1979), and the amount of maternal care a rat pup receives affects DNA methylation of the estrogen receptor alpha gene in the brain (Champagne et al., 2006;

Kurian et al., 2010). Edelmann and Auger (2011) randomly assigned some newborn females to receive the extra attention normally given to males by simulating maternal licking using a paintbrush. This did, in fact, masculinize the DNA methylation pattern and expression of the estrogen receptor alpha gene in the amygdala of the treated females (Edelmann and Auger, 2011). Being treated differently by your parents based on your perceived sex is an aspect of gender. In this case, however, the differential treatment is based on the odor of the neonate's urine (Moore, 1985), which in turn is due to differences in circulating testosterone (i.e., sex).

Some sex differences in the brain are independent of gonadal hormones, and are instead due to sex chromosome complement (Arnold et al., 2003; Cisternas et al., 2018). Similarly, sex chromosomes influence the expression of epigenetic enzymes and cause sex differences in the epigenome of rodents and flies (Xu et al., 2008a,b; Jiang et al., 2010; Lemos et al., 2010; Arnold, 2012). Thus, based on animal studies, both major determinants of biological sex (sex chromosomes and gonadal steroids) contribute to differences in the epigenome.

Information on sex differences in the human brain epigenome is very limited. During some stages of human fetal development, the brains of males and females differ in both DNA methylation and hydroxymethylation (Spiers et al., 2015, 2017). Because these differences are seen before birth, and presumably prior to social influences, these are "sex differences." There are also differences in epigenetic marks in the prefrontal cortex of men and women (Lister et al., 2013; Xu et al., 2014; Gross et al., 2015). Adults have had plenty of gendered experiences, however, so whether these differences are due to sex or gender is not clear. In the next section, we will consider how gender could – and probably does – leave an epigenetic imprint on the brain. We present three specific gendered experiences/exposures occurring at different periods of human development, and for which there are data demonstrating epigenetic effects of those experiences/exposures in animal or human studies.

## GENDERED EXPERIENCES AND EXPOSURES

## Early Life Stress

A growing literature demonstrates that early life stress leaves an epigenetic signature (Roth et al., 2009; Lutz et al., 2018). For example, rodents separated from their mothers throughout early life have reduced DNA methylation and altered gene expression in adulthood within a stress-regulatory brain region (Murgatroyd et al., 2009). Early life maltreatment – being stepped on and ignored by the mother – also alters DNA methylation in genes associated with learning and cell growth, as well as expression levels of epigenetic enzymes in the rat prefrontal cortex (Roth et al., 2009; Blaze and Roth, 2013; Blaze and Roth, 2017).

Similar observations have been made in humans. Compared to children raised by their biological parents, children raised in orphanages have higher DNA methylation of genes associated with immune response, mood, and social behaviors (Naumova et al., 2012). These findings are based on analyses of blood lymphocytes, however, which are often used for this kind of work in humans given the difficulty of obtaining brain samples. In another approach, DNA methylation was compared in the brains of adults who died by suicide, with or without a history of childhood abuse. Those who experienced childhood abuse had decreased hydroxymethylation and expression of the kappa opioid receptor gene in the cortex, suggesting epigenetic programming by a history of early life maltreatment (Lutz et al., 2018).

This work is relevant to the question of whether gender leaves an epigenetic imprint on the brain because the sex of a baby may significantly affect the likelihood that it will face early life stress (Jeffery et al., 1984; van Balen and Inhorn, 2003; Puri et al., 2011). In recent history, for example, China's "one child policy" resulted in the abandonment of many girls and sharply skewed sex ratios within orphanages (Johnson et al., 1998; Chen et al., 2015). Similarly, during the Great Chinese Famine, families preferred to spend their limited resources on boys, leading to disparities in disability and illiteracy between men and women a generation later (Mu and Zhang, 2011). Treating children differently based on their biological sex is an important part of our definition of gender. Thus, exposure to early life stress changes the neural epigenome, and early life stress can be a gendered experience.

## Environmental Endocrine Disruptors

It is nearly impossible in industrialized societies to avoid exposure to environmental endocrine disruptors such as bisphenol A, phthalates, and parabens. In rodents, developmental exposure to bisphenol A alters DNA methylation in the brain, and changes the expression of DNA methyltransferases in a brain region-specific manner (Yaoi et al., 2008; Kundakovic et al., 2013; Zhou et al., 2013; Walker and Gore, 2017). Moreover, phthalate exposure during adolescence reduces levels of the epigenetic regulatory protein, methyl CpG binding protein 2, and alters social and fear behaviors in rats (Betz et al., 2013). Environmental endocrine disruptors therefore are clearly capable of altering the brain's epigenome and, to the extent that exposure to these chemicals is gender-based, epigenetic changes may also be gendered.

Interestingly, bisphenol A, phthalates, and parabens are commonly found in cosmetics, scented lotions, nail polish, and feminine care products. There is a vast difference in the use of personal care products between women and men in many parts of the world, and women do, in fact, have higher urinary levels of phthalates and parabens than men (Calafat et al., 2010; Biesterbos et al., 2013; Saravanabhavan et al., 2013). The application of lotions and cosmetics acutely increases levels of urinary paraben concentrations (Meeker et al., 2013), and the difference in urinary levels between males and females emerges in adolescence – the age at which many girls start experimenting with cosmetics and skin care products (Calafat et al., 2010; Dewalque et al., 2014).

The elevated phthalates and parabens in women is likely related to their greater cosmetic use, but is this due to sex or gender? We would say "sex" if, for example, sex chromosomes or gonadal hormones control the desire to use cosmetics, or alter the metabolism or storage of these chemicals in the body. On the other hand, gender is at play if the difference is primarily based on social expectations. Evidence strongly suggests a role for gender because societal norms for cosmetic use vary over time and geography: cosmetics were used by men in ancient Egypt, at the French court in the 17th and 18th centuries, and by British military officers (Carter, 1998; Tapsoba et al., 2010; Ribechini et al., 2011). Very recently, cosmetic use has again become acceptable among men in Western societies (Souiden and Diagne, 2009). Thus, societal gender norms influence cosmetic use. Although no human studies have directly addressed this question, there may well be epigenetic consequences of gendered exposure to cosmetics and other environmental chemicals.

## Alcohol Consumption

fnins-13-00173 February 25, 2019 Time: 16:4 # 4

Throughout the world, men are more likely to consume alcohol than are women (Wilsnack et al., 2009). A recent meta-analysis found that 39% of men and 25% of women globally are drinkers; moreover, men are more likely to drink excessively, and the increase in disease burden due to alcohol consumption is three times higher in men than in women (GBD 2016 Alcohol Collaborators, 2018). This could reflect sex differences: rodents and non-human primates show sex differences in voluntary alcohol consumption, and gonadal hormones influence preference for an alcohol solution in rodents (Forger and Morin, 1982; Morin and Forger, 1982; Juarez et al., 1993; Ford et al., 2004). Critically, however, the difference in drinking rate between men and women varies enormously by location. In Nepal, for example, men are 14 times as likely as women to be drinkers, whereas in Sweden, the prevalence of drinking is nearly equal between men and women (GBD 2016 Alcohol Collaborators, 2018). Societal factors therefore play a large role, and alcohol consumption can safely be categorized as a gendered behavior in many human societies.

The link to epigenetic changes in the brain in this case is relatively strong. Several studies have reported changes in DNA methylation and histone modifications in the postmortem human brain in association with chronic alcohol consumption (Ponomarev, 2013; Tulisiak et al., 2017). As in most human studies, these are correlations, so it remains possible that alcohol consumption does not cause epigenetic changes in the human brain, but that existing epigenetic differences predispose some people to drink. This is where animal studies are again very helpful: many rodent studies in which animals are randomly assigned to ethanol exposure demonstrate a causal relationship between acute or chronic ethanol consumption and epigenetic changes in the brain (Pandey et al., 2008; Kyzar et al., 2016).

## CONNECTING THE DOTS

The argument we are making is that boys and girls, and men and women, have different exposures and experiences based on societal expectations or perceived expectations (i.e., gender), and that some of these exposures/experiences are known to cause epigenetic changes in the brain based on carefully controlled animal studies. In a few cases, the gendered exposures/experiences have also been associated with epigenetic changes in humans, although most studies are correlational. We have presented just three examples above, but countless experiences/exposures will differ based on gender over a lifetime, and they will interact in complex ways with one another and with the epigenetic consequences of biological sex (**Figure 1**).

A logical extension of this argument is that variations in gender within a sex will also affect the epigenome. For example, cosmetic use among Western women varies from zero to many

FIGURE 1 | Hypothetical depiction of the complex interplay of sex and gender on the brain epigenome throughout the lifespan. Chromosomal sex is determined at conception and can have effects on the epigenome throughout life (red). The gonads differentiate after the first 10 weeks of fetal life in humans; thereafter, sex differences in gonadal hormones can have acute or lasting effects on the epigenome (gold). The gendered experiences described in this review start as early as birth (early life stress based on gender; green) and continue into adolescence and adulthood (cosmetic use, alcohol consumption; light blue, purple). Many other gendered experiences not explicitly addressed in this review will also impact the neuroepigenome (dark blue). The relative contribution of various factors and how they may change throughout development are not known, but the effects of biological sex and gender will interact in myriad ways throughout life. In some cases, gender may amplify epigenetic differences due to sex, whereas in other cases, gendered experiences may counteract differences in the epigenome based on biological sex. Not shown here is the fact that with our current ability to know the sex of an unborn child, gender can start before birth (Al-Akour, 2008).

products a day and correlates with gender expression and sexuality (Loretz et al., 2005; Moore, 2006). If cosmetics cause epigenetic changes, those changes will vary not just between sexes, but also within sex, across cultures, and over the lifespan. Indeed, any differences in the brain between men and women – including those in the epigenome – must be viewed within a social, historical, and developmental context (Springer et al., 2012; Rippon et al., 2014).

Our three examples given above emphasize exposures that differ by gender, because these are more likely to have been modeled in animal studies (and therefore to have applicable epigenetic data). However, gender is multi-dimensional, and any aspect (gender roles, identities, beliefs, etc. . .) may affect the epigenome. Epigenetic modifications are a way for experience to alter gene expression and, taken together, it seems inescapable that gender will leave an epigenetic imprint on the brain.

That said, few studies have directly examined differences in epigenetic marks in the brains of men and women, and none have attempted to separate the contributions of sex and gender. Demonstrating a causal relationship between gender and human brain epigenetics will be very challenging, because this will require not only an experimental design, but also

## REFERENCES


brain samples collected at the relevant time point(s). Several authors have proposed methods or best practices for studying effects of gender on biological outcomes, and inroads have been made in separating the effects of sex and gender on disease risk (e.g., Krieger, 2003; Rippon et al., 2014; Pelletier et al., 2015). Given our lifetimes of layered gendered experiences, and their inevitable, iterative interactions with sex, it may never be possible to completely disentangle the effects of sex and gender on the human brain epigenome. We can start, however, by including gender in our thinking any time a difference between the epigenome of men and women is reported.

## AUTHOR CONTRIBUTIONS

LRC, CC, and NF conceptualized and wrote the manuscript.

## FUNDING

This work was supported by an NSF Graduate Research Fellowship (to LRC), NSF IOS 1557451 (to NF), and a Georgia State University Brains & Behavior Seed Grant.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Cortes, Cisternas and Forger. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Does Sex/Gender Play a Role in Placebo and Nocebo Effects? Conflicting Evidence From Clinical Trials and Experimental Studies

#### Paul Enck\* and Sibylle Klosterhalfen

Department of Internal Medicine VI: Psychosomatic Medicine and Psychotherapy, University Hospital Tübingen, Tübingen, Germany

#### Edited by:

Marina A. Pavlova, University Hospital Tübingen, Germany

#### Reviewed by:

Magne Arve Flaten, Norwegian University of Science and Technology, Norway György Bárdos, Institute of Health Promotion and Sport Sciences, Eötvös Loránd University, Hungary

> \*Correspondence: Paul Enck paul.enck@uni-tuebingen.de

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Neuroscience

Received: 26 July 2018 Accepted: 12 February 2019 Published: 04 March 2019

#### Citation:

Enck P and Klosterhalfen S (2019) Does Sex/Gender Play a Role in Placebo and Nocebo Effects? Conflicting Evidence From Clinical Trials and Experimental Studies. Front. Neurosci. 13:160. doi: 10.3389/fnins.2019.00160 Sex has been speculated to be a predictor of the placebo and nocebo effect for many years, but whether this holds true or not has rarely been investigated. We utilized a placebo literature database on various aspects of the genuine placebo/nocebo response. In 2015, we had extracted 75 systematic reviews, meta-analyses, and meta-regressions performed in major medical areas (neurology, psychiatry, internal medicine). These meta-analyses were screened for whether sex/gender differences had been noted to contribute to the placebo/nocebo effect: in only 3 such analyses female sex was associated with a higher placebo effect, indicating poor evidence for a contribution of sex to it in RCTs. This was updated with another set of metaanalyses for the current review, but did not change the overall conclusion. The same holds true for 18 meta-analyses investigating adverse event (nocebo) reporting in RCT in the placebo arm of trials. We also screened our database for papers referring to sex/gender and the placebo effect in experimental studies, and identified 28 papers reporting 29 experiments. Their results can be summarized as follows: (a) Despite higher sensitivity of pain in females, placebo analgesia is easier to elicit in males; (b) It appears that conditioning is effective specifically eliciting nocebo effects; (c) Conditioning works specifically well to elicit placebo and nocebo effects in females and with nausea; (d) Verbal suggestions are not sufficient to induce analgesia in women, but work in men. These results will be discussed with respect to the question why nausea and pain may be prone to be responsive to sex/gender differences, while other symptoms are less. Lastly, we will discuss the apparent discrepancy between RCT with low relevance of sex, and higher relevance of sex in specific experimental settings. We argue that the placebo response is predominantly the result of a conditioning (learning) response in females, while in males it predominantly may be generated via (verbal) manipulating of expectancies. In RCT therefore, the net outcome of the intervention may be the same despite different mechanisms generating the placebo effect between the sexes, while in experimental work when both pathways are separated and explicitly explored, such differences may surface.

Keywords: sex, gender, placebo, nocebo, randomized controlled trials, experimental studies

## TERMINOLOGY

fnins-13-00160 February 28, 2019 Time: 19:35 # 2

The terms sex and gender refer to biological and psychosocial, respectively, origins of differences between women and men. For the purpose of this review these terms will be used interchangeable to describe any difference observed between men and women as it may impact on aspects of experimental medicine and clinical therapeutics, similar to Franconi et al. (2012).

The debate of the terms placebo effect and placebo response have also filled many pages, but will be ignored here for matters of simplicity. Both terms describe the results of a manipulation of treatment by providing an inert drug (in randomized controlled trials, RCT) or manipulating an experimental intervention, either for a whole group or for an individual. It has to be kept in mind, however, that some of these results of the RCT/experiment may be due to factors others than the placebo effect, specifically response biases, the Hawthorn effect, regression to the mean, spontaneous variation of symptoms, and other influences, that need to be controlled for, if at all possible (Enck et al., 2013; Schedlowski et al., 2015). The same ignorance is applied to the differential use of terms nocebo effects and nocebo response, for which the same limitations are valid (Bingel and Placebo Competence Team, 2014).

## THE SHORT HISTORY OF PLACEBO RESEARCH

Historically, the term "placebo" referred only to the use of inert, pill-like medicines for control of unspecific (not drug-associated) effects in RCT (Kaptchuk, 1998), and for the – occasional – use of similar remedies in everyday-medicine (Fassler et al., 2010). It was later extended to include other and specifically nonmedicinal therapies into the arena of evidence-based medicines. Placebo-controlled trials in surgery and other "instrumental" and manual therapies (acupuncture, stimulation techniques such as TENS, TMS, physical therapies, and alike) (Enck, 2018) often use the term "sham" instead, to denote that the provision of placebos in not "inert" any longer: sham surgery for instance can be associated with significant violating of the body's integrity. The application of the concept of placebos for psychotherapy and therapies alike has received very little and rather late attention and raises substantial controversy nowadays (Blease, 2018) over the question whether psychotherapy is to a large extent only placebo therapy (Gaab et al., 2016), or whether the placebo concept should not be applied at all to psychotherapy (Kirsch, 2005).

The term "nocebo" has a much younger tradition. It was initially describing side effects (adverse events, AE) reported in RCT in the placebo arm of studies, where these AE can only occur as the consequence of mis-attributing symptoms toward the ingested (pill) placebo, or as the consequence of having read and signed AE patient information (Bingel and Placebo Competence Team, 2014). Nocebo effects follow very much the rules for placebo effects both in clinical studies and in experimental settings, as we will describe below, but we will not discuss in more detail the psychobiological and neurophysiological mechanisms behind placebo and nocebo effects – these have been extensively reported by us and others in many reviews in recent years (see for instance Enck et al., 2008, 2013; Elsenbruch and Enck, 2015; Schedlowski et al., 2015).

According to Franconi et al. (2012), female patients are traditionally underrepresented in clinical studies, for different reasons not to elucidate here (e.g., Pinnow et al., 2009). On the other hand specifically in the area on pain, sex differences are well established, both for clinical and for experimental setting (Paller et al., 2009), but also have been found to be variable with sexual orientation and identity (Vigil et al., 2014). In the following sections we will review advances in research over the last decade, with respect to pain and placebo analgesia.

## SEX-EFFECTS ON THE PLACEBO EFFECT IN RCT

While age has been shown to consistently affect placebo response rates in a number of clinical conditions investigated during RCT, sex of the patients has rarely been reported to contribute to it. Before 2010, there are only a few papers reporting stronger placebo analgesic responses in male patients (Berkley et al., 2006; Fillingim et al., 2009). Others failed to find sex difference in placebo analgesia, e.g., with tooth extraction (Averbuch and Katzper, 2001), transcutaneous electrical nerve stimulation (Robinson et al., 1998), and an experimental pain test (Olofsen et al., 2005) In a benzodiazepine withdrawal study, female patients had higher placebo responses than males (Saxon et al., 2001). However, sex differences in individual studies (e.g., Kelley et al., 2009) for the irritable bowel syndrome), clinical or experimental, cannot provide sufficient evidence for or reject the assumption of sex differences existing.

In a 2013 systematic review (Weimer et al., 2015) of metaanalyses and systematic review of RCT across most medical subspecialties, based on our JIPS literature database (Enck et al., 2018), we identified only three out of 75 meta-analyses that reported higher placebo response rates in female patients than in males, and in neurological and psychiatric diseases only, namely restless leg syndrome (Ondo et al., 2013), bipolar mania (Yildiz et al., 2011), and schizophrenia (Mallinckrodt et al., 2010). This however, remained not without contradiction by other metaanalyses of the same clinical conditions (Woods et al., 2005; Fulda and Wetter, 2008; Chen et al., 2010), and with analyses of similar size (see **Table 1**).

This is surprising, given that these 75 analyses – with more than 1,500 RCT included, in more than 40 different diseases and with more than 150,000 patients – covered neurological diseases (Parkinson's disease, restless leg syndrome, epilepsy), pain syndromes (migraine, neuropathic pain, fibromyalgia), psychiatric diseases (depression, schizophrenia, mania, psychosis, attention-deficit hyperactivity disorder, addiction), gastrointestinal disorders (visceral pain syndromes, inflammatory bowel diseases), and other disorders (asthma, overactive bladder, hypertension, allergy, chronic fatigue, sleep problems).



In further 8 meta-analyses with 287 studies with various gastrointestinal disorders, no gender differences was found, and neither in 3 other meta-analyses 64 studies in different with medical conditions (from Weimer et al., 2015; for references not in our listing, we refer to this paper). No S, Number of RCT included; Response, Placebo (Pla), nocebo (Noc); Sex, M = Males, F = Females; RSL, Rest leg syndrome; MDD, Major Depression Disorder; OCD, Obsessive compulsory disease; ADHD, Attention-deficit hyperactivity disorder; BED, Binge eating disorder; GAD, General anxiety disorder. <sup>∗</sup> indicates availability of individualized data.

Adding a few more meta-analyses and conditions published since 2015 (Vase et al., 2015; Ciccozzi et al., 2016; Chen et al., 2017; Imanaka et al., 2017; Razza et al., 2018; Yeung et al., 2018) did not reveal additional evidence for higher placebo response in either sex in any of the diseases. In consequence of this rather clear result, we are forced to conclude that in RCT in the direction and size of the placebo response is not related to the sex of the patients (Weimer et al., 2015).

## SEX EFFECTS ON THE NOCEBO RESPONSE IN RCT

The number of all papers including the term "nocebo" in our database is 431, of which only 12 (2.7%) refer to sex or gender – implying that in the discussion of the nocebo effects much less attention is paid to sex differences. The database contains 18 meta-analyses on nocebo effects in RCT, covering more than 500 RCT with more than 25,000 patients, but excluding meta-analyses with children and adolescents, papers comparing two or more treatment modes for one condition, or with one treatment mode for more than one disease, and all experimental studies. All of these are in relation to neurological and psychiatric disorders (see **Table 2**).

As with the placebo effect, in only three papers an association of sex and the report AE and study termination due to AE was noted: in two meta-analysis the nocebo effect was higher in women (Zis and Mitsikostas, 2015; Meister et al., 2017), in one it was higher in men (Papadopoulos and Mitsikostas, 2012). This leaves us with a similar conclusion as above, that in RCT the direction and size of the nocebo response may not be related to the sex of the patients. It neither seems to be related to age of the patients, as two analyses showed higher AE reports in younger patients (Mitsikostas et al., 2012; Dodd et al., 2018), whereas another two noted higher responses in the elderly (Papadopoulos and Mitsikostas, 2010; Zis and Mitsikostas, 2015). Unfortunately however, most studies neither reported sex nor age as determining factors of the nocebo effect in RCT, either because it was not possible due to small numbers for meta-regressions, or it was not of interest to the authors.

## SEX DIFFERENCES IN EXPERIMENTAL PLACEBO AND NOCEBO STUDIES

The situation is entirely different when placebo experiments are planned to evaluate the mechanisms behind the placebo/nocebo effects seen in RCT. Here recruitment of patients or volunteers can be planned based on a balance sex distribution, and eventually even matched for other social and biological variables, e.g., age, BMI, status etc., depending on the underlying hypotheses. Unfortunately, sex-balanced studies have one disadvantage that is often either ignored or has led to dismissal of female test persons altogether, that is the need for assessment and adjustment of female participants according to their menstrual cycle, e.g., with pain studies (Iacovides et al., 2015). In animal work, not only in placebo research, this has vastly abandoned including female animals at all in many studies (Couzin-Frankel, 2014). Surprisingly, even in experiments with patients the sex of patients is sometimes not reported (e.g., Petersen et al., 2012).

A recent systematic review (Vambheim and Flaten, 2017) identified 18 experiments in 17 papers (among more than

TABLE 2 | Nocebo effects (adverse events) in meta-analyses of randomized controlled trials, with respect to sex influences.


No S, Number of RCT included; Response, Placebo (Pla), nocebo (Noc); Sex, M = Males, F = Females; MDD, Major Depression Disorder; CIDP, Chronic inflammatory demyelinating polyneuropathy; MS, Multiple Sclerosis; ST, Symptomatic treatment; DMT, Disease modifying treatment.

500 experiments, according to our JIPS database) reported in healthy volunteers in which sex as a contributing factor was either investigated purposely, or occurred incidentally with the data evaluation. To these 18 experiments we added 9 further experiments in healthy volunteers and 2 in different patient groups (Liccardi et al., 2004; Skyt et al., 2018).

## EXPERIMENTAL PLACEBO STUDIES

A total of 18 experiments were performed on placebo responses in healthy volunteers (**Table 3**), nearly an equal part showed either stronger responses in males (N = 7) and in females (N = 6), and 5 showed no sex differences, leaving the question unanswered. Of the three experimental studies in patients, two showed stronger placebo responses in females while one was inconclusive.

It should be noted, however, that 12 of the 18 studies on this group are from three laboratories only: 4 from the Flaten lab in Tromsö, Norway, 3 from the Elsenbruch lab in Essen, Germany and 5 from our Düsseldorf/Tübingen labs, the remaining six are from six different labs around the world, indicating that except in these three labs, sex effects were probably accidental finding but not the focus of research. In the three laboratories providing more than one study, one group showed a male predominance, one a female predominance, and one found consistently no sex differences. It seems from the distribution in **Table 3** that there is a trend for placebo analgesia to be more effective in males, while with experimentally induced nausea females report higher placebo responses. Whether this is due to a laboratory-specific bias or specific to the clinical condition (pain or visceral pain versus nausea, for instance), or

TABLE 3 | Placebo experiments reporting sex in healthy volunteers either by verbal induction or conditioning of the response (data in part from Vambheim and Flaten, 2017, supplemented by further studies).


No (Fem), Total number of volunteers included (number of females); Intervention, Placebo (Pla), nocebo (Noc); Sex, M = males, F = Females.

to different methods of placebo induction (verbal instruction versus conditioning), cannot be answered due to the small number of studies.

It is noteworthy though that placebo conditioning experiments have never worked for visceral pain (Sigrid Elsenbruch, personal communication); none is reported in the literature so far, despite own and other's attempts. Taking visceral pain out of the equation, it appears that verbal induction of analgesia works better in men than in women.

Important to note also is the fact that the Colloca et al. (2015) study used oxytocin for support the placebo effect, which is known to work specifically well in females, and may explain the paradoxical finding – compared to all other placebo analgesia studies that reported higher responses in males. The Krummenacher et al. (2014) study was performed in children, so that data are not easily transferable to adults (Weimer et al., 2013).

## EXPERIMENTAL NOCEBO STUDIES

**Table 4** lists the 8 experiments performed to induce a nocebo reaction in healthy volunteers; here the distribution seems cleared: Five of the eight studies, and in addition the only patient study reports higher nocebo responses in females than in males, and only 1 male predominance; two remain inconclusive.

It is noteworthy that among the six with stronger responses in females, four are conditioning studies, as are two of the placebo studies (see **Table 3**). This underlines our assumption that conditioning may work specifically well in females. When we combine the experimental placebo and the nocebo studies in healthy volunteers and compute a chi-square distribution for conditioning versus expectancy with female predominance versus female non-dominance (F = M and M > F), it yields significance (Fisher's Exact test, p = 0.06, one-sided).

## BEHAVIORAL VERSUS PHYSIOLOGICAL RESPONSES

Of specific note is that none of the four studies on visceral placebo analgesia ever produced sex differences at the behavioral

TABLE 4 | Nocebo experiments reporting sex in healthy volunteers either by verbal induction or conditioning of the response (data in part from Vambheim and Flaten, 2017, supplemented by further studies).


No (Fem), Total number of volunteers included (number of females); Intervention, Placebo (Pla), nocebo (Noc); Sex, M = males, F = Females; AE, Adverse events.

(pain report) level, but one showed sex-dependent brain correlates of a placebo intervention despite equal subjective pain reports (Theysohn et al., 2014): Women exhibiting stronger responses in some brain regions (insular, prefrontal cortex) in anticipation of pain, but lower downregulation of activation in the same areas during the pain, in contrast to men; this may be indicative of the known higher pain sensitivity of females. An early PET study had demonstrated that females when exposed to placebo show significantly greater brain activation in the prefrontal cortex, as compared to the males (Paulson et al., 1998). Further imaging studies showed that the (blinded) application of i.v. glucose induced dopamine and increased glucose binding in the striatum in men but not in women (Haltia et al., 2008) and differentially affected blood pressure between sexes (Haltia et al., 2007), underlining a similar mechanisms at the CNS level than the Theysohn et al. study. Sex differences have also been shown to exists for the opioid system (Niesters et al., 2010), further supporting and explaining these differential effects on the background of approved involvement of the opioid (Sauro and Greenberg, 2005) and dopamine system (Scott et al., 2007, 2008) in placebo analgesia.

Other neuro-endocrine mediators have been nominated to the placebo response, among the first were NO (Stefano et al., 2001; Fricchione and Stefano, 2005), oxytocin (Enck and Klosterhalfen, 2009), the endocannabinoid system (Benedetti et al., 2011), and CCK (Benedetti et al., 2006). While for the first (NO), an empirical prove has never been presented, the involvement and OXT has been shown (Kessner et al., 2013; Colloca et al., 2015; Tracy et al., 2017), though not without contradictory data: While OXT worked in enhancing placebo analgesia, especially in women (Colloca et al., 2015; Tracy et al., 2017), it did not in dermal itch (Skvortsova et al., 2018). Its greater action in women supports the behavioral finding of smaller effects in women in pain challenges, as compared to men: mere verbal suggestion of beneficial effects of presumed analgesics (in fact, placebos) is not sufficient to induce analgesia in women, but requires additional trust, mediated by OXT.

For CCK the involvement in nocebo hyperalgesia has only shown in one study so far (Benedetti et al., 2006), and for the endocannabinoid system, supporting evidence has been shown by Pecina et al. (2014). Specifically for placebo and nocebo effects of hypobaric pressure (high altitude) sickness symptoms, the involvement of prostaglandins has been shown (Benedetti et al., 2014), but neither of these neuroendocrine mediator produced differential effects between the men and women.

## SEX EFFECTS ON EXPERIMENTER – VOLUNTEERS INTERACTIONS

In a sham-acupuncture trial with one male and one female therapist, the female acupuncturist induced greater trust than the male in having received true acupuncture (White et al., 2003). In the re-evaluation of a RCT in 120 IBS patients, the female physician produced greater symptom improvements in the drug and the placebo arm of the trial than her two male colleagues, and more female than male patients responded to placebo (Enck et al., 2005). Both studies can point toward the potential role of sex of doctors in the placebo responses, but cannot prove it.

In an experimental pain study by Kallai et al. (2004) significant interaction of the sex of male and female experimenter (N = 4 each) and sex of male and female volunteers (N = 80 each) on pain tolerance (cold pressor test) indicated that subjects tolerated pain longer when investigated by an experimenter of opposite sex. A significant main effect was found for sex of the experimenter: higher pain intensities and higher pain tolerance were found with female experimenters.

The first experiment in a placebo research setting (Flaten et al., 2006) noted higher placebo analgesia in males than in females following verbal manipulation of expectancies – in this experiment they used five female nurses as experimenters. To further explore sex differences on pain perception, they included experimenters of both sexes (n = 3 each) in another experiment (Aslaksen et al., 2007) and found significant interaction between both factors, in that female experimenters produced higher placebo analgesia in male volunteers than in females, while male experimenters did not produce similar responses, neither with male nor with female participants. This was not reflected in physiological data (heart rate), indicating – so the authors – the sex effect seen is probably due to psychosocial factors. In a third placebo analgesia experiment, this time with 8 experimenters (4 females), and with 64 volunteers (32 females), equally distributed in a balanced fashion, the dominant male response to female experimenters was not replicated. Instead significant sex (experimenter) × sex (volunteers) with a larger placebo analgesic response in males reporting to male experimenters, compared with male subjects reporting to female experimenters. With respect to pain reports (but not to placebo analgesia) the influence of experimenter sex persisted, however, male participants reported lower pain to female experimenters compared with the male experimenters in line with previous studies, as is a significant main effect of experimenter sex, with lower pain reports to female experimenters than to male experimenters.

Further evidence for a sex-by-sex interaction comes for two other placebo experiments, however, except Flaten et al. (2006), neither study has varied systematically the number and sex of the experimenters, and it may well be that the effects seen are therefore not sex- but personality-linked. Stumpf et al. (2016) noted no sex difference in the placebo response for itch, but a difference between the one male and female investigators, with respect to the exaggerated verbal suggestion and the respective control conditions, with the female experimenter producing higher flares size in the histamine condition. In a nausea study by Weimer et al. (2012) that provided verbal information of the anti-emetic effect of ginger (placebo), men who received placebo responded stronger to placebo information when provided by the male experimenter, and to ginger information when provided by the female experimenter; such effect was not seen in females. One explanation provided by the authors is that women's behavior

is stronger connected to their symptoms (and to information provided) than men's behavior.

## WHY APPEAR PAIN AND NAUSEA PRONE TO (OPPOSITE) SEX DIFFERENCES IN PLACEBO/ NOCEBO RESPONSE?

Placebo and nocebo effects, as has been shown by many experimental investigations, can reliably be elicited in healthy volunteers, with many experimental paradigms, verbally induced or conditioned, but specifically with pain and nausea. At the same time, only pain and nausea have been shown to reliably be effected by sex, and two opposite conclusions can be drawn from the above discussed data:


For both conclusions, a rational concept is needed, despite the fact that they are based on only a few experiments from only a few placebo research groups, not necessarily interested in sex and gender differences per se.

For one, the above (**Tables 3**, **4**) displayed distribution of research paradigms may be biased by an arbitrary or rational selection processes: Investigating placebo analgesia (instead of placebo responses in other areas of medicine) is determined – among others – by the simplicity of testing pain under laboratory conditions through a variety of techniques, that all (or many) also allow exportation into brain scanners and other advanced research technology. As we have elucidated before (Enck et al., 2018), our own decision to focus on nausea and a rotation paradigm was made before this was labeled placebo research (in 2004), as was our interest in sex differences, e.g., of nausea susceptibility (Stockhorst et al., 1998; Klosterhalfen et al., 2000).

Both pain and nausea were among the earliest clinical conditions that gain interest for their strong placebo responsiveness, as early reports from Beecher (1955) and Wolf (1959) indicate. At the same time, pain as well as nausea are among the most frequent symptoms reported in medicine, be it in clinical practice as subjective symptom in many somatic and functional diseases (Enck et al., 2016, 2017), or as adverse events or patient reported outcomes in RCT of drugs and other interventions, also in the placebo arms of trials (Rief et al., 2006). At the same time, both symptoms lack a biological correlate (biomarker) that can be used reliably to measure it, so that medicine is still relying on subjective assessment of its nature (threshold, tolerance, intensity) (Weimer et al., 2014; Saltychev et al., 2016).

Both symptoms are not per se diseases by their own, but rather indicative of an underlying process that requires medical attention and explanation, and only as a chronic condition (without such a process) become markers of a disease, as chronic pain or recurrent nausea and vomiting. Nausea has been called an maladaptation symptom, e.g., in the context of motion sickness (Lackner, 2014). For women, especially nausea has an additional health relevance not apparent for men: Nausea may be indicative of pregnancy at an early stage, and may serve as a biological warning signal in the interest of the safety of the unborn life, that has overcome from evolution.

The apparent difference between men and women with regards nausea on the one hand, and to verbally induced or conditioned responses on the other hand is best illustrated by the Klosterhalfen et al. (2009) experiment where we showed that women respond to conditioning of nausea symptoms much better than men, while men were more susceptible toward verbally induced symptom provocation. The obvious interpretation of these differences is that for women, learning mechanisms dominate – and previously learned content remains relevant -, while in men, an acutely provided information is of higher relevance than past experiences. This may also explain the higher susceptibility of men for verbally induced placebo analgesia, despite their lower overall pain sensitivity.

Three more experiments from our pre-placebo research tradition may further illustrate the importance of sex for nausea experience: In a study using a circular-vection drum to induce nausea (Klosterhalfen et al., 2008), we found that women responded stronger to the stimulus while sitting, while in men, the lying position was much more aversive. Significant differences between sexes were also found for habituation to repetitive rotation exposure: both endocrine and inflammatory markers habituated differently between men and women with multiple (five) rotations on the same day: increases in men and decreases in women in the first session versus increases in men and in women in the last session (Rohleder et al., 2006). With rotations repeated over (five) consecutive days (Meissner et al., 2009), males responded stronger on day 1 and reduced responses on days 2 and 3, while women responded stronger on day 3, as compared to days 1 and 2. For days 4 and 5, these trends reversed, again differentially between sexes.

All these data has led us to believe that both psychological and biological factors contribute to nausea reports in these experimental situations and interaction in rather complex ways, and presumably involving other factors that our experiments did not completely control for (Klosterhalfen et al., 2005, 2006).

## THE APPARENT DISCREPANCY BETWEEN RCT AND EXPERIMENTS REQUIRES AN EXPLANATION

In 2012, Franconi et al. (2012) stated that the available data are too preliminary in order to reach to a definitive conclusion, but that a sex effect on placebo responses is conceivable. In 2013, Weimer et al. (2015) found that sex effects on placebo responses in RCT across medicine and its subspecialties are not visible and can therefore be ignored. A few years later the evidence has substantially strengthened for sex effects in experimental work on placebo and nocebo effects, as we show above, but still remains poor for clinical RCT data. This apparent discrepancy between RCT and experimental data also needs an explanation.

The best explanation that we can provide today is referring to the different nature of experiments on the one hand and

RCT on the other. In a well-planned experiment, the separation of expectancy manipulation and learning/conditioning – as the two main underlying mechanisms of the placebo response – can be achieved, and the relative contribution of either can be explored. For instance, this allowed Colloca and Benedetti (2009) and others, to directly compare the relative potency of a novel learning mechanisms for placebo analgesia (by social observation) to the other two (expectation and conditioning).

In a randomized placebo-controlled trial, in contrast, the amount and degree of factors referring to patients' learning (medical history, previous therapies and their success and/or failure, duration of knowing the treating doctor, etc.) and to expectancies delivered and associated with the treatment (informed consent and AE reports, symptom diaries, number

## REFERENCES


and intensity of doctor-patient contacts etc.) is neither known nor balanced, and may vary from patient to patient as well, e.g., in relation to his/her social environment and the "placebo by proxy" influences (Grelotti and Kaptchuk, 2011). Under these circumstances it is conceivable that any existing differences in placebo responsiveness between the sexes are averaged out in RCT, and result in equally sized placebo effects in men and women, as we have seen.

## AUTHOR CONTRIBUTIONS

PE and SK had the idea for the paper and wrote the manuscript. PE extracted the literature.

analysis of 10 clinical trials from one research group. Eur. Eat. Disord. Rev. 22, 140–146. doi: 10.1002/erv.2277


placebo arms of clinical trials of olanzapine for bipolar disorder. Bipolar Disord. doi: 10.1111/bdi.12662 [Epub ahead of print].



study. Psychoneuroendocrinology 83, 101–110. doi: 10.1016/j.psyneuen.2017. 05.028


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Enck and Klosterhalfen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Gender Differences in Generating Cognitive Reappraisals for Threatening Situations: Reappraisal Capacity Shields Against Depressive Symptoms in Men, but Not Women

Corinna M. Perchtold<sup>1</sup> \*, Ilona Papousek<sup>1</sup> , Andreas Fink<sup>1</sup> , Hannelore Weber<sup>2</sup> , Christian Rominger<sup>1</sup> and Elisabeth M. Weiss<sup>1</sup>

<sup>1</sup> Department of Psychology, University of Graz, Graz, Austria, <sup>2</sup> Department of Psychology, University of Greifswald, Greifswald, Germany

#### Edited by:

Annie Duchesne, University of Northern British Columbia, Canada

#### Reviewed by:

Gaurav Suri, San Francisco State University, United States Eric S. Allard, Cleveland State University, United States

\*Correspondence: Corinna M. Perchtold corinna.perchtold@uni-graz.at

#### Specialty section:

This article was submitted to Emotion Science, a section of the journal Frontiers in Psychology

Received: 12 October 2018 Accepted: 27 February 2019 Published: 15 March 2019

#### Citation:

Perchtold CM, Papousek I, Fink A, Weber H, Rominger C and Weiss EM (2019) Gender Differences in Generating Cognitive Reappraisals for Threatening Situations: Reappraisal Capacity Shields Against Depressive Symptoms in Men, but Not Women. Front. Psychol. 10:553. doi: 10.3389/fpsyg.2019.00553 Despite major research interest regarding gender differences in emotion regulation, it is still not clear whether men and women differ in their basic capacity to implement specific emotion regulation strategies, as opposed to indications of the habitual use of these strategies in self-reports. Similarly, little is known on how such basic capacities relate to indices of well-being in both sexes. This study took a novel approach by investigating gender differences in the capacity for generating cognitive reappraisals in adverse situations in a sample of 67 female and 59 male students, using a maximum performance test of the inventiveness in generating reappraisals. Participants' self-perceived efficacy in emotion regulation was additionally assessed. Analyses showed that men and women did not differ in their basic capacity to generate alternative appraisals for anxiety-eliciting scenarios, suggesting similar functional cognitive mechanisms in the implementation of this strategy. Yet, higher cognitive reappraisal capacity predicted fewer depressive daily-life experiences in men only. These findings suggest that in the case of cognitive reappraisal, benefits for well-being in women might depend on a more complex combination of basic ability, habits, and efficacy-beliefs, along with the use of other emotion regulation strategies. The results of this study may have useful implications for psychotherapy research and practice.

Keywords: cognitive reappraisal, emotion regulation, gender differences, depression, maximum performance

## INTRODUCTION

Among the most pervasive differences between men and women in the realm of emotion is women's heightened vulnerability toward the development of affective disorders, in particular depression and anxiety (e.g., Nolen-Hoeksema, 2001; Kessler et al., 2007; Steel et al., 2014). Over the years, this female proneness to depressive symptoms has been attributed to heightened emotional reactivity toward negative stimuli (Bradley et al., 2001; Kessler, 2003; Kelly et al., 2008) as well as potentially maladaptive emotion regulation (e.g., Garnefski et al., 2004; Nolen-Hoeksema, 2012), both behaviorally and on the level of the brain (e.g., Domes et al., 2010;

Whittle et al., 2011; Stevens and Hamann, 2012). However, consistent empirical support for sex differences especially in emotion regulation that may in turn elucidate gender differences in several types of psychopathology is sparse (see Nolen-Hoeksema and Aldao, 2011; Whittle et al., 2011; Zimmermann and Iwanski, 2014). This, along with increasing recognition that deficient emotion regulation is at the core of various disorders (Martin and Dahlen, 2005; Aldao and Nolen-Hoeksema, 2010; Hofmann et al., 2012; Berking et al., 2014; Joormann and Stanton, 2016), highlights the need for more in-depth investigations on gender differences<sup>1</sup> in the proficiency of implementing certain emotion regulation strategies.

One emotion regulation strategy that merits special attention in this case is cognitive reappraisal. Cognitive reappraisal aims at changing the emotional impact of a situation by deliberately viewing it from a different perspective by using alternative situational interpretations (e.g., Lazarus and Alfert, 1964; Lazarus and Folkman, 1984; Gross and John, 2003). Converging evidence from multiple studies has shown that cognitive reappraisal is particularly powerful in dealing with adverse events, sustainably regulating negative affect and decreasing depressive symptoms (e.g., Martin and Dahlen, 2005; Augustine and Hemenover, 2009; Troy et al., 2010; Webb et al., 2012). In this respect, Martin and Dahlen (2005) found that independent of gender, higher self-reported positive reappraisal predicted lower depressive symptoms, while Troy et al. (2010) showed that cognitive reappraisal protected against depressive symptoms during stressful life events. Meta-analyses corroborated these findings, with Augustine and Hemenover (2009) demonstrating links between cognitive reappraisal and large hedonic shifts in affect (defined as decreases in negative or increases in positive emotions and indexed by self-report). These findings were supported by the meta-analysis of Webb et al. (2012), who reported cognitive reappraisal to be highly effective in modifying emotional outcomes on behavioral and physiological levels as well. While this invites assumptions that a higher prevalence of depression in women may partly originate from less frequent or less effective use of cognitive reappraisal, available data are mixed. According to some studies, women employ cognitive reappraisal on a more frequent basis than men do (e.g., Tamres et al., 2002; Spaapen et al., 2014; also see Nolen-Hoeksema, 2012), though in the meta-analysis of Tamres et al. (2002), this effect was reported for most emotion regulation strategies. These findings are, however, challenged by others that report no gender differences in the habitual use of cognitive reappraisal (Gross and John, 2003; Haga et al., 2009; Zlomke and Hahn, 2010), or even endorse more positive re-interpretations in men (Öngen, 2010). Research on gender-specific effects of cognitive reappraisal use on depressive symptoms during adolescence yielded disparate results as well, either denoting cognitive reappraisal equally effective in attenuating depressive symptoms in both men and women (Shapero et al., 2018) or suggesting that greater habitual use of cognitive reappraisal more strongly decreases depressive symptoms in adolescent girls than boys (Duarte et al., 2015).

One possible explanation for these inconclusive results is that, while having provided vital evidence, these approaches mainly focused on self-reported tendencies to use cognitive reappraisal, thereby neglecting potential gender differences in actual capacity to adequately implement cognitive reappraisals in critical situations (e.g., Perchtold et al., 2018). Several researchers pointed out that individuals' typical reappraisal use in daily life cannot be equated with their actual capacity to use this strategy when confronted with adverse scenarios, given the absence of or only weak correlations between the two (McRae et al., 2008; Troy et al., 2010; Weber et al., 2014). However, despite numerous appeals for more objective performance measures of individuals' actual emotion regulation capacity (Demaree et al., 2006; McRae et al., 2008; Whittle et al., 2011; Opitz et al., 2015), few efforts have been made in that direction. Thus, assumptions that men and women may differ in their basic capacity for cognitive reappraisal remain rather speculative to date. In an attempt to add some clarity to the picture, two brain imaging studies (McRae et al., 2008; Domes et al., 2010) specifically investigated sex differences in neural correlates of instructed cognitive reappraisal, albeit with different outcomes. McRae et al. (2008) reported lower increases in prefrontal activity and greater decreases in amygdala activity during reappraisal efforts in men compared to women, despite similar attenuations of self-reported negative emotions in both sexes. Domes et al. (2010) found quite the opposite activation pattern, indicating greater prefrontal activity in men compared to women during cognitive reappraisal implementation, with no notable sex differences in amygdala activity or self-report regulation success. Intriguingly, both studies interpreted their results in terms of a more efficient reappraisal process in men, suggesting less effortful cognitive control (McRae et al., 2008) and more appropriate recruiting of regulatory areas (Domes et al., 2010) in men compared to women. Although this argument critically implicates executive control processes in effective reappraisal (Joormann and Gotlib, 2010; Malooly et al., 2013; Pe et al., 2013; Rominger et al., 2018), neither study used objective behavioral indicators of reappraisal capacity, making it difficult to put their findings into perspective. Altogether, the question whether men and women differ in their basic capability for implementing alternative appraisals in critical situations is thus still unanswered.

The present study aims to address this gap in literature by investigating gender differences in the basic capacity for generating cognitive reappraisals. Moreover, it was examined how this capacity relates to individuals' depressive daily-life experiences. More precisely, we sought to determine whether cognitive reappraisal capacity may serve as a predictor of depressive experiences in daily life also over and above

<sup>1</sup>We adopted the current definitions of sex and gender, according to which sex is considered a biological component, which is defined via the genetic complement of chromosomes, whereas gender refers to the social, environmental, cultural, and behavioral factors and choices that influence a person's self-identity and health (Clayton and Tannenbaum, 2016; National Institute of Health Office of Research on Women's Health, 2019). Since it cannot be determined that any of the effects discussed in this study are caused by biological factors alone, differences between men and women are referred to as "gender differences." This does, however, not exclude the possibility that biological and social factors may interact in explaining the present results. If cited literature addressed sex or gender differences, their wording was adopted.

individuals' self-efficacy in the regulation of emotions, and whether this holds for both genders in a similar way. In this study, we used the Reappraisal Inventiveness Test (RIT; Weber et al., 2014), which confronts individuals with selfrelevant, threatening situations and instructs them to produce as many different cognitive reinterpretations as possible in order to downregulate their experienced stress and anxiety. Importantly, by using the RIT, our focus was on gender differences in reappraisal capacity in the psychometric sense, that is, to what degree men and women are theoretically capable of implementing cognitive reappraisal in aversive situations (maximum performance, Cronbach, 1970). Objective coding of participants' reappraisal ideas in terms of appropriateness (see Demaree et al., 2006) then results in an index of reappraisal capacity. This capacity can be referred to as basic or fundamental, as it delineates an individuals' basic cognitive potential to construct different interpretations for given situations in the first place (i.e., a construction competence), allowing for more flexibility in coping with everyday challenges (Weber et al., 2014). In this regard, studies have linked higher cognitive reappraisal capacity to more appropriate recruitment of the lateral prefrontal cortex during emotion regulation efforts (Papousek et al., 2017), which also predicted self-perceived chronic stress levels (Perchtold et al., 2018). This corroborates the notion that this brain-based cognitive reappraisal capacity may affect more distal emotional outcomes like stress perception and by implication, possibly depressive experiences. Thus, cognitive reappraisal capacity likely constitutes a necessary prerequisite for effective reappraisal implementation in daily life (Weber et al., 2014; de Assuncao et al., 2015; Papousek et al., 2017). However, in this regard, two things need to be considered. Firstly, in daily life, it might occasionally seem more relevant to produce one high-quality reappraisal than a variety of different reappraisals to effectively diminish the emotional impact of aversive situations. Yet, it can be argued that the capacity to generate a large pool of potential reappraisals for a given situation makes it more likely to select reappraisals individuals can effectively implement in this specific context (also see Wisco and Nolen-Hoeksema, 2010). Having a broad repertoire of potential reappraisals readily available may be especially relevant when individuals face new situations, in which they cannot rely upon their routine strategies (Weber et al., 2014). Secondly, though considered a vital prerequisite for effective cognitive reappraisal implementation, reappraisals capacity only covers a certain aspect in the reappraisal process, as individuals not only need to be principally capable of constructing various situational appraisals, they also need to make use of this ability in daily life (Perchtold et al., 2018). Conversely, however, if individuals' basic capacity for cognitive reappraisal generation is impaired, habitual use of cognitive reappraisal in daily life may not yield any benefits, and reappraisal trainings, e.g., in cognitive behavioral therapy, may not be sufficiently effective.

To the best of our knowledge, no study to date has tested gender differences in the explicit ability to ad hoc generate cognitive reappraisals for adverse situations. Moreover, given equivocal evidence from literature as to sex differences in executive control processes relevant to emotion regulation (McRae et al., 2008; Domes et al., 2010; Franklin et al., 2018), we did not have strong a priori predictions regarding which gender would show better cognitive reappraisal capacity and how this capacity would relate to depressive symptoms in men and women. In line with available literature, however, we did hypothesize that women would report more depressive experiences than men (Nolen-Hoeksema, 2001; Van de Velde et al., 2010; Salk et al., 2017) and conversely, less self-efficacy in emotion regulation (e.g., Freudenthaler and Papousek, 2013). A relationship between cognitive reappraisal capacity and self-efficacy beliefs seems likely, with self-efficacy potentially acting as the decisive variable for daily-life experience of depression. In this regard, previous research reported substantial correlations between perceived self-efficacy in emotion regulation and various indexes of well-being (see Baudry et al., 2018). Additionally, in light of recent findings that some cognitive reappraisal strategies (e.g., positive reinterpretations) might be more adaptive than others as regards implications for well-being (Kalisch et al., 2015; Willroth and Hilimire, 2016; Perchtold et al., 2018), we tested for gender differences in the quality of generated reappraisals (positive re-interpretation, de-emphasizing, problem-orientation, symptom re-interpretation).

## MATERIALS AND METHODS

## Participants

The sample comprised 126 participants (67 women, 59 men), aged between 18 and 35 (M = 22.42, SD = 3.15). All participants were university students enrolled in various fields. No participant reported using drugs or psychoactive medication and none had participated in an experiment using the RIT before. Thirty women reported the use of hormonal contraceptives, with n = 25 using the contraceptive pill (duration of use: M = 3.86 years SD = 2.59), and n = 5 using intrauterine devices (duration of use: M = 2.04 years; SD = 1.16). The study was approved by the authorized ethics committee. Participants gave their written consent to participate in the study. After receiving general instructions, participants completed the RIT and questionnaires.

## Reappraisal Inventiveness Test (RIT)

The RIT (Weber et al., 2014) is a maximum performance test for cognitive reappraisal ability that confronts individuals with adverse emotional situations likely to occur in their everyday lives. Participants are instructed to imagine the situation happening to them and to generate and write down as many different ways as possible to think about the situation in a way that diminishes their negative emotions. In the present study, four vignettes depicting anxiety-eliciting situations (de Assuncao et al., 2015) were presented one at a time on separate pages and were supplemented by a picture in order to make them more vivid. For each vignette, participants were given 20 s to imagine the situation happening to them and then turn to the next page at the signal of the experimenter. Subsequently, participants wrote

down as many different ways to reappraise the situation with the goal to diminish anxiety until the allotted time of 3 min per situation had elapsed. In the night item of the RIT (situation 1), for instance, participants face the following situation: "At night, you lie alone in bed and are about to fall asleep, when you suddenly hear a loud noise from the living room. You get up, go into the living room and realize that the window is open." In the other situations, individuals are confronted with walking home alone at night (2), a root canal appointment (3), and a smoke alarm going off at the neighbors (4). For the assessment of behavioral measures of their reappraisal inventiveness, participants' responses to the RIT items were used and independently rated by two experienced experimenters, who received extensive training beforehand.

## Cognitive Reappraisal Capacity

Following the scoring procedure of the RIT and previous relevant research (Weber et al., 2014; Fink et al., 2017; Papousek et al., 2017; Perchtold et al., 2018; Rominger et al., 2018), RIT-fluency was used as an index of cognitive reappraisal capacity, calculated as the total number of generated non-identical reappraisals (α = 0.93). On average, participants generated M = 22.12 (SD = 5.23) valid reappraisals. The number of reappraisal ideas generated for each of the four situations differed slightly, with significantly fewer ideas generated for situation 4 (M = 5.12) than for the rest (situation 1: M = 5.75, p < 0.001; situation 2: M = 5.74, p < 0.001; situation 3: M = 5.51, p = 0.060). The inter-rater reliability with two-way random, single measure ICC (95% confidence intervals, consistency) was = 0.99 for overall RIT-fluency. Reappraisal were additionally categorized according to the category scheme of the RIT (Weber et al., 2014), which allows for a more profound categorization of reappraisal ideas according to content. The four reappraisal categories in the RIT are: positive re-interpretation (generating positive aspects; M = 8.64, SD = 4.46; e.g., "Now that I am awake, I get to do some stargazing"), de-emphasizing (trivializing the impact of the situation; M = 9.54, SD = 4.03; e.g., "Why would someone break into my apartment, I do not own anything valuable"), problem-orientation (finding ways to reduce harm; M = 3.27, SD = 3.47; "I have my phone, I can call for help anytime"), and symptom re-interpretation (reappraising physical arousal; M = 0.35, SD = 0.62; e.g., "My heart is just beating rapidly because I got out of bed so fast"). For more example answers matched to their respective category, please see **Supplementary Appendix**. Other reappraisal ideas not matching these four categories were excluded due to lack of respective answers generated by the participants. Inter-rater reliabilities were ICC = 0.96, ICC = 0.95, ICC = 0.97, and ICC = 0.89 for positive re-interpretation, de-emphasizing, problem-orientation, and symptom re-interpretation, respectively. After completion of all vignettes, participants rated the extent of anxiety they would experience when confronted with the depicted situations (7-point scales ranging from 0 "not anxious at all" to 6 "very anxious"). Ratings were M = 3.56 (SD = 1.78), M = 3.40 (SD = 1.68), M = 2.72 (SD = 1.78), and M = 3.18 (SD = 1.47). In one-sample t-tests, ratings for all vignettes differed significantly from zero (t-values ranging from 17.14 to 24.27, all p-values <0.001), indicating that all situations were indeed perceived as anxiety evoking. Situation 3 (M = 2.72) was perceived as significantly less anxiety evoking than situation 1 (p = 0.003) and situation 2 (p = 0.017).

## Self-Report Measures Depression

The Center for Epidemiologic Studies Depression Scale (CES-D, German version; Hautzinger and Bailer, 1993) is comprised of 20 items, rated from 0 (rarely or none of the time – less than 1 day) to 4 (most or all the time – 5 to 7 days; α = 0.90). It refers to mood and attributions over the past week and is designed for measuring sub-clinical depressive daily-life experiences in the general population (Wood et al., 2010). Scores ranged from 0 to 37 (M = 12.05, SD = 7.0).

## Perceived Efficacy in Managing Negative Emotions

The emotion regulation subscale of the Self-report Emotional Ability Scale (SEAS; Freudenthaler and Neubauer, 2005) was used to assess how able individuals feel to regulate negative affect in their everyday life (e.g., "It is easy for me to change my bad mood"). The 6 items are rated on 6-point Likert scales ranging from 1 to 6 (α = 0.75). Scores ranged from 9 to 34 (M = 22.51, SD = 5.17).

## Statistical Analysis

In order to investigate basic gender differences in the central variables of interest (cognitive reappraisal capacity, self-efficacy in managing negative emotions), two independent sample t-tests were computed. Subsequently, a three-step hierarchical multiple regression analysis was employed with depression as the dependent variable. In the first step, gender was entered as a predictor of depressive daily-life experiences. The second step added reappraisal capacity and self-efficacy in emotion regulation as predictors, with the third step additionally considering interactions of gender and reappraisal capacity, as well as of gender and perceived self-efficacy in managing negative emotions. The applied hierarchical regression approach allowed to examine, firstly, whether men and women differ in the amount of depressive experiences in their everyday lives (first step). Secondly, it examined whether gender differences in depressive experiences are explained by individual differences in reappraisal capacity and/or self-efficacy in managing negative emotions, and whether these variables as such are related to depression (i.e., explain unique variance in the amount of depressive experiences beyond that afforded by gender differences; second step). Thirdly, it allowed to examine whether potential relationships between reappraisal capacity and self-efficacy in managing negative emotions with depression are differently expressed for men and women (third step of the hierarchical regression). The statistical assumptions for the model (i.e., ratio of cases to independent variables, normality, independence of errors, homoscedasticity, linearity, and absence of multicollinearity) were met. A significance level of p < 0.05 (two-tailed) was used. Additionally, a multivariate analysis of variance was computed to test for potential gender differences in the patterns of used reappraisal categories (number of reappraisals qualifying as

positive re-interpretation, de-emphasizing, problem-orientation, and symptom re-interpretation).

## RESULTS

## Basic Gender Differences in Cognitive Reappraisal Capacity and Self-Efficacy in Managing Negative Emotions

In terms of perceived self-efficacy in managing negative emotions, men reported significantly higher self-efficacy than women [men: M = 24.46, SD = 4.93; women: M = 20.79, SD = 4.77; t(124) = 4.24, p < 0.001]. However, men and women did not differ in their basic capacity to generate cognitive reappraisals for anxiety-eliciting events [men: M = 5.55, SD = 1.24; women: M = 5.52, SD = 1.37; t(124) = 0.118, p = 0.906]. Moreover, while women reported feeling greater anxiety elicited by the presented scenarios [men: M = 2.81, SD = 0.12; women: M = 3.58, SD = 0.69; t(124) = −5.27; p < 0.001], this self-reported anxiety was uncorrelated with performance on the reappraisal test (r = −0.07, p = 0.468). No significant differences in any variables of interest were observed between women who did and those who did not report using hormonal contraceptives (all p's > 0.140).

## Relationships Between Cognitive Reappraisal Capacity and Self-Efficacy in Managing Negative Emotions With Depressive Experiences in Men and Women

In **Table 1**, the findings of the hierarchical regression analysis are summarized. At step one, gender significantly correlated with


Dependent variable: amount of depressive daily-life experiences (CES-D). Gender was scored such that the positive beta weight indicates that women reported more depressive experiences than men. For an illustration of the significant interaction effect see Figure 1. Full model: F(5,120) = 9.82, p < 0.001.

the amount of depressive experiences [r = 0.21; F(1,124) = 5.57, p = 0.020], indicating that, overall, women reported more depressive experiences than men (men: M = 10.51, SD = 6.35; women: M = 13.40, SD = 7.30). In addition to gender, reappraisal capacity and self-efficacy explained additional 21% of the variance in depressive experiences [F(3,122) = 13.67, p < 0.001]. While both of these variables explained unique portions of variance in depression (reappraisal capacity: sr = −0.18, p = 0.027; selfefficacy: sr = −0.43, p < 0.001), the contribution of gender became non-significant (sr = 0.04, p = 0.621) as reappraisal capacity and self-efficacy were included in the model. Together, this suggests that the observed gender differences in reported depressive experiences are to a large part attributed to differences in self-efficacy in emotion regulation. Overall, higher scores in self-efficacy as well as in cognitive reappraisal capacity were associated with less depressive experiences. Entering the interaction terms reappraisal capacity by gender and self-efficacy by gender in the model additionally increased the explained amount of variance in the experience of depression by 4% [F(5,120) = 9.82, p < 0.001]. Of the two interactions, only the contribution of the interaction reappraisal capacity by gender was significant (sr = 0.18, p = 0.020; self-efficacy by gender: sr = −0.07, p = 0.357). The significant interaction indicates that while a higher basic capacity for cognitive reappraisal generation for anxiety-eliciting situations was associated with lower self-reported depressive experiences in men, the capacity for reappraisal generation was unrelated to the experience of depression in women (men: r = −0.42, p < 0.001; women: r = 0.03, p = 0.820). See **Figure 1** for an illustration of the significant interaction effect.

In light of evidence that the difficulty of cognitive reappraisal increases with the intensity of emotional situations (e.g., Sheppes et al., 2014), we additionally ran two separate hierarchical regression analyses for the lower and higher anxiety eliciting items. In both analyses, the previously observed interaction reappraisal capacity by gender remained significant (lower anxiety eliciting: sr = 0.19, p = 0.019; higher anxiety eliciting: sr = 0.16; p = 0.042), indicating that differences in anxiety ratings for the RIT vignettes did not influence the main findings of this study.

## Gender Differences in Use of Reappraisal Sub-Strategies

Men and women did not differ in their employment of different reappraisal strategies [F(4,121) = 1.04, p = 0.387]. See **Table 2** for a descriptive summary of the rates of generated reappraisal categories. On an exploratory basis, it was additionally examined how the use of different reappraisal strategies contributed most to the reporting of depressive daily-life experiences (standard multiple regression analysis). The generation of relatively more reappraisals categorized as de-emphasizing (sr = −0.16, p = 0.038) and positive reinterpretation (sr = −0.15, p = 0.064) were associated with fewer depressive experiences, whereas the use of problem orientation (sr = 0.03, p = 0.724) and symptom re-interpretation (sr = −0.10, p = 0.214) did not seem to play an important role on their

TABLE 2 | Use of reappraisal strategies, expressed as percentage of total generated cognitive reappraisals.


own [F(6,119) = 7.68, p < 0.001]. This result was independent from variance explained by gender and self-efficacy in managing negative emotions.

## DISCUSSION

This study examined gender differences in the fundamental capacity to spontaneously generate alternative cognitive reappraisals for anxiety-eliciting scenarios as well as their potential relevance to depressive experiences in everyday life. In line with indications of greater emotional reactivity to negative information and stressful events in women than men as well as women's greater proneness to clinical depression (Bradley et al., 2001; Nolen-Hoeksema, 2001; Kessler et al., 2007; Kelly et al., 2008; Steel et al., 2014), women reported more depressive symptoms than men in the current study. Nevertheless, these differences were not reflected in basic reappraisal skills, as men and women demonstrated a similar capacity to generate meaningful alternative interpretations for adverse anxious events. This constitutes a novel finding in literature, as potential gender differences in emotion regulation capacity have never been scrutinized with a maximum performance test of reappraisal ability before. Despite previous studies hinting at a more efficient reappraisal process in men based on their prefrontal cortex engagement and related stronger executive functioning (e.g., McRae et al., 2008; Domes et al., 2010; also see Masumoto et al., 2016), this study yielded no evidence suggesting a potential advantage of men in the behavioral test for reappraisal inventiveness. Note that while greater reappraisal inventiveness does not automatically translate to efficacy in cognitive reappraisal, it may inform about vital cognitive prerequisites of efficient reappraisal implementation. Accordingly, based on their performance in this study, men and women presumably recruit similar functional executive processes during reappraisal generation, of which set-shifting, memory updating, and inhibition of dominant yet irrelevant responses are proposed as crucial building blocks for cognitive reappraisal (Joormann and Gotlib, 2010; Malooly et al., 2013; Pe et al., 2013). Since the importance of executive functions has also been endorsed by specific research on reappraisal inventiveness (Weber et al., 2014; Papousek et al., 2017; Perchtold et al., 2018; Rominger et al., 2018), our findings suggest equivalent executive functioning in both genders as regards cognitive reappraisal.

Interestingly, however, a higher capacity for reappraisal generation predicted fewer depressive symptoms in men only, while this effect was absent in women. Hence, our results

indicate that while both genders do not differ in their basic reappraisal capacity, this capacity appears to be a protective buffer against depression in men only. Although it is premature to draw any firm conclusions from this novel observation, the trends in this study prompt us to speculate on some noncompeting explanations for this result. A possible explanation for the observed null effects of reappraisal capacity on depression in women could be linked with the finding of lower selfefficacy in managing negative emotions in women than in men. Substantial positive effects of emotion regulation selfefficacy on well-being are abundant in literature (see Baudry et al., 2018 for review). Further, it was suggested that individuals with higher self-efficacy in emotion regulation put more efforts in actively modifying their emotions and, hence, are prone to use effortful regulation strategies such as cognitive reappraisal more consistently (Tamir et al., 2007). On that note, findings showed that individuals regarding themselves more capable of controlling their emotions were more prone to use cognitive reappraisal in their daily lives. Furthermore, those individuals who more persistently used cognitive reappraisal and scored higher on emotion regulation self-efficacy were more successful in downregulating negative emotions (Gutentag et al., 2017).

Thus, for the present study, the following tentative interpretation is suggested: Men, due to higher confidence in their emotion regulation skills, could generally show greater attempts to actively cope with adverse events, and use effortful active regulation strategies such as cognitive reappraisal with greater determination than women do. Thereby, they may benefit from good reappraisal capacity in terms of fewer depressive daily-life experiences. In contrast, good reappraisal capacity might be less significant for the experience of depression in women because based on lower self-perceived regulation skills, they show reduced emotion regulation attempts from the start. It is thus assumed that effort or motivation in using cognitive reappraisal may be more important than a more frequent employment of cognitive reappraisal alone, as suggested by several indications that men tend to report less habitual use of reappraisal than women (Tamres et al., 2002; Nolen-Hoeksema and Aldao, 2011; Spaapen et al., 2014), although this assumption is not corroborated by all studies (Gross and John, 2003; Haga et al., 2009; Zlomke and Hahn, 2010). Specifically for anxiety-eliciting situations, it is possible that women are less motivated to downregulate anxiety by means of cognitive reappraisal, since they are more prone to feelings of anxiety (e.g., McLean and Anderson, 2009) and are thus more likely to accept these feelings as part of their everyday lives. Complementing this assumption, despite good reappraisal capacity, women might also be less convinced of the effectivity of cognitive reappraisal in reducing their anxious feelings, which adds beliefs about consequences of cognitive reappraisal as another potential influencing factor (e.g., Ortner et al., 2017). Our data, however, can only partly support all these arguments, because we did not assess efforts put in the reappraisal task, beliefs in reappraisal effectiveness, and the preferred use of cognitive reappraisal as a trait (e.g., Gross and John, 2003).

Additionally, it can be derived from literature that women tend to report using both, adaptive and maladaptive emotion regulation strategies more than men (Thoits, 1991; Tamres et al., 2002; Nolen-Hoeksema and Aldao, 2011). While this at first underlines a supposedly more flexible repertoire of regulation strategies in women, there are also studies suggesting that maladaptive emotion regulation strategies (e.g., rumination, suppression) are more strongly linked to depression than are adaptive ones (e.g., cognitive reappraisal, acceptance; Aldao et al., 2010; Nolen-Hoeksema and Aldao, 2011; Joormann and Stanton, 2016). As a consequence, if women endorse more maladaptive regulation strategies than men, and if these strategies were eminently detrimental to mental well-being (e.g., Nolen-Hoeksema et al., 2008, also see Krause et al., 2017), good cognitive reappraisal capacity alone may not suffice to guard against the experience of depression in women, as the impact of concomitantly employed maladaptive strategies prevails. It is hence possible that in women, interactions between adaptive and maladaptive emotion regulation strategies have a more pronounced impact on depressive experiences than the capability to effectively implement one adaptive strategy per se.

In line with recent indications that some reappraisal strategies might be more adaptive than others in the long run (Kalisch et al., 2015; Perchtold et al., 2018, 2019), this study also examined gender differences in four reappraisal categories scored in the cognitive reappraisal test (Weber et al., 2014; de Assuncao et al., 2015). No differences emerged, however, despite some evidence that men more often employ problem-oriented coping strategies (Ptacek et al., 1994; Baker and Berenbaum, 2007), whereas women favor emotion-focused tactics (Lazarus and Folkman, 1984; Eaton and Bradley, 2008). It appears that these allegedly basic preferences are not reflected in reappraisal categories. Yet, further research is warranted to look more closely into potential gender differences among the myriad of available strategies that occur in cognitive reappraisal of aversive events (e.g., McRae et al., 2012; Perchtold et al., 2019). Independent of gender and other strategies, the generation of relatively more de-emphasizing reappraisals and positive re-interpretations was associated with fewer depressive experiences. This result supports previous studies that find both, self-focused (de-emphasizing) and situation-focused (positive) reappraisal effective in reducing negative emotional reactivity (Shiota and Levenson, 2012; Ranney et al., 2017), albeit more long term-benefits are suggested for positive reappraisal (e.g., Kalisch et al., 2015).

Importantly, in the present study, the obtained differences in reappraisal capacity effects on depressive experiences between men and women cannot be definitively interpreted in terms of sex or gender. Cognitive reappraisal capacity reflects individuals' capability to recruit appropriate brain activation when faced with the demand of reappraising an aversive event (Papousek et al., 2017; Perchtold et al., 2018). Since no differences in this basic capacity were observed, this potentially also points to the absence of sex differences in recruitment of adequate brain circuits, as far as the inventiveness in generating alternative reappraisals is concerned. This inventiveness, however, is a necessary, but not a sufficient prerequisite for effective emotion

regulation, since individuals not only need to be theoretically capable of generating suitable reappraisals for critical situations, they also need to do so when faced with these situations in daily life. Here, how men and women actually make use of their capabilities might critically depend on gender roles, which likely entail different beliefs in emotion regulation self-efficacy, reappraisal effectiveness, or controllability of stressors. However, these notions remain speculative until further investigation.

This study presents a novel approach for investigating gender differences in cognitive reappraisal by explicitly testing performance in generating alternative cognitive reinterpretations for anxiety-evoking situations. By drawing on an actual behavioral performance measure instead of self-reported data, our measure of reappraisal capacity is independent from the participants' ability or willingness to accurately report on their abilities. Post hoc power analysis confirmed that at 0.989, our results are unlikely to be skewed by a type 2 error for women. Some limitations of this study must be noted. Naturally, the capacity to generate multiple cognitive reappraisals as assessed in this study only covers a certain aspect of an individual's ability to effectively implement cognitive reappraisal for negative affect regulation. While specifically for situations that exceed routines, it can assumed that the likelihood for effective reappraisal implementation increases with the pool of generated ideas, for recurrent negative events in daily life, the ability to repeatedly implement just one reappraisal in a successful manner may be equally or even more important. Yet, since recurrent anxiety-eliciting situations (e.g., walking home alone at night) are not always exactly alike, a high capacity to generate manifold reappraisals may still prove vital. Secondly, it may be questioned why depression and not anxiety was used as an outcome variable when testing gender-specific effects of cognitive reappraisal capacity for anxiety-eliciting situations. Depression and anxiety greatly overlap; they share a great proportion of their symptomatology, as well as common genetic and environmental contributors (e.g., Preisig et al., 2001; Kessler et al., 2005; Burton et al., 2015). Yet, compared to anxiety, markedly more literature indicated correlations between depression and emotion regulation strategies, particularly cognitive reappraisal (e.g., Martin and Dahlen, 2005; Aldao et al., 2010; Joormann and Gotlib, 2010; Troy et al., 2010; Everaert et al., 2017). Thirdly, our claim that men and women possess similar cognitive reappraisal capacity and related executive functioning is based on experimentally instructed reappraisal within a limited time span. That is not to say that gender differences might not emerge when reappraisal time increases, perhaps as a function of cognitive effort, as was proposed by others (McRae et al., 2008; Domes et al., 2010). Thus, more fine-grained investigations into gender differences at specific stages of the cognitive reappraisal process are warranted that go beyond the presumably very early stage of generating multiple potential reappraisals scrutinized in this study (e.g., selection of a suitable reappraisal, implementation of that reappraisal, etc.). In this respect, scrutinizing the time-course of cognitive reappraisal by means of EEG may be particularly informative as regards (neural) efficacy of the reappraisal process in men and women. Next, this study's results are based on cross-sectional data, which

do not allow causal interpretations of the relations. While the research background denotes cognitive reappraisal capacity as the cause and depressive experiences as the effect (e.g., Hofmann et al., 2012; Berking et al., 2014), circular mechanisms may also be at work. In this respect, other studies suggested that deficits in implementing effective emotion regulation strategies might also arise as a consequence of depressive episodes (e.g., Troy et al., 2010; Liu and Thompson, 2017). Additionally, sex hormones and phases in menstrual cycle are known to affect emotional responding, including emotion regulation strategy choice (Toffoletto et al., 2014; Graham et al., 2018). Although in the present study, women with and without use of hormonal contraceptives did not differ in any variables of interest, we did not control for menstrual cycle data in our analyses, which constitutes an important direction in future research. Moreover, although we attempted for a comprehensive interpretation of our findings based on available literature, our propositions regarding potential influences of other variables on the relationship of reappraisal capacity and depressive symptoms (e.g., regulation effort, impact of other strategies) should be considered as preliminary until further studies demonstrate they significantly moderate the discussed effect. Also, note that our findings are restricted to reappraisal capacity in dealing with anxiety-eliciting events only. While reappraisal inventiveness can be regarded a trans-emotional capacity that is not specific to certain emotions (de Assuncao et al., 2015), gender differences might nonetheless emerge for the downregulation of anger, disgust, or sadness. Thus, a vital goal for future research is to identify whether the relationships identified in this study also hold for other versions of the RIT (e.g., anger, Weber et al., 2014). Lastly, this study used a sample of young students without severe mental health problems. Findings may not generalize to more serious depressive symptoms.

Taken together, the present study demonstrated that while men and women do not differ in their basic cognitive capacity to implement cognitive reappraisals in threatening situations, higher reappraisal capacity seemingly reduces depressive dailylife experiences in men only. This possibly implies a more complex link between cognitive reappraisal and depressive experiences in women, suggesting their benefits for wellbeing more strongly depend on several aspects of their emotion regulation efforts through reappraisal and beyond working in concert. Though preliminary, these findings may have useful implications for psychotherapy research and practice. For instance, whereas men might benefit from abilitybased reappraisal trainings alone, in women, it may also need concomitant interventions that focus on reducing the use of maladaptive emotion regulation strategies as well as enhancing self-efficacy and determinedness in the context of cognitive reappraisal.

## DATA AVAILABILITY

The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

## ETHICS STATEMENT

fpsyg-10-00553 March 13, 2019 Time: 18:17 # 9

This study was carried out in accordance with the recommendations of the guidelines by the ethics committee of the University of Graz, Austria with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the ethics committee of the University of Graz, Austria.

## AUTHOR CONTRIBUTIONS

EW, IP, and AF conceptualized the study. CP, IP, and CR collected, analyzed, and interpreted the data. CP drafted the manuscript. EW, IP, CR, AF, and HW critically reviewed

## REFERENCES


the manuscript. All authors gave their final approval of the manuscript.

## FUNDING

This work was supported by the Austrian Science Fund (FWF) (Grant Number P30362).

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2019.00553/full#supplementary-material



adolescents. Procedia Soc. Behav. Sci. 9, 1516–1523. doi: 10.1016/j.sbspro.2010. 12.358



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Perchtold, Papousek, Fink, Weber, Rominger and Weiss. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Non-overlapping and Inverse Associations Between the Sexes in Structural Brain-Trait Associations

Daphne Stam<sup>1</sup> , Yun-An Huang<sup>1</sup> and Jan Van den Stock1,2,3 \*

<sup>1</sup> Laboratory for Translational Neuropsychiatry, Department of Neurosciences, KU Leuven, Leuven, Belgium, <sup>2</sup> Geriatric Psychiatry, University Psychiatric Center KU Leuven, Leuven, Belgium, <sup>3</sup> Brain and Emotion Laboratory, Maastricht University, Maastricht, Netherlands

Personality reflects the set of psychological traits and mechanisms characteristic for an individual. The brain-trait association between personality and gray matter volume (GMv) has been well studied. However, a recent study has shown that brain structurepersonality relationships are highly dependent on sex. In addition, the present study investigates the role of sex on the association between temperaments and regional GMv. Sixty-six participants (33 male) completed the Temperament and Character Inventory (TCI) and underwent structural magnetic resonance brain imaging. Mann-Whitney U tests showed a significant higher score on Novelty Seeking (NS) and Reward Dependence (RD) for females, but no significant group effects were found for Harm Avoidance (HA) and Persistence (P) score. Full factor model analyses were performed to investigate sex-temperament interaction effects on GMv. This revealed increased GMv for females in the superior temporal gyrus when linked to NS, middle temporal gyrus for HA, and the insula for RD. Males displayed increased GMv compared to females relating to P in the posterior cingulate gyrus, the medial superior frontal gyrus, and the middle cingulate gyrus, compared to females. Multiple regression analysis showed clear differences between the brain regions that correlate with female subjects and the brain correlates that correlate with male subjects. No overlap was observed between sexspecific brain-trait associations. These results increase the knowledge of the role of sex on the structural neurobiology of personality and indicate that sex differences reflect structural differences observed in the normal brain. Furthermore, sex hormones seem an important underlying factor for the found sex differences in brain-trait associations. The present study indicates an important role for sex in these brain structure-personality relationships, and implies that sex should not just be added as a covariate of no interest.

Keywords: sex, temperaments, voxel-based morphometry, brain-trait association, full factor model

## INTRODUCTION

Some people are almost constantly looking for new challenges, while others choose to stick to old habits. There is a large diversity in the way people behave and how they think. This diversity can be explained by personality, a set of psychological traits and mechanisms characteristic for an individual (Larsen and Buss, 2010). It is well known that personality traits are subject to sex

#### Edited by:

Marina A. Pavlova, Tübingen University Hospital, Germany

#### Reviewed by:

Nicole Anderson, Brigham Young University, United States Robert Kelvin Perkins, Norfolk State University, United States

\*Correspondence: Jan Van den Stock jan.vandenstock@med.kuleuven.be

#### Specialty section:

This article was submitted to Gender, Sex and Sexuality Studies, a section of the journal Frontiers in Psychology

> Received: 30 November 2018 Accepted: 04 April 2019 Published: 24 April 2019

#### Citation:

Stam D, Huang Y-A and Van den Stock J (2019) Non-overlapping and Inverse Associations Between the Sexes in Structural Brain-Trait Associations. Front. Psychol. 10:904. doi: 10.3389/fpsyg.2019.00904

**61**

differences. For instance, females typically show higher agreeableness and neuroticism compared to males (Chapman et al., 2007; Weisberg et al., 2011). Little is known about the neurobiology that is associated with sex differences in personality traits. However, there are striking differences between the sexes in the neural basis of emotional processes (Kret and De Gelder, 2012), in the relationship between narcissistic personality and regional grey matter volume (GMv; Yang et al., 2015) and findings implicate structural differences as a partial explanation for sex differences in antisocial personality (Raine et al., 2011). A recent study of Nostro et al. (2016) showed that brain structure-personality associations are highly dependent on sex. They used the NEO Five Factor Inventory (NEO FFI) to measure personality and found no significant associations between the NEO FFI (Costa and McCrae, 1992b) and regional (GMv) for the combined (males and females) sample. However, they did find sex-specific associations. Interestingly, significant associations with GMv were detected only in males. For neuroticism negative correlations were found for GMv of parieto-occipital sulcus/cuneus, left fusiform gyrus/cerebellum, and right fusiform gyrus. Positive correlations were found between conscientiousness and GMv of left precuneus and parietooccipital sulcus. Also a positive correlation was found between extraversion and GMv precuneus/parieto-occipital sulcus, thalamus, left fusiform gyrus/cerebellum, and right cerebellum.

The present study addresses sex differences in the neurobiology of temperaments which are based on the psychobiological personality account of Cloninger et al. (1993). Temperaments are regarded to be heritable and homogeneous, stable over time, emerge early in life, and independent of each other (Cloninger, 1986; Cloninger et al., 1993; Heath et al., 1994; Stallings et al., 1996; Comings et al., 2000; Larsen and Buss, 2010). The Temperament and Character Inventory (TCI) assess these temperaments (Cloninger et al., 1993). The TCI contains four temperament scales: (1) novelty seeking (NS); (2) harm avoidance (HA); (3) reward dependence (RD); and (4) persistence (P) (Cloninger et al., 1993). These temperament scales can be further subdivided into different subscales: NS can be divided into exploratory excitability (NS1), impulsiveness (NS2), extravagance (NS3), and disorderliness (NS4); HA is composed of anticipatory worry (HA1), fear of uncertainty (HA2), shyness (HA3), and fatigability (HA4); RD can be divided into sentimentality (RD1), social attachment (RD2), and dependency (RD3); the temperament P is not further divided (Cloninger et al., 1993).

Several studies have investigated the associations between temperaments and regional GMv (Iidaka et al., 2006; Gardini et al., 2009; Picerni et al., 2013; Laricchiuta et al., 2014; Stam et al., 2018). However, sex is a variable that is typically statistically controlled for and little is known about its effect on temperamentbrain associations.

In the current study, we try to answer the question "how does sex affect the association between temperaments and regional GMv?" In order to answer this question, we investigate the interaction between sex and temperaments (NS, HA, RD, and P) on regional GMv. Not much is known about the relation between sex and the association between temperaments and regional GMv. The study of Nostro et al. (2016) showed a positive correlation between extraversion and GMv precuneus/parietooccipital sulcus, thalamus, left fusiform gyrus/cerebellum, and right cerebellum, in males. As extraversion is a trait known to be linked with NS (Gocłowska et al., 2018), we expect to find comparable results. The aim of the current study is to reveal distinct and common effects between the sexes in associations between regional GMv and temperaments. Furthermore, sex-specific associations between personality and risk for neuropsychiatric disorders have been reported, including eating disorders (Gual et al., 2002; Krug et al., 2009) and mood disorders (Costa and McCrae, 1992a; Afifi, 2007). A better understanding of the neurobiology of sex differences in personality temperaments may hold benefits for development of sex-specific treatments of those disorders.

## MATERIALS AND METHODS

This study was approved by the Ethical Committee of University Hospitals Leuven. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

## Participants

Sixty-six healthy subjects participated, 33 males (mean age ± SD = 38 ± 13 years, range 21–75) and 33 females (mean age ± SD = 36 ± 11 years, range 21–65)<sup>1</sup> . Mann-Whitney U tests showed that no significant sex differences were detected for age (P = 0.568). The sample was a non-clinical population composed of three subgroups to increase the variability of the loadings on personality scales: (1) Fourteen participants with premanifest Huntington's disease [21% (50% male)], (2) Eighteen gene-negative controls from Huntington's disease families [27% (50% male)], (3) thirty-four healthy controls [52% (50% male)]. We included premanifest Huntington's disease subjects, referring to the absence of motor symptoms in combination with a positive mutation status. In addition, a radiologist evaluated the structural scans and there were no abnormalities at the individual level. As the main question of the current article focusses on sex differences, observing possible group differences of the three subgroups falls outside the scope of this article. The different subgroups are merely added for methodological reason (increasing the variability in the dataset).

## Temperament and Character Inventory

Temperament and character inventory is a questionnaire for measuring seven domains of personality and consists of 240 dichotomous items. The seven domains are divided in three character scales (self-directedness, cooperativeness, and selftranscendence) and four temperament scales (NS, HA, RD, and P). A validated Dutch translation was used with reasonable to good psychometric internal consistency (Cronbach's α range, 0.64–0.87). It was validated in a representative sample of Dutch individuals (n = 1034) (version 1.3; Datec Psychological Tests,

<sup>1</sup>All data is available through the corresponding author (jan.vandenstock@med.kuleuven.be).

Leiderdorp, Netherlands). NS reflects enthusiasm, impulsivity, and reward-sensitivity (e.g., "I like to explore new ways to do things"); HA is related to acting with caution and passive avoidance behavior (e.g., "I often feel tense and worried in unfamiliar situations, even when others feel there is little to worry about"); RD is associated with responsiveness to signals of reward (e.g., "I like to please other people as much as I can") and P indicates motivation without direct external reward (e.g., "I am more of a perfectionist than most people") (Cloninger, 1986; Cloninger et al., 1993; Larsen and Buss, 2010; Laricchiuta et al., 2014).

## MRI Acquisition

Neuroimaging was performed on a 3T MRI scanner. A highresolution T1-weighted anatomical image (voxel size: 0.98 mm × 0.98 mm × 1.20 mm) was acquired on a single 3T Philips Achieva system equipped with a 32 channel head coil using a 3D turbo field echo sequence (TR:9.6 ms; TE:4.6 ms; matrix size:256 × 256; 182 slices).

## Structural Data Analysis

Data was analyzed using CAT12, a Computational Anatomy Toolbox<sup>2</sup> (Gaser and Kurth, 2017) running under SPM12<sup>3</sup> and MATLAB (R2016b). In order to investigate the role of sex in the association between regional GMv and temperaments, we performed voxel-based morphometry (VBM). Preprocessing consisted of normalization to MNI space, tissue classification (segmentation) into GM, white matter (WM), and cerebrospinal fluid (CSF), and bias correction of intensity non-uniformities. The amount of volume changes due to spatial registration were scaled, in order to retain the original local volumes (modulating the segmentations). The modulated images were smoothed using a 12 mm × 12 mm × 12 mm full-width at half-maximum Gaussian kernel.

## Statistical Analysis

In order to investigate sex differences in both the main scales (NS, HA, RD, and P) as in the subscales (NS1, NS2, NS3, NS4, HA1, HA2, HA3, HA4, RD1, RD2, RD3, and P) statistical tests were preceded by a normality check on the distributions of the respective residuals by means of Shapiro-Wilk test. In case normality could not be assumed, non-parametric tests were performed. For the purpose of uniformity of analyses, we performed parametric tests or non-parametric tests on all behavioral data.

To investigate sex differences in brain-temperament associations, we first performed a full factorial model analysis on the voxelwise GMv of the total sample, to observe the interaction effect between sex and the temperament scores (NS, HA, RD, and P). Sex was included as factor and the temperament scores as interaction with the factor. Age and total intracranial volume (TIV) were included as variables of no interest.

Eight contrasts were performed: for every temperament scale, we investigated the male and female association. In every analysis

<sup>2</sup>http://www.neuro.uni-jena.de/cat/

we added the remaining temperament scores as variables of no interest, in order to maximize the specificity of the results of a single temperament (as it controls for the association that is contained by the other temperaments).

Secondly, multiple regression analyses were performed on the smoothed GM-images for males and females separately. The four temperament scores (NS, HA, RD, and P) were entered as regressors in a single model, in addition to age, and TIV, which were included as variables of no interest. In total eight contrasts were performed on the male data and eight contrasts on the female data.

To investigate any overlap between the sex-selective results, we inclusively masked the results of the regression analysis of both groups. The statistical threshold was set at a Pheight < 0.001 (k = 10) in combination with Pheight < 0.05 FWE-corrected following Small Volume Correction.

Anatomic labeling of significant clusters was performed using xjView<sup>4</sup> and clusters were visualized using MRICron<sup>5</sup> .

## RESULTS

Shapiro-Wilk test showed that residuals of NS, HA, RD, and P were normally distributed (P > 0.108). The residuals of the different subscales, however, were not normally distributed. For the purpose of uniformity of analyses, we performed nonparametric tests on all behavioral data. Mann-Whitney U tests showed a significantly higher score on NS (p = 0.02) and RD (p < 0.001) for females, but no significant group effects were found for the HA and P score (P > 0.445). Furthermore, Mann-Whitney U tests showed sex differences between the subscales (**Figure 1**).

## Interaction Between Sex and Temperaments in Voxel-Wise GMv

To investigate the role of sex in the association between regional GMv and temperaments, we ran a full factorial model on the smoothed GM images. This revealed interactions between sex and temperaments. The results are presented in **Table 1**. Males and females show opposite associations between NS and GMv in the superior temporal gyrus (females show positive association; t = 4.96, p = 0.001), for HA in the middle temporal gyrus (females show positive association; t = 4.56, p = 0.003), for RD in the insula (females show positive association; t = 4.42, p = 0.003 and t = 3.66, p = 0.001), for P in the posterior cingulate gyrus (females show a negative association; t = 4.72, p = 0.006), the medial superior frontal gyrus (females show a negative association; t = 4.62, p = 0.006), and the middle cingulate gyrus (females show a negative association; t = 4.56, p = 0.001). See **Figure 2**.

## Within-Sex Correlation Between Temperaments and Voxel-Wise GMv

To investigate the correlation between temperaments and voxelwise GMv for males and females separately, we performed

<sup>3</sup>http://www.fil.ion.ucl.ac.uk/spm/software/spm12/

<sup>4</sup>http://www.alivelearn.net/xjview

<sup>5</sup>http://www.mccauslandcenter.sc.edu/mricro/mricron

FIGURE 1 | Mean score of the four temperament scores [novelty seeking (NS), harm avoidance (HA), reward dependence (RD), and persistence (P)] and each subscale (NS1, NS2, NS3, NS4, HA1, HA2, HA3, HA4, RD1, RD2, RD3, and P) for males and females separately. Error bars represent standard error of the total score (NS, HA, RD, and P). The results show a significant higher score on NS and RD for females, driven by NS2, NS3, RD1, and RD2. <sup>∗</sup>Marks significance at p < 0.05.


TABLE 1 | Sex-temperament interaction effects in voxel-wise GMv.

Overview results full factorial model [Pheight < 0.001 (k = 10), combined with SVC, FWE-corrected at cluster level], observing the role of sex for GMv associations with temperament traits of the TCI; novelty seeking (NS), harm avoidance (HA), reward dependence (RD), and persistence (P). Coordinates refer to MNI-space.

two separate multiple regression analyses on the smoothed GM-images. The results are shown in **Table 2**. To investigate any overlap between the sex-selective results, we inclusively masked the results of the regression analysis of both groups. No overlap was observed.

## DISCUSSION

In the current study, we investigated sex differences in temperament-brain associations, as well as shared temperamentbrain associations between the sexes.

## Sex Differences in Temperament Traits

We observed significant sex differences in NS and RD between males and females. For both temperaments, females had a significantly higher score than males.

Extraversion, is a trait of the Big Five that is known to be linked with NS (Gocłowska et al., 2018). A study by Weisberg et al. (2011) observed significantly higher overall extraversion score in females. Furthermore, individuals scoring high with respect to NS tend to be enthusiastic, impulsive, and NS is known to be linked to the neurotransmitter dopamine (Cloninger, 1987; Larsen and Buss, 2010). Previous research has shown that females score higher in enthusiasm

FIGURE 2 | Full factorial model results obtained at a statistical threshold of Pheight < 0.001 (k = 10), combined with SVC, FWE-corrected at cluster level. The results are shown at a significance level of P = 0.05 and are overlaid on a canonical 3-dimensional–rendered MRI brain template with a cut-out. (A) In the middle a statistical map displaying the sex-temperament interaction effects on GMv for reward dependence. Left and right the partial correlation (Female: r = 0.963, Male: r = –0.939) between GMv in the insula as a function of reward dependence. (B) In the middle a statistical map displaying the sex- temperament interaction effects on GMv for novelty seeking. Right a scatterplot showing the partial correlation (Female: r = 0.943, Male: r = –0.925) between GMv in the superior temporal gyrus as a function of novelty seeking. (C) In the middle a statistical map displaying the sex-temperament interaction effects on GMv for harm avoidance. Left a scatterplot showing the partial correlation (Female: r = 0.989, Male: r = –0.973) between GMv in the middle temporal gyrus as a function of harm avoidance. (D) In the middle a statistical map displaying the sex-temperament interaction effects on GMv for persistence. Left a scatterplot showing the partial correlation (Female: r = –0.938, Male: r = 0.924) between GMv in the middle cingulate gyrus as a function of persistence. Right a scatterplot showing the partial correlation (Female: r = –0.938, Male: r = 0.924) between GMv in the medial superior frontal gyrus as a function of persistence and the partial correlation (Female: r = –0.938, Male: r = 0.924) between GMv in the posterior cingulate gyrus as a function of persistence.



Overview Multiple regression results [Pheight < 0.001 (k = 10), combined with SVC, FWE-corrected at cluster level], investigating GMv associations with temperaments of the TCI; novelty seeking (NS), harm avoidance (HA), reward dependence (RD), and persistence (P) for males and females separately. Coordinates refer to MNI-space. L9, lobe 9; L7b, lobe 7b.

(Costa et al., 2001; Weisberg et al., 2011) and that estradiol, the female sex hormone, modulates mesolimbic dopamine systems and so affects motivated behaviors (Yoest et al., 2014). On the other hand, no associations between the total scores of NS and total testosterone, the male sex hormone, have been found (Tsuchimine et al., 2015). Previous studies have looked at sex differences in a previous version of the TCI, the Tridimensional Personality Questionnaire (TPQ) (Cloninger et al., 1991). However, these studies found conflicting findings on NS score and impulsivity when looking at sex differences (Reynolds et al., 2006; Mitchell and Potenza, 2015). A possible explanation for these conflicting findings on impulsivity and NS, may be that females show fluctuating levels of impulsivity due to the menstrual cycle and changing estrogen levels (Weafer and de Wit, 2014). Furthermore, when looking at the different subscales of NS, we found that the impulsiveness (NS2) (p = 0.029) and the extravagance (NS3) (p = 0.005) dimensions of NS specifically drive the significant sex differences in NS. A study using the TPQ (Cloninger, 1987) also found a significantly higher score for females on NS3 (Zohar et al., 2001). They also found a positive correlation between NS3 and RD, the second temperament where we found a significant higher score for females than for males.

Reward dependence has been shown to be linked with norepinephrine, previous research has already shown that through the central nervous system estrogen can modulate noradrenergic neurotransmission (Vega-Rivera et al., 2013). Furthermore, our findings on RD are in line with previous research (Cloninger et al., 1991; Zohar et al., 2001) and showed that for RD the significant sex difference was

driven by sentimentality (RD1) (p < 0.001) and attachment (RD2) (p = 0.010). RD is often described as inter-personal sensitivity and sociability. Generally, females focus more on interpersonal relationships, score higher on attachment, warmth and empathy (Zohar et al., 2001; Weisberg et al., 2011; Weafer and de Wit, 2014) and are more concerned with the opinion of others in social tasks than males (Cloninger et al., 1991; Vega-Rivera et al., 2013). Males tend to focus more on individuality and achievement (Sato and McCann, 1998). These results provide support for the higher score for females on RD.

In summary, we found significantly higher scores for females on NS and RD. These results indicate that mainly for the temperaments linked to sociability and attachment, we find a significant higher score for females.

## Interaction Between Sex and Temperaments on Voxel-Wise GMv

We found opposite associations in both groups between GMv in the superior temporal gyrus and NS, with a positive association for females. Previous research found a significant positive correlation between NS and glucose metabolism in the superior temporal gyrus (Hakamata et al., 2006), which is a region that is linked with impulsivity and social cognition (Horn et al., 2003; Grecucci et al., 2013). Furthermore, a previous study showed that females mainly have more GM percentage in the superior temporal gyrus than males (Schlaepfer, 1995). A study looking at the difference between pre- and post-menopausal females, showed a decrease in GMv in postmenopausal females. The GMv was also found to be positively correlated to estradiol, the major female sex hormone (Kim et al., 2018). In contrast with our hypothesis we did not find any results for NS that were comparable to the study of Nostro et al. (2016). A possible explanation for this discrepancy may relate to the methods (Multiple regression analysis vs. full factorial model analysis). Alternatively, the similarity between NS and extraversion may be limited.

Secondly, we observed inverse associations in both groups between GMv in the insula and RD, with a positive association in females. RD is associated with responsiveness to signals of reward, the insula is known to play a part in the additional reward-sensitive brain areas (O'Doherty et al., 2002; Kirsch et al., 2003). Research shows that females show more GMv in the right insula (Ruigrok et al., 2014) and estrogen is found to excite neurons in the insula (Saleh et al., 2004). The insula is part of the limbic system. Previous research has shown that females have a larger limbic volume (Goldstein et al., 2001; Zaidi, 2010). It has been proposed that due to a larger limbic brain females are better in touch with their emotions and can better connect to others (Zaidi, 2010).

Thirdly, we found opposite associations between HA and GMv in the middle temporal gyrus. There are contradictory findings about the link between HA and the middle temporal gyrus (Hakamata et al., 2006; Iidaka et al., 2006). As previous studies have shown both positive (Iidaka et al., 2006) and negative correlations (Hakamata et al., 2006) between HA and the middle temporal gyrus. The middle temporal gyri is also known to be linked to social cognition (Grecucci et al., 2013). Furthermore, we did not find a significant sex difference in HA score.

P is the only temperament with opposite associations for which we found a positive association for males. We did not find a sex difference for score in P. P indicates motivation without direct external reward (Cloninger et al., 1993). The medial superior frontal gyrus and the cingulate gyri are areas known to be involved in cognitive control and motivation (Heilbronner et al., 2011; Bahlmann et al., 2015).

In summary, the results show that GMv of the superior temporal gyrus, middle temporal gyrus, and insula show a positive association between temperaments and regional GMv in females and a negative association in males, while a positive association in males and a negative association in females was observed between P and regional GMv in the posterior cingulate gyrus, medial superior frontal gyrus, and middle cingulate gyrus.

## Within-Sex Correlation Between Temperaments and Voxel-Wise GMv

We found non-overlapping sex-specific topographic patterns in temperament-brain associations. These results suggests that the structural neurobiology underlying personality is to a high degree sex-specific and our results are in line with the study of Nostro et al. (2016), who only found sex-specific associations. Our results support their hypothesis that brain structure-personality associations are highly dependent on sex and this may be attributable to hormonal interplays.

## Clinical Implications

Different temperaments from the TCI have been linked to several neuropsychiatric disorders; NS is known to be correlated with drug addiction (Bardo et al., 1996; Lin et al., 2015; Vanhille et al., 2015), tobacco abuse (Palmer et al., 2013), and depression (Duclot and Kabbaj, 2013). NS and HA are both linked to pathological gambling (Kim and Grant, 2001; Nordin and Nylander, 2007) and alcohol abuse (Palmer et al., 2013; Wennberg et al., 2014). However, the risk for these neuropsychiatric symptoms differs between males and females. For example, studies show that males have a higher risk for developing an alcohol or gambling addiction (Engwall et al., 2004; Nolen-Hoeksema and Hilt, 2006). Our results support the importance of sex in the neurobiology of these disorders.

## Limitations

A limitation of the current study is the small sample size. However, the different subgroups that constituted the sample presumably increased the variability of the dataset and benefited the statistical power. Furthermore, we used a statistical threshold [Pheight < 0.001 (k = 10) in combination with Pheight < 0.05 FWEcorrected following Small Volume Correction]. However, as the current study is an exploratory study, further studies are needed with larger sample sizes. It is important to keep in mind that females generally have smaller brains than males (Ruigrok et al., 2014) and this can effect volume of specific brain regions (Pintzka et al., 2015). To control for this, TIV was entered as variable of no interest in all analyses. Furthermore, as the study is correlational in nature, any causal interpretations are unjustified.

## CONCLUSION

fpsyg-10-00904 April 23, 2019 Time: 17:45 # 8

The present study documents opposing associations in males and females between temperament brain associations. The results reveal that sex-specific associations outweigh sex-general associations in the neurobiology of personality.

## ETHICS STATEMENT

This study was approved by the Ethical Committee of University Hospitals Leuven. All subjects gave written

## REFERENCES


informed consent in accordance with the Declaration of Helsinki.

## AUTHOR CONTRIBUTIONS

DS and JVdS contributed conception and design of the study and organized the database. DS performed the statistical analysis and wrote the first draft of the manuscript. Y-AH analyzed the data. All authors contributed to manuscript revision, read, and approved the submitted version.

## FUNDING

JVdS was supported by a KU Leuven Starting Grant.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Stam, Huang and Van den Stock. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Sex Differences in Sex Hormone Profiles and Prediction of Consciousness Recovery After Severe Traumatic Brain Injury

Yu H. Zhong, Hong Y. Wu, Ren H. He, Bi E. Zheng and Jian Z. Fan\*

Department of Rehabilitation Medicine, Nanfang Hospital, Southern Medical University, Guangzhou, China

Objective: The clinical course of unconsciousness after traumatic brain injury (TBI) is commonly unpredictable and it remains a challenge with limited therapeutic options. The aim of this study was to evaluate the early changes in serum sex hormone levels after severe TBI (sTBI) and the use of these hormones to predict recovery from unconsciousness with regard to sex.

Methods: We performed a retrospective study including patients with sTBI. A statistical of analysis of serum sex hormone levels and recovery of consciousness at 6 months was made to identify the effective prognostic indicators.

#### Edited by:

Annie Duchesne, University of Northern British Columbia, Canada

#### Reviewed by:

Claudia Brigitte Späni, Northwestern University, United States Adel Helmy, University of Cambridge, United Kingdom

> \*Correspondence: Jian Z. Fan fjz@smu.edu.cn

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Endocrinology

Received: 02 September 2018 Accepted: 08 April 2019 Published: 26 April 2019

#### Citation:

Zhong YH, Wu HY, He RH, Zheng BE and Fan JZ (2019) Sex Differences in Sex Hormone Profiles and Prediction of Consciousness Recovery After Severe Traumatic Brain Injury. Front. Endocrinol. 10:261. doi: 10.3389/fendo.2019.00261 Results: Fifty-five male patients gained recovery of consciousness, and 37 did not. Of the female patients, 22 out of 32 patients regained consciousness. Male patients (n = 92) with sTBI, compared with healthy subjects (n = 60), had significantly lower levels of follicular stimulating hormone (FSH), testosterone and progesterone and higher levels of prolactin. Female patients (n = 32) with sTBI, compared with controls (n = 60), had significantly lower levels of estradiol, progesterone, and testosterone and significantly higher levels of FSH and prolactin. Testosterone significantly predicted consciousness recovery in male patients. Normal or elevated testosterone levels in the serum were associated with a reduced risk of the unconscious state in male patients with sTBI. For women patients with sTBI, sex hormone levels did not contribute to the prediction of consciousness recovery.

Conclusion: These findings indicate that TBI differentially affects the levels of sex-steroid hormones in men and women patients. Plasma levels of testosterone could be a good candidate blood marker to predict recovery from unconsciousness after sTBI for male patients.

Keywords: traumatic brain injury, sex hormones, consciousness, sex, differences

## INTRODUCTION

Traumatic brain injury (TBI) is a major cause of death and disability worldwide and is increasing in incidence (1). Patients with acute severe TBI (sTBI) often develop severe disorders of consciousness, i.e., coma, minimally conscious state or vegetative state. Although many patients may regain consciousness during the 1-month post-TBI period, the minimal conscious state may also develop into a chronic and even permanent state (2). Early detection of consciousness in patients with TBI could predict subsequent recovery of neurological function since early recovery of consciousness is closely related to better long-term functional outcomes (3). However, there is an ongoing debate about the clinical assessment of consciousness, which relies on inferences obtained from observed responses to external stimuli. This clinical evaluation of consciousness may be erroneous in 40% of patients, since the responses of patients with severe brain damage may be very limited (4, 5). In addition, rehabilitative care will be limitedly accessible to those who are inaccurately identified as poor prognoses due to the lack of a tool for predicting consciousness recovery (6). Hence, it is crucial to find a biomarker to predict the recovery of consciousness for patients suffering from TBI.

Hormone dysfunction, also known as post-TBI hormonal deficiency syndrome, is very common in the post-acute phase of sTBI. It has been reported that up to 80% of patients with sTBI suffer from some types of acute hypopituitarism and related hypogonadism (7, 8). The literature suggests that sex hormones can affect damage after TBI and are associated with the stress response occurring in the acute phase of the disease. Furthermore, there is proof that estrogen and progesterone have neuro-protective effects, suggesting that inadequate levels may have both acute and long-term consequences on the recovering brain (9). Decades of studies show that testosterone levels are low in 36.5–100% of patients with sTBI, however, the prognostic significance of testosterone levels remains controversial (10). Although insufficiency in hormones after TBI has become increasingly recognized, there are limited data focusing on TBI survivors regarding the role of sex hormones in predicting consciousness.

There is increasing evidence demonstrating significant sex differences in the nervous system response to traumatic injury (11). A growing number of studies in experimental TBI report that female brains consistently exhibit less damage in comparison to their male counterparts because of effects of gonadal steroid hormones at time of injury (8). However, studies regarding the influence of sex on outcomes and recovery of TBI are still scarce. To the best of our knowledge, there is no previous study investigating the association between serum hormone levels during the acute TBI phase and the recovery of consciousness in patients with TBI. The goals of this study were to assess sex differences in alterations of serum sex hormones after sTBI and determine whether sex hormones can effectively predict recovery of consciousness with regard to sex.

## METHODS

## Patients and Definitions

We retrospectively screened all patients with TBI admitted to the neurosurgery, emergency or rehabilitation department of our institution from 2007 to 2017. The inclusion criteria were as follows: (1) age of 18–75 years old; (2) head trauma with Glasgow Coma Scale (GCS) score of 3–8 based on the first score registered after resuscitation, with no eye opening for at least 24 h; (3) absence of previous neurologic disorders; (4) absence of a previous history of breast cancer requiring chemotherapy treatment/tamoxifen, pituitary, or hypothalamic tumor, prostate cancer receiving orchiectomy, or hormone suppression agents, or untreated thyroid disease; (5) serum sex hormone measurement received within 1 week after trauma. Ninety-two male patients and 32 female patients with sTBI were enrolled in this study following above-mentioned criteria. Healthy subjects were separately enrolled as controls for serum sex hormone measurement. Healthy subjects had no history of neurological, psychiatric, cardiovascular, pulmonary, renal or endocrinological disease, and had not received replacement hormone therapy or contraception. In addition, control women were interviewed about their menopausal status and reproductive history. If this information was not available, subjects >50 years of age were defined as post-menopausal. Sixty age-and sex-matched healthy controls were included for both male and female group.

All patients were given both oral and written information about the study and a written informed consent was obtained.

## Parameters

A standardized case collection form was used to determine the causes of trauma, age, sex, injury severity score (ISS), GCS scores, and neuroradiological data at baseline. The severity of the trauma was evaluated by ISS. The lowest recorded GCS scores before sedation and intubation from the emergency department or scene of accident was used in this study. The type of injury was obtained from initial head computed tomography (CT) report.

Serum sex hormone measurements for all patients were performed in 1 week after sTBI. Additionally, serum samples for premenopausal females were collected either in the follicular phase (days 5–10) or the luteal phase (days 18–23) of their cycle. Blood for enrolled patients was primarily collected in the morning (∼7:00 a.m.) for analysis of estradiol, follicular stimulating hormone (FSH), luteinizing hormone (LH), progesterone, prolactin, and testosterone. All sex hormones were analyzed at the accredited clinical chemistry laboratory at Nanfang Hospital, Southern Medical University. Serum estradiol, progesterone, and testosterone were analyzed using radioimmunoassay with the Coat-A-Count in-vitro diagnostic test kit (Siemens Healthcare Diagnostics Inc., Los Angeles CA). Serum FSH, LH, and prolactin were measured by electrochemiluminescence immunoassay (ECLIA; Modular Analytics E170, Roche, GmbH, Hannheim, Germany). Male patients were divided into two subgroups according to the normal range (1.80–8.82 ng/ml) of male testosterone provided by the accredited clinical chemistry laboratory. Testosterone levels <1.80 ng/ml was classified as low testosterone level group, and testosterone levels >1.80 was classified as normal or elevated testosterone level group.

## Study Outcome

The primary outcome was consciousness recovery. All enrolled patients were classified into two groups according to their final coma recovery result: recovery of consciousness (RC) and no recovery of consciousness (NRC). Patients were considered to be the RC group if they met at least one of the following demonstrations: (1) functional use of one or more objects, (2) functional interactive communication, or (3) clearly discernable behavioral manifestation of a sense of self. The judgment on the unconscious state during the follow-up period was evaluated by the Coma Recovery Scale–Revised (CRS-R) (12). The patients were followed for at least 6 months.

## Statistical Analysis

Normally distributed data are presented as the mean ± standard deviation (SD) and compared using Student's t test. Nonnormally distributed continuous data are presented as median (interquartile range) and compared by the Mann- Whitney U test. Chi-square or Fisher's exact tests were performed to compare categorical data. Independent variables were screened to select those with statistically significant differences between the RC and NRC groups using single-factor analysis. Logistic regression analysis was used to determine which variables independently predicted recovery of consciousness. A logistic regression model contained sex hormones and clinical predictors including age, pupil reactivity, GCS score, ISS, and computed tomography (CT) characteristics (Rotterdam CT classification). The times to recovery of consciousness for patients with sTBI were illustrated with Kaplan–Meier curves and compared using the Cox proportional hazards regression model in hazard ratios (HR), with adjustment for baseline characteristics. The prediction of recovery of consciousness was analyzed using the receiver operating characteristic (ROC) curve method. A p-value of < 0.05 was considered statistically significant. All analyses were two-sided and performed using SPSS software version 21.0 (SPSS Inc., Chicago, IL, USA).

## RESULTS

## Enrollment and Characteristics of the Patients

Of 3,411 patients with TBI screened for eligibility, 124 patients with sTBI met all inclusion criteria and were enrolled in this study. Thirty-two of 124 patients were women. The primary mechanism of injury was motor vehicle collisions in both men and women. The median GCS at admission for men and women was 4 and 5, respectively. The median ISS score was 36 in both men and women. No sex differences were found in the types of injury observed by head CT, demographic, and injury variables including age, GCS score at admission, and ISS. The baseline characteristics of the patients are shown in **Table 1**.

## Serum Sex Hormone Levels by Sex After sTBI

**Table 2** summarizes serum sex hormones by sex for estradiol, FSH, LH, progesterone, prolactin, and testosterone for patients and healthy controls. Serum estradiol levels were significantly lower in women with sTBI than observed in matched healthy subjects (58.50 ± 43.79 vs. 94.69 ± 73.66 pg /ml; p = 0.013), whereas levels were similar in control values for men. The mean FSH levels for men with sTBI were lower than those for their controls (3.56 ± 3.50 vs. 5.27 ± 2.89 mIU/L; p = 0.002). In contrast, FSH levels in women with sTBI were higher than TABLE 1 | Characteristics of patients at baseline.


GCS, Glasgow Coma Scale; ISS, Injury Severity Score; IQR, interquartile range. Results are given as median (IQR) or n (%). Overall scores on the GCS range from 3 to 15, with lower scores indicating a lower level of consciousness. The ISS ranges from 0 to 75, with higher scores indicating greater severity of injury.

those in their controls (17.76 ± 12.78 vs. 8.87 ± 5.93 mIU/L; p < 0.001). Mean prolactin levels for both men (26.91 ± 14.35 vs. 10.00 ± 4.69 ng/ml; p <0.001) and women (52.77 ± 23.26 vs. 18.89 ± 10.26 ng/ml; p < 0.001) were significantly higher than those in matched healthy controls. Testosterone levels were significantly lower than control values for both men (1.98 ± 1.79 vs. 5.28 ± 1.82 ng/ml; p < 0.001) and women (0.19 ± 0.15 vs. 0.26 ± 0.11 ng/ml; p = 0.008). Similar trends were noted for progesterone (both p < 0.001). No significant difference was found in LH levels for both men and women between patients with sTBI and healthy controls.

## Recovery of Consciousness and Associated Hormone Levels by Sex After sTBI

Of the 92 male patients with sTBI, consciousness was regained in 55 (59.78%) patients. Among these patients, the duration of recovery to consciousness after sTBI was <1 month for 32 patients, 1–3 months for 15 patients, 3–6 months for 6 patients, and more than 6 months for 2 patients. Of the 32 female patients with sTBI, 22 (68.75%) patients had regained consciousness. The recovery to consciousness duration after sTBI was <1 month for 13 patients, 1–3 months for 5 patients, and 3–6 months for 4 patients. There is no statistically significant difference in percentage of the patients regaining consciousness between male and female groups.

**Table 3** summarizes the results of single-factor analysis of variables for the RC and NRC groups by sex. There were no


FSH, follicular stimulating hormone; LH, luteinizing hormone. Results are given as mean± standard deviation or median (IQR). \*p < 0.05. \*\*p< 0.001.



GCS, Glasgow Coma Scale; ISS, Injury Severity Score; FSH, follicular stimulating hormone; LH, luteinizing hormone. Results are given as mean± standard deviation or median (IQR). \*p < 0.05. \*\*p < 0.001.

statistically significant differences between the two groups in terms of age, GCS, and ISS at baseline for both men and women with sTBI, yet there were statistically significant differences for serum levels of estradiol, FSH, progesterone, prolactin and testosterone. For male patients with sTBI, the RC group had higher levels of FSH (3.40 ± 2.26 vs. 2.13 ± 1.62 mIU/L; p = 0.007), higher levels of testosterone (2.69 ± 1.95 vs. 0.93 ± 0.72 ng/ml; p < 0.001), lower levels of progesterone (0.22 ± 0.17 vs. 0.31 ± 0.22 ng/ml; p = 0.029), and lower levels of prolactin (23.90 ±11.22 vs. 31.44 ± 17.27 ng/ml; p = 0.014) than those for the NRC group. For the female patients with sTBI, the levels of estradiol (76.79 ± 41.12 vs. 18.26 ± 6.50 pg/ml; p < 0.001), progesterone (1.66 ±1.09 vs. 0.20 ± 0.07 ng/ml; p < 0.001), and prolactin (61.63 ± 22.26 vs. 33.28 ± 9.68 ng/ml; p < 0.001) were significantly higher in the RC group compared with the NRC group.

#### TABLE 4 | Logistic regression analysis of variables to predict recovery of consciousness.


FSH, follicular stimulating hormone. \*\*p < 0.001.

## Outcome Predictors

We then attempted to evaluate the use of these hormone levels, i.e., the significant differences between the RC and NRC groups, to predict recovery of consciousness by sex, as shown in **Table 4**. A logistic regression model with recovery of consciousness/no recovery of consciousness as the dependent factor for male patients, which was a combination of the clinical predictors with FSH, progesterone, prolactin, and testosterone as independent factors, showed that testosterone significantly predicted consciousness recovery (OR, 3.495, 95% CI, 1.792–6.815, p < 0.001). For the women patients with sTBI, however, sex hormone levels did not contribute to the prediction of consciousness recovery when examining these hormones together with the clinical predictors.

Frontiers in Endocrinology | www.frontiersin.org

Raised intracranial pressure and edema around the region of hypothalamic–pituitary may also contribute to hormonal abnormalities (13). Therefore, it is conceivable that surgical treatment during the acute phase of TBI, such as decompressive surgery operations, could alleviate hormonal abnormalities by reducing intracranial pressure. There is increasing proof that hypopituitarism may be badly neglected in patients with TBI because the lack of routine follow-up of hormone levels (14). In addition, the majority of clinical researches on pituitary abnormalities in TBI to date have been on men because men have a higher incidence of TBI than women or regardless of sex (15, 16). The results of this study further extended previous work examining hormone profiles after sTBI by sex. Our results showed statistically significant changes in FSH, progesterone, prolactin, and testosterone for men patients, whereas in women patients, the changes were observed in estradiol, FSH, progesterone, prolactin, and testosterone. Interestingly, the trend of changes in FSH and prolactin was opposite for sex groups. These findings indicated that TBI differentially affects the levels of sex-steroid hormones in men and women with sTBI. It is frequent to observe sex differences in post-TBI outcomes (17– 23). Results from experimental models show that female rats exhibit lesser susceptibility to post-TBI and male rats developed more severe cerebral edema, which could significantly cause secondary brain injury (11). Data from clinical study have noted that women are more likely to survive their injuries and less likely to suffer posttraumatic complications than men

FIGURE 2 | Probability of recovery of consciousness at 6 months related to serum testosterone levels observed 1 week after sTBI in male patients. The probability results are from the ROC curve, where larger test results indicate a more positive test. The AUC for testosterone is 0.736 (p <0.001).

recovery of consciousness/no recovery of consciousness in men, an ROC curve was drawn, as shown in **Figure 2**. The ROC analysis showed that the area under the curve (AUC) was 0.736 (p < 0.001).

patients with normal or elevated testosterone levels.

## DISCUSSION

1.22–3.66, p = 0.007).

The literature suggests that pituitary hormone abnormalities occur early, with high frequency post-TBI (7, 8). However, these research findings are mostly mixed regardless of sex. Our study investigated alterations in sex hormones and specifically focused on the effects of these hormones on consciousness after sTBI by sex, which has not been wellstudied. In the current study, sex-specific alterations in serum sex hormone levels were identified in the acute phase of sTBI. Importantly, our data suggested that serum testosterone was a significant predictor of consciousness recovery in male patients with sTBI, whereas serum sex hormones did not contribute to consciousness recovery in women patients with sTBI.

Hypopituitarism is highly prevalent during the acute phase of TBI. Thus, far, the exact mechanisms underlying hypopituitarism

have not yet been clarified. The most widely accepted theory belongs to the ischemic insult to the pituitary gland.

Furthermore, the times to recovery from coma for male patients with normal or elevated testosterone levels and those with low testosterone levels were compared by Kaplan–Meier survival curves, which showed the proportion of male patients regaining consciousness (**Figure 1)**. The analysis showed that normal or elevated testosterone levels in serum significantly reduced the risk of remaining in an unconscious state in male patients with sTBI (log-rank test, p = 0.011) (HR, 2.12; 95% CI,

In addition, in evaluating the power of testosterones to predict

(17–20). However, other researchers have found the opposite results that women have worse outcomes and are more likely to die from their injuries than men (18). Sex differences in the extent of brain damage has also been reported among survivors post-TBI, with the female brain suffering from less damage compared to their male counterparts (24). These studies support that pathophysiologic variables may underlie these differences. Numerous studies from clinical and laboratory research support the essential role of sex hormones in the injured brain (20–23). Hence, sex-specific changes of hormonal steroids may contribute to innate sex-based differences in physiology and pathobiology of TBI.

Unconsciousness resulting from TBI is frustrating for clinicians and distressing for patients' families, since the mechanisms behind the recovery from unconsciousness are largely unknown and its prognosis is especially challenging (17). Consciousness is considered to exhibit an emergent property of cortical activity (25). The ascending reticular activating system (ARAS) of the brain structures accounts for the regulation of consciousness (24). It has been proposed that impaired consciousness level post-TBI may be due to damage of part of the ARAS, including the brainstem, thalamus, extensive injury to the cortex, or the disconnection of white matter between the thalamus and cerebral cortex (26). In addition, the hypothalamus plays an important role in maintaining self-awareness since it is involved in the regulation of sleep and awakening as the primary timekeeper of consciousness (27, 28). Hypothalamus-pituitary dysfunction resulting from TBI is mainly caused by damage to the hypothalamus, including hypoxic insult, direct mechanical injury, and vascular injury (7, 14). Hence, we speculated that hormone alterations after sTBI may have a certain degree of predictive value for recovery of responsiveness by combining the above-mentioned studies.

In the current study, we determined how serum sex hormones may be useful for predicting the outcome of unconsciousness. Our results showed that testosterone, only in male patients, was an effective predictor of recovery of consciousness. Notably, normal or elevated testosterone levels were significantly associated with a reduced risk of unconsciousness. Despite the exact mechanism of how testosterone promotes the recovery of consciousness being unknown, several previous studies could support the results from this study. It has been reported that the descent of testosterone is dependent on the severity of TBI. In males, there is a positive correlation between plasma testosterone level and GCS score (29, 30). Moreover, it has also been reported that testosterone level is associated with mortality or morbidity of patients with sTBI (31). Clinical studies suggest that male TBI patients could benefit from restoring serum testosterone levels (10, 32). Beneficial effects of testosterone after brain injury have also been reported in animal experiments. Results of experiment conducted by Lopez-Rodriguez and coworkers show that testosterone levels on brain inversely correlate with the severity of TBI and edema formation, but positively correlate with GCS scores. They also suggest that animals with lower levels of testosterone on brain had higher neurological deficiency (33). Furthermore, brain testosterone plays a neuroprotective effect against oxidative damage in experimental model (34). Other research suggests that intrinsic androgen may impact the capacity of neural stem/progenitor cells to produce neural progenitors under oxidative stress conditions (35). There is evidence that steroid hormones may modulate adult subventricular zone neurogenesis by affecting synthesis of brain-derived neurotrophic factors (36). It has also been reported that testosterone could improve working memory in aged rats by aiding transport of nerve growth factor from hippocampus to cortex (37). Therefore, it is not surprising that testosterone level has an effective predictive value in terms of consciousness recovery. Though male RC group had a significantly higher levels of testosterone and FSH and lower levels of prolactin and progesterone, FSH, prolactin and progesterone were not included in the logistic regression equation. As with females, none of sex hormone was associated with consciousness although RC group presented with higher levels of estradiol, prolactin and progesterone, which may be explained by sex-specific responses to sex hormone. It has also been previously demonstrated that loss of testosterone in men could change the brain's hormonal landscape because alteration of testosterone is gradual in healthy men and can be clinically subtle, whereas change in sex hormones in healthy women is rapid and overt (38). There are notable sex differences in neurochemistry, brain morphology and functional outcomes in addition to similarities between female and male brains (39). Marked sex-specific responses to injury caused by trauma have also been reported in the nervous systems above-mentioned (16– 19). These studies may provide evidence for the difference in the association between sex hormones and consciousness for male and female patients post-TBI.

In addition to use of testosterone to distinguish whether patients were likely to have RC vs. NRC, it was of interest to analyze the probability of testosterone levels predicting recovery of consciousness. In the current study, by using testosterone levels in male patients with sTBI the ROC analysis showed a high AUC. The probability of consciousness recovery increased with increasing levels of testosterone, which provides a rationale for why male TBI patients could benefit from restoring their serum testosterone levels as previously suggested in a clinical study (32). There is an increasing belief that unconsciousness following TBI may be the consequence of traumatic axonal injury to the brainstem reticular activating system and thalamus, extensive damage to the cortex (2, 40). Androgens were shown to be an important promoting factor in axons regeneration in males (41). The potential mechanism by which testosterone could enhance consciousness recovery post-TBI was presumably due to facilitating axonal regeneration. However, systemic administration of testosterone to female animal elicited a less extent of axonal regeneration, which could have been due to conversion of testosterone to estradiol by aromatase and subsequently inability to bind to androgen receptors within neurons (42). In addition, effects of sex hormones on brain and behavior can be moderated by factors such as menopausal status, age, and parity (38, 43). These results could contribute to the absence of associations between gonadal hormones and TBI outcomes in women.

In the present study, progesterone and estrogen were not associated with consciousness, although progesterone was significantly lower in both sexes relative to controls, and estrogen was lower only in the female patients compared with health subjects. Decades of researches demonstrate that progesterone can suppress neuroinflammation and reduce edema, oxidative injury, blood-brain barrier damage, enhance dendritic arborization and synaptogenesis, and limit cellular necrosis after brain trauma (8, 44, 45). Experimental literature also suggests that estrogen can increase cerebral blood flow, reduce inflammatory, prevent lipid peroxidation, and promote cell survival post-TBI (11, 46, 47). Despite a growing body of evidence from laboratory studies supporting the influential role of progesterone and estrogen in TBI, there is an alarming paucity of clinical data. The large clinical trials show no clinical benefit of progesterone and estrogen in patients with severe TBI (48, 49). There are currently no recommendation for the use of treatment with estrogen or progesterone to afford neuroprotection in TBI (8, 48). These results may explain that progesterone and estrogen were not associated with consciousness in both sex at current study.

This study was limited by the fact that sex hormone levels were assessed only once and were not evaluated for their dynamic changes. Our data do not allow discrimination between what proportion of the hormone alteration is caused by the TBI itself and how much is caused by the extracranial injuries and critical illness situation. Additionally, the observed indicators were also limited. Functional magnetic resonance imaging and electroencephalography responses, known to provide useful prognostic information, were not included in this study. However, this study was mainly designed to find a biomarker for predicting consciousness recovery at an early stage post-TBI. Our sample size was also relatively small. In the future, more study subjects are needed to overcome possible bias and to improve the generalizability of data.

## REFERENCES


## CONCLUSION

The results of this work indicate that acute serum sex hormone profiles are different between male and female patients in the acute phase of sTBI. Serum testosterone concentration is an effective prognostic indicator in male patients with sTBI for recovery of consciousness. Hence, these patients should be considered and referred to neuroendocrine evaluation in an early phase after traumatic event. However, progesterone and estrogen are not significantly associated with the outcome of unconsciousness, so early treatment with progesterone and estrogen may not work on the recovery of consciousness. Further work is needed to investigate the exact mechanism of how testosterone promotes the recovery of consciousness in male population with TBI.

## ETHICS STATEMENT

This study was carried out in accordance with the recommendations of Nanfang Hospital, Southern Medical University. The protocol was approved by our Institutional Review Board. All patient's data were analyzed and reported anonymously.

## AUTHOR CONTRIBUTIONS

YZ and JF designed the protocol. HW and BZ recruited subjects and collected data. RH analyzed the data. YZ drafted the manuscript. JF reviewed and edited the manuscript.

## FUNDING

This study was supported by the National Natural Science Foundation of China (Grant No. 81802250) and the Presidential Foundation of Nanfang Hospital (Grant No. 2017C031).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Zhong, Wu, He, Zheng and Fan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Bridging Sex and Gender in Neuroscience by Shedding a priori Assumptions of Causality

Melissa M. Holmes 1,2,3 \* and D. Ashley Monks 1,2

*<sup>1</sup> Department of Psychology, University of Toronto Mississauga, Mississauga, ON, Canada, <sup>2</sup> Department of Cell & Systems Biology, University of Toronto, Toronto, ON, Canada, <sup>3</sup> Department of Ecology & Evolutionary Biology, University of Toronto, Toronto, ON, Canada*

Keywords: behavioral ecology, definition, gender, polymorphism, sex

"The first task of every science is the clear definition of the object it has to investigate. In no science, however, is this preliminary task so difficult as in psychology; and this circumstance is the more remarkable since logic, the science of defining, is itself a part of psychology. When we compare all that has been said by the most distinguished philosophers and scientists of all ages on the fundamental idea of psychology, we find ourselves in a perfect chaos of contradictory notions." From Haeckel (1905)

We agree with Haeckel that definitional issues are paramount in science and that we should be mindful of our human biases, especially when thinking about subjects such as sex and gender that are so central to our identity. It is our view that sex and gender can both be incorporated into neuroscience research in a meaningful, rigorous way but not with the current dichotomous approach: sex and gender are not distinct variables with mutually exclusive causes and cannot be considered as such. Our use of the terms sex and gender herein will reflect current conventional definitions, including those associated with this special issue, with "sex" referring to biological attributes and "gender" referring to social structure and socially constructed roles, behaviors, and identities. Here we argue that the most significant restriction for the successful widespread inclusion of sex and gender into neuroscience research is definitional. We believe a comparative and interdisciplinary approach to this question will help desegregate current definitions and drive sex and gender science forward in an efficient, integrative, and conceptually accurate manner.

The definitional issues surrounding sex and gender originate in part from a conflation of observable traits with inferences as to their causality. That is, "biological" and "social" are attached to "sex" and "gender," respectively in a familiar dialectic that echoes notions of nature and nurture and by extension determinism and free will. This conflation is problematic not only because it leads to conceptual ambiguity and fuels unnecessary disagreements (see Griffiths, 2002, for relevant discussion of "innateness"), but also because defining traits based on presumed causality introduces a major obstacle to scientific investigation. Our thinking and experimentation should not be constrained by definitions that on the one hand are difficult to observe (i.e., to categorize a trait as a manifestation of sex or gender requires knowledge of its cause) and on the other hand precludes causal investigation (traits thus defined are then canonized). Case in point is the assumption or assertion that gender is a non-biological, social construction, or that sex has a purely biological basis. This dichotomous causal inference implies orthogonality and is dubious even when only considering traditional laboratory rodents and humans, but all the more so when we take a truly comparative perspective.

As a species, we seem to cherish a belief that humans are fundamentally unique among animals, beyond the obvious fact that, by definition, all species are unique from each other. It is common even among academics to hold the view that humans have categorically unique cognitive and social abilities, such as language, self-awareness, technology, and culture. These beliefs persist

#### Edited by:

*Annie Duchesne, University of Northern British Columbia, Canada*

Reviewed by: *Elena Choleris, University of Guelph, Canada*

\*Correspondence: *Melissa M. Holmes melissa.holmes@utoronto.ca*

#### Specialty section:

*This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Neuroscience*

Received: *15 February 2019* Accepted: *26 April 2019* Published: *09 May 2019*

#### Citation:

*Holmes MM and Monks DA (2019) Bridging Sex and Gender in Neuroscience by Shedding a priori Assumptions of Causality. Front. Neurosci. 13:475. doi: 10.3389/fnins.2019.00475*

**78**

despite evidence for at least rudimentary forms of these abilities in other species. Notable for the current discussion, there is compelling evidence for at least limited theory of mind (i.e., the ability to recognize mental states in others such as their goals, intentions, perceptions, knowledge, and/or beliefs) in diverse non-human animals. For example, chimpanzees (Pan troglodytes) are able to "pass" tests of theory of mind that are applied to human infants. These include: adaptively modifying food begging according to whether an experimenter is unable or unwilling to give food, correctly producing an action that was unsuccessfully attempted by an experimenter, using stealth adaptively to disguise food retrieval from a competitor, and selectively retrieving food that is unknown to competitor (reviewed in Call and Tomasello, 2008). As further example, western scrub-jays (Aphelocoma californica) not only have effective food caching strategies to minimize thieving from competitors, but also move the location of their caches when they are observed if they have prior experience thieving from others (reviewed in Clayton et al., 2007). Several rodent species also show evidence of self-other awareness in the form of empathy or consolation behavior (Burkett et al., 2016; Mogil, 2019). Whereas it is important to acknowledge doubt concerning the extent to which the adaptive performance of other animals or even human infants reflects true understanding of the mental states of others rather than resulting from more simple behavioral rules (see Penn and Povinelli, 2007), we believe it is fair to say diverse species have remarkably sophisticated social behavior, enabling behavioral responses that adapt to mental states of conspecifics.

If other animals can, for lack of a better word, understand the goals, intentions, perceptions, and knowledge of their conspecifics, why would we assume that they are incapable of having awareness of their sex or sexuality, or for these to be separate from their social environment? To the contrary, it seems to us that it would be remarkable if this did not occur to varying extents in non-human animals, and there is considerable evidence to support this conclusion. Other species, too, appear to make at least rudimentary assumptions about how conspecifics will behave based on biological and social cues and shift their sociosexual phenotype based on social environment. For example, in bluehead wrasse (Thalassoma bifasciatum), adult females can undergo complete sex change based on social cues, becoming sperm-producing males if the existing dominant male is removed from their habitat (Warner and Swearer, 1991). Furthermore, male Astatotilapia burtoni, a species of cichlid, exist in two morphs: territorial, aggressive males have striking coloration while non-territorial, subordinate males do not. Importantly, these morphs are plastic. Males can shift between phenotypes, showing changes in behavior, morphology, and neuroendocrinology, depending on their social environment (reviewed in Maruska and Fernald, 2018). Are these male morphs equivalent to different genders? We believe they could be considered as such, however gender is, by most current definitions, a manifestation of human sociocultural factors and is therefore exclusively applied to humans.

These comparative examples highlight that the intersection of numerous physiological and behavioral traits can manifest in predictable morphs beyond "male" vs. "female". In behavioral ecology, polymorphism is defined as the occurrence of two or more forms/morphs/phenotypes at the same ontogenetic stage within a population. These morphs must be discontinuous and occur at a frequency higher than explained by the rate of mutation. Importantly, while the morphs themselves are discontinuous, trait expression can be either categorical or continuously distributed. For example, male plainfin midshipman fish (Porichthys notatus) exist in two morphs (reviewed in Forlano et al., 2015). Type I males are larger, establish and defend territories, and produce sonic vocalizations to attract females. Type II males are smaller, do not maintain territories, and do not produce the same vocal repertoire. Rather, they mate cryptically by ejaculating when females are laying eggs in the territory of a Type I male. In this species, the presence or absence of testes is categorical between males and females but is continuous within males with Type II males having a higher gonadosomatic index than Type I males, on average. Furthermore, polymorphisms can be strictly genetic or environmentally-cued. In sexually reproducing species, the most obvious example of a polymorphism—and this is not a coincidence—is the differentiation of an embryo along male or female lines. In mammals, this is a classic example of genetic polymorphism whereby the mechanism of sexual differentiation is provided by polymorphic sex chromosome genes (reviewed in Arnold et al., 2012). In several species, however, sex determination is environmentally controlled. For example, in leopard geckos (Eublepharis macularius), the gonadal sex of the individual is attributable to the temperature in which the egg incubates (Viets et al., 1993). This is not to say that environmentally-cued polymorphism is independent of genetics. Rather, while genetic polymorphisms result from discontinuously distributed but continuously active genetic material, environmentally-cued polymorphisms stem from universally distributed but differentially active genetic material (Clark, 1976). Key to the current debate, because polymorphism means "many forms," it is an appropriate term for observable differences in form between members of a population regardless of how many forms and whether or not the mechanism of morph determination is known (Clark, 1976). Importantly, this concept inherently acknowledges the intersectionality of biological and environmental mechanisms.

Discussion about the intersections between genes and environment in the evolution of human sex and gender differences is ongoing (e.g., Smuts, 1995; Eagly and Wood, 2013; Liesen, 2013; Neuberg and Sng, 2013; Barker, 2015) and it has been argued that the social environment and/or culture are not entirely distinct from genetic and epigenetic mechanisms (see for example Jablonka and Lamb, 2014; Fine et al., 2017). Indeed, others have advocated for a redefinition and expansion of sex and gender categories (e.g., Fausto-Sterling, 2000; Fine, 2010; Jordan-Young, 2010; Hyde et al., 2019) or suggested methodological approaches to better integrate sex and gender in neuroscience research (e.g., Rippon et al., 2014; Joel and McCarthy, 2017; Hyde et al., 2019). However, some of this discourse is inherently based on dichotomous definitions of sex and gender whereas we further the call for an empirical, theoretically agnostic approach to the re-examination of sex and gender categories on the basis of observable traits in the absence of causal assumptions. Using polymorphism as a theoretical framework for all sex and gender research (both human and non-human animal), we can statistically determine if a given phenotype (behavioral, morphological, or otherwise) is a continuous or categorical variable and how different variables cluster together (or not). We can then analyze sex and gender variables using multilayer network analysis, which is specifically designed to explore multifaceted systems (e.g., Finn et al., 2019). We can incorporate chromosomal, hormonal, and morphological variables with key environmental variables including social and sexual experience, social rank, and current and evolutionary social/sociocultural milieu. We can disrupt networks in silico to identify putative causal relationships and generate testable hypotheses concerning orthogonality of the variables that define morphs. We have little doubt that some of these variables will cluster together and influence each other in clear and predictable ways, particularly given well-established links between chromosomes, gonads, and morphology. However, exactly how this happens will differ according to species and, importantly, a broad comparative approach will allow us to identify opportunities for modeling specific target mechanisms that might better align with the human condition.

In sum, we agree with the idea that neuroscientists should theoretically be both "sex-informed" and "gender-informed" but we do not think the current definitions of sex and gender facilitate this goal. We argue for a reevaluation of the current consensus definitions that primarily serve to dichotomize the biological and the social when these are inextricably intertwined in any social animal. As a result, these definitions serve to inhibit investigations of mechanism, broadly defined. Furthermore, it is our opinion that applying "sex" to non-human animals but both "sex" and "gender" to humans is fundamentally inaccurate and imposes further bias on the study of mechanism. To correct

## REFERENCES


this, we either need to redefine gender to focus exclusively on those features that are truly unique to humans, which will require significant introspection and debate, or we need to more broadly apply gender concepts to non-human social animals. In pursuit of a desirable social goal (i.e., inclusion and equal opportunity for individuals) we should not ignore or deny the biological variability that exists and the mechanistic determinants that cause the variability. That is, variability is not solely caused by disadvantage, suppression, and prejudice. Conversely, in pursuit of a standardized, reductionist translational approach, we cannot ignore or deny species-specific social adaptations and the importance of social interactions on physiology. We fully acknowledge the complexity of studying/modeling sex and gender (e.g., Jordan-Young and Rumiati, 2012; Eliot and Richardson, 2016) but we believe this should be a source of scientific inspiration. We need to keep asking the questions, we just need to reframe how we do so. By taking a step back, shedding our biases about causation, and appreciating variability within and across species, we can revisit the consensus definitions of sex and gender in an unbiased, data driven way. We believe this will reframe how we study sex and gender and ultimately better reveal the interplay between an organism's brain, body, behavior, and environment.

## AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

## FUNDING

The authors are supported by NSERC grants RGPIN 312458 (DM), RGPIN 2018-04780 (MH), and RGPAS 2018-522465 (MH) and an Ontario Early Researcher Award (MH).

disparities. J. Neurosci. 36, 11823–11830. doi: 10.1523/JNEUROSCI.1391- 16.2016


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Holmes and Monks. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Sex Hormones and Gender Role Relate to Gray Matter Volumes in Sexually Dimorphic Brain Areas

#### Belinda Pletzer1,2 \*

<sup>1</sup> Department of Psychology, University of Salzburg, Salzburg, Austria, <sup>2</sup> Centre for Cognitive Neuroscience, University of Salzburg, Salzburg, Austria

The present study investigates the relationship of circulating sex hormone levels and gender role to gray matter volumes in sexually dimorphic brain areas and explores, whether these relationships are modulated by biological sex (as assigned at birth based on sexual anatomy) or oral contraceptive (OC) use. It was hypothesized that testosterone and masculinity relate positively to gray matter volumes in areas that are typically larger in men, like the hippocampus or cerebellum, while estradiol/progesterone and femininity relate positively to gray matter volumes in the frontal cortex. To that end, high resolution structural MRI scans, sex hormone levels and gender role self-assessments were obtained in a large sample 89 men, 89 naturally cycling (NC) women, and 60 OC users. Men showed larger regional gray matter volumes than women in the cerebellum and bilateral clusters spanning the putamen and parts of the hippocampi/parahippocampi and fusiform gyri. In accordance with our hypotheses, a significant positive association of testosterone to hippocampal volumes was observed in women irrespective of OC use. Participant's self-reported femininity was significantly positively associated with gray matter volumes in the left middle frontal gyrus (MFG) in men. In addition several differences between OC-users and NC women were identified.

#### Edited by:

Alfonso Abizaid, Carleton University, Canada

#### Reviewed by:

Ashlyn Swift-Gallant, Memorial University of Newfoundland, Canada Wendy Portillo, National Autonomous University of Mexico, Mexico Melissa Holmes, University of Toronto Mississauga, Canada

> \*Correspondence: Belinda Pletzer Belinda.Pletzer@sbg.ac.at

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Neuroscience

Received: 14 February 2019 Accepted: 23 May 2019 Published: 18 June 2019

#### Citation:

Pletzer B (2019) Sex Hormones and Gender Role Relate to Gray Matter Volumes in Sexually Dimorphic Brain Areas. Front. Neurosci. 13:592. doi: 10.3389/fnins.2019.00592 Keywords: sex hormones, gender role, brain structure, oral hormonal contraceptives, sex differences

## INTRODUCTION

Sex differences in brain structure and function have long been a matter of debate and have attracted considerable research interest, as they are assumed to underlie sex differences in behavior (e.g., Cosgrove et al., 2007; Andreano and Cahill, 2009). For instance, sex differences in the brain are thought to explain sex differences in cognition, or the differential vulnerability for neurodevelopmental and psychiatric disorders in men and women.

Sex differences in brain structure have repeatedly been reported, with consistencies in some areas, but inconsistencies in others (see Ruigrok et al., 2014 for a meta-analysis). Regional gray matter volumes change with age at a different rate in males and females, both during development (Gur and Gur, 2016) and during aging (Jäncke et al., 2015). These age-related changes account for some of the variability between studies. Recent meta-analyses (Ruigrok et al., 2014) and more large-scale studies (Ritchie et al., 2018) arrive at similar conclusions. In adults, larger regional volumes in males compared to females are consistently reported in subcortical areas, including the hippocampus, amygdala, basal ganglia and nucleus accumbens, in parts of the parahippocampal gyrus, in the cerebellum and the posterior cingulate cortex (PCC). Larger regional volumes

in females compared to males are consistently reported in frontal areas, including the anterior cingulate cortex (ACC).

It can be speculated whether males' larger hippocampal/parahippocampal volumes play a role in their frequently reported advantage in spatial tasks (Andreano and Cahill, 2009; Levine et al., 2016) or whether females' larger volumes in frontal areas play a role in their frequently reported advantage in verbal tasks (Andreano and Cahill, 2009) or selfcontrol (e.g., Chapple et al., 2010; Hosseini-Kamkar and Morton, 2014). However, the relationship between structure and function is not always as clear-cut, and the larger volume of a brain area doesn't necessarily lead to better performance in tasks involving this area. As a counter-example, the hippocampus has also been implicated in verbal memory tasks and larger hippocampal volumes relate to better verbal memory performance in females (Protopopescu et al., 2008). Also in females, larger volumes of the fusiform face area relate to better performance in a face-recognition task (Pletzer et al., 2015a). Nevertheless, these areas show regionally larger volumes in men (e.g., Pletzer et al., 2010), while women outperform men in verbal memory and face-recognition tasks (see Andreano and Cahill, 2009 for a review).

Sex differences in brain morphology are thought to result from organizational effects of sex hormones on the brain during development – both prenatally and during adolescence (Kelly et al., 1999; Cosgrove et al., 2007). At a smaller scale, sex hormones appear to also exert activational effects on the brain throughout our adult life span, the most prominent example being menstrual cycle-dependent changes (Protopopescu et al., 2008; Lisofsky et al., 2015; Barth et al., 2016; Pletzer et al., 2018). The hippocampus has consistently been reported to increase gray matter volumes during the high estradiol pre-ovulatory phase (Protopopescu et al., 2008; Lisofsky et al., 2015; Barth et al., 2016; Pletzer et al., 2018), while an increase in right basal ganglia volumes has been observed in the high progesterone luteal phase (Protopopescu et al., 2008; Pletzer et al., 2018). In good accordance, animal studies also report increases in hippocampal spine density in response to estradiol (Woolley and McEwen, 1993, 1994). Furthermore, animal studies have also found similar estradiol-dependent changes in spine-density in the frontal cortex (e.g., Hao et al., 2006). These changes are more subtle and short-lived, but suggest that sex hormones continuously reshape our brain, particularly in areas, that are rich in sex hormone receptors (Barth et al., 2015). Apart from brain areas involved in the regulation of neuroendocrine axes (i.e., hypothalamus), areas with a particularly high density of sex hormone receptors include the hippocampus, the frontal cortex and the cerebellum (Barth et al., 2015). These are the same areas that show the strongest sexual dimorphism (Cosgrove et al., 2007; Ruigrok et al., 2014). Nevertheless, only few studies have addressed whether circulating levels of sex hormones relate to gray matter volumes in these areas across participants. For instance it has been demonstrated, that cross-sex hormone treatment in male-to-female and female-to-male transsexuals alters their brain structure toward the proportions of the aspired sex (Hulshoff Pol et al., 2006; Guillamon et al., 2016). However, it has not been addressed whether subjects with higher circulating testosterone levels also display larger volumes in brain areas known to be larger in men, and whether vice versa, subjects with higher circulating estradiol levels, display larger volumes in brain areas known to be larger in women. Furthermore, it is unclear, how this association may be modulated by the biological sex of participants.

Due to the accumulating knowledge of sex hormone actions on the human brain and age-related changes therein, sex hormones have been implicated as one potential cause for sex differences in cognitive functions (e.g., Kelly et al., 1999). Particularly regarding spatial abilities, a role of testosterone has been repeatedly discussed (Hooven et al., 2004; Driscoll et al., 2005; Hausmann et al., 2009; Courvoisier et al., 2013). However, most contemporary models of sex differences in cognitive functions follow a psychobiosocial approach and consider not only biological factors, like sex hormones, but also socialization (e.g., Hausmann et al., 2009; Levine et al., 2016). One social factor that has been implicated in sex differences, is gender role. Gender role refers to the predominant views of what's typically male or typically female in a given society (e.g., Eagly and Koenig, 2006). The extent to which a person identifies with typically male characteristics, is referred to as masculinity. The extent to which a person identifies with typically female characteristics, is referred to as femininity. Unlike biological sex, gender roles are neither dichotomous nor mutually exclusive. Rather masculinity and femininity are assessed on two different continuous scales (e.g., Bem, 1974). While typical males score high on masculinity and low on femininity and typical females vice versa, a substantial proportion of individuals show no such gender-typical identification (Bem, 1974). About 30% of individuals score high on both masculinity and femininity, and are considered androgynous (e.g., Vafaei et al., 2016).

According to the gender role mediation hypothesis (Nash, 1979), the extent to which a person identifies with the societal expectations toward their biological sex transfers to their behavior, and may explain sex differences in a variety of domains, including cognitive abilities. In line with this hypothesis, a recent meta-analysis found for instance, that higher masculinity relates to better spatial performance in both men and women (Reilly and Neumann, 2013). However, while the actions of sex hormones on cognitive functions are thought to result from their actions on the brain, the relationship of gender role to brain structure and function has hardly been addressed. Only a handful of studies have assessed brain structure in untreated transgendered individuals (see Guillamon et al., 2016 for a review) and homosexuals (Ponseti et al., 2007; Abé et al., 2014; Manzouri and Savic, 2018), both groups that usually show low gender role conformity. In general, these studies suggest little differences between untreated transsexuals or homosexual and non-transsexual heterosexuals. However, these studies are characterized by small sample sizes and a certain variability in the inclusion criteria for the transsexual or homosexual groups. Thus, it remains unclear whether participants with higher masculinity show a more male-typical brain morphology, i.e., larger gray matter volumes in brain areas that are typically larger in men. Vice versa, it has not been assessed whether participants with high femininity show a more female-typical brain morphology, i.e.,

larger gray matter volumes in brain areas that are typically larger in women. However, a few findings do stand out. For instance, Luders et al. (2012) report larger cortical thickness in the left MFG of untreated male-to-female transsexuals compared to nontranssexual men. Abé et al. (2014) report smaller hippocampal volumes in homosexual men compared to heterosexual men. It can thus be speculated that, at least in men, gender identity is reflected to some extent in brain morphology. For instance, brain structure has been related to personality (e.g., Riccelli et al., 2017), and our understanding of what's masculine and what's feminine relies to a great extent on personality traits (Bem, 1974; Gruber et al., in press). Accordingly, most measures assessing masculinity and femininity include personality dimensions, like expressivity on the femininity scale and assertiveness on the masculinity scale (e.g., Eagly and Koenig, 2006). It is thus possible, that a person's perception of how masculine or feminine they are, depends in part on their brain morphology and chemistry. A recent fMRI study has assessed brain activation in men and women during the processing of gender-related attributes (Hornung et al., 2019), as are used to assess gender role (Gruber et al., in press). They found stronger activation for gender-congruent attributes in the amygdala and putamen (Hornung et al., 2019). The present study focuses on brain morphology.

The aim of the present study is to assess whether circulating levels of sex hormones and/or participants masculinity and femininity relate to gray matter volumes in brain areas that show a high density of sex hormone receptors and have been described as sexually dimorphic. To address these questions, a hypothesisdriven region-of interest (ROI) based approach is combined with more exploratory whole-brain analyses. The hippocampus was selected as a brain area with high density of sex hormone receptors that is typically larger in men. The middle frontal gyrus (MFG) was selected as a brain area with high density of sex hormone receptors that is typically larger in women. To that end, high-resolution structural MRIs, saliva samples and gender role ratings were obtained from a large sample of 89 men and 149 women. It was hypothesized, that testosterone and masculinity relate positively to gray matter volumes in areas that are typically larger in men, like the hippocampus or cerebellum, while estradiol/progesterone and femininity relate positively to gray matter volumes in the frontal cortex. It was also explored whether these relationships are modulated by biological sex.

An important factor to consider, when addressing these questions, is hormonal contraceptive use in women. Oral hormonal contraceptives (OC) contain synthetic estrogens and progestins that do not only influence a woman's endogenous hormonal milieu (e.g., Wiegratz et al., 2003). Previous work has outlined potential OC-dependent effects on gray matter volumes in sexually dimorphic brain areas (Pletzer et al., 2010, 2015a; De Bondt et al., 2013; Petersen et al., 2015) and on gender role (Pletzer et al., 2015b). Across different cultures, OC-users describe themselves as more feminine compared to non-users (Pletzer et al., 2015b), even though several studies indicate that their behaviors and brain activation patterns may in fact be more comparable to men (e.g., Nielsen et al., 2011; Pletzer et al., 2014). Accordingly, effects of OC-use will also be assessed in all analyses.

## MATERIALS AND METHODS

## Participants

As add-on to three different neuroimaging studies, 89 men (mean age: 24.18 ± 4.44 years), 89 naturally cycling women (mean age: 24.02 ± 3.94 years), and 60 women using oral hormonal contraceptives (OC; mean age: 21.42 ± 2.46 years) completed self-ratings for their masculinity/femininity.

In all three studies, participants were right-handed, Caucasian, aged between 18 and 35 years, heterosexual, had no diagnosis of psychological, neurological or endocrinological disorders and no brain tissue abnormalities on the structural MRI. The majority of participants were university students who had completed general qualification for university entrance. All naturally cycling women had a regular menstrual cycle of 21 to 35 days length (mean duration: 29.36 ± 2.91 days).

Among them a subsample of 54 men (mean age: 24.33 ± 4.37 years) and 51 naturally cycling women (mean age: 24.12 ± 4.26 years) also completed a standardized gender role questionnaire. For those 51 women, mean cycle duration was 29.11 days (SD = 3.05).

Naturally cycling women and men did not differ in age (both |t| < 0.26; both p > 0.79), but hormonal contraceptive users were significantly younger than the other two groups (both t > 4.87, both p < 0.001).

## Ethics Statement

The studies were approved by the University of Salzburg's ethics committee and conform to the Code of Ethics of the World Medical Association (Declaration of Helsinki). Informed written consent was obtained from all participants.

## Procedure

Study 1 investigated brain responses to different risk taking tasks (Pletzer and Ortner, 2016). Study 2 investigated sex differences in brain responses to numerical and attention tasks (Pletzer, 2016; Pletzer and Harris, 2018) and included only naturally cycling women. Study 3 was a resting state study (currently unpublished).

In all three studies, participants gave one saliva sample before entering the scanner and one saliva sample after scanning, both via the passive drool method. Questionnaires were completed on site immediately after the scanning session as to not interfere with the main research question of the studies. Self-ratings were included in all three studies, the GERAS was completed by participants of Study 2 and some participants of Study 1.

Among the 89 naturally cycling women in the whole sample, 58 were scanned in their luteal cycle phase (11-3 days before the onset of the next menses are counted backwards; mean cycle day: 22.03 ± 3.98). The remaining 31 women were unavailable during their luteal cycle phase and scanning sessions were scheduled during or shortly after the next menses (mean cycle day: 7.25 ± 4.07). Among the 60 hormonal contraceptive users in the whole sample, 39 used contraceptives containing androgenic progestins (Levonorgestrel, Desogestrel, Dienogest, Gestoden), while 16 used contraceptives containing anti-androgenic progestins (Drospirenon, Cyproteronacetat,

Chlormadinonacetat). Five women were unable to provide information about the hormonal contraceptives they were using.

As expected, progesterone [t(84) = 6.78, p < 0.001] significantly higher during the luteal cycle phase compared to the early follicular cycle phase, while testosterone and estradiol did not differ between cycle phases (both t < 1.29, both p > 0.20). However, in accordance with our previous studies hippocampal volumes did not differ significantly between menses and luteal cycle phase (both t < 0.50, both p > 0.61; compare Pletzer et al., 2018), MFG volumes were only by trend higher in the luteal cycle phase (both t < 1.91, both p = 0.06; compare Pletzer et al., 2018) and masculinity/femininity ratings did not differ significantly between menses and luteal cycle phase (all |t| < 1.54, all p > 0.13; compare Pletzer et al., 2015a). Furthermore, there were no differences between pill-types in masculinity/femininity self-ratings, sex hormones or GM-volumes (all |t| < 1.58, all p > 0.13). Accordingly, NC-women and OC-users were not split into sub-groups for the analyses.

Among the 51 naturally cycling women in the subsample, 41 were scanned in their luteal cycle phase (mean cycle day: 21.78 ± 3.85), while 10 were scanned in their menses (mean cycle day: 8.80 ± 4.60). Again, the NC group was not split by cycle phase due to the small number of participants in their menses.

## Hormone Analysis

Prior to analysis, saliva samples were stored frozen at −20◦ and centrifuged twice at 3000 rpm for 15 min and 10 min, respectively. As recommended by the ELISA kit instructions, aliquots from both samples were then pooled to account for fluctuation in hormone release and saliva production and obtain a more stable measure of hormone levels throughout the scanning session. Estradiol, progesterone and testosterone levels were assessed using DeMediTec<sup>1</sup> salivary ELISA kits (DES6644, DES6633, and DES6622). All samples were assessed in duplicates and assessment was repeated for samples showing coefficients of variation (CV) above 25%. For estradiol, sensitivity is 1.4 pg/ml, intra-assay CV is 8.5%, inter-assay CV is 7%. For progesterone, sensitivity is 5 pg/ml, intra-assay CV is 7%, inter-assay CV is 9%. For testosterone, sensitivity is 2.2 pg/ml, intra-assay CV is 7.5%, inter-assay CV is 9%. For three participants (two men, one OC), hormone levels were not assessed due to visible blood contamination. Hormone levels of more than three SD above the group mean were discarded prior to analyses (E: two men, one OC, two NC; P: two OC, two NC).

## Questionnaires

#### Gender Role Self-Assessment

On a nine-point Likert-Scale, participants were asked to rate how masculine or feminine they perceived themselves in comparison to (other) men, (other) women, or the general population. The same scale was already employed by Pletzer et al. (2015b). The three comparisons were performed to take into account the fact that women tend to compare themselves to other women, while men tend to compare themselves to other men (Pletzer et al., 2015b). These ratings represent subjective measures of

<sup>1</sup>http://www.demeditec.com

masculinity and femininity and depend on the participant's personal understanding of these concepts. As outlined by Pletzer et al. (2015b), the concepts of masculinity and femininity vary between cultures and possibly also subcultures, e.g., depending on education or generation.

### Gender-Related Attributes Scale (GERAS)

To additionally obtain a more objective measure of masculinity and femininity, a subsample of participants also performed the gender related attributes scale (GERAS). The GERAS was developed by Gruber et al. (in press) as a standardized measure to assess gender role via attributes that are typically perceived as masculine or feminine in middle European cultures. It has been well-validated and shows excellent internal consistency and reliability (Gruber et al., in press). It extends previous sex/gender role inventories (e.g., Bem Sex Role Inventory – Bem, 1981; Personal Attributes Questionnaire – Spence et al., 1975) by including not only personality traits, but also cognitive abilities and interests that are typically associated with the male or female gender on three subscales: (i) personality subscale, (ii) cognitions subscale – 14 items (7 masculine, 7 feminine), and (iii) interests subscale – 16 items (8 masculine, 8 feminine). The personality subscale consists of 20 traits (both positive and negative), 10 of which are typically associated with the male (e.g., dominant, bold) and 10 with the female gender (e.g., warm-hearted, sensitive). Participants are asked to rate how often they think these traits apply to them. The cognition subscale consists of 14 cognitive skills (7 masculine, 7 feminine), for which previous studies have demonstrated sex differences favoring men (e.g., find a way) or women (e.g., find the right words). Participants are asked to rate how well they think they are able to perform these tasks. The interests subscale consists of 16 activities, 8 of which are stereotypically preferred by men (e.g., boxing, drinking), the other 8 are stereotypically preferred by women (e.g., dancing, talking). Participants are asked to rate how interested they would be to engage in these activities. All ratings are performed on a seven-point Likert-scale. For each subscale, separate masculinity and femininity scores are obtained by averaging the ratings for masculine and feminine items, respectively. The overall masculinity and femininity scores are obtained by averaging the masculinity and femininity scores of the three subscales.

## MRI Data Acquisition and Analysis

All three studies were performed on the same scanner (Siemens Magnetom Trio Tim 3 Tesla) at the Christian Doppler Klinik (Salzburg, Austria). All studies included the same scanning sequence to obtain a high resolution structural scan using a T1-weighted sagittal 3D MPRAGE sequence (TR = 2300 ms, TE = 2.91 ms, TI delay of 900 ms, FOV 256 mm, slice thickness = 1.00 mm, flip angle 9◦ , voxel size 1.0 × 1.0 × 1.0 mm, 160 sagittal slices). Images were segmented into gray matter, white matter and csf partitions using cat12 standard procedures and templates. SPM12 tissue probability maps and European brain templates for affine regularization were used during the initial SPM12 affine registration, light affine preprocessing and moderate (0.5) strength of local adaptive segmentation, skull stripping and final clean-up for CAT12 segmentation. Images

were spatially normalized to the same stereotactic space (MNI template) and voxel size for normalized images was set to 1.5 mm. To control for individual differences in brain size, brain segments were modulated using non-linear normalization parameters.

For ROI-based analyses, gray matter volumes were extracted from the left and right hippocampus, as well as the left and right MFG using the get\_totals script by G. Ridgeway<sup>2</sup> . Masks were constructed via the wfu-pickatlas toolbox, using aal-masks for the hippocampus and 10 mm spheres around the coordinates that showed the strongest sex difference favoring women for the MFG. The extracted gray-matter volumes were analyzed using JASP 0.8.1.1 (see section "Statistical Analysis").

For the more exploratory, whole-brain analyses, gray matter partitions were smoothed using a 12 mm Gausian kernel. The smoothed images were entered into SPM12 second level analyses. Total intracranial volume (TIV) and age were entered as covariates in all analyses. In a first step, men, naturally cycling women and OC users were compared using a one-way ANOVA design. Sexually dimorphic brain areas were identified by defining contrasts comparing men to both female groups. In addition contrasts comparing OC-users to NC women were also defined. In a second step, whole brain multiple regression designs were used, to identify areas sensitive to sex hormone levels or gender role. These whole brain regression analyses were performed separately for each group. A primary uncorrected threshold of p < 0.001 and a secondary cluster-level family wise error (FWE) corrected threshold of pFWE < 0.05 were used.

## Statistical Analysis

Statistical analysis was performed using JASP 0.8.1.1. Since age differed significantly between NC and OC women, age was controlled in all analyses. For analyses of brain volumes, TIV was entered as additional covariate. Accordingly, ANCOVAs were used to compare endocrine measures, behavioral measures and brain volumes between groups, while multiple regression analyses were used to relate endocrine and behavioral measures to gray matter volumes. For group comparisons in the whole sample the omnibus test comparing all three groups (men, NC, OC) is reported in the text and pairwise comparisons are listed in **Table 1**. For pairwise comparisons an FDR-correction of p-values was used.

Since previous work has outlined OC-dependent effects not only on gray matter volumes (Pletzer et al., 2010, 2014; Petersen et al., 2015), but also on sex hormone levels (Wiegratz et al., 2003) and on gender role (Pletzer et al., 2015b), the following analyses approach was chosen for ROI-based multiple regression analyses. In a first step it was assessed, how men differed from naturally cycling women, by accounting for biological sex in the analyses, but excluding OC-users. In a second step, naturally cycling women were compared to OC-users by accounting for OC-use in the analyses, but excluding men. Multiple regression analyses modeled age and TIV, sex hormones/gender role, and biological sex/OC-use, as well as their interactions as independent variables. If significant interactions were observed, separate partial correlations controlling for age and TIV were performed for each group to clarify.

## RESULTS

## Endocrine Results

In the whole sample (**Table 1**), testosterone, progesterone and estradiol levels differed significantly between groups [T: F(2,232) = 78.05, p < 0.001, η <sup>2</sup> = 0.40; P: F(2,227) = 19.26, p < 0.001, η <sup>2</sup> = 0.15, E: F(2,220) = 5.99, p = 0.005, η <sup>2</sup> = 0.05]. Post hoc comparisons revealed that testosterone levels were significantly higher in men compared to women irrespective of their hormonal status. Progesterone and Estradiol levels were significantly higher in NC women compared to men. Testosterone and progesterone levels were significantly higher in NC women compared to OC users.

## Behavioral Results

### Gender Role Self Assessment

In the whole sample (**Table 1**), significant group differences were observed in both self-rated masculinity [F(2,234) = 119.44, p < 0.001, η <sup>2</sup> = 0.51] and self-rated femininity [F(2,234) = 108.23, p < 0.001, η <sup>2</sup> = 0.48]. Men rated themselves as significantly more masculine and significantly less feminine than women irrespective if their hormonal status. Women on hormonal contraceptives rated themselves as significantly more feminine and significantly less masculine than naturally cycling women.

In men masculinity and femininity self-ratings showed a highly significant negative interrelation (r = −0.55, p < 0.001). Similarly in NC women a moderate negative association was observed between masculinity and femininity self-ratings (r = −0.23, p = 0.03). In OC women no significant association between masculinity and femininity self-ratings was observed (r = −0.19, p = 0.15). The correlation in men was significantly stronger than in the female groups (both Z > 2.52, both p < 0.012). Correlation coefficients did not differ significantly between NC and OC women (Z = 0.25, p = 0.80).

## GERAS

Also in the GERAS, men reached significantly higher masculinity and significantly lower femininity scores compared to NC women. Differences were strongest for the interests subscale and weakest for the cognition subscale (**Table 2**). The masculinity and femininity subscales of the GERAS were not significantly interrelated in either men or NC women (both |r| < 0.15, both p > 0.30).

In men, self-rated masculinity correlated significantly with masculinity scores as assessed by the GERAS (r = 0.43, p = 0.001), while self-rated femininity did not correlate with femininity as assessed by the GERAS (r = −0.14, p = 0.33). In NC women, selfrated masculinity did not correlate with masculinity as assessed by the GERAS (r = 0.09, p = 0.54), while self-rated femininity correlated significantly with femininity as assessed by the GERAS (r = 0.34, p = 0.01). Taking into account GERAS subscales, the best predictor of men's self-rated masculinity and women's self-rated femininity was the personality subscale (**Table 3**).

<sup>2</sup>http://www0.cs.ucl.ac.uk/staff/gridgway/vbm/get\_totals.m


TABLE 1 | Average hormone levels and masculinity/femininity self-assessment for men and women.

Age was controlled in all comparisons. P-values were FDR-corrected. NC, naturally cycling. OC, oral contraceptive user. MFG, middle frontal gyrus. L, left; R, right. Significant effects are highlighted in bold font.

Sex hormone levels were not correlated with masculinity or femininity scores (either self-rated or assessed with the GERAS) in either men, NC-women or OC-women (all |r| < 0.17, all p > 0.20).

## Neuroimaging Results

#### Group Differences

In the whole sample (**Table 1**), significant group differences in TIV were observed [F(2,234) = 80.53, p < 0.001, η <sup>2</sup> = 0.41]. Post hoc comparisons revealed that TIV was significantly larger in men compared to women, but did not differ between naturally cycling women and OC users. Controlling for age and TIV, the three groups did not differ significantly in overall WM volumes [F(2,233) = 0.89, p = 0.41, η <sup>2</sup> = 0.002], but group differences were



Age was controlled in all comparisons.

identified in GM volumes [F(2,233) = 6.25, p = 0.002, η <sup>2</sup> = 0.02]. Men and NC women did not differ in GM volumes, after age and TIV were accounted for. However, OC users had significantly smaller overall GM volumes than NC women.

In the ROI analyses, significant group differences were observed in the left hippocampus [F(2,233) = 4.00, p = 0.02, η <sup>2</sup> = 0.03] and the right MFG [F(2,233) = 11.74, p < 0.001, η <sup>2</sup> = 0.07]. Pairwise comparisons revealed that OC users showed significantly smaller gray matter volumes in the left hippocampus than NC women. Men showed significantly smaller volume in the right MFG than women irrespective of their hormonal status. No significant group differences were observed in the right hippocampus [F(2,233) = 2.66, p = 0.07, η <sup>2</sup> = 0.02] and the left MFG [F(2,233) = 1.28, p = 0.28, η <sup>2</sup> = 0.01].

At the whole-brain level, regional volume differences between men and women (both groups) are depicted in **Figure 1**. Controlling for age and TIV, men showed larger GM-volumes than women in the cerebellum ([34, −78, −20], 16405 voxels, T = 6.86, pFWE < 0.001) and a large cluster spanning the bilateral putamen, hippocampi, parahippocampi and amygdalae [(26, −3, −9), 7319 voxels, T = 5.00, pFWE = 0.005]. Women showed larger GM volumes than men in the frontal pole [(−18, 68, −4), 4241 voxels, T = 5.01, pFWE = 0.005], right MFG [(36,18,27), 627 voxels, T = 5.01, pFWE = 0.004; **Table 1**] and right IFG [(46,50,10), 2491 voxels, T = 4.95, pFWE = 0.006].

Controlling for age and TIV, OC users showed significantly smaller regional GM-volumes than naturally cycling women in the right parahippocampal/fusiform gyrus [(28, −14, −42), 1882 voxels, T = 5.29, pFWE = 0.001; **Figure 2**].

#### Sex Hormones and GM-Volumes

In men and NC women, sex hormones were not related to TIV, overall GM or WM volumes (all |r| < 0.17, all p > 0.11). In


TABLE 3 | Interrelation between sex-role self-assessment and GERAS-scores.

Significant effects are highlighted in bold font.

FIGURE 1 | Sex differences in regional gray matter volumes. Areas with larger regional volumes in men are depicted in blue (Cerebellum, Hippocampus/Parahippocampus, Amygdala, Putamen). Areas with larger regional volumes in women are depicted in red (frontal pole, middle/inferior frontal gyrus).

women and OC users. Areas with smaller regional volumes in OC users are depicted in green.

OC users, estradiol levels were significantly related to TIV and overall GM and WM volumes (all r > 0.25, all p < 0.05; results not shown), but there was no association between testosterone or progesterone levels and TIV/GM/WM (all |r| < 0.20, all p > 0.13). The higher the estradiol levels of OC users, the larger were their brains.

In the ROI-based analyses of sex hormones, significant sex × testosterone interactions were identified in the hippocampi, which are attributable to the fact that testosterone related more positively to hippocampal volumes in women (left: r = 0.26, p = 0.02; right: r = 0.16, p = 0.15) than in men (left: r = 0.07, p = 0.53; right: r = −0.14, p = 0.19; **Figure 3** and **Table 4**). This association did not differ between OC-users and NC women (**Table 5**).

For the left MFG, a significant interaction between OCuse and testosterone was observed (**Table 5**). This interaction resulted from a negative association to testosterone in OC women (r = −0.27, p = 0.04), but non-significant association in NC women (r = 0.13, p = 0.25).

No additional associations to sex hormones were observed in whole-brain analyses.

### Gender Role and GM-Volumes

In men and NC women, neither self-rated nor GERASmasculinity or femininity were related to total TIV, GM or WM volumes (all |r| < 0.15, all p > 0.17). In OC users, self-rated femininity was negatively related to TIV (r = −0.25, p = 0.05; results not shown).

Neither Self-rated nor GERAS masculinity or femininity were related to GM-volumes in any ROI and there were no differences in these associations depending on biological sex or OC-use (all |b| < 0.26, all |t| < 1.98, all p > 0.05).

Whole-brain analyses revealed no associations between gender role as assessed by the GERAS and GM volumes in any brain area. In men larger GM volumes in the left MFG were significantly positively related to higher femininity ratings [(−32, 36, 22), k = 721 voxels, T = 5.01, pFWE = 0.015; **Figure 4**]. The more GM in the left MFG, the more feminine did men consider themselves. Masculinity self-ratings were not related to GM volumes in any area. In NC-women and OC-users masculinity and femininity self-ratings were not related to GM volumes in any area.

## DISCUSSION

The present study set out to (i) investigate the relationship of circulating sex hormone levels and gender role to gray matter volumes in sexually dimorphic brain areas and explore, whether

FIGURE 3 | Relationship of testosterone to hippocampal gray matter volumes. A positive relationship was observed in women (left: r = 0.26, p = 0.02; right: r = 0.16, p = 0.15) but not in men (left: r = 0.07, p = 0.53; right: r = –0.14, p = 0.19). Women with higher Testosterone levels, showed larger hippocampal gray matter volumes. This association was irrespective of oral contraceptive (OC) use, i.e., both naturally cycling women and OC users are included in the female data depicted. For illustrative purposes, the x-scale was cut at 150 pg/ml of Testosterone. Note that male values reached up to 350 pg/ml.

TABLE 4 | Relationship of sex hormones to gray matter volumes in the hippocampus and middle frontal gyrus (MFG) while controlling for biological sex.


MFG = middle frontal gyrus. L, left; R, right. <sup>∗</sup>p < 0.05, ∗∗p < 0.01, and ∗∗∗p < 0.001. Significant effects are highlighted in bold font.


TABLE 5 | Relationship of sex hormones to gray matter volumes in women, while controlling for OC-use.

MFG = middle frontal gyrus. L = left, R = right. <sup>∗</sup>p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001. Significant effects are highlighted in bold font.

these relationships are modulated by (ii) biological sex and (iii) OC-use. Indeed, a variety of associations between sex hormones and gender role to gray matter volumes were observed, that were dependent on either biological sex or OC-use. The fact that gender role and sex hormones showed no significant interrelation in this sample underlines the view of gender role as a social construct and provides the opportunity to study their influence on gray matter volumes independently.

In accordance with our hypothesis, testosterone related positively to gray matter volumes in the hippocampi (**Figure 3**),

i.e., brain areas showing a sexual dimorphism favoring men. These findings are in good accordance with studies in transsexuals, demonstrating that cross-sex hormone treatment alters brain structure toward the proportions of the aspired sex (Hulshoff Pol et al., 2006; see Guillamon et al., 2016 for a review). Furthermore, they are in good accordance with animal studies, demonstrating a testosterone-dependent increase in synaptic spine density in the hippocampus (e.g., Leranth et al., 2003). The association was significantly stronger in women compared to men, but did not differ significantly between NC women and OC-users, even though the strongest associations were observed in NC women. This observation is consistent with the animal literature showing testosterone actions on hippocampal spine density via local conversion to estradiol also only in females (see Atwi et al., 2016 for a review).

Furthermore, in line with our hypothesis, a positive association between self-rated femininity and gray matter volumes in the left MFG, i.e., a brain area typically larger in women, was observed in men. Men, who perceive themselves as more feminine, show larger left MFG volumes. This is in line with a previous study reporting larger cortical thickness in the left MFG of untreated male-to-female transsexuals compared to men (Luders et al., 2012). This association probably reflects an important role of the MFG in personality traits and other characteristics, typically considered as feminine. Results of the present study support the assumption that gender role self-concepts are largely driven by personality (compare **Table 3**). As an example, conscientiousness (Schmitt et al., 2008), which is typically higher in women, was shown to relate to gray matter volumes in the lateral prefrontal cortex (DeYoung et al., 2010).

It is an interesting observation that female brain structure relates more strongly to sex hormone levels than male brain structure, while male brain structure relates more strongly to gender roles than female brain structure. While several associations between gray matter volumes and testosterone were observed in women, no significant association to sex hormones were observed in men. Based on this observation it can be speculated whether the female brain is more susceptible to sex hormone influences, which is plausible given the continuous plasticity required to respond to hormonal fluctuations along the menstrual cycle (e.g., Pletzer et al., 2018; see Sundström Poromaa and Gingnell, 2014 for a review) or during other hormonal transition periods (e.g., pregnancy, menopause, see Barth et al., 2015 for a review). Notably, however, no associations to estradiol or progesterone were observed in women, even though such influences have been demonstrated in withinsubjects designs in women (Barth et al., 2016; Pletzer et al., 2018). It is possible that gray matter volumes are not so much dependent on the absolute circulating hormone level, as measured in the present study, but respond to sudden changes in hormone levels as can only be assessed in withinsubject designs.

Vice versa, no association between gender role and brain structure was observed in women. This finding is in line with previous research on transsexual and homosexual participants. Altered regional brain morphology was only observed in untreated male-to-female transsexuals and male homosexual participants (Luders et al., 2012; Abé et al., 2014) not in female-to-male transsexuals or female homosexual participants (Guillamon et al., 2016; Manzouri and Savic, 2018). Furthermore, a stronger association of personality traits to brain structure in men compared to women has also been previously reported (Nostro et al., 2016). One research question that arises from this observation is whether the stability of personality traits or gender role constructs differs between men and women. If women's gender role self-concept is more flexible over time, a relationship to brain structure is not to be expected. While this has hardly been assessed for personality or gender identity, a higher flexibility in women has been reported regarding sexual orientation (Kinnish et al., 2005), which may explain the lack of brain structural differences between homosexual and heterosexual women (Manzouri and Savic, 2018).

However, apart from the relationship of self-reported femininity to the left MFG volumes in men, no association between masculinity or femininity (neither self-rated, nor questionnaire-based) and brain structure was observed. Again, this is in line with previous literature on transsexual individuals reporting that before sex hormone treatment their brain morphology largely corresponds to their natal sex (Guillamon et al., 2016). While personality traits have been successfully related to brain morphology, it is important to keep in mind, that the gender role self-concept develops at the interplay between an individual's traits, abilities and interests on the one hand and social norms on the other hand. While some traits may relate to different brain structures, the same traits and abilities may result in different perceptions of masculinity/femininity in different cultural contexts. Furthermore – as the factorial structure of the GERAS shows – gender role is a multi-facetted construct spanning a variety of traits, abilities and interests, which cannot all be pinpointed to the same brain area. It is, however, possible, that in the left MFG several of the traits contributing to femininity intersect. The fact that men tend to show stronger lateralization of brain functions may also have contributed to this finding (e.g., Hausmann and Güntürkün, 1999, 2000).

Finally, some important differences between naturally cycling women and OC-users have been identified. In interpreting these differences it is important to keep in mind, that the results reported here represent between-group comparisons. It is thus possible that the groups of OC-users and NC-women differ for other reasons than their OC-use. First, OC users show significantly lower testosterone and progesterone levels than NC women, which is probably a result of the downregulation of the HPG-axis by synthetic steroids (Wiegratz et al., 2003). Estradiol levels did not differ significantly between OC-user and NC women. This may be due to the fact that none of the NC-women tested in the present study were in the preovulatory phase, when estradiol levels peak, but may also be the result of some cross-reactivity between the synthetic ethinylestradiol and the antibodies used for estradiol assessment. Second, the finding that OC-users rate themselves as more feminine and significantly less masculine compared to naturally cycling women was replicated (Pletzer et al., 2015b). This observation does not necessarily imply a hormonal modulation of gender role. There are various non-hormonal reasons why women on OCs may perceive themselves as more feminine. On the one hand, the daily intake of a pill controlling one's reproductive functions may act as a constant reminder of one's own femininity. It is also possible, that the heightened femininity is not a result of the OC-use, but that women who consider themselves more feminine are more likely to choose OCs as a contraceptive method. This is probably related to the fact that a majority of women start OC-use when entering a longterm relationship. Accordingly, the increased femininity may be a result or a pre-requisite of OC-users different relationship status. Note, however, that relationship status was not assessed in the present study.

Furthermore, several differences in brain volume results between OC-users and NC women were observed. The fact that OC-users show larger TIV is likely attributable to chance in sampling, which represents another issue in betweengroup comparisons. More importantly, OC-users show smaller regional GM volumes than NC women in the hippocampi and parahippocampal gyri. These results are in contrast to previous studies demonstrating larger gray matter volumes of OC-users compared to non-users in the hippocampus, parahippocampus and fusiform gyri (Pletzer et al., 2010, 2015a; De Bondt et al., 2013). These inconsistencies between studies highlight the importance of longitudinal study designs to disentangle effects of OC-use from other variables that might differentiate OC-users and NC-women in cross-sectional designs. The inconsistencies may be a result of different actions of the various progestin compounds contained in different OCs. The majority of OC-users in the present study used OCs containing androgenic progestins, i.e., progestins that are derived from 19-nortestosterone and thus able to bind to testosterone receptors. Previous findings of increased parahippocampal/fusiform volumes were observed in users of anti-androgenic progestins (Pletzer et al., 2015a). A reduction of GM-volumes, has previously been reported for androgenic progestins, albeit in the MFG (Pletzer et al., 2015a). Comparably, in the present study, OC-users show smaller left MFG-volumes with higher testosterone levels.

In summary, our study corroborates findings of activational effects of sex hormones on brain morphology in adults, demonstrating that – at least in women – testosterone promotes a more male-like brain morphology and estradiol a more femalelike brain morphology. In addition our study also demonstrates for the first time an association between a more feminine gender role and a more female-like brain morphology in men. Finally our study identifies differences in gender role and gray matter volumes between OC-users and naturally cycling women.

## DATA AVAILABILITY

The datasets generated for this study are available on request to the corresponding author.

## AUTHOR CONTRIBUTIONS

BP designed the study, analyzed the data, and wrote the manuscript.

## FUNDING

This study was in part funded by FWF-projects P28261 and DK W1233-G17.

## ACKNOWLEDGMENTS

The author thanks TiAnni Harris, Sara Fernandez, Matthias Schurz, and Manuel Schabus for their help with data acquisition and all participants for their time and willingness to contribute to this study.

## REFERENCES

fnins-13-00592 June 15, 2019 Time: 17:44 # 11


volume and face recognition performance. Brain Res. 1596, 108–115. doi: 10. 1016/j.brainres.2014.11.025


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Pletzer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fnins-13-00592 June 15, 2019 Time: 17:44 # 12

# Interaction of Sex and Age on the Dissociative Effects of Ketamine Action in Young Healthy Participants

B. Derntl1,2,3† , J. Hornung<sup>1</sup>† , Z. D. Sen1,4† , L. Colic<sup>5</sup> , M. Li<sup>6</sup> and M. Walter1,4,5,6,7,8 \*

<sup>1</sup> Department of Psychiatry and Psychotherapy, Eberhard Karls University of Tübingen, Tübingen, Germany, <sup>2</sup> Werner Reichardt Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany, <sup>3</sup> LEAD Research School & Network, University of Tübingen, Tübingen, Germany, <sup>4</sup> Clinical Affective Neuroimaging Laboratory, Magdeburg, Germany, <sup>5</sup> Department for Behavioral Neurology, Leibniz Institute for Neurobiology, Magdeburg, Germany, <sup>6</sup> Max Planck Institute for Biological Cybernetics, Tübingen, Germany, <sup>7</sup> Center for Behavioral Brain Sciences, Magdeburg, Germany, <sup>8</sup> Department of Psychiatry and Psychotherapy, Otto von Guericke University Magdeburg, Magdeburg, Germany

Ketamine is a drug that reduces depressive and elicits schizophrenia-like symptoms in humans. However, it is largely unexplored whether women and men differ with respect to ketamine-action and whether age contributes to drug-effects. In this study we assessed dissociative symptoms via the Clinician Administered Dissociative States Scale (CADSS) in a total of 69 healthy subjects aged between 18 and 30 years (early adulthood) after ketamine or placebo infusion. Dissociative symptoms were generally increased only in the ketamine group post-infusion. Specifically, within the ketamine group, men reported significantly more depersonalization and amnestic symptoms than women. Furthermore, with rising age only men were less affected overall with respect to dissociative symptoms. This suggests a sex-specific protective effect of higher age which may be due to delayed brain maturation in men compared to women. We conclude that it is crucial to include sex and age in studies of drug effects in general and of ketamine-action in specific to tailor more efficient psychiatric treatments.

Clinical Trial Registration: EU Clinical Trials Register (EudraCT), trial number: 2010-023414-31.

Keywords: ketamine, dissociation, depersonalization, sex, age, brain maturation

## INTRODUCTION

Ketamine is an N-methyl-D-aspartate (NMDA) receptor antagonist and has been shown to decrease depressive symptoms in humans (Murrough et al., 2013), even for low doses (Xu et al., 2016), leading to rapid acting and long lasting effects. Furthermore, ketamine leads to schizophrenia-like symptoms including positive and negative symptoms and has been used as a psychosis model in both human and animal studies for decades (Krystal et al., 1994; Adler et al., 1998; Newcommer et al., 1999). Additionally, acute ketamine administration induces transient dissociative symptoms, i.e., a kind of experience of detachment from surroundings, body and time (Sleigh et al., 2014). Importantly, ketamine induced dissociative symptoms, especially the degree of depersonalization, can predict the antidepressant response 24 h after ketamine infusion in major depression patients, whereas neither other acute psychotomimetic nor physiological effects can (Luckenbaugh et al., 2014; Niciu et al., 2018).

Despite the long history of ketamine's use in experimental and clinical medicine, only few studies have addressed the question whether modulatory factors like sex and age may contribute to the effects of ketamine. In animal studies sex-specific effects of repeated ketamine administration

#### Edited by:

Annie Duchesne, University of Northern British Columbia, Canada

#### Reviewed by:

Jennifer L. Vande Voort, Mayo Clinic, United States Siegfried Kasper, Medical University of Vienna, Austria

#### \*Correspondence:

M. Walter martin.walter@uni-tuebingen.de; martin.walter@med.ovgu.de

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Neuroscience

Received: 20 February 2019 Accepted: 29 May 2019 Published: 18 June 2019

#### Citation:

Derntl B, Hornung J, Sen ZD, Colic L, Li M and Walter M (2019) Interaction of Sex and Age on the Dissociative Effects of Ketamine Action in Young Healthy Participants. Front. Neurosci. 13:616. doi: 10.3389/fnins.2019.00616

have been shown leading to antidepressant effects and enhanced hippocampal synapsin levels in male mice but increased depressive like symptoms and attenuated glutamate and aspartate levels in female mice (Thelen et al., 2016). Other studies reported faster antidepressant effects in female but longer lasting effects in male mice (Franceschelli et al., 2015) and higher sensitivity of female rats to low doses of ketamine (Carrier and Kabbaj, 2013). Furthermore, juvenile males were reported to be less sensitive to antidepressant effects of ketamine in comparison to adult male rats (Parise et al., 2013). Regarding sex-specific effects of ketamine in humans, initial research provides evidence that after drug infusion men show a larger decline of verbal memory than women (Morgan et al., 2006). However, most of the previously published studies did not investigate an effect of sex or the interaction of sex and age on any of the ketamine induced symptoms and physiological alterations, despite its great relevance for uncovering ketamine's therapeutic potential (Wright and Kabbaj, 2018).

Among many targets, ketamine is primarily an NMDA receptor antagonist and its consequent enhancing effect in the function of another glutamatergic ionotropic receptor, AMPA receptor, is well known (for review see Aleksandrova et al., 2017). The glutamatergic system displays prominent sex differences from the DNA level to physiological behaviors of neurons, potentially contributing to the well-known gap in prevalence rates, symptomatology and treatment success in women and men suffering from mental disorders (for review see Wickens et al., 2018). The little information we have from human studies indicates that women show higher levels of glutamate compared to men, in particular in the striatum and the cerebellum (Zahr et al., 2013) as well as the sensorimotor cortex and the anterior cingulate cortex (Grachev and Apkarian, 2000). Moreover, changes in cerebral glutamate levels across the lifespan have been predominantly reported in adult men, exhibiting a steep decline with age (Sailasuta et al., 2008). Interestingly, serum levels of glutamate increase with older age in adult women, while this is not observed in men (Kouchiwa et al., 2012).

Sex and age effects have been more extensively explored in animal studies, critically pointing to the differential reorganization of the glutamatergic system in the developmental period in females and males especially in the prefrontal cortex (PFC; Spear, 2000). The NR2 subunit of NMDA receptor displays a developmental switch as the NR2B subunits in the PFC play important roles in regulating the maturation of PFC circuits in the transition phase between puberty and early adulthood (Flores-Barrera et al., 2014). Provided that maturation of the PFC is not completed until mid-twenties due to gradual synaptic pruning throughout adolescence and early adulthood (Tsujimoto, 2008; Elston et al., 2010; Kolb et al., 2012), these developmental stages can be called as critical periods (Bale and Epperson, 2017). These critical periods also correspond to the developmental stages, where female and male brains become more and more distinct from each other in many levels (Bale and Epperson, 2017). Men and women differ with respect to brain maturation leading to a 1–2 years earlier peak of gray maturation (Lenroot et al., 2007) as well as reduced cortical gray matter loss during adolescence/early adulthood in women (Sandu et al., 2014). Among other brain regions like the amygdala, hippocampus or hypothalamus, orbital and medial PFC show sexual dimorphisms (Goldstein et al., 2001) and differing maturation processes, e.g., gray matter in frontal cortices becomes thinner earlier in females than males (McEwen and Morrison, 2013). Interestingly, Deakin et al. (2008) showed that dissociative effects of ketamine are associated to activity in ventromedial regions of PFC.

In view of these findings, we expected sex differences with respect to dissociative symptoms after single ketamine infusion in women and men during early adulthood when brain maturation is still ongoing. This study was designed to compare effects of a ketamine infusion in healthy young women and men using the Clinician-Administered Dissociative States Scale (CADSS) assessing dissociative symptoms, i.e., derealization, depersonalization, and amnestic effects. To do so, we matched women and men for age and restricted the age range from 18 to 30 years.

## MATERIALS AND METHODS

## Participants

The study was part of a randomized, double-blind, placebocontrolled trial (EudraCT number: 2010-023414-31). Participants were recruited by public advertisement. Participants were screened for MR compatibility and completed extensive medical examination to assure healthy physical status. The German version of Mini-International Neuropsychiatric Interview (MINI; Sheehan et al., 1998) was used to exclude DSM–IV psychiatric disorders. Participants were additionally screened for general psychiatric (BPRS; Overall and Gorham, 1960) depressive (HAM-D; Hamilton, 1960) and anxiety related symptoms (HAM-A; Hamilton, 1959). Moreover, participants were free of current substance use or abuse (excluding smoking) and did not take any medication (excluding contraception pills). Seventeen subjects were excluded during screening process. The age range was set to 18–30 years. Thus, 29 healthy female and 40 male participants were recruited and randomly assigned to receive either a racemic ketamine or a placebo (saline) infusion. Study investigators, research coordinators, attending care teams and subjects were blind to treatment allocation. Fourteen women (mean age = 23.43 years; SD = 2.47) and 21 men (mean age = 24.57; SD = 2.51) received ketamine, whereas 15 women (mean age = 24.33 years; SD = 2.66) and 19 men (mean age = 24.00 years; SD = 1.97) received a placebo infusion. The Ethics Committee of the Medical Faculty of the University of Magdeburg approved the experimental protocol of the study and the study was conducted in accordance with the Declaration of Helsinki (World Medical Association, 2002). Participants provided written informed consent prior to participation and received financial compensation for their participation.

## Procedure

First participants completed a baseline assessment of the CADSS (Bremner et al., 1998) which assesses dissociative symptoms divided into depersonalization, derealization, and amnestic symptoms. Afterward, 50 ml of either 0.9%

saline (NaCl 0.9%; Berlin-Chemie AG, Berlin, Germany) or 0.5 mg/kg body weight of ketamine +/− racemate (Ketamineratiopharm <sup>R</sup> 500 mg/10 ml; Ratiopharm GmbH, Ulm, Germany) were infused continuously over 40 min via an infusion pump (Injectomat 2000; Fresenius Kabi GmbH, Langenhagen, Germany). Immediately after the end of infusion, participants completed the CADSS the second time and again 20–40 min after the end of infusion. During infusion, participants were monitored for cardiovascular response every 5 min, and again 20 as well as 60 min after infusion (Liebe et al., 2017). To ensure the safety of participants, they were additionally asked about their general condition after the end of infusion.

## Statistical Analysis

Statistical analyses were conducted via SPSS 23 (IBM). First independent samples t- or U-tests compared the placebo and ketamine group with respect to demographical variables. Further women and men in the ketamine group were also compared for the same variables.

To assess whether placebo and ketamine group differed for baseline or post-infusion dissociative symptoms, a 2 × 2 independent samples analysis of variance (ANOVA) was conducted including the within-subject-factor Time (baseline, post-infusion) and the between-subject-factor Treatment (placebo, ketamine).

To investigate dissociative symptoms in the ketamine group, a multivariate analysis of covariance (MANCOVA) with sex (women, men) as fixed factor and the three CADSS-subscales (scores after ketamine infusion) as dependent variables was conducted to assess sex differences across all subscales. Test statistics are reported according to Pillai's Trace. Furthermore, a univariate ANCOVA was computed to assess sex differences for the total score. To adjust for potential differences on the total dose depending on body weight, this variable was added as a covariate to both tests. In each case, the statistical threshold was set to α = 0.05. P-values between 0.05 and 0.09 were labeled trend-significant.

## Correlation Analyses

To assess whether age was associated to symptom manifestation for the total score or one of the three subscales within the ketamine group, partial correlations controlling for weight were conducted separately for women and men. To address the question whether correlation coefficients were different for women and men, Fisher's z tests were computed. Differences in correlation scores were corrected for multiple comparisons, with an effective threshold of p < 0.0125.

TABLE 1 | Demographic details of the ketamine and placebo group for female and male participants.


Both groups did not differ in age and psychiatric symptoms. Only in the ketamine group general CADSS scores showed an increase post-infusion, that differed between men and women Variables are presented as mean (SD). Hamilton Anxiety Rating Scale (HAM-A); Hamilton Depression Scale (HAM-D), Clinician Administered Dissociative Symptoms Scale (CADSS), F (females), M (males).

#### Derntl et al. Inter Sex Age Ketamine

## RESULTS

## Demographics

fnins-13-00616 June 18, 2019 Time: 12:31 # 4

Independent samples t-tests confirmed that the placebo and ketamine group did not differ significantly in demographic variables or psychiatric symptoms. Also, women and men in both groups did not differ in all parameters except weight (see **Table 1**).

## General Effects of Ketamine Infusion

First, a 2 × 2 ANOVA detected main effects of Treatment, F(1,67) = 39.65, p < 0.001, η <sup>2</sup> = 0.37, and Time, F(1,67) = 39.65, p < 0.001, η <sup>2</sup> = 0.37, and a significant interaction of Treatment × Time, F(1,67) = 39.65, p < 0.001, η <sup>2</sup> = 0.37, indicating enhanced dissociative symptoms only in the ketamine group immediately post-infusion.

## Sex-Specific Effects in the Ketamine Group Depending on CADSS Subscale

Second, sex- and subscale-specific effects of ketamine were investigated. The MANCOVA including weight as covariate showed a significant effect of sex [F(3,30) = 3.60, p = 0.025, η <sup>2</sup> = 0.27]. Looking at subscales individually, depersonalization [F(1,32) = 7.38, p = 0.011, η <sup>2</sup> = 0.19] and amnesia [F(1,32) = 5.09, p = 0.031, η <sup>2</sup> = 0.14] showed significantly higher scores in men, which was not the case for derealization [F(1,32) = 0.93, p = 0.34, η <sup>2</sup> = 0.03]. Univariate ANCOVA showed a marginal effect on the total score [F(1,32) = 2.99, p = 0.09, η <sup>2</sup> = 0.09] (see **Figure 1**).

## Sex- and Subscale Specific Effects in Connection With Participant Age

Next, we assessed whether participant age was associated with symptom manifestation in CADSS total and sub-scores, separately for men and women. In men, significant negative correlations between age and CADSS scores were observed (for depersonalization on a trend-level), which was not the case for women (see **Figure 2** and **Table 2**).

To further validate whether men and women differed for any of the above correlations, Fisher's Z-tests were conducted. For derealization, z = −3.08, p = 0.002, and the CADSS total score, z = −2.99, p = 0.003, significant sex-differences were detected indicating that in men symptom manifestation decreased more strongly with age than in women. Correlation coefficients of men and women did not differ significantly for depersonalization z = −1.59, p = 0.11 and amnesia, z = −0.98, p = 0.33 (see **Table 2**).

## DISCUSSION

The present study investigated whether dissociative symptoms as induced by the anti-depressive drug ketamine

differ as a function of sex and age. In general, a subanesthetic dose of ketamine led to profound dissociate symptoms affecting women and men, though men showed significantly stronger symptom manifestation regarding depersonalization and amnesia than women. Furthermore, taking into account our participants age, in men dissociative symptoms in total and derealization in specific decreased with rising age while this association was not observed in women.

TABLE 2 | Correlation analyses testing sex-specific associations between Age and CADSS scores immediately after infusion.


Women and men differed with respect to the CADSS total score and the subscale derealization. Clinician Administered Dissociative Symptoms Scale (CADSS). Data is presented as r coefficient (p-value). Bold values and <sup>∗</sup> indicate p < 0.05.

Surprisingly, the effects of sex and age on ketamine's actions have not been broadly examined in humans, although prevalence rates and symptomatology of mental disorders associated with the glutamate system and ketamine-action, e.g., depression, significantly differ between women and men (Whiteford et al., 2013; Strong and Kabbaj, 2018; Wickens et al., 2018; Wright and Kabbaj, 2018). Although animal studies point to a variation in sensitivity to antidepressant and addictive effects of ketamine depending on age and sex (Carrier and Kabbaj, 2013; Parise et al., 2013; Franceschelli et al., 2015; Zanos et al., 2016; Schoepfer et al., 2017; Strong et al., 2017), human studies rarely report sex or age effects (Cho et al., 2005; Niciu et al., 2014). In humans, the reported differences between women and men focused on metabolites and hepatic clearance of ketamine (Saland et al., 2017), biomarkers (Colic et al., 2019) or side effects (Liebe et al., 2017). Sigtermans et al. (2009) reported that S-Ketamine is metabolized faster in female subjects and the effect of ketamine is greater on cardiac output and heat pain. Another study which used racemic ketamine, as also used in the current study indicated a sex-specific metabolism of ketamine in depressed and bipolar patients (Zarate et al., 2012). Additionally, a previous meta-analysis reported a significant association between effect sizes of ketamine response at later time points, i.e., 7 days post-infusion, and percentage males, but the number of included studies that contributed data was quite low (Coyle and Laws, 2015). Reviewing the relevant literature,

Wright and Kabbaj (2018) stressed that most of the clinical studies lack the information about sensitivity to the effects of ketamine because generally one particular dose of ketamine is administered instead of an application of a dose-response regime like in animal studies. Indeed, Morgan et al. (2006) showed that men are more sensitive to verbal and subjective memory disturbances induced by intravenous ketamine infusions, of which doses ranged between 0.5 and 1.3 mg/kg. Likewise, in the current study male participants reported higher subjective memory disturbances measured by CADSS supporting earlier findings by Morgan et al. The single dose regime that was applied in the current study might have hindered the clear sex effect insensitivity to the amnesic effect of ketamine. More studies using a wider range of doses would be beneficial to understand both the role of sex and age in effects of ketamine.

Concerning age effects, dissociative effects of ketamine were negatively associated with age only in male participants. Early adulthood is a critical development stage that engenders vulnerability for a variety of mental disorders for women and men (Paksarian et al., 2016). Plenty of previously reported findings indicate that the effects of ketamine show variation across participants according to the basal status of associated circuits (Lahti et al., 2001; Corlett et al., 2006; Morgan et al., 2008). Regarding clinical populations, reports on geriatric patients are scarce but seem to be similar to generally observed effects (although see Szymkowicz et al., 2014), but the samples were very small or case-control studies (Iglewicz et al., 2015; George et al., 2017; Medeiros da Frota Ribeiro and Riva-Posse, 2017). Regarding depressed adolescents and young adults, studies investigating antidepressant effects of ketamine are virtually non-existent.

This study is limited in that we specifically focused on young adults, thus included only data from participants younger than 30 years. However, to fully test our assumption of an age-specific decline of dissociative symptoms in men, future studies should include a broader age range, informing about age- and sexspecific effects across different developmental stages (e.g., from puberty to menopause and further) as changes across the lifespan have been reported for cerebral glutamate levels in men (Sailasuta et al., 2008) and serum levels in women (Kouchiwa et al., 2012). Moreover, a modulatory role of estradiol has been shown in animal studies addressing glutamate transmission (Smejkalova and Woolley, 2010). Regarding ketamine, sex differences in ketamine pharmacokinetics in rats have been reported, however, the impact of circulating hormone levels was negligible (Saland and Kabbaj, 2018). Another limitation to be addressed is the lack of measurements of ketamine and its active metabolites in the blood. Evidence indicates that the metabolism of both racemic and S-ketamine differ between men and women (Sigtermans et al., 2009; Zarate et al., 2012). A common limitation of placebo-controlled ketamine studies is the reliability of blinding. Ketamine induces symptoms that are evident mostly to the participants and also to the involved scientists. For this reason, studies are conducted with active placebos like midazolam (Wilkinson et al., 2019), which result in their own caveats.

In summary, male participants in our study reported stronger depersonalization and amnestic symptoms following ketamine infusion. Interestingly, this effect was potentiated by age, i.e., the younger the age the stronger the symptoms. Thus, our findings suggest a sex-specific protective effect of age, which may be due to progressed brain maturation in women compared to men. We conclude that it is crucial to include sex and age in studies of drug effects in general and of ketamine-action in specific to tailor more efficient psychiatric treatment strategies.

## DATA AVAILABILITY

The datasets generated for this study are available on request to the corresponding author.

## ETHICS STATEMENT

Participants provided written informed consent prior to participation, received financial compensation for their participation, and the Ethics Committee of the Medical Faculty of the University of Magdeburg approved the experimental protocol of the study.

## AUTHOR CONTRIBUTIONS

BD, JH, and ZS analyzed the parts of the data and wrote the first draft. LC and ML collected the data, contributed to the data analyses, and corrected the manuscript. MW designed the study, supervised the data collection and analyses, and corrected the manuscript.

## FUNDING

BD was supported by the DFG (DE2319/6-1) and the CIN (EXC307). This study was supported by the German Research Foundation (SFB 779/A06 for MW and DFG WA 2673/4-1 for MW), the Center for Behavioral Brain Sciences (CBBS NN05 for MW), and Leibniz Association (Pakt für Forschung und Innovation for MW). LC received a scholarship from the German Research Foundation (SFB 779, 2013–2016).

## ACKNOWLEDGMENTS

We thank Dr. Claus Tempelmann and Renate Blobel (Department of Neurology) for data acquisition; Dr. Melanie Weigel (Department of Ophthalmology) and Dr. Conrad Friedrich Genz (Department of Cardiology) for clinical screening; and Dr. Julia Noack and Linda Frolik Endrulat (Clinical Affective Neuroimaging Laboratory) for trial management. We would also like to acknowledge the help of all the participants in this study. The article processing charge was funded by the DFG and the Eberhard Karls University Tübingen in the funding program Open Access Publishing.

## REFERENCES

fnins-13-00616 June 18, 2019 Time: 12:31 # 7



**Conflict of Interest Statement:** MW received research support from HEEL and Janssen Pharmaceutical Research for a clinical trial on ketamine in patients with major depression which were not investigated in this manuscript. Other authors declare no conflict of interest.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Derntl, Hornung, Sen, Colic, Li and Walter. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Beyond Biological Sex: Interactive Effects of Gender Role and Sex Hormones on Spatial Abilities

Belinda Pletzer1,2 \*, Julia Steinbeisser<sup>1</sup> , Lara van Laak<sup>1</sup> and TiAnni Harris1,2

<sup>1</sup> Department of Psychology, University of Salzburg, Salzburg, Austria, <sup>2</sup> Centre for Cognitive Neuroscience, University of Salzburg, Salzburg, Austria

Sex differences in spatial abilities are well documented, even though their underlying causes are poorly understood. Some studies assume a biological basis of these differences and study the relationship of sex hormone levels to spatial abilities. Other studies assume social influences and study the relationship of gender role (masculinity/femininity) to spatial abilities. Contemporary theories postulate a psychobiosocial model of sex differences in spatial abilities, in which both biological (e.g., hormonal) and psychosocial (e.g., gender role) variables interactively modulate spatial abilities. However, few studies have addressed both aspects simultaneously. Accordingly, the present study explores potential interactive effects between gender role and sex hormones on spatial performance. 41 men and 41 women completed a mental rotation and a virtual navigation task. Sex hormone levels and gender role were assessed in all participants. Sex differences favoring men were observed in both tasks. We found that neither sex hormones nor gender role alone emerged as mediators of these sex differences. However, several interactive effects between gender role and sex hormones were identified. Combined effects of masculinity and testosterone were observed for those variables that displayed sex differences. Participants with both, high masculinity and high testosterone showed the best performance. However, this association was further modulated by biological sex and progesterone levels. Furthermore, we observed an interactive effect of femininity, estradiol and testosterone on response times in both tasks. Consistent across both tasks and irrespective of biological sex, testosterone related to response times in participants with low estradiol levels, depending on their femininity. In participants with low femininity, testosterone was related to slower reaction times, while in participants with higher femininity, testosterone was related to faster reaction times.

Keywords: gender role, sex hormones, mental rotation, navigation, sex differences

## INTRODUCTION

Sex differences have attracted considerable research interest over the past decade, but their underlying mechanisms remain yet to be uncovered. Some researchers see sex differences in adults as the direct result of organizational or activational effects of sex hormones, i.e., effects of sex hormones on the brain that occur either during fetal development or later in life

Edited by:

Alfonso Abizaid, Carleton University, Canada

#### Reviewed by:

Donna Toufexis, The University of Vermont, United States Melissa J. Glenn, Colby College, United States

\*Correspondence: Belinda Pletzer Belinda.Pletzer@sbg.ac.at

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Neuroscience

Received: 15 February 2019 Accepted: 12 June 2019 Published: 09 July 2019

#### Citation:

Pletzer B, Steinbeisser J, van Laak L and Harris T (2019) Beyond Biological Sex: Interactive Effects of Gender Role and Sex Hormones on Spatial Abilities. Front. Neurosci. 13:675. doi: 10.3389/fnins.2019.00675

**102**

(e.g., Kelly et al., 1999). Other researchers see sex differences in adults as a result of socialization and experience (e.g., Nash, 1979; Eagly and Koenig, 2006). With increasing age, children are more and more exposed to societal views of what's acceptable or desirable for their biological sex (gender role/sex role). They develop their gender identity and a sense of how much they conform to these gender roles (e.g., Eagly and Koenig, 2006). The extent to which individuals conform to male gender roles is referred to as masculinity. The extent to which they conform to female gender roles is referred to as femininity.

Sex differences have been described for various domains including spatial, verbal and memory abilities (see Andreano and Cahill, 2009 for a review). While sex differences in some areas are more disputed than others, the predominant view is that some abilities are better developed in women, while other abilities are better developed in men (e.g., Halpern, 2000). The most robust sex differences have been described in the spatial domain, in which men on average outperform women (e.g., Levine et al., 2016). However, an important observation in sex difference research is their task specificity, since different – seemingly similar tasks – may involve a variety of different cognitive processes (Andreano and Cahill, 2009). Accordingly, the male superiority in spatial tasks is by no means universal. For instance, women outperform men in tasks of object location memory (see Voyer et al., 2007 for a meta-analysis). However, men appear to have a robust advantage in tasks of spatial visualization, like mental rotation tasks and navigation tasks (Andreano and Cahill, 2009). Sex differences in mental rotation and navigation have been described across different cultures (Silverman and Eals, 1992) and emerge with moderate to large effect sizes (0.5–1.3) in meta-analyses (e.g., Linn and Petersen, 1985; Voyer et al., 1995).

Accordingly, most research on the question of whether sex differences are of biological or societal origin, has focused on sex differences in these tasks (compare Levine et al., 2016). An important question in that regard concerns the development of sex differences in spatial visualization (Linn and Petersen, 1985; Levine et al., 2016). If sex differences in tasks of spatial rotation or navigation are already observed in early childhood, they may be the result of organizational effects of sex hormones. If they arise with the onset of puberty, they may be the result of activational effects of sex hormones. If they emerge at school age or after puberty, this may provide stronger support for societal influences. While children are first introduced to societal gender roles at home, the extent to which gender roles are enforced by parents varies greatly depending on their own views. With school age, however, gender role expectations are enforced by classmates and teachers, leading to a stronger and more homogenous exposure to societal gender roles at that age. However, a recent study places the onset of strongest enforcement of gender roles around the age of ten, linking it to an increased concern about girl's sexuality and safety during puberty (Blum et al., 2017). Thus, endocrinological and social aspects of puberty are invariably confounded, making it hard to disentangle activational effects of sex hormones and societal influences during that time period.

Nevertheless, a variety of studies have focused on sex differences in spatial visualization in children and adolescents of various age groups. However, different studies arrive at different conclusions. While some studies do indeed report sex differences in mental rotation already in infants (Moore and Johnson, 2008, 2011; Quinn and Liben, 2008; Lauer et al., 2015) and preschoolers (Levine et al., 1999; Frick et al., 2013a), other studies are unable to find sex differences in spatial abilities in young children (Leplow et al., 2003; Frick et al., 2013b). However, studies in children require the use of age-appropriate tasks and the comparability of the tasks used in children to the tasks used in adults has been criticized (Andreano and Cahill, 2009; Levine et al., 2016). For instance, infant studies rely on different looking times between figures and their mirror image, and it is unclear, whether such looking patterns reflect spatial abilities. Furthermore, mental rotation tasks used in children are mostly two-dimensional, while adult MRT are three-dimensional. Most studies locate the emergence of sex differences in spatial abilities around puberty (Voyer et al., 1995; Andreano and Cahill, 2009; Titze et al., 2010). However, Linn and Petersen (1985) argue in their metaanalytic review, that the emergence of sex differences in spatial abilities is likely later around the age of 18, which they link to a stronger age-related increase in spatial abilities in boys compared to girls. However, since this point is far past the onset of puberty, the increase in boy's spatial abilities cannot be explained by activational effects of sex hormones. It is, however, possible, that young adults experience an increased exposure to gender role expectations with the transition to independence at the age of 18. This is usually the time during which they choose a career and – in many countries – boys undergo military training. These factors may not only contribute to enforce the societal views of what's male or female, but also lead to increased training of young men in spatial abilities. Numerous studies indicate that spatial skills are highly trainable and that training reduces sex differences in spatial tasks (see Levine et al., 2016 for a review).

Furthermore, findings of hormone-related changes in cognitive functions, including spatial functions, along the female menstrual cycle (e.g., Hausmann et al., 2000), during pregnancy (Workman et al., 2012) or menopause (e.g., Halbreich et al., 1995), suggest that sex hormones continuously reshape our brain throughout our adult life. Accordingly, pinpointing the onset of sex differences in cognitive abilities to a certain age, may not be sufficient. If sex differences in adults are the result of different hormonal milieus between men and women, the actual level of circulating sex hormones at the time of testing should modulate sex differences in spatial abilities. However, studies relating circulating testosterone levels to spatial abilities in adults arrive at mixed results. While some studies observe linear or u-shaped relationships in men or women (e.g., Hooven et al., 2004; Driscoll et al., 2005; Hausmann et al., 2009; Courvoisier et al., 2013), other studies find no relationship of circulating testosterone levels to mental rotation performance (Halari et al., 2005; Falter et al., 2006; Puts et al., 2010). It is possible that these conflicting results arise from complex interactions between organizational and activational effects of sex hormones, i.e., activational effects of hormones may differ in differently organized neural structures. For instance it was recently observed that testosterone relates to hippocampal volumes in women, but not in men (Pletzer, 2019) and findings of testosterone showing different relationships to spatial abilities in men and women are not uncommon

(Hooven et al., 2004; Driscoll et al., 2005; Hausmann et al., 2009; Courvoisier et al., 2013). However, not only biological sex, but also interactive effects between different sex hormones may play a role. Testosterone is converted to estradiol via the enzyme aromatase and into the physiologically more active dihydro-testosterone via the enzyme 5α-reductase. This enzyme does, however, show a higher affinity to progesterone, such that in the presence of high progesterone levels, less testosterone gets converted into dihydro-testosterone. Accordingly, testosterone effects may be alleviated in the presence of high progesterone.

Furthermore, if sex differences in adults are indeed the result of socialization by which individuals learn to adapt their behavior according to the societal views of what's typical for a certain gender, the extent to which individuals have incorporated these roles into their self-image, should explain sex differences in spatial abilities. Indeed masculinity was found to relate positively to mental rotation performance in a recent metaanalysis (Reilly and Neumann, 2013), while femininity showed no such association. However, the majority of studies included in this meta-analyses have used the Bem sex role inventory (BSRI; Bem, 1974) to assess gender role. This measure has however been criticized due to its poor factorial validity on the one hand, the exclusion of relevant dimensions of gender role, such as activities and interests, and the fact that the item pool as collected in the 1970s appears outdated with respect to a more modern understanding of gender roles (e.g., Choi and Fuqua, 2003).

Contemporary theories assume that sex differences develop according to a psychobiosocial model, i.e., as a result of interactions between biological, e.g., hormonal, influences, societal influences and individual characteristics (e.g., Levine et al., 2016). Individuals differ in the extent to which they conform to societal expectations not only because of the way they are brought up, but also because of their personality. Furthermore, personality factors have also been known to affect how susceptible subjects are to fluctuations in their hormonal milieu (e.g., Gingnell et al., 2010; Stenbæk et al., 2019). Particularly higher scores of neuroticism have been related to a higher susceptibility to hormonal fluctuations in women, albeit this has mostly been studied with respect to mood changes. It is unclear whether the personality trait neuroticism reflects a certain brain organization that responds more strongly to hormonal changes or whether vice versa concurrent mood changes in response to hormonal fluctuations are perceived as more neurotic by participants. Nevertheless, these findings suggest that like activational effects of sex hormones, social influences act differently on differentially organized neural structures. In line with this assumption, it was recently observed that femininity relates to frontal gray matter volumes in men, but not in women (Pletzer, 2019). Particularly since conceptually there is some overlap between scales assessing neuroticism and scales assessing femininity, it is also plausible that social influences (e.g., gender role) amplify or diminish hormonal influences (both organizational and activational) on behavior.

However, only a few studies have addressed both biological and psychosocial factors in the same study. To the best of our knowledge, only one study has done so with respect to spatial abilities (Hausmann et al., 2009). They found interactive effects of sex hormones and stereotypes on spatial performance in the sense that testosterone mediated the effects of gender stereotypes in spatial abilities. In the present study, we address whether sex differences in mental rotation and spatial navigation are mediated via masculinity, femininity or sex hormones. We hypothesize that particularly testosterone levels and masculinity relate positively to spatial abilities and act as mediators for sex differences therein. In an integrative approach we additionally seek to identify interactive effects of biological sex, gender role, and sex hormones on spatial abilities. More specifically, we expect the best spatial abilities in participants with both, high masculinity and high testosterone levels, i.e., we expect masculinity to facilitate testosterone actions on spatial abilities. Furthermore, we expect stronger testosterone effects in participants with low progesterone levels.

## MATERIALS AND METHODS

## Participants and Procedure

A total of 41 healthy young men and 45 healthy young women was recruited for the present study. All participants were between 18 and 35 years of age, had passed general qualification for university entrance and had no psychiatric, neurological or endocrinological disorders. Women did not take hormonal contraceptives, had no current or prior diagnosis of premenstrual dysphoric disorder and had a regular menstrual cycle of 21– 35 days with no more than 7 days of variation between individual cycles (Fehring et al., 2006). Cycle-length and cycle regularity was established by participants self-reports of their last three onsets of menses. Test sessions for women were scheduled in the mid-luteal cycle phase, since some previous studies suggest that women behave most "female-like" during this cycle phase and the largest sex differences in spatial abilities have been reported for this phase (e.g., Hampson, 1990; Hausmann et al., 2000). The mid-luteal cycle phase spanned from 3 days post-ovulation up to 3 days before the expected onset of participants next menses. Ovulation was calculated by subtracting 14 days from the expected onset of next menses according to the participant's last onset of menses and cycle length as based on the past three cycles. Onset of next menses was evaluated in follow-up.

Upon arrival at the lab, participants were asked to rinse their mouth, signed the informed written consent for the study and completed a general health related screening questionnaire. They then gave the first saliva sample. Afterward, they completed the computerized mental rotation task (MRT). After the MRT, participants gave their second saliva sample. Then they completed the virtual navigation task (VNT). Upon completion of the navigation task, participants gave a third saliva sample and completed questionnaires regarding the videogaming experience, the gender related attributes questionnaire, the masculinity and femininity self-report scales, as well as the screening version of Ravens Advanced Progressive Matrices (APM; Raven et al., 1962) to obtain an estimate of IQ. Gender role scales were scheduled after the spatial tasks and after the last saliva samples in order to avoid any stereotype threat like influences by briefing participants about gender role. After debriefing the participants they received either course credits or monetary enumeration.

## Ethics Statement

fnins-13-00675 July 6, 2019 Time: 12:43 # 4

The study was approved by the University of Salzburg's ethics committee and conforms to the Code of Ethics of the World Medical Association (Declaration of Helsinki). Informed written consent was obtained from all participants.

## Assessment of Spatial Abilities Mental Rotation Task (MRT)

For the mental rotation task, 30 items were selected from the Ganis and Kievit (2015) stimulus library. Participants were presented with two three-dimensional figures. Their task was to decide whether the two figures were the same, but rotated, or whether the two figures were different, as fast and accurately as possible within a pre-specified time-limit of 7 s. 15 items required a "same" response (left mouse button), 15 items a "different" response (right mouse button). Same figures were rotated by 50◦ (5 items), 100◦ (5 items) or 150◦ degrees (5 items). Different figures were mirror images of the same figures and rotated by the same degree. The order of stimuli was randomized in each participant. Reaction time (RT) and accuracy were recorded for each item.

### Virtual Navigation Task (VNT)

The navigation task used in the present study was a virtual reality (VR) adaptation of the task used in Harris et al. (2019). Ten items were selected from the task developed by Harris et al. (2019) using Unreal Engine 4 18.3 and presented to participants via a HTC Vive virtual reality system. Each item represented a new environment consisting of a 10 × 10 matrix with different landmarks placed on each field. Participants were given three lines of directions to a target location in the environment and their task was to reach the target location as quickly as possible. There was no pre-specified time-limit to complete each item. Participants could only move on to the next item, once they found the target location. All directions used allocentric terms ("north," "south," "east," and "west") to guide participants through the environment and participants learned which direction they were facing at the beginning of each item. Furthermore, half of the items used landmark-terms to guide participants (e.g., "go to the tree"), the other half used Euclidian terms (e.g., "go for four blocks"). For each item, the time participants needed to reach the target location (navigation time) was recorded.

## Assessment of Gender Role

Two measures were used to assess gender role: (i) selfratings of masculinity and femininity. (ii) the gender-related attributes questionnaire as an objective measure of personality traits, cognitive abilities and interests that are (stereo-)typically associated with men or women.

### Gender Role Self-Assessment

On a nine-point Likert-Scale, participants were asked to rate how masculine or feminine they perceived themselves. Since men tend to compare themselves to other men, while women tend to compare themselves to other women (Pletzer et al., 2015), each rating was performed three times: (i) in comparison to (other) men, (ii) in comparison to (other) women, and (iii) in comparison to the general population. The same scale was already employed by Pletzer et al. (2015). These ratings represent subjective measures of masculinity and femininity and depend on the participant's personal understanding of these concepts. As outlined by Pletzer et al. (2015) the concepts of masculinity and femininity vary between cultures and possibly also subcultures, e.g., depending on education or generation.

### Gender-Related Attributes Scale (GERAS)

As a more objective measure of masculinity and femininity, we recently developed the gender related attributes scale (GERAS; Gruber et al., 2019). The GERAS assesses gender role via attributes that are typically perceived as masculine or feminine in middle European cultures. It extends previous sex role inventories (e.g., BSRI; Bem, 1974) by including not only personality traits, but also cognitive abilities and interests typically associated with the male or female gender, and thus spans multiple aspects of gender roles. Accordingly it consists of three subscales: (i) personalities subscale with 20 items (10 masculine and 10 feminine), (ii) cognitions subscale with 14 items (7 masculine and 7 feminine), and (iii) interests subscale with 16 items (8 masculine and 8 feminine). In the personality subscale participants are asked to rate how often in their opinion positive and negative traits typically associated with the male (e.g., dominant and bold) or female (e.g., warm-hearted and sensitive) gender apply to them. In the cognitions subscale participants are asked to rate how well they think they are able to perform certain tasks, for which previous studies have demonstrated sex differences favoring men (e.g., find a way) or women (e.g., find the right words). In the interests subscale participants are asked to rate how interested they would be to engage in activities which are stereotypically preferred by men (e.g., boxing and drinking) or women (e.g., dancing and talking). All ratings are performed on a seven-point Likert-scale. Separate masculinity and femininity scores can be obtained for each subscale by averaging the ratings for masculine and feminine items, respectively. Overall masculinity and femininity scores are obtained by averaging the masculinity and femininity scores of the three subscales. The GERAS has been well-validated and shows excellent internal consistency and reliability (Gruber et al., 2019). In particular, masculinity and femininity scores obtained with the GERAS are highly correlated to participant's masculinity and femininity self-assessment (Gruber et al., 2019).

Accordingly, composite measures of masculinity and femininity were obtained by averaging GERAQ and selfassessment scores after recoding self-assessment scores to a seven-point scale by collapsing the two extreme categories at both ends of the scale. The composite score does not only reflect how much participants identify with pre-dominant societal views of what's male and what's female, but also take into account their self-perceived masculinity and femininity based on their own views of what's male and what's female. To address, how much participants conform to the typical male vs. typical female dichotomy, a masculinity-to-femininity ratio was also

calculated. The masculinity-to-femininity ratio was the higher the more typically male and the lower, the more typically female participants were.

## Assessment of Sex Hormones

As outlined in the procedures, three saliva samples were obtained throughout the study – one in the beginning of the experiment, one after the MRT and one after the VNT. All saliva samples were acquired before masculinity and femininity were assessed in order to avoid any effects of priming participants about gender role on sex hormones. Saliva samples were immediately frozen at −20◦ after the experiment and centrifuged twice for 15 and 10 min at 3000 rpm, respectively. As recommended by the ELISA kit instructions, the three samples of each participant were pooled to account for fluctuations in hormone and saliva production. Estradiol, progesterone and testosterone were assessed from the pooled samples using DeMediTec salivary ELISA kits. While the approach to pool samples certainly has the advantage of providing a more stable measure of the average hormone concentrations throughout the experiment, hormonal variations in response to experimental manipulations are not taken into account. While effects of priming participants about gender roles were avoided by the order of measures (see procedure), it cannot be completely ruled out that the spatial tasks themselves elicited a hormonal response.

To reflect the activity of the enzyme aromatase, which converts testosterone to estradiol, an estradiol-to-testosterone ratio was calculated. Furthermore, testosterone is more physiologically active in its dehydrogenized form (dihydro-tetosterone) by conversion via the enzyme 5α-reductase. Since progesterone has a higher affinity to that enzyme than testosterone, testosterone is less physiologically active in the presence of high progesterone levels (e.g., Sitruk-Ware, 2006). Accordingly, to assess testosterone's access to the enzyme 5α-reductase and obtain a measure of its physiological activity, a testosterone-toprogesterone ratio was calculated. Similarly, in women, estradiol actions are often counteracted by progesterone – possibly due to their opposite effects on a variety of neurotransmitter systems (Barth et al., 2015). Accordingly, to obtain a measure of free estrogenic activity, an estradiol-to-progesterone ratio was calculated.

## Statistical Analyses

Statistical analyses were performed in SPSS 22 and R 3.5.1. As a manipulation check gender role, sex hormones, and spatial ability scores were compared between men and women using independent samples t-tests. To identify potential candidates for mediation analyses, interrelations between gender role, sex hormones and spatial abilities were assessed using Pearson correlations. In addition, partial correlations controlling for biological sex were performed to assess which variables related to spatial abilities, irrespective of biological sex. To assess, whether gender role or sex hormones were able to explain the sex difference in spatial abilities, mediation analyses were performed using the mediate function of the mediation packages (Tingley et al., 2017). To assess, whether biological sex, gender role and sex hormones interactively modulated spatial abilities, multiple regression analyses were performed. Details are described in the results section. To illustrate the combined or interactive effects of multiple variables in 3d space (**Figures 1**, **3**, **4**), we used the gridfit function in matlab 2016.

## RESULTS

Four women had to be excluded because the onset of their next menses after the study was too early or too late and their progesterone values were below the acceptable range for the luteal cycle phase (<43 pg/ml; compare Harris et al., 2019). Hormone levels of one male participant were excluded from analysis as outliers, since they exceeded the group mean by more than three standard deviations. Accordingly data of 40 men and 41 women were used for analysis. Women's average cycle length was 29 days (SD = 3 days). They were on average tested on day 22 of their cycle (SD = 4 days). An additional 5 participants (1 men and 4 women) did not complete the VNT task due to severe motion sickness.

## Sex Differences

**Table 1** summarizes the descriptive statistics and gender comparisons for demographic data, sex hormones, gender role and spatial abilities. Men and women did not differ in age and IQ. Furthermore, there were no differences in estradiol levels between men and women (compare **Table 1**). As expected, men showed higher testosterone levels and masculinity ratings than women, while women showed higher progesterone levels and femininity ratings than men. Regarding spatial abilities, sex differences were observed for MRT accuracy and VNT RT, but not for MRT RT.

## Relationship Between Gender Role, Sex Hormones, and Spatial Abilities

To identify potential mediators of sex differences in spatial abilities, Pearson correlations between gender role, sex hormones and their respective ratios and spatial abilities were calculated. **Table 2** summarizes zero-order correlations between gender role, sex hormones and spatial abilities below the diagonal. Masculinity and testosterone levels were highly positively interrelated and both related to mental rotation accuracy and navigation time, but not mental rotation RT. Progesterone was positively related to femininity and both related negatively to MRT accuracy, but not RT or navigation time. Estradiol was not related to gender role, MRT accuracy or navigation time, but was positively related to MRT RT.

To assess the interrelation between gender role and sex hormones and their relationship to spatial abilities irrespective of biological sex, partial correlations controlling for biological sex were performed, since gender role and sex hormones are inadvertently confounded with biological sex. Partial correlations are summarized above the diagonal in **Table 2**. After controlling for biological sex, the masculinity and femininity total score were significantly negatively interrelated, while sex hormones were significantly positively interrelated. There

TABLE 1 | Descriptives and biological sex differences for demographic variables, gender role, sex hormones, and spatial abilities.

responded slower, and participants with higher masculinity responded faster.


GERAQ, gender-related attributes questionnaire; MRT, mental rotation task; VNT, virtual navigation task; RT, reaction time; MD, mean difference; d, Cohen's d.

was no significant association between gender role and sex hormones. Gender role and sex hormones were not related to MRT accuracy or navigation time. Testosterone and estradiol both related positively to MRT RT. MRT RT and navigation time were significantly positively interrelated. Separate analyses by biological sex confirmed that these correlations were observed in both men and women, although the associations of sex hormones to MRT RT were significant only in men (**Figure 1** and **Table 3**). Furthermore a positive correlation between progesterone and MRT RT was observed only in men (**Figure 1**).

## Mediation Analyses

To address, whether gender role or any of the sex hormones mediated the sex differences in spatial performance, mediation analyses were performed. In all except two analyses, the direct effect of sex remained significant, while the indirect effect never reached significance (**Table 4**). Exceptions were masculinity and testosterone in the navigation task. Controlling for masculinity or testosterone the direct effect of sex was not significant anymore. However, the indirect effect also did not reach significance. Accordingly, neither gender roles nor sex hormones mediated the sex difference in MRT accuracy and navigation time.

## Beyond Biological Sex Differences: Interactive Effects of Gender Role and Sex Hormones

To explore interactive effects of gender role and sex hormones on spatial abilities, we used a data driven approach. For masculinity and femininity, respectively, models including all interactions with biological sex and sex hormones were

TABLE 2 | Pearson and partial correlations between gender role, sex hormones, and spatial abilities.


Pearson correlations are displayed below the diagonal in light gray. Partial correlations after controlling for biological sex are displayed above the diagonal in white. MRT, mental rotation task; VNT, virtual navigation task; RT, reaction time. <sup>∗</sup>p < 0.05, ∗∗p < 0.01, and ∗∗∗p < 0.001.

TABLE 3 | Pearson correlations between gender role, sex hormones, and spatial abilities for men and women.


Pearson correlations for men are displayed below the diagonal in light gray. Pearson correlations for men are displayed above the diagonal in white. MRT, mental rotation task; VNT, virtual navigation task; RT, reaction time. <sup>∗</sup>p < 0.05, ∗∗p < 0.01, and ∗∗∗p < 0.001.

TABLE 4 | Direct and mediated effects of sex on spatial abilities.


Sex differences were not mediated by demographic variables, gender role or sex hormones. MRT, mental rotation task; VNT, virtual navigation task; RT, reaction time; Masc, masculinity; fem, femininity; T, testosterone; E, estradiol; P, progesterone. <sup>∼</sup>p < 0.10, <sup>∗</sup>p < 0.05, ∗∗p < 0.01, and ∗∗∗p < 0.001.

defined and non-significant interactions were backward eliminated from higher to lower orders, such that only significant interactions and their lower order terms remained in the model beyond the main effects. **Table 5** summarizes results for masculinity, while **Table 6** summarizes the results for femininity.

For mental rotation accuracy, a significant interaction between sex, masculinity and testosterone was observed (b = 1.00, SE<sup>b</sup> = 0.39, t = 2.55, p = 0.01). Women with both high masculinity and high testosterone levels had the highest mental rotation accuracy, while in men only a trend toward higher accuracy with higher testosterone levels was visible (**Figure 2**).

For mental rotation reaction times (MRT RT), no significant interaction between masculinity and testosterone was observed, but masculinity and testosterone remained significant predictors in the model. While masculinity was negatively related to mental rotation reaction times (b = −0.28, SE<sup>b</sup> = 0.13, t = −2.18, p = 0.03), testosterone was positively related to mental rotation reaction times (b = 0.39, SE<sup>b</sup> = 0.13, t = 3.06, p = 0.003). Irrespective of their biological sex, participants with higher masculinity, but lower testosterone levels solved mental rotation items faster (**Figure 1**). Estradiol and progesterone did not survive as significant predictor in the model, highlighting testosterone as the hormone with the strongest effect on MRT RT.

For navigation times, the interaction between masculinity and testosterone was further qualified by progesterone (b = −0.74, SE<sup>b</sup> = 0.29, t = −2.51, p = 0.01). This interaction resulted from the fact, that testosterone was most negatively related to navigation times in participants with high progesterone levels and high masculinity (**Figure 3**). In participants with low progesterone levels and high masculinity the opposite pattern was observed, i.e., a positive association between testosterone and navigation times. Accordingly in the absence of progesterone, testosterone improved navigation performance for participants with low masculinity, but impaired navigation performance for participants with high masculinity.

There was no interaction between femininity and sex hormones in the prediction of mental rotation accuracy. However, for both mental rotation reaction times and navigation

TABLE 5 | Results of exploratory multiple regression models including interactions between biological sex, masculinity, and sex hormones.


TABLE 6 | Results of exploratory multiple regression models including interactions between biological sex, femininity, and sex hormones.


MRT, mental rotation task; VNT, virtual navigation task; RT, reaction time. <sup>∼</sup>p < 0.10, <sup>∗</sup>p < 0.05, ∗∗p < 0.01, and ∗∗∗p < 0.001.

MRT, mental rotation task; VNT, virtual navigation task; RT, reaction time. <sup>∼</sup>p < 0.10, <sup>∗</sup>p < 0.05, ∗∗p < 0.01, and ∗∗∗p < 0.001.

FIGURE 2 | Relationship of masculinity and testosterone to mental rotation accuracy (MRT Accuracy). A combined effect of masculinity and testosterone was identified in women. Women with both, high masculinity and high testosterone levels had the highest MRT accuracy. In men a small, but non-significant positive association to testosterone was observed, but no effect of masculinity – probably due to a ceiling effect. Data were smoothed in the 3d space using the matlab function gridfit.

high progesterone levels, high masculinity and high testosterone levels related to the fastest navigation times. In participants with low progesterone levels, the opposite pattern was observed. Data were smoothed in the 3d space using the matlab function gridfit.

times significant three-way interactions between femininity, testosterone and estradiol were observed (MRT: b = 0.68, SE<sup>b</sup> = 0.22, t = 3.07, p = 0.003; VNT: b = 0.77, SE<sup>b</sup> = 0.23, t = 3.33, p = 0.001). These 3-way interactions were accompanied by a two-way interaction between femininity<sup>∗</sup> testosterone and a main effect of estradiol in the case of the MRT. For the MRT, but not for the VNT, estradiol was related to longer (slower) RT. The interactions are plotted in **Figure 4**. In both tasks associations between testosterone and reaction time were observed for participants with low estradiol levels, depending on their femininity. In participants with low femininity, testosterone showed a positive relationship to RT, i.e., the higher the testosterone levels, the slower the reactions. In participants with high femininity, testosterone showed a negative relationship to RT, i.e., the higher the testosterone levels, the faster the reactions. In addition, for navigation times the same interaction was observed for progesterone, i.e., interactive effects of femininity and testosterone on navigation times in participants with low progesterone levels.

## DISCUSSION

The present study set out to investigate, whether gender role or sex hormones mediate sex differences in spatial performance. In addition we sought to explore potential interactive effects between gender role and sex hormones on spatial performance. We hypothesized highest spatial abilities in participants with both high masculinity and high testosterone levels. Furthermore we hypothesized stronger testosterone influences in participants with lower progesterone levels.

We found that neither gender role nor sex hormones or their ratios alone explained sex differences in spatial tasks. This was observed in both the mental rotation and the virtual navigation task, even though performance in both tasks was unrelated. This is in line with results of a previous study, demonstrating no effect of sex hormones on a similar navigation task (Harris et al., 2019). While sex differences in the mental rotation task emerged in accuracy, but not reaction times, sex differences in the navigation task emerged in response times. These differences

may be due to the fact that a time limit was posed for the mental rotation task, but not for the navigation task, while all items had to be solved correctly in the navigation task, but not in the mental rotation task.

However, while none of the variables tested were able to explain the sex difference alone, sex did not remain a significant predictor in any of the multiple regression models assessing the interactive effects of gender role and sex hormones. This suggests that it's their interactive effects that contribute to the differences observed between men and women, which is in line with psycho-biosocial models. Specifically, we observed the expected interaction between masculinity and testosterone for both performance measures that showed a sex difference, i.e., mental rotation accuracy and navigation times. While the interaction was qualified by biological sex for the MRT, it was qualified by progesterone in the VNT. In the MRT, a combined effect of masculinity and testosterone on accuracy was only observed in women. As expected, women with both, high masculinity scores and high testosterone levels showed the best performance. For navigation times the same effect was observed in participants with high progesterone levels, while the opposite effect was observed in participants with low progesterone levels. Since women have higher progesterone levels than men, both observations are in line with the assumption that progesterone modulates testosterone influences. They are, however, in the opposite direction as hypothesized. While we assumed that testosterone effects would be stronger in participants with lower progesterone levels due to progesterone's higher affinity for the enzyme 5α-reductase, we found that testosterone and masculinity enhance spatial performance in participants with high progesterone levels (**Figures 2**, **3**), but impair spatial performance in participants with low progesterone levels (**Figure 3**). This suggests different mechanisms of testosterone action in the presence or absence of progesterone. It can be assumed that in the absence of progesterone, testosterone mainly acts as dihydrotestosterone, while in the presence of high progesterone, testosterone does not get converted to dihydrotestosterone. In this case, testosterone can either act directly on androgen receptors or it acts as estradiol on estrogen receptors after conversion via the enzyme aromatase. This suggests different effects of dihydrotestosterone, testosterone and estradiol on spatial abilities and emphasizes dihydrotestosterone as an important sex hormone to consider

in future studies. The fact that progesterone only emerged as a predictor for navigation times, as well as the fact that the combined effect of masculinity and testosterone on mental rotation accuracy was only observed in women, may be attributable to a ceiling effect in men (compare **Figure 1**). Almost all men reached an accuracy of over 90 percent, leaving little room for variation due to gender role or sex hormones.

Interestingly, associations of gender role and sex hormone to spatial performance also emerged for mental rotation reaction times – the one measure that did not show sex differences in spatial abilities. Irrespective of biological sex, masculinity, but not femininity, emerged as a predictor of mental rotation reaction times, such that more masculine individuals of either sex, showed faster reactions. This finding is in line with previous reports summarized in the meta-analysis by Reilly and Neumann (2013) and the effect size is in the range reported by Reilly and Neumann (2013). Note, however, that this effect only reached significance in the multiple regression model, when testosterone was also controlled for. Testosterone showed the opposite effect, i.e., individuals with higher testosterone levels were slower. While the correlation to testosterone was only significant in men, the effect survived across participants and did not interact significantly with biological sex. This is probably attributable to the overall lower testosterone levels in women. Likewise, in the correlation analyses, estradiol was related to slower reaction times across participants and progesterone was related to slower reaction times in men. Note, however, that the multiple regression analysis clearly identified testosterone as the hormone with the strongest influence, since neither estradiol nor progesterone survived as predictor in that model. The fact that masculinity and testosterone had opposite effects on mental rotation reaction times may explain, why no sex difference was found in this measure. Since masculinity includes personality traits like risk taking and competitiveness, its relationship to faster reactions seems plausible. The finding regarding testosterone levels, however, suggests, that testosterone may play a role in regulating the speed-accuracy trade-off participants are faced with during a timed task. This finding hints at the possibility of testosterone improving spatial performance by slowing reaction times, leading to more considerate decision making in the MRT. This idea is somewhat unexpected as testosterone has previously been shown to increase impulsive behavior (e.g., Agrawal et al., 2018) and better spatial performance (e.g., Hooven et al., 2004; Hausmann et al., 2009). However, u-shaped relationships and negative activational influences of testosterone on spatial performance have also been reported (e.g., Hromatko and Tadinac, 2006). On the contrary, estradiol has been discussed to decrease impulsive behavior (Howard et al., 1988, Diekhof, 2015, Roberts et al., 2018), which is in line with it's relation to increased response times observed in the present study. Furthermore, the estradiol finding is in line with other studies suggesting a negative effect of estradiol on spatial performance in humans (e.g., Courvoisier et al., 2013; Hampson et al., 2014), but contrasts findings from animal studies, suggesting a positive effect of estradiol on spatial working memory (e.g., Williams et al., 1990; Healy et al., 1999; Workman et al., 2012). Note, however, that also in animal studies, negative findings and null effects regarding estradiol actions on spatial performance have been described and it has been discussed that estadiol actions may differ between different types of spatial abilities and depending on spatial strategy (Williams et al., 1990; Chesler and Juraska, 2000; Snihur et al., 2008; Lipatova and Toufexis, 2013).

However, the present study did not only identify masculinity to interact with sex hormone actions, but also femininity in a three-fold interaction of femininity, testosterone and estradiol on response times in both tasks (compare **Figure 1**). The fact that the association of testosterone to spatial response times is modulated by both estradiol and femininity may contribute to the mixed findings reported in the literature, where both positive and negative associations, as well as u-shaped relationships have been reported (e.g., Hooven et al., 2004; Driscoll et al., 2005; Halari et al., 2005; Falter et al., 2006; Hausmann et al., 2009; Puts et al., 2010; Courvoisier et al., 2013). Furthermore, this 3-fold interaction may help to shed light on the seemingly contradictory findings discussed in the previous paragraph. Including femininity in the model, revealed that testosterone related to response times in spatial tasks in participants with low estradiol levels, but depending on their femininity. These findings show that testosterone exerts its actions on response times (i) only in the absence of estradiol and (ii) in different directions for high and low femininity. In participants with low femininity, testosterone was related to slower reaction times, while in participants with higher femininity, testosterone was related to faster reaction times. Most importantly, this finding was consistent across the two spatial tasks employed in this study, even though response times revealed sex differences in the navigation task, but not in the mental rotation task.

Regarding (i), the fact that the interaction between femininity and testosterone only emerges for individuals with low estradiol levels may reflect the underlying biochemistry of these hormones. Testosterone is converted to estradiol via the enzyme aromatase. Accordingly across comparable testosterone levels, high estradiol may reflect higher aromatase activity, while low estradiol may reflect lower aromatase activity. Thus, in individuals with higher aromatase activity, estradiol may be the more relevant hormone to modulate performance, while in individuals with low aromatase activity, testosterone may be the more relevant hormone for modulating performance.

Regarding (ii), the interaction between femininity and testosterone highlights – for the first time – an important role for femininity in spatial abilities. The fact that this role is clearly modulatory may explain, why no associations between femininity and spatial performance were observed in previous studies (Reilly and Neumann, 2013). Linking this finding to the discussion in the previous paragraph regarding testosterones relationship to more impulsive decision making, it appears that personality traits associated with high femininity (e.g., expressivity, neuroticism) fuel this association, while low femininity reverses it. This finding adds to the discussion that sex hormones may have different effects on differently organized neural structures. However, it appears that gender role is a better proxy for how a brain is organized, since femininity survived as a predictor in the model, while biological sex did not.

While of course most participants with high femininity were female and most participants with low femininity were male, we specifically identified 14 participants, whose gender role did not correspond to their biological sex. Using a median cut-off, three men showed high masculinity but low femininity, i.e., a typically female gender role, while four women showed low masculinity but high femininity, i.e., a typically male gender role. Furthermore, several participants showed an indifferent (low masculinity and low femininity, 2 men) or androgynous (high masculinity and high femininity, 2 men, 3 women) gender role pattern.

In summary, results of the present study suggest, that neither gender role nor sex hormones alone mediate sex differences in spatial performance. Rather it seems that their contributions to spatial performance are mostly combinatory and interactive. While masculinity seems to boost testosterone effects in those tasks that show significant sex differences, femininity modulates testosterone effects on response times. The combined effect of masculinity and testosterone was modulated by progesterone, while the interactive effect of femininity and testosterone was modulated by estradiol levels.

## DATA AVAILABILITY

All datasets generated for this study are included in the manuscript and/or the supplementary files.

## REFERENCES


## ETHICS STATEMENT

The study was approved by the University of Salzburg's ethics committee and conforms to the Code of Ethics of the World Medical Association (Declaration of Helsinki). Informed written consent was obtained from all participants.

## AUTHOR CONTRIBUTIONS

BP designed the study, analyzed the data, and wrote the manuscript. JS and LvL collected the data and assisted in the literature research. TH developed the navigation task, provided the VR equipment, and assisted in the data collection.

## FUNDING

This study was funded by the Austrian Science Fund (P28261).

## ACKNOWLEDGMENTS

We thank Johanna Hagg for her help in the data acquisition and Thomas Scherndl for the statistical advice. We also thank all the participants for their time and willingness to contribute to this study.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Pletzer, Steinbeisser, van Laak and Harris. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Aromatization Is Not Required for the Facilitation of Appetitive Sexual Behaviors in Ovariectomized Rats Treated With Estradiol and Testosterone

#### Sherri Lee Jones\* †‡, Stephanie Rosenbaum‡ , James Gardner Gregory and James G. Pfaus†

Department of Psychology, Center for Studies in Behavioral Neurobiology, Concordia University, Montreal, QC, Canada

Testosterone can be safely and effectively administered to estrogen-treated postmenopausal women experiencing hypoactive sexual desire. However, in the United States and Canada, although it is often administered off-label, testosterone co-administered with estradiol is not a federally approved treatment for sexual arousal/desire disorder, partly because its mechanism is poorly understood. One possible mechanism involves the aromatization of testosterone to estradiol. In an animal model, the administration of testosterone propionate (TP) given in combination with estradiol benzoate (EB) significantly increases sexually appetitive behaviors (i.e., solicitations and hops/darts) in ovariectomized (OVX) Long-Evans rats, compared to those treated with EB-alone. The goal of current study was to test whether blocking aromatization of testosterone to estradiol would disrupt the facilitation of sexual behaviors in OVX Long-Evans rats, and to determine group differences in Fos immunoreactivity within brain regions involved in sexual motivation and reward. Groups of sexually experienced OVX Long-Evans rats were treated with EB alone, EB+TP, or EB+TP and the aromatase inhibitor Fadrozole (EB+TP+FAD). Females treated with EB+TP+FAD displayed significantly more hops and darts, solicitations and lordosis magnitudes when compared to EB-alone females. Furthermore, TP, administered with or without FAD, induced the activation of Fos-immunoreactivity in brain areas implicated in sexual motivation and reward including the medial preoptic area, ventrolateral division of the ventromedial nucleus of the hypothalamus, the nucleus accumbens core, and the prefrontal cortex. These results suggest that aromatization may not be necessary for TP to enhance female sexual behavior and that EB+TP may act via androgenic pathways to increase the sensitivity of response to male-related cues, to induce female sexual desire.

Keywords: sexual desire, testosterone, estradiol, preclinical model, aromatase, fadrozole

## INTRODUCTION

The role of androgens and estrogens in male sexual behavior in rodent models has been well characterized (Hull et al., 1997; Sato et al., 2005; Hull and Dominguez, 2007), but the role of androgens given in combination with estradiol has not been well studied in female sexual behavior. This is particularly true for female sexually appetitive behaviors and the associated

Edited by: Belinda Pletzer, University of Salzburg, Austria

#### Reviewed by:

Charlotte A. Cornil, University of Liège, Belgium Arturo Ortega, Center for Research and Advanced Studies (CINVESTAV), Mexico

#### \*Correspondence:

Sherri Lee Jones sherri.jones@mail.mcgill.ca

#### †Present address:

Sherri Lee Jones, Department of Psychiatry, McGill University, Montréal, QC, Canada James G. Pfaus, Centro de Investigacion Cerebrales, Universidad Veracruzana, Xalapa, Mexico ‡These authors have contributed

equally to this work and share first authorship

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Neuroscience

Received: 30 November 2018 Accepted: 17 July 2019 Published: 06 August 2019

#### Citation:

Jones SL, Rosenbaum S, Gardner Gregory J and Pfaus JG (2019) Aromatization Is Not Required for the Facilitation of Appetitive Sexual Behaviors in Ovariectomized Rats Treated With Estradiol and Testosterone. Front. Neurosci. 13:798. doi: 10.3389/fnins.2019.00798

**115**

neural mechanisms, despite human data suggesting that testosterone plays a key role in female sexual desire. Testosterone is an effective treatment for estrogen-treated post-menopausal women experiencing hypoactive sexual desire (Sherwin et al., 1985; Burger et al., 1987; Shifren et al., 2000; Sherwin, 2002; Braunstein et al., 2005; Davis and Braunstein, 2012). However, one of the reasons that testosterone is not FDA-approved is due to a lack of understanding of the neural mechanisms through which it facilitates sexual desire in these women, and human studies cannot identify where in the brain testosterone is acting, nor the neural mechanisms through which it exerts its effects. Testosterone is an aromatizable androgen that can exert its effects through androgenic or estrogenic pathways and can activate androgen receptors directly or indirectly following conversion by 5α-reductase into dihydrotestosterone (DHT). Testosterone can also activate estrogen receptors following aromatization to estradiol (E2) (Simpson, 2002), or by increasing bioavailable E2 through the displacement of E2 from sex-steroid binding globulins, which bind androgens with higher affinity than estrogens (Burke and Anderson, 1972; Selby, 1990). Some early work has shown that both aromatizable and non-aromatizable androgens are involved in female rat sexual preference tests (de Jonge et al., 1986b), although comprehensive mechanistic studies of where in the brain and through which mechanisms testosterone facilitates sexual motivation when administered on an EB-baseline have not been conducted. Animal literature has identified hypothalamic, limbic, and the prefrontal cortex as key brain regions involved in the activation of female sexually appetitive behaviors, making them candidate regions where testosterone may exert its actions. Thus, a better understanding of the mechanisms through which testosterone facilitates sexual desire, in candidate brain regions, can be addressed using preclinical rodent models.

Animal studies have demonstrated that testosterone propionate (TP) can facilitate sexual behaviors in ovariectomized (OVX) as well as gonadally intact reproductively senescent female rats. Administration of TP to OVX rats treated with estradiol benzoate (EB) increases scent-marking frequency, proceptive behaviors and partner preference for sexually active males over EB administration alone (de Jonge et al., 1986a; Van de Poll et al., 1988). Administration of TP also synergistically increases proceptive (i.e., appetitive) sexual behaviors in OVX females treated with EB and progesterone (Fernández-Guasti et al., 1991), and to levels equivalent to treatment with EB and progesterone (Jones et al., 2017). It has also been shown that in the aged gonadally-intact female rat, TP capsules implanted subcutaneously acutely increase both appetitive and consummatory sexual behaviors (Jones et al., 2012). Recently it was reported that testosterone propionate (TP) administered to the sexually experienced EB-treated OVX Long-Evans rat, 4 h prior to testing facilitates appetitive sexual behaviors beyond the effect of EB alone (Jones et al., 2017). Thus, this rodent model can be useful for increasing our understanding of the mechanisms involved in TP-induced facilitation in an animal model of hypoactive sexual desire.

One potential mechanism through which testosterone can facilitate sexual desire is through an androgenic pathway. Testosterone binds to androgen receptors directly and indirectly following reduction to DHT, and numerous reports suggest this as a possible mechanism. Firstly, whereas estrogen replacement therapy alone does not restore decreased sexual function, desire and arousal in many postmenopausal women (Utian, 1975; Nathorst-Böös et al., 1993; Shifren et al., 1998), studies have shown that testosterone, even in the absence of E2, yields a modest, yet significant increase in sexual episodes and desire in post-menopausal women (Davis et al., 2008). Secondly, human studies have found a limited role of aromatization in testosterone's ability to reinstate female sexual behavior. In one clinical study, post-menopausal women who were unresponsive to an estrogen therapy received transdermal testosterone in combination with either the aromatase inhibitor Letrozole, or placebo. Blocking aromatization with letrozole did not affect the enhancement in sexual satisfaction, general well-being and overall mood (Davis et al., 2006). In addition, Shifren et al. (2000) demonstrated that while a transdermal testosterone patch improved sexual function and well-being in postmenopausal women over placebo alone, serum free estradiol concentrations between these groups did not significantly differ, suggesting minimal aromatization. These results indicate that aromatization may not be necessary for testosterone to exert its facilitative role on female sexual desire in women treated with estrogens and suggest that facilitation may occur via an androgenic mechanism.

The first goal of the current study was to determine whether administration of the aromatase inhibitor fadrozole (FAD) would block the facilitation of female sexual behavior by TP in EBtreated females. Androgen receptors, estrogen receptors and the aromatase enzyme are widely distributed in the female brain, including the medial preoptic area (mPOA), ventromedial hypothalamus (VMH), and amygdala (Roselli et al., 1985, 1987; Handa et al., 1986, 1987; Wu and Gore, 2009; Wu et al., 2009; Feng et al., 2010; Stanic et al., 2014 ´ ), and these are some key regions implicated in sexually appetitive behaviors. Thus, a second goal was to begin to address the activation of neural regions by testosterone's facilitation of female sexual desire. To this end, we examined the number of Fos-immunoreactive (Fos-IR) cells within brain regions associated with sexual behavior (Pfaus and Heeb, 1997).

## MATERIALS AND METHODS

## Animals

Sexually naive Long-Evans female rats (150–200 g), were obtained from Charles River (St-Constant, Quebec). Female rats were housed in pairs in shoebox cages in a reversed lighting schedule (12/12 h light-dark, with lights off at 8 p.m.). Food and water were given ad libitum. Male Long-Evans rats (200–250 g) obtained from the same supplier were used as stimulus animals (n = 33). These males were sexually experienced in the bi-level chambers with a group of OVX sexually experienced Long-Evans stimulus females primed with EB (10 µg/0.1 mL sesame oil) and progesterone (500 µg/0.1 mL sesame oil) administered 48 and 4 h prior to sexual training, respectively. Males were housed in groups of 3 or 4 in large plexiglass chambers lined with

betachip. All other housing conditions were identical to those described for females.

All animal procedures were conducted in accordance with the standards established by the Canadian Council on Animal Care (CCAC) and approved by the Concordia University Animal Ethics Committee.

## Surgery

One week after arrival, experimental female rats were bilaterally ovariectomized (OVX) through lumbar incisions under a mixture of 4 parts ketamine hydrochloride to 3 parts xylazine hydrochloride administered by intraperitoneal injection (1 mL/kg of body weight). Females were treated post-operatively with subcutaneous injections of 3cc physiological saline, 0.03 mL Banamine and 0.1 mL Penicillin G.

## Hormone and Drug Preparation

All steroid compounds were received from Steraloids (Newport, RI). EB (10 µg), progesterone (500 µg), and TP (200 µg) were dissolved in 0.1 mL sesame oil under low heat for approximately 30 min, and stored at room temperature. Fadrozole hydrochloride (FAD; 1.25 mg/kg, Novartis Pharma and Sigma Aldrich) was dissolved in 0.1 mL of 0.9% physiological saline containing 20% 2-hydroxy propyl b-cyclodextrin and administered via subcutaneous injection twice a day (12 h apart). This dose was selected based on work showing that E2 was reduced in hypothalamic and amygdaloid nuclear pellets in FAD treated males compared to controls (Bonsall et al., 1992).

## Experimental Procedure

All sexual behavior training and testing occurred in bi-level chambers (Mendelson and Pfaus, 1989), during the middle third of the dark cycle. These chambers are designed to facilitate the experimenter's view of the full behavioral repertoire of sexual behaviors (Mendelson and Pfaus, 1989; Pfaus et al., 1999). Males were placed in chamber alone for a 5 min habituation period. Next, females were introduced to the chamber for a 30 min training session.

After a 7 day post-operative recovery period, experimental females were primed with subcutaneous injections of EB 48 h before, and progesterone 4 h prior to each of four sex-training sessions with sexually vigorous males (Jones et al., 2013). The purpose of the sexual training sessions is to ensure that all females have sexual experience and to reduce variability in sexual responding (Gerall and Dunlap, 1973; and as in Jones et al., 2013). Following these 4 training sessions, females were given a 2 week hormone wash-out period before being randomly assigned to one of three experimental groups (n = 11/group). During this 2 week hormone wash-out, males were given 30 min training sessions with a different subset of sexually-experienced, hormonallyprimed females every 4 days, to keep them sexually active.

EB was administered to experimental females by subcutaneous injection 48 h, and TP (or an equal volume of the oil control) 4 h before testing. FAD (or an equal volume of the vehicle control) was administered by subcutaneous injection at 8 a.m. and 8 p.m. every day for 3 days including the test day (**Figure 1**). For the experimental session, females were given 30 min to copulate with a sexually vigorous male.

All training and test sessions were video-recorded with a Sony Handycam, digital files were transferred to a personal computer, and sexual behaviors were scored blind to group condition using the Behavioral Observation Program (Cabilio, 1996) customized for rat sexual behavior.

## Behavioral Measures

Solicitations, defined as head-wise orientation toward the male followed by a run-away to the same or a different level, and hops and darts were used as measures of appetitive sexual behaviors (Pfaus et al., 1999; Jones et al., 2017). The consummatory measure, lordosis, was measured on a 4-point scale according to Hardy and Debold (1971) such that no lordosis was coded as a zero and increasing lordosis magnitudes (LM) from low to high were coded from 1 to 3. A lordosis quotient (LQ) was calculated by taking the ratio of total LMs to the number of mounts, intromissions and ejaculations received by the male. Mounts, intromissions and ejaculations received from the male were also coded (Pfaus et al., 1999).

## c-Fos Immunoreactivity

Two weeks following the test day, a subset of females that had been behavioral responsive on the test day (n = 5/group) were given their respective treatments of EB, EB+TP, or EB+TP+FAD, and were exposed to a sexually vigorous male behind a metal grid divider for 1 h prior to sacrifice. This was done so that females received only visual, auditory and olfactory cues from the males, since the goal was to investigate activation of regions involved in sexually appetitive behaviors without the confound of activation induced by receipt of sexual stimulation from the male, which is also known to differentially activate brain regions (Pfaus et al., 1993, 1994, 1996).

## Immunocytohistochemistry

Females were deeply anesthetized with an intraperitoneal injection of sodium pentobarbital (120 mg/kg/mL), and perfused intracardially with ice-cold phosphate-buffered saline (300 mL) followed by ice-cold 4% paraformaldehyde in 0.1 M phosphate buffer (300 mL). Brains were then removed, postfixed in 4% paraformaldehyde for 4 h, and stored overnight in 30% sucrose at 4◦C.

## Histology

Frozen coronal brain sections were sliced using a cryostat from the olfactory bulb until the beginning of the cerebellum. All sections were rinsed in cold 0.9% 50 mM tris buffer saline (TBS) and put into a 30% hydrogen peroxide TBS solution and left for 30 min at room temperature. The sections were incubated for 2 h at room temperature in a 3% Normal Goat Serum (NGS) solution mixed in 0.2% triton TBS. Following the preblocking phase, sections were incubated for 72 h at 4◦C in a solution containing: 3% NGS, primary rabbit polyclonal c-Fos antibody (Fos ab5, Calbiochem, Mississauga, ON; diluted 1:10,000) in a 0.05% triton TBS solution. Sections were transferred into a solution containing: 3% NGS, secondary antibody (Vector Laboratories Canada, Burlington, ON; 1:200) in a 0.2% triton TBS solution for 1 h at 4◦C. Sections were then incubated for 2 h at 4 ◦C in the avidin-biotinylated-peroxidase complex (Vectastain

FIGURE 1 | Experimental timeline. Females were ovariectomized 1 week after arrival into the colony, and given 1 week of recovery. All females were primed with estradiol benzoate (EB) 48 h before, and progesterone (P) 4 h prior to each of four sex-training sessions with males. After a 2 week hormone washout period females were randomly assigned to one of three experimental groups (n = 11/group): EB+Oil+Saline, EB+TP+Saline, or EB+TP+FAD (fadrozole). Estradiol benzoate (EB) was administered to experimental females by subcutaneous injection 48 h, and testosterone propionate (TP), or an equal volume of the oil control, 4 h before testing. FAD, or an equal volume of the vehicle control, was administered by subcutaneous injection at 8 a.m. and 8 p.m. every day for 3 days including the test day. For the experimental session, females were given 30 min to copulate with a sexually vigorous male.

Elite, ABC kit, Vector Laboratories, diluted 1:55). Sections were washed in TBS (3 × 5 min) between each incubation. Sections were then washed for 10 min in a 50 mM Tris buffer solution (pH = 7.6) before transferring to 3,3<sup>0</sup> -diaminobenzidine (DAB) in 50 mM Tris (0.1 mL of DAB/Tris buffer, pH 7.6) for another 10 min. Finally, sections were incubated in a 8%NiCl<sup>2</sup> (0.08 g) solution (400 µL per 100 mL of DAB/H2O<sup>2</sup> solution). The DAB reaction was stopped by transferring the sections to cold TBS (3 ×10 min washes) at room temperature. Sections were then mounted on gel-coated slides and allowed at least 24 h to dry. Sections were then dehydrated for 10 min each in 70, 90, and 100% ethanols, and immersed in Xylene for 2 h. The sections were then coverslipped using permount glue and allowed to dry for 48 h before examination under a light microscope. Confirmation of successful Fos-IR was made when dark staining was detected within cell nuclei, as in Pfaus et al. (1993).

Tissue sections were examined at 40× and average numbers of Fos-IR cells were counted bilaterally using five sections for each region/rat, which appeared to contain the largest number of Fos-IR cells (Pfaus et al., 1993, 1996; Coria-Avila and Pfaus, 2007; Parada et al., 2010). Using the Paxinos and Watson (1986) rat brain atlas regions of interest were identified using standard visible anatomical landmarks (Pfaus et al., 1993, 1996; Smith et al., 1997; Coria-Avila and Pfaus, 2007; Parada et al., 2010). Fos-IR cells were counted in the infralimbic prefrontal cortex (IL; Plates 8–10), medial amygdala (MeA: Plates 27–29), medial preoptic area (mPOA: Plates 20–22), ventromedial hypothalamic nucleus (VMH; Plates 27–29) ventral tegmental nucleus (VTA: Plates 39–43), nucleus accumbens (NAc) core, and shell (Plates 11–15). The methodology applied for taking pictures, selecting the region of interest, and counting Fos-IR cells was as in previous papers from our group, but specifically, we applied methodology and regions of interest as previously reported in Coria-Avila and Pfaus (2007), Parada et al. (2010), Pfaus et al. (1993, 1996) and Smith et al. (1997). All pictures were taken by a researcher (JGG) blind to experimental group. The researcher identified and captured all sections containing the region of interest which could be identified with the visible landmarks (as described in **Figure 2**).

Images of each section were captured on a desktop computer under the same light intensity using Q Capture Pro (version 5.1) connected to a Leitz microscope (40×) and saved in TIFF format before importing into Image J. ImageJ software was used to count the number of Fos-IR cells in each region by a researcher blind to experimental group (SR). For each brain region, the region of interest was identified according to standard anatomical landmarks (**Figure 2**), then manually outlined on the sections containing the largest number of Fos-IR positive cells. It should be noted that this can lead to some minimal degree of intersubject variability in the exact location of Fos-IR counts within the region of interest. The region of interest was identified and outlined as described in previous publications (Pfaus et al., 1993, 1996; Smith et al., 1997; Coria-Avila and Pfaus, 2007; Parada et al., 2010). Our methodology for counting Fos-IR cells consisted first, of adjusting the brightness and contrast on the first section counted using ImageJ and noting that contrast value to apply it to all subsequent images for that region. Next, the threshold tool was used to manually capture all cells that were subjectively identified as immunopositive, blind to experimental group. For all images, circularity was set to 0.3–1, and pixel size was set to 2–40.

## Statistical Analyses

Data were analyzed with Statistical Package for the Social Sciences (SPSS) software (Version 18). Due to violation of homogeneity of variance, the Kruskall–Wallis test was used to analyze behavioral differences between groups. Post hoc analyses were conducted using the Mann–Whitney U and a Bonferroni correction was applied for the three group comparisons (padj), but unadjusted p-values are also reported for transparency and interpreted as trends. Effect sizes were computed on the Mann–Whitney tests using the formula r = Z/(sqrt(n)). The level of significance was set to 0.05 for all tests.

Data are presented using boxplots, and outliers were defined as generated by SPSS (outliers are defined as 1.5–3 times the interquartile range, and extreme outliers are defined as values 3 or

more times the interquartile range). The number of animals that displayed at least one occurrence of the behavior was calculated. All animals were included in all analyses, except for lordosis measures, where only females that received a mount from a male were included, because the calculation of LQ and LM depends on mounts received.

Brain data were analyzed using a one-way analysis of variance (ANOVA) to test for differences between EB-alone, EB+TP and EB+TP+FAD groups, and significant ANOVAs were followed up with Fisher's Least Significant Difference post hoc analysis. The level of significance was set at 0.05 for all comparisons. Eta square is reported as a measure of effect size for ANOVAs and Hedge's g for between group comparisons.

## RESULTS

The percentage of females displaying each behavior within each treatment group is shown in **Table 1**.

## Appetitive Sexual Behaviors

The non-parametric Kruskall–Wallis was conducted to test for behavioral differences between groups. Females treated with EB+TP+FAD displayed more hops/darts (**Figure 3A**) compared to EB-alone (U = 24, z = 2.486, p = 0.013, padj = 0.039, r = 0.53), and to levels equivalent to EB+TP (U = 44, z = 1.091, p = 0.275, padj = 0.825, r = 0.23; main effect, X<sup>2</sup> (2) = 7.530, p = 0.023), whereas EB+TP tended to increase the number of hops/darts compared to EB-alone (U = 30.5, 2.04, p = 0.041, padj = 0.123, r = 0.43). Sexual solicitations (**Figure 3B**) did not differ between females treated with EB+TP compared to EBalone (U = 44, p = 0.069, padj = 0.207, z = 1.817, r = 0.39), whereas females administered EB+TP+FAD displayed significantly more solicitations compared to EB-alone (U = 16.4, p = 0.001, padj = 0.003, z = −3.354, r = 0.715) and tended to display more than females treated with EB+TP (U = 30, p = 0.032, padj = 0.096, z = −2.144, r = 0.46); main effect, X<sup>2</sup> (2) = 13.009, p = 0.001.



<sup>a</sup>For lordosis quotient (LQ) and lordosis rating (LR), only those females that were mounted could be included in the analyses. The number of females mounted (with or without intromission) per group are EB+O n = 6; EB+TP n = 8, EB+TP+FAD n = 11.

FIGURE 3 | Median frequency of hops/darts (A), solicitations (B), level changes (C), defensive behaviors (D), lordosis rating (E), and lordosis quotient (F) of ovariectomized Long-Evans rats (n = 11/group) treated with estradiol benzoate (EB) with or without testosterone propionate (TP) and the aromatase inhibitor fadrozole (FAD). Data were analyzed using Kruskall–Wallis to detect differences between groups, and significant effects were followed up using Mann–Whitney U, and p-values were adjusted using a Bonferroni correction. Boxes represent interquartile range, and whiskers each represent the top and bottom 25% of scores. <sup>o</sup> Outlier. +Extreme outlier. <sup>∗</sup>Different from EB-alone, padj < 0.05. #Tendency to differ from EB-alone, p < 0.05, or padj < 0.10. <sup>a</sup>Tendency to differ from EB+TP, p < 0.05.

## Level Changes and Defensive Behaviors

More level changes (**Figure 3C**) were observed in females treated with EB+TP+FAD (U = 19, p = 0.006, padj = 0.018, Z = −2.727, r = 0.58) compared to those treated with EB-alone, whereas there was a tendency for EB+TP to increase level changes compared to EB-alone (U = 28, p = 0.033, padj = 0.099, Z = −2.137, r = 0.46); EB+TP and EB+TP+FAD did not differ (U = 47, p = 0.375, padj = 1.00, Z = −0.887, r = 0.19) [main effect, X 2 (2) = 8.625, p = 0.013]. Defensive behaviors (**Figure 3D**) did not differ between groups, X<sup>2</sup> (2) = 2.761, p = 0.251.

## Lordosis

Lordosis rating (LR; **Figure 3E**) was higher in females treated with EB+TP+FAD compared to EB-alone (U = 6.0, p = 0.005, padj = 0.015; Z = −2.781, r = 0.70) and tended to be higher in females treated with EB+TP compared to EB-alone (U = 9.0, p = 0.036, padj = 0.108, Z = −2.094, r = 0.45), whereas LR did not differ between females treated with EB+TP+FAD and EB+TP [U = 18.0, p = 0.093, padj = 0.279, Z = −1.680, r = 0.41; main effect, X<sup>2</sup> (2) = 9.455, p = 0.009]. LQ tended to be higher in females treated with EB+TP compared to EB-alone (U = 12,

ejaculations (C) that males made toward ovariectomized Long-Evans rats (n = 11/group) treated with estradiol benzoate (EB) with or without testosterone propionate (TP) and the aromatase inhibitor fadrozole (FAD). Data were analyzed using Kruskall–Wallis to detect differences between groups, and significant effects were followed up using Mann–Whitney U, and p-values were adjusted using a Bonferroni correction. Boxes represent interquartile range, and whiskers each represent the top and bottom 25% of scores. <sup>o</sup> Outlier. <sup>+</sup>Extreme outlier. <sup>∗</sup>Different from EB-alone, padj < 0.05. #Tendency to differ from EB-TP, p < 0.05; ∗∗Different from EB-alone and EB+TP, both padj < 0.05.

p = 0.052, padj = 0.156, Z = −1.940, r = 0.52, **Figure 3F**), and was significantly higher in females treated with EB+TP+FAD (U = 9.0, p = 0.009, padj = 0.027, Z = −2.612, r = 0.63) compared to EB-alone, whereas LQ did not differ between females treated with EB+TP and those treated with EB+TP+FAD [U = 28.0, p = 0.206, padj = 0.618, Z = −1.355, r = 0.31; main effect, X 2 (2) = 7.802, p = 0.020].

## Male Stimulations

Females treated with EB+TP+FAD received significantly more mounts (**Figure 4A**) than females treated with EB-alone (U = 21, p = 0.008, padj = 0.024, Z = −2.626, r = 0.56), whereas females treated with EB+TP did not differ from EB-alone (U = 41.5, p = 0.200, padj = 0.600, Z = −1.281, r = 0.27), or from EB+TP+FAD [U = 38.5, p = 0.151, padj = 0.453, Z = −1.452, r = 0.31; main effect, X<sup>2</sup> (2) = 7.173, p = 0.028].

Whereas females treated with EB+TP did not differ from EB-alone in the number of intromissions received (U = 49.5, p = 0.148, padj = 0.444, Z = −1.447, r = 0.31), females treated with EB+TP+FAD received significantly more intromissions than females treated with EB-alone (U = 16.5, p = 0.001, padj = 0.003, Z = −3.353, r = 0.71) and tended to receive more than females treated with EB+TP [U = 28.0, p = 0.020, padj = 0.060, Z = −2.332, r = 0.50; **Figure 4B**; main effect, X 2 (2) = 13.729, p = 0.001].

Similarly, whereas females treated with EB+TP did not differ from EB-alone in the number of ejaculations received (U = 55.0, p = 0.317, padj = 0.951, Z = −1.000, r = 0.21), females treated with EB+TP+FAD received significantly more ejaculations than females treated with EB-alone (U = 22, p = 0.002, padj = 0.006, Z = −3.067, r = 0.65), and compared to females treated with EB+TP [U = 28.5, p = 0.014, padj = 0.042, Z = −2.451, r = 0.52; **Figure 4C**; main effect, X<sup>2</sup> (2) = 13.136, p = 0.001].

## Fos-IR

Descriptive data of all Fos-IR counts for each brain region by group are shown in **Table 2**, and representative pictures are shown in **Figure 2**. One-way ANOVAs were used to determine if there were significant differences between treatment groups, followed by an LSD post hoc analysis. EB+TP and EB+TP+FAD had higher Fos -IR counts than EB alone, in the mPOA [p = 0.023, g = 1.66; p = 0.01, g = 2.71, respectively, main effect of group, F(2, <sup>11</sup>) = 5.432, p = 0.023, R <sup>2</sup> = 0.497], the NAc core [p = 0.02, g = 2.36, p = 0.01, g = 2.70, respectively, main effect of group, F(2, <sup>11</sup>) = 6.008, p = 0.022, R <sup>2</sup> = 0.572], the IL [p = 0.005, g = 2.52, and p = 0.0048, g = 2.62, respectively, main effect of group, F(2, <sup>11</sup>) = 6.912, p = 0.015, R <sup>2</sup> = 0.606], and the vlVMH [p = 0.024, g = 2.46; p = 0.022, g = 1.74, respectively, main effect of group, F(2, <sup>12</sup>) = 14.705, p = 0.036, R <sup>2</sup> = 0.485] but EB+TP and EB+TP+FAD did not differ from each other (mPOA, p = 0.624, NAc core, p = 0.654; IL, p = 0.180, vlVMH, p = 0.850). In the VTA, EB+TP+FAD females had higher Fos-IR counts than EB-alone (p = 0.007, g = 2.95) whereas EB+TP tended to increase the number of Fos-IR counts compared to EB-alone (p = 0.08, g = 1.57), but EB+TP and EB+TP+FAD did not differ (p = 0.122) [main effect of group, F(2, <sup>10</sup>) = 6.409, p = 0.022, R <sup>2</sup> = 0.616].

No differences between groups were found in the dmVMH, F(2, <sup>12</sup>) = 1.874, p = 0.204, R <sup>2</sup> = 0.273, the NAc shell [F(2, <sup>11</sup>) = 2.041, p = 0.186, R <sup>2</sup> = 0.312], or the MeA [F(2, <sup>12</sup>) = 0.455, p = 0.647, R <sup>2</sup> = 0.083].

Jones et al. Testosterone and Female Sexual Desire

TABLE 2 | Average ± SEM numbers of Fos-immunoreactive cells in different hypothalamic and limbic structures in ovariectomized female rats treated with estradiol benzoate (EB) alone, or in combination with testosterone propionate (TP), or TP and the aromatase inhibitor fadrozole (FAD).


n = 5/group. Some tissue sections were damaged or had poor staining, resulting from 3 to 5 sections per region. EB, Estradiol Benzoate; TP, Testosterone Propionate; FAD, Fadrozole Hydrochloride; MeA, Medial Amygdala; mPOA, Medial Preoptic Area; NAc, Nucleus Accumbens; IL, Infralimbic Prefrontal Cortex; VMH, Ventromedial nucleus of the hypothalamus; VTA, Ventral Tegmental Area. <sup>∗</sup>Different from EB, p < 0.05.

## DISCUSSION

The purpose of this study was to determine whether administration of the aromatase inhibitor FAD would disrupt the facilitation of female sexually appetitive behaviors that occurs with TP treatment in EB-treated OVX rats, and to determine whether Fos-IR differed between groups in brain regions known to be involved in sexual motivation and reward. The present results illustrate that blocking aromatization using FAD in females treated with EB+TP increased hops/darts, solicitations, level changes and lordosis measures compared to those treated with EB-alone. These findings suggest that aromatization of TP to estradiol is not necessary for the display of female sexual behaviors in OVX rats treated with EB and TP. The Fos-IR data suggest that TP may act within the mPOA, NAc core, IL, and vlVMH to elicit its effects, and as well as the VTA, which specifically had higher numbers of Fos-IR cells in the EB+TP+FAD group compared to the EB-alone group.

In the current study, the behavioral levels induced by EB+TP were less pronounced than levels reported in Jones et al. (2017), particularly for LQ. However, this is not surprising given previous reports that a number of factors can influence behavioral sensitivity to estradiol, such as sexual experience (Gerall and Dunlap, 1973; Pfaus et al., 1999), EB dose (Pfaus et al., 1999; Jones et al., 2013), strain (Jones et al., 2013), bedding type (Jones et al., 2015), and exposure to male cues (Jones et al., 2015). Important individual differences exist in behavioral sensitivity to hormone treatments on sexual behavior. Thus, to ensure that females were all behaviorally sensitive to sex steroid hormones, we examined behaviors induced by EB+P priming on the fourth day of behavioral training, and for all groups LQ and LR were near maximal (range LQ = 0.93–0.98; range LR = 2.43–2.69), and no differences were detected between groups on any behavioral measure (**Supplementary Table 1**) suggesting that on average, the groups were equally as responsive to sex steroid hormones under equivalent and optimal hormone priming conditions. The variability in sensitivity to EB and TP is reminiscent of reports in the human literature, showing that some women's low sexual desire responds rather well to estrogens administered alone, and that testosterone can be particularly beneficial to improving sexual desire in women who are unresponsive to estradiol alone, as originally reported by Burger et al. (1987). In addition to the environmental and experiential factors outlined above, hormone sensitivity can be dependent on differences in biological mechanisms, such as steroid hormone receptor density, enzymes, and hormone binding globulins, among other factors. Although the effectiveness of surgical ovariectomy and hormone administration were not formally tested, the high and normal levels of behavioral responding during the training phase, as well as the low level of responding in the control groups suggest that those manipulations were effective. Additional research will be needed to increase our understanding of individual differences in hormone sensitivity, and to determine who responds best to which treatments. Such considerations are already being taken into account for women presenting with differing etiologies (i.e., top-down or bottom-up sexual inhibition) of hypoactive sexual desire (e.g., Sarin et al., 2013; Poels et al., 2014).

The facilitation of EB+TP compared to EB-alone did not attain the strict statistical cut-offs in the current study, in contrast to the statistically significant increase in appetitive behaviors reported in Jones et al. (2017). We note however that the pattern of results in the current study mimic those reported in Jones et al. (2017), and moreover, the effect sizes on appetitive behaviors between EB-alone and EB+TP treated animals are similar in magnitude in the current study (hops/darts r = 0.43; solicitations r = 0.39; level changes r = 0.46) and Jones et al. (2017) (hops/darts r = 0.68, solicitations r = 0.68, level changes r = 0.60). The effect sizes range from moderate to large, suggesting a reliable and moderate ability for TP to facilitate appetitive sexual behaviors in EB-treated OVX female rats.

In the present study, blocking aromatase in OVX EB+TP treated rats enhanced appetitive sexual behaviors beyond that of EB-alone. The administration of TP tended to increase hops/darts and level changes beyond that of EB-alone, with moderate effect sizes on hops/darts, level changes, as well as solicitations (with r ranging from 0.39 to 0.46). The administration of FAD to females treated with EB+TP enhanced appetitive measures of sexual behaviors, such that EB+TP+FAD displayed significantly more hops/darts than EB-alone, and tended to display more sexual solicitations than females treated with EB+TP. The effect sizes between EB and EB+TP+FAD were moderate to large, with r = 0.53 for hops/darts and r = 0.715 for solicitations, and small to moderate between EB+TP and EB+TP+FAD, with r = 0.23 for hops/darts and r = 0.39 for solicitations. These findings suggest that aromatization to estradiol is not necessary for the facilitation of appetitive sexual behaviors by TP when administered to EBtreated females.

One strict interpretation of the present data is that FAD had no statistically significant facilitative effect on appetitive sexual behaviors beyond treatment with EB+TP (i.e., only a statistical trend for FAD to increase solicitations beyond EB+TP was detected). This interpretation could suggest that FAD may release

an inhibitory effect induced by EB+TP (given that EB+TP+FAD facilitated sexually appetitive behaviors beyond EB-alone), which could involve, for example, extragonadal estradiol synthesis. However, an inhibitory action of extragonadal estradiol in the context of these data may not be a likely explanation for two reasons. First, estradiol is not inhibitory to sexual behavior in OVX oil-treated animals, and is necessary for the display of sexual behaviors (Pfaff, 1980). One mechanism through which EB+TP is thought to exert its effects is by indirectly increasing bioavailable estradiol, following its displacement from steroid hormone binding globulins by testosterone (Burke and Anderson, 1972). However, in our OVX females, endogenous levels of estradiol are probably too low, even given the multiple sites of extra-gonadal synthesis of estradiol (Barakat et al., 2016), particularly because FAD was administered twice a day for the duration of the experimental phase, and has previously been shown to effectively reduce E2 in hypothalamic nuclear pellets (Bonsall et al., 1992). As such, a more likely explanation is that more free androgen was available to act on androgen receptors to facilitate sexual behaviors. Second, when considered with previous publications using similar methods, EB+TP facilitates appetitive sexual behaviors beyond EB-alone, as discussed above. Nonetheless, we cannot rule out the interpretation that TP+FAD releases inhibition in EB-alone treated OVX females, particularly given that FAD was administered systemically, and that we did not measure circulating E2, nor did we confirm that FAD effectively reduced neural estradiol in our animals.

A more plausible and parsimonious interpretation of these data, particularly when considered in the context of the data presented in Jones et al. (2017) is that TP given in combination with EB facilitates appetitive sexual behaviors, at least in part, through androgenic mechanisms (Cappelletti and Wallen, 2016). As discussed above, TP induces a reliable and moderate increase in appetitive behaviors in EB-treated females, which was not blocked by FAD administration. This is consistent with previous results indicating the importance of androgen receptor activation in female sexual behavior (Jones et al., 2010; Kudwa et al., 2010). Testosterone has been shown to require the presence of estradiol to exert its modulatory role on female sexual behavior (Sherwin and Gelfand, 1987; Buster et al., 2005), thus it is possible that EB administration 48 h before testing upregulates androgen receptors, thereby facilitating the ability of testosterone to act on androgen receptors in areas of sexual behavior as it does with progesterone (Rubin and Barfield, 1983). It is also interesting that EB+TP+FAD tended to enhance the expression of sexual solicitations beyond that of EB+TP, and the only brain region that revealed increased Fos-IR specifically in the EB+TP+FAD group was the VTA. The VTA, a core component of the mesocorticolimbic reward pathway, contains androgen receptors (Kritzer, 1997; Kritzer and Creutz, 2008) and therefore this is a key region of interest for future mechanistic studies.

Some earlier animal studies have shown the importance of AR in the facilitation of female sexual behavior. For example, Yahr and Gerling (1978) demonstrated that administration of 6-alpha-fluorotestosterone, a non-aromatizable androgen, could induce sexual receptivity in female rats comparable to that of TP. In addition, recent studies using selective androgen receptor modulators (SARM) have revealed an important role of ARs in female sexual behavior. Administration of a non-aromatizable SARM that does not interact with estrogen receptors, to OVX rats primed with sub-optimal levels of EB (2.0 µg) + progesterone (100 µg) increased both proceptive and receptive sexual behavior in sexually-experienced females (Kudwa et al., 2010). Moreover, TP given in combination with R-bicalutamide, an anti-androgen, reduced sexual preference of a female for an intact male compared to TP-alone (Jones et al., 2010). Together these data highlight the importance of ARs and contribute to a more mechanistic approach underlying testosterone's role in female sexual behavior.

Additionally, treatment with FAD appears to have increased the female's attractivity. EB+TP+FAD-treated females received more mounts and intromissions than EB-alone treated females and tended to receive more intromissions than EB-TP, and receive more ejaculations than both the EB-alone and EB+TP treated females, all with correspondingly moderate effect sizes. We suspect that the behavior of the males was influenced by the appetitive behaviors and receptivity of their female partners, which is also reflected in the percentage of females that were mounted (i.e., about half the EB-treated females, and 73% of the EB+TP, and 91% of the EB+TP+FAD females). Pfaus and Pinel (1989) demonstrated that when training a male with a non-receptive female, the male quickly learns that she is not receptive followed by a drastic decrease in rate of mounting over trials. In the present study, the male's mounts, intromissions and ejaculations on the final training day, occurring 2 weeks prior to testing were normally distributed, and 100% of females in each group were mounted (see **Supplementary Table 1**). Therefore, the males' sub-par sexual behaviors toward females receiving EB-alone and EB+TP could be explained by the low appetitive and receptive behaviors displayed by these females, a behavioral pattern consistent with our previous reports of OVX Long-Evans rats treated acutely with EB-alone (Jones et al., 2013, 2017).

As a first step to investigating potential brain regions where TP may be exerting its effects to facilitate sexual motivation, Fos-IR was examined within mesocorticolimbic regions known to be involved in sexual motivation (Pfaus, 2009) following EB treatment and exposure to a male behind a screen. Fos-IR was investigated within the mPOA, MeA, IL, VMH, VTA, and NAc core and shell. TP administration to OVX EB-treated females induced Fos-IR in the mPOA, NAc core, IL and the vlVMH, whereas activation within the VTA occurred with the addition of FAD. These regions have a moderate to high density of ARs (Handa et al., 1986, 1987; Fernández-Guasti et al., 2000; Wu et al., 2009; Feng et al., 2010), making them potential candidate regions where TP may exert its effects.

The mPOA is a critical component in mediating female proceptive behaviors such as hops, darts and solicitations (Erskine, 1989b; Hoshina et al., 1994), and is important for the integration and interpretation of olfactory and auditory sensory cues (Hull et al., 1997). In the current study we found that compared to EB-alone, Fos-IR was expressed in more cells in females treated with EB+TP regardless of whether FAD was administered. These Fos-IR data parallel the behavioral data, namely the higher appetitive measures compared to EB-alone.

Activity in the mPOA is sensitive to changes in hormonal milieu, thus one possible mechanism is that TP is working in the female mPOA as it does in the male, to modulate the mPOA's neural responsiveness to olfactory cues (Pfaff and Pfaffmann, 1969). TP has also been shown to upregulate nitric oxide synthase, which increases levels of nitric oxide, thereby increasing dopamine release in the mPOA of male rats (Lorrain et al., 1996; Hull et al., 1997; Hull and Dominguez, 2006). This relationship has not been examined in the female brain, and we acknowledge that the mechanisms may be different between the sexes. Nonetheless, it is a likely candidate mechanism given that dopamine and the activation of its distinct receptors (D1 and D2) in the mPOA has been shown to mediate female sexual behavior (Matuszewich et al., 2000; Graham and Pfaus, 2010, 2012). Therefore, within the mPOA, TP may act through androgenic mechanisms. Future mechanistic studies are needed to determine the combined effects of estrogens and androgens on female sexual motivation within the female mPOA.

Upstream of the mPOA, the amygdala is important for integrating sensory information from the environment. The MeA itself is involved in female sexual motivation, via dopaminergic and progesterone signaling (Holder et al., 2015). The lack of difference in Fos-IR expression within the MeA between groups suggests that testosterone does not act within this region to facilitate appetitive sexual behaviors, and further suggest that all the females were detecting similar sensory input in response to male cues.

The vlVMH is well-known as a critical region for the expression of lordosis via estradiol signaling (Pfaff, 1968; Pfaff and Sakuma, 1979; Pfaff et al., 2000, 2011). In the current study, the vlVMH had significantly more Fos-IR nuclei in females given either EB+TP or EB+TP+FAD, when compared with EB alone. Consistent with this, EB+TP females displayed higher LR and LQ compared to EB-alone. There is evidence that certain androgens, such as DHT and 5α-androstane-3α,17β-diol, inhibit EB-induced lordosis in female rats (Baum and Vreeburg, 1976; Erskine, 1989a), and as such it is somewhat surprising that FAD led to a significant increase in lordosis measures beyond EB-alone, given that FAD is an aromatase inhibitor, which suggests that TP acted via an androgenic pathway. In summary, it is unclear through what mechanism within the vlVMH EB+TP+FAD might facilitate lordosis, although downstream midbrain mechanisms cannot be ruled out (Pfaff, 1980).

The dopaminergic output from the mPOA to the VTA is essential for sexual behavior (Brackett and Edwards, 1984). Females receiving EB+TP+FAD had significantly more Fos-IR in the VTA compared to females receiving EB alone. Downstream of the VTA, EB+TP+FAD, and EB+TP had significantly higher Fos-IR expression in the NAc core, although not in the shell, when compared to EB alone. The NAc has been implicated in the motivation to engage in sexual behavior, as well as in the rewarding properties of sexual behavior such as paced mating in the female rat (Jenkins and Becker, 2001, 2003; Guarraci et al., 2002, 2004). Specifically, the NAc shell has been shown to be involved in processing of rewarding stimuli, while the core is involved in motor function related to reward and reinforcement (Ito and Hayen, 2011). Infusion of the testosterone metabolite 3a-diol into the NAc shell selectively increased appetitive sexual behaviors (hops darts and ear wiggles) (Sánchez Montoya et al., 2010). Because in the current study the administration of TP to OVX EB-treated females upregulated Fos-IR in the NAc core but not shell, and that occurred regardless of whether FAD was also administered, it is likely that the effect of TP within the NAc is associated with the rewarding properties of sexual stimuli, or with the rewarding properties of TP itself (Nyby, 2008). It should be noted, however, that the dose of FAD used in this study was selected based on work showing that estradiol was reduced in hypothalamic and amygdaloid nuclear pellets in FAD-treated male rats compared to controls (Bonsall et al., 1992). Thus, because we did not measure aromatase activity in our female animals following FAD administration, we cannot be certain that the dose had the same level of effectiveness as reported by Bonsall et al. (1992).

## CONCLUSION

In conclusion, administration of FAD enhanced the facilitation of appetitive and consummatory sexual behaviors in OVX female rats treated with EB and TP, showing that aromatization of testosterone to estradiol is not required for TP-induced facilitation of sexual desire in our preclinical model. Moreover, TP-induced activation of Fos-IR expression in brain areas implicated in sexual motivation, behavior and reward, suggests that TP may increase the sensitivity to male-related cues and may enhance the female's attractivity to the male. Future mechanistic studies should investigate whether the facilitation by TP can be blocked by giving androgen receptor inhibitors, and measuring circulating levels of estradiol, testosterone, and SHBG to better inform the mechanisms.

## ETHICS STATEMENT

All animal procedures were conducted in accordance with the standards established by the Canadian Council on Animal Care (CCAC) and approved by the Concordia University Animal Ethics Committee.

## AUTHOR CONTRIBUTIONS

SJ designed the experiments in consultation with JP, trained SR and JG on the methodological details of the experiments, performed the surgeries, oversaw all aspects of the experiments, ran the final statistical analyses, and contributed to the intellectual content of the manuscript, and prepared the manuscript for publication. SR conducted the experiments, including preparation of solutions, injections, perfusions, histology, scored the behavior and performed the Fos counts, managed the data sets, conducted the analyses in consultation with SJ and JP, and wrote the first draft of the manuscript. JG assisted in all phases of data collection and in particular the brain staining and picture taking, provided intellectual contributions to the manuscript, and data

interpretation. JP conceived and designed the experiments in collaboration with SJ, oversaw all aspects of the experiments, and provided intellectual contributions to the manuscript. All authors approved all the contents of the manuscript for publication.

## FUNDING

This study was supported by a project grant from the Canadian Institutes of Health Research to JP (GH 162264) and by an infrastructure grant from the Fonds de la Recherche Québec en Santé (to the Center for Studies in Behavioral Neurobiology at Concordia University).

## REFERENCES


## ACKNOWLEDGMENTS

The authors thank Novartis Pharma for the free sample of fadrozole hydrochloride. The authors would like to acknowledge the helpful and constructive feedback provided by the reviewers during the review process.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnins. 2019.00798/full#supplementary-material




**Conflict of Interest Statement:** JP is a member of the scientific advisory boards for AMAG Pharmaceuticals, Emotional Brain, LLB., IVIX Corp, and Palatin Technologies, Inc.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Jones, Rosenbaum, Gardner Gregory and Pfaus. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Verbal Interaction Social Threat Task: A New Paradigm Investigating the Effects of Social Rejection in Men and Women

Sanne Tops<sup>1</sup> , Ute Habel1,2, Ted Abel<sup>3</sup> , Birgit Derntl<sup>4</sup> and Sina Radke1,2 \*

<sup>1</sup> Department of Psychiatry, Psychotherapy and Psychosomatics, Faculty of Medicine, RWTH Aachen University, Aachen, Germany, <sup>2</sup> Jülich Aachen Research Alliance – BRAIN Institute I: Brain Structure–Function Relationships: Decoding the Human Brain at Systemic Levels, Research Center Jülich GmbH and RWTH Aachen University, Aachen, Germany, <sup>3</sup> Department of Molecular Physiology and Biophysics, Iowa Neuroscience Institute, Carver College of Medicine, The University of Iowa, Iowa City, IA, United States, <sup>4</sup> Department of Psychiatry and Psychotherapy, Medical School, University of Tübingen, Tübingen, Germany

#### Edited by:

Annie Duchesne, University of Northern British Columbia, Canada

#### Reviewed by:

Gábor B. Makara, Hungarian Academy of Sciences (MTA), Hungary Katrin Preckel, Max Planck Institute for Human Cognitive and Brain Sciences, Germany

> \*Correspondence: Sina Radke sradke@ukaachen.de

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Neuroscience

Received: 08 October 2018 Accepted: 25 July 2019 Published: 07 August 2019

#### Citation:

Tops S, Habel U, Abel T, Derntl B and Radke S (2019) The Verbal Interaction Social Threat Task: A New Paradigm Investigating the Effects of Social Rejection in Men and Women. Front. Neurosci. 13:830. doi: 10.3389/fnins.2019.00830 In recent years, digital communication and social media have taken an indispensable role in human society. Social interactions are no longer bound to real-life encounters, but more often happen from behind a screen. Mimicking an online communication platform, we developed a new, fMRI compatible, social threat paradigm to investigate sex differences in reactions to social rejection. During the Verbal Interaction Social Threat Task (VISTTA), participants initiate 30 short conversations by selecting one of four predefined opening sentences. Two computerized interlocutors respond to the opening sentence mostly with negative comments and rejections toward the participant, which should induce social-evaluative threat. Physiological and subjective responses were measured, before, during, and after the VISTTA in 61 (29 male and 32 female) first year students who received either mostly negative (n = 31; threat group) or neutral comments (n = 30; control group). Two-level behavioral validation included social threatinduced mood changes in participants, and interlocutor evaluation. The latter consisted of multiple variables such as "willingness to cooperate" after every conversation, an overall fairness evaluation of interlocutors, and evaluations per reaction indicating how positive or negative it was received. We acquired additional physiological measures including cortisol assays via saliva samples, heart rate, and blood pressure. Confirming our hypotheses, peer rejection and exclusion during the VISTTA led to less willingness to cooperate and lower fairness evaluation of interlocutors. It also induced feelings of anger and surprise and lower happiness in the social-threat group. Women showed overall higher emotion ratings compared to men. Contrary to our a priori hypothesis, the VISTTA did not induce cortisol and heart rate increases. However, the stable cortisol response in women in the threat group does not follow the circadian decline and might reflect an endocrinological response. The decline in cortisol response in men in both the threat and control group could indicate faster habituation to the VISTTA. Taken together, these findings indicate effects of social-evaluative threat on a behavioral level, and more moderate effects on the emotional and physiological level. Sex differences in affective and cortisol responses may indicate that women are more susceptible for the social-evaluative threat than men. With a realistic implementation of verbal, interactive, and social components, the VISTTA is designed as an fMRI paradigm that can be applied to elucidate the neural representation of social-evaluative threat.

Keywords: social-evaluative threat, rejection, cortisol, social stress, verbal communication, VISTTA

## INTRODUCTION

fnins-13-00830 August 6, 2019 Time: 17:18 # 2

With the increasing influence of social media and online communication platforms, digital communication has taken a vital role in current society. With this development, social interaction more often happens from behind a screen, rather than in real life. Interactions that involve rejection, exclusion, and negative evaluation can lead to a lower self-esteem and acceptance (Dickerson, 2008). This can also lead to a set of physiological responses including activation of one of the main biological responses to stress: the hypothalamus–pituitary– adrenal gland (HPA) axis (Mason, 1968), leading to an increased production of cortisol by the adrenal glands (Lupien et al., 2009). In the initial definition of stress by Selye (1950), "mere emotional stimuli" were considered negligible in comparison to physical variables such as physical trauma, heat, and fasting. Emotional stimuli, that is, conditions involving novelty, uncertainty, unpredictability, and anticipation of something previously experienced as unpleasant, however, may challenge one's capacity to cope with the situation, which will be experienced as a burden and distress.

As proposed by Mason (1968), emotional stimuli such as social evaluation and exclusion can also trigger the stress response. This idea has been confirmed by more recent studies (Williamson et al., 2018) investigating the effect of social exclusion on cardiovascular and affective responses in response to a social evaluative stressor. Excluded participants showed increased cardiovascular and anxiety responses to the stressor. Included participants reported similar increases in anxiety, but cardiovascular responses did not change. Social evaluation functions as a stressor through the salience of negative judgment, and the threat that it poses to maintaining self-esteem and social status. Uncontrollable and social-evaluative elements of a psychological stressor have been shown to increase cortisol and blood pressure (Dickerson and Kemeny, 2004). The threat is specific but common; several studies have indicated that cortisol rises after social evaluation in various settings such as public speaking, paced auditory serial addition test, and mental arithmetic under time pressure (Kirschbaum et al., 1993; Bibbey et al., 2015; Smith and Jordan, 2015; Dahm et al., 2017).

The stress response is not universal. Differences between men and women responding to various stressors have been well documented. Multiple underlying factors have been identified, often divided into biological and social factors. The menstrual cycle and oral contraceptives (OCs) have been found to affect the stress response. During the follicular phase of the menstrual cycle, the cortisol response is attenuated compared to the luteal phase (Villada et al., 2017), possibly explained by higher levels of progesterone and estrogen in the luteal phase (Gordon and Girdler, 2014). OC use has a dampening effect on the stress response. A meta-analysis based on 34 studies by Liu et al. (2017) reported lower salivary cortisol in women on OCs compared to women not on OC both at baseline and peak following the Trier Social Stress Test (TSST) but not during the recovery phase. When comparing men to women on OC and men compared to women not on OC, they found no differences in salivary cortisol at baseline between the sexes, but reported higher cortisol levels in men during peak and recovery compared to women on OC. In addition to biological factors, gender and socialization seem to affect the stress response in men and women differently (Pruessner, 2018). When being subjected to psychosocial stress, social support from a partner dampens the cortisol response in men, women on the other hand respond more strongly with their partner around (Kirschbaum et al., 1995). Comparing an achievement stressor with a social rejection stressor, Stroud et al. (2002) showed increases in cortisol in men for the former and in women in the latter Task. This suggests that men are more sensitive to competitive and achievement aspects of a situation, and that women are more affected by social components that can affect their social standing within a group. They did not differentiate between sex and gender and only included women who were not on OC. There is, however, empirical evidence showing increased cortisol and testosterone levels in women in anticipation of a rugby match, whereby postgame levels of these hormones were higher than pregame levels (Bateup et al., 2002). The testosterone rise was associated with team bonding and aggressiveness and the cortisol change was positively related to the level of challenge of the opponent. These findings provide evidence that not only men, but women too are sensitive to competitive aspects of a situation and that it is reflected in their endocrinological response. A study applying an adjusted TSST, whereby the audience during the 5 min speech was behind a one-way mirror so participants could not see them, yielded sex-specific results. Men reported comparable cortisol levels, whereas women showed no response when they could not see the audience (Andrews et al., 2007; Wadiwalla et al., 2010). This sex-based difference paved the way for follow-ups investigating the influence of gender identity on the stress response. Sex refers to physiological differences in the gonads, sex hormones, external genitalia, and internal reproductive organs. Gender on the other sided refers to social, environmental, cultural, and behavioral factors that affect someone's self-identity (Clayton and Tannenbaum, 2016). To differentiate between the effects of sex and gender identity, four groups were subjected to the adjusted TSST: male gender identity with male sex, female gender with female sex, male gender with female sex, and female gender with male sex. The cis-gendered groups replicated previous results.

However, subjects with female gender identity combined with male sex did not respond to the Task, whereas subjects with male gender identity and female sex responded like cis-gendered males with an increased cortisol response (Pruessner, 2018). Future studies examining the effect of OC on women with female vs. male gender identity could give more insight which factor dominates the stress response. These results emphasize the importance of gender identity in explaining differences between men and women.

Inducing social-evaluative threat commonly involves evaluation and judgment by others. Numerous stress paradigms comprise both social evaluation and performance, such as the TSST, Montreal Imaging Stress Task (MIST), and ScanStress (Kirschbaum et al., 1993; Dedovic et al., 2005; Dahm et al., 2017, respectively). Evaluation through rejection/exclusion has often been examined using the Cyberball paradigm (Williams et al., 2000). A modified version of the Cyberball Task, with exclusion based on negative performance evaluation, proved to increase subjective stress (Wagels et al., 2017). Due to the mild nature of exclusion in the original Cyberball, however, cortisol increases are not consistently found (Zöller et al., 2010; Seidel et al., 2013; Gaffey and Wirth, 2014; Radke et al., 2018). Similarly, the Yale Interpersonal Stressor (YIPS) is a well-established way to induce social-evaluative threat, whereby participants are excluded during the course of a real-life conversation with two confederates (Stroud et al., 2002; Zwolinski, 2008). While Stroud et al. (2000, 2002) reported a cortisol increase following the YIPS, others also relying on the YIPS failed to elicit a cortisol response (Linnen et al., 2012). Following recent developments in computer-mediated communication, a novel, exclusion-based paradigm, "Ostracism Online," mimics a social media environment to induce social exclusion (Wolf et al., 2014). Here, participants can receive "likes" from others on a short introduction they wrote about themselves; in the exclusion condition they, however, receive only one "like" from 11 other group members (Wolf et al., 2014). Ostracism Online has been validated using the Need-Threat Scale (van Beest and Williams, 2006) and has been reported to induce increased self-ratings in the extent to which participants felt bad, unfriendly, angry, and sad following exclusion compared to including conditions. Mimicking more realistic online communication, Donate et al. (2017) developed a chatroom Task whereby participants can ask questions to and answer questions from two confederates in a yes or no format. Participants in the inclusion condition are asked a question in 33% of the rounds (equal to the confederates), compared to only 15% in the exclusion condition. The results revealed that exclusion led to increased anger and higher levels of self-pain feelings, namely feeling tortured and hurt. It is important to note that both paradigms have been validated using self-ratings only.

Overall, responses to social-evaluative threat can be assessed on various levels. Performance oriented paradigms, like the MIST, ScanStress, and TSST, confirmed their validity with physiological measures using cortisol assays. Exclusion-related paradigms such as Cyberball and Ostracism Online mainly focused on subjective ratings of stress and mood to indicate an emotional effect.

There is, however, no fMRI compatible paradigm available yet that combines social-evaluative threat with social mediaor online communication. We have therefore designed the "Verbal Interaction Social Threat Task" (VISTTA) suitable for investigating the direct neural representation of social-evaluative situations and responses. This study is the first to investigate the possibilities and implications of the VISTTA, aiming to validate it as a social threat induction method. Considering the scope of the current study, additional research will have to be done to have a broader understanding of the domains affected by the VISTTA. Verbal communication is central to this new paradigm that bears a strong resemblance to online chatting. The increasing influence of social media and online communication platforms comes with an increase in the number of cases of cyberbullying. The Cyberbullying Research Center in the United States reported that on average 28% of all middle and high school students, who participated in different studies between 2007 and 2016, have been the victim of cyberbullying (Patchin, 2016). The VISTTA mimics an online communication environment with two interlocutors. It has a realistic implementation of verbal, interactive, and social components and can be deployed to gain valuable insights in the above-described social interactions. Participants are told they will do a cooperation Task with the interlocutors at the end of the VISTTA and are asked after every conversation to rate how much they like to cooperate with them. We expect the VISTTA to elicit a behavioral, emotional, and a physiological response. Lower subjective ratings with regard to the cooperation Task and a more negative mood indicate a behavioral and emotional effect, respectively. We also expect to find elevated cortisol levels, and increased heart rate over the course of the paradigm. Based on the abovedescribed sex differences, we hypothesized to find larger effects in females than in males.

## MATERIALS AND METHODS

## Ethics Statement

The local ethics committee at the Medical Faculty of RWTH Aachen University approved the current study. The experimental protocol was carried out in accordance with the provisions of the World Medical Association Declaration of Helsinki.

## Participants

Sixty-one healthy first year students (29 males, Mage = 19.9, SD = 1.6, 32 females, Mage = 19.85, SD = 1.2; sex was defined by self-report) who were all fluent in German, participated in this experiment. When discussing males/females, we refer to the sex that is reflected physiologically by the gonads, sex hormones, external genitalia, and internal reproductive organs (Clayton and Tannenbaum, 2016). To ensure that all participants were in a new social environment without an established social network, we only included students who had recently moved to Aachen and did not switch studies. Further inclusion criteria were: age of 18–30 years, righthandedness, no metabolic illnesses (hypertension; lung-, brain-, and kidney diseases; diabetes mellitus; and drug dependence),

no neurological or psychiatric disorders (determined with Structured Clinical Interview for DSM-IV; Fydrich et al., 1997), no medication use, and no pregnancy in women. Only women who took hormonal contraceptives via the pill were included. We made this choice to control for hormonal fluctuations and interpersonal differences in the menstrual cycle (Liu et al., 2017). Participants were told to refrain from alcohol 24 h, eating 2 h, and coffee 3 h prior to the experiment. All experimental sessions took approximately 2 h and took place between 1.30 pm and 6.30 pm to control for the circadian rhythm in cortisol. All participants gave written informed consent and received €20, as monetary compensation.

## Paradigm – Social-Evaluative Threat

The VISTTA is partially based on the YIPS (Stroud et al., 2000). The YIPS uses real confederates that are trained to exclude the participant from the conversation, whereas the VISTTA is a variation that employs a digital communication platform. The Task simulates an online communication environment in which the participant is led to believe he/she is communicating with two peers, one male (Daniel) and one female (Julia).

During 30 short conversations, which the participant initiates by selecting one of the four presented opening sentences, the two computerized interlocutors respond mostly with negative comments and rejections toward the participant. This approach should induce feelings of social rejection and exclusion and induce feelings of social stress in the participants. The participants had 40 s to select an opening sentence by pressing button 1, 2, 3, or 4 on the keyboard. If they did not select an option, the first sentence was automatically selected. We created unique reactions per opening sentence, in a way that they did not contradict over the course of the experiment. After selecting an opening sentence, a chat box was shown whereby ". . . is typing" was shown when the interlocutors were supposedly typing there response. Depending on the length of the response, the duration of ". . . is typing" varied in length with longer presentation times for longer responses (ranging from 4 to 8 s) (see **Figure 1** for an overview of the VISTTA). In order to create a within-subject control condition and to make it more credible that participants were chatting with two actual people, they also received neutral to positive reactions in 10 out of 30 conversations. Reactions from interlocutors were both either positive or negative, so that acceptance and rejection would take place in the same set of topics for all participants. Two examples are presented in **Figure 2**. This experimental group, from here on referred to as "threat group," was compared to a control group that only received neutral/positive reactions. The 20 topics with dismissive comments were changed so that the interlocutors replied with agreement and consent. A pilot study among Ph.D. and master students (n = 15 for threat condition, n = 11 for control condition) showed that all negative responses were rated significantly more negative than all positive responses (p ≤ 0.009), except one that showed only a trend toward significance (p = 0.052) (see **Supplementary Table 1** for all opening sentences and corresponding reactions). We created two pseudorandom orders of topics that were randomly assigned to the participants to rule out any possible confounding effects of order. Each block of 10 conversations contained the same set of topics in both versions. All responses were created to match the opening sentences. The experiment started with a practice round to familiarize the participants with the structure of the paradigm.

## Procedure

As part of the cover story, participants were told that the experiment was about initiating social interactions between students via online communication and that the two interlocutors were each in a separate room nearby. They were also led to believe they had to do a cooperation Task with the two others after the VISTTA and that the height of their monetary reward depended on how well they cooperated. It was therefore preferable if they maintained a good bond with the interlocutors. To reinforce the cover story, participants were told to be punctual, because the experiment was conducted together with two other students. They could not meet the "others," as the goal of this study was said to investigate online communication. During the experiment, the investigator left the participant three times to check whether the "others" were ready to start and if everything went as planned; first before the start of the VISTTA (T2), in the short break after 20 conversations (T3), and directly after finishing the VISTTA (T4). Mood and physiological measures were also acquired at these time points. During the debriefing at the end of the experiment, all participants were asked about their experience with regard to the confederates, and whether they believed they were communicating with two real people. Five participants reported they did not believe the two interlocutors were real. Exploratory analyses whereby non-believers were excluded yielded a similar pattern of results. Final analyses were therefore conducted on the whole sample. As the study was conducted in Aachen, Germany, all opening sentences and responses were in German (see **Supplementary Table 1** for the English translation).

## Social-Evaluative Threat Measures Trait Measures

We acquired a set of personality questionnaires covering stress coping mechanisms [Coping Inventory for Stressful Situations (CISS); Endler and Parker, 1990], anxiety {State-Trait Anxiety Inventory [STAI(T)]; Spielberger et al., 1983, Liebowitz Social Anxiety Scale; Liebowitz, 1987}, primary appraisal secondary appraisal (PASA; Gaab, 2009), rejection sensitivity [Rejection Sensitivity Questionnaire (RSQ); Berenson et al., 2009], social network questionnaire (Linden et al., 2007), stress processing (Stressverarbeitungsfragebogen; Janke and Erdmann, 1997), and intelligence [Wortschatztest (WST); Schmidt and Metzler, 1992].

## Subjective Ratings

The subjective experience of social evaluative threat was assessed on three distinct levels. First, at the end of every conversation, participants were asked to rate the extent to which they wanted to cooperate with the two interlocutors on a Scale from 1 to 5, with 1 being "not at all" and 5 "very much." Second, after finishing the VISTTA, participants answered open questions how they experienced the interaction and how they felt to not have

met their interlocutors. They also rated on a Scale from 1 to 8 how fair they thought the "others" were (1 being "not fair at all," 8 being "very fair"). Third, in an additional, reaction rating Task, participants were presented with all reactions (2 reactions per topic, total of 60 reactions) they had received during the VISTTA. For each reaction separately, they indicated to what extend they experienced that reaction as positive or negative in regard to their opening sentence (on a Scale from 1 to 5 with 1 being "very negative" and 5 "very positive").

## Emotional and Physiological Responses

Mood was measured repeatedly using the Emotional Self-Rating (ESR; Weiss et al., 1999) and the Positive and Negative Affect Scale (PANAS) at T2, T3, and T4 (Watson et al., 1988).

Salivary cortisol levels, heart rate, and blood pressure were repeatedly measured throughout the experiment (see **Figure 1** for overview). Saliva samples were taken using SaliCaps, to measure cortisol (IBL International, Hamburg, Germany). Saliva samples were taken at the start of the VISTTA (T2), in the

short break after 20 conversations (T3), and directly after finishing the VISTTA (T4). Sampling time varied among participants, but was not timed. The samples were stored at −30◦C until they were analyzed by the Dresden LabService (Germany). Samples were analyzed in duplicate and the average was used in subsequent analyses. Cortisol concentrations were measured using Luminescence Immunoassays with high sensitivity (Immuno-Biological Laboratories GmbH, Hamburg, Germany), with intra-assay and inter-assay coefficients of <8%. Heart rate and blood pressure were acquired via an automatic blood pressure monitor with arm cuff (Intellisense, OMRON, Germany) at six time points (minutes after onset) throughout the experiment (T1 = 15, T2 = 30, T3 = 50, T4 = 60, T5 = 75, T6 = 120).

## Statistical Analyses

All analyses were performed using SPSS 25 (IBM Corp., Armonk, NY, United States). The alpha level was set to 0.05 and Greenhouse–Geisser correction was applied when necessary. Post hoc pairwise comparisons were Bonferroni corrected.

## Trait Measures

All scores except PASA and Social Network were normally distributed. PASA and Social Network were logarithmically transformed to meet the criterion of normal distribution. Separate 2 × 2 ANOVAs were conducted for each of the personality questionnaires, with Group (threat, control) and Sex (male, female) as between-subject factors.

## Subjective Ratings

Ratings on the willingness to cooperate were averaged for positive/neutral and negative reactions separately. Subsequently, a 2 × 2 × 2 ANOVA was conducted, with Valence (positive or negative reactions) as within-subject factor and Group and Sex as between-subjects factors. A similar analysis was used for the fairness rating, without Valence as a within-subject factor (i.e., a 2 × 2 ANOVA).

Ratings for the individual reactions all deviated from normal distribution (Kolmogorov–Smirnov was significant). Moreover, the subset of reactions presented to the participant depended on the choice of opening sentence, which led to a unique combination of reactions for every participant. Hence, individual reactions could not be directly compared between conversations. The ratings per reaction (one from each interlocutor) were combined into a mean score indicating the overall positivity/negativity of the reaction-pair per conversation. For these reasons, these data were analyzed using generalized estimating equations (GEEs). This mean rating was entered as dependent variable in the full model of the GEE analysis with Topic Valence (two levels: positive, negative) as within-subject factor and Group (threat, control) and Sex (male, female) as between-subject factors. Subjects were modeled as random effects and all factors as fixed effects.

## Emotional Responses

Repeated measures ANOVAs with post hoc pairwise comparisons were conducted for PANAS, with positive and negative mood as subscales, with Time (T2, T3, T4) as within-subjects factor and Group (threat, control) and Sex (male, female) as betweensubjects factors.

A similar analysis as for the reaction ratings was performed for the ESR Scales (anger, disgust, happiness, fear, sadness, surprise) as they deviated from normal distribution. The GEE analysis was designed with Emotion (six levels: anger, disgust, happiness, fear, sadness, surprise) and Time (T2, T3, T4) as within-subject factors and Group (threat, control) and Sex (male, female) as between-subject factors. Subjects were modeled as random effects, and all factors were included as fixed effects. To test for differential effects of the two VISTTA versions, only interactions involving the factor Group were entered in the model (as fixed effects).

#### Physiological Responses

fnins-13-00830 August 6, 2019 Time: 17:18 # 7

Cortisol was acquired three times and heart rate and blood pressure six times. Cortisol values were not normally distributed and hence logarithmically transformed. All analyses were computed based on the transformed data. Repeated measures ANOVAs were conducted for cortisol, heart rate, systolic and diastolic blood pressure, with Time as within-subjects factor and Group and Sex as between-subjects factors.

## RESULTS

## Trait Measures

There were no differences between groups or sexes regarding the personality questionnaires, after correcting for multiple comparisons. Before correction, scores on trait anxiety (STAI-T; p = 0.041) and Task-oriented coping (CISS\_Task; p = 0.013) were significantly higher for men than women, whereas scores on avoidance-oriented coping (CISS\_avoidance; p = 0.009), and social anxiety (Liebowitz\_anxiety: p = 0.014), were higher for women compared to men. Both, however, did not survive the corrected alpha level that was set to 0.0029. All other comparisons were not significant. The descriptive statistics of all questionnaires are included in **Supplementary Table 2**.

## Subjective Ratings

### Willingness to Cooperate and Perceived Fairness

As expected, participants who took part in the threat group were less willing to cooperate with the interlocutors than participants who received only neutral/positive reactions during the Task. This was shown by a main effect of Group [F(1,57) = 37.254, p ≤ 0.001, η 2 <sup>p</sup> = 0.395], whereby the overall "willingness to cooperate" was lower in the threat group (M = 3.21, SD = 0.53) than in the control group (M = 3.94, SD = 0.37) (p ≤ 0.001). We also found a main effect of Valence [F(1,57) = 49.033, p ≤ 0.001, η 2 <sup>p</sup> = 0.462], showing higher "willingness to cooperate" after positive (M = 3.88, SD = 0.55) than after negative (M = 3.25, SD = 0.87) reactions. A Valence <sup>∗</sup> Group interaction [F(1,57) = 35.082, p ≤ 0.001, η 2 <sup>p</sup> = 0.381] revealed a group difference for negative reactions, with lower cooperation ratings in the threat group compared to the control group (p ≤ 0.001). No group difference was found for positive reactions (p = 0.453). "Willingness to cooperate" also significantly differed withinsubjects in the threat group, with higher ratings after positive comments than after negative comments (p ≤ 0.001) (**Figure 3A**). No Valence effect was present in the control group (p = 0.453), as all reactions were neutral/positive. "Willingness to cooperate" did not differ between sexes, regardless of valence (p = 0.438) (see **Table 1** for means per group and emotion).

Interlocutors in the threat group (M = 4.22, SD = 1.32) were rated significantly less fair than interlocutors in the control group (M = 7.43, SD = 1.14) [F(156.34,1.56) = 1319.08, p ≤ 0.001, η 2 <sup>p</sup> = 0.637]. There was no main effect of Sex on the willingness to cooperate and no significant interaction including Sex emerged (all p ≥ 0.579).

## Comment Ratings

The GEE analysis for the comment ratings showed a main effect of Group [Wald-χ 2 (1) = 87.0, p ≤ 0.001], whereby reactions were rated lower (more negative) in the threat group (M = 2.77, SD = 1.19) than the control group (M = 3.80, SD = 0.77). We also found a main effect of Topic Valence [Wald-χ 2 (1) = 221.5, p ≤ 0.001], with lower ratings for negative comments (M = 2.94, SD = 1.14) compared to neutral/positive comments (M = 3.95, SD = 0.75). Again, there was no main effect of Sex (p = 0.931). Two interactions were found significant, i.e., Group <sup>∗</sup> Sex [Waldχ 2 (1) = 47.0, p = 0.031] and Group <sup>∗</sup> Topic Valence [Waldχ 2 (1) = 137.9, p ≤ 0.001]. No other interactions were significant (p ≥ 0.221). Post hoc analyses for the Group <sup>∗</sup> Sex interaction showed that both men and women in the threat group rated the comments overall as more negative than the participants in the control group (p ≤ 0.001). Within the threat group, men tended toward lower ratings (more negative) than women (p = 0.093). There was no difference in ratings between men and women in the control group (p = 0.152). Decomposing the Group <sup>∗</sup> Topic Valence interaction showed that reactions in the 20 negative topics in the threat group were rated more negative than neutral/positive reactions to the same topics in the control group (p ≤ 0.001). The reactions to the 10 neutral/positive topics that were the same for both groups were rated equally (p = 0.905) (**Figure 3B**).

## Emotional Responses

### Positive and Negative Mood

The VISTTA led to a significant decrease in positive mood over time, that is, there was a main effect of Time for positive mood [F(1.793,98.613) = 14.651, p ≤ 0.001, η 2 <sup>p</sup> = 0.210]. Pair-wise comparisons showed a general decrease between T2−T3 and T2−T4 (p ≤ 0.001). We did not find a main effect of Time for negative mood (p = 0.545). A Time <sup>∗</sup> Group interaction did occur for negative mood [F(1.558,98.613) = 6.012, p = 0.007, η 2 <sup>p</sup> = 0.099], but not positive mood (p = 0.677). Post hoc analyses showed that negative mood decreased only in the control group between T2−T3, that is, from the start of the Task until the break. Negative mood did not change over time in the threat group (p ≥ 0.272). No other effects or interactions were found significant (p ≥ 0.197).

### Emotional Self-Rating

The GEE analysis for the ESR revealed significant main effects of Emotion [Wald-χ 2 (5) = 532.1, p ≤ 0.001], Sex [Waldχ 2 (1) = 4.63, p = 0.031], and Time [Wald-χ 2 (2) = 7.66, p = 0.022].

There were significant interactions of Group <sup>∗</sup> Time [Waldχ 2 (2) = 6.8, p = 0.033], and Group <sup>∗</sup> Emotion [Waldχ 2 (5) = 15.98, p = 0.007]. In addition, there were significant three-way interactions of Group <sup>∗</sup> Emotion <sup>∗</sup> Time [Waldχ 2 (17) = 43.4, p ≤ 0.001], and Group <sup>∗</sup> Sex <sup>∗</sup> Emotion [Waldχ 2 (10) = 19.01, p = 0.040].

The main effect of Emotion was due to significantly higher ratings for happiness (M = 2.97, SD = 0.95) than for all other

group. (B) Negative reactions in the threat version of the VISTTA were rated significantly lower than reactions to the same topics in the control version. Also within-subject they were rated lower than the positive reactions in 10 topics. (C) Cortisol stayed stable in women in the threat group, but decreased in the control group. Men showed decreasing cortisol levels in both groups. For illustrative purposes only significant differences between women in both groups and the decay in the women control group are marked with an asterisk. (D) Anger increased from T2 to T3 in the threat group, compared to unchanged anger scores in the control group. (E) The threat version led to an increased score of surprise, whereas this score stayed stable in the control group. (F) Participants in the threat group reported decreased levels of happiness. The control version did not elicit such a decrease. Raw values were used to create the graphs, although some analyses used transformed scores. Error bars represent standard deviations. Asterisks indicate significant differences with p < 0.05.

emotions, followed by surprise (M = 2.17, SD = 1.02), which also differed significantly from all other emotions, as well as a significant difference between anger (M = 1.20, SD = 0.56) and disgust (M = 1.02, SD = 0.13). The main effect of Sex was due to higher emotional ratings in females (M = 1.62, SD = 1.02) than in males (M = 1.52, SD = 0.85) (see **Table 2** for means per group and emotion). The main effect of Time was due to overall higher ratings, that is, more intense emotions, at T3 (M = 1.58, SD = 0.95) than at T4 (M = 1.52, SD = 0.93).

Decomposing the two significant two-way interactions revealed higher ratings for anger (p ≤ 0.001) and surprise (p = 0.012) in the threat group than in the control group, along with overall higher ratings in the threat group than in the control group at T3 (p = 0.015), that is, after the first block of the VISTTA. These effects need to be viewed within the context of the Group <sup>∗</sup> Emotion <sup>∗</sup> Time interaction: For anger and surprise, ratings differed between threat and control only at T3 and T4 (p ≤ 0.002), not at T2 (p ≥ 0.068) (before the VISTTA). Moreover, ratings for happiness differed at T4, with lower ratings in the threat group than in the control group (p = 0.0026). Crucially, temporal changes of emotional ratings were limited to the threat group: Here, ratings for anger increased from T2 to T3 (p ≤ 0.001), and from T2 to T4 (p = 0.005). Similarly, ratings for surprise increased from T2 to T3 (p = 0.022). Analogously, happiness decreased from T2 to T4 (p ≤ 0.001), and T3 to T4 (p = 0.030), while no such changes over time were evident in the control group (p ≥ 0.0164) (**Figures 3D–F**).

The Group <sup>∗</sup> Sex <sup>∗</sup> Emotion interaction was due to sexspecific responses: In males, higher ratings in the threat than in the control group were evident for anger and surprise. In females, differences in emotional ratings between threat and control emerged for anger and happiness, with higher and lower values in the threat group, respectively. Comparing males and females directly indicated differences in the rating of happiness only for the control group, that is, females rated themselves as happier than males.

## Physiological Responses Cortisol

The repeated measures ANOVA with Time as within-subject factor and Sex and Group as between-subjects factors showed a main effect of Time [F(1.322,75.930) = 24.031, p ≤ 0.001,



Cooperation positive and – negative (willingness to cooperate after positive and negative reactions, respectively), comment positive and – negative (how positive or negative both reactions from interlocutors were experienced by participants).

η 2 <sup>p</sup> = 0.297] with T2 ≥ T3 (p = 0.013), T3 ≥ T4 (p ≤ 0.001), and T2 ≥ T4 (p ≤ 0.001). We also found a main effect of Group [F(1,57) = 6.193, p = 0.016, η 2 <sup>p</sup> = 0.098] with higher cortisol levels in the threat group (M = 5.25 nmol/L, SD = 1.88) than the control group (M = 4.31 nmol/L, SD = 2.58) at all three time points (p ≤ 0.036), and a main effect of Sex [F(1,57) = 9.051, p = 0.004, η 2 <sup>p</sup> = 0.137], whereby men (M = 5.75, SD = 2.57) had higher cortisol levels than women (M = 3.92, SD = 1.58) at all three time points (p ≤ 0.034).

A significant three-way interaction between Time <sup>∗</sup> Sex <sup>∗</sup> Group emerged [F(1.332,75.930) = 3.653, p = 0.048, η 2 <sup>p</sup> = 0.060]. Post hoc comparisons for men and women separately showed a main effect of Group among women [F(1,30) = 14.233, p = 0.001] with higher cortisol levels at all three time points in the threat group (p ≤ 0.027) (**Figure 3C**). Cortisol levels in men did not differ between groups (p ≥ 0.212). Comparing men and women in both groups showed a main


HR, heart rate; SBP, systolic blood pressure; DBP, diastolic blood pressure.

effect of Sex in the control group [F(1,28) = 7.989, p = 0.009] with higher cortisol levels in men than women. Cortisol levels in the threat group did not differ between men and women (p ≥ 0.137) (see **Table 2** for an overview of means and standard deviations). Time point comparisons to investigate the course of cortisol levels showed that the cortisol level did not change over time for women in the threat group (p ≥ 0.281). Women in the control group showed a significant decrease between T2−T3 and T2−T4 (p ≤ 0.040). The control group in men led to a decrease between T3 and T4 (p ≤ 0.001). In the threat group, cortisol levels decreased significantly between T2−T4 and T3−T4 (p ≤ 0.001). All other time point comparisons did not reach significance (p ≥ 0.116) (**Figure 3C**).

#### Heart Rate and Blood Pressure

The repeated measures ANOVA with Time as within-subject factor and Sex and Group as between-subjects factors showed a main effect of Time [F(3.919,207.718) = 18.127, p ≤ 0.001, η 2 <sup>p</sup> = 0.255]. No main effects of or interactions with Group and Sex were found (p ≥ 0.224). The Time <sup>∗</sup> Sex <sup>∗</sup> Group interaction showed a trend toward significance (p = 0.053). The overall pattern showed that heart rate decreases in both groups, with more fluctuation in the threat group.

We found a main effect of Time for both systolic and diastolic blood pressure (SBP and BPD, respectively) (p ≤ 0.17), with a general decay over time. For SBP, there was a main effect of Sex, showing that men had higher SBP than women (p ≤ 0.022). No difference between sexes was found for DBP (p ≥ 0.795). Blood pressure did not differ between groups (p ≥ 0.239).

## DISCUSSION

fnins-13-00830 August 6, 2019 Time: 17:18 # 10

The aim of the current study was to develop and validate a new, fMRI compatible, social threat paradigm that implements a realistic representation of nowadays' digital communication environments. As a second objective, we were interested if and how social-evaluative threat affects men and women differently. The opening sentences were created in a way that they stated something about the participant's personality or interests, so that the reactions that followed from the interlocutors would directly target the participant. Reactions in the control group were of an agreeing and accepting nature so that those participants did not experience any social evaluative threat. Our results indicate that the VISTTA elicits both subjective, emotional, and physiological responses as apparent from lower willingness to cooperate after negative reactions, reactions rated as more negative in the threat group, increased feelings of anger and surprise, decreased feeling of happiness, negative mood decreased in the control group, but stayed stable in the threat group and the stable cortisol levels in women in the threat group throughout the experiment. However, increased physiological measures at the start of the testing session could reflect pre-experimental arousal, as the experiment started approximately 30 min after arrival.

## Relevance of VISTTA as New Social-Evaluative Threat Paradigm

Although the VISTTA contains obvious similarities with the Chat-room Task (Donate et al., 2017), these paradigms target different concepts. The Chat-room ostracizes participants by not asking them the same amount of questions as the confederates. This "lack of interest" in the participant is the driving force behind the ostracism induction. The content of the questions and answers does not play a role. The VISTTA is differently structured, whereby participants are continuously involved in the conversations. Our goal was to use personally directed rejection to drive the experience of social-evaluative threat. Participants might have felt ostracized during the VISTTA when the two confederates repeatedly agreed, and together disagreed/insulted the participants' perspective.

## Subjective Responses

Lower willingness to cooperate after negative reactions than after positive ones shows rejection negatively affected the motivation for social, cooperative interactions. We did not find sex differences for this measure despite different characteristic coping strategies between men and women. Nickels and Kubicki (2017) reported that performance stress, induced via the TSST, led to less prosocial behavior in men and more cooperative behavior in women. The VISTTA is not performance based, which could be a possible explanation why this sex difference is not reflected in our findings. As an additional validation of social rejection, participants rated how positive or negative they experienced all reactions they received. As this was also a 1–5-Scale, just like "willingness to cooperate" the findings showed an almost identical pattern, with more negative ratings for negative comments and more positive ratings for positive comments. Men and women showed a similar rating pattern. Although these measures were very similar, we tried to target different concepts. "Willingness to cooperate" was hypothesized to reflect a motivation for facing the two individuals who rejected the participant, whereas "comment ratings" to reflect the level of positivity or negativity of each individual reaction. We wanted to indirectly measure social-evaluative threat using these measures.

## Affective Responses

Also, no sex difference emerged for the subjective mood ratings. We found that positive mood decreased over time for both men and women. This decrease, however, was seen in both the threat and control group, suggesting the negative comments during the VISTTA did not affect participant's positive emotional state. Negative mood did differ between groups. Participants in the control group reported a decreased negative mood in the first half of the VISTTA. The threat group showed an increase, although that did not reach significance. These findings suggest that the inclusive interactions in the control condition positively affected the negative mood. This underlines that positive and negative mood are not bipolar, but rather change independently from one another. We also demonstrated effects on multiple emotions such as anger, surprise, and happiness. Over the course of the threat version of the VISTTA, participants reported increased feelings of anger and surprise, and decreased feelings of happiness. Similar results have been reported using other exclusion paradigms. Unfair exclusion, compared to fair exclusion in a modified Cyberball Task, was linked to increased anger (Chow et al., 2008). Exclusion from participation in the Chat-room Task also led to higher anger ratings (Donate et al., 2017). Our finding regarding surprise indicates that the negative interaction was unexpected, since the control group did not report any changes for this emotion. Decreased happiness in the threat group, contrary to stable happiness levels in the control group, indicates that the VISTTA negatively affected the positive state. Two factors might contribute to these findings. The most evident is the content of the personally directed negative comments that affects the emotional state of the participants. Second, the anticipation of having to face the interlocutors after the chat Task and to cooperate with them on a separate Task for additional monetary reward could contribute to a less positive mood.

## Physiological Responses

Contrary to our a priori hypotheses, the VISTTA did not induce cortisol and heart rate increases. A possible explanation is that there is no direct social evaluation, but indirect via a computerized communication. Other fMRI compatible paradigms, such as Cyberball (Williams et al., 2000) have also been found to not elicit a cortisol increase in both men and women (Zöller et al., 2010; Zwolinski, 2012; Seidel et al., 2013; Gaffey and Wirth, 2014; Radke et al., 2018). Cortisol increases are generally found after a stressor that includes direct personal interaction. Using a modified version of the TSST, Woody et al. (2018) investigated the effect of socialevaluative threat, and added cognitive load as additional factor of interest. They reported increased cortisol and blood pressure

in response to social-evaluative threat, but a flat line for the non-social-evaluative threat group. Following the circadian rhythm, in healthy individuals, cortisol levels peak early in the morning and, without a stressor, decline throughout the day (Krieger et al., 1971; Weitzman et al., 1971; Debono et al., 2009; Chan and Debono, 2010). The fact that we do not see this decline in women in the threat group during the experiment could be an indication that social-evaluative threat by the VISTTA elicits an endocrinological response in women. For men, the decline in cortisol, as well as in heart rate, might indicate that they habituate to the social evaluation. As the social evaluation occurs through a computer without face-to-face interaction, the physiological responses we found could be dampened due to a more indirect threat. During the YIPS, female, but not male, participants, who are excluded during the course of a conversation with two confederates, show a cortisol increase (Stroud et al., 2000, 2002; Zwolinski, 2008). However, this cortisol response is not replicated in all studies (Linnen et al., 2012). Women appear to respond more strongly to social rejection, whereas men show increased cortisol responses to achievement challenges (Stroud et al., 2002; Kogler et al., 2017). In the study of Blackhart et al. (2007) participants were told that no one wanted to be paired with them to complete a Task after a group interaction session. Although the rejection did not come directly from the confederates, cortisol levels were significantly higher following social rejection compared to acceptance. It seems that direct personal interaction whereby investigators/jury/peers judge or reject participants is an important factor to elicit a cortisol increase, and that it is particularly effective in women. Meeting the two confederates could help reinforce the cover story and elicit a stronger response to rejection. Looking back at the factors influencing the stress response that we discussed in the section "Introduction," an important note here is that the abovementioned studies either included women not using OC or did not report on contraceptive use. Also, to this day, the majority of research papers focuses on sex difference whereas gender identity has been shown to differently affect the stress response. It would be a valuable addition to future research to assess not only sex but also gender, and include it as a factor of interest or at least as confounding factor.

At the start of the experiment, heart rate was significantly higher for both men and women compared to when the VISTTA was finished; however, this was seen in both experimental groups (threat vs. control). This decline opposes our previous hypothesis that HR increases as an effect of the social threat. Throughout the entire experiment, SBP was significantly higher for men compared to women, with no effect of experimental group. Men have, in general, a higher SBP than women (Reckelhoff, 2001). Overall, the VISTTA did not elicit a significant response in HR or blood pressure.

## Limitations

We were unfortunately not able to measure a continuous heart rate signal or skin conductance. The blood pressure monitor we used had to be attached separately for each time point and hence only enabled us to acquire heart rate and blood pressure for T1–T6. It was therefore not possible to directly compare physiological responses to negative and positive feedback, only between-subjects. During an acute stressor, the release of catecholamines increases heart rate and blood pressure (McEwen, 2007). Heart rate is therefore a suitable measure to investigate the immediate response to a stressful situation. To directly compare the effects of negative and positive feedback, a continuous heart rate signal would shed more light on the physiological responses to social-evaluative threat and be a suitable indicator for stress response. Although physiological measures could not be acquired continuously, we did have a behavioral measure after every conversation, allowing us to directly compare subjective effects of positive vs. negative feedback.

For this study, we chose to test a group with a specific age and social background to restrict possible confounding factors. This might result in lower generalizability of the results. Also, given this is the initial study investigating the effects of VISTTA, future studies including larger and different samples should shed more light on wider applicability of this new paradigm. Since all conversations were in German, the VISTTA should be adapted for other studies using non-German speakers. All opening sentences and responses are also available in English. Also, two topics should be changed, as they were specific for the region the study took place.

We made the choice to only include women on OCs to control for hormonal fluctuations and interpersonal differences in the menstrual cycle. It should be noted that OCs heighten estrogen levels and consequently dampen the stress response in women (Pruessner, 2018).

Although, we included a variety of affective measurements, additional measurements such as feelings of embarrassment and changes in self-esteem could have served as extra validation for experiencing social-evaluative threat. Validation of the VISTTA by means of psychological responses was aimed at differences in positive and negative mood, as well as changes in a range of emotions including anger, fear, happiness, and surprise among others. We did not find changes in fear and sadness, which may suggest participants experienced surprise rather than socialevaluative threat from the interactions. Increases in anger, however, are comparable to other studies investigating social exclusion using the Cyberball and Chat-room Task (Chow et al., 2008; Donate et al., 2017).

## CONCLUSION AND FUTURE DIRECTIONS

Implementing personally directed, verbal negative feedback, we applied the VISTTA to induce social-evaluative threat. Men and women in the threat group responded similar on the subjective level, that is, with increased anger and surprise and a lower willingness to cooperate in comparison to the control group. However, physiological measures differed between both groups and sexes. We demonstrated an overall higher endocrinological response in the threat group. Regardless of group, a cortisol decay over time was reported for men, whereas women showed a stable cortisol level over time in the threat group and a decay in the control group. These findings might indicate stronger

habituation in men than in women and underline the importance of multi-level assessment of responses to social-evaluative threat, even in computer-mediated communication. Further replication and validation during fMRI will be crucial to determine its effects in different experimental settings. It will be of interest to which extent meeting the interlocutors affects the perception of social threat. Also rating how much the participants identify with the opening statement of choice could give more insight how threatening the negative reactions might be perceived. Social media environments, as used in the VISTTA, can lower the threshold for negative interaction, which, in turn, can elicit feelings of stress and rejection. Given the increasing influence of online communication platforms, the VISTTA is a useful addition for research on social-evaluative threat and psychosocial stress. In reaction to performance and evaluative stressors, sex and gender have been shown to affect the stress response differently (Pruessner, 2018). As most research has focused on the role of sex to differentiate between males and females, gender has not been given the same level of investigation despite having a demonstrated effect. The discussion on sex and gender has taken a flight over the last few years, both in society, and in the scientific community. The VISTTA enables multi-level assessment of social-evaluative threat, hence, using samples with varying compositions of sex and gender identity, this paradigm could help bridge the gap between sex and gender in this particular field.

## ETHICS STATEMENT

The experimental protocol was carried out in accordance with the provisions of the World Medical Association Declaration of Helsinki. All subjects gave written informed consent. The

## REFERENCES


protocol was approved by the local ethics committee at the Medical Faculty of RWTH Aachen University.

## AUTHOR CONTRIBUTIONS

ST created the first draft, finalized the manuscript, and performed the data collection and analyses. UH and TA took part in interpreting the data and revision of the manuscript. BD and SR conceived and designed the project, and took part in interpretation of the data and revision of the manuscript.

## FUNDING

This research project was supported by the START-program of the Faculty of Medicine, RWTH Aachen (691505) and by the International Research Training Group (IRTG 2150) of the German Research Foundation (DFG).

## ACKNOWLEDGMENTS

The authors thank Paul Mols from the Brain Imaging Facility at the RWTH Aachen University for his assistance with the development of the VISTTA and Grace Lee for her assistance in editing the manuscript.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnins. 2019.00830/full#supplementary-material


Bearbeitung Der Amerikanischen Originalversion Des SCID-II, eds B. Michael, R. L. Spitzer, M. Gibbon, B. W. Janet, and L. Benjamin, (Göttingen: Hogrefe).


Selye, H. (1950). . doi: 10.1016/j.psyneuen.2013.07.021

Smith, T. W., and Jordan, K. D. (2015). Interpersonal motives and social-evaluative threat: effects of acceptance and status stressors on cardiovascular reactivity and salivary cortisol response. Psychophysiology 52, 269–276. doi: 10.1111/psyp. 12318


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Tops, Habel, Abel, Derntl and Radke. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fnins-13-00830 August 6, 2019 Time: 17:18 # 13

# Gender Difference in Gender Bias: Transcranial Direct Current Stimulation Reduces Male's Gender Stereotypes

Siqi Wang1,4 , Jinjin Wang1,4 , Wenmin Guo1,4 , Hang Ye2,3,4 , Xinbo Lu2,3 , Jun Luo2,3,4 and Haoli Zheng2,3,4 \*

<sup>1</sup>School of Economics, Zhejiang University, Hangzhou, China, <sup>2</sup>School of Economics, Zhejiang University of Finance and Economics, Hangzhou, China, <sup>3</sup>Center for Economic Behavior and Decision-making (CEBD), Neuro & Behavior EconLab (NBEL), Zhejiang University of Finance and Economics, Hangzhou, China, <sup>4</sup> Interdisciplinary Center for Social Sciences (ICSS), Zhejiang University, Hangzhou, China

#### Edited by:

Marina A. Pavlova, University Hospital Tübingen, Germany

#### Reviewed by:

Giulia Prete, Università degli Studi G. d'Annunzio Chieti e Pescara, Italy Jan Van den Stock, KU Leuven, Belgium

> \*Correspondence: Haoli Zheng haolizheng@zufe.edu.cn

#### Specialty section:

This article was submitted to Brain Imaging and Stimulation, a section of the journal Frontiers in Human Neuroscience

> Received: 29 November 2018 Accepted: 29 October 2019 Published: 22 November 2019

#### Citation:

Wang S, Wang J, Guo W, Ye H, Lu X, Luo J and Zheng H (2019) Gender Difference in Gender Bias: Transcranial Direct Current Stimulation Reduces Male's Gender Stereotypes. Front. Hum. Neurosci. 13:403. doi: 10.3389/fnhum.2019.00403 Stereotypes exist in the interactions between different social groups, and gender stereotypes are particularly prevalent. Previous studies have suggested that the medial prefrontal cortex (mPFC) is involved in the social cognition that plays an important role in gender stereotypes, but the specific causal effect of the mPFC remains controversial. In this study, we aimed to use transcranial direct current stimulation (tDCS) to identify a direct link between the mPFC and gender bias. Implicit stereotypes were measured by the gender implicit association test (IAT), and explicit prejudice was measured by the Ambivalent Sexism Inventory (ASI). We found that male and female participants had different behavioral and neural correlates of gender stereotypes. Anodal tDCS significantly reduced male participants' gender D-IAT scores compared with cathodal and sham stimulation, while the stimulation had an insignificant effect in female participants. The reduction in male participants' gender bias mainly resulted from a decrease in the difference in reaction time (RT) between congruent and incongruent blocks. Regarding the explicit bias measurement, male and female participants had distinct attitudes, but tDCS had no effect on ASI. Our results revealed that the mPFC played a causal role in controlling implicit gender stereotypes, which is consistent with previous observations and complements past lesion, neuroimaging, and transcranial magnetic stimulation (TMS) studies and suggests that males and females have different neural bases for gender stereotypes.

Keywords: gender stereotypes, medial prefrontal cortex, transcranial direct current stimulation, implicit associations test, gender difference

## INTRODUCTION

Stereotypes refer to socially shared conceptual attributes associated with members of a social category that describe their traits and characteristics (Greenwald and Banaji, 1995; Amodio, 2014). On the one hand, this automatic association process strengthens the distinction of different groups through overgeneralized social categorization, which is efficient as a cognitive heuristic for simplifying the complexity of the physical and social world (Abrams and Hogg, 1988); on the other hand, it influences people's social attitudes and behavior, which leads to prejudices, discrimination, and more severe social conflicts (Amodio, 2014). Gender stereotypes have appeared in the mass media and the general public, have been described and discussed in the research literature (Gray, 1992; Rudman et al., 2001), have attracted the attention of both males and females, and have contributed to the foundation of beliefs and behaviors in terms of gender (Becker and Sibley, 2009). In part, gender stereotypes reflect the different characteristics of genders; however, the broad generalization of such a large group of people can never be true and accurate. For example, although social gender stereotypes accentuate gender differences, males and females are more similar than different on most but not all psychological variables (Hyde, 2005). In addition, the intensity of gender stereotypes and the perception of similarities and differences in characteristics between males and females vary across cultures (Guimond, 2008).

Explicit measures are commonly used for assessing an individual's stereotypes and bias towards a particular group, and these measures require participants to report their own attitudes (Olson and Zabel, 2009). Studies using explicit measures have shown that levels of stereotyping and sexism have reduced in the past few years, but these specious conclusions were drawn from women more than from men (Spence and Buckner, 2000) and cannot reflect unconscious bias when controlled and regulated by social norms and political correctness (Rudman et al., 2001). Moreover, old-fashioned sexist beliefs have gradually evolved from the appearance of discriminatory behavior and negative beliefs towards women to modern sexism (Swim et al., 1995) and neosexism (Tougas et al., 1995) and have been expressed under subtle guises, such as ambivalence and chivalry (Glick and Fiske, 1996; Barreto and Ellemers, 2005).

Implicit measures on gender stereotypes have developed during the past few decades (Rudman and Kilianski, 2000; Rudman et al., 2001); however, previous studies have also revealed that the correlations between results from explicit and implicit methods vary across studies (Greenwald and Banaji, 1995; Rudman and Kilianski, 2000), which has led to a further discussion of the power of implicit measures. The implicit association test (IAT) is one of the most popular methods consistently used for measuring the automatic concept–attribute associations that underlie implicit social biases and stereotypes (Greenwald et al., 1998). In gender stereotypes, this method assesses the association of a target name (either a male name or a female name) with respective attribute categories (e.g., strong vs. weak) that represent the social stereotypes towards these different groups of people. This task requires participants to categorize the target names and attribute words by pressing two corresponding response keys as quickly as possible when they see the words appear on the computer screen. In congruent blocks, participants are instructed to categorize male names and strong attributes using one response key, while female names and weak attributes are categorized by pressing another key. In incongruent blocks, the response mapping is reversed, so male names and weak attributes share one key, and female names and strong attributes share the other key. Since the response time and accuracy rate in the congruent blocks are different from what is obtained in the incongruent blocks, the IAT scores can be calculated following a standard procedure (Greenwald et al., 2003), which represents an individual's personal implicit social bias towards these two genders.

Gender stereotypes were discovered in past behavioral research, and males and females tend to have different patterns of evaluative gender stereotypes (Rudman et al., 2001). Recently, Pavlova et al. (2014) conducted a series of experiments manipulating implicit and explicit gender stereotyping information and identified the susceptibility to these attitudes. Messages delivered in explicit positive (implicit negative) terms and explicit negative (implicit positive) terms can elicit significant gender differences in cognitive performance on a task with no initial gender gap, and this gender effect is more pronounced in females. However, these studies still lacked direct neural evidence underlying the fluctuation in gender bias.

In accordance with behavioral research, recent neurocognitive studies have investigated the neural basis of prejudice and stereotypes and have found that these psychological phenomena primarily rely on the function of a specific brain region, the medial prefrontal cortex (mPFC; Amodio, 2014). In the social cognition context, the mPFC is associated with the ability to ''mentalize,'' which underlies theory of mind, and furthermore, with the formation of impressions about other people (Frith and Frith, 1999; Amodio and Frith, 2006; Amodio, 2014), and the mPFC is more activated during the judgment of people than in the judgment of inanimate objects (Mitchell et al., 2002). Neural activity within the mPFC also predicted empathy and an altruistic motivation towards ingroup members (Mathur et al., 2010; Cikara et al., 2011a), while the absence of activity of the mPFC was observed in the ''dehumanization'' process towards outgroup members (Harris and Fiske, 2006), which leads to biased attitudes and discrimination. In a gender prejudice study, mPFC activation in men who had stronger hostile sexist attitudes when viewing sexualized images of female bodies was lower than that in men who had weaker attitudes (Cikara et al., 2011b), which demonstrates the neural function of the mPFC in sexual objectification. These studies revealed a correlation of prejudice and mPFC activity in the gender field but did not provide direct evidence that proved a causal relationship.

Compared with the role of the mPFC in prejudice, it is more directly involved in stereotyping (Amodio, 2014). According to previous studies, the mPFC participated in brain functions related to stereotypes, for example, cognitive control (Amodio and Frith, 2006), automatic associations, storing social knowledge (Mitchell et al., 2002; Krueger et al., 2009), and integrating information to coordinate social behavior (Contreras et al., 2012; Gilbert et al., 2012). However, the function of the mPFC in gender stereotypes is still not clear. Neuroimaging studies have demonstrated the critical role of the mPFC in gender stereotyping using the behavioral task IAT. Using functional magnetic resonance imaging (fMRI), the anteromedial PFC was significantly activated in congruent blocks of the gender IAT, where the association between gender and social attributes was consistent with the stereotypes (Knutson et al., 2007). Quadflieg et al. (2009) also found that the ventromedial prefrontal cortex (VMPFC) shows stronger activation for stereotypic judgments than for non-stereotypical judgments, which further confirms the indispensable role of the mPFC. However, the results of lesion studies on the function of the mPFC are not all consistent. In earlier clinical observations, male patients with VMPFC lesions had a lower level of gender stereotypes than patients with dorsolateral prefrontal cortex (DLPFC) lesions (Milne and Grafman, 2001). However, a subsequent clinical study found that large lesions in the VMPFC increased stereotypical attitudes (Gozzi et al., 2009). The divergent conclusions result from the different classifications of the brain-damaged region: in Milne and Grafman (2001), patients had damage to both lateral and medial sectors of the ventral PFC, but in Gozzi et al. (2009), the researchers distinguished participants who had lesions in these regions and suggested a differential function of these two sectors of the mPFC.

Although neuroimaging and lesion studies demonstrated associations between the mPFC and gender stereotypes, the direct causal relationship remained imprecise. Brain stimulation technologies such as transcranial magnetic stimulation (TMS) and transcranial direct current stimulation (tDCS) can modulate the activity of target brain regions and establish causal connections between the brain and decisions. One TMS study, Cattaneo et al. (2011) found that applying TMS over the right anterior dorsomedial prefrontal cortex (aDMPFC) of male participants led to increased gender stereotypes as assessed by the IAT. In the present study, we aimed to investigate the effect of tDCS on the mPFC of subjects performing a gender stereotyping task. We chose to apply this noninvasive brain stimulation technique because of the features and advantages compared with TMS (Nitsche and Paulus, 2001; Fecteau et al., 2007). tDCS is safe and easy to use with reliable modulatory effect (Nitsche and Paulus, 2001). Moreover, tDCS does not cause noise interference, nor does it cause muscle twitching during stimulation, which makes it a good choice for performing the IAT, which needs a rapid response (Sellaro et al., 2016). In addition, tDCS can apply reliable sham stimulation, which produces a similar skin sensation but does not modulate the excitability of the brain region (Gandiga et al., 2006; Sellaro et al., 2016). More importantly, tDCS can both enhance and suppress the excitability of local brain activity (Nitsche and Paulus, 2001; Ardolino et al., 2005), so we can determine whether tDCS applied over the mPFC changed the participant's gender stereotypes and thereby figured out the precise causal role of the mPFC in this process, which would also provide complementary evidence for the TMS study. Furthermore, to test the possible explicit prejudice of the participants influenced by tDCS, we investigated whether there were any divergent results between conscious and unconscious attitudes.

## MATERIALS AND METHODS

## Participants

A total of 192 right-handed healthy students (96 males, mean age = 20.53 SD = 2.00; 96 females, mean age = 20.21, SD = 1.58) participated in our experiments. All of the participants declared no history of psychiatric illness or psychiatric problems, had normal or corrected-to-normal vision, and were naïve to tDCS, our decision-making task, and IAT. Before participants started the tasks, all of them gave written informed consent approved by the Zhejiang University ethics committee. The experiment lasted approximately one and a half hours, and each participant received an average payment of 30 RMB yuan (approximately 4.35 United States dollars) after the experiment. No participants reported any adverse side effects regarding pain in the scalp or headaches after the experiment.

## tDCS

tDCS applied a weak direct current to the scalp via two salinesoaked surface sponge electrodes (5 cm × 7 cm; 35 cm<sup>2</sup> ). The current was constant and was delivered by a battery-driven stimulator (NeuroConn, Ilmenau, Germany). It was adjusted to induce cortical excitability of the target area without any physiological damage to the participants. Various orientations of the current had various effects on cortical excitability. In general, anodal stimulation would enhance cortical excitability, whereas cathodal stimulation would restrain it (Nitsche and Paulus, 2000).

Participants were randomly assigned to one of three tDCS treatments, and the target areas were localized according to the International electroencephalography (EEG) 10–20 System. For anodal stimulation over mPFC (n = 64, 32 males and 32 females), the anodal electrode was placed horizontally over the Fpz position, whereas the return electrode was placed horizontally over Oz (Sellaro et al., 2015). For the cathodal stimulation (n = 64, 32 males and 32 females), the polarity was reversed, where the cathodal electrode was placed over Fpz, whereas the anodal electrode was placed over Oz (**Figure 1**). The current was constant for 20 min and was 1.5 mA in intensity, with a 30 s ramp up and down; the safety and efficiency of this stimulation have been demonstrated in previous studies (Riva et al., 2015). For sham stimulation (n = 64, 32 males and 32 females), the procedures were the same as in the active tDCS, but the stimulation was automatically turned off after 30 s without the participant's knowledge. The participants may have felt the initial itching, but there was no current for the rest of the stimulation. This method of sham stimulation has been shown to be reliable (Gandiga et al., 2006). Before the decision-making tasks, the laboratory assistant put a tDCS device on the participant's head for stimulation. After 20 min of stimulation, the tDCS device was taken off, and the participant was then asked to complete several tasks.

## Task and Procedure

All of the participants received a single-blinded stimulation session (either anodal, cathodal, or sham stimulation), with tDCS applied on the mPFC for 20 min, and then completed IAT tasks programmed by Inquisit 4 (Millisecond Software, Seattle, WA, USA). After the IAT task, they were asked to complete a questionnaire including an explicit test and personal information.

In the IAT task, 20 words were used as stimuli—10 common typical Chinese names and 10 attributes. Five of the names were Chinese male names, and five were Chinese female names. The

and Oz based on the international electroencephalography (EEG) 10–20 system of the human brain. The shading represents the range of input voltage from −19.379 V to 18.948 V.

attribute words consisted of five strong words, and five weak words, which were selected from a previous gender stereotyping study (Rudman et al., 2001); the stimuli from that study have been applied several times in other neurological studies using gender IAT since then (Knutson et al., 2007; Gozzi et al., 2009; Cattaneo et al., 2011).

The task used the procedure designed by Greenwald et al. (1998) consisting of five blocks. Blocks 1, 2, and 4 are for practice, and the others (Blocks 3 and 5) are test blocks. In (practice) Block 1, participants were asked to classify male and female names by pressing left (E) and right keys (I). In (practice) Block 2, they were then asked to categorize the strong words and weak words using these two keys as well. In (practice) Block 4, participants were asked to categorize male and female names again, but the key assignments were reversed compared with Block 1. In (test) Block 3 and Block 5, names and attributes words are combined. One of these blocks was in the congruent condition, where participants were required to press key ''E'' for male names with the left hand and strong words and key ''I'' for female names and weak words with the right hand. The other block was in the incongruent condition, and the association was switched such that female names shared key ''E'' with strong words, and male names shared key ''I'' with weak words. The order of the congruent and incongruent blocks was counterbalanced; meanwhile, the order of the name practice blocks corresponded with the order of the test blocks (the position of Block 1 was swapped with Block 4 when Block 3 was incongruent and Block 5 was congruent).

Stimuli were presented in the center of the computer screen in white text on a black background using Inquisit 4 (Millisecond Software, Seattle, WA, USA). The category labels (''men'' and ''women, '' ''strong'' and ''weak'') were displayed on the left and right top sides of the screen. Practice Blocks 1, 2, and 4 had 20 trials, and test Blocks 3 and 5 had 40 trials. To complete the task, Participants needed to classify names and attributes words by pressing the keys ''E'' and ''I'' on the computer keyboard according to the label's position. Each trial was kept on the screen until the participant had given the correct response, followed by a 500 ms blank screen. Participants were asked to respond as quickly and accurately as possible when stimuli appeared on the screen.

## Analysis

The critical variables are mean reaction time (RT) and percentage of error rate (PE), which reflect the subjects' direct responses to the different types of associations. According to Greenwald et al. (2003), an improved algorithm performed better in measuring implicit association strength, so we calculated D-IAT scores for the three stimulation conditions following this procedure. All trials except the extreme long trials (latencies >10,000 ms) were included, and error latencies were replaced with block mean latencies plus 600 ms. The RTs and PEs were then synthesized to D-IAT scores—the difference between the adjusted latencies of the incongruent and congruent blocks divided by the pooled standard deviation of all trials. In general, these three variables together indicate how strong the stereotypes are, with higher IAT scores and larger differences in RTs and PEs between the congruent block and the incongruent block representing a stronger implicit bias towards males or females.

To evaluate explicit stereotypes, we used the Ambivalent Sexism Inventory (ASI; Glick and Fiske, 1996) and calculated the scores according to the standard method (**Supplementary Table S1**). Because some of the reversed-worded items did not perform well when translated into other languages in cross-cultural studies (Glick et al., 2000), only valid items were retained in the test. In general, the ASI scores represent ambivalent Wang et al. Gender Difference in Gender Bias

attitudes towards women. HS represents hostile sexism, expressing negative stereotypes and attitudes towards women, while BS represents benevolent sexism expressing positive stereotypes and attitudes, both of which complementarily generate gender inequity in various cross-cultural ideologies (Glick and Fiske, 2001). In addition, there are three subfactors of benevolent sexism: protective paternalism, complementary gender differentiation, and heterosexual intimacy, which correspond with three types of questions (BP, BG, and BI) in BS.

## RESULTS

The data were statistically evaluated using SPSS software (version 22, SPSS Inc., Chicago, IL, USA). The significance level was set at 0.05 for all analyses.

## Implicit Measures

The Shapiro–Wilk test showed that the residuals of the D-IAT scores were normally distributed (p = 0.129). To test whether both male and female participants had gender stereotypes, we used a one-sample t-test to compare D-IAT scores and zero. **Figure 2** shows both male and female participants' IAT-scores in different stimulation conditions. In all of the male data, there was a significant difference between the IAT-D scores and zero (t(95) = 29.35, p < 0.001, Mean = 0.87, SD = 0.29). Analyses also showed that IAT-D scores from all three stimulation conditions were significantly different from zero respectively (anodal: t(31) = 13.47, p < 0.001, Mean = 0.72, SD = 0.30; cathodal: t(31) = 19.30, p < 0.001, Mean = 0.89, SD = 0.26; sham: t(31) = 23.73, p < 0.001, Mean = 1.01, SD = 0.24), which indicated that male subjects had strong associations of male names with strong attributes and female names with weak attributes regardless of the stimulation conditions. As for the female subjects in the three tDCS types, they also had gender stereotypes in all groups. t-tests revealed that the D-IAT scores from the three stimulation conditions were significantly different from zero respectively (all: t(95) = 9.86, p < 0.001, Mean = 0.40, SD = 0.40; anodal: t(31) = 7.03, p < 0.001, Mean = 0.49, SD = 0.40; cathodal: t(31) = 4.81, p < 0.001, Mean = 0.37, SD = 0.44; sham: t(31) = 5.36, p < 0.001, Mean = 0.35, SD = 0.37).

One-way ANOVA performed on the D-IAT scores of all subjects using gender and tDCS types as factors showed that the main effect of gender (F(1,186) = 90.34, p < 0.001, η 2 <sup>p</sup> = 0.32) and the interaction of gender and stimulation conditions (F(2,186) = 6.22, p = 0.002, η 2 <sup>p</sup> = 0.06) were significant, which indicated that male and female participants' D-IAT scores have different patterns. The main effect of tDCS types was not significant (F(2,186) = 0.85, p = 0.429, η 2 <sup>p</sup> < 0.01). Post hoc analysis using Bonferroni corrections revealed that male participants' D-IAT scores significantly decreased when subjects underwent anodal stimulation compared with sham stimulation (p = 0.003), while the D-IAT scores of those that underwent cathodal stimulation were not significantly changed compared with the sham group (p = 0.476) and anodal group (p = 0.163). However, the female participants' D-IAT scores were not significantly changed by the stimulation. Bonferroni post hoc tests revealed that no significant result was found in the pairwise comparison

between these three stimulations (p > 0.1). Post hoc analysis using Bonferroni corrections also found that male participants' D-IAT scores were higher than those of female participants in all stimulation conditions (anodal: p = 0.006, cathodal: p < 0.001, and sham: p < 0.001).

D-IAT scores depend on RTs and PEs: lower D-IAT scores mean higher RTs and PEs in the congruent blocks or lower RTs and PEs in the incongruent blocks. Therefore, we further decomposed the D-IAT effect and analyzed RTs and PEs. **Table 1** shows all the means and SD for RTs and PEs across genders, blocks, and stimulation conditions. The Shapiro–Wilk test showed that the residuals of RTs and PEs in congruent and incongruent blocks were not normally distributed (p < 0.05), so we performed non-parametric tests to analyze them. First, we tested RTs; **Figure 3** shows the results. The Wilcoxon signed-rank test showed significant results for the relationship between RTs in congruent blocks and incongruent blocks (p < 0.001), indicating that RTs were overall higher in incongruent blocks than in congruent blocks. This difference is RTs remained present for both male and female participants in the three stimulation conditions (p < 0.001).

More importantly, we analyzed the factors of block conditions, gender, and stimulation conditions. First, we applied these tests on the data from male participants separately. With a Kruskal–Wallis test, we found no significant difference in RTs either in congruent blocks or in incongruent blocks (p > 0.1), indicating that tDCS did not change the latencies in these two distinctive blocks respectively. However, the difference in RT between congruent blocks and incongruent blocks was significantly modulated by tDCS (p = 0.028). Post hoc analysis using Dunn–Bonferroni corrections revealed that the difference in RTs in anodal stimulation is significantly smaller than that in sham stimulation (p = 0.023), while the cathodal stimulation had no significant effect compared with the anodal and sham group (p > 0.1). These results indicated that the effect of tDCS on male participants' gender stereotypes stemmed from the


TABLE 1 | Mean and SD D-IAT scores, reaction times, percent of error, and rate correct scores across genders, blocks, and stimulations.

Asterisks indicate statistically significant differences between congruent and incongruent blocks. Daggers indicate statistically significant differences between male and female genders.

relative association between congruent blocks and incongruent blocks rather than these two blocks independently. As for female participants, the effect of tDCS disappeared. Kruskal–Wallis tests on RTs in congruent blocks, incongruent blocks, and the difference of them were all insignificant (p > 0.1).

We also tested PEs using the same non-parametric test. **Figure 4** shows the results. The Wilcoxon signed-rank test showed a significant difference between PEs in congruent blocks and incongruent blocks (p < 0.001), indicating that participants made more mistakes overall in incongruent blocks than in congruent blocks. For male participants in the three stimulation conditions, this difference in PEs between block conditions still existed (cathodal, sham: p < 0.001, anodal: p = 0.002), while differences in female participants' error rate were insignificant between condition blocks. Meanwhile, for both males and females, the Kruskal–Wallis test on PEs in both congruent blocks

and incongruent blocks and the difference between them were all insignificant (p > 0.1).

We further focused on gender differences in the RTs and PEs of both congruent and incongruent blocks. The Mann–Whitney test was applied to show that the differences in RTs and PEs between genders existed in both block conditions, but the differences were in the opposite direction between block conditions. As for RTs, in incongruent blocks, male participants reacted significantly more slowly than females (p < 0.001) but in congruent blocks, male participants reacted significantly more quickly (p = 0.014). In terms of PEs, males made significantly more mistakes than females in incongruent blocks (p = 0.010), but in congruent blocks, males made significantly fewer mistakes (p = 0.049).

## Explicit Measures

The Shapiro–Wilk test showed that the residuals of ASI were normally distributed (p = 0.116). To investigate whether tDCS directly changes explicit prejudice, we tested the effect of tDCS on the explicit test (**Table 2**). One-way ANOVA with stimulation conditions (anodal, cathodal, and sham) and gender (male and female) as between-subjects factors showed no significant main effect in stimulation conditions (F(2,186) = 2.46, p = 0.088, η 2 <sup>p</sup> = 0.03), gender (F(1,186) = 3.46, p = 0.064, η 2 <sup>p</sup> = 0.02), and the interaction of stimulation conditions and gender (F(2,186) = 0.20, p = 0.818, η 2 <sup>p</sup> < 0.01), which revealed that tDCS did not modulate the explicit gender stereotypes of either male or female participants.

Since the ASI has four subscales: HS, BI, BG, and BP, and only the residuals of HS were normally distributed according to the Shapiro–Wilk test (HS: p = 0.152 BI, BG, BP: p < 0.05), we performed non-parametric tests for analysis. Kruskal–Wallis was used to test the tDCS effect on these subscales from male and female participants, respectively. Overall, tDCS stimulation did not change the subscales (p > 0.1). The only explicit attitude influenced by tDCS was BG from female participants (p = 0.045). Post hoc analysis using Dunn–Bonferroni corrections revealed that anodal stimulation reduced the intensity of females' attitudes on gender differentiation compared with sham stimulation (p = 0.041).

The Mann–Whitney test was applied to analyze the gender difference in the subscales further. There were some distinctions between the attitudes of male and female participants. On the whole, males and females had a similar degree of hostile sexism towards the female (p = 0.083). However, there were significant differences in benevolent sexism between male and female participants. Males had higher BI factor values than females (p < 0.001), and they also had higher BP factor values (p = 0.027). Nevertheless, the male participants' sexism was weaker than that of females in terms of the BG factor (p < 0.001). These results demonstrated that males and females had their own reasons for benevolent sexism: males are more sexist in terms of protective paternalism and heterosexual intimacy, while females focus more on complementary gender differentiation.

## Correlation Between Implicit and Explicit Measures

Finally, we tested whether the explicit attitudes, the ASI scores, were correlated with the implicit gender stereotypes. The ASI scores were positively correlated with the D-IAT score in the sham group (ρ = 0.24, p = 0.053 in a Pearson correlation test). In our study, the HS from our participants in the sham situation had a significant relationship with the D-IAT score according to Pearson correlation test (ρ = 0.39, p = 0.002 for HS, ρ = 0.03, p = 0.83 for BS), which implied that hostile sexism was the only explicit attitude correlate with the implicit gender stereotypes. When we tested the correlation of all six (three stimulation conditions × two genders) combinations, HS was only positively


correlated with the D-IAT scores for both males (ρ = 0.34, p = 0.057) and females in the sham stimulation (ρ = 0.34, p = 0.061), a trend close to significance.

## Robustness Analysis: RCS

In the implicit measures, we first calculated D-IAT scores, and then analyzed the RTs and PEs separately. In this section, we further applied another method called the rate correct score or RCS, which combines speed and accuracy as a robustness analysis. The RCS is the number of correct responses divided by the sum of all RTs in the congruent and incongruent conditions, respectively (Woltz and Was, 2006; Vandierendonck, 2017). **Table 1** shows all of the mean and SD values for RCS across genders, blocks, and stimulation conditions. **Figure 5** also shows the results.

The Shapiro–Wilk test showed that the residuals of RCS in incongruent blocks were normally distributed (p = 0.315), while those in congruent blocks were not normally distributed (p = 0.027), so we performed non-parametric tests to analyze them. The Wilcoxon signed-rank test showed significant results between RCS in congruent blocks and incongruent blocks (p < 0.001), indicating that RCS were higher overall in congruent blocks than in incongruent blocks and that people made more correct responses per second. For both male and female participants in the three stimulation conditions, these differences in RCS between block conditions still existed (p < 0.001).

Kruskal–Wallis testing on RCS in both the congruent blocks and incongruent blocks from male and female participants showed that tDCS did not change the RCS in these two distinctive blocks, respectively (p > 0.1). However, for male participants, the difference in RCS between congruent blocks and incongruent blocks was significantly modulated by tDCS (p = 0.003). Post hoc analysis using Dunn–Bonferroni corrections revealed that the difference in RCS in anodal stimulation is significantly smaller than that in sham stimulation (p = 0.002), while cathodal stimulation had no significant effect compared with the anodal and sham groups (p > 0.1). These results demonstrated that tDCS modulated male participants' relative correct responses per second between congruent blocks and incongruent blocks, which was consistent with the results of the RT analysis. For female participants, these effects were all insignificant.

We also checked the gender difference in RCS in congruent and incongruent blocks by using the Mann–Whitney test. In congruent blocks, male participants' number of correct responses per second were higher than females' (p = 0.006), but in incongruent blocks, male participants made significantly fewer correct responses per second than females (p < 0.001). This result is consistent with our finding for RTs and PEs, in that differences in RCS between genders existed in both block conditions but the differences were in the opposite direction in the two block conditions.

## DISCUSSION

This article investigated the contribution of the mPFC to stereotypes, specifically within the domain of gender stereotypes, and this effect was found to be limited to male participants. Previous lesion studies (Milne and Grafman, 2001; Gozzi et al., 2009), neuroimaging studies (Knutson et al., 2007; Quadflieg et al., 2009), and a TMS study (Cattaneo et al., 2011) suggested that the mPFC was involved in prejudice and stereotyping (Amodio, 2014), especially in the gender stereotyping assessed by the IAT (Greenwald et al., 1998, 2003). Nevertheless, the mechanistic role of the mPFC in this test remained vague, and the conclusions have not been convergent.

Because of the inconsistency of previous results and the lack of a test of the causal relationship between the mPFC and gender stereotypes, in this study, we applied tDCS over the mPFC in our participants to directly modulate this brain region and reveal the precise effect on gender stereotypes. We found that, when enhancing the activity of the mPFC, the implicit gender stereotyping attitudes of male participants, as indicated by the D-IAT scores measured by the gender IAT (Rudman et al., 2001) were reduced compared to the sham group. This observation demonstrated the causal relationship between mPFC activation and gender-stereotyped attitudes. The reduction in the D-IAT scores mainly stemmed from a decrease in the difference in RTs between the incongruent and congruent blocks when participants underwent anodal stimulation over mPFC, which seems to conflict with previous research that the effect of modulating the activity of the mPFC resulted from altered performance only in the incongruent blocks (Sellaro et al., 2015). In this study, the researchers found that enhancing the activation of the mPFC reduced the negative bias towards social outgroups. Thus, the interpretation was that the mPFC was an essential region in self-regulatory and cognitive control in the context of ethnic stereotyping. Cattaneo et al. (2011) suggested that the inhibition of the aDMPFC by TMS led to an increase in gender bias based on an increased error rate in the incongruent blocks. Our findings were not inconsistent with that result because the aDMPFC is involved in the network mediating cognitive control in the DLPFC. In the present study, the target brain region was the mPFC, or the VMPFC specifically, which proved to have a different function than the DMPFC based on a lesion study (Gozzi et al., 2009). This result can also be compared to the outcome from Gladwin et al. (2012), where anodal stimulation of the L-DLPFC only improved the RT in congruent blocks using the IAT about insects and flowers. They found that the function of the L-DLPFC was to influence working memory, which meant that the activation of the L-DPLFC increased the associations in congruent blocks and led to faster RTs but that, in incongruent blocks, this activation of the brain region affected congruent and incongruent associations at the same time.

In this study, several factors contributed to the changes observed in the congruent blocks comparing to incongruent blocks. First, the effect of mPFC activation on congruent blocks also correlates with the role of the mPFC in memory and decision making. According to Euston et al. (2012), when confronted with different contexts, locations, and events, the mPFC takes part in the process of learning and using the associations between these targets to provide the corresponding response. This function in both long-term and short-term memory provides the possible explanation that the activation of the mPFC reduced the association intensity in congruent blocks but had an effect on both congruent and incongruent associations in incongruent blocks, which finally led to a reduction in the bias. Another reason was that gender stereotypes are culturesensitive. In Western culture and Chinese culture, the history and current situation of social gender stereotypes are not entirely the same. In the meantime, there is a gap in the intensity of the gender stereotypes between these two societies. For example, the D-IAT scores from our experiment were higher than those from the previous study (Cattaneo et al., 2011). The similar, but not identical, cultural background influences an individual's neural activity, which underlies cognitive functions such as emotional processing, mental attribution, self-representation, and self-awareness (Han and Northoff, 2008), which possibly causes the distinct change in the congruent blocks during anodal stimulation. The reasons above combined can account for the effect of tDCS on the differences in RTs and RCS between incongruent and congruent blocks. Actually, a more precise role of the mPFC in the neural circuit of prejudice and stereotypes can be found by further combining fMRI and tDCS techniques.

We also revealed that cathodal tDCS had no significant effect on the behavior of either the male or female participants compared with the sham group, which was consistent with Sellaro et al. (2015). This result may be because the mPFC is insensitive to cathodal stimulation, which was also investigated in studies of cathodal stimulation of the somatosensory cortex, while anodal stimulation influenced the activity of this brain area (Matsunaga et al., 2004). Another possible explanation also mentioned in previous studies is that the low background level of activity in the mPFC and the high prejudice baseline have a ceiling effect, which limits the influence of cathodal stimulation (Matsunaga et al., 2004; Sellaro et al., 2015).

This study also investigated gender differences in the view of gender bias. Regarding implicit stereotypes, the female participants' bias was significantly lower than that of the male participants, and only the male participants' gender stereotypes were significantly affected by the tDCS, which has also been observed in several studies (Rudman et al., 2001; Knutson et al., 2007; Cattaneo et al., 2011). Additionally, the female participants showed a significant gender bias in the IAT, which resulted from a different gender culture baseline between the societies. One possible reason is that the different behaviors of the male and female participants stemmed from different neural stereotype substrates or different sensitivities to activation of the mPFC during tDCS. The female participants had a relatively higher activation level of the mPFC and lower gender bias so that stimulation power was limited, and stimulation could not further reduce the bias. Because explicit attitudes correlated with D-IAT scores for some of the subscales, different neural activities in the male and female participants may both consciously and unconsciously influence the bias.

In terms of the neural substrates of gender difference, Stam et al. (2019) demonstrated that brain structure-personality associations are dependent on sex. Specifically, in some brain regions, there were inverse associations between temperament and regional gray matter volume (GMv) in males and females, and the brain regions related to gender and temperament were non-overlapping. So, the difference in personality between genders has a sex-specific neural basis. In our study, what we found is consistent with Stam et al. (2019): the difference in implicit gender stereotypic attitudes between male and female have a sex-specific association with the target region, the mPFC. We demonstrated the causal relationship between the mPFC and gender stereotypes by modulating the activity of the mPFC. Although personal characteristics, temperaments and stereotypic attitudes are distinct from each other, for example, temperaments are heritable, homogeneous, and stable while stereotypic attitudes can be influenced by culture and evolved in the lifetime (Comings et al., 2000; Stam et al., 2019), sex-specific associations between brain regions and personal traits and attitudes still exist. In summary, our results provided extensive evidence from personality to stereotypes for the neural basis of gender difference.

In this study, the participants' explicit prejudicial beliefs were measured by the ASI, which has been widely used in previous research (Glick and Fiske, 1996; Milne and Grafman, 2001; Rudman et al., 2001; Knutson et al., 2007; Cattaneo et al., 2011). The ASI scores correlated with the D-IAT scores in the sham group, which indicated that the explicit attitudes of gender stereotypes and the automatic association process were closely related. However, our study demonstrated that tDCS had no effect on the majority of the explicit test. Therefore, we interpret these findings to show that the explicit gender bias in the ASI was consciously controlled according to social norms and discipline, and this bias expression can be controlled by external influences, such as culture and education (Crandall et al., 2002; Cunningham et al., 2004). This cultural background may explain the different results in our research from previous research. For example, Rudman et al. (2001) revealed that both BS and HS significantly correlated with gender potency stereotypes measured by a similar IAT to that used in our experiment, and these findings demonstrated that participants who held both hostile and benevolent sexist attitudes had the same automatic associations between males and potency. However, in our study, only the HS scores in our participants in the sham stimulation condition showed trends close to a significant relationship with the D-IAT scores through the Pearson correlation test, which revealed that hostile sexism was the only explicit behavior related to the implicit gender stereotypes here.

## Limitations

One limitation of this study is that although our findings confirmed that modulating the excitability of the mPFC reduced male participants' gender stereotypes, the neural circuitry underlying this process cannot be demonstrated by a single experiment. Future studies may focus on other brain regions and discuss the functions of the mPFC within the neural circuit. Moreover, by using this bipolar tDCS montage, whether only the mPFC influences the gender stereotypes or whether both target and return electrodes and the interaction between them influence participant's behavior together is still unclear. These issues should be considered seriously in further studies. In addition, this study applied a between-subject design to avoid the learning effect, which can also be improved upon in the future.

## CONCLUSION

Our study revealed that male and female participants had different behavioral performance and neural substrates regarding gender stereotypes. Males had a relatively higher level of gender stereotyping than females, and the mPFC plays a causal role in controlling male participants' implicit gender stereotypes. Male participants' implicit bias was significantly restrained by tDCS, but female participants were not significantly influenced. The stimulation did not directly influence the ability to make automatic associations in congruent blocks or to overcome automatically activated gender-biased associations in incongruent blocks but affected the difference between the two blocks. We also found differences in explicit prejudice between male and female, which have both neural and cultural underpinnings.

## ETHICS STATEMENT

This study was carried out in accordance with the recommendations of Zhejiang University ethics committee with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Zhejiang University ethics committee.

## AUTHOR CONTRIBUTIONS

SW and HZ designed the experiment, analyzed the data and wrote the manuscript. SW, JW, WG, HY, XL, JL, and HZ performed the experiment, revised the manuscript and finally approved the version to be published. SW drew figures.

## FUNDING

This work was supported by the Zhejiang Provincial Natural Science Foundation of China (LY19G030019), Zhejiang Philosophy and Social Sciences Planning Project (17NDJC166 YB), National Social Science Fund of China (Grant Numbers: 13AZD061 and 15ZDB134), National Natural Science Foundation of China (Grant Numbers: 71703145 and 71903169) and MOE (Ministry of Education

## REFERENCES


in China) Project of Humanities and Social Sciences (Project No. 17YJCZH120).

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnhum. 2019.00403/full#supplementary-material.

brain stimulation. Clin. Neurophysiol. 117, 845–850. doi: 10.1016/j.clinph.2005. 12.003


**Conflict of Interest**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Wang, Wang, Guo, Ye, Lu, Luo and Zheng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Immune System Sex Differences May Bridge the Gap Between Sex and Gender in Fibromyalgia

Irene Meester<sup>1</sup> \*, Gerardo Francisco Rivera-Silva<sup>1</sup> \* and Francisco González-Salazar1,2

<sup>1</sup> Laboratory of Tissue Engineering and Regenerative Medicine, Basic Sciences Department, University of Monterrey, San Pedro Garza García, Mexico, <sup>2</sup> Laboratory of Cellular Physiology, Northeast Center of Research, Mexican Institute of Social Security, Monterrey, Mexico

The fibromyalgia syndrome (FMS) is characterized by chronic widespread pain, sleep disturbances, fatigue, and cognitive alterations. A limited efficacy of targeted treatment and a high FMS prevalence (2–5% of the adult population) sums up to high morbidity. Although, altered nociception has been explained with the central sensitization hypothesis, which may occur after neuropathy, its molecular mechanism is not understood. The marked female predominance among FMS patients is often attributed to a psychosocial predisposition of the female gender, but here we will focus on sex differences in neurobiological processes, specifically those of the immune system, as various immunological biomarkers are altered in FMS. The activation of innate immune sensors is compatible with a neuropathy or virus-induced autoimmune diseases. Considering sex differences in the immune system and the clustering of FMS with autoimmune diseases, we hypothesize that the female predominance in FMS is due to a neuropathy-induced autoimmune pathophysiology. We invite the scientific community to verify the autoimmune hypothesis for FMS.

Keywords: autoimmune disease, central nervous system sensitization, fibromyalgia, pathophysiology, sex differences, widespread chronic pain

## INTRODUCTION

As long as the pathophysiology of the FMS is not elucidated, the diagnosis (Wolfe et al., 2016; Arnold et al., 2019) and the treatment (Macfarlane et al., 2017) will remain inadequate. Many consider FMS to be psychosomatic (Lami et al., 2018) and there are still physicians who do not recognize the disorder. Although the name indicates a fibromuscular affection and the syndrome is classified as a rheumatic disorder, FMS is treated as a neurological problem, in accordance with the currently most accepted hypothesis: central sensitization (Staud et al., 2009). The history of FMS not only reveals the confusion (Inanici and Yunus, 2004) but also the importance of (1) inflammation, (2) a neuropathic type of pain, (3) referred pain after irritation or damage of the paraspinal ligaments, (4) increased substance P levels in cerebrospinal fluid (CSF), and (5) an etiology of trauma and/or infection accompanied by mental stress, which are all consistent with neuroinflammation. Female predominance and clustering with autoimmune diseases were recognized in the historical review

#### Edited by:

Annie Duchesne, University of Northern British Columbia, Canada

#### Reviewed by:

Gillian Einstein, University of Toronto, Canada Ilke Coskun Benlidayi, Cukurova University, Turkey Larry Culpepper, Boston University, United States Reinhild Klein, University of Tübingen, Germany

#### \*Correspondence:

Irene Meester meesterirene@hotmail.com; elisabethd.meester@udem.edu Gerardo Francisco Rivera-Silva gerardo.rivera@udem.edu

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Neuroscience

Received: 29 November 2018 Accepted: 16 December 2019 Published: 17 January 2020

#### Citation:

Meester I, Rivera-Silva GF and González-Salazar F (2020) Immune System Sex Differences May Bridge the Gap Between Sex and Gender in Fibromyalgia. Front. Neurosci. 13:1414. doi: 10.3389/fnins.2019.01414

**Abbreviations:** AIRE, autoimmune regulator; CNS, central nervous system; CSF, cerebrospinal fluid; FMS, fibromyalgia syndrome; HLA, human leukocyte antigen; MS, multiple sclerosis; NK, natural killer cell; SLE, systemic lupus erythematosus; TCR, T cell receptor; TLR, Toll-like receptor; Th, helper T lymphocyte; Treg, suppressor T lymphocyte.

(Inanici and Yunus, 2004), but suggestions of autoimmunity markers were omitted (Jacobsen et al., 1990; Klein et al., 1992), despite being actual at the time of the review. Still, it seems that autoimmune susceptibility accompanies FMS. We propose that FMS is a neuropathy-induced autoimmune disease directed to nervous tissue. As autoimmunity is sex biased (Beeson, 1994), the autoimmune hypothesis may explain the female prevalence observed in FMS.

The focus of the paper is to present the biological data from which this hypothesis emerges, followed by how it may explain central sensitization and the sleep alteration that characterize FMS. Next, we describe the mechanisms of immunological selftolerance and how it can be breached, as well as the wellknown sex differences in the immune system, which explains why women are more susceptible to develop certain autoimmunity disorders. We reflect on the complexities of proving the hypothesis and offer suggestions to verify the hypothesis.

## FIBROMYALGIA: INTRODUCING THE AUTOIMMUNE HYPOTHESIS

Fibromyalgia syndrome is characterized by unexplained chronic (>3 months) widespread pain accompanied by moderate to severe sleep problems and/or fatigue (Arnold et al., 2019). Fatigue upon awakening has been associated with altered sleep wave patterns, especially a lack of slow-wave sleep (Roizenblatt et al., 2001). A myriad of additional symptoms tends to accompany the disease, amongst them cognitive difficulties, depression, irritable bowel, irritable bladder, restless legs, dry mouth and eyes, and altered sense perception (Arnold et al., 2019). Primary FMS is not accompanied by another chronic pain disorder, whereas secondary FMS develops as a co-morbidity of another dominant chronic disease, commonly an autoimmune disease (Häuser et al., 2015). FMS prevalence among the adult population ranges from 0.8–5% worldwide, depending on the geographical area, case definition, and assessment method (Johnston et al., 2013; Jackson et al., 2015). FMS occurs in the pediatric population, generally beginning with the onset of puberty (Gedalia et al., 2000), but the highest prevalence is among middle-aged women. The femaleto-male ratio ranges from 1:1 to 30:1, but a worldwide average is about 3:1 in both the pediatric (Gedalia et al., 2000) and adult populations (Queiroz, 2013).

Although FMS is classified as a musculoskeletal disease, the currently most accepted hypothesis of pathogenesis, central sensitization (Staud et al., 2009), is neurobiology-based and supported by empirical and impartial evidence (Maestu et al., 2013; Sluka and Clauw, 2016). As the etiology and pathophysiology of FMS remain elusive, FMS treatment is directed to symptom management, which includes inhibition of an overreacting CNS (Macfarlane et al., 2017). In general, 30% of the patients report a 30% improvement because of treatment (Häuser et al., 2015). This modest efficacy suggests that the pharmacological treatment does not target the cause.

We hypothesize that FMS is a neuropathy-induced autoimmunity directed against nervous tissue. The autoimmune hypothesis provides a mechanistic explanation for the central sensitization hypothesis and thus the two hypotheses are compatible. Considering sex differences in the immune system, the autoimmune hypothesis may explain female predominance among FMS patients.

The autoimmune hypothesis emerged from the following observations. First, the epidemiological profile of FMS is similar to the one of autoimmunity as both peak among middle-aged women (Beeson, 1994). Second, FMS co-occurs with a cluster of autoimmune diseases e.g., sicca syndrome, SLE, rheumatoid arthritis, irritable bowel syndrome, thyroiditis, interstitial cystitis/painful bladder syndrome, and restless legs syndrome co-occur. Autoantibodies 'specific' for aforementioned autoimmune diseases tend to be shared rather than unique. When they are detected in FMS, a corresponding autoimmunity is diagnosed and FMS is redefined as secondary FMS (Hervier et al., 2009). Still, secondary FMS reveals autoimmune susceptibility (Buskila and Sarzi-Puttini, 2008; Giacomelli et al., 2013; Haliloglu et al., 2017). Specific antibodies for FMS have been reported (**Supplementary Table 1**), but they are neither consolidated nor generally accepted (Werle et al., 2001; Giacomelli et al., 2013). Third, there is overlap in the clinical profile of FMS and certain autoimmune diseases, with respect to complex genetic and environmental risk factors. The latter include infections (Smatti et al., 2019) and stress due to traumatic experiences (Sharif et al., 2018). Stress and certain personality characteristics associate positively with autoimmune diseases, FMS, and other chronic diseases in retrospective studies with selected controls (Martin et al., 1996; Lami et al., 2018). But whereas stress and personality are considered precipitating factors or consequences in autoimmune diseases (Mitsonis et al., 2009; Hassett and Clauw, 2010), they are interpreted as the cause in FMS (Lami et al., 2018) despite a lack of convincing evidence demonstrating a causal relation (Häuser and Henningsen, 2014). Similarly, there is no convincing evidence that a certain personality causes pain (Naylor et al., 2017). Fourth, although routinely screened inflammatory biomarkers in FMS samples (such as C-reactive protein levels and erythrocyte sedimentation velocity) tend to be within the normal clinical range, these and other immune markers are significantly different from the ones of healthy controls in a research setting (Kadetoff et al., 2012; Xiao et al., 2013; Pernambuco et al., 2015; Mendieta et al., 2016; Bäckryd et al., 2017; Ciregia et al., 2018). Although data on inflammatory biomarkers are not consisted among studies, they correlate weakly with clinical variables (Ernberg et al., 2018), or can be explained by comorbid conditions, the general impression is that chronic inflammation occurs in FMS (Ernberg et al., 2018). Similarly, routine leukocyte counts are within the normal range, but specific lymphocyte subgroups, not screened in clinical routine studies, are altered in FMS patients. Compared to age-matched healthy controls, female FMS patients had higher proportions of CD57+ natural killer cells (NK) (17.1% vs. 11.3%) and CD5+ B cells (6.46% vs. 2.5%) (Russell et al., 1999) but lower CD56+ NK (Landis et al., 2004). Case-control observational whole-genome expression studies among women revealed altered expression of immune pathways and markers of tissue destruction (Lukkahatai et al., 2015; Jones et al., 2016). These expression studies did not confirm gene polymorphisms

that had been identified in genome association studies (Park and Lee, 2017). The gene association studies often had a selection bias and did not clarify why the genetic susceptibility would only lead to a disorder later in life. In this respect, the HLA alleles (Branco et al., 1996; Yunus et al., 1999) form an exception, as their impact depends on an interaction between genetic and environmental factors. HLA genes have an essential role in the immune system. Thus, the emerging picture is that a combination of genetic predisposition, a precipitating event (infections, trauma, autoimmune diseases or other reasons of necrosis) (Jiao et al., 2015), and immune dysregulation due to psychological stress (Takahashi et al., 2018) may convert autotolerance or pre-existing occult autoimmunity into overt autoimmunity, but the autoreactive component remains elusive (**Figure 1**). Importantly, though the precipitation event may be transitory, autoimmunity is a response of the adaptive immune system and is chronic. The sex bias in the immune system may explain the female preponderance of FMS.

## WHY DOES CENTRAL SENSITIZATION OCCUR?

Pain perception not only depends on the pain stimulus, but also on the emotional and psychosocial state at a certain moment (Rhudy et al., 2010; Finnern et al., 2018). Both human and animal studies reveal greater pain sensitivity among females than males for most pain modalities (Rhudy et al., 2010; Kisler et al., 2016; Melchior et al., 2016; Aufiero et al., 2017; Kosek et al., 2018). The gender/sex bias in pain perception in various animal species denotes the importance of biological processes and thus sex differences therein. Unfortunately, pain studies aimed at other aspects than a sex or gender bias seldom report outcome variables according to sex or gender and only mention the proportion of males or females among study participants (**Supplementary Table 1**). As a consequence, although the sexneutral neurophysiology of the pain pathway is well-described in several reviews (Basbaum et al., 2009; Zeilhofer et al., 2012; Peirs et al., 2015; Kendroud and Hanna, 2019) (**Figure 2**), little is known about sex differences in pain processing, except for modulation by sex-related hormones (Taleghany et al., 1999; Vincent and Tracey, 2008; Artero-Morales et al., 2018). These biological aspects are at least as important as the psychological aspect (Foo et al., 2017). Though sex differences in functional pain processing in itself are interesting, for FMS the focus is on pathological pain processing, which has been explained with the central sensitization hypothesis.

Central sensitization was originally described as an increased electrophysiological activity in the dorsal horn in both a polyarthritic (Menétrey and Besson, 1982) and a post-injury male rat model (Woolf, 1983). Importantly, in both models a peripheral tissue injury triggered off alterations in dorsal horn neurons so that they augment pain signaling to normal input, even from low-threshold Aβ mechanoreceptors (Woolf, 1991). Besides, in animal models, sex differences in pain processing, i.e., at a biological level, could only be detected after neuropathy (Sorge et al., 2015) and involved the immune system.

Central sensitization has become the most accepted hypothesis to explain FMS mainly because peripheral sensitization due to autoimmunity is discarded because FMS does not comply with the following criteria of inflammation: (a) the presence of blood inflammatory biomarkers according to common clinical criteria, (b) responsiveness to non-steroid anti-inflammatory cyclooxygenase inhibitors. The former is debatable once research findings on immunological biomarkers (as mentioned in the previous section) are considered. The latter only indicates that FMS is not cyclooxygenase dependent, but discredits neither the involvement of other inflammatory and immune response pathways, nor does it nullify the possibility of neurogenic pain. Failure to comply with inflammatory criteria have not prevented the recognition of Graves' disease (Baruah and Bhattacharya, 2012) and MS as autoimmune diseases (Luzzio and Dangond, 2018). Furthermore, the need for inflammatory biomarkers does not apply to secondary FMS because the accompanying autoimmune disease will generate not only inflammatory biomarkers but also a continuous nociceptive input. In secondary FMS, the CNS changes appear to improve when nociceptive input is removed (Sluka and Clauw, 2016). The point is that the nociceptive input may also be present in primary FMS, but we are ignorant of it. FMS pain is said to be idiopathic, but the burning, nagging, excruciating pain that is characteristic of FMS (Inanici and Yunus, 2004) is consistent with neuropathy. Histopathological studies have been performed on muscular and connective tissues (Inanici and Yunus, 2004), but not on dorsal root ganglia and central nervous tissue along the pain pathway. History teaches that myasthenia gravis was considered an idiopathic paralysis. Only after having established the involvement of the immune system were pathological immune infiltrates observed in muscle tissue (Hughes, 2005). There are indications that neuropathy is present in FMS patients (Üçeyler et al., 2013; Ramírez et al., 2015;

projections from 2nd order neurons toward thalamus (Thal) and cortical areas; yellow projections, projections for 3rd order neurons to cortical areas for awareness; green projections, descending projections that modulate the pain pathway. I-X, Reddit layers within the gray matter of the spinal cord; ↔, Integration of modulatory ascending and descending information in the dorsal horn Reddit laminae I-V (DH LI-V). 5-HT, serotonin; Aα, Aβ, Aδ y C, incoming nerves with decreasing levels of myelination; Amyg, amygdala; CC, cingulate cortex; CCKBR, cholecystokinin B receptor; CGRP, calcitonin gene-related peptide; DRG, dorsal root ganglia; Glu, glutamate; Ins, insula; LC, locus ceruleus; Nav, voltage-gated sodium channels; NE, norepinephrine; NGF, nerve growth factor; PAG, periaqueductal gray; RN, raphe nucleus; RVM, rostroventral medulla; S1, S, somatosensory areas 1 and 2; SP, substance P; TRP, transient receptor potential sensitive to nociceptive stimulus; µR, µ-opioid receptor with high affinity for enkephalins and beta-endorphin. Image based on (Tracey and Mantyh, 2007; Allen Human Brain Atlas, 2010; Ossipov et al., 2010).

Krumina et al., 2019). FMS pain may reflect a neuropathy of a currently unknown etiology, which is compatible with the central sensitization hypothesis. Latremoliere and Woolf (2009) reviewed neuroplasticity at the molecular and cellular level to explain central sensitization in pain hypersensitivity. They underscore the fundamental contribution of an inflammatory or neuropathic event to initiate the central sensitization.

Rat spinal cord slices exposed to pro-inflammatory cytokines display patch-clamp recordings that are congruent with the central sensitization hypothesis (Kawasaki et al., 2008), suggesting that the immune system plays a role; but, as is the case too often, proper controls were missing. A mouse study revealed the essential role of the immune sensor TLR8 in the maintenance of neuropathic pain. After nerve injury, TLR8 levels increased in the small IB4+ neurons in the dorsal root ganglia and in the ipsilateral dorsal horn. Subsequent intrathecal or intradermal injection of TLR8 agonists (VTX-2337 and miR-21) induced mechanical allodynia, and increased excitability of neurons in the dorsal root ganglia, accompanied by the expression of inflammatory mediators like interleukin 1 beta, interleukin 6, and tumor necrosis factor alpha. These effects were absent or reduced in TLR8 knock-out mice (Zhang et al., 2018). Aforementioned

study did not report on sex differences, but in humans, TLR8 is an X-linked gene that may escape X-inactivation leading to a dosage difference between men and women (Umiker et al., 2014). TLR8 is especially important because of its pro-autoimmunity potential (Peng et al., 2005). A recent study with BALB/c and C57BL/6 mice suggests that circulating immunoglobulin G (IgG)-type immune complexes may directly mediate hyperalgesia at the level of dorsal root ganglia (Bersellini Farinotti et al., 2019) where macrophages and neurons have receptors for IgG1 and IgG2b (Bersellini Farinotti et al., 2019). If this mechanism were to be confirmed in humans, it would make women more vulnerable, because women are more inclined to humoral adaptive immune responses than men (see Section Sex Differences in the Immune System). Another mouse study revealed sex differences in pain processing. Intrathecal stimulation of the immune sensor of danger TLR4 induced mechanical allodynia in male, but not in female mice. At a cellular level, microglia in the spinal cord proliferated in both sexes after peripheral nerve injury, but only male microglia upregulate the immune sensor of danger, P2RX4 (Mapplebeck et al., 2016). P2RX4 detects nucleotides, mainly ATP, released after CNS stress or injury (Di Virgilio and Sarti, 2018). Activation of P2RX4 receptors leads to release of pro-inflammatory interleukin 1β and brain-derived neurotropic factor, which promote pain hypersensitivity. The inhibition of microglia in the spinal cord reversed allodynia only in male rodents. The female pain process requires more investigation (Sorge et al., 2015). As for humans, sex-biased pain because of knee osteoarthritis could be explained by differences in immune signaling molecules, interleukin 8 and monocytechemoattractant protein-1 (Kosek et al., 2018). Interleukin 8 is one of the immunological biomarkers most consistently associated with FMS (Kosek et al., 2015). Thus, sex differences in pathological pain processing mainly involves immune sensors and immune cells. And as we will see in Section "Sex Differences in the Immune System," the immune system presents sex differences.

Besides, the sex bias of central sensitization remains to be elucidated. We hypothesize that autoimmunity directed to the CNS, either toxic or stimulatory, explains not only central sensitization, but also the female predominance of FMS and the lack of peripheral inflammatory biomarkers.

## WHY IS SLOW-WAVE SLEEP MISSING?

The neurobiology of sleep and the regulation of the daily sleepwake cycle have been reviewed with a clinical perspective by España and Scammell (2011). Gender and sex differences in sleep health have been recognized but major gaps continue to exist is areas of sleep regulation, the epidemiology of sleep problems, diagnosis and treatment (Mallampalli and Carter, 2014). Two sexually dimorphic areas, the preoptic area and suprachiasmatic nucleus (Hofman and Swaab, 1989; Hofman et al., 1996) have been associated with sleep problems (España and Scammell, 2011). However, it seems that the sexual dimorphic nucleus of the preoptic area is dedicated to sexual and parental behavior (Rosenblatt et al., 1996) rather than to sleep regulation. More recently, another brain area, the anterior cingulate cortex, has been involved in both primary insomnia (Yan et al., 2018) and FMS (Jensen et al., 2009), but neither study analyzed sex differences. Still, sleep differs between men and women and may contribute to a sex-biased risk for sleep disorders (Mallampalli and Carter, 2014) and, consequently, for FMS.

Polysomnography studies have revealed a variety of sleep disturbances in FMS patients (Choy, 2015). Especially, the phase of deep sleep is reduced and otherwise affected. Instead of the synchronized and therefore high-amplitude slow waves characteristic of the δ rhythm, desynchronized low-amplitude high-frequency waves characteristic of the α rhythm interfere, generating a pattern known as α-δ sleep (Choy, 2015). Deep sleep is considered to be important for memory consolidation and restoration processes. Abnormal deep sleep and other sleep disturbances may contribute to the development of chronic pain (Finan et al., 2013) and FMS (Mork and Nilsen, 2012).

Taking together the difficulty to maintain slow-wave sleep and the peak symptomology upon awakening, it seems as if the regulation of the different sleep phases may be involved. The well-known somnogenic adenosine seems to have a special role in the regulation of the slow-wave sleep phase as revealed by studies with male C57BL/6 mice (Oishi et al., 2017). Extracellular adenosine accumulation activates the adenosine A<sup>1</sup> receptor which inhibit arousal and induces slow-wave sleep. Another adenosine receptor, A2<sup>A</sup> receptor, can induce slow-wave sleep, but can be overruled by a motivation stimulus, like hunger or stress (Lazarus, et al. 2019). These receptors are found in the nucleus accumbens. Thus, the nucleus accumbens, already known for being part of the reward circuit, has a role in the control of the slow-wave sleep phase via adenosine receptors. A variety of enzymes and adenosine and nucleotide transporters in both neurons and astroglia are important for extracellular adenosine levels in the micro-environment of the nucleus accumbens. Interestingly, the A2<sup>A</sup> receptor also regulates naive T cell development in the thymus and its maintenance in the periphery (Cekic et al., 2013). In summary, adenosine, associated metabolites and involved enzymes and transporters may be important in slow-wave sleep and FMS, but further research is necessary.

## AUTOIMMUNE DISEASES: A CONFUSED IMMUNE SYSTEM

## Activation and Tolerance in the Immune System

The definition of autoimmune disease is relatively straightforward: a "disease that results when the immune system [..] mistakenly attacks the body's own tissues" (Wein, 2013). In practice it is more complicated as non-symptomatic healthy persons tend to have autoantibodies (Tan et al., 1997), which are eliminated before doing harm (Nagele et al., 2013). For a disease to be classified as autoimmune there must be detectable autoantibodies or autoreactive T-cells in amounts sufficiently higher than non-symptomatic

controls and they must explain the symptoms or present a strong epidemiological association with the symptoms. These requirements are fulfilled for all recognized autoimmune diseases, most often because of autoantibodies whether or not in combination with immune infiltrates (Dornmair et al., 2003). In autoimmune diseases that cluster with FMS, such as rheumatoid arthritis, Sjögren's syndrome, and SLE, specific and diagnostic autoantibodies have been identified. Autoantibodies against cryptic nuclear, cytoplasmic and proteolipid protein antigens are shared in various autoimmune diseases (Suurmond and Diamond, 2015; Fayyaz et al., 2016).

Instead of being due to one specific cause, autoimmune diseases develop when risk factors accumulate. Central to autoimmune diseases is the loss of tolerance to autoantigens. Tolerance of the immune system is the non-activation of the immune response. Upon contact with a substance, particle or pathogen our defense system must decide to attack or to be tolerant. Essential for this decision is recognition, which is different for the three levels of protection of our defense system.

The first line of defense is a biophysicochemical barrier that does not need activation nor recognition. The second level of protection is provided by the innate immune system, a fastreacting system without memory. Memory is not required as similar cell types or humoral factors share the same receptors for dangers and will attack the same patterns of danger. This in contrast with the third level of protection, provided by T lymphocytes and B lymphocytes. Lymphocytes have unique receptors, so that few will react upon a pathogenic invasion. T lymphocytes are activated by antigen presenting cells of the innate immune system, that provide instructions according to the danger pattern that were encountered. When selected and activated, lymphocytes generate a clone for future memory and effector cells that are either cytotoxic (Tc) or differentiate in a variety of mediators (helper T cells, Th) that potentiate different components of the innate and adaptive immune systems, while effector B lymphocytes (B) become plasma cells (plasma) that produce antibodies. If a tolerance response is erroneously converted into an active immune response, the memory of the adaptive immune response will respond with immune hypersensitivity upon future challenges.

### Activation and Tolerance in the Innate Immune System

The tolerance mechanism of the innate immune system is passive; i.e., the innate immune system is only activated upon the detection of a limited number of non-self molecular patterns associated with pathogens (PAMPs) or endogenous damage (DAMPs). Hereto, innate immune cells share a limited set of PAMP recognition receptors. Depending on the pattern recognized, an appropriate action is initiated (Hoebe et al., 2004). Everything that is not a PAMP or DAMP is automatically tolerated by the innate immune system. In general, loss of auto-tolerance is not due to breaches of tolerance of the innate immune system.

## Activation and Tolerance in the Adaptive Immune System

Both recognition/activation and tolerance are more complex in the adaptive immune system (Schwartz, 2012). The adaptive immune system is able to recognize a huge number of structures (known as antigenic determinants or epitopes) because of an enormous variety of specific receptors, T-cell receptors (TCRs) and B-cell receptors, that differ among lymphocytes. As a consequence, when a noxious antigen invades the body, only few specific lymphocytes will react. Clonal proliferation of a triggered lymphocyte generates a specific 'army' composed of effector lymphocytes (to eliminate current danger) and memory lymphocytes to provide a faster and stronger immune response for a future challenge with the same antigen.

As the specificity of the TCR and B-cell receptors is generated at random, these receptors may recognize harmless xenoantigens and autoantigens (Klein et al., 2014). To avoid allergies and autoimmune diseases, the adaptive immune system has three control mechanisms: (a) central tolerance, (b) peripheral tolerance, and (c) major histocompatibility complex-restricted activation of T cells, i.e., a T cell can only be activated when its cognate antigen is presented by a major histocompatibility complex molecule.

## HLA Presents Protein-Associated Epitopes to the Adaptive Immune System

In humans, the major histocompatibility complex is known as HLA. The HLA genes of class I (A, B, and C) and II (DP, DQ, and DR) are the most polymorphic genes of the human genome and provide a molecular identity to an individual. All nucleated cells of the human body express HLA-I, although leukocytes express them in larger amounts, while HLA-II is expressed by specialized antigen-presenting cells. HLA-I presents antigens to CD8 + cytotoxic T cells, while HLA-II presents antigens to CD4+ helper T cells (Th) (Hoebe et al., 2004). As antigen presentation to T cells is HLA-restricted, only body-own cells will be protected. The polymorphism of the HLA molecules has an impact on the quality of the immune response. Certain HLA-antigen complexes facilitate the immune response of an individual and others not. Thus, the individuals of a population differ in their protective capacity and their susceptibility to autoimmune diseases. The antigen presentation process is designed for large proteins, preferentially in a particulate presentation. Therefore, electrolytes, sugars, lipids (e.g., sex hormones), nucleic acids, peptides (e.g., neuropeptides), and other small molecules (e.g., neurotransmitters) are passively tolerated. On the other hand, tolerance to protein autoantigens is an active, energy-consuming, and highly controlled process with central and peripheral mechanisms.

## Mechanisms of Central Tolerance

Central tolerance is established in the primary lymphatic organs, most importantly, in the thymus, where developing T cells or thymocytes reorganize their genome at random to express a unique TCR, either xenoreactive or autoreactive. The thymic epithelial cells function as 'teachers' of the thymocytes. Hereto, the thymic epithelial cells express "AIRE)," a transcription factor

that facilitates the expression of organ-specific autoantigens in the thymus. The autoantigens are presented in combination with HLA to the thymocytes. To survive, thymocytes should recognize HLA (positive selection) but not recognize autoantigens (negative selection). Simplified, autoreactive thymocytes have two fates (Klein et al., 2014): (1) they die by apoptosis, in case of a strong and long antigen-TCR interaction, and (2) they become regulatory T cells (nTreg), when the antigen-TCR interaction is of intermediate strength and length (Azzam et al., 2001; Ohkura et al., 2013). Thymocytes that do not react with autoantigens or only shortly and with low-affinity are liberated as naïve Th or cytotoxic T cells to protect against danger in the periphery.

As not all autoantigens are expressed in the thymus in sufficient amounts, autoreactive naive T cells may circulate in the periphery, a phenomenon known as ignorance.

## Mechanisms of Peripheral Tolerance

Peripheral tolerance complements central tolerance by any of the following mechanisms: (1) clonal deletion of autoreactive T cells by apoptosis, (2) the peripheral induction of Treg (iTreg) under the influence of transforming growth factor beta, (3) anergy, i.e., a reversible inactive state of the T cell when the antigen presenting cells do not provide a costimulatory signal (Mueller, 2010), and (4) ignorance, i.e., the amount of auto-antigen is insufficient to induce either tolerance or an immune response. The choice for any of these tolerance options depends on the abundance, strength and duration of the TCR-antigen interactions, but it is conditioned by the absence of a danger signal. As the T cell has no information about the type of danger recognized by its at-random-generated TCR, this information is provided by the antigen presenting cell. Depending on the type of PRR activated the antigen presenting cell provides a costimulatory signal and instructions about the desired type of response (Th1, Th2, or Th17 response) via cytokines (Kaiko et al., 2008). Apart from these major tolerance processes, immune responses are fine-tuned by many stimulatory, inhibitory, and modulatory molecules (e.g., CD5) (Sigal, 2012) and cells, e.g., NKT cells (Dasgupta and Kumar, 2016) within the immune system, as well as neuroendocrine peptides and hormones beyond the immune system (Carniglia et al., 2017). In summary, the adaptive system mainly attacks large protein (>10 kDa) antigens and tolerance to large self-proteins is a complex and highly regulated process.

## Loss of Tolerance and Development of Autoimmune Diseases

Tolerance may be breached for any of the following reasons or a combination of them. First of all, molecular mimicry between an auto-antigen and a pathogenic antigen may confuse the immune system. A well-known example is the Guillain-Barré syndrome due to autoantibodies directed to peripheral nerves because of cross-reaction between nerve autoantigens and certain pathogen antigens, especially Campylobacter jejuni, Epstein-Barr virus, cytomegalovirus (Jacobs et al., 1998).

Secondly, bystander autoantigens that are presented in combination with danger signals due to concurrent infections or physical trauma may overcome tolerance, which may play a role in the development of autoimmune diseases (Gestermann et al., 2018). The bystander mechanism may also explain how a 'founder' or primary autoimmune disease may lead to secondary autoimmune diseases. Cell lysis due to inflammation of the primary autoimmune disease liberates multiple cryptic autoantigens (mitochondrial antigens, phospholipid antigens, ribonucleoproteins, and other cytoplasmic autoantigens) that provide DAMPs. And the cycle repeats itself: novel autoantigens in combination with danger signals may lead to multiple autoreactive clones of lymphocytes. Cryptic autoantigens liberated during necrosis are shared by different autoimmune diseases (Suurmond and Diamond, 2015). Although the etiology is not proven, latent viral infections associate with MS (Virtanen et al., 2014). A reactivating latent viral infection may cause not only minor damage to nervous tissue but also a breach in tolerance either by molecular mimicry or the bystander effect. Multiple cryptic autoantigens are liberated due to repeated tissue damage, so that polyautoimmunity develops. In MS, oligoclonals are directed against a mixture of autoantigens of cellular debris (Brändle et al., 2016). Initially, MS can be more or less controlled by interferon treatment (Multiple Sclerosis Therapy Consensus Group, 2008), which has antiviral activity, until the polyautoimmune disease has become autosustainable and the relapsing-remitting MS patient turns into a secondaryprogressive MS patient. In MS, just as in FMS, inflammatory biomarkers of the blood tend to be within the normal range (Luzzio and Dangond, 2018) as the inflammatory process occurs localized, i.e., behind the blood-brain barrier. Aforementioned process probably plays a role in many autoimmune diseases but have been recognized for only a few. The clinical importance is high because antimicrobial treatment may resolve the initial infection, once the autoimmune disease has developed it may be too late (Mukherjee et al., 2018). Alternatively, it explains why steroid anti-inflammatory treatment worsens (Clark et al., 1985) rather than improves the symptoms.

Thirdly, there is a genetic predisposition for most autoimmune diseases, especially in the HLA genes, probably as a consequence of affinity issues between the HLA molecule and the presented antigen (Bodis et al., 2018). HLA-DR2, DR3, DR4, DQ6, and DQ8 are associated with SLE, sicca syndrome, and rheumatoid arthritis (Mangalam et al., 2013). Furthermore, gene polymorphisms in complement factors may diminish autoantibody clearance (Macedo and Isaac, 2016).

Fourthly, CD5 expression fine-tunes the adaptive immune response in a complex way (Sigal, 2012). CD5 on T or B lymphocytes may either facilitate or inhibit an adaptive immune response depending on avidity issues (Domingues et al., 2016). In general, CD5+ B cells have been related with increased susceptibility for autoimmune disease, while the opposite applies to T cells (Tarakhovsky et al., 1995; Pers et al., 1999; Hawiger et al., 2004).

Fifthly, stress dysregulates the immune system (Sharif et al., 2018). Acute stress enhances catecholamines and circulating leukocytes, which facilitates a stronger immune response. Chronic stress, on the other hand, is immunosuppressive. The stress hormone cortisol diminishes leukocyte numbers and suppresses leukocyte function (Dhabhar, 2008). The stress hormones epinephrine and cortisol induce a rapid

leukocyte redistribution. Although the exact mechanisms of this phenomenon remains to be elucidated, clinical and epidemiological data convincingly and consistency reveal an association between chronically stressed people and vulnerability to and resurgence of infections and autoimmune diseases (Reiche et al., 2004).

Sixthly, the immune response is age dependent (Elisia et al., 2017) and age has is a risk factor for loss of central tolerance. In over 60% of autoimmune cases, the onset of symptoms was in the fourth and fifth decade of life, with a median onset at 37.5 years of age (Euesden et al., 2017). A detection bias may play a role in the age effect. Autoimmune disease may not be apparent at the onset, because the progression is slow and their biochemical, physiological or visual detection not obvious. For example, Sjögren syndrome, an autoimmune disease affecting the glands, becomes apparent in older individuals, because destruction of the glands is slow and symptoms of dryness are only experienced when most of the glands are destroyed (Hsu and Mountz, 2003). Importantly, the most essential organ for central tolerance, the thymus, undergoes profound age-associated atrophy (Lynch et al., 2009). Thymic decline is clearly associated with the presence of sex steroids (Heng et al., 2005).

And finally, being female is a risk factor for many autoimmune diseases. A sex bias in the immune system is well-documented and will be described in more detail in the next section.

The unfortunate co-occurrence of aforementioned risk factors elicits an autoimmune disease. It is important to highlight that the initial trigger may be transitory, but the induced autoimmune response is chronic due to the memory of the adaptive immune response. Aforementioned risk factors all apply to FMS (see Section Fibromyalgia: Introducing the Autoimmune Hypothesis and **Figure 1**).

## SEX DIFFERENCES IN THE IMMUNE SYSTEM

The burden of infectious diseases and the incidence of cancer, allergies, and autoimmune disease differs between men and women. Though this phenomenon can be partially explained by a differential exposure of men and women to environmental risk factors, sex differences in the immune system are evident and well-known. The expression of many immune system-associated proteins and activation of immune cells differs between the sexes as have been reviewed elsewhere (Klein and Flanagan, 2016; Rainville et al., 2018). Well-known is a Th1 bias in male and a Th2 bias in female (Klein and Flanagan, 2016). The latter facilitates the natural passive transfer of immunity from mother to fetus, as immunoglobulins G are able to pass the placenta to protect the fetus. Furthermore, as of adulthood, the number of innate and adaptive leukocytes is higher in females than in age-matched males (Urquhart et al., 2008), except for Treg and innate lymphoid cells, including NK, which are more abundant in males (Klein and Flanagan, 2016). NK have functions beyond cytotoxicity including an important role in the regulation of immune homeostasis and inflammation (Vitale et al., 2005). During aging, the diverse NK population changes gradually; the proportion of immunoregulatory CD56hi NK diminishes in favor of highly differentiated cytotoxic CD57<sup>+</sup> NK. This redistribution may explain functional changes in NK cells with healthy aging (Gayoso et al., 2011). The dysregulation of NK and NKT cells is associated with allergies and autoimmune diseases (Nielsen et al., 2013; Tahrali et al., 2018). Here, we highlight that the cellular immune response tends to predominate in men and the humoral immune response in women. Furthermore, adult men tend to have more immunoregulatory cells than women.

## Differential Effect of Sex Hormones on Leukocyte Behavior

Sex hormones differentially influence the behavior of leukocytes. There is a differential distribution of sex hormones receptors among leukocytes. For example, CD4+ T cells have high levels of the estrogen receptor alpha but low estrogen receptor β levels, whereas the opposite applies for Treg. CD8+ Tc have low expression levels for both types of estrogen receptors, whereas NK express both receptors highly. In mice, high estrogen levels signal via estrogen receptor α and induce antiviral type 1 interferon and NK cells, as well as Treg. Signaling via estrogen receptor β has opposing effect and results in diminished Treg activation. Furthermore, the transmembrane G protein-coupled estrogen receptor, which induces rapid signaling is highly expressed by certain leukocytes (CD4+ Th, Treg, B cells, and macrophages) (Koenig et al., 2017). Estrogen diminishes neutrophil infiltrates and protects against the harmful effects of the innate response (Shimizu et al., 2008; Ritzel et al., 2013). Progesterone and testosterone promote monocyte recruitment and an androgen receptor antagonist reduced monocyte recruitment (Toyoda et al., 2012; Sutti and Tacke, 2018). On the other hand, testosterone treatment reduced immunoglobulin M and immunoglobulin G production and as such diminishes peripheral humoral adaptive immunity (Kanda et al., 1996). The effects of sex hormones are complex because of opposing effects depending on concentration (Hughes, 2012), the variety in metabolic forms that modulate the immune response, and relative concentrations of various sex hormones. Our intention is not to unravel the intricate interactions between the sex hormones and immune system, but to highlight its existence. The area requires more investigation, but it seems that testosterone procures central tolerance. Taken together, testosterone and progesterone downregulate peripheral (humoral) adaptive immunity, but facilitate peripheral innate immunity (Hughes and Clark, 2007; Lai et al., 2012), whereas the opposite applies for estrogen as it is associated with peripheral innate immune suppression, stronger humoral responses, and weaker central tolerance (Kovats, 2015).

## Sex Differences in Immune Tolerance

As mentioned above, the thymus has an essential role in central tolerance. Thymic involution is different between the sexes, especially after the onset of puberty, with a more prominent decline in males than in females, so that the adult female thymus contains more thymocytes and has a higher thymic output than the male thymus (Gui et al., 2012). However, the T cells liberated

from the female thymus may have been less well 'educated' as the female thymic epitheliocytes express less AIRE and fewer autoantigens (mRNA and protein) than the male ones (Dragin et al., 2016). Low expression levels of AIRE associate with autoimmune disease (Sato et al., 2002; Liu et al., 2014).

The sex hormones, estrogen and dihydrotestosterone (the main active metabolite of testosterone), regulate AIRE expression in opposite directions. At physiological doses, estrogen diminishes AIRE expression, whereas dihydrotestosterone increases AIRE expression (Dragin et al., 2016). Dihydrotestosterone treatment in an experimental animal model of MS, upregulated AIRE and tissue-specific antigen expression in the thymus, improved negative selection of autoreactive T cells, and diminished the severity of autoimmune disease (Zhu et al., 2016). A cross-sectional populationbased study revealed that a higher estradiol/testosterone ratio associated significantly with autoimmune thyroid disease among Chinese men (Chen et al., 2017). Interestingly, a clinical phase I/II pilot study among 12 female FMS patients revealed that a 28-day treatment with testosterone gel significantly decreased pain, stiffness, and fatigue (White et al., 2015). Altogether, the efficacy of the thymus depends strongly on age and sex hormones. Male sex hormones seem to compensate for thymic involution with an increased AIRE expression, whereas female sex hormones contribute to diminished tolerance and increased vulnerability to autoimmune diseases.

## Sex Chromosomes and Immune-Associated Genes

The major genetic difference between men and women are the sex chromosomes. Men are XY, whereas women are XX. To enable the pairing of the X and Y chromosomes during male meiosis, small pseudo-autosomal regions are present at the extremes of both the X and Y chromosome. In the pseudo-autosomal regions, the X and Y chromosomes encode the same genes (Mangs and Morris, 2007). For non-pseudoautosomal region genes, males will express the genes of the unique X chromosome, whereas female cells perform at random X inactivation as a dosage compensation mechanism. X inactivation is clonally maintained and generates a functional mosaic organism for X chromosome-encoded genes (Rubtsov et al., 2010). Importantly, X-chromosome inactivation is not an all-or-non-phenomenon; about 10–15% of the X chromosome-encoded genes escape X-inactivation in humans, and a mouse model of accelerated aging revealed increased reactivation with age (Berletch et al., 2011).

Although it has been stated that the X chromosome encodes a disproportionally large number of immune-associated genes (Bianchi et al., 2012), so far no scientifically sound supporting evidence has been published. Still, the increased susceptibility for immune hypersensitivities of men with Klinefelter's syndrome (XXY) or women with Turner syndrome (X-) reveals the importance of dosage of X-linked immune system-associated genes (Mortensen et al., 2009; Sawalha et al., 2009). An elegant study that used the four genome model (XXSry+, XYSry+, XX, XYSry <sup>−</sup>) in two gonadectomized mouse models for autoimmune disease demonstrated a dosage effect of the X chromosome in SJL mice. Interestingly, the dosage effect was not observed in C57BL/6 mice (Smith-Bouvier et al., 2008). The genetic background of C57BL/6 mice contrast with the one of SJL mice in terms of susceptibility for murine cytomegalovirus and autoimmune disease. In contrast to SJL, C57BL/6 has a bias toward a Th1 response and high NK activity (Sellers et al., 2012; Song and Hwang, 2017). This cellular immune response protects against viral infections.

Two X-linked genes are especially associated with humoral autoimmune disease to cryptic antoantigens. TLR7 and TLR8 (both in band Xp22.2) encode endosomal immune sensors that sense microbial and endogenous RNA (Umiker et al., 2014). TLR7 expression displays a dosage disequilibrium in biallelic B lymphocytes of women and men with Klinefelter's syndrome. These biallelic B lymphocytes switch more easily to IgG (Souyris et al., 2018), which is consistent with SLE symptom severity in TLR7 + as compared to TLR7-deficient C57BL/6 mice (Desnues et al., 2014). In a 564Igi mouse model of SLE, a dosage difference in TLR8 determined the sex bias in anti-RNA IgG antibodies, which were higher in female than male mice (Umiker et al., 2014). 564Igi mice are especially susceptible to autoimmunity because of diminished somatic hypermutation (McDonald et al., 2017). The release of miR-21 due to neuropathy stimulates TLR8 signaling in the dorsal root ganglia, which leads to hyperexcitability and pain (Zhang et al., 2018). Importantly, TLR8 activation can reverse tolerant Treg into aggressive forms (Peng et al., 2005). Thus, X-linked RNA immune sensors may be activated in neuropathy and favor autoimmunity rather than immune tolerance.

## AUTOIMMUNITY TO THE NERVOUS SYSTEM. WHAT TRIGGERS IT OFF?

The CNS used to be considered an immunoprivileged organ, and therefore little susceptible to autoimmune issues. But this viewpoint has changed over the last two decades upon the detection of autoantibodies against a variety of nervous system autoantigens. As expected, most autoantibodies are directed against large, complex protein autoantigens (**Table 1**) rather than against small non-protein molecules. Currently, among patients with mental illnesses, the serum prevalence of autoantibodies against nervous tissue antigens is 11–17%, which may be an underestimation (Graus and Santamaría, 2017).

After neuropathy (due to infection or a lesion), intracellular macromolecules that previously were encrypted autoantigens may be exposed and targeted by autoantibodies (Totsch and Sorge, 2017). The latter probably occurs in MS, where oligoclonals target ubiquitous intracellular proteins (Brändle et al., 2016). Oligoclonal bands are interpreted as immunoglobulins that are produced intrathecally, i.e., inside the CNS (Luzzio and Dangond, 2018). Oligoclonal bands occur in 95% of MS patients (Halbgebauer et al., 2016). Other targets of MS autoantibodies are myelin-associated autoantigens (Link et al., 1990) and viral antigens (Virtanen et al., 2014). Although no specific virus is considered to be the causative agent of MS, viruses may be direct or indirect risk factors. The latter via molecular mimicry and/or bystander activation (Virtanen and Jacobsen, 2012) as described in TABLE 1 | Autoantibodies against nervous tissue autoantigens.

fnins-13-01414 January 8, 2020 Time: 18:33 # 10


1, Koga et al., 1998; 2, Link et al., 1990; Raddassi et al., 2011; Virtanen et al., 2014; Brändle et al., 2016; 3, Kruse et al., 2015; 4, Noridomi et al., 2017; 5, Fukata et al., 2018; 6, Levite and Ganor, 2008; 7, Hiyama et al., 2010; 8, Jarius and Wildemann, 2013; 9, Sabater et al., 2014; 10, Pang et al., 2017; 11, Caturegli et al., 2005.

Section "Loss of Tolerance and Development of Autoimmune Diseases." Various psychiatric diseases are considered to be caused by either an autoimmune process or an infection (Singh and Trevick, 2016; Dubey et al., 2018). We propose that in many cases both occur; the infection would be the initiation event and autoimmunity a consequence. Still they may occur simultaneously, especially when involving opportunistic pathogens. A variety of herpes viruses are opportunistic, pandemic, and neurotropic. Depending on the geographic location, 40–100% of the adult population is infected. A primary infection establishes a lifelong latent infection, which reactivates intermittently without obvious disease symptoms,

except for immunocompromised persons. Stress may be a trigger for 'asymptomatic' reactivation. Among herpes viruses, cytomegalovirus seems especially apt to alter the immune response into autoimmunity (Andersen and Andersen, 1978; Varani and Landini, 2011; Halenius and Hengel, 2014), while Epstein-Barr virus-transformed lymphocytes tend to produce autoantibodies (Garzelli et al., 1984). In case FMS etiology involves neuropathy by reactivating latent pathogens, the unresponsiveness to corticosteroid treatment is understood. Considering the heterogeneity of FMS patients, other infections or neuropathic events should not be ruled out as possible triggers of autoimmunity.

## THE MISSING PIECE: EVIDENCE OF AUTOIMMUNE COMPONENTS SPECIFIC FOR FMS

Autoantibodies against intracellular antigen, nervous and muscle tissue have been reported in FMS patients (**Supplementary Table 1**), but their role in FMS pathogenesis is controversial. We suggest to screen nervous tissue involved in the pain pathway, both CNS (including the pituitary and pineal glands) and peripheral nervous tissue (dorsal root ganglia) with patient samples (blood and CSF), to complete the missing evidence (**Figure 1**). A variety of conceptual and technical issues may complicate the detection of autoantibodies. The lack of tissue lesions should not be interpreted as the absence of autoimmunity, as autoantibodies may be stimulatory as in Graves' disease (Yeung and Habra, 2018). Screening of autoantibodies should not be limited to blood, as autoantibodies or oligoclonals may be limited to CSF when neuropathic symptoms predominate (Luzzio and Dangond). Furthermore, pleocytosis of leukocytes in CSF should be evaluated. Autoantibody screening on certain tissues may yield false negatives when the autoantibodies are directed against other tissues than the ones that are screened. Screening on animal tissues may yield false negatives when the human autoantigens are sufficiently different from the animal forms. Screening on fixed tissues may yield false negatives because the appropriate antigen retrieval method was not applied. Also, the autoimmune response may be cellular rather than humoral. A conceptual or interpretation issue are prodromal autoantibodies; tissue destruction mediated by prodromal autoantibodies remains asymptomatic until the overcapacity of the targeted organ has been lost (Arbuckle et al., 2003; Hayashi et al., 2008; Haller-Kikkatalo et al., 2017). During the prodromal period, autoantibodies run the risk to be interpreted as false positives. Longitudinal follow-up studies of patients with prodromal autoantibodies would be interesting. And finally, because of the heterogeneity among FMS patients, a certain etiology or pathogenesis may be limited to a subgroup of FMS patients (Jacobsen et al., 1990; Purnamawati et al., 2018). The worst scenario would be that the detected pathogenesis is discarded because it does not apply to a sufficiently high proportion of FMS patients. To avoid this situation, stratification or clustering of FMS patients is recommendable. Despite the aforementioned, the challenge is not impossible; the detection of anti-IgLON5 is exemplary (Sabater et al., 2014). We recommend a similar screening technique to verify whether an autoimmune process is involved in the pathogenesis of FMS.

## CONCLUSION

The clinical profile of FMS displays a strong overlap with certain autoimmune diseases. In fibromyalgia, physical or mental stress may constitute a precipitating factor or a consequence rather than a cause, similar to the situation in autoimmune diseases. Stress may debilitate the immune system and allow for reactivation of a latent (viral) infection, which may cause neuroinflammation or neuropathy and facilitate autoimmune phenomena. However, different from most autoimmune diseases, common clinical serum markers of inflammation are within the normal range in FMS. Still, altered immunological biomarkers, especially CD57 and IL-8 levels, are compatible with a viral infection or autoimmune mechanism. Sex differences in the immune system would explain a sex bias in FMS prevalence. If convincing evidence for an autoimmune process were detected for FMS, diagnostic tests and effective therapies could be developed. Blood and CSF should be screened for autoantibodies and/or autoreactive lymphocytes. Screening for autoantibodies directed to peripheral nervous tissues and CNS should include dorsal root ganglia, the spinal cord, pituitary gland and pineal gland, should be screened as possible targets for autoantibodies and autoreactive lymphocytes.

## AUTHOR CONTRIBUTIONS

All authors complied with the ICMJE criteria for authorship of contribution in conception or data acquisition, drafting or revising the manuscript, approval of the final document, and agreement with all aspects of the document. The contributions of GR-S were mainly in the neuroscience and pain part, the ones of FG-S in the immune system section, and IM in the FMS and autoimmune aspects, as well as the coordination of the publication aspects.

## FUNDING

This work was supported by the University of Monterrey (UDEM); grant number UIN17536.

## ACKNOWLEDGMENTS

The authors thank Dr. David Hafner for reviewing the document and the University of Monterrey (UDEM) for grant UIN17536.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnins.2019. 01414/full#supplementary-material

## REFERENCES






**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Meester, Rivera-Silva and González-Salazar. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Untangling the Ties Between Social Cognition and Body Motion: Gender Impact

*Sara Isernia1,2,3 , Alexander N. Sokolov1 , Andreas J. Fallgatter1 and Marina A. Pavlova1 \**

*1Department of Psychiatry and Psychotherapy, Medical School and University Hospital, Eberhard Karls University of Tübingen, Tübingen, Germany, 2Department of Psychology, Università Cattolica del Sacro Cuore, Milan, Italy, 3CADITeR, IRCCS Fondazione Don Carlo Gnocchi ONLUS, Milan, Italy*

We proved the viability of the general hypothesis that biological motion (BM) processing serves as a hallmark of social cognition. We assumed that BM processing and inferring emotions through BM (body language reading) are firmly linked and examined whether this tie is gender-specific. Healthy females and males completed two tasks with the same set of point-light BM displays portraying angry and neutral locomotion of female and male actors. For one task, perceivers had to indicate actor gender, while for the other, they had to infer the emotional content of locomotion. Thus, with identical visual input, we directed task demands either to BM processing or inferring of emotion. This design allows straight comparison between sensitivity to BM and recognition of emotions conveyed by the same BM. In addition, perceivers were administered a set of photographs from the Reading the Mind in the Eyes Test (RMET), with which they identified either emotional state or actor gender. Although there were no gender differences in performance on BM tasks, a tight link occurred between recognition accuracy of emotions and gender through BM in males. In females only, body language reading (both accuracy and response time) was associated with performance on the RMET. The outcome underscores gender-specific modes in visual social cognition and triggers investigation of body language reading in a wide range of neuropsychiatric disorders.

Keywords: biological motion, visual social cognition, gender, emotion, body language reading

## INTRODUCTION

Body language reading is an essential ability for efficient daily interpersonal exchanges and adaptive behavior. Another benefit of body language reading is that, although verbal information flow is believed to be easily kept under control, body movement often reveals our true feelings and dispositions. Typically developing (TD) individuals are proficient in inferring emotions and intentions of others represented by biological motion (BM) in *point-light* displays minimizing the availability of other cues (such as body shape or outfit) and, thereby, isolating information conveyed by BM solely (**Figure 1**) (e.g., Dittrich et al., 1996; Pollick et al., 2001; Atkinson et al., 2004; Heberlein et al., 2004; Clarke et al., 2005; Manera et al., 2010; Alaerts et al., 2011; Sokolov et al., 2011; Krüger et al., 2013; Actis-Grosso et al., 2015; Vaskinn et al., 2016). Perceivers can judge emotional content of dance, represented by a few moving dots located on a dancer's body, with anger being the most reliably identified emotion (Dittrich et al., 1996). Inferring emotions from BM is fairly robust across cultures (Parkinson et al., 2017), and it remains rather accurate

#### *Edited by:*

*Jan Van den Stock, Katholieke Universiteit Leuven, Belgium*

#### *Reviewed by:*

*Luca Francesco Ticini, University of Manchester, United Kingdom Giorgio Vallortigara, University of Trento, Italy*

#### *\*Correspondence:*

*Marina A. Pavlova marina.pavlova@uni-tuebingen.de*

#### *Specialty section:*

*This article was submitted to Emotion Science, a section of the journal Frontiers in Psychology*

*Received: 10 January 2020 Accepted: 16 January 2020 Published: 06 February 2020*

#### *Citation:*

*Isernia S, Sokolov AN, Fallgatter AJ and Pavlova MA (2020) Untangling the Ties Between Social Cognition and Body Motion: Gender Impact. Front. Psychol. 11:128. doi: 10.3389/fpsyg.2020.00128*

**168**

FIGURE 1 | Illustration of point-light biological motion. Three consequent static frames exemplifying human walking as a set of dots placed on the main joints and head of an invisible actor body. A walker is seen facing left in intermediate position between the frontal and sagittal view.

over age: only the recognition of sadness (but not angry or happy displays that are more exaggerated) at short durations is lower in the elderly (Spencer et al., 2016).

It has been argued that social cognitive abilities (i.e., abilities to perceive and understand emotional states, drives, and intentions of others) and BM processing are tightly linked and, therefore, performance on *socially neutral* tasks (such as detection of camouflaged BM, facing detection, or discrimination between canonical and scrambled BM) may serve as a hallmark of social cognition (Pavlova, 2012): individuals with neurodevelopmental and psychiatric conditions (such as autism spectrum disorders (ASD), Williams-Beuren syndrome, and Down syndrome) and survivors of premature birth who exhibit aberrant BM processing, have compromised daily social perception and possess lower social competence. In agreement with this assumption, newborn human infants (and newly hatched chicks) appear to be predisposed to BM, though such predispositions are impaired in newborns at high risk of autism (Bidet-Ildei et al., 2014; Di Giorgio et al., 2016, 2017).

In the non-clinical adult population, a possible intrinsic link between the ability to perceive BM and a person's social capabilities appears to be in line with visual psychophysics. Emotional valence of BM affects the sensitivity to point-light gait masked by an additional set of dots taken from the same walker, with highest sensitivity (but also greatest response bias), to angry and lowest sensitivity, to neutral walking (Chouchourelou et al., 2006). The sensitivity to slightly camouflaged BM is related to both anger and happiness (Ikeda and Watanabe, 2009). Happiness superiority effect in BM processing is also affirmed: BM detection within noise is not only facilitated by an actor's happiness, but happiness is easier to recognize than angry and neutral BM (Lee and Kim, 2017). The ability to reveal the identity of point-light dancers and expression intensity correlates with self-reported empathy (Sevdalis and Keller, 2011). Emotion recognition through BM is related not only to more basic capability for discrimination between canonical and scrambled BM, but also to performance on the Reading the Mind in the Eyes Test, RMET (Alaerts et al., 2011). Empathy, performance on both the RMET and Cambridge Face Memory Test, and autism quotient are all positively linked in TD individuals with efficient BM processing (such as facing detection) (Miller and Saygin, 2013). In children aged 7–12 years, BM facing detection is already associated with mindreading in eyes (Rice et al., 2016). Alexithymia (the inability to identify and describe emotions in the self) scores in TD individuals correlate with confidence in rating the emotion valence of pointlight BM displays (Lorey et al., 2012). BM processing (decoding of gender) is affected by gender stereotyping elicited by depicted emotion: angry throwing of a ball is often judged to be performed by men, whereas sad throwing is judged to be performed by women (Johnson et al., 2011). Inferring emotions through BM is modulated by administration of the neuropeptide oxytocin known to facilitate social cognition (Bernaerts et al., 2016; Wynn et al., 2019). Moreover, smelling steroids (either androstadienone or estratetraenol) makes observers to estimate the emotional state of a point-light walker of the opposite sex as happier and more relaxed (Ye et al., 2019).

Some aspects of BM processing and/or body language reading are aberrant in psychiatric, neurological, psychosomatic, and neurodevelopmental disorders (for reviews, see Pavlova, 2012; Pavlova, 2017a,b; Okruszek et al., 2018). Most importantly, the visual sensitivity to BM is inversely linked to the severity of these disorders, e.g., as measured by the autistic diagnostic observation schedule (ADOS) in ASD (Blake et al., 2003), or to autism traits in TD (Koldewyn et al., 2010). For ASD and TD individuals pooled together, both BM processing and emotion recognition are related to social responsiveness scores (Nackaerts et al., 2012). In schizophrenia (SZ), deficient BM is connected to aberrant social cognition (Okruszek and Pilecka, 2017; Okruszek, 2018). For instance, deficient BM detection (discrimination between such actions as climbing a stair and scrambled displays) is linked to lower social competence (Kim et al., 2005). In SZ, a positive correlation is reported between BM processing (detection of facing direction of masked walkers) and the empathy index (Matsumoto et al., 2015). Poorer emotion recognition is associated with impaired self-reported social functional capacity, community outcome (such as lifetime relationship status and independent living) and, in particular, in individuals who committed homicide, with a tendency to under-mentalize (Engelstad et al., 2017, 2018a,b; Egeland et al., 2019).

In a number of tasks examining BM processing, gender/sex differences are reported (Pavlova, 2017a,b). TD adult females are more accurate in BM recognition of actions (such as jumping on the spot) and faster in discriminating between emotional and neutral BM (Alaerts et al., 2011). Yet gender differences in body language reading appear to be modulated by the type of portrayed emotion and actor gender (Sokolov et al., 2011; Krüger et al., 2013). TD females are reported to be more accurate in body language reading through full-light body motion (Strauss et al., 2015). Brain imaging points to sex differences in neural circuits underpinning BM processing (Anderson et al., 2013; Pavlova et al., 2015). Sex differences in BM processing are also reported in other species (Regolin et al., 2000; Brown et al., 2010). This points to their fundamental character. Gender (socio-cultural aspects) and sex (neurobiological aspects) impacts can be of substantial value, not only for better conceptualization of social cognition but also for understanding neuropsychiatric conditions, most of which are gender/sex specific (Pavlova, 2012, 2017a,b).

Some previous studies on social cognition through BM possess methodological limitations: (1) BM tests are often based on videotapes of only one (either female or male) or two (female and male) performers. For example, many studies on emotion recognition from BM in psychiatric populations use videos of only one actor [e.g., *EmoBio test* first introduced by Heberlein et al. (2004); see also Okruszek et al., 2018]. (2) Socially neutral BM processing and inferring social information from point-light displays are often assessed with different sets (or types) of displays and experimental procedures. (3) Unbalanced design is used with samples of TD individuals and patients that are not properly matched in respect to gender (e.g., patients of one gender are compared with TD individuals of both genders) and/or differ in sample size (sample of TD individuals is twice or even larger than patient sample). If samples contain many more individuals of one gender and/ or more TD participants than patients, comparisons between groups may lead to paradoxical statistical outcomes. These issues can preclude proper generalization of findings.

Here we proved the viability of the assumption that BM processing is firmly linked with expressive body language reading. Bearing in mind the occurrence of gender-specific modes in both BM processing and social cognition, we examined whether this bond is gender-specific. For this purpose, TD females and males completed two tasks with the same set of point-light BM displays portraying angry and neutral locomotion of female and male actors. On one task, perceivers had to indicate an actor's gender, whereas on the other, the emotional content of locomotion. Thus, with identical visual input, we directed task demands either to BM processing or emotion recognition. The primary benefit of this design is that it allows comparison between BM processing and inferring emotions conveyed by the same BM. In addition, in a separate session, perceivers were administered a set of photographs from the RMET for identifying either an emotional state or actor gender.

## MATERIALS AND METHODS

## Participants

Forty participants (20 females and 20 males, aged 19–39 years; students of the University of Tübingen Medical School) were enrolled in the study. No age difference occurred between them: males were aged 26.5 years [median (Mdn), 95% confidence interval, CI from 24.43 to 30.67], and females were aged 25 years [Mdn, 95% CI from 23.23 to 28.27 (Mann-Whitney test, *U* = 171.5, *p* = 0.439, n.s.)]. As performance on the RMET (German version, for details, see section below) requires language command of high proficiency, German as a native language served as an inclusion criterion. All observers had normal or corrected-to-normal vision. None had head injuries or a history of neuropsychiatric disorders (including ASD, SZ, and depression), or regular drug intake (medication). They were run individually and were naïve as to the purpose of the study. None had previous experience with such displays and tasks. The study was conducted in line with the Declaration of Helsinki and was approved by the local Ethics Committee at the University of Tübingen Medical School. Informed written consent was obtained from all participants. Participation was voluntary, and the data were processed anonymously.

## Biological Motion: Stimuli, Tasks, and Procedure

Participants were presented with a set of point-light black-andwhite animations portraying human locomotion. Display production is described in detail elsewhere (Krüger et al., 2013). The displays were built up by using the Motion Capture Library. In brief, recording was performed using a 3D position measurement system at a rate of 60 Hz (Optotrak, Northern Digital Inc., Waterloo, ON, Canada). The matrix data for each frame were processed with MATLAB (The Mathworks Inc., Natick, MA, USA) into a video sequence. Each display consisted of 15 white dots visible against a black background (**Figure 1**). The dots were placed on the shoulder, elbow, and wrist of each arm; on the hip, knee, and ankle of each leg; and on the head, neck, and pelvis of a human body. As we intended to make tasks demanding and supposed more pronounced effects with brief stimulus duration, each movie lasted for 2 s which corresponded to one walking cycle consisting of two steps. During locomotion, a walker was seen facing right in an intermediate position of 45° between the frontal and sagittal view. As sagittal view is often considered neutral in respect to possible social interactions and the frontal view is reported to elicit ambiguous (facing backward or toward an observer) and often gender-dependent impressions of locomotion direction (Pollick et al., 2005; Brooks et al., 2008; Schouten et al., 2010, 2011), the intermediate trajectory of locomotion was used. For creation of left faced stimuli, we rotated videos 90° horizontally. The walking figure was positioned with the pelvis fixed to the middle of the screen. Female and male actors walked either with angry or neutral affective expression. For avoiding variability in emotion portrayal, sets of neutral and angry stimuli were created from the same actors. The videos of six (three female/ three male) actors facing either right or left were presented in three separate runs with a short break between them. In total, each experimental session consisted of a set of 144 trials [6 actors (3 female/3 male) × 2 emotions (neutral/angry) × 2 facing directions (left/right) × 6 (2 repetitions of each stimulus in each run × 3 runs) trials. During an inter-stimulus interval (after stimulus offset and till onset of the next stimulus right after participant's response), a white fixation cross was displayed in the center of the screen for 6 s. If a participant failed to respond within this period, the next trial automatically started. Participants were asked to respond after each stimulus offset.

With the same set of stimuli, in a two-alternative forced choice (2AFC) paradigm, participants performed two different tasks indicating by pressing one of two respective keys either actor gender (female/male) or emotion (neutral/angry). By contrast with emotion task, performance on gender task is based on revealing biomechanical characteristics of locomotion (Kozlowski and Cutting, 1977; Barclay et al., 1978; Cutting et al., 1978; Pollick et al., 2005). The order of tasks (gender/emotion recognition) was counterbalanced between participants. Using identical visual input (the same set of displays) in the same sample of participants, we varied task demands re-directing the task either to BM processing or to bodily emotion recognition. The whole experimental session (consisting of two tasks) took about 20–25 min per participant. No immediate feedback was given regarding performance. The main advantage of this experimental design is that it allows comparison between sensitivity to BM and recognition of emotions conveyed by the same BM.

## Reading Mind in the Eyes Test and Gender Recognition Task

After completion of both BM tasks, a computer version of the RMET was additionally administered to participants. This test is described in detail elsewhere (Baron-Cohen et al., 2001). In brief, participants were shown a set of 36 black-and-white photographs of female and male eyes along with a corresponding face part expressing a certain emotional or affective state. On each trial, they had to choose among four alternative descriptions (adjectives) simultaneously presented on the screen including the correct one corresponding with the picture. Participants were instructed to be as fast as possible. Each correct response was scored 1 for a total score range of 0–36. This standardized test is considered one of the most commonly used tasks assessing affective theory of mind (Baglio and Marchetti, 2016). The test was administered in a computerized form by 2019 Qualtrics®; https://www.qualtrics.com/de/ (Qualtrics International Inc.; Provo, Utah, U. S. A.). In the other session, with the same set of 36 photographs, participants completed a gender recognition task from RMET (RMET\_G). The RMET was used primarily for proving whether affect recognition through body motion and eye expressions are linked to each other. Gender recognition on the RMET\_G task was used as a control.

## Data Analysis

Data analysis was performed by using Statistical Package for Social Science (SPSS version 24, IBM Corporation; Armonk, New York, U.S.A.) and JMP Software (version 13; SAS Institute; Cary, North Carolina, U.S.A.). All data were tested for normality of distribution by Shapiro-Wilk test with subsequent uses of either parametric (for normally distributed data sets) or, otherwise, non-parametric statistics.

## RESULTS

## Biological Motion Tasks: Emotion and Gender Recognition

In accordance with our assumption that BM processing is firmly linked to expressive body language reading, our data analysis was primarily focused on associations between performance on the emotion recognition task (BME) and the gender recognition task (BMG); the outcome of the analysis of variance (ANOVA) is reported for completeness.

Individual rates of correct responses on both BM tasks were submitted to a mixed model 2 × 2 × 2 × 2 repeated-measures omnibus ANOVA with within-subject factors Task (gender/emotion recognition), Actor Gender (female/male), and Emotion (angry/ neutral), and a between-subject factor Observer Gender (female/ male). The outcome revealed main effects of Task [*F*(1,38) = 24.46, *p* < 0.001] with higher accuracy on revealing emotions than gender, Actor Gender [*F*(1,38) = 37.62, *p* < 0.001] with higher accuracy in recognition of movies of male than female actors, and Emotion [*F*(1,38) = 64.71; *p* < 0.001] with better performance for neutral than angry displays on both tasks together. The main effect of Observer Gender only tended to be significant [*F*(1,38) = 3.41; *p* = 0.066] with a non-significant interaction between Observer Gender and Task [*F*(1,38) = 0.99, *p* = 0.321, n.s.]. All interactions are summarized in **Supplementary Table S1**. *Post hoc* analysis indicated a lack of gender differences in accuracy on both BM tasks [BME: *t*(38) = 0.92, *p* = 0.365, n.s., and BMG: *t*(38) = 1.39, *p* = 0.173, n.s., two-tailed]. Similarly, no gender differences in response time were found [BME: *U* = 197, *p* = 0.935, n.s.; BMG: *t*(38) = 1.46, *p* = 0.153; n.s., two-tailed].

As we expected to find a positive link between recognition of emotions (BME task) and gender (BMG task) through BM, correlation analysis was conducted on performance accuracy and response time separately for females and males. In males, accuracy of emotion and gender recognition through BM was positively linked with each other [Pearson product-moment correlation, *r*(18) = 0.38, *p* = 0.049; **Figure 2A**], whereas no such association was found in females [*r*(18) = 0.032, *p* = 0.447, n.s., both one-tailed]. Response time of correct responses between the BME and BMG tasks positively correlated with each other in both gender groups [males: Spearman's *ρ*(18) = 0.633, *p* = 0.002; females: *ρ*(18) = 0.568, *p* = 0.005, both one-tailed; **Figure 3A**].

FIGURE 2 | Relationship between accuracy of emotion and gender recognition through biological motion, and performance on the Reading Mind in the Eyes Test for female and male observers. (A) Correlation matrices between accuracy of performance (correct response rate) on emotion (BME) and gender (BMG) recognition through biological motion (BM), and the Reading Mind in the Eyes Test (RMET). Significant correlations (Pearson product-moment correlation; *p* < 0.05) are colorcoded by green, non-significant correlation by violet. (B) Correlations between BMG and BME accuracy in males (left panel, diamonds), and between the RMET and BME accuracy in females (right panel, circles) were significant.

## Relation Between Inferring Emotion Through Biological Motion and Reading the Mind in the Eyes Test

We administered a set of stimuli from the RMET primarily for addressing the issue of whether performance on two visual social cognition tasks, namely, revealing eyes expressions (RMET) and emotion recognition through BM (BME) are connected to each other. Gender recognition with the RMET set of stimuli (RMET\_G task) served as a control.

As expected from previous work (Kirkland et al., 2013), females were more proficient on the RMET with greater recognition accuracy [*t*(38) = 1.73, *p* = 0.046, one-tailed, effect size Cohen's *d* = 0.56]. Notably, we found that females tended to surpass males in recognition of female images [*t*(38) = 1.97, *p* = 0.056, two-tailed], with no gender difference in recognition of male images [*U* = 185, *p* = 0.677, n.s., two-tailed]. No gender difference on the RMET task was found in response time (*U* = 183, *p* = 0.323, n.s.). No gender difference in recognition accuracy occurred on the RMET\_G task (*U* = 187.5, *p* = 0.724, n.s.), presumably because the task turned to be far too easy to perform. There was also no gender difference in response time on this task (*U* = 199, *p* = 0.978, n.s.). No correlation occurred between recognition accuracy on the RMET and RMET\_G task [males: Spearman's *ρ*(18) = −0.161, *p* = 0.497; females: *ρ*(18) = 0.17, *p* = 0.473, n.s.].

Based on earlier work (Alaerts et al., 2011; Miller and Saygin, 2013), we expected to find a positive link between accuracy on the BME and RMET tasks. Yet, in males, the correlation between recognition accuracy on these tasks turned to be non-significant [*r*(18) = 0.186, *p* = 0.216, n.s], whereas in females accuracy of BME and RMET positively correlated with each other [*r*(18) = 0.445, *p* = 0.025, one-tailed; **Figure 2B**]. Similarly, response time on the BME task and RMET correlated with each other in females [*ρ*(18) = 0.483, *p* = 0.016; **Figure 3B**, right panel], but not in males [*ρ*(18) = 0.287, *p* = 0.11, n.s., one-tailed; **Figure 3B**, left panel].

## DISCUSSION

The present study was aimed at the proof of concept in accord with which body motion perception and visual social cognition are intimately tied (Pavlova, 2012). Keeping in mind experimental evidence for gender-specific modes in both visual social cognition and BM processing, we focused on the gender specificity of this link. The findings revealed that: (1) A tight link occurred between the accuracy of gender and emotion recognition through BM in males, though there were no gender differences in performance on both BM tasks. Independent of observers' gender, response time on emotion and gender recognition through BM correlated with each other. (2) In females only, body language reading (both accuracy and response time) was associated with mindreading through eyes.

The outcome provides further support for the general concept according with which BM processing serves a hallmark of social cognition (Pavlova, 2012). Previous research already pointed

other. (B) In males (left panel), response time on the BME task and RMET were not associated with each other, whereas in females (right panel) this association was significant. Significant correlations (Spearman's *ρ*; *p* < 0.05) are color-coded by green, non-significant correlations by violet.

to the link between BM processing and social cognition: individuals with aberrant BM processing are also compromised on daily-life social cognition possessing lower social competence, empathy, and face recognition capabilities (Sevdalis and Keller, 2011; Miller and Saygin, 2013). In this study, we tried to untangle the ties between BM processing and body language reading by using identical visual input and re-directing task demands either to BM processing [gender decoding that is based on revealing biomechanical characteristics of locomotion (Kozlowski and Cutting, 1977; Barclay et al., 1978; Cutting et al., 1978; Pollick et al., 2005)] or to emotion recognition. For the first time, we uncovered the gender specificity of these ties. It appears that males heavily rely upon common mechanisms underpinning gender and emotion recognition through BM, whereas in females, this tie is not so pronounced: only response time but not accuracy of gender and emotion recognition are positively linked to each other. This outcome appears to dovetail with recent reports indicating that females and males tend to use different types of information during BM processing and gender recognition in point-light displays: females rely on form and motion cues together, whereas males use motion cues solely (Hiris et al., 2018). This is also in line with recent findings on gender recognition in human infants aged 4–18 months: in a habituation paradigm, boys more easily differentiate the gender of a pointlight walker, presumably possessing higher sensitivity to motion parameters (Murray et al., 2018; Tsang et al., 2018). Yet adaptation effects in point-light BM gender recognition indicate that this process is rather unlikely to be based on extracting low-level perceptual features (Jordan et al., 2006). In accord with this, in SZ individuals, both emotion and gender recognition of avatars correlate with social functioning: emotion recognition correlates with the level of social engagement and interpersonal communication, whereas gender recognition is linked with independence in daily life (Peterman et al., 2014). Future brain imaging research will help to clarify where and how gender and emotion recognition through BM talk to each other in the brain.

By contrast, females likely bank on tightly interconnected general mechanisms of social cognition for emotion recognition through BM and mindreading through eyes. In males, the link in performance between these tasks is absent. At first glance, bearing in mind previous reports (Alaerts et al., 2011; Miller and Saygin, 2013) on the association between emotion recognition through point-light BM and eye expressions on the RMET, gender specificity of this linkage (occurrence of this link in females only) in the present study appears rather startling. Yet in these earlier studies, samples of participants contained predominately females.

In agreement with previous work (Kirkland et al., 2013) that points to female superiority on the RMET (independent of cultural differences), females tended to outperform males at judging mental states expressed by eyes. Yet there were no gender differences on the emotion through BM task. Brain imaging work on BM processing and inferring social interaction through Heider-and-Simmel-like animations suggests the existence of gender-specific modes in processing of socially relevant information even in the absence of behavioral differences: gender-related dimorphism in the neural circuits may prevent behavioral differences if they are maladaptive, and thereby promote proper behavioral response (Pavlova et al., 2010, 2015). Similarly, implementing different behavioral strategies by females and males may have contributed to the lack of gender differences in performance on BM tasks in the present study.

The present study was conducted in the student sample that affords group homogeneity. Although such a population is commonly used in the field, this may represent a limitation in terms of the outcome generalizability. However, since the study was focused on the association between performances on the tests, one would expect that, in general population, perceivers who are proficient on one task may be expected to be more proficient on the other and vice versa.

Gender specificity of the link between BM processing and visual social cognition may be of value for better understanding a wide range of psychiatric, neurologic, neurodevelopmental, and psychosomatic conditions. Some aspects of BM processing are atypical in ASD (e.g., Klin et al., 2009; Nackaerts et al., 2012; Jack et al., 2017), schizophrenia (e.g., Kim et al., 2011; Hastings et al., 2013; Spencer et al., 2013; Hashimoto et al., 2014; Vaskinn et al., 2016, 2018; Engelstad et al., 2017, 2018a,b; Okruszek et al., 2018) and schizotypal personality disorder (Hur et al., 2016), bipolar disorders (Vaskinn et al., 2017), attention deficit hyperactivity disorder (ADHD) (Kröger et al., 2014), anxiety disorders and in individuals with elevated anxiety (van de Cruys et al., 2013; Heenan and Troje, 2015), obsessive compulsive disorders (Kim et al., 2008), and unipolar depression (Loi et al., 2013; Kaletsch et al., 2014). Deficits are also reported in individuals who were born preterm and suffer congenital brain lesions (Pavlova and Krägeloh-Mann, 2013), Alzheimer's (Henry et al., 2012; Insch et al., 2015) and Parkinson's diseases (Cao et al., 2015; Jaywant et al., 2016a,b; Kloeters et al., 2017), epilepsy (Bala et al., 2018), and eating disorders such as anorexia nervosa and bulimia (Zucker et al., 2013; Lang et al., 2015; Dapelo et al., 2017). Most of these disorders that are characterized by aberrant social cognition display a skewed sex ratio: females and males are affected differently in terms of clinical picture, prevalence, and severity (Pavlova, 2012, 2017a,b).

BM processing relies on a large-scale neural network (Grosbras et al., 2012; Engell and McCarthy, 2013; Pavlova et al., 2017). For understanding proper functioning of this network and especially its pathology, one has to consider dynamic changes in brain activation unfolding over time (Pavlova, 2017a,b). Recently, whole-head ultrahigh field 9.4 T functional magnetic resonance imaging (fMRI), along with temporal analysis of blood-oxygen-level-dependent (BOLD) responses, revealed distinct large-scale ensembles of regions playing in unison during different stages of BM processing (Pavlova et al., 2017). An integrative analysis of structural and effective brain connectivity sheds light on architecture and functional principles of the BM circuitry, which is organized in a parallel rather than hierarchical manner (Sokolov et al., 2018). The hub of this circuitry lies in the right posterior superior temporal sulcus, STS (Grossman and Blake, 2002; Beauchamp et al., 2003; Gobbini et al., 2007; Kaiser et al., 2010; Herrington et al., 2011; Dasgupta et al., 2017), where this network likely communicates with the social brain, the neural circuits underwriting our ability for perception and understanding of drives, intentions, and emotions of others. The visual sensitivity to BM is best predicted by functional communication (effective connectivity) and presence of whitematter pathways between the right STS and fusiform gyrus (Sokolov et al., 2018).

Research on the brain networks dedicated to affective body language reading in normalcy and pathology is extremely sparse (Heberlein et al., 2004; Atkinson et al., 2012; Jastorff et al., 2015; Mazzoni et al., 2017; He et al., 2018). This work emphasizes the key role of the STS and fusiform face area in inferring emotions of point-light agents and avatars (e.g., Goldberg et al., 2015; Vonck et al., 2015). In a nutshell, it appears that BM processing engages a specialized neural network with hubs in the several areas of the brain including the right temporal cortex and fusiform gyrus, where this circuitry topographically overlaps and communicates with the social brain. Specifically tailored brain imaging is required to clarify to what extent visual processing of BM and expressive body language reading share topographically and dynamically overlapping neural networks. This work will contribute to better understanding of neurodevelopmental, psychiatric, neurological, and psychosomatic disorders related to social cognition.

## RESUME

The present study was aimed at providing a proof of concept that BM perception and visual social cognition are intimately tied (Pavlova, 2012). Here, we focused on the gender specificity of this bond. By using identical visual input and re-directing task demands either to BM processing or emotion recognition, we cautiously untangled the ties between BM processing and body language reading. The findings revealed that (1) although there were no gender differences in performance on both BM tasks, a tight link occurred between accuracy of gender and emotion recognition through BM in males. (2) In females only, body language reading is linked with mindreading through eyes. The outcome points to gender-specific modes in visual social cognition and fosters investigation of body language reading in a wide range of neuropsychiatric disorders.

## DATA AVAILABILITY STATEMENT

The raw data will be made available by the authors, without undue reservation, to any qualified researcher.

## ETHICS STATEMENT

The studies involving human participants were reviewed and approved by the Ethical Committee of the University of Tübingen Medical School. The participants provided their written informed consent to participate in this study.

## AUTHOR CONTRIBUTIONS

MP, SI, and AF conceived and designed the study experiments. SI performed the experiments. MP, SI, and AS analyzed the data. MP and AF contributed reagents, materials, and analysis tools. SI and MP wrote the paper. MP supervised the whole project. All co-authors contributed to the writing of the manuscript.

## FUNDING

This work was supported by the German Research Foundation (DFG), grant PA847/22-1 to MP, and by the Reinhold Beitlich Foundation. MP appreciates donations made by Professor Regine Leibinger. AS was funded by the Reinhold Beitlich Foundation and Doris Leibinger Foundation. The funders had no role in

## REFERENCES


the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The research training of SI in MP's lab was funded by the Italian Ministry of Education, University and Research, within the doctoral program in "Sciences of the Person and Education," Università Cattolica del Sacro Cuore, XXXII cycle, Milano, Italia.

## ACKNOWLEDGMENTS

We are thankful to participants enrolled in the study, Bernd Kardatzky for technical help, and members of MP's lab at the Department of Psychiatry and Psychotherapy, University of Tübingen Medical School for daily support, in particular, to Samuel Krüger for assistance in stimuli creation. SI acknowledges support of Davide Massaro, Francesca Baglio, and Antonella Marchetti.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2020.00128/ full#supplementary-material


Grossman, E. D., and Blake, R. (2002). Brain areas active during visual perception of biological motion. *Neuron* 35, 1167–1175. doi: 10.1016/S0896-6273(02)00897-8


Zucker, N., Moskovich, A., Bulik, C. M., Merwin, R., Gaddis, K., and Losh, M. (2013). Perception of affect in biological motion cues in anorexia nervosa. *Int. J. Eat. Disord.* 46, 12–22. doi: 10.1002/eat.22062

**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2020 Isernia, Sokolov, Fallgatter and Pavlova. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

digital media

of impactful research

article's readership