Unbiased Decisions Among Women’s Basketball Referees

Decisions often reflect implicit biases. Ethnic, racial, and gender traits are associated with stereotypes that may influence the decision-making process. Previous research shows that referees’ decisions in men’s professional sports are often biased in favor of racial and nationalistic in-groups. This study examined if similar biases exist in women’s professional sports. Additionally, this study analyzed the potential influence of the gender composition of referee teams on rapid decisions. We gathered data on referee foul calls in women’s professional basketball in Spain, 2014–2019 and defined important decisions (fifth fouls) and stressful situations (one-possession matches). The main finding is that out-groups based on racial (i.e., Black players) and nationalistic (i.e., foreign players) criteria did not differ in number of foul calls received. In stressful situations, foreign players actually received fewer fouls than Spanish players. Similarly, there was no evidence of bias due to the gender composition of referee teams: foul calls did not differ between all-male and mixed teams. Implications for race and nationality as dynamic social constructs within ethnocentric and social identity theories are discussed.


INTRODUCTION
Research demonstrates that decisions and evaluations often hide an implicit bias, even if we do not intend it or realize it (Staats, 2014). The Implicit Association Test has been used to capture the extent to which social groups are implicitly associated with good/bad traits (Greenwald et al., 1998). This type of association is subconscious and probably a result of early exposure to cultural beliefs and stereotypes. Banaji and Greenwald (2016) call it the "blindspot." Implicit biases tend to result in discriminatory behaviors that favor in-group members (Staats, 2014).
Implicit bias is, therefore, an alternative explanation to traditional taste-based and statistical discrimination (Bertrand et al., 2005). Multiple studies find evidence that implicit racial/ethnic biases influence the evaluation of police officers and judges (Banks et al., 2006), doctors and health care professionals (FitzGerald and Hurst, 2017), employment recruiters (Dipboye and Colella, 2013), policy makers (Glaser et al., 2014), students and voters (Jost et al., 2009), and sport referees (Price and Wolfers, 2010). Most studies use data from the United States In sports, most studies on biases among referees use data from men's competitions. Still, ethnicity, nationality, and race are dynamic social constructs that are likely to change over time and develop differently across societies. More research is needed to understand racial and nationalistic biases in different countries and alternative settings, such as women's sports.
Research has also shown that the widespread barriers and biases against racial and ethnic minorities can be implicit (unconscious) or explicit (conscious) (Blair et al., 2015). The analysis of referee decisions in sports is especially relevant to the literature on biases (Dohmen and Sauermann, 2016), because calling a foul is a split-second decision, and usually made under pressure, when implicit association biases should matter the most. Any unconscious association of racial/ethnic groups with negative traits is likely to surface in this situation and bias the decisions of referees.
Another factor that may influence these decisions is the gender composition of referee teams. The representation of women in sports is limited, especially in leadership positions (Kane and Stangl, 1991;Walker and Bopp, 2011). Women are also underrepresented on referee teams. In our sample of women's basketball games in Spain, we found only 25 female referees (approximately 9% of the total pool of referees). Despite this underrepresentation, we can test if the gender composition of referee teams has an influence on the number of fouls (approximately 18% of the matches have at least one woman as a referee). The relationship between gender diversity and team performance is widely analyzed in sports (Lee and Cunningham, 2019) and other domains (Azmat, 2019). However, the decisionmaking of referee teams as a function of gender composition has been overlooked to date.
In this paper, we analyze professional women's basketball in Spain; the main focus is on referee decisions that may reflect an implicit bias against out-group players based on race/ethnicity and nationality. The sample includes 53,398 player-match observations. In addition, we examine the influence of gender composition of referee teams on foul calls. The dataset allows us to control for a number of factors regarding player and team characteristics that may moderate the role of racial/ethnic and nationalistic biases. Moreover, we extend the analysis to test for biases in important calls (i.e., the fifth foul, which disqualifies a player for the rest of the match) and stressful situations (i.e., one-possession matches, where the last possession decides the winner).
The article is organized as follows: the second section reviews the related literature, shapes the theoretical contribution, and provides the hypotheses. The third section describes the data and explains the empirical strategy, and the fourth section presents the results and discusses the implications for research. Conclusions are drawn in the fifth section.

Biases in Referee Decisions
Referees are an ideal group of decision makers for research purposes. Their decisions are observable and important for the outcome of the match. They frequently make match-changing decisions: calling or not calling a foul may change the outcome of a match and may even decide championships, which, in turn, have significant economic consequences. Therefore, not only the performance of athletes but also the decisions of referees are constantly under scrutiny.
Additionally, referees often have to make decisions in a split second. Calling or not calling a foul in a fast-paced sport like ice hockey or basketball is a decision made within a tenth of a second. Therefore, such decisions cannot be driven by strategic contemplation. A foul must be called immediately, leaving no time for elaborate conscious deliberation.
The ideal referee makes unbiased decisions and treats all athletes equally regardless of their race, nationality, gender, sexual orientation, or religion. In fact, referees receive specific education and training to apply the laws of the game consistently and fairly. A sport loses its integrity when its referees consciously favor one type of athlete over another. But even if referees do not discriminate against certain kinds of athletes consciously, they may still do so sub-or unconsciously; that is, referee decisions may be subject to implicit biases (cf. Greenwald and Krieger, 2006).
Findings on racial, ethnic, and nationalistic biases are present across fields, occupations, and cultures (Jost et al., 2009). Research has extensively examined the determinants of ethnic, racial, and national identity that are used to create in-group and out-group distinctions (Tajfel, 1982). In-group favoritism often occurs, even when there is no personal advantage. The process of social categorization explains some of the mechanisms behind in-group preferences (van Knippenberg et al., 2004). The creation of social groups and the categorization of their members lead to stereotypes that may result in racial and nationalistic implicit biases.
Race represents a powerful social construct of interest to research (Gannon, 2016). The social meaning of race has historically allowed cultures to create in-group and outgroup labels, determining relationships, attitudes, and social status (Park, 1950). Similarly, nationality is an important source of social identity that determines several aspects of human experience (Rad and Ginges, 2018) and interactions (Fox and Miller-Idriss, 2008).
In sports, bias among referees and judges has attracted research attention (Dohmen and Sauermann, 2016). The seminal paper of Price and Wolfers (2010) on biases among National Basketball Association (NBA) referees set the ground for several contributions that followed. The authors analyzed the racial composition of referee teams and the number of fouls called on opposite-identity players. They found a significant bias: players received up to 4% fewer fouls calls when referees belonged to the same racial group. Similarly, Parsons et al. (2011) found that when Major League Baseball umpires and batters shared ethnicity/race, the probability of a called strike was lower. Based on the existing literature on racial biases among sport referees, we make the following prediction: Hypothesis 1: Black players are called for more fouls than their white counterparts.
Beyond racial biases in the United States, there is also evidence of nationalistic preferences of referees and judges in different sports and countries. Gallo et al. (2013) found that referees in the English Premier League who were born and spent their entire lives in the United Kingdom were 15% more likely to punish foreign, non-white players with yellow cards. Pope and Pope (2015) found that Union of European Football Associations (UEFA) Champions League football referees tend to favor players from the same country.
Other researchers report nationalistic biases when referees award free kicks in Australian football (Mohr and Larsen, 1998) and penalties in international rugby competitions (Page and Page, 2010). Research has also found nationalistic preferences in the evaluations of ski jumpers, figure skaters (Zitzewitz, 2006(Zitzewitz, , 2014Krumer et al., 2020), and dressage riders (Sandberg, 2018). Based on the existing literature on nationalistic biases among referees, we make the following prediction: Hypothesis 2: Foreign players are called for more fouls than their national counterparts.

Gender Composition of Referee Teams
Personality traits and risk preferences of men and women are often used to explain differences in sports outcomes at the individual level. For example, in judo, men benefit from psychological momentum, but women do not (Cohen-Zada et al., 2017b). Also, whereas Cohen-Zada et al. (2017a) found that in tennis men are more likely to choke under pressure, Lindner (2017), observing biathlon, found no gender differences.
Research also examines the influence of gender diversity on group performance (e.g., Wicker et al., 2012). Lee and Cunningham (2019) review the relationship between gender diversity and sport outcomes, and find an overall positive effect. However, several studies provide mixed evidence, and a number of moderating factors have been identified; namely, the type of setting, outcome, and sport role (e.g., administrators, coaches, or players). Regarding sport role, referees have been largely overlooked. Souchon et al. (2004) examined penalties called by male referees on male and female handball players. The authors found that female players were more likely to receive sanctions than their male counterparts. In a study analyzing videos of referees' decisions, Souchon et al. (2013) explored the influence of gender on the evaluation of penalties against male and female players and did not find significant differences. To the best of our knowledge, no previous research has considered implicit biases in professional sports as a function of the gender composition of referee teams.
Documented differences in decisions lead to disparate outcomes for women and men in the workplace (Chang and Milkman, 2020). Both men and women exhibit gender biases that tend to punish women, especially for displaying stereotypical male behaviors. However, there is scant evidence on the possible influence of group gender composition on decisions, despite its relevance to the decision-making process.
Regarding risk preferences, evidence from laboratory experiments is mixed. Booth et al. (2014) found that women were more likely to take risks in all-female groups than in mixed groups, whereas Castillo et al. (2019) found that women become more risk tolerant as the proportion of men increases. In natural settings, studies also find inconsistent results. Bagues et al. (2017) reported that the presence of women evaluators on a committee does not alter the quantity or quality of selected candidates to professorships in Italy and Spain. De Paola and Scoppa (2015), however, found that all-male science committees, unlike mixed-gender ones, are biased against women candidates.
These studies theorize that the presence of women could induce a "licensing effect" (Monin and Miller, 2001), such that with women present, men may feel licensed to behave differently, or male identities may be strengthened, which in turn might alter the outcome (Akerlof and Kranton, 2000). In basketball, toughness and aggressiveness are stereotypical male behavior, which, according to the licensing effect, may lead mixed referee teams to behave differently than all-male teams. Based on the existing literature on the licensing effect, we make the following prediction: Hypothesis 3: The number of foul calls differs between allmale and mixed referee teams.

Data Description
We gathered data on players, coaches, and referees for the women's first division (765 matches) and second division (1,790 matches) in Spain. The database covers five seasons (2014-2015 to 2018-2019) and 58 teams.
In total, the database has 53,398 observations at the playermatch level. We used the official websites of the leagues 1 to gather match statistics, as well as information on teams, coaches, players, and referees. The race of players was inferred by looking at pictures in newspapers and on league and team websites. Table 1 displays the complete list of summary statistics.
Players' statistics included the number of fouls per match, minutes played, points, and shooting efficiency. Players' age and status (drafted and/or international) were also gathered. Drafted and international players might have not only better skills, but also a "superstar" reputation that may bias the decisions of referees. "Drafted" refers to players selected by the Women's National Basketball Association (WNBA), the top women's basketball league in the United States International players refer to players that compete with their respective national teams. Finally, we included player's position, which is expected to influence the number of fouls (Sampaio et al., 2006).
Several teams in the Spanish league are located in the same city or close to each other. We controlled for a potential derby effect with a binary variable that classified matches as a derby if the teams operate within a 50-km radius of each other (cf. Buraimo et al., 2012). At the end of the season, the best six teams qualified for the playoffs. The top two teams directly qualified for one semifinal. The other four teams compete in best-of-three quarterfinals. We controlled for a potential playoff effect with a binary variable that classified matches as regular season vs. playoff (Price et al., 2012). Referee teams for the dataset were composed of two referees with one exception. In the 2018-2019 season, Division 1 matches had three referees. We dropped this season from the sample, so that the variable that captures gender composition of referee teams is homogeneous. (The results remain unaltered if this season is included). In the sample, less than 2% of the referee pairs consist of two women. Thus, a binary variable was created that takes the value 1 if the referee pair does not include a woman, Frontiers in Psychology | www.frontiersin.org and 0 if the team includes at least one woman. The vast majority of referees are white and Spanish.

Empirical Strategy
We followed previous research on biases among sport referees and used different models to include the main independent variables (nationality/race and gender composition of referee teams) and the rest of control variables (U, V, and W) sequentially (Garicano et al., 2005;Gallo et al., 2013;Pope and Pope, 2015). This entry method allows us to capture any influence of the main independent variables on the number of fouls, and identify the control variables that could moderate this relationship.
To examine the influence of player nationality and gender composition of referee teams, we used the following empirical model: Similarly, to examine the influence of player race and gender composition of referee teams, we used the following model: where i is a player in season t and match g, and ε is a random error term.
In both regressions, Y is the dependent variable foul rate (in Tables 2, 3) and whether a player received a fifth foul (0/1) (in Table 4). RT is a binary variable that captures whether the referee team is all-male (value 1) or mixed (0). In the nationality regression, Foreign is a binary variable that takes value 1 if a player is foreign and 0 if she comes from Spain. In the race regression, Black is a binary variable that takes value 1 if a player is Black and 0 otherwise. Additionally, the interaction terms between referee team and the player nationality (RT itg × Foreign itg ) and race (RT itg × Black itg ) identify if all-male and mixed referee teams call more fouls on foreign or Black players. We used an additional analysis to further examine the influence of race on fouls among foreign players only (the results of this additional analysis do not differ from those described here, and are provided as Supplementary Material).
The remaining factors are identical in both regressions. All variables are listed in Table 1. The vector U (Model 2) controls for players' characteristics. We expect players with better records of points and assists to commit fewer fouls, as teams tend to protect these players from fouling out (Gomez et al., 2016). Different aspects of players might influence the chances of being called for a foul. For example, referees may hesitate to call a foul on a very good player. If a player is a "superstar, " the effect may be even greater (e.g., Deutscher, 2015). We considered superstars players who were drafted in the WNBA and/or play for a national team (further categorizing national teams as top 20, 40, 60, and 80 in the world). We expect fewer foul calls against drafted players and international players from top-ranked national teams. Additionally, we included the position and age of players. Based on previous results, we expect centers and forward who play closer to the basket and secure rebounds to have a higher number of fouls (Sampaio et al., 2006).  Influence of nationality and race in one-possession matches -Dependent variable: Fouls *40/min played. (a) Robust standard errors in parentheses *** p < 0.01, ** p < 0.05, * p < 0.1. (b) All models are clustered at the player level. Influence of race and nationality on fifth foul calls -Dependent variable: Fifth foul. (a) Robust standard errors in parentheses *** p < 0.01, ** p < 0.05, * p < 0.1. (b) All models are clustered at the player level.
The vector V (Model 3) controls for characteristics of matches and teams. Binary variables control for derbies, playoff matches, and home court. We expect referees to call more fouls in derbies and playoff matches where rivalries are stronger, outcome uncertainty higher, and prizes at stake bigger (Buraimo et al., 2012;Price et al., 2012). Due to the home bias effect, we also expect referees to call fewer fouls against home players (Dohmen and Sauermann, 2016). Additionally, the vector includes season fixed effects to identify changes over time, and team fixed effects to account for variability in playing style not captured by individual player statistics.
The vector W (Model 4) includes individual referee fixed effects to capture unobserved heterogeneity. All models are clustered at the player level to control for individual differences of players. The analysis used linear regression models with robust standard errors (logit and probit models lead to similar results).
In the following, we organize the empirical analysis into three sections; each describes the analysis of different match and foul types: (1) The first section includes the full sample in a general analysis of fouls in all types of matches. The dependent variable is players' fouls per match, weighted, following Price and Wolfers (2010), by minutes played. Other types of statistical normalization used in basketball (e.g., per minute or per-36 min statistics), yield similar results.
(2) The second section focuses on one-possession matches.
These matches are exciting and end up with a tight result, which increases the stress of both players and referees. For example, Garicano et al. (2005) found a home-team bias in referee decisions in close matches that disappears when the matches are uneven. We define one-possession matches as those in which the final result is a point difference less than four. The dependent variable is again fouls per match (weighted as described above). (3) The third section focuses on fifth foul calls. The fifth foul is especially important in basketball, because after it the player has to leave the match. Therefore, a subsample was derived, consisting of players who finished the match with either four or five fouls. A binary variable was created that took the value 1 if the player ended the match with five fouls and 0 if she ended with four.

RESULTS
The first graphical analysis provides an overview of the number of fouls called. Figure 1 displays a histogram that shows the percentage of fouls called on players by both referee teams in different types of matches. Except for small differences, the distribution for mixed and all-male referee teams is very similar. At an aggregate level, mixed referee teams called an average of 17.54 fouls per match, and all-male referee teams called an average of 17.45. The difference is not significant and shows that the gender composition of the referee teams does not have an impact on the number of fouls in a match. In one-possession matches, we observe a higher number of fouls. On average, both all-male and mixed referee teams called 18.34 fouls per match.
In all regression tables, the models show only the coefficients for the main variables (race, nationality, and gender composition of referee teams), omitting the whole set of control variables. Supplementary Table S1 (Supplementary Material) provides the complete results for the analysis of fouls in all types of matches. The control variables show the expected results, which are consistent and similar throughout the regression models and different analyses (the complete set of results for fouls in one-possession matches and fifth foul calls are available from the authors upon request).
Regarding player characteristics, referees call more fouls on centers. Centers tend to be taller and stronger players who are expected to play near the basket and use their size to the benefit of the team in defensive rebounding and blocking (Sampaio et al., 2006). Players scoring more points and providing more assists receive significantly fewer fouls. The opposite occurs with respect to player turnovers. The number of fouls is also lower for older players and players drafted by the WNBA. However, international players as a whole were not called for more or fewer fouls. Players at home received significantly more fouls than away players, which contradicts the home-team bias effect often found in men's sports (Dohmen and Sauermann, 2016). Deutscher (2015) found a similar result in the NBA (Deutscher, 2015). Regarding the type of match, the results show that players receive more foul calls in playoff matches. Table 2 reports the influence of nationality and race on the number of fouls per match at the player level. In Model 1 of the regression, we find a significant influence of nationality. The negative sign means that fewer fouls are called on foreign players. However, this result is not consistent: it disappears when we include the variables that control for the player characteristics. Supplementary Table S1 (Supplementary Material) shows how this effect is mainly driven by the influence of older players and players with better records of points and assists. Table 2 provides no evidence that race/ethnicity has an influence on the number of fouls called by the referees.

Fouls in All Types of Matches
The gender composition of referee teams does not have a significant effect on the number of fouls. Additionally, the interaction term that tests for biases against foreign or Black players does not yield significant results. Table 3 reports the regression results for fouls called on players in more stressful matches (i.e., matches in which the difference in the final score is three points or less). A consistent finding throughout the models is that foreign players are called for significantly (p < 0.05) fewer fouls in one-possession matches. This finding is in line with previous evidence on referees and stress (Garicano et al., 2005), but does not support theories of in-group discrimination (Tajfel, 1982). Race has no significant influence on the number of fouls in this type of match.

Fouls in One-Possession Matches
The results also show that the gender composition of referee teams does not play a significant role in the number of fouls called on foreign or Black players in one-possession matches. The interaction term in the regression does not yield significant results either. This means that there is no evidence that all-male and mixed referee teams behave differently toward foreign players. Table 4 includes the regressions where the dependent variable is a special type of foul-the fifth, which brings disqualification. This analysis used only players who ended matches with four or five fouls. The results show that neither nationality nor race has a significant influence on fifth fouls. Neither foreign nor Black players were more likely to be disqualified than their counterparts.

Fifth Foul Calls in All Types of Matches
Similar to previous analyses, the results provide no evidence that the gender composition of referee teams influence the number of fifth fouls called. The result is consistent across all models that included different types of fouls and matches.

DISCUSSION
This paper tests hypotheses about implicit biases. The analysis uses split-second decisions of referees in professional women's basketball in Spain to examine whether foreign and Black players face a significant bias. The analysis also tests whether the gender composition of referee teams (i.e., all-male vs. mixed teams) has an influence on foul calls. We found two main results: (1) foreign and Black players do not suffer discrimination; and (2) the gender composition of referee teams has no impact on the total number of calls or those called on foreign and Black players. We did, however, observe a bias that favors foreign (but not Black) players in stressful situations, challenging previous evidence on implicit biases and sport referees.
The main results from Table 2 do not support Hypotheses 1 or 2. We found no evidence of the racial/ethnic and nationalistic discrimination found by other researchers among sport referees (Dohmen and Sauermann, 2016) and professionals in other fields (e.g., Jost et al., 2009).
Race is a social construct that has evolved over time (Gannon, 2016) and served societies to create social groups that often result in stereotypes and negative attitudes toward outside members (Park, 1950). In the United States, the historical discrimination against the Black population is rooted in the society and still affects the opportunities of this minority (Deruy, 2016). Such discrimination can be used to explain the implicit own-race bias that research finds among referees in men's professional sports (Price and Wolfers, 2010;Parsons et al., 2011).
The results from our study suggest that discrimination against players from minorities depends on the context and the country of analysis. Implicit racial associations in sport depend heavily on the historical and cultural evolution of the meaning of race (Kahn, 2013;Suzuki and Von Vacano, 2018). The label "foreigner" also has a strong constructivist component and follow a similar pattern determining the relationships between individuals and the access to social domains (Fox and Miller-Idriss, 2008;Rad and Ginges, 2018).
Previous studies have found biases among referees and judges that favor competitors from the same country at both the individual (Zitzewitz, 2006(Zitzewitz, , 2014Sandberg, 2018;Krumer et al., 2020) and team level (Mohr and Larsen, 1998;Page and Page, 2010). However, our results show that foreign players do not have more fouls called on them. In fact, we observe the opposite: in stressful situations, foreign players are called for significantly fewer fouls.
This result is in line with the assumption that implicit biases should play a more prominent role when decisions are made under pressure. Previous research finds support for this assumption (e.g., Garicano et al., 2005). The direction of the bias, however, contradicts our initial hypothesis, as implicit bias should favor in-group players (Staats, 2014). One possible explanation relies again on the constructivist process of providing traits with meaning (Suzuki and Von Vacano, 2018), which is likely to differ depending on the context. Consumer ethnocentrism theories illustrate this process.
Nationality and ethnicity have been used to make ingroup/out-group distinctions (Tajfel, 1982). In the context of our study, national (Spanish) and white players represent the ingroup, and foreign and Black players the out-group. Following the logic of social identity theory, referees will tend to evaluate in-group players unreasonably favorably compared to their outgroup counterparts. However, social identity and ethnocentric theories may work in opposite directions. In situations where foreign traits are regarded as better than the national ones, consumers will be motivated to favor national brands, but simultaneously, also comply with the in-group norm that foreign (read American) is better (Supphellen and Rittenburg, 2001).
This situation perfectly defines the current status of the Spanish women's basketball league with respect to other foreign leagues, especially the WNBA in the United States Thus, in the Spanish context, while showing sympathy for national players, referees acknowledge the quality of foreign players, whose higher status can even bias the decisions. This condition also applies to men's sports. In basketball, previous research provides evidence of the consequences of ethnocentrism in situations of (un)equal perceived quality of foreign and national brands. Berri et al. (2015) found that players born in the United States receive preferential treatment from coaches, who allow them additional time on the court both in the NBA in the United States and Liga Asociación de Clubs de Baloncesto (ACB) in Spain. Our research is the first to show this relationship in women's sports, although it is significant only in more stressful situations.
In our study, the operational definition of stressful matcheswhere the difference in the final score is three points or lessmay be subject to criticism. Not only does it reduce sample size, but there are other ways to define stress, including crowd size. Several studies explore the relationship between attendance and biases among referees and find significant results (e.g., Garicano et al., 2005;Dohmen, 2008). This is a limitation because we do not have access to attendance data. The basketball federation should consider making this information publicly available not only for research purposes but also for league development plans that encourage participation. Still, we know that women's basketball has lower attendance than comparable men's competitions, so that we can expect a smaller influence of the crowd. This could be a reason for the absence of an implicit bias effect in this league, as documented in other professional sports (Dohmen and Sauermann, 2016).
Another limitation of the present study is that we examined only differences between two groups: (1) white vs. Black players, and (2) foreign vs. national players. Potential discrimination that results from the combination of race and nationality is thereby overlooked (Borland and Bruening, 2010). We did perform an additional analysis to rule out this possibility (see Supplementary Material, Supplementary Table S2). We find no evidence that foreign Black players are treated differently than foreign white players. This result suggests that nationality and race as social constructs follow a similar pattern: both lack an association with negative stereotypes in this Spanish sport context.
The third hypothesis does not find support in any type of match and foul call. We found no evidence that the gender composition of referee teams affects foul calls. Neither the total number of fouls in a match nor the fouls called on foreign and Black players differ between all-male and mixed referee teams. This finding is novel in the literature on referee biases in sports. In professional sports, no previous contribution analyzed the relationship between the gender composition of referee teams and foul-call decisions. Women's sports provide an opportunity to extend knowledge in this line of research.
On the one hand, our results differ from the studies in other fields that find changes in women's risk behavior when participating in single-vs. mixed-gender groups in experimental settings (Booth et al., 2014;Castillo et al., 2019) and all-male committees with biases against women (De Paola and Scoppa, 2015). On the other hand, our results are consistent with Bagues et al. (2017), who found that the presence of women evaluators in a scientific committee does not have an influence on the quality of selected candidates.
Our findings also support the seminal ideas of Fine (2010), who argued that gender differences in psychological traits will not necessarily have an influence on measurable outcomes. The setting of referees in sports differs from the ones described above. Referees receive specific education and training to be consistent and impartial, but the decisions are time-constrained and do not allow for strategic contemplation. These factors could determine the influence of psychological traits leading to different outcomes.
The study also has limitations in terms of how much light it can shed on the referee gender composition question. Granted, we found no difference in the foul calls made by all-male vs. mixed referee teams. However, the limited number of teams exclusively composed of women is a drawback. We cannot conduct a more informative analysis regarding differences in behavior with presence/absence of individuals from the opposite gender. Similar to previous studies, we found no evidence that the behavior of male referees is affected by the gender of who else is on the team (Castillo et al., 2019), but we cannot rule out the possibility that women referees may behave differently in the absence of men on the team.
We performed a robustness check with the three groups of referees (all-male, mixed, and all-female) and found no significant differences; again, however, the low number of matches with all-female referee teams prevents further analysis. We urge researchers to find suitable settings to perform this analysis. The underrepresentation of women in different roles in women's sport is a recurrent problem. The governing bodies, federations, and referee committees need to consider strategies to increase the representation of women referees.
Another limitation of this setting is the lack of information on the referees. Unfortunately, we know only the gender and nationality of the referees working in this league. Additional information-such as age, experience, and education-would allow us to identify possible moderating factors. For example, Gallo et al. (2013) found that referees who were born and spent their entire life in the United Kingdom were more likely to penalize foreign, non-white players; however, these authors did not find any significant differences when using purely racial, national, or linguistic criteria. Moreover, it is not possible for us to identify which member of the referee team made the call. That information would be valuable for assessing behavior in the presence/absence of the opposite gender.
Beyond the need for more complete statistics in women's basketball, the results from this study have other implications for sports organizations. First, the results of this research give league organizers no reason not to assign gender composition of referee teams freely, without fear that players might be treated differently. Second, teams have no reason to exploit referee composition by, for instance, specifically lining up foreign (white or Black) or national players. Third, our findings provide no reason not to use mixed-gender referee teams in men's basketball. The theoretical implications of mixed referee teams calling fouls on male players are different. The empirical setting, however, is almost identical. To advance knowledge of potential cross-gender biases, future research would benefit from the inclusion of female referees in men's sports. Additionally, such appointments would improve the visibility of women referees in a setting where they are severely underrepresented. Greater visibility could result in greater participation by girls and young women.
The lack of observable biases in women's basketball in Spain has positive implications, and the good performance of the referees should be acknowledged. Referees undergo specialized education and training to do the job in a consistent and impartial way. Although this is a constant in all sports, referees and judges in some disciplines need to deal with a stronger subjective component (e.g., in dressage; Sandberg, 2018;and figure skating, Zitzewitz, 2006). This is a potential explanation of differences in biases across disciplines.

CONCLUSION
Calling a foul on a basketball player is a split-second decision that might be susceptible to bias against members of minority groups. However, we found no evidence of biased decisions with respect to racial/ethnic or national out-groups. In other contexts, negative stereotypes associated with race and foreignness are presumably responsible for the biased decision-making of referees. By contrast, in women's basketball in Spain, there was no sign of negative bias, and therefore, by extension, no reason to assume negative stereotypes. Indeed, if anything, the results suggest positive stereotypes for foreign players in onepossession matches.
This study also found no evidence that the gender composition of referee teams affects split-second decisions. Across all types of matches, nationalities, and racial/ethnic groups, all-male and mixed referee teams did not differ in number of fouls called. Finally, with respect to all-female referee teams, sports organizations are encouraged to use them more widely, and researchers are encouraged to study them more extensively.

DATA AVAILABILITY STATEMENT
The datasets generated in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://doi.org/10. 7910/DVN/XGDFKI.