VALID: A perceptually validated Virtual Avatar Library for Inclusion and Diversity

As consumer adoption of immersive technologies grows, virtual avatars will play a prominent role in the future of social computing. However, as people begin to interact more frequently through virtual avatars, it is important to ensure that the research community has validated tools to evaluate the effects and consequences of such technologies. We present the first iteration of a new, freely available 3D avatar library called the Virtual Avatar Library for Inclusion and Diversity (VALID), which includes 210 fully rigged avatars with a focus on advancing racial diversity and inclusion. We present a detailed process for creating, iterating, and validating avatars of diversity. Through a large online study (n=132) with participants from 33 countries, we provide statistically validated labels for each avatar's perceived race and gender. Through our validation study, we also advance knowledge pertaining to the perception of an avatar's race. In particular, we found that avatars of some races were more accurately identified by participants of the same race.


INTRODUCTION
The study of human behavior in virtual worlds is becoming increasingly important as more and more people spend time in these digital spaces.In particular, understanding how behaviors in virtual environments compare to behaviors in real life can provide insight into the ways in which humans interact with technology and each other.Furthermore, it can also help inform the design of virtual environments, making them more realistic and socially acceptable.
One area of significant interest is virtual avatars, which are 3D representations of virtual humans often used in virtual worlds and simulations [59].Virtual avatars have a prominent role in social computing and immersive environments [18,28,74].For example, previous studies have utilized virtual avatars to express emotions [11], recreate social psychology phenomena [17,25], and influence feelings of presence or embodiment [52,55].Moreover, virtual avatars are expected to play evermore important roles in the future.For instance, virtual avatars are at the center of every social virtual reality (VR) and augmented reality (AR) interaction, as they are key to representing remote participants [60] and facilitating collaboration [69].
Open virtual avatar libraries facilitate research by providing freely available resources to the community.For instance, Microsoft's Rocketbox avatar library [24] has been extensively used in multiple studies (e.g., [13,29,32]), with over 400 stars and 100 forks on GitHub since its release in 2020.However, the representation of races in currently available avatar libraries are limited and have not been validated through user perception studies.As examples, over 40% of the Rocketbox avatars are white males, and MakeHuman1 , another popular resource for 3D avatars, only includes options for African, Asian, and Caucasian avatars.Due to such limitations, many researchers are unable to properly investigate the effects of diversity for virtual avatars.Furthermore, existing libraries not only have limited diversity but are also biased in their lack of inclusivity with regard to professions for minority avatars.For instance, Rocketbox lacks Asian avatars dressed for medical scenarios.
The limited representation of currently available resources may have an inherently detrimental impact by introducing biases into research investigations from the outset, such as restricting the development of scenarios with Asian medical professionals.Furthermore, as immersive technologies grow in popularity worldwide, Taylor et al. [70] have called for improved racial/ethnic representations in VR to mitigate potential negative effects of racial bias.This can be especially important since virtual avatar race 2 can have a strong influence on human behavior.For example, research has found that embodying a virtual avatar of a different race can affect implicit racial bias, as demonstrated by several prior studies [26,42,57,63].As such, it is critical to provide more inclusive avatar resources.
The goal of this work was dual purpose.First, we sought to develop an open library of validated virtual avatars, which we refer to as the Virtual Avatar Library for Inclusion and Diversity (VALID).This library has been designed to inclusively represent a range of races across various professions.Secondly, we wanted to better understand how humans perceive avatars, particular with regard to race.In this first iteration of VALID, we followed the recommendations of the 2015 National Content Test Race and Ethnicity Analysis Report from the U.S. Census Bureau [45] to ensure that a wide range of people was represented in our library.For example, this report found that members of Middle Eastern and North African (MENA) communities use the MENA category when it is available, but have trouble identifying their race when it is not available.Similarly, it also found that Hispanics can better identify their ethnicity by combining the conventional race and Hispanic origin questions into one.Therefore, to overcome those traditional problems on racial and ethnic categorization, VALID includes seven races as recommended by the U.S. Census Bureau report [45] (which differs from the 2020 U.S. Census): American Indian or Native Alaskan (AIAN) 3 , Asian, Black or African American (Black), Hispanic, Latino, or Spanish (Hispanic), Middle Eastern or North African (MENA), Native Hawaiian or Pacific Islander (NHPI), and White.
VALID includes 210 fully rigged virtual avatars designed to advance diversity and inclusion.We iteratively created 42 base avatars (7 target races × 2 genders × 3 individuals) using a process that combined data-driven average facial features with extensive collaboration with representative stakeholders from each racial group.To address the longstanding issue of the lack of diversity in virtual designers and to empower diverse voices [9], we adopted a participatory design method.This approach involved actively involving individuals ( = 22) from diverse backgrounds, particularly different racial and ethnic identities, in the design process.By including these individuals as active participants, we aimed to ensure that their perspectives, experiences, and needs were considered and incorporated into the design of the avatars.
Once the avatars were created, we sought to evaluate their perception on a global scale.We then conducted a large online study ( = 132) with participants from 33 countries, self-identifying as one of the seven represented races, to determine whether the race and gender of each avatar are recognizable, and therefore validated.We found that all Asian, Black, and White avatars were universally identified as their modeled race by all participants, while our AIAN, Hispanic, and MENA avatars were typically only identified by participants of the same race, indicating that participant race can bias perceptions of a virtual avatar's race.We have since modeled the 42 base avatars in five different outfits (casual, business, medical, military, and utility), yielding a total of 210 fully rigged avatars.
To foster diversity and inclusivity in virtual avatar research, we are making all of the avatars in our library freely available to the community as open source models.In addition to the avatars, we are also providing statistically validated labels for the race and gender of all 42 base avatars.Our models are available in FBX format, are compatible with previous libraries like Rocketbox [24], and can be easily integrated into most game engines such as Unity and Unreal.Additionally, the avatars come equipped with facial blend shapes to enable researchers and developers to easily create dynamic facial expressions and lip-sync animations.All avatars, labels, and metadata can be found at our GitHub repository: [LINK REDACTED FOR REVIEW].
This paper makes three primary contributions: (1) We provide 210 openly available, fully rigged, and perceptually validated avatars for the research community, with a focus on advancing diversity and inclusion.(2) Our diversity-represented user study sheds new light on the ways in which people's own racial identity can affect their perceptions of a virtual avatar's race.In our repository, we also include the agreement rates of all avatars, disaggregated by every participant race, which offers valuable insights into how individuals from different racial backgrounds perceive our avatars.
(3) We describe a comprehensive process for creating, iterating, and validating a library of diverse virtual avatars.Our approach involved close collaboration with stakeholders and a commitment to transparency and rigor.This could serve as a model for other researchers seeking to create more inclusive and representative virtual experiences.

RELATED WORK
In this section, we describe how virtual avatars are used within current research in order to highlight the need for diverse avatars.
We conclude the section with a discussion on currently available resources used for virtual avatars and virtual agents.

Effect of Avatar Race
Virtual avatars are widely used in research simulations such as training, education, and social psychology.The race of a virtual avatar is a crucial factor that can affect the outcomes of these studies.
For example, research has shown that underrepresented students often prefer virtual instructors who share their ethnicity [3,48].
Similarly, studies have suggested that designing a virtual teacher of the same race as inner-city youth can have a positive influence on them [4], while a culturally relevant virtual instructor, such as an African-American instructor for African-American children, can improve academic achievement [21].
The design of virtual avatars is especially important for minority or marginalized participants.Kim and Lim [37] reported that minority students who feel unsupported in traditional classrooms develop more positive attitudes towards avatar-based learning.In addition, children with autism spectrum disorder treat virtual avatars as real social partners [1,38].Therefore, to better meet the needs of all individuals participating in such studies, it is important for researchers to have access to diverse avatars that participants can comfortably interact with.Diversity in virtual avatars is important not only for improving representation, but also for enhancing the effectiveness of simulations.Halan et al. [27] found that medical students who trained with virtual patients of a particular race demonstrated increased empathy towards real patients of that race.Similarly, Bickmore et al. [6] showed that interacting with a minority virtual avatar reduced racial biases in job hiring simulations.
These findings highlight the importance of diverse and inclusive virtual avatars in research simulations and emphasize the need for more comprehensive representation of different races.Access to a wide range of validated avatars through VALID will help to create more inclusive and representative simulations, and enable researchers to investigate the impact of avatar race or gender on participants' experiences.This will help improve the inclusivity of simulations and contribute towards addressing issues of bias.

Implicit Racial Bias and Virtual Avatars
Avatars are becoming increasingly important in immersive applications, particularly in the realm of VR, where they are becoming ubiquitous [16].Research suggests that the degree of similarity between a user's real body and their virtual avatar can influence embodiment, presence, and cognition [34,35,46,53].Maister et al. [42] proposed that non-matching self-avatars can affect a user's self-association and influence social cognition.Moreover, studies have demonstrated that embodying a darker-skinned avatar in front of a virtual mirror can reduce implicit racial biases [26,42,57,63], which are unconscious biases that can lead to discriminatory behavior [12].For instance, Salmanowitz et al. [63] found that a VR participant's implicit racial bias affects their willingness to convict a darker-skinned suspect based on inconclusive evidence.Similarly, Peck et al. [56] found that each participant's implicit racial bias was related to their nuanced head and hand motions in a firearm simulation.These foundational studies provide compelling evidence that embodying an avatar of a different race can affect implicit biases and further emphasize the need for diverse avatar resources.
Our study examines how participants perceive the race of diverse virtual avatars.While some studies have explored how a virtual avatar's race affects user interactions (e.g., [3,19,75]), little research has been conducted on how individuals actively perceive the race of virtual avatars.Setoh et al. [66] note that racial identification can predict implicit racial bias, making it crucial to understand how people perceive the race of virtual avatars to further investigate these effects.

Own-Race Bias
Own-race bias, also known as the "other-race effect, " refers to the phenomenon in which individuals process the faces of their own race differently from those of other races [14,20,47,62].Studies have suggested that this bias can influence the way individuals categorize race.For example, MacLin and Malpass [41] found that Hispanic participants were more likely to categorize Hispanic faces as fitting their racial category than Black faces, and Blascovich et al. [7] observed that participants who strongly identify with their in-group are more accurate in identifying in-group members.
Although own-race bias has not yet been studied in the context of 3D virtual avatars, Saneyoshi et al. [64] recently discovered that it extends to the uncanny valley effect [49] for 2D computergenerated faces.Specifically, they found that Asian and European participants rated distorted faces of their own race as more unpleasant than those of other races.Building on this research, we extended the study of own-race bias to 3D virtual avatars and focused on race categorization rather than perceived pleasantness.Our study included avatars and participants from seven different races, providing insights into how a diverse user population may interact within equally diverse virtual worlds.

Virtual Avatar Resources
There are numerous resources for creating virtual avatars.Artists can use 3D modeling tools, such as Autodesk 3ds Max4 , Autodesk Maya5 , Blender6 , or ZBrush7 to manually model, texture, and rig virtual avatars.However, such work requires expertise in 3D modeling and character design, and is often a tedious process [24].On the other hand, parametric models, including freely available tools like MakeHuman 8 and Autodesk Character Generator9 , as well as commercially available ones such as Daz3D10 , Poser 11 , and Reallusion Character Creator 12 , enable users to generate virtual avatars from predefined parameters, thereby significantly expediting the avatar generation process.Nonetheless, using these tools still requires learning a new program and time to customize each model, despite the absence of the artistic expertise needed for manual tools.
Another alternative to traditional modeling is to use scanning technologies, which can capture 3D models of real people.For instance, Shapiro et al. [67] and Waltemate et al. [73] used 3D cameras and photogrammetry, respectively, to capture 3D models of their users.Singular Inversions FaceGen Modeller 13 has also been employed to generate 3D faces from user photos and then apply them to a general 3D avatar body [8,22].However, scanning approaches require the ability to physically scan the user, limiting their use for certain applications, particularly remote ones.
Most closely related to our goal of providing a free and open library of ready-to-use avatars is the Microsoft Rocketbox library [24] and its accompanying HeadBox [72] and MoveBox [23] toolkits.Rocketbox provides a free set of 111 fully rigged adult avatars of various races and outfits.However, it falls short in terms of representation by not including any avatars of AIAN or NHPI descent.Additionally, the library offers only a limited number of Asian, Hispanic, and MENA avatars, excluding minority representations for some professions (e.g., Rocketbox does not include any Asian medical avatars).Furthermore, none of the available avatar libraries have been validated by user perception studies to ensure their efficacy and inclusivity.Therefore, our VALID project aims to fill this gap by providing a free and validated library of diverse avatars.

AVATAR CREATION PROCEDURE
This section outlines our iterative process for developing the VALID library, which includes 42 base avatars.We began by using datadriven averaged facial features to create our initial models.We then conducted interviews with representative volunteers to iteratively refine and modify the avatars based on their feedback.

Initial Modeling
To ensure a broad diversity of people were represented in our library, we initially created 42 base avatars (7 target races × 2 genders × 3 individuals ) modeled after the seven racial groups recommended by the 2015 National Content Test Race and Ethnicity Analysis Report [45]: AIAN, Asian, Black, Hispanic, MENA, NHPI, and White.We created 3 male and 3 female individuals for each race, resulting in a total of 6 individuals per race.
Preliminary models were based on averaged facial features of multiple photos selected from the 10k US Adult Faces Database [2] and stock photos from Google for races missing from the database (e.g., AIAN, MENA, and NHPI).These photos were used as input to a face-averaging algorithm [15], which extracted average facial features for each race and gender pair.Using these averages as Figure 1: An example of the creation of a 3D avatar using our methodology.1) We select 4-7 faces from a database [2] or stock photos.2) We calculate the average face using WebMorph [15].3) A 3D artist recreates the average face using modeling software.4) The models are improved iteratively through recurrent consultation with representative volunteers.a reference, a 3D artist recreated the average faces for each race and gender pair using Autodesk Character Generator (due to its generous licensing and right to freely edit and distribute generated models 14 ) and Blender to make modifications not supported by Autodesk Character Generator (see Figure 1).

Iterative Improvements through Representative Interviews
After the preliminary avatars were created based on the facial averages, we worked closely with 2 to 4 volunteers of each represented race (see Table 1) to adjust the avatars through a series of Zoom meetings.This process ensured that all avatars were respectful and reduced the likelihood of harmful or stereotypical representations.Volunteers self identified their race and were recruited from university cultural clubs (e.g., Asian Student Association, Latinx Student Association), community organizations (e.g., Pacific Islanders Center), and email lists.We iteratively asked these volunteers for feedback on all avatars representing their race, showing them the model from three perspectives (see Figure 2).Volunteers were specifically asked to identify accurate features and suggest changes to be made.Once the changes were completed based on the feedback, we presented the updated avatars to the volunteers.This process was repeated until they approved the appearance of the avatars.For example, volunteers requested changes to facial features, such as: Additionally, we modified hairstyles according to feedback: • Once the avatars were approved by their corresponding volunteer representatives, we conducted an online study to validate the race and gender of each avatar based on user perceptions.

AVATAR VALIDATION STUDY
We conducted an online, worldwide user study to determine whether the target race and gender of each avatar is recognizable and, therefore, validated.Participants were recruited from the online Prolific marketplace 15 , which is similar to Amazon Mechanical Turk.Prior research shows that Prolific has a pool of more diverse and honest participants [58] and has more transparency than Mechanical Turk [54].Since diversity was a core theme of our research, we chose Prolific to ensure that our participants would be diverse.

Procedure
The following procedure was reviewed and approved by our university Institutional Review Board (IRB).The study consisted of one online Qualtrics survey that lasted an average of 14 minutes.Each participant first completed a background survey that captured their self-identified demographics, including race, gender, and education.Afterwards, they were asked to familiarize themselves with the racial terms as defined by the U.S. Census Bureau research [45].
Participants were then asked to categorize the 42 avatars by their perceived race and gender.Participants were shown only one avatar at a time and the order was randomized.
For each of the avatars, participants were shown three perspectives: a 45 • left headshot, a direct or 0 • headshot, and a 45 • right headshot (see Figure 2).Avatars were shown from the shoulders up and were dressed in a plain gray shirt.The images were rendered in Unity using the standard diffuse shader and renderer.The avatars were illuminated by a soft white (#FFFEF5) directional light with an intensity of 1.0, and light gray (#7F7F7F) was used for the background.Participants were asked to select all races that each avatar could represent: "American Indian or Alaskan Native", "Asian", "Black or African American", "Hispanic, Latino, or Spanish", "Middle Eastern or North African", "Native Hawaiian or Pacific Islander", "White", or "Other"."Other" included an optional textbox if a participant wanted to be specific.We allowed participants to select multiple categories according to the U.S. Census Bureau's recommendations for surveying race [45].For gender, participants were able to select "Male", "Female", or "Non-binary".Participants were paid $5.00 via Prolific for completing the study.

Participants
A total of 132 participants (65 male, 63 female, 4 non-binary) from 33 different countries were recruited to take part in the study.We aimed to ensure a diverse representation of perspectives by balancing participants by race and gender.Table 2 provides a breakdown of our participants by race, gender, and country.Despite multiple recruitment attempts, including targeted solicitations via Prolific, we had difficulty recruiting NHPI participants.It is important to note that we excluded volunteers who had previously assisted with modeling the avatars from participating in the validation study to avoid potentially overfitting their own biases.

Data Analysis and Labeling Approach
To validate the racial identification of our virtual avatars, we used Cochran's Q test [68], which allowed us to analyze any significant differences among the selected race categories.This approach was necessary since our survey format allowed participants to select more than one race category for each avatar, following the U.S. Census Bureau's research recommendations [45].Since the Chisquared goodness of fit test requires mutually exclusive categories, we were unable to use it in our analysis.Furthermore, since our data was dichotomous, a repeated-measures analysis of variance (ANOVA) was not appropriate.Therefore, Cochran's Q test was the most appropriate statistical analysis method for our survey data.We used a rigorous statistical approach to assign race and gender labels to each avatar.First, we conducted the Cochran's Q test across all participants ( = 132) at a 95% confidence level to identify significant differences in the participants' responses.If the test indicated significant differences, we performed pairwise comparisons between each race using Dunn's test to determine which races were significantly different.
For each avatar, we assigned a race label if the race was selected by the majority of participants (i.e., over 50% of participants selected it) and if the race was selected significantly more than other race choices and not significantly less than any other race.This approach resulted in a single race label for most avatars, but some avatars were assigned multiple race labels due to multiple races being selected significantly more than all other races.If no race was selected significantly more than the majority, then we categorized the avatar as "Ambiguous".We followed a similar procedure for assigning gender labels.
To account for the possibility that the race of the participant might influence their perception of virtual race, we also assigned labels based on same-race participants.This involved using the same procedure for assigning labels as described above, except based only on the selections of participants who identified as the same race as the avatar.This also allows future researchers to have the flexibility to use the labels from all study participants for studies focused on individuals from diverse racial backgrounds or to use the labels from participants of the same race for studies targeting specific racial groups.

Validated Avatar Labels
Table 3 summarizes our results and labels for all 42 base avatars across all participants and for same-race participants.
5.1.1Race and Gender Labels.Asian, Black, and White avatars were correctly identified as their intended race across all participants, while most of the remaining avatars were accurately identified by same-race participants (see Table 3 for all and same-race agreement rates).Therefore, we observed some differences in identification rates based on the race of the participants, highlighting the potential impact of own-race bias on the perception of virtual avatars.Notably, there were no significant differences in gender identification rates based on participant race, indicating that all avatars were correctly perceived as their intended gender by all participants, regardless of their racial background.

Naming Convention.
If an avatar was identified as its intended race by corresponding same-race participants, we named it after that race.For instance, the avatar Hispanic_M_2 was labeled as White by all participants.However, our Hispanic participants perceived it as solely Hispanic.Hence, we left the original name.However, if an avatar was labeled as "Ambiguous" or as a different race by same-race participants, we added an X at the beginning of its name to indicate that it was not validated.Avatars were also labeled by their identified gender ("M" or "F").

Other-Race vs. Same-Race Perception
To further examine how participant race affected perception of virtual avatar race, we additionally analyzed the data by separating same-race and other-race agreement rates.In effect, we separated the selections of the participants who were the same race as the avatar modeled and those who were not.

Difference in Agreement
Rates. Figure 3 displays the difference in agreement rates between same-race and other-race.Figure 3 shows that several avatars were strongly identified by both otherrace and same-race participants.In particular, all Asian, Black, and White avatars were perceived as their intended race with high agreement rates by both same-race and other-race participants (over 90% agreement for all but one).However, some avatars were only identified by participants of the same race as the avatar.For example, our analysis of the agreement rates for different racial groups revealed interesting trends.For instance, non-Hispanic participants had an average agreement rate of 54.5% for Hispanic avatars, while Hispanic participants had a much higher average agreement of 75.0%.Similar patterns were observed for AIAN (57.8% other-race, 75.0%same-race) and MENA (40.4% other-race, 68.0% same-race) avatars.

Perceived Race Clusters.
To gain deeper insights into how participants perceived the avatars' races, we employed Principle Component Analysis (PCA) to reduce the agreement rates of each of the 42 base avatars down to two dimensions.Next, we performed K-means clustering [39] on the resulting two-dimensional data to group the avatars based on their perceived race.We optimized the number of clusters using the elbow method and distortion scores [5].We applied this technique to both other-race and same-race agreement rates to determine whether there were any differences in the clustering based on participant race.By visualizing the clusters, we aimed to better understand the differences in how participants perceived the avatars' races.
Figure 4 shows that Asian, Black, and White avatars were perceived consistently by all participants, with clearly defined clusters.However, there was more confusion in perceiving AIAN, Hispanic, MENA, and NHPI avatars, which clustered closer together.Samerace participants had less overlap and more-accurately perceived these avatars, with more separation between them.For example, the Hispanic and MENA avatars were in separate clusters for samerace participants, except for one avatar (Hispanic_F_2).On the other hand, the Hispanic and MENA avatars were entirely clustered together for other-race participants.

DISCUSSION
In this section, we discuss the validation of our avatars.Specifically, we examine the extent to which each avatar was correctly identified as its intended race and the variability in identification across different participant groups.Additionally, we discuss the implications of our results for virtual avatar research, highlighting the importance of considering the potential impact of own-race bias on avatar race perception.Finally, we describe the potential future impact of our avatar library in the community, including how it can be used to promote diversity and inclusion.

Race Identification
6.1.1Universally Identified Avatars.We found that our Asian, Black, and White avatars were recognized by all participants with high agreement rates.This suggests that these avatars can be a valuable tool for researchers seeking to create virtual humans that can be easily identified by individuals from different racial backgrounds.
Our results may be due to perceptual expertise or familiarity with other-race faces, as proposed by Civile et al. [14].We hypothesize that this familiarity could be explained by the prevalence of these racial groups in global media and pop culture.For example, White cast members were the most represented in popular Hollywood movies over the last decade, followed by Black cast members [44].Since Hollywood movies have a dominant share in the global film industry [43], people may be more familiar with characters that are prevalent in these films.Additionally, East Asian media culture has become widely popular worldwide over the past few decades [31,33].Phenomena like "The Hallyu Wave" and "Cool Japan" [40] have enabled East Asian films, dramas, and pop music to gain a global following.As people may often encounter these racial groups in media, this familiarity may have facilitated their recognition of these avatars.
6.1.2Same-Race Identified Avatars.As expected, some avatars were only identified by participants of the same race as the avatar, consistent with the own-race bias effect.For example, as seen in Table 3, the Hispanic avatars received mixed ratings of White and Hispanic across all participants, but most were perceived as solely Hispanic by Hispanic-only participants.Similarly, only one MENA avatar was perceived as MENA by all participants, while five were perceived as MENA by MENA-only participants.These results suggest that participants' own-race bias, a well-known phenomenon in psychology, may also affect their perception of virtual avatars.The Table 3: Assigned labels for all 42 base avatars."All" indicates that the label was identified by all 132 participants, while "Same-Race" only includes the data of participants who identify as the race that the avatar was modeled for.Agreement labels were calculated as the percentage of participants who perceived an avatar to represent a race or gender.findings point to the importance of considering participants' race when using virtual avatars in research or applications that require accurate representation of different racial groups.
6.1.3Ambiguous Avatars.Several avatars in our library were perceived ambiguously by all participants and only same-race participants, and therefore labeled as such (see Table 3 for details).
Identifying the reason for these avatars' lack of clear identification is not straightforward, and multiple factors could be at play.For instance, the two ambiguous AIAN avatars were the only ones with short hairstyles, which may have impacted their identification as AIAN.Long hair carries cultural and spiritual significance in many AIAN tribes [71], and some participants may have perceived the avatars as non-AIAN as a result, even among AIAN participants.The validation of our NHPI avatars was limited, possibly due to the low number of NHPI participants ( = 12) in our study, despite our targeted recruitment efforts.As a consequence, most of the NHPI avatars were not validated by NHPI participants, including the lack of validation for any female NHPI avatars.Another potential reason for this lack of validation is that the majority of our NHPI participants identified themselves as New Zealand Maori, whereas our avatars were developed with the help of Samoan and Native Hawaiian volunteer representatives.Therefore, it is possible that our NHPI avatars are representative of some NHPI cultures, but not New Zealand Maori.In future studies, expanding recruitment efforts for both interview volunteers and study participants will be crucial, despite the challenges involved in doing so.For example, future studies may need to compensate NHPI participants more than participants of other races.

Implications for Virtual Avatars
Our study provides valuable insights for virtual avatar applications and research.Our findings indicate that human behavior in race categorization can apply to virtual avatars, which has notable implications for interactions in virtual experiences.Kawakami et al. [36] suggest that in-group and out-group categorization can lead to stereotyping, social judgments, and group-based evaluations.Therefore, designers and developers should be aware of this and take necessary steps to mitigate unintended consequences in virtual experiences.For example, regulating codes of conduct [70] can help to improve interracial interactions in VR.Interestingly, our study also replicated a nuanced finding from more recent psychology research on the perception of ambiguous avatars [51].As seen in Table 3, most of the misidentified avatars were identified as Hispanic by all participants.Similarly, Nicolas et al. [51] recently found that participants classify racially ambiguous photos as Hispanic or MENA, regardless of their parent ethnicities.We believe that this effect extended to our virtual avatars.

An Open Library of Validated Avatars
As a contribution to the research community, we are providing open access to our virtual avatar library, which includes all 210 fully rigged avatars, along with validated labels for each avatar's race and gender.Our library features avatars of seven different races, providing a diverse selection for researchers to use in their studies.The validated labels can facilitate research on the impact of avatar race, and researchers can choose to use the labels for studies aimed at individuals from different racial backgrounds or same-race labels for specific study populations.
The Virtual Avatar Library for Inclusion and Diversity (VALID) provides researchers and developers with a diverse set of fully rigged avatars suitable for various scenarios such as casual, business, medical, military, and utility.Each avatar comes with 65 facial blend shapes, enabling dynamic facial expressions (see Figure 5).The library is readily available for download and can be used in popular game engines like Unity or Unreal.Although this is the first iteration of the library, we plan to update it by adding more professions and outfits soon.In addition, the library can be used for a wide range of research purposes, including social psychology simulations and educational applications.

Limitations and Future Work
We recognize that our VALID avatar library is only a small step towards achieving greater diversity and inclusion in avatar resources.We acknowledge that the representation of each demographic is limited and plan to expand the diversity within each group by creating new avatars.For example, our Asian avatars are modeled after East Asians, but we plan to expand VALID to include South Asian and Southeast Asian avatars as well.Our Hispanic representatives have pointed out the need for more diverse Hispanic avatars, including varying skin tones to represent different South American populations, such as Mexican and Cuban.Additionally, our NHPI representatives have suggested the inclusion of tattoos, which hold cultural significance for some NHPI communities, could improve the identifiability of our NHPI avatars, in addition to improving our NHPI recruitment methods.Any future updates to the library will undergo the same rigorous creation, iteration, and validation process as the current avatars.
While our first iteration of the library focused on diversity in terms of race, we realize that the avatars mostly represent young and fit adults, which does not reflect all types of people.In the future, we plan to update the library with a diversity of body types that include different body mass index (BMI) representations and ages.Including avatars with different BMI representations is not only more inclusive, but can also be useful for studies targeting physical activity, food regulation, and therapy [65].Likewise, we plan to include shaders and bump maps [30] that can age any given avatar by creating realistic wrinkles and skin folds, further improving the diversity and inclusivity of VALID Another limitation of the current work is that our library includes only male and female representations.In future updates, we plan to include non-binary and androgynous avatars.Currently, there are not many androgynous models that are freely available.However, they can be an area of important study.For example, previous studies found that androgynous avatars reduce gender bias and stereotypical assumptions in virtual agents [50] and improve student attitudes [10].Thus, we plan to include these avatars in a future update by following Nag et al. 's [50] guidelines for creating androgynous virtual humans.
Our study, while diverse in terms of race and country, is not representative of everyone.We recruited participants through the online platform Prolific, which is known for its increased diversity compared to other crowdsourcing platforms such as Mechanical Turk.However, due to the online nature of the platform, we primarily recruited younger adults.It is possible that perceptions of our avatars may differ among other age groups, such as children or older adults.Therefore, it is important to broaden recruitment efforts by exploring alternative platforms and recruitment strategies that may be more effective in reaching a wider range of participants.Future studies could also consider conducting in-person studies or focus groups to gather additional insights into avatar perception.

CONCLUSION
We have introduced a new virtual avatar library comprised of 210 fully rigged avatars with diverse professions and outfits, available for free.Our library aims to promote diversity and inclusion by creating equitable representation of seven races across various professions.We designed 42 base avatars using data-driven facial averages and collaborated with volunteer representatives of each ethnicity.A large validation study involving participants from around the world was conducted to obtain validated labels and metadata for the perceived race and gender of each avatar.Additionally, we offer a comprehensive process for creating, iterating, and validating diverse avatars to aid other researchers in creating similarly validated avatars.
Our validation study revealed that the majority of avatars were accurately perceived as the race they were modeled for.However, we observed that some avatars, such as the Hispanic and MENA avatars, were only validated as such by participants who identified as Hispanic or MENA, respectively.This finding suggests that the perception of virtual avatars may be influenced by own-race bias or the other-race effect, as described in the psychology literature.Moving forward, we plan to expand the library to include additional races, professions, body types, age ranges, and gender representations to further improve diversity and inclusion.
"[These hairstyles] look straighter and more Eurocentric.So I would choose [these facial features] and then do a natural [hair] texture." -Black Volunteer 1 • "Usually the men have curly hair or their hair is cut short on the sides with the top showing." -NHPI Volunteer 1

Figure 2 :
Figure 2: An example of how each avatar was presented to participants during our validation study.

Figure 3 :
Figure 3: Confusion matrix heatmap of agreement rates for the 42 base avatars by separated by other-race participants and same-race participants (i.e., participants of a different or same race as the avatar).Agreement rates were calculated as the percentage of participants who perceived an avatar to represent a race or gender.
(a) Other-race clustering (b) Same-race clustering

Figure 4 :
Figure 4: Clustered scatterplots of each avatar's relation to one another based on Principle Component Analysis and K-means clustering for other-race and same-race participant identifications.The Voronoi analysis shows the borders of the clusters where each category was assigned.Each avatar is color coded by its validated label.

Figure 5 :
Figure 5: Images of the skeleton and facial blend shapes included with our avatars.

Table 1 :
Breakdown of our volunteer representatives by race, gender (male, female, or non-binary), and country.