Is my body better than yours? Validation of the German version of the Upward and Downward Physical Appearance Comparison Scales in individuals with and without eating disorders

Introduction This study examines the psychometric properties of a German version of the Upward and Downward Physical Appearance Comparison Scales (UPACS and DACS). Methods A total of 2,114 participants, consisting of 1,360 women without eating disorders (Mage = 25.73, SDage = 6.84), 304 men without eating disorders (Mage = 24.48, SDage = 6.34), and 450 women with eating disorders (Mage = 27.11, SDage = 7.21), completed the UPACS and DACS as well as further questionnaires on appearance comparisons, eating disorder pathology, and self-esteem. Results Structural equation modeling confirmed the proposed one-factor structure of the original English-language version of the DACS but not of the UPACS. Both scales showed good internal consistency and test–retest reliability. The UPACS and DACS showed the expected correlations with related constructs, indicating acceptable construct validity, with some limitations for women with eating disorders. Discussion Overall, this study indicates that the German versions of the UPACS and DACS are psychometrically suitable for assessing upward and downward physical appearance comparisons in women and men without eating disorders and women with eating disorders in research and clinical practice.


Introduction
Body dissatisfaction is a risk factor for the development of eating disorders (EDs, Grabe et al., 2008;Rohde et al., 2015) and is strongly associated with appearance comparisons: The tripartite influence model of body image and eating disturbance proposes that comparing one's appearance to that of others mediates the influence of peers, parents, and media on body dissatisfaction (Thompson et al., 1999).In line with this, girls and women with ED symptoms have been found to engage more frequently in appearance comparisons than women without ED symptoms (Corning et al., 2006;Hamel et al., 2012).Moreover, the association between appearance comparisons and body dissatisfaction is assumed to be stronger in women than Schönhals et al. 10.3389/fpsyg.2024.1390063Frontiers in Psychology 02 frontiersin.org in men (Myers and Crowther, 2009), and research shows that women are more likely to engage in such comparisons (Strahan et al., 2006).Adapted from Festinger (1954), appearance comparisons can take the form of upward comparisons, in which others are perceived as more attractive than oneself, or downward comparisons, in which others are perceived as less attractive.Upward appearance comparisons have frequently been related to lower self-esteem (Schmuck et al., 2019;Rüther et al., 2023), disordered eating (Blechert et al., 2009;Arigo et al., 2014), negative mood, and body dissatisfaction (Leahey et al., 2007(Leahey et al., , 2011;;Ridolfi et al., 2011;Myers et al., 2012).The associations and effects of downward appearance comparisons, by contrast, are less clear.While many studies have reported associations of downward appearance comparisons with increased self-esteem (Pan and Peña, 2020), positive mood, and body satisfaction (van den Berg and Thompson, 2007;Bailey and Ricciardelli, 2010), others suggested that downward appearance comparisons do not have protective effects against body dissatisfaction and eating pathology (Fitzsimmons-Craft, 2017;Rogers et al., 2017), or even found associations with higher levels of body dissatisfaction (Vartanian and Dey, 2013), disordered eating (Drutschinin et al., 2018), drive for thinness, and restrained eating among women (Lin and Soby, 2016).The latter effect might be explained by the consideration that in general, downward comparisons are more commonly used as a coping strategy by individuals with low self-esteem (Wills, 1981).Additionally, the tendency to make upward appearance comparisons and the tendency to make downward appearance comparisons are associated with one another (O'Brien et al., 2009), indicating that people who engage in appearance comparisons do so in both directions, although upward appearance comparisons are often more prevalent (Ridolfi et al., 2011;McCarthy et al., 2023).
To assess general appearance comparisons in German-speaking individuals, the German version of the Physical Appearance Comparison Scale (PACS, original version: Thompson et al., 1991) is available and has been validated for women and men without EDs and women with anorexia nervosa (Mölbert et al., 2017).However, the PACS does not differentiate between upward and downward appearance comparisons.To overcome this limitation, O'Brien et al. (2009) developed the English-language Upward and Downward Physical Appearance Comparison Scales (UPACS and DACS).The UPACS and DACS are short and therefore economic scales consisting of 10 and eight items, respectively, with good internal reliability and construct validity.Principal component analysis revealed a one-factor solution for each scale.To the best of our knowledge, to date, there is no validated German-language questionnaire for the assessment of upward and downward appearance comparisons.
Therefore, the aim of the present study was to translate the UPACS and DACS into German and to examine the psychometric properties of the translated versions of both scales in a sample of women and men without EDs and women with EDs.We expected that women without EDs would show higher scores on the UPACS and DACS than men without EDs, as women tend to engage in appearance comparisons to a greater extent than men (Davison and McCabe, 2005;Strahan et al., 2006;O'Brien et al., 2009).Furthermore, we assumed that women with EDs would show higher scores on both scales than women without EDs, as appearance comparisons play a crucial role in the development of body dissatisfaction, which is in turn associated with EDs (Thompson et al., 1999;Leahey et al., 2011).Consistent with the original versions of the scales, we hypothesized a one-factor structure for both the UPACS and the DACS for all examined subsamples (O'Brien et al., 2009).In terms of construct validity, when examining the correlations of the UPACS and DACS with related measures, we assumed (1) positive correlations with established questionnaires regarding general physical appearance comparisons and eating pathology, in line with the original research (O'Brien et al., 2009), and (2) negative correlations with self-esteem, given that upward appearance comparisons have been found to be harmful for self-esteem (Schmuck et al., 2019), while downward comparisons are more frequently used as a coping mechanism by individuals with low self-esteem (Wills, 1981).Overall, from a descriptive perspective, we postulated stronger effects for the UPACS than for the DACS, as upward appearance comparisons have been more conclusively related to body image disturbances.

Participants
The sample for the present study consisted of N = 2,114 participants aged between 18 and 78 years (M age = 25.84,SD age = 6.89) who completed the UPACS and DACS and related measures across nine studies.The sample comprised n = 1,360 women without eating disorders (M age = 25.73,SD age = 6.84), n = 304 men without eating disorders (M age = 24.48,SD age = 6.34), and n = 450 women with eating disorders (M age = 27.11,SD age = 7.21, n = 191 anorexia nervosa, n = 132 bulimia nervosa, n = 127 binge eating disorder).Across the studies, n = 338 diagnoses were selfreported and n = 112 diagnoses were assessed using a structured clinical interview within the respective study (Structured Clinical Interview for DSM-IV, Wittchen et al., 1997;or Diagnostic Interview for Mental Disorders, Margraf et al., 2017).Some participants who took part in one of the nine studies were not included in the final sample: e.g., one man, who was the only man to self-report an ED, meaning that it was not possible to analyze a subsample of men with EDs; 117 participants with other or unclear EDs; nine participants with implausible or missing data (e.g., values outside the range of the respective scales); and seven participants under the age of 18 years.Participants' body mass index (BMI) ranged from 12.89 kg/m 2 to 65.31 kg/m 2 (M BMI = 23.68,SD BMI = 6.37).Across all participants, 9.46% were underweight (BMI < 18.50), 66.65% were of normal weight (BMI 18.50-24.99),12.16% were overweight (BMI 25.00-29.99),and 11.73% were obese (BMI > 30.00).The characteristics of the subsamples of the different studies are displayed in Table 1.

Upward and Downward Physical Appearance Comparison Scales (UPACS and DACS)
The UPACS consists of 10 items assessing upward physical appearance comparisons and the DACS consists of eight items assessing downward physical appearance comparisons (O'Brien et al., 2009).For both scales, items are rated on a 5-point Likert scale from 1 = strongly disagree to 5 = strongly agree, with higher scores

Physical Appearance Comparison Scale
The PACS (Thompson et al., 1991) assesses the frequency of general appearance comparisons.The German version of the scale consists of five items that are rated on a 5-point Likert scale (1 = never to 5 = always, Mölbert et al., 2017), with higher scores indicating a higher tendency for general physical appearance comparisons.The internal consistency for the current sample was acceptable, at McDonald's ω t = 0.84.

Eating Disorder Examination Questionnaire
The EDE-Q assesses the psychopathology of EDs (Fairburn and Beglin, 1994).The German version used in the present study encompasses 22 items across the four subscales Restraint, Eating concern, Weight concern and Shape concern.Items are rated on a 6-point Likert scale based on the frequency of certain behaviors in the past 28 days (0 = no day to 6 = every day) or the severity of the behaviors (0 = not at all to 6 = very much, Hilbert and Tuschen-Caffier, 2016).Higher scores indicate higher ED pathology.The internal consistency for the current sample was acceptable, at McDonald's ω t = 0.98 for the global score across the four subscales.The EDI-2 assesses ED-related characteristics (Garner, 1991).The two subscales Drive for thinness and Body dissatisfaction of the German version used in the present study contain 16 items combined, rated on a 6-point Likert scale (1 = never to 6 = always, Paul and Thiel, 2005), with higher scores indicating a higher level of drive for thinness and body dissatisfaction, respectively.The internal consistencies for the current sample were acceptable, at McDonald's ω t = 0.95 for the Drive for thinness subscale, and McDonald's ω t = 0.95 for the Body dissatisfaction subscale.

Rosenberg Self-Esteem Scale
The RSES assesses general self-esteem (Rosenberg, 1965).The German version contains 10 items rated on a 4-point Likert scale (0 = strongly disagree to 3 = strongly agree, Ferring and Filipp, 1996), with higher scores indicating higher self-esteem.The internal consistency for the current sample was acceptable, at McDonald's ω t = 0.95.Holtmann et al., 2024 (manuscript in preparation)].Study 9 was performed in order to obtain additional data, particularly regarding convergent validity with the PACS, to ensure an appropriate sample size for men, and to examine the test-retest reliability.The ninth study was the only study in which men with eating disorders or non-binary people could have participated.Unfortunately, only one man with an eating disorder and no non-binary people participated; therefore, no analyses for these groups could be conducted.In the nine studies, data were collected using either an online survey presented in Unipark or LimeSurvey, or through paper-and-pencil questionnaires.Participants were required to be at least 18 years old and were primarily recruited through email distribution lists, flyers, social media posts (e.g., on Instagram), and cooperation with clinics.

Procedure
In all nine studies, participants provided sociodemographic information and completed the UPACS and DACS, EDE-Q, EDI-2, and RSES.The PACS was administered in Studies 8 and 9 only.Participants who completed the questionnaires in Study 9 received an email asking them to complete the UPACS and DACS again 2 weeks later, and were reminded to take part in the retest a further week later.In total, n = 232 participants (n = 142 women without EDs, n = 50 men without EDs, n = 40 women with EDs) completed both assessments.
For the sample of women with EDs, EDs were either self-reported by participants or assessed using a structured clinical interview.While Study 1 used the Structured Clinical Interview for DSM-IV (SCID-I, Wittchen et al., 1997), Study 6 examined EDs using the Diagnostic Interview for Mental Disorders (DIPS, Margraf et al., 2017).

Data analysis
Descriptive data, comparisons between groups, and correlation analyses were calculated using IBM SPSS (version 28.0.1.1).McDonald's ω t was calculated using the psych package in R (version 4.3.2).Structural equation modeling (SEM) was performed using the lavaan package in R. As the assumption of normality was violated (Kolmogorov-Smirnov tests with Lilliefors correction, p < 0.001 for each respective scale in every subsample), Mann-Whitney U tests were performed to compare (a) the scores on the EDI-2 subscales and the EDE-Q global score between women with self-reported EDs and women with EDs diagnosed within a respective study, and (b) the UPACS and DACS scores between women and men without EDs and between women with and without EDs.All further analyses were conducted separately for (1) women without EDs, (2) men without EDs, and (3) women with EDs.The internal consistency was considered acceptable at McDonald's ω t > 0.65 (Kalkbrenner, 2023).SEM with maximum likelihood estimation was used to examine whether the one-factor structure of the original scales can be transferred to the German versions.The fit/misfit indices χ 2 , CFI, NNFI, RMSEA, and SRMR were determined to assess the goodness of fit of the one-factor models.The thresholds for good fit were CFI ≥ 0.97, NNFI ≥0.97, RMSEA ≤0.05 and SRMR ≤0.05, and the thresholds for acceptable fit were CFI ≥ 0.95, NNFI ≥0.95, RMSEA ≤0.08 and SRMR ≤0.10 (adapted from West et al., 2012).In the case of a poor model fit, possible re-specifications were examined through modification indices, solely for exploratory purposes in order to discuss possible future adaptations of the scales.Furthermore, measurement invariance across all subsamples was examined.Following Putnick and Bornstein (2016), measurement invariance was tested in four steps: (1) configural invariance, (2) metric invariance (loading invariance), (3) scalar invariance (intercept invariance) and (4) residual invariance.Invariance was defined based on the recommendations by Chen (2007).Thresholds that indicated non-invariance were ΔCFI ≤ −0.005, ΔRMSEA ≥0.010, and ΔSRMR ≥0.025 for metric invariance and ΔSRMR ≥0.005 for scalar and residual invariance.As tests of correlations against 0 are robust to non-normality (Fowler, 1987), Pearson correlations were used to assess test-retest reliability and correlations with related constructs.The test-retest reliability was calculated by correlating the first and second assessments of the UPACS and DACS.To examine construct validity, the PACS, the subscales of the EDI-2, the EDE-Q, and the RSES were correlated with the UPACS and DACS.Furthermore, the UPACS and DACS were correlated with each other.Bonferroni correction was applied for the correlations regarding construct validity within each subsample, thus correcting for 11 significance tests in each case.

Preliminary analyses
When comparing women with self-reported EDs and women with EDs diagnosed within one of the studies, no significant differences emerged regarding the Body dissatisfaction subscale of the EDI-2, z = 0.48, p = 0.64.However, the two groups differed significantly on the Drive for thinness subscale of the EDI-2, z = 3.34, p < 0.001, and on the EDE-Q, z = 3.69, p < 0.001, with women with self-reported EDs showing significantly higher levels of ED pathology.
In order to include a greater range and variability of ED pathology, the further analyses include both women with diagnosed and selfreported EDs.
As expected, women without EDs showed significantly higher scores on the UPACS than men without EDs, z = 6.75, p < 0.001, and significantly lower scores than women with EDs, z = 12.21, p < 0.001.Likewise, women without EDs showed significantly higher scores on the DACS than men without EDs, z = 4.56, p < 0.001, and significantly lower scores than women with EDs, z = 6.66, p < 0.001.
The corrected item-total correlations (all ≥ 0.50) for the UPACS and DACS items were good in every subsample, indicating that all items correlate sufficiently with each respective scale (see Tables 2, 3).None of the items were normally distributed, as indicated by Kolmogorov-Smirnov tests with Lilliefors correction (all p < 0.001).Skewness and kurtosis of all items as well as box plots for the UPACS and DACS can be derived from the Supplementary material.

Confirmatory factor analyses
The results of the SEM analyses for the different subsamples are displayed in Table 4. Regarding the UPACS, the one-factor structure showed a poor fit on all examined indices, with the exception of the SRMR for all subsamples, which indicated an acceptable fit.Across all subsamples, the standardized loadings of the items were 0.46 ≤ λ* ≤ 0.92 (see Table 5).Invariance testing did not support configural invariance in the CFI and RMSEA; only the SRMR indicated an acceptable fit.Therefore, the further steps are not reported, but can be derived from Table 6.In sum, these results indicate that the proposed one-factor structure of the German UPACS is not adequate for all subsamples.For exploratory purposes, we examined re-specifications based on modification indices.In all subsamples, modification indices indicated a substantial improvement of the model fit when allowing for correlations of the errors of the two items "I find myself thinking about whether my own appearance compares well with models and movie stars" and "I tend to compare my own physical attractiveness to that of magazine models" (see Table 4, all MI > 90), indicating a redundancy of the item contents (Byrne, 2016).Despite the poor fit, the further analyses were nevertheless conducted for the entire 10-item UPACS in order to examine the psychometric properties in parallel to the original version, and because the good item-total correlations still indicate sufficient relations of each item with the rest of the scale.
Concerning the DACS, for all subsamples, the one-factor structure showed a good fit according to the CFI and SRMR, an acceptable fit according to the NNFI, and a mediocre fit according to the RMSEA.Across all subsamples, the standardized loadings of the items  lay at 0.55 ≤ λ* ≤ 0.91 (see Table 7).Tests of measurement invariance supported configural and metric invariance (see Table 6).For scalar and residual invariance, the changes in the CFI lay above the threshold, indicating non-invariance, but the changes in the RMSEA and SRMR lay below the respective thresholds.Therefore, scalar and residual invariance were only partially supported.Taken together, these results indicate that the proposed one-factor structure of the German DACS is acceptable for all subsamples.

Internal consistency
For women without EDs (n = 1,360), the internal consistencies of the two scales were acceptable, at McDonald's ω t = 0.95 for the UPACS and McDonald's ω t = 0.94 for the DACS.Similarly, for men without EDs (n = 304), the internal consistencies were acceptable, at McDonald's ω t = 0.96 for the UPACS and McDonald's ω t = 0.95 for the DACS.For women with EDs (n = 450), internal consistency was

Test-retest reliability
The test-retest reliability could only be determined for participants who took part in the retest in Study 9 (n = 232).On average, the interval between the first and second assessment of the UPACS and DACS lay at M = 16.47 days (SD = 4.65).For women without EDs (n = 142), the test-retest reliability was r tt = 0.86, p < 0.001 for the UPACS and r tt = 0.70, p < 0.001 for the DACS.For men without EDs (n = 50), the test-retest reliability was r tt = 0.80, p < 0.001 for the UPACS and r tt = 0.64, p < 0.001 for the DACS.For women with EDs (n = 40), the test-retest reliability was r tt = 0.84, p < 0.001 for the UPACS and r tt = 0.71, p < 0.001 for the DACS.

Construct validity
In women and men without EDs, the UPACS and DACS showed significant positive correlations with each other, indicating that a higher tendency for upward appearance comparisons is associated with a higher tendency for downward comparisons, and vice versa (see Tables 8, 9).For women without EDs, the correlation between the UPACS and DACS was not significant (see Table 10).
As hypothesized, in women without EDs (see Table 8), the UPACS and DACS scores showed significant positive correlations with the PACS, the two subscales of the EDI-2 and the EDE-Q.The UPACS and DACS scores showed significant negative correlations with the RSES scores.All effect sizes were descriptively higher for the correlations with the UPACS than for the correlations with the DACS.
In men without EDs, the correlations followed a similar pattern as in women without EDs (see Table 9).From a descriptive perspective, in men, the effect sizes for the correlations of the EDE-Q with the UPACS and DACS differed marginally.Furthermore, the effect size of the correlation of the EDI-2 subscale Drive for thinness with the DACS was slightly higher than that with the UPACS.Overall, all effect sizes were descriptively smaller in men than in women without EDs, indicating stronger associations of upward and downward appearance comparisons with related constructs in women than in men without EDs.
A different pattern emerged in women with EDs (see Table 10).The correlations of the PACS, the EDI-2 subscale Drive for thinness, and the EDE-Q with the UPACS and DACS were in the expected direction, with stronger effect sizes for the UPACS.The RSES showed a significant negative correlation with the UPACS but not with the DACS.The EDI-2 subscale Body dissatisfaction was not significantly correlated with either the UPACS or the DACS.

Discussion
The aim of the present study was to investigate the Germanlanguage versions of the UPACS and DACS in different samples of women and men without EDs and women with EDs.Using data collected over nine studies, we examined the psychometric properties and factor structure of both scales.
Regarding the factor structure of the scales, the results of the SEM and tests of measurement invariance indicated that the one-factor structure of the original English-language version is adequate for the DACS but not for the UPACS in all subsamples.Exploratory re-specifications based on modification indices indicated that the poor fit could be due to two similar items, concerning upward appearance comparisons with "magazine models" and with "models and movie stars." When allowing for correlation of the residuals of these two items, the fit of the one-factor model improved noticeably, suggesting that these items are closely interrelated, possibly due to the similar wording of the items.As both items include comparisons with models, with one specifying comparisons with magazine models, it is evident that the two items heavily overlap in content and in item wording.Moreover, it is questionable whether comparisons with models and celebrities like movie stars have the same qualities as they did back in 2009, when the original scale was developed, as the rise of social media since that time has led to a new type of comparisons, namely with so-called influencers (Ye et al., 2021).Nowadays, people tend to compare themselves with celebrities and influencers on social media rather than with models seen in traditional media, especially magazines or billboard advertisements (Fardouly et al., 2017), and may therefore identify themselves more with influencers than with other celebrities (Schouten et al., 2020).To account for this change, Roberts et al. (2022) adapted the tripartite influence model (Thompson et al., 1999) and identified social media as a separate factor influencing body dissatisfaction apart from traditional media.Additionally, the standardized loadings and itemtotal correlations of the two items in question were descriptively the smallest in each subsample, indicating that they are not associated as strongly with the underlying factor or the rest of the scale, respectively.Therefore, future studies should consider adapting the wording to the modern context or the removal of these two items.Until then, the complete version of the UPACS should only be used   while taking into account the possible limitations regarding the factor structure and the possibly outdated item wording.The internal consistencies were acceptable for both scales in all examined subsamples.The item-total correlations indicate that each item correlates sufficiently with each respective scale.Furthermore, the test-retest reliability was good in all examined subsamples.Taken together, the reliability of the UPACS and DACS can be considered adequate.
Additionally, women without EDs showed significantly higher scores on the UPACS and DACS than did men without EDs, which was expected due to women's greater tendency to engage in appearance comparisons (Davison and McCabe, 2005;Strahan et al., 2006).Moreover, in line with expectations, women with EDs scored significantly higher on both scales than did women without EDs, which confirms the higher tendency for appearance comparisons in women with EDs (Blechert et al., 2009;Arigo et al., 2014).The findings indicate that the UPACS and DACS are able to detect differences in the tendency for upward and downward appearance comparisons.
Regarding the construct validity, the correlation patterns differed between the examined subsamples.It is important to note that the following discussion is based on descriptive effect sizes, as the correlations were not tested against each other.A further study evaluating the construct validity and including further inferential statistical data analyses is needed.Nevertheless, the correlation patterns provide a first overview indicating the construct validity of the scales.Most importantly, in each subsample, the UPACS and DACS correlated with the PACS.This is in line with expectation, as the PACS assesses general appearance comparisons and is therefore closely connected to the constructs of upward and downward physical appearance comparisons.In accordance with this, the effect sizes of the correlations with the PACS were descriptively higher than the effect sizes of the correlations with other related measures in each subsample.As expected, the effect size for the correlation between the UPACS and PACS was descriptively higher than that between the DACS and PACS, as upward appearance comparisons are generally more prevalent than downward appearance comparisons (Ridolfi et al., 2011;McCarthy et al., 2023) and should therefore be more closely related to the general tendency of making appearance comparisons.Accordingly, these results indicate that the scales do, in fact, assess appearance comparisons.
For women without EDs, the correlation patterns followed the overall expectations regarding positive associations with eating pathology, body dissatisfaction, and drive for thinness and negative associations with self-esteem.Furthermore, all effect sizes were descriptively higher for the correlations of the UPACS compared to those of the DACS, supporting the more conclusive and closer relationship of upward appearance comparisons with body image disturbance (Leahey et al., 2007).Findings regarding the influence of downward appearance comparisons on body image, by contrast, are inconsistent, with some studies reporting a positive influence (van den Berg and Thompson, 2007;Bailey and Ricciardelli, 2010) and others showing no positive effects, or even detrimental effects (Lin and Soby, 2016;Fitzsimmons-Craft, 2017;Drutschinin et al., 2018).The present study tends to support the latter findings.
For men without EDs, the UPACS and DACS showed positive correlations with the measures of eating pathology, body dissatisfaction, and drive for thinness and negative correlations with self-esteem.The pattern of effect sizes was not as conclusive as for women without EDs, as descriptively, the effect sizes for the correlations with the EDE-Q only differed marginally between the UPACS and DACS, and the effect size for the correlation of the DACS with the EDI-2 subscale Drive for thinness was descriptively slightly smaller than the correlation of the UPACS with this subscale.The similar effect sizes for the correlations with the UPACS and DACS might be due to the overall low scores and the limited variability on the EDE-Q and EDI-2 subscale Drive for thinness in men without EDs, because men generally show lower eating disorder pathology than women (Smink et al., 2012) and because thinness might be less important for evaluating one's own appearance in men compared to women (Anderson and Bulik, 2004).
For women with EDs, the lack of association of both the UPACS and DACS with the EDI-2 subscale Body dissatisfaction might be due to a lack of variance in the subscale.The distribution of scores on the EDI-2 is left-skewed in this subsample, since most women with EDs show high body dissatisfaction per se, given that body dissatisfaction is a prevalent risk factor for EDs (Grabe et al., 2008;Rohde et al., 2015).The RSES only showed a negative correlation with the UPACS and not with the DACS, which might indicate that comparisons with people who are perceived to look better have a greater impact on selfesteem in women with EDs.This finding might be attributable to negative cognitive biases in women with EDs, which lead to more body checking behavior (Williamson et al., 2004) that possibly encompasses more upward appearance comparisons with women perceived to be more attractive than oneself as compared to downward comparisons.
Several limitations of the present study should be mentioned.First, the initial translation process did not include a backtranslation by a native English speaker; thus, complete correspondence between the original version and the German version of the UPACS and DACS cannot be fully ensured.However, a bilingual translator did subsequently compare the two versions and found only three minor expressions that could have been altered but would not have meaningfully changed the item contents.This suggests that the translation can be deemed as acceptable.Second, the data are derived from nine different studies with varying study aims, designs, and target populations, conducted at different time points between 2016 and 2023.While all studies included the same German-language version of the UPACS and DACS, the studies differed in length and the questionnaires were administered at different positions within the studies.Therefore, it cannot be ensured that certain preceding questionnaires or contents in the respective studies influenced participants' responses to the UPACS and DACS.Third, in three of the studies, EDs were not diagnosed with a structured clinical interview, but rather relied on participants' self-reports.It is possible that some participants may have provided a false or only presumed diagnosis.However, compared to participants with EDs diagnosed as part of a study, those with self-reported EDs had comparable or even higher scores on relevant questionnaires regarding ED pathology, suggesting that the majority of self-reported diagnoses were appropriate for the category of EDs.It might be the case that individuals with higher levels of ED pathology were more likely to participate in studies in which self-reported diagnoses were sufficient because the procedure was less stressful, more anonymous, and the studies were easier to 10.3389/fpsyg.2024.1390063Frontiers in Psychology 10 frontiersin.orgaccess.Fourth, the psychometric properties and factor structure of the UPACS and DACS were examined for women with EDs in general rather than separately for women with anorexia nervosa, bulimia nervosa, and binge eating disorder.Despite some overlap in the characteristics of the different EDs, e.g., possible binge-eating episodes, the EDs differ, for instance, regarding weight and compensatory behavior (American Psychiatric Association, 2013).We chose to analyze all EDs within one group to enable sufficient sample sizes, especially for the SEM analyses and the assessment of test-retest reliability.However, future studies should examine the validity of the scales for each form of ED separately.Fifth, due to the lack of men with eating disorders and non-binary people in the final sample, no statement can be made regarding the suitability of the UPACS and DACS for assessing upward and downward appearance comparisons in these two groups.Future evaluations of the scales should therefore endeavor to include these groups.
Finally, we only examined questionnaires that were related to the concepts of appearance comparisons in some way, thus focusing most on convergent validity.To ensure divergent validity, future studies should encompass more measures assessing variables that are less or not related to appearance comparisons.Despite these limitations, the present study is the first to evaluate German-language versions of the UPACS and DACS, and might thus constitute a first step towards enabling the use of both scales in future research.Overall, the reliability and validity of the German-language UPACS and DACS are satisfactory, with limitations especially regarding the factor structure of the UPACS.Moreover, as the data were collected across nine studies, it was possible to examine the properties of the UPACS and DACS in women and men without EDs as well as women with EDs with sufficient sample sizes.
Future studies should seek to refine the scales, especially the UPACS, to account for modern-day developments in comparison processes or to generate a more general assessment of appearance comparisons that can be used outside the context of the rapidly changing media environment.The proposed re-specifications of the UPACS were only examined for exploratory purposes and need to be confirmed in independent samples.As the data were derived from several different studies, a future study should be designed and conducted solely for the purpose of validating the UPACS and the DACS.
Overall, the German-language versions of the UPACS and DACS seem to be useful scales to assess upward and downward appearance comparisons in women and men without EDs and women with EDs.Differentiating the direction of appearance comparisons allows for a more specific examination of the influence of appearance comparisons on body image and related constructs and as a risk factor for EDs.In sum, this study is the first to indicate that the German-language UPACS and DACS might be suitable for use in research and clinical practice.
As the German-language versions of the UPACS and DACS were part of several different studies conducted at the Department of Clinical Psychology and Psychotherapy of Osnabrück University, data were available from eight studies conducted between 2016 and 2023 [Study 1: Voges et al., 2018; Study 2: Voges et al., 2019; Study 3: Voges et al., 2020; Study 4: Voges et al., 2022; Study 5: unpublished data; Study 6: Quittkat et al., 2023; Study 7: Ladwig et al., 2024 (manuscript submitted for publication); Study 8:

TABLE 1
Sample characteristics within the different studies.

TABLE 2
Means, standard deviations, and corrected item-total correlations for the UPACS in different subsamples.

TABLE 3
Means, standard deviations and corrected item-total correlations for the DACS in different subsamples.

TABLE 4
Structural equation modeling for the UPACS and DACS in different subsamples.Structural equation modeling with maximum likelihood estimation, including re-specification analysis for the UPACS, in which suggestions are made to allow for correlation of errors between items.The suggestion with the highest MI in each subsample is reported.UPACS, Upward Physical Appearance Comparison Scale; UPACS mod., re-specification analysis for the UPACS; DACS, Downward Physical Appearance Comparison Scale; CFI, comparative fit index; NNFI, non-normed fit index; RMSEA, root mean square error of approximation; CI, confidence interval; SRMR, standardized root mean square residual; ED, eating disorder.The specific items can be derived from Tables2, 3. *p < 0.05, **p < 0.01, ***p < 0.001.

TABLE 5
Standardized loadings for structural equation modeling for the UPACS in different subsamples.
Structural equation modeling with maximum likelihood estimation.UPACS, Upward Physical Appearance Comparison Scale.ED, eating disorder.The specific items can be derived from Table2.

TABLE 6
Measurement invariance across all subsamples.
Test of measurement invariance via structural equation modeling with maximum likelihood estimation.Differences Δ indicate the change in the respective index between two consecutive steps.UPACS, Upward Physical Appearance Comparison Scale; DACS, Downward Physical Appearance Comparison Scale; CFI, comparative fit index; NNFI, non-normed fit index; RMSEA, root mean square error of approximation; CI, confidence interval; SRMR, standardized root mean square residual; ED, eating disorder.**p < 0.01, ***p < 0.001.

TABLE 7
Standardized loadings for structural equation modeling for the DACS in different subsamples.

TABLE 8
Means, standard deviations, and correlations of the UPACS and DACS and related measures in women without eating disorders.

TABLE 9
Means, standard deviations, and correlations of the UPACS and DACS and related measures in men without eating disorders.
TABLE 10 Means, standard deviations, and correlations of the UPACS and DACS and related measures in women with eating disorders.