Detecting Authoritarianism Efficiently: Psychometric Properties of the Screening Instrument Authoritarianism – Ultra Short (A-US) in a German Representative Sample

With right-wing-extremist and -populist parties and movements on the rise throughout the world, the concept of authoritarianism has proven to be particularly valuable to explain the psychological underpinnings of these tendencies. Even though many scales to measure the different dimensions of authoritarianism exist, no short screening instrument has been tested and validated on a large scale so far. The present study examines the psychometric properties of the screening instrument Authoritarianism – Ultrashort (A-US) in three representative German samples (n = 2,524, n = 2,478, and n = 2,495). Using exploratory and confirmatory factor analysis, the A-US demonstrated acceptable internal consistency. Model fit was good and correlations with related constructs indicated convergent validity in both samples. Construct validity was demonstrated using the original version of the scale. The instrument proved to be invariant across sex, employment status, and education, but not across different age groups. Finally, the analyses showed that differences in the A-US are associated with sociodemographic variables. Potential causes and effects of these findings are discussed. Based on these results, the A-US proved to be a valuable and highly efficient tool to screen for authoritarian tendencies.

it as a character trait consisting of nine distinct dimensions 1 . Following psychodynamic theory, they claimed that authoritarian character traits were formed mainly in early childhood and were largely dependent on the parent's overly strict and harsh child rearing behavior. Even though the studies in general and the F-Scale in particular were criticized for various reasons, the idea of social and political attitudes being "ideologically organized along a single dimension that was a direct expression of personality" (Duckitt, 2015, p. 256) remained and so did the aim of finding adequate measurements.
In social science today, there is no homogeneous concept of authoritarianism. The phenomenon is still defined as a personality trait (e.g., Oesterreich, 2005) that mirrors social authoritarian dynamics (Decker, 2019). While the empirical findings evaluate the correlation between authoritarianism and prejudice, the concept was also adopted in social cognition. The double process model by Duckitt defines authoritarianism as a set of "social attitudinal or ideological expressions of basic social values or motivational goals that represent different, though related, strategies for attaining collective security at the expense of individual autonomy" (Duckitt and Bizumic, 2013). This definition focuses on the attitudinal and behavioral aspects as well as its effect on group processes rather than its etiology. Furthermore, it abandons a social theory approach to understand the social origins of authoritarian dynamics. In his notion of right-wing authoritarianism (RWA), Altemeyer (1981Altemeyer ( , 1988Altemeyer ( , 1996 reduces the original nine dimensions of the F-Scale to three, i.e., authoritarian aggression, authoritarian submission, and authoritarian conventionalism. Individuals with a high score in authoritarianism are thus expected to act aggressively toward an out-group or individuals showing socially deviant behavior, they prefer to follow the rule of a leader, and they are drawn to traditional values that are not to be scrutinized. In the present study, we rely on this definition to investigate the properties of our three-item, ultrashort screening scale Authoritarianism -Ultra Short (A-US). Using the well validated Short Scale for Authoritarianism (Kurzzskala Autoritarismus; KSA-3; Beierlein et al., 2014) as a basis, the A-US is aimed to measure the full range of authoritarianism, covering all three dimensions as defined by Altemeyer. Authoritarianism can predict right-wing political attitudes as well as voting behavior (Decker and Brähler, 2006;Decker et al., 2016Decker et al., , 2018Dunwoody and Plane, 2019). The concept shows overlap with the idea of conservatism as used, e.g., in a metaanalysis by Jost et al. (2003). Furthermore, when compared to the Big Five and Social Dominance Orientation, it has been shown to be one of the best predictors of generalized prejudices, especially when the out-group is perceived as threatening toward the social order and/or showing dissident behavior (Ekehammar et al., 2004;Duckitt andSibley, 2007, 2009). It is thus associated with racism and sexism, as well as prejudice toward homosexuals and mentally disabled people (Ekehammar et al., 2004). Moreover, there is a correlation between acceptance of corporal punishment, violent educational methods and authoritarianism (Clemens et al., 2019). Authoritarianism fires the cycle of violence by approving child abuse and physical violence by parents and transmitting violence to the next generation (Clemens et al., 2020). Authoritarian attitudes are known to increase when the perceived threat on social and individual security is high (Asbrock et al., 2010;Asbrock and Fritsche, 2013;Dunwoody and Plane, 2019), making it an individual variable that is sensitive to changes to a given social situation. With anti-democratic parties and movements on the rise throughout the world and increasing violence against migrants and minorities, understanding and monitoring authoritarianism has become an issue of great political relevance. A reliable and efficient way of assessment lays the necessary foundation to work against these tendencies.
Altemeyer's original scale, the Right Wing Authoritarianism (RWA), was designed to measure authoritarianism as a onedimensional construct with three aspects. There is an ongoing debate about the dimensionality of authoritarianism though; Funke (2005) developed a three-dimensional, balanced scale that is among the most frequently used in German populations. Its items were criticized with regard to contents, involving questions about related concepts like prejudice, religiousness and conservatism.
The same holds true for the recently published Very Short Authoritarianism Scale (VSA) by Bizumic and Duckitt (2018). Their attempt to provide a short alternative to established measures builds on the well-validated, 18-item ACT-scale . It is made up of six items to capture the three aforementioned dimensions of authoritarianism using balanced two-item sets. While the ACT was developed to rid the RWA of its content overlap with criterion variables, the items operationalizing traditionalism or conventionalism are still likely to be culturally sensitive and show large overlap with religiousness 2 . While religiousness generally shows highs correlations with authoritarian attitudes, it is plausible that in certain subgroups or countries, there may be a different connection or no connection at all to authoritarian attitudes (e.g., in former socialist countries). In fact, Lee et al. (2018) found that the correlations of religiousness and political orientation largely vary across countries. Mixing the two constructs, authoritarianism and religiousness, in a single questionnaire may thus obscure the relationship between them. Moreover, with its six items, the VSA may still be unfit for some large-scale purposes.
Another widely used method of assessing authoritarianism efficiently applies questions regarding child-rearing values. The most prominent scale in this realm is the four-item Authoritarian Child Rearing Values (ACRV) and its adaptation, the ACRV-2, that has been used in the American National Election Survey (ANES). Participants are asked to choose between two item pairs of desirable qualities when raising a child, one representing authoritarian, the other non-authoritarian values. Even though correlations with the RWA and ACT can be considered acceptable, findings regarding reliability have been inconsistent (according to Bizumic and Duckitt, 2018, reported alphas range between 0.54 and 0.66 while they report an α = 0.71 themselves). Most importantly, it is doubtful that the ACRV-2 is capable of capturing all facets of authoritarianism as conceptualized by Altemeyer. Bizumic and Duckitt (2018) argue that while it might be used to operationalize authoritarian submission, authoritarian aggression may not be captured at all. Moreover, MacWilliams (2016) points out that there is an unsettled issue regarding crossracial validity of the scale, as African-Americans might interpret the questions differently. Another substantial flaw regards the force-choice answering format. Opposition in meaning as well as equal social desirability of paired items in these formats is only assumed (Ray, 1990). Beierlein et al. (2014) tried to eliminate some of these shortcomings by developing an unbalanced, nine-item short scale to measure authoritarianism in its three dimensions, the KSA-3. Unlike other short scales (e.g., Schmidt et al., 1995;Aichholzer and Zeglovits, 2015) its psychometric properties proved to be more than satisfactory. An ultrashort screening scale that covers the full spectrum of authoritarianism and is tested and validated using a representative sample has yet to be developed. It is needed in order to provide a more efficient way to screen for authoritarian tendencies within a society.
In the present study, we evaluate the three item, ultrashort version of the authoritarianism scale, based on the concept of Altemeyer (1988), and compare it to the original short scale by Beierlein et al. (2014). After an item analysis, an exploratory factor analysis (EFA) is used to analyze the dimensionality. It is then followed by a confirmatory factor analysis (CFA). As differences in authoritarianism and the support of right-wing extremist positions are often reported between certain groups (e.g., sex and age groups) and factors like employment status and educational background are used to explain mean differences, it is important to inspect measurement invariance as a prerequisite for comparing mean scores. To this end, measurement invariance is tested for these socio-demographic factors and their influence on mean and factor score is evaluated. Finally, construct validity is assessed using the original version of the scale and convergent validity is demonstrated using measures of right-wing attitudes, self-assessment of left/right positioning, as well as generalized and group specific prejudices.

Participants
The present study was part of a regular national representative survey of the general population of Germany. Two samples were analyzed using data collected in 2016 (Sample 1), 2017 (Sample 3), and 2018 (Sample 2), by an independent institute for opinion and social research (USUMA, Berlin). The criteria for inclusion were an age of ≥14 years and sufficient ability to understand the written German language. All adult participants provided their informed consent. In case of minors enrolled in the present study, informed consent was also obtained from the next of kin, caretakers, or guardians. After a sociodemographic interview, participants completed self-report questionnaires regarding political attitudes, physical and psychological symptoms in the presence (but without any interference) of the interviewer.
A random-route sampling procedure with 258 sample points revealed that 4,902 (Sample 1), 5,418 (Sample 2), and 5,160 (Sample 3) households should be contacted as part of the study. Of these, 4,830 households of Sample 1, 5,316 of Sample 2, and 5,093 of Sample 3 were eligible to participate (i.e., were not vacant or without individuals who met the inclusion criterion). The selection of the target persons within the households was carried out according to the Kish selection grid. In total, there were 2,524 participants in Sample 1, 2,516 in Sample 2, and 2,531 in Sample 3 (participation rate 52.7, 47.5, and 49.7% respectively). Due to the shortness of the scale, only participants that completed all three items of the A-US were included, leading to an exclusion of n = 79 (Sample 1) and n = 38 (Sample 2). As Sample 3 was used for construct validation, all participants with missing values in the nine-item version of the scale were excluded (n = 36). Thus, the final samples consisted of 2,465 (Sample 1), 2,478 (Sample 2) and 2,495 subjects (Sample 3). Sociodemographic characteristics of the study samples are presented in Table 1. While the three samples did not show notable differences, when comparing the sex and age groups to data provided by the Federal Statistical Office of Germany (2019), a slight overrepresentation of female participants as well as an underrepresentation of younger age groups could be observed. As these were minor deviations, the data can be assumed to be representative of the German population.

Measures
For the present study, we used a three-item version of the Short Scale for Authoritarianism (Kurzzskala Autoritarismus; KSA-3; Beierlein et al., 2014) that is designed to measure authoritarianism on a five-point scale, with 1 indicating strong opposition and 5 indicating strong agreement. The original scale consists of nine items on three dimensions (i.e., aggression, submission, and conventionalism). The items with the highest factor loadings on each dimension were selected for the ultrashort, three-item version, the Authoritarianism -Ultra Short (A-US). This type of item selection insured that the three original dimensions were best represented in the short scale. An overall score was computed by adding the individual scores of each of the three selected items of the ultrashort scale. Original item wording as well as an English translation are provided in Table A1 in the Appendix. Additionally, for construct validation, a shortened sixitem score of the original scale was calculated by adding up the scores of the remaining items not selected for the A-US.
The Leipzig Scale on Right-Wing Extremist Attitudes (Fragebogen zur Rechtsextremen Einstellung; FR-LF; Decker et al., 2013) assesses right-wing attitudes using six dimensions. Each dimension consists of three items that are to be rated on a five-point scale ranging from 1 = I fully disagree to 5 = I fully agree. Decker et al. (2013) found the questionnaire showed a very good internal consistency of α = 0.94. For this study, the total score was used by adding up all item scores. Political orientation was measured using a single-item leftright-self assessment scale ("Thinking about your own political views, how would you rate them on the following scale?") ranging from 1 = left to 10 = right.
Generalized and group specific prejudices were analyzed using parts of the questionnaire developed by the research group around Heitmeyer (2012). It assesses several forms of grouprelated hostility (Gruppenbezogene Menschenfeindlichkeit; GMF) on a four-point scale ranging from 1 = I fully agree to 4 = I fully disagree. To make the results more accessible, all necessary items were poled so that high scores indicated high values of GMF. In the present study, we took items measuring prejudices against Muslims (two items; ω 1 = 0.84; ω 2 = 0.83), and Sinti and Roma (three items; ω 1 = 0.90; ω 2 = 0.91) from both Samples.
Items regarding homophobic attitudes (three items, one inverted; ω 1 = 0.83) as well as sexism (two items, ω 1 = 0.86) were included using additional data from Sample 1. An overall score to account for generalized prejudices was also calculated by adding all used items (ten in Sample 2 and five in Sample 2; ω 1 = 0.86; ω 2 = 0.89) 3 .

Statistical Analyses
On Sample 1, an EFA was conducted to determine the number of factors of the A-US. We then used Sample 2 to confirm the findings using confirmatory factor analysis (CFA). Both subsamples did not differ significantly with regard to A-US mean scores, sex, and age (see Tables 1, 2).
For the EFA, principal axis factoring was applied using SPSS. A total of three different indicators were used to identify the factor structure of the A-US: Kaiser Guttman criterion, screeplot, and Horn's parallel analysis (PA; Horn, 1965). PA focuses on extracting Eigenvalues from random data sets that have the same number of cases and variables as the original raw data. This procedure is based on the idea that factors of real data should have larger Eigenvalues that those extracted from random data. Consequently, only those factors were retained in the real data that showed Eigenvalues greater than those of the random data (O'Connor, 2000). The parallel analysis engine provided by Patil et al. (2017) was used to create random data. The method was based on PCA factor extraction and used 95th percentile of the Eigenvalues as a threshold instead of the mean to avoid overextraction of factors.
Additionally, a CFA was conducted on Sample 2 to confirm the factorial structure of the A-US. To this end, we used R and the packages lavaan and semTools (Rosseel, 2012;semTools Contributors, 2016) and each model was estimated with the robust maximum likelihood method approach (Satorra and Bentler, 2001). Due to the shortness of the A-US with only three items, model fit indices could not be calculated, as a model with three indicators of a latent variable is justidentified. As we did not want to impose additional constraints to the model, only factor loadings and a measure of internal consistency, McDonald's (1999) ω (Trizano-Hermosilla and Alvarado, 2016), are reported.
Additional analyses were conducted using Sample 1 and Sample 2 to test the invariance of the model across sex and age as well as education and employment status using multi-group CFA in R (Meredith, 1993). After testing the factorial structure in each subgroup, measurement invariance was tested in three steps using the configural model first (without constraints), followed by a metric invariant model (with factor loadings constrained to be equal across groups), a scalar invariant model (with factor loadings and item intercepts simultaneously constrained to be equal across groups), and a strict invariant model (with factor loadings, item intercepts, and residuals constrained to be equal across groups). Due to the hierarchy of these nested and increasingly restrictive models, they could then be compared. Due to the large sample size, the χ 2 significance-test was capable of detecting even the smallest model differences (e.g., Putnick and Bornstein, 2016). A non-significant χ 2 test result could thus be seen as a very strong indicator that invariance holds. For the cases of significant χ 2 results we reported differences CFI and gamma Hat (GH, Steiger, 1989) as alternative measures. Values equal to or smaller than 0.01 indicated the invariance of the model (Cheung and Rensvold, 2002;Milfont and Fischer, 2010). Whenever full scalar invariance could not be assumed, partial invariance was tested by consecutively constraining only two of the three item intercepts to be equal across groups while one was estimated freely. Even though stepwise selection processes like this have been heavily criticized (see Marsh et al., 2018), Gregorich (2006) argues that partial invariance allows for valid comparisons in mean scores as long as two loadings and intercepts are constrained to be equal across groups. Nevertheless, the assumption of partial invariance should always be considered inferior to full scalar invariance.
The combined sample was then used to identify possible influences of sociodemographic factors on A-US scores within the SEM framework. For this, latent means were fixed to be equal across groups and model fit was analyzed once again. A significant decline in model fit compared to the strict invariance model was seen as an indicator that differences were present. Latent means were then compared in the strict invariance model between the groups. Finally, R 2 was calculated to show the extent of the differences found by comparing between-group-variance in intercepts and latent means to the pooled total variance. Finally, Sample 3 was used for construct validation calculating Pearson correlation with a reduced version of the original scale containing only those six items not included in the A-US. Correlations were analyzed in all relevant subgroups. Furthermore, Pearson correlations were used to explore convergent validity of the A-US with related constructs, i.e., right-wing extremist attitudes, left-right-self-assessment, and different measures of prejudice.

Descriptive Statistics
Descriptive statistics for each item were reported separately for Sample 1 and Sample 2 in Table 2. While skewness and kurtosis lay within the commonly agreed upon cut-offs of <2 (Pituch and Stevens, 2016), histograms showed clear deviations from normality on an item level, especially in item 1 (data not presented). Due to these findings, we assumed a non-normal distribution of data on the item level but not for the scale as a whole. We thus used robust estimation and fit indices for the EFA and CFA whenever available (Brosseau-Liard et al., 2012;Brosseau-Liard and Savalei, 2014), as these methods are based on an item level, but parametric measures when scale scores were involved (e.g., for mean comparisons and correlations). Difficulty indices for the A-US items ranged between 0.42 (Item 2) and 0.72 (Item 1), indicating a medium to low item difficulty within the accepted range of 0.20 to 0.80 (Moosbrugger and Kelava, 2012). Furthermore, the corrected item-total correlations all scored above the cut-off of 0.40 (Moosbrugger and Kelava, 2012). No notable differences between the two samples were observed.

EFA, CFA, and Reliability
Results of the EFA indicated a unidimensional one-factor solution using the Kaiser Guttman criterion with an Eigenvalue of 1.29, accounting for 42.92% of the variance. Factor loadings for the three items were 0.50, 0.65, and 0.78 respectively. The visual evaluation method of the scree-plot indicated one factor as well (data not shown). Additionally, PA (Horn, 1965) indicated a unidimensional scale structure with only the Eigenvalue of the first factor exceeding the value of the Random Data (see Table 3).
According to the results of the EFA, the unidimensional model was tested by CFA in Sample 2. As the model was just identified, fit indices could not be calculated. Factor loadings were 0.53, 0.59, and 0.81 for the three items. Reliability analysis led to an ω 1 = 0.68 in Sample 1, ω 2 = 0.69 in Sample 2 and ω 3 = 0.71 in Sample 3.

Measurement Invariance
Measurement invariance of the A-US was tested regarding the following groups using both Sample 1 and Sample 2: sex, age, employment status, and education. Results are shown in Table 4.
Due to the non-significant χ 2 -test as well as CFI and GH < 0.01, the A-US could be assumed strictly invariant across sex. The results for different age groups were less clear. While both the configural and metric model proved to be sufficient using the relevant indices, complete scalar invariance could not be assumed. Therefore, all variants of partial invariance were explored by consecutively freeing the intercepts of each item across the groups while fixing the other two item intercepts to be equal across groups. While there was no significant effect of allowing for variation of intercepts across groups for item 1, freely estimating the intercepts across groups of either item 2 or item 3 lead to partial invariance. The intercepts showed a linear age trend with older age groups showing higher values in the intercepts (see Table 5). As the free estimation of intercepts across groups for item 2 lead to the best model fit, we then tested for strict invariance including this constraint. All relevant indices showed a significant decline in model fit. Therefore, strict invariance could not be assumed.
Regarding employment status and education 4 , strict invariance could be assumed based on the analyses. Even though employment status showed significant χ 2 -test results for the scalar model, both CFI and GH remained below the cutoff of 0.01. As χ 2 -tests are capable of detecting even the slightest effects in larger samples, CFI and GH are the more reliable indices in this case (Putnick and Bornstein, 2016). Intercepts as well as latent means for the different employment statuses can be found in Table 5. Compared to the group working in full time, the group of people in training as well as those working less than 15 h per week showed lower A-US intercepts and latent mean scores, while homemakers as well as retired persons showed higher intercepts and latent means scores. The analysis of measurement invariance for education status lead to similar results. The strict model showed a significant χ 2 -test result and CFI fell on the cut-off of 0.01. As GH was still far below the cut-off score, strict invariance could be assumed. A linear trend regarding the years of education was found in the intercepts as well as latent mean scores (see Table 5).

Influence of Sociodemographic Parameters
As Table 4 shows, a significant decline in model fit could be observed in all sociodemographic parameters except sex when fixing latent mean scores across groups. Therefore, age group, education and employment status all have an influence on A-US mean scores. No mean differences were found between men and women.
Even though full scalar invariance was not given in the case of age groups, meaningful comparisons of latent means are still possible (Gregorich, 2006). The results revealed a linear trend with older participants showing higher authoritarianism in a descriptive manner (see Table 5). Differences in latent means between the age groups accounted for 4.53% of the total variance. Regarding employment status, group membership accounted for 3.95% of the total variance. The group of retired people showed the highest A-US latent mean scores while those in education or training showed the lowest. This may at least partially be explained by an overlap with the reported age effect. Interestingly, the group of unemployed and working < 15 h per week showed lower A-US scores than the group of those working full time. Finally, education had the largest influence on authoritarianism, explaining 9.36% of the total variance. Once again, a linear trend  could be observed. Participants with a higher level of education showed lower scores in authoritarianism. Overall, a high influence of most of the sociodemographic parameters was revealed. This goes in line with previous studies on authoritarianism and right-wing extremist attitudes and needs to be taken into consideration for future research on the topic.

Construct Validity
To analyze construct validity, Pearson correlation with a reduced, six-item version of the scale was assessed, to account for the item overlap. The correlation was very high (r = 0.879). Testing different age groups, educational backgrounds and employment statuses did not reduce the correlations substantially. As shown in Table 6, correlations between the three items of the A-US and the remaining six items of the original scale ranged between 0.862 and 0.923 in all groups. This can be regarded as very strong evidence that the proposed ultra-short scale indeed measures the same construct as the original scale.

Convergent Validity
To analyze the validity of the scale, correlations with adjacent constructs were calculated (see Table 7). We expected positive correlations with right-wing attitudes, left-right self-assessment as well as generalized and group specific prejudices. All of these were found to be highly significant in both samples, showing small to medium effect sizes ranging from r = 0.16 in the left-right-self-assessment in Sample 1 to r = 0.50 in right-wing extremism in Sample 2 (Cohen, 1988).

DISCUSSION AND LIMITATIONS
The aim of the present study was to examine the psychometric properties of the ultra-short screening instrument for authoritarianism A-US. A short scale to monitor authoritarian tendencies in large samples is crucial to understand and predict changes in the social climate. The A-US showed good to excellent psychometric properties. Non-normal distribution of data was assumed on an item level but not for the scale as a whole. This was taken into consideration by using robust estimation and fit indices for the tests of measurement invariance. Factorial validity was assessed by EFA and CFA. Both methods confirmed the one-dimensionality of the A-US with factor loadings ranging between 0.50 (Item 1 in the EFA) and 0.81 (Item 3 in the CFA). Moreover, internal consistency (McDonald's ω) ranged between 0.68 and 0.71 in all samples. Taking into consideration the shortness and purpose of the three-item instrument, this can be evaluated as satisfying. It is also comparable to the aforementioned VSA. As the A-US is intended for group statistics rather than individual assessment, efficiency may be ranked higher than internal consistency (Ziegler et al., 2014).
Tests of measurement invariance revealed strict invariance of the A-US across sex, employment status and education. This is an important statistical prerequisite that allows for meaningful observed mean comparisons between these groups (Gregorich, 2006). As for the age groups, partial scalar invariance was given when freely estimating the intercepts between groups for either item 2 or item 3. Even though results should be considered with some care as the stepwise procedure of testing for partial invariance has been criticized as possibly leading to idiosyncratic results (Marsh et al., 2018), this allows for meaningful comparisons of intercepts and latent mean scores across groups. Strict invariance could not be assumed across age groups, so observed mean scores should not be compared at all. Some important mean differences were found in relation to sociodemographic parameters. A linear age trend was observed in all item intercepts as well as latent mean scores. In the case of item 3, age group membership accounted for 4.52% of the total variance of authoritarianism. As this item represents the dimension authoritarian conventionalism of the original scale, the results imply that older age groups tend to hold more conservative and traditional attitudes, while also showing more authoritarianism in general. With the data at hand, it is not possible to clarify whether this is a general age trend, due to e.g., changes in personality and cognition associated with older age (Cornelis et al., 2009) or whether the differences are due to birth cohort. As this scale was tested in a German population, the effect may be viewed as a relic of the Nazi-era and the resulting scores might be caused by a more authoritarian environment the older age groups were raised in. International samples should be used to clarify this effect and further analyses should try to separate age effects from birth cohort effects. Employment status accounted for 3.95% of the total variance using a comparison of latent mean scores. As the group of retired people showed the highest scores in authoritarianism and the group of people in training and education showed the lowest, an overlap with age effect is likely. Furthermore, the largest effects were found regarding the educational background where higher authoritarianism scores were associated with lower educational levels. Group membership explained 9.36% of the variance in latent mean scores. This finding once again proves that higher education may serve as a buffer for authoritarian and rightwing extremist attitudes (e.g., Rippl, 2002;Zick et al., 2011;Decker et al., 2018).
Construct validity was assessed using Pearson correlations with the long version of the scale. The correlations with the reduced version excluding the three items of the short scale were very high (r = 0.879). Convergent validity was demonstrated by small to medium correlations to other instruments measuring related constructs. The relatively low correlation with leftright-self assessment may be explained mainly by the fact that it is a single-item instrument. In the validation study by Beierlein et al. (2014), two out of three subdimensions showed similar correlations of r = 0.22 (authoritarian submission) and r = 0.21 (conventionalism) with political left-right-self assessment compared to the A-US. Correlations of these two dimensions with prejudices against homosexuals and migrants ranged between r = 0.27 and r = 0.34, and were thus comparable to those of the A-US as well. The dimension authoritarian aggression of the KSA-3 generally reached higher correlations with prejudices and political self-assessment than the A-US. Taking into consideration the relatively low factor loading of the first item of the A-US taken from that dimension, this may be seen as an indicator, that authoritarian submission and conventionalism are better captured by the A-US than authoritarian aggression. Some other scales assessing authoritarianism have been criticized for showing an overlap in item-contents and -wording with the parameters they want to explain, leading to an increase in estimated correlations. As this short scale uses rather broadly worded items, little to no overlap is to be expected 5 . The overall correlational pattern of this short scale can thus be viewed as a strong indicator of validity, even though the effect sizes only indicate small to medium effects.
Limitations include the one-dimensionality of the scale that does not fully reflect the complexity of the construct (Funke, 2005, see above). This may be intensified by the fact that the A-US only uses three items to capture a construct that was originally made up of three dimensions. In most studies, authoritarianism is treated as a unitary parameter though (Beierlein et al., 2014), suggesting that this simplification may not result in grave impairments of the quality of the scale. Another potential problem regards the reliability of short scales, as they, once again, may not capture the full complexity of the constructs they are trying to measure. The A-US showed high internal consistency and factor loadings as well as high correlations with the original scale and related constructs. Considering the anticipated use of the scale, it thus demonstrates adequate reliability and validity. Comparisons with older data and other authoritarianism scales could be used to further improve construct validity. Unfortunately, no data is available yet to cover an adequate time frame needed for an analysis of the stability of the A-US over time. Still, the results of this study indicate that the A-US is able to capture authoritarianism in most relevant research and screening contexts. Finally, the onesided answering format may lead to acquiescence (MacWilliams, 2016). According to Podsakoff et al. (2003), acquiescence reflects "the propensity for respondents to agree (or disagree) with questionnaire items independent of their content" (p. 887). This may especially be a problem when it comes to authoritarianism, a concept already closely linked to conformity, i.e., the tendency to agree. Balanced short-scales, like the VSA, may account for this by controlling for the direction of item wording. In this respect, they may be better at differentiating between high scoring "authoritarians" and mere "conformists" that show a tendency to agree. With its six items, the VSA may still be too long for some research purposes though. In an ultra-short scale like the A-US, reversed item wording in one of only three items may lead to additional problems, a possible distortion in item meaning being the most important of them (cf. Rokeach, 1967). In scale construction, we relied on the well-validated KSA-3 by Beierlein et al. (2014) that is capable of capturing the three dimension of authoritarianism as conceptualized by Altemeyer. In the threeitem version we proposed, our focus was to adequately reflect the whole spectrum of authoritarianism rather than creating a balanced scale based on item wording. To further evaluate and address the problem of response biases, future research should aim to translate and validate both the ACRV-2 and VSA short scales for the use in German populations in order to allow for a more meaningful comparison with the A-US. These possible impairments should be kept in mind when using this short scale. As this instrument is supposed to serve only as a very short screener for authoritarian tendencies in large groups, rather than a measure for individual assessment, most of these shortcomings seem negligible (Ziegler et al., 2014).
Due to these deficits, more elaborated scales should be preferred for complex analyses on the genesis and effects of authoritarianism. The influence of sociodemographic factors on authoritarianism should be taken into consideration at all times and the issue of possible response biases should be addressed in future research. Based on the results altogether, the A-US proved to be a valid and reliable screening tool for large-scale assessment and monitoring of authoritarian tendencies.