Investigating the Multidimensionality of the Work-Related Flow Inventory (WOLF): A Bifactor Exploratory Structural Equation Modeling Framework

This study investigated the factor structure of the Work-Related Flow Inventory (WOLF) through the application of the bifactor exploratory structural equation modeling (B-ESEM) framework. Using a sample of 577 Chinese teachers, we contrasted a series of competing models, including CFA, ESEM, bifactor CFA, and B-ESEM models. The results suggested that the B-ESEM structure with three S-factors (absorption, work enjoyment, and intrinsic work motivation) and one G-factor (global flow) was the best representation of the WOLF ratings. The results also supported the composite reliability and the strict invariance of this measurement structure between male and female groups. Relative to males, female teachers showed a higher level of global work-related flow experience. Finally, the nomological validity of WOLF ratings was supported by the statistical relationships of the WOLF factors with job satisfaction and autonomy.


INTRODUCTION
There has been an increasing interest in the construct of flow over the last forty years. Flow is a state of consciousness where people become totally immersed in an activity, and enjoy it intensely (Csikszentmihalyi, 1997). When a person is engaged in some activity of his or her preference, whether it be leisure (e.g., playing chess), sport (e.g., swimming), work, or study, it is more likely that the individual may experience flow. According to flow theory (Csikszentmihalyi, 1975), flow not only helps individuals to have pleasure and satisfaction in the activity, but also improves their self-efficacy and self-esteem and promotes their self-growth and subjective well-being.
Evidence shows that people may have more experience of flow during work as opposed to during their spare time (e.g., Delle Fave and Massimini, 2003). Bakker (2008) discussed that work-related flow experience could be conceptualized as three aspects: absorption, work enjoyment, and intrinsic work motivation. Absorption reflects a person's concentration on, and immersion in, the work. Work enjoyment reflects a person's happy feeling and positive view with regard to the quality of his work. Intrinsic work motivation reflects the tendency that a person does the work for pleasure and satisfaction in the work. Flow at work is most likely to occur when a balance is achieved between the demand of a job and a person's capacity and adequate organizational resources available for doing the job successfully (Bakker, 2008).

Psychometric Characteristics and Latent Structure of the WOLF
Despite the existence of many methods (e.g., experience sampling method, questionnaires, neuronal indicators, and psychophysiological measures) and psychometric instruments (e.g., the Swedish Flow Proneness Questionnaire, Ullén et al., 2012; the Flow State Scale, Jackson and Marsh, 1996) developed to assess flow, the work-related flow inventory (WOLF) developed by Bakker (2008) is the most widely administered measure in the work context. By design, the WOLF consists of three dimensions: absorption (4 items), work enjoyment (4 items), and intrinsic work motivation (5 items). As discussed in Csikszentmihalyi (1997), these three dimensions are the important components typically included in research for flow.
Research findings have generally supported the psychometric quality of the WOLF. For example, Bakker (2008) found that the WOLF showed good internal consistency reliability as well as test-retest reliability estimates, in addition to good evidence for its convergent, construct, and predictive validity. Other studies in different cultural settings also provided support for the WOLF's psychometric quality, such as reliability, predictive validity, and convergent validity, etc. The cultural settings in which such evidence came from included South Africa (Geyser et al., 2015), Norway (Christensen, 2009), Spain (Salanova et al., 2006), Pakistan (Zubair and Kamal, 2015), Italy (Colombo et al., 2013;Zito et al., 2015), and Turkey (Zekioglu et al., 2017). Similarly, the WOLF has also been shown to have good psychometric characteristics (e.g., reliability and validity) when used in the Chinese cultural context (Zeng, 2013;Chen et al., 2016).
With regard to the latent structure of the WOLF, there are some unresolved issues. Using multiple samples of employees from different occupational groups in Netherlands, Bakker (2008) provided empirical support for the three-factor CFA model consistent with the original design of three components for the measure, over the one-factor model (i.e., only the general factor of flow), and a couple of competing two-factor models. In general, the three-factor structure of the WOLF has also been supported in other cultural settings, such as South Africa (Geyser et al., 2015), Brazil (Freitas et al., 2019), Italy (Colombo et al., 2013;Zito et al., 2015), Turkey (Zekioglu et al., 2017), and China (Zeng, 2013;Chen et al., 2016). However, most research indicated that this three-factor model typically only showed borderline model fit at best (e.g., Bakker, 2008;Chen et al., 2016), especially with the consideration of the widely used criteria for model fit assessment (Hu and Bentler, 1999).
Furthermore, there was discussion that, of the three constructs proposed for WOLF, two of them -work enjoyment and intrinsic work motivation -overlap conceptually (Llorens et al., 2013;Happell et al., 2015), as enjoyment could already be covered by intrinsic motivation. Operationally, as discussed in Ryan and Deci (2000), self-report of enjoyment is often used for measuring intrinsic motivation. In a sample of Australian workers, Happell et al. (2015) found that the items representing the work enjoyment and intrinsic work motivation dimensions of the WOLF loaded on one dimension, and they argued that the two-factor solution (i.e., absorption and work enjoyment/intrinsic work motivation) should be retained instead of the conventional three-factor structure. Likewise, Bãdoiu and Oprea (2018) showed that the two-factor model had a better fit for the sample data of a Romanian population. In fact, even in the original study (Bakker, 2008), work enjoyment was found to be considerably correlated with intrinsic motivation (e.g., ranging from 0.67 to 0.82), suggesting that these two factors conceptually overlap, which led to poor discriminant validity. Other studies (e.g., Geyser et al., 2015) also discussed these two issues (i.e., borderline model fit and two overlapping constructs).
On the practice side, how the WOLF score(s) is used is also inconsistent. Some researchers used the composited flow score as the measure of global flow (e.g., Fagerlind et al., 2013;Zubair and Kamal, 2015), while some others used three subscale scores to represent the three domain components of flow (e.g., Demerouti et al., 2012). Still others treated the global flow as a latent variable with the three subscale scores as its indicators (Salanova et al., 2006). These studies, however, did not provide any rationale or practical guidelines about why the WOLF score(s) should be used as shown in the respective studies. Ideally, the way in which the WOLF score(s) is used should be grounded in, and supported by, the latent structure of the measure, as how the score(s) of the WOLF is composited should be guided by the latent structure of the measure. When the latent structure of the WOLF is somewhat uncertain, we cannot be sure what scoring mechanism would be the best representation of the underlying structure of the WOLF.
Recent research (e.g., Stenling et al., 2015;Gu et al., 2017b;Tóth-Király et al., 2018) that examined different approaches for modeling the latent structure of some psychological measures indicated that a conventional confirmatory factor analysis approach may often fail to adequately capture the more complicated multidimensionality of the latent structure of some measures; more sophisticated modeling approaches may be needed to better model the multidimensionality of some measures. It is likely that we may develop a better understanding about the issues concerning the latent structure of the WOLF as discussed above by considering some more sophisticated modeling approaches that may better capture measurement multidimensionality. With all these considerations, it became necessary to revisit the issue of the latent structure of the WOLF, to develop a better understanding of the multiple issues discussed above.

Confirmatory Factor Analysis (CFA)
The confirmatory factor analysis (CFA) is the most commonly used approach to model construct-relevant multidimensionality. CFA, however, is often criticized for its overly restrictive independent cluster model (ICM) assumption, which requires that each item is defined by one, and only one, content domain. This assumption is operationalized in a CFA (ICM-CFA) model by constraining all cross-loadings to zeros, which could lead to unintended consequences, such as inflated factor correlations, poor goodness-of-fit indices, and poor discriminant validity, etc. Indeed, research indicated that the ICM-CFA model, even when the model fit was satisfactory, could lead to inflated factor correlations (e.g., Morin et al., 2017).

Bifactor CFA
Bifactor CFA model assumes: (a) the existence of a general factor that accounts for the shared communality by all the items; and (b) the existence of several group factors, which contribute to a common variance shared within each cluster of items, beyond that of the general factor (Reise, 2012;Gu et al., 2015Gu et al., , 2017a. For model identification, orthogonality is assumed between the general factor and the specific factors. Such a bifactor CFA model better represents the multidimensionality of the underlying factor structure because of the coexistence of a general construct (e.g., flow at work) and some specific constructs (e.g., absorption, work enjoyment, and intrinsic work motivation).

Exploratory Structural Equation Modeling (ESEM)
Exploratory structural equation modeling provides an overarching framework which integrates CFA and exploratory factor analysis (EFA) into a single structural equation modeling (SEM) model. This model is more appropriate for investigating possible multidimensionality of a measure due to the associations between non-target constructs and imperfect items (Asparouhov and Muthén, 2009). ESEM relies on target rotation, which is a confirmatory form of rotation, to freely estimate cross-loadings. Compared with CFA, ESEM provides more accurate, typically lower, estimates of factor correlations, and these more accurate estimates of factor correlations result in better discriminant validity (Asparouhov and Muthén, 2009;Morin et al., 2016).

Bifactor Exploratory Structural Equation Modeling (B-ESEM)
Bifactor exploratory structural equation modeling was recently proposed by Morin et al. (2016) to examine the issue of constructrelevant multidimensionality. B-ESEM integrates both bifactor model and ESEM model into a single analytical framework. This new modeling approach not only allows the coexistence of the general construct and its subdomains (e.g., global flow, and absorption, work enjoyment, and intrinsic work motivation as specific components), but also takes the relations of non-target constructs and items into account. Theoretically, the B-ESEM is the most comprehensive and flexible model that can more accurately describe the complex psychological characteristics.
Compared with B-ESEM, ESEM ignores the possible presence of hierarchically higher order construct(s) (e.g., global flow at work), which can lead to inflated cross-loadings. By contrast, bifactor model, which is essentially a CFA model, neglects the possibility that items may have cross-loadings on the non-target specific factors. The consequence of fixing such cross-loadings to zero is to inflate the variance of the general factor Sánchez-Oliva et al., 2017). B-ESEM, theoretically, overcomes these shortcomings as described above.

Nomological Validity of the WOLF
The nomological validity of WOLF could be supported by appropriate statistical relationships between work flow and external criterion variables such as autonomy and job satisfaction. As Morgeson et al. (2005) discussed, job autonomy reflects how much a job allows an employee to have discretion, freedom, and independence for work scheduling, or allows employees to make the necessary decisions to get the job done. Job satisfaction, on the other hand, is a person's agreeable or positive emotional state that is based on personal evaluation of one's occupation or job experiences (Locke, 1976). As Hackman and Oldham (1980) described in their job characteristics model, five important job characteristics (namely, task significance, skill variety, autonomy, feedback, and task identity) generate and enhance a person's flow experience. Of these five, autonomy seems to have the most beneficial effect on flow (Bakker, 2008;Mäkikangas et al., 2010;Lin and Joe, 2012). Empirical evidence also suggested that autonomy was significantly and positively associated with flow experience. For example, Fullagar and Kelloway (2009) revealed that autonomy was a significantly positive predictor for flow. In addition, many other studies showed that job satisfaction was closely related to work flow or its specific components (e.g., Maeran and Cangiano, 2013;Geyser et al., 2015;Zito et al., 2015).

The Present Study
We conducted this study with three specific aims. First, we intended to investigate WOLF's latent structure, by using both conventional and more recent modeling approaches, such as ESEM model, bifactor model, and B-ESEM model, for the purpose of resolving some issues related to WOLF's latent structure. Second, we intended to examine how invariant the WOLF structure was across gender groups. For this purpose, a series of progressively more stringent invariance conditions (e.g., ranging from configural, weak, strong, and to strict invariance) would be tested. Third, we intended to examine the nomological validity of the WOLF in relation to the relevant constructs of autonomy and job satisfaction, as suggested by the best model that emerged from the modeling analyses under the first aim.

Participants
The participants were 577 teachers recruited in Zhengzhou, a metropolitan area in central China. The sample's average age was 36.80 years old (SD = 9.04), and their average work seniority was 12.20 years (SD = 9.95). The majority of participants were female (71.9%) and married (83.5%). Among the participating teachers, 21.0% were teaching in kindergartens, 40.0% in primary schools, and 39.0% in secondary schools.

Flow at Work
The Work-Related Flow Inventory (WOLF; Bakker, 2008) was used to measure flow at work. This measure had 13 items designed to assess three dimensions of flow experience: (a) absorption (four items; sample item: "I get carried away by my work"), (b) work enjoyment (four items; sample item: "I do my work with a lot of enjoyment"), and (c) intrinsic work motivation (five items; sample item: "I find that I also want to work in my free time"). The items had the response scale with 7-points ranging from 1 (never) to 7 (always). For using this measure in the sample of Chinese teachers, the standard procedure of translation and back-translation (Brislin, 1986) was used to translate the original WOLF into Chinese, and both the English and Chinese items of the WOLF were available in Supplementary Table S1 of Supplementary Appendix. Cronbach's α was 0.92 for the total scale, and 0.85, 0.91, and 0.83 for the three subscales of absorption, work enjoyment, and intrinsic work motivation, respectively. The model-based reliability (i.e., omega coefficient, ω; Sijtsma, 2009) would be estimated and reported in section "Results."

Job Satisfaction
The Job Satisfaction Scale (Schriesheim and Tsui, 1980) was used to measure job satisfaction. The self-report scale contained six items, with each being rated on a 5-point Likert scale (1 = strongly disagree; 5 = strongly agree). A sample item is, "How satisfied are you with the nature of the work you perform?" Cronbach's α and the omega coefficient (ω) in the study sample were both 0.86.

Autonomy
Autonomy was measured by using the subscale of selfdetermination under the Psychological Empowerment Scale (Spreitzer, 1995). The subscale consisted of three items (e.g., "I have significant autonomy in determining how I do my job"). Participants responded to each item on a 5-point Likert scale (1 = strongly disagree; 5 = strongly agree). In this study, Cronbach's α and the omega coefficient (ω) were both 0.83.

Statistical Analysis
To achieve the aims of the study, statistical analyses were carried out in three phases. In the first phase of analyses, for understanding the measurement structure of the WOLF, a series of nine alternative models were examined to assess their respective goodness-of-fit, as follows.

Model of unitary dimension:
Model 1: One-factor CFA model (global flow).
Models with three sub-domains: Model 6: Three-factor CFA model (absorption, work enjoyment, and intrinsic work motivation).
Model 7: ESEM model (including absorption, work enjoyment, and intrinsic work motivation). Model 8: Bifactor CFA model (B-CFA) with three specific domains (absorption, work enjoyment, and intrinsic work motivation). Model 9: B-ESEM model, including three S-factors (absorption, work enjoyment, and intrinsic work motivation), and one G-factor (global flow).
Among the nine models above, Model 1 was the baseline model, which assumed one general factor of global flow without considerations for any sub-domains. Model 2 to Model 5 shared the general assumption of two sub-domains of work flow. Model 6 to Model 9 shared the general assumption of three subdomains of work flow.
In the first-order CFA models (Model 2 and Model 6), each item was specified to load on the factor (i.e., the content domain) that the item was assumed to measure, and without cross-loadings on any other factors. In the first-order ESEM models (Model 3 and Model 7), all cross-loadings were specified to be freely estimated through oblique target rotation. The B-CFA models (Model 4 and Model 8) assumed that each item simultaneously loaded onto a global flow construct and one of the specific domains of flow, and that all factors were orthogonal (i.e., uncorrelated with each other). As for the B-ESEM models (Model 5 and Model 9), an item was not only defined by the G-factor and by a S-factor of its own, but it also reflected other conceptually adjacent subdomains (i.e., cross-loadings) through orthogonal bifactor-target rotation.
In the second phase of analyses, for the purpose of testing measurement invariance across gender groups, the best fitting model that emerged from the first phase of modeling analyses (i.e., Model 1 to Model 9; described above) was used, and measurement invariance analyses were conducted by using the sequence described in the literature (Millsap, 2011). The analyses tested progressively more stringent invariance assumptions: (a) configural invariance (invariance of factor structure), (b) weak invariance (#a satisfied, plus invariance of factor loadings), (c) strong invariance (#b satisfied, plus invariance of item intercepts), (d) strict invariance (#c satisfied, plus invariance of item uniquenesses), (e) latent variance-covariance invariance (#d satisfied, plus invariance of latent variance-covariance), and (f) latent means invariance (#e satisfied, plus invariance of latent factor means).
In the third phase of analyses, latent factors representing job satisfaction and autonomy were integrated to the retained measurement model to examine the nomological validity of the WOLF.
All modeling analyses were carried out by using the statistical modeling software Mplus 7.0 (Muthén and Muthén, 2012). In the modeling analyses, the robust maximum likelihood (MLR) estimation method was used, which provides estimates of standard errors and fit indexes appropriate for conditions such as ordinal Likert-scale item responses and data non-normality. For model fit assessment, we considered the following modelfit indices: the Comparative Fit Index (CFI), the Tucker-Lewis Index (TLI), and the root mean square error of approximation (RMSEA) with its confidence intervals (CI). As suggested in the literature (Hu and Bentler, 1999;Marsh et al., 2004), adequate and excellent model fit may be indicated by values greater than 0.90 and 0.95, respectively, on CFI and TLI, and by values lower than 0.08 and 0.06, respectively, on RMSEA. For testing alternative models, as discussed in Chen (2007), CFI and TLI ≥ 0.01 and RMSEA ≥ 0.015 could be considered to suggest a more restrictive model. Table 1 presents the means, standard deviations, and Pearson correlations for the measured variables. As expected, the three components of the WOLF (absorption, work enjoyment, and intrinsic work motivation) were positively related to each other (r = 0.66-0.72, p < 0.001). Absorption, work enjoyment, and intrinsic motivation correlated with both autonomy and job satisfaction (r = 0.44-0.68, p < 0.001).

Latent Structure of the WOLF
Results of model fit assessment for the nine alternative models, which represented different latent measurement structures of the WOLF as discussed previously, are displayed in the upper portion of Table 2. Based on comparison of the alternative twosubdomain (Models 2 to 5) and three-subdomain (Models 6 to 9) solutions, it is apparent that the three-subdomain solutions had a much better fit to the data than the two-subdomain counterparts. The parameter estimates for the two-subdomain models, which were reported in Supplementary Tables S2, S3 of Supplementary Appendix, further supported the threesubdomain solutions.
With the superiority of the three-subdomain solutions clearly supported, we shifted our focus to the comparisons of different forms of three-subdomain solutions (i.e., comparisons among Models 6 to 9). As discussed in Morin et al. (2016), we first compared the CFA (Model 6) and ESEM (Model 7) models, and it was revealed that the ESEM model (Model 7) showed better model fit ( TLI = 0.04, CFI = 0.06, and RMSEA = -0.02) than the CFA model (Model 6).
Tables 3 and 4 present the standardized factor loadings and factor correlations of these two models, which also provided  , some were large enough to indicate the possibility of an unmodelled G-factor. Out of 26 cross-loadings, seven were between 0.20 and 0.30 (Items a1, a2, w1, w3, i1, and i2), and six were over 0.30 (Items i1, i2, i3, i4, and i5).
As shown in Table 2 were lower in general than the loadings on the other group factors. In addition, two of five target loadings (Item i2, |λ| = 0.10; Item i5, |λ| = 0.01) on the intrinsic work motivation S-factor were statistically non-significant. These suggested that the intrinsic work motivation S-factor is less well-defined than the other two, but acceptable.
More importantly, the B-ESEM model's cross-loadings (|λ| = 0.00-0.22; M = 0.09) were substantially lower than those of the ESEM model (|λ| = 0.05-0.51; M = 0.20). Furthermore, in the B-ESEM solution, only two cross-loadings were between 0.20 and 0.30 (Items a2 and w3), and none were over 0.30. All these findings provided strong support for retaining the B-ESEM model as the best representation of the structure of the WOLF.
For further assessing the appropriateness of the B-ESEM model, we calculated model-based coefficients of composite reliability (Perreira et al., 2018) for both the G-factor and the S-factor, based on the standardized model estimates. The composite reliability of both the general flow factor (ω = 0.94) and the work enjoyment S-factor (ω = 0.82) were very good. The composite reliability of the S-factor for absorption (ω = 0.67)  was adequate, and that of the intrinsic work motivation S-factor (ω = 0.42) was marginal.

Assessment for Gender Group Measurement Invariance
The measurement invariance across gender for the B-ESEM model was assessed, and the findings were displayed in the lower portion of Table 2. The configural invariance model  (Model A) showed very good model fit to the data (TLI = 0.96, CFI = 0.98, RMSEA = 0.05). Progressively more stringent invariance constraints were then successively imposed on factor loadings (Model B: weak invariance), item intercepts (Model C: strong invariance), and item uniquenesses (Model D: strict invariance). None of these more stringent invariance conditions caused model fit deterioration beyond the general guidelines (i.e. CFI and TLI ≥ 0.01 and RMSEA ≥ 0.015). However, the model for the invariance of latent variance-covariance (Model E; TLI = -0.02, CFI = -0.01, RMSEA = 0.02) and the model for latent means invariance (Model F; TLI = -0.03, CFI = -0.02, RMSEA = 0.02) were not supported by the data. Further analysis showed that when males' factor means were fixed to zero for model identification purpose, female teachers' factor means were statistically higher on the work flow G-factor (M = 0.46, p < 0.001), but not statistically different on the S-factors (p > 0.05).

Nomological Validity
For the purpose of examining the nomological validity of the WOLF, external CFA factors for job satisfaction and autonomy   Table 6, the flow G-factor and two S-factors (i.e. work enjoyment and intrinsic work motivation) were significantly and positively associated with job satisfaction and autonomy. By contrast, the absorption S-factor was not significantly associated with these two external factors.

DISCUSSION
This study is the first attempt to investigate the latent structure of the WOLF by using both CFA and ESEM approaches. Consistent with previous research (e.g., Bakker, 2008;Christensen, 2009;Happell et al., 2015), this study found that the one-factor CFA solution was far from being acceptable, indicating that work-related flow should be considered as consisting of multiple dimensions, rather than of a unitary dimension. The WOLF was originally designed to assess three interrelated content domains (Bakker, 2008), and the three-factor structure was shown in various samples and in different cultures. However, Rodríguez-Sánchez et al. (2011) and Llorens et al. (2013) discussed that only enjoyment and absorption were the essence of the work-related flow experience. Enjoyment could be considered as some kind of motivation (Davis et al., 1992), and intrinsic motivation might be an antecedent, instead of a core component, of work-related flow (Deci and Ryan, 1985;Llorens et al., 2013). The study by Happell et al. (2015) in an Australian sample showed that the items for two domains, work enjoyment and intrinsic work motivation, loaded onto one dimension, providing support for the argument described above. For the purpose of understanding whether these two components might be combined into one factor, we compared the two-factor CFA (absorption and work enjoyment/motivation) and the three-factor CFA (absorption, work enjoyment, intrinsic work motivation) solutions, and found that the goodnessof-fit of the latter model substantially exceeded that of the former. More importantly, we found that the correlation between work enjoyment and intrinsic work motivation in the threefactor CFA model was indeed high (i.e., 0.88), which was in line with previous findings (e.g., Bakker, 2008;Geyser et al., 2015;Zito et al., 2015). With such findings based primarily on conventional CFA approaches, it is difficult to decide which model should be preferable for WOLF. Therefore, new modeling approaches (e.g., B-CFA, ESEM, and B-ESEM) could be needed to further examine the dimensionality of the WOLF structure.
In line with prior research on multidimensional data, the comparison between the ICM-CFA model and ESEM model in this study revealed that the ESEM model was preferable, as ESEM had better model fit, and the factors showed better differentiations between each other as indicated by the lower inter-factor correlations. The ESEM solution, similar to the three-factor CFA, only considered the subdomains as separate factors for absorption, work enjoyment, and intrinsic work motivation, without the consideration for a possible overarching global factor. The observation of multiple cross-loadings of sizable magnitude (|λ| > 0.20, or even 0.30) in the ESEM model suggested that a global work-related flow factor might be present in the data. The comparison of ESEM and B-ESEM solutions provided support for this possibility. First, B-ESEM had substantially better model fit to the data. Second, the general flow dimension in B-ESEM appeared to be well defined, with the items showing moderate to large loadings on this general flow factor. Third, the composite reliability of the flow G-factor (ω = 0.94) was excellent. Fourth, the specific factors of absorption and work enjoyment were well-defined, while the specific factor of intrinsic work motivation was less well-defined, but generally acceptable. Finally, cross-loadings in the B-ESEM solution were generally lower than those of the ESEM solution.
In general, if the composite reliability (ω) of a specific factor is sufficiently high (e.g., >0.5), it indicates that the subscale score accounts for a meaningful amount of variance beyond the G-factor (Perreira et al., 2018). The findings in this study showed that the specific factors of absorption (ω = 0.67) and work enjoyment (ω = 0.82) had a substantial amount of specificity of its own, over and above the global flow. On the other hand, the specific factor of intrinsic work motivation (ω = 0.42) was less well-defined and had relatively low composite reliability. But three of the five target loadings exceeded 0.3, indicating that this specific factor still had an acceptable degree of specificity beyond the G-factor. Therefore, it is suggested to report the total score and subscale scores of absorption and work enjoyment when using the WOLF in practice. The use of subscale score of intrinsic work motivation should be treated with caution.
As shown earlier, the B-ESEM solution showed the best model fit. In the B-ESEM model, however, Item w3 ("I feel happy during my work") not only reflected the global work-related flow and the subdomain of work enjoyment, but it also had a substantial cross-loading (λ = 0.22) on the non-target intrinsic work motivation S-factor. This, however, was reasonable, because employees who are happy at work are usually motivated intrinsically by their work (Geyser et al., 2015). Psychometrically, it may not be realistic to require that each item reflects one, and only one, content domain of multidimensional constructs (Asparouhov and Muthén, 2009).
In addition, the findings also provided support for strict measurement invariance of the B-ESEM solution across gender groups, suggesting that this model was well-replicated across subsamples of male and female teachers. For the latent mean differences, the results revealed that female teachers showed a higher level of global work-related flow experience than male teachers. These findings were consistent with previous research showing that female teachers reported greater engagement and satisfaction with the work and lower burnout (Okpara et al., 2005;Rey et al., 2012). This finding may be related to socially constructed gender roles. More specifically, as discussed in Motro and Ellis (2016), the society has a higher expectation for women to carry out communal roles and display the related traits (e.g., friendliness, sympathy, gentleness, caring, and kindness, etc.). On the other hand, society has a higher expectation for men to carry out agentic roles and display these associated traits (e.g., power, dominance, independence, aggression, and competence, etc.). The theory about role congruity suggests that, when a group's stereotype is not matched with the expected social roles, biased responses may occur (Diekman and Hirnisey, 2007). Due to the incongruity between the demands of teaching and the typically expected societal roles of males, male teachers may experience lower level of flow.
The relationships between the WOLF factors with external factors of autonomy and job satisfaction supported the nomological validity of the WOLF. The global flow experience was found to be positively associated with autonomy, and this makes sense, as previous research (e.g., Fried and Ferris, 1987;Saavedra and Kwun, 2000) indicated that when employees could schedule their work and determine some aspects of their job, this could contribute to the employees' positive affect and motivation. This finding is in line with the empirical findings in previous research related to the job characteristics model (Hackman and Oldham, 1980), and to the job demands-resources model (Bakker and Demerouti, 2007), in that high levels of job resources (e.g., autonomy and social support) lead to work-related flow (Zito et al., 2016).
The other finding that the overall work-related flow was positively related to job satisfaction is also in line with previous research, which indicated that flow experience had an important effect on job satisfaction (Geyser et al., 2015), and the psychological state of flow was considered critical in redesigning interventions in the workplace in order to promote job satisfaction (Maeran and Cangiano, 2013). Our results also revealed that only the specific factors of work enjoyment and intrinsic work motivation, but not the absorption S-factor, had positive relationship with job satisfaction and autonomy, confirming the notion that absorption might have some overlap with the holistic description of flow (Bakker, 2008).
Despite the strength of this study in using systematic modeling approaches to examine the latent structure of the WOLF, there are some limitations in this study. One limitation is that the study relied on a convenience sample of Chinese teachers, which may limit the generalizability of findings to a wider context. Future research could use samples from other cultures and from other types of employees. Another limitation is that our assessment of the underlying structure of the WOLF was based on cross-sectional data only. Future research may consider the longitudinal stability of the B-ESEM structure.
In summary, our results supported that the B-ESEM solution could best represent the underlying structure of WOLF scores, and this model incorporates two aspects of psychometric multidimensionality: one is the result of the conceptual adjacency of content domains of flow (e.g., work enjoyment and intrinsic work motivation), and the other is associated with the coexistence of the global work-related flow and the three specific components. Furthermore, the strict gender-group measurement invariance of the B-ESEM model was supported. Female teachers, however, showed a higher level of global work-related flow experience than the male teachers. Finally, the nomological validity of WOLF ratings was supported by the statistical relationships of the WOLF factors with job satisfaction and autonomy.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding authors.

ETHICS STATEMENT
The study protocol was approved by Hunan Normal University Research Ethnics Committee. All participants gave written informed consent.