Psychometric Evaluation of the Chinese Recovering Quality of Life (ReQoL) Outcome Measure and Assessment of Health-Related Quality of Life During the COVID-19 Pandemic

Objective The primary objective was to translate the Recovering Quality of Life (ReQoL) measures from English to traditional Chinese and assess their psychometric properties in Hong Kong (HK) Chinese population. The secondary objective was to investigate the mental health-related quality of life (HRQoL) of this sample during the coronavirus disease 2019 (COVID-19) pandemic. Method Recovering Quality of Life was translated to Traditional Chinese adhering to standard guideline recommended by the official distributors. Five hundred members of the general population were successfully recruited to participate in a telephone-based survey. The following psychometric properties of the ReQoL were evaluated: construct, convergent, and known-group validity and internal consistency and test–retest reliability. The item measurement invariance was assessed on the basis of differential item functioning (DIF). Multiple regression analysis was used to assess the relationship between respondents’ characteristics and mental HRQoL. Results Results of confirmatory factor analysis (CFA) supported a two-factor structure of the ReQoL. The ReQoL showed significant correlations with the other mental health, quality of life, and well-being measures, which indicated a satisfactory convergent validity. Known-group validity confirmed that ReQoL is able to differentiate between people with different mental health status. The (Cronbach’s alpha = 0.91 and 0.76 for positive [PF] and negative [NF] factor), and McDonald’s omega of 0.89 (PF = 0.94, NF = 0.82) indicated the ReQoL has good reliability as well as test–retest reliability with an intraclass correlation coefficient of 0.75. Four items showed negligible DIF with respect to age. Respondents who were highly educated and without psychological problems reported a high ReQoL score. Conclusion Traditional Chinese ReQoL was shown to be a valid and reliable instrument to assess the recovery-focused quality of life in HK general population. Future studies are needed to appraise its psychometric properties in local people experiencing mental disorders.


INTRODUCTION
Health-related quality of life (HRQoL) captures an individual's or population's perceived physical and mental health status over time . It can provide comprehensive information on the burden of preventable diseases, injuries, and disabilities from the perspective of person-centered care (Yin et al., 2016). Usually, HRQoL is assessed by using patient reported-outcome measures (PROMs), which include multiple items reflecting people's self-perceived physical and emotional functioning and health status. HRQoL has more traditionally been an important outcome to assess the effectiveness of interventions on people' physical health (Goldhagen et al., 2016;Xu et al., 2017;Wong E.L.Y. et al., 2019). Recently, however, in evaluating the outcomes of mental care, promoting "recovery, " which reflects the extent to predict the changes in HRQoL (Garner et al., 2014), has drawn increasing professional attention, and emerged as a new paradigm to assess the full journey of people in overcoming the detrimental effects of mental problems (Ellison et al., 2016). Mental health recovery is a self-directed process of healing and transformation (Deegan, 2002), which is promoted through an interaction between individual experience, community environment and social engagement.
Although several PROMs are available to assess the effectiveness of interventions that reduce the symptoms of mental illness, few of them focus on appraising the improvement of mental HRQoL (Keetharuth et al., 2018a). Currently, the EuroQol-five dimension instrument (EQ-5D), a generic preference-based measure, is the most recommended PROM in assessing people's HRQoL worldwide (Herdman et al., 2011) and has been increasingly used to measure HRQoL in different populations in HK. However, the EQ-5D focused on physical health with only one of five items capturing mental health. In mental health, validity of the EQ-5D is rarely reported but suggests a potentially mixed picture (Brazier, 2010). Although there is evidence that generic instruments are able to reflect the impact of common conditions such as mild to moderate depression and anxiety (Lamers et al., 2006), an increasing number of studies showed conflicting evidence on its validity for patients with schizophrenia, depression, and bipolar (Barton et al., 2009;Papaioannou et al., 2011;Mulhern et al., 2014). Therefore, the Recovering Quality of Life (ReQoL) outcome measure with an emphasis on mental health was constructed to assess the recovery-focused quality of life (Keetharuth et al., 2018b).
Recovering Quality of Life was developed by a team led from The University of Sheffield, United Kingdom and funded by the Department of Health Policy Research Program (ReQoL, 2021). It is a self-completed questionnaire with two versions, ReQoL-20 and ReQoL-10, which contains 20 and 10 mental health items and one physical health item (not included in the scoring), respectively. The ReQoL was developed with significant inputs from service users not only as participants but also as research partners. The psychometric analyses in the development stage were based on data from over 6,450 participants, which significantly increased the face and content validity of the measure (Keetharuth et al., 2018b). A bifactor model of the ReQoL comprising a global factor and two local factors of negative and positive affects was reported by the developers (Keetharuth et al., 2018a). ReQoL has been translated into different languages and shown good reliability and validity (Keetharuth et al., 2018b;Chua et al., 2020;van Aken et al., 2020).
At present, widespread outbreaks of coronavirus disease 2019 (COVID-19) have drastically changed multiple aspects of the people's lives, with a growing number struggling with several mental health issues (Khan et al., 2020). Mass lockdown, increased rate of unemployment and the fears and uncertainties of the pandemic have not only exacerbated the psychiatric symptoms for patients with mental illnesses, but also affected the general population who may not have previously experienced psychological distress and symptoms of mental illness (Shigemura et al., 2020;Zhang and Ma, 2020). Chew et al. (2020) indicated that these psychological responses affect the well-being of the individual and community, and the impact could persist well after the outbreak. Traditional mental health measures, e.g., General Anxiety Disorder-7 (GAD-7) and Patient Health Questionnaire (PHQ-9), were developed on the basis of meeting clinical criteria and assess the effectiveness of intervention from the perspective of reducing symptoms (Keetharuth et al., 2018b). They cannot be used to reflect the lived experience of general population and measure their HRQoL outcomes. The lack of a well-defined, multidimensional, and psychometrically valid measure of recovery in Hong Kong (HK) has been cited as a potential barrier to providing suitable healthcare for improving their mental HRQoL (Mak et al., 2016). Although the ReQoL was constructed to assess the HRQoL for patients with a broad spectrum of mental illnesses, it could be a useful instrument to capture general wellbeing in the pandemic as well as establishing whether this questionnaire could be used for public health interventions in the general population. Therefore, the primary objective of this study was to translate and culturally adapt the ReQoL from English to Traditional Chinese (ReQoL-TC) and assess its psychometric properties in HK general population. The secondary aim was to investigate the recovery-focused HRQOL of this sample using the ReQoL-TC during the COVID-19 pandemic.

Translation and Cultural Adaptation
We adhered to standard guidelines "Translation and Linguistic Validation Process for the ReQoL" provided by the official distributors of the ReQoL in translating and culturally adapting the ReQoL in Traditional Chinese (Wild et al., 2005). Dual forward translation was undertaken independently by two professional translators, who were native Chinese speakers but proficient in English. The local research team used both translations to perform the forward translation reconciliation. A revised version was then produced and sent to another two professional translators, who were native English speakers but proficient in Chinese for backward translation independently. The local and ReQoL research team jointly examined the back translation against the original English version to identify any discrepancies, addressed the disputed items, and refined the translation focused on cultural adaption until consensus was achieved by all the research team members.
Cognitive debriefing was conducted with ten members of the general population who were invited to comment on the response options and any wording they found difficult in understanding the ReQoL-TC. Respondents were asked to describe in their own language what the wording meant to them. We paid special attention to the items that we had to adapt and were confident they were understood in the way intended as per the developer's concept elaboration. After proofreading, the final version of the ReQoL-TC was confirmed.
The local research team has rich experience in the translation and cultural adaption of PROMs and other health outcomerelated questionnaires. The eligibility of the translators has been approved by the ReQoL distributor. The results were discussed with the ReQoL developers and one of whom was invited to join local research team to monitor the project and ensure the quality of the development.

Sample and Data Collection
A random telephone survey of the general population in HK, was carried out by a team of telephone survey professionals in July 2020, following similar recruitment methods by a previous study (Chan et al., 2019). Before the formal survey, a pilot study with ten randomly selected persons was conducted to test the logic of the telephone survey. In order to minimize the sampling error, telephone numbers were first selected randomly from an updated telephone directory as seed numbers. Another three set of numbers were then generated using the randomization of last two digits in order to recruit the unlisted numbers. Duplicate numbers were screened out, and the remaining numbers mixed in a random order to form the final sample. Interviews were carried out by experienced interviewers, between 10:00 and 22:00 on weekdays and other periods including weekends and public holidays should appointments with suitable subjects were arranged. The inclusion criteria for the study were: (1) HK permanent residents; (2) ≥18 years; (3) no cognitive problems; and (4) able to provide informed consent. Upon successful contact with a target household, one qualified member of the household was selected among family members using the lastbirthday random selection method (i.e., a respondent in the household who just had their birthday would be selected to participate in the telephone interview). Given a sample size of around 300-500 is believed to have sufficient power to estimate parameters in confirmatory factor analysis (CFA; DeVellis, 2017), in this study, a minimum sample size of 500 was determined. Finally, data from 500 participants were successfully collected. Approximately 72.2% were female, around 60.6% ≥60 years, nearly 64.4% completed the secondary or above education, 62.2% of participants reported no chronic conditions and only 4.4% (n = 22) indicated they had visited a psychiatrist within the last 12 months ( Table 1)

Recovering Quality of Life
The ReQoL-TC was used in this study. It comprises 20 mental health items and one physical health item. ReQoL-TC-10 (11 items) comprises the first 10 item of the ReQoL-TC and the physical health item. Of the 20-item ReQoL-TC, 11 are positively worded and nine are negatively worded. All the items are scored on a five-point Likert scale ranging from "None of the time" to "Most or all of the time." A sum score is calculated by summing the scores of all the items (except for the physical health item), where a higher score indicates a better quality of life.

General Anxiety Disorder-7
The GAD-7 is a self-rated scale to measure the severity of generalized anxiety disorder. It has seven items, e.g., Feeling nervous, anxious, or on edge, scored from zero (not at all) to three (nearly every day) (Kroenke et al., 2007). The sum score ranges from zero to 21 and the cut-off point for mild, moderate and severe anxiety symptoms are 5, 10, and 15, respectively (Spitzer et al., 2006). The Chinese GAD-7 has been validated (Tong et al., 2015). The Cronbach's alpha (internal consistency reliability) of the GAD-7 in our sample was 0.93.

Depression Anxiety Stress Scales-21 Items
The Depression Anxiety Stress Scales-21 items (DASS-21) measures higher-order mental factor of psychological distress (Lovibond and Lovibond, 1995) over the past week using seven items in each of the domains of depression, anxiety, and stress. Each item, e.g., I found it hard to wind down, has a four-point Likert scale with rating choices ranging between "never applied to oneself " (0) and "very much/most of the time" (3). Final scores for three subscales are calculated by summing the scores for the relevant items (DASS, 2021). The Chinese DASS-21 has been validated (Wang et al., 2016). The Cronbach's alpha (internal Non-employed includes respondents who reported they are housewife, students, and unemployed. consistency reliability) of subscale depression, anxiety and stress in our sample was 0.75, 0.7, and 0.72, respectively.

EQ-5D-5L
The EQ-5D-5L is a generic preference-based measure to estimate people's HRQoL (Herdman et al., 2011). It has two sections: the descriptive system and the visual analog scale (EQ-VAS).
The descriptive system comprises five items (mobility, self-care, usual activities, pain/discomfort, and anxiety/depression) with five levels (from "no problem" to "extreme problem"). Utility scores were calculated using the EQ-5D-5L HK value set Wong E.L. et al., 2019). EQ-VAS is a vertical scale used to measure people's overall health with values between 0 (worst imaginable health) and 100 (best imaginable health) The Cronbach's alpha (internal consistency reliability) of the EQ-5D-5L (descriptive system) in our sample was 0.82.

ICEpop CAPability Measure for Adults
ICEpop CAPability measure for adults (ICECAP-A) is a generic preference-based measure that evaluates an individual's capability well-being (Al-Janabi et al., 2012). The descriptive system of the ICECAP-A has five items (stability, attachment, autonomy, achievement, and enjoyment) with four response options ranging from "fully capable" to "not capable." The Chinese ICECAP-A has been validated (Tang et al., 2018). In the absence of value set for the Chinese population, we calculated the sum score of the ICECAP-A by summing the scores of five items, where a higher score represents a poorer capability well-being. The Cronbach's alpha (internal consistency reliability) of the ICECAP-A in our sample was 0.77.

Construct, Convergent, and Known-Group Validity
Confirmatory factor analysis was used to assess the construct validity. In assessing the dimensionality of the ReQoL-TC, two models were developed in line with the original study (Keetharuth et al., 2018a). The first was a bi-factor model, with one global factor and the two factors contained positively worded and negatively worded items respectively. Second, the two-factor model consisted of the positively worded items and the negatively worded items as two separate factors. The model fit was assessed by the root mean square error of approximation (RMSEA ≤ 0.08), standardized root mean squared residual (SRMR ≤ 0.08), Tucker-Lewis index (TLI ≥ 0.9), and comparative fit index (CFI ≥ 0.9) (DeVellis, 2017). The factor loadings of each item were also checked. For the bi-factor model, we calculated the explained common variance for the global factor to assess its importance relative to the two other factors (Reise et al., 2010). To address the issue of non-normal data, the robust distribution free weighted least squares (WLSMV) estimator was used (Ferrando and Lorenzo-Seva, 2000;Markon, 2019).
Several a priori hypotheses about the relationship between the ReQoL-TC and the other HRQoL and mental health instruments were formulated to test the convergent validity of the ReQoL-TC (Keetharuth et al., 2018b;Chua et al., 2020;van Aken et al., 2020) (specific hypotheses are presented in Supplementary  Table 1). The strength of the correlation was estimated using Pearson's correlation coefficient (r), where r ≥ 0.55 were interpreted as adequate.
The known-group validity was examined by (i) comparing people self-reporting specific mental health conditions versus those who did not; and (ii) using GAD-7 and DASS clinical cutoff points (where a score of <5 on GAD-7 (Spitzer et al., 2006) and depression [≤9], anxiety [≤7], and stress [≤14] on DASS indicate no clinical concerns) (Brumby et al., 2011). While GAD-7 and DASS-21 do not measure aspects of quality of life per se, it can be assumed that they define broad groups expected to have different quality of life scores.

Item Statistics, Internal Consistency, and Test-Retest Reliability
Feasibility and acceptability of the ReQoL-TC were assessed by the time taken to complete the questionnaire and proportions of missing values of items (respondents were allowed to skip questions) . The mean, standard deviation (SD), median, and range of the ReQoL-TC (both 20-and 10item versions) scores were reported. We also calculated ceiling and floor effects, skewness, and kurtosis. Internal consistency reliability of the ReQoL-TC was measured by Cronbach's alpha (α > 0.7, acceptable), McDonald's omega (ω > 0.7, acceptable), Guttman's lambda 4 (λ > 0.8, suitable) (McDonald, 1999), the item-total correlation (>0.5, acceptable) and alpha if an item is dropped (DeVellis, 2017). Selected participants (10%) was invited to take part in another telephone survey two weeks later (only respondents who did not report experiencing any significant life event were invited) to assess the test-retest reliability using intraclass correlation coefficient (ICC > 0.7, acceptable, two-way mixed effects model) (Fleiss, 1999).

Differential Item Functioning
Differential item functioning (DIF) with regard to patients' natural attributes, i.e., gender (male vs. female) and age (regrouped to two groups; G1: < 50 years vs. G2: ≥ 50 years), which were unchanged characteristics, was evaluated (Cherepanov et al., 2010;Roberts et al., 2014 (Zumbo, 1999). DIF analyses were conducted for the two dimensions differentiated in the ReQoL-TC. The univariate (one-way analysis of variance) and multiple analysis (ordinary least squares regression model) was used to show the ReQoL-TC sum scores reported by respondents with different background characteristics, i.e., sex, age, education,  Team, 2013). CFA, reliability, ICC and DIF was analyzed using "lavaan, " "psych, " "ICC, " and "lordif " package, respectively. The level of significance was set at p-value ≤ 0.05.

RESULTS
The model fit statistics from the CFA are presented in Table 2.
The goodness-of-fit indices indicated an acceptable fit for both the bifactor (χ 2 = 214.7, degree of freedom [df] = 150, p < 0.001, RMSEA = 0.029, SRMR = 0.058, TLI = 0.989, and CFI = 0.991) and the two-factor model (χ 2 = 540.3, df = 169, p < 0.001, RMSEA = 0.066, SRMR = 0.076, TLI = 0.942, and CFI = 0.948) of the ReQoL-TC. The factor loadings for the two-factor model ranged between 0.29 and 0.86. For the bi-factor model, the loadings for the global factor were smaller than 0.3 for 11 out of 20 items (40 factor loadings: 10 for positive factor, 10 for negative factor, and 20 for global factor) and the explained common variance was 51%. Considering the results of CFA and the bifactor structure reported by the original study, we concluded that the bi-factor model outperformed the two-factor model in this sample of HK general population. The standardized factor loadings for the observed variables of both models are presented in Table 3.
The distribution of the ReQoL-TC sum and factor scores are presented in Table 4. The mean scores (SD) for the ReQoL-TC-20 and the ReQoL-TC-10 were 60.33 (10.52) and 28.54 (6.63), respectively, and no sum score of three scales showed ceiling or floor effect. The internal consistency reliability of ReQoL-TC-20 (α = 0.86 [0.91 and 0.76 for two factors, respectively]) and ReQoL-TC-10 (α = 0.84) was acceptable. The value of ICC confirmed the test-retest reliability of ReQoL-TC-20 (ICC overall = 0.75, ICC positive = 0.79, and ICC negative = 0.71) and ReQoL-TC-10 (ICC = 0.71) was satisfactory. No missing data was identified of ReQoL-TC and the average time to complete the measure was around 5 min indicating a good feasibility and acceptability. The response distribution, and factor-level item-total correlation and alpha if item dropped of the ReQoL-TC are presented in Supplementary Table 3. The sum and factor score distributions of the ReQoL-TC are presented in Supplementary Figure 2. Tables 5, 6 show the result of convergent and knowngroup validity of the ReQoL-TC. All the correlations between measures were significant and the signs were as expected. Both the ReQoL-TC-20 and ReQoL-TC-10 showed a significant correlation with GAD-7, DASS-21, EQ-5D, and ICECAP-A scores. Most correlation coefficients of the ReQoL-TC-20 were larger than those of the ReQoL-TC-10. The ICECAP-A item of enjoyment (r = −0.49) showed the strongest correlation with the sum score of ReQoL-TC-20, followed by the ICECAP-A item of stability (r = −0.47) and attachment (r = −0.47). The correlation coefficients of the ReQoL-TC-10 with the other measures ranged between 0.23 (GAD-7) and 0.48 (ICECAP-A item of attachment). The results of ANOVA indicated that participants with clinical mental health status showed poorer quality of life than those without, confirming the known-group validity of the ReQoL-TC.
Factor-level DIF analysis found that items 14 (negative factor) showed both uniform and non-uniform DIF on sex. Another three negatively worded items 6, 10 (uniform), and 9 (nonuniform) showed DIF on age. Two positively worded items 3 (uniform) and 8 (non-uniform) showed DIF on age. Checking the McFadden R 2 , the effect size of DIF was negligible for all six items (<0.001-0.06) (Supplementary Table 4).
Results of the univariate and multiple analysis are presented in Table 7. Respondents who were highly educated and living with no psychological problems tended to report a high ReQoL-TC sum score.

DISCUSSION
This study presented the development of the Traditional Chinese version of the ReQoL, which exhibited acceptable psychometric properties, in HK general population. The translation process adhered to acceptable international translation standard. A twofactor structure, which separately comprised positively and   negatively worded items, was confirmed with an acceptable model fit and factor loadings. We have not found commonly agreed threshold for interpreting the explained common variance. However, previous studies have concluded that scales were sufficiently unidimensional if they obtained ECV values of around 70-80% which is much higher than 51% found in this study (Reise et al., 2013). The ReQoL-TC showed good convergent validity in correlated with other instruments measuring HRQoL, mental health and well-being, and sufficient discriminative power to differentiate people with and without clinical mental health status as defined by GAD-7 and DASS-21. Additionally, the internal consistency and test-retest reliability of the ReQoL-TC were satisfactory for both positive wording and negative wording factors, and no ceiling or floor effect was detected. Further, a significant and strong correlation between ReQoL-TC-20 and ReQoL-TC-10 was identified. In general, the results of this study confirmed that two-factor ReQoL-TC is a valid and reliable instrument with good acceptability and feasibility to assess the recovery-focused quality of life for HK general population and can be used in research settings. Although no ceiling or floor effects were detected on the sum score of ReQoL-TC, score distribution of some negatively worded items were severely skewed. Approximately 92 and 89% of participants chose the option "Never" for item "I felt like a failure" and "I thought my life was not worth living." This is in line with the findings from the original United Kingdom study where 33 and 51% of patients indicated never having any concerns on those two aspects. Additionally, item "I felt in control of my life" and item "I felt calm" showed a measure of floor effect with 31 and 25% of participants indicating they could control their life or feel calm, respectively, which was not reported in the original study. These findings should be interpreted with caution as our survey was completed during the COVID-19 pandemic. Given several studies have confirmed that pandemic undoubtedly negatively impact on people's mental health and well-being (Burgess, 2020;Galea et al., 2020), the "COVID bias" might have affected the people's response in our study. Post-pandemic assessments are needed in the future. The ReQoL-TC has a short version and a longer version to serve the different settings where recovery-focused quality of life is measured. Despite the internal consistency and test-retest reliability for both versions being satisfactory in this study, the Cronbach's alpha was lower than that reported by the studies in United Kingdom (ReQoL-20:0.93 and ReQoL-10:0.87) and Netherlands (ReQoL-20:0.94 and ReQoL-10:0.9) (Keetharuth et al., 2018b;van Aken et al., 2020). We found the mean sum score of the ReQoL-TC-10 was lower than that of the half of ReQoL-TC-20 sum score (scale 0-40). This difference might be because participants tend to score higher on those items with negative wording. ReQoL-TC-10 only contains four out of 11 items with negative wordings. However, previous studies indicated that a measure contains a mixture of positive and negative items is a crucial element as people with mental health difficulties identified issues that both enhanced or depleted their quality of life (Crawford et al., 2011;Keetharuth et al., 2018b). The psychometric performance of two versions of ReQoL-TC should be separately assessed in the future. In our sample, compared with the ReQoL-TC-10, the ReQoL-TC-20 showed a closer relationship with the mental healthrelated measures, i.e., the GAD-7 and DASS-21. This is possibly the case because of the pandemic with more people experiencing mental health difficulties. This finding should be interpreted with caution because only 500 members of the general population were included in this study. However, overall, we could expect ReQoL-TC-10, by virtue of its brevity, to be more practical measuring the recovery-focused quality of life for general population. For individuals with mental illness, the ReQoL-TC-20 could be more appropriate due to its comprehensiveness. Future studies with both general population and individuals with mental problems are needed to investigate the generatability of findings in this study. Further, in line with the findings of van Aken et al.'s (2020) study, both ReQoL-20 and ReQoL-10 showed a significant but not strong correlation with the EQ-5D utility score. It is not surprising that, as Papaioannou et al. (2013) also indicated in their study, for patients with personality disorders, the EQ-5D is suitable to assess patients' HRQoL, but lacks the content validity to fully reflect the impact of the condition. Brazier also indicated that the EQ-5D appears to perform acceptably well in depression and personality disorder, but less well in anxiety, schizophrenia and bipolar disorder . Furthermore, the ReQoL-TC sum score strongly correlated with ICECAP-A item of stability, attachment and enjoyment supported that ReQoL-TC can capture people' recovery-focused quality of life as it is intended to measure (Leamy et al., 2011), instead of assessing the effect of the reduction in symptoms alone.
The associations of positively and negatively worded factors of the ReQoL-TC with the other HRQoL and mental health measures showed a mixed picture. The items of negative wordings presented a strong correlation with the measures investigating individual's mental health status. However, the items of positive wordings factor showed a strong correlation with quality of life and well-being measures. Previous studies have indicated the potential impact of using negative or positive wordings in assessing individual's psychological attributes. For example, a study examined the Rosenberg Self-Esteem scale, has demonstrated the existence of method effects associated with negatively and/or positively worded items (Schönberger and Ponsford, 2009). Wouters et al. (2012) also indicated the importance of evaluating wording effects when examining the factor structure of the HADS in vulnerable patient groups. However, few of them directly exhibited a significant association between items with positive wordings and respondents' wellbeing. Further research is needed to investigate the effect of negatively and positively worded items of the ReQoL-TC on the other Chinese population's health, and which sociodemographic and personality characteristics are associated with such response style.
Regarding the structure of the ReQoL-TC, Keetharuth et al.'s (2018b) original study reported a bi-factor model -a global factor and separate factors for the positively and negatively worded items. This finding was not fully supported in our study (low factor loadings for the global factor and a low explained common variance), despite the satisfactory goodnessof-fit indices. However, both two studies confirmed the presence of two factors -positively worded and negatively worded items -of the ReQoL. Moreover, our model fit was average as this might be as a result of the survey bias in terms of the interviewer administered questionnaire, the survey population and the influence of the COVID-19 pandemic. Considering the difference in the populations of the two studies (patient in the United Kingdom vs. general population in HK) and the goodness-of-fit of two models were satisfactory, we do not suggest rejecting the two-factor model. However, further assessments are needed to explore the structure of the ReQoL-TC in both HK general population or individuals experiencing mental illness. The implication of the two factors is that a sum score may not be generated for the ReQoL-TC for this population and the scores of positive and negative affect will have to be kept separate. Moreover, several items (four negative and one positive wording items) showed significant DIF on age and sex, respectively, despite the effect size of DIF was negligible.
Attention should be paid to the item "I felt like a failure (item 14), " which showed several psychometric problems, such as both uniform and non-uniform DIF, a low factor loading (0.29) and strong skewness (−5.32). This might be because of two possible reasons. First, "failure" is a very negative word in the Chinese culture and people usually show less willingness to use that word to describe themselves or the others (Bedford and Hwang, 2003), thus, in this study, more than 90% of respondents selected "none of the time". Second, the ReQoL-TC was developed based on inputs from individuals with mental health problems, however, in this study, all respondents were members of the general population, thus, negative wording items, such as "failure" item were problematic to some extent. However, considering this was the first study to assess the validity and reliability of the ReQoL-TC and only 500 members of the general population was surveyed, we retained all the items and did not recommend for any to be dropped at this stage.
Several limitations should be addressed. First, all participants in this study were recruited through a telephone-based survey. This might lead to several bias pertaining to data collection and quality, e.g., participants may not have understood the questions as they could not read them (interview bias). Hence, other forms, such as face-to-face or online survey, should be used in future studies. Second, our sample is not representative of the HK general population as few of them were young or with high income, which might affect the validity and reliability of the ReQoL-TC. Third, while we used the guidelines provided by the developers to adapt the ReQoL for the HK population, we are aware that there are newer guidelines on cultural adaptation (Hernandez et al., 2020). Moreover, despite the original ReQoL has been translated and adapted to HK Chinese population, the adaptation may not be directly used to compare between two cultures because no measurement equivalence between the original and the adapted forms was carried out. Further investigations are needed to independently assess the psychometric properties of the ReQoL-TC-10 in another sample of local population.

CONCLUSION
This study confirmed that the ReQoL-TC has sound psychometric properties in a sample of HK general population. It demonstrates good face and content validity, satisfactory convergent and discriminatory validity as well as adequate internal consistency and test-retest reliability. A bi-factor structure of the ReQoL-TC with one positive wording, one negative wording and a global factor was confirmed. This study also investigated the recovery-focused HRQoL using the newly developed ReQoL-TC in HK general population during the COVID-19 pandemic. Although we showed that ReQoL-TC is suitable for use in HK general population, future validation work should be carried out to investigate the performance of the ReQoL-TC in individuals with mental health problems. We also intend to develop a set of preference weights preference-based ReQoL-TC to calculate quality adjusted life years to support the economic evaluation in improving people's mental HRQoL.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by contacting correspondence author, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the institutional review board of The Chinese University of Hong Kong (Ref. ID: SBRE-18-671). The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
RX: study concept and design, data analysis and interpretation, software, writing-original draft, and writing-review and editing. AK: study concept and design, data analysis and interpretation, and writing-review and editing. L-LW: software, visualization, and writing-review and editing. AW-LC: provision of study materials and patients, collection and assembly of data, and writing-review and editing. EL-YW: study concept and design, provision of study materials and patients, collection and assembly of data, supervision, and writing-review and editing. All authors contributed to the article and approved the submitted version.