Reliability, Validity, and Factor Structure of Pittsburgh Sleep Quality Index in Community-Based Centenarians

Background The Pittsburgh Sleep Quality Index (PSQI) is a widely used self-report questionnaire that measures general sleep quality in general populations. However, its psychometric properties have yet to be thoroughly examined in longevous persons. Objectives This study aimed to explore the reliability, validity and factor structure of the Chinese-language version of the PSQI in community-dwelling centenarians. Methods A total of 958 centenarians (mean age = 102.8 years; 81.8% females) recruited from 18 regions in Hainan, China, completed the PSQI scale. Cronbach’s alpha coefficient was used to measure the internal consistency. Exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) were performed to explore the validity and factor structure of the PSQI in this sample. Correlations between the global PSQI score and physical function, depression symptoms, self-reported health status and subjective well-being were used to assess divergent validity. Results The Cronbach’s α coefficient of the PSQI was 0.68, and it increased to 0.78 after two components (medication use and daytime dysfunction) were removed. The Spearman correlation coefficients of the PSQI score with each component were statistically significant (P<0.01). EFA yielded a two-factor structure model of the original PSQI-7 and a one-factor structure model of the simplified PSQI-5. The one-factor model with five components (χ2/df =1.59, CFI=0.99, RMSEA=0.03) fit the data well and had good configural invariance across demographic characteristics (0.530.05). Conclusions The original PSQI showed acceptable applicability in Chinese community-dwelling centenarians, and its psychometric characteristics moderately improved after sleeping medication and daytime dysfunction were removed. Further validation studies on PSQI are needed among centenarians from varied backgrounds.


INTRODUCTION
Sleep problems, including insufficient sleep, poor quality sleep, and sleep disorders such as insomnia and sleep apnea, are highly prevalent among older populations (1). Poor sleep quality increases the risk of falls (2,3), obesity (4), cognitive dysfunction (5), cardiovascular diseases (6,7), as well as neurologic or psychiatric conditions (8,9). Thus, sleep quality has become an important index for evaluating the health and life status of older adults, especially for the vulnerable oldest-old populations (10). As oldest-old adults, including centenarians, constitute the fastest growing segments of the world population (11), their sleep quality warrants more attention. Although some studies indicated that centenarians represent a prototype of successful aging (12), poor sleep quality is relatively high in this population (13,14). In addition, both the symptoms and influencing factors of sleep problems in long-lived populations have varied patterns compared to the symptoms and factors among younger generations (15,16). Thus, it is imperative to verify tools for measuring sleep quality among this exceptionally aged population.
The Pittsburgh Sleep Quality Index (PSQI) (17), a widely used self-reported questionnaire, is considered to be a generic instrument to measure sleep quality in diverse populations (18,19) and has been employed in several centenarian studies (20,21). The PSQI covers a broad range of indicators relevant to sleep quality and its components (namely, subjective sleep quality, sleep latency, sleep duration, habitual sleep efficiency, sleep disturbances, use of sleeping medication, and daytime dysfunction) provide references for clinical decisions. A systematic review including 37 studies confirmed that the PSQI had a fair or good Cronbach's alpha coefficient (ranging from 0.64 to 0.83) (17,22). Previous studies about communitydwelling older adults demonstrated single-factor (23), two-factor (24) and three-factor models (25) with different methodologies. Published findings also showed adequate internal consistency (19) and construct validity but varying factor structures among the English, Chinese, Korean, Portuguese, Nigerian and Greek versions of the PSQI (26)(27)(28)(29)(30). A recent systematic review including 45 articles and demonstrated the heterogeneous dimensionality of PSQI in both clinical and community-dwelling populations (31). A cross-sectional study including 17 comparative factor structures explored the fitness of different PSQI factor scoring models (32). Moreover, some studies also reported conflicting results in older adults (33)(34)(35).
Although previous studies have explored various properties of the PSQI across different clinical and nonclinical groups (19), its applicability in long-lived populations is not clear. The objective of this study was to examine the reliability, validity and factor structure of the PSQI in a group of community-dwelling Chinese centenarians.

Participants
The sample of this study was obtained from the baseline survey of the China Hainan Centenarian Cohort Study (CHCCS), which was conducted in Hainan, China, from June 2014 to December 2016. Hainan province has one of the highest life expectancies and percentage of centenarians in China and was authorized by the International Expert Committee on Population Aging and Longevity as a "World Longevity Island" (36). The CHCCS was designed as a whole sample study; it aimed to evaluate centenarians' physical and mental health status and identify modifiable factors related to aging and longevity.
Prior to the investigation, three steps of the age verification method were adopted to ensure the age authenticity of the enrolled participants; 58 participants failed the age validation ( Figure 1). A total of 1002 centenarians were eventually included and surveyed in the CHCCS. Details of the sampling strategy and inclusion and exclusion criteria have been reported elsewhere (37). After further excluding 44 participants who failed to complete the questionnaires due to critical diseases, 958 participants (Mean age = 102.8 ± 2.7 years; 81.8% females) from all 18 regions throughout Hainan Province were included. The geographical distribution of participants is shown in Figure 2 with red dots. The participants' primary caregivers assisted in describing the details of the investigation. The ethics committee of the Hainan branch of the Chinese People's Liberation Army General Hospital (Sanya, Hainan) approved the study protocol (no. of serial: 301hn11201601).

The PSQI
The PSQI is an 18-item constructed questionnaire designed to assess overall sleep quality over a 1-month period. The 18 items are divided into 7 derived component scores: (1) sleep quality; (2) sleep latency; (3) sleep duration; (4) sleep efficiency; (5) sleep disturbance; (6) medication use; and (7) daytime dysfunction. These items are rated in terms of the frequency or severity of the problem on a four-point Likert scale (e.g., 0 = Not during the past month, 1 = Less than once a week, 2 = Once or twice a week, 3 = Three or more times a week). The sum of the component scores yields a global PSQI score that ranges from 0 to 21, with higher scores representing lower sleep quality.
Since there is no consensus on the PSQI threshold for longevous persons, we used the 7/8 global PSQI score as a cutoff of sleep disorder rather than the original 4/5 cutoff recommended by the tool developers. A cut-off score of >7 point was confirmed more appropriate to determine poor sleep in clinical practice (38)(39)(40). If any item under a component had a missing value, the component was defined as missing; 42 participants (4.4%) had at least one missing value for seven components. We performed multiple imputation to address the missing values (41).

The 15-Item Geriatric Depression Scale
Depression symptoms were measured using the Chinese version short form geriatric depression scale (GDS-15), which consists of 15 dichotomous items assessing the presence of depression symptoms during the last week (42). The responses to the 15 dichotomous items of this scale were scored and higher scores indicate more depressive symptoms (possible range 0-15; observed range of 0-15). The GDS-15 has been proven to be useful for assessing depressive symptoms among very old people (43,44).

The Barthel Index for Activities of Daily Living
The Barthel Index consists of 10 items that measures a person's physical activities of daily living (ADL) and is commonly used as a proxy for physical function (45). Daily activities measured by the Barthel Index include grooming, feeding, dressing, bathing, toilet use, transferring from bed to chair, walking, stair climbing, bowel continence, and urinary continence. Each item of the ADL is rated on a scale with a given number of points assigned to each level of activity, and the total score ranges from 0 to 100 points with five-point increments. A higher score indicates higher levels of physical function. Items of the Barthel index are regarded as dependent if they are performed with any help from other people. In our previous study, the functional dependence of ADL of centenarians was 71.2% (46).

The Satisfaction With Life Scale
The Satisfaction With Life Scale (SWLS) is a short five-item instrument and was developed as a way to assess an individual's cognitive judgment of their satisfaction with their life as a whole.
Participants completing the questionnaire are asked to judge how they feel about each of the statements using a seven-point Likert scoring system, with 1 being "strongly disagree" with the statement and 7 being "strongly agree" with the statement. Scale scores range from 5 to 35, with higher scores indicating greater life satisfaction (47).

Visual Analog Scales
Visual analog scales (VAS) are psychometric response scales used to measure subjective characteristics or attitudes and have been used in the past for a multitude of disorders. The VAS score records an individual's self-rated health on a 20-cm, vertical visual analog scale ranging from 0 to 100, with notes at both ends labeling "the worst health you can imagine" (at 0) and "the best health you can imagine" (at 100). In our previous study, the mean VAS score was 66.9 among oldest-old populations and the respondents who had good sleeping quality reported higher VAS score by 7.09 on average, compared to those who had poor sleeping quality (48).

Statistical Analysis
The 958 participants were randomly assigned into two subsamples. Normal continuous variables were described as the mean ± SD; skewed continuous variables were described as median with upper and lower quartiles; categorical variables were described as numbers with percentages. For statistical analysis, continuous variables were compared using Student's t test (normal distribution) or Wilcoxon rank-sum test (skewed distribution); categorical variables were compared using Chi-squared test. As several variables were ordinal, we used Spearman correlation coefficient to measure the correlations between global PSQI score and each of its components, the GDS-15, the ADL, the SWLS, and the VAS. Furthermore, the Cronbach's alpha coefficient (49) was used to measure the internal consistency of the seven components with the global score of the original PSQI and revised scales (after certain items were deleted). A Cronbach's a coefficient greater than 0.7 indicates good internal consistency (50).
Parallel analyses (PA) (51) were performed to determine factor numbers with 500 random data matrices. The principal component analysis (PCA) and maximum likelihood (ML) extraction methods are the most widely used in studies conducting an EFA of the PSQI components (26,31,52,53). We simultaneously used the above two extraction methods to test the stability of loadings as results may differ with variation in methodology. Given that some components may cross between different factors (especially component 1 sleep quality), an oblique rotation was performed to redistribute factor loadings which allows for correlations between factors. The criteria for factor extraction included: (1) eigenvalues >1, (2) loadings of each item ≥ 0.4, and (3) factors from the real data with eigenvalues greater than the corresponding eigenvalue from the random data (either the average or the 95th percentile) were retained in PA. Factor loadings were using against the following criteria: ≥0.71 (excellent), 0.63-0.70 (very good), 0.55-0.62 (good), 0.45-0.54 (fair), 0.32-0.44 (poor), and<0.32 (unacceptable and deleted from factor) (54). Given that the multivariate normality assumption was not met in our dataset and the PSQI scores were ordinal, a confirmatory factor analysis using the weighted least squares (WLS) method was conducted in the validation set using Mplus Version 7.4. Both absolute fit indexes (c 2 /df; root mean square error of approximation, RMSEA) and value-added fit indexes (comparative fit index, CFI; normed fit index, NFI; incremental fit index, IFI; Tucker-Lewis index, TLI; relative fit index, RFI) were used. A nonsignificant c 2 statistic indicates that the fit of the restricted model is similar to that of the unrestricted model. However, the c 2 statistic is sensitive to small deviations in model fit with large samples (55). For well-specified models, a c 2 /df of 5 or less and an RMSEA of 0.1 or less reflect an acceptable fit. For the CFI, NFI, IFI, TLI and RFI, values greater than 0.90 are typically considered acceptable. We also conducted subgroup analyses for the alternative model across demographic characteristics (gender, education, residence, etc.) to examine the stability of the model. Statistical significance was accepted at the two-sided 0.05 level and confidence interval was computed at the 95% level. Statistical analyses were performed with Statistic Package for Social Science (SPSS) version 22.0 and Mplus Version 7.4.

RESULTS
A total of 958 centenarians (M = 102.8 years old, SD = 2.7) were investigated with the Chinese version of the PSQI scale. The majority of participants were female (81.8%), Han (88.4%), illiterate (91.1%), rural residents (65.7%), lived with families (89.9%), and widowed or divorced (89.9%). The mean ± SD of the global PSQI scores was 8.44 ± 3.09, and the overall prevalence rate of poor sleep quality was 58.25% (cutoff=7/8). The characteristics of the total sample as well as the training/validation set are described in Table 1. After randomization, no significant differences were found across demographic characteristics between the two subsamples.
As Figure 3 shows, the global PSQI score was significantly correlated with each component (P<0.01). Among the seven components, the correlations between the global PSQI score and the medication use (r=0.136) and daytime dysfunction (r=0.315) components were relatively low, and these components were not significantly correlated with some other components. The participants' responses to each component are shown in Table  2. The Cronbach's alpha coefficient of the global PSQI was 0.68, and it increased to 0.78 after two components (medication use and daytime dysfunction) were deleted.
Kaiser-Meyer-Olkin (KMO=0.78) and Bartlett's sphere tests (c 2 = 328.98, df=21, P<0.001) supported the suitability of the data for factor analysis. An exploratory factor analysis with oblique rotation was performed in the training set. As Figure 4 shows, parallel analysis confirmed two factors for components 1-7 and one factor for components 1-5. According to the PCA, two factors (factor 1 with components 1-5; factor 2 with components 6 and 7) were extracted with a total variance contribution rate of 46.9%. When components 6 and 7 were deleted, one factor (components 1-5) was extracted, which accounted for 43.6% of the variance. All five factor loadings are reasonable according to the given criteria. Each component's loading with the PCA and ML methods is shown in Table 3. The results of factor loadings using the two extraction methods were basically consistent.
Confirmatory factor analyses were adopted for the validation set to compare the suitability of seven competition models. Model A was the original unidimensional structure with seven components, model B was a two-factor structure and model C was a one-factor model with five components. We also examined four commonly used PSQI models from previous studies examining elderly people (25,28,34,52,53). Details of the seven competing models are presented in Table 4. Some models were modified by theoretical residual correlations based on the modification index (≥4). To systematically compare the performance of all models, we used the following methods: (1) model fit indices; (2) factor loadings in the CFA model; (3) correlation with the original global PSQI score ; and (4) Cronbach's a coefficient. The one-factor model C ( Figure 5) with five components (c 2 /df =1.59, P=0.157, CFI=0.99, RMSEA=0.03) fit the data better than other models, and no MI indicated model modification. To further explore the stability of model C, subgroup analyses ( Table 5) were performed in the validation set. No significant difference was found across demographic characteristics based on D c 2 (P>0.05). Moreover, this revised PSQI-5 model fit both the total sample (c 2 /df =2.66, P=0.080, CFI=0.97, RMSEA= 0.04) and the training set (c 2 /df =2.98, P=0.010, CFI=0.96, RMSEA= 0.06) in a satisfactory way.

DISCUSSION
To the best of our knowledge, this is the first study to date to examine the psychometric properties of the Chinese language version of the PSQI in a community sample of centenarians with a large sample size. The core finding was that the original PSQI-7 had adequate internal consistency and factor stability, while the revised PSQI-5 had better internal consistency and factor stability as an instrument to screen centenarians' sleep quality.
The PSQI showed a fair Cronbach's a coefficient and adequate variance contribution rate compared with previous aged population-based studies from Portugal (25), the United States (53) and China (34). However, two components (medication use and daytime dysfunction) strongly influenced the reliability and construct validation of the scale. The elevated Cronbach's a coefficient (from 0.68 to 0.78) after deleting the two components suggested the limited contributions of the two deleted components. Besides, as Figure 3 showed, the two deleted components had poor item-total correlations with the global PSQI score (r<0.4) as well as other reserved components (r<0.2). Their factor loadings were unacceptable in both one-factor and two-factor CFA models (<0.4). We obtained a one-factor model with good loadings from PCA on the remaining five components. In the fitness comparison of the seven competing models shown in Table 5, neither the original unidimensional model A nor the twofactor model B ( Figure 6) showed acceptable goodness of fit for the validation set. Since the EFA is a data-driven approach that may lead to spurious deviations from well-known factor structures, we added four commonly used PSQI models for comparison, but none of them fit the data well. The revised onefactor PSQI-5 model C fit the data well with a non-significant c 2 / df (P=0.157), and had good performance across demographic characteristics. Therefore, the one-factor PSQI-5 can be recommended as the best model for community-dwelling Chinese centenarians. One of PSQI validation studies likewise indicated factorial validity of a unidimensional scale with the same five components among women aged ≥70 in America (56). While another study has shown validity of a three-component unidimensional PSQI scale (sleep quality, sleep latency and sleep disturbances) in a sample of community-dwelling older Malaysians (57). These previous similar results support the feasibility of simplifying PSQI components in clinical practice.
Discordance with the medication use component was common in previous studies with either clinical or nonclinical samples. Several published findings showed poor correlations or small factor loading (22,25,33,34,53,(58)(59)(60) for this component in the PSQI framework. The medication use component was also removed in a previous study to improve the model fitness in a Chinese clinical setting (34). Another study about community-dwelling older adults from Portugal removed this component (25). Centenarians in this study are more likely to be illiterate and lacks medical service accessibility. Their awareness of sleep problems and their use of sleep medications were relatively low: only 1.98% reported regular use of sleep medications. It is noted that removing medication use component was based on psychometrical method. However, there are some situations in which information on medication use may justify its inclusion. In other words, the best possible psychometrics are not always the highest consideration. Also, the use and availability of medication may differ substantially among  populations and regions. Therefore, our findings may not generalize to other samples of centenarians from other regions. Although 35.07% of the centenarians showed serious daytime dysfunction, this component was not consistent with the others. Similar results were found in older men (61) and women (56) with osteoporotic fractures and in a study on black and white octogenarians (56). Centenarians in our study are tend to be became the crookback and often companied with chronic pain such as arthritis, those symptoms associated with poor sleep quality but not daytime dysfunction; and sometime sitting position might relieve the pain from lying position. The centenarians' sleep quality was not significantly correlated with physical function (Barthel index of ADL). Although daytime dysfunction often accompanies inadequate nighttime sleep (62), the association was not common in older people with physical function impairments. The dimensionality of the PSQI is highly controversial. Several studies suggested the original unidimensional measure (34,57,63,64), while others supported the multidimensional index (two or three factors) (25,53,65,66). Diverse sample characteristics and nonuniform methodologies (e.g., factor rotation and extraction methods; estimation method selection) may account for the conflicting results. Groups observed in previous studies have highly heterogeneous demographic characteristics and health status. In addition, the results of single confirmatory and exploratory factor analyses may be inconsistent, as shown in several studies (22,25,67). Thus, it is appropriate to refer the results of other studies directly. Using across-validation approach, we confirmed the one-factor PSQI-5 model as the optimal structure in our sample (22,53).
Significant correlations between the global PSQI score and multiple self-reported health outcomes provided evidence that the Chinese PSQI had appropriate divergent validity in centenarians. This finding is consistent with correlations obtained in samples of the elderly (68,69). Considering the potential loss of information by deleting two components, we examined the screening consistency of the two PSQI scales. When the sleep quality cutoff was 7/8 for the  original PSQI-7, we observed a satisfactory Kappa coefficient (0.801, 95% CI: 0.736-0.863) with the revised PSQI-5(cutoff=6/7). This indicated a high consistency of screening ability between the two scales; however, their screening abilities need to be tested against a standard clinical diagnosis. The PSQI-5 has appropriate reliability and validity compared to the original PSQI, and it may be useful in clinical practice because it may slightly reduce the time needed for calculating scores. In addition, our results supported the unidimensional formulation of the PSQI suggested by Buysse et al. (17), which makes it feasible to use the global score rather than separate factor scores, as a global score is often sufficient for screening purposes (70). Some limitations of the current study should be acknowledged. First, the sample was representative of centenarians residing in low-or middle-income regions; extrapolation to centenarians from other settings should be performed with caution. Second, as the current household registration system did not exist in China sixty years ago, the age of the participants based on the Chinese identity card might not be well-validated because of a lack of birth certificates. Nevertheless, strict quality control with three-step age verification methods has been taken to avoid age inaccuracy, which attenuates age exaggeration. Third, this study was conducted in a community setting and did not include varied diagnostic groups (e.g., sleep apnea, and insomnia). Thus, test attributes such as sensitivity, specificity, and known-group validity could not be evaluated. These attributes should be measured to further validate the PSQI in this age group.
In conclusion, the current findings validate the original PSQI-7, and demonstrate that the revised PSQI-5 had a satisfactory univariate factor structure for assessing centenarians' sleep in this    sample. Future studies should validate the factor structure and psychometric properties of PSQI and determine an appropriate cutoff score to identify sleep quality in community-based centenarians from varied backgrounds.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by: The ethics committee of the Hainan branch of the Chinese People's Liberation Army General Hospital (Sanya, Hainan) approved the study protocol (no. of serial: 301hn11201601). The patients/participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
CZ, YZ, and YY proposed the concept and design, analyzed and interpreted the data, and wrote the manuscript; CZ, HZ, and YY interpreted the data, drafted and edited the manuscript, supervised the study, and obtained funding; MZ, ZL, CC, and DB drafted and edited the manuscript. All authors contributed to the article and approved the submitted version. YY and YZ are guarantors.