Psychometrics of the Patient Health Questionnaire (PHQ-9) in Uganda: A Systematic Review

Background Depression is screened by many psychological tools, whereas the Patient Health Questionnaire-9 (PHQ-9) is one of the most commonly used self-administered tools. Uganda is a culturally diverse country with a wide variety of tribes, ethnic groups, languages, and disease conditions; it is urgent to know the psychometrics of the used PHQ-9 across different cohorts. However, there is no prior review to assess its reliability in this culturally diverse country, where this review fulfills the knowledge gap. Methods Adhering to the PRISMA guideline, a systematic search was performed in several databases (i.e., PubMed, Africa-Wide Information, AJOL, and PsycINFO, among others), and a total of 51 articles were included in this review, confirming the study inclusion criteria (e.g., using the PHQ-9). Results The PHQ-9 has been used among individuals above 10 years and both genders, and the tool has been used most among the HIV patient group (n = 28). The tool is frequently administered by interviews and has been translated into several languages (mostly Luganda, n = 31). A cutoff of 10 was commonly used to identify clinical or major depression (n = 23), and its prevalence ranged from 8 to 67%. It has been validated for use in two populations, (i) HIV-positive participants and (ii) the general population attending a health facility. The sensitivity and specificity were 92 and 89%, respectively, at a cutoff score of 10, whereas 67 and 78%, respectively, at a cutoff score of 5. The Cronbach alpha ranged between 0.68 and 0.94. Conclusion The PHQ-9 has been used in several studies in Uganda but validated in only two populations and is commonly used in one language. Thus, validation of the tool in various populations and languages is warranted to improve the tool's acceptance in Uganda.


INTRODUCTION
Over 300 million people worldwide suffer from depression, the single most significant factor contributing to global disability (1). In addition, it is reported that ∼9 out of every ten suicide occurrences are due to mental disorders, whereas depression accounts for almost two-thirds of these cases (2). Given the adverse effects of depression, it is regularly screened by mental health workers, general practitioners, medical and surgical subspecialties, clinical officers, and nurses (3). Various methods are used to screen for depression, such as Diagnostic and Statistical Manual of Mental Disorders criteria and psychological tools. The psychological tools include Patient Health Questionnaire (PHQ), Beck's Depression Inventory (BDI), Hamilton Rating Scale for Depression, Zung Self-Rating Depression Scale (ZSRDS), Montgomery-Asberg Depression Rating Scale, Symptom Checklist-20, Center for Epidemiologic Studies-Depression Scale, Akena Visual Depression Inventory, and Mini-International Neuropsychiatric Interview (MINI) (4)(5)(6)(7). Most of the tools are clinician-administered, but PHQ, BDI, and ZSRDS are self-administered (4).
The PHQ is the commonly used tool in Uganda, and it has various versions depending on the number of items used for depression assessment, such as PHQ-9 (9 items), PHQ-8 (8 items), PHQ-4 (4 items), and PHQ-2 (2 items) (8). Items are designed to capture the depression symptoms adhering to the Diagnostic and Statistical Manual of Mental Disorders criteria, where each item is scored from 0 to 3 (ranging from "not at all" to "nearly every day") for responses on the experience of these symptoms within the past 2 weeks. The nine-item tool (PHQ-9) has a total score of 0 to 27, with 1-4 for minimal depression, 5-9 for minor depression, 10-14 for moderate depression, 15-19 for moderately severe depression, and 20-27 for severe depression (4,9,10). Based on a recent systematic review of the PHQ-9, its sensitivity and specificity for major depressive disorder (MDD) range between 37 and 98% and between 42 and 99%, respectively (11). The internal reliability of the PHQ-9 is good, with a Cronbach alpha ranging from 0.67 to 0.89 (11). Just like other psychometric tools, the accuracy of PHQ-9 depends on various factors such as (i) the administrator-self-report produces many false negatives or positives depending on participants motives, (ii) the accuracy of the translation to another language-some statements cannot be directly translated in some languages, (iii) culture acceptability of the symptoms tested-some depressive symptoms are culturally acceptable (e.g., loss of appetite among adolescent girls, or admitting to feeling sad), (iv) physiological or pathological state of the patient at the time of its administrationpatients with pain and other chronic diseases commonly report sadness, insomnia, and anhedonia (12,13).
The tool has been used in many cultures and languages, and it has persistently had good reliability. Uganda is a culturally diverse country with over 54 tribes and five ethnic groups (14). In addition, the country also has low levels of education to enable the majority of the population to comprehend the tool in its raw form-English (15). The country also has multiple refugee groups from different neighboring countries (e.g., Sudan, Democratic Republic of Congo, Rwanda, Burundi, Somali, and Ethiopia), all with different dialects (16)(17)(18)(19), and various disease conditions whose symptoms may lead to false positives with the tool such as TB and cancers, with masked symptoms such as loss of energy, anhedonia, loss of appetite, and poor sleep (12). In line with the issues mentioned above, the accuracy of depression detection by the PHQ-9 has limited generalizability.
Although the PHQ-9 has been a widely used tool in Uganda, a culturally diverse country, no systematic evaluation assessing its reliability and psychometric properties had been performed to the best of the authors' knowledge. However, these properties typically help identify and define suitability or reliability for the use of the tool that reveals information about relevance, adequacy, and usefulness. Therefore, a systematic review was undertaken to address this gap, considering all studies that used the PHQ-9 in Uganda, which is anticipated to improve the cultural acceptance of the tool in the country.

Search Strategy
The guideline as provided in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) (20) was adhered to the present study to conduct a systematic review. A systematic literature search was performed with assistance from the university librarian (Mr. Wilson Adriko) in the databases including PubMed (n = 71), AJOL (n = 1), Cochrane library (n = 4), and Scopus (n = 48), from inception to May 10, 2021 (Supplementary Material 1). Additional searches were carried out on the databases including Google Scholar, Africa-Wide Information, PsycINFO, Global Health, Web of Science, CINAHL, and even ResearchGate for any missing articles in the primary search. The utilizing search strategy included keywords: (depression OR depress * OR unipolar * OR major OR "mood disorder"); AND (PHQ * OR "Patient Health Questionnaire" OR PHQ-9); AND (Uganda OR Kampala). (Boolean search operators * = Words match if they begin with the word preceding the * operator; and ". . . " = characters return only results that contain the phrase with double quote).

Study Selection Criteria
First of all, "Title and Abstract" of all of the retrieved literature were screened independently by MMK and SMN. Then, the full-text article was evaluated to confirm if the article adhered to the inclusion criteria. Any disagreement among the individuals was settled by the content expert (MAM). Finally, the articles were included in this review based on the inclusion criteria: being Ugandan studies using the PHQ-9, published in English as a peer-reviewed journal article, or a thesis, or a preprint article.

Data Eligibility
A total of 124 articles were retrieved from several databases. Of these, 111 articles remained after removing duplicates and screening their "Titles and Abstracts"; a total of 48 articles were eliminated. In addition, the full text of seven articles was not possibly retrieved; although the corresponding authors were contacted, no responses were received after 1 week. However, full texts of the remaining 56 articles and six articles retrieved by citation searching were assessed for eligibility. Of these, 11 fulltext articles were excluded because of the following reasons: (i) protocol for Randomized Control Trial (RCT) (n = 3), (ii) did not use the PHQ-9 (n = 4), (iii) belonged to other countries (n = 3), and (iv) was a book (n = 1). In the end, a total of 51 articles met the inclusion criteria of this review and were selected (Figure 1).

Bias Evaluation and Quality Assessment
The Joanna Briggs Institute (JBI) checklist was used to evaluate the risk of bias of the included articles (21). The JBI uses a 4point Likert point with the answers "no, " "yes, " "unclear, " and "not applicable, " for the following questions: (1) appropriateness of the sample frame; (2) recruitment procedure; (3) adequacy of the sample size; (4) description of subjects and setting; (5) description of the identified sample; (6) validity of the methods used to screen for depression; (7) reliability of the methods used to screen for depression; (8) adequacy of statistical analyses; and (9) response rate. Articles were assigned one point per yes, representing the score range 0-9. Articles with a score of <5 were considered low quality and were to be excluded. No article was excluded after quality assessment ( Table 1).

Data Extraction
In Microsoft Excel, a data extraction file was created to extract the information from the included articles. Then, two independent reviewers extracted data on the team utilizing the following criteria: (i) first author name and publication year, (ii) year of data collection, (iii) districts where data were collected, (iv) type of study; (v) study group, (vi) sample size, (vii) gender, (viii) age, (ix) level of education, (x) translated language, (xi) administrator of PHQ-9, (xii) cutoff scores, (xiii) prevalence, (xiv) standard/confirmation tool used, and (xv) psychometric measurements. Any disagreement among the entered data was settled by the team supervisor (SA), who cross-checked and updated the final data extracted file. However, the final data extracted file is presented in Table 1.

Description of the Included Studies
A total of 51 articles (49 peer-reviewed, one preprint, and one thesis) were included in the review. In 24 studies, the used study design was both cohort and cross-sectional study, but only five randomized clinical trials were found. These included studies conducted between 2011 and 2021, where participants ranged from 29 to 1,903. Two studies involved participants of any of   two genders [i.e., female (65) and male (53)], whereas the lowest number of female participants was 11 (66) and the highest was 1,492 (39), and the percentage of female participants ranged from 21% (66) to 82% (48). Most of the studies reported participants' mean age (n = 37), ranging between 31 and 45 years. One study reported median age (55), and the remaining studies (n = 13) used age categories. The majority of the participants of the included studies had attained education ( Table 1). The studies were conducted in various parts of the country, with the majority being conducted in the capital city, Kampala (n = 32), and its surrounding districts. For details of the study distribution in the country (see Figure 2).

Study Population of the Included Studies
The majority of the studies were conducted among HIVpositive participants (n = 28), whereas five were among refugee communities, and others were conducted among patients of other medical conditions (i.e., diabetes, TB, psychosis, and patients with a stoma or stroke). Studies were also done among the general population and community (n = 6). Four studies were also done among adolescents and children (age range 10-24).

PHQ-9 Tool Administration
Trained research interviewers were used to administer the PHQ-9 tool in most studies (n = 42), whereas health workers administered five studies, and only three studies were selfadministered (54,63,67). In addition, the tools were commonly administered in the translated Luganda language (n = 31), whereas 3 studies were in Kiswahili, 2 studies were in Kinyarwanda, and one study each was conducted for the rest of the languages, including Juba Arabic, French, Lusoga, Langi (Luo), and Runyankole/Rukiga.

Cutoff Scores Used in the Included Studies
About 20 studies considered a continuous score for the depression symptoms and reported the corresponding mean and standard deviation, whereas most studies (n = 23) used a cutoff of ≥10 to indicate clinical depression or major depression. Other cutoffs included 1-4 signifying minimal depression (n = 4), 5-9 for minor or mild depression (n = 12), 10-14 for moderate depression (n = 5), 15-19 for moderately severe depression (n = 3), and 20-27 for severe depression (n = 3). Other unique scores included 0-9 for minimal (n = 1) and 5-14 for moderately severe depression (n = 1).

Validation and Psychometric Properties
The PHQ-9 Cronbach alpha was reported in three studies (i.e., 0.68, 0.75, and 0.94) (16,42,65). The tool had mainly been validated with MINI (n = 2) (30, 42), but HSCL was also used (n = 1) (30). The tool was validated for use in two populations (368 HIV patients and 1,407 individuals of the general population), all in the same language, Luganda (30,42). For HIV patients, the sensitivity and specificity were 92 and 89%, respectively, at a cutoff score of 10 (30), and were 67 and 78%, respectively, at a cutoff of 5 for the general population attending a health facility (42).

DISCUSSION
The present review summarizes the existing evidence for the use, reliability, and validity of the PHQ-9 among the studies conducted in different Ugandan cohorts. This review can be considered to provide an overview of how the PHQ-9 is being used in Uganda and guide future direction on validating the depression measuring tools in this culturally diverse country. The PHQ-9 has been consistently found to have good psychometric properties in many parts of the globe (68). However, this is the first systematic review assessing the psychometric properties of the PHQ-9 based on the studies conducted in Uganda. The tool's sensitivity and specificity for MDD are found to be good, 92 and 89% at a cutoff score of 10, 67, and 78% at a cutoff of 5, respectively. Similar findings for this cutoff score of 10 are reported in many validation studies or systematic reviews outside the country (6,69). The sensitivity was 68.6% [95% confidence interval (CI) 48-83.7%] and 88% (CI 77-94%) (6), while the specificity was 84.5% (CI 74.3-91.1%) (69) and 78% (CI 65-88%) (6). Despite the good psychometric properties, the results are from very few studies, making it difficult to generalize the findings in a country with diverse cultures. However, the tool has been translated to the country's different languages, with a moderate to excellent reported Cronbach alpha (0.68-0.94). Although a Cronbach's alpha of 0.70 or greater is regarded as acceptable for a self-reported instrument, the PHQ-9 (70), the studies included in this review were mainly interviewer-administered. Hence, more studies are needed to validate and make the tool culturally acceptable and understood among the different languages by adjusting a few items or developing culturally applicable tools such as a tool developed for assessing depression among adolescents living with HIV (13).
The psychometric properties of the PHQ-9 among the HIVpositive patients were found to be excellent, with a sensitivity of 92% and 100% and a specificity of 89% and 100% for the cutoff of 10 and 5, respectively (30,39). However, the psychometric properties have been poor with the general population at a cutoff of 5 (sensitivity for MDD of 67% and specificity for MDD of 78%) (42), and the tool has not been validated in any other special group. This may be due to many studies about depression being done mainly among HIV-positive people in the country (71). Validation of the PHQ-9 in other languages, clinical or vulnerable groups, and cultures can widen the tool's application and acceptance due to its being simple to understand by patients and accuracy in detecting depression (68,72,73). Despite the tool not being validated in many special groups in the country, the PHQ-9 has shown good reliability at a cutoff of 10 in several groups, including (i) patients receiving psychiatry care, Cronbach alpha of 0.87, sensitivity of 93%, and specificity of 52% (74); (ii) bariatric surgery candidates; sensitivity of 75% and specificity of 76% (75); (iii) university students, internal consistency of 0.85; sensitivity of 85%, and specificity of 99% (76); (iv) patients with multiple sclerosis, Cronbach alpha of 0.82; specificity of 88%, and sensitivity of 95% (77); (v) geriatric population, internal consistency of 0.89, sensitivity of 95%, and specificity of 67% (78); and (vi) patients with psychiatric condition, Cronbach alpha of 0.88 following translation in Farsi; good correlations with PHQ-15 (0.64), and BDI-13 (0.70) (72). The tool has also been validated among several administration approaches, including via telephone (internal consistency of 0.91; good sensitivity of 82%, and specificity of 91%) (79) and computerized version (Cronbach alpha of 0.88, correlation with the paper version of 0.92) (80), but in review, the tool was validated for use by none of the health workers (3). The diversity of populations and special patient groups in which the tool was used, makes its reliability uncertain in multicultural countries, especially in Africa, thus requiring further research.
A number of limitations should be considered when interpreting this review's findings. First, most of the articles included in this review are from the same major study or project. The data about the reliability of the PHQ-9 also posit challenges for any possible meta-analysis. The findings are mainly from the central part of the country; thus, we cannot generalize the PHQ-9 acceptability. Despite these limitations, this review includes a large sample size, with results from both gray literature and peerreviewed articles, showing the tools used in different populations and cultures, which are the mentionable strengths of the study. The study also included studies from different study groups that point to the generalizability of the PHQ-9 to a broader population in Uganda.

CONCLUSION
The PHQ-9 has been used in several studies in Uganda but validated in only two populations and is commonly used in one language. Thus, validation of the tool in various populations and languages is warranted to improve the tool's acceptance in Uganda. In addition, with most studies using a cutoff of 10 and above for the PHQ-9, future studies are recommended to adopt this cutoff to have nationally comparable results. Because of the vast use of the PHQ-9 among participants living with HIV, the tool is suggested to be the first choice among this population due to the significant reliability and validation. However, further studies are needed among those population groups with other chronic medical conditions such as stroke, diabetes, hypertension, and TB. In addition, more studies are highly recommended to validate this tool in different languages and in different parts of the country to cater to cultural diversity.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
MK was involved in the conception and initial manuscript drafting. MK and MM designed the study. With the assistance of a Liberian, MK identified all eligible articles from all sources and imported them to Endnote 9 to remove duplicates. SN and MK selected the remaining articles by title and then by abstract independently. MM settled any discrepancy about an included article from MK and SN. After undergoing a quality check performed by SA, MM, and MK did a final review of the articles. All authors were involved in the analysis, interpretation, substantive revisions, and approval of the final version.

ACKNOWLEDGMENTS
We acknowledge the support and guidance provided by Mr. Wilson Adriko in developing the search strand strings and literature retrieval for the selected articles.