Health systems strengthening through policy-level integration of community health worker programs into national health systems for improved health outcomes - scorecard metrics validation: A bifactor structural equation model approach

Background Subsequent to the demonstrated potential of community health workers (CHWs) in strengthening health systems to improve health outcomes, recent literature has defined context and guidelines for integrating CHW programs into mainstream health systems. However, quantitative measures for assessing the extent of CHW program integration into national health systems need to be developed. The purpose of this study was to validate a newly developed scale, Community Health Worker Program Integration Scorecard Metrics (CHWP-ISM), for assessing the degree of integration of CHW programs into national health systems in Sub-Saharan Africa (SSA). Methods Data obtained through a pilot study involving a purposively selected sample of 41 participants selected from populations involved in CHW programs work in selected countries of SSA formed the basis of a 31-item bifactor model. Data were collected between June and December 2019. By applying a latent variable approach implemented with structural equation modeling, data analysis was mainly done using the R statistical environment, applying factor analysis procedures. Results Dimensionality, construct validity, and the CHWP-ISM scale's internal consistency were assessed. Confirmatory factor analysis of the CHW-ISM bifactor model supported a co-occurring CHW integration general factor and six unique domain-specific factors. Both the comparative fit index (CFI) and Tucker–Lewis Index (TLI) fit indices were above 0.9, while the root mean square of the residuals (RMSR) was 0.02. Cronbach's alpha (α), Guttman 6 (Lambda 6), and Omega total (ωt) were above 0.8, indicating good scale reliability. Conclusion Statistical significance of the bifactor model suggests that CHW integration has to be examined using factors that reflect a single common underlying integration construct, as well as factors that reflect unique variances for the identified six subject-specific domains. The validated CHWP-ISM could be useful to inform policy advisers, health systems, donors, non-governmental organizations, and other CHW program stakeholders with guidance on how to quantitatively assess the integration status of different components of CHW programs into respective critical functions of the health system. Improved integration could increase CHW program functionality, which could in turn strengthen the healthcare systems to improve health outcomes in the region.

Background: Subsequent to the demonstrated potential of community health workers (CHWs) in strengthening health systems to improve health outcomes, recent literature has defined context and guidelines for integrating CHW programs into mainstream health systems. However, quantitative measures for assessing the extent of CHW program integration into national health systems need to be developed. The purpose of this study was to validate a newly developed scale, Community Health Worker Program Integration Scorecard Metrics (CHWP-ISM), for assessing the degree of integration of CHW programs into national health systems in Sub-Saharan Africa (SSA).
Methods: Data obtained through a pilot study involving a purposively selected sample of participants selected from populations involved in CHW programs work in selected countries of SSA formed the basis of a -item bifactor model. Data were collected between June and December . By applying a latent variable approach implemented with structural equation modeling, data analysis was mainly done using the R statistical environment, applying factor analysis procedures.
Results: Dimensionality, construct validity, and the CHWP-ISM scale's internal consistency were assessed. Confirmatory factor analysis of the CHW-ISM bifactor model supported a co-occurring CHW integration general factor and six unique domain-specific factors. Both the comparative fit index (CFI) and Tucker-Lewis Index (TLI) fit indices were above . , while the root mean square of the residuals (RMSR) was . . Cronbach's alpha (α), Guttman (Lambda ), and Omega total (ω t ) were above . , indicating good scale reliability.

Introduction
Research evidence demonstrating the potential of community health workers (CHWs) in strengthening health systems to improve health outcomes has been growing (1-7), particularly their efficacy in improving the reach, impact, and efficiency of health services (8). In addition to improving health outcomes in general, evidence has also demonstrated their prospect in improving child health outcomes, particularly reducing childhood morbidity and mortality. (1,(9)(10)(11)(12). This has been attributed to many reasons, including their placement as a bridge between health facilities and communities (13-16), which accord CHWs with an immense and unique advantage of proximity and availability in communities, to mobilize community members to identify and address their own health needs (13, 17).
However, this evident efficacy of CHWs in delivering community-based preventive and curative services is being truncated by varying, subjective, or at worst lack of CHW program integration into national health systems (11, [18][19][20]. In an attempt to review practical strategies to reduce the Under 5 Mortality Rate (U5MR), the 2015 Renewed Promise to Child Survival underscored the need to strengthen health systems to deliver high-quality high-impact interventions (HIIs) (21) for child health. Scott et al. highlighted that the integration of CHWs with health systems necessitates their inclusion into public policies that direct national service delivery, human resources for health, health financing, medical products and technologies, health information and leadership, and governance critical functions of the health system (22). The call for this health workforce cadre to be integrated into national health systems has been stressed (8,(22)(23)(24)(25)(26)(27)(28)(29) and in particular, in Sub-Saharan Africa (SSA) region (4,8).
Recent literature has defined the context, mechanisms, and guidelines for integrating CHW programs into mainstream health systems (1,19,(25)(26)(27)(30)(31)(32)(33)(34)(35)(36). However, much of what is currently in use for assessing CHW program integration is based on qualitative measures. Although qualitative metrics add valuable information, given that some integration determinants cannot be captured by quantitative measures, such metrics can be subjective. Qualitative metrics may include human experience or judgment as a factor in measurement and information that can often be difficult to measure due to ambiguity. On the other hand, there are advantages to measuring CHW integration quantitatively because the extent can often be clearly expressed as a ratio, percentage, or even average that can be compared across two or more program settings. Basing CHW integration metrics on quantitatively validated models increases transparency and consistency. To date, little research has been done to assess CHW programs' integration quantitatively or the use of quantitative integration metrics to assess CHW program integration into national health systems, at both policy and implementation levels.
This study validated newly developed scorecard metrics for measuring the extent of CHW programs' integration into national health systems (CHWP-ISM) in Sub-Saharan Africa (38) at the policy level. The paper tests the proposed CHW integration metrics scale's construct validity, dimensionality, and reliability. The items, generated and reported in Mupara et al. (38), included some suggestions proposed by Boateng et al. (37) as best practices for developing and validating scales for health.

Methodology
Theoretical considerations, item selection, and study conceptual framework The present study was based on the earlier reported study (38), which is part of a wider study undertaken to generate Frontiers in Public Health frontiersin.org . /fpubh. .

FIGURE
Study conceptual framework (X -X as indicated in Table ). Source: Adapted from the study by Mupara and Mogaka ( ).
items suitable for assessing the extent of integration of CHW programs into national HS. The study (Mupara et al. (38) study on integration indicators) was based on WHO's building blocks for strengthening health systems (HSS) (Figure 1). In this study, CHW program integration at the policy level has been defined as the policy-level guidelines and directives that specify how CHW programs should be mainstreamed into national HS. These policy guidelines were viewed in line with the WHO Health systems (HS) building blocks, namely, service delivery, human resources for health, health information, medical products and technologies, health financing, and leadership, and governance (39). Therefore, for the purpose of this study, six matching integration domains whose indicators could be measured to ascertain the extent of CHW program integration, specifically at the policy level, were identified. These were CHW Recruitment, Education, and Certification (

Study participants
Purposive sampling was employed to identify and invite a sample of 45 healthcare providers who were involved with

Procedure
This study was guided by the general principles of the Nuremberg Code, the Declaration of Helsinki, and the Institutional Review Board. A study package was delivered to selected eligible participants by hand, courier services or email depending on the geographical location (distance) of potential participants between January and April 2019. The questionnaires were self-administered. The package included an invitation letter with a study description, a consent form, and a questionnaire. The first section of the questionnaire was designed to elicit demographic information about participants, including their age, gender, educational level, and occupation. The participants were asked to rate their considered opinion on a 5-point ("strongly agree" to "strongly disagree") Likert-type scale in order to express how certain items could be used to measure the extent of integration of CHW programs into national health systems. The items that participants responded to are listed in Table 1. Forty-one participants successfully completed the survey. Completed questionnaires were retrieved back from participants through hand delivery and couriered services. Data were captured in Excel, cleaned, and processed for analysis using R. There were no missing data.

Statistical considerations
The CHWP-ISM bifactor model was assessed using a latent variable approach implemented with structural equation modeling. The study constructed latent variables (constructs) that were hypothesized to have varying influences on CHW integration. Their relationship with corresponding integration (observable) indicators was then statistically tested using the bifactor modeling method (41, 42) through exploratory and confirmatory factor analyses (EFA/CFA) (43).
The 31 items were grouped to reflect specific domains, which are as seen in the study by Tellegen and Waller (44). The specific domains for this study were CHW REC; CHW R&R; CHW Remuneration; CHW Supervision; CHW Information Management, and CHW Supplies and Equipment as shown in Table 1. A bifactor latent structure analysis, as explained in the study by Reise et al. (45), was done to assess the scale's unidimensional-multidimensional nature. Bifactor models allow for the assessment of hierarchical models of constructs, examining whether indicators contribute to specific (unique) factors over and above their contribution to a general factor (42). This modeling approach enabled us to derive (via EFA) and test (via CFA) the most optimal way to present the CHWP-ISM scale: either as a conceptually broad "CHW integration" unidimensional factor, or as a multidimensional scale made of domain-specific factors; or as a blend of the . /fpubh. . two, reflecting both general and sub-domain factors. The single factor scale reflected the variance common among all observed measures in the checklist, while the multidimensional checklist reflected additional common variance among item clusters, corresponding to content-specific subdomain constructs. First, a factor analysis model was specified using the "omega" (46) function in "psych" (47) version 1.8.12 R package, i.e., an exploratory factor analysis (EFA). The "omega" function incorporates Leiman transformations (41) in estimating bifactor structures. Next, a CFA model was specified using "lavaan" version 0.6-3 (48,49) in R version 3.6.1 (50), which provided the fitted item to use with the "omegaFromsem" function for CFA. Criteria for item retention for the final model were based on factor loading >0.20 on either the general factor or any of the specific factors. The goodness of fit was tested using global and local fit indices. Given that this was a new scale development, cutoff points were according to the following fit criteria: SRMR ≤0.1, TLI and CFI ≥0.80 for acceptable fit, root mean square error of approximation (RMSEA) ≤0.06, and SRMR ≤0.08 (51). Besides, both modification indices and item content theory were used in model modification. Coefficient alpha (α) and 3 omega coefficients (ω T, ω h, and ω hs ) from the 'omegaSem' output were used to examine the internal consistency and reliability of the CHWP-ISM scale.

Demographic characteristics of participants
There were 41 participants out of the initial 45 invited (88% response rate). There were more female participants than male participants (51%). Those with a diploma or above as the highest academic qualifications were the majority (75%), while 24 % of the participants who hold certificate qualifications are CHWs. All participants who took part in the study had a tertiary education qualification and were all proficient in the English language. The rest of the participants' demographic profiles varied greatly and are summarized in Figure 2.

Correlations and descriptive statistics
Descriptive statistics of the data were obtained, including an analysis of outliers, which was defined as any value that is greater than three standard deviations above or below the mean. Skew statistics were assessed using the "mardia" function in "psych" version 1.8.12 (52) R package that applies Mardia's tests for multivariate skewness and kurtosis. Skew values ranged from (-) 1.87 to (-) 0.85 and kurtosis (k) statistics for 30 items ranged from −0.49 to 1.91, with the 31st item indicated at k = 3.61. However, this outlier did not alter results and so it was included in the dataset. The data have a general normal distribution. This justified the use of the maximum likelihood estimation method (ML) in carrying out further analysis using the data.

Variable correlation
We used the R package "corrplot" version 0.84 (53) to visualize the data correlation matrix. The correlogram in Figure 3 displays variables in the correlation matrix and how they relate with each other. In the upper triangle, positive correlations are displayed in blue and negative correlations in red color. Color intensity and the size of the circle are proportional to the correlation coefficients, helping to identify "groups" of variables that share a strong relationship with each other (hierarchical clustering). The lower triangular correlation matrix displays the actual correlation values.

Construct validity
We estimated the CHW-ISM bifactor model by finding the number of latent factors underlying the 31 integration items (observed variables) with an initial EFA. Figure 4 shows the item correlations between observed indicators and the latent variables. Table 2 shows the EFA solution that indicates seven eigenvalues, with five that are >1.0 and two closer to 1, implying seven extractable factors (one general factor and six contentspecific factors). This is consistent with our prior hypothesis on CHW-ISM structure: a general (unidimensional) factor and six content-specific unique factors. A further examination of the EFA factor structure in Table 2 shows that, for the most part, the loading pattern of the bifactor solution has items more or less perfectly settling into respective domain-specific parcels that speak to the six domains as hypothesized (Table 1). More generally, however, the items have more loading on the general factor (CHW program integration). This implies a factor structure in which convergent and discriminant validity are evident by the high loadings to the specific factors, with most factors loading above 0.70. Loading on the general factor, however, greatly varies: some items' loadings are <0.2 (not shown because we set cutoff level of ≥ 0.2 loadings), while others are as high as 0.9. This is strong evidence that the main CHW integration construct can be sub-scaled into its separate, distinct but correlated elements/subscales.
Column "h2" in Table 2 presents values for item-explained common variance (i.e., percent communality) due to the general factor. These values suggest most items from all constructs equally form good candidates for inclusion in a unidimensional (one common factor) item set, i.e., they form the core of a unidimensional CHW-ISM scale. This is evidenced by their large communality values (h2).  To examine the extent the CHW-ISM bifactor model fits the data, we examined a number of absolute and relative fit indices (51). The root mean square of residual (RMSR), the square root of the difference between the residuals of the sample covariance matrix and the hypothesized covariance model, was indicated as 0.02 (acceptable fit is below 0.08 (51)). The comparative fit index (CFI) and Tucker Lewis Index (TLI) above.95 are currently considered an indicator of excellent fit (51,54), and CHW-ISM Scale's CFI and TLI were above this threshold, indicating excellent fit. Average Variance Extracted (AVE) was indicated as 0.7, which is above the recommended > 0.5 for establishing convergent validity. The above factor loading and model fit statistics present evidence for the construct validity of the CHWP-ISM scale.

CHWP-ISM scale reliability
To test the bifactor model's dimensionality, we examined whether item factor loading of the scale could be explained by one general factor, plus several specific factors corresponding to each of the scale's facets (dimensions).
As seen in Table 2, the standardized coefficient alpha was given as 0.98, implying that 98% of the observed score variance can be attributed to the "true score" variance. However, coefficient alpha has been said to be limited in explaining variance when the data are multidimensional, as in the case of a bifactor model (55). The factor structure in Table 2 suggests an essentially congeneric model and not a tau-equivalent model that the alpha coefficient is particularly suited for (56). Therefore, we also used omega coefficients (41, 45, 46) to indicate the construct reliability of the PMI measurement scale. It has been noted that omega coefficients provide a more accurate approximation of a scale's reliability (57).
Omega total (ω t ) accounts for the variance due to the general factor, as well as the group factors. From Table 2, it is indicated that ω t = 0.98. Even though ω t is appropriate for varying factor loadings as seen in the bifactor model, its value is influenced by all modeled sources of common variance and includes itemspecific variance as an error (58). To further clarify sources of variance, we used coefficient omega hierarchical (ω h ) and coefficient omega hierarchical subscale/group (ω hS ). Coefficient ω h estimates the proportion of variance in total scores that can be attributed to a single general factor. From Table 2, when ω t and ω h are compared, we noted that 42% of all reliable variance in the total scores (0.42/0.98) can be attributed to the general factor, assumed to reflect individual differences in the trait of CHW integration. Fifty-seven percent (0.95-0.36) of the reliable variance in total scores is attributable to the multidimensionality caused by the four group factors. This implies that the CHW-ISM scale is principally (∼40-60) unidimensional-multidimensional, reflecting a co-occurring .
/fpubh. . CHW integration general factor and six unique content-specific factors. This is also supported by the explained common variance (ECV) due to the general factor, which is 0.22, implying that the general factor explains 22% of the common variance extracted, with 78% of the common variance spread across the six unique group factors. On the other hand, the unique variance associated with each of the six subscales once the variance associated with the general (unidimensional) factor is partitioned out is indicated by coefficient ω hs. This index reflects the reliability of a subscale score after controlling for the variance due to the general factor (59). In this case, ω hs was reported as REC = 0.89; R&R = 0.73; ReM = 1.21; Sup = 0.68; IT = 0.77; and E&S = 0.38. These values imply that the reliability of subscales is adequate and justifies the use of the CHW-ISM subscales for any future quantitative investigations of CHW integration into health systems.

Discussion
With the growing importance of CHW programs on the global health agenda, comes the responsibility to create a scientific foundation for CHW integration metrics and evaluation. However, CHW integration into national health systems is a broad concept. This implies that the 'CHW integration' construct has a more diverse content, making it more reasonable to question whether it can be adequately measured as a single unidimensional construct or as a multidimensional construct composed of many sub-constructs. In this study, a bifactor model was applied to explore the unidimensional-multidimensional structure of the newly defined CHW integration measurement tool. Study findings provide evidence to the question as to whether the data set had a strong enough common factor (unidimensional), or had a more complex multidimensional (content-specific subcategory) structure. Furthermore, the tool's validity and reliability as a quantitative measure of CHW integration into healthcare systems were investigated and documented. This presented a strong basis for an objective means of measuring CHW integration into national health systems across healthcare jurisdictions.
Although factor loading was adequate, the bifactor CFA model did not suggest a perfect unidimensional CHW integration construct, as shown in Figure 4. Furthermore, strong content-specific correlations did not point to the . /fpubh. .

FIGURE
Exploratory bifactor model for CHW-ISM scale. g, the conceptualized general factor indicative of communalities of all items; REC, "CHW Recruitment, Education and Certification" construct; R&R, "CHW Roles and Responsibilities" construct. ReM, "CHW Remuneration"; SuP, "CHW Supervision"; IT, "CHW Information management"; E&S, "CHW supplies and equipment" Items labels (x -x ) correspond to items as listed in Table . main common CHW integration construct. Therefore, the combined CHW-ISM bifactor scale presents a better and more plausible model which can explain scale reliability at subscale (multidimensional) as well as at full-scale (unidimensional) levels. This is because the bifactor model shows how all items simultaneously measure both the common CHW integration trait and at the same time account for the variance of each item as influenced by domain-specific (subscale) groupings.

Conclusion
This study aimed at validating the CHW Integration Metrics Scorecard (CHWP-ISM) for assessing the degree of integration of CHW programs into the national health system in SSA. We proposed that this validated CHWP-ISM can be used to evaluate the extent of integration of health interventions aimed at strengthening health systems through the WHO HS building blocks. The metrics scorecard can be used to pair component health interventions (integration variables) with corresponding WHO HS building blocks that they feed into, and then the extent of integration can then be determined considering the integration variables.
The process of integration stage is used to determine if there are country policies that speak to the inclusion of CHW program integration variables into respective health system building blocks. Evidence of integration zooms into the specific guidelines developed from the policies detailing the day-to-day running of the CHW programs in a way that exhibits that they are part and parcel of the health system. Having established that the above is in place, the presence or absence of integration indicators will be used to score all the aspects that make up the specific integration parameter. Thereafter, the aggregation of the integration indicators can be judged against the scale to determine the extent of integration.
In general, the CHWP-ISM can be used to review CHW programs' extent of integration at all levels of the health system be they local, district, provincial, or national. Particularly, to reconnoiter the interaction between the respective components of the health intervention under study and its corresponding WHO building block of the health system. It is hoped that the use of this metrics scorecard to assess the extent of integration of . /fpubh. . The root mean square of the residuals is 0.02 The  Table: "g", F1, F2, F3, F4, F5, and F6 correspond to constructs "g", REC, R&R, ReM, SuP, IT, and E&S in Figure 4.
CHWs could better strengthen health systems to improve health outcomes in Sub-Saharan Africa. One weakness of the study is the assumption that all participants understand CHW policy issues at the same level. Further studies should be conducted with participants with the same level of understanding of CHW policy.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement
The studies involving human participants were reviewed and approved by Biomedical Research Ethics Committee, University of KwaZulu-Natal. The patients/participants provided their written informed consent to participate in this study.