Analysis of Hierarchical Routine Data With Covariate Missingness: Effects of Audit & Feedback on Clinicians' Prescribed Pediatric Pneumonia Care in Kenyan Hospitals

Background: Routine clinical data are widely used in many countries to monitor quality of care. A limitation of routine data is missing information which occurs due to lack of documentation of care processes by health care providers, poor record keeping, or limited health care technology at facility level. Our objective was to address missing covariates while properly accounting for hierarchical structure in routine pediatric pneumonia care. Methods: We analyzed routine data collected during a cluster randomized trial to investigating the effect of audit and feedback (A&F) over time on inpatient pneumonia care among children admitted in 12 Kenyan hospitals between March and November 2016. Six hospitals in the intervention arm received enhance A&F on classification and treatment of pneumonia cases in addition to a standard A&F report on general inpatient pediatric care. The remaining six in control arm received standard A&F alone. We derived and analyzed a composite outcome known as Pediatric Admission Quality of Care (PAQC) score. In our analysis, we adjusted for patients, clinician and hospital level factors. Missing data occurred in patient and clinician level variables. We did multiple imputation of missing covariates within the joint model imputation framework. We fitted proportion odds random effects model and generalized estimating equation (GEE) models to the data before and after multilevel multiple imputation. Results: Overall, 2,299 children aged 2 to 59 months were admitted with childhood pneumonia in 12 hospitals during the trial period. 2,127 (92%) of the children (level 1) were admitted by 378 clinicians across the 12 hospitals. Enhanced A&F led to improved inpatient pediatric pneumonia care over time compared to standard A&F. Female clinicians and hospitals with low admission workload were associated with higher uptake of the new pneumonia guidelines during the trial period. In both random effects and marginal model, parameter estimates were biased and inefficient under complete case analysis. Conclusions: Enhanced A&F improved the uptake of WHO recommended pediatric pneumonia guidelines over time compared to standard audit and feedback. When imputing missing data, it is important to account for the hierarchical structure to ensure compatibility with analysis models of interest to alleviate bias.

Background: Routine clinical data are widely used in many countries to monitor quality of care. A limitation of routine data is missing information which occurs due to lack of documentation of care processes by health care providers, poor record keeping, or limited health care technology at facility level. Our objective was to address missing covariates while properly accounting for hierarchical structure in routine pediatric pneumonia care.
Methods: We analyzed routine data collected during a cluster randomized trial to investigating the effect of audit and feedback (A&F) over time on inpatient pneumonia care among children admitted in 12 Kenyan hospitals between March and November 2016. Six hospitals in the intervention arm received enhance A&F on classification and treatment of pneumonia cases in addition to a standard A&F report on general inpatient pediatric care. The remaining six in control arm received standard A&F alone. We derived and analyzed a composite outcome known as Pediatric Admission Quality of Care (PAQC) score. In our analysis, we adjusted for patients, clinician and hospital level factors. Missing data occurred in patient and clinician level variables. We did multiple imputation of missing covariates within the joint model imputation framework. We fitted proportion odds random effects model and generalized estimating equation (GEE) models to the data before and after multilevel multiple imputation.
Results: Overall, 2,299 children aged 2 to 59 months were admitted with childhood pneumonia in 12 hospitals during the trial period. 2,127 (92%) of the children (level 1) were admitted by 378 clinicians across the 12 hospitals. Enhanced A&F led to improved inpatient pediatric pneumonia care over time compared to standard A&F. Female clinicians and hospitals with low admission workload were associated with higher uptake of the new pneumonia guidelines during the trial period. In both random effects and marginal model, parameter estimates were biased and inefficient under complete case analysis.

INTRODUCTION
Routine data are widely used in many countries to monitor quality of care and to inform intervention programmes for better patients' health outcomes (1).
Routine data can also be used to highlight areas of concern in clinical performance thus prompting actions and strategies to improve practice at individual or institutional levels (2). Prior studies show that quality of care vary across place and time in spite of standard clinical guidelines (3). These variations can be attributed to multiple factors including changes in clinical guidelines, degree of task complexity, and patient's characteristics, clinician characteristics in addition to organizational and contextual factors at hospital level (3)(4)(5). Between 2013 and 2014, the Kenya Medical Research Institute-Wellcome Trust Research Programme in collaboration with the Ministry of Health, the Kenya Pediatric Association and 14 county-level hospitals initiated a partnership known as the Clinical Information Network (CIN). The main aim of CIN is to collect and use routine pediatric data to promote adoption and adherence to recommended clinical practices through audit and feedback (A&F) cycles (3,(5)(6)(7). While such data from multiple sites enhance generalization of results to wider population, it leads to complex hierarchical data structures, for instance, patients clustered within clinicians, who are then clustered within hospitals.
Besides complex structures, routine data are subject to missing information at any level of hierarchy. Missing information may occur due to lack of documentation of care processes by health care providers, poor record keeping, or limited health care technology at facility level (1,8,9). In the occurrence of missing data, appropriate missing data methods at analysis stage are recommended to avoid biased results (10) informing clinical policies and ultimately leading to poor patients care and outcomes (11).
In the recent past, there has been an increase in literature on quality of care among children admitted with common childhood illnesses in low and middle income countries (3,(12)(13)(14)(15). However, majority of the studies account for variation at patient and hospital levels ignoring variation due to clinicians characteristics in spite of their critical role in delivery of routine care (16). Besides, missing data is a common problem across these studies. Majority of the studies report using complete case analysis (13,17,18) and multiple imputation (15,19,20). A major limitation of complete case records is biased and inefficient parameter estimates due to information loss. In studies where multiple imputation is used to handle missing data, the nature and details of the imputation model are rarely reported posing uncertainty about conclusions and barriers for replicate analyzes. Furthermore, when missing data occur in multilevel data context, incompatibility between the imputation model and the analysis models potentially leads to biased estimates, underestimated cluster level variances, and overestimated individual level variances (10,(21)(22)(23). For example, incompatibilities occur when the imputation model assumes data are single level (i.e., ignoring multilevel structure) while the analysis model of interest is multilevel.
In this study, we aim to address missing covariates while properly accounting for hierarchical structure in inpatient routine data set, that is, patients nested within clinicians who are then nested within hospitals. Specifically, we analyze data from a cluster randomized trial investigating the effect of enhanced audit and feedback on clinicians' prescribed pediatric pneumonia care in Kenyan hospitals. To achieve this objective, we construct and analyze pneumonia Pediatric Admission Quality of Care (PAQC) score adapted to new WHO recommendations on assessment and treatment of inpatient pediatric pneumonia cases. PAQC score is a newly developed ordered composite measure used to benchmark quality of care among children admitted with common childhood illnesses in low and middle income settings.
The remainder of this paper is structured as follows: In the Methods section we present a description of pneumonia trial data followed by statistical analysis methods for cluster correlated and missing data methods, respectively. Thereafter, we present results before and after multiple imputation and conclude with a discussion.

Study Design
In this study we analyzed data from a cluster randomized trial conducted by KEMRI-Wellcome Trust Research programme between March 2016 and November 2016. Details of the trial and the study population are described in full elsewhere (5,24). In summary, the trial was embedded within the larger CIN study (ongoing) (6,7,25). The primary goal of the trial was to investigate whether enhanced audit and feedback improved quality of inpatient pediatrics pneumonia care (i.e., assessment, diagnosis, and treatment of childhood pneumonia) in Kenyan hospitals following new pneumonia guidelines recommended by the World Health Organization (WHO) in 2013 (26). Six hospitals were randomized to receive a standard audit and feedback report on general inpatient pediatric care (control arm). The remaining six hospitals received a standard audit and feedback report in addition to an enhanced audit and feedback targeting assessment, classification and treatment of pneumonia cases (intervention arm) (5,24). Trained data clerks abstracted routine data from the medical records into Research Electronic Data Capture (REDCap) tool after patient's discharge from general pediatric wards. Data abstraction process was guided by a standard operational procedure manual (5). Patients' data spanned history of illness, physical examination, diagnosis, laboratory investigations, treatments, and discharge plans (5,24). Details of admitting clinician including sex and professional qualification were also recorded into a separate database linked to the patients' database by a unique clinician code.
Data quality assurance (DQA) exercises were conducted by CIN research assistants in each hospital every 3 months to check consistencies with data clerk's entries. The Kenya Ministry of Health and Kenya Medical Research Institute's Scientific and Ethical Review Unit approved data collection without individual patient's consent (5).

Outcome: Pneumonia Pediatric Admission Quality of Care Score
Our outcome of interest was pneumonia PAQC score adapted to 2013 WHO pediatric pneumonia treatment guidelines. As earlier mentioned, PAQC score is a summary measure spanning three quality of care domains namely, assessment, clinical diagnosis, and treatment of common childhood illnesses including pneumonia, malaria, diarrhea, and dehydration. Details on PAQC score construction and validation are described in full elsewhere (12,27). With regard to pneumonia PAQC, there are three binary subcomponents in the assessment domain. The first subcomponent represents assessment and documentation of two primary signs and symptoms required for pneumonia identification (i.e., presence of cough or difficulty in breathing). The value 1 in the binary indicator denotes documentation of both cough and difficulty in breathing as either present or absent while 0 denotes lack of documentation of least one primary sign and symptom in a patient's medical record.
The second binary indicator represents assessment and documentation of secondary signs and symptoms required for pneumonia severity classification (i.e., chest indrawing, respiratory rate, grunting, central cyanosis, oxygen saturation, ability to drink, or altered level of alertness). The value 1 in the binary indicator denotes documentation of all secondary signs and symptoms, respectively, while 0 denotes lack of documentation of least one secondary signs and symptom. The third binary indicator of the assessment domain corresponds to 1 when primary and secondary pneumonia signs and symptoms (all primary and secondary signs and symptoms combined) are documented and 0 otherwise (26).
The second PAQC score domain entails integration of information on presenting signs and symptoms by admitting clinician to correctly diagnose and classify pneumonia severity (i.e., severe pneumonia or pneumonia). For example, pneumonia was the correct diagnosis for a child who, in addition to cough and/or difficult breathing (primary signs), presented with lower chest indrawing or respiratory rate >50 for patients aged 2-11 months (or respiratory rate <40 for patients aged 12-59 months) in the absence of all other secondary signs and symptoms. In this study, a binary indicator was created with value 1 representing correct pneumonia severity classification (i.e., is, pneumonia severity documented in the medical record by the admitting clinician was in line with severity implied by presenting signs and symptoms) and 0 representing misclassified pneumonia severity.
The third PAQC score domain consists of two binary indicators. The first binary variable indicates whether oral amoxicillin was prescribed for pneumonia cases (denoted by 1) or not (denoted by 0). The second binary variable indicates whether oral amoxicillin was prescribed according to guideline recommended doses (26). In order to determine correctness of the dose, we created a new variable "dose per kilo body weight" using actual dose given at point of care, patient's weight, and frequency of administration. Among pediatric pneumonia cases, the recommended oral amoxicillin dose should range between 32 and 48 international units per kilogram (IU/Kg) every 12 h. The new variable was then transformed into a binary variable with 1 representing correct dose (that is, dose per kilo body weight and frequencies of administration are in line with guidelines recommendations) and 0 representing incorrect dose (incorrect in either dose per kilo body weight or frequency of administration) or missing dose. Subsequently, we summed all the six binary components across domains to obtain PAQC score; an ordinal outcome on a 7-point scale. We constructed pneumonia PAQC score at patient level. A minimum score of zero corresponded to inappropriate pneumonia care and maximum score of six represented complete adherence to new pneumonia guidelines across domains of care. To assess performance in terms of adherence to pediatric pneumonia guidelines during the trial period, we calculated and plotted the LOESS smoothing curves and the corresponding 95% confidence bands for the mean monthly PAQC score for each intervention arm.

Covariates
The covariates of interest were intervention arm, follow up time in months with their interaction, hospital malaria prevalence status, and hospital admission workload. At clinician level, gender, and cadre were considered (here cadre refers to clinician's level of training that is, clinical officers with diploma-level training and medical officers with a bachelor's degree level training). At patient level, we considered sex, number of comorbid illnesses, and age at admission. Prior to analysis, we converted age for all the patients into months before categorizing them into two age groups that is, patients aged 2-11 months and patients aged 12-59 months. With regard to comorbidities, we determined the total number of clinical diagnoses documented in patient's medical records. The diagnoses of interest included malaria, malnutrition, HIV, Asthma, Tuberculosis (TB), rickets, anemia, diarrhea, and dehydration. For each patient, we created separate binary variables for the diagnosis above with value 1 denoting the presence of a disease and 0 denoting absence of a disease. We then summed the binary indicators and categorized patients into four groups, that is those with 0, 1, 2, 3 or more comorbidities, respectively.

Missing Data Concepts
In the analysis of partially observed data, assumptions were made about the missingness mechanism generating the data (10). Suppose Y (representing both response and independent variables) is an N × p matrix denoting a hypothetical data set containing p variables (j = 1,...,p) for the ith study subject, (i = 1,2,3,. . . ,N). For each study subject, Y i can be partitioned into observed and missing components denoted by Y i obs and Y i miss , respectively. Further letting a missingness indicator R i take the value 1 if Y i is observed and 0 if Y i is missing. Then according to Rubin (28) data are said to be missing completely at random (MCAR) when the probability of missing values in variable is independent of the variable itself or any other observed variable in the data set that is, P(R i |Y i miss , Y i obs ) = P(R i ). When the probability of missing values in a variable does not depend on the variable of interest but are conditionally dependent on other observed variables in the data set, then data are said to be missing at random (MAR) and denoted by P( . When MAR assumption does not hold, then data are said to be Missing Not at Random (MNAR). MNAR mechanism occurs when the missingness depends on the actual value of the missed observation (10).

Investigating the Missing Data Mechanism
Before analyzing partially observed data, it was important to investigate plausible missing data mechanisms (10,29). In this study we generated binary missingness indicators (R i ) for partially observed variables in the pneumonia trial data set. The binary missingness indicators were analyzed separately using a logistic regression model below where X i is a vector of fully observed variables for the ith subject. The vector β denotes fixed regression parameters to be estimated. When the probability of missingness is independent on fully observed variables (P-values for the regression coefficients > 0.05), a variable is said to be MCAR. On the other hand, when the probability of missingness is dependent on fully observed variables (P-values for the regression coefficients < 0.05), then MAR assumptions holds and restricting analysis to complete observations yields bias and inefficient estimates (10,29,30). Similarly, when the probability of missingness is dependent on fully observed covariates but independent of the response variable, then covariate dependent MAR assumptions holds and restricting analysis to complete observations yields unbiased but inefficient estimates due to information loss (10,29,30). We also used graphical methods to investigate missing data patterns underlying pneumonia trial data ( Figure A1 in Supplementary Material).

Multiple Imputation
Multiple imputation (MI) involves substituting each missing value with a set of plausible values given the observed data and an imputation model (10,31). MI is commonly used assuming a MAR mechanism but can also be used when data are MNAR.
Multiple imputed data sets are then analyzed using standard methods and results pooled into a single inference using Rubin's Rule (32). Multiple imputation is preferred over other missing data methods such as list wise or pairwise deletion because uncertainty about the missing values is taken into account (10,23,30,31,33). Additionally, MI separates imputation phase from analysis phase therefore allowing inclusion of auxiliary variables in the imputation model that are predictive of missing variables and the missingness mechanism (10,23,27,(33)(34)(35).
In this study, we imputed missing level 1 and level 2 variables within the joint modeling framework where replacement values are drawn from a multivariate normal distribution in a single step. Multilevel MI was implemented in the newly developed jomo and mitmil packages in R (version 3.4.3) which allows imputation of categorical variables with more than two levels in the second and higher levels of the multilevel structure (36). For the ith patient nested within jth clinician in hospital l, we defined a two level JM imputation model corresponding to where Y i,j,l (1) and Y j,l (2) are vectors of partially observed level 1 variables (patient's sex) and level 2 variables (clinician's sex and cadre), respectively. Predictor variables (X i,j,l (1) ) of missing patient's sex included fully observed follow-up time interacted with feedback arm, hospital admission workload and hospital malaria prevalence status, patient's PAQC score, patient's age and number of comorbid illnesses. Level 2 predictors (X j,l (2) ) for missing clinicians' sex and cadre included follow-up time interacted with feedback arm, hospital admission workload, and hospital malaria prevalence status. Column vectors β 1 and β 2 denote level 1 and level 2 fixed effects, respectively. A clinician random intercept (b j,l ) was included to account for clustering at clinicians' level and to ensure compatibility with substantive models of interests. A burn-in of 1,000 updates and a 1,000 iterations between each of the 30 imputations were considered. We used trace plots to assess convergence (37). Final estimates were pooled according Rubin's rules.

Statistical Analysis
We considered two model families to analyze pneumonia trial data, that is, generalized estimating equations (GEE) and random effects models. The random effects and GEE models differ in terms of estimation and interpretation of parameter estimates (30). We considered both models in order to assess the stability of inferences and conclusions within and across the two methods before and after multiple imputation.

Generalized Estimating Equations (GEE) Model
Generalized estimating equations (GEE) proposed by Liang and Zeger (38) is a quasi-likelihood method for modeling correlated responses within the marginal (population averaged) family of models (29,30). In GEE model a working correlation structure is adopted. However, the parameter estimates in GEE model are consistent even when the association structure is misspecified (29,39). A GEE model is given by where the link function h −1 (•) is a known function, X i is a design matrix for the fixed effects and β is the vector of unknown regression parameters. The vector of regression parameters is interpreted in terms of average response over the population rather than prediction of the effect of changing covariates on a given study subject (29). When the responses are ordered and the proportional odds assumptions of parallel logits hold, the cumulative logits (proportional odds) model is considered (40). For instance, considering ordered pneumonia PAQC score (outcome) for the ith patient nested within jth clinician in hospital l, the proportional odds GEE model of interest corresponds to where α k , k = 1,2,3,4,5,6 are PAQC score intercepts and β ′ s are regression coefficients common across all k−1 cumulative logits.

Random Effects Model
In contrast to population-averaged models, random effects models are useful when drawing inferences with respect to the subject-specific parameters. Given the covariates and random effects, the responses are assumed to be conditionally independent in this model (29,30). A random effects model is denoted by where h −1 (•) is a known link function, X i and Z i are design matrices for the fixed effects and random effects while β and b i are vectors of fixed and random parameters, respectively. The vector b i is assumed to be sampled from a multivariate normal distribution with mean vector 0 and covariance matrix . The vector of regression parameters (β) has subject specific interpretation in terms of the transformed mean response for in individual. Considering pneumonia trial data with ordinal PAQC score as above, proportional odds random intercepts model of interest corresponds to logit[P(Y PACQ Score : i,j,l ≤ k)] = α k + β 1 X age group : i,j,l + β 2 X patient ′ s sex : i,j,l + β 3 X comobidity : i,j,l + β 4 X clinician ′ s cadre : j,l + β 5 X clinician ′ s sex : j,l + β 6 X admission workload : l + β 7 X malaria prevalence : l + β 8 X time in months : l * X trial arm : l + b jl where α k , k = 1,2,3,4,5,6 are PAQC score specific intercepts, β ′ s are estimated regression coefficients (common across all k−1 cumulative logits) and b j,l are clinician's random intercepts. Hospital level random effects were not considered in these analyses due to the few number of clusters.

Statistical Tests for Multiple Parameters
We used Wald tests and likelihood-ratio tests to determine covariates with statistically significant effect on pneumonia PAQC score. The likelihood-ratio tests was used to test for statistical significance of covariates in the random effects models (10,41,42

Descriptive Summaries
In total, 2,299 children aged 2-59 months were admitted in general pediatric wards with childhood pneumonia in 12 CIN hospitals during the trial period. We linked patients and clinicians' databases using unique clinician code present in both databases with a success rate of 92.5% (2,127/2,299) after exclusion of 172/2,299 case records lacking admitting clinician's information. This resulted to three levels of clustering i.e., 2,127 patients admitted by 378 clinicians in 12 hospitals. Of the 2,127 pneumonia cases, 953/2,127 (44.8%) were admitted in six hospitals assigned to enhanced A&F (intervention) arm. The number of pneumonia cases varied across hospitals with a range of 42-356 patients ( Table 1). Five out of 12 hospitals were drawn from high malaria endemic regions (three control and two intervention hospitals) while the remaining seven hospitals (four control and three intervention hospitals) were drawn from low malaria regions in Kenya (25). Furthermore, four in 12 hospitals were high admission workload hospitals that is, more than 1,000 pediatric admissions per annum (three control and one intervention hospitals) while 8/12 were low admission workload hospitals i.e., <1,000 pediatric admissions per annum (three control and five intervention hospitals) irrespective of admission diagnosis. On average, there were 32 clinicians per hospital with a standard deviation of nine clinicians. The number of patients per clinician ranged between 3 and 46. Majority of the admitting clinicians were clinical officer interns at 48.7% (185/378) followed by Medical officer interns at 26.2% (99/378). Clinical officer and medical officers accounted for 1.6% (6/378) each. Approximately, 21.9% (83/378) and 21.7% (82/378) clinicians had missing gender and cadre, respectively ( Table 1). In subsequent analyses we grouped clinicians into two cadres from the initial four. That is, clinical officers (CO) combining clinical officers and clinical officer interns and medical officers    (MO) combining medical officers and medical officer interns, respectively. Approximately, 42% (903/2,127) of patients were aged between 2 and 11 months and 45% (950/2,127) were females. Patient's sex was missing in 0.7% (17/2,127) of case records ( Table 1).
Examining pneumonia PAQC score over time graphically, hospitals in the standard A&F arm (red curve) exhibited a higher mean PAQC score at baseline with no significant fluctuations over time (Figure 1). On the other hand, hospitals assigned to enhanced A&F arm (blue curve) had a lower mean PAQC score at baseline which rapidly improved toward higher score in the first 6 months of follow-up. Although enhanced A&F arm's trend line surpassed that of standard A&F arm after 6 months of follow-up, the 95% confidence bands of the two intervention arms overlapped substantially (Figure 1).
An assessment of missing data patterns suggested a multivariate missing data pattern ( Figure A1 in Supplementary Material). The missing data pattern further revealed similarities between of missing clinician's cadre and sex. That is, nearly all clinicians with missing sex had missing cadre as well. Further investigations into missing data patterns showed that missing clinicians' cadre and sex only occurred in six out of 12 hospitals (Figure 2).
Logistic regression results on plausible mechanisms underlying pneumonia trial data indicated that the probability of missing patient's sex was neither dependent on the outcome (PAQC score) nor fully observed covariates (interaction between intervention arm and follow up time in months, hospital admission workload, and malaria prevalence, patient's age group, and the number of presenting comorbid illnesses). That is, the P-values were >0.05 suggesting a MCAR mechanism (Table A1 in Supplementary Material). On the other hand, the probabilities of missing clinician's cadre and gender were dependent on both the outcome and fully observed covariates suggesting evidence against MCAR (Table A1 in Supplementary Material). Therefore, FIGURE 1 | Mean PAQC score and 95% confidence bands for six hospitals in the standard A&F arm and six hospitals in the enhance A&F arm.
we imputed missing data assuming a MAR mechanism. MI diagnostic test indicated satisfactory convergence ( Figure A2 in Supplementary Material).

Random Effects and GEE Model Results
Test for proportional odds assumption was not statistically significant at 5% level (P = 0.17). Therefore, we assumed parallel logits and fitted proportional odds models to complete case records and imputed datasets. In Table 2, we present the likelihood ratio test and Wald test results for proportional odds random effects and GEE model, respectively. After MI of missing covariates, we observed consistent results between the random effects model and the GEE model in terms of statistical significance of covariates of interest ( Table 2). Specifically, we found statistically significant interaction effect between intervention arm and follow-up time. Similarly, admission workload at hospital level was significant at 5% level. At patients' level, age and the number of comorbidities were statistically significant while at clinicians' level, sex showed significant effect on pneumonia PAQC score ( Table 2).
In Table 3, we present proportional odds ratios and the corresponding 95% confidence interval obtained after fitting the random intercepts model and GEE models before and after multilevel multiple imputation. Standard errors before and after MI are presented in This loss information led to larger standard errors comparison to those obtained after MI of missing covariates in both random  effects and GEE model families. Furthermore, the proportional odds ratios were consistently smaller under complete case analyses compared to those obtained after MI ( Table 3). These results were an indication of bias and inefficiency of parameters estimated under complete case analysis. The six PAQC score intercepts presented in Table 3 denote thresholds (cut points) differentiating adjacent levels of the response variable. For example, intercept 1 in Table 3 denote the odds of PAQC score = 1 vs. PAQC score ≥ 2 for a female patient aged 2-11 months admitted with no comorbidities admitted by a male medical officer in a high workload hospitals located in high malaria prevalence region. The individual fixed effect parameters are the proportional odds ratios of individual variables on PAQC score holding all other variables in the model constant.
From study results, enhanced audit and feedback led to improve uptake of new pneumonia pediatric guideline over time. For instance, considering a patient admitted in an intervention hospital (enhanced audit and feedback arm), the odds of PAQC score = 1 vs. PAQC score ≥ 2 were 1.16 (95% CI: 1.02-1.308) times higher the odds of a patients admitted in a control hospital, for a unit increase in follow-up time and holding other variables at reference levels. Likewise, for a patient admitted in an intervention hospital, the odds of PAQC score = 1 vs. PAQC score ≥ 2 were 1.29 (95% CI: 1.17-1.482) times higher the odds of a patients admitted in a control hospital, for a unit increase in follow-up month (GEE model after MI). These interpretations hold for all other response (PAQC score) levels.
The study results also exhibited shifts in statistical significance before and after multiple imputation for selected variable. Specifically, adjusting for other variables, complete cases analysis lead to insignificant difference between low and high admission workload hospitals on levels of PAQC score in both random effects model and GEE model where the 95% CI confidence intervals contained the value 1. But after MI, the odds of higher pneumonia PAQC score in low workload hospitals were 1.12 (95% CI: 1.08-1.372) and 1.40 (95% CI: 1.103-2.063) times higher than for high workload hospitals for the random intercepts and GEE model, respectively (Table 3).
With regard to random effects model, the variance component between clinicians and the corresponding standard error were inflated under complete cases analysis. A possible explanation for this results is that clinicians with missing cadre and sex were discarded under complete case analysis resulting to fewer number of clinicians (clusters) hence inflated clinicians' variability. On the other hand, all clinicians were retained after MI hence lower variability between clinicians.

DISCUSSION
This study sought to investigate the effect of enhanced A&F on routine pediatric pneumonia care in 12 Kenyan hospitals during a cluster randomized trial. In the analysis we adjusted for patients, clinicians, and hospital levels factors while accounting for covariate missingness across the three levels of hierarchy. The number of pneumonia admissions varied widely across hospitals during the trial period. The outcome of interest (pneumonia PAQC score) is a composite measure representing multiple aspects of pediatric pneumonia care on a 7-point ordinal scale. The advantage of using composite outcomes over individual performance measures is increased statistical efficiency (43)(44)(45)(46)(47). Although we reported and analyzed a fully observed outcome, we note that variations in pneumonia PAQC on the 7-point ordinal scale was attributable to missing data in some of the subcomponents in addition to inappropriate pneumonia care across domains of care (12). Specifically, missing components and those corresponding to inappropriate care were scored zero. Among covariates, clinician variables exhibited the highest proportions of missingness. Approximately 21% of all admitting clinicians had missing sex and cadre, respectively. These observations were consistent with previous results of a cluster randomized trial evaluating the effectiveness of a multifaceted intervention to improve admission pediatric care in eight Kenyan hospitals (10,48). In the said study, 14 and 20% of the clinicians had missing sex and years of experience, respectively. In contrast, patient level variables were fully observed except patient's sex which had <1% missingness. The sharp contrast missingness between clinicians and patients level variables could be due the fact that continued CIN audit and feedback reports focus on the documentation of patient level variables rather than documentation of clinicians' characteristics. Through preliminary investigations, we established that missing clinicians' characteristics occurred in six out of 12 hospitals participating in the trial. The patterns of missingness in the two clinicians level variables was highly correlated. That is, clinicians who did not document their sex were also likely not to document their cadre and vice versa.
To alleviate bias and inefficiency, we used multiple imputation within the joint modeling (JM) imputation framework assuming a MAR mechanism (10,30,31). Although JM imputation framework does not address the full range of complexities that are typical of multilevel data (22,23), it was preferred due to its flexibility coupled with recent statistical software developments in handling categorical variables with more than two levels in second and higher levels of hierarchy (36).
Consistent with our expectations, results demonstrated that multilevel imputation led to more precise parameter estimates compared to complete case analyses in both random effects and GEE models. Adjusting for patients, clinicians and hospital level factors, enhanced A&F improved uptake and adherence to recommended pediatric pneumonia guidelines over time among children aged 2-59 months admitted in six CIN hospitals during the trial period compared to standard A&F on general inpatient pediatric care. The significant difference in the uptake of the pneumonia guidelines between the intervention arms could be due to difference in baseline performance observed in the Loess curves. That is, control hospitals exhibited high baseline performance (on average) thus leaving smaller room for improvement compared to low baseline performance in the enhanced A&F arm hence larger room for improvement over time. These results were consistent with those of the primary analysis (24).
A key difference between our study and that primary analysis is that whereas we analyzed a composite outcome spanning three quality of care domain, Ayieko et al. (24) considered proportion of patients with correct pneumonia classification and treatment, respectively. Furthermore, our study accounted for clinicians' characteristics in addition to patients and hospitals level characteristics accounted for in the primary analysis. From results, the quality of pneumonia care differed between male and female clinicians. It was also evident that junior clinicians (medical officers and clinical officer interns) were responsible for much care during the trial period. However, the quality of care provided did not differ between the cadres. The high number of interns is an indication that hospitals in the trial were teaching and referral hospitals.

Strengths and Implications of the Study
In this study, we investigated plausible missing data mechanism underlying pneumonia trial data. Though often ignored, this step is important in assessing and understanding the implications of missingness in a given data set under analysis. That is, inefficient estimates or both biased and inefficient estimates. In addition to missing data mechanism, we evaluated missing data patterns underlying the trial data set. This was useful in revealing trends and gaps in the quality of routine care. Insight into such information is useful when designing cost effective follow-up or new interventions programmes for optimal and efficient utilization of already stretched resources (49). For instance, based on our study results, a follow up intervention programme aimed at improving documentation and reporting of clinician characteristics should be directed to specific hospitals low documentation of clinicians' level data while resources in hospitals with good documentation practices should be directed elsewhere.
To address missing data, we employed recent statistical software tools to impute missing variables in routine pediatric data. Our choice of imputation tools and method was in consideration of the hierarchical structure of the data and type of variables in the data set. This ensured compatibility between imputation and analysis models of interest thus minimizing bias in parameter estimates (10,23). Further, our choice of proportion odds models to analyze the ordinal outcome was ascertained through formal test further enhancing the validity of our study results. In instances when the proportional odds assumptions are violated, multinomial logistic regression model is recommended (40). In contrast to previous studies reporting quality of inpatient pediatric routine care in CIN hospitals (3,13,15), our study accounted for clinicians who are essential for the delivery of health intervention (16). Ignoring variation at clinician level may lead to biased estimates, overestimation or underestimation of variations in other levels of clustering (50).

LIMITATIONS
A limitation of this study is that we relied on data collected after patient discharge. Therefore, we are unable to ascertain if patients received pneumonia care as documented by health workers (24). We imputed missing data assuming MAR mechanism. Therefore, sensitivity analyses will be undertaken to explore the robustness of the inferences to MAR assumptions.

CONCLUSION
Adjusting for hospitals, admitting clinicians, and patient level factors, enhanced audit, and feedback improved uptake of WHO recommended pediatric pneumonia guidelines compared to standard audit and feedback. Additionally, female clinicians and hospitals with low admission workload were associated with higher uptake of the new pediatric pneumonia guidelines during the trial period. In both random effects and marginal model, parameter estimates were biased and inefficient under complete case analysis. Therefore, multiple imputation is recommended. When analyzing partially observed data with more than one level of clustering, it is paramount to accounts for the hierarchical structure in the imputation model to ensure compatibility with analysis models of interest and hence alleviate bias.

ETHICS STATEMENT
The Kenya Ministry of Health and Kenya Medical Research Institute's Scientific and Ethical Review Unit approved the use of de-identified patient data obtained through retrospective review of medical records without individual patient consent.

AUTHOR CONTRIBUTIONS
SG conducted the analyses. Feedback on the analytic approach was provided by EN, NO, PA, and ME. SG drafted the initial manuscript with feedback on subsequent drafts provided by all authors who then approved the final manuscript.

ACKNOWLEDGMENTS
We would like to thank the Ministry of Health who gave permission for this work to be developed and have supported the implementation of the CIN together with the county health executives and all hospital management teams. We are grateful to the Kenya Pediatric Association for promoting the aims of the CIN and the support they provide through their officers and membership. We also thank the hospital teams involved in service delivery for the sick child. This work is published with the permission of the Director of KEMRI. The CIN