# Modeling Long-Term Graft Survival With Time-Varying Covariate Effects: An Application to a Single Kidney Transplant Centre in Johannesburg, South Africa

^{1}Division of Biostatistics and Epidemiology, School of Public Health, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa^{2}Wits Donald Gordon Medical Centre, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa

**Objectives:** Patients' characteristics that could influence graft survival may also exhibit non-constant effects over time; therefore, violating the important assumption of the Cox proportional hazard (PH) model. We describe the effects of covariates on the hazard of graft failure in the presence of long follow-ups.

**Study Design and Settings:** We studied 915 adult patients that received kidney transplant between 1984 and 2000, using Cox PH, a variation of the Aalen additive hazard and Accelerated failure time (AFT) models. Selection of important predictors was based on the purposeful method of variable selection.

**Results:** Out of 915 patients under study, 43% had graft failure by the end of the study. The graft survival rate is 81, 66, and 50% at 1, 5, and 10 years, respectively. Our models indicate that donor type, recipient age, donor-recipient gender match, delayed graft function, diabetes and recipient ethnicity are significant predictors of graft survival. However, only the recipient age and donor-recipient gender match exhibit constant effects in the models.

**Conclusion:** Conclusion made about predictors of graft survival in the Cox PH model without adequate assessment of the model fit could over-estimate significant effects. The additive hazard and AFT models offer more flexibility in understanding covariates with non-constant effects on graft survival. Our results suggest that the period of follow-up in this study is long to support the proportionality assumption. Modeling graft survival at different time points may restrain the possibility of important covariates showing time-variant effects in the Cox PH model.

## 1. Introduction

The incidence and prevalence of end-stage kidney disease (ESKD) have significantly increased in developing countries, such as South Africa (1). Patients with ESKD have an increased risk of premature death on chronic dialysis therapy and for long term survival, kidney transplantation is the treatment of choice (2). A successful kidney transplant increases the life-expectancy and quality of life of a patient with ESKD. Despite advances in the use of immunosuppressants, recipient and donor factors still compromise the efficacy of a kidney transplant outcome, especially for long-term survival (3, 4). This has brought increased interest in identifying these factors using statistical methods, such as survival analysis. In kidney transplant studies, time-to-graft failure or patient death is usually the event of interest.

Beyond the Kaplan-Meier (KM) estimator, most kidney transplant studies employ the Cox Proportional hazard (PH) model to analyse whether individual patients or donor's characteristics influence the probability of Graft survival (GS) or graft failure (GF). The framework of proportional hazard assumption under the Cox PH model states that factors under study act multiplicatively on the baseline hazard function and either increase or decrease the baseline function at a constant rate (5). This fundamental assumption may not be tenable in kidney transplant studies because the effect of recipient age may impose a strong effect immediately after kidney transplant but gradually fades with time. In this situation, a hazard ratio (HR) does not suggest the same magnitude or size on the survival time. Therefore, the variable is said to have a time-varying effect on survival. Assessing the PH assumption should be the fundamental aspect in the use of the Cox PH model because violation of this assumption could lead to misleading of the resulting parameter interpretation (6). However, if the assumption of PH is violated for any covariate, a more flexible model which does not condition on constant proportional could offer more insight about the relationship between graft survival and the risk factors.

One of these models is the Aalen's additive hazard model (7), which specifies how the covariates impact additively on the hazard, but the effects of the covariates are allowed to vary freely over time. As, however, the closest version of an additive hazard model which is analog to the Cox hazard model is the Lin and Ying (8) model. It assumes the covariates act additively upon an unknown baseline hazard and their effects are constant. Conversely, the effects of the covariates in the model may be constant or time-varying. McKeague and Sasieni (9) proposed a version of the additive model that accommodates both constant and time-varying covariates effects. Although several authors advocated and used the additive hazard models for survival time data, however, the additive hazard model is rarely used in survival data analysis, more especially in kidney transplant research due to lack of familiarities with the model (10, 11). Similarly, the parametric accelerated failure time (AFT) models accommodate time-varying covariates effects. The effect of covariates in an AFT model is constant and act multiplicatively on the survival times (12), and the covariates accelerate or decelerate the occurrence of events of interest i.e., a predictor effect acting to either accelerate or decelerate graft survival time. The formulation of these models allows the estimation of a time ratio (TR) and the regression coefficients are estimated with the method of full maximum likelihood. Parametric survival models were considered by Hashemian et al. (13), in analyzing survival after kidney transplant and noted that parametric survival models provide a more suitable description of the survival data compared with the Cox PH model.

This study is motivated by previous studies on the statistical analysis of kidney transplants done in South Africa (14–16). These studies focused on the comparison of patients and GS or identification of factors that influence survival using the KM estimator and standard Cox PH model. As an extension to these previous studies, this study aims to use a more rational and methodical approach to (i) identify factors that influence long-term GS using purposeful model building strategy, (ii) affirm the importance of assessing the PH assumption in Cox PH model, and finally (iii) show the need to consider additive hazard and AFT models as a complement to the Cox model when the PH assumption is not tenable.

## 2. Patients and Method

We studied patients ≥18 years that underwent their first kidney transplant at Charlotte Maxeke Johannesburg Academic Hospital between 1984 and 2000. This is a retrospective cohort study, which involves 915 adult patients. Patients were followed-up after transplant, and information detailing patients, donors and transplant characteristics were recorded. GS was defined as the period from transplant to GF, loss to follow-up or end of the study. That is patients were right-censored if the graft did not fail by the end of the study or the patients were lost to follow-up (graft failure: 1, censored or alive: 0). Deaths with functioning grafts were not captured in this study. GF rates were computed as the ratio of the number of failed grafts to patient-years (PY) of follow-up and expressed as failure rates per 1,000 PY. Predictor variables or covariates for inclusion in this study were identified from literature using factors shown to significantly influence graft survival (16–18).

The covariates considered in this study are not time-dependent because they were only measured at the beginning of the study. There is no relationship between each variable missingness and the values of the variable or other variables in the study. Nonetheless, we numerically verified the assumption of missing completely at random using the Little's test of MCAR (19). MissForest based imputation method (20) was used to replace the missing data with reasonable values. MissForest is a non-parametric imputation method that can simultaneously impute different types of variables and its algorithm is based on random forest. There is no need to specify the tweaking parameter or the distribution of the data in the algorithm. For each variable with missing observation, the algorithm fits a random forest model using the rest of the variables in the dataset and then predict the missing values for that variable. The imputation procedure continuously run interactively and performance between iterations are assessed until a stopping criterion is reached. This is done in a repeated approach for all the variables with missing value in the dataset. For the continuous variable (donor age) with missing value, we assessed the performance of the imputation algorithm using the normalized root mean square error and for the categorical variables with missing, we used the proportion of falsely classified entries (21). The data were summarized and relevant information available for the patients were extracted. The analysis steps are described in Figure 1.

## 3. Survival Analysis Methods

Let *T*_{i} be a random variable that represents GF time for patient *i* with characteristics *X*_{i}, a p-dimensional covariate vector. Suppose *C*_{i} denotes right censoring times, the distribution of *C*_{i} is independent of *T*_{i} such that *min*(*T*_{i}, *C*_{i}) is observed. Typically, a survival dataset ${{D}}_{m}$ consists of *m* i.i.d. representative observations (*T*_{i}, δ_{i}, *X*_{i}), *i* = 1, …, *n* and δ_{i} = *I*[*T*_{i} ≤ *C*_{i}, δ = 1 or *T*_{i} > *C*_{i}, δ = 0] is defined as censoring indicator.

### 3.1. Cox Proportional Hazard Model

The Cox PH model was used to analyse the effect of the study predictors on GS. The purposeful method of variable selection employed in this study was based on the Cox PH model (5). First, the effects of all the study covariates on graft survival were assessed univariately with the Cox PH model (22). If *T*_{i} follows the Cox PH model, then the hazard function for *T*_{i} at time *t* > 0 conditional on *X*_{i} is given by

where *h*_{0}(*t*) is arbitrary, the unspecified non-negative function of time known as baseline hazard. It corresponds to the hazard when all predictor variables are equal to zero. β denotes the vector of the regression coefficients, which is estimated using the partial likelihood method. The term exp(*X*′β) depends on covariates, but not time. Significant variables at a 25% level of significance in the univariable analysis were included in the multivariable Cox PH model as applied by Hosmer et al. (5) and Bursac et al. (23). Variables were excluded from the model sequentially if they were neither significant predictor of graft survival nor confounders. The procedure for the purposeful method of variable selection is detailed in Hosmer and Lemeshow (5). Under the Cox PH model, a continuous covariate is assumed to have a log-linear functional form. Sometimes the effect of a covariate may not be in a linear association with the log-hazard. Hence, assuming a linear effect when a non-linear effect is applicable results in misspecification, which definitely affects the estimated coefficients and standard errors. The functional form of the continuous covariates was assessed using the plot of martingale residual from a null model and cumulative sums of martingale residual plot (24–26).

One restrictive assumption of the Cox PH model is the PH assumption. The hazard of two individuals with covariates *X*_{1} and *X*_{2} is said to be proportional when the hazard ratio is constant over time. That is, $HR=\frac{h(t,{X}_{1})}{h(t,{X}_{2})}=\frac{{h}_{0}(t)exp(\beta {X}_{1})}{{h}_{0}(t)exp(\beta {X}_{2})}=\frac{exp(\beta {X}_{1})}{exp(\beta {X}_{2})}=exp\left\{\beta ({X}_{1}-{X}_{2})\right\}$. This implies that the ratio of the two hazards remains proportional or constant over time. When the hazard ratio (HR) of a variable is not constant over time, the covariate is said to have a non-proportional or time-varying effect on survival, which suggests that the effect of the covariate changes over time. Test and graphical methods based on scaled Schoenfeld residuals (${\text{r}}_{i}^{w}={n}_{e}\mathrm{\text{var}}(\widehat{\beta}){\mathbf{\text{r}}}_{i}$) and technique based on cumulative sums of martingale residuals ($U(\widehat{\beta},t)=\sum _{i=1}^{n}{X}_{i}{\widehat{M}}_{i}(t)$) (25, 27) were used to verify the validity of proportionality for each selected covariate in the final model. The scaled Schoenfeld residuals vs. time were plotted for each covariate. Under common definition, these residuals are expected to randomly distribute around the zero line slope if proportionality is valid. Also, the observed processes plotted along with 50 simulated processes under the null hypothesis of no model misspecification were compared. The non-proportional hazard assumption for any covariate is revealed if the observed processes are atypical of the simulated processes. A clear lack of fit could be concluded for the Cox PH model due to time-varying covariates effects in the model, which violates the PH assumption.

### 3.2. Additive Hazard Model

To circumvent PH assumption and characterize the nature of the time-varying covariates effects through the cumulative regression function plots, we employed the Aalen additive hazard model, given by

Similar to model 1, *h*_{0} and γ represent the baseline hazard function and vector of time-varying regression coefficients, which may change in magnitude and even sign over time. The flexibility of the Aalen additive hazard is tempered due to the difficulty indirect estimation of the coefficients function. Hence, the cumulative regression coefficients version is estimated based on the least square estimation of the integrated coefficients ${\beta}_{i}(t)={\int}_{0}^{t}{b}_{i}(u)du,i=1,\dots ,p$. These effects are graphed against time to investigate if the covariates in model 1 have time-varying or constant effects over time. The more beta is from 0, the higher the impact the coefficients has had on the hazard of graft failure over the period of follow-up. As well, a positive and a negative slope with an increase in covariates indicate an increase and a decrease in hazard, respectively. For a covariate with significant effect, the confidence bands are likely not to include the zero line. Both the Kolmogorov-Smirnov and Cramer Von Mises tests (28) were used to assess the time-invariant effects of the covariates. The cumulative martingale residual was used to assess the fit of the covariates in the Aalen additive hazard model. To further assess the nature of these covariates, we fitted a variation of model 2, given by

In this version of the additive model, γ_{b}(*t*) and γ_{a} represent a vector of covariates with time-varying and constant effects, respectively. A successive test was done to compare the result of this model and that of the previous models.

### 3.3. Accelerated Failure Time Models

All the significant variables from the Cox PH model were also used to fit AFT models. We used the shape of the hazard function to select the appropriate AFT models, as reported by Khanal et al. (29). The baseline hazard function profile (Figure S1) displays a monotone decreasing hazard, which is closer to log-logistic (when *k* ≤ 1), log-normal (when σ > 1) and Weibull distributions (when γ < 1) (12). The survival functions of the selected distributions are $S\left(t\right)=1-\Phi \left(\frac{logt-\mu}{\sigma}\right)$, *S*(*t*) = {1+^{e}^{θ}*t*^{k}}^{−1} and *S*(*t*) = exp(−λ*t*^{γ}). The distributions are characterized by the location or scale (μ, θ, λ) and shape (σ, *k*, γ) parameters. In the AFT model, the effect of covariates is constant and act multiplicatively on survival times. The log-linear relationship between the variables and the log of survival time is given by

where μ is the model intercept, α is a vector of regression coefficients quantitatively expressing the impact of each explanatory variable on the survival time. A negative value of α indicates that survival time increases with decreasing value of the explanatory variable and vice versa. The exp(α′*X*) is usually referred to as the acceleration factor. σ is the scale parameter and ϵ is the error term, which is assumed to have a specific distribution, such as a logistics or normal distribution. The deviation of log*T* from linearity is modeled by the error term. The distribution *T* is based on the probability distribution of ϵ, and the survival function for *T* can be obtained from the survival function of the distribution of ϵ. The Akaike information criterion defined by *AIC* = −2*l* + 2*k* (30), where l is the log-likelihood of the model and k is the total number of parameters in the model, was used to compare the fit of the AFT models. The best performing model was used to compare the results of the Cox PH and Additive hazard models. To draw valid inferences from the best-performing models, Deviance residuals were used to assess the adequacy of the selected model (5). The deviance residual is express as

where the quantity *r*_{Mi} is the martingale residual. The sign function defined by sign(.) takes the value −1 or +1 if its argument is negative or positive, respectively. The deviance residuals are normalized transformations of the martingale residuals and have a mean of zero. If the model is valid, the *r*_{Di} are more symmetrically distributed around zero compared to *r*_{Mi}. The R codes used to perform the analysis is included in the Supplementary Material.

## 4. Results

### 4.1. Descriptive Statistics

The descriptive information available for the 915 patients in this study is summarized in Table 1. Majority of the patients (85%) received a kidney from cadaveric donors, and white patients accounted for 56% of the total patients in the study. The unadjusted graft failure rates for the study variable categories are also listed in Table 1. Most transplant cases concentrated before 1992, which is the midpoint of kidney transplantation in this study (Figure 2A). We observed 43% cases of graft failure by the end of the study; hence, the censoring rate is about 58%. Graft survival at 1, 5, 10 and 15 years are 81% (95% CI: 78–84%), 66% (95% CI: 63–70%), 50% (95% CI: 47–55%), and 37% (95% CI: 32–42%), respectively. The median follow-up was ~10 years, about 17% grafts failed after the 1st year of follow-up and this period has the highest hazard rate of GF (Figures 2B–D). 18% of the cases have missing observations in their records and there are no missing values in the time variables. The Little's MCAR results show that these observations are missing completely at random (*p*-value = 0.206). MissForest method of imputation was used to address the issue of missing data in this study and the reliability of the method was assessed. The out-of-bag errors estimated by missForest for the continuous variable and categorical variables are 0.02 and 0.14, respectively. This shows good performance of missForest in imputing missing data because the values are close to zero than 1.

**Table 1**. Characteristics of kidney transplant recipient in CMJAH from 1984 to 2000 and partial likelihood ratio test *p*-value for all the study covariates.

**Figure 2**. Exploratory data analysis of the transplant data showing **(A)** barplot of years of kidney transplantation, **(B)** histogram plot of graft survival time variable, **(C)** KM plot of graft survival, (+) indicates censoring, and **(D)** Smoothed graft failure rate per 1,000 PY.

### 4.2. Result From the Cox Proportional Hazard Model

The first step considered in the model building procedure was to explore the relationship between each covariate and graft survival time, univariately. At 25% level of significance, evidence of association with GS is suggested for some variables (Table 1). These variables were deemed candidate for inclusion in the multivariable model.

The multivariable model containing all the significant covariates in the univariable analysis was fitted (Table S1). In order to simplify the model, *p*-values of the covariates based on the partial likelihood test were examined. “Donor age” has the largest *p*-value (*p* = 0.654), which is not statistically significant. Omitting this covariate and refitting the model results in the likelihood ratio (LR) test of 0.202 (Table 2). This is not significant (*p* = 0.653) at 5% level, indicating no improvement over the full model. Furthermore, the change in coefficients $(\Delta \widehat{\beta})$ for each covariate remaining in the model was compared with the original model, the result (Table 2) shows that donor age is neither a significant predictor of graft survival nor a confounder. Next, “Histological acute rejection” and “Urological ESKD” were subsequently removed from the model. The LR tests with *p*-values of 0.349 and 0.227, respectively, show that the model without these covariates is not statistically different from the model with these covariates (Table 2). However, the removal of “Urological ESKD” influenced the coefficients of “Renal disease ESKD” and “Hypertension” by more than 15%. “Urological ESKD” would have been retained in the model if “Renal disease ESKD” and “Hypertension” were significant predictors of graft survival at 10% level of significance. Therefore, we considered “Urological ESKD” as an unimportant confounder and exclude the three variables from the model.

**Table 2**. Partial likelihood ratio test indicating the effect of deleting covariates that are not significant in the multivariable analysis and their highest impact in coefficient change for other covariates.

There is no significant change in the value of $-2LL(\widehat{\beta})$ on deleting “inherited ESKD” and “Hypertension” from model 2, sequentially (*p*-value = 0.169 and 0.145). The deletion did not confound the relationship of any covariate remaining in the model and graft survival. The final covariates in the multivariate model at this stage is shown in Model 3 (Table S1).

In the next stage, “Donor recipient-gender match,” “Donor-recipient blood group match,” and “Clinical acute rejection” that were not significant in the univariable analysis were sequentially added in the multivariable Model 2). Only “Donor-recipient gender” shows a significant relationship with graft survival, with LR test of 4.908 (*p* =< 0.027). Hence, we re-consider this variable at this stage of model building (Model 4; Table 2). The summary of Model 4 is shown in Table S1. We compared the variables selected in the final model with an automated method of variable selection, such as stepwise and best-subset (Table S3). We observed that automated methods are susceptible to selecting more variables, which are not significantly related to GS at 5–10% significant levels.

#### 4.2.1. Assessment of Linearity Assumption

The next step was to assess the functional form of “Recipient age,” as the only continuous variable in the final model (Model 4, Table S1). Figure 3 shows a plot of the martingale residual from a null model and the cumulative martingale residual. The smoothing spline fit shows evidence of linearity for this variable. It is also obvious that the observed processes for this variable are more typical with the 20 simulated realizations from the null distribution with a complimentary *p*-value of 0.100. This indicates that a linear term is needed for “Recipient age” in the model.

**Figure 3**. Linearity assumption assessment. **(A)** Smoothed martingale residual plot from a null Cox PH model vs. recipient age. **(B)** Cumulative martingale residuals plot vs. recipient age (*p* = 0.095).

#### 4.2.2. Assessment of PH Assumption and Overall Goodness-of-Fit

There was no significant two-way interaction between the covariates in the model at 5% level of significance. We assessed the assumption of the Cox PH model to confirm if the covariates interpreted above only shift the baseline hazard up or down, but does not change over the lifetime of a graft. Figure S2 shows evidence of time-varying effects for some covariates in the model, given that the curves seem not to drift apart steadily, as should be expected in the case of constant effects. Table 3 shows the *p*-values of tests based on the scaled Schoenfeld and cumulative residuals for non-proportional hazard assessment. The results of the two tests support evidence of deviation from the proportionality assumption as shown in Table 3. The results are graphically illustrated for each covariate in the Cox PH model (Figures S3, S4). These figures suggest non-constant effects over time for the aforementioned variables. However, when these covariates interacted with time in the extended Cox PH model, only recipient ethnicity shows a non-constant effect (Table S2). The non-constant effect of these covariates indicate a lack of fit in the Cox PH model, which could lead to misleading parameter interpretation.

**Table 3**. Non-proportionality test in the Cox PH model, *p*-values for scaled Schoenfeld residuals and cumulative residuals (*) tests.

### 4.3. Result From Additive Hazard Models

The covariates in model 4 (Table S1) were used to fit the Aalen additive hazard model. The result is comparable to the Cox PH model in identifying the risk factors of graft survival (Table 4). However, the Kolmogorov-Smirnov test shows some evidence of time-varying effect for Donor type and recipient ethnicity in this table (*p*-values < 0.05), this is supported by Von Cramer Mises test (result not included). The plot of the cumulative regression coefficients for the Aalen model is shown in Figure 4. There is a linear increase in the hazard of graft failure with an increase in the recipient age and its confidence interval does not include the zero line, indicating that age has a significant effect on the hazard of graft failure over the years of follow-up. The 95% confidence interval for other plots include the zero line at some time point, indicating covariates with early (e.g., Delayed graft function) and late (donor type) significant effects on graft survival. Only the plot for donor type has a negative effect on graft survival, the effect at some points flattens before it steeply decreases linearly, which by the test is an indication of a time-varying effect. The cumulative plot for recipient ethnicity shows a curvilinear pattern, it displays a steep increase at the beginning of the follow-up and shows a roughly zero slope after the first 10 years. The plot suggests a time-varying effect for recipient ethnicity and also that this covariate may not have a late significant effect on graft survival.

**Table 4**. Tests for non-significant and time-varying effects of the covariates in the Aalen additive hazard model.

**Figure 4**. Estimates of cumulative hazard risk with a 95% pointwise confidence interval based on Aalen's additive model.

The cumulative martingale residual together with 50 simulated processes (Figure 5) under the Aalen model shows that the covariates' behavior is more typical with the model (*p*-values > 0.05), indicating a good fit of Aalen model. The result of the semi-parametric version of the Aalen model is shown in Table 5, all the covariates as previously reported shows significant effects on graft survival. For the covariates with constant effects as suggested by the Aalen model, their estimates are shown in Table 5.

**Table 5**. Analysis of risk factors based on the Cox PH, McKeague and Sasieni hazard, and Weibull AFT models.

### 4.4. Result From Parametric AFT Models

The AIC values of the models are 2,444, 2,492, and 2,471 for Weibull, lognormal, and log-logistic models, respectively. The rule is that any model that conforms to the observed data should adequately lead to a smaller AIC. Based on this, the Weibull model is the best-performing model. The distribution of the deviance residuals from the Weibull model is mostly within the range of ±3 except three observations that are slightly outside this bound (Figure 6). The result of the Weibull model is similar to that of the Additive hazard models in detecting the significant predictors of graft survival and their directional effects (positive or negative effect), although, the interpretations are not the same. For instance, in Table 5, the semi-parametric additive hazard model shows that female patients that received a kidney from female donors had an increase in the hazard of 0.0327 compared with male patients that received a kidney from male donors. Conversely, the Weibull model shows that female recipients of a kidney from female donors had 44% lower graft survival in comparison to male patients that received from male donors. The Weibull models show that the influence of all the predictors except donor type, decelerate graft survival time. Every additional increase in age, on the average age of the recipients, is associated with 5% decrease in graft survival, this indicates that the older the patient, the higher the hazard of graft failure. This is similar to what is observed in the additive hazard model. Also, the results show that graft survival is prolonged (more than twice) among patients that received live kidney transplant compared with those that received a cadaveric kidney transplant.

## 5. Discussion

In this study, 915 adult patients that underwent a kidney transplant at Charlotte Maxeke Academic Hospital Johannesburg, South Africa were analyzed. This study attempts to appropriately employ more statistically justifiable strategies in selecting the best combination of predictors that influence long-term GS post-kidney transplantation. The method of imputation used in this study has been shown (in studies using different biological and medical datasets) to outperform imputation methods, such as multivariate imputation by the chained equation, nearest neighbor and mean imputation (20, 31). The Cox PH model is the most attractive survival model when a set of covariates are of interest in modeling time to graft failure. Fitting a large number of variables in this model could add noise to the estimated quantities, resulting in collinearity among variables and increase the cost of modeling unnecessary predictors. The purposeful variable selection method based on the Cox PH model becomes more complex when there are too many predictors in the data. However, this procedure of model building involves a combination of science, statistical method, experience and common sense (32). The purposeful method has been applied in previous studies (5, 23, 32). These studies comparatively showed that purposeful variable selection method leads to significant variables, confounding factors and a richer model compared with other selection methods; when prediction and identification of risk factors are of interest.

Evaluating the PH assumption for all predictors in the Cox PH model should be a fundamental aspect of the modeling process when using the Cox PH model. Including variable(s) not satisfying the PH assumption leads to an inferior fit of a Cox model i.e., the power of tests is reduced for both variables with constant and non-constant HR in the model. Our results provide evidence of time-varying effects for the covariates in the Cox PH model. This shows that it is necessary to assess this assumption based on the fact that clinical variables effects are rarely constant over time.

The Cox PH, additive hazard and AFT models are used in survival to study the association between risk factors and the event of interest in failure time data. The appropriateness of the individual model may not be known in advance for a specific application. The models may capture the risk process equally or sometimes give a different result (10, 29, 33). For many application in public health, the additive hazard may be plausible since the result gives differences in hazard, rather than a hazard ratio. The same applies to the straightforward interpretation of TR as compared to HR. These models may be compared with regards to the *p*-values of the covariates in the model, since the greatness of *p*-value shows the power to reject the null hypothesis (10).

We identified that factors, such as “donor age” and “acute rejections” previously shown to be important risk factors of GS (34–36) are neither significant nor confounders in this study. The difference between the findings of this present study and these previous studies could be linked to differences in sample size (number of graft failures observed), year of transplant, duration of follow-up and method of data analysis. Nevertheless, it is noteworthy that the significant predictors of GS observed in this study are in agreement with previous studies (11, 16–18, 34, 35, 37, 38).

Prognostic assessment with the Cox PH model is generally based on patients/donors characteristics at the time of evaluation. These characteristics have a greater tendency to change, following a long period of study. We have shown in this study that when long-term follow-up is of interest, survival prediction may be discordant with the Cox PH model. We have statistically shown that the Cox PH model did not capture all the significant aspects of the data analysis and did not provide adequate fit in this study. We were able to investigate the time-varying covariate effects with the Aalen additive model and fully estimates the effects of the covariate with the AFT model. The need to explore beyond the Cox PH model is revealed in the Aalen plot, the plot can aide a nephrologist to understand the pattern by which the covariates influence graft survival after transplantation. Considering censored quantile regression model could be alternatives when the PH assumption is not valid in the Cox PH model.

This study has several important strengths. We have used a rational approach in analyzing the kidney transplant data generated from a South African transplant cohort study. The results of this historical data analysis could help to understand long-term performance and progress of kidney transplant outcome in this region, and how the risk factors influence the survival of the graft after kidney transplant. The analysis involves a combination of both recipients of a cadaveric and living donor kidney transplant, focusing on graft failure because maximizing graft is paramount important to the recipient of a new kidney and transplant unit. We found in this study that predictors of graft survival may exhibit time-varying effects.

On the other hand, this study also has some methodological limitations. We found that multicollinearity is a problem in using the purposeful method of variable selection, especially when the covariates are highly related. Specifically, we noticed that dropping any of the causes of ESKD influences the coefficients of others. Taking a decision on which variable to add or retain some times is challenging. However, because the procedure is governed by a specific rule at each step, the choice or decision to drop or retain any variable was critically assessed to avoid multicollinearity in the final model. In addition, 57% censoring observed is another limitation in this study.

## 6. Conclusion

Additive hazard and AFT models are yet to receive more deserving attention in modeling GS after a kidney transplant. When covariate effects involve certain patterns of heterogeneity in kidney transplant studies, additive hazard and AFT models could offer great flexibility in modeling GS time. The models used in this study describe different features of the relationship between the risk factors and graft failure. Hence, it appears necessary to use these models complementarily to gain a more comprehensive understanding of GS after a kidney transplant.

## Data Availability

The datasets generated for this study are available on request to the corresponding author.

## Ethics Statement

Ethical clearance for JF, approved by the Wits Human Research Ethics Committee. Medical clearance certificate number: M121186.

## Author Contributions

OA performed the data cleaning and analysis, interpreted the ensuing results, wrote, and edited the manuscript. JF provided the data used for this study, supervised the data cleaning steps, and edited the manuscript. EM supervised the analysis and the manuscript production, and edited the manuscript.

## Funding

OA was supported by DELTAS Africa Initiative-SSACAB with funding from the GlaxoSmithKline (GSK). Grant No. 107754/Z/15/Z-DELTAS Africa Sub-Saharan Africa Consortium for Advanced Biostatistics (SSACAB) programmer.

## Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

## Acknowledgments

The authors would like to thank Dr. Petra Gaylard for her assistance in cleaning the study data. The authors would also like to acknowledge the staff and the patients from the Renal Transplant Unit at Charlotte Maxeke Johannesburg Academic Hospital and all the donors who make kidney transplant possible for patients with kidney failure.

## Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpubh.2019.00201/full#supplementary-material

## References

1. Soliman AR, Fathy A, Roshd D. The growing burden of end-stage renal disease in Egypt. *Renal Fail.* (2012) 34:425–8. doi: 10.3109/0886022X.2011.649671

2. Zukowski M, Kotfis K, Kaczmarczyk M, Biernawska J, Zukowska A, Sulikowski T, et al. Influence of selected factors on long-term kidney graft survival: a multivariable analysis. *Transplant Proc.* (2014) 46:2696–8. doi: 10.1016/j.transproceed.2014.08.016

4. Irish W, Ilsley J, Schnitzler M, Feng S, Brennan D. A risk prediction model for delayed graft function in the current era of deceased donor renal transplantation. *Am J Transplant.* (2010) 10:2279–86. doi: 10.1111/j.1600-6143.2010.03179.x

5. Hosmer DW Jr, Lemeshow S, May S. *Applied Survival Analysis: Regression Modeling of Time-to-Event Data*. Vol. 618. New York, NY: Wiley-Interscience (2008).

6. Bellera CA, MacGrogan G, Debled M, de Lara CT, Brouste V, Mathoulin-Pélissier S. Variables with time-varying effects and the Cox model: some statistical concepts illustrated with a prognostic factor study in breast cancer. *BMC Med. Res. Methodol.* (2010) 10:20. doi: 10.1186/1471-2288-10-20

7. Aalen O. A model for nonparametric regression analysis of counting processes. In: Klonecki W, Kozek A, Rosiński J, editors. *Mathematical Statistics and Probability Theory*. New York, NY: Springer (1980). p. 1–25.

10. Abadi A, Saadat S, Yavari P, Bajdik C, Jalili P. Comparison of Aalen's additive and Cox proportional hazards models for breast cancer survival: analysis of population-based data from British Columbia, Canada. *Asian Pac J Cancer Prev.* (2011) 12:3113–6.

11. BAŞAR E. Aalen's additive, Cox proportional hazards and the Cox-Aalen model: application to kidney transplant data. *Sains Malays.* (2017) 46:469–76. doi: 10.17576/jsm-2017-4603-15

12. Klein JP, Moeschberger ML. *Survival Analysis: Techniques for Censored and Truncated Data*. New York, NY: Springer Science and Business Media (2005).

13. Hashemian AH, Beiranvand B, Rezaei M, Reissi D. A comparison between Cox regression and parametric methods in analyzing kidney transplant survival. *World Appl Sci J.* (2013) 26:502–7. doi: 10.5829/idosi.wasj.2013.26.04.12203

14. Myburgh J, Shapiro M, Maier G, Meyers A. HL-A and cadaver kidney transplantation. 7-year experience at Johannesburg General Hospital. *S Afr Med J.* (1976) 50:1279–84.

15. Myburgh J, Botha J, Meyers A, Smit J, Milne F, Thomson P, et al. The treatment of end-stage renal disease at the Johannesburg Hospital: a 17-year experience. Part II. The role of transplantation. *S Afr Med J.* (1983) 64:522–7.

16. Fabian J, Maher H, Bentley A, Gaylard P, Crymble K, Rossi B, et al. Favourable outcomes for the first 10 years of kidney and pancreas transplantation at Wits Donald Gordon Medical Centre, Johannesburg, South Africa. *S Afr Med J.* (2016) 106:172–6. doi: 10.7196/SAMJ.2016.v106i2.10190

17. Hariharan S, Mcbride MA, Cherikh WS, Tolleris CB, Bresnahan BA, Johnson CP. Post-transplant renal function in the first year predicts long-term kidney transplant survival. *Kidney Int.* (2002) 62:311–8. doi: 10.1046/j.1523-1755.2002.00424.x

18. Wafa EW, Shokeir AA, Akl A, Hassan N, Fouda MA, El Dahshan K, et al. Effect of donor and recipient variables on the long-term live-donor renal allograft survival in children. *Arab J Urol.* (2011) 9:85–91. doi: 10.1016/j.aju.2011.07.003

19. Little RJ. A test of missing completely at random for multivariate data with missing values. *J Am Stat Assoc.* (1988) 83:1198–202.

20. Stekhoven DJ, Bühlmann P. MissForest–non-parametric missing value imputation for mixed-type data. *Bioinformatics.* (2011) 28:112–8. doi: 10.1093/bioinformatics/btr597

21. Pan R, Yang T, Cao J, Lu K, Zhang Z. Missing data imputation by K nearest neighbours based on grey relational structure and mutual information. *Appl Intell.* (2015) 43:614–32. doi: 10.1007/s10489-015-0666-x

23. Bursac Z, Gauss CH, Williams DK, Hosmer DW. Purposeful selection of variables in logistic regression. *Source Code Biol Med.* (2008) 3:17. doi: 10.1186/1751-0473-3-17

24. Therneau TM, Grambsch PM, Fleming TR. Martingale-based residuals for survival models. *Biometrika.* (1990) 77:147–60.

25. Lin DY, Wei LJ, Ying Z. Checking the Cox model with cumulative sums of martingale-based residuals. *Biometrika.* (1993) 80:557–72.

26. Therneau TM, Grambsch PM. *Modeling Survival Data: Extending the Cox Model*. New York, NY: Springer Science and Business Media (2000).

27. Grambsch PM, Therneau TM. Proportional hazards tests and diagnostics based on weighted residuals. *Biometrika.* (1994) 81:515–26.

28. Martinussen T, Scheike TH. *Dynamic Regression Models for Survival Data*. New York, NY: Springer Science and Business Media (2007).

29. Khanal SP, Sreenivas V, Acharya SK. Accelerated failure time models: an application in the survival of acute liver failure patients in India. *Int J Sci Res.* (2014) 3:161–66.

30. Akaike H. Information theory and an extension of the maximum likelihood principle. In: Parzen E, Tanabe K, Kitagawa G, editors. *Selected Papers of Hirotugu Akaike*. New York, NY: Springer (1998). p. 199–213.

31. Waljee AK, Mukherjee A, Singal AG, Zhang Y, Warren J, Balis U, et al. Comparison of imputation methods for missing laboratory data in medicine. *BMJ Open.* (2013) 3:e002847. doi: 10.1136/bmjopen-2013-002847

32. Zhang Z. Model building strategy for logistic regression: purposeful selection. *Ann Transl Med.* (2016) 4:111. doi: 10.21037/atm.2016.02.15

33. Giolo SR, Krieger JE, Mansur AJ, Pereira AC. Survival analysis of patients with heart failure: implications of time-varying regression effects in modeling mortality. *PLoS ONE.* (2012) 7:e37392. doi: 10.1371/journal.pone.0037392

34. Tan JC, Kim JP, Chertow GM, Grumet FC, Desai M. Donor–recipient sex mismatch in kidney transplantation. *Gend Med.* (2012) 9:335–47. doi: 10.1016/j.genm.2012.07.004

35. González-Molina M, Burgos D, Cabello M, Ruiz-Esteban P, Rodríguez MA, Gutiérrez C, et al. Impact of immunosuppression treatment on the improvement in graft survival after deceased donor renal transplantation: a long-term cohort study. *Nefrologia.* (2014) 34:570–8. doi: 10.3265/Nefrologia.pre2014.Jun.12327

36. Koo EH, Jang HR, Lee JE, Park JB, Kim SJ, Kim DJ, et al. The impact of early and late acute rejection on graft survival in renal transplantation. *Kidney Res Clin Pract.* (2015) 34:160–4. doi: 10.1016/j.krcp.2015.06.003

37. Morales JM, Marcén R, del Castillo D, Andres A, Gonzalez-Molina M, Oppenheimer F, et al. Risk factors for graft loss and mortality after renal transplantation according to recipient age: a prospective multicentre study. *Nephrol Dial Transplant.* (2012) 27:39–46. doi: 10.1093/ndt/gfs544

Keywords: graft survival, time varying covariate effect, Cox PH model, purposeful selection, additive hazard models

Citation: Achilonu OJ, Fabian J and Musenge E (2019) Modeling Long-Term Graft Survival With Time-Varying Covariate Effects: An Application to a Single Kidney Transplant Centre in Johannesburg, South Africa. *Front. Public Health* 7:201. doi: 10.3389/fpubh.2019.00201

Received: 19 April 2019; Accepted: 08 July 2019;

Published: 25 July 2019.

Edited by:

Jim Todd, London School of Hygiene and Tropical Medicine (LSHTM), United KingdomReviewed by:

Birhanu Ayele, Stellenbosch University, South AfricaInnocent B. Mboya, University of KwaZulu-Natal, South Africa

Copyright © 2019 Achilonu, Fabian and Musenge. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Eustasius Musenge, Eustasius.Musenge@wits.ac.za