Abstract
We apply a pattern-based classification method to identify clinical and genomic features associated with the progression of Chronic Kidney disease (CKD). We analyze the African-American Study of Chronic Kidney disease with Hypertension dataset and construct a decision-tree classification model, consisting 15 combinatorial patterns of clinical features and single nucleotide polymorphisms (SNPs), seven of which are associated with slow progression and eight with rapid progression of renal disease among African-American Study of Chronic Kidney patients. We identify four clinical features and two SNPs that can accurately predict CKD progression. Clinical and genomic features identified in our experiments may be used in a future study to develop new therapeutic interventions for CKD patients.
1 Introduction
The main function of kidney is to remove excess water and waste products from blood. It also helps to regulate the levels of minerals such as sodium, calcium, and potassium in blood. One suffers from chronic kidney disease (CKD), also known as renal disease, when kidney losses its function gradually and usually permanently. CKD, defined by reduced glomerular filtration rate (GFR), proteinuria, or structural kidney disease, is a worldwide growing public health problem1. Many subjects with renal disease of most etiologies progress to severe renal failure and/or end stage renal disease (ESRD), requiring renal replacement therapy, which may involve a form of dialysis or renal transplantation (Lewis et al., 1993; Klahr et al., 1994; DCCT, 1995; Brenner et al., 2001; Lewis et al., 2001; Wright et al., 2002; Niki et al., 2015). However, progression rate of CKD is very heterogeneous (Lindeman et al., 1985; Lindeman, 1990; Hallan et al., 2006). While a few predictive factors for progression such as proteinuria have been detected, identification of those at risk to progress remains a significant problem. It has also been established that there are several therapies that can ameliorate the progression of renal disease including ACE inhibitors, blood pressure control, tight diabetes control and perhaps low protein diets; however, in trials examining these therapeutic modalities there remains a very significant risk of progression of renal disease in the subjects receiving optimal therapy (Lewis et al., 1993; Klahr et al., 1994; DCCT, 1995; Brenner et al., 2001; Lewis et al., 2001; Wright et al., 2002; Niki et al., 2015).
African-American Study of Chronic Kidney disease with Hypertension (AASK) was motivated by the high rate of hypertension-related chronic kidney disease in the African-American population and the scarcity of effective therapies. The study involved 21-center randomized double-blinded treatment trial of 1,094 African-American patients with hypertension at ages ranging from 18 to 70 years. Patients had renal failure with GFR between 20 and 65 ml/min/1.73m2. Patients were randomized to the angiotensinogen converting enzyme inhibitor (ACEi) ramipril, the β-blocker (BB) metoprolol or the dihydropyridine calcium channel blocker (CCB) amlodipine, and to usual (mean arterial pressure (MAP 102–107) or low (MAP ) blood pressure (BP) goals. The rationale for the treatment arms was that there was human and animal data suggesting that ACEi and CCB might slow progression of renal disease independent of their BP effects (Lewis et al., 1993; Hallan, 1998), and there was data from observational and treatment studies that a lower BP might have beneficial effects (Klahr et al., 1994; Klag et al., 1997). Although other studies had attempted to achieve a MAP separation (Hansson et al., 1998; Lewis et al., 2001), AASK is the first major trial to actually achieve this goal. The primary outcome was rate of decline of GFR (GFR slope) based on iothalamate GFR studies at 6 months intervals, with a secondary clinical composite outcome of end stage renal disease (ESRD), a or 50% drop in GFR from baseline (GFR event), or death (Subasi et al., 2017).
The initial AASK results were not conclusive (Wright et al., 2002). While the adopted therapy was shown to slow the progression of renal disease, there was still high rate of progression to renal failure. The CCB arm of the study was stopped early when interim analysis indicated that CCB was inferior to both BB and ACEi in patients with urine protein/creatinine ratio (about 300 proteinuria/24 h) (Agodoa et al., 2001). The low BP goal of the study did not improve outcomes: there was no beneficial effect of low MAP on rate of progression of renal disease as defined by GFR slope or clinical composite outcomes (GFR events, end stage renal disease (ESRD) or death). Subsequently, a similar result was found in the REIN trial (Ruggenenti et al., 1999). Studies in Type 2 diabetes have demonstrated a linear relation of achieved BP to renal outcomes (Bakris et al., 2003; Pohl et al., 2005); however, it should be noted that all the patients in these studies were treated to the same goal BP, so that rather than low BP being protective, the ability to achieve lower BPs may have defined a sub-population in these studies with low risks of disease progression. Despite the lack of effect on renal outcomes in AASK, proteinuria was diminished by the lower BP goal. This finding is similar to that previously reported for diabetics (Lewis et al., 2001). Finally, a subgroup analysis in AASK did suggest that patients on a non-protective regimen (CCB) may have benefited from the low BP goal (Contreras et al., 2005). Most importantly in AASK, ACEi decreased the number of events as compared to both BB and CCB (Wright et al., 2002). These data for ACEi vs. CCB are tabulated in Table 1 (risk reduction adjusted for baseline covariates) and were most dramatic for the hard outcomes, especially ESRD.
TABLE 1
| ramipril vs. Amlodipine | % Risk Reduction | 95% CI | p-value |
|---|---|---|---|
| GFR event, ESRD or death | 0.004 | ||
| GFR event or ESRD | 0.006 | ||
| ESRD or death | |||
| ESRD alone |
Analysis of clinical composite outcomes - confidence interval (CI).
Several possible interventions such as blood pressure control (Wright et al., 2002), diabetes treatment (DCCT, 1995), controlling dietary protein intake (Klahr et al., 1994) and medications with possible renoprotective effects (Ruggenenti et al., 1999; Agodoa et al., 2001; Wright et al., 2002) have been tested in clinical trials. In all cases, the residual rate of progression of chronic kidney disease has remained significant. To date, there are few prediction models to identify which patients are likely to progress significantly. Subasi et al. (2017) (Subasi et al., 2017) identified serum proteomic patterns that can accurately distinguish rapid progression and slow progression among AASK patients. Recently, Lipkowitz et al. (2013) (Parsa et al., 2013) examined effects of variants in gene encoding apolipoprotein L1 (APOL1) on the disease progression and observed that renal risk variants in APOL1 were associated with the higher rates of ESRD and progression of chronic kidney disease in African-American patients as compared to white patients. Other recent studies include Rahman et al. (2013), where the effects of two antihypertensive drug dose (PM dose and add-on dose) schedules on nocturnal blood pressure vs. usual therapy (AM dose) in former participants were determined and Chen et al. (2016), where the longitudinal changes in hematocrit in hypertensive renal disease were studied.
The goal of our current study is to apply a pattern-based classification method to identify clinical and genomic features that may serve as prognostic markers for the progression of renal disease among AASK patients. Clinical and genomic features identified in our analysis shall be used in a future study to obtain comparison of the disease progression in white patients and African-American patients, both of those with and those without apolipoprotein L1 (APOL1) high-risk variants. The ultimate goal of our AASK data analysis, started in (Subasi et al., 2017) and continued in this current work, is to identify new targets and provide basis for new therapeutic interventions for chronic kidney disease.
2 Study Subjects
Closer inspection of the data highlights the current dilemma: although there is a decrease in the number of events with ACEi still a residual event rate of in the trial as a whole and in subjects with urine protein/creatinine , a mild degree of proteinuria of (Figures 1 and 2). In addition it can be seen that the event rate is essentially constant throughout the 5 years of the trial, indicating that remaining patients are still at risk to progress. This finding is similar to that of other trials such as MDRD (Klahr et al., 1994; Hebert et al., 1997), the Collaborative Study Group Trial (Lewis et al., 1993), RENAAL (Brenner et al., 2001) and IDNT (Lewis et al., 2001).
FIGURE 1

AASK clinical composite events–all patients.
FIGURE 2

AASK clinical composite events–proteinuria.
Figure 3 indicates the significant heterogeneity of progression rate of renal disease in the AASK Trial, where the rate of decline of GFR after 6 months in the trial (chronic GFR slope) is depicted in blue for each patient from most rapid decline (negative slope) on the left, to the least rapid decline (positive slope) on the right. The expected rate of decline of GFR with aging is generally assumed to be (Berg, 2006; Murussi et al., 2006), although longitudinal studies have raised questions about this assumption (Lindeman et al., 1985; Lindeman, 1990). Based on this estimate, approximately of the AASK patients in Figure 3 did not progress (right side, slope ) while approximately progressed rapidly (left side, slope ). The figure also shows that proteinuria, the strongest predictor of progression rate reported in literature, is not an ideal predictor in that there are a number of slow progressors with significant proteinuria (red spikes, right), while a significant number of rapid progressors had no or minimal proteinuria (absence of red bars, left) (Subasi et al., 2017). This data is supported by the observation in genetics studies that proteinuria and progression of renal disease may be disparate phenotypes (Fogarty et al., 2000; Krolewski et al., 2006).
FIGURE 3

AASK Patients stratified by GFR slope with degree of proteinuria superimposed.
2.1 Pre-processing of AASK Data to Predict Progression of Renal Disease
An avenue that has not been carefully explored is a data mining approach to detect the combinations of clinical features and/or single nucleotide polymorphisms (SNPs) that better determine the population at risk for progression of CKD. The goal of this section is to identify combinatorial patterns of clinical features and SNPs that can accurately predict progression of the renal disease among AASK patients. In order to achieve this, we perform a study on a selected subset of subjects from the AASK Clinical Trial based on the glomerular filtration slope (GFR) of all AASK patients presented in Figure 3. The original AASK data contains 1,094 African-American patients with 88 clinical features and 130 SNPs. Before we start our analysis, we remove features with more than missing values in the dataset. We then remove AASK patients with missing GFR values and more than missing values. This results in 800 AASK patients with 77 clinical features and 113 SNPs. In order to develop a classification model that can predict the rate of decline of kidney function, we identify two “extreme” groups of patients whose disease progression is “slow” (GFR chronic slope ) or “rapid” (GFR chronic slope ). The two subsets of patients, referred to as slow progressors and rapid progressors are selected from the AASK study based on the chronic GFR slope histogram presented in Figure 4. The resulting reduced dataset contains 138 AASK patients identified as rapid progressors and 75 AASK patients as slow progressors.
FIGURE 4

Chronic GFR slope of AASK patients in the reduced data.
Figure 5 shows the PCA plot of the AASK patients in the reduced dataset. Table 2 describes the patient population for this study. As can be seen from the table, proteinuria is very different between the two groups of disease progression, which supports the previous studies showing that proteinuria is the strongest predictor of GFR slope progression in AASK (Wang et al., 2006).
FIGURE 5

PCA plot of AASK patients in the reduced data: * Rapid Progressors and * Slow Progressors
TABLE 2
| Basic Clinical Features | Rapid Progressors | Slow Progressors |
|---|---|---|
| Chronic slope | ||
| GFR | ||
| Proteinuria | ||
| Age | ||
| Weight (kg) | ||
| (cm) | ||
| BMI |
Baseline characteristics of study population.
2.2 Identification of Significant Clinical and Genomic Features
The resulting AASK dataset consisting of 138 rapid progressors, 75 slow progressors, 77 clinical features, and 113 SNPs, is further investigated to remove any features irrelevant for the recognition of a rapid progressor as opposed to a slow progressor. In order to obtain a classification model effectively and efficiently, we first apply a correlation-based feature selection procedure (Hall and Smith, 1998) to retain only those relevant features successfully distinguishing between rapid progressors and slow progressors in AASK data. Correlation-based feature selection method evaluates the worth of a subset of features by considering the individual predictive ability of each feature along with the degree of redundancy between them. Subsets of features that are highly correlated with the outcome (rapid/slow progression) while having low intercorrelation are preferred. AASK data is randomly partitioned into ten approximately equal parts; one of these subsets is designated as “test set”, correlation based feature selection is built on the remaining nine subsets which form the “training dataset”, and then evaluated on the cases in the test set. This procedure is repeated ten times, always taking another one of the ten parts in the role of the test set (re-randomizing the patients into ten new subsets and repeating the procedure nine additional times for a total of 100 tests).
Table 3 shows the features selected from ten times 10-folding cross-validation of the correlation-based feature subset selection procedure in WEKA, a commonly used open source data mining software (Hall et al., 2009). The rationale for using small numbers of features is both for ease in collecting the relevant data for prediction on patients from different sources (health systems) and the possibility that finding a small number of novel predictors may help inform studies into the mechanisms and treatment of CKD progression if they suggest new and unexplored pathways. The SNPs and the fact that the alpha-2 agonist antihypertensive medicine use are predictors may help in this manner.
TABLE 3
| % Absolute Frequency | Feature |
|---|---|
| 90% | α-agonist |
| 100% | Proteinuria |
| 100% | U.Protein/U.Creatinine |
| 70% | GFR value at G1 visit |
| 100% | CHGB-1 |
| 90% | PLCG2 rs4399527 |
Feature Selection - 10 fold stratified cross validation.
3 PATTERN-BASED Classification Model to Predict Progression of Renal Disease
3.1 Identification of Combinatorial Patterns of Significant Clinical Features and SNPs
analysis provides us with a reduced AASK data, containing 138 rapid progressors and 75 slow progressor with.
• four clinical features: α-agonist (peripherol base), proteinuria, urine-protein/urine-creatinine, GFR value at G1 visit, where α-agonist represents the use of peripheral alpha-2 agonist blood pressure medication
• two SNPs: CHGB-1, PLCG2 rs4399527.
These six features were validated using 10 × 10-folding cross-validation experiments on seven commonly used and well-known classification methods, including Random Forest, Decision Trees, Nearest Neighbor, Support Vector Machines, Neural Networks, Logistic Regression, and Naïve Bayes (Hall et al., 2009). In this step the AASK data is randomly partitioned into ten approximately equal parts; one of these subsets is designated as “test set”, a model is built on the remaining nine subsets which form the “training dataset”, and then tested by predicting the classes of patients in the test set using a classification method. This procedure is repeated 10 times, always taking another one of the ten parts in the role of the test set (re-randomizing the patients into 10 new subsets and repeat the procedure nine additional times) for a total of 100 tests for each of the seven classification methods. Table 4 shows average accuracy, sensitivity (proportion of correctly classified rapid progressors), specificity (proportion of correctly classified slow progressors) as well as average precision, recall, F-measure, and area under Receiver Operating Characteristic (ROC) curve.
TABLE 4
| Classification Method | Accuracy | Sensitivity | Specificity | Precision | Recall | F-Measure | ROC Area |
|---|---|---|---|---|---|---|---|
| Random forest | 78.33% | 83.63% | 68.79% | 0.71 | 0.69 | 0.68 | 0.86 |
| C4.5 decision tree | 76.77% | 80.53% | 70.18% | 0.68 | 0.70 | 0.67 | 0.78 |
| Nearest neighbor | 70.21% | 76.97% | 58.02% | 0.59 | 0.58 | 0.57 | 0.68 |
| Support vector machines | 72.70% | 77.91% | 63.34% | 0.62 | 0.63 | 0.61 | 0.71 |
| Neural networks | 73.07% | 78.19% | 63.79% | 0.63 | 0.64 | 0.62 | 0.81 |
| Logistic regression | 75.88% | 81.70% | 65.39% | 0.68 | 0.65 | 0.65 | 0.85 |
| Naïve bayes | 70.20% | 57.90% | 93.02% | 0.56 | 0.93 | 0.69 | 0.85 |
Cross-validation of classification methods for AASK samples.
As can be seen in Table 4, while Random Forest provides us with highest accuracy, C4.5 Decision Tree (Quinlan, 1993), a non-parametric supervised learning method used for classification and regression, provides the best sensitivity and specificity, i.e., the best prediction for rapid and slow prediction. C4.5 classification model consisting of seven patterns, S1-S7, for slow progressors and eight patterns, R1-R8, for rapid progressors is presented in Table 5 as combinatorial patterns of clinical features and SNPs associated with slow and rapid progression in the AASK dataset. Figures 6 and 8 show the C4.5 decision tree and heatmap corresponding to the combinatorial patterns presented in Table 5, respectively.
TABLE 5
| Patterns | C4.5 Classification Model for Renal disease Progression |
|---|---|
| S1 | U. Protein and PLCG2 rs4399527=GC and CHGB 1=TT |
| S2 | U. Protein and PLCG2 rs4399527=GC and CHGB 1=CT and α-agonist and Pro./Creat.Ratio > 0.01706 |
| S3 | U. Protein and PLCG2 rs4399527=GC and CHGB 1=CC |
| S4 | U. Protein and PLCG2 rs4399527=CC and Pro./Creat.Ratio ≤ 0.15714 |
| S5 | U. Protein and PLCG2 rs4399527=GG and CHGB 1=TT and GFR G1 ≤ 59.5816 |
| S6 | U. Protein and PLCG2 rs4399527=GG and CHGB 1=CT and Pro./Creat.Ratio > 0.02177 |
| S7 | U. Protein and PLCG2 rs4399527=GG and CHGB 1=CC |
| R1 | U. Protein and PLCG2 rs4399527=GC and CHGB 1=CT and α-agonist ≤0 and Pro./Creat.Ratio ≤ 0.01706 |
| R2 | U. Protein and PLCG2 rs4399527=GC and CHGB 1=CT and α-agonist >0 |
| R3 | 0 < U. Protein and PLCG2 rs4399527=GC |
| R4 | U. Protein and PLCG2 rs4399527=CC and Pro./Creat.Ratio > 0.15714 |
| R5 | U. Protein and PLCG2 rs4399527=GG and CHGB 1=TT and GFR G1 ≤ 41.4 |
| R6 | U. Protein and PLCG2 rs4399527=GG and CHGB 1=TT and GFR G1 > 59.5816 |
| R7 | U. Protein and PLCG2 rs4399527=GG and CHGB 1=CT and Pro./Creat.Ratio ≤ 0.02177 |
| R8 | U. Protein > 0.5 |
C4.5 classification model for AASK samples.
FIGURE 6

C4.5 decision tree for AASK samples.
FIGURE 7

Heatmap of the C4.5 patterns for AASK samples.
FIGURE 8

Receiver operating curves (ROC).
The pattern characteristics including
• rapid prevalence: proportion of rapid progressors covered by a pattern to the total number of rapid progressors,
• slow prevalence: proportion of slow progressors covered by a pattern to the total number of slow progressors,
• rapid homogeneity: proportion of rapid progressors covered by the pattern,
• slow homogeneity: proportion of slow progressors covered by the pattern,
• degree: number of conditions appear in the description of the pattern of the C4.5 classification model are given in Table 6.
TABLE 6
| Pattern | Homogeneity (%) | Slow prevalence | Rapid prevalence | Degree |
|---|---|---|---|---|
| S1 | 3 | |||
| S2 | 5 | |||
| S3 | 3 | |||
| S4 | 3 | |||
| S5 | 4 | |||
| S6 | 4 | |||
| S7 | 3 | |||
| R1 | 5 | |||
| R2 | 100 | 4 | ||
| R3 | 2 | |||
| R4 | 3 | |||
| R5 | 4 | |||
| R6 | 4 | |||
| R7 | 4 | |||
| R8 | 1 |
C4.5 decision tree pattern characteristics.
3.2 Validation of Combinatorial Patterns
We remark that the C4.5 classification model given in Table 5 consists of explicit patterns, where the four clinical features and two SNPs selected in Identification of Significant Clinical and Genomic Features are assigned threshold values. Note that patterns S1-S7 exhibit high homogeneity for the slow progressors and R1-R8 exhibit high homogeneity for the rapid progressors in AASK data. For example, patterns S2, S3, S5, S7 have homogeneity, meaning that all patients covered by each of these patterns are slow progressors. Similarly, the homogeneity of patterns R1, R2, R5, R6, R7 is also , i.e., all patients covered by each of these patterns are rapid progressors. We refer to such patterns as pure patterns associated with the respective subgroups of AASK patients. We also remark that the classification model contains fuzzy patterns, S1, S4, S6, R3, R4, R8, i.e., patterns with homogeneity . For example, the homogeneity of pattern S4 is , meaning that of the patients covered by pattern S4 are slow progressors and the remaining of the patients covered by this pattern are rapid progressors in AASK Clinical Trial.
As for the prevalence, patterns S4 and R8 are significant patterns, S4 covering of all slow progressors, but only of the rapid progressors and R8 covering of all rapid progressors, but only of the slow progressors in the data. While the other patterns in the classification model does not exhibit high prevalence in the associated subgroups within the data, they are still required to predict the progression of all AASK patients in the study. Finally, we observe that these patterns use small number of features of AASK patients. The degrees of the patterns (number of features used in pattern description) range from one to 5. Note that according to pattern R8, the U. Protein levels of of rapid progressors exceeds 0.5 and of the patients covered by this pattern are rapid progressors. Similar observations can be done for other patterns forming the classification model in Table 5.
Based on the 10 × 10-folding cross-validation experiments, the classification model correctly classifies of rapid progressors and of slow progressors and exhibits an average accuracy of with 0.68 precisiom, 0.70 recall, and 0.67 F-measure, validating the distinguishing power of the classification model for the AASK patients in our study. As another measure of the effectiveness of the classification model at predicting rapid or slow progressors, we generate receiver operating characteristic (ROC) curve that shows how much the classification model is capable of distinguishing between the rapid progressors and slow progressors in AASK Clinical Trial. ROC curve is obtained by plotting (true positive rate) against (false positive rate). Based on 10 × 10-folding cross-validation experiments, the area under the ROC curve is 0.78. ROC curve corresponding to the C4.5 classification model (built on entire dataset) in Table 5 is shown in Figure 8.
Thus, we can conclude that the combinatorial patterns forming the classification model in Table 5 are high quality decision rules that can be easily interpreted by medical experts, allowing them to target the clinical features and SNPs associated with the progression of the renal disease to develop new therapies.
Funding
ML, ES, and MMS’s work was supported by National Institutes of Health—Grant number: 5R21DK67468. KC and EC’s work was supported by National Science Foundation (NSF) Research Experience for Undergraduates (REU) Grant—Award number: 1,359,341.
Statements
Data availability statement
The datasets generated for this study can be found in the African American Study of Kidney Disease and Hypertension Study (Clinical Trial) (AASK Trial) https://repository.niddk.nih.gov/studies/aask-trial/.
Author contributions
ES, ML, and MMS are senior co-authors who designed and supervised the entire project and participated in writing the manuscript. MMM, TB, and MSM participated in the study design and performed the combinatorial analysis and participated in writing the manuscript. KC, EC, ZA, and RP were involved in various steps of the combinatorial analysis.
Acknowledgments
Special thanks to the AASK Investigators.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Footnotes
1.^Chronic Kidney disease Surveillance Project, Center for disease Control and Prevention—http://nccd.cdc.gov/ckd/
References
1
AgodoaL. Y.AppelL.BakrisG. L.BeckG.BourgoignieJ.BriggsJ. P.et al (2001). Effect of ramipril vs amlodipine on renal outcomes in hypertensive nephrosclerosis: a randomized controlled trial. Jama285, 2719–2728. 10.1001/jama.285.21.2719
2
BakrisG. L.WeirM. R.ShanifarS.ZhangZ.DouglasJ.van DijkD. J.et al (2003). Effects of blood pressure level on progression of diabetic nephropathy: results from the RENAAL studyEffects of intensive blood-pressure lowering and low-dose aspirin in patients with hypertension: principal results of the Hypertension Optimal Treatment (HOT) randomised trial. Arch. Intern. Med. 163, 1555–1565. 10.1001/archinte.163.13.1555
3
BergU. (2006). Differences in decline in gfr with age between males and females. reference data on clearances of inulin and pah in potential kidney donors. Nephrol. Dial. Transplant. 21, 2577–2582. 10.1093/ndt/gfl227
4
BrennerB. M.CooperM. E.de ZeeuwD.KeaneW. F.MitchW. E.ParvingH. H.et al (2001). Effects of losartan on renal and cardiovascular outcomes in patients with type 2 diabetes and nephropathy. N. Engl. J. Med. 345, 861–869. 10.1056/NEJMoa011161
5
ChenE.MillerG. E.YuT.BrodyG. H. (2016). The Great Recession and health risks in African American youth. Brain Behav. Immun. 53, 234–241. 10.1016/j.bbi.2015.12.015
6
ContrerasG.GreeneT.AgodoaL. Y.CheekD.JuncoG.DowieD.et al (2005). Blood pressure control, drug therapy, and kidney disease. Hypertension. 46, 44–50. 10.1161/01.HYP.0000166746.04472.60
7
DCCT (1995). Effect of intensive therapy on the development and progression of diabetic nephropathy in the diabetes control and complications trial. The Diabetes Control and Complications (DCCT) Research Group. Kidney Int47, 1703–1720.
8
FogartyD. G.HannaL. S.WantmanM.WarramJ. H.KrolewskiA. S.RichS. S. (2000). Segregation analysis of urinary albumin excretion in families with type 2 diabetes. Diabetes49, 1057–1063. 10.2337/diabetes.49.6.1057
9
HallM.FrankE.HolmesG.PfahringerB.ReutremannP.WittenI. (2009). The WEKA data mining software: an update. SIGKDD Explorations11 (1), 10–18. 10.1145/1656274.1656278
10
HallM. A.SmithL. A. (1998). Practical feature subset selection for machine learning. Springer.
11
HallanM. (1998). Calcium antagonists and renal disease. Kidney Int. 54, 1771–1784. 10.1046/j.1523-1755.1998.00168.x
12
HallanS. I.CoreshJ.AstorB. C.AsbergA.PoweN. R.RomundstadS.et al (2006). International comparison of the relationship of chronic kidney disease prevalence and esrd risk. J. Am. Soc. Nephrol. 17, 2275–2284. 10.1681/ASN.2005121273
13
HanssonL.ZanchettiA.CarruthersS. G.DahlöfB.ElmfeldtD.JuliusS.et al (1998). Effects of intensive blood-pressure lowering and low-dose aspirin in patients with hypertension: principal results of the Hypertension Optimal Treatment (HOT) randomised trial. HOT Study Group. Lancet351, 1755–1762. 10.1016/s0140-6736(98)04311-6
14
HebertL. A.KusekJ. W.GreeneT.AgodoaL. Y.JonesC. A.LeveyA. S.et al (1997). Effects of blood pressure control on progressive renal disease in blacks and whites. modification of diet in renal disease study group. Hypertension30, 428–435. 10.1161/01.hyp.30.3.428
15
KlagM. J.WheltonP. K.RandallB. L.NeatonJ. D.BrancatiF. L.StamlerJ. (1997). End-stage renal disease in African-American and white men. 16-year MRFIT findings. Jama277, 1293–1298.
16
KlahrS.LeveyA. S.BeckG. J.CaggiulaA. W.HunsickerL.KusekJ. W.et al (1994). The effects of dietary protein restriction and blood-pressure control on the progression of chronic renal disease. Modification of Diet in Renal Disease Study Group. N. Engl. J. Med. 330, 877–884. 10.1056/NEJM199403313301301
17
KrolewskiA. S.PoznikG. D.PlachaG.CananiL.DunnJ.WalkerW.et al (2006). A genome-wide linkage scan for genes controlling variation in urinary albumin excretion in type II diabetes. Kidney Int. 69, 129–136. 10.1038/sj.ki.5000023
18
LewisE. J.HunsickerL. G.BainR. P.RohdeR. D. (1993). The effect of angiotensin-converting-enzyme inhibition on diabetic nephropathy. The Collaborative Study Group. N. Engl. J. Med. 329, 1456–1462. 10.1056/NEJM199311113292004
19
LewisE. J.HunsickerL. G.ClarkeW. R.BerlT.PohlM. A.LewisJ. B.et al (2001). Renoprotective effect of the angiotensin-receptor antagonist irbesartan in patients with nephropathy due to type 2 diabetes. N. Engl. J. Med. 345, 851–860. 10.1056/NEJMoa011303
20
LindemanR. D.TobinJ.ShockN. W. (1985). Longitudinal studies on the rate of decline in renal function with age. J. Am. Geriatr. Soc. 33, 278–285. 10.1111/j.1532-5415.1985.tb07117.x
21
LindemanR. (1990). Overview: renal physiology and pathophysiology of aging. Am. J. Kidney Dis. 16, 275–282. 10.1016/s0272-6386(12)80002-3
22
MurussiM.GrossJ. L.SilveiroS. P. (2006). Glomerular filtration rate changes in normoalbuminuric and microalbuminuric Type 2 diabetic patients and normal individuals A 10-year follow-up. J. Diabetes Complicat. 20, 210–215. 10.1016/j.jdiacomp.2005.07.002
23
NikiP.PanosK.ChristosC. (2015). New targets for end-stage chronic kidney disease therapy. J. Crit. Care Med. 1, 92–95. 10.1515/jccm-2015-0015
24
ParsaA.KaoW. H.XieD.AstorB. C.LiM.HsuC. Y.et al (2013). APOL1 risk variants, race, and progression of chronic kidney disease. N. Engl. J. Med. 369, 2183–2196. 10.1056/NEJMoa1310345
25
PohlM. A.BlumenthalS.CordonnierD. J.De AlvaroF.DeferrariG.EisnerG.et al (2005). Independent and additive impact of blood pressure control and angiotensin II receptor blockade on renal outcomes in the irbesartan diabetic nephropathy trial: clinical implications and limitations. J. Am. Soc. Nephrol. 16, 3027–3037. 10.1681/ASN.2004110919
26
QuinlanJ. (1993). C4.5: programs for machine learning. Morgan Kaufmann Publishers.
27
RahmanM.GreeneT.PhillipsR. A.AgodoaL. Y.BakrisG. L.CharlestonJ.et al (2013). A trial of 2 strategies to reduce nocturnal blood pressure in blacks with chronic kidney disease. Hypertension61, 82–88. 10.1161/HYPERTENSIONAHA.112.200477
28
RuggenentiP.PernaA.GherardiG.GariniG.ZoccaliC.SalvadoriM.et al (1999). Renoprotective properties of ace-inhibition in non-diabetic nephropathies with non-nephrotic proteinuria. Lancet354, 359–364. 10.1016/S0140-6736(98)10363-X
29
SubasiE.SubasiM. M.HammerP. L.RobozJ.AnbalaganV.LipkowitzM. S. (2017). A classification model to predict the rate of decline of kidney function. Front. Med. 4, 97. 10.3389/fmed.2017.00097
30
WangX.LewisJ.AppelL.CheekD.ContrerasG.FaulknerM.et al (2006). Validation of creatinine-based estimates of gfr when evaluating risk factors in longitudinal studies of kidney disease. J. Am. Soc. Nephrol. 17, 2900–2909. 10.1681/ASN.2005101106
31
WrightJ. T.BakrisG.GreeneT.AgodoaL. Y.AppelL. J.CharlestonJ.et al (2002). Effect of blood pressure lowering and antihypertensive drug class on progression of hypertensive kidney disease: results from the aask trial. Jama288, 2421–2431. 10.1001/jama.288.19.2421
Summary
Keywords
classification, genomic analysis, AASK, chronic kidney disease, decision trees
Citation
Moreno MM, Bain TC, Moreno MS, Carroll KC, Cunningham ER, Ashton Z, Poteau R, Subasi E, Lipkowitz M and Subasi MM (2021) Identifying Clinical and Genomic Features Associated With Chronic Kidney Disease. Front. Big Data 3:528828. doi: 10.3389/fdata.2020.528828
Received
22 January 2020
Accepted
30 October 2020
Published
14 January 2021
Volume
3 - 2020
Edited by
Tuan D. Pham, Prince Mohammad bin Fahd University, Saudi Arabia
Reviewed by
Dinh Tuan Phan Le, New York City Health and Hospitals Corporation, United States
Lin Liu, Tsinghua University, China
Updates
Copyright
© 2021 Moreno, Bain, Moreno, Carroll, Cunningham, Ashton, Poteau, Subasi, Lipkowitz and Subasi.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Munevver Mine Subasi, msubasi@fit.edu
This article was submitted to Medicine and Public Health, a section of the journal Frontiers in Big Data
Disclaimer
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.