Different Pathophysiology and Outcomes of Heart Failure With Preserved Ejection Fraction Stratified by K-Means Clustering

Background: Stratified medicine may enable the development of effective treatments for particular groups of patients with heart failure with preserved ejection fraction (HFpEF); however, the heterogeneity of this syndrome makes it difficult to group patients together by common disease features. The aim of the present study was to find new subgroups of HFpEF using machine learning. Methods: K-means clustering was used to stratify patients with HFpEF. We retrospectively enrolled 350 outpatients with HFpEF. Their clinical characteristics, blood sample test results and hemodynamic parameters assessed by echocardiography, electrocardiography and jugular venous pulse, and clinical outcomes were applied to k-means clustering. The optimal k was detected using Hartigan's rule. Results: HFpEF was stratified into four groups. The characteristic feature in group 1 was left ventricular relaxation abnormality. Compared with group 1, patients in groups 2, 3, and 4 had a high mean mitral E/e′ ratio. The estimated glomerular filtration rate was lower in group 2 than in group 3 (median 51 ml/min/1.73 m2 vs. 63 ml/min/1.73 m2 p < 0.05). The prevalence of less-distensible right ventricle and atrial fibrillation was higher, and the deceleration time of mitral inflow was shorter in group 3 than in group 2 (93 vs. 22% p < 0.05, 95 vs. 1% p < 0.05, and median 167 vs. 223 ms p < 0.05, respectively). Group 4 was characterized by older age (median 85 years) and had a high systolic pulmonary arterial pressure (median 37 mmHg), less-distensible right ventricle (89%) and renal dysfunction (median 54 ml/min/1.73 m2). Compared with group 1, group 4 exhibited the highest risk of the cardiac events (hazard ratio [HR]: 19; 95% confidence interval [CI] 8.9–41); group 2 and 3 demonstrated similar rates of cardiac events (group 2 HR: 5.1; 95% CI 2.2–12; group 3 HR: 3.7; 95%CI, 1.3–10). The event-free rates were the lowest in group 4 (p for trend < 0.001). Conclusions: K-means clustering divided HFpEF into 4 groups. Older patients with HFpEF may suffer from complication of RV afterload mismatch and renal dysfunction. Our study may be useful for stratified medicine for HFpEF.


INTRODUCTION
The rate of heart failure with preserved ejection fraction (HFpEF) increases with age, reaching 50% or higher in patients with heart failure (1). Many previous studies revealed that HFpEF has many aspects, and the heterogeneity of this syndrome suggests different etiological and pathophysiological paths by which individual patients develop heart failure (2)(3)(4)(5). This heterogeneity also impedes the effectiveness of existing medications, such as inhibitors of the renin-angiotensin system and/or beta blockers for heart failure with reduced ejection fraction, and is related to poor outcomes for patients with HFpEF (6,7). Thus, the one-size-fits-all approach cannot improve clinical outcomes and precision medicine may be needed for patients with HFpEF (8). Although the individual pathophysiology must be known to perform precision medicine, common pathophysiologies for HFpEF may exist. By identifying subgroups of patients with different pathophysiologies of HFpEF, stratified medicine may enable the development of effective treatments for particular groups of patients with HFpEF; however, the multidimensionality of HFpEF makes it difficult to group patients together by common disease features. To overcome this problem, the precise calculating ability of artificial intelligence helped to stratify HFpEF. Indeed, using several machine-learning algorithms, previous studies clarified the phenotypes and therapeutic strategies for HFpEF; however, the features of heart failure with mid-range ejection fraction may influence the features of unknown phenotypes and RV diastolic function was not taught in previous studies (9)(10)(11)(12)(13). Although RV function plays an important role in the pathophysiology of HFpEF (14), there is a lack of guidance for the assessment and quantification of RV diastolic function (15). The physiological properties of the right ventricle are lower contractility and higher compliance than the left ventricle (16). The loss of high compliance, the greatest feature of the right ventricle, will influence on clinical outcomes of HFpEF. Indeed, we reported that the rate of less-distensible right ventricle assessed by jugular venous pulse increased with age and was risk factor for cardiac events of HFpEF (17,18). If this feature is taught in machine learning, a new important subgroup may be found. By enrolling patients meeting the diagnostic criteria of HFpEF described in heart failure guideline (19) and teaching cardiac function by referring to echocardiographic and jugular venous pulse evaluation, this study aimed to clarify new subgroups of HFpEF using machine learning.

MATERIALS AND METHODS
In this study, after receiving approval from the Human Subject Review Committee of our institute, all data from our echocardiographic and jugular venous pulse database and medical records were retrospectively obtained. Between April 2013 and March 2020, 7,437 consecutive outpatients underwent echocardiographic examinations (Vivid 7, General Electric Healthcare, Wauwatosa, WI, USA) for cardiovascular disease. For 2,882 patients, we simultaneously recorded electrocardiography, phonocardiography, and jugular venous pulse measurement, and all data were stored using a hard-disk memory system (echoPAC PC, General Electric Healthcare) for later analyses. A flowchart of this study is shown in Figure 1.
In the present study, we defined patients with HFpEF as those with left ventricular (LV) ejection fraction ≥50%, two or more positive variables of LV diastolic dysfunction, having symptoms and/or signs of heart failure, and a brain natriuretic peptide (BNP) level >35 pg/ml (19,21). First, patients were excluded if they lacked data, such as LV ejection fraction, mitral e ′ , left atrial volume index, tricuspid regurgitant velocity, tricuspid annular plane systolic excursion (TAPSE), jugular venous pulse waveform, BNP, creatine, and hemoglobin. Patients were also excluded if they had normal LV diastolic function, constrictive pericarditis, cardiac amyloidosis, hypertrophic cardiomyopathy, moderate or severe valvular heart disease, congenital heart disease, acute coronary syndrome within 6 months, uncontrolled angina pectoris, idiopathic pulmonary arterial hypertension, acute decompensated heart failure, LV ejection fraction <50%, kidney failure (estimated glomerular filtration rate [eGFR] < 15 ml/min/1.73 m 2 ), or advanced cancer. We diagnosed 535 patients with HFpEF, but 52 were excluded because of follow-up at another hospital. In total, we retrospectively enrolled 483 patients in the present study. The data from 350 patients obtained during the period from April 2015 to March 2020 were used as original data to find new phenotypes of HFpEF and data from other 133 patients obtained during the period from April 2013 to March 2015 were used to validate the phenotypes found by clustering methods. All patients took medications continuously for 4 months. Based on the arrangement of our hospital, informed consent was provided by all patients at the time when they were examined using echocardiography and/or underwent blood sample tests. The study complied with the Declaration of Helsinki.

Evaluation of Cardiac Function
Cardiac function was evaluated as in our previous report (17,18,20). The jugular venous pulse waveform was used to evaluate RV distensibility. It was recorded in the supine position by well-trained cardiac sonographers. A pulse-wave transducer (TY-306, Fukuda Denshi, Tokyo, Japan) was placed over the neck, above and to the right of the junction of the right clavicle and the manubrium sterni, and held in place manually. The jugular venous waveform was recorded for at least 30 s and digitized at a sampling interval of 600 Hz. Using an off-line moving average technique (Matlab version 14, Mathworks, Natick, MA, USA), respiratory baseline fluctuations (0.1-0.5 Hz) were excluded from the jugular waveform to determine the relative depth of the nadirs of "X" and "Y" descent (17,18,20). According to the established significance of the jugular venous waveform (22)(23)(24), two cardiologists who were blinded to the clinical data judged whether the jugular venous pulse had a dominant "Y" descent, where the nadir of the "Y" descent was deeper than that of the "X" descent, reflecting a less-distensible right ventricle. LV end-diastolic and end-systolic volumes were measured using a modification of Simpson's method. The LV ejection fraction was calculated as stroke volume divided by end-diastolic volume. LV mass was also calculated using the Devereux formula and was divided by surface area (LV mass index [LVMI]) (25). To evaluate the diastolic properties of the left ventricle, we measured the early diastolic velocities (e ′ ) using pulsed-wave tissue Doppler from the apical view. We measured the septal and lateral E/e ′ , and averaged the values for more reliable assessment of LV relaxation and filling pressure (21). If patients had atrial fibrillation (AF), we estimated velocity measurements from 10 consecutive cardiac cycles (21). The left atrial volume index was obtained using the biplane method from both the apical fourand two-chamber views (25). In addition, tricuspid regurgitant jet was detected using the continuous Doppler technique to measure the RV systolic pressure. The peak pressure gradient from the right ventricle to the right atrium was calculated from the peak tricuspid regurgitant velocity (V) using a modified Bernoulli equation (pressure gradient = 4 V 2 ). The peak RV pressure was then calculated by adding the peak pressure gradient to the right atrial pressure, which was estimated from the echocardiographic characteristics of the inferior vena cava (26). We regarded RV systolic pressure as systolic pulmonary arterial pressure (SPAP) because of the absence of a gradient of across the pulmonic valve and RV outflow tract. The LV ejection fraction, mean mitral e ′ , mean mitral E/e ′ ratio, SPAP, TAPSE, and jugular venous pulse waveform were used as indicators of LV contractility, LV relaxation ability, LV filling pressure, RV afterload, RV contractility, and RV diastolic function, respectively, in this study.

K-Means Clustering
We used R and downloaded several packages to perform k-means clustering to stratify patients with HFpEF (27)(28)(29). K-means clustering, unsupervised machine learning, is one of the most popular clustering techniques. K-means clustering produces hard (an element can only be a member of one cluster), flat, and polythetic (membership is determined by similarity based on multiple attributes) clusters. The k-means algorithm has no training or testing data per se. It works by creating each cluster around a centroid, which is an average cluster member, namely, the center of a cluster (30). The steps of the k-means clustering algorithm are as follows: First, the algorithm starts by specifying the number of clusters (k). Second, k random centroids are initialized based on datapoints in the data. Third, for each point, the algorithm finds the nearest centroid and assigns the point to that cluster. To find the nearest centroid, Euclidean distance was used in this study. Fourth, the centroid is adjusted such that it minimizes the distance within the cluster variance. Lastly, the algorithm stops once cluster assignment stops making changes (30). It is well-known that the number of clusters specified greatly affects the performance of k-means clustering. To determine the optimal k, Hartigan's rule was used in this study. The Euclidean distance formula is not defined for nominal data. To calculate the distance between nominal features, they need to be converted into a numeric format, for which we used dummy coding, where a value of one indicates one category, and zero, indicates the other (31,32). To avoid some features having a larger range of values than the others solely dominating, the features applied for k-means clustering were standardized using z-scores, as in the following formula (31): where µ is the mean of x and σ is the standard deviation of x.
When the values were standardized by z-scores, positive values were above the overall mean level and negative values were below the overall mean. By examining whether the clusters fall above or below the mean level for each interest category, we can begin to identify patterns that distinguish the clusters from each other. An extreme z-score reflects the features of the cluster (31). Principal component analysis was also applied to visualize the results of k-means clustering (33).

Documentation of End Points
All 483 patients were followed up at the outpatient clinic of our hospital. We defined deterioration of HFpEF as follows: sudden death, death from heart failure, or hospitalization for deterioration of HFpEF. These cardiac events were reported and adjudicated by cardiovascular specialists at our hospital.

Validation Cohort
We performed an independent validation analysis on 133 of 458 patients (the data obtained in the period from April 2013 to March 2015). Setting the same number as clusters estimated from original cohort, k-means clustering was also performed in the validation cohort. We then looked to see whether there was again a difference in outcomes among the groups using same outcome analysis (cox proportional hazards analysis) used in the original cohort.

Statistical Analysis
Numerical data are expressed as the median (interquartile range), mean ± standard deviation, or z-score. The Shapiro-Wilk test was used to assess the normality of data. To assess homogeneity of variance, Bartlett's test was used in this study. One-way analysis of variance or the Kruskal-Wallis test was used to compare numerical data among groups, and the chi-square test or Fisher's exact test was used to compare non-parametric data among groups. If a significant difference was observed among groups, Holm's method was used to compare the groups.
For outcomes analyses, we used unadjusted and age-adjusted Cox proportional hazards models to determine the independent association between groups and outcomes. Cardiac events of HFpEF stratified by k-means clustering were estimated using the Kaplan-Meier method. Differences between the event-free curves were examined using the log-rank chi-square test and Holm's method. Significance was established at p < 0.05. All statistical analyses were carried out using EZR (Saitama Medical Center, Jichi Medical University, Saitama, Japan) (34).

RESULTS
Patient characteristics, renal function, hemoglobin level, cardiac function, and the rate of cardiovascular events are shown in Table 1. These 37 features were applied to k-means clustering in this study. The optimal k was four, which was detected using Hartigan's rule (Figure 2). HFpEF was stratified into four groups by k-means clustering. The coordinates of the cluster centroids according to stratification using k-means clustering are shown in Table 2. Using principle component analysis, the results of kmeans clustering are visualized in Figure 3. Coefficients of each feature to create the axes of principle components 1 and 2 are shown in Supplementary Table 1.

Patient Characteristics and Comorbidities
Patient characteristics and comorbidities are shown in Table 3.
Group 1 was composed of younger individuals (median age 70 years) with relatively higher BMI and rate of prior coronary revascularization (35%), in addition to relatively preserved renal function (median eGFR 69 ml/min/1.73 m 2 ). Group 2 was characterized by older age (median age 83 years), the highest proportion of women (73%), and lower eGFR (median 51 ml/min/1.73 m 2 ). Group 3 exhibited intermediate age (median age 77), with higher BMI, the highest prevalence of atrial fibrillation (95%), and the lowest prevalence of prior coronary revascularization (2%). Group 4 was characterized by older age (median age 85 years), higher proportion of women (64%), higher prevalence of atrial fibrillation (56%), and lower eGFR (median 54 ml/min/1.73 m 2 ). The usage rate of loop diuretics was the highest in group 4.

Patient Symptoms and Signs of HFpEF and Cardiac Function
Patient symptoms and signs of HFpEF, cardiac function, and the rate of cardiac events are shown in Table 4. Most patients had dyspnea on exertion. LV relaxation function, suggested by mean mitral e ′ , decreased in all groups. The mean mitral E/e ′ ratio was higher in groups 2, 3, and 4 than in group 1. Group 1 exhibited the lowest rate of volume overload signs and symptoms, such as leg edema, neck vein dilatation, and pleural effusion. LV and RV function and morphology were preserved in group 1 compared with other groups. This group also exhibited a lower value of BNP (median 72 pg/ml). Group 2 demonstrated an intermediate rate of leg edema (40%). In group 2, LV and RV function and morphology were preserved compared with groups 3 and 4. Group 3 had a higher rate of volume overload signs and symptoms (the rate of leg edema and neck vein dilatation was 73 and 36%, respectively). Compared with groups 1 and 2, group 3 exhibited a larger LAVI, shorter deceleration time of mitral inflow (the rate of deceleration time ≤160 ms, 43%), more

Relationship Between Clinical Phenotypes and Patient Outcomes
Cox proportional hazard analysis is shown in These results were almost the same in the age-adjusted model ( Table 5). The Kaplan-Meier analysis of HFpEF stratified by kmeans clustering is shown in Figure 4. The event-free rate was the lowest for patients in group 4 (p for trend <0.001).  Data are the z-score. Abbreviations are the same as those in Table 1.

Validation of the K-Means Clustering
Group 2, 3, and 4 in the validation cohort, as in the original cohort, was associated with cardiac events independently of age, with hazard ratios comparable to those of the original cohort (Supplementary Table 6).

DISCUSSION
In the present study, patients with HFpEF were divided into four groups using k-means clustering. These groups had different etiologies and pathophysiologies. The event-free rates of cardiac events were significantly different among some groups. Using a cohort of 350 outpatients with documented HFpEF, and a validation cohort of 133 independent outpatients with HFpEF, we demonstrated the feasibility and validity of the k-means clustering technique for HFpEF. Clustering is an unsupervised machine learning task that automatically divides the data into clusters, or groups of similar items and is used for knowledge discovery rather than prediction and it provides insight into the natural groupings found within data (30,31). Thus, it is important that the clinical significance of dividing HFpEF is understood by physicians. K-means clustering is not as sophisticated as more modern clustering algorithms; however, it uses simple principles to find the nearest centroid for each point. Therefore, k-means clustering is an easy-to-understand clustering algorithm for physicians who are unfamiliar with machine learning, which may be a key advantage. Therefore, it may be easier for physicians to understand important functional features to specify HFpEF, such as renal function, AF, mean mitral E/e ′ ratio, and RV systolic and diastolic function, which were used in our study, and the combination of complications leading to the poorer prognosis of patients with HFpEF.

Group 1 (Younger Patients With Mild Symptoms and LV Relaxation Abnormality)
Our group 1 was the youngest and relatively higher BMI (the prevalence of obesity, 41%) and rate of prior coronary revascularization (35%). Generally, LV relaxation ability
Other abbreviations are the same as those in Table 1. Obesity was defined as a BMI ≥ 25 kg/m 2 based on the criteria proposed by the Japanese Society for the Study of Obesity. *, comparison between groups 1 and 2, p < 0.05; #, comparison between groups 1 and 3, p < 0.05; +, comparison between groups 1 and 4, p < 0.05; $, comparison between groups 2 and 3, p < 0.05; !, comparison between groups 2 and 4, p < 0.05; ¶, comparison between groups 3 and 4, p < 0.05; ‡, comparison between groups 3 and 4, p = 0.073. decreases with age; however, the mean mitral e ′ of group 1 was the same as that of the other groups. Group 1 had LV relaxation abnormality which is the most common cardiac dysfunction of HFpEF. Obesity is associated with LVH and incipient LV dysfunction [5]. Ischemia can also influence the LV relaxation ability, even in the absence of overt ischemia, and it improves after coronary revascularization (35,36). Thus, these comorbidities may be associated with poorer LV relaxation ability of group 1. LV relaxation abnormality may be associated with dyspnea on exertion through incomplete LV relaxation due to exercise-induced tachycardia (37). Considering these features, the pathophysiology of group 1 HFpEF resembles one of the previously reported phenotypes of HFpEF, exerciseinduced diastolic dysfunction (38). Other cardiac functions and morphology were preserved in group 1, which may have been associated with the highest event-free rate among the groups.

Group 2 (Older Patients With Renal Dysfunction)
Our group 2 was older and lower eGFR (the prevalence of CKD, 66%). LV and RV function and morphology were preserved, except for the rate of increase in LV filling pressure suggested by the mean mitral E/e ′ ratio >14. The DT of mitral inflow in group 2 was slower compared with group 3 and 4, which suggested that LV filling depended more on slow filling (39). Although LV relaxation ability in group 2 seemed to be the same as that in group 1, the ratio of increase in LV filling pressure in group 2 was higher than that in group 1. The mechanisms of the increase in LV filling pressure may not be through advanced LV diastolic dysfunction, but instead through volume overload due to renal dysfunction in group 2. Excessive sodium retention increases the extracellular fluid volume in patients with renal failure (40). As the left ventricle is not a volume pump, but a pressure pump (16), excessive sodium retention due to renal dysfunction may cause an increase in the LV filling pressure under the condition of LV relaxation abnormality (41). Renal dysfunction may be also associated with the increase in the prevalence of volume overload signs in group 2. As an inverse relationship between renal function and adverse cardiovascular outcomes has been reported (42), the comorbidity of CKD may also have led to the poorer prognosis of HFpEF in group 2. As LV and RV function and morphology were preserved compared with groups 3 and 4, chronic renocardiac syndrome (cardiorenal syndrome type Data are the number of patients (%), median (interquartile range), or mean ± SD. RV, right ventricular; RWT, relative wall thickness. Other abbreviations are the same as those in Table 1. *, comparison between groups 1 and 2, p < 0.05; #, comparison between groups 1 and 3, p < 0.05; +, comparison between groups 1 and 4, p < 0.05; $, comparison between groups 2 and 3, p < 0.05; !, comparison between groups 2 and 4, p < 0.05; ¶, comparison between groups 3 and 4, p < 0.05.  4) was assumed to be the pathophysiology of group 2 in this study (41).

Group 3 (AF and Advanced Biventricular Diastolic Dysfunction)
Most patients in group 3 had atrial fibrillation (95%), and advanced LV and RV dysfunction were more common than in groups 1 and 2. Among them, LAVI was the largest, DT of mitral inflow was shortest and the prevalence of less-distensible right ventricle was higher in group 3. HFpEF leads to AF via structural and functional remodeling of the left atrium. On the other hand, AF itself causes left atrial dilatation, impaired atrial function, and atrial fibrosis, AF may be a direct cause of HFpEF (43). Due to the elimination of atrial contraction, ventricular filling depends more on the rapid filling phase, suggested by the shorter DT of mitral inflow (39). LV diastolic function may be more impaired by AF because the rate of a higher mean mitral E/e ′ ratio increased. Indeed, AF is also associated with LV myocardial fibrosis which in turn leads to LV diastolic dysfunction, and successful cardioversion is associated with improvement of LV filling (43). Recently, we demonstrated that AF is associated with a decrease in RV distensibility (17). The relationship between RV diastolic function and AF has not been fully established yet; however, a vicious cycle may be formed between the right side of the heart and AF, similar to the relationship between the left side of the heart and AF. Moreover, chronic high LV filling pressure impact RV diastolic function through ventricular interaction (44). LV and RV diastolic function in group 3 may deteriorate via these mechanisms. In particular, less-distensible right ventricle may play an important role in the higher prevalence of volume overload signs and cardiovascular event rates (18).

Group 4 (Older Patients With RV Afterload Mismatch and Renal Dysfunction)
Our group 4 was characterized by older age, comorbidities, and cardiac dysfunction shared, similar to groups 2 and 3, e.g., AF, RV dysfunction, and renal dysfunction. Moreover, a higher SPAP (SPAP > 35 mmHg, 59%) was also a complication. Lessdistensible right ventricle, higher SPAP, and renal dysfunction should be paid attention to in the pathophysiology of group 4. When RV preload reserves are lost, indicated by less-distensible right ventricle, the stroke volume decreases with increased RV afterload, resulting in RV afterload mismatch and further deterioration of the hemodynamics of HFpEF (18). RV failure negatively affects renal function through the increase in right atrial pressure, i.e., congestive kidney failure (5), which is related to a poorer prognosis of HFpEF (45). On the other hand, worsening renal function leads to sodium retention and may evoke volume overload under the condition in less-distensible right ventricle because the RV preload reserve is limited. Thus, a possible pathophysiological mechanism of group 4 is the formation of a vicious cycle between RV afterload mismatch and renal dysfunction. Volume overload signs and symptoms resisting loop diuretics and the poorest prognosis among the groups may have been caused by this vicious cycle.

Comparison With HFpEF Groups Identified in Previous Studies
Some of these comorbidities and demographics to stratify patients with HFpEF were reported previously, however, our study had several differences. Compared with previous reports (9-13), our subjects were older (median age 77 years old) and, median age of our group 4 was 85 years. Using machine learning, a high age group similar to our group 4 has not been reported previously. The identified subgroups of HFpEF in this study had more hemodynamic concepts than those in previous reports because LV and RV function, especially RV distensibility were taught in machine learning. Echocardiography is excellent for the assessment of cardiac function, but in comparison with LV diastolic function, there is a lack of guidance for the assessment and quantification of RV diastolic function (15). The examination of tricuspid inflow was recommended for the assessment of RV diastolic function (26); however, the echocardiographic assessment of RV function is often difficult due to the complex RV anatomy and these measures do not typically form part of a standard clinical echocardiographic study (15,46). Indeed, RV diastolic function assessed using tricuspid inflow were not taught in previous reports (9)(10)(11)(12)(13). To overcome this problem, we paid attention to the jugular venous pulse. This method may be forgotten in RV assessment, but the waveform pattern can reflect the condition of the right ventricle (22,23). Indeed, we previously reported that the combination of a high RV systolic pressure and less-distensible right ventricle, a situation in which RV afterload mismatch is easily evoked, exhibited the poorest outcomes in HFpEF (18) and that beta-blockers may be useful for the patients with HFpEF and preserved RV distensibility (20). The assessment of jugular venous pulse waveform is useful for the stratification of HFpEF. We hypothesize that the complications of RV afterload mismatch and renal dysfunction are associated with the poorest outcomes of HFpEF. Thus, to our best knowledge, this is the first study in which RV distensibility assessed by jugular venous pulse was utilized to divide HFpEF by machine learning and we clarified a new phenotype of older age for HFpEF using k-means clustering.
Clinical Implication in the Pathophysiology of Group 4 The relationship between RV afterload mismatch and renal dysfunction is troublesome when deciding therapeutic strategies for patients in group 4. Only diuretics can improve volume overload and possibly the hemodynamics in HFpEF. Diuretics may improve congestive kidney disease, but their excessive use reduces the RV filling pressure, which reduces the stroke volume and may result in prerenal failure (22). Heart rate reduction may exert untoward action in patients with HFpEF and RV afterload mismatch because cardiac output depends more on heart rate. Our study also suggested that progression to cardio renal syndrome associated with RV afterload mismatch should be prevented by appropriate treatments.

Study Limitations
Several methodological limitations must be considered. First, this was a retrospective study that was conducted at a single center and performed on consecutive patients with matching eligibility criteria. As we required satisfactory imaging of echocardiography and jugular venous pulse, some patients, such as markedly obese patients with limited windows or fatty neck, may have been underrepresented. Moreover, patients with tachycardia may also have been excluded because of difficulty in separating the E and A waves in the mitral inflow or the "X" and "Y" descent of the jugular venous pulse, as described previously (17,18,20). Second, it is well-known that wild-type transthyretin amyloidosis is an underdiagnosed cause of HFpEF (47). If patients had a dominant Y descent in the jugular venous pulse waveform, constrictive pericarditis and/or cardiac amyloidosis were suspected. These diseases were examined as in our previous study (17,18,20). However, our screening examination, such as echocardiography, may have been insufficient to detect ATTRwt. Other screening examinations with a higher sensitivity, such as scintigraphy, is needed (47). Thus, early stages of amyloidosis may have been included in the present study. Third, k-means clustering is not as sophisticated as more modern clustering algorithms. As it uses an element of random chance, it is not guaranteed to find the optimal set of clusters and requires a reasonable guess as to how many clusters naturally exist in the data. Stratified data by k-means clustering often include subjectivity of the researcher even if a technique, such as Hartigan's rule, to find the optimal k is applied (31). Using k-means clustering, categorical data need to be converted to numerical data. Clustering methods also have a risk of overlap of characteristics. Although we performed validation study to confirm our results, modern clustering algorithms that can analyze both categorical and numerical data may be more suitable for more precise stratification. Clustering methods have advanced since the inception of k-means and modern clustering algorithms may be superior; however, this does not mean that k-means is obsolete. K-means clustering is still used widely because of its simple principals, high flexibility, and satisfactory performance in many cases. The performance of a clustering algorithm depends on both the quality of the clusters themselves and what is done with the information (31). Moreover, as all learning algorithms are only as good as the input data, the features taught to the machine learning algorithm are also important. Indeed, by learning RV diastolic function, we clarified a new phenotype of older age for HFpEF. Fourth, the criteria for HFpEF are updated once every few years and patients meeting the latest criteria were enrolled retrospectively in this study; therefore, many patients were excluded from the original sample, which may have caused selection bias. Lastly, this retrospective study was unable to establish a causal relationship. However, our results may be useful for the management of elderly patients with HFpEF because the prevalence of HFpEF will increase with age. Thus, further prospective clinical studies are warranted to confirm our results.

CONCLUSION
K-means clustering divided HFpEF into four groups. Older patients with HFpEF may suffer from complication of RV afterload mismatch and renal dysfunction. Our results may be useful for stratified medicine for HFpEF.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Imizu Municipal Hospital Ethics/Clinical Trial Review Committee. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
DH and HA worked on the conception, methodology, and formal analysis. Data collection was performed by DH, TN, and JT. DH wrote the manuscript. All authors approved the final version of the manuscript.