Comparing Multimorbidity Patterns Among Discharged Middle-Aged and Older Inpatients Between Hong Kong and Zurich: A Hierarchical Agglomerative Clustering Analysis of Routine Hospital Records

Background: Multimorbidity, defined as the co-occurrence of ≥2 chronic conditions, is clinically diverse. Such complexity hinders the development of integrated/collaborative care for multimorbid patients. In addition, the universality of multimorbidity patterns is unclear given scarce research comparing multimorbidity profiles across populations. This study aims to derive and compare multimorbidity profiles in Hong Kong (HK, PRC) and Zurich (ZH, Switzerland). Methods: Stratified by sites, hierarchical agglomerative clustering analysis (dissimilarity measured by Jaccard index) was conducted with the objective of grouping inpatients into clinically meaningful clusters based on age, sex, and 30 chronic conditions among 20,000 randomly selected discharged multimorbid inpatients (10,000 from each site) aged ≥ 45 years. The elbow point method based on average within-cluster dissimilarity, complemented with a qualitative clinical examination of disease prevalence, was used to determine the number of clusters. Results: Nine clusters were derived for each site. Both similarities and dissimilarities of multimorbidity patterns were observed. There was one stroke-oriented cluster (3.9% in HK; 6.5% in ZH) and one chronic kidney disease-oriented cluster (13.1% in HK; 11.5% ZH) in each site. Examples of site-specific multimorbidity patterns, on the other hand, included a myocardial infarction-oriented cluster in ZH (2.3%) and several clusters in HK with high prevalence of heart failure (>65%) and chronic pain (>20%). Conclusion: This is the first study using hierarchical agglomerative clustering analysis to profile multimorbid inpatients from two different populations to identify universalities and differences of multimorbidity patterns. Our findings may inform the coordination of integrated/collaborative healthcare services.


INTRODUCTION
Multimorbidity is commonly referred to as the co-occurrence of two or more chronic health conditions (1) and is consistently associated with poorer quality of life (2), more healthcare utilization (3), deteriorating mental health (4), and greater risk of mortality (5). Various models of care have been proposed and trialed to address this complexity in clinical practices (6,7).
Multimorbid patients are clinically diverse (8) and may have markedly different prognoses due to different disease combinations (9). Such heterogeneity limits the provision of integrated or collaborative care in various healthcare settings (10,11) as evidence-based clinical guidelines are inadequate (12) given the scarcity of randomized controlled trials conducted on multimorbid populations (13), even those with more prevalent disease combinations (14). This challenge is often further complicated by the need to manage multiple drug regimens (polypharmacy) (15) and the associated adverse effects (16).
Numerous attempts have thus been made to identify common patterns of multimorbidity to simplify the problem (8,17). Prados-Torres et al. (17) identified 97 different combinations of two or more co-occurring chronic conditions, which were mostly represented as cardiometabolic, musculoskeletal, and mental patterns. In a more recent review, Ng et al. (8) updated the literature search and evaluated the methods by which multimorbidity patterns are identified. Clustering analysis has been found to be the commonest approach to identifying disease patterns: grouping together similar diseases that are found in the same individuals (8,17). While this approach omits the possibility that one disease may belong with more than one cluster and that even individuals with the same disease may be clinically distinct from each other given other conditions, very few reviewed studies (18) adopted clustering analysis or other methods to group similar patients instead of similar diseases (8,17). Furthermore, very few studies have compared multimorbidity patterns across countries using the same methods (19). Similarities and dissimilarities between multimorbidity patterns observed in different contexts, therefore, remain unexplored. In fact, the presence of universal patterns may strengthen the rationale for more randomized controlled trials to be conducted on multimorbid patients because results would confer cross-country implications.
This study aims to describe and compare the patterns of cooccurring chronic conditions among discharged multimorbid inpatients in Hong Kong and Zurich using a hierarchical agglomerative clustering analytic approach, which is typically used for grouping similar individuals based on predetermined ranges of characteristics. Since the healthcare system as well as the cultural and demographic characteristics differ drastically between the two sites, observations of similar multimorbidity patterns may potentially imply universal challenges facing clinicians globally.

Study Design and Data Collection
We conducted a retrospective analysis of clinical records of discharged patients aged ≥ 45 from all public hospitals in Hong Kong (representing >90% of all inpatient services) during January 2010-December 2013 and from the University Hospital Zurich (general acute hospital, teaching hospital for University of Zurich) during August 2009-August 2017. The discrepancy of observation period was mainly due to the much smaller amount of data generated from Zurich as it is only one hospital compared with the whole public hospital system in Hong Kong.
The data contained information on patients' age, sex, the length of stay, and the first 15 clinical diagnoses made during the hospital stay coded with the International Classification of Diseases, Ninth Revision (ICD-9) for Hong Kong and with the Tenth Revision (ICD-10) for Zurich. Multimorbidity is defined as having two or more chronic conditions using a list of 30 diseases coded either by ICD-9 or ICD-10. The list of chronic conditions is based on validated coding algorithms summarized by Tonelli et al. (20) and included alcohol misuse, asthma, atrial fibrillation, chronic heart failure, chronic kidney disease, chronic pain, chronic pulmonary disease, chronic viral hepatitis B, cirrhosis, dementia, depression, diabetes, epilepsy, hypertension, hypothyroidism, inflammatory bowel disease, irritable bowel syndrome, lymphoma, metastatic cancer, multiple sclerosis, myocardial infarction, non-metastatic cancer (breast, cervical, colorectal, lung, and prostate), Parkinson's disease, peptic ulcer disease, peripheral vascular disease, psoriasis, rheumatoid arthritis, schizophrenia, severe constipation, and stroke or transient ischemic attack (TIA). These diseases are all convertible between ICD-9 and ICD-10 in accordance with the Tonelli algorithms and therefore diseases could be mapped between the sites. Ten thousand patients with two or more of these conditions from each site were randomly selected and included with the "sample" function according to approach described by Ripley (21) in R, version 3.6.0 (R Foundation for Statistical Computing, Vienna, Austria). This sample size was approximately four times the minimum required number recommended for a clustering analysis of 32 variables (22). 30-day readmission and length of stay in hospital were measured to compare health care utilizations between clusters and sites. We considered only the first discharge of each patient during the data collection period to observe their clinical profiles and length of stay, then followed them up 30 days after their baseline discharge to observe any readmission.

Clustering Analysis
Stratified by sites (Hong Kong and Zurich), a hierarchical agglomerative clustering analysis implemented with R package "hclust" was conducted to form clusters of patients starting by grouping similar individuals in terms of age groups (categorized according to the World Health Organization's 5-year intervals: 45-49, 50-54, 55-59, 60-64, 65-69, 70-74, 75-79, 80+), sex, and the presence of listed chronic conditions with dissimilarity between patients measured by Jaccard index (23). Using the Ward's method, the pair of clusters or patients merged in each step was the one associated with the smallest increase in the total withincluster dissimilarity. To inform our decision on the number of clusters to be specified, we first plotted the total within-cluster dissimilarity by number of specified clusters to look for an elbow point at which total within-cluster dissimilarity cease to decrease significantly, then we conducted a qualitative examination of the disease prevalence across clusters to eventually determine the optimal number of clusters from a clinical perspective, i.e., the balance between interpretability and meaningful clinical grouping of patients. Clustering of patients was then compared between sites.
We performed statistical analyses with R and there were no missing data in the anonymized hospital records. An ethics waiver has been granted by Cantonal Ethics Committee of Zurich for the analysis of Zurich inpatient data (Ref: NZ-B-Nr.2017-00882) while the analysis of Hong Kong inpatient data was approved by the Survey and Behavioral Ethics Committee of the Chinese University of Hong Kong (Project Code: Elderly Care -CUHK). As only secondary analysis of anonymized inpatient data was performed, no informed consent was required.

RESULTS
Over the corresponding study periods (see Figure 1), there were 1,015,225 inpatients aged ≥ 45 discharged from Hong Kong public hospitals among which 144,711 were multimorbid; and of the 102,936 discharged from the University Hospital Zurich, 37,574 were multimorbid. For each site, 10,000 patients were randomly selected for analysis. Table 1 shows the comparison of sample characteristics and disease prevalence between sites. First, the sample from Hong Kong had an older median age (75 vs. 70) and fewer males (51.2 vs. 57.6%) than the Zurich sample. Second, while 30day readmission rate was similar, the median length of current stay is substantially greater in Zurich (7 vs. 4 days). Third, patients from Zurich had more diagnoses than those from Hong Kong, where only 1.4% of the sample had five or more diseases, compared with 5.0% in Zurich. There were notable differences in specific disease prevalence between the sites. Prevalence of alcohol misuse, epilepsy, and cancer prevalence among patients in Zurich was about triple that in Hong Kong. Much fewer Zurich patients had atrial fibrillation and heart failure compared with Hong Kong patients and there was a sharp contrast in peripheral vascular disease prevalence (0.0% in Hong Kong vs. 10.4% in Zurich).   in the Zurich sample but only 38,964 in the Hong Kong sample, plausibly due to the fact that patients in Zurich had more diagnoses (see Table 1). In general, the patterns of disease dyads were fairly similar. Among all dyads in Hong Kong, 10,701 were related to hypertension, 7,815 to heart failure, and 2,967 to chronic pain. In Zurich, 13,294 were related to hypertension,  Figure 3 shows the average within-cluster dissimilarity by number of specified clusters. Based on the elbow point method, at five clusters, within-cluster dissimilarity ceased to decline further significantly. Hence, we examined clustering schemes from 5 to 10 specified clusters in each site. This clinical review of clustering schemes indicated that at nine clusters, a balance between interpretability and meaningful categorization of patients was achieved. Thus, clustering patterns with nine specified clusters are presented. The order of clusters was in accordance with the size of the clusters (#1 being largest and #9 smallest). Figures 4, 5 show the characterizing diseases by clusters (most prevalent diseases within clusters and diseases that were most prevalent compared with other clusters) in each site respectively, while Tables 2, 3 show exact disease prevalence, median age, median length of stay, and 30-day readmission rates of clusters.

Comparison of Clusters Between Sites
Accordingly, each site had one stroke-oriented cluster (>90% prevalence), one among older adults in Hong Kong (Hong Kong: H9, see Table 2) and another among males in Zurich (Zurich: cluster Z8, see Table 3). In both clusters, hypertension and heart failure were relatively prevalent. Figure 6 shows a comparison of the chord diagrams of these two clusters (H9 and Z8) representing all disease dyads in each site. Further, there was one chronic kidney disease-oriented cluster (>50% prevalence) in each site (H3 and Z3) and both clusters also featured atrial fibrillation (see Figure 7 for chord diagram comparison). While 53% of patients in Z3 suffered from chronic pain, H3 showed the highest prevalence of myocardial infarction across clusters in Hong Kong. The only myocardial infarction-oriented cluster in the study was generated for Zurich among relatively young males (Z9). Zurich featured two additional unique clusters, i.e., epilepsy sometimes combined with chronic pulmonary disease in young patients (Z7), and hypothyroidism in older females (Z5). Several clusters in Hong Kong featured high heart failure prevalence (H2, H6, H8, H4), and chronic pain was also relatively prevalent in those clusters. Two clusters in Zurich (Z6, Z2) showed heart failure prevalence over 50%, but only Z6 had a high proportion of chronic pain. Z2 on the other hand featured the highest prevalence of peripheral vascular disease across all clusters, whereas that diagnosis was rare in Hong Kong. Some clusters in Hong Kong and Zurich had mixed features and were thus less clear, including chronic pulmonary disease with stroke and heart failure (H5), cirrhosis with severe constipation (H1), depression, dementia, and alcohol misuse (H7), chronic pain and diabetes (Z1), as well as metastatic cancer, cirrhosis, and hepatitis B (Z4).

DISCUSSION
Findings of this study provide an overview of the clustering patterns of multimorbidity based on which further investigations Frontiers in Medicine | www.frontiersin.org  could be conducted to inform potential integration of services and collaboration between medical specialties. In addition, the comparison between patient records from Hong Kong and Zurich provide preliminary results on the degree of universality of multimorbidity patterns across world populations. Overall speaking, there were no striking similarities or dissimilarities of the identified clustering patterns beyond established disease relationships between the two sites, with the co-occurrence of known comorbidities being most frequently observed. Specifically, only two out of nine clusters were found to be common clusters across the two sites. In both sites, especially in Hong Kong where fewer diagnoses were recorded, disease dyads mostly fell within the same medical specialties, which may suggest the importance of integrated practices within the specialty relative to cross-specialty collaboration.

Interpretation and Implications
While a focused examination of disease prevalence within each cluster reflect only the previously identified disease relationships, the overall findings jointly represent the distribution of morbidity burden within and across clusters among multimorbid patients in a realistic healthcare setting. In other words, an overall clinical profile of multimorbid inpatients in terms of a variety of chronic conditions is presented. While the differences between the sites may be attributed to the different healthcare delivery and financing mechanisms as well as cultural and demographic factors, the identified similarities of multimorbidity patterns may suggest common specific segments of patients requiring further attention across populations, and the results convey important information on the management level for the planning and coordination of services for multimorbid patients in hospitals and other healthcare facilities. For instance, although it is commonly known that chronic kidney disease and atrial fibrillation are closely related diseases (24), our analysis further showed that patients having these two conditions constituted a significant proportion of multimorbid patients in both sites. Hence, the successful implementation of integrated care for these patients may alleviate the healthcare burden of multimorbidity significantly.
Likewise, the presence of stroke-oriented clusters which have been observed in both sites, despite having different demographics, suggest that stroke and associated morbidities (25,26) represent a sizeable proportion of multimorbid patients in a hospital setting. While there are existing integrated services for stroke patients in typical healthcare systems of developed societies, it is also important to assess the degree to which these patients contribute to the total burden of multimorbidity.
There are also unique clusters in each site which may be of clinical importance. Specifically, only in Zurich did we observe a myocardial infarction-oriented cluster (99% prevalence). Also, the prevalence of peripheral vascular disease is drastically higher in the sample of Zurich than that of Hong Kong. These results may suggest potentially different etiologies of cardiovascular diseases between Hong Kong and Zurich due to different lifestyles, living environments, and economic structures. Nevertheless, such difference may also be attributed to different specialization foci of the hospitals, different referral patterns, and other practices that may differ between sites. More observations are needed to investigate the underlying reasons.
In Hong Kong, the clusters featuring high prevalence (>65%) of heart failure also had relatively high prevalence of chronic pain, which was partially the case in Zurich. This may relate to the regular practice of hospital clinicians in the assessment of heart failure which include the report of pain (27, 28). If confirmed by further research, this begs the question whether chronic pain prevalence is currently being underestimated among patients with other diseases (without assessment of pain) and, hence, whether it is necessary to include the assessment of pain for them. In fact, the observed prevalence of heart failure is apparently higher compared with previous inpatient research in other populations, such as in Canada (29). Further research is recommended to examine the potentially underlying reasons for this difference.
In each of the two sites, there existed highly complex clusters (H7 and Z4) in which a wide variety of diseases was featured. However, as these clusters did not constitute a markedly large proportion of multimorbid patients, integrated or collaborative care for the rest of the clusters with obvious characterizing diseases should be the priority for alleviating the healthcare burden of multimorbidity.

Relationship With the Literature
While the results of this study are context specific, it provides preliminary information on the replicability of multimorbidity profiling between populations. In the literature, there are at least two recent important systematic reviews on the results of multimorbidity profiling from previous research with a huge variety of statistical methods (8,30). It has been suggested that   with the exceptions of mental and cardiovascular patterns, other patterns did not seem to show good evidence of universality across populations even when stratified by statistical methods of multimorbidity profiling (30). However, even with highly sophisticated meta-analytic approaches, it should be noted that across studies, different ranges of chronic conditions and specifications of statistical analyses were adopted. Therefore, evidence from analyses with the same methods on different populations is important to further our understanding. This study is one of the first attempts to narrow such a gap and as far as we are aware, this is the first study to adopt hierarchical agglomerative clustering analysis to compare multimorbid patient clustering patterns.

Strengths and Limitations
Despite this novelty and the methodological strength of the wellvalidated algorithms to define chronic conditions using ICD-9 and ICD-10, this study has several limitations that require caution in interpreting the results. First, while the Hong Kong data are highly representative of the inpatients of the public sector, we do not have access to patient records in the private sector which may specialize in certain different conditions from those found prevalent in our sample. Also, University Hospital Zurich was our only source of Zurich data; nevertheless, it is a well-established general acute hospital representing diverse patient intake and standard practices. Second, the validity and comprehensiveness of diagnostic codes may differ between sites  [3,14] 5 [2,10] 7 [3,13] 9 [4,15] 8 [4,14] 7 [3,13] 8 [4,16] 11 [7,17] 5 [2,9] 30-day readmission (%) 11  a limited predetermined list of 30 diseases for the clustering analysis. Diseases not included in the list were not used for multimorbidity profiling. However, this approach allowed us to use well-validated algorithms (20) that enable first insight into the question of universality without having to validate these algorithms for the specific settings. Importantly, the present study provides preliminary evidence of some degree of universality across populations. Sixth, the comparison between the sites were mainly qualitative and no statistical tests were applied to quantitatively summarize the differences of the clustering patterns. However, our approach was a combination of clinical reasoning and exploratory machine learning methods in addressing the problem, which may facilitate future research with similar purposes. Last but not least, as the sample were randomly drawn from the Hong Kong and Zurich populations only, external generalizability is limited and the analysis should be replicated in other populations to verify the results. To conclude, we conducted a hierarchical agglomerative clustering analysis on discharged inpatients from Hong Kong and Zurich based on a list of 30 diseases and provided findings on the universality of multimorbidity patterns across inpatient populations from the two places. Results should be facilitative of the experimentation and development of integrated or collaborative care for multimorbid patients in the healthcare systems of both populations. Future research should adopt more representative samples, longitudinal disease coding, and comprehensive lists of chronic conditions to verify results of this study and make recommendations on the care planning and coordination of services for multimorbid inpatients.

DATA AVAILABILITY STATEMENT
The data analyzed in this study is subject to the following licenses/restrictions: Authorization to access the data may be considered by the Hospital Authority of Hong Kong upon reasonable requests. Requests to access these datasets should be directed to Hospital Authority of Hong Kong, hacpaaedr@ha.org.hk.

ETHICS STATEMENT
An ethics waiver has been granted by Cantonal Ethics Committee of Zurich for the analysis of Zurich inpatient data (Ref: NZ-B-Nr.2017-00882) while the analysis of Hong Kong inpatient data was approved by the Survey and Behavioral Ethics Committee of the Chinese University of Hong Kong (Project Code: Elderly Care -CUHK). As only secondary analysis of anonymized inpatient data was performed, no informed consent was required.

AUTHOR CONTRIBUTIONS
FL and PB conceptualized the study design, conducted the analysis, and drafted the manuscript. EB and SW supervised the analysis and result interpretation and critically commented on the manuscript drafts. All authors contributed to the interpretation of results and helped revise the manuscript.