Cardiovascular Phenotypes Profiling for L-Transposition of the Great Arteries and Prognosis Analysis

Objectives Congenitally corrected transposition of the great arteries (ccTGA) is a rare and complex congenital heart disease with the characteristics of double discordance. Enormous co-existed anomalies are the culprit of prognosis evaluation and clinical decision. We aim at delineating a novel ccTGA clustering modality under human phenotype ontology (HPO) instruction and elucidating the relationship between phenotypes and prognosis in patients with ccTGA. Methods A retrospective review of 270 patients diagnosed with ccTGA in Fuwai hospital from 2009 to 2020 and cross-sectional follow-up were performed. HPO-instructed clustering method was administered in ccTGA risk stratification. Kaplan-Meier survival, Landmark analysis, and cox regression analysis were used to investigate the difference of outcomes among clusters. Results The median follow-up time was 4.29 (2.07–7.37) years. A total of three distinct phenotypic clusters were obtained after HPO-instructed clustering with 21 in cluster 1, 136 in cluster 2, and 113 in cluster 3. Landmark analysis revealed significantly worse mid-term outcomes in all-cause mortality (p = 0.021) and composite endpoints (p = 0.004) of cluster 3 in comparison with cluster 1 and cluster 2. Multivariate analysis indicated that pulmonary arterial hypertension (PAH), atrioventricular septal defect (AVSD), and arrhythmia were risk factors for composite endpoints. Moreover, the surgical treatment was significantly different among the three groups (p < 0.001) and surgical strategies had different effects on the prognosis of the different phenotypic clusters. Conclusions Human phenotype ontology-instructed clustering can be a potentially powerful tool for phenotypic risk stratification in patients with complex congenital heart diseases, which may improve prognosis prediction and clinical decision.


INTRODUCTION
Congenitally corrected transposition of the great arteries (ccTGA), a rare and anatomically complex congenital heart disease with an incidence of 1 in 33,000 live births approximately, is characterized by atrioventricular and ventricular-arterial discordance (1). Diverse accompanied anomalies are ubiquitous that the most common co-deformities are ventricular septal defect (VSD, 70%), pulmonary stenosis (40%), and systemic atrioventricular valve abnormality (2). Heterogeneous physiological conditions and hemodynamic issues accrue, thus, impede clinical decisions and prognostic evaluation.
The human phenotype ontology (HPO), a comprehensive resource for systematically defining and logically organizing human phenotypes, enables computational inference and complex algorithms that support combinatorial genomic and phenotypic analysis (3). It has been used in multiple fields, particularly, genotype-phenotype analysis for genetic syndromes (4), neurodevelopmental diseases (5-7), hereditary hemorrhagic telangiectasia (8), and myofibrillar myopathies (9). Moreover, it is appealing that HPO is adopted as a powerful tool for personalized medicine and precision medicine (3,10). The scope of HPO application has gradually broadened as HPO-related procedures were evolved, such as Doc2Hpo (11) for HPO concept curation and HPOLabeler (12) for human protein-phenotype studies. Combined with electronic medical records (EMRs), HPO can be administered in constructing longitudinal footprints of genetic disorders (13) and expediting genetic diagnoses (14). Previously, we have succeeded in grouping patients with Ebstein's anomaly by employing HPO and EMR (15).
There had been several reports investigating the postoperative outcomes of patients with ccTGA that most of them were complicated with VSD and pulmonary stenosis (or left ventricular outflow tract obstruction) (16,17). However, it seemed to be a common problem that most studies did not take all of the cardiovascular phenotypes into consideration for prognostic analysis, thus bias might exist. Here, we delineated a big cardiovascular phenotypes picture of 270 patients with ccTGA, clustered them according to phenotypic similarity by HPO and EMR, and analyzed the outcomes in combination with three types of surgical strategies. We aimed at elucidating the relationship between phenotype and prognosis and providing a novel phenotypic stratification strategy that might improve prognosis prediction and clinical decision of ccTGA.

Study Population
By retrospectively reviewing records from 2009 to 2020, we identified 380 patients who were admitted to Fuwai hospital and underwent surgeries with the diagnosis of ccTGA by Chinese Abbreviations: ccTGA, congenitally corrected transposition of the great arteries; HPO, human phenotype ontology; VSD, ventricular septal defect; AVSD, atrioventricular septal defect; PAH, pulmonary arterial hypertension; TR, tricuspid regurgitation; PVS, pulmonary valve stenosis; SVEF, systemic ventricular ejection fraction; SVEDD, systemic ventricular end diastolic diameter; MR, mitral regurgitation; PI, pulmonary insufficiency; EMR, electronic medical records. EMR. The international classification of diseases 10th revision (ICD-10) has been adopted in our center and all diagnoses in Chinese can be referred to this system. Standardized diagnosis in ICD-10 was further annotated based on HPO. The diagnoses of all enrolled patients were confirmed by echocardiography and surgery. Baseline demographics, echocardiographic information, electrocardiographs, cardiac CT, catheter data, operation and progress notes, and status at discharge were manually reviewed by two authors and a specialized cardiologist. There was no disagreement in the EMR reviewing process. Then we excluded complex congenital heart disease (pulmonary atresia, single ventricle, and double outlet right ventricle) which might skew our clustering results, thus making ccTGA the main diagnosis. A total of 300 patients were included in further analysis. This study was approved by the Ethics committee of Fuwai Hospital, Chinese Academy of Medical Science, and Peking Union Medical College (approval no. 2020-1402). Owing to the nature of the retrospective study, informed consent was waived. The whole process of this study is displayed in Supplementary Figure 1.

Phenotype Annotation Based on HPO
We collated cardiovascular phenotypes based on surgical and echocardiographic records (before definitive surgery) and standardized them in concordance with the HPO database (https://hpo.jax.org/). All included terms were the subitems of "HP:0001626, abnormality of the cardiovascular system" except for "HP:0002092, pulmonary arterial hypertension" (PAH). Phenotypes that did not appear in the HPO database would be traced to superior terms, for example, "junctional escape rhythm" was taken as "arrhythmia." Collectively, a total of 58 terms were annotated.

Patients Clustering by HPO
The principles of phenotypes clustering based on HPO were as previously described (15,(18)(19)(20). Our subsequent clustering was initially based on calculating the similarity between any two annotated phenotypes. According to the frequency information of each phenotype in the HPO dataset [p(p)], we defined the similarity of a pair of phenotypes (e.g., p1 and p2) as follows: v ∈ anc(p1) ∩ anc(p2) : the set of common ancestors of p1 and p2.
Patients are defined by a set of phenotypes, we calculated the similarity matrix in pairwise patients (e.g., c1 and c2) based on "between term set" similarities by the following equation: Frontiers in Cardiovascular Medicine | www.frontiersin.org Guided by R package "ontologySimilarity" and based on similarity matrix (sim_mat), we calculated a distance matrix [max(sim_mat) -sim_mat]. Using the distance matrix, we performed an unsupervised hierarchical clustering by R package "pheatmap." For the selection of the parameter in the function "pheatmap, " the complete linkage method was employed by default, and "cutree_col" is set to be three to obtain three phenotypically heterogeneous clusters (https:// cran.r-project.org/web/packages/ontologyIndex/vignettes/introto-ontologyX.html, https://rdrr.io/cran/ontologySimilarity/f/ vignettes/ontologySimilarity-examples.Rmd).

Follow-Up and Clinical Adverse Outcomes
The records of patients who received re-examinations in our center were retrieved (such as ECG, echocardiography, any records of readmission, or reported adverse events). A followup telephone interview for all enrolled patients was also conducted on March 2021 (inquiring about their survival status, morbidity, any adverse events, or reintervention), and revisiting records from other hospitals were obtained if available. Total clinical adverse events included all-cause mortality, heart failure, and reinterventions. Primary surgery was defined as definitely corrective surgery and reintervention was any heart surgery performed after the primary surgery. We treated all-cause mortality as the primary endpoint, and all-cause mortality plus heart failure as the composite endpoint events. For the patients who could not be reached, the last medical visit records in our hospital were retrieved as the basis for outcome judgment.

Surgical Classification
Due to diverse surgical procedures among the patients, we classified them into 3 types and were approximately consistent with a recent study (21).
• Anatomic repair (arterial switch/double arterial root switch/Rastelli with Senning or Mustard, arterial switch/Rastelli with Hemi-Mustard, and bidirectional Glenn). • Physiologic repair (any cardiac surgery other than anatomical repair and permanent pacemaker placement, thus morphological right ventricle remains as the systemic ventricle). • Fontan palliation.

Statistical Analysis
All statistical analyses involved in this study were performed with SPSS Statistics Version 23.0 (IBM 16 Corporation, Armonk, NY, USA) and R software version 3.6.2 (R Foundation for Statistical Computing, Vienna, Austria). Categorical variables were summarized as frequencies (percentage) and continuous variables were summarized as mean ± SD or median (25th to 75th percentiles), with the comparison methods of χ 2 test unless group size was lesser than 10, in which the Fisher exact test and Kruskal-Wallis test would be adopted.
Kaplan-Meier method was adopted to estimate the freedom from adverse events morbidity, and overall survivals in which log-rank test was administered in the comparison between different groups. The survival time of enrolled patients began at the definitive surgery and ended at death, event, or the last FIGURE 1 | HPO terms encoded for the ccTGA-associated cardiovascular anomalies. The tree plot shows the relationship of all the annotated phenotypes. Circles with borders are the phenotypes presented in our cohort (the phenotypes absent in the HPO database are not shown). The shade of the color represents the frequency of terms in the HPO database (the darker the color, the higher the frequency, with the color key on the top). Arrows indicate the relationship of affiliation between phenotypes. HPO, the human phenotype ontology; ccTGA, congenitally corrected transposition of the great arteries. The phenotypic similarity was calculated to generate the distance matrix, which was further used to produce the heatmap. Both horizontal and vertical axis indicated patients with ccTGA. The dashed line showed the height to cut the tree into three groups. The color represents the degree of similarity between patients (lighter color indicates higher similarity, with the color key on the right) yellow, blue, and red were adopted to distinguish three different clusters (yellow for cluster 1, blue for cluster 2, and red for cluster 3); (B) Number of phenotypes of patients in each cluster. ccTGA, congenitally corrected transposition of the great arteries.
follow-up. Stratification in Kaplan-Meier analysis was based on different clusters to detect whether significance existed or not. Stratification based on different surgical strategies was performed to explore the influence of different surgical types on the novel clustering modality. Moreover, the mid-to long-term outcome is the main issue in patients with ccTGA. In this regard, landmark analysis was administered for piecewise analysis of patient outcomes. To identify underlying associated factors' correlation or contribution to adverse events and overall survival rate, the univariable Cox proportional hazard regression method was utilized. After stepwise selection of potential variables, we adopted the multivariable Cox proportional hazard regression method to further validate the significant univariate factors (p < 0.05). We used the Benjamini-Hochberg method to adjust the p.

Patient Demographics and Characteristics
Baseline demographic characteristics of this cohort are illustrated in Table 1. The median age at definitive surgery was 5.4 years (2.1-23.25 years), with 107 female patients (39.6%). The mean follow-up time was 4.88 ± 3.47 years, the median follow-up time was 4.29 (2.07-7.37) years. Thirty patients were lost to followup and the follow-up rate was reached 90% (270/300). Clinical adverse events occurred in a total of 48 patients of which 19 patients died of various causes, 7 patients had heart failure, and 28 patients received reinterventions. Table 1 shows the co-existed anomalies presented in more than 10 patients. VSD, tricuspid regurgitation (TR), and pulmonic stenosis (PVS) were the top three concomitant defects with the frequency of 80.8, 47.4, and 42.6%, respectively. Preoperative systemic ventricular ejection fraction (SVEF) and systemic ventricular end diastolic diameter (SVEDD) were significantly different among three clusters with a p of 0.002 and 0.001, respectively. However, patients with SVEF <40% among the three clusters were comparable (p = 0.082). Notably, the occurrence of TR, the most concerning issue of ccTGA, was significantly diverse among three clusters (p < 0.001) with the severity demonstrated in Table 1.

HPO-Based Clustering for Patients With CcTGA
We sorted out the cardiovascular phenotypes of all the patients and annotated them to 58 terms. Detailed information of the 58 terms in HPO and the abbreviations assigned in the study are presented in Supplementary Table 1. A tree plot was generated to show the distribution and affiliation of each term, and the shade of the color represented the frequency of each item in the HPO database (Figure 1). Most patients presented three or four additional terms, with the median number of additional terms carried by each patient was three (ranging from one to nine) ( Figure 2B).
We then performed an unsupervised hierarchical clustering to classify 270 patients into three clusters based on their phenotypic similarity, with a size of 21, 136, and 113, respectively (Figure 2A). We compared the distribution of 58 terms in three clusters and found that 15 of 58 terms had significant differences in the distribution among the three clusters ( Table 1, Supplementary Table 2, Supplementary Figure 2).
Three clusters had their characteristics in phenotypic distribution. Patients in cluster 1 had higher frequencies of TR, PAH, cardiomegaly, and arrhythmia. Septal defects and PVS more frequently occurred in cluster 2. Cluster 3 had no distinct but wide-ranging phenotypic characteristics. Next, we analyzed the phenotypic combination distribution of patients in the three clusters and found that cluster 2 had more homogeneous phenotypes while cluster 1 and cluster 3 were more heterogeneous (Supplementary Figure 3). For cluster 3, more complex phenotypic combinations were observed and a total of 46 isolated terms with 87 different combinations occurred. Seventy-eight patients had a unique phenotypic combination, and the median number of terms carried by each patient in cluster 3 was four compared with cluster 1 and cluster 2 with the median number of three ( Figure 2B).
Next, we analyzed the differences in clinical outcomes among the three clusters. As the co-existed phenotypes of patients in three clusters varied, they inevitably led to different physiological conditions that might eventually affect their survival status and increase the risk of suffering from adverse events. We found that there was no significant difference between the three clusters in the overall survival rate (p = 0.32, Figure 3A) and composite endpoints (p = 0.15, Figure 3B). However, we noticed that different trends began at about 4 years postoperatively for the occurrence of composite endpoint events. Thus, we performed a landmark analysis and found that patients in cluster 3 had significantly worse mid-term outcomes compared with cluster 1 and cluster 2 (p = 0.021 for overall survival rate, p = 0.004 for compound endpoints; Figures 4A,B).
Finally, we analyzed the impact of surgeries on patients and their interaction with phenotypes. Surgical strategies were significantly varied in the three clusters (p < 0.001), which might affect the results. For the whole cohort, surgery did not affect the overall survival rate or composite endpoints (p = 0.18 for overall survival rate, p = 0.35 for composite endpoints; Supplementary Figures 4A,B) while it dramatically contributed to the risk of reinterventions (p = 0.0015) and this mainly occurred in cluster 2 (p = 0.039; Supplementary Figures 5A,B, 6A,B ). We then compared the outcomes of patients receiving the same surgical strategy in three different phenotypic clusters and found that cluster 3 had a worse mid-term outcome compared with cluster 1 and cluster 2 when surgery was limited to physiologic repair (Supplementary Figures 7A,B, 8A,B, 9A,B), which was almost consistent with the results of comparing outcomes of three phenotypic clusters irrespective of surgery (Figures 3A,B).

DISCUSSION
Wide-range of co-existed anomalies are still the conundrum in ccTGA population of this era and cause socio-economical burden owing to abroad age divergence of this population (18)(19)(20). In this study, for the first time, we comprehensively summarized the cardiovascular phenotypes of patients with ccTGA and analyzed the effects of these combined phenotypes contributing to clinical outcomes integrated with different surgical strategies. We found that patients with more complex phenotypes had significantly worse mid-term prognosis, and surgery also had different effects on the prognosis of patients with different phenotypes.
Being confronted with great heterogeneity in anatomy, hemodynamics, and electrophysiology, the presentation, course, management, and outcomes are not only determined by ccTGA but also other co-existed anomalies (22). Most patients with ccTGA have at least one or more cardiac phenotypes, and these phenotypes significantly affect the natural course of the disease (2). In our cohort, a total of 58 phenotypes were identified, and most patients were presented with three or more different additional phenotypes (Figures 1, 2B). PAH, AVSD, MR, PI, cardiomegaly, and levocardia are risk factors of either death or composite endpoints ( Table 2). As a common concern for ccTGA population, TR did not contribute to survival or reintervention, which was consistent with the reports from other centers (23,24). However, in a previous report from our center, severe TR was associated with the composite endpoints, such as death, heart transplantation, and congestive heart failure (25). The discrepancy may be because we did not limit the degree of TR in this study. In addition to structural abnormalities, the arrhythmic burden of patients with ccTGA, i.e., paroxysmal supraventricular tachycardia, atrial arrhythmia, and complete atrioventricular block, was high and increased with time (26). Even for patients with isolated ccTGA, the occurrence of arrhythmia was reported as one of the important factors affecting the natural history (22). Our study consistently showed that arrhythmias were significantly attributable to patients' outcomes (p = 0.015). Collectively, various phenotypes in the ccTGA population are crucial elements in patient prognosis evaluation.
In previous studies, VSD and left ventricular outflow tract obstruction were the major co-anomalies considered for patient grouping and they had not focused on effects of phenotypes that were less influential or clinically considered to be less symptomatic. However, there is no consensus about which phenotypic combinations may have unexpected consequences for patients. As a comprehensive and widely used database of human phenotypes, HPO provides us an inspiring tool for clustering patients with phenotypic heterogeneity, considers comprehensive phenotypes of a patient as far as possible, and presents a relatively complete physiological state of patients' cardiovascular system, which may serve as a tool for phenotypebased risk stratification. Regarding the results of our study, patients of cluster 3 were revealed significantly different midterm prognosis compared with the other two clusters. In cluster 3, a fair number of patients have unique phenotypic combinations, and several specific phenotypes only presented in one patient, which made a phenotypically extremely heterogeneous group. This group could not be defined by a few specific phenotypes, so they considered a group of patients with a high-risk phenotypic combination. In fact, it does not contradict our goal, which was to identify patients at high risk, rather than focusing on the correlation of a few phenotypes with patient outcomes. Grouping patients by phenotypic similarity through HPO can partially eliminate the influence of phenotypic factors, which may also help us make a phenotypically bias-free cohort for exploring the association between prognosis and other factors, such as surgery.
Due to the divergence and complexity of the anatomical structure of ccTGA, the optimal surgical treatment has not reached a consensus (27). Physiological repair and anatomical repair are two major surgical strategies of ccTGA, while several palliations are prone to acceptable clinical outcomes, such as the Fontan procedure. For maintaining a normal morphologic right ventricular function, the focus of surgical treatment selection was shifted from physiological repair to anatomical repair, but the survival rate after anatomic repair varied among different studies (28). After taking phenotypic factors into account, we found that there was a significant difference in surgical strategy among the three clusters. The selection of treatment was essentially determined by different physiological conditions resulting from different phenotypes of patients, so it was reasonable that operations for different groups of patients varied. However, we found that there was no significant difference in the occurrence of death and heart failure of patients who received different surgical treatments. Consequently, we speculated that the different outcomes of patients in the three phenotypic clusters were caused by phenotypes themselves instead of surgery. For patients with physiological repair, a significant difference in the mid-term prognosis of patients with different phenotypes was observed. We postulated that adopted types of physiological repair operations were determined by the different physiological states of patients, which was also an indirect consequence of the phenotypes. For the patients of cluster 2, different surgical strategies caused a significant difference in prognosis that anatomic repair increased the risk of reintervention in comparison with physiological repair and Fontan palliation (Supplementary Figures 5A,B). VSD and PVS were distinctive characteristics of cluster 2, which was equivalent to the mainstream patient population included in most studies. For patients with both VSD and pulmonary stenosis, Fontan is a feasible option when anatomical risk factors impede biventricular repair, and it has achieved a satisfactory mid-term outcome (29). Anatomical repair of ccTGA sacrificed short-term prognosis but improved long-term prognosis so that it is associated with significant early mortality and morbidity (28,30). Therefore, for patients in cluster 2, more attention should be paid to choosing the appropriate surgical procedure.
If a large cohort is available, we assume that accurate phenotypic risk stratification can be performed based on the patient's disease profile. Hence software based on such algorithms may be promising for patient risk stratification. According to our experience and data, most of the patients with ccTGA after surgery met satisfactory therapeutic effect, thus some patients did not follow the medical advice for periodic revisits. It may be possible to avoid the occurrence of unexpected adverse events if we can accurately identify patients with high-risk phenotypes and inform them in advance. However, our preliminary exploration of the novel phenotypical risk stratification modality in 270 patients with ccTGA needs to be verified in future studies with different ethnicity and genetic background, and larger cohorts.
Our study had several other limitations: due to the nature of the retrospective single-center study, ethnic genetic background, morbidity, and hospital treatment decisions might be biased. For HPO-based clustering, although all cardiovascular phenotypes had been considered, the severity of each phenotype was neglected, for instance, mild, moderate, and severe TR. For patients with pulmonary atresia, single ventricle, and double outlet right ventricle resulting in complex physiological conditions, ccTGA may not be the main diagnosis at this time and the received treatment might be discrepant, so they were excluded from this study.

CONCLUSION
Diversely co-existed anomalies of patients with ccTGA are the major culprit in prognosis evaluation and HPO-instructed clustering delineates a novel phenotypic risk stratification strategy that might beneficially improve prognosis prediction and clinical decision of the ccTGA population.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics Committee of Fuwai Hospital. Written informed consent for participation was not provided by the participants' legal guardians/next of kin because: This is the retrospective study.

AUTHOR CONTRIBUTIONS
QH and HS contributed to the conceptualization, methodology, EMR reviewing, follow-up, data analysis, and manuscript writing. ZZ contributed to the supervision, conceptualization, professional suggestion, and revision. SL contributed to EMR reviewing, supervision, conceptualization, professional suggestion, and revision. XS contributed to the investigation and follow-up. WC did the investigation. YW worked on graphing optimization. RL provided professional revision suggestions and grant for the study. All authors contributed to the article and approved the submitted version.