Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Cardiovasc. Med., 20 November 2025

Sec. Cardiovascular Genetics and Systems Medicine

Volume 12 - 2025 | https://doi.org/10.3389/fcvm.2025.1694255

Combining WGCNA and machine learning to identify mechanisms and biomarkers of hyperthyroidism and atrial fibrillation


Linyuan WangLinyuan Wang1Kun YangKun Yang1Ruilong KangRuilong Kang1Pengbo LiuPengbo Liu1Yongzhi Deng

Yongzhi Deng2*
  • 1Department of Cardiovascular Surgery, The Affiliated Hospital of Shanxi Medical University, Shanxi Cardiovascular Hospital (Institute), Shanxi Clinical Medical Research Center for Cardiovascular Disease, Taiyuan, China
  • 2Department of Cardiovascular Surgery, Shanxi Cardiovascular Hospital (Institute), the Affiliated Hospital of Shanxi Medical University, Shanxi Clinical Medical Research Center for Cardiovascular Disease, Taiyuan, China

Background: Hyperthyroidism and atrial fibrillation (AF) are interrelated conditions with significant cardiovascular impact. While their clinical association is established, the molecular mechanisms remain unclear. Identifying shared biomarkers and pathways can advance understanding and guide therapy.

Methods: The hyperthyroidism dataset GSE71956 and the AF dataset GSE115574 were obtained from the Gene Expression Omnibus (GEO) database. Differential gene analysis was performed using the “limma” package, and overlapping genes shared by both diseases were identified through weighted gene co-expression network analysis (WGCNA), followed by functional enrichment analysis. Machine learning algorithms were also applied to identify key biomarkers. To validate the predictive results, peripheral blood samples were collected for real-time quantitative polymerase chain reaction (RT-qPCR) analysis. Finally, immune infiltration analysis was conducted to evaluate immune cell changes in hyperthyroidism and AF.

Results: Through differential gene screening and WGCNA, 23 overlapping genes associated with hyperthyroidism and AF were identified. Using least absolute shrinkage and selection operator (LASSO) and random forest (RF) machine learning algorithms, CXCL16 and TMEM127 were ultimately identified as key genes. The two genes demonstrated good diagnostic efficacy in the hyperthyroidism validation set GSE276271 (AUC: TMEM127, 0.636; CXCL16, 0.591) and in the AF validation set GSE2240 (AUC: TMEM127, 0.745; CXCL16, 0.720). RT–qPCR analysis demonstrated that CXCL16 and TMEM127 expression levels were significantly elevated in both the hyperthyroidism and AF groups compared to the control group, aligning with the findings from our prior bioinformatics analysis. Immune analysis revealed significant differences in two immune cell types in both hyperthyroidism and AF.

Conclusion: CXCL16 and TMEM127 are promising biomarkers, offering insights into the shared pathogenesis of hyperthyroidism and AF. These findings provide a foundation for novel diagnostic and therapeutic strategies targeting these conditions.

1 Introduction

Hyperthyroidism, a prevalent endocrine disorder, is particularly common in iodine-deficient regions, with a global prevalence of 0.2%–1.3% in iodine-sufficient populations (1). The condition is characterized by elevated levels of thyroxine (T4), triiodothyronine (T3), or both, significantly impacting cardiac energy metabolism, cardiovascular function, and the heart's electrical conduction system (2). Cardiovascular complications such as sinus tachycardia and atrial fibrillation (AF) are commonly associated with hyperthyroidism.

AF, the most widespread sustained arrhythmia globally, has become a critical public health issue due to its increasing incidence, associated healthcare burdens, and adverse effects on morbidity and mortality (3). Major risk factors for AF include aging, sedentary lifestyles, obesity, diabetes, metabolic syndrome, and obstructive sleep apnea (4, 5). Moreover, genetic predisposition plays a significant role in AF, with over 140 genetic loci identified as contributors to its pathogenesis (6, 7).

The connection between hyperthyroidism and AF has been acknowledged for over a century (8). For example, a population-based study involving more than 40,000 individuals with hyperthyroidism revealed that 8.3% experienced AF or atrial flutter within a month of diagnosis (9). Recent machine learning analyses of extensive datasets have further substantiated this relationship (10). Additionally, studies from Denmark have indicated a heightened risk of hyperthyroidism following AF onset, emphasizing the importance of thyroid function monitoring after an AF diagnosis (11). Despite these advances, the molecular mechanisms underpinning this relationship remain poorly understood, warranting further research into potential therapeutic strategies.

The advent of high-throughput technologies and bioinformatics has revolutionized the discovery of biomarkers and therapeutic targets. While traditional bioinformatics research largely focused on differential gene expression and protein-protein interaction (PPI) network analyses, these methods often overlooked co-expressed gene clusters shared by hyperthyroidism and AF (12, 13). Furthermore, the precision of PPI network analyses has been questioned. Newer approaches, such as weighted gene co-expression network analysis (WGCNA) and machine learning (ML) algorithms, have emerged as powerful tools for identifying disease-relevant targets. WGCNA identifies disease-associated gene modules by constructing scale-free networks, while machine learning algorithms like Least Absolute Shrinkage and Selection Operator (LASSO) and Random Forest (RF) are increasingly utilized to detect biomarkers and enhance diagnostic accuracy (14).

This study employed mRNA expression datasets from the GEO database to identify co-expression modules shared by hyperthyroidism and AF using WGCNA. Functional enrichment analysis was conducted to explore the biological roles of overlapping genes, while LASSO and RF approaches were applied to pinpoint potential biomarkers. These biomarkers were validated using an independent dataset. To further validate our findings, peripheral blood samples were collected for real-time quantitative polymerase chain reaction (RT-qPCR) analysis. Additionally, immune cell infiltration analysis was performed to investigate immune cell involvement in the pathogenesis of hyperthyroidism and AF. Figure 1 depicts the study flowchart.

Figure 1
Flowchart of immune infiltration analysis starting with hyperthyroidism and atrial fibrillation, leading to an analysis of differentially expressed genes (DEGs). It identifies 1,575 DEGs from hyperthyroidism and 1,799 from atrial fibrillation, processed through Weighted Gene Co-Expression Network Analysis (WGCNA) to produce 23 DEGs. The analysis is further refined using LASSO and RF methods, highlighting TMEM127 and CXCL16 genes.

Figure 1. Flowchart of the study.

2 Materials and methods

2.1 Data collection

Gene expression profiles of hyperthyroidism and AF were obtained from the GEO database (https://www.ncbi.nlm.nih.gov/geo). The GSE115574 dataset (microarray data on the GPL570 platform) includes 59 atrial tissue samples from 30 patients who underwent mitral regurgitation repair surgery, comprising 28 samples from individuals with AF and 31 from those with sinus rhythm (SR). The GSE71956 dataset (microarray data on the GPL10558 platform) contains 49 peripheral blood CD4+ T lymphocyte samples, including 31 from patients with hyperthyroidism and 18 from healthy controls. To further validate our findings, we selected datasets GSE2240 and GSE276271 as independent validation sets. The GSE2240 dataset (microarray data on the GPL97 platform) includes right atrial tissue samples from 30 patients who underwent cardiac surgery (either valve repair or coronary artery bypass grafting), with 10 samples from patients with AF and 20 from individuals with SR. The GSE276271 dataset (RNA-seq data on the GPL28702 platform) comprises 15 feline thyroid tissue samples, including 11 from hyperthyroidism models and 4 from healthy controls. All the datasets were subjected to standardized data preprocessing.

2.2 Identification of differentially expressed genes (DEGs)

DEGs were analyzed using the “limma” R package by comparing hyperthyroidism samples against controls in the GSE71956 dataset and AF samples against controls in the GSE115574 dataset. Heatmaps and volcano plots were generated with the “pheatmap” and “ggplot2” R packages. Genes with |log2FC| > 0.1 and p-value < 0.05 were deemed significant.

2.3 Weighted gene co-expression network analysis

A co-expression network was constructed using the “WGCNA” R package, with DEGs as input. Hierarchical clustering identified and excluded outliers. A scale-free network was created using the “pickSoftThreshold” function to determine the optimal soft threshold power (β), transforming the similarity matrix into a weighted adjacency matrix. From this, a topological overlap matrix (TOM) was derived to enhance noise reduction. Gene modules were identified through hierarchical clustering and the dynamic tree cut algorithm. Pearson correlation analysis was used to correlate gene modules with clinical traits, and modules showing strong correlations were visualized in a trait-gene network. Shared key genes were identified by intersecting hyperthyroidism- and AF-associated modules.

2.4 Functional enrichment analysis

Key genes were functionally annotated using Gene Ontology (GO) analysis, categorizing them into biological processes (BP), cellular component (CC), and molecular function (MF). KEGG pathway analysis identified the pathways and biological roles associated with these genes (1517).

2.5 Machine learning

To identify robust diagnostic biomarkers, we applied LASSO regression and Random Forest (RF), with hyperparameters optimized to mitigate overfitting. For LASSO (R package “glmnet”), the optimal regularization parameter (λ) was determined by ten-fold cross-validation, selecting the λ value that minimized the cross-validated error. Genes with non-zero coefficients at this λ were retained. For RF (R package “randomForest”), the model was first run with 500 trees, and the optimal number of trees was identified as the point where the out-of-bag (OOB) error reached a minimum. A final model was built using this optimal number. From this model, the top 15 genes were selected based on the mean decrease in importance. Biomarkers were determined by intersecting genes identified by both algorithms. Receiver operating characteristic (ROC) analysis in the GSE276271 and GSE2240 datasets validated the diagnostic performance of the identified biomarkers.

2.6 Study population and blood samples

Peripheral blood samples were collected from 16 patients with hyperthyroidism and 16 patients with AF at the Cardiovascular Hospital affiliated with Shanxi Medical University. Inclusion criteria for the hyperthyroidism group were: (1) a confirmed clinical diagnosis of hyperthyroidism (18); (2) age ≥18 years. Exclusion criteria included: (1) concomitant AF or other arrhythmias; (2) the presence of autoimmune diseases, active infections, coagulation disorders, malignancies, psychiatric disorders, or neurological conditions. For the AF group, inclusion criteria were: (1) a confirmed diagnosis of AF (19); (2) age ≥18 years. Exclusion criteria included: (1) concurrent hyperthyroidism; (2) other arrhythmias requiring clinical intervention, as well as autoimmune diseases, active infections, coagulation disorders, malignancies, psychiatric disorders, or neurological conditions. Additionally, 16 healthy individuals undergoing routine physical examinations at the same hospital during the same period were recruited as the control group. These individuals had no history of endocrine or cardiovascular disease, and were matched with the study groups for age and sex. The study was approved by the hospital's ethics committee (2025WJ025), and informed consent was obtained from all participants or their legal guardians.

2.7 Real-time quantitative PCR

Total RNA was extracted from peripheral blood samples using an RNA extraction kit (Seven Biotech, Beijing, China), and stored at −80 °C. The RNA was then reverse-transcribed into complementary DNA (cDNA) using a reverse transcription kit from the same manufacturer. RT-qPCR was subsequently performed on a real-time PCR detection system using SYBR Green PCR Master Mix (Seven Biotech, Beijing, China), following the manufacturer's protocol. The expression levels of CXCL16 and TMEM127 were normalized to GAPDH as the internal control, and relative gene expression was calculated using the 2−ΔΔCT method. Primer sequences used for RT-qPCR are listed in Supplementary Table S1.

2.8 Immune infiltration analysis

Immune infiltration was analyzed using CIBERSORT to estimate the proportions of immune cell types based on gene expression data. Immune cell profiles were compared between disease and control groups for both hyperthyroidism and AF datasets.

2.9 Statistical analysis

Statistical analyses were performed using GraphPad Prism 9.0 (GraphPad Software, La Jolla, California) and R (version 4.4.1). All data are presented as mean ± SEM. Statistical significance between two groups was analyzed using an unpaired Student's t-test. When more than two groups were involved, one-way ANOVA was used to analyze differences between groups. A p-value less than 0.05 was considered statistically significant.

3 Result

3.1 Identification of DEGs

Using R software for data analysis, we identified 1,575 DEGs in the hyperthyroidism dataset (GSE71956), comprising 527 upregulated and 1,048 downregulated genes (Figures 2A,B). In the AF dataset (GSE115574), a total of 1,799 DEGs were identified, with 967 upregulated and 832 downregulated genes (Figures 2C,D). The findings are summarized in Supplementary Table S2.

Figure 2
Panel A presents a volcano plot showing genes with significant downregulation (blue) and upregulation (red). Panel B features a heatmap with hierarchical clustering, comparing control and hyperthyroidism groups. Panel C displays another volcano plot with a different distribution of gene expression changes between control and AF groups. Panel D shows a corresponding heatmap with hierarchical clustering for these groups, using a gradient from blue to red indicating expression levels.

Figure 2. Screening of DEGs. (A) Volcano map of hyperthyroidism group; (B) heatmap of hyperthyroidism group; (C) volcano map of AF group; (D) heatmap of AF group.

3.2 Weighted gene co-expression network analysis

WGCNA was performed on the DEGs from both the hyperthyroidism and AF datasets to identify modules significantly associated with these conditions. For the hyperthyroidism dataset, a soft thresholding power (β) of 12 was selected, resulting in a scale-free network. Eleven modules were identified, with the grey module (correlation coefficient = 0.56) and turquoise module (correlation coefficient = 0.44) showing the strongest positive associations with hyperthyroidism, collectively containing 430 genes. In the AF dataset, a scale-free network was achieved with β = 8. Eleven modules were identified, with the brown (correlation coefficient = 0.59), black (correlation coefficient = 0.53), and turquoise (correlation coefficient = 0.53) modules exhibiting the highest positive correlations with AF, including a total of 641 genes. By intersecting key modules from both datasets, 23 overlapping genes were identified as potential key genes (Figure 3). The key modules results are provided in Supplementary Table S3.

Figure 3
Gene analysis visuals include: Panel A shows a gene dendrogram with module colors indicating different gene groups. Panel B presents a heatmap displaying module-trait relationships, highlighting correlation values and significance between modules and traits like hyperthyroidism and control. Panel C shows another gene dendrogram with module colors for a different dataset. Panel D provides another module-trait relationship heatmap with traits AF and control. Panel E is a Venn diagram illustrating the overlap of gene sets between hyperthyroidism and AF, with 23 genes in common.

Figure 3. WGCNA analysis result. (A) Dendrograms for gene and trait clustering in hyperthyroidism were created. These gene clustering trees, or dendrograms, were derived from hierarchical clustering based on neighbor-related differences. (B) The hyperthyroidism condition was characterized by 11 gene co-expression modules. Each cell within these modules displays the correlation coefficient and the corresponding p-value. (C) Dendrograms for gene and trait clustering were also constructed for AF. (D) 11 gene co-expression modules of AF. (E) The intersection of hyperthyroidism and AF. We intersected the results to get 23 key genes.

3.3 Functional enrichment analysis

GO and KEGG pathway enrichment analyses were conducted on the 23 key genes to explore shared biological processes underlying hyperthyroidism and AF. In the CC category, enriched terms included phagolysosome, NADPH oxidase complex, secondary lysosome, 90S preribosome, and endosome lumen. In the MF category, significant terms included superoxide-generating NADPH oxidase activator activity, signaling adaptor activity, signaling receptor complex adaptor activity, eukaryotic initiation factor eIF2 binding, and catalase activity. The top five enriched GO terms and the ten most enriched KEGG pathways are shown in Figure 4, highlighting the most relevant biological insights.

Figure 4
Two dot plots labeled A and B display gene enrichment data. Plot A shows GeneRatio versus terms like \

Figure 4. Functional enrichment analysis of Key genes. (A) GO enrichment analysis results. (B) KEGG enrichment analysis results.

3.4 Machine learning screening for biomarkers

Two ML algorithms, LASSO and RF, were employed to analyze the 23 candidate genes. LASSO regression identified eight significant genes associated with hyperthyroidism and ten with AF (Figures 5A,C). Concurrently, RF analysis ranked the top 15 most important genes for both conditions (Figures 5B,D). By intersecting the results from both algorithms, two key biomarker genes, CXCL16 and TMEM127, were identified (Figure 5E). ROC validation was conducted for two biomarker genes using the GSE276271 and GSE2240 validation datasets. In the hyperthyroidism validation dataset, CXCL16 and TMEM127 achieved area under the curve (AUC) values of 0.636 and 0.591, respectively, while in the AF validation dataset, the values were 0.745 and 0.720, demonstrating their diagnostic potential for both diseases (Figures 5F,G). All results of machine learning are presented in Supplementary Table S4.

Figure 5
Panel A features LASSO regression plots with coefficients vs. log lambda, and deviance vs. log lambda. Panel B shows random forest error rates by the number of trees and a variable importance chart. Panel C contains additional LASSO regression plots. Panel D displays another random forest error plot and variable importance. Panel E presents a Venn diagram of gene overlaps among models. Panel F and Panel G include ROC curves with true positive vs. false positive rates for TMEM127 and CXCL16, with associated AUC values.

Figure 5. Biomarker screening and ROC validation. (A) LASSO regression model results in hyperthyroidism group; (B) RF results in hyperthyroidism group; (C) LASSO regression model results in AF group; (D) RF results in AF group; (E) Venn diagrams for Lasso and RF models; (F) the ROC of the biomarker genes in the hyperthyroidism validation set; (G) the ROC of the biomarker genes in the AF validation set.

3.5 Expression of diagnostic genes in clinical samples

To further validate our findings, a total of 48 clinical blood samples (16 patients with hyperthyroidism, 16 patients with AF, and 16 controls) from patients were collected. RT–qPCR analysis revealed that the expression levels of CXCL16 and TMEM127 were significantly elevated in both the hyperthyroidism and AF groups compared to the control group, corroborating the results of our bioinformatics analysis (Figure 6). Detailed clinical information for all patients is provided in Supplementary Table S5.

Figure 6
Bar graphs labeled A and B show relative mRNA levels normalized to GAPDH among Control, Hyperthyroidism, and AF groups. Graph A shows significant increases in Hyperthyroidism and AF compared to Control, with statistical markers (** and ***) indicating significance. Graph B shows similar trends with statistical significance marked by (*).

Figure 6. Validation of CXCL16 and TMEM127 expression in the hyperthyroidism and AF groups, as measured by RT-qPCR. (A) Relative expression of CXCL16. (B) Relative expression of TMEM127. Significance was indicated as *p < 0.05, **p < 0.01, ***p < 0.001.

3.6 Immune infiltration analysis

To explore the role of immune cells in the pathogenesis of hyperthyroidism and AF, immune infiltration analysis was conducted using the CIBERSORT algorithm (Figures 7A,C). A comparison of immune cell composition between disease and control groups revealed significant differences. Hyperthyroidism samples exhibited elevated levels of resting natural killer (NK) cells and neutrophils (Figure 7B). In contrast, AF samples showed a reduced proportion of regulatory T cells (Tregs) and activated dendritic cells (Figure 7D).

Figure 7
Four-part image comprising two stacked bar charts (A and C) and two box plots (B and D). Charts A and C display the relative percentages of various immune cell types with color codes. Box plots B and D compare the prevalence of immune cell types across two groups marked as CON (blue) and two different conditions (red). Statistical significance is indicated with \

Figure 7. Immune cell infiltration analysis. (A) This bar plot illustrates the distribution of 22 distinct immune cell types across each sample in hyperthyroidism dataset, providing a comparative visual representation of their proportions. (B) The box plot depicts the expression profiles of 22 immune cell types in hyperthyroidism samples compared to control samples. (C) Similar to hyperthyroidism, the bar plot in AF dataset delineates the proportion of 22 immune cell types, offering a visual comparison of their distribution in different AF samples. (D) The box plot depicts the expression profiles of 22 immune cell types in AF samples compared to control samples. *p < 0.05.

4 Discussion

AF is a prevalent cardiac condition, affecting approximately 60 million people worldwide. It is the most common cardiac complication in hyperthyroidism, affecting 5%–15% of individuals with overt hyperthyroidism (8). Research from the United Kingdom indicates that patients with Graves' disease have more than twice the risk of developing AF compared to the general population (20). Additionally, AF in individuals with Graves' disease is associated with increased risks of acute coronary syndromes, stable angina, cardiac hospitalization, and overall mortality (21). Despite these associations, the mechanisms underlying the coexistence of hyperthyroidism and AF, as well as their common biomarkers, remain poorly understood, necessitating further investigation to identify specific and sensitive biomarkers.

WGCNA, a powerful bioinformatics tool for constructing gene co-expression networks from high-throughput microarray data, has gained popularity when combined with machine learning methods (22). Unlike traditional approaches, WGCNA establishes connections between gene expression and clinical data, enabling the discovery of novel therapeutic targets and providing insights into the shared pathogenesis of comorbidities (23). For example, Zhang et al. identified GLUL, NCF2, S100A12, and SRGN as significant biomarkers associated with AF and heart failure (HF), providing a foundation for clinical diagnosis and treatment (24). Similarly, WGCNA, integrated with differential gene and PPI network analyses, has been used to identify hub genes linked to cardioembolic stroke and AF, aiding in stroke diagnosis and prevention strategies (25). Other studies, such as that by Huang et al., identified STAT4 and COL1A2 as biomarkers for HF and depression comorbidities, broadening therapeutic possibilities (26). Zhu et al. further demonstrated the potential of combining WGCNA and ML to identify diagnostic biomarkers like NPPA, OMD, and PRELP for dilated cardiomyopathy with HF (27). These studies indicate that shared biomarkers may provide valuable insights into the complex interplay between hyperthyroidism and AF, thereby enhancing patient prognosis.

In this study, we retrieved gene expression data and clinical information from the GEO database to construct gene co-expression networks using WGCNA. Two machine learning algorithms, LASSO and RF, were employed to identify CXCL16 and TMEM127 as biomarkers significantly associated with both hyperthyroidism and AF. External validation confirmed the diagnostic utility of these two hub genes. To further validate our findings, we conducted RT-qPCR analyses on clinical blood samples. The results demonstrated that the gene expression levels of CXCL16 and TMEM127 were significantly upregulated in both the hyperthyroidism and AF groups compared to the control group. These findings are consistent with our previous bioinformatics analyses.

CXCL16 is a multifunctional CXC chemokine that primarily binds to the CXC chemokine receptor 6 (CXCR6), playing a critical role in immune regulation, inflammatory responses, and cell chemotaxis (28). In immune cell adhesion, CXCL16 facilitates the attachment of immune cells to endothelial and dendritic cells, contributing to the pathogenesis of multiple autoimmune diseases. Studies have demonstrated a strong association between CXCL16 levels and the severity, disease activity, and prognosis of conditions such as multiple sclerosis, autoimmune hepatitis, rheumatoid arthritis, Crohn's disease, and psoriasis (29). In cardiovascular research, serum CXCL16 levels have been positively correlated with the severity of coronary artery disease, highlighting its potential as a biomarker for cardiovascular risk assessment (30). Additionally, CXCL16 has been shown to promote Ly6Chigh monocyte infiltration, exacerbating cardiac dysfunction following acute myocardial infarction (31). Furthermore, elevated plasma CXCL16 levels have been associated with poor clinical outcomes and a higher recurrence rate in patients with AF (32). Although no direct studies have established a link between CXCL16 and hyperthyroidism, its crucial role in immune regulation and inflammation suggests potential involvement in immunopathological processes of hyperthyroidism.

TMEM127 is a transmembrane protein encoded by the TMEM127 gene and is widely expressed across various human tissues (33). As a tumor suppressor, germline mutations in TMEM127 are associated with hereditary pheochromocytomas and paragangliomas. These mutations typically result in a loss of TMEM127 function, leading to aberrant activation of the mTOR signaling pathway and promoting tumor development (34). Additionally, TMEM127 has been implicated in metabolic disorders such as insulin resistance and fatty liver disease, with studies demonstrating a strong correlation between its expression levels and insulin sensitivity (35). Notably, TMEM127 forms a ternary complex with SUSD6 and MHC-I, facilitating the recruitment of WWP2 to mediate the ubiquitination and lysosomal degradation of MHC-I. This mechanism enables cancer cells to downregulate MHC-I expression on their surface, thereby evading immune surveillance (36). Although a direct link between TMEM127 and AF or hyperthyroidism has not yet been established, its crucial role in key signaling pathways and cellular regulation suggests potential relevance. Further investigation into the roles of CXCL16 and TMEM127 in cardiovascular and endocrine metabolic disorders may provide novel insights and directions for future research.

Immune infiltration is a key factor in the progression of AF and hyperthyroidism, with immune cells playing a crucial role in mediating inflammation and other immune processes. Research indicates that abnormal thyroid hormone levels can impact the cardiovascular system by activating inflammatory pathways and oxidative stress responses, with dynamic immune cell changes central to this process (37). Specifically, patients with hyperthyroidism often exhibit a systemic inflammatory state, characterized by elevated levels of pro-inflammatory cytokines such as interleukin-6 (IL-6) and tumor necrosis factor-α (TNF-α). These cytokines contribute to thyroid tissue damage and dysfunction through multiple mechanisms (38), potentially linked to the abnormal activation of neutrophils and resting NK cells. In this study, hyperthyroid samples showed a significant increase in resting NK cells and neutrophils, with a similar trend observed in AF samples. This suggests that both conditions may share common immune-inflammatory activation mechanisms. In AF, the inflammatory response is marked by increased levels of IL-1β, TNF-α, and IL-6 (39). NK cells may contribute to AF-related inflammation by secreting these cytokines, promoting atrial fibrosis and structural remodeling. Neutrophils, which are typically absent in healthy cardiac tissue, respond rapidly to cardiac stress, peaking within 24 h—much earlier than inflammatory monocytes and lymphocytes. Upon activation, neutrophils adhere to and migrate toward injury sites, recruiting additional immune cells via chemokine concentration gradients (40). Furthermore, the interaction between neutrophils and endothelial cells triggers a respiratory burst, leading to oxidative damage to cardiomyocytes and myocardial tissue contraction (41). Tregs, a key subset of T cells involved in immune regulation, mitigate inflammation by inhibiting the activation of helper T cells (Th cells) and effector T cells, thereby reducing immune-mediated myocardial damage (42). Tregs are thought to play a critical immunomodulatory role in AF pathogenesis, and restoring their balance or enhancing their activity has been proposed as a therapeutic strategy to mitigate inflammation-driven atrial remodeling. This approach has been validated in atherosclerosis studies (43). Similarly, in Graves' disease, systemic inflammation is accompanied by a decline in Treg immunosuppressive function and a shift toward a cytotoxic phenotype (44). In addition, activated dendritic cells may contribute to the pathogenesis of both hyperthyroidism and AF through antigen presentation, inflammatory cytokine secretion, and interactions with other immune cells. Studies suggest that dendritic cells facilitate cardiac remodeling and functional recovery following myocardial infarction by modulating Tregs and macrophage polarization (45). However, in this study, immune cell infiltration patterns differed between hyperthyroidism and AF, potentially due to variations in myocardial vs. peripheral blood T cell samples. Despite their differences, growing evidence supports the central role of inflammation and immune responses in the onset, progression, and prognosis of both hyperthyroidism and AF. Further research into immune cell interactions and regulatory mechanisms may offer a more robust theoretical basis for optimizing clinical management and therapeutic strategies for these conditions.

However, this study has several limitations. Variability in sample sources may introduce bias, and datasets with sufficiently large sample sizes remain limited. Furthermore, it is unclear whether elevated mRNA levels correspond to proportional increases in protein expression, as many biological processes are also regulated by post-translational modifications. We acknowledge that this study was restricted to transcriptomic datasets and peripheral blood validation. Comprehensive phenomic validation would require the integration of proteomic, metabolomic, imaging, and clinical phenotypic data (46). Future studies incorporating multi-omics and longitudinal phenomic analyses are warranted to further advance the phenomics paradigm.

We conducted a bioinformatics analysis using data from the GEO database to investigate the underlying molecular mechanisms, key genes, and patterns of immune cell infiltration associated with hyperthyroidism and AF. Using two machine learning algorithms—LASSO and RF—we identified CXCL16 and TMEM127 as potential diagnostic biomarkers and therapeutic targets for both conditions, offering a foundation for future investigations and potential therapeutic developments.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Ethics statement

The studies involving humans were approved by Ethics Committee of Shanxi Cardiovascular Hospital. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

LW: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing. KY: Visualization, Writing – review & editing. RK: Visualization, Writing – review & editing. PL: Visualization, Writing – review & editing. YD: Conceptualization, Resources, Supervision, Writing – review & editing.

Funding

The author(s) declare that no financial support was received for the research and/or publication of this article.

Acknowledgments

The authors express sincere gratitude for the invaluable data support extended by the GEO databases.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fcvm.2025.1694255/full#supplementary-material

References

1. Chaker L, Cooper DS, Walsh JP, Peeters RP. Hyperthyroidism. Lancet. (2024) 403(10428):768–80. doi: 10.1016/S0140-6736(23)02016-0

PubMed Abstract | Crossref Full Text | Google Scholar

2. Cappola AR, Desai AS, Medici M, Cooper LS, Egan D, Sopko G, et al. Thyroid and cardiovascular disease: research agenda for enhancing knowledge, prevention, and treatment. Circulation. (2019) 139(25):2892–909. doi: 10.1161/CIRCULATIONAHA.118.036859

PubMed Abstract | Crossref Full Text | Google Scholar

3. Ohlrogge AH, Brederecke J, Schnabel RB. Global burden of atrial fibrillation and flutter by national income: results from the global burden of disease 2019 database. J Am Heart Assoc. (2023) 12(17):e030438. doi: 10.1161/JAHA.123.030438

PubMed Abstract | Crossref Full Text | Google Scholar

4. Sagris M, Vardas EP, Theofilis P, Antonopoulos AS, Oikonomou E, Tousoulis D. Atrial fibrillation: pathogenesis, predisposing factors, and genetics. Int J Mol Sci. (2021) 23(1):6. doi: 10.3390/ijms23010006

PubMed Abstract | Crossref Full Text | Google Scholar

5. Middeldorp ME, Kamsani SH, Sanders P. Obesity and atrial fibrillation: prevalence, pathogenesis, and prognosis. Prog Cardiovasc Dis. (2023) 78:34–42. doi: 10.1016/j.pcad.2023.04.010

PubMed Abstract | Crossref Full Text | Google Scholar

6. Shoemaker MB, Shah RL, Roden DM, Perez MV. How will genetics inform the clinical care of atrial fibrillation? Circ Res. (2020) 127(1):111–27. doi: 10.1161/CIRCRESAHA.120.316365

PubMed Abstract | Crossref Full Text | Google Scholar

7. Kim JA, Chelu MG, Li N. Genetics of atrial fibrillation. Curr Opin Cardiol. (2021) 36(3):281–7. doi: 10.1097/HCO.0000000000000840

PubMed Abstract | Crossref Full Text | Google Scholar

8. Kostopoulos G, Effraimidis G. Epidemiology, prognosis, and challenges in the management of hyperthyroidism-related atrial fibrillation. Eur Thyroid J. (2024) 13(2):e230254. doi: 10.1530/ETJ-23-0254

PubMed Abstract | Crossref Full Text | Google Scholar

9. Frost L, Vestergaard P, Mosekilde L. Hyperthyroidism and risk of atrial fibrillation or flutter: a population-based study. Arch Intern Med. (2004) 164(15):1675–8. doi: 10.1001/archinte.164.15.1675

PubMed Abstract | Crossref Full Text | Google Scholar

10. Bekiaridou A, Kartas A, Moysidis DV, Papazoglou AS, Baroutidou A, Papanastasiou A, et al. The bidirectional relationship of thyroid disease and atrial fibrillation: established knowledge and future considerations. Rev Endocr Metab Disord. (2022) 23(3):621–30. doi: 10.1007/s11154-022-09713-0

PubMed Abstract | Crossref Full Text | Google Scholar

11. Selmer C, Hansen ML, Olesen JB, Mérie C, Lindhardsen J, Olsen AM, et al. New-onset atrial fibrillation is a predictor of subsequent hyperthyroidism: a nationwide cohort study. PLoS One. (2013) 8(2):e57893. doi: 10.1371/journal.pone.0057893

PubMed Abstract | Crossref Full Text | Google Scholar

12. Zhang J, Wang J, Wu Y, Li W, Gong K, Zhao P. Identification of SLED1 as a potential predictive biomarker and therapeutic target of post-infarct heart failure by bioinformatics analyses. Int Heart J. (2021) 62(1):23–32. doi: 10.1536/ihj.20-439

PubMed Abstract | Crossref Full Text | Google Scholar

13. Kong X, Sun H, Wei K, Meng L, Lv X, Liu C, et al. WGCNA combined with machine learning algorithms for analyzing key genes and immune cell infiltration in heart failure due to ischemic cardiomyopathy. Front Cardiovasc Med. (2023) 10:1058834. doi: 10.3389/fcvm.2023.1058834

PubMed Abstract | Crossref Full Text | Google Scholar

14. Li Y, Hu Y, Jiang F, Chen H, Xue Y, Yu Y. Combining WGCNA and machine learning to identify mechanisms and biomarkers of ischemic heart failure development after acute myocardial infarction. Heliyon. (2024) 10(5):e27165. doi: 10.1016/j.heliyon.2024.e27165

PubMed Abstract | Crossref Full Text | Google Scholar

15. Kanehisa M, Furumichi M, Sato Y, Matsuura Y, Ishiguro-Watanabe M. KEGG: biological systems database as a model of the real world. Nucleic Acids Res. (2025) 53(D1):D672–7. doi: 10.1093/nar/gkae909

PubMed Abstract | Crossref Full Text | Google Scholar

16. Kanehisa M. Toward understanding the origin and evolution of cellular organisms. Protein Sci. (2019) 28(11):1947–51. doi: 10.1002/pro.3715

PubMed Abstract | Crossref Full Text | Google Scholar

17. Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. (2000) 28(1):27–30. doi: 10.1093/nar/28.1.27

PubMed Abstract | Crossref Full Text | Google Scholar

18. Wiersinga WM, Poppe KG, Effraimidis G. Hyperthyroidism: aetiology, pathogenesis, diagnosis, management, complications, and prognosis. Lancet Diabetes Endocrinol. (2023) 11(4):282–98. doi: 10.1016/S2213-8587(23)00005-0

PubMed Abstract | Crossref Full Text | Google Scholar

19. Joglar JA, Chung MK, Armbruster AL, Benjamin EJ, Chyou JY, Cronin EM, et al. 2023 ACC/AHA/ACCP/HRS guideline for the diagnosis and management of atrial fibrillation: a report of the American College of Cardiology/American Heart Association joint committee on clinical practice guidelines. Circulation. (2024) 149(1):e1–156. doi: 10.1161/CIR.0000000000001193

PubMed Abstract | Crossref Full Text | Google Scholar

20. Okosieme OE, Taylor PN, Evans C, Thayer D, Chai A, Khan I, et al. Primary therapy of Graves’ disease and cardiovascular morbidity and mortality: a linked-record cohort study. Lancet Diabetes Endocrinol. (2019) 7(4):278–87. doi: 10.1016/S2213-8587(19)30059-2

PubMed Abstract | Crossref Full Text | Google Scholar

21. Naser JA, Pislaru SV, Stan MN, Lin G. Incidence, risk factors, and outcomes of incident atrial fibrillation in patients with graves disease. Mayo Clin Proc. (2023) 98(6):883–91. doi: 10.1016/j.mayocp.2022.12.013

PubMed Abstract | Crossref Full Text | Google Scholar

22. Zhang X, Li J, Zhang L, Wu X, Wang Y, Zhang L, et al. Integration WGCNA with LC-MS data for evaluating the processing status and transformation rules of Ligustri Lucidi Fructus: a novel strategy for evaluating the processing technology of traditional Chinese medicines. Talanta. (2025) 282:127029. doi: 10.1016/j.talanta.2024.127029

PubMed Abstract | Crossref Full Text | Google Scholar

23. Xiao Y, Pan RJ, An ZH, Liu QM, Zhou SH. A novel strategy for key gene identification in hypertrophic cardiomyopathy based on cuproptosis and multiple WGCNA analyses. Eur Heart J. (2023) 44:ehad655-1854. doi: 10.1093/eurheartj/ehad655.1854

Crossref Full Text | Google Scholar

24. Zhang Z, Ding J, Mi X, Lin Y, Li X, Lian J, et al. Identification of common mechanisms and biomarkers of atrial fibrillation and heart failure based on machine learning. ESC Heart Fail. (2024) 11(4):2323–33. doi: 10.1002/ehf2.14799

PubMed Abstract | Crossref Full Text | Google Scholar

25. Zhang J, Zhang B, Li T, Li Y, Zhu Q, Wang X, et al. Exploring the shared biomarkers between cardioembolic stroke and atrial fibrillation by WGCNA and machine learning. Front Cardiovasc Med. (2024) 11:1375768. doi: 10.3389/fcvm.2024.1375768

PubMed Abstract | Crossref Full Text | Google Scholar

26. Huang K, Zhang X, Duan J, Wang R, Wu Z, Yang C, et al. STAT4 and COL1A2 are potential diagnostic biomarkers and therapeutic targets for heart failure comorbided with depression. Brain Res Bull. (2022) 184:68–75. doi: 10.1016/j.brainresbull.2022.03.014

PubMed Abstract | Crossref Full Text | Google Scholar

27. Zhu Y, Yang X, Zu Y. Integrated analysis of WGCNA and machine learning identified diagnostic biomarkers in dilated cardiomyopathy with heart failure. Front Cell Dev Biol. (2022) 10:1089915. doi: 10.3389/fcell.2022.1089915

PubMed Abstract | Crossref Full Text | Google Scholar

28. Lauver MD, Katz ZE, Markus H, Derosia NM, Jin J, Ayers KN, et al. The CXCR6-CXCL16 axis mediates T cell control of polyomavirus infection in the kidney. PLoS Pathog. (2025) 21(3):e1012969. doi: 10.1371/journal.ppat.1012969

PubMed Abstract | Crossref Full Text | Google Scholar

29. Bao N, Fu B, Zhong X, Jia S, Ren Z, Wang H, et al. Role of the CXCR6/CXCL16 axis in autoimmune diseases. Int Immunopharmacol. (2023) 121:110530. doi: 10.1016/j.intimp.2023.110530

PubMed Abstract | Crossref Full Text | Google Scholar

30. Xing J, Liu Y, Chen T. Correlations of chemokine CXCL16 and TNF-α with coronary atherosclerotic heart disease. Exp Ther Med. (2018) 15(1):773–6. doi: 10.3892/etm.2017.5450

PubMed Abstract | Crossref Full Text | Google Scholar

31. Zhang J, Hao W, Zhang J, Li T, Ma Y, Wang Y, et al. CXCL16 promotes Ly6Chigh monocyte infiltration and impairs heart function after acute myocardial infarction. J Immunol. (2023) 210(6):820–31. doi: 10.4049/jimmunol.2200249

PubMed Abstract | Crossref Full Text | Google Scholar

32. Huang J, Wu N, Xiang Y, Wu L, Li C, Yuan Z, et al. Prognostic value of chemokines in patients with newly diagnosed atrial fibrillation. Int J Cardiol. (2020) 320:83–9. doi: 10.1016/j.ijcard.2020.06.030

PubMed Abstract | Crossref Full Text | Google Scholar

33. Guo Q, Cheng ZM, Gonzalez-Cantú H, Rotondi M, Huelgas-Morales G, Ethiraj P, et al. TMEM127 suppresses tumor development by promoting RET ubiquitination, positioning, and degradation. Cell Rep. (2023) 42(9):113070. doi: 10.1016/j.celrep.2023.113070

PubMed Abstract | Crossref Full Text | Google Scholar

34. Deng Y, Qin Y, Srikantan S, Luo A, Cheng ZM, Flores SK, et al. The TMEM127 human tumor suppressor is a component of the mTORC1 lysosomal nutrient-sensing complex. Hum Mol Genet. (2018) 27(10):1794–808. doi: 10.1093/hmg/ddy095

PubMed Abstract | Crossref Full Text | Google Scholar

35. Srikantan S, Deng Y, Cheng ZM, Luo A, Qin Y, Gao Q, et al. The tumor suppressor TMEM127 regulates insulin sensitivity in a tissue-specific manner. Nat Commun. (2019) 10(1):4720. doi: 10.1038/s41467-019-12661-0

PubMed Abstract | Crossref Full Text | Google Scholar

36. Chen X, Lu Q, Zhou H, Liu J, Nadorp B, Lasry A, et al. A membrane-associated MHC-I inhibitory axis for cancer immune evasion. Cell. (2023) 186(18):3903–20.e21. doi: 10.1016/j.cell.2023.07.016

PubMed Abstract | Crossref Full Text | Google Scholar

37. Tan Öksüz SB, Şahin M. Thyroid and cardiovascular diseases. Turk J Med Sci. (2024) 54(7):1420–7. doi: 10.55730/1300-0144.5927

PubMed Abstract | Crossref Full Text | Google Scholar

38. Lanzolla G, Marinò M, Menconi F. Graves disease: latest understanding of pathogenesis and treatment options. Nat Rev Endocrinol. (2024) 20(11):647–60. doi: 10.1038/s41574-024-01016-5

PubMed Abstract | Crossref Full Text | Google Scholar

39. Băghină RM, Crișan S, Luca S, Pătru O, Lazăr MA, Văcărescu C, et al. Association between inflammation and new-onset atrial fibrillation in acute coronary syndromes. J Clin Med. (2024) 13(17):5088. doi: 10.3390/jcm13175088

Crossref Full Text | Google Scholar

40. Tang Y, Jiao Y, An X, Tu Q, Jiang Q. Neutrophil extracellular traps and cardiovascular disease: associations and potential therapeutic approaches. Biomed Pharmacother. (2024) 180:117476. doi: 10.1016/j.biopha.2024.117476

PubMed Abstract | Crossref Full Text | Google Scholar

41. Huang M, Huiskes FG, de Groot NMS, Brundel B. The role of immune cells driving electropathology and atrial fibrillation. Cells. (2024) 13(4):311. doi: 10.3390/cells13040311

PubMed Abstract | Crossref Full Text | Google Scholar

42. Kumar V, Narisawa M, Cheng XW. Overview of multifunctional tregs in cardiovascular disease: from insights into cellular functions to clinical implications. FASEB J. (2024) 38(13):e23786. doi: 10.1096/fj.202400839R

PubMed Abstract | Crossref Full Text | Google Scholar

43. Xia Y, Gao D, Wang X, Liu B, Shan X, Sun Y, et al. Role of treg cell subsets in cardiovascular disease pathogenesis and potential therapeutic targets. Front Immunol. (2024) 15:1331609. doi: 10.3389/fimmu.2024.1331609

PubMed Abstract | Crossref Full Text | Google Scholar

44. Liu Z, Ke SR, Shi ZX, Zhou M, Sun L, Sun QH, et al. Dynamic transition of Tregs to cytotoxic phenotype amid systemic inflammation in Graves’ ophthalmopathy. JCI Insight. (2024) 9(22):e181488. doi: 10.1172/jci.insight.181488

PubMed Abstract | Crossref Full Text | Google Scholar

45. Choo EH, Lee JH, Park EH, Park HE, Jung NC, Kim TH, et al. Infarcted myocardium-primed dendritic cells improve remodeling and cardiac function after myocardial infarction by modulating the regulatory T cell and macrophage polarization. Circulation. (2017) 135(15):1444–57. doi: 10.1161/CIRCULATIONAHA.116.023106

PubMed Abstract | Crossref Full Text | Google Scholar

46. Ying W. Phenomic studies on diseases: potential and challenges. Phenomics. (2023) 3(3):285–99. doi: 10.1007/s43657-022-00089-4

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: hyperthyroidism, atrial fibrillation, weighted gene co-expression network analysis, machine learning, biomarkers

Citation: Wang L, Yang K, Kang R, Liu P and Deng Y (2025) Combining WGCNA and machine learning to identify mechanisms and biomarkers of hyperthyroidism and atrial fibrillation. Front. Cardiovasc. Med. 12:1694255. doi: 10.3389/fcvm.2025.1694255

Received: 28 August 2025; Accepted: 10 November 2025;
Published: 20 November 2025.

Edited by:

Georges Michel Nemer, Hamad Bin Khalifa University, Qatar

Reviewed by:

Lihong Pan, University of Mississippi Medical Center, United States
Peng Li, Nanjing Medical University, China

Copyright: © 2025 Wang, Yang, Kang, Liu and Deng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yongzhi Deng, b2x5bXBpY3NjaGluYUAxNjMuY29t

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.