AUTHOR=Liu Yanyan , Yin Zhenglang , Wang Yao , Chen Haohao TITLE=Exploration and validation of key genes associated with early lymph node metastasis in thyroid carcinoma using weighted gene co-expression network analysis and machine learning JOURNAL=Frontiers in Endocrinology VOLUME=Volume 14 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/endocrinology/articles/10.3389/fendo.2023.1247709 DOI=10.3389/fendo.2023.1247709 ISSN=1664-2392 ABSTRACT=Background: Thyroid carcinoma (THCA), the most common endocrine neoplasm, typically displays an indolent behavior; yet, in some instances, lymph node metastasis (LNM) may arise in the early stages, with the underlying mechanisms not yet fully understood. Materials and Methods: Co-expression networks were established using the "WGCNA" R package. To achieve unsupervised clustering, the "ConsensusCluster Plus" R package was utilized. ImmuCellAI database was utilized to evaluate immune cell infiltration. The LASSO, SVM, and Random Forest algorithms were individually executed utilizing the "glmnet", "e1071", and "randomForest" R packages. The molecular docking process was conducted using mcule 1-click Docking server online. The levels of gene expression and protein expression were experimentally validated using RT-qPCR and immunohistochemistry. Results: Through WGCNA and PPI network analysis, twelve hub genes were identified as the most relevant to LNM potential from these two modules. The 12 hub genes were differentially expressed in THCA and showed significant correlations with downregulation of neutrophil infiltration, as well as upregulation of dendritic cell and macrophage infiltration, and activation of the EMT pathway in THCA. We propose a novel molecular classification approach and an online web-based nomogram (http://www.empowerstats.net/pmodel/?m=17617_LNM) for evaluating the LNM potential of THCA. Machine learning algorithms have identified ERBB3 as the most critical gene associated with LNM potential in THCA. The differential methylation levels partially explain this differential expression of ERBB3. Through ROC analysis, ERBB3 has been identified as a diagnostic marker for THCA (AUC=0.89), THCA with high LNM potential (AUC=0.75), and lymph nodes with tumor metastasis (AUC=0.86). We have presented a comprehensive review of endocrine disruptor chemical (EDC) exposures, environmental toxins, and pharmacological agents that have a potential impact on LNM potential. Conclusion: In conclusion, our study utilizing bioinformatics analysis techniques identified gene modules and hub genes influencing LNM potential in THCA patients.