AUTHOR=Li Jinzhou , Cui Ting , Huang Zeping , Mu Yanxi , Yao Yalong , Xu Wei , Chen Kang , Liu Haipeng , Wang Wenjie , Chen Xiao TITLE=Analysis of risk factors for lymph node metastasis and prognosis study in patients with early gastric cancer: A SEER data-based study JOURNAL=Frontiers in Oncology VOLUME=Volume 13 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2023.1062142 DOI=10.3389/fonc.2023.1062142 ISSN=2234-943X ABSTRACT=Abstract Background: Lymph node status is an important factor in determining the prognosis of patients with early gastric cancer (EGC) and preoperative diagnosis of lymph node metastasis (LNM) has some limitations. This study explored the risk factors and independent prognostic factors of LNM in EGC patients and constructed a clinical prediction model to predict LNM. Methods: Clinicopathological data of EGC patients was collected from the public Surveillance, Epidemiology, and End Results (SEER) database. Univariate and multivariate logistic regression was used to identify risk factors for LNM in EGC patients. The performance of the LNM model was evaluated by C-index, calibration curve, receiver operating characteristic (ROC) curve, decision curve analysis (DCA) curve, and clinical impact curve (CIC) based on the results of multivariate regression to develop a nomogram. An independent data set was obtained from China for external validation. The Kaplan-Meier method and Cox regression model were used to identify potential prognostic factors for overall survival (OS) in EGC patients. Results: A total of 3993 EGC patients were randomly allocated to a training cohort (n=2797) and a validation cohort (n=1196). An external cohort of 106 patients from the Second Hospital of Lanzhou University was used for external validation. Univariate and multivariate logistic regression showed that age, tumor size, differentiation, and examined lymph nodes count (ELNC) were independent risk factors for LNM. Nomogram for predicting LNM in EGC patients was developed and validated. The predictive model had a good discriminatory performance with a concordance index (C-index) of 0.702 (95% CI: 0.679-0.725). The calibration plots showed that the predicted LNM probabilities were the same as the actual observations in both the internal validation cohort and external validation cohort. The AUC values for the training cohort, internal validation cohort and external validation cohort were 0.702 (95% CI: 0.679-0.725), 0.709 (95% CI: 0.674-0.744) and 0.750(95% CI: 0.607-0.892), respectively, and the DCA curves and CIC showed good clinical applicability. Conclusions: In this study, we identified risk factors and independent prognostic factors for the development of LNM in EGC patients, and developed a relatively accurate model to predict the development of LNM in EGC patients.