- 1The Second Clinical Medical School, Anhui University of Chinese Medicine, Hefei, China
- 2Department of Research, The Second Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, China
- 3Department of Gastroenterology, Hefei Second People’s Hospital, Hefei, China
- 4Department of Gastroenterology, The Second Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, China
Background: Chronic atrophic gastritis (CAG), an early stage of gastric cancer, is a major digestive disorder, and the prognosis of CAG is determined by many sociodemographic and clinicopathologic subject characteristics. This retrospective observational multicenter analysis was conducted to explore risk factors and construct a predictive model for low-grade intraepithelial neoplasia (LGIN) in patients with CAG.
Methods: The training dataset included 317 CAG patients diagnosed and treated in the Second Affiliated Hospital of Anhui University of Chinese Medicine from September 2018 to January 2025. All the baseline characteristics, including gender, age, education, basic diseases, blood indicators, and pathological mechanism during treatment of CAG, were recorded and selected based on both the least absolute shrinkage and selection operator (LASSO) regression analysis with 10-fold cross-validation and logistic regression analysis. After that, the nomogram was established, and its accuracy and predictive performance were evaluated via the area under the receiver operating characteristic (ROC) curves (AUC), calibration curves, Hosmer–Lemeshow goodness-of-fit test, and decision curve analysis (DCA) curves. For the validation dataset, the medical record information of 92 CAG patients diagnosed and treated in the Hefei Second People’s Hospital from November 2023 to January 2025 was recorded for subsequent analysis.
Results: Our LASSO regression analysis revealed that family history, HP infection, pepsinogen I, pepsinogen II, bile reflux, and Kimura–Takemoto classification (C3 vs. C1) were significant independent risk factors, and the fitting equation was obtained. A nomogram for predicting LGIN in CAG patients was established. The ROC curve revealed that our predictive model showed good predictive efficacy with an AUC value of 0.838 (95% CI = 0.789–0.887) with a specificity of 0.761 and a sensitivity of 0.791 in the training dataset and an AUC value of 0.941 (95% CI = 0.893–0.989) with a specificity of 0.852 and a sensitivity of 0.908 in the validation dataset. Moreover, calibration and DCA curves demonstrated that our predictive model had a good fit, better net benefit, and predictive efficiency in LGIN in CAG patients.
Conclusions: Our predictive model demonstrated that family history, HP infection, pepsinogen I, pepsinogen II, bile reflux, and Kimura–Takemoto classification were the independent risk factors of LGIN in CAG patients with high accuracy and good calibration.
Introduction
Chronic atrophic gastritis (CAG) is a progressive inflammatory condition characterized by the thinning and degradation of the gastric mucosa, accompanied by the loss of functional gastric glands (1, 2), and it remains a significant global health concern, particularly affecting aging populations and regions with high Helicobacter pylori (HP) infection rates, such as parts of East Asia, Eastern Europe, and South America (3). Symptoms of CAG are diverse and may include abdominal discomfort, often presenting as a dull pain, fullness, or a burning sensation in the upper abdomen. Nausea and vomiting are also prevalent, especially after meals. Some patients may experience a reduced appetite, leading to weight loss over time (4). Meanwhile, CAG may progress to more severe conditions, and the risk of developing peptic ulcers increases as the gastric mucosa deteriorates (5). Moreover, long-term inflammation can cause intestinal metaplasia, a precancerous change that significantly raises the risk of gastric adenocarcinoma (6, 7).
The cause of CAG is often derived from long-term HP infection (8), autoimmune responses (9), or chronic exposure to persistent irritations such as alcohol consumption, smoking, long-term bile reflux, or intake of non-steroidal anti-inflammatory drugs (NSAIDs) (10). Over time, the damaged mucosa may undergo intestinal metaplasia, where stomach cells are replaced by intestinal-type cells, impairing acid production and digestive function (11). Symptoms can range from mild indigestion and bloating to severe deficiencies in vitamin B12 or iron due to malabsorption (12). The progression of CAG to more severe precancerous stages, such as low-grade intraepithelial neoplasia (LGIN), is influenced by multiple factors encompassing HP and alcohol and tobacco intake. It is marked by cellular atypia, and architectural distortion confined to the epithelial layer often emerges in this milieu of chronic injury (13). Early detection through endoscopic surveillance and eradication of HP are critical to halting progression to high-grade dysplasia or invasive carcinoma (14). However, the diagnostic result of LGIN in CAG patients is still unclear.
Hence, this study was conducted to develop and validate a novel nomogram that incorporates clinicopathologic factors associated with LGIN based on a model for predicting LGIN in CAG patients.
Materials and methods
Data source and participants’ information
Our multicenter retrospective observational study was approved by the Ethics Committee of the Second Affiliated Hospital of Anhui University of Chinese Medicine and the Hefei Second People’s Hospital and followed the Declaration of Helsinki, which was a multiple-center, retrospective, and observational analysis on CAG patients admitted to two hospitals. The clinical data of 317 patients from September 2018 to January 2025 in the Department of Gastroenterology of the Second Affiliated Hospital of Anhui University of Chinese Medicine were utilized for our predictive model as the training dataset, and the clinical data of 92 patients from November 2023 to January 2025 in the Department of Gastroenterology of the Hefei Second People’s Hospital were utilized for our predictive model as the validation dataset.
The inclusion criteria for our clinical data collection were as follows: a) age over 18 years old, b) patients diagnosed through endoscopy or pathology demonstrating CAG, c) patients having complete and searchable clinical information such as blood biomarkers and CAG classification data, and d) patients participating in our retrospective observational study voluntarily. The exclusion criteria were a) patients with CAG who received medical treatment in the past, b) patients having incomplete clinical data, and c) patients who were not willing to participate in our retrospective observational study.
The diagnosis of gastric LGIN requires comprehensive evaluation through multimodal methods. First, endoscopic examination is the core means. Conventional gastroscopy can initially identify mucosal abnormalities (such as erythema, erosion, or mucosal roughness), while enhanced imaging techniques (such as narrow band imaging, magnifying endoscopy, or chromoendoscopy) can further observe microvascular and glandular structural changes and assist in locating suspicious areas. Secondly, histopathological analysis is the key to diagnosis, and multiple biopsies (at least three to five pieces) are required to cover the range of lesions to avoid missed diagnosis; microscopic features include mild nuclear atypia and disordered arrangement, but the lesions are limited to the lower half of the mucosa and need to be differentiated from inflammation or reparative hyperplasia. In addition, HP detection is indispensable because HP infection is an important cause of LGIN, and eradication therapy may reverse some lesions. Finally, it is necessary to combine clinical follow-up to dynamically evaluate the condition. Some cases may progress to high-grade lesions or cancer, and timely intervention is required. During the diagnostic process, attention should be paid to the consistency differences between pathologists, and multidisciplinary consultation is recommended to improve accuracy. In general, the diagnosis of LGIN relies on the close integration of endoscopy and pathology, combined with etiological evaluation and dynamic monitoring, to provide a basis for subsequent treatment decisions.
In our retrospective analysis, the least absolute shrinkage and selection operator (LASSO) and multivariate logistic regression were utilized to establish our predictive model. In order to prevent overfitting of our predictive model, the number ratio (317:92) between the training dataset and the validation dataset, approximately 7:3 or 8:2, is reasonable (15).
Characteristic selection
Similar to the aforementioned literature (16, 17) and the aim of our retrospective analysis, the following characteristics were analyzed and studied: a) sociodemographic characteristics [gender of patients (male or female), age at diagnosis, education (less than primary school, middle school, or upper college), and marital status (single or married)], b) questionnaire information [obesity (no, yes), hypertension (no, yes) (systolic blood pressure 140 or diastolic blood pressure 90), depression (no, yes), frailty (no, yes), alcohol consumption (no, yes), smoking (no, yes), diabetes (no, yes), family history (no, yes), dyslipidemia (no, yes) (cholesterol 6.19 or low-density lipoprotein, LDL 4.14)], c) laboratory data [HP infection (no, yes), glucose (mmol/L), cholesterol (mmol/L), LDL (mmol/L), pepsinogen I (μg/L), pepsinogen II (μg/L), gastrin 17 (pmol/L), alpha fetoprotein (AFP) (μg/L), carcinoembryonic antigen (CEA) (μg/L), carbohydrate antigen 125 (CA125) (U/mL), carbohydrate antigen 199 (CA199) (U/mL), and D-dimer (mg/L)], and d) endoscopic characteristics [Kimura–Takemoto (KT) classification (C1, C2 or C3) and bile reflux (no, yes)].
Statistical analysis
Data were analyzed using R software 4.4.2 in our retrospective analysis. Continuous characteristics were presented as median (interquartile range), p-values were calculated via the Mann–Whitney U test, and comparisons between groups were analyzed by the rank sum test. Categorical characteristics were presented as number (N) or proportion (%), and comparisons between groups were analyzed using the chi-squared test or Fisher’s exact test. We employed the correlative analysis of independent characteristics via the function “cor()” and a heatmap of a correlation matrix was obtained, showing a graphical representation that utilized color-coding to visualize the strength and direction of the relationships between characteristics in a dataset. Each cell in the matrix showed the correlation coefficient ranging from −1 to 1 based on Pearson’s method. Furthermore, a nomogram was plotted to illustrate the risk of LGIN in patients with CAG, and the LASSO regression analysis was utilized to select relevant characteristics to establish our predictive model of high-dimensional data (18). After selecting predictors of LGIN in patients with CAG, the 10-fold cross-validation based on the LASSO regression analysis was utilized to confirm the suitable tuning parameters (λ), and the coefficients of a sparse matrix with non-zero as selected characteristics via the minimum λ were considered (19). Selected characteristics were applied via the multivariate logistic regression analysis, and p-values less than 0.05 were considered as independent variables of the radiomics nomogram (20).
The performance of our predictive model was assessed in both the training and validation datasets, including the assessment of the receiver operating characteristic (ROC) curve (AUC), sensitivity, and specificity. Moreover, the calibration curve and Hosmer–Lemeshow goodness of fit were used to evaluate the effectiveness (21). Decision curve analysis (DCA) was conducted to validate the accuracy of the predictive model by quantifying the net benefits at different threshold probabilities (22).
In the construction of the clinical model and in the subsequent analysis, R version 4.4.2 (http://www.r-project.org, R Foundation for Statistical Computing) was utilized. For baseline characteristics, we utilized the “tableone” package, and the table was drawn via both the “flextable” and “officer” packages. In the LASSO regression analysis, the “glmnet” R package was utilized. For establishing the linear regression model, the “rms” package was used to plot the radiomics nomogram for subsequent analysis. We plotted the ROC and DCA curves based on the “pROC” and “rmda” packages, respectively, and the R package “ResourceSelection” was used to perform the Hosmer–Lemeshow goodness of fit. A two-bilateral p-value less than 0.05 was considered statistically significant.
Results
Participant baseline characteristics
Detailed baseline information of the sociodemographic and clinicopathologic characteristics of both the training and validation datasets is shown in Table 1. For the training dataset, with a total of 317 CAG patients with non-LGIN (N = 92, 29%) and LGIN (N = 225, 71%), consisting of 144 male patients (45.4%) and 173 female patients (54.6%), the mean (SD) age was 62.03 ± 12.11 in non-LGIN patients and 61.28 ± 11.91 in LGIN utilized. For the validation dataset, with a total of 92 CAG patients with non-
LGIN (N = 27, 29.3%) and LGIN (N = 65, 70.7%), consisting of 39 male patients (42.4%) and 53 female patients (57.6%), the mean (SD) age was 60.59 ± 15.36 in non-LGIN patients and 59.88 ± 12.30 in LGIN patients, as shown in Supplementary Table S1 and Table 1. Among the characteristics in both the training and validation datasets, nine characteristics showed significant differences between the training and validation datasets: hypertension (p < 0.001), dyslipidemia (p < 0.001), HP infection (p = 0.008), glucose (p = 0.001), pepsinogen II (p < 0.001), AFP (p < 0.001), CEA (p = 0.001), CA125 (p < 0.001), and CA199 (p = 0.003).
Correlation heatmap of the predictive characteristics in the training dataset
As depicted in Figure 1, the correlation matrix heatmap of our training dataset revealed the causal associations between different predictive characteristics. In our heatmap of our correlation matrix, warm colors (e.g., red) often indicate strong positive correlations, cool colors (e.g., blue) represent negative correlations, and neutral colors (e.g., white) denote weak or no correlation. A correlation analysis was performed on the predictive variables of our model, which included gender, age, level of education, marital status, obesity, hypertension, depression, frailty, alcohol consumption, smoking, diabetes, dyslipidemia, family history, HP infection, glucose, cholesterol, pepsinogen I, pepsinogen II, gastrin 17, AFP, CEA, CA125, CA199, D-dimer, Kimura–Takemoto classification, and bile reflux. Considering that all characteristics have weak correlation and no multicollinearity in the visualization of the heatmap, we utilized these recorded characteristics for the subsequent LASSO regression analysis in our retrospective study.
Selection of predictive characteristics and nomogram establishment
In our LASSO logistic regression analysis, we leveraged 10-fold cross-validation to obtain the optimal parameter λ for our predictive model and finally screened nine characteristics: alcohol consumption, smoking, family history, HP infection, glucose, pepsinogen I,
pepsinogen II, gastrin 17, bile reflux, and Kimura–Takemoto classification (p < 0.05), as depicted in Figure 2. Two LASSO result figures have chosen non-zero coefficients as the underlying factors of LGIN. Based on the results of the LASSO logistic regression analysis, the fitting equation of our predictive model is as follows:

Figure 2. Selection of sociodemographic and clinicopathologic characteristics using the LASSO regression analysis. (A) The LASSO coefficient profiles of the 17 texture features. A coefficient profile plot was produced against the log(λ) sequence. By using 10-fold cross-validation, 24 non-zero coefficients based on optimal λ were selected. (B) The optimal parameter (λ) in the LASSO model was selected via 10-fold cross-validation using minimum criteria. The left dashed line represents λ.min and the right dashed line represents λ.1se.
whererepresents alcohol consumption (No denotes 0, Yes denotes 1), represents smoking (No denotes 0, Yes denotes 1), represents family history (No denotes 0, Yes denotes 1), represents HP infection (No denotes 0, Yes denotes 1), represents pepsinogen I value, represents pepsinogen II value, represents gastrin 17 value, represents bile reflux (No denotes 0, Yes denotes 1), represents Kimura–Takemoto classification (C1, C2, C3), and the constant term of the formula (0.601) means reference intercept.
Multivariate logistic regression analysis further showed that family history (OR = 3.111, 95% CI = 1.620–6.039), HP infection (OR = 4.810, 95% CI = 2.335–10.16), pepsinogen I (OR = 0.982, 95% CI = 0.970–0.993), pepsinogen II (OR = 0.832, 95% CI = 0.755–0.912), bile reflux (OR = 2.388, 95% CI = 1.212–4.727), and Kimura–Takemoto classification (C3 vs. C1) (OR = 3.874, 95% CI = 1.693–9.264) are independent risk factors for LGIN in CAG patients (p < 0.05), respectively, as presented in Table 2. The β values of three characteristics, encompassing pepsinogen I, pepsinogen II, and gastrin 17, were all less than zero, and the OR values of the aforementioned characteristics were also less than zero, which means that these three characteristics are the protective factors for LGIN in CAG patients.
Meanwhile, the predictive model was plotted as a radiomics nomogram, constructed using family history, HP infection, pepsinogen I, pepsinogen II, bile reflux, and Kimura–Takemoto classification based on the aforementioned risk factors, as shown in Figure 3. Our radiomics nomogram provides a visual representation of the impact of each factor, helping doctors in conducting individualized risk evaluations in clinical practice. For example, if a patient with CAG had family history, HP infection with pepsinogen I (170) and pepsinogen II (17), bile reflux, and C2 classification, then the patient’s corresponding scores would be approximately 43, 60, 0, 0, 30, and 30, respectively, with a total score of 133. This would reveal that the estimated probability of CAG patients with LGIN is approximately 23%.
Validation and calibration of the predictive model
As shown in Figures 4A, B, the discrimination power of our predictive model was assessed by AUC

Figure 4. Receiver operating characteristic curve (ROC) of our predictive model. (A) Training dataset. (B) Validation dataset.
values calculated in ROC figures by analyzing the indication of LGIN in CAG patients in both the training and validation datasets. The ROC figure in the training dataset calculated an AUC value of 0.838 (95% CI = 0.789–0.887) with a specificity of 0.761 and a sensitivity of 0.791 as well as an AUC value of 0.941 (95% CI = 0.893–0.989) with a specificity of 0.852 and a sensitivity of 0.908 in the validation dataset. The AUC in the validation dataset (0.941) had a higher score than in the training dataset (0.838), which means that our predictive model has a good effect. Furthermore, the Hosmer and Lemeshow goodness-of-fit (GOF) test and the calibration curve were utilized to evaluate our predictive model, and a p-value of the Hosmer and Lemeshow GOF test greater than 0.05 indicates that the predictive model has a good degree of fit. The results showed that our model had a good fit for the training (χ2 = 4.1407, df = 8, p-value = 0.8442) and validation datasets (χ2 = 3.3873, df = 8, p-value = 0.9078). The calibration curves for the radiomics nomogram based on our multivariate logistic regression analysis in the training and validation datasets are depicted in Figures 5A, B and demonstrated good agreement between the prediction results and the observational outcomes. In clinical practice, calibration curves are commonly used to evaluate and optimize predictive models such as the probability of postoperative complications for LGIN in CAG patients. By comparing the predicted probability with the actual occurrence probability, doctors can develop more accurate postoperative monitoring and intervention plans. Moreover, for decision curve analysis, the DCA in both the training and validation datasets indicated that the net benefit of our predictive model was consistently better than the two extreme strategies (all treatment and no treatment) across a wide range of threshold probabilities, representing its underlying clinical ability, as shown in Figure 6. According to the DCA, doctors can choose the most appropriate intervention threshold based on changes in net benefit. This helps avoid excessive or inappropriate intervention and improve the quality of medical decision-making for LGIN in CAG patients. Good calibration curve and the DCA revealed that our model has good calibration, clinical application, and generalization.

Figure 6. Decision curve analysis (DCA) of our predictive model. (A) Training dataset. (B) Validation dataset.
Discussion
Our multicenter retrospective analysis established a clinical predictive model based on LASSO and multivariate logistic regression algorithm. For baseline results, nine characteristics, namely, hypertension (p < 0.001), dyslipidemia (p < 0.001), HP infection (p = 0.008), glucose (p = 0.001), pepsinogen II (p < 0.001), AFP (p < 0.001), CEA (p = 0.001), CA125 (p < 0.001), and CA199 (p = 0.003), showed significant differences. The LASSO and multivariate logistic regression method evaluated the influence of family history, HP infection, pepsinogen I, pepsinogen II, bile reflux, and Kimura–Takemoto classification on the impact of LGIN in CAG patients. Some literature suggested that LGIN may act as a critical transitional stage in the progression of CAG to advanced premalignant lesions, with its presence correlating with more severe mucosal atrophy and intestinal metaplasia (23–25). Notably, CAG patients with LGIN exhibited a higher likelihood of multifocal atrophy and accelerated histological deterioration, aligning with prior evidence that LGIN serves as a marker of genomic instability in the gastric mucosa (26). Our retrospective observational analysis aimed to evaluate the diagnostic efficacy of LGIN in combination with other characteristics for the concurrent detection of CAG, in order to offer a basic foundation for the diagnosis and treatment of CAG disease.
A family history of gastric cancer or premalignant conditions was significantly associated with advanced CAG, with clinical and molecular evidence highlighting inherited susceptibility as a key modifier of disease severity. Epidemiological studies consistently demonstrate that individuals with a family history of gastric cancer or premalignant gastric lesions exhibit a two- or three-fold increased risk of developing advanced CAG, independent of HP infection or environmental exposures (27, 28). This association is likely mediated by genetic polymorphisms in pathways regulating gastric mucosal homeostasis, such as pro-inflammatory cytokines (29), tumor suppressor genes (30), and genes involved in acid secretion (31). Clinically, patients with a family history often present with earlier-onset, multifocal atrophy and accelerated progression to intestinal metaplasia or dysplasia (32).
HP infection is a major etiological factor in the development of CAG, a condition characterized by progressive loss of gastric glandular structures and mucosal thinning (33). HP colonizes the gastric epithelium, triggering a persistent inflammatory response mediated by bacterial virulence factors (e.g., CagA and VacA toxins) and host immune reactions (34). Over time, chronic inflammation disrupts gastric homeostasis, leading to glandular atrophy, parietal cell loss, and hypochlorhydria (reduced stomach acid secretion) (35). These pathological changes are hallmarks of CAG and significantly increase the risk of metaplastic transformations, such as intestinal metaplasia, which is a precursor to gastric cancer (36). Several studies revealed that long-term HP infection accelerates the progression from non-atrophic gastritis to CAG, with bacterial persistence, host genetic susceptibility, and environmental cofactors (e.g., smoking, high-salt diet) influencing disease severity (37, 38).
In CAG patients, progressive atrophy of the gastric glands leads to reduced secretion of pepsinogen I (produced primarily in the gastric corpus/fundus), while pepsinogen II (produced throughout the stomach) levels remain relatively stable, and precursors of the digestive enzyme pepsin serve as important biomarkers for CAG (39). This results in a characteristic decrease in the pepsinogen I/pepsinogen II ratio, which has become a validated non-invasive diagnostic indicator for gastric mucosal atrophy (40). Serum pepsinogen testing (pepsinogen I 70 μg/L and pepsinogen I/pepsinogen II ratio 3) is widely used to screen for CAG, particularly in high-risk populations for gastric cancer (41). The severity of corpus atrophy correlates strongly with declining pepsinogen I levels, reflecting the loss of acid-secreting parietal cells and enzyme-producing chief cells in CAG progression. This serological approach is especially valuable for detecting early-stage atrophy before endoscopically visible changes occur (42).
Bile reflux, the backward flow of duodenal contents (including bile acids, pancreatic enzymes, and intestinal fluid) into the stomach, is increasingly recognized as a contributing factor to CAG. Prolonged bile reflux damages the gastric mucosal barrier through multiple mechanisms: bile acids disrupt surface mucous cells, induce oxidative stress, and trigger chronic inflammation, accelerating glandular atrophy (43, 44). This process often coexists with HP infection, creating a synergistic effect that exacerbates mucosal injury and impairs healing (45). Bile acids also inhibit proton pump function, reducing gastric acid secretion and altering the gastric microenvironment, which may further promote epithelial metaplasia and atrophy (46). Endoscopically, bile reflux is associated with mucosal erythema, erosions, and bile-stained fluid in the stomach. Chronic exposure to bile reflux correlates with advanced CAG stages and intestinal metaplasia, raising the risk of gastric carcinogenesis (47).
The Kimura–Takemoto classification is a widely used endoscopic grading system that evaluates the extent and pattern of gastric mucosal atrophy in CAG (48). It categorizes atrophy into two main types: closed type (C-type) and open type (O-type), based on the progression of atrophic borders observed during endoscopy. In closed-type atrophy, the atrophic changes remain confined to the lesser curvature of the stomach, while open-type atrophy involves expansion toward the greater curvature and fundus, reflecting more advanced disease (49). In our Second Affiliated Hospital of Anhui University of Chinese Medicine, the Hefei Second People’s Hospital, and our baseline of retrospective analysis, we only recorded the C-type in CAG patients. This classification correlates closely with histopathological severity, acid secretion levels, and gastric cancer risk (50). The Kimura–Takemoto system aids clinicians in stratifying CAG patients for surveillance, as open-type patterns warrant closer endoscopic monitoring due to their strong link to gastric carcinogenesis (51).
LGIN represents a precancerous lesion in the stomach and is closely associated with CAG. In CAG, prolonged mucosal inflammation and glandular atrophy create a microenvironment conducive to genetic and epigenetic alterations, promoting the development of cellular dysplasia (52). LGIN, characterized by mild-to-moderate architectural distortion and cytological atypia confined to the epithelial layer, frequently arises in areas of CAG with intestinal metaplasia (53). LGIN represents an early neoplastic transformation within this spectrum, marked by architectural distortion and cytological atypia confined to the epithelial layer (54). While LGIN itself carries a lower risk of progression to invasive adenocarcinoma compared to high-grade dysplasia, its presence in CAG significantly elevates cancer risk (55). The combination of atrophic changes, metaplasia, and dysplasia in CAG exemplifies the multistep “Correa cascade” of gastric cancer development (56).
Our retrospective observational analysis has several limitations. First, our study has selection bias. As a multicenter study, the cohort may not represent broader demographic or geographic populations, and the sample size of the validation cohort (N = 92) is not enough, and significant differences in some baseline characteristics, including hypertension, dyslipidemia, and HP infection, between the training and validation cohorts (as shown in Table 1) suggest intercenter variability, limiting model generalizability. Second, the LASSO and multivariate regression leveraged in our analysis are prevailing methods. Furthermore, machine learning algorithms can be utilized to establish a novel predictive model with better performance (57). Finally, there may be a bias in the record of more effective characteristics for our predictive model, encompassing short follow-up duration and serological and histological variability—the ratio of pepsinogen I/pepsinogen II and histopathological grading were subject to interlaboratory variability and interobserver discrepancies (58).
Conclusion
In our multicenter retrospective analysis, we found a causal association between several independent factors (family history, HP infection, pepsinogen I, pepsinogen II, bile reflux, and Kimura–Takemoto classification) and LGIN in CAG patients and established a predictive model to evaluate the clinical diagnosis of CAG.
Data availability statement
Publicly available datasets were analyzed in this study. All recorded dataset can be obtained via email from the corresponding author.
Ethics statement
This study was approved by the Ethics Committee of the Second Affiliated Hospital of Anhui University of Chinese Medicine and the Hefei Second People’s Hospital. All participants volunteered to participate in this study and signed the informed consent form. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and institutional requirements.
Author contributions
WD: Conceptualization, Methodology, Visualization, Writing – original draft. CZ: Conceptualization, Resources, Writing – review & editing. HC: Data curation, Writing – review & editing. MG: Conceptualization, Data curation, Writing – review & editing. XX: Data curation, Writing – review & editing. BP: Methodology, Visualization, Writing – review & editing. YZ: Conceptualization, Writing – review & editing. BS: Conceptualization, Writing – review & editing. XL: Conceptualization, Methodology, Project administration, Writing – review & editing.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This study was supported by Anhui Province Major Difficult Diseases Collaborative Research Project of Traditional Chinese Medicine and Western Medicine (Anhui Traditional Chinese Medicine Development Secret (2021) No. 70), Major Project of Anhui Provincial Department of Education (2024AH040149), and Anhui Province Key Research and Development Plan Project (2022e07020023).
Acknowledgments
We would like to acknowledge the support from the Second Affiliated Hospital of Anhui University of Chinese Medicine and the Hefei Second People’s Hospital.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Generative AI was used in the creation of this manuscript.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2025.1597099/full#supplementary-material
References
1. Livzan MA, Gaus OV, Mozgovoi SI, and Bordin DS. Chronic autoimmune gastritis: modern diagnostic principles. Diagnostics (Basel). (2021) 11:2113. doi: 10.3390/diagnostics11112113
2. Zhang ZP, Zhu TT, Zhang L, Xing YC, Yan ZQ, and Li QS. Critical influence of cytokines and immune cells in autoimmune gastritis. Autoimmunity. (2023) 56:2174531. doi: 10.1080/08916934.2023.2174531
3. Nguyen CL, Dao TT, Phi TN, Nguyen TP, Pham VT, and Vu TK. Serum pepsinogen: A potential non-invasive screening method for moderate and severe atrophic gastritis among an Asian population. Ann Med Surg (Lond). (2022)78:103844. doi: 10.1016/j.amsu.2022.103844
4. Chinese Society of Gastroenterology, Cancer Collaboration Group of Chinese Society of Gastroenterology, Chinese Medical Association. Guidelines for diagnosis and treatment of chronic gastritis in China (2022, Shanghai). J Dig Dis. (2023) 24:150–80. doi: 10.1111/1751-2980.13193
5. Hagiwara T, Mukaisho K, Nakayama T, Sugihara H, and Hattori T. Long-term proton pump inhibitor administration worsens atrophic corpus gastritis and promotes adenocarcinoma development in Mongolian gerbils infected with Helicobacter pylori. Gut. (2011) 60:624–30. doi: 10.1136/gut.2010.207662
6. Shin SY, Kim JH, Chun J, Yoon YH, and Park H. Chronic atrophic gastritis and intestinal metaplasia surrounding diffuse-type gastric cancer: Are they just bystanders in the process of carcinogenesis? PloS One. (2019) 14:e0226427. doi: 10.1371/journal.pone.0226427
7. Wang RP, Song SM, Qin JJ, Yoshimura K, Peng FD, Chu YS, et al. Evolution of immune and stromal cell states and ecotypes during gastric adenocarcinoma progression. Cancer Cell. (2023) 41:1407–1426. e9. doi: 10.1016/j.ccell.2023.06.005
8. Zhang ZF and Zhang XG. Chronic atrophic gastritis in different ages in South China: a 10-year retrospective analysis. BMC Gastroenterol. (2023) 23:37. doi: 10.1186/s12876-023-02662-1
9. Lahner E, Conti L, Annibale B, and Corleto VD. Current perspectives in atrophic gastritis. Curr Gastroenterol Rep. (2020) 22:38. doi: 10.1007/s11894-020-00775-1
10. Kuang WH, Xu JL, Xu FT, Huang WZ, Majid M, Shi H, et al. Current study of pathogenetic mechanisms and therapeutics of chronic atrophic gastritis: a comprehensive review. Front Cell Dev Biol. (2024) 12:1513426. doi: 10.3389/fcell.2024.1513426
11. Jin RU and Mills JC. Are gastric and esophageal metaplasia relatives? The case for Barrett’s stemming from SPEM. Dig Dis Sci. (2018) 63:2028–41. doi: 10.1007/s10620-018-5150-0
12. Lenti MV, Rugge M, Lahner E, Miceli E, Toh BH, Genta RM, et al. Autoimmune gastritis. Nat Rev Dis Primers. (2020) 6:56. doi: 10.1038/s41572-020-0187-8
13. Voltaggio L, Cimino-Mathews A, Bishop JA, Argani P, Cuda JD, Epstein JI, et al. Current concepts in the diagnosis and pathobiology of intraepithelial neoplasia: A review by organ system. CA Cancer J Clin. (2016) 66:408–36. doi: 10.3322/caac.21350
14. Cao Y, Wang DC, Mo GY, Peng Y, and Li Z. Gastric precancerous lesions:occurrence, development factors, and treatment. Front Oncol. (2023) 13:1226652. doi: 10.3389/fonc.2023.1226652
15. Wu WT, Li YJ, Feng AZ, Li L, Huang T, Xu AD, et al. Data mining in clinical big data: the frequentlyused databases, steps, and methodological models. Mil Med Res. (2021) 8:44. doi: 10.1186/s40779-021-00338-z
16. Weng SF, Hsu HR, Weng YL, Tien KJ, and Kao HY. Health-related quality of life and medical resource use in patients with osteoporosis and depression: A cross-sectional analysis from the national health and nutrition examination survey. Int J Environ Res Public Health. (2020) 17:1124. doi: 10.3390/ijerph17031124
17. Xu RX, Chen YX, Yao ZH, Wu W, Cui JX, Wang RQ, et al. Application of machine learning algorithms to identify people with low bone density. Front Public Health. (2024) 12:1347219. doi: 10.3389/fpubh.2024.1347219
18. Sauerbrei W, Royston P, and Binder H. Selection of important variables and determination of functional form for continuous predictors in multivariable model building. Stat Med. (2007) 26:5512–28. doi: 10.1002/sim.v26:30
19. Yuan XF, Xu QR, Du FX, Gao XX, Guo J, Zhang JN, et al. Development and validation of a model to predict cognitive impairment in traumatic brain injury patients: a prospective observational study. EClinicalMedicine. (2025) 80:103023. doi: 10.1016/j.eclinm.2024.103023
20. Huang YQ, Liang CH, He L, Tian J, Liang CS, Chen X, et al. Development and validation of a radiomics nomogram for preoperative prediction of lymph node metastasis in colorectal cancer. J Clin Oncol. (2016) 34:2157–64. doi: 10.1200/JCO.2015.65.9128
21. Kramer AA and Zimmerman JE. Assessing the calibration of mortality benchmarks in critical care: The Hosmer-Lemeshow test revisited. Crit Care Med. (2007) 35:2052–6. doi: 10.1097/01.CCM.0000275267.64078.B0
22. Vickers AJ, Cronin AM, Elkin EB, and Gonen M. Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers. BMC Med Inform Decis Mak. (2008) 8:53. doi: 10.1186/1472-6947-8-53
23. Zhang SX, Shen Y, Liu H, Zhu D, Fang JS, Pan HF, et al. Inflammatory microenvironment in gastric premalignant lesions: implication and application. Front Immunol. (2023) 14:1297101. doi: 10.3389/fimmu.2023.1297101
24. Wang ST, Yang HW, Zhang WL, Li Z, and Ji R. Disruption of the gastric epithelial barrier in Correa’s cascade: Clinical evidence via confocal endomicroscopy. Helicobacter. (2024) 29:e13065. doi: 10.1111/hel.13065
25. Zhang MF, Zhong JL, Song ZY, Xu Q, Chen YC, and Zhang ZM. Regulatory mechanisms and potential therapeutic targets in precancerous lesions of gastric cancer: A comprehensive review. BioMed Pharmacother. (2024) 177:117068. doi: 10.1016/j.biopha.2024.117068
26. Rugge M, Genta RM, Malfertheiner P, Dinis-Ribeiro M, El-Serag H, Graham DY, et al. RE.GA.IN.: the real-world gastritis initiative-updating the updates. Gut. (2024) 73:407–41. doi: 10.1136/gutjnl-2023-331164
27. Sepulveda AR. Molecular testing of Helicobacter pylori-associated chronic gastritis and premalignant gastric lesions: clinical implications. J Clin Gastroenterol. (2001) 32:377–82. doi: 10.1097/00004836-200105000-00004
28. Rugge M, Genta RM, Di Mario F, El-Omar EM, El-Serag HB, Fassan M, et al. Gastric cancer as preventable disease. Clin Gastroenterol Hepatol. (2017) 15:1833–43. doi: 10.1016/j.cgh.2017.05.023
29. Rudnicka K, Backert S, and Chmiela M. Genetic Polymorphisms in inflammatory and other regulators in gastric cancer: risks and clinical consequences. Curr Top Microbiol Immunol. (2019) 421:53–76. doi: 10.1007/978-3-030-15138-6_3
30. Zabaleta J. Multifactorial etiology of gastric cancer. Methods Mol Biol. (2012) 863:411–35. doi: 10.1007/978-1-61779-612-8_26
31. Katoh M. Dysregulation of stem cell signaling network due to germline mutation, SNP, Helicobacter pylori infection, epigenetic change and genetic alteration in gastric cancer. Cancer Biol Ther. (2007) 6:832–9. doi: 10.4161/cbt.6.6.4196
32. Esposito G, Dottori L, Pivetta G, Ligato I, Dilaghi E, and Lahner E. The hematological presentation of a multifaceted disorder caused by cobalamin deficiency. Nutrients. (2022) 14:1672. doi: 10.3390/nu14081672
33. Shah SC, Piazuelo MB, Kuipers EJ, and Li D. AGA clinical practice update on the diagnosis and management of atrophic gastritis: Expert review. Gastroenterology. (2021) 161:1325–1332. e7. doi: 10.1053/j.gastro.2021.06.078
34. Han L, Shu X, and Wang J. Helicobacter pylori-mediated oxidative stress and gastric diseases: A review. Front Microbiol. (2022) 13:811258. doi: 10.3389/fmicb.2022.811258
35. Zavros Y, Rieder G, Ferguson A, Samuelson LC, and Merchant JL. Genetic or chemical hypochlorhydria is associated with inflammation that modulates parietal and G-cell populations in mice. Gastroenterology. (2002) 122:119–33. doi: 10.1053/gast.2002.30298
36. Kinoshita H, Hayakawa Y, and Koike K. Metaplasia in the stomach-precursor of gastric cancer? Int J Mol Sci. (2017) 18:2063. doi: 10.3390/ijms18102063
37. Zheng SY, Zhu L, Wu LY, Liu HR, Ma XP, Li Q, et al. Helicobacter pylori-positive chronic atrophic gastritis and cellular senescence. Helicobacter. (2023) 28:e12944. doi: 10.1111/hel.12944
38. Inoue I, Yoshimura N, Iidaka T, Horii C, Muraki S, Oka H, et al. Helicobacter pylori-related chronic gastritis as a risk factor for lower bone mineral density. Calcif Tissue Int. (2025) 116:16. doi: 10.1007/s00223-024-01310-4
39. Kageyama T. Pepsinogens, progastricsins, and prochymosins: structure, function, evolution, and development. Cell Mol Life Sci. (2002) 59:288–306. doi: 10.1007/s00018-002-8423-9
40. Leja M, Kupcinskas L, Funka K, Sudraba A, Jonaitis L, Ivanauskas A, et al. The validity of a biomarker method for indirect detection of gastric mucosal atrophy versus standard histopathology. Dig Dis Sci. (2009) 54:2377–84. doi: 10.1007/s10620-009-0947-5
41. Roubaud-Baudron C, Krolak-Salmon P, Quadrio I, Megraud F, and Salles N. Impact of chronic Helicobacter pylori infection on Alzheimer’s disease: preliminary results. Neurobiol Aging. (2012) 33:1009. e11–9. doi: 10.1016/j.neurobiolaging.2011.10.021
42. Leja M, Camargo MC, Polaka I, Isajevs S, Liepniece-Karele I, Janciauskas D, et al. Detection of gastric atrophy by circulating pepsinogens: A comparison of three assays. Helicobacter. (2017) 22:1–12. doi: 10.1111/hel.12393
43. Bajor A, Gillberg PG, and Abrahamsson H. Bile acids: short and long term effects in the intestine. Scand J Gastroenterol. (2010) 45:645–64. doi: 10.3109/00365521003702734
44. Booth DM, Murphy JA, Mukherjee R, Awais M, Neoptolemos JP, Gerasimenko OV, et al. Reactive oxygen species induced by bile acid induce apoptosis and protect against necrosis in pancreatic acinar cells. Gastroenterology. (2011) 140:2116–25. doi: 10.1053/j.gastro.2011.02.054
45. Scarpignato C, Gatta L, Zullo A, and Blandizzi C. Effective and safe proton pump inhibitor therapy in acid-related diseases - A position paper addressing benefits and potential harms of acid suppression. BMC Med. (2016) 14:179. doi: 10.1186/s12916-016-0718-z
46. He QJ, Liu LM, Wei JG, Jiang JY, Rong Z, Chen X, et al. Roles and action mechanisms of bile acid-induced gastric intestinal metaplasia: a review. Cell Death Discovery. (2022) 8:158. doi: 10.1038/s41420-022-00962-1
47. Haesebrouck F, Pasmans F, Flahou B, Chiers K, Baele M, Meyns T, et al. Gastric helicobacters in domestic animals and nonhuman primates and their significance for human health. Clin Microbiol Rev. (2009) 22:202–23. doi: 10.1128/CMR.00041-08
48. Ohno A, Miyoshi J, Kato A, Miyamoto N, Yatagai T, Hada Y, et al. Endoscopic severe mucosal atrophy indicates the presence of gastric cancer after Helicobacter pylori eradication-analysis based on the Kyoto classification. BMC Gastroenterol. (2020) 20:232. doi: 10.1186/s12876-020-01375-z
49. Takiguchi S, Adachi S, Yamamoto K, Morii E, Miyata H, Nakajima K, et al. Mapping analysis of ghrelin producing cells in the human stomach associated with chronic gastritis and early cancers. Dig Dis Sci. (2012) 57:1238–46. doi: 10.1007/s10620-011-1986-2
50. Sugimoto M, Murata M, Murakami K, Yamaoka Y, and Kawai T. Characteristic endoscopic findings in Helicobacter pylori diagnosis in clinical practice. Expert Rev Gastroenterol Hepatol. (2024) 18:457–72. doi: 10.1080/17474124.2024.2395317
51. Kishikawa H, Ojiro K, Nakamura K, Katayama T, Arahata K, Takarabe S, et al. Previous Helicobacter pylori infection-induced atrophic gastritis: A distinct disease entity in an understudied population without a history of eradication. Helicobacter. (2020) 25:e12669. doi: 10.1111/hel.12669
52. Liu YX, Huang TT, Wang L, Wang Y, Liu Y, Bai JY, et al. Traditional Chinese Medicine in the treatment of chronic atrophic gastritis, precancerous lesions and gastric cancer. J Ethnopharmacol. (2025) 337:118812. doi: 10.1016/j.jep.2024.118812
53. Astudillo P. Wnt5a signaling in gastric cancer. Front Cell Dev Biol. (2020) 8:110. doi: 10.3389/fcell.2020.00110
54. Appelman HD, Umar A, Orlando RC, Sontag SJ, Nandurkar S, El-Zimaity H, et al. Barrett’s esophagus: natural history. Ann N Y Acad Sci. (2011) 1232:292–308. doi: 10.1111/j.1749-6632.2011.06057.x
55. Muzaheed. Helicobacter pylori oncogenicity: Mechanism, prevention, and risk factors. Sci World J. (2020) 2020:3018326. doi: 10.1155/2020/3018326
56. Liu QS, Tang JY, Chen SL, Hu SY, Shen CF, Xiang JY, et al. Berberine for gastric cancer prevention and treatment: Multi-step actions on the Correa’s cascade underlie its therapeutic effects. Pharmacol Res. (2022) 184:106440. doi: 10.1016/j.phrs.2022.106440
57. Valova I, Harris C, Mai T, and Gueorguieva N. Optimization of convolutional neural networks for Imbalanced Set classification. Proc Comput Sci. (2020) 176:660–9. doi: 10.1016/j.procs.2020.09.038
Keywords: chronic atrophic gastritis, low-grade intraepithelial neoplasia, multi-center retrospective analysis, predictive model, nomogram
Citation: Ding W, Zhang C, Chen H, Gao M, Xu X, Pei B, Zhang Y, Song B and Li X (2025) Development and validation of a clinical model to predict low-grade intraepithelial neoplasia in chronic atrophic gastritis patients: a retrospective observational multicenter analysis. Front. Oncol. 15:1597099. doi: 10.3389/fonc.2025.1597099
Received: 20 March 2025; Accepted: 12 May 2025;
Published: 04 June 2025.
Edited by:
Babak Pakbin, Texas A&M University, United StatesReviewed by:
Alireza Yaghoobi, Shahid Beheshti University of Medical Sciences, IranKatayoun Pahlavanyali, Texas A&M University, United States
Copyright © 2025 Ding, Zhang, Chen, Gao, Xu, Pei, Zhang, Song and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Xuejun Li, bGl4dWVqdW4wMzA4QDEyNi5jb20=