You're viewing our updated article page. If you need more time to adjust, you can return to the old layout.

ORIGINAL RESEARCH article

Front. Oncol., 17 December 2025

Sec. Gynecological Oncology

Volume 15 - 2025 | https://doi.org/10.3389/fonc.2025.1669396

Artificial intelligence-assisted noninvasive preoperative prediction of lymph nodes metastasis in cervical cancer through a clinical-imaging feature combined strategy

  • 1. Department of Gynecology and Obstetrics, Affiliated Dongyang Hospital of Wenzhou Medical University, Dongyang, Zhejiang,Ā China

  • 2. Department of Radiology, Affiliated Dongyang Hospital of Wenzhou Medical University, Dongyang, Zhejiang,Ā China

Article metrics

View details

663

Views

37

Downloads

Abstract

Background:

Lymph node metastasis (LNM) of patients with cervical cancer (CC) is correlated with noticeably reduced five-year survival rate. but the role of conventional detection is limited for preoperative diagnosis of LNM. Therefore, we intended to develop a predictive model for LNM by integrating medical images, clinical data along with artificial intelligence-assisted method.

Methods:

CC patients who underwent radical hysterectomy combined with pelvic lymphadenectomy between January 2013 and October 2024 were retrospectively enrolled in this study. For computed tomography (CT) and ultrasound (US) images, a pre-trained ResNet-18 model on large-scale samples was used to extract representative features, fine-tuned with random cropping data augmentation. For clinical indicators, after normalizing to the range [0,1], a multilayer perceptron block was applied to extract representative features. Then, contrastive learning and feature fusion methods were utilized to integrate similar messages. Finally, a multi-modal contrastive learning framework was developed by consolidating above two parts. The framework was estimated by accuracy, sensitivity, specificity and the area under the receiver operating characteristic curve (AUC).

Results:

This work consisted of 127 CT images of patients with pathologically diagnosed cervical malignancies. After integrating clinical-imaging feature and artificial intelligence-assisted algorithm, the finally developed LNM predicting model achieved a high accuracy of 92.31% with an AUC of 0.88. Additionally, the model also displayed strong sensitivity (80.0%) and specificity (95.45%) in CC cohorts.

Conclusion:

This study presented an efficient noninvasive and highly accurate diagnostic tool for LNM, which may significantly enhance surgical decision-making for lymph node dissection in CC patients with LNM.

Introduction

Cervical cancer (CC), one of the four major malignant tumors affecting women, poses a significant threat to women’s health and lives worldwide (1). In 2020, there were approximately 604,000 cases and 342,000 deaths worldwide (2). Nonetheless, by 2022, the number has risen to 661,000 new cases and 348,000 deaths (3). Actually, lymph node metastasis (LNM) also has been identified asĀ aĀ dangerous factor that can strongly reduce the five-year survivalĀ rateĀ of CC patients from 90% to 60.8% (4, 5). Therefore, the 2018 revision of the International Federation of Gynecology andĀ Obstetrics (FIGO) staging system emphasized the significanceĀ of LNM in staging accuracy and clinical decision-making (6).

As a result, accurately assessing LNM is crucial for CC treatment. For instance, determining the precise status of stage IIIC1r patients at the initial treatment stage can prevent them from undergoing both surgery and chemoradiotherapy, thereby reducing unnecessary financial burden of treatment (7). Furthermore, for CC patients seeking fertility preservation, the 2023 National Comprehensive Cancer Network (NCCN) guidelines state that when patients with IA2-IB1 CC meet the ConCerv criteria could consider conservative surgery (i.e., cervical conization with pelvic lymphadenectomy or sentinel lymph node mapping) (8).

Traditionally, pelvic lymphadenectomy has been a crucial step in examining LNM. However, the LNM rate in early-stage CC is below 20%, meaning that most patients do not benefit from routine pelvic lymphadenectomy and may face increased surgical risks, instead (9, 10). While the newly proposed sentinel lymph node mapping still lacks mountainous clinical reports (11). To this end, imaging technology plays a pivotal role in preoperative assessment and has been incorporated into the 2018 FIGO revision as a key metric in the staging system (6). However, in clinical practice, directly using imaging technology for LNM prediction presents several challenges: (i) the same imaging modality may demonstrate varying advantages depending on the specific task, making it suboptimal to rely on a single modality for LNM prediction; (ii) imaging technologies exhibit limited sensitivity in detecting micro-metastases, and their interpretation is subjective, potentially leading to missed or misdiagnosed cases; and (iii) some imaging modalities, such as MRI, are expensive and may not be widely accessible. For these reasons, and because the majority of patients in our retrospective cohort did not undergo MRI examinations, we focused on developing a model using the more universally available CT, ultrasound, and clinical data. This approach ensures broader applicability and avoids the significant sample size reduction that would result from requiring MRI for all patients.

As the progression of artificial intelligence (AI) technology in the field of computational pathology, AI-assisted multi-modal learning method is developed to help extract more extensive, substantial, and distinctive features through integrating images, tissue sections as well as texts (12). For example, Wei etĀ al. (13) proposed a novel multi-modal learning framework for genotype prediction in Glioma. Magazine ā€œCell Reports Medicineā€ also newly published a paper concerning the method, which could accurately forecast stomach cancer response to neoadjuvant chemotherapy with noticeably reduced distances to tumor invasion margin and enhanced inflammatory infiltration in responders (14). Whereas, multi-modal learning method is rarely employed to LNM related research on CC patients.

Based on aforenoted backgrounds, we aim to purport a multi-modal contrastive learning framework for accurate LNM prediction in CC patients through integrating CT images, US images and laboratory indicators. This approach would present a novel direction for accurate LNM prediction in CC.

Materials and methods

This was a single-center retrospective cohort design, enrolling consecutive CC patients who underwent radical hysterectomy combined with pelvic lymphadenectomy at Dongyang People’s Hospital, affiliated with Wenzhou Medical University (a tertiary teaching hospital in China), between January 1, 2013, and October 31, 2024. The study protocol was approved by the Institutional Ethics Review Committee (Approval No.: 2024-YX-362), with a waiver of informed consent due to its retrospective nature. All data processing strictly adheres to the principles of the Declaration of Helsinki, and original medical records are securely stored in the hospital’s electronic medical record archiving system.

Patients enrollment

Patients were included based on the following criteria:

  • Histopathological confirmation of CC, with diagnosis based on the 2018 FIGO staging system;

  • Newly diagnosed cases, i.e., patients who have not received neoadjuvant chemotherapy or radiotherapy;

  • Complete surgical records and pathological reports;

  • Preoperative imaging assessment, including CT and US.

The exclusion criteria including:

  • Concurrent diagnosis of other malignant tumors;

  • Abnormal liver or kidney function tests (alanine aminotransferase (ALT)/aspartate aminotransferase (AST) > 2 times the upper normal limit, creatinine clearance < 60 mL/min);

  • Presence of hematologic or immune system disorders, or acute/chronic infections;

  • Missing key data.

After rigorous screening, a total of 127 patients were included in the final analysis (FigureĀ 1). Among them, 16 cases with LNM were set as positive group, while 111 cases were classified as the negative group.

FigureĀ 1

Flowchart depicting criteria for cervical cancer patient selection from 2013 to 2024 for a study on radical hysterectomy with pelvic lymphadenectomy. Inclusion criteria include histopathological confirmation, new diagnosis without prior treatment, complete records, and preoperative assessment. Exclusion criteria include other malignancies, abnormal liver or kidney tests, hematologic or immune disorders, and missing data. The final enrolled population consists of 127 patients, with 16 having lymph node metastasis and 111 without.

Flowchart of patient selecting criteria and enrolling number. Here, ALT means alanine aminotransferase, AST means aspartate aminotransferase, FIGO means the International Federation of Gynecology and Obstetrics, CT means computed tomography, LNM means lymph node metastasis.

Imaging acquisition

CT Scanning Protocol. CT Scanning was carried out in accordance to paper published before (15). Briefly, Patients were positioned in the supine position, with the scan range covering the region from the pubic symphysis to the iliac crest. Then, a 64-slice spiral CT scanner was used for both non-contrast and contrast-enhanced scans of the entire abdomen and pelvis with a slice thickness of 5 mm. An iopromide contrast agent (300 mg/mL, 1.5 mL/kg) was administered intravenously. The CT scanners used in this study included the Canon Aquilion One, Philips Brilliance, Siemens Force, and Siemens Somatom series, all of which offered high-resolution imaging and rapid scanning capabilities. All CT images were obtained from the institutional picture archiving and communication system and saved in DICOM format.

US Examination Protocol. Patients were placed in the lithotomyĀ position, and transvaginal ultrasound was performed with a probe frequency of 4–8 MHz. The probe was carefully inserted into the vaginal fornix and gradually rotated to obtain multi-plane imaging. Key observations included cervical morphology, size, myometrial echogenicity, and endometrial characteristics. Additionally, a detailed evaluation of any detected masses was performed, focusing on external morphology, boundary definition, and internal echogenicity to comprehensively capture imaging features of cervical lesions. The US equipment used in this study included the PHILIPS EPIQ7W and Voluson E8, both of which provide advanced imaging capabilities for precise CC assessment.

Laboratory data

Squamous Cell Carcinoma Antigen (SCC-Ag) Detection. SCC-Ag levels were measured using the MAGLUMI2000 chemiluminescence immunoassay analyzer, employing a sandwich immunoluminescence assay (ARCHITECT i2000SR, Abbott). The critical threshold for SCC-Ag was 1.8 ng/mL.

CA125 and CA19–9 Detection. CA125 and CA19–9 as common tumor markers (16, 17) were determined with the Cobas electrochemiluminescence immunoassay analyzer, applying a double-antibody sandwich electrochemiluminescence assay. The reference ranges were:

  • CA125: < 35 U/mL;

  • CA19-9: < 39 U/mL.

Complete Blood Count (CBC) Testing. CBC count was analyzed applying the XE-2100 automated hematology analyzer, including:

  • Absolute neutrophil count: 1.8āˆ’6.3 Ɨ 109/L;

  • Absolute lymphocyte count: 1.1āˆ’3.2 Ɨ 109/L;

  • Platelet (PLT) count: 125āˆ’350 Ɨ 109/L.

Serum Albumin (Alb) Level. Serum Alb levels were tested by the Hitachi 7600–120 automated biochemical analyzer, based on the bromocresol green colorimetric method. The reference range was 40–55 g/L.

Inflammation/Nutritional Index Calculations. Based on the laboratory test results, the following indices were calculated:

  • Neutrophil-to-Lymphocyte Ratio (NLR);

  • Platelet-to-Lymphocyte Ratio (PLR);

  • Systemic Immune-Inflammation Index (SII).

• Prognostic Nutritional Index (PNI).

Postoperative pathological diagnoses were independently reviewed and confirmed by two senior pathologists to ensure accuracy and reliability. The pathological reports included key parameters such as tumor size, histological type, differentiation grade, depth of invasion, LNM status (including the number and location of metastatic nodes), and surgical margin status.

Model architectures

The overall analytical framework was illustrated in FigureĀ 2.

FigureĀ 2

Diagram of a machine learning model pipeline involving ultrasound images, CT images, and clinical data. Ultrasound and CT images are processed by ResNet-18 image encoders, while clinical data, including NLR, PLR, and CA125, are processed by an MLP data encoder. Outputs are fused in a feature fusion module, leading to contrastive learning. The process is optimized by a combined loss function, \(L_{CE} + L_{contrastive}\).

The overall analytical framework of multi-modal contrastive learning procedure.

Multi-modal Data Encoding. Here, we employed a pre-trained ResNet-18 model (18) to extract features from CT and US images. This model was pretrained on large-scale publicly available datasets that include CT images of various target organs (19). As such, the backbone incorporates extensive medical knowledge and provides an effective starting point for fine-tuning on our target tasks, especially when training data for the limited downstream tasks. We extracted the hidden feature from ResNet-18 immediately before its final linear layer as the general CT and US image representation. Denoted by Hct ∈ D and Hus ∈ D where D = 512. For clinical data, a multilayer perceptron (MLP) block comprising two linear layers was utilized. The first layer outputted a 256-dimensional feature vector, while the second layer projects the data outputted a 512-dimensional space (Hclin ∈ D).

Contrastive Learning. The contrastive learning module was designed to align CT, US image features and clinical data features in a shared latent space. Given the extracted feature vectors from above two modalities, we optimized the model using an additional contrastive loss function (20) that encouraged positive pairs (i.e., the same patient data) to be closer in feature space while pushing negative pairs (i.e., different patients) apart. Specifically, for each patient i, we extracted CT and US image features as shown in Equation 1.

and were the raw CT and US image data from patient i, respectively. While, and were feature vector extracted from and with and function from ResNet-18 model. In the same time, we also extracted clinical feature vectors were extracted using the multilayer perceptron block as described in Equation 2.

To improve contrastive representation learning, we applied additional projection layers as defined in Equation 3.

where gCT, gus and gclin were MLP-based projection heads that mapped the features to a lower-dimensional contrastive space, i.e., 128 in our experiment. To maximize the similarity among a patient’s CT features, US features and clinical features while minimizing similarity between different patients’ feature pairs, we used the cosine similarity function defined in Equation 4.

And the model was also optimized with the InfoNCE loss function to further maximize the similarity between positive pairs (21) as shown in Equation 5.

The overall contrastive loss is computed as described in Equation 6.

where Ļ„ was a temperature scaling hyperparameter that controlled the sharpness of similarity scores, we use Ļ„ = 1 as the default in this experiment.

Optimization Objective. Apart from contrastive learning, the learned CT, US and clinical features were also fused together by concatenation as shown in Equation 7.

This fused representation was then passed to a binary classifier with loss function for the final prediction. The overall training objective is shown in Equation 8.

Here means cross entropy loss, means contrastive loss. equals 0 or 1, indicating the positive or negative property of patient’s LNM.

Implementation details

To ensure a rigorous evaluation and mitigate overfitting, the entire dataset was randomly divided into three independent sets: training (80%), validation (10%), and testing (10%). The model parameters were optimized using the training set, and the model checkpoint that achieved the highest accuracy on the validation set was selected. All performance metrics reported in this study (e.g., in TableĀ 1) are based solely on the held-out test set, which was not used at any stage of training or model selection.

TableĀ 1

Characteristics Total n percentile (n=127) P value Test statistic
LN+ (n=16) LN- (n=111)
Age (years) 56.0 (49.5,63.8) 54.8 (48.0,64.0) 0.676 zĀ =Ā -0.418
SCC-Ag level 0.016 x² = 5.853
ā€ƒNormal (<1.8ng/ml) 5 (31.3%) 70 (63.1%)
ā€ƒElevated (≄1.8ng/ml) 11 (68.8%) 41 (36.9%)
CA125 level 0.126 x² = 2.337
ā€ƒNormal (<35ng/ml) 13 (81.3%) 103 (92.8%)
ā€ƒElevated (≄35ng/ml) 3 (18.8%) 4 (3.6%)
CA19–9 level 0.013 x² =Ā 6.160
ā€ƒNormal (<39ng/ml) 13 (81.3%) 107 (96.4%)
ā€ƒElevated (≄39ng/ml) 3 (18.8%) 8 (7.2%)
PLT (*10^9/l) 244.3 ± 48.3 245.0 ± 65.5 0.968 t = 0.040
Alb (g/l) 42.2 (40.1,46.9) 42.4 (40.3,44.9) 0.83 zĀ =Ā -0.214
NLR 3.2 (1.9,4.1) 2.9 (1.7,3.2) 0.383 zĀ =Ā -0.872
PLR 158.2 (129.6,173.8) 163.6 (113.6,184.1) 0.551 zĀ =Ā -0.596
SII 764.7 (430.4,1005.4) 693.3 (375.5,829.4) 0.372 zĀ =Ā -0.894
PNI 50.4 ± 5.1 51.0 ± 5.4 0.693 t = 0.396
Clinical stage <0.001 x² = 92.045
FIGO I-II 3 (18.8%) 110 (99.1%)
FIGO III-IV 13 (81.3%) 1 (0.9%)
Differentiation Grade 0.909 x² = 0.013
ā€ƒWell/Moderately differentiated 10 (62.5%) 71 (64.0%)
ā€ƒPoorly/undifferentiated 6 (37.5) 40 (64.0%)
Pathological lymphovascular invasion status 0.061 x² = 3.520
ā€ƒPositive 8 (50.0%) 81 (73.0%)
ā€ƒNegative 8 (50.0%) 30 (27.0%)
Pathological perineural invasion status 0.559 x² = 0.341
ā€ƒPositive 14 (87.5%) 102 (91.9%)
ā€ƒNegative 2 (12.5%) 9 (8.1%)

Clinicopathological characteristics of patients.

To train our framework, The Adam optimizer with an initial learning rate of 1e-3, following a cosine annealing schedule for learning rate decay was carried out. The model was trained for 30 epochs, and all experiments were implemented using the PyTorch framework on NVIDIA A100 GPUs. To enhance data quality and improve generalization, Normalization and augmentation techniques were utilized. Firstly, both CT and US images underwent random cropping1 followed by resizing to a resolution of 224. Secondly, clinical data were converted into the range [0,1] to mitigate numerical instability and reduce biases induced by raw values. Lastly, we adopt mini-batch training with a batch size of 8, where the sampling probability was adjusted to ensure that underrepresented classes had a higher chance of being selected. This strategy was proved particularly effective for classification tasks with class imbalances.

Validation and evaluation of models

To objectively evaluate the effectiveness of the approach in detecting LNM in CC cohorts, multiple performance indicators were adopted, containing accuracy , sensitivity , specificity , and the area under the receiver operating characteristic curve (AUC). Here, TP, TN, FP, and FN were represented as true positives, true negatives, false positives, and false negatives, respectively.

Statistical analysis

Clinical data were analyzed using SPSS version 26.0 (Statistical Package for the Social Sciences). The Kolmogorov–Smirnov (K-S) test was used to assess the normality of continuous variables. Normally distributed data were expressed as mean ± standard deviation (x ± s), and compared using the independent samples t-test. Non-normally distributed data were presented as median and interquartile range (P25, P75), with comparisons performed using the Mann–Whitney U test. Categorical variables were reported as frequency and percentage, and analyzed with the chi-square test or Fisher’s exact test, as appropriate.

Results

Patient characteristics

In this study, 127 eligible patients were divided into LNM-positive group (LNM+, n = 16) and LNM-negative group (LNM-, n = 111). The patient baseline characteristics and conventional diagnostic features were summarized in TableĀ 1. No obvious differences were seen between LNM+ and LNM- patients in terms of age, CA125 level, PLT level, Alb level, NLR, PLR, SII, differentiation grade, pathological lymphovascular invasion status and pathological perineural invasion status between two groups (p > 0.05), indicating no easily apparent diagnosis for LNM. Only the SCC-Ag, CA19–9 and FIGO based clinical stage showed marked differences between the groups (p< 0.05, respectively).

Predictive performance

The overall performance of this study was summarized in TableĀ 2. The ablation study highlighted the impact of incorporating additional modalities and contrastive learning on prediction accuracy. Specifically, using a single modality, the highest accuracy achieved was 80.77%, with clinical data proving to be the most effective. This was likely because clinical data tended to be more precise, whereas CT and US images contained redundant information, requiring the model to filter out noise.

TableĀ 2

CT US Clinical Contrastive ACC (%) SE (%) SP (%) ROC (%)
āœ“ āœ— āœ— āœ— 76.92 60.00 80.95 71.42
āœ— āœ“ āœ— āœ— 73.08 60.00 76.19 79.04
āœ— āœ— āœ“ āœ— 80.77 60.00 85.71 84.96
āœ“ āœ“ āœ“ āœ— 84.62 60.00 90.48 86.67
āœ“ āœ“ āœ“ āœ“ 92.31 80.00 95.45 88.17

Results of lymph node classification.

A checkmark indicated the use of this type of data, while a cross implied that this type of data is not applied. CT, using CT data; US, with ultrasound US data; Clinical, with applying Clinical data; Contrastive, employing contrastive learning; ACC, accuracy; SE, sensitivity; SP, specificity; ROC, receiver operating characteristic.

When integrating all above three modalities, the accuracy was improved to 84.62%. Noticeably, by leveraging multi-modal data and contrastive learning method, our model achieved an impressive 92.31% accuracy in LNM prediction, demonstrating its strong efficacy. The corresponding AUC curves in FigureĀ 3 further validated the effectiveness of our approach, with a high AUC of 0.88.

FigureĀ 3

Averaged Receiver Operating Characteristic (ROC) curve displaying true positive rate versus false positive rate. The orange line represents the mean ROC with an area under the curve (AUC) of 0.88, including a shaded region indicating variability. A gray dashed line represents the diagonal of no-discrimination.

Receiver operating characteristic curve of the developed framework.

Notably, the ablation study (upper part of TableĀ 1) revealed that clinical data alone provided the highest predictive performance among single-modality inputs. This observation aligns with clinical intuition, given that laboratory markers such as CA125, NLR, and PLR are established correlates of lymph node metastasis, thereby offering interpretability into the model’s decision process.

Discussion

This study novelly established a predicting model based on multi-modal contrastive learning to assess the occurrence of LNM in CC patients. The model addressed the limitations of traditional methods and offered an efficient, non-invasive tool for preoperative evaluation, thereby supporting clinical decision-making and improving patient outcomes.

Current LNM evaluations rely primarily on imaging modalities. However, these methods face two major challenges: (1) image interpretation depends heavily on physician expertise, inducing subjectivity; and (2) limited sensitivity in detecting micrometastases (e.g., <5 mm) (22). To address these challenges, the integration of artificial intelligence and medical imaging offers a promising solution. While traditional machine learning methods show clinical potential, their reliance on manual feature engineering (e.g., regions of interest annotation) limited its practical utilization (23). Recently, advancements in unsupervised learning, deep feature extraction and multi-modal feature fusion have demonstrated promising improvements of enhancing model capability (24–26).

Motivated by recent advances, this study integrated CT and US imaging with a range of clinical data—including tumor markers (CA125, SCC, CA19-9) and emerged prognostic indices such as SII and PNI based on a systemic review (27). CT and US were chosen as the primary imaging modalities due to their cost-effectiveness and diagnostic performance, which is comparable to MRI for evaluating LNM and parametrial invasion (24). Through multi-modal integration, the model captures complementary information across data sources, boosting prediction accuracy from 76.92%-80.77% in single modality to 84.62% in three combined modalities, accompanied with increased ROC from 0.71-0.84 to 0.86. Furthermore, given the highly imbalanced nature of our dataset, there was a risk that the model could be biased toward predicting the majority class. However, by employing a weighted sampling strategy during training, our model achieved high sensitivity and specificity, indicating that the model effectively distinguished true labels rather than defaulting to majority-class predictions. Machine learning models generally require large-scale, high-quality data for accurate prediction. So, one of the primary challenges for accurately prognosticate LNM in CC is the limited available training data. To address this limitation, a pre-trained model (19) on a large-scale public medical dataset was adopted to acquire general medical representations. Additionally, to make effective predictions, we also proposed several strategies to fully leverage the available data, including the integration of multi-modal data and the application of contrastive learning.

Moreover, this study employed a contrastive learning framework that mapped heterogeneous modalities into a shared latent space. By maximizing similarity between positive pairs and minimizing it between negative pairs using the InfoNCE loss function, the model extracted discriminative cross-modal features. This approach significantly enhances performance, raising accuracy to 92.31% and AUC to 0.88. This result was better than previously published LNM prediction model in pancreatic cancer (74.4%, 0.79)Ā (28) as well as newly published similar paper in CC (87%, 0.81) (29). To enable a direct comparison on our dataset, we reproduced a closely related baseline model for cervical cancer LNM predictionĀ (29). Our proposed framework attained a higherĀ accuracy of 92.31% compared to 87.25% achieved by this baseline,Ā highlighting the effectiveness of our multimodal contrastive learning approach in integrating complementary information. This improvement highlights the effectiveness of our method in integrating complementary information from CT, ultrasound, and clinical data to enhance lymph node metastasis prediction performance.

The proposed multi-modal contrastive learning model holds substantial clinical value. First, accurate preoperative LNM prediction can help avoid unnecessary surgical interventions, reducing treatment burden. Beyond the acute complications, the choice of treatment modality has profound implications for long-term quality of life. As highlighted by Di Donna etĀ al., multimodal treatments for cervical cancer, including radical surgery and chemoradiotherapy, are frequently associated with long-term urinary, gastrointestinal, and sexual dysfunctions (30). Our non-invasive predictive model aids in refining clinical decision-making. Second, including US as a data source provides a more accessible and cost-effective solution, particularly beneficial in low-resource settings where US is more available than other imaging equipment. Moreover, the model’s high sensitivity and specificity enhance its reliability in handling imbalanced datasets, which is common in real-world applications.

Despite the promising results, this study has several limitations. It was a single-center retrospective study with a limited sample size, which may affect the generalizability of the proposed method. To partially address this, we employed a pretrained ResNet-18 model trained on large-scale medical datasets to enhance generalization capability. Furthermore, to evaluate robustness under single-center data constraints, we conducted supplementary experiments assessing cross-age generalization and resilience to Gaussian blur noise that simulated multi-center imaging variability. The model maintained strong performance (e.g., accuracy >85%, AUC >81%) across these challenging conditions, underscoring its robustness.

Nonetheless, this preliminary effort yielded encouraging outcomes. Future work will include multicenter validation with larger cohorts. Additionally, although contrastive learning contributed to performance improvement, external validation is essential to further assess its clinical utility. Finally, this study focused exclusively on CT, US, and clinical data. Subsequent research will aim to incorporate additional imaging modalities such as MRI and PET to further enhance performance.

Conclusion

In conclusion, this study built an efficient LNM prediction strategy in CC patients using a multi-modal contrastive learning approach along with a pre-trained model on large-scale samples. The model demonstrated excellent predictive performance and hold significant potential for clinical application. It is expected to provide more accurate diagnostic and therapeutic guidance for CC patients, optimize treatment decisions, and improve patient outcomes.

Statements

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving humans were approved by Dongyang Hospital of Wenzhou Medical University, Dongyang, China. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

Author contributions

JZ: Writing – original draft. CF: Data curation, Writing – original draft, Writing – review & editing. JD: Writing – review & editing.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This work was supported by the Zhejiang Medicine and Health Science and Technology Project (grant number 2023ZL751).

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authorsĀ and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

  • 1

    Xu M Cao C Wu P Huang X Ma D . Advances in cervical cancer: current insights and future directions. Cancer Commun (Lond). (2025) 45:77–109. doi:Ā 10.1002/cac2.12629

  • 2

    Sung H Ferlay J Siegel RL Laversanne M Soerjomataram I Jemal A et al . Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. (2021) 71:209–49. doi:Ā 10.3322/caac.21660

  • 3

    Bray F Laversanne M Sung H Ferlay J Siegel RL Soerjomataram I et al . Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. (2024) 74:229–63. doi:Ā 10.3322/caac.21834

  • 4

    Wright JD Matsuo K Huang Y Tergas AI Hou JY Khoury-Collado F et al . Prognostic performance of the 2018 international federation of gynecology and obstetrics cervical cancer staging guidelines. Obstet Gynecol. (2019) 134:49–57. doi:Ā 10.1097/AOG.0000000000003311

  • 5

    Guimarães YM Godoy LR Longatto-Filho A Reis R . Management of early-stage cervical cancer: A literature review. Cancers (Basel). (2022) 14:575. doi: 10.3390/cancers14030575

  • 6

    Bhatla N Aoki D Sharma DN et al . Cancer of the cervix uteri. Int J Gynaecol Obstet. (2018) 143 Suppl 2:22–36. doi:Ā 10.1002/ijgo.12611

  • 7

    Zhong ML Liu YC Yang JT Wang YN Ao MH Xiao Y et al . Treatment for locally resectable stage IIIC1r cervical cancer: surgery or chemoradiotherapy? BMC Cancer. (2024) 24:217. doi:Ā 10.1186/s12885-024-11944-0

  • 8

    Schmeler KM Pareja R Lopez Blanco A Fregnani JH Lopes A Perrotta M et al . ConCerv: a prospective trial of conservative surgery for low-risk early-stage cervical cancer. Int J Gynecol Cancer. (2021) 31:1317–25. doi:Ā 10.1136/ijgc-2021-002921

  • 9

    Diaz JP Sonoda Y Leitao MM Zivanovic O Brown CL Chi DS et al . Oncologic outcome of fertility-sparing radical trachelectomy versus radical hysterectomy for stage IB1 cervical carcinoma. Gynecol Oncol. (2008) 111:255–60. doi:Ā 10.1016/j.ygyno.2008.07.014

  • 10

    Frumovitz M dos Reis R Sun CC Milam MR Bevers MW Brown J et al . Comparison of total laparoscopic and abdominal radical hysterectomy for patients with early-stage cervical cancer. Obstet Gynecol. (2007) 110:96–102. doi:Ā 10.1097/01.AOG.0000268798.75353.04

  • 11

    Kampan NC Teik CK Shafiee MN . Where are we going with sentinel nodes mapping in ovarian cancer? Front Oncol. (2022) 12:999749. doi:Ā 10.3389/fonc.2022.999749

  • 12

    Ullah U Lee JS An CH Lee H Park SY Baek RH et al . A review of multi-modal learning from the text-guided visual processing viewpoint. Sensors (Basel). (2022) 22:6816. doi:Ā 10.3390/s22186816

  • 13

    Wei Y Chen X Zhu L Zhang L Schƶnlieb CB Price S et al . Multi-modal learning for predicting the genotype of glioma. IEEE Trans Med Imaging. (2023) 42:3167–78. doi:Ā 10.1109/TMI.2023.3244038

  • 14

    Gao P Xiao Q Tan H Song J Fu Y Xu J et al . Interpretable multi-modal artificial intelligence model for predicting gastric cancer response to neoadjuvant chemotherapy. Cell Rep Med. (2024) 5:101848. doi:Ā 10.1016/j.xcrm.2024.101848

  • 15

    Qi Y Hu Y Lin C Song G Shi L Zhu H et al . A preoperative predictive model based on multi-modal features to predict pathological complete response after neoadjuvant chemoimmunotherapy in esophageal cancer patients. Front Immunol. (2025) 16:1530279. doi:Ā 10.3389/fimmu.2025.1530279

  • 16

    Xiao Y Sun S Zheng N Zhao J Li X Xu J et al . Development of PDAC diagnosis and prognosis evaluation models based on machine learning. BMC Cancer. (2025) 25:512. doi:Ā 10.1186/s12885-025-13929-z

  • 17

    Bojan A Pricop C Vladeanu MC Bararu-Bojan I Halitchi CO Giusca SE et al . The predictive roles of tumour markers, hemostasis assessment, and inflammation in the early detection and prognosis of gallbladder adenocarcinoma and metaplasia: A clinical study. Int J Mol Sci. (2025) 26:3665. doi:Ā 10.3390/ijms26083665

  • 18

    He K Zhang X Ren S Sun J . Deep Residual Learning for Image Recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington, DC: IEEE Computer Society. (2016) p. 770–778.

  • 19

    Chen S Ma K Zheng Y . Med3D: Transfer Learning for 3D Medical Image Analysis. (2019). pp. 1–12.

  • 20

    Guo Q Liao Y Li Z Liang S. Multi-modal representation via contrastive learning with attention bottleneck fusion and attentive statistics features. Entropy (Basel). (2023) 25:1421. doi:Ā 10.3390/e25101421

  • 21

    Chen T Kornblith S Norouzi M Hinton G . A Simple Framework for Contrastive Learning of Visual Representations. (2020).

  • 22

    Sun J Wu S Jin Z Ren S Cho WC Zhu C et al . Lymph node micrometastasis in non-small cell lung cancer. BioMed Pharmacother. (2022) 149:112817. doi:Ā 10.1016/j.biopha.2022.112817

  • 23

    Chen X Liu W Thai TC Castellano T Gunderson CC Moore K et al . Developing a new radiomics-based CT image marker to detect lymph node metastasis among cervical cancer patients. Comput Methods Programs Biomed. (2020) 197:105759. doi:Ā 10.1016/j.cmpb.2020.105759

  • 24

    Zhang K Sun K Zhang C Ren K Li C Shen L et al . Using deep learning to predict survival outcome in non-surgical cervical cancer patients based on pathological images. J Cancer Res Clin Oncol. (2023) 149:6075–83. doi:Ā 10.1007/s00432-022-04446-8

  • 25

    Lai J Chen Z Liu J Zhu C Huang H Yi Y et al . A radiogenomic multimodal and whole-transcriptome sequencing for preoperative prediction of axillary lymph node metastasis and drug therapeutic response in breast cancer: a retrospective, machine learning and international multicohort study. Int J Surg. (2024) 110:2162–77. doi:Ā 10.1097/JS9.0000000000001082

  • 26

    Ming Y Dong X Zhao J Chen Z Wang H Wu N et al . Deep learning-based multimodal image analysis for cervical cancer detection. Methods. (2022) 205:46–52. doi:Ā 10.1016/j.ymeth.2022.05.004

  • 27

    Han X Liu S Yang G Hosseinifard H Imani S Yang L et al . Prognostic value of systemic hemato-immunological indices in uterine cervical cancer: A systemic review, meta-analysis, and meta-regression of observational studies. Gynecol Oncol. (2021) 160:351–60. doi:Ā 10.1016/j.ygyno.2020.10.011

  • 28

    Chen X Wang W Jiang Y Qian X . A dual-transformation with contrastive learning framework for lymph node metastasis prediction in pancreatic cancer. Med Image Anal. (2023) 85:102753. doi:Ā 10.1016/j.media.2023.102753

  • 29

    Zhu Y Fu C Du J Jin Y Du S Zhao F et al . Prediction of cervical cancer lymph node metastasis via a multimodal transfer learning approach. Br J Hosp Med (Lond). (2024) 85:1–14. doi:Ā 10.12968/hmed.2024.0428

  • 30

    Büchner J Caruana I Künkele A Rives S Vettenranta K Bader P et al . Chimeric antigen receptor T-cell therapy in paediatric B-cell precursor acute lymphoblastic leukaemia: curative success and future directions. Cancers. (2023) 15:5734. doi: 10.3389/fped.2021.784024

Summary

Keywords

cervical cancer, lymph node metastasis, artificial intelligence, multi-modal contrastive learning, predictive model

Citation

Zhang J, Fu C and Du J (2025) Artificial intelligence-assisted noninvasive preoperative prediction of lymph nodes metastasis in cervical cancer through a clinical-imaging feature combined strategy. Front. Oncol. 15:1669396. doi: 10.3389/fonc.2025.1669396

Received

19 July 2025

Revised

28 October 2025

Accepted

03 December 2025

Published

17 December 2025

Volume

15 - 2025

Edited by

Jin-Ghoo Choi, Yeungnam University, Republic of Korea

Reviewed by

Basilio Pecorino, Kore University of Enna, Italy

Lalasa Mukku, Christ University, India

Updates

Copyright

*Correspondence: Junqiang Du,

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics