Development and validation of a diagnostic model for predicting cervical lymph node metastasis in laryngeal and hypopharyngeal carcinoma

Objectives The lymph node status is crucial for guiding the surgical approach for patients with laryngeal and hypopharyngeal carcinoma (LHC). Nonetheless, occult lymph node metastasis presents challenges to assessment and treatment planning. This study seeks to develop and validate a diagnostic model for evaluating cervical lymph node status in LHC patients. Materials and methods This study retrospectively analyzed a total of 285 LHC patients who were treated at the Department of Otolaryngology Head and Neck Surgery, Daping Hospital, Army Medical University, from January 2015 to December 2020. Univariate and multivariate logistic regression analyses were employed to construct the predictive model. Discrimination and calibration were used to assess the predictive performance of the model. Decision curve analysis (DCA) was performed to evaluate the clinical utility of the model, and validation was conducted using 10-fold cross-validation, Leave-One-Out Cross Validation, and bootstrap methods. Results This study identified significant predictors of lymph node metastasis in LHC. A diagnostic predictive model was developed and visualized using a nomogram. The model demonstrated excellent discrimination, with a C-index of 0.887 (95% CI: 0.835-0.933). DCA analysis indicated its practical applicability, and multiple validation methods confirmed its fitting and generalization ability. Conclusion This study successfully established and validated a diagnostic predictive model for cervical lymph node metastasis in LHC. The visualized nomogram provides a convenient tool for personalized prediction of cervical lymph node status in patients, particularly in the context of occult cervical lymph node metastasis, offering valuable guidance for clinical treatment decisions.


Introduction
Globally, head and neck squamous cell carcinoma is the seventh most frequent cancer.LHC have an annual incidence rate of 5.9 per 100,000 people in Europe, which translates to about 40,000 new cases per year (1).LHC are prone to cervical lymph node metastasis, particularly occult metastasis, significantly impacting patient prognosis, With 5-year relative survival rates of only 25% to 61% (1)(2)(3).Even in young patients, the 5-year overall survival rate can plummet to as low as 66%, with a recurrence rate of 62%, indicating a less than favorable prognosis (4).Studies (5,6) have indicated that the overall lymph node metastasis rate among patients with LHC is as high as 59%, even among patients without clinically palpable cervical lymphadenopathy (CN0), there exists a notable risk of occult metastasis, estimated to be approximately 36% to 56%.Variations in metastatic patterns exist among patients with different tumor sites and stages.The remarkably high rate of lymph node metastasis and occult metastasis in LHC pose substantial challenges to clinical management, imposing significant economic and social burdens on patients, and profoundly affecting their prognosis (7).
Currently, the clinical assessment of cervical lymph nodes primarily encompasses imaging studies, ultrasound, and biopsy.The primary challenge encountered lies in accurately evaluating metastatic lymph nodes prior to treatment, particularly occult metastases.However, imaging may introduce concerns such as potential allergic reactions to contrast media and have limitations in the diagnosis of occult metastasis, the accuracy of fluorodeoxyglucose positron emission tomography-computed tomography(FDG-PET/CT) assessment is only 74%, and it carries risks such as radiation exposure and allergic reactions (8, 9).Furthermore, while ultrasound contributes to some extent to the evaluation of cervical lymph node status, its accuracy is relatively modest, even when combined with invasive biopsies, with an overall accuracy of only 77.4% (10).Preoperative biopsy also has limitations including low utilization rates, invasiveness, and the risk of biopsy failure (9).Accurate preoperative assessment of cervical lymph node status is critical, as incorrect assessments could impact subsequent patient care and treatment outcomes.As inadequate evaluation can lead to misguided treatment strategies for patients.Untreated metastatic lymph nodes can lead to shorter survival times and negative prognoses.Conversely, overtreatment such as performing neck dissection on patients without metastases may pose risks such as chyle leakage, recurrent laryngeal nerve injury, and even tracheostomy.
The current predictive models for assessing cervical lymph nodes are characterized by a singular approach, focusing solely on morphological features such as volume and thickness.They lack detailed consideration of important influencing factors such as age, tumor location, and T stage.Additionally, their clinical application is cumbersome, requiring assistance from ultrasound specialists in some cases, which presents inconvenience for clinicians.These limitations collectively constrain the practicality and accuracy of these models in clinical practice.Consequently, there is an urgent need for a more accurate and non-invasive approach to assess cervical lymph node status in patients with LHC.Such an approach would enhance diagnostic precision, facilitate tailored treatment strategies, and alleviate the physical and financial burdens on patients.
This study relies on clinical data from previous LHC patients within our department, aiming to construct a non-invasive predictive model based on patient clinical characteristics.The goal is to assist in a comprehensive assessment of cervical lymph nodes' status, thereby providing valuable support for subsequent patient treatment decisions.

Data collection
As this is a retrospective study, sample size calculation was not conducted prior to data collection.We gathered all inpatient cases between January 2015 and December 2020 and then screened them based on inclusion and exclusion criteria.We aimed to include all eligible cases to enhance the reliability of statistical outcomes and analyses, as depicted in the flowchart.The variables chosen for analysis was based on clinical relevance and data availability, guided by clinical guidelines and input from oncology experts.This included commonly used clinical variables such as demographic characteristics (gender, age, smoking history, alcohol consumption), tumor features (tumor location, TNM staging), and pertinent hematological parameters like neutrophils and platelet counts, as advised by existing literature.
This study collected data from 322 patients at the Department of Otolaryngology Head and Neck Surgery, Daping Hospital, Army Medical University.Among them, 37 patients were excluded due to reasons such as treatment abandonment, readmission, or incomplete medical records.Therefore, the final analysis included 285 patients.The clinical data encompassed various patient characteristics and hematologic parameters.Hematologic parameters were assessed within one week before surgery.Based on these variables, we successfully established a clinical predictive model, which was subsequently evaluated and validated.Ultimately, we obtained a predictive model capable of non-invasively predicting cervical lymph node metastasis in patients with LHC.This study received approval from the Ethics Committee of the Army Medical Center of PLA [Approval Number: 2023 (170)] and adhered to the ethical principles outlined in the Declaration of Helsinki.

Statistical analysis
Data analysis was conducted using R version 4.2.2.Count variables were depicted as median (IQR), while categorical variables were represented as numerical (%).Employing the "glm" function, univariable logistic regression analysis was performed, with lymph node metastasis as the outcome variable and age, smoking index, neutrophil count, lymphocyte count, platelet count, NLR, PLR, hemoglobin, and albumin as predictor variables.Optimal cutoff values were identified using the maximum Youden index method (Youden index = sensitivity + specificity -1).Subsequently, variables were transformed into categorical variables based on these cutoff values.Single-factor logistic regression analysis was then repeated using the transformed categorical variables along with other categorical variables as predictors.Variables with p < 0.05 were selected using backward stepwise multiple logistic regression to construct the final model.The "rms" function was utilized to create a nomogram visualizing the final model, which incorporated tumor location, T stage, neutrophil count, and platelet count.
Discrimination, calibration, clinical utility, and clinical impact of the model were assessed separately using receiver operating characteristic (ROC) curves generated by the "proc" package, calibration curves produced by the "rms" package, decision curves analysis, and clinical impact curves generated by the "rmda" package, respectively.Finally, the model underwent validation using 10-fold cross-validation, leave-one-out cross-validation, and bootstrap methods, please refer to the flowchart in Figure 1 for details.

Results
In the end, we included 285 patients for statistical analysis.Among them, the majority were male, accounting for 275 cases (96.5%), while there were only 10 female patients (3.5%), consistent with previous research findings (11).The median age of these patients was 61 years, with 245 cases (86.0%) having no lymph   1.
In univariate logistic regression analysis, sex differences did not significantly impact lymph node metastasis (p=0.583,OR= 0.641, 95% CI: 0.131-3.137).Similarly, age, smoking history, smoking index, alcohol consumption history, lymphocyte count, hemoglobin, albumin, hypertension, diabetes, pathological type, and other factors did not show significant predictive value for lymph node metastasis in univariate analysis.However, we found that the following variables had predictive value for the presence of lymph node metastasis: tumor location (p<0.001,OR=23.111,95% CI: 7.651-69.808),indicating that the probability of lymph node metastasis in hypopharyngeal cancer was approximately 23.1 times that of laryngeal cancer; T stage (p<0.001,OR=14.491,95% CI: 6.300-33.331),indicating that the probability of lymph node metastasis in advanced-stage tumors (T3-4) was approximately 14.5 times that of early-stage tumors (T1-2); neutrophil count (p<0.001,OR=6.072, 95% CI: 2.681-13.75),indicating that the probability of lymph node metastasis was approximately 6.1 times higher in patients with high neutrophil counts (>5.0) compared to those with low neutrophil counts (≤5.0); platelet count (p=0.005,OR=3.651, 95% CI: 1.476-9.03),indicating that the probability of lymph node metastasis was approximately 3.6 times higher in patients with high platelet counts (>168) compared to those with low platelet counts (≤168); NLR (p=0.016,OR=2.358, 95% CI: 1.169-4.756),indicating that the probability of lymph node metastasis was approximately 2.4 times higher in patients with high NLR (>2.5) compared to those with low NLR (≤2.5);PLR (p=0.02,OR=2.352, 95% CI: 1.144-4.839),indicating that the probability of lymph node metastasis was approximately 2.4 times  higher in patients with high PLR (>100) compared to those with low PLR (≤100).The above univariate logistic regression analysis considered the impact of each variable on lymph node metastasis.To more accurately assess the influence of multiple variables on the outcome and their potential confounding effects, we employed a backward stepwise approach to introduce these variables into a multivariate logistic regression analysis to determine their contributions.In the final multivariate analysis, NLR and PLR were excluded, and the following variables were retained: tumor location (p<0.001,OR= 19.049, 95% CI: 4.946-73.37),T stage (p<0.001,OR=14.673,95% CI: 5.55-38.791),neutrophil count (p<0.025,OR= 3.321, 95% CI: 1.159-9.514),and platelet count (p<0.038,OR= 3.102, 95% CI: 1.064-9.046).Ultimately, we established a cervical lymph node metastasis prediction model for LHC based on tumor location, T stage, neutrophil count, and platelet count as predictive factors.The Hosmer-Lemeshow test indicated a good model fit (p=0.352,Chi-square=3.266,df=3).
To personalize the prediction of cervical lymph node status in patients, we created a nomogram based on the model's predictive factors, as shown in Figure 2. Using each patient's Total points, we can predict their risk of cervical lymph node metastasis.

Discrimination
We assessed the predictive model's ability to distinguish between the lymph node metastasis group and the non-metastasis group using ROC curves.The results demonstrated that the prediction model constructed with tumor location, T stage, neutrophil count, and platelet count had excellent discrimination ability, with an Area Under the Curve (AUC) of 0.887, 95% CI: 0.835-0.933.The AUC value at this point is equivalent to the concordance index, as shown in Figure 3.

Calibration
In Figure 4A

Clinical utility
Next, we assessed the clinical utility of the model.We used DCA and Clinical Impact Curve to evaluate how patients benefit from the model.In Figure 5A, the x-axis represents the threshold probability values, and clinicians make selections based on the actual situation.The green horizontal line indicates that all patients have no metastasis and receive no intervention, resulting in a net benefit of 0. The blue diagonal line represents all patients having metastasis and receiving intervention, resulting in a linear net benefit.Both of these lines are considered ineffective.The red curve represents the DCA curve.Within the threshold probability range of 0 to 0.9, the DCA curve is positioned above both the None and All ineffective lines, indicating that the model adds clinical value to patients within these threshold probability values and demonstrates clinical utility.In Figure 5B, the Clinical Impact Curve demonstrates that when the horizontal axis exceeds 0.3, the solid line approaches the dashed line, indicating close alignment between model predictions and  The ROC curve of the predictive model.AUC, Area Under the Curve.Our model has demonstrated remarkable efficacy in real-world clinical scenarios.Take, for example, a recent case involving a 62year-old male patient diagnosed with laryngeal cancer (cT4N2M0) upon admission.Enhanced CT scans indicated the possibility of left cervical lymph node involvement.Consequently, the patient underwent a comprehensive procedure involving total laryngectomy and left neck lymph node dissection, with postoperative histopathology confirming the absence of lymph node metastasis (depicted in Figure 6).Notably, upon admission, the patient presented with a neutrophil count of 5.45 × 10^9/L and a platelet count of 159 × 10^9/L.According to our model's calculations, the estimated risk of cervical lymph node metastasis for this patient was a mere 32.7% (95% CI: 0.118-0.638).This case underscores the strengths of our model in predicting neck lymph node status, thereby aiding clinical decision-making.The study indicates that current clinical assessment methods are no longer sufficient to accommodate the complex needs of patients (12,13).In contrast, our model has the potential to prevent both overtreatment and under-treatment resulting from limited assessments.

Model validation
This study employed three validation methods, including 10fold cross-validation, Leave-One-Out Cross Validation, and Bootstrap, to ensure the robustness of the model.The results in Table 3 show that all three methods indicate model accuracy of around 90%, with Kappa values exceeding 0.44 and the area under the ROC curve above 0.84.This confirms the model's excellent generalization ability.

A B
Calibration curves of the model under different functions.

A B
Evaluation of the predictive model using DCA and Clinical Impact Curve.(A) The y-axis represents the net benefit, and the x-axis represents the threshold probability.Within the range of 0-0.9, the DCA curve is located above the "None" and "All" lines, indicating the model's suitability in this range.(B) The y-axis represents the number of individuals classified as high risk.The solid red line represents the number of individuals predicted as high risk by the model, while the dashed blue line represents the number of individuals predicted as high risk who experienced the outcome event.
In the range where the x-axis exceeds 0.3, the two lines are close together, indicating the effectiveness of the model.

Discussion
Our study successfully established and validated a diagnostic prediction model that can non-invasively predict neck lymph node status in patients before treatment, providing valuable information for their subsequent treatment.This model offers a more accurate and convenient method for lymph node assessment, avoiding the potential issues associated with traditional methods of lymph node evaluation, such as drug allergies, invasive procedures, suboptimal sample quality, and inadequate assessment (8, 12,14).
Tobacco and alcohol are widely recognized as risk factors for head and neck tumors (15,16), as they can damage mucosal cells, increase exposure to carcinogens, and promote tumor initiation and progression (17).However, the association of tobacco and alcohol with tumor cell metastasis to lymph nodes remains unclear (18)(19)(20).In univariate logistic regression analysis, our study results show no significant correlation between drinking history and neck lymph node metastasis (p=0.584,OR=1.207, 95% CI: 0.616-2.363),and smoking index (p=0.07,OR=2.475, 95% CI: 0.929-6.595)and smoking history (p=0.192,OR=2.264, 95% CI: 0.664-7.722)were also not confirmed as risk factors for lymph node metastasis.
Although we initially considered NLR and PLR as potential predictive factors, they were not included in the final prediction model in the multivariate logistic regression analyses.While neutrophils and platelets play important roles in tumor metastasis (21,22), the roles of NLR and PLR in the field of cancer are still uncertain (23)(24)(25).While they showed a trend toward lymph node metastasis in univariate analysis, their impact was not as significant as tumor location, T stage, neutrophil count, and platelet count.Additionally, NLR and PLR are influenced by various factors, including medications and other disease states, which may introduce confounding variables (26).In contrast, tumor location, T stage, neutrophil count, and platelet count showed significant associations with neck lymph node metastasis in both univariate and multivariate logistic regression analyses.The multivariate results revealed that advanced-stage (T3-4) patients had a 15-fold higher risk of lymph node metastasis compared to early-stage (T1-2) patients (p<0.001,OR=14.673,95% CI: 5.55-38.791),and the risk of neck metastasis for hypopharyngeal cancer was 19 times higher than that for laryngeal cancer (p<0.001,OR=19.049,95% CI: 4.946-73.37).High neutrophil count (>5.0) was associated with a 3-fold higher risk of metastasis compared to low neutrophil count (≤5.0) (p=0.025,OR=3.321, 95% CI: 1.159-9.514),and high platelet count (>168) was associated with a 3-fold higher risk of metastasis compared to low platelet count (≤168) (p=0.038,OR=3.102, 95% CI: 1.064-9.046),as detailed in Table 2, indicating an association between these factors and neck lymph node metastasis.
Tumor staging is closely associated with lymph node metastasis, and as the T stage increases, the probability of lymph node metastasis also rises (27).In a study on occult lymph node metastasis in laryngeal cancer, the rate of occult lymph node metastasis was 15.4% for T2 supraglottic laryngeal cancer, while it was as high as 35.7% for T4 cases (28).Another study (29) pointed out that lymph node metastasis in hypopharyngeal cancer is closely related to tumor staging, with advanced cases (T3-T4) being more prone to lymph node metastasis than early cases (T1-2), which is consistent with our research findings.Research suggests that tumors in different locations exhibit significant differences in lymph node metastasis (11), particularly tumors in the hypopharynx, which are considered high-risk factors for lymph node metastasis, with a metastasis rate as high as 70% (30,31).Additionally, hypopharyngeal cancer is more likely to metastasize to the tracheal lymph nodes (32).In contrast, the lymph node metastasis rate for laryngeal cancer is only 16.2%, even in T4-stage laryngeal cancer cases, the metastasis rate remains only 36% (33,34), consistent with our predictive model, highlighting the significant impact of different tumor locations on the risk of lymph node metastasis.Neutrophils also play an important role in tumor metastasis (22,35), as they can travel with circulating tumor cells and promote tumor cell proliferation by releasing cytokines, further validating this perspective in our study.Platelets can also participate in the tumor metastasis process in multiple ways, including promoting the survival of tumor cells, assisting in evading immune surveillance, and facilitating tumor cell adhesion to endothelial cells for penetration into capillaries for distant metastasis (36).Currently, drugs that intervene in the interaction between platelets and tumor cells are being explored to inhibit tumor metastasis (21), and our study results support this perspective.Unfortunately, the aforementioned studies primarily focused on analyzing individual factors in lymph node metastasis, without considering the synergistic effects of multiple factors in lymph node metastasis.This is precisely the highlight of our research.
We comprehensively employed various methods including discriminative analysis, calibration curve, Hosmer-Lemeshow test, and clinical utility to assess the performance of our model.The results demonstrated that our model excelled in all these metrics, displaying remarkable discriminative power, goodness of fit, and significant potential for clinical application.This underscores the importance of our predictive model in assessing neck lymph node status among LHC patients.Another highlight of our study is the multi-level validation conducted on the model, including 10-fold cross-validation, leave-one-out cross-validation, and bootstrap methods.This not only helps prevent model overfitting but also provides a comprehensive evaluation of its generalization ability.Consistent validation results across different methods consistently indicated outstanding accuracy and stability, affirming the model's strong generalization potential.
Currently, there is limited research on predictive models specifically targeting occult lymph node metastasis in laryngeal and hypopharyngeal carcinoma.Previous models have been criticized for incomplete evaluation factors, poor clinical applicability, or mismatched tumor types (37,38).Our study integrates oncological characteristics with laboratory findings to comprehensively assess lymph node status.Additionally, we employ multiple validation methods to ensure the reliability of our results.
Our developed predictive model exhibits promising potential and practicality for clinical treatment decisions.Demonstrating robust discrimination and calibration, the model accurately assesses cervical lymph node status in patients with laryngeal and hypopharyngeal carcinoma, providing valuable insights for clinical decision-making.Moreover, the predictors utilized in the model are based on routine clinical examinations, showcasing their broad applicability and utility in everyday clinical practice.This facilitates personalized treatment approaches, ultimately enhancing treatment outcomes.
However, our study does have limitations.Firstly, it is based on single-center cases and lacks external validation, restricting the model's applicability to broader populations.Additionally, retrospective designs may introduce confounding factors that could affect result accuracy.To address these limitations, we propose conducting prospective studies across multiple centers and populations to validate the research model, ensuring its generalizability.Furthermore, we suggest exploring the integration of tumor biomarkers and genomic data into the predictive model.By incorporating these biomarkers and genetic information, we can enhance the accuracy and precision of assessing lymph node status in patients.Additionally, investigating the potential synergy between traditional clinical variables and novel biomarkers warrants further exploration.These approaches hold promise for developing comprehensive and precise predictive models, thereby improving clinical outcomes for patients.

Conclusion
We have developed and validated a diagnostic prediction model that can predict lymph node status using non-invasive methods before treatment, addressing the clinical challenge of detecting occult neck lymph node metastasis.This model holds promise for clinical application, providing valuable insights for patient management.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material.Further inquiries can be directed to the corresponding authors.

FIGURE 1 Flow
FIGURE 1Flow diagram of study design.
, we utilized the Calibration curve to evaluate and visualize the disparity between the model's predicted values and actual values.The horizontal axis represents predicted values, while the vertical axis represents actual values.The blue dashed line labeled "Ideal" serves as a reference line, the red solid line labeled "Apparent" represents the fit between the model's predicted values and actual values, and the green solid line labeled "Bias-corrected" indicates the fit between the model's predicted values and actual values after calibration.The shape of the curve suggests good consistency between the model's predicted values and actual values.In Figure 4B, with Dxy=0.774,indicating a strong correlation between predicted values and actual values, and p=0.950, indicating good model fit.

FIGURE 2
FIGURE 2 Nomogram for Predicting Neck Lymph Node Metastasis Risk in LHC.Tumor location, Primary tumor site; T stage, Primary tumor stage; Neutrophil, neutrophil count, Platelet, platelet count.

FIGURE 3
FIGURE 3 (A) Using the calibrate function, the bias-corrected line or the Apparent line approaching the Ideal line indicates higher consistency between predicted and actual values.(B) Using the val.prob function, Dxy represents the magnitude of correlation between predicted and actual values; Brier represents mean squared error, with smaller values indicating better calibration performance; p=0.950 indicating a good fit.

6
FIGURE 6 Patient's Enhanced CT scans and Postoperative Pathology.(A-B) Axial and coronal CT images of the patient's neck obtained after contrast enhancement.The images reveal an enlarged lymph node in the left level II area (arrow).The radiologist assessed this lymph node preoperatively, considering the possibility of lymph node metastasis associated with the tumor.(C) Pathological findings from postoperative tumor specimens show the presence of keratin pearls, indicative of squamous cell carcinoma (HE staining, 100x magnification).(D) Postoperative lymph node pathology results do not demonstrate the presence of cancer cells (HE staining, 100x magnification).

TABLE 1
Demographic and clinicopathologic characteristics of 285 patients.

TABLE 1 Continued
Age, Smoking Index, Neutrophil, Platelet, Lymphocyte, NLR, PLR, Albumin, and Hemoglobin are presented as Median (IQR); Sex, T stage, Tumor location, Smoking history, Drinking history, Hypertension, Diabetes, and Pathology are presented as N (%).Percentages may not sum to 100% due to rounding.NLR, Neutrophil-to-Lymphocyte Ratio; PLR, Platelet-to-Lymphocyte Ratio; Smoking index: number of cigarettes smoked per day multiplied by the number of years of smoking.

TABLE 2
Odds ratios (95% Confidence Intervals) and other statistical parameters obtained from logistic regression analysis using all patients.
Ref, reference category; B, Coefficient; SE, Standard Error; OR, Odds Ratios; Z, standard score; NLR, Neutrophil-to-lymphocyte ratio; PLR, Platelet-to-lymphocyte ratio; Smoking index, Number of cigarettes smoked per day multiplied by the number of years of smoking.

TABLE 3
Validation results of the model using three different methods: 10-fold cross-validation, leave-one-out cross validation, and bootstrap.