Prediction models for lymph node metastasis in cervical cancer based on preoperative heart rate variability

Background The occurrence of lymph node metastasis (LNM) is one of the critical factors in determining the staging, treatment and prognosis of cervical cancer (CC). Heart rate variability (HRV) is associated with LNM in patients with CC. The purpose of this study was to validate the feasibility of machine learning (ML) models constructed with preoperative HRV as a feature of CC patients in predicting CC LNM. Methods A total of 292 patients with pathologically confirmed CC admitted to the Department of Gynecological Oncology of the First Affiliated Hospital of Bengbu Medical University from November 2020 to September 2023 were included in the study. The patient’ preoperative 5-min electrocardiogram data were collected, and HRV time-domain, frequency-domain and non-linear analyses were subsequently performed, and six ML models were constructed based on 32 parameters. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity. Results Among the 6 ML models, the random forest (RF) model showed the best predictive performance, as specified by the following metrics on the test set: AUC (0.852), accuracy (0.744), sensitivity (0.783), and specificity (0.785). Conclusion The RF model built with preoperative HRV parameters showed superior performance in CC LNM prediction, but multicenter studies with larger datasets are needed to validate our findings, and the physiopathological mechanisms between HRV and CC LNM need to be further explored.


Introduction
Cervical cancer (CC) is a common gynecologic malignancy worldwide and one of the leading causes of cancer-related deaths in women (Sung et al., 2021).Clinical treatment is usually administered according to the disease stage of patients.For example, early CC patients are often treated with surgery, while patients with locally advanced CC need to be considered for combined radiotherapy and chemotherapy (Bhatla et al., 2021;Burmeister et al., 2022;Pecorino et al., 2022;Mereu et al., 2023).In 2018, the International Federation of Gynecology and Obstetrics (FIGO) included lymph node status in the CC staging criteria (Kido and Nakamoto, 2021).Since then, the occurrence of lymph node metastasis (LNM) has become an important factor in determining the staging and treatment modalities of CC (Hou et al., 2020).In addition, LNM has been proven to be an important risk factor for CC recurrence and patient death (Polterauer et al., 2010;Moreira et al., 2020).Therefore, it is of great significance to accurately assess whether LNM occurs in CC patients before treatment to make the best treatment decision and prognosis assessment.
Imaging methods such as magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound are the current preferred choice for early non-invasive detection of CC lymph node status.However, the assessment of lymph node involvement by imaging methods relies mainly on lymph node size and morphological features.False-positive results may occur when there is a combination of inflammation, tuberculosis and hyperplastic lymph node lesions, and false-negative results may occur when there are small metastatic lymph nodes or micrometastatic foci.In recent years, with the development of artificial intelligence technology, radiomics approaches have emerged and been widely studied in the prediction of preoperative LNM status in patients with colorectal, bladder, breast, and biliary tract cancers (Huang et al., 2016;Wu et al., 2018;Mao et al., 2020).However, for the early prediction models of CC LNM by radiomics, the reproducibility of the radiomics features and the robustness of the model remain to be demonstrated due to the complexity of the lymph node images and the delineation relies on the subjective judgment of the diagnostician (Ji et al., 2019).
The autonomic nervous system (ANS) is an important component of the tumor microenvironment, which is involved in and modifies the cancer process (Manganaro et al., 2021).Evidence suggests that the ANS interacts with the development of inflammation, immunity, and metastasis in a variety of cancers (Bautista and Krishnan, 2020;Kamiya et al., 2021).The ANS, characterized by heart rate variability (HRV), has been extensively studied in cancer prognostic assessment, and preoperative HRV has been shown to be strongly associated with LNM status in a variety of malignant tumors.For example, Hu et al. (2018) and Simó et al. (2018) found that HRV decreased with tumor progression in patients with gastric cancer and correlated with LNM.Wang et al. (2021) found that in CC patients, HRV was significantly lower in the LNM group than in the no LNM group, and this association was independent of confounding factors such as age.If HRV can be used as a feature variable to build machine learning (ML) models to predict CC LNM, it would help to simplify the examination method for LNM.
The main objective of this study was to establish ML models to predict lymph node status based on preoperative short-term HRV features in CC patients, thereby providing new ideas for the preoperative prediction of LNM in CC patients.

Subjects
The study was approved by the Medical Ethics Committee of Bengbu Medical University (Bengbu, Anhui, China) (2023-14).The experimental process was performed in strict accordance with the ethical standards set out in the 1964 Declaration of Helsinki and its amendments.All patients were informed of the detailed purpose, process, risks and adverse effects of the experiment and signed an informed consent form.
The study subjects were 427 CC patients admitted to the Department of Gynecological Oncology of the First Affiliated Hospital of Bengbu Medical University from November 2020 to September 2023.The inclusion criteria were as follows: (1) CC confirmed by pathohistological examination (squamous pathological types) and (2) newonset patients without surgical treatment, radiotherapy and chemotherapy.The exclusion criteria were as follows: (1) carcinoma in situ; (2) incomplete pathological data; (3) poor quality of electrocardiographic signals; and (4) ectopic beats >5% of all beats.

Data collection and heart rate variability analysis
The 5-min supine electrocardiogram data of CC patients were collected 1 day before surgery using a single-lead miniature electrocardiograph (version 2.8.0, Healink-R211B, Healink Ltd., Bengbu, China) with the sampling rate of the electrocardiograph set to 400 Hz and the bandwidth of the signal set to 0.6-40 Hz.The patient was asked to keep quiet and breathe steadily, and lead V6 was used.
The Pan-Tompkins algorithm was used to extract the electrocardiographic R-R interval (RRI) time series (Pan and Tompkins, 1985).Artifacts caused by extraction techniques, interference, and ectopic beats were automatically corrected using a time-varying threshold algorithm (Lipponen and Tarvainen, 2019).HRV analysis was then performed to obtain a total of 32 HRV parameters.
Time-domain indicators included the standard deviation of all normal-to-normal intervals (SDNN), root mean square of successive interval differences (RMSSD), number of successive RR interval pairs that differed by more than 50 ms (NN50), NN50 divided by the total number of RR intervals (pNN50), triangular interpolation of normal-to-normal intervals (TINN), RR interval triangular index (RRTi, sampling interval 1/128 s), deceleration capacity (DC), acceleration capacity (AC).
Prior to the frequency-domain analysis, the RR interval sequences were uniformly resampled using the 4-HZ cubic spline interpolation, the spectral values were estimated based on the fast Fourier transform (FFT) method, and the power spectral densities of the RR interval time series were estimated using the FFT of the Welch periodogram method (150 s window width, 50% overlapping windows).

Machine learning modeling
Since the original dataset exists LNM (+) and LNM (-) category imbalance problem, the LNM (+) and LNM (-) were matched in the ratio of 0.8:1 by using the synthetic minority over-sampling technique (SMOTE) in order to enhance the model performance.The dataset was divided into training and test sets at a ratio of 7:3.Six ML models adaptive boosting (AdaBoost), Gaussian naive Bayes (GNB), logistic regression (LR), random forest (RF), support vector machine (SVM) and XGBoost, were built for CC LNM status classification.We used 10-fold cross-validation to evaluate the performance of the models on the training set.The optimal model was selected and the classification performance on the test set was further evaluated using the area under receiver operating characteristic curve (AUC), accuracy, sensitivity and specificity.ML models were constructed and validated using Python (Version 3.9) and R programming language (Version 3.6.3).

Statistical analysis
Different representations were applied according to different types of data: mean ± standard deviation for normal continuous data, median (first quartile, third quartile) for non-normal continuous data, and count (percentage) for count data.The

Patient characteristics
This study finally included 292 CC patients with an age and body mass index (BMI) of 54.0 ± 10.9 years and 24.8 ± 3.2 kg/m 2 , respectively.All patients were divided into LNM (-) and LNM (+) groups based on histopathologic findings.The LNM (-) group included 230 patients, and the LNM (+) group included 62 patients.After statistical analysis, we found no significant differences between the LNM (-) and LNM (+) groups in terms of age, BMI, hypertension, diabetes, tubal ligation, menopausal status, and adjuvant chemotherapy (P > 0.05).Table 1 describes the basic clinical characteristics of CC patients.
Table 2 shows the statistical results of the HRV parameters in CC patients.No HRV parameter was significantly different (P > 0.05) between the CC LNM (-) and LNM (+) groups; these results were obtained from the analysis of the raw data of 292 CC patients.

Diagnostic performance of the six machine learning models
Six ML models were built based on the 32 HRV features, Figure 1 shows the AUC of 10-fold cross-validation for the 6 ML models on the validation set.Among them, the RF model had the highest AUC for 10-fold cross-validation (AUC = 0.904).The calibration curve (Figure 2) shows that among the six models, the RF model has the best fit between the predicted probability and the actual probability to discriminate LNM with a Brier score of 0.147.
The best performing RF model was further tested using the test set.The ROC curves of the RF model in the training set, validation set, and test set are shown in Figures 3A-C, respectively.Figure 3D demonstrates the decision curve of the RF model in the test set, and the result shows that the RF model achieves a high net clinical benefit in most of the high risk threshold ranges.Table 3 shows the predictive metrics of the RF model on the training set, validation set, and test set.

Model interpretation
The SHAP summary plot of the RF model is shown in Figure 4.The 20 features in the SHAP summary plot are arranged along the vertical axis in descending order of feature importance, with a higher position indicating a higher level of importance for the model to predict the LNM status.The features in order they are SampEn_MSE2, α1, ApEn, SampEn_MSE3, REC, DET, α2, ShanEn, SDHR, Lmean, SDNN, SampEn_MSE5, TINN, LF/HF, SampEn_MSE4, CD, MeanHR, MinHR, HF, MaxHR.For each

Discussion
In this study, we used HRV parameters obtained from the preoperative 5-min electrocardiogram data of CC patients to develop ML models for the classification of LNM status.Among them, the RF model performed the best with an AUC of 0.852, accuracy of 0.744, sensitivity of 0.783, and specificity of 0.785 on the test set.The results showed that the RF model based on preoperative HRV features could be used for CC LNM prediction.
In recent years, the use of ML techniques has been proposed to identify CC LNM, mainly using invasively obtained hematological parameters and/or non-invasive imaging parameters as model input features.For example, Ou et al. (2022) used pretreatment hematological parameters of CC patients to build an ML model to predict LNM in patients and built a Cforest model with a performance AUC of only 0.620.In addition, uncontrollable factors such as drugs and inflammation can affect the stability of hematological indicators, and different testing reagents and equipment can cause bias in the test results (Niu et al., 2020), all of which are not conducive to the popularization of this method.Arezzo et al. (2023) established LR and XGBoost models to predict LNM in patients with advanced CC using clinical data and pelvic MRI as the characteristic parameters, and the results showed that the XGBoost model demonstrated a better predictive performance (89% accuracy, 83% precision, 78% recall, and AUC 0.79).Although the model based on clinical features and MRI showed some performance improvement over the hematological parameter model, the method is more costly and the test is more time-consuming.In comparison, the ML model established with HRV parameters as features is much better than the above methods in prediction performance.In addition, HRV detection is a noninvasive method that is low cost, safe and easy to perform, which is also ideal for clinical promotion and application.
In ML-related studies, previous scholars have mostly used the P-value of statistical methods as a criterion for feature selection.However, this approach has certain pitfalls, i.e., the P-value is always manipulated to make a "one-size-fits-all" judgment with a threshold value of 0.05 or 0.01, which makes it easy to miss the potential contributions of features to the model prediction.In our study, although statistically significant differences in HRV metrics between the LNM (-) and LNM (+) groups were not observed, our test set of RF model based on HRV parameters reached an AUC of 0.852.The threshold for a significant difference (P < 0.05) is too strict and may ignore the contribution of some features to the classification (Guo et al., 2019).Traditional statistical methods may not be suitable for feature selection when modeling in ML, as ML methods can mine more potential relationships between data.
In this study, for the best predictive performance of the RF model, we used SHAP analysis to address the issue of model interpretability.The SHAP analysis showed that non-linear HRV Receiver operating characteristic curve (ROC) for six ML models on the validation set.Calibration curves for the 6 ML models on the validation set.The dotted line represents the perfect calibration curve, i.e., the predicted probability matches the true probability perfectly.The numbers in the legend represent the Brier scores of the ML models; the smaller the Brier score, the closer the predicted probability of the ML model is to the true probability.
parameters contributed more to the RF model.HRV analysis includes traditional time-domain, frequency-domain and nonlinear analyses.Compared with traditional time-domain and frequency-domain parameters, non-linear parameters reflect the complexity of physiological signals better and can detect subtle changes in the early stages of disease (Busa and van Emmerik, 2016;Shi et al., 2017Shi et al., , 2019;;Cui et al., 2020;Liao et al., 2022).The MSE complexity measure analysis method was proposed by Frontiers in Neuroscience 05 frontiersin.orgCosta et al. (2002), which quantifies the non-linear dynamics of complex systems on multiple scales based on SampEn and measures signal complexity more comprehensively.Several studies have been conducted to apply the complexity indicators quantified by this method for disease prediction, classification and prognostic assessment (Lin et al., 2016;Frassineti et al., 2021;Tang et al., 2021;Yang et al., 2021;Liao et al., 2022).For example, Frassineti et al. (2021) found that MSE analysis had prescreening value in neonatal seizures.Tang et al. (2021) showed that MSE analysis was helpful for the identification of high-risk pulmonary hypertension patients.In contrast to other single time scale analyses, MSE can reflect an understanding of a range of time scales (Busa and van Emmerik, 2016).In practical applications, MSE analysis will be more accurate for long-duration electrocardiogram data; the length of the electrocardiogram data we analyzed was 5 min, so only 5 scales were analyzed.Interestingly, Zhang et al. (2021) used the 5 scales as well and noted that MSE showed greater discriminatory power in identifying coronary artery lesions.Combined with our findings, this implies that MSE may be promising indicators for detecting disease states, although the underlying mechanisms remain unclear.DFA provides an interpretation of shorter time series and can quantify the fractal behavior of complex dynamical systems (Peng et al., 1994;Nayak et al., 2018;Gu et al., 2022).In our results, we observed that the short-term fluctuation slope α1 in the DFA indicator played an important role in the RF model contribution, which may be related to the complex physiological mechanism behind it.The physiological context of DFA has been shown to be related to subtle interactions between sympathetic and vagal nerves (Tulppo et al., 2005;Beckers et al., 2006;Mandarano et al., 2022).As mentioned in the introduction, cancer progression involves dysfunction of ANS regulation, and LNM, as an important step in cancer progression, is associated with a combination of factors such as immune function, inflammatory response, and other factors, which can be influenced by the ANS through their regulation (Li et al., 2013;Le et al., 2016).In summary, the results of this study suggest that altered body complexity and ANS dysfunction are closely associated with CC LNM, but the specific mechanisms need to be further explored.

Limitations
Our study also presents some limitations.First, there were some differences in the proportion of patients in the LNM (-) and LNM (+) groups in this study, and although we corrected for the sample imbalance using the SMOTE method, this may still have interfered with the results and affected the generalization ability of the model.Second, this study was a single-center study.Because HRV collection is prone to interference from the environment and other factors, external validation in a multicenter study is essential.Third, the physiopathological mechanisms between HRV parameters, especially non-linear parameters, and CC LNM need to be further explored.

Conclusion
In conclusion, we investigated the feasibility of ML modeling using preoperative HRV parameters to predict CC LNM and demonstrated that the RF model may be a helpful detection tool.Being easy to implement, non-invasive and inexpensive, the technique is amenable to further clinical studies to refine our Frontiers in Neuroscience 07 frontiersin.orgmethodology and to determine the optimal application of the technique in clinical practice.
point represents one patient.The horizontal axis is the SHAP value of the feature, the absolute value of which indicates the degree to which the feature affects the model output.Patients with higher SHAP values are at higher risk of developing LNM.Red indicates higher feature values, purple indicates feature values close to the overall mean, and blue indicates lower feature values.

FIGURE 3
FIGURE 3 Receiver operating characteristic (ROC) curves of the RF model on the training set (A), validation set (B), test set (C), and the decision curve on the test set (D).

FIGURE 4 SHAP
FIGURE 4SHAP summary plot of 20 HRV parameters of the RF model.

TABLE 1
Clinical and demographic data.
Values are expressed as the mean ± standard deviation, or the number of patients (percentages).LNM, lymph node metastasis; N, number of individuals; BMI, body mass index.

TABLE 2
Differences in HRV indicators between the LNM (-) and LNM (+) groups.The chi-square test was used to compare count data between two groups.SPSS Statistics 26.0 (IBM Corp., Chicago, IL, United States of America) software was used for statistical analysis.P < 0.05 was defined as a significant difference.
HRV, heart rate variability; LNM, lymph node metastasis; N, number of individuals.Shapiro-Wilk test was used to test the distribution normality of continuous variables.Independent samples t-tests and Mann-Whitney U-tests were performed to compare continuous variables between two groups.

TABLE 3
Predictive metrics of RF model on training set, validation set and test set.