Personalized chemotherapy selection for patients with triple-negative breast cancer using deep learning

Background Potential uncertainties and overtreatment exist in adjuvant chemotherapy for triple-negative breast cancer (TNBC) patients. Objectives This study aims to explore the performance of deep learning (DL) models in personalized chemotherapy selection and quantify the impact of baseline characteristics on treatment efficacy. Methods Patients who received treatment recommended by models were compared to those who did not. Overall survival for treatment according to model recommendations was the primary outcome. To mitigate bias, inverse probability treatment weighting (IPTW) was employed. A mixed-effect multivariate linear regression was employed to visualize the influence of certain baseline features of patients on chemotherapy selection. Results A total of 10,070 female TNBC patients met the inclusion criteria. Treatment according to Self-Normalizing Balanced (SNB) individual treatment effect for survival data model recommendations was associated with a survival benefit (IPTW-adjusted hazard ratio: 0.53, 95% CI, 0.32–8.60; IPTW-adjusted risk difference: 12.90, 95% CI, 6.99–19.01; IPTW-adjusted the difference in restricted mean survival time: 5.54, 95% CI, 1.36–8.61), which surpassed other models and the National Comprehensive Cancer Network guidelines. No survival benefit for chemotherapy was seen for patients not recommended to receive this treatment. SNB predicted older patients with larger tumors and more positive lymph nodes are the optimal candidates for chemotherapy. Conclusion These findings suggest that the SNB model may identify patients with TNBC who could benefit from chemotherapy. This novel analytical approach may provide debiased individual survival information and treatment recommendations. Further research is required to validate these models in clinical settings with more features and outcome measurements.


Introduction
Breast cancer is the most prevalent malignant tumor in women worldwide (1) and the leading cause of cancer-related deaths (2).Triple-negative breast cancer (TNBC) is the most aggressive subtype of breast cancer (3), which is characterized by the absence of estrogen receptors (ERs) and progesterone receptors (PRs), as well as the lack of overexpression of human epidermal growth factor receptor 2 (HER2) (4).Patients with TNBC account for 10-20% of breast cancer cases diagnosed each year (5), and they have a higher rate of recurrence and mortality (6).
Currently, adjuvant chemotherapy is the standard of care for operable TNBC, but it is only partially effective (7).For example, the National Comprehensive Cancer Network (NCCN) guidelines only recommend adjuvant chemotherapy for patients with tumor size larger than 1 cm (beyond T1b) or pN+ (8).However, a number of studies have found that patients with T1b TNBC still benefit after receiving adjuvant chemotherapy (9,10).In addition to tumor size, age, race, surgery, and radiation therapy are also important indicators for chemotherapy decisions (11).This indicates that the therapeutic heterogeneity of adjuvant chemotherapy cannot be ignored in the TNBC population.
The individuality of the patient should be at the core of every treatment decision (12).Estimating the average treatment effect with randomized control trials (RCTs) or observational studies that incorporate extensive statistical theories only provides a coarse summary of the distribution of a treatment effect, which may be inapplicable or even misleading at the individual level (13).Traditionally, to assess the heterogeneity of treatment and select the optimal treatment for a particular patient, researchers should continually subdivide subgroups through clinical experience to approximate an individual patient or a particular class of patients and repeatedly conduct RCTs within these subgroups.However, the traditional approach is not only very expensive and time-consuming but also ethically restrictive (14).Inferring unbiased individual treatment effects (ITEs) in observational studies is challenging because observational data can be affected by numerous biases (13).Leveraging machine learning, the ITEs can be predicted through counterfactual reasoning (13).Previous studies (15,16) have demonstrated that the deep learning (DL)-based treatment recommendation system can effectively predict ITEs, recognize treatment heterogeneity, and select optimum treatment for the patients.
The aim of this study is to establish a set of sophisticated DL treatment guidelines.Thus, it can provide optimal adjuvant chemotherapy recommendations for TNBC patients at the individual level and help patients achieve the longest possible survival.

Study design and setting
This was a population-based retrospective cohort study making individualized adjuvant chemotherapy recommendations for patients with TNBC using DL.All participants in this study were included in the Surveillance, Epidemiology, and End Results (SEER) 18 database, which tracks cancer patients in 18 regions of the United States and represents approximately 27.8% of the national population (17).This study followed the Strengthening the Reporting of Observational Studies in Epidemiology reporting guidelines (18).
Female patients diagnosed with ductal, lobular, or ductal-lobular carcinoma as a single primary cancer between 2010 and 2016, who underwent either breast-conserving surgery (BCS) or mastectomy, were included in the study.The exclusion criteria were as follows: (1) missing demographic information; (2) unknown HER2, ER, or PR status; (3) carcinoma in situ; (4) unknown laterality and bilateral breast cancer; (5) unspecified Tumor Node Metastasis (TNM) stage or tumor size; (6) unknown metastasis sites; (7) unknown axillary lymph node status; (8) uncertain whether adjuvant or neoadjuvant systemic treatment and radiotherapy was performed; (9) unknown histologic grades and types; and (10) incomplete follow-up or multiple malignancies.The inclusion process is illustrated in Figure 1A.
We collected baseline information (sex, age, race, income, and marriage status), tumor characteristics (location, size, laterality, histological grade, histologic type, and TNM stage), and treatment details (type of surgery and chemotherapy) for cases from the SEER database.Patients were excluded if any of the included clinical characteristic statuses were undocumented or missing.The primary outcomes of this study included overall survival (OS), which is the time interval from diagnosis to all-cause death.Patients who remained alive on 31 December 2020 were censored in the study.The tumor stage was determined according to the 7th American Joint Committee on Cancer Staging Manual.

Algorithms
The T-learner adopts two models to estimate the ITEs by , where ∝ 1 and ∝ 0 denote the models trained on corresponding treatment populations (19).This approach is composed of two estimators trained in different treatment groups, representing different treatment hypotheses in inference.The ITE is computed by observing the difference in predictions between these two estimators, which can be any prediction model, such as the Cox proportional hazards (CPH) model.While the T-learner can exclude certain confounding factors, it remains vulnerable to inconsistent predictive performance (13) and biased treatment allocation (20) due to disparate patient numbers and imbalanced baseline characteristics in the two treatment groups.DeepSurv (21) was originally proposed to relax the linearity and normality assumptions of CPH by replacing the single-layer linear model of CPH with a multilayer perceptron (MLP).In a follow-up study (15), it was found that combining DeepSurv with the T-learner was effective in inferring ITEs.
Cox Mixtures with Heterogeneous Effects (CMHEs) (22) operate on the assumption that the cohort consists of potential subgroups with different survival scenarios.Within each risk group, the proportional hazards assumption holds a concept known as the conditional proportional hazard assumption.To maximize the representation of diverse risk groups, the expectation-maximization technique was implemented.
The Balanced Individual Treatment Effect for Survival (BITES) data (20), a semi-parametric DL survival regression model, addresses the issue of selection bias using representation-based causal inference.It contains a shared network and two risk networks (three MLPs) and uses the Integral Probability Metrics to maximize the p-Wasserstein  1B.SNB inherits the overall architecture of BITES with MLPs replaced with self-normalizing neural networks (SNNs) (26).The neuron activations of SNNs automatically converge toward zero mean and unit variance, which in turn avoids exploding and vanishing gradients.Therefore, the feature extraction ability and robustness of SNB are significantly improved, which is expected to accurately predict the factual and counterfactual survival outcomes, thereby inferring more accurate ITEs.The shared network calculates balanced (debiased) latent representation using Smoothed Optimal Transport loss (27).Each risk network represents the corresponding treatment group, akin to a T-learner.

Calculation of individual treatment effect
When estimating the ITEs, we can observe only one outcome per patient; the alternative scenario remains hypothetical and thus unobservable.Thus, these outcomes need to be predicted by models.The individual survival distribution is obtained with the predicted log hazard ratios and treatment-specific baseline hazards, which describe the change in survival probability over time.
We define the clinically interpretable potential outcome as the area under the survival curve for an individual over a specified period (10 years), termed restricted survival time (RST).The formula can be described as: ITE X t S t x dt S t x dt ( ) are the predicted survival distributions for an individual under two treatment scenarios, respectively.Individualized treatment recommendations can then be obtained based on the value of ITEs.

Model development, validation, and treatment recommendation
We trained five models in total: SNB, BITES, Cox Mixtures with Heterogeneous Effects (CMHE) (22), DeepSurv (21), and CPH.The DeepSurv and CPH were trained and used with a T-learner structure.
Initially, we selected patients diagnosed in 2010 to serve as an external testing set concealed from the models.In the remaining data, patients were randomly allocated to a training set of 70% of the samples used for building the models; and a testing set of 30% of the samples, unseen by models, were used for evaluating the model performance.During training, we used five-fold cross-validation to tune the hyperparameters of the model; each time, the model was trained on four-fifths of the training set and validated on the remaining one-fifth of the training set.The training process will be automatically terminated if the validation loss does not decrease in 1,000 iterations.Tuned hyperparameters included the nodes and layers of MLPs or SNNs, learning rate, mini-batch size, the strength of Smoothed Optimal Transport loss (applicable for BITES and SNB), and the number of risk groups (applicable for CMHE).We did not take any missing value filling approach because there were no missing values.When feeding the models, all categorical variables that contain more than three factors were processed with one-hot encoding.
To explore the recommendation effect of models, we divided the patients into the recommended (Consis.) and anti-recommended (Inconsis.)groups based on whether the actual treatment they received was consistent with the model recommendations.The multivariate hazard ratio (HR), 10-year risk difference (RD), and the difference in the 10-year restricted mean survival time (DRMST) were calculated between Consis.and Inconsis.groups to evaluate the protective effects of models.The HR compares the relative risk of an event occurring between two groups over time; RD represents the absolute difference in event rates between two groups; and DRMST measures the change in average survival time between two groups.Overall, these metrics measure the survival advantage that following model recommendations can provide over not following them.A positive difference indicates longer survival in the treatment group.A positive RD suggests a higher event rate in the treatment group, while a negative RD indicates a lower rate.Inverse probability treatment weighting (IPTW) was used to control for baseline imbalance between the Consis.and Inconsis.groups.All models used the same ITE calculation methods.To prevent the potential that the Consis.group may have better prognostic factors, the IPTW was used to correct the baseline imbalance between the Consis.and Inconsis.groups.Demographic and tumor characteristics were adjusted, including age, race, marriage status, income, location, laterality, histology, grade, TNM stage, tumor size, and lymph node positivity.Treatment variables were not adjusted as they were measured after exposure (treatment recommendation) and may introduce unmeasured confounding (28).
To account for the effect of covariates on relative efficacy, we calculated the linear relationship between patient characteristics and ITEs (29).Considering that the SEER database contains patients originating from different regions, a mixed-effect linear regression was used to calculate this effect.It enables the model to account for and capture regional heterogeneity, thereby improving the accuracy and generalizability of the estimates (30).

Statistical analyses
All statistical analyses were performed using R version 4.1.3and Python version 3.8.Models were built with Python packages Pytorch 2.0.0 and scikit-survival 0.19.0, with main codes provided by the original papers cited above.We have made some improvements and integrations to the source codes, which are open source in Github: https://github.com/xinyi1999/MyPublication.In this repository, model codes, ITE calculations, and other methods are documented.Metrics were calculated using the R packages survival and rms.The IPTW was conducted using the R package ipw.The mixed-effect linear regression was developed using the R package lme4.Continuous variables are reported as median and interquartile range (IQR), and categorical variables are expressed as numbers and percentages (%).The log-rank test was used to compare the Kaplan-Meier (KM) curves.

Model performance
The testing set contained 2,573 patients, while the external testing sets included 1,468 patients diagnosed in 2010.All performance indicators were calculated in the testing and external testing sets with a preset time horizon of 10 years.The detailed model performance is demonstrated in Table 2.
The models predicted patients' factual and counterfactual survival purely based on baseline covariates.Then, the ITEs and subsequent treatment recommendations were obtained.The metrics of interest lie in how much survival advantages can be gained by following model recommendations, which can be reflected by evaluating the protective effect of the Consis.group compared to the Inconsis.group.We set the metrics that decide the performance of the model to those corrected with IPTW, as they were largely unaffected by other prognostic factors.We also compared the NCCN guidelines with the models.The NCCN guidelines recommend TNBC patients with pT1-3pN0-1mi and tumor>1 cm or with pN+ to receive chemotherapy (8).Patients whose actual treatment was consistent with the NCCN guidelines were compared to those who were inconsistent.
The KM curves of the SNB-recommended Consis.group versus Inconsis.group in the testing and external testing sets are presented in Figures 2A,B, while that of breast cancer-specific survival (BCSS) is demonstrated in Figures 2C,D.Better OS of the Consis.group in the testing (P of log-rank test = 0.0029; P of IPTW-adjusted log-rank test = 0.0433) and external testing (P of log-rank test = 0.0490; P of IPTW-adjusted log-rank test = 0.0284) sets was visualized.The BCSS of the Consis.group was better than that of the Inconsis.group with degraded performance (Testing set: P of log-rank test = 0.0630; P of IPTW-adjusted log-rank test = 0.0330; External testing set: P of log-rank test = 0.0031; P of IPTW-adjusted log-rank test = 0.0081).
Whether the protective effect of SNB was affected by an imbalance in treatment proportions is also of interest.Thus, the interventional natural direct effect (INDE) was calculated to cut off the effect of treatment variables on OS improvement, which was proposed by Diaz et al. (32).We treated the treatments (chemotherapy and surgical type) as a mediator and adjusted for baseline features.The standardized mean difference (SMD) before and after IPTW correction is shown in Supplementary Figure S1A (testing set) and Supplementary Figure S1B (external testing set).Covariates were balanced after IPTW with between-group SMDs smaller than 0.1 (33).

Treatment heterogeneity
Treatment heterogeneity can be captured by the presence of very different average treatment effects (ATEs) in different subgroups, indicating that patients with different characteristics respond heterogeneously to the same treatment.Patients were divided into chemotherapy recommended (CTR) and chemotherapy not recommended (CNR) groups based on whether chemotherapy was recommended by SNB.Similarly, patients were also divided into NCCN recommend chemotherapy (NCR) and NCCN not recommend chemotherapy (NNR) groups determined by whether the patients met the NCCN guidelines.This analysis was done using a combined population of testing and external testing sets.Figure 4A demonstrates the HR and IPTW-adjusted HR of chemotherapy in these subgroups.

Deep learning-based treatment insights
The ITE values reflect the difference in RST between chemotherapy and non-chemotherapy, indicating the additional survival time of an individual patient receiving chemotherapy.Considering that patients were from different regions, we derived a mixed-effect linear regression that predicts ITEs from the covariates with reporting region set as random effects, which was done in the combined population of testing and external testing sets.In such cases, the beta values obtained can be interpreted as follows: when other features hold, the presence of this covariate or an increase of one unit causes the difference in the survival time within 10 years of chemotherapy over no chemotherapy to increase beta.These results are presented in Figure 4B.It was found that, for every 1 mm increase in the size of a patient's tumor, chemotherapy increases their survival time by a relative 0.05 (95% CI, 0.04-0.06)months over 10 years.Similarly, patients with advanced age (0.11, 95% CI, 0.10-0.12)and more positive lymph nodes (0.26, 95% CI, 0.12-0.40)resulted in better efficacy of chemotherapy.Patients with tumors in the upper inner quadrant were not recommended for chemotherapy (−0.52, 95% CI, −0.82 to −0.22).Subsequently, we conducted a subgroup analysis (Supplementary Table S1), with the efficacy of chemotherapy increases with age and tumor size.

Model interpretation
We used SurvSHAP(t) to interpret the functional output of SNB, the first method introduced to date that can provide a time-dependent interpretation with a solid theoretical basis (34).Figure 4C visualizes aggregating the eight most important variables, sorted by aggregated Sharpley values and rankings over 500 observations.The horizontal bars represent the number of observations where the importance of the variable is ranked first, second, and so on, indicated by the given color.The "treatment" variable indicated the effect of using different risk networks and baseline hazards.
Age was deemed the most important prognostic factor in 418 samples, followed by histologic grade, laterality, and surgical type.
One patient was randomly selected from the testing set and analyzed with SNB, shown in Supplementary Figure S2.With the help of SNB, the survival probability during different treatment plans was clearly demonstrated.Based on the predicted survival distribution, various indicators of survival advantages can then be calculated, including differences in mortality, time at risk, and RST, to facilitate the users' self-directed choice of a more appropriate treatment plan.

Discussion
Determining which TNBC patients require adjuvant chemotherapy involves multifactorial considerations (11) and remains controversial (7).Avoiding overtreatment and individualizing treatment plans for patients are key to achieving precision medicine.
Therefore, in this study, we carefully evaluated SNB, which outperformed state-of-the-art models, widely used alternatives, realworld physician choices, and NCCN guidelines.After diligently correcting for biases, following SNB recommendations can halve patients' 10-year mortality rate, significantly outperforming alternative approaches.In addition to OS, following SNB guidance significantly improved BCSS in TNBC patients.We observed that the NCCN guidelines resulted in a positive RD and successfully identified treatment heterogeneity; however, these findings were statistically significant only in univariate metrics not corrected by IPTW.Treatment selection often needs to consider complex feature interactions rather than being based on fixed guidelines (25), and our study demonstrated that DL models are well suited to accomplish this, as clearly evidenced by the stronger protective effect of SNB than the NCCN guidelines.
Artificial intelligence-guided intervention studies provide the opportunity to gain insights from DL-based treatments by interpreting model recommendations associated with ITE values.We accounted for and excluded the influence of confounding factors on treatment recommendations by keeping other covariates constant.Thus, compared to the conclusions from traditional methods, these results are virtually independent of confounding factors and are quantifiable, which provides an essential basis for visualizing the impact of baseline characteristics on the relative efficacy of chemotherapy.
Consistent with previous studies, we found that for every 1 mm increase in the size of a patient's tumor (10,35), chemotherapy resulted in a relative extension of their 10-year survival time by 0.05 months.Other features, including number of positive lymph nodes (8) and age (11), also significantly affect chemotherapy efficacy.Interestingly, we found that chemotherapy was not recommended for patients with tumor sites located in the inner lower quadrant.The relationship between chemotherapy efficacy and tumor location has rarely been discussed, while past studies have only mentioned that TNBC patients with tumor sites located in the inner lower quadrant have a poorer prognosis after receiving neoadjuvant chemotherapy (36,37).Therefore, this result can only be used as a reference at present.The reliability of it needs to be further investigated with more data and more teams, which may provide clinicians with new treatment ideas.
In response to the widely publicized effect of age and tumor size, the results of our subgroup analyses are also consistent with the findings of current mainstream, authoritative studies.Patients with TNBC over 65 years of age were more likely to benefit from adjuvant chemotherapy (38), which was not found to be statistically different in relatively younger patients.This indicates a greater need to incorporate multiple factors in the younger population when making final treatment decisions.In addition, consistent with the NCCN guidelines, adjuvant chemotherapy improves survival in TNBC with tumor size greater than 1 cm (8).However, for patients with smaller tumor sizes, it is important to combine other factors to make the final decision (39).
Developing a survival benefit visualization tool is essential for enabling patients and physicians to make informed treatment decisions.This tool facilitates the visual comparison of expected outcomes from various treatment options via a graphical treatment recommendation system, which incorporates multiple individual and comparative survival metrics.However, crafting personalized treatment plans and executing visual prognostic analyses remains challenging in practice (12,40).Most current models utilize patient characteristics to generate prognostic factors, yet these are often influenced by biases from different treatments (41).The SNB model has the potential to overcome these challenges by more accurately demonstrating individual outcomes following various treatment regimens.With SNB, patients and physicians can visualize the anticipated effects of different treatment choices, playing a pivotal role in the final decision-making process.In addition, the cost of treatment is also a key consideration for patients, and by considering the cost of incorporating various therapies in the future, the SNB can help patients filter out the most cost-effective and optimal solution.It is also worth mentioning that for patients who have lost the ability to make decisions on their own, the SNB can greatly help their families to objectively analyze the pros and cons of different treatment options.All these predict the future application of the DL model in clinical treatment.In the future, improvements in data quality and including more disease types will refine these models further, laying a strong foundation for the entire field of precision medicine.

Limitations
Due to database restrictions, we could not access some important features, such as Ki67, TILs, BRCA status, the presence of positive margins, and patient treatment switching or termination.Given that such biological factors are highly important prognostic markers concerning the survival of TNBC patients, we strongly advocate conducting further research to delve into this topic, contingent upon the availability of pertinent data regarding this information.Although the absence of crucial data above can affect treatment outcomes, the model's usefulness is expected to increase as the variety and quality of variables improve.The generalizability of our results is limited by using a single database when training and testing the model.This approach may introduce biases associated with demographic and geographic diversity that do not accurately reflect the entire patient Frontiers in Medicine 11 frontiersin.orgpopulation.Despite our best efforts to control for bias in the data, reliance on retrospective data inherently limits the ability to control variables and interventions that were not initially recorded, which may introduce unmeasured bias and inconsistent observation times (42).Subsequent studies are recommended to test the protective effect of the model through randomized control trials, prospective cohort studies, or target trial emulation (43).Furthermore, considering the subjective nature of patient decisions, it is vital to include additional prognostic factors, such as complications and survival benefits, with anticipated improvements in database variables facilitating this comprehensive approach.

Conclusion
To the best of our knowledge, this is the first study to develop individualized adjuvant chemotherapy recommendations for TNBC patients using DL.Moreover, our study confirms SNB's potential to enhance clinical treatment decision-making and offer quantitative treatment insights.The model predicted enhanced chemotherapy efficacy in patients with older age, larger tumors, and a higher number of positive lymph nodes.
t indicates the preset time horizon, x indicates the covariates, and S t x

1
FIGURE 1 Inclusion process and model architecture.(A) Selection and exclusion criteria for patient inclusion.(B) The model architecture of Self-Normalizing Balanced (SNB) individual treatment effect for survival data.

2
FIGURE 2 The Kaplan-Meier curves of the Consis.and Inconsis.groups.(A) Kaplan-Meier curves comparing the overall survival between the Consis.group and the Inconsis.group in the testing set.The p-values are derived from the log-rank test (p = 0.0029) and IPTW-adjusted log-rank test (IPTW-adjusted p = 0.0433).(B) Kaplan-Meier curves comparing the overall survival between the Consis.group and the Inconsis.group in the external testing set.The p-values are from the log-rank test (p = 0.0490) and the IPTW-adjusted log-rank test (IPTW-adjusted p = 0.0284).(C) Kaplan-Meier curves comparing the breast cancer-specific survival between the Consis.group and the Inconsis.group in the testing set.The p-values are derived from the log-rank test (p = 0.063) and the IPTW-adjusted log-rank test (IPTW-adjusted p = 0.0330).(D) Kaplan-Meier curves comparing the breast cancer-specific survival between the Consis.group and the Inconsis.group in the external testing set.The p-values are from the log-rank test (p = 0.0031) and the IPTW-adjusted log-rank test (IPTW-adjusted p = 0.0081).

3
FIGURE 3 Causal paths illustrating the effects of Self-Normalizing Balanced (SNB) individual treatment effect for survival data on survival outcomes.(A) Causal path diagram showing the effects of SNB on overall survival (OS) in the testing set.The diagram quantifies the Interventional Natural Direct Effect (INDE: −0.06; 95% CI, −0.07 to −0.05) and Interventional Natural Indirect Effect (INIE: 0.01; 95% CI, 0.00 to 0.01), illustrating the direct and mediated impacts on OS, adjusted for baseline features as confounders.(B) The causal path diagram in the external testing set similarly details the SNB's impact on OS.Displays the INDE (−0.06; 95% CI, −0.08 to −0.05) and INIE (0.01, 95% CI, 0.00 to 0.01), highlighting the robustness of SNB's effect across different testing scenarios.X indicates patients' baseline features, which were adjusted as intermediate confounders.

FIGURE 4
FIGURE 4Model interpretation of treatment effects and variables impact.(A) Average treatment effect (ATE) and treatment heterogeneity present the hazard ratios (HRs) and IPTW-adjusted HRs demonstrating the ATE of chemotherapy across different patient groups, including chemotherapy recommended (CTR) and not recommended (CNR) groups classified by SNB, as well as groups classified according to the NCCN guidelines (NCR and NNR).(B) Interpretation of model recommendation behavior shows beta values from a mixed-effect linear regression predicting individual treatment effects from covariates with the region as a random effect in the combined testing and external testing sets.These beta values indicate the impact of one unit increase in covariates on the survival time difference over 10 years between chemotherapy and no chemotherapy.(C) Interpretation of overall output using SurvSHAP(t).Visualizes the aggregation of the eight most important variables influencing treatment decision, based on the Sharpley values derived from the SurvSHAP(t) analysis over 500 observations.This diagram details how variables rank in terms of importance across multiple observations.

.16 (3.34-13.00) 8.11 (3.34-13.00)
Normalizing Balanced individual treatment effect for survival data; BITES, Balanced Individual Treatment Effect for Survival data; CMHE, Cox Mixtures with Heterogeneous Effects; CPH, Cox proportional hazards model; NCCN, National Comprehensive Cancer Network treatment guidelines.IBS, integrated Brier score; HR, hazard ratio; RD, 10-year risk difference; DRMST, the difference in the 10-year restricted mean survival time.a Integrated Brier score in the non-chemotherapy group; b Integrated Brier score in the chemotherapy group.The bold font indicates that the model performs best in this metric.Patients with pT1-3pN0-1mi and tumor > 1 cm or with pN+ are recommended to receive adjuvant chemotherapy according to the NCCN guidelines.