A Risk Stratification Model for Predicting Overall Survival and Surgical Benefit in Triple-Negative Breast Cancer Patients With de novo Distant Metastasis

Background and Aims: This research aimed to construct a novel model for predicting overall survival (OS) and surgical benefit in triple-negative breast cancer (TNBC) patients with de novo distant metastasis. Methods: We collected data from the Surveillance, Epidemiology, and End Results (SEER) database for TNBC patients with distant metastasis between 2010 and 2016. Patients were excluded if the data regarding metastatic status, follow-up time, or clinicopathological information were incomplete. Univariate and multivariate analyses were applied to identify significant prognostic parameters. By integrating these variables, a predictive nomogram and risk stratification model were constructed and assessed with C-indexes and calibration curves. Results: A total of 1,737 patients were finally identified. Patients enrolled from 2010 to 2014 were randomly assigned to two cohorts, 918 patients in the training cohort and 306 patients in the validation cohort I, and 513 patients enrolled from 2015 to 2016 were assigned to validation cohort II. Seven clinicopathological factors were included as prognostic variables in the nomogram: age, marital status, T stage, bone metastasis, brain metastasis, liver metastasis, and lung metastasis. The C-indexes were 0.72 [95% confidence interval [CI] 0.68–0.76] in the training cohort, 0.71 (95% CI 0.68–0.74) in validation cohort I and 0.71 (95% CI 0.67–0.75) in validation cohort II. Calibration plots indicated that the nomogram-based predictive outcome had good consistency with the recoded prognosis. A risk stratification model was further generated to accurately differentiate patients into three prognostic groups. In all cohorts, the median overall survival time in the low-, intermediate- and high-risk groups was 17.0 months (95% CI 15.6–18.4), 11.0 months (95% CI 10.0–12.0), and 6.0 months (95% CI 4.7–7.3), respectively. Locoregional surgery improved prognosis in both the low-risk [hazard ratio [HR] 0.49, 95% CI 0.41–0.60, P < 0.0001] and intermediate-risk groups (HR 0.55, 95% CI 0.46–0.67, P < 0.0001), but not in high-risk group (HR 0.73, 95% CI 0.52–1.03, P = 0.068). All stratified groups could prognostically benefit from chemotherapy (low-risk group: HR 0.50, 95% CI 0.35–0.69, P < 0.0001; intermediate-risk group: HR 0.34, 95% CI 0.26–0.44, P < 0.0001; and high-risk group: HR 0.16, 95% CI 0.10–0.25, P < 0.0001). Conclusion: A predictive nomogram and risk stratification model were constructed to assess prognosis in TNBC patients with de novo distant metastasis; these methods may provide additional introspection, integration and improvement for therapeutic decisions and further studies.


INTRODUCTION
Triple-negative breast cancer (TNBC) is a biologically invasive disease that accounts for ∼15% of breast malignancies (1). Despite the rapid development of treatment methods such as surgery, chemotherapy and immunotherapy, TNBC is still the common cause for cancer-related deaths, mainly due to distant metastasis (2).
Cancer metastasis is a complicated process, involving several stages such as invasion of the extracellular matrix, epithelialmesenchymal transition, angiogenesis, immune invasion, and distal colonization (3). Usually during the process of distant metastasis, cancer cells (seed) escape from the primary site and adapt to the distant microenvironment (soil), which can be mediated by the "seed and soil" interaction (4). Furthermore, distant target organs can be changed and prepared for the arrest and colonization of circulating cancer cells (5,6). In terms of triple-negative breast cancer, several studies have indicated that different genes mediate tumor cell metastasis to either bone, lung, brain or liver tissues, resulting in organ-specific metastatic heterogeneity (7)(8)(9)(10).
In the real world, metastatic TNBC is a heterogeneous neoplasm with diverse prognostic endings and can be influenced by demographic features, including age, race and marital status, as well as clinicopathological parameters (for example, tumor size, grade, and clinical treatment) (11)(12)(13)(14). Different metastatic sites can also influence the survival outcomes of TNBC. For instance, visceral metastasis results in a poorer prognosis than bone metastasis (15). Thus, in consideration of these clinicopathological factors that may influence patient survival, it is vital to construct a comprehensive analytic model to accurately estimate the prognostic outcome of every patient. This predictive model can help physicians make therapeutic decisions and perform clinical trials.
In recent years, the nomogram has been considered a commonly viable predictive model for assessing prognostic outcome, especially in cancer patients (16)(17)(18)(19)(20). Several nomograms have been established for predicting the risk of recurrence, the benefit of radiation or the response to neoadjuvant chemotherapy in breast cancer (21)(22)(23)

Cohort Population and Data Processing
This was a retrospective study based on data from the Surveillance, Epidemiology, and End Results (SEER) database. In this study, case selection was conducted on the basis of the following inclusion and exclusion criteria.
Inclusion criteria: (1) pathological diagnosis was made between 2010 and 2016; (2) molecular subtype of triplenegative breast cancer; and (3) at least one distant site of de novo metastasis.

Statistical Analysis
We randomly assigned the patients enrolled from 2010 to 2014 into two cohorts, the training cohort and the validation cohort I, at a ratio of three to one, and we assigned the patients enrolled from 2015 to 2016 into the validation cohort II. Descriptive statistics were applied to summarize the clinicopathological features of the three cohorts. Overall survival (OS) was compared among different subgroups with Kaplan-Meier methods and log-rank tests. Further multivariate modeling was conducted to assess the independent predictive variables for survival. In consideration of potential competitive risk factors, breast cancerspecific survival (BCSS) was further analyzed with univariate and multivariate models. Cumulative mortality curves were generated to assess the impact of competitive mortality. Statistical significance was determined with a two-sided P < 0.05. We executed statistical analyses with SPSS 22.0.
Based on the data of the multivariate model, a nomogram was constructed with RMS and the SURVIVAL package in R software. We used 2-, 3-, and 5-years OS for analysis in the nomogram. One thousand bootstrap resamples were used to calculate C-indexes and generate calibration plots, which assessed the predictive accuracy of the nomogram. Furthermore, a risk stratification model was developed on the basis of each patient's total scores in the nomogram to divide all cases into three prognostic groups.

Patient Characteristics
The flowchart of the patient selection process is shown in Figure 1. In total, we included 1,737 patients based on the following criteria: 918 patients in the training set, 306 patients in validation set I and 513 patients in validation set II. The patients' baseline clinicopathological features and OS data within each subgroup are shown in In terms of the different metastatic sites, 41.4% (380/918), 9.0% (83/918), 28.8% (264/918), and 41.9% (385/918) of the patients had metastasis to the bone, brain, liver and lung, respectively, in the training set. The median overall survival time was 11.0 (95% CI 9.6-12.4), 6.0 (95% CI 3.5-8.5), 9.0 (95% CI 7.3-10.7), and 12.0 (95% CI 10.6-13.4) months for patients with bone, brain, liver and lung metastasis, respectively.

Nomogram Construction and Validation
A predictive nomogram integrating seven independent risk factors for prognosis was constructed (Figure 3) and scores were assigned for the clinical variables in each subgroup ( Table 3). Among all included variables, brain metastasis had a score of 100, followed by liver metastasis (score 99), T stage (T4: score 72; T3: score 11; T2: score 7), age (≥70: score 70; 50-69: score 23), bone metastasis (score 63), lung metastasis (score 47), and marital status (unmarried: score 36). The total score of an individual patient was obtained by adding all scores based on the patient's clinical variables. The likelihood of 2-, 3-, and 5-years OS could   be obtained by drawing a straight line on the "total points" axis ( Figure 3).
The C-indexes in the training (0.72, 95% CI 0.68-0.76), validation I (0.71, 95% CI 0.68-0.74), and validation II (0.71, 95% CI 0.67-0.75) cohorts suggested acceptable predictive accuracy of the model. The calibration plots in the training set suggested that the predictive outcome had good agreement with the recorded survival results (Figures 4A,B). The calibration curves in validation sets I and II also showed that the nomogrambased predictive outcome had good consistency with the recoded prognosis results (Figures 4C-F).

Risk Stratification Model
Moreover, a risk stratification model was generated on the basis of each patient's total scores from the nomogram to divide all patients into three prognostic groups. According to the risk stratification model, all the patients were stratified into three groups: low-risk group (792/1,737, 45.6%; total score<150), intermediate-risk group (692/1,737, 39.8%; total score 150-249), and high-risk group (253/1,737, 14.6%; total score ≥ 250) (Figure 3). In all cohorts, the median overall survival time in the low-, intermediate-and high-risk groups was 17.0 months (95% CI 15.6-18.4), 11.0 months (95% CI 10.0-12.0), and 6.0 months (95% CI 4.7-7.3), respectively. The Kaplan-Meier methods indicated that the risk stratification model could accurately differentiate survival in the three prognostic groups (Figures 5A-C). Cumulative mortality curves were generated to assess the impact of competitive events. There was no significant difference with regard to competitive mortality in all cohorts (P > 0.05) (Figures 5D-F), indicating that the primary outcome in this research was not affected by the potential competitive risk factors.

Survival Benefit of Surgery and Systemic Therapy in Stratified Risk Groups
To further assess the survival benefit of surgery, Kaplan-Meier curves were generated in the stratified risk groups.  (Figures 6A,B). However, surgery did not significantly improve prognosis in the high-risk group (HR 0.73, 95% CI 0.52-1.03, P = 0.068) ( Figure 6C) (Figures 7A-C).

DISCUSSION
In the present study, a nomogram was conducted and validated for predicting survival outcomes in distantly metastatic TNBC patients. We finally included 1,737 patients and identified seven demographic and clinicopathological features as prognostic factors including age, marital status, T stage, and bone/brain/liver/lung metastasis. Further C-index assessment and calibration curves suggested that the nomogram had optimal predictive accuracy. Moreover, a risk stratification model was generated on the basis of each patient's total scores from the nomogram and the survival benefits of therapeutic choices were analyzed in the classified risk groups.
To the best of knowledge, this is the first large-cohort, comprehensive retrospective study that has developed a predictive nomogram for the prognosis of TNBC patients with distant organ metastasis. Our prognostic model can be feasibly applied in clinical practice to predict the survival probability of each individual patient, and remind doctors of the expected benefits of different treatments. Furthermore, the newly established risk stratification system recognizes high-risk patients who need additional adjuvant therapies. Follow-up period can be narrowed for timely adjustment of treatment protocols in the high-risk subgroups. In the meantime, these high-risk patients can also be encouraged to take part in ongoing clinical trials for novel drugs. Moreover, this predictive tool is useful for the guidance of controlling confounding bias in research design, especially in those regarding overall survival as primary endpoints. In brief, we believe that patients enrolled for nomogram construction   represent the majority of metastatic TNBC patients, which guarantees the translational value of this predictive model in real situations.
In our findings, demographic features (age and marital status) and clinicopathological variables (T stage and bone/brain/liver/lung metastasis) were independent prognostic factors, results that were consistent with previous publications (11,24,25). Among all these distal metastatic sites, brain metastasis was the key factor with the poorest prognosis, followed by liver, lung and bone metastasis. A previous largecohort study considered breast cancer patients as a whole population and showed a similar trend in terms of the influence of different distant metastatic sites on patient survival (13).
The standard treatment for TNBC patients with de novo distant metastasis usually consists of palliative systemic therapies such as chemotherapy. The survival benefit of locoregional resection remains controversial. A multicenter, phase III, randomized, controlled trial MF07-01 indicated that locoregional treatment could improve 5-years survival in de novo stage IV breast cancer patients (26). A recently published multicentric retrospective study in France indicated that locoregional treatment improved overall survival in breast cancer patients   with synchronous metastasis, especially in patients with the molecular subtype of HR-positive/HER2-negative and HER2positive (27). Another retrospective study in Chinese patients showed that surgical removal of the primary tumor could improve the prognosis of patients with bone metastasis alone (28). Importantly, surgery can offer solid pathological evidence for molecular classification, can alleviate clinical symptoms and can reduce tumor burden. However, not all patients can obtain a survival benefit from locoregional therapy. The ABCSG-28 trial did not indicate a survival benefit for locoregional surgery in de novo metastatic breast cancer (29). Another openlabeled randomized controlled trial in India also identified that breast operations could not prolong survival in patients with primary metastasis (30). Thus, personal demographic and clinicopathological parameters need to be considered carefully to make a therapeutic decision for each patient. It is vital to construct a risk stratification model integrating all these parameters to precisely identify those patients who can prognostically benefit from locoregional resection. Notably, in our established model, surgery could only improve the survival outcome in low-and intermediate-risk groups, but not in highrisk groups, which provided more accurate information for therapeutic decisions.
To our knowledge, this research is among the innovative studies that have conducted a predictive nomogram for general metastatic TNBC patients. However, there may be several limitations in the present research. The first may be the retrospective nature of SEER-based research. Second, information about some potential prognostic parameters, such as the Eastern Cooperative Oncology Group (ECOG) performance status score, the detailed chemotherapy protocol and the multigene signature assessment, were not provided in the database (31)(32)(33). In addition, the database only included information on de novo distant metastasis. Some patients may have developed metachronous metastasis during followup which is unknown from the database. Last, only the patients diagnosed from 2010 to 2016 were ultimately enrolled for analysis, since distant metastatic locations and molecular classification were recorded from 2010 in the SEER database. Additionally, the majority of enrolled patients were Caucasian and black, so the nomogram needs to be validated in external cohorts, especially in Asian patients. Thus, we suggest further prospective studies be performed and that more prognostic variables be considered to improve our predictive model.
In summary, a novel predictive nomogram and risk stratification model were conducted for predicting individual survival in TNBC patients with de novo distant metastasis. This prognostic model may help clinical physicians make better decisions and may help in the design of future prospective studies.

DATA AVAILABILITY STATEMENT
The datasets analyzed for this study can be found in the SEER database (https://seer.cancer.gov/).

ETHICS STATEMENT
This research was based on the publicly available data from the SEER database and the data-use agreement was assigned. Patients' informed consent was not required because no direct interaction with patients was performed and no personal identification was applied in this study. In addition, this research was conducted in compliance with the Declaration of Helsinki.

AUTHOR CONTRIBUTIONS
ZW, HW, and X-SC designed this study. ZW, XS, and YF performed the search and collected data. S-SL and S-ND rechecked data. ZW and XS performed analysis and wrote the manuscript. X-SC and K-WS helped to revise the manuscript. All authors approved the final version of the manuscript.