Prognostic Factor Analysis and Model Construction of Triple-Negative Metaplastic Breast Carcinoma After Surgery

Objective The study aimed to analyze the prognostic factors of patients with triple-negative (TN) metaplastic breast carcinoma (MpBC) after surgery and to construct a nomogram for forecasting the 3-, 5-, and 8-year overall survival (OS). Methods A total of 998 patients extracted from the Surveillance, Epidemiology, and End Results (SEER) database were assigned to either the training or validation group at random in a ratio of 7:3. The clinical characteristics of patients in the training and validation sets were compared, and multivariate Cox regression analysis was used to identify the independent risk variables for the OS of patients with TN MpBC after surgery. These selected parameters were estimated through the Kaplan–Meier (KM) curves using the log-rank test. The nomogram for predicting the OS was constructed and validated by performing the concordance index (C-index), receiver operating characteristics (ROC) curves with area under the receiver operating characteristic curves (AUC), calibration curves, and decision curve analyses (DCAs). Patients were then stratified as high-risk and low-risk, and KM curves were performed. Results Multivariate Cox regression analysis indicated that factors including age, marital status, clinical stage at diagnosis, chemotherapy, and regional node status were independent predictors of prognosis in patients with MpBC after surgery. Separate KM curves for the screened variables revealed the same statistical results as with Cox regression analysis. A prediction model was created and virtualized via nomogram based on these findings. For the training and validation cohorts, the C-index of the nomogram was 0.730 and 0.719, respectively. The AUC values of the 3-, 5-, and 8-year OS were 0.758, 0.757, and 0.785 in the training group, and 0.736, 0.735, and 0.736 for 3, 5, and 8 years in the validation group, respectively. The difference in the OS between the real observation and the forecast was quite constant according to the calibration curves. The generated clinical applicability of the nomogram was further demonstrated by the DCA analysis. In all the training and validation sets, the KM curves for the different risk subgroups revealed substantial differences in survival probabilities (P <0.001). Conclusion The study showed a nomogram that was built from a parametric survival model based on the SEER database, which can be used to make an accurate prediction of the prognosis of patients with TN MpBC after surgery.


INTRODUCTION
Breast cancer (BC) is the leading cause of death in women, especially in women aged 20-59 years (1). Metaplastic breast cancer (MpBC) is a rare BC histological subtype with strong heterogeneity, accounting for approximately 0.3 to 1% of all kinds of BC (2). While MpBC is infrequent in terms of population incidence, it usually leads to an impoverished prognosis, consequently contributing to a relatively high mortality rate (3). "Metaplastic cancer" first appeared in 1973 by Huvos et al. (4). As defined by the World Health Organization (WHO), MpBC combines the presence of at least two histological cell types with metaplastic changes to squamous and/or mesenchymal elements (3). Additionally, they are highly heterogeneous and vary from chondroid, osseous, spindle, squamous, to rhabdomyoid elements (5). According to the biological behavior and histopathological characteristics of MpBC, malignant degrees can be divided into high and low (6). Most MpBCs are triple negative (TN) in phenotype, lack expression of estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2) (7).
Generally, triple-negative breast cancer (TNBC) has always been considered a type with more aggressive behavior, a greater chance of recurrence, and a worse prognosis (8). Triple-negative metaplastic breast cancer (TN MpBC) is more resistant than other TNBCs to conventional chemotherapy and carries a worse prognosis with a doubled risk of local recurrence and distant metastasis (34% vs. 15.5%) (9). A study of 59,519 patients by Giovanni et al. clearly showed us that MpBC was associated with worse OS compared to TNBC with a significant 40% increased risk of death (10). TN MpBC differs from other typical and common BCs in the pathological and clinical aspects. However, the prognosis and predictive factors for it remain largely unknown. Therefore, an accurate prediction model of TN MpBC is urgently needed.
Nomograms have been extensively endorsed and supported as a forecasting method in a considerable range of diseases in oncology recently, such as endometrial sarcoma (11), BC (12), lung cancer (13), gallbladder cancer (14), prostate cancer (15), kidney carcinoma (16), etc. They have shown a high discriminative ability to predict survival in validation and meet the requirements for an integrated model. By creating a concise and direct evaluation graph, the nomogram can assist clinicians and patients in making the most optimal and informed decisions regarding treatment. The Surveillance, Epidemiology, and End Results (SEER) database is one of the most comprehensive and complete large-scale tumor registries in North America, gathering a vast quantity of evidence-based and medicinerelated data with an approximate coverage of one-third of the U.S. population (17). In this study, we used the SEER database to create a nomogram that could predict the prognosis of patients with TN MpBC.

Patient Selection Criteria
Patients diagnosed with TN MpBC between 2010 and 2015 were initially identified using SEER*Stat (version 8.3.9) from the SEER database, which was all derived from 18 population-based cancer registries.
Inclusion criteria for patients with breast cancer were as follows: histology ICD-O-3 (8,575), triple-negative breast subtype, and surgery performed. Patients with N-adjusted, unknown adjusted AJCC 6th stage, unknown race recode, unknown tumor size, or positive regional nodes were excluded from this study.

Cohort Definition and Variable Declaration
Eligible patients in the SEER database were randomly divided into training and validation cohorts with a 7:3 ratio by the "caret" package in R version 4.1.1. The training group was used to create the prediction model, and the validation cohort was performed to confirm the accuracy and applicability of the model. Thirteen variables were recorded to describe the characteristics of patients: age, sex, marital status at diagnosis, ethnicity, tumor size, clinical stage, surgery type, chemotherapy, radiotherapy, laterality of tumor, primary sites, historic stage, and positive regional nodes. The age of these patients was reclassified at 63 years old. Surgical types include breast-conserving surgery (BCS) and mastectomy. The tumor clinical stage was ranked according to the American Joint Committee on Cancer (AJCC) 6th Edition breast cancer staging criteria. The tumor size was divided into three categories: <20, 20-50, and >50 mm. Tumors were classified as primary locations in the central section of the breast, the lower-inner/lower-outer/upper-inner/upper-outer quadrant of the breast, and others. Moreover, the historic stage of the tumor was classified as localized, regional, and distant separately. As for regional nodes, negative or positive status was confirmed.

Development of the Prognosis Model
Statistical analyses were conducted using the R software (4.1.2) and the SPSS 21.0. Significant factors were identified using Cox univariate analysis, and variables with P <0.05 were extracted into the multivariate Cox proportional hazards regression models. For each condition, the hazard ratios (HRs) and corresponding 95% confidence intervals (CIs) were calculated.
On the basis of the results from prior multivariate analysis, the preferred independent risk variates were included in the nomogram to estimate the probability of the 3-, 5-, and 8-year MpBC OS rates following comprehensive therapy. The "rms" package was used to plot the nomogram. Selected patients were stratified as high and low risk, and the Kaplan-Meier (KM) curves with the log-rank test using the "survival" package of R software were performed to assess the significance of the overall survival (OS) difference between the low-risk and high-risk groups. OS was defined as the time gap between the date of diagnosis and the day of death from any cause or the final follow-up.

Validation of Nomogram
The nomogram was then validated using various approaches. The concordance index (C-index) was generated to measure the predictive accuracy and discrimination capabilities. The receiver operating characteristic (ROC) curves were depicted, and the area under the receiver operating characteristic curves (AUC) was also created to examine the predictive accuracy. Commonly, C-index and AUC values greater than 0.7 indicate legitimate estimation. To test the association between the expected probabilities and the observed outcome frequencies, calibration curves were adapted. Decision curve analyses (DCAs) were performed to evaluate the clinical applicability and benefit of the nomogram.

Patient Characteristics
The data on 998 patients with TN MpBC were taken from the SEER database and randomized into training and validation groups in a 7:3 ratio, according to the screening criteria. Table 1 summarizes the demographic and clinical baseline features of these individuals. Almost all the included patients were women (99.6%). The median age of patients included in this were not, and the rest of the information remained unknown. Additionally, the entire population had a relatively low risk of regional node metastasis (73.7%). Approximately 50.6 and 49.4% of the tumors were lateralized to the left and right, respectively, and most were located in the upper-outer quadrant of the breast (36.9%). In terms of therapy, mastectomy was performed in 57.9% of the patients, and the rest underwent BCS. Chemotherapy (65.9%) and radiotherapy (46.1%) were received by most of the included patients with TN MpBC ( Table 1). There is a significant statistical difference in the age between the training and validation groups (P <0.05), whereas there is no significant difference in the distribution of the other described variables between the training and validation groups ( Table 1).

Prognostic Variables Screening
A Cox univariate survival analysis was performed for each variable in the training set. As demonstrated in Table 2, the Cox univariate regression results differentiated nine variables (age, marital status, clinical stage, tumor size, primary site, historic stage, surgery type, chemotherapy, and regional node status) that were substantially linked with TN MpBC OS (P <0.05). It is worth mentioning that tumor size, to some extent, corresponds to the T stage of clinical stage categorization. Thus, to avoid repetition, only TNM stage classification was included in our multivariate analysis rather than tumor size. These variables were then contained in the multi-factor Cox regression model. Based on the multivariate analysis, we ultimately ascertained that age ≥63 years (P = 0.094), unmarried status (P = 0.02), higher stage (P <0.01), positive regional nodes (P = 0.066), no chemotherapy (P <0.01), and mastectomy-received (P = 0.04) were independent risk variables in the poor prognosis of patients with TN MpBC ( Table 2). KM curves drawn for the above six factors separately indicated the same results. As shown, patients younger than 63 years or who are married are more likely to survive longer than those older or unmarried (age [P <0.001], Figure 1A; marital status [P <0.001], Figure 1B). Additionally, survival rates decline with higher clinical stage (P <0.001) ( Figure 1C) and positive regional lymph nodes (P <0.001) ( Figure 1D). Different therapy types have different effects. Patients with MpBC who received BCS and chemotherapy tended to have a higher survival probability (surgery type [P <0.001], Figure 1E; and chemotherapy [P <0. 001], Figure 1F). These findings corroborate the statistical findings stated above. In summary, age, marital status, clinical stage, regional nodes, chemotherapy, and surgery type were significant factors that were associated with OS.

Nomogram Construction and Validation
The preceding screened six factors were used to create a nomogram of OS prognosis in TN MpBC (Figure 2), and all the predictors were integrated to predict the 3-and 5-year survival of patients with MpBC. The C-index in the training and validation groups were 0.730 (95% CI: 0.713-0.747) and 0.719 (95% CI: 0.693-0.745), respectively, and this exhibited favorable prognostic accuracy and clinical practicality when we verified the discrimination of the nomogram. The above outcomes correspond to the ROC curves and the AUC value ( Figure 3). The AUC value of 3-, 5-, and 10-year OS is higher than 0.70 and shows that the constructed nomogram has good predictive efficiency for OS. The AUC value of 3-year OS was 0.758 in the training cohort (P <0.001) ( Figure 3A) and 0.736 in the validation cohort (P <0.001) ( Figure 3B). For the 5-year OS, the AUC values in the training and validation groups were 0.757 (P <0.001) ( Figure 3A) and 0.735 (P <0.001) ( Figure 3B), respectively. The AUC value of the 8-year OS was 0.785 (P <0.001) ( Figure 3A) and 0.736 (P <0.001) ( Figure 3B), respectively. In both the training and validation cohorts, the calibration curves illustrated a high level of consistency between the actual observed results and the nomogram predictions of 3-, 5-, and 8-year OS (Figures 4A-F).

Risk Assessment
According to former analysis and newly-built nomogram, we have performed a postoperative risk stratification, dividing patients into high-and low-risk groups. In both the training (P <0.001) and validation sets (P <0.001), the KM curves of the different risk subgroups indicated great survival probability distinction ( Figures 6A,B). The high-risk group showed  distinctly worse survival conditions than the low-risk group. These findings reveal that the risk classification system has a strong predictive value for the prognosis of patients with MpBC, further strengthening the potential applicability of this prognostic model.

DISCUSSION
MpBC is a rare, yet deadly form of BC that consists of epithelial and mesenchymal histological components (18). The WHO classified MpBC into the epithelial and epithelial-mesenchymal mixed types (19). According to its biobehavioral and histopathological features, MpBC can be subclassified into a high and low grade (3). Low-grade MpBC usually contains adenosquamous carcinoma and fibromatosis-like MpBC, while squamous cell carcinoma, spindle cell carcinomas, and heterologous mesenchymal differentiation comprise the highly malignant MpBC (20). For the MpBC pathogenesis, molecular alteration and genetic changes are usually taken into consideration. Some of these variations can be focused on as therapeutic targets preclinically and clinically (21). According to previous studies, MpBC typically has molecular variations in epithelial to mesenchymal transition (EMT) (22) and phosphoinositide 3-kinases (PI3K) signaling (23). Research has shown that the aggressiveness and poor clinical outcomes of   MpBC can be explained by EMT characteristics along with PI3K pathway hyperactivation (24). Reis-Filho found that EGFR gene amplification, which is exhibited in nearly 34% of MpBC cases (25) and EGFR tyrosine kinase inhibitors, can be employed as therapeutics against MpBC. Other genetic differences also exist, such as nitric oxide signaling, Wnt/-catenin signaling, altered immunological response, and cell cycle dysregulation (21). Since more than 90% of MpBC is negative for ER or PR and HER2 (26), and considering its rarity, research on clinical pathological features and the prognostic factors of patients with MpBC has been limited (3). Yet, no practical clinical diagnosis-treatment-prognosis guideline or consensus has been globally acknowledged, which has boosted enthusiasm for MpBC research recently. It is well known that all that TNBC has a poor prognosis among all molecular types of breast cancer (17). Existing studies have shown that the prognosis of MpBC is even worse than that of TNBC (27). Zhang et al. checked 30,000 patients with BC, and they discovered that the 5-year disease-free survival (DFS) rate and OS rate of the patients with MpBC were 67.9 and 78.7%, which is much lower than the 86.0 and 90.6% of healthy patients with TNBC, implying that the prognosis of MpBC was poorer than TNBC (28). This was consistent with the results from Pakha et al., where they found that the 5-year OS rate of MpBC was only 64%, which was significantly lower than that of IDC and TNBC (29). The long-term prognosis of patients is greatly influenced by MpBC immunohistological subgroups, with TN MpBC having the worst prognosis of all (30). As a result, it is of critical importance to investigate the factors that impact the prognosis of TN MpBC to identify patients at high risk.
In this study, 50.6 and 49.4% of the tumors were lateralized to the left and right, respectively, and most were located in the upper-outer quadrant of the breast (36.9%). This implies that no matter whether on the left or right breast, the upper-outer quadrant is the site where MpBC is most likely to occur. Besides, we found that in patients who included in this cohort, the median age was 63. In the training group, the age distribution was even for half who were older than 63 years old and the other half were younger. However, in the validation group, more patients were ≤63. There is a significant statistical difference in the age distribution. This indicates that age does not exert an influence on to the predictive efficiency of the predictive model. In our study, we confirmed that age, marital status, clinical stage, regional lymph nodes, chemotherapy, and surgical-received all play a role in making a prediction of MpBC.
Although existing studies have shown that patients with MpBC have a lower rate of regional node metastasis by around 22-31% (8, 31, 32), the prognosis of MpBC is affected by regional node involvement, as our results exhibited. Similar outcomes to those were concluded by other researchers (28). In a study of 90 patients with MpBC in China, Zhang et al. confirmed that regional node status was an independent predictor for OS (28). Lee et al. also came to the same conclusion that positive regional nodes would lead to a dismal clinical ending (33). In our study, mastectomy is another factor that worsens the prognosis of patients with MpBC. Previously published studies also ascertained profitable prognostic implications of BCS; on the contrary, no surgery and mastectomy play an opposite role (34). Some scholars, however, pointed out that the type of surgery was not a prognostic factor for disease-specific survival (DSS) or OS (35,36). For a retrospective study, this discrepancy could be attributable to the likelihood of selection bias. Furthermore, our findings also stated whether or not receiving chemotherapy also affects the prognosis of MpBC. Conventional anthracycline combined with cyclophosphamide chemotherapy is ineffective and cannot benefit patients with MpBC (37). Even with the most effective paclitaxel-containing chemotherapy regimen, the clinical efficiency for patients with advanced MpBC is less than 20% (38). Thus, it can only be tentatively concluded that adjuvant chemotherapy containing paclitaxel is the most likely chemotherapy regimen to benefit MpBC patients (38). Despite the fact that there are still few viable treatment options for MpBC in the systemic setting, and the chemotherapy regimen as well as its efficacy are debatable, chemotherapy improves the prognosis of patients with MpBC (39). The involvement of chemotherapy in the prognosis of patients with MpBC has been described before (19). Chemotherapy was related to prolonged survival in the report of Rakha et al., albeit this impact was limited to earlystage disease (19). In addition, tumor stage is also related to the prognosis of MpBC. Tumor size adversely affects the OS of patients with MpBC. Several studies have found that the higher the tumor stage, the worse the prognosis of patients with MpBC,  which matches our findings (3,34,40). Furthermore, the relationship between histological subtypes and MpBC prognosis was also discussed in many articles. Yamaguchi et al. disclosed significantly higher metastatic risks in MpBC containing spindle cells (41). MpBC rich in spindle cells manifests a more aggressive biobehavior (33). Intriguingly, other articles focused in depth and presented that MpBC with spindle cell appears with a higher frequency of PIK3CA mutation, which may benefit from radio-/chemotherapy (29). In this study, 998 postoperative patients diagnosed with TN MpBC were extracted from the SEER database. After analysis, our findings imply that age, marital status, regional node metastasis, chemotherapy, and surgical type are all variables that influence OS; therefore, we developed a prediction model and a nomogram based on these variables. The predictive model can predict the 3-, 5-, and 8-year OS of patients with MpBC accurately and effectively, providing a valid scientific basis for predicting the prognosis of MpBC. Identification of these characteristics and an understanding of their role in disease aggressiveness and progression could lead to more customized treatment for this patient group. Further, it may contribute to the current knowledge on the MpBC management strategy.
Limitations do exist in our study. First, we acquired data from the SEER database, which lacked numerous valuable characteristics, including comorbidity, the specific chemotherapy regimen, radiotherapy dose, target volume, endocrine therapy, and pathological condition. Second, almost all the people we included were from Europe and America, and the Asian population needs to be further investigated. Third, this research was a retrospective analysis with weak argumentation.

Conclusions
Using six clinicopathological elements, we developed a prediction model and nomogram for predicting the OS of postoperative patients with MpBC. The validation of the model has proven to be extremely effective. These methods can assist physicians with patient counseling and treatment decision-making in prognostic evaluation and tailored therapy, notwithstanding the need for further external validation.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. The patients/participants provided their written informed consent to participate in this study.