Establishment and Verification of a Predictive Model for Node Pathological Complete Response After Neoadjuvant Chemotherapy for Initial Node Positive Early Breast Cancer

Objective Axillary node status after neoadjuvant chemotherapy (NCT) in early breast cancer patients influences the axillary surgical staging procedure. This study was conducted for the identification of the likelihood of patients being node pathological complete response (pCR) post NCT. We aimed to recognize patients most likely to benefit from sentinel lymph node biopsy (SLNB) following NCT and to reduce the risk of missed detection of positive lymph nodes through the construction and validation of a clinical preoperative scoring prediction model. Methods The existing data (from March 2010 to December 2018) of the Chinese Society of Clinical Oncology Breast Cancer Database (CSCO-BC) was used to evaluate the independent related factors of node pCR after NCT by Binary Logistic Regression analysis. A predictive model was established according to the score of considerable factors to identify ypN0. Model performance was confirmed in a cohort of NCT patients treated between January 2019 and December 2019 in Henan Cancer Hospital, and model discrimination was evaluated via assessing the area under the receiver operating characteristic (ROC) curve (AUC). Results Multivariate regression analysis showed that the node stage before chemotherapy, the expression level of Ki-67, biologic subtype, and breast pCR were all independent related factors of ypN0 after chemotherapy. According to the transformation and summation of odds ratio (OR) values of each variable, the scoring system model was constructed with a total score of 1–5. The AUC for the ROC curves was 0.715 and 0.770 for the training and the validation set accordingly. Conclusions A model was established and verified for predicting ypN0 after chemotherapy in newly diagnosed cN+ patients and the model had good accuracy and efficacy. The underlined effective model can suggest axillary surgical planning, and reduce the risk of missing positive lymph nodes by SLNB after NCT. It has great value for identifying initial cN+ patients who are more appropriate for SLNB post-chemotherapy.


INTRODUCTION
Sentinel lymph node biopsy (SLNB) is considered to be the standard method for the management of axillary nodes in patients with clinical lymph node-negative (cN0) early breast cancer (BC). Neoadjuvant chemotherapy (NCT) is widely used in locally advanced BC, triple-negative (TN), and human epidermal growth factor receptor-2 positive (HER2+) BC (1). NCT elevates the potency that a patient may experience breastconserving therapy, could significantly downstage the axilla, and permits for an in-vivo evaluation of treatment effect. The probability of nodal negativity post NCT affects the choice of axillary staging operation. In patients with cN0 disease before treatment, the feasibility of SLNB after chemotherapy has been confirmed and agreed upon. But in patients who were downstaged from initial lymph node-positive (cN+) disease before treatment to clinically node-negative after chemotherapy, although the safety of SLNB has also been confirmed, while it is still one of the focus of controversy (2). The main reason for the controversy is the concern about patients with negative sentinel lymph nodes but missed metastatic nodes checking post NCT.
The risk of missed detection of metastatic nodes in the population was positively correlated with the false-negative rate (FNR) of SLNB and the load of axillary lymph node metastasis after NCT. Multiple considerable studies have been directed to a modification in the manner surgeons manage the axilla in NCT treated patients of BC. According to the reported trails studies of the American College of Surgeons Oncology Group (ACOSO) Z1071 trial, the SN FNAC trial, and the SENTINA trial, SLNB is considered to be a relatively safe and feasible procedure after NCT. Although the overall FNR was 12.6, 8.4, and 14.2%, respectively (3)(4)(5). In the retrospective study, the FNR of SLNB was found to be as high as 5-25% (6,7). It can be seen that SLNB after NCT still has a higher FNR. On the one hand, FNR could be lowered via improving the surgical technique of SLNB, including double tracer with the use of dye combined with a radionuclide, placement of marker clip in positive lymph nodes pre-NCT and its removal during operation, detection of more than two sentinel lymph nodes, and examined the nodes with immunohistochemical method (8).
On the other hand, we can use optimizing patient selection. SLNB should be performed in patients associated with a low load of axillary lymph node metastasis or even no node metastasis after chemotherapy. When allowing for SLNB surgery after NCT for a patient associated with cN+ complication at diagnosis, this approach is beneficial for those patients who mostly have a complete nodal response. Ideally, doctors can identify which patients will respond to chemotherapy. They select patients who can most likely achieve node pathological complete response (pCR, ypN0) and suitable to SLNB after NCT. In the same way, the risk of missing positive lymph nodes will be reduced.
Preoperative models that predict the likelihood of the patient achieving a pCR in the axilla after NCT are helpful to guide this decision making. Multiple models have been published predicting axillary pCR after NCT in various cohorts. However, these established models were limited because of single-institution experiences or multicenter small sample size of the studies (9)(10)(11). Only three models reported in two studies were based on the materials of the National Cancer Data Base (NCDB) (12,13). This study plans to develop a clinical preoperative scoring prediction model for the identification of the likelihood of patients being axillary pCR after NCT based on CSCO-BC and to verify the model based on independent data in Henan Cancer Hospital. It is expected to establish and verify a model for predicting ypN0 in newly diagnosed cN+ patients after chemotherapy, to guide axillary surgical planning, and identify initial cN+ patients who are more appropriate for SLNB after chemotherapy, and to achieve the goal of reducing the risk of missing detection of metastatic nodes.

Study Population
After the approval of the Institutional Review Board, we recognized all those patients with primary BC and obtained preoperative chemotherapy in the CSCO BC database from March 2010 to December 2018 and Henan Cancer Hospital from January 2019 to December 2019. CSCO BC database is an authoritative cancer registry database that contains anonymized BC cases from nine large hospitals representing all regions of China. Variables include patient sex, age at the time of diagnosis, the status of menstruation, location of primary breast tumor, pathological type, estrogen receptor (ER), progesterone receptor (PR), HER2, Ki-67, mode of operation, cTNM stage, axillary pathology, and postoperative breast pathology, etc.

Inclusion and Exclusion Criteria
In the existing study, the following conditions were considered as inclusion criteria (1) cTNM stage based on the 7 th edition of American Joint Cancer Commission (AJCC) cTNM staging before treatment available; (2) before chemotherapy, invasive BC validated via core needle biopsy; (3) axillary lymph nodes positive at diagnosis (cN+); (4) known ER, PR, HER2, Ki-67 status before chemotherapy; (5) received preoperative chemotherapy; (6) received axillary lymph node dissection after chemotherapy, the patient subjected to breast surgery following the local treatment standards; (7) Postoperative pathology of axillary lymph nodes and breast available. Patients with any of the following conditions were not selected for the study: male, bilateral BC, axillary lymph node-negative (cN0), metastasis of internal or supraclavicular mammary lymph node, distant metastasis (M1), inflammatory BC, BC during pregnancy, stage 0 or ductal carcinoma in situ (DCIS) at diagnosis, axillary or primary breast tumor resected before treatment, simultaneous deletion of ER, PR, and HER2 results, receiving preoperative endocrine or radiation therapy, absence of postoperative breast and lymph node pathology or no operation. After screening, 1,814 patients were deemed to be eligible and included in the final analysis. The resulted of 1,497 patients from the CSCO BC database being assigned to a ''training set'' (used in creating our initial model of post-NCT ypN0) and the resulted of 317 patients from Henan Cancer Hospital being assigned to a "validation set" for confirmation of model strength (Figure 1).

Pathology
The immunohistochemistry (IHC) technique was used in order to assess the HER2, Ki-67, ER, and PR status at diagnosis. The ER/PR+: ≥1% of tumorous cells were evaluated with nuclear staining. ER or PR+ were collapsed into one HR+. The determination of HER2+ was carried out based on the ASCO/ CAP suggested guidelines. A 3+ score for IHC was considered positive, as was a 2+ IHC score with the results of fluorescence in situ hybridization (FISH) overruled (14). The expression of Ki-67 was categorized as high (>30%) and low (<30%) based on the nuclear positive cells ratio to all tumor cells in 10 high-power visual fields. Based on HR and HER2 status, the patients were divided into four sub-types: HR+HER2−, HR+HER2+, HR−HER2−, and HR−HER2+. The pCR was defined as the absence of tumor cells in the axillary lymph nodes (axillary pCR) or the breast (breast pCR) after NCT (1).

Statistical Analyses
Statistical analyses were conducted with SPSS, version 23. Categorical variables were compared via univariate logistic regression in the training set. Factors that were statistically significant at the 0.1 level were included in the multivariate analysis. Binary logistic regression analysis was employed for multivariate analysis in the training set. ORs and 95% CIs were measured. Odds ratio (OR) >1 indicated an elevated likelihood of pN0. The odds ratios of significant independent predictors were employed for translating into points for the model. The receiver operating characteristic (ROC) curve was plotted, and the predictive accuracy was evaluated via measuring the area under the ROC curve (AUC). A 95% CI was measured for all AUC, and was compared with an AUC of 0.5 by Z test. The model is validated in the validation set.

Patient Characteristics
Baseline clinical, as well as pathologic properties of the study participants in the training and validation sets, have been revealed in Table 1. A total of 1,814 female BC patients were registered in the current study, having 1,497 and 317 in the training and the verification set, respectively. In the training set, the patient's age was ranged from 19 to 77 years, with an average of 48 years.

Establish the Scoring System
According to the value of the OR of each independent predictor, the scoring system has been represented in Table 3. In the training set, the total score of all patients was calculated based on the above scoring system. As the score of the cumulative model was ranged from 2 to 10.5, the model adjustment was carried out at a 1-5 numeric scale, as represented in Table 4. The model score distribution and corresponding ypN0 ratio of the training set and verification sets are indicated in Table 4. Elevated point scores were associated with step by step elevation in the rate of pCR that  has been graphically represented in Figure 2. The axillary pCR rate of patients with a score of 5 in the training set can reach 77%, while the overall trend of 92.9% in the verification set.

Effectiveness Evaluation of Scoring System Model
Based on the model score of all patients, the ROC curve of ypN0 ratio post-NCT was drawn ( Figure 3). The training set AUC was 0.715 (95% CI 0.688-0.742, P <0.001), and the verification set AUC was 0.770 (95% CI 0.716-0.823, P <0.001) which indicates significant discrimination of the scoring system.

DISCUSSION
SLNB is the standard method for the staging of the axillary lymph node in patients with early BC and cN0 disease. However, the safety of SLNB is still controversial in patients with initial cN+ stage and downstaging to cN0 after chemotherapy. Earlier studies have been revealed that 22-44% is the rate of axillary pCR post-NCT, which is higher in patients with triple-negative and HER2 positive BC i.e., 40-74% (15,16). Ideally, doctors will be able to screen out patients who respond well to chemotherapy and have more chances to reach ypN0 post NCT, and suitable to do SLNB after NCT. In this way, the risk of missing positive lymph nodes can be lowered. In the current study, we established a clinically predicted model for patients with cN+ BC for prediction, with good discrimination, pathologically negative nodal status following NCT.    with axillary pCR post-chemotherapy. It showed consistency with the results obtained from earlier studies. A 1-5 model scoring system has been constructed on the basis of transformation and summation of the OR values of each variable. To determine the efficiency of the scoring system, the ROC curve of the ypN0 ratio has been plotted post-NCT. The training set AUC is 0.715 and the verification set AUC is 0.770 which indicates the significant discrimination of the scoring system. In the model scoring system established in this study, breast pCR after chemotherapy accounted for a large weight (4.5 to 10.5). Many earlier studies have been reported that the axillary pCR rate in breast pCR patients has been considerably elevated than that in breast non-pCR patients (17)(18)(19). According to Netherlands Cancer Registry, it has been revealed that in newly diagnosed cN+ patients, the axillary pCR rate in breast pCR patients after NCT is 45%, while that in breast non-pCR patients is only 9.4% (20).
Of course, how to judge the state of breast pCR before the operation is one of the problems that need to be solved in the application of this model. Many studies have reported the strategies of predicting breast pCR post NCT, and even the requirement of breast surgery in patients with breast pCR post-NCT has been questioned (21). Image-Guided Minimally Invasive Biopsy(MIB) is expected to precisely predict breast pCR. A single-center prospective study from MD Anderson Cancer Center involved 40 patients with clinical T1-3N0-3M0 and TN or HER2-positive BC who were assessed as complete or partial remission by ultrasound or mammography after NCT and underwent fine-needle aspiration biopsy or coarse needle biopsy before the operation. The accuracy for prediction breast pCR can reach 98.0% with a lower FNR of 5.0% by a combination of the two invasive biopsy methods (22). Another prospective, a monocenter cohort study has also shown that a vacuum-assisted MIB can accurately diagnose a pCR provided that the pathological evaluation shows a representative sample (23). At present, although pathologic response in the breast is not available preoperatively during the time of surgical decisionmaking. However, these studies suggested that MIB has a good prospect in predicting the clinical application of breast pCR (22,23).
Numerous studies have been established models for the prediction of ypN0 post-NCT, but almost all of them are singlecenter or multi-center small sample data, and only a few of the models are based on NCDB large database. The most recent model reported in 2018 was constructed with 19,115 (70% being assigned as "testing cohort" to created initial model and 30% being assigned as "validation cohort" to confirmation of model strength) clinically node-positive BC patients who underwent NCT and then received breast surgery and dissection of the axillary lymph node. The model was carried out to predict pathologically node-negative status. The study revealed that age, histological type, initial N stage, histological grade, molecular classification, and breast pCR status were independently predicted ypN0 post NCT. The AUC of the training set and verification set were 0.781 and 0.788, accordingly (12). Moreover, data of the training, as well as verification set, have been obtained from the NCDB database, which lacks external independent data verification. While Murphy et al. also validated the model independently by external data of Mayo Clinic, the sample size in the validation set seems to be not enough (n = 180) (13). The advantage of our model is that the model was established based on China's authoritative CSCO BC database (training set), and independently verified by the BC patient data of Henan Cancer Hospital (validation set). Although tumor histology and grade are not included in our model, we get similar AUC values. The training set AUC is 0.715, the verification set AUC is 0.770 which reveals that the prediction model on the basis of the underlined scoring system is stable with good discrimination. This may also suggest that our model may be more convenient for the Asian population.
If we can use primary data, more parameters were able to be included in our model, such as tumor histology and grade, MRI or ultrasound or clinical tumor response, and chemotherapy details. Thus, our model may be greatly improved. The information is incomplete or not available in the CSCO BC database, and therefore, was not included in the model. However, our model was based on a larger sample of the Asian population database, also the largest sample size, for the first time. We, therefore, included a more representative and heterogenous cohort of patients from across China and hospitals of varying sizes and varying practice settings.
The model of the current study may provide a reliable screening method for patients who are suitable to do SLNB after NCT with initial cN+ disease. For patients that were at risk of node-negative, axillary staging with SLNB surgery would be recommended, while for patients that were at higher risk of nodal positive disease for a longer period, ALND may be considered. For example, the chances of ypN0 for patients with a score of 1 point is less than 20%, and it should be careful to do SLNB, or even direct dissection of the axillary lymph node is recommended. For patients with a score of 2-3 points, the probability of ypN0 is 20-50%. SLNB can be considered. The chances of ypN0 for patients with a score of 4-5 points is more than 70%, and direct SLNB is recommended. Hence, the underlined model permits surgeons for SLNB surgery in patients with more chances of nodal response to NCT, so reducing the chances of false-negative events.
The model established in this study is based on the authoritative BC registration database in China and verified by the independent data of Henan Cancer Hospital. As far as we know, this is the first clinically predicted model for ypN0 on the basis of the Asian population database. The shortcomings of this study included that it is a retrospective study that is based on the database. There is a large number of missing data, including chemotherapy regimens. So, data of many patients were excluded from the study. Notably, chemotherapy regimens are not included in both of our model and models based on NCDB (12,13). A lower axillary pCR rate was reported in patients treated with a taxane without an anthracycline (23.7%) than an anthracycline without a taxane (19%) (15). Unfortunately, we cannot rule out the impact of chemotherapy regimens. But it also reflects that this study is more in line with the characteristics of real-world patients. We also suggested that a large heterogeneous cohort of patients used for generating models makes them universally appropriate for patients at all medical centers. A multicenter BC NCT database containing much more variables is being established, and we hope to improve the model in the future.
In brief, we established and confirmed a model to predict ypN0 post-chemotherapy in newly diagnosed cN+ patients which has good accuracy and efficacy. The models, which included patient clinical nodal category, Ki-67 expression, biologic subtype, and breast pCR, showed good discrimination. This clinically useful model is helpful to the reasonable choice of axillary surgery after NCT and reduces the risk of missing positive lymph nodes in SLNB after NCT.

DATA AVAILABILITY STATEMENT
The datasets presented in this article are not readily available because at present, the CSCO BC database is not a fully public database. Requests to access the datasets should be directed to zlyyliuzhenzhen0800@zzu.edu.cn.

ETHICS STATEMENT
The authors are responsible for the current study in confirming that queries associated with the precision or reliability of the underlined work are properly evaluated and solved. The current study was performed according to the Declaration of Helsinki (as revised in 2013). The approval for the underlined study was provided by the Ethical Review Committee of the Affiliated Cancer Hospital of Zhengzhou University (No. 2019001). As the current study was retrospectively planned, informed consent was waived via Affiliated Cancer Hospital of Zhengzhou University.