Nomogram to Predict Tumor-Infiltrating Lymphocytes in Breast Cancer Patients

Background: Tumor-infiltrating lymphocytes (TILs) play important roles in the prediction of prognosis and neoadjuvant therapy (NAT) efficacy in breast cancer (BRCA) patients, in this study, we identified clinicopathological factors related to BRCA TILs, then to construct and validate nomogram to predict high density of TILs. Methods: A total of 826 patients diagnosed with BRCA in Sun Yat-Sen University cancer center were enrolled in nomogram cohort. TILs were assessed using hematoxylin-eosin (H&E) staining by two pathologists. Complete clinical data were collected for analysis. Then the enrolled patients were split into a training set and validation set at a ratio of 8:2. and the backward multivariate binary logistic regression model was used to establish nomogram for predicting BRCA TILs, which were further evaluated and validated using the C-index, receiver operating characteristic (ROC) curves and calibration curves. Then another independent NAT cohort of 106 patients was established for verifying this nomogram in NAT efficacy prediction. Results: TILs were significantly correlated with body mass index (BMI), tumor differentiation, ER, PR, HER2 expression, Ki67, blood biochemical indicators including total bilirubin (TBIL), indirect bilirubin (IBIL), total protein (TP), Globulin (GLOB), inorganic phosphorus (IP), calcium (Ca). In which ER expression level [OR = 0.987, 95%CI (0.982–0.992), p < 0.001], IP [OR = 4.462, 95%CI (1.171∼17.289), p = 0.029], IBIL [OR = 0.906, 95%CI (0.845–0.966), p = 0.004] and TP [OR = 1.053, 95%CI (1.010–1.098, p = 0.016)] were independent predictors of TILs. Then nomogram was established, for which calibration curves (C-index = 0.759) and ROC curve (AUC = 0.759, 95%CI 0.717–0.801) in training sets, calibration curves (C-index = 0.708) and ROC curve (AUC = 0.708, 95%CI 0.617–0.800) in validation sets demonstrated great evaluation efficiency. Besides, independent NAT cohort verified this nomogram can distinguish patients with greater NAT efficacy (p = 0.041). Conclusion: The finds of clinicopathological factors associated with TILs could help clinicians to understand the tumor immunity of BRCA and improve treatment system for patients, and the established nomogram with high evaluation efficiency may be used as a complement tool for distinguishing patients with better NAT efficacy.


INTRODUCTION
According to the latest cancer statistics, in 2021, there would be about 284,200 newly diagnosed breast cancer (BRCA) patients in the United States, accounting for around 15% of all cancer diagnosis. In the female population, BRCA has the highest incidence rate (30%) and the second mortality rate (15%), and the incidence has steadily increased in the past 2 decades, therefore, BRCA is considered as a major threat to women's health (Siegel et al., 2021). Although the treatment methods of BRCA patients has been improved to a certain extent, there is still space for improvement in precise treatment and management. And in clinical practice, there is an urgent need for indicators to predict prognosis, treatment efficacy and guide selections of treatment options.
Tumor-infiltrating lymphocytes (TILs) refer to mononuclear immune cells infiltrating into the tumor tissue, which have been reported and studied in a variety of solid tumors, including BRCA, colon cancer, cervical cancer, melanoma, and lung cancer (Underwood, 1974). TILs in the breast tissue mainly consist of CD8 + T cells, as well as helper (CD4 + ) T cells, CD19 + B cells and a very small amount of NK cells with different infiltration degrees (Chin et al., 1992;Whitford et al., 1992). TILs is a highly complicated system (Gajewski et al., 2013). Thus, we need to do furthering research to better understanding tumor immunity.
TILs are of great significance in neoadjuvant treatment (NAT) efficacy evaluation for BRCA patients. Studies have shown that TILs are associated with positive disease-free survival (DFS) in triple-negative breast cancer (TNBC) and HER2 + patients. While TILs are related to increased overall survival (OS) in TNBC and HR+ HER2-patients (Denkert et al., 2018a), and TILs can serve as an independent prognosis factor for TNBC patients (Pruneri et al., 2016a). Besides, the level of TILs is also related to tumor metastasis in BRCA, research indicate metastatic BRCA tended to have lower TILs (Zhu et al., 2019). What's more, TILs were also reported to be correlated with the efficacy of NAT. For TNBC and HR+ HER2-subtypes, patients with higher TILs are more likely to have pathological complete response (pCR) after neoadjuvant chemotherapy (Hamy et al., 2019). A possible explanation for this association was that chemotherapy, including cyclophosphamide, and gemcitabine, can indirectly stimulate TILs mediated adaptive immune response by reducing immunosuppressive factors (Denkert et al., 2018a). Therefore, chemotherapy, besides being tumoricidal, may additionally have an immunotherapeutic effect via stimulation of immune responses in TILs leading to complete clinical responses (Pruneri et al., 2016a). For HER2+ subtype, patients with higher TILs tended to benefit more from neoadjuvant targeted therapy in regard of pCR (Ignatiadis et al., 2019;De Angelis et al., 2020).And the same trend has also been observed in BRCA immunotherapy (Loi et al., 2021), However, beneficial effects of neoadjuvant endocrine-therapy were found in tumors with low TILs infiltration for HR + HER2-patients (Skriver et al., 2020a;Skriver et al., 2020b;Lundgren et al., 2020).
Overall, TILs in BRCA patients are of great clinical significance in NAT and prognosis evaluation. However, there are some limitations for traditional H and E staining. Firstly, for patients receiving NAT, tumor sample could only be obtained through pre-operative fine-needle aspiration biopsy, the sample size is small and local, which on the one hand cannot represent the whole tumor, due to the heterogeneous distribution of TILs in BRCA patients (Dieci et al., 2018), and will affect the accuracy of detection on the other hand. Therefore, a prediction model, based on accurate TILs data of NAT naive surgical resection specimens and baseline available indictors, has important clinical significance for evaluating the baseline level of TILs and the efficacy of NAT. Finally, the prediction model of TILs, can serve as a newly prognosis tool for BRCA patients, whose tumor samples cannot be obtained (et: samples from other hospitals could not be borrowed and used, or the tissue itself was too small for further staining) to perform H and E staining and morphological evaluation of TILs. Thus, for those two kinds of specific BRCA patients, traditional H and E staining has noteworthy deficiencies, so we need an prediction model with great accuracy. Therefore, NAT naive surgical resection specimens were analyzed and clinical data was collected, aiming to explore associations of TILs with clinicopathological characteristics, and to propose nomogram model for predicting high TILs in BRCA patients.

Nomogram Cohort Patients
From June 2020 to March 2021, a total number of 826 patients' data were collected in Sun Yat-Sen University Cancer Center. The diagnosis of invasive breast cancer was confirmed by postoperative pathology. The main inclusion criteria were: 1. 18-80 years of age; 2. eligibility for radical tumor resection after comprehensive evaluation. Exclusion criteria mainly were: 1) previous NAT, i.e., chemotherapy, radiotherapy, targeted therapy, or immunotherapy, etc; 2) secondary tumors or multifocal tumors; 3) HIV infection or other immune system diseases; 4) past use of immune agents or drugs and health care products that may affect immune function; 5) state of inflammation and infection within nearly 1 week. This study was approved by the Ethics Committee of Sun Yat-Sen University Cancer Center.

General Information and Examination Results
Patients' information was obtained from the breast cancer single disease research platform of Sun Yat-Sen University Cancer Center. Inclusion and exclusion criteria were described above. General patient information, histopathological findings and specific blood test results were assessed retrospectively. General patient information included age, gender, BMI, blood type, hypertension, and diabetes. Blood test results included blood routine, blood biochemical, hormone level, and tumor markers. All blood test results were obtained from the patients within 1 week before surgery. The histopathological data included postoperative paraffin pathological findings: pathological type and stage, TNM staging, degree of tumor differentiation, immunohistochemical staining results. ER positivity was defined as ER ≥ 1%. PR positivity was defined as PR ≥ 1%.

Tumor-Infiltrating Lymphocyte Assessment
Postoperative tumor paraffin specimens from 826 patients underwent H and E staining, and two pathologists independently calculated stromal TILs density percentages; the final values of TILs were obtained through unified discussion, with the measurement standard referring to "RECOMMENDATIONS BY AN INTERNATIONAL TILs-For the supplemental versions of WORKING GROUP 2014 and 2018",the standardized approach mainly including: 1) select tumor aera.2) define stromal area.3) scan at low magnification.4) determine type of inflammatory infiltrate.5) assess the percentage of stromal TILs (Salgado et al., 2015;Dieci et al., 2018). In this study, "10%" TILs was taken as the cut-off value; therefore, patients with >10% TILs were considered the high TILs group, and the remaining individuals constituted the low TILs group.

Construction and Validation of Nomogram
To improve the robustness and reliability of our prediction model, the nomogram cohort were split into a training set and validation set in a random manner without replacement at a ratio of 8:2. Training set was used to construct the predicted nomogram and perform internal validation. For training cohort, baseline predictors of TILs in BRCA patients were evaluated by the univariate and multivariate binary logistic regression model. Then variables with p value less than 0.05 were included in the backward multivariate binary logistic regression model (Erratum: Borderud et al., 2015) to further screen the optimal prediction model with the lowest Akaike information criterion (AIC). Finally, nomogram establishment was performed based on the optimal prediction model with the R software (version 4.0.1). Bootstrapping with 80 samples was applied for internal validation of the nomogram. The performance of nomogram was assessed by Harrell's concordance index (C-index). Calibration curves were implemented to validate the accuracy and reliability of the nomogram (Kramer and Zimmerman, 2007). In the end, model performance in predicting high TILs were quantified using the area under the curve (AUC) of the receiver operating characteristic (ROC) analysis in training set and validated in validation set, respectively. During the validation of the nomogram, the total points of each patient in the validation cohort were calculated according to the established nomogram, then ROC curve were derived using the total points as predicting parameter. All analyses and plots were performed using the following R packages: "rms", "calibrate", "pROC" and "nomogramEx".

Validation of Nomogram in NAT Cohort
In order to verify the clinical significance of TILs nomogram model, we included an independent cohort treated with NAT, to verify our nomogram model with the prediction of NAT pCR ratio. The main inclusion criteria were: 1) 18-80 years of age female patients; 2) Preoperative diagnosed as the early and intermediate stage breast cancer, and eligibility for NAT after comprehensive evaluation according to the latest NCCN BRCA guideline. Exclusion criteria mainly were: 1) previous anti-tumor therapy, i.e., surgery, chemotherapy, radiotherapy, targeted therapy, or immuno-therapy, etc; 2) secondary tumors or multifocal tumors; 3) HIV infection or other immune system diseases; 4) past use of immune agents or drugs and health care products that may affect immune function; 5) state of inflammation and infection within nearly 1 week; 6) Participated in any prospective drug clinical study. Then, a total of 106 patients' data were collected in Sun Yat-Sen University Cancer Center from December 2017 to March 2021.Total points of each patient in the NAT cohort were calculated according to the established nomogram. NAT efficacy was calculated with Pathological complete response (pCR), pCR defined as pathological Miller-payne grade5 together with no lymph nodes metastasis. Patients were divided into predicted high and low TILs groups according to the optimal cut-off points defined by ROC analysis in nomogram training set.

Statistical Analysis
Categorical variables were analyzed by the Chi-square test or Fisher's exact test. Continuous variables were analyzed by nonparametric tests. Associations of clinical and pathological variables with TILs were determined. Then, the optimal cutoff value of TILs was determined by ROC curve analysis, predicting low and high-immune cell groups and further stratifying the patients into the low and high-TILs groups. All statistical tests were two-sided, and p < 0.05 was considered significant. Statistical analysis was performed with Programming language R (version 4.0.1, http://www.R-project.org).

General Patient Characteristics and Histopathological Findings
A total of 826 patients were included in nomogram cohort according to the above criteria, H andE staining was performed in enrolled cases to assess the density of TILs, and the results were confirmed by two pathologists. As shown in  Table.1. Related factors were included for analysis, and the associations of these factors with TILs in tumor tissue samples were analyzed. It was found that tumor differentiation (p < 0.001), TNM staging (p 0.045) and BMI (p 0.015) were significantly correlated with TILs. In addition, age, gender, marital status, blood group, tumor location, tumor T and N stages, hypertension, and diabetes mellitus were not significantly correlated with tumor TILs.
Further analysis showed that even in HR + HER2-breast cancer, ER expression level (p 4.6e-07) and PR (p 0.00043) were negatively correlated with TILs, indicating that the association between TILs and HR is independent of molecular subtyping. Patients with TILs <30% showed increased ER and PR expression levels, especially ER levels (Figures 2A,B). Besides, in HR + HER2-patients (p 1.6e-11) and HER2+, TNBC patients (p 0.0072), the expression of Ki67 was positively correlated with TILs ( Figures 2C,D).
However, we found no significant correlations between TILs and indicators of blood routine (Supplementary  Table  S3) and hormone levels (Supplementary Table S2).

Nomogram for Evaluating Breast Cancer TILs
Various factors affecting BRCA TILs were included in the analysis. In order to establish a predicted model that can be used in baseline situation, we only include TILs associated factors that can be accurately detected at baseline biopsy tissue and blood samples, including ER expression level, PR expression level, Ki67 expression level, HER2 status, and significant blood indicators mentioned above (TP, GLOB, IP, Ca, TBIL, and IBLI).
We divided the nomogram cohort with complete parameter information mentioned above (n 773) into training set and validation set in a random manner without replacement at a ratio of 8:2. Nomogram model construction was performed in training set (n 618). In univariate logistic regression analysis, all of the baseline indicators showed significant correlation with TILs and were further enrolled in the backward multivariate logistic regression model (Table.4). It turned out that the model with IBIL, IP TP, histology grade, ER, and Ki67 as input variables has the lowest AIC. Therefore, factors of PR, HER2, TBIL, GLOB were excluded after backward multivariate binary logistic regression analysis and further nomogram construction.
Finally, a nomogram model for BRCA (Figure 3) was further established with variates of IBIL, IP TP, histology grade, ER, and Ki67 ( Figure 3A). Harrell' concordance indicators of the nomogram were assessed (c-index 0.772), and calibration curves showed good agreement between predicted and observed values ( Figure 3B). In ROC curve analysis to evaluate the discrimination power of the TILs nomogram, the AUC was 0.759 (95%CI 0.717-0.801) in training set and 0.708 (95%CI 0.617-0.800) in validation set, respectively ( Figures 3C,D). Univariate binary logistic analysis showed that large point value calculated using the nomogram model had a significant association with higher TILs (OR 1.033 95%CI:1.026-1.04, p < 0.001).
In addition, we also found that ER expression level [OR 0.987

Validation of Nomogram in NAT Patients
To further verify the prediction effect of this nomogram on NAT efficacy, another independent NAT cohort was collected for testing. A total of 106 patients were included, including 26 patients with pCR and 80 patients with non-PCR (Table.5), and scored those patients with this nomogram model. Patients were divided into the high TILs group and the low TILs group. 30% (24/80) of patients in the high TILs group achieved pCR, for which only 7.69% (2/26) in the low TILs group (p 0.041) ( Figure 4A). And in pCR group, patients tended to have higher nomogram points compared with non-pCR group (p 0.022) ( Figure 4B).

DISCUSSION
This study retrospectively analyzed factors affecting TILs in BRCA patients and established nomogram model with a large sample size and relatively complete and comprehensive clinical data. Through a series of statistical tests, a model with certain predictive efficacy was established and validated, and relevant factors affecting TILs were obtained. With considerable clinical significance, aiming to help clinicians understand the influencing factors of TILs and evaluate the TILs status of some specific BRCA patients. There is a nomogram model to predict TILs in ovarian cancer (Dai et al., 2021), while to our knowledge, this is the first nomogram model for predicting TILs in BRCA patients with a large sample size and explicit clinical significance. Some influencing factors of TILs were obtained. HR + HER2-BRCA had the lowest density of TILs, significantly less than TNBC and HER2 + BRCA cases, corroborating withprevious findings (Stanton et al., 2016). In addition, based on a large sample size, this study firstly found that even in HR + HER2-BRCA, TILs were negatively correlated with ER and PR expression, and positively associated with Ki67. In TNBC and HER2+ BRCA cases, there was a positive correlation between TILs and Ki67 expression. These findings indicated that molecular subtyping of BRCA alone is not enough to predict TILs, and detailed information has valuable significance to improve TILs evaluation. Based on the postoperative pathological results of patients in this study, TILs were correlated with pathological grade, peritumor vascular invasion, molecular typing and other factors, which were in agreement with the conclusions of previous studies (Pruneri et al., 2016b;Criscitiello et al., 2020;Dülgar et al., 2020). In addition, this study innovatively found that BRCA peri-tumor nerve invasion was also negatively correlated with TILs, providing another influencing factor as a reference, although the specific mechanism still needs to be further explored.
Furthermore, this study firstly found significantly negative correlations between TBIL, IBIL, and IP with TILs, based on large sample size. To our knowledge, this is the first work to report an association between bile acid metabolism and BRCA TILs. Previous studies have suggested that bile acids, as tumor suppressants, can regulate the production and function of CD4, Th17, and Treg cells in peripheral blood, which impacts the body's tumor immunity (Hang et al., 2019;Campbell et al., 2020). Lithocholic acid (LAC) as a part of bile acid metabolism, which has been reported to inhibit the proliferation and invasion of BRCA cells and induce their death through lipid metabolism and other mechanisms (Krishnamurthy et al., 2008;Luu et al., 2018;Mikó et al., 2018). Besides, studies have shown that bile acids can affect the expression of chemokine CXCL16, and then affect the infiltration of natural killer T cells in liver cancer through intestines-liver axis, and then affect the biological behavior of tumor (Ma et al., 2018). While there is no such concept as intestines-liver axis in BRCA, and the specific mechanism behind this phenomenon remains unclear, our results may bridge a connection between bile acid metabolism, tumor immunity, and tumor biological behavior, providing clinical evidence for subsequent mechanism studies. As for IP, this study suggested a significant positive correlation with TILs. In the above multivariate logistic model, IP was the strongest independent predictor. According to previously reported data, IP increase can reduce the occurrence risk of HR-BRCA (Papadimitriou et al., 2021). Besides, some clinical BRCA treatment drugs, such as fulvestrant and bisphosphonates, may also affect IP metabolism (Robertson et al., 2016). Therefore, the relationship between IP and TILs deserves further investigation, as well as the role of IP in BRCA. Due to the limitations of retrospective study, a higher level of evidence can't be provided, which could only confirm the correlation between the two, but not for determining the causal relationships. Furthermore, during clinical treatment, especially during NAT, abnormal blood index as bile acid or blood biochemical caused by drug therapy (as   neoadjuvant chemotherapy, targeted therapy and immunecheckpoint inhibitors) or other reasons, should be paid fully attention by clinicians, which may be related with TILs and furthering tumor immunity, or even prognosis and treatment efficacy.
In recent years, the correlation between body metabolism and tumor biological behaviors has become one of the research hotspots and attracted increasing attention. This study suggested that patients' BMI was correlated with BRCA TILs. Previous studies have reported that BMI in BRCA patients is related to tumor prognosis and sensitivity to chemotherapy: obesity increases the risk of BRCA recurrence and mortality by about 35-40% (Jiralerspong and Goodwin, 2016). Another retrospective study confirmed that BMI is associated with the efficacy of docetaxel chemotherapy in BRCA patients (Desmedt et al., 2020). Accordingly, in vivo experiments have confirmed that obesity can affect the function and number of CD8 + immune cells through the related STAT3 pathway, metabolite GATM and ROC curves for the TILs prediction nomogram in the validation set. All the points assigned on the top point scales for various factors were summed to generate a total point score. The total score was projected on the bottom scales to determine the probability of high-density TILs in an individual. The nomogram-predicted frequency of high TIL density was plotted on the x-axis, and the actual observed frequency of high T cell density was plotted on the y-axis. The AUC was calculated, and its 95% CI was estimated by bootstrapping. TILs, tumor-infiltrating lymphocytes; ROC, receiver operating characteristic; CI, confidence interval. ACSBG1 gene (Ringel et al., 2020;Zhang et al., 2020;Maguire et al., 2021). Meanwhile this study based on lager sample size, firstly demonstrated BMI as an independent influencing factor for BRCA TILs, which may further impact on the prognosis and treatment efficacy. Consequently, the relationships and mechanism between BMI, TILs, BRCA prognosis, and treatment efficacy are worthy of further investigation. While the present findings could help us to understand the correlation between BMI and BRCA, and promote the positive effect of weight control and physical exercise on the prognosis and treatment of BRCA patients. Studies have shown that TILs can predict the prognosis of BRCA, and can serve as an independent prognosis factor for TNBC (Pruneri et al., 2016b;Denkert et al., 2018a). Thus, the nomogram could be of great significant in prognostic evaluation especially for patients whose pathological slides cannot be obtained for further immunohistochemical staining. In addition to prognosis evaluation, studies have shown that TILs can be used as a predictor of the efficacy of different neoadjuvant therapies, a meta-analysis based on a large sample size indicated that for BRCA of various subtypes, patients with TILs >10% (intermediate and high TILs) had high NAT pCR ratio than patients with TILs ≤ 10 (Low TILs) (Denkert et al., 2018b). Besides, with the application of neoadjuvant immune-checkpoint inhibitors (ICIs) in breast cancer, Loi, S, et al. found that TILs with 10% cut-off value can be used to screen specific benefit groups of ICIs (Loi et al., 2021). Therefore, for this Nomogram model, TILs 10% was used as cut-off value and patients were divided into two groups: patients with TILs> 10% were defined as high TILs subgroup, while TILs ≤ 10% is low TILs subgroup. Patients in the High TILs group tended to have better NAT efficacy and ICIs efficacy. And our independent NAT cohort also verified this view.
For patients receiving NAT, the method of obtaining specimens for immunohistochemical examination was fine needle biopsy. Due to the heterogeneous distribution of BRCA TILs (Denkert et al., 2018a;Dieci et al., 2018), the accuracy of TILs evaluation from needle biopsy samples were much inferior to those from postoperative specimens, which may further affected the screening of patients benefiting from NAT. In this study, we proposed a nomogram predicted model for TILs, which only required parameters that can be accurately obtained from the baseline tissue. Therefore, this nomogram model could be helpful for the evaluation of NAT efficacy. And with consideration of side effects of various treatment methods available, we could make the preferable selection for BRCA NAT.
At present, clinical evaluation methods for the efficacy of BRCA chemotherapy mainly include Oncotype DX 21 gene testing (Wang et al., 2018) and MammaPrint 70 gene testing (Krop et al., 2017). It can help clinicians to screen out suitable patients with early and middle stage HR+ HER2− BRCA, who could benefit from chemotherapy. However, those methods are expensive, and require additional blood samples from patients. While our nomogram model is convenient, quick, and noninvasive, and it can be expected to be applied in the evaluation of much more kinds of neoadjuvant therapy for subtypes of BRCA patients.
In this study, postoperative specimens were collected retrospectively for TILs assay, and factors influencing TILs were analyzed, for which may have great clinical significance. Then a nomogram model was established using basic and widely available clinical information (BMI, tumor differentiation, ER, Ki67, IP, IBIL, GLOB), which can accurately estimate the actual TILs of patients, providing a tool for patients and clinicians to make estimation of prognosis and therapeutic efficacy, so as to achieve precise treatment in BRCA.
The limitations of this study mainly include: 1) This study was a retrospective study conducted by single center, which has inherent limitations. 2) We only validated this nomogram with NAT patients in cohort with small sample size, and other items as prognosis and immunotherapy were not verified.
3) The correlation between some indicators such as bilirubin and TILs has been found, but the mechanism has not been further explored.

CONCLUSION
The findings of clinicopathological factors associated with TILs could help clinicians understand the tumor immunity of BRCA patients, serval factors including IBIL and IP as independent predictors of TILs have not been reported before, which may have great clinical significance. And the established nomogram with high evaluation efficiency may be used as a complement tool for distinguishing patients with better neoadjuvant therapeutic efficacy.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Sun yat-sen university cancer center. The ethics committee waived the requirement of written informed consent for participation.