Mutant Allele Frequency-Based Intra-Tumoral Genetic Heterogeneity Related to the Tumor Shrinkage Mode After Neoadjuvant Chemotherapy in Breast Cancer Patients

The shrinkage mode of tumor extent after neoadjuvant chemotherapy (NAC) is an important index to evaluate the odds of breast-conserving surgery. However, there is no sufficient measurement to predict the shrinkage mode after NAC. In this study, we analyzed 24 patients' formalin-fixed, paraffin-embedded samples before and after treatment and analyzed 456 cancer-related genes panel by using target next-generation sequencing. Meanwhile, the pathological shrinkage mode was reconstructed in three dimensions after surgery, and the genetic heterogeneity level was estimated by mutant-allele tumor heterogeneity (MATH). We measured the genetic intra-tumor heterogeneity and explored its correlation with the shrinkage mode after NAC. A total of 17 matched pair samples of primary tumor tissue and residual tumor tissue were successfully accessed. It was found that the most common mutated genes were TP53 and PIK3CA in both samples before and after NAC, and no recurrent mutations were significantly associated with the shrinkage mode. Besides, the MATH value of formalin-fixed, paraffin-embedded samples before and after NAC was analyzed by the area under the curve of the receiver operating characteristic, and it is feasible to classify patients into concentric shrinkage mode and non-concentric shrinkage mode in NAC based on the MATH threshold of 58. Our findings indicate that the MATH value was associated with the shrinkage mode of breast cancer in a non-linear model. Patients with the MATH value below the threshold of 58 before and after NAC displayed a concentric shrinkage mode. The area under the curve was 0.89, with a sensitivity of 0.69 and specificity of 1. Our study might provide a promising application of intra-tumor heterogeneity that is measured by MATH to make a choice of surgery.

The shrinkage mode of tumor extent after neoadjuvant chemotherapy (NAC) is an important index to evaluate the odds of breast-conserving surgery. However, there is no sufficient measurement to predict the shrinkage mode after NAC. In this study, we analyzed 24 patients' formalin-fixed, paraffin-embedded samples before and after treatment and analyzed 456 cancer-related genes panel by using target nextgeneration sequencing. Meanwhile, the pathological shrinkage mode was reconstructed in three dimensions after surgery, and the genetic heterogeneity level was estimated by mutant-allele tumor heterogeneity (MATH). We measured the genetic intra-tumor heterogeneity and explored its correlation with the shrinkage mode after NAC. A total of 17 matched pair samples of primary tumor tissue and residual tumor tissue were successfully accessed. It was found that the most common mutated genes were TP53 and PIK3CA in both samples before and after NAC, and no recurrent mutations were significantly associated with the shrinkage mode. Besides, the MATH value of formalin-fixed, paraffin-embedded samples before and after NAC was analyzed by the area under the curve of the receiver operating characteristic, and it is feasible to classify patients into concentric shrinkage mode and non-concentric shrinkage mode in NAC based on the MATH threshold of 58. Our findings indicate that the MATH value was associated with the shrinkage mode of breast cancer in a non-linear model. Patients with the MATH value below the threshold of 58 before and after NAC displayed a concentric shrinkage mode. The area under the INTRODUCTION Neoadjuvant chemotherapy (NAC) is prescribed increasingly in patients with advanced breast cancer (1). Previous studies have shown that NAC could facilitate breast conservation in locally advanced breast cancer, reducing volume resection in breastconserving therapies (2,3). In contrast, tumors downsized by NAC were reported to have a higher local recurrence after breastconserving therapy than those who have not (4). Furthermore, large lumpectomy volumes were sacrificed in high response patients (5). There are two states of tumor shrink after NAC, concentric shrinkage mode (CSM) and non-concentric shrinkage mode (NCSM). Patients with CSM after NAC is considered to be ideal candidates for breast-conserving treatment (BCT). Simultaneously, NCSM can potentially lead to false-negative reporting of margins, which may increase the risk of locoregional recurrence (6). At the St. Gallen International Expert Consensus Conference on the Primary Therapy of Early Breast Cancer 2017, experts voted that different surgical strategies should be adopted for breast cancer based on shrinkage mode (7). It is a critical NAC therapeutic effect evaluation criterion to predict the patient will present CSM or NCSM after NAC; predictive measurements are urgently needed to inform the design of the surgical scheme and treatment strategy.
Indeed, the current method of estimating the shrinkage mode after NAC is still under research. Previous studies have reported that tumor response to NAC varied greatly by clinicopathologic variables [i.e., molecular subtype (8); clinical stage (9)]. Patients with triple-negative-and human epidermal growth factor receptor 2-positive tumors have a higher probability of achieving CSM (8). In contrast, there is still a lack of genetic composition studies to improve the stratification.
Recently, remarkable advances in oncogene investigations have made it possible to incorporate next-generation sequencing (NGS) technology into precise clinical care. Intra-tumoral genetic heterogeneity based on the NGS approach for monitoring the response to chemotherapies is currently underway. Tumors with high genetic heterogeneity were thought to contain more comprehensive resistant populations and distinct subpopulations leading to worse survival (10,11). Thus, we speculated that the complex genetic composition might also influence the shrinkage mode after NAC. A better understanding of the genomics Abbreviations: NAC, neoadjuvant chemotherapy; ITH, intra-tumor heterogeneity; MATH, mutant-allele tumor heterogeneity; ROC, receiver operating characteristic; CSM, concentric shrinkage mode; NCSM, nonconcentric shrinkage mode; FFPE, formalin-fixed, paraffin-embedded; BCT, breast-conserving treatment; NGS, next-generation sequencing.
information of the shrinkage mode may suggest more clear classification methods for surgical strategy.
Several bioinformatics methods based on NGS have been proposed to explore tumor genetic heterogeneity (12)(13)(14). Mutant-allele tumor heterogeneity (MATH) was used to generally measure genetic heterogeneity, which is based on estimating allelic frequencies of the tumor and making it a measurable variable (15).
This study was designed to identify the pathological shrinkage modes after NAC for breast carcinomas using NGS genomic profiling. As these are clinical data regarding the response, the intra-tumor genetic heterogeneity may help to understand breast cancer biology and performing local-regional management.

Patients and Samples
Between October 2016 and April 2018, 24 patients with primary invasive breast cancer confirmed by histopathology with clinical stages II and III were enrolled. Breast MRI was performed before NAC and a week before surgery. Patients diagnosed with the distant metastatic disease before surgery or not examined using MRI before NAC were excluded. All the patients received a full course of anthracycline and taxane-based chemotherapy regimens and underwent radical surgery. Trastuzumab treatment was delivered to human epidermal growth factor receptor 2positive patients.
To define the tumor molecular subtypes, we identified the immunohistochemistry (IHC) expression of estrogen receptor, progesterone receptor, HER2 status, and proliferation index (Ki-67). Fluorescence in situ hybridization was conducted when HER2 expression was detected as grade 2 on IHC. One percent expression rate was used as the cutoff to define positive hormone receptors. HER2 receptor was considered to be positive when HER2 expression was detected as grade 3 on IHC or HER2 gene amplification on fluorescence in situ hybridization. The expression of Ki-67 over 20% was considered as high.
The retrospective study was approved by the Institute Review Board of Shandong Cancer Hospital (institute review board approval number: SDTHEC201802002). The informed consents were obtained from all patients.

Clinical-Pathological Patterns
We divided shrinkage mode into two modes, CSM and NCSM (16).
MRI was performed at baseline before NAC; CSM means that the longest diameter retraction rate of the residual tumor was ≤2 cm and ≥50% compared with the longest diameter of the primary tumor before NAC and included four modes, pathologic complete response (PCR); isolated, concentric shrinkage without surrounding lesions; nodular, residual multinodular lesions; and clumps with scattered, concentric shrinkage with surrounding lesions. NCSM means that the longest diameter retraction rate of residual tumor after NAC was >2 cm and/or <50% compared with the longest diameter of the primary tumor before NAC and included two modes, isolated, concentric shrinkage with surrounding lesions; and clumps with scattered and diffuse, replaced lesions (diffuse in whole quadrants) (Figure 1) (16).

Tissue DNA Isolation and Purification
Genomic DNA was extracted from the FFPE samples using GeneRead DNA FFPE Kit (Qiagen, USA). The quality of purified DNA was assayed by gel electrophoresis and quantified by Qubit R 4.0 Fluorometer (Life Technologies, USA).

Library Construction and 456 Gene Panel Sequencing
The purified genomic DNA was first fragmented into DNA pieces around 200-300 bp using the enzymatic method (5× WGS Fragmentation Mix, Qiagen, USA). After end repairing, a tailing and T-adaptors ligating by polymerase chain reaction (PCR) reagents were performed in pre-library. The products were followed by exon capture. Captured fragments were subsequently purified and hybridized by 456 gene panels designed by Berry Oncology Corporation, including drug-target genes and hot-spot mutated genes related to cancer development. SNV, Indel, gene fusion, and copy number variation data were detected through 456 panels in 1,000× coverage. All results were annotated by COSMIC, TCGA, ClinVar, and inhouse Berry Oncology database (Berry Oncology Corporation, Supplementary Table 1).

Bioinformatics Analysis of Mutations
FASTP (17) was used to trim adapters and to remove lowquality sequences to obtain clean reads. The clean reads aligned to Ensemble GRCh37/hg19 reference genome performed by BWA (18). PCR duplications were processed by gencore (19). SAMtools (20) was applied to detect single-nucleotide variations (SNVs), insertions, and deletions. HGVS variant description was annotated by ANNOVAR (21) software. We excluded SNVs with PopFreqMax > 0.05 and identified non-synonymous SNVs with VAF > 0.5% or with VAF > 0.1% in cancer hotspots for further analysis.

Statistical Analysis
Mroz et al. (15) introduced a measurement of heterogeneity termed as MATH. It is the ratio of scaled median absolute deviation (MAD) to median stated in percentage. To investigate the correlation between intra-tumoral genetic heterogeneity and the shrinkage mode after NAC, we used MATH value to assess intra-tumoral genetic heterogeneity.
The MATH value for tumors was based on the distribution of mutant-allele fractions among specific mutated loci, calculated as the percentage ratio of the width (MAD scaled by a constant factor so that the expected MAD of a sample from a normal distribution equals the standard deviation) to the center (median) of its distribution (15): The Fisher exact test was used for the comparison of MATH value between CSM and NCSM groups. P < 0.05 was considered statistically significant. MATH value was assessed using the area under the curve of the receiver operating characteristic (ROC). All statistical analyses were performed by SPSS 22.0 software.

RESULTS
We carried out high-coverage sequencing of 456 cancerrelevant genes on a group of 24 paired pre/post-NAC matched samples. Seven paired specimens (7/24, 29.1%) were excluded from the analysis because of insufficient amount/poor quality of DNA content. For 17 (17/24, 70.8%) patients, both primary tumor biopsy and residual tissues were successfully accessed and analyzed. Thirteen patients got CSM, and four patients got NCSM. Patient's characteristics are listed in Table 1.
We compared TP53 and PIK3CA hot mutations frequency between pre-NAC and post-NAC in CSM and NCSM. It was found that TP53 hot mutations frequency in CSM declined from 70% in pre-NAC to 30% in post-NAC, and those values in NSCM raised from 50 to 100%. PIK3CA hot mutations frequency in CSM was from 23% in pre-NAC, then climbed to 38%, and in NSCM from 50 to 75% ( Table 2).
From all these samples, we identified 142 variants in 456 genes. One hundred thirteen mutations were detected in the pre-NAC group, and 70 gene (70/142, 49.3%) mutations were only found in the pre-NAC group (Supplementary Table 2). The mutation frequency of NOTCH1 was 17.6% (3/17) in pre-NAC samples. Meanwhile, CSF1R, JAK2, MAP3K1, MECOM, PAX5, and PTEN were mutated in 11.8% (2/17) among FIGURE 1 | CSM means that the longest diameter retraction rates of the residual tumor were ≤2 cm and ≥50% compared with the longest diameter of the primary tumor before NAC and included four modes, PCR, pathologic complete response; isolated, concentric shrinkage without surrounding lesions; nodular, residual multinodular lesions; clumps with scattered, concentric shrinkage with surrounding lesions. NCSM means that the longest diameter retraction rates of residual tumor after NAC were >2 cm and/or <50% compared with the longest diameter of the primary tumor before NAC and included two modes, isolated, concentric shrinkage with surrounding lesions; clumps with scattered and diffuse replaced lesions (diffuse in whole quadrants). CSM, concentric shrinkage mode; NAC, neo-adjuvant chemotherapy; PCR, pathologic complete response; NCSM, non-concentric shrinkage mode.

Mutant-Allele Tumor Heterogeneity Score Before and After Treatment Was Associated With the Pathological Pattern After Neoadjuvant Chemotherapy
First, we analyzed the MATH value of 17 paired samples. The MATH value of the CSM group was significantly lower than that of the NCSM group (39.66 vs. 102.7, P < 10-4; Figure 4). We also compared the MATH value of the samples before NAC with the samples after NAC, and the mean value of MATH in the samples  before NAC was higher than that in the samples after NAC (64.05 vs. 52.67). In the CSM group, the MATH value of 6 (6/13, 46%) patients increased after NAC, whereas it was decreased in the remaining 6 (6/13, 46%) patients. What is more, we found that the mutation number before and after treatment was always one; the MATH value before and after was the same. In the NCSM group, the MATH of 3 (3/4, 75%) patients was increased, whereas it was decreased in 1 (1/4, 25%) patient (Figure 2).  Second, based on the analysis of the ROC curve, the optimal threshold value of MATH was 58; the tumor would have CSM after NAC, no matter the tumor samples were before NAC or after NAC (Figure 5). ROC curve was used to analyze the MATH value and shrinkage mode, as shown in Figure 6. The area under the curve was 0.89, with a sensitivity of 0.69 and specificity of 1. The threshold value of 58 indicated a good accuracy to distinguish the two different shrinkage modes.

Characteristics of the Mutant-Allele Tumor Heterogeneity Value Before and After Neoadjuvant Chemotherapy in Patients With Different Molecular Typing
Considering the different MATH values of various molecular types of breast cancer at baseline levels, we assessed whether patients with different molecular typing met the threshold in the same way.
Three of four patients had one MATH value below the threshold of 58 (Figure 5)

DISCUSSION
In this study, we have detected the genomic landscape of a breast cancer cohort before and after NAC and constructed pathological three-dimensional shrinkage mode for postoperative samples. We associated the MATH value of samples before and after NAC with the shrinkage mode after NAC. Our findings extend the knowledge in the field of BCT after NAC in several ways.
First, our results may provide an effective way to select suitable patients for BCT under NAC. As we know, one issue currently under discussion during NAC is how to identify those patients who are suitable for BCT. Lack of effective measurements to select patients may lead to a decrease in BCT safety and rates. For one thing, BCT after NAC for unscreened patients may lead to an increase in local recurrence rate. A recent meta-analysis reported that NAC was associated with more frequent local recurrence than adjuvant chemotherapy: the 15-year local recurrence was 21.4% for classical treatment vs. 15.9% for neoadjuvant therapy (4). For another, considering the safety, some clinicians challenged the choice of BCT after NAC (22)(23)(24), which may be due to insufficient estimation of NCSM. Our analysis revealed that NGS of pre-and post-NAC samples might help to select patients who are suitable for BCT. However, our findings require confirmation in a larger dataset and multicenter research.
Second, our results suggested that genetic heterogeneity within a tumor may be the major factor determining the shrinkage mode after NAC. The factors affecting the shrinkage mode after NAC are currently limited to clinical parameters; genetic information was not available. Previous research by our group and others (8,16) has investigated the association between  the shrinkage mode and tumor subtype, showing that patients with triple-negative and HER2-positive tumors have a higher probability of achieving the CSM. However, we found that the shrinkage mode differed within the same subgroups; it was insufficient to match the clinical needs. In this research, our results provided direct evidence that the breast cancer patients with high-MATH before and after NAC would show NCSM.
Our results also showed that breast cancer is a highly heterogeneous disease with few high recurrent genes. In accordance with previous reports (11,25), most genes except TP53 and PIK3CA in the samples before and after NAC occurred at a low frequency. We were not able to find specific tumor somatic mutations associated with the shrinkage mode after NAC. However, our results showed that patients with CSM had a greater reduction in the number of mutations after NAC. Previous reports (26,27) have suggested that changes in mutations may be linked to the sensitivity of chemotherapeutic drugs. Drawing on the report mentioned earlier, we guessed the possible explanation. As known, tumors with higher heterogeneity contain more variety clones than homogeneous ones. The patients with low MATH values have a relatively uniform response and sensibility to NAC and are prone to CSM. For the patients with high-MATH value, the tumor contains more variety clones; a portion of their cancer cells are sensitive to NAC, whereas the rest are not. Therefore, cells containing mutations are eliminated, and few mutations emerge, the mutant alleles relatively reduce, the patients show CSM, and the MATH values decrease below the threshold. Conversely, for patients with NCSM, their cancer cells might be resistant to NAC, and without enough loss of mutations after NAC, the mutant alleles remain or are not sufficiently reduced.
These results also raised questions, whether the heterogeneity assessment of the primary tumor is sufficient to predict treatment outcome and whether the state after chemotherapy needs to be considered. In previous studies (15,28), the heterogeneity of the primary tumor was assessed to study the correlation with treatment outcomes without considering the heterogeneity of the tumor after chemotherapy. In contrast, they did not produce the desired results. Similarly, in our study, patients with high MATH value primary tumor still achieved the desired results and even PCR, but the MATH value after NAC was consistent. Meanwhile, we also observed changes in the heterogeneity during NAC. Both pre-and post-NAC assessments of intratumoral heterogeneity may be needed to risk stratification, a conjecture which is consistent with the previous report (27). Suffered from a shortage of sample size, the determination of threshold value was rather vague. Multiple thresholds apply to our findings. In our series of results, we selected 58 as the threshold, and all of them satisfied our findings. Larger sample size and multicenter cohort are needed to help to determine threshold values.
Noteworthy, multiple types of breast cancers differed in the assessment of MATH before and after NAC. Our results showed that 75% of luminal A patients had MATH value increased after NAC, 87.5% of Her2-positive patients were below the threshold before NAC, even all the luminal B patients were above the threshold before NAC. Combined with previous reports (29), we can conclude that molecular characteristics are related to heterogeneity. Consequently, If the conclusion was confirmed, considering the costs of serial sequencing assays that preclude its clinical implication, we pointed out that precise selection sequencing is based on molecular typing. For patients with luminal B, the NGS genetic test of post-NAC specimens was first considered. Conversely, for patients with luminal A or Her2positive, primary tumor biopsy for NGS is the first choice.
Traditionally, the pattern of residual tumor is classified into CSM and NCSM based on morphological changes showed on MRI. Of note, we used a more precise definition of CSM, as we further characterize the shrinkage mode with concentric shrinkage. This method considered the morphology of surrounding lesions as well as the extent of the residual tumors. We defined the mode with the longest diameter retraction rate of the residual tumor ≤2 cm and ≥50% as CSM, which was also a good candidate for BCT and regarded as a good response to NAC. The definition seemed to be consistent with "limited multifocal regression" reported by Diane C (30). In their reports, they further subdivided multifocal regression into diffuse and limited multifocal shrinkage. Similarly, their results suggest that only the diffuse multifocal shrinkage is a risk factor that portends a worse outcome rather than a limited multifocal mode. Simultaneously, a study from the Netherlands by Briete Goorts et al. reported that patients with Pinder classification 50-90% were regarded as pathological responders (31).
Our study has some limitations. First, a small sample size may mislead the correct understanding of the result and affect the precise determination of thresholds value. Second, restricted cancer gene panel may affect the calculation of MATH results, as MATH values developed based on whole-exome sequencing (15). Furthermore, more accurate methods for determining heterogeneity, including changes in copy numbers, may enhance the study's persuasiveness.

CONCLUSIONS
NGS provides a way to show dynamics of genetic heterogeneity before and after NAC in breast cancer with different shrinkage modes, suggesting that MATH scores may correlate with pathological shrinkage mode after NAC, informing that NGSbased approaches may have the potential to be used to estimate the shrinkage mode and to enable the selection of the most appropriate patients for BCT during NAC. Specifically, our results showed that patients with CSM had a greater reduction in the number of mutations after NAC. Thus, we provide possible explanations for genetic heterogeneity associated with shrinkage mode. Finally, optimizing the choice of pre-or post-NAC samples can be selected based on individualized molecular typing. These findings might help to optimize the choice of surgical options. To enable the selection of costeffective sequencing options and further determine the threshold, additional large clinical studies are needed.

DATA AVAILABILITY STATEMENT
The raw sequence data reported in this paper have been deposited in the Genome Sequence Archive in BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, under accession number HRA000501 that can be accessed at http://bigd.big.ac.cn/gsa-human.

ETHICS STATEMENT
The retrospective study was approved by the Institute Review Board of Shandong Cancer Hospital (IRB Approval Number: SDTHEC201802002). The informed consents were obtained from all patients. The patients/participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.