IMMUNE MICROENVIRONMENT PROFILING OF NORMAL APPEARING COLORECTAL MUCOSA

. 17 Introduction: Lynch syndrome (LS) is the most common hereditary cause of colorectal cancer (CRC), 18 increasing lifetime risk of CRC by up to 70%. Despite this higher lifetime risk, disease penetrance in LS 19 patients is highly variable and most LS patients undergoing CRC surveillance will not develop CRC. 20 Therefore, biomarkers that can correctly and consistently predict CRC risk in LS patients are needed to 21 both optimize LS patient surveillance and help identify better prevention strategies that reduce risk of 22 CRC development in the subset of high-risk LS patients.


Introduction 53
Lynch Syndrome (LS) is the most common hereditary cause of colorectal cancer (CRC) with a prevalence 54 of approximately 0.1-0.4% individuals in the general population and is responsible for 1-4% of patients 55 with CRC (1). It is an autosomal dominant disease caused by a germline pathogenic variant (PV) in one 56 of the DNA mismatch repair (MMR) genes (MLH1, MSH2, MSH6 or PMS2) or deletions in the EPCAM 57 gene that leads to silencing of MSH2 via promoter hypermethylation. LS is characterized by a very rapid 58 transformation along the adenoma-carcinoma sequence that usually occurs in 1-3 years in contrast to 59 the 10-15-year timeline for MMR proficient tumors. Depending on the MMR genes involved, the lifetime 60 risk of LS patients developing CRC is reported to be as high as 60% without surveillance (2). Given this 61 heightened risk, the National Comprehensive Cancer Network (NCCN) recommends regularly scheduled 62 surveillance colonoscopy be performed more frequently beginning at earlier ages for individuals with LS 63 (3). Most LS patients undergoing high quality colonoscopy surveillance, however, do not develop CRC. 64 This dichotomy raises the potential of safely lengthening colonoscopy surveillance intervals in a subset 65 of LS patients deemed to be at lower risk of developing CRC, if they can be accurately identified. 66 67 CRC in LS patients display a highly microsatellite instable (MSI-H) phenotype, which is associated with 68 increased immune infiltration of the CRC tumor microenvironment. This increase is in response to 69 immunogenic frameshift peptides generated by the defective MMR machinery (4, 5). Interestingly, a 70 systematic review of CRC literature indicates that LS-associated CRC tumors have an increased immune 71 response when compared to sporadic MSI-H CRC tumors, even at the premalignant stage (6). A potential 72 reason for this could be that normal appearing colonic crypts from LS patients can exhibit MMR deficiency 73 (7-9). The presence of MMR deficient crypts, however, was independent of an LS patient's cancer history 74 (10). Such observations have resulted in an increasing realization that the immune status of normal 75 colorectal mucosa in LS needs to be better characterized and understood. Toward this goal, a recent 76 seminal study compared immune infiltration in tumor-distant normal appearing colorectal mucosa of LS 77 patients with and without CRC with that of sporadic MSI-H and microsatellite-stable (MSS) CRC patients 78 (11). Interestingly, it found elevated T-cell infiltration in normal mucosa of cancer-free LS patients when 79 compared with MSS CRC patients adding to the evidence that immunogenic frameshift peptides can 80 induce an antitumor immune profile in the absence of tumor. It also identified altered immune profiles 81 between normal mucosa of cancer-free LS carriers and LS CRC patients and raised the possibility that 82 the immune profile of normal mucosa may be a risk-modifier in LS patients and could potentially help 83 improve patient surveillance. 84 85 To expand this concept, we sought to determine whether LS impacts the constitutive inflammatory state 86 of colorectal tissue, with the long-term goal of identifying biomarkers to incorporate into CRC prevention 87 approaches. Specifically, we profiled the cytokine-based, local immune signaling microenvironment of 88 normal appearing colorectal mucosa from LS and non-LS patients without active CRC, but with or without 89 a history of CRC. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Establishing biomarker fidelity. As demonstrated in Figure 1, we applied ratio-of-cross coefficient of 150 variation (rxCOV) metric to establish the quality of the biomarker measurements. rxCOV uses an 151 objective threshold of zero to verify whether the good performance of a biomarker in separating between 152 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted March 6, 2023. ; https://doi.org/10.1101/2023.03.03.23286594 doi: medRxiv preprint patient groups was not an assay associated experimental artifact (14). rxCOV metric was independently 153 applied to mELISA data from each visit. 154 155 Biomarker selection. Biomarkers that passed the fidelity check were used to select a subset capable of 156 separating the two patient groups being compared. The selection was performed using logistic regression 157 with an elastic-net penalty (15). It was implemented using the glmnet R package (16, 17). Elastic-net 158 penalty overcomes the shortcoming of lasso-based biomarker selection, which can randomly select a 159 biomarker from a set of highly correlated biomarkers, while ignoring the rest of the correlated group. 160 Since cytokine based signaling includes both autocrine and paracrine components and is pleiotropic in 161 nature, lasso-based biomarker selection can be particularly deleterious in this context (15, 18). Elastic-162 net penalty on the other hand, can both perform efficient biomarker selection, while also accounting for 163 grouping effect of highly correlated biomarkers. As a result, it excludes trivial biomarkers but performs 164 grouped selection -selecting the whole group of correlated biomarkers, if some within the group are 165 important in differentiating the two groups being compared. Such a selection process is better suited for 166 cytokine-based biomarker selection. Specifically, we performed elastic-net-based penalized logistic 167 regression utilizing two nested loops. The outer loop, with 100 iterations, sampled with replacement 70% 168 of patients in each of the two groups being compared to generate a range of patient cohorts capturing 169 the underlying patient distribution within each group. For each outer loop, an inner loop with 200 iterations 170 was used to optimize the elastic-net penalized logistic regression model based on leave-one-out cross 171 validation and estimate the selection probability for each biomarker based on its ability in consistently 172 classifying the two patient groups. The final biomarker selection was performed by estimating their overall 173 selection probabilities as a function of varying confidence thresholds. The advantage of this selection 174 process is that it eschews use of an arbitrary threshold for biomarker selection and instead generates an 175 overall selection-probability estimate of their ability to help differentiate the two patient groups.

176
Biomarkers with more that 50% value on this scale were selected. Biomarker selection was independently 177 performed for each visit.

179
Prediction: Random Forest (19) classifier was trained on 70% of the patient data and the remaining 30% 180 was used for validation. Two hundred bootstraps were performed to capture the patient data distribution 181 in the mutually exclusive training and validation set. To profile the impact of LS and CRC history on the constitutive inflammatory state of colorectal tissue, 201 we directly compared the immune microenvironment in the following groups: 1) LS patients with prior 202 history of non-metastatic CRC (LS-CRC); 2) LS patients with no prior history of malignancy (LS-NoCRC); 203 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. We first established the reproducibility of our ex-vivo colorectal explant culture and multiplexed ELISA 216 based profiling of the local immune signaling microenvironment in the colonic mucosa of LS patients over 217 repeat visits. Toward this goal, we leveraged our ability to follow LS patients that are under annual to 218 biannual endoscopic surveillance at our medical center. We obtained biopsies from those LS patients 219 that did not experience any change in health status or develop any disease between their repeat visits 220 spread over a period of two years ( Figure 2A). The biopsies were cultured ex-vivo (see Material and 221 Methods) and profiled using multiplexed ELISA. We next tested the fidelity of each of the cytokines 222 comprising the immune profile using our rxCOV metric to identify the subset of high-fidelity cytokines. 223 Using pairwise Wilcoxon signed-rank test (Graphpad Prism, v9.0, Boston MA) (20), we compared the 224 expression levels of each high-fidelity cytokine across the two visits. Our results show that over the nearly 225 two-year period the expression levels of most cytokines remained stable with difference in expression 226 only being statistically significant at the 0.05 level for 9/32 of the selected biomarkers ( Figure 2B). 227 Although, we are further expanding our temporal study to include additional visits, our initial findings 228 indicate that we can reproducibly capture the constitutive inflammatory state of the local immune signaling 229 microenvironment across repeat patient visits. We explored Lynch status associated modification of local immune signaling microenvironment of normal 235 appearing mucosa in the absence of history of CRC. Using our computational method, we compared the 236 cytokine profiles of LS patients without any history of CRC (LS-NoCRC) with the baseline healthy controls 237 (HC) (Figure 3). We performed this comparison over two successive patient visits and identified Eotaxin-238 3, IL-16, IL-17A, IL-1α, and TNF-β as the set of cytokines that were consistently differentially modulated 239 along the LS-status axis ( Figure 3A). We tested the strength of this differential modulation by validating 240 their ability to predict LS status of patients within the two cohorts utilizing Random Forest based classifier. 241 Quantification of this performance via area under the Receiver Operating Characteristic curve (aucROC) 242 revealed a relatively consistent level of performance over two patient visits that were separated by two 243 years ( Figure 3B). A direct comparison of expression levels of the six identified biomarkers from LS-244 NoCRC and HC are presented for visit 1 ( Figure 3C) and LS-NoCRC visit 2 vs the single visit of HC 245 ( Figure 3D). We eschewed the typical statistical significance-based analysis that correlates differential 246 signature with the outcome. Instead, we implemented an outcome driven biomarker selection strategy 247 that selected those markers that were most capable of capturing the impact of LS without a history of 248 CRC, while accounting for their pleiotropic and redundant activity. As a result, the differential expression 249 of individual selected biomarkers captures the subtle effect of the impact of LS on the colorectal 250 microenvironment even though it might not be statistically significant.

252
Gene ontology-based enrichment analysis of the differential profile of Eotaxin-3, IL-16, IL-17A, IL-1α, and 253 TNF-β revealed that, among other things, regulation of IL-6 production was fundamentally associated 254 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted March 6, 2023. ; https://doi.org/10.1101/2023.03.03.23286594 doi: medRxiv preprint with LS status-based modification of the immune signaling microenvironment (21, 22)( Figure 3E). 255 Interestingly, although IL-6 was not directly selected by our computational method, its enrichment 256 suggests a potential close association between JAK/STAT signaling and LS status of the patient. 257 Additionally, enrichment of IL-1 suggests the possibility of macrophage-mediated lymphocyte activation 258 (23). biomarkers that consistently differentiated between the two groups over repeated visits using the same 268 strategy detailed in the previous subsection ( Figures 4A-D). Interestingly, presence of CRC background 269 resulted in IL-6 being explicitly included in the selected list, suggesting its more explicit role in the normal 270 colonic mucosa of patients with a CRC background.

272
Pathway analysis based on the selected biomarkers revealed an enriched role of IL-10, IL-4 and IL-13 273 signaling suggesting an immunosuppressive phenotype dependence ( Figure 4E) (24). Concurrently, 274 gene ontology-based enrichment analysis revealed an enriched role of IL-12, IL-23, NF-κB, and nitric 275 oxide synthase (NOS) complexes that indicate a more proinflammatory phenotype (25, 26) and suggest 276 a competition between the proinflammatory and immunosuppressive phenotypes in the normal colonic 277 mucosa ( Figure 4F). We note that many of these enriched complexes are implicated or dysregulated in 278 patients with inflammatory bowel disease (27-30). Additionally, TGF-beta complex was also enriched. 279 Aberration in TGF-beta signaling due mutations in TGF-beta receptors is commonly found in MMR 280 deficient CRC, suggesting a potential residual effect of CRC history (31). Thus, our selected biomarkers 281 seem to indicate that the impact of LS in the background of CRC is managed through a competition 282 between the immunosuppressive and proinflammatory phenotypes along with residual effect of earlier 283 CRC. The dominance of one phenotype over the other might potentially be concordant with increased 284 risk of CRC relapse. Finally, we studied the modification in the local immune signaling microenvironment of normal appearing 290 colonic mucosa due to different CRC histories in patients with LS. Specifically, we profiled the LS-noCRC 291 and LS-CRC patient groups. Our computational analysis identified IL-8.1, IP-10, MCP-1, GMCSF, and 292 IL-1β as biomarkers that were consistently able to differentiate between the two groups in both visits 293 ( Figures 5A-D).

295
Pathway analysis combined with gene ontology-based enrichment analysis revealed enrichment of 296 signaling pathways and molecular complexes that are a subset of those that capture the impact of 297 differing LS status in patients with a history of CRC ( Figures 5E, F). Direct comparison of Figure 4E with 298 5E, and of Figure 4F with 5F, seems to suggest that the impact on the local immune signaling 299 microenvironment of normal appearing mucosa due to differing LS status in patients with CRC history is 300 much broader than impact associated with differing CRC histories in LS patients. Importantly, we have 301 now identified a consistent set of immune signaling signatures predictive of risk in both these settings 302 over two patient visits. We aim to validate these signatures and better understand their biological 303 underpinnings as part of our ongoing prospective study.

305
All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Biomarkers that can correctly predict CRC risk in LS patients are needed to successfully mitigate the risk 308 through prevention approaches including surveillance colonoscopy, vaccines, and chemoprevention. 309 Ideally these approaches will be personalized with the goal of reducing costs by optimizing surveillance 310 intervals or through predicting and monitoring agents that are most likely to prevent the development of 311 colonic neoplasia. We took a novel approach to address this issue. Taking advantage of the large Lynch 312 patient population that regularly undergoes surveillance colonoscopies at our institution, we were able to 313 identify a cohort of patients that we followed over repeat visits. We obtained colorectal biopsies from 314 these patients and using an ex vivo explant system combined with multiplexed ELISA, profiled the local 315 immune signaling microenvironment of their normal appearing mucosa. Unlike most studies that focus 316 on active disease in LS patients, our selection criteria mimicked two real-world surveillance scenarios for 317 LS patients that are currently cancer-free: 1) those with no history of CRC and may develop CRC for the 318 first time, or 2) those with a history of CRC that could experience a second primary CRC. By profiling the 319 immune microenvironment of their normal appearing mucosa and using a history of CRC as a surrogate 320 for CRC risk, our prospective and ongoing study aims to elucidate subtle but robust differences 321 associated with immune modulation dependent on LS status and the residual effect of a prior resected 322 CRC. Importantly our study established reproducibility of our results over repeat visits when there was 323 no significant change in the patient's health status, thereby, identifying a set of potential biomarkers that 324 warrant further investigation in future prevention studies.

326
The potential of using differentially expressed serum and plasma cytokines as biomarkers for detecting 327 the presence of CRC has been observed by multiple groups. By combining logistic regression models 328 with multiplex ELISAs multiple teams have proposed a group of biomarkers capable of distinguishing 329 between CRC and HC. Both the combination of serum levels of IL-9, Eotaxin, GM-CSF, and TNF-α and 330 the combination of IL-4, IL-8, Eotaxin, IP-10, and TNF-α can distinguish between patients with CRC and 331 HC (32). High serum IL-8, high IL-6, low MCP-1, low -IL1ra and low IP-10 were also able to distinguish 332 between CRC patients with active disease from HC, with low serum IP-10 in combination with high IL-8 333 and IL-6 being associated with metastasized disease (33). However, these studies focus on detecting 334 active disease, and not on characterizing alterations in the immune signaling microenvironment of normal 335 mucosa in LS patients in a CRC-history dependent manner. By focusing on the latter, we anticipate that 336 our findings have the potential to assess CRC risk and help in preemptively mitigating it by optimizing 337 surveillance intervals and identifying immunomodulating prevention agents.

339
An important aspect of this study is that our analysis was informed by the redundant and pleiotropic 340 nature of cytokine activity. We specifically avoided two pitfalls. First, we did not employ a simple p-value 341 based determination of significance of differential cytokine expression to select important biomarkers. 342 This is necessary due to the interconnectedness of cytokine signaling. It is possible that cytokines without 343 a statistically significant differential expression between the comparison groups are important for 344 separating those two groups. To avoid this pitfall, we utilized the predictive ability of the cytokines as the 345 selection criterion. Second, while testing their predictive ability we utilized an elastic net penalty instead 346 of the sometimes more commonly used lasso penalty to explicitly account for their correlated nature, 347 while also excluding trivial associations.

349
Our analysis also benefited from our use of rxCOV metric, which we previously developed to avoid 350 conflating assay-associated experimental variability with true significance of differential expression of any 351 cytokine. Utilizing the rxCOV metric allowed us to filter out cytokines that truly were not statistically 352 significant or whose differential expression was overwhelmed by assay associated noise. Although this 353 latter aspect might have reduced the number of selected biomarkers for any two comparison groups, it 354 did ensure that the differential expression of selected biomarkers truly had predictive ability in the context 355 of our multiplexed ELISA measurements of explant colorectal mucosal cultures. The overall consistency 356 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted March 6, 2023. ; https://doi.org/10.1101/2023.03.03.23286594 doi: medRxiv preprint of biomarkers selected between visit 1 and visit 2, which occurred nearly two years later, demonstrates 357 the robustness of both the assay and analytical methods.

359
Due to the significant impact on healthcare resources, there is debate on the frequency that LS patients 360 should undergo surveillance colonoscopy (34). Our long-term goal is to have a robust set of biomarkers 361 that are strongly predictive of individual patient risk for developing colorectal neoplasia to identify those 362 LS patients for whom surveillance intervals can be lengthened, based on their immune microenvironment 363 detected from normal-appearing rectal biopsies. We note that although epidemiological studies have 364 shown that risk of developing CRC in LS patients is correlated with MMR gene type, it remains unclear 365 how to identify those patients in the genotype-defined cohorts that benefit from increased CRC 366 surveillance. On the other hand, it is becoming increasingly probable that the local immune 367 microenvironment of the normal colonic mucosa might be a more relevant, sensitive, and specific 368 indicator of the evolving risk of a LS patient developing colorectal neoplasia (11). It is also potentially 369 better indicator than serum biomarker levels that characterize systemic alterations that are not specific 370 to the subtle alterations in the local microenvironment. Our approach of utilizing the patient biopsy 371 obtained during regularly scheduled surveillance to profile their local immune microenvironment, 372 therefore, has the potential to be incorporated into managing patient surveillance intervals.

374
Our study is currently limited by lack of complementary imaging data that corroborates cytokine activity 375 differences in the context of immune cell infiltrates and their states of activation and polarization. Future 376 work will focus on addressing this aspect while continuing to build on our strength of being able to follow 377 patients over multiple visits, increasing the relatively small number of patients and expanding our cytokine 378 repertoire. Combining these findings with sequencing-based analysis will help us not only optimize 379 surveillance strategies but also identify potential immunoprevention candidate targets that will have the 380 benefit of a long follow-up.

382
This study was focused on characterizing the local immune signaling microenvironment of normal 383 appearing mucosa in clinically relevant LS patient groups and utilizing this characterization to identify a 384 set of biomarkers that were consistently able to differentiate patients in an LS and CRC status dependent 385 manner over two visits. The identification of our candidate biomarkers requires validation in a prospective, 386 blinded studies using an independent cohort of LS patients. Further work is required to determine if the 387 biomarkers can play a role in selecting prevention strategies such as identifying potential biological 388 pathways that should be targets or monitoring response by detecting alterations in the immune signaling 389 microenvironment. 390 391 392 Acknowledgements 393 394 This project was supported in part by the Hillman Developmental Funds (P30CA047904). The authors 395 would like to recognize the study participants and thank the clinical coordinators (Nancy Abubaker, 396 Marietta Kocher, and Parker Ulrich) and laboratory staff (Katrina Culbertson, Alyssa Hein, Katie Sauka, 397 Aaron Siegel, and Amanda Swistok) for their assistance in patient recruitment, sample collection and 398 tissue assays. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (None) 408 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  to compute the overall selection probability for each biomarker. Those with overall selection probability 417 greater than 0.5 comprise the selected biomarkers. 418 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  Biopsies obtained during these visits were profiled using ex vivo explant system and multiplexed ELISA. 422 Biopsies from HC controls and non-LS patients with a history of CRC (sporadic-CRC patients) were 423 obtained during a single visit (B) Pairwise comparison of biomarker expression between the two visits. 424 Only the biomarkers that passed the rxCOV fidelity threshold are shown.

426 427
All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  boxplots were obtained using a Random Forest classifier trained on 70% of the patient data, with the 431 remaining 30% used for validation. Two hundred bootstraps were performed to generate the boxplots 432 and test the stability of the performance. Visit 1 aucROC: 75.6 (mean) ± 0.72 (se). Visit 2 aucROC: 80.7 433 (mean) ± 0.66 (se) (C, D) Expression levels of the selected cytokines for visit 1 and 2 respectively. (E) 434 Gene ontology-based enrichment analysis. 435 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted March 6, 2023. Gene ontology-based enrichment analysis. (F) Reactome pathway analysis. 444 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted March 6, 2023. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.