Plasma-Derived Inflammatory Proteins Predict Oral Squamous Cell Carcinoma

Oral squamous cell carcinoma (OSCC) is a major concern with high morbidity and mortality worldwide, even with the current knowledge and the advancement in treatment. OSCCs diagnosed at late-stage often require wide-excision with or without neck dissection, radiotherapy, or chemotherapy. When deemed successful, treatment often results in diminished quality of life, impaired function, and disfigurement. Strategies for early detection are urgently needed for patients afflicted with this disease. Inflammatory protein plasma biomarkers have shown to be potential tests for early detection and disease monitoring in several cancers. There has been no study on inflammation-related plasma biomarkers in OSCC. The objectives of the study were to use a multiplex approach to screen plasma-derived biomarkers and to examine the association of measurable proteins with OSCC. A total of 260 plasma samples (210 OSCC and 50 normal controls) were collected to measure for concentration of inflammatory related biomarkers using electrochemiluminescence multiplex assay. After screening of 82 potential biomarkers of the first 160 OSCC, 16 cytokines, chemokines, and growth factors were identified and verified in the second set of samples containing 50 OSCC and 50 normal. After adjustment of age and batch effects, the adjusted differential expression analysis showed that the OSCCs were markedly lower in 14 biomarkers and significantly higher level of interleukin 1 receptor antagonist (IL1Ra). By performing unsupervised clustering analysis, we observed distinctive groups of normal and two subgroups of OSCC. Linear regression of IL2, IL1Ra, and macrophage inhibitory factor (MIF) showed high accuracy in classifying OSCC with sensitivity of 0.96 and specificity of 0.92. In conclusion, this is the first paper to identify potential inflammatory plasma protein biomarkers of patients with OSCC. With further validation, the set of biomarkers can potentially be used to assist in early detection of OSCC when the disease is localized and in more treatable stage.


INTRODUCTION
Oral squamous cell carcinoma (OSCC), the most common form of oral cancer, remains a global health issue accounting for 274,000 new cases and 145,000 deaths each year (1,2). Despite advancement in treatment, the improvement of 5-year survival rates (30-60%) is diminutive mainly due to the aggressive nature of this disease and its high recurrence rates in lymph nodes and distant organs (3,4). Early detection of cancerous lesions at more localized and treatable stages can potentially improve this decimal outcome. However, OSCCs are often caught at late stage which largely relies on regular screenings with invasive diagnostic biopsies. In addition, post-treatment complications include scarring and trauma which often cause tissue alterations and can preclude identification of early recurrence. Moreover, repeated biopsies for post-treatment monitoring is impractical and can further traumatize the yet-to-heal mucosal surface. Therefore, there is a need for a non-invasive tool for early detection of OSCC to improve clinical treatment and patient's quality of life.
The advancement of genomic technologies has made it possible for early detection of key biological events that contribute to tumorigenesis of OSCC. It is largely accepted that OSCC, like other cancers, is a genetic disease characterized with loss of heterozygosity (5,6), deregulation of cell cycle or proliferation proteins expression (7)(8)(9)(10), and dysregulation of microRNA expression (11)(12)(13)(14). Harnessing the immune response directed against tumors is another promising event given the well-established evidence of immune-related molecules reacting toward tumor antigens in a variety of cancer types. Given such, identification of biomarkers that are specific to the OSCC environment would provide an effective strategy for cancer screening.
The circulatory system has been known to constitute of components that reflect diverse physiological and pathological states. Therefore, the sampling of blood, as opposed to tissue biopsies, is an attractive avenue for developing a relatively lessinvasive screening test, especially with the advent of proteomic technologies such as mass spectrometry or microarray-based assays. Previous studies comparing serum or plasma levels between healthy controls, premalignant, and OSCC have revealed significant differences in several proteins such as angiogenic factor (15)(16)(17), cytokines, chemokines, and growth factors (18)(19)(20)(21). The objectives of this study were to use a multiplex approach to screen plasma-derived biomarkers and to examine the association of measurable proteins with OSCC.

Study Population
The OSCC patients were identified from a pan-Canadian surgical trial (NCT01039298) (22). Among the 443 patients enrolled between 2010 and 2016, we identified 210 OSCCs from the oral anatomical sites (ICD-10 site codes of C02.0-C06.9) with at least 3 years of post-surgery follow-up. The blood samples were collected at time of surgery, processed within 4 h of collection, and had not gone through any freeze-thaw cycle. Patient baseline demographic data included age, sex, ethnicity, smoking history, exposure to second-hand smoke, and alcohol consumption. Clinical-pathological data included lesion anatomical site, clinical assessment of tumor size and neck lymphadenopathy, tumor grade, depth of invasion. Outcome data included overall survival, disease-specific survival, and development of nodal disease during post-surgery follow-up.
In addition to the OSCC samples, there were 60 normal samples used. Ten samples were from existing normal blood samples collected for other studies and served as a baseline normal for Cohort 1; 50 plasma samples from participants of the British Columbia Generations Project (BCGP) were requested based on one-to-one matching criteria for age (±5 years old), sex, and smoking history. These participants, recruited between 2010 and 2016, had no known cancer history at the time of blood collection up to the last data update in November 2017. Pre-analytical conditions were the same as those of the OSCC samples, including processing whole-blood samples within 4 h of collection and samples had not gone through any freezethaw cycle. These BCGP samples served as baseline normal in comparative analyses. Supplementary Tables 1, 2 summarize the study population by Cohort. The study schema is illustrated in Figure 1.
This study was carried out in accordance with the recommendations of University of British Columbia Clinical Research Ethics and the BC Cancer Research Ethics General Guidance Notes (GNs), BC Cancer Agency Research Ethics Board. This study utilized the clinical information and samples collected from existing studies which were approved by the BC Cancer Agency Research Ethics Board (REB#H09-03090 and REB#H08-01354, respectively). The present study was approved under REB#17-02031.

Sample Preparation
Whole blood samples were collected in EDTA vacutainer tubes (Becton Dickinson, Franklin Lakes, NJ, USA), either at time of surgery for OSCC or at the time of enrollment to BCGP, stored at 4 • C, and processed within 4 h of collection. The OSCC whole blood samples were centrifuged at 1,500 × g for 15 min at room temperature to separate blood plasma which were then stored at −80 • C until usage. The BCGP whole blood samples were processed for plasma separation in accordance to BCGP's standard operating protocol, with centrifugation at 1,300 × g for 10 min at 4 • C (23).

Electrochemiluminescence Multiplex Assays
Plasma protein expression was measured based on multiplex electrochemiluminescence (ECL) detection assays using commercially available kits from Meso Scale Diagnostics (MSD) (Rockville, MD, USA). We first screened potential OSCC biomarkers among the 82 biomarkers across Cohort 1 (150 OSCC and 10 normal;   Table 3. Based on the differential expression analysis from Cohort 1 and the current literatures (24-26), we identified 16 candidate biomarkers and verified them by performing ECL assays (V-PLEX Angiogenesis Panel 1 Human Kit, n = 3; V-PLEX Vascular Injury Panel 2 Human Kit, n = 3; U-PLEX Biomarker Group 1 Human, n = 10) on Cohort 2 (60 OSCC and 50 BCGP normal; Figure 1). The verification results of 16 biomarkers in Cohort 2 are summarized in Supplementary Table 3. To assess the assay's reproducibility, we also randomly selected 10 OSCC samples from Cohort 1 and repeated the measurement for the 16 candidate biomarkers as part of Cohort 2.
All ECL assays were conducted as per the manufacturer's protocols. Briefly, supplied 96-well plates were washed and coated (for U-PLEX kits) with monoclonal antibodies followed by addition of serially diluted calibrator standard in duplicates and plasma samples (20 to 50 µL with dilution factor as per assay protocol), incubation with shaking (1-2 h, room temperature), washing (three times each well), addition of 20-50 µL SULFO-TAG conjugated secondary monoclonal antibodies, and final incubation with shaking (1-2 h, room temperature). MSD Read Buffer was added to each well right before loading the plates for signal detection on the QuickPlex SQ 120 (Meso Scale Diagnostics, LLC, Rockville, MD, USA). Pre-analytical data processing was performed on MSD Discovery Workbench software version 4.0.12 to calculate the concentration of each biomarker in each sample based on the standard curves generated from calibration standards using the four-parameter logistic fit.

Statistical Analysis
All data analysis was performed using R version 3.4.4. For comparative analyses, we considered the BCGP samples as the "normal" group as opposed to the "diseased" OSCC group. Patient demographics and clinical-pathological characteristics were compared by using Student's t-test for continuous variables or Fisher's exact test for categorical variables. All statistical tests at p < 0.05 were considered significant.
To screen for potential OSCC biomarkers, we performed unpaired two-group Wilcoxon Mann-Whitney test to compare the concentration level between OSCC and normal in Cohort 1. Those with p < 0.05 after correction for multiple testing with Benjamin-Hochberg (BH) procedure were considered significant as candidate biomarkers for verification in Cohort 2. Finally, differential expression analysis on the 16 candidate biomarkers was performed on the 210 OSCC against 50 BCGP normal.
To examine the association between biomarker concentrations and OSCC, we first used logistic regression analyses to assess the potential impact of patient demographic variables, selecting those with p < 0.05 as potential confounding factors. Linear regression analyses for differential expression between OSCC and normal were then performed for each biomarker, adjusting for confounding variables and batch-effect that may hinder with clustering analysis.
To investigate presence of subgroups of samples, we used hierarchical clustering (pheatmap v1.0.10) (27) with input as the concentration (pg/mL) data matrix for the candidate biomarkers across the 210 OSCC and 50 normal samples to identify subgroups within the study population. We used Ward.D2 for the clustering method with Pearson correlation and Euclidean as the distance measures for clustering the columns and rows respectively. Further, the relationship among candidate biomarkers in OSCC or in normal samples is presented by network visualization (qgraph v1.5) (28) with input of Pearson correlation coefficient matrix of log10 transformed concentration. The output is a network composed of circles of nodes, which each represents a candidate biomarker, connected by lines that represent strength of significant correlation with p < 0.05, i.e., the greater the distance between two biomarkers, the lower the correlation and absence of connecting lines denotes zero correlation. The placement of the nodes represents how biomarkers cluster.
To explore the potential of these biomarkers in detection of OSCC, we first performed LASSO penalized regression analysis (glmnet v2.0-16) (29) to identify biomarkers and baseline covariates that best classify OSCC with highest accuracy. The regression model input was log10 transformed biomarker concentration. We randomly partitioned the entire study population (n = 260) into training set for model development and test set for evaluation of the fitted model. The biomarkers with highest discriminative performance were tested for classification performance by computing their sensitive and specificity in classifying normal and OSCC in the test set. A receiver operating characteristic (ROC) curve was then generated (pROC 1.12.1) (30) with area under the curve (AUC) estimation of predictability for OSCC.

Study Population
Patient demographics are summarized in Table 1 (OSCC and BCGP normal) and tumor characteristics are summarized in Supplementary Table 2. OSCC patients were mainly middleaged, ever-smoked, and White; compared to BCGP participants, the OSCC patients were older (p < 0.001) ( Table 1). Majority of OSCC lesions were on the tongue (66.7%) and early-staged (72.3%). In addition, 33.3% of the patients had loco-regional recurrence, 18.1% died of disease, and 10.9% died of other cause (Supplementary Table 2). We performed multivariate logistic regression analysis to assess the potential association of demographic variables with OSCC ( Table 1). Age and ethnicity were significantly associated with OSCC.
To assess the differential expression between 210 OSCC and 50 BCGP normal for the 16 candidates, we performed linear regression analysis with adjustment for age, ethnicity, and batch as confounding variables ( Table 2). This revealed 15 candidate biomarkers that were significantly differentially expressed (p < 0.05) with 14 significantly lower and IL1Ra significantly higher in OSCC samples, comparing to BCGP normal ( Table 2 and Figure 2A). Given the objective was to identify biomarkers for early detection of OSCC, we also performed differential expression analysis between normal (n = 50) and early-stage OSCC (T1/T2 and DOI ≤ 10 mm; n = 152) (31), and between early-stage and late-stage OSCC (T3/T4 and/or DOI>10 mm, n = 58) ( Table 1). We observed similar results in differences between normal and early-stage, but there was significantly higher concentration of CRP and SAA in late-stage OSCC (Supplementary Tables 5, 6).

Unsupervised Clustering of Biomarker Expression Reveals Subgroups of Samples
To investigate the extent of heterogeneity of biomarker across the 260 samples, we performed unsupervised hierarchical clustering which revealed 3 main clustered groups (CGs) of samples  ( Figure 2B). The CG1 comprised mainly normal (65.7%) while most of the OSCC were clustered into CG2 and CG3, with CG2 showing distinctively lower levels of ICAM1, I309, MCP3, MIP1a, and IL1a. Comparing between CG2 and CG3, similar baseline demographics and clinical-pathological characteristics were observed (Supplementary Table 4), suggesting that there are other clinical or biological factors associated with the clustering. Interestingly, compared to CG3, there were more CG2 OSCC with greater tumor size of T3/T4 (9.9 vs. 5.3%) and lymph node positive at time of surgery (21.7 vs. 13%), and that significantly more late-stage OSCC were in CG2 (p = 0.005). This suggests that the plasma level of these biomarkers may infer the staging of tumor at time of initial diagnosis. Given that the biomarkers were also clustered into three groups by hierarchical clustering, we investigated the relationship among them by computing Pearson correlation coefficients with correlation network visualization (Figure 3). The network of BCGP normal ( Figure 3A) consisted of strong and tight correlations (thick red lines) for most of the biomarkers, except MIH and bFGF showing negative correlation (thin faded black line). In contrast, OSCC ( Figure 3B) showed 2 tight clusters (MCP3, I309, ICAM1, MIP1a, and MCSF; IL2, IL6, and IL10) with less number of biomarkers, a separate but close relationship between MIF and bFGF, and negative correlations of ICAM to IL2/IL10, I309 to Tie2, and MCSF to IL2/IL10, and MIP1a to IL10 (faded black lines). This suggests that there are subgroups of OSCC which may differ in biological processes introducing the noise to the relationship between these biomarkers. In addition, there was an inverse relationship for several markers where the correlation is positive in BCGP normal but negative in OSCC. This may also reflect the consequences of response mechanism of certain cytokines in presence of a tumor.

Discriminative Performance of Plasma Biomarkers
As a preliminary step to investigate the diagnostic performance of these circulating biomarkers, we randomly partitioned the 260 samples into training (n = 195, 158 OSCC and 37 normal) and test (n = 65, 52 OSCC and 13 normal) sets. LASSO penalized regression was performed on the training set which selected MIF, IL2, and IL1Ra as variables that best classified OSCC and normal. These selected variables were then applied to the test set. Although the test set was small, the model achieved high performance with AUC of 0.96, sensitivity of 0.96, specificity of 0.92, PPV of 0.98 and NPV of 0.86 (Table 3 and Figure 4).

DISCUSSION
OSCC, with poor survival and significant impact on quality of life, has been an under-studied disease. As an immune inhibitory disease, it is known to be associated with increased expression of cytokines and chemokines at the tumor microenvironment with mounting evidence associating these inflammatory changes with stages of diseases and recurrence (32,33). Therefore, circulating biomarkers may also reflect the pathological disease states and blood samples which constitute detectable proteins can potentially be used to screen for OSCC. By comparing to normal samples, this is the first study to identify a set of potential plasma inflammatory protein biomarkers to distinguish OSCC from normal. Sampling of blood is a more convenient and a relatively less invasive means of sample collection compared to invasive tissue biopsies. In addition, examining blood protein components provides an indication of a systemic analysis of the changes in the presence of cancer. Comprehensive proteomics approach is suitable for biomarker discovery; however, it can be costly in conducting the assay and executing proteomic analysis while integrating sources of variables contributing to aberrantly expressed proteins. The MSD, an ECL-based platform, is commercially available and can be customized to screen or measure a set of targeted protein expressions, and has shown promising robust results for screening of diseases (34,35). Moreover, compared to the enzyme-linked immunosorbent assay (ELISA), ECL has been found to have higher sensitivity with capability of multiplex up to ten biomarkers per sample. As the first step of biomarker screening, we used MSD as our approach to screen for potential targets and verify them in an independent set of test samples. In our study, we have assessed the potential confounding impact on age. Comparing to the normal controls after adjustment for age and batch effects, all differentially expressed biomarkers remained significant.
Our study samples were subjected to only 1 to 2 freezethaw cycles. The process of freeze-thawing or storage time has been investigated for its effect on detectable concentration of blood proteins. Studies have found that freeze-thaw, up to a maximum of five cycles, and long-term storage of samples at −80 • C had minimum changes to the concentrations of blood plasma proteins (36)(37)(38). A recent study investigated the effects of freeze-thaw by comparing concentrations of inflammatory proteins, of which 9 overlap with our panel (39). No significant differences were found between never-frozen and at least one freeze-thaw cycle. For a blood-based diagnostic test in clinical settings, immediate processing of fresh samples is convenient. However, to be cost-effective in clinical settings, MSD assay should analyze 80 collected samples. Thus, subjecting samples to at least one freeze-thaw cycle may be unavoidable. Nevertheless, if the plasma biomarkers in this study are validated as OSCC specific, more economical clinical platforms, such as ELISA, can be designed to test on fresh-blood samples in clinical settings.
The most intriguing observation from this study is the significant low level of these biomarkers in OSCC, compared to normal samples. Although this was unexpected, OSCC is known to be an immune inhibiting disease; therefore, the observed lowlevel expression may reflect this. Another possible explanation is that tumor-tissue associated inflammatory-related proteins are present at higher concentration at the tumor tissues than in the blood. Further investigation of the expression of these targets at the tumor tissue, unaffected tissues, and correlated to the circulating levels may shed light of its underlying mechanism. However, given the multifaceted nature of these biomarkers, in which expression is affected by a variety of signaling pathways, the underlying biological explanation of the observed expression requires further experiments which are beyond the scope of this study. Nevertheless, with the observed blood level we were able to estimate the performance in classifying OSCC which achieved high sensitivity and specificity.
The present study found strong association between OSCC and decreased level of proteins involved in immune response, including IL2, MIF, and IL1Ra. IL2 is one of the key cytokines with regulatory role in T-cell expansion and activation through main signaling pathways (STAT, PI3K-AKT, and MAPK) that mediates the survival, proliferation, differentiation, activation, and cytokine production in different types of immune cells (40,41). IL2 is predominantly produced by antigen-stimulated T cells, NK cells, and activated dendritic cells. The absence of IL2, thus, infers the characteristic of immune deficient head and neck cancer, including OSCC (42). We observed almost undetectable trace of circulating IL2 in OSCC.
MIF is a pro-inflammatory cytokine constitutively expressed and readily to be secreted by activated immune cells promoting cell proliferation and angiogenic activities, facilitating detection of antigens, and production of other inflammatory cytokines (43). In regard to carcinogenesis, high expression of MIF has been found to inhibit regulatory effects of p53 mediated apoptosis in tumor-cell lines, and cytotoxic CD8+ T cells (44,45). In addition, MIF was also demonstrated to activate T cell through production of pro-inflammatory molecules, including IL2 and IL6 (44). The low expression of MIF may explain the observed low level of IL2.
IL1Ra is structural variant of IL1 ligand with antiinflammatory effects by competitively binding to IL1 receptors. Therefore, the elevated level of IL1Ra in circulation may indicate the presence of inflammatory effects of IL1 in tumor tissues which trigger the IL1Ra to counterbalance the signaling pathways activated by IL1. This suggests expression of IL1Ra plays a role in demoting the progression of tumor (46). Several studies have demonstrated the expression of IL1Ra to be positively correlated with progression and lymph node metastasis (47)(48)(49), inhibit IL-1 mediated prostate cancer regression (50), and increased growth rate of glioblastoma cells (51). In our study, the expression of IL1Ra is markedly higher in patients with OSCC compared to normal controls. In addition, we observed significantly higher level of IL1Ra in OSCCs that developed lymph node disease (fold change 1.1, p = 0.03). These results may be suggesting that an increase in IL1Ra was to reduce the tumor-mediated production of IL1 (52) and could propose value in assessing disease severity.
To explore the correlation among the candidate biomarkers using network visualization, we have observed interesting reverse relationships between biomarkers between OSCC and normal BCGP samples. For example, bFGF was negatively correlated with MIF among normal BCGP samples but showed significant positive correlation between MIF and several other biomarkers. This reflects previous reports demonstrating the production of MIF in presence of growth factors and inducing tumor growth (53,54). We also observed significant negative correlation between ICAM1 and IL10 among the OSCC samples suggesting the inhibitory role of IL10 on ICAM1 and T-cell activation (55). In addition, the biomarkers among OSCC clustered to more subgroups suggesting biological difference within OSCC; although we did not have significant differences between clustered groups in regards to demographics, tumor clinicalpathological characteristics, or outcome.
The limitations of the study should be considered. First, this is not a large-scale mass spectrometry or a microarray study to comprehensively interrogate the complex plasma proteome. Therefore, biological sources of variability in observed expression such as protein isoforms, or pre-or post-transcriptional modifications could not be identified. Instead, we wanted to apply a clinically translational platform to investigate the clinical value of immune-related biomarkers derived from easily accessible biosamples. Second, the retrospective nature of this study limits our full control over pre-analytical processing parameters, such as centrifugation time and speed, between laboratories, However, to our knowledge, there have been few reports of how centrifugation speed would significantly affect the detectable concentration of these proteins. A few studies may even suggest that plasma-derived proteins are relatively robust to various sample processing methods (56,57). In regards to study population, the normal matched samples from the BCGP are the best available samples that are most representative of the general non-OSCC population with comprehensive data collections on demographics, smoking and no known any cancer history with follow-up. However, we do not have detailed medical information on whether there is presence of oral premalignant diseases, autoimmune diseases, and use of immunomodulators or other related conditions which could put this population at an increased risk of developing malignancies, which in turn may contribute to observed biomarker alterations. Third, 50% of the OSCC in this study was non-smokers, which is different from other geographic regions where tobaccorelated OSCC remains high. Therefore, it is worth to note that our OSCC population may not be generalized. Lastly, the observed aberrantly expressed protein may be due to changes in metabolic states or other physiological states that could not be captured in this study. This limitation applies to all blood biomarker studies due to the varying genetic and nongenetic explanations, e.g., medical comorbidities and diets, in the population (58).
Future work is warranted to determine mechanisms by which most of these identified biomarkers are under-expressed in OSCC compared to normal. Other future directions may include a validation study with samples collected from different institutes with the determination of the best methods (e.g., ELISA vs. ECL) and cut-offs for various targets identified from this study. In addition, studies to investigate the temporal levels of these markers by repeating measurements before, post-treatment, and at time of local-regional recurrences or years into disease-free follow-up are of importance and can help to determine the value of these biomarkers in early identification of local and regional recurrence during the follow up.
In conclusion, this is the first paper to identify potential inflammatory plasma protein biomarkers of patients with OSCC. With further validation in larger sized cohort including paired blood samples collected over the course of disease management, the set of biomarkers has potential to assist in early detection of OSCC.

AUTHOR CONTRIBUTIONS
HK and CP contributed to the conception and design of the study. KL, XL, and YZ executed assay experiments and data acquisition. KL and NL performed the statistical analysis. KL wrote first draft of the manuscript. All authors interpreted and critically revised the manuscript to its final form. All authors gave final approval and agree to be accountable for all aspects of the work.