ORIGINAL RESEARCH article
CDKN2A Copy Number Loss Is an Independent Prognostic Factor in HPV-Negative Head and Neck Squamous Cell Carcinoma
- 1Yale School of Medicine, New Haven, CT, United States
- 2Department of Therapeutic Radiology, Yale School of Medicine, New Haven, CT, United States
- 3Yale Cancer Center, Yale School of Medicine, New Haven, CT, United States
- 4University of Connecticut School of Medicine, Farmington, CT, United States
- 5Department of Biostatistics, Yale School of Public Health, New Haven, CT, United States
- 6Cancer Outcomes, Public Policy, and Effectiveness Research (COPPER) Center at Yale, Yale School of Medicine, New Haven, CT, United States
Background: HPV infection is associated with high p16 expression and good prognosis in head and neck squamous cell carcinomas (HNSCCs). Analysis of CDKN2A, the gene encoding p16, may further elucidate the association between p16 expression and prognosis. We sought to determine whether CDKN2A copy number loss was associated with poor survival in HPV-negative HNSCCs.
Methods: The Cancer Genome Atlas HNSCC clinical and genomic data were obtained and integrated. Patients <80 years old with a primary tumor in the oral cavity, oropharynx, hypopharynx, or larynx were included. Stratifying by copy number loss status, CDKN2A mRNA and p16 protein expression levels were examined and overall survival (OS) and disease-free survival (DFS) were evaluated.
Results: 401 patients with HPV-negative HNSCC were identified. 146 patients demonstrated CDKN2A copy number loss. The CDKN2A copy number loss group expressed significantly lower levels of CDKN2A mRNA and p16 protein than did the non-copy number loss group. Median OS for patients with and without CDKN2A copy number loss was 16.5 and 46.6 months, respectively (p = 0.007). Median DFS for both groups was 11.6 and 19.2 months, respectively (p = 0.03). In both univariate and multivariable analyses, stage IV designation, receipt of chemotherapy and CDKN2A copy number loss were predictive of OS.
Conclusion: CDKN2A copy number loss predicted poor survival independently of other patient and treatment factors and may be a clinically useful prognostic factor.
It is well known that head and neck squamous cell carcinomas (HNSCCs) caused by human papillomavirus infection (HPV-positive) have considerably better prognosis than those not associated with HPV infection (HPV-negative) (1–3). To differentiate between HPV-positive and HPV-negative disease, p16 immunohistochemistry has historically been used; HPV viral protein E7 has been observed to downregulate pRb and subsequently increase p16 expression (4, 5). While the gold standard of HPV detection in head and neck cancers is now RNA-based detection of viral proteins E6/E7 (6), clinical and molecular analysis of p16 tumor data suggests that p16 may play an important role in the pathogenesis of head and neck cancers (7–10).
Given current understanding that HPV-positive and HPV-negative HNSCCs are clinically and biologically distinct, analysis of HNSCCs should ideally be stratified by HPV status. Interestingly, recent studies suggest that p16 expression varies greatly even among only HPV-positive or only HPV-negative HNSCCs (11–13). This wide variability in gene expression amongst patients with the same HPV status suggests that differences in p16 expression cannot be explained solely by HPV infection. CDKN2A, the gene that encodes p16, is frequently inactivated via copy number loss among HPV-negative head and neck cancer patients (14). Given the high prevalence of CDKN2A copy number loss, it is possible that this genomic abnormality may largely explain the wide variability in p16 expression among HPV-negative tumors. Moreover, considering the role of p16 as a known tumor suppressor, it is possible CDKN2A copy number loss may independently predict survival even when considering HPV-negative HNSCCs alone. We aim to investigate the emerging clinical significance of CDKN2A copy number loss in HPV-negative head and neck cancers using The Cancer Genome Atlas (TCGA).
Materials and Methods
Data Source and Study Population
The Cancer Genome Atlas is a joint effort by the National Cancer Institute and National Human Genome Research Institute that collected genomic and clinical patient data for 33 types of cancer. We analyzed TCGA head and neck cancer data, integrating various types of genomic measurements with clinical metadata. The TCGA data were analyzed as follows: previously published results of PCR-based RNA-detection of HPV E6/E7 RNA were used to identify HPV-negative cancers (15), and Affymetrix SNP6 copy number measurements were used to identify patients with CDKN2A copy number loss. CDKN2A mRNA expression (RNA-Seq v2) and p16 protein quantification (reverse phase protein array) were evaluated to characterize tumor CDKN2A expression. Since CDKN2A mRNA and p16 protein are downstream products of tumor CDKN2A DNA, mRNA and protein expression were expected to be relatively lower in individuals with CDKN2A copy number loss as long as the gene was transcriptionally and translationally active.
Inclusion criteria included patients with a primary tumor of known HPV-negative status in the oral cavity, oropharynx, hypopharynx, or larynx. Patients 80 years of age or greater were excluded for overall and disease-free survival (DFS) analyses. Clinical data were obtained from the TCGA Genomic Data Commons (16). Raw copy number data were acquired from the Broad Institute’s Genome Data Analysis Centers Firehose website (17). HPV status designations for this cohort were downloaded from the supplementary files of a recent publication by Nulton et al. (15) mRNA counts and protein expression data were obtained from the MSKCC Cancer Genomics Data Server through the “cgdsr” R package (18).
Statistical Analyses and Variable Definitions
Copy number loss is defined as the loss of a chromosomal segment (e.g., during DNA replication). Loss of one or both copies of a gene contained in the deleted chromosomal segment often results in functional deficit due to gene under-expression (19). The primary independent variable, CDKN2A copy number loss, was defined a priori as having a relative log2 copy number ratio <−1. Kolmogorov–Smirnov testing was performed to evaluate differences in mRNA and protein expression between groups, both to validate CDKN2A copy number loss status and to investigate the transcriptional and translational effects of copy number loss. mRNA read count data were preprocessed by library size normalization using the TMM method, followed by log-transformation and z-scoring of mRNA reads (20).
Overall survival (OS) and DFS were the outcome variables examined. Wilcoxon rank-sum and χ2 tests were performed to assess the relationships between CDKN2A copy number loss and various demographic, clinicopathologic, and treatment variables. Survival analyses were conducted using the Kaplan–Meier method with log-rank testing for significance. Cox proportional hazards models with and without multiple imputation of missing values were also fit to identify demographic, clinicopathologic, and treatment factors associated with survival. Feature selection for multivariable analyses was performed by including clinicopathologic features previously reported to be prognostic in HNSCCs, patient and treatment factors found to be prognostic in our univariate Cox regressions, and clinicopathologic variables that differed significantly in prevalence between our copy number loss and non-copy number loss groups. All independence and hypothesis tests were performed using a two-sided significance level of 0.05. R version 3.4.1 and the following R packages were used to perform all data visualization and statistical analyses: “ggplot2,” “survival,” “survminer,” “interval,” and “mice” (21–26).
Demographic and Clinicopathologic Differences Between Genomic Groups
We identified 401 patients under age 80 with HPV-negative head and neck cancer. Of these 401 patients, 146 (36.4%) exhibited CDKN2A copy number loss. The median age of all HPV-negative patients was 61 and the range was 19–79. The cohort tended to be mostly male (73.1%), with no significant difference in sex distribution between copy number groups. Anatomic site of lesion varied: 223 (56%) were cancers of the oral cavity, 66 (16%) were of the oropharynx, 105 (26%) were of the larynx, and 6 (1%) were of the hypopharynx. Clinical stage was used in place of pathologic stage for 44 patients for whom clinical stage but not pathologic stage information was available. The CDKN2A copy number loss group consisted of a slightly greater proportion of African Americans and Stage III/IV tumors and tended to have higher rates of smoking and heavy alcohol consumption (Table 1). The only statistically significant differences between groups at a 0.05 significance level were with regards to smoking status and heavy alcohol consumption.
Table 1. Demographic and clinicopathologic characteristics of all HPV-negative head and neck squamous cell carcinomas stratified by CDKN2A copy number.
mRNA/Protein Expression Differences Between Test and Control Groups
The CDKN2A copy number loss group exhibited significantly lower CDKN2A mRNA expression than did the non-copy number loss group (median −1.10 vs. 0.64, p < 2.2 × 10−16; Figure 1A). Similarly, analysis of protein assay results for 174 samples for which RPPA data were available revealed a lower expression of p16 in the CDKN2A copy number loss group (median −0.94 vs. 0.19, p < 7.1 × 10−11; Figure 1B). These results corroborate CDKN2A copy number loss status assignments. Furthermore, they demonstrate that CDKN2A copy number loss has a functional impact on gene transcription and translation.
Figure 1. Expression of (A) CDKN2A mRNA and (B) p16 protein in copy number loss vs. non-copy number loss groups for all HPV-negative head and neck squamous cell carcinomas.
Overall survival and DFS were examined to evaluate the clinical implications of CDKN2A copy number loss. Median OS for patients with and without CDKN2A copy number loss were 16.5 and 46.6 months, respectively (p < 0.007; Figure 2A). CDKN2A copy number loss was also significantly associated with decreased DFS (median 11.6 vs. 19.2 months, p < 0.03; Figure 2B). There was no significant difference in OS or DFS between groups for early-stage (stage I or II) disease (Figures 3A,B), but among advanced-stage (stage III or IV) tumors, copy number loss was indicative of poor OS (median 16.0 vs. 27.4 months, p < 0.02; Figure 3C) and DFS (median 9.0 vs. 17.0 months, p < 0.04; Figure 3D).
Figure 2. Differences in (A) overall survival (OS) and (B) disease-free survival (DFS) between CDKN2A copy number loss and non-copy number loss groups for all tumors.
Figure 3. Differences in (A) overall survival (OS) and (B) disease-free survival (DFS) between CDKN2A copy number loss and non-copy number loss groups for early-stage (stage I/II) tumors; Differences in (C) OS and (D) DFS between CDKN2A copy number loss and non-copy number loss groups for advanced-stage (stage III/IV) tumors.
On univariate Cox proportional hazards analysis, African American race (HR: 1.70; 95% CI: 1.05–2.76), Stage IV AJCC pathologic tumor stage (HR: 3.19; 95% CI: 1.01–10.03), receipt of chemotherapy (HR: 1.68; 95% CI: 1.22–2.33), and CDKN2A copy number loss (HR: 1.54; 95% CI: 1.12–2.11) were associated with poor OS (Table 2). In adjusted analysis using a multivariable Cox proportional hazards model, AJCC pathologic tumor stage (HR: 1.86; 95% CI: 1.06–3.26), receipt of chemotherapy (HR: 1.87; 95% CI: 1.12–3.12), and CDKN2A copy number loss (HR: 1.42; 95% CI: 1.01–2.00) remained significant (Table 3). Given the considerable amount of unrecorded data on alcohol consumption and receipt of chemotherapy, multiple imputation was performed and multivariable Cox regression was applied to the imputed dataset. This analysis revealed that the same variables of tumor stage (HR: 1.79, 95% CI: 1.03–3.11), receipt of chemotherapy (HR: 1.64, 95% CI: 1.13–2.37), and CDKN2A copy number loss (HR: 1.51, 95% CI: 1.06–2.16) were independently associated with poor survival (Table 3).
Table 2. Univariable Cox proportional hazards analysis of all HPV-negative head and neck squamous cell carcinomas.
Table 3. Multivariable Cox proportional hazards model of all HPV-negative head and neck squamous cell carcinomas.
In our integrated genomic and clinical analysis of TCGA, CDKN2A copy number loss was associated with poor prognosis in HPV-negative head and neck cancer independently of other known prognostic factors including age, advanced tumor stage, and African American race (13, 14, 27). CDKN2A copy number loss was also strongly associated with decreased CDKN2A mRNA and protein expression, demonstrating significant impact on gene transcription and translation.
Univariate survival analysis found CDKN2A copy number loss to indicate worse prognosis in all HPV-negative disease. Stratifying by copy number and sub-stratifying by stage (early vs. late) showed that CDKN2A copy number loss indicated significantly poorer OS and DFS in advanced-stage but not early-stage disease. Given that our cohort consists primarily of advanced-stage tumors, lack of observed survival difference in the early-stage cohort may be due to low sample size. Follow-up studies with a greater number of early-stage tumors would be useful to validate this finding.
Notably, copy number loss retained prognostic value on multivariable analysis. For this analysis, we included possible covariates identified by reviewing previous reports of patient factors predictive of survival and conducting univariate Cox regressions on our own data. Additionally, we included the variables of smoking and heavy alcohol consumption (which were both found to have significantly different prevalence between compared groups) in our multivariable Cox regression to identify possible confounders. The finding that high pathologic stage is an independent predictor of survival is consistent with previous findings, and the finding that receipt of chemotherapy also independently predicts poor survival is not too surprising considering that patients who receive chemotherapy tend to have more advanced, systemic disease even amongst high-stage cancers. The finding that African American race was predictive of survival on univariate analysis has precedence (13), and its prognostic value was not maintained in our multivariate analysis. This too is consistent with previous reports suggesting that the difference in survival between racial groups is likely related to socioeconomic factors resulting in treatment disparities (28). Interestingly, CDKN2A copy number loss was found to be significant on multivariable analysis, suggesting it may be clinically useful as an independent prognostic factor.
Some missing or unavailable clinical data limit the conclusions that can be drawn from this study. For instance, data on receipt of chemotherapy and adjuvant radiotherapy were sparse (missing in 43 and 16% of patients, respectively). These variables are of particular clinical interest, as outcomes analysis of individuals who receive these treatments can reveal insights into efficacy and help shape best practice guidelines. We were able to incorporate variables with sparse data into our multivariable analyses through categorical representation of unknowns or through multiple imputation, but these approaches are not perfect substitutes for actual values. Thus, as highlighted in a recent editorial, we stress the importance of documenting such treatment information more completely in future data collection efforts (29). Another limitation encountered was the absence of comorbidity data. We had access to many major demographic and clinicopathologic variables, but full comorbidity histories were not available. Such data are helpful when performing retrospective cohort-based analyses to more comprehensively control for clinical confounders. Additionally, though we used overall and DFS as outcome measures, cancer-specific survival would have been ideal. To facilitate access to comorbidity and cancer-specific survival data and to provide researchers with more complete longitudinal clinical data, we suggest that future genomics data collection efforts like TCGA consider linking with Medicare to provide researchers with more complete longitudinal clinical data for patients age 65 and older.
Despite these limitations, this study highlights strengths of the dataset and our integrated approach to clinical and genomic analysis. To our knowledge, of all clinically oriented genomic studies of HNSCCs, this cohort includes the largest collection of HPV-negative HNSCCs to date. In genomics research, sample size is often limited because of the extensive costs and workflow required to acquire and sequence patient samples (30). With the large number of samples in TCGA, we believe that our cohort is more representative of the total population of HPV-negative HNSCCs than smaller HNSCC cohorts of previously published clinical analyses. A methodologic strength of this study is that it integrates data from different sequencing platforms to validate the functional significance of the genomic abnormality, prior to survival analysis. We emphasize the importance of including complementary mRNA and protein expression data when evaluating mutations and copy number alterations, as not all mutations or abnormalities of a given gene have the same (or any) effect. Transcriptional and translational analysis can provide insights into the biological relevance of mutations and other genomic abnormalities in the context of other disease influences.
In conclusion, we found that CDKN2A copy number loss was associated with low expression of CDKN2A mRNA and p16 protein and indicated poor clinical prognosis in terms of disease progression and OS. These survival differences remained significant on multivariable analysis, suggesting CDKN2A copy number loss may have clinical utility as an independent prognostic factor for advanced-stage HNSCC. Through this analysis, we demonstrate the power and limitations of the TCGA database in analyzing the clinical impact of a genomic abnormality. Future large-scale genomic data collection efforts should emphasize linking genomic data with robust, longitudinal treatment and outcomes data to accelerate clinical discovery.
WC and JY conceived of the presented idea. WC, RB, ZH, JC, and JY contributed to experimental design, selection of outcomes measures, and variable selection. WC preprocessed and performed the initial data analysis. WC, RB, AM, TH, ZH, JC, and JY contributed to interpretation of the results. SG and JT verified the analytical methods. JY supervised the findings of this work. All authors discussed the results and contributed to writing the final manuscript.
Conflict of Interest Statement
JY receives research funding from 21st Century Oncology. All other authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
1. Ang KK, Harris J, Wheeler R, Weber R, Rosenthal DI, Nguyen-Tân PF, et al. Human papillomavirus and survival of patients with oropharyngeal cancer. N Engl J Med (2010) 363(1):24–35. doi:10.1056/NEJMoa0912217
2. Chung CH, Zhang Q, Kong CS, Harris J, Fertig EJ, Harari PM, et al. p16 protein expression and human papillomavirus status as prognostic biomarkers of nonoropharyngeal head and neck squamous cell carcinoma. J Clin Oncol (2014) 32(35):3930–8. doi:10.1200/JCO.2013.54.5228
3. Kumar B, Cordell KG, Lee JS, Prince ME, Tran HH, Wolf GT, et al. Response to therapy and outcomes in oropharyngeal cancer are associated with biomarkers including human papillomavirus, epidermal growth factor receptor, gender, and smoking. Int J Radiat Oncol Biol Phys (2007) 69(2 Suppl):S109–11. doi:10.1016/j.ijrobp.2007.05.072
4. Andl T, Kahn T, Pfuhl A, Nicola T, Erber R, Conradt C, et al. Etiological involvement of oncogenic human papillomavirus in tonsillar squamous cell carcinomas lacking retinoblastoma cell cycle control. Cancer Res (1998) 58(1):5–13.
5. Lewis JS Jr, Chernock RD, Ma XJ, Flanagan JJ, Luo Y, Gao G, et al. Partial p16 staining in oropharyngeal squamous cell carcinoma: extent and pattern correlate with human papillomavirus RNA status. Mod Pathol (2012) 25(9):1212–20. doi:10.1038/modpathol.2012.79
8. Ramshankar V, Soundara VT, Shyamsundar V, Ramani P, Krishnamurthy A. Risk stratification of early stage oral tongue cancers based on HPV status and p16 immunoexpression. Asian Pac J Cancer Prev (2014) 15(19):8351–9. doi:10.7314/APJCP.2014.15.19.8351
9. Lam AK, Lo CY, Leung P, Lang BH, Chan WF, Luk JM. Clinicopathological roles of alterations of tumor suppressor gene p16 in papillary thyroid carcinoma. Ann Surg Oncol (2007) 14(5):1772–9. doi:10.1245/s10434-006-9280-9
11. Weinberger PM, Yu Z, Haffty BG, Kowalski D, Harigopal M, Brandsma J, et al. Molecular classification identifies a subset of human papillomavirus – associated oropharyngeal cancers with favorable prognosis. J Clin Oncol (2006) 24(5):736–47. doi:10.1200/JCO.2004.00.3335
12. Lewis JS Jr, Thorstad WL, Chernock RD, Haughey BH, Yip JH, Zhang Q, et al. p16 positive oropharyngeal squamous cell carcinoma: an entity with a favorable prognosis regardless of tumor HPV status. Am J Surg Pathol (2010) 34(8):1088–96. doi:10.1097/PAS.0b013e3181e84652
13. Stephen JK, Divine G, Chen KM, Chitale D, Havard S, Worsham MJ. Significance of p16 in site-specific HPV positive and HPV negative head and neck squamous cell carcinoma. Cancer Clin Oncol (2013) 2(1):51–61. doi:10.5539/cco.v2n1p51
15. Nulton TJ, Olex AL, Dozmorov M, Morgan IM, Windle B. Analysis of The Cancer Genome Atlas sequencing data reveals novel properties of the human papillomavirus 16 genome in head and neck squamous cell carcinoma. Oncotarget (2017) 8(11):17684–99. doi:10.18632/oncotarget.15179
18. Jacobsen A. cgdsr: R-Based API for Accessing the MSKCC Cancer Genomics Data Server (CGDS). R Package Version 1.2.5 (2015). Available from: https://CRAN.R-project.org/package=cgdsr (Accessed: January 3, 2018).
20. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics (2010) 26(1):139–40. doi:10.1093/bioinformatics/btp616
21. R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing (2016). Available from: https://www.R-project.org/ (Accessed: January 3, 2018).
22. Wickham H. ggplot2: Elegant Graphics for Data Analysis. New York: Springer-Verlag (2009). Available from: http://ggplot2.org (Accessed: January 3, 2018).
23. Therneau T. A Package for Survival Analysis in R. R Package Version 2.38 (2015). Available from: http://CRAN.R-project.org/package=survival (Accessed: January 3, 2018).
24. Kassambara A, Kosinski M. survminer: Drawing Survival Curves Using ’ggplot2’. R Package Version 0.4.0 (2017). Available from: https://CRAN.R-project.org/package=survminer (Accessed: January 3, 2018).
29. Chen WS, Townsend JP, Yu JB. Radiation-specific clinical data should be included in existing large-scale genomic datasets. Int J Radiat Oncol Biol Phys (2017) 98(1):8–10. doi:10.1016/j.ijrobp.2017.01.023
Keywords: CDKN2A, head and neck neoplasms, prognostic biomarkers, genomics and genetics, outcomes assessment
Citation: Chen WS, Bindra RS, Mo A, Hayman T, Husain Z, Contessa JN, Gaffney SG, Townsend JP and Yu JB (2018) CDKN2A Copy Number Loss Is an Independent Prognostic Factor in HPV-Negative Head and Neck Squamous Cell Carcinoma. Front. Oncol. 8:95. doi: 10.3389/fonc.2018.00095
Received: 14 February 2018; Accepted: 19 March 2018;
Published: 04 April 2018
Edited by:Makoto Tahara, National Cancer Centre (Japan), Japan
Reviewed by:Torahiko Nakashima, Kyushu University, Japan
Edgar K. Selzer, Medizinische Universität Wien, Austria
Copyright: © 2018 Chen, Bindra, Mo, Hayman, Husain, Contessa, Gaffney, Townsend and Yu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: James B. Yu, email@example.com