Original Research ARTICLE
Assessing the Relationship Between Leukocyte Telomere Length and Cancer Risk/Mortality in UK Biobank and TCGA Datasets With the Genetic Risk Score and Mendelian Randomization Approaches
- 1Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
- 2Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China
- 3Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, United States
- 4Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, United States
- 5Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, China
Background: Telomere length is an important indicator of tumor progression and survival for cancer patients. Previous work investigated the associations between genetically predicted telomere length and cancers; however, the types of cancers investigated in those studies were relatively limited or the telomere length-associated genetic variants employed often came from genome-wide association studies (GWASs) with small sample sizes.
Methods: We constructed the genetic risk score (GRS) for leukocyte telomere length based on 17 associated genetic variants available from the largest telomere length GWAS up to 78,592 individuals. Then, a comprehensive analysis was undertaken to evaluate the association between the constructed GRS and the risk or mortality of a wide range of cancers [i.e., 37 cancers in the UK Biobank and 33 cancers in The Cancer Genome Atlas (TCGA)]. We further applied the two-sample Mendelian randomization (MR) to estimate the causal effect of leukocyte telomere length on UK Biobank cancers via summary statistics.
Results: In the UK Biobank dataset, we found that the GRS of leukocyte telomere length was associated with a decreased risk of nine types of cancer (i.e., significant association with multiple myeloma, chronic lymphocytic leukemia, kidney/renal cell cancer, bladder cancer, malignant melanoma, basal cell carcinoma, and prostate cancer and suggestive association with sarcoma/fibrosarcoma and Hodgkin’s lymphoma/Hodgkin’s disease). In addition, we found that the GRS was suggestively associated with an increased risk of leukemia. In the TCGA dataset, we observed suggestive evidence that the GRS was associated with a high death hazard of rectum adenocarcinoma (READ), sarcoma (SARC), and skin cutaneous melanoma (SKCM), while the GRS was associated with a low death hazard of kidney renal papillary cell carcinoma (KIRP). The results of MR further supported the association for leukocyte telomere length on the risk of malignant melanoma, Hodgkin’s lymphoma/Hodgkin’s disease, chronic lymphocytic leukemia and multiple myeloma.
Conclusion: Our study reveals that telomere played diverse roles in different types of cancers. However, further validations in large-scale prospective studies and deeper investigations of the biologic mechanisms are warranted.
Telomere is a special structure with a 6-bp TTAGGG repeat sequence and plays an important role in genomic stability by protecting DNA against damage and fusion 0 (de Lange, 2005). Due to the inability of DNA polymerase to fully extend the 3′ end of DNA strand, the telomere becomes progressively shorter during each round of cell division. The length of telomere is thus a biomarker of cellular and overall biological aging. Once a critically short telomere length is reached, the cell would be triggered to enter senescence, which would ultimately lead to cell growth arrest or apoptosis (Shay and Wright, 2019). In stem and progenitor cells, the length of telomere is maintained by enzyme telomerase (Hackett and Greider, 2002; Shawi and Autexier, 2008). It is shown that enzyme telomerase is activated in almost all human tumors; such an activation can result in the continuous division of cancer cells and is the key component of the tumorigenic phenotype of human cancer cells (Stewart and Weinberg, 2006; O’Sullivan and Karlseder, 2010).
Prior studies have demonstrated that telomere length is associated with a lot of age-related diseases and disorders (e.g., cancers and neurodegenerative disorders) (Zhu et al., 2011) and that a shorter telomere length in tumor tissues is an important indicator of tumor progression and survival for cancer patients (Ma et al., 2011; Xu et al., 2016). However, not all studies reported consistent findings (Supplementary Table S1), partly reflecting the complicated function of telomere on human cancers. The diversity in cancer types, ethnicities, study designs, measurement methods, and selected tissues for telomere length in previous work further complicates the observed association. Given the severe disease burden of cancers worldwide (Siegel et al., 2019), understanding the association between telomere length and cancers can provide valuable insights into the development of cancers and has the potential to improve the prevention and treatment strategies for cancers.
On the other hand, in the past few years, a number of single nucleotide polymorphisms (SNPs) have been identified to be associated with leukocyte telomere length through genome-wide association studies (GWASs) (Levy et al., 2010; Gu et al., 2011; Mangino et al., 2012; Codd et al., 2013; Pooley et al., 2013; Dorajoo et al., 2019). Relying on associated genetic variants, many studies have been undertaken to investigate the association between genetically predicted leukocyte telomere length and cancers. However, the types of cancers investigated in previous studies (Zhang et al., 2015; Li et al., 2020) were relatively limited. In addition, the telomere length-associated SNPs employed in previous studies (Zhang et al., 2015; Rode et al., 2016; Haycock et al., 2017) often came from GWASs with small sample sizes (Levy et al., 2010; Codd et al., 2013).
Recently, a large-scale GWAS of leukocyte telomere length was conducted with the largest sample size to date (up to ∼80,000) (Li et al., 2020), which allows us to choose more appropriate SNPs to study the multilocus genetic profile of leukocyte telomere length via the genetic risk score (GRS) approach (Ripatti et al., 2010; Dudbridge et al., 2013; Eusden et al., 2015; Guo et al., 2016; Goldman, 2017; Tosto et al., 2017; Bogdan et al., 2018; De La Vega and Bustamante, 2018; Zeng et al., 2019b). Briefly, GRS is an efficient and powerful genetic method to explore the association between an exposure and complex diseases by integrating multiple genetic variants with weak effects, and it dramatically enhances the predictability of complex diseases through genetic polymorphisms (Belsky et al., 2013; Khera et al., 2018; Duncan et al., 2019; Khera et al., 2019). Moreover, several cancer-relevant cohorts, such as The UK Biobank (Bycroft et al., 2018) and The Cancer Genome Atlas (TCGA) (Hoadley et al., 2018), have collected a variety of cancer-related omics and clinical information, which makes it feasible to systematically investigate a large number of types of cancers.
Based on these valuable data resources, in the present work, we evaluated the association between leukocyte telomere length and 37 cancers from the UK Biobank cohort as well as 33 cancers from the TCGA dataset using the genetic risk score method. We further applied the two-sample Mendelian randomization (MR) (Burgess et al., 2017; Hartwig et al., 2017) to assess the association between leukocyte telomere length and multiple cancers, for which the summary statistics can be available from the UK Biobank cohort. Our study revealed that telomere played cancer-specific roles and that a shorter leukocyte telomere length can either increase or decrease the risk/mortality of cancers. However, further validations in large-scale prospective studies and deeper investigations of the biological mechanism of leukocyte telomere length on various types of cancers are warranted.
Materials and Methods
Selection of Instrumental Variables for Leukocyte Telomere Length
We obtained the summary statistics (e.g., effect size and effect allele) of leukocyte telomere length from the ENGAGE consortium as well as the EPIC-CVD and EPIC-InterAct cohorts (Supplementary Table S2; Li et al., 2020), which was the largest GWAS of telomere length (N = 78,592) undertaken in the European population to date. In this study, leukocyte telomere length was measured as a continuous variable and the linear additive regression was implemented to investigate the association for each genetic variant (Li et al., 2020). Particularly, in the association analysis, the age of participants was considered as a covariate to remove the influence of biological age. We selected 17 independent index SNPs that were strongly associated with leukocyte telomere length (p < 5.00E-8; see Table 1) to construct GRS. Note that, given the fact that the length of telomere would shorten progressively with age, to facilitate the explanation of our results, we made a sign transformation for the effect sizes of these used SNPs so that the relationship under investigation corresponded to a shorter leukocyte telomere length.
Table 1. Independent index single nucleotide polymorphisms (SNPs) associated with leukocyte telomere length in the European population.
Construction of Genetic Risk Score
where is the estimated marginal SNP effect on the shorter leukocyte telomere length for the jth selected index SNP (e.g., Table 1) (Li et al., 2020). Gj is the individual-level genotype of the same SNP in the UK Biobank (Bycroft et al., 2018) or TCGA dataset (Hoadley et al., 2018) and is coded to be 0, 1, and 2, representing the number of effect allele. Following prior work (Zeng et al., 2019b), we do not directly rescale the GRS as its p-value would not be altered regardless of whether the GRS is scaled or not. We instead standardize the GRS so that its mean is zero and the variance is equal to 1.
Two-Stage Regression Model in the UK Biobank and TCGA Using GRS
To link GRS with the risk of cancers from the UK Biobank (Table 2; Bycroft et al., 2018), we apply an additive logistic regression while adjusting for a set of available covariates (i.e., age, gender, smoke, drink, and BMI).
Table 2. Association between the genetic risk score (GRS) of leukocyte telomere length and the risk of 37 UK Biobank cancers.
where μi is the expectation of yi, with yi = 1 or 0 representing the status of individual i with or without cancer; θ is the effect size of GRS; and Xi is the vector of standardized covariates with effect sizes α. Of note, we assume that all of the entries in the first column of X are 1, representing the intercept term.
We next evaluate the effect of GRS on the mortality of cancers from TCGA (Table 3; Hoadley et al., 2018) with the Cox proportional hazards model (Cox, 1972) while controlling for available clinical covariates (i.e., age at diagnosis, gender, and stage).
Table 3. Association between the genetic risk score (GRS) of leukocyte telomere length and the mortality of 33 TCGA cancers.
where ti is the observed survival time and h0(t) is an arbitrary baseline hazard function. Cancer-specific covariates are considered for some cancers in TCGA [e.g., the status of estrogen and progesterone receptors for breast invasive carcinoma (BRCA)]. In the logistic or Cox model, we are mainly interested in estimating θ and testing for the null hypothesis H0: θ = 0. We further examine the interaction effect between GRS and each of the clinical covariates (e.g., GRS × gender) if GRS is detected to be associated with some cancer.
Two-Sample MR Analysis
Besides the GRS method, we also perform the two-sample MR analysis to estimate the causal effect of leukocyte telomere length on cancers in the UK Biobank using summary statistics (Sudlow et al., 2015). In observational studies, MR is a flexible approach for causal inference to avert confounding and reverse causality (Zeng et al., 2019a; Yu et al., 2020). In brief, we estimate the causal effect of leukocyte telomere length (again, denoted as θ) relying on all the available instrumental variables (Table 1) through the commonly employed inverse-variance weighted (IVW) method (Burgess et al., 2017; Hartwig et al., 2017).
where and are the effect size and the variance, respectively, of the instrumental variable j for the exposure X (i.e., leukocyte telomere length; Li et al., 2020), and and are the effect size and the variance, respectively, for the same instrumental variable j on the outcome Y (i.e., cancer in the UK Biobank; Sudlow et al., 2015).
To guarantee the validity of our MR analysis, before the formal analysis, we examine the pleiotropic effects of instruments by removing index SNPs that may be potentially related to individual cancers if the Bonferroni-adjusted p-values are less than 0.05. We also conduct a series of sensitivity analyses: (i) weighted median-based (Bowden et al., 2016b) and maximum likelihood methods (Burgess et al., 2013), which are robust when some instrumental variables might be invalid; (ii) MR-Egger regression (Bowden et al., 2016a; Burgess and Thompson, 2017), which guards against horizontal pleiotropic effects; and (iii) leave-one-out (LOO) analysis (Noyce et al., 2017) and Mendelian randomization pleiotropy residual sum and outlier (MR-PRESSO) test (Verbanck et al., 2018) to examine potential instrumental outliers.
UK Biobank and TCGA Cancer Datasets
The UK Biobank dataset consists of approximately 500,000 individuals (Bycroft et al., 2018). We selected age, gender, smoke, drink, and BMI as covariates and originally chose 79 self-reported cancers up to 337,198 independent individuals (28,820 cases and 308,378 controls) of European ancestry, but only included cancers with at least 60 cases (to some extent, this cutoff value was used arbitrarily) and treated cancer-free individuals to be controls. Finally, a total of 37 cancers were left up to 335,036 individuals (27,641 cases for various cancers and 307,395 shared cancer-free controls after removing individuals with missing values). The genotypes were provided by the UK Biobank after the research application was approved. However, we can only obtain 15 SNPs because two were missing (i.e., rs3219104 on PARP1 and rs55749605 on SENP7) in the UK Biobank. In addition, because summary-level statistics are necessary for the two-sample MR analysis, herein we can only consider 28 cancers from the UK Biobank (n = 420,473) (Sudlow et al., 2015; Supplementary Table S6). The summary statistics of these cancers were obtained from https://pan.ukbb.broadinstitute.org/.
Then, we obtained the survival and clinical information of 33 cancers from TCGA (Hoadley et al., 2018). We selected the overall survival time and status as the outcome and primarily included age at diagnosis, gender, and pathologic tumor stage as covariates because many other important clinical covariates were missing for most of the patients. When the pathologic tumor stage cannot be available, we instead employed the clinical stage (i.e., for CESC, DLBC, OV, THYM, UCEC, and UCS) or histological grade (i.e., for LGG). It needs to be stated that all three stage variables were missing in five cancers (i.e., GBM, LAML, PCPG, PRAD, and SARC). For each cancer, we only kept samples from the primary cancer tissue and excluded those with missing values in clinical covariates. More details about these TCGA cancers are demonstrated in Table 3 and Supplementary Table S3. For each cancer, we filtered out SNPs that had a missingness rate >0.95 across individuals, genotype calling rate <0.95, minor allele frequency (MAF) > 0.01, or Hardy–Weinberg equilibrium (HWE) p-value < 10–4. We next performed an imputation procedure by first phasing the genotypes with SHAPEIT (Delaneau et al., 2013), then imputed the SNPs based on the Haplotype Reference Consortium panel (McCarthy et al., 2016) on the Michigan Imputation Server using minimac3 (Das et al., 2016). The filtering procedure for the imputed genotypes included an HWE p-value < 10–4, a genotype call rate <95%, a MAF < 0.01, and an imputation score <0.30. After the imputation of genotypes, all of the 17 SNPs were yielded in TCGA.
Finally, we performed power calculation to detect a non-zero causal effect for GRS with regards to cancers based on the UK Biobank and TCGA datasets. Firstly, we simulated genotypes for 17 independent SNPs with varying MAFs (Table 1) and then calculated the GRS. Two independent covariates (i.e., one was binary and the other was continuous) were also included, with each having an effect size of 0.5. We generated a case–control variable y with the probability of exp(η)/(1 + exp(η)) and η = GRS × θ + 0.5X1 + 0.5X2. We created 2,000,000 individuals to be the population and then randomly sampled 50 (or 100 and 150) cases and 300,000 controls (as well as their GRS and covariates) to be a subset for the final simulation analysis.
Secondly, to simulate survival datasets, we first generated genotypes and calculated the GRS in the same way as described above. Again, two independent covariates were included, with each having an effect size of 0.5. Then, we employed the inverse probability method (Bender et al., 2005) to create survival time which followed a Weibull distribution, with the shape parameter being 1 and the scale parameter being 0.01. The location parameter of this Weibull distribution was determined by the GRS and the two covariates [i.e., μ = exp(η), with η = GRS × θ + 0.5X1 + 0.5X2]. The censored rate was fixed to be 50% in a random manner (the high censored rate corresponded to a similar situation observed in the TCGA cancer dataset). The sample size varied from 100, 300, to 500.
In both simulations, the effect size of GRS θ was set to 0.05, 0.10, or 0.20, approximately corresponding to odds ratios (ORs) [or hazard ratio (HR)] of 1.05, 1.10, and 1.20. The simulation was repeated 1,000 times, and the power calculated by the proportion of the p-value of GRS was less than 1.67E-3, approximately equal to the significance level after the Bonferroni correction of 30 types of cancers.
Throughout our study, we utilized the R software (version 3.6.1) to implement all the analyses. The association was declared to be statistically significant if the false discovery rate (FDR) is <0.05 (Benjamini and Hochberg, 1995), while the association was deemed to be suggestive if the unadjusted p-value is <0.05.
Association Between GRS and UK Biobank Cancers
The 17 selected index SNPs collectively explain about 1.37% phenotypic variance of leukocyte telomere length, and all the F statistics are above 10 (ranging from 27.9 to 205.4, with an average of 63.3) (Table 1), largely ruling out the possibility of weak instrument bias (Cragg and Donald, 1993; Burgess et al., 2017; Zeng and Zhou, 2019a). Based on the constructed GRS, we first investigate the association between leukocyte telomere length and the risk of UK Biobank cancers (Table 2). We detect that the GRS of leukocyte telomere length is significantly associated with a decreased risk of seven types of cancers (Table 2), including multiple myeloma [OR = 0.77, 95% confidence interval (CI) = 0.63–0.93, FDR = 0.021], chronic lymphocytic leukemia (OR = 0.82, 95%CI = 0.71–0.94, FDR = 0.020), kidney/renal cell cancer (OR = 0.86, 95%CI = 0.78–0.95, FDR = 0.017), bladder cancer (OR = 0.91, 95%CI = 0.84–0.98, FDR = 0.030), malignant melanoma (OR = 0.91, 95%CI = 0.88–0.95, FDR = 9.56E-05), basal cell carcinoma (OR = 0.94, 95%CI = 0.90–0.97, FDR = 0.010), and prostate cancer (OR = 0.94, 95%CI = 0.91–0.98, FDR = 0.020). Suggestive associations are observed for two types of cancers including sarcoma/fibrosarcoma (OR = 0.84, 95%CI = 0.72–0.98, FDR = 0.063) and Hodgkin’s lymphoma/Hodgkin’s disease (OR = 0.89, 95%CI = 0.79–0.99, FDR = 0.069). In addition, we discover that the GRS of leukocyte telomere length is also marginally related to an increased risk of leukemia (OR = 1.20, 95%CI = 1.02–1.41, FDR = 0.058).
We further examine the interaction effect of GRS and one of the covariates (e.g., age, gender, smoke, drink, or BMI) for each of the 10 cancers. We observe that the interaction term is statistically significant between smoke and GRS for sarcoma/fibrosarcoma (OR = 0.83, 95%CI = 0.71–0.97) as well as between drink and GRS for leukemia (OR = 0.82, 95%CI = 0.69–0.97) (Supplementary Table S4).
Association Between GRS and TCGA Cancers
We now examine the effect size of GRS on 33 TCGA cancers through the Cox proportional hazards model. We observe suggestive evidence that the GRS of leukocyte telomere length is related to a higher death hazard of READ (HR = 1.72, 95%CI = 1.09–2.73, p = 0.020), SARC (HR = 1.29, 95%CI = 1.06–1.58, p = 0.011), and SKCM (HR = 1.19, 95%CI = 1.03–1.37, p = 0.018) and is associated with a lower death hazard of KIRP (HR = 0.66, 95%CI = 0.47–0.93, p = 0.019), suggesting that a genetically decreased leukocyte telomere length can lead to a worse overall survival of READ, SARC, and SKCM while can result in a better overall survival of KIRP. However, all these associations become non-significant after accounting for multiple comparisons (FDR > 0.05). Neither suggestive nor significant associations are identified between GRS and the remaining cancers (Table 3). We further examine the interaction effect of GRS and each of the covariates (e.g., age at diagnosis, gender, or stage) for each of the four cancers. We do not identify any statistically significant interactions (Supplementary Table S5).
Association Between Leukocyte Telomere Length and UK Biobank Cancers Using the Two-Sample MR
With the selected 17 instrumental variables, we further perform MR analysis to investigate the causal effect of leukocyte telomere length on each of the 28 cancers from the UK Biobank. As no evidence of effect heterogeneity is presented across instruments (all the p-values for the Cochran’s Q test are greater than 0.05), thus, only the results estimated via the fixed-effects IVW method are displayed below. Among the 28 cancers, we identify that leukocyte telomere length is associated with a decreased risk of nine cancers (Supplementary Table S6), including basal cell carcinoma, malignant melanoma, skin cancer, bladder cancer, kidney/renal cell cancer, Hodgkin’s lymphoma/Hodgkin’s disease, thyroid cancer, chronic lymphocytic leukemia, and multiple myeloma. We also observe that leukocyte telomere length is associated with an increased risk of leukemia (Supplementary Table S6).
We now validate the observed causal associations shown above through various sensitivity analyses (Supplementary Table S6). Here, we focus on the associations that are significant in all sensitivity analyses (i.e., PWeighted median and PLikelihood < 0.05) and have no horizontal pleiotropic effects (i.e., PEgger–intercept > 0.05). Then, four types of cancers are left, including malignant melanoma (OR = 0.58, 95%CI = 0.44–0.79, FDR = 0.004), Hodgkin’s lymphoma/Hodgkin’s disease (OR = 0.30, 95%CI = 0.13–0.69, FDR = 0.008), chronic lymphocytic leukemia (OR = 0.20, 95%CI = 0.08–0.54, FDR = 0.004), and multiple myeloma (OR = 0.18, 95%CI = 0.05–0.66, FDR = 0.018). Of note is that both the weighted median method and the maximum likelihood method generate consistent causal effect estimates compared with the IVW method (Supplementary Table S6). In addition, we create scatter plots for the SNP effect sizes of leukocyte telomere length and these four cancers (Figure 1); we find that no instruments may be potential outliers. The finding is also supported by MR-PRESSO, which displays the absence of instrument outliers at the significance level of 0.05.
Figure 1. Relationship between the single nucleotide polymorphism (SNP) effect sizes of leukocyte telomere length (LTL) (x-axis) and the corresponding effect sizes of cancer (y-axis). (A) Malignant melanoma. (B) Hodgkin’s lymphoma/Hodgkin’s disease. (C) Chronic lymphocytic leukemia. (D) Multiple myeloma. In the plot, horizontal/vertical lines represent the 95% confidence interval.
To further examine whether a single instrumental variable may strongly influence the causal effects of leukocyte telomere length on these four cancers, we performed the LOO analysis. Again, the LOO analysis results demonstrate that none of the 17 instruments can substantially impact the estimated casual effect. Therefore, we can conclude that it is likely that a shorter leukocyte telomere length can decrease the risk of malignant melanoma, Hodgkin’s lymphoma/Hodgkin’s disease, chronic lymphocytic leukemia, and multiple myeloma. This finding here is also consistent with the results derived by the GRS regression above.
Power Calculation for the Association Between GRS and Cancers in the UK Biobank/TCGA Datasets
In terms of our simulations, we have sufficient power to detect the association in the UK Biobank as the total sample size is large, although only a few of the cancer cases are included. Specifically, we observe that the estimated power approaches 100% even when the number of cases is only 50 and the OR is only 1.05. In contrast, due to the relatively weak effect size and small sample size in the simulated TCGA cancer dataset, under our simulation settings, we have only low to moderate power to detect the association between GRS and the survival risk of cancer (Figure 2). For example, when the sample size is 300, the statistical power is only 3.0 or 10.7% when the HR was set to be 1.05 or 1.10. As can be expected, the power improves with the increase in the sample sizes and effect sizes.
Figure 2. Estimated power in the simulation to evaluate the association between genetic risk score (GRS) and cancers in The Cancer Genome Atlas (TCGA). In the simulation, the effect sizes of GRS were set to 0.05, 0.10, and 0.20 and the sample sizes of cancer were set to 100, 300, and 500.
Summary of the Results of the Present Study
The main objective of our study was to investigate whether there existed associations between genetically predicted leukocyte telomere length and various types of cancers. To achieve this, we first constructed the GRS of leukocyte telomere length based on associated SNPs from a large-scale GWAS and evaluated the effect of GRS on the risk and mortality of cancers. We found statistical evidence supporting the existence of associations between GRS and cancers in the UK Biobank and TCGA. Briefly, based on the GRS, a shorter leukocyte telomere length was identified to be associated with the decreased risk of some cancers (i.e., multiple myeloma, chronic lymphocytic leukemia, kidney/renal cell cancer, bladder cancer, malignant melanoma, basal cell carcinoma, prostate cancer, sarcoma/fibrosarcoma, and Hodgkin’s lymphoma/Hodgkin’s disease) as well as related to the decreased mortality of KIRP. In addition, inverse associations were observed for shorter leukocyte telomere length on the risk of leukemia as well as on the mortality of READ, SARC, and SKCM. The results of the MR analysis also supported the existence of an association between leukocyte telomere length and various cancers, including malignant melanoma, Hodgkin’s lymphoma/Hodgkin’s disease, chronic lymphocytic leukemia, and multiple myeloma. The diverse associations between leukocyte telomere length and cancers may in part reflect the different carcinogenic mechanisms acted by telomere in specific cancer types, further suggesting that telomere length is a valuable indicator of cancer risk and prognosis.
Discoveries Combined With the Previous Study
We found that the observed associations between leukocyte telomere length and cancers in the present study (i.e., multiple myeloma, chronic lymphocytic leukemia, kidney/renal cell cancer, bladder cancer, malignant melanoma, basal cell carcinoma, and prostate cancer) are greatly consistent with prior findings obtained in terms of MR (Supplementary Table S1; Zhang et al., 2015; Ojha et al., 2016; Haycock et al., 2017; Machiela et al., 2017; Li et al., 2020; Went et al., 2020). Particularly, several previous studies demonstrated that a shorter telomere length was associated with a decreased lung cancer risk or mortality and that the association was present in adenocarcinoma while absent in squamous cell carcinoma (Supplementary Table S1; Zhang et al., 2015; Haycock et al., 2017; Kachuri et al., 2018; Yuan et al., 2018), which may be attributed to the discrepancy in the biological characteristics of various subtypes of lung cancer. In the present study, inconsistent correlations were also identified within different subtypes of cancer. For example, we discovered that leukocyte telomere length had an opposite effect on the risk of leukemia and chronic lymphocytic leukemia. However, we observed that leukocyte telomere length displayed similar effects on the risk of malignant melanoma and basal cell carcinoma. These findings suggest that leukocyte telomere may influence the risk or mortality of cancer in a histologic way and also emphasize the unique roles of leukocyte telomere in the development of cancers.
Although the molecular mechanism remains unclear, some prior studies implied that both short and long telomere length played an important role in the etiology of cancers (Cui et al., 2012; Cheng et al., 2017; Nelson and Codd, 2020). Cells with longer telomere lengths have greater proliferative potential and more probability of accruing mutations (Hanahan and Weinberg, 2011); therefore, telomere shortening is generally considered to be a protective mechanism against tumorigenesis (Rode et al., 2016; Zhang et al., 2017; Kuo et al., 2019). However, it has been proposed that telomere shortening can generally give rise to end-to-end chromosome fusions and attenuates DNA damage response, thus increasing genomic instability and finally initiating carcinogenesis (Wu et al., 2003). These findings indicate that telomere plays a dual role in cancer development, and such role seems to depend on the types of cancers and the balance of the proliferation and senescence of cells in cancers.
Strengths and Limitations of Our Study
One advantage of our study is that more than 50 diverse types of cancers were investigated; it is thus feasible to undertake a systematic evaluation in the present analysis. In addition, methodologically, the GRS analysis can be viewed to be a two-stage regression model within the framework of instrumental variable-based causal inference (Baum et al., 2003; Hernán and Robins, 2006; Zeng et al., 2019a). Specifically, leukocyte telomere length is the exposure of interest and the associated SNPs are the carefully selected instrumental variables which are supposed to satisfy the necessary assumptions of instruments (Lawlor et al., 2008; Sheehan et al., 2008; Zeng et al., 2019a; Zeng and Zhou, 2019a, b). In the first stage, the effect size of each instrumental variable is estimated with an external large-scale GWAS dataset; in the second stage, the influence of leukocyte telomere length on various cancers is assessed based on the genetically determined leukocyte telomere length which is predicted with the chosen instrumental variables. Therefore, in terms of the principle of instrumental variable inference, the estimated effect of GRS can be interpreted as causal. In this sense, besides the MR method, we are actually investigating the causal association between leukocyte telomere length and cancers by constructing a GRS.
Finally, some shortcomings of this study should also be mentioned. Firstly, the majority of the individuals of the UK Biobank and TCGA were of European ancestry, so our results may not be applicable to other populations. Secondly, in our study, telomere length measured in blood leukocytes was employed and not in all cell types in vivo; however, leukocyte telomere length was demonstrated to be highly correlated with that in cells from other tissues (Friedrich et al., 2000; Wilson et al., 2008; Butt et al., 2010). Thirdly, as described before, the effect sizes of leukocyte telomere length on the mortality of TCGA cancers were only suggestive and the sample size of these cancers was not sufficiently large to maintain high power to detect weak associations. Therefore, further investigations with a larger sample size are required to validate our results.
Our study reveals that telomere played diverse roles in different types of cancers; however, further validations in large-scale prospective studies and deeper investigations of the biologic mechanisms are warranted.
Data Availability Statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.
PZ conceived the idea for the study. PZ, YW, XZ, SH, and HZ obtained the data. PZ and YG cleared up the datasets, performed the data analyses, and drafted the manuscript. PZ, YG, and YW interpreted the results of the data analyses. All authors approved the manuscript and provided relevant suggestions.
The research of PZ was supported in part by the Youth Foundation of Humanity and Social Science funded by Ministry of Education of China (18YJC910002), the Natural Science Foundation of Jiangsu Province of China (BK20181472), the China Postdoctoral Science Foundation (2018M630607 and 2019T120465), the QingLan Research Project of Jiangsu Province for Outstanding Young Teachers, the Six-Talent Peaks Project in Jiangsu Province of China (WSN-087), the Training Project for Youth Teams of Science and Technology Innovation at Xuzhou Medical University (TD202008), the Postdoctoral Science Foundation of Xuzhou Medical University, the National Natural Science Foundation of China (81402765), and the Statistical Science Research Project from National Bureau of Statistics of China (2014LY112). The research of SH was supported in part by the Social Development Project of Xuzhou City (KC19017). The research of YW was supported in part by the National Natural Science Foundation of China (81402764 and 81973142).
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We thank the ENGAGE Telomere Consortium as well as the EPIC-CVD and EPIC-InterAct cohorts for making the leukocyte telomere length summary data publicly available and are grateful to all the investigators and participants who contributed to this study. This study has been conducted using the UK Biobank resource under application number 30686. The UK Biobank was established by the Wellcome Trust Medical Charity, Medical Research Council, Department of Health, Scottish Government, and the Northwest Regional Development Agency. It has also had funding from the Welsh Assembly Government, British Heart Foundation, and Diabetes United Kingdom. The UK Biobank GWAS data can be accessed from the UK Biobank repository (https://biota.osc.ox.ac.uk/). The genetic and phenotypic UK Biobank data are available through application to the UK Biobank (https://www.ukbiobank.ac.uk). The UK Biobank summary statistics can be accessed from https://pan.ukbb.broadinstitute.org/. The TCGA data are publicly available from https://portal.gdc.cancer.gov/legacy-archive/. We also thank the editor, the associate editor, and two reviewers for theirconstructive comments, which substantially improved our manuscript.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2020.583106/full#supplementary-material
Belsky, D. W., Moffitt, T. E., Sugden, K., Williams, B., Houts, R., McCarthy, J., et al. (2013). Development and evaluation of a genetic risk score for obesity. Biodemogr. Soc. Biol. 59, 85–100. doi: 10.1080/19485565.2013.774628
Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. Ser. B 57, 289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x
Bogdan, R., Baranger, D. A. A., and Agrawal, A. (2018). Polygenic risk scores in clinical psychology: bridging genomic risk to individual differences. Ann. Rev. Clin. Psychol. 14, 119–157. doi: 10.1146/annurev-clinpsy-050817-084847
Bowden, J., Del Greco, M. F., Minelli, C., Smith, G. D., Sheehan, N. A., and Thompson, J. R. (2016a). Assessing the suitability of summary data for two-sample mendelian randomization analyses using MR-Egger regression: the role of the I-2 statistic. Int. J. Epidemiol. 45, 1961–1974. doi: 10.1093/ije/dyw220
Bowden, J., Smith, G. D., Haycock, P. C., and Burgess, S. (2016b). Consistent estimation in mendelian randomization with some invalid instruments using a weighted median estimator. Genet. Epidemiol. 40, 304–314. doi: 10.1002/gepi.21965
Burgess, S., and Thompson, S. G. (2012). Improving bias and coverage in instrumental variable analysis with weak instruments for continuous and binary outcomes. Stat. Med. 31, 1582–1600. doi: 10.1002/sim.4498
Butt, H. Z., Atturu, G., London, N. J., Sayers, R. D., and Bown, M. J. (2010). Telomere length dynamics in vascular disease: a review. Eur. J. Vasc. Endovasc. Surg. 40, 17–26. doi: 10.1016/j.ejvs.2010.04.012
Bycroft, C., Freeman, C., Petkova, D., Band, G., Elliott, L. T., Sharp, K., et al. (2018). The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209. doi: 10.1038/s41586-018-0579-z
Cheng, Y., Yu, C., Huang, M., Du, F., Song, C., Ma, Z., et al. (2017). Genetic association of telomere length with hepatocellular carcinoma risk: a Mendelian randomization analysis. Cancer Epidemiol. 50(Pt A), 39–45. doi: 10.1016/j.canep.2017.07.011
Codd, V., Nelson, C. P., Albrecht, E., Mangino, M., Deelen, J., Buxton, J. L., et al. (2013). Identification of seven loci affecting mean telomere length and their association with disease. Nat. Genet. 45, 422–427. doi: 10.1038/ng.2528
Cui, Y., Cai, Q. Y., Qu, S. M., Chow, W. H., Wen, W. Q., Xiang, Y. B., et al. (2012). Association of leukocyte telomere length with colorectal cancer risk: nested case-control findings from the shanghai women’s health study. Cancer Epidemiol. Biomark. Prevent. 21, 1807–1813. doi: 10.1158/1055-9965.Epi-12-0657
Dorajoo, R., Chang, X., Gurung, R. L., Li, Z., Wang, L., Wang, R., et al. (2019). Loci for human leukocyte telomere length in the Singaporean Chinese population and trans-ethnic genetic studies. Nat. Commun. 10:2491. doi: 10.1038/s41467-019-10443-10442
Duncan, L., Shen, H., Gelaye, B., Meijsen, J., Ressler, K., Feldman, M., et al. (2019). Analysis of polygenic risk score usage and performance in diverse human populations. Nat. Commun. 10:3328. doi: 10.1038/s41467-019-11112-11110
Friedrich, U., Griese, E., Schwab, M., Fritz, P., Thon, K., and Klotz, U. (2000). Telomere length in different tissues of elderly patients. Mech. Age. Dev. 119, 89–99. doi: 10.1016/s0047-6374(00)00173-171
Gu, J. A., Chen, M., Shete, S., Amos, C. I., Kamat, A., Ye, Y. Q., et al. (2011). A genome-wide association study identifies a locus on chromosome 14q21 as a predictor of leukocyte telomere length and as a marker of susceptibility for bladder cancer. Cancer Prevent. Res. 4, 514–521. doi: 10.1158/1940-6207.Capr-11-0063
Guo, Y., Andersen, S. W., Shu, X. O., Michailidou, K., Bolla, M. K., Wang, Q., et al. (2016). Genetically predicted body mass index and breast cancer risk: mendelian randomization analyses of data from 145,000 women of European descent. PLoS Med. 13:2105. doi: 10.1371/journal.pmed.1002105
Hartwig, F. P., Davey Smith, G., and Bowden, J. (2017). Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int. J. Epidemiol. 46, 1985–1998. doi: 10.1093/ije/dyx102
Haycock, P. C., Burgess, S., Nounu, A., Zheng, J., Okoli, G. N., Bowden, J., et al. (2017). Association between telomere length and risk of cancer and non-neoplastic diseases a mendelian randomization study. JAMA Oncol. 3, 636–651. doi: 10.1001/jamaoncol.2016.5945
Hoadley, K. A., Yau, C., Hinoue, T., Wolf, D. M., Lazar, A. J., Drill, E., et al. (2018). Cell-of-origin patterns dominate the molecular classification of 10,000 tumors from 33 Types of cancer. Cell 173, 291–304.e296. doi: 10.1016/j.cell.2018.03.022
Kachuri, L., Saarela, O., Bojesen, S. E., Davey Smith, G., Liu, G., Landi, M. T., et al. (2018). Mendelian randomization and mediation analysis of leukocyte telomere length and risk of lung and head and neck cancers. Int. J. Epidemiol. 48, 751–766. doi: 10.1093/ije/dyy140
Khera, A. V., Chaffin, M., Aragam, K. G., Haas, M. E., Roselli, C., Choi, S. H., et al. (2018). Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50:1219. doi: 10.1038/s41588-018-0183-z
Khera, A. V., Chaffin, M., Wade, K. H., Zahid, S., Brancale, J., Xia, R., et al. (2019). Polygenic prediction of weight and obesity trajectories from birth to adulthood. Cell 177, 587–596.e589. doi: 10.1016/j.cell.2019.03.028
Kuo, C. L., Pilling, L. C., Kuchel, G. A., Ferrucci, L., and Melzer, D. (2019). Telomere length and aging-related outcomes in humans: a Mendelian randomization study in 261,000 older participants. Aging Cell 18, e13017. doi: 10.1111/acel.13017
Lawlor, D. A., Harbord, R. M., Sterne, J. A., Timpson, N., and Davey Smith, G. (2008). Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Statist. Med. 27, 1133–1163. doi: 10.1002/sim.3034
Levy, D., Neuhausen, S. L., Hunt, S. C., Kimura, M., Hwang, S. J., Chen, W., et al. (2010). Genome-wide association identifies OBFC1 as a locus involved in human leukocyte telomere biology. Proc. Natl. Acad. Sci. U.S.A. 107, 9293–9298. doi: 10.1073/pnas.0911494107
Li, C., Stoma, S., Lotta, L. A., Warner, S., Albrecht, E., Allione, A., et al. (2020). Genome-wide association analysis in humans links nucleotide metabolism to leukocyte telomere length. Am. J. Hum. Genet. 106, 389–404. doi: 10.1016/j.ajhg.2020.02.006
Ma, H. X., Zhou, Z. Y., Wei, S., Liu, Z. S., Pooley, K. A., Dunning, A. M., et al. (2011). Shortened telomere length is associated with increased risk of cancer: a meta-analysis. PLoS One 6:e020466. doi: 10.1371/journal.pone.0020466
Machiela, M. J., Hofmann, J. N., Carreras-Torres, R., Brown, K. M., Johansson, M., Wang, Z., et al. (2017). Genetic variants related to longer telomere length are associated with increased risk of renal cell carcinoma. Eur. Urol. 72, 747–754. doi: 10.1016/j.eururo.2017.07.015
Mangino, M., Hwang, S. J., Spector, T. D., Hunt, S. C., Kimura, M., Fitzpatrick, A. L., et al. (2012). Genome-wide meta-analysis points to CTC1 and ZNF676 as genes regulating telomere homeostasis in humans. Hum. Mol. Genet. 21, 5385–5394. doi: 10.1093/hmg/dds382
McCarthy, S., Das, S., Kretzschmar, W., Delaneau, O., Wood, A. R., Teumer, A., et al. (2016). A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283. doi: 10.1038/ng.3643
Noyce, A. J., Kia, D. A., Hemani, G., Nicolas, A., Price, T. R., De Pablo-Fernandez, E., et al. (2017). Estimating the causal influence of body mass index on risk of Parkinson disease: a Mendelian randomisation study. PLoS Med. 14:e1002314. doi: 10.1371/journal.pmed.1002314
Ojha, J., Codd, V., Nelson, C. P., Samani, N. J., Smirnov, I. V., Madsen, N. R., et al. (2016). Genetic variation associated with longer telomere length increases risk of chronic lymphocytic leukemia. Cancer Epidemiol. Biomarkers. Prev. 25, 1043–1049. doi: 10.1158/1055-9965.Epi-15-1329
Pooley, K. A., Bojesen, S. E., Weischer, M., Nielsen, S. F., Thompson, D., Al Olama, A. A., et al. (2013). A genome-wide association scan (GWAS) for mean telomere length within the COGS project: identified loci show little association with hormone-related cancer risk. Hum. Mol. Genet. 22, 5056–5064. doi: 10.1093/hmg/ddt355
Ripatti, S., Tikkanen, E., Orho-Melander, M., Havulinna, A. S., Silander, K., Sharma, A., et al. (2010). A multilocus genetic risk score for coronary heart disease: case-control and prospective cohort analyses. Lancet 376, 1393–1400. doi: 10.1016/S0140-6736(10)61267-61266
Shim, H., Chasman, D. I., Smith, J. D., Mora, S., Ridker, P. M., Nickerson, D. A., et al. (2015). A multivariate genome-wide association analysis of 10 LDL subfractions, and their response to statin treatment, in 1868 Caucasians. PLoS One 10:e0120758. doi: 10.1371/journal.pone.0120758
Sudlow, C., Gallacher, J., Allen, N., Beral, V., Burton, P., Danesh, J., et al. (2015). UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12:e1001779. doi: 10.1371/journal.pmed.1001779
Verbanck, M., Chen, C.-Y., Neale, B., and Do, R. (2018). Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet. 50, 693–698. doi: 10.1038/s41588-018-0099-97
Went, M., Cornish, A. J., Law, P. J., Kinnersley, B., van Duin, M., Weinhold, N., et al. (2020). Search for multiple myeloma risk factors using Mendelian randomization. Blood Adv. 4, 2172–2179. doi: 10.1182/bloodadvances.2020001502
Wilson, W. R. W., Herbert, K. E., Mistry, Y., Stevens, S. E., Patel, H. R., Hastings, R. A., et al. (2008). Blood leucocyte telomere DNA content predicts vascular telomere DNA content in humans with and without vascular disease. Eur. Heart J. 29, 2689–2694. doi: 10.1093/eurheartj/ehn386
Wu, X., Amos, C. I., Zhu, Y., Zhao, H., Grossman, B. H., Shay, J. W., et al. (2003). Telomere dysfunction: a potential cancer predisposition factor. J. Natl. Cancer Inst. 95, 1211–1218. doi: 10.1093/jnci/djg011
Xu, X., Qu, K., Pang, Q., Wang, Z., Zhou, Y., and Liu, C. (2016). Association between telomere length and survival in cancer patients: a meta-analysis and review of literature. Front. Med. 10, 191–203. doi: 10.1007/s11684-016-0450-452
Yu, X., Yuan, Z., Lu, H., Gao, Y., Chen, H., Shao, Z., et al. (2020). Relationship between birth weight and chronic kidney disease: evidence from systematics review and two-sample Mendelian randomization analysis. Hum. Mol. Genet. 29, 2261–2274. doi: 10.1093/hmg/ddaa074
Yuan, J. M., Beckman, K. B., Wang, R., Bull, C., Adams-Haduch, J., Huang, J. Y., et al. (2018). Leukocyte telomere length in relation to risk of lung adenocarcinoma incidence: findings from the Singapore Chinese Health Study. Int. J. Cancer 142, 2234–2243. doi: 10.1002/ijc.31251
Zeng, P., Wang, T., Zheng, J., and Zhou, X. (2019a). Causal association of type 2 diabetes with amyotrophic lateral sclerosis: new evidence from Mendelian randomization using GWAS summary statistics. BMC Med. 17:225. doi: 10.1186/s12916-019-1448-1449
Zhang, C., Doherty, J. A., Burgess, S., Hung, R. J., Lindstrom, S., Kraft, P., et al. (2015). Genetic determinants of telomere length and risk of common cancers: a Mendelian randomization study. Hum. Mol. Genet. 24, 5356–5366. doi: 10.1093/hmg/ddv252
Zhang, X., Zhao, Q., Zhu, W., Liu, T., Xie, S. H., Zhong, L. X., et al. (2017). The association of telomere length in peripheral blood cells with cancer risk: a systematic review and meta-analysis of prospective studies. Cancer Epidemiol. Biomarkers. Prev. 26, 1381–1390. doi: 10.1158/1055-9965.Epi-16-0968
Keywords: leukocyte telomere length, cancer, genetic risk score, UK Biobank, TCGA, Mendelian randomization
Citation: Gao Y, Wei Y, Zhou X, Huang S, Zhao H and Zeng P (2020) Assessing the Relationship Between Leukocyte Telomere Length and Cancer Risk/Mortality in UK Biobank and TCGA Datasets With the Genetic Risk Score and Mendelian Randomization Approaches. Front. Genet. 11:583106. doi: 10.3389/fgene.2020.583106
Received: 14 July 2020; Accepted: 24 September 2020;
Published: 23 October 2020.
Edited by:Lei Zhang, Soochow University, China
Reviewed by:Jian Gu, University of Texas MD Anderson Cancer Center, United States
Yuehua Cui, Michigan State University, United States
Copyright © 2020 Gao, Wei, Zhou, Huang, Zhao and Zeng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Ping Zeng, email@example.com
†These authors share first authorship