Telomere Maintenance Pathway Activity Analysis Enables Tissue- and Gene-Level Inferences

Telomere maintenance is one of the mechanisms ensuring indefinite divisions of cancer and stem cells. Good understanding of telomere maintenance mechanisms (TMM) is important for studying cancers and designing therapies. However, molecular factors triggering selective activation of either the telomerase dependent (TEL) or the alternative lengthening of telomeres (ALT) pathway are poorly understood. In addition, more accurate and easy-to-use methodologies are required for TMM phenotyping. In this study, we have performed literature based reconstruction of signaling pathways for the ALT and TEL TMMs. Gene expression data were used for computational assessment of TMM pathway activities and compared with experimental assays for TEL and ALT. Explicit consideration of pathway topology makes bioinformatics analysis more informative compared to computational methods based on simple summary measures of gene expression. Application to healthy human tissues showed high ALT and TEL pathway activities in testis, and identified genes and pathways that may trigger TMM activation. Our approach offers a novel option for systematic investigation of TMM activation patterns across cancers and healthy tissues for dissecting pathway-based molecular markers with diagnostic impact.


INTRODUCTION
Telomeres perform protective functions at the ends of linear eukaryotic chromosomes. They constitute one of the basic molecular factors conditioning the ability of cells to divide. Excessive cell divisions lead to incomplete reconstitution of telomeres resulting in telomere shortening and loss of proper structure of telomeric ends (Blackburn, 1991;Chow et al., 2012;Martínez and Blasco, 2015). Highly proliferative cells, such as cancer and stem cells, may utilize mechanisms for preserving telomere ends despite many rounds of divisions (Greenberg, 2005). These are known as telomere maintenance mechanisms (TMMs; Hug and Lingner, 2006). The cells may trigger activation of different TMM pathways, either using the telomerase reverse transcriptase driven synthesis (TEL; Hug and Lingner, 2006;Shay, 2016), or via DNA break-induced repair (BIR) like processes, also known as alternative lengthening of telomeres (ALT; Cesare and Reddel, 2008;Neumann et al., 2013;Sobinoff and Pickett, 2017;Jia-Min Zhang et al., 2019). The TEL pathway is more commonly occurring in stem cells and the majority of cancers, while ALT is mostly activated in tumors of mesenchymal and neuroepithelial origin (liposarcomas, osteosarcomas, and oligodendroglial gliomas), but can also be found in tumors of epithelial origin (carcinomas of the breast, lung, and kidney) (Henson et al., 2002). Some cancers such as neuroblastomas and liposarcomas, do not show evidence for activation of any of the two TMM pathways, exhibiting the ever shorter telomeres phenotype (Costa et al., 2006;Dagg et al., 2017). Finally, some indications have recently prompted that certain cancer entities (liposarcoma and other sarcomas, some tumor types) might also have both of the TMM pathways activated (Costa et al., 2006;Gocha et al., 2013).
Current experimental assays for TMM phenotyping have several shortcomings. For example, the telomeric-repeat amplification protocol (TRAP) assay for estimating telomerase activity is not very sensitive and is time and resource consuming (Fajkus, 2006). Assays to measure ALT activity are based on the assessment of chromosomal and/or cellular markers, such as C-circles, ALT-associated nuclear bodies or heterogeneous distributions of telomere length (Pickett and Reddel, 2015;Jia-Min Zhang et al., 2019), which are usually observed in ALT-type cancers. However, recent studies strongly suggest that none of these markers alone is sufficient to define the ALT status of a cell (Jia- Min Zhang et al., 2019). Attempts to use gene expression signatures for classification of TMM mechanisms have been made (Lafferty-Whyte et al., 2009). However, those signatures are applicable to specific data (Lafferty-Whyte et al., 2009) and they do not provide mechanistic details about TMM pathway activation.
Here we were set to develop a complementary approach to TMM detection that utilizes widely available gene expression data in combination with molecular interaction topologies in the TMM pathways. Establishment and analysis of TMM pathway topologies is not a trivial issue, because there is no holistic understanding of the functional context of the molecular factors, of their interactions and of the mechanisms triggering TMM activation. Previous research has identified transcriptional regulators of telomerase complex assembly (Yuan et al., 2019), however, how the enzyme components are processed and brought together (Schmidt and Cech, 2015), how the enzyme is recruited to the telomeres and what promotes final synthesis (Chen et al., 2012), is largely not clear and scattered throughout the literature in the best case. Even less is known about regulation of ALT on a gene level. Although it is considered as a break-induced repair (BIR)-like process, some of the usually accepted BIR factors are not always involved (Jia- Min Zhang et al., 2019). We previously developed a TMM-pathway approach under consideration of TEL and ALT and applied it to colon cancer (Nersisyan et al., 2019). However, overall there is no pathway representation, neither of TEL, nor of ALT TMM, which has been proven in a wider context of cells and/or tissues.
In the first part of the manuscript, we show how gene expression data can be used for TMM phenotyping making use of the TEL and ALT TMM pathways. We have constructed these pathways based on available knowledge about molecular factors and interactions involved in TMMs by further developing our previous work (Nersisyan et al., 2019). For demonstration, we have analyzed available gene expression data on different cancers with independent experimental TMM annotations (Lafferty-Whyte et al., 2009). In the second part, we apply comprehensive bioinformatics analyses to discover details of TEL and ALT activation in healthy human tissues.

Literature Based Reconstruction of Telomere Maintenance Pathways
In order to address the current lack of signaling pathway representations of telomere maintenance mechanisms (TMMs), we have performed a literature search to identify genes involved in TEL and/or ALT TMMs and to define their interaction partners and functional role (Figure 1). Overall, we identified 38 (ALT) and 27 (TEL) genes derived from 19 and 13 references, respectively (Tables 1, 2). We have considered interactions among these genes in terms of pathway topologies and took into account complex formation and other molecular events with possible impact for TEL or the ALT TMM in order to describe pathway activation in time and space (Figure 1).
The ALT pathway is represented through a series of branches describing DNA damage and assembly of APB bodies, which then promote separation of one of the telomere strands and invasion to another telomeric template from sister chromatids, other chromosomes or extra-chromosomal telomeric sequences, followed by DNA polymerase delta assisted telomere synthesis and ultimate processing of Holliday junctions, which are formed during strand invasion ( Figure 1A; Henson et al., 2002;Pickett and Reddel, 2015;Jia-Min Zhang et al., 2019;Sobinoff and Pickett, 2020). The ALT pathway thus takes into account not only APB formation, but also many other events leading to telomere synthesis. We have included the molecular factors involved in those events based on studies where both presence of APBs and C-circle assays were used for ALTdetection (Supplementary Table S4). The TEL pathway describes expression, post-transcriptional modifications, recruitment and assembly of different components of the telomerase complex, namely hTERT, hTR, and dyskerin, followed by the formation of a catalytically active telomerase complex, its recruitment to telomeres and telomere synthesis by telomerase and DNA polymerase alpha (Figure 1B; Hug and Lingner, 2006;Tseng et al., 2015;Rice and Skordalakes, 2016).
Our pathway model is based on the following general assumptions: (i) TMM pathways are subnets of a global network of cellular processes. We only include genes that have unambiguous effect on either TEL or ALT pathway. (ii) Alternative splicing and/or post-transcriptional regulatory effects mediated, e.g., via transcription factors, non-coding RNA (miRNA or lncRNA) or interactions on protein level, are not explicitly considered in our TMM pathways. We assume that these factors to a certain degree are implicitly taken into account by the measured expression levels of the genes in the TMM pathways. (iii) The TMM pathways are treated as directed graphs to mimic signal transduction from source to sink as described below.

TMM Pathway Activities Are Supported by Experimental Assays
For assessment of TEL and ALT TMM pathway activities in a given sample we use gene expression data and the pathway signal flow (PSF) algorithm as implemented in Cytoscape (apps PSFC and TMM; see "Materials and Methods" section for details). The algorithm computes the PSF score in each of the pathway nodes by considering signal propagation through all upstream activating and inhibiting interactions and complex and linker node types making use of the fold change (FC) expression values of the involved pathway genes with respect to their mean expression in the respective data set. The PSF score thus reflects the activity of all upstream events and of their topology in contrast to gene set overexpression measures often used alone for functional assessment (Nersisyan et al., 2014. The PSF scores of the final sink nodes then estimate the overall activity of the TEL and the ALT pathway, respectively.  Application to two publicly available microarray gene expression datasets from cell lines and liposarcoma tissues delivers an ALT/TEL-PSF data couple for each sample, which is then plotted into an ALT-versus-TEL PSF coordinate systems (Figures 2A,B). Each sample is color-coded according to its assignment to TEL + /ALT − or TEL + /ALT − or double negative TEL − /ALT − phenotypes which were determined by independent experimental assays (APB assay for ALT and TRAP assay for TEL) alongside with the gene expression measurements. Using support vector machine (SVM) learning, TEL positive and negative samples were separated by a vertical line, and ALT positive and negative samples by a horizontal line in both data sets (Figures 2A,B).
Detailed inspection revealed that all double negative healthy mesenchymal stem cells and mortal cell lines (green circles) indeed locate in the ALT − /TEL − quadrant formed by the perpendicular lines of our classification scheme thus indicating agreement with experimental assays. For single positive ALT + /TEL − (red) and ALT − /TEL + (blue) samples one finds their accumulation in the respective top-left and down-right quadrants, respectively, as expected. A certain fraction of theses phenotypes is found "displaced" in the left-down and top-right quadrants assigning them to double negative and double positive TMM cases, respectively. For some samples, we observed discordance in TMM PSF values between technical replicates, which was also noticeable on the level of gene expression (Supplementary Data 2), suggesting possible technical issues during microarray processing. Overall, we obtained 80-85% agreement between our computational assessment of TMM activity as ALT or TEL single positive,   Chen et al., 2018 and full agreement between double negative samples and the experimental annotations taken from the original publication.

Analysis of Pathway Activation Patterns at Single Gene and Single-Sample Resolution
For a better understanding of particular reasons of the diversity observed in the ALT/TEL plots we visualized pathway activity patterns for selected samples in  Figure 2C), which pushes one replicate to the upper right quadrant corresponding to the double positive TEL + /ALT + pathway phenotype. The C33 cell line (T2), derived from cervical squamous cell carcinoma, shows clear activation of the major TEL pathway nodes, while the main ALT events, such as telomere recruitment to APB, strand invasion and telomere synthesis, are suppressed, pushing it to the TEL + /ALT − quadrant in agreement with experimental assignment ( Figure 2C). The bladder carcinoma cell line 5,637 (T4), experimentally assessed to have high telomerase activity, showed relatively low TEL PSF values. As seen in Figure 2C, the expression and processing of the two telomerase complex subunits hTERT (TEL pathway branch TERT) and dyskerin (branch DKC1) were highly activated in this cell line, however, the expression of the factors processing the RNA template hTR was low, thus being a bottleneck for the telomerase complex formation, explaining the observed discrepancy. We could also identify the genes responsible for this, as described in detail below.
Looking at the patterns of the A7 sample in the dataset of liposarcoma tissues ( Figure 2D), we observed high activity of the APB branch of the ALT pathway, indicating that both the computational annotation and the experimental assay point on accumulation of APB bodies in this tissue. However, as the expression of downstream factors in the ALT pathway was low ( Figure 2D), it compromised the ultimate activation of this pathway. This result supports recent studies showing that the mere presence of APB bodies doesn't necessarily lead to the activation of ALT (Jia- Min Zhang et al., 2019). Similar branchactivation patterns were observed for the A6 and A9 tissue samples (Supplementary Figure S3). It is important to note that these samples had high TEL pathway activity, which is in line with previous observations of high false-positive rate of the APB assay associated with overexpression of TERC or TERT (Henson and Reddel, 2010). In summary, inspection of the PSF patterns along the pathways identifies genes and branches contributing to activation or deactivation of ALT and TEL with single sample Cell lines

Partial Influence (PI) Analysis Identifies Gene-Specific Triggers of Pathway Activities
As a second additional option of extracting gene level information responsible for pathway activation changes, we analyzed the partial influence (PI) of each gene on TMM pathway activities ("Materials and Methods" section). PI enables understanding of which genes act as triggers to activate or to deactivate selected nodes in the pathways. Genes, increasing or decreasing the PSF of the target node have positive or negative PI's, respectively. Activating nodes with log fold change (FC) expression above or below zero in a given sample will thus have a positive or negative PI, respectively. Nodes with inhibitory effect, on the other hand, will have the reverse association with PI ( Figure 3A).
We have examined the PIs in the TEL pathway of the 5,637 (T4) bladder carcinoma cell line (Figure 2A), where branchlevel analyses showed suppression of the hTR maturation branch ( Figure 2C)  encoding the hTR maturation factors NAF1, WRAP53 and PARN (Figure 3) were responsible for low activity of the hTR branch in this sample. Indeed, bringing the relative expression of NAF1 to a fold change value of 1 (Figure 3B) was sufficient to push the activity of the TEL pathway over the threshold for classifying it as TEL + (log PSF of 0). It is important to mention that the used microarray gene expression datasets did not contain expression values for the TERC gene itself, and only the expression of hTR processing factors contributed to the hTR branch in this case. A similar branch activation and PI pattern was observed for the liposarcoma tissue T8, where the TMM-PSF diverged from the experimental assay results (Figure 2B and Supplementary Figure S6). In the other misclassified sample (T6), we observed high PSF activity of the telomerase complex, however, the low TEL pathway activity was driven by low expression of STN1, which stimulates polymerase alpha in synthesis of the complementary telomeric strand after the action of the telomerase complex (Supplementary Figure S6).
PI analysis also identified that various genes were responsible for activation of the APB branch of the ALT pathway in the A6 and A7 liposarcoma tissues, while down-regulation of RAD51 and CHEK1 led to suppression of the downstream strand invasion events leading to low ALT PSF activity in both samples (Supplementary Figure S5). Hence, PI analysis extracts genes which act as triggers for switching TMM on or off with possible impact for altering between ALT and TEL and vice versa.

Comparison With TelNet Genes
Our curated TMM-pathway approach considered in total 63 genes extracted from recent publications (see above and "Materials and Methods" section). As an alternative option we made use of TelNet database (Braun et al., 2018) which collected 2,094 genes with impact for telomere biology and extracted 336 genes annotated as ALT-or TEL-associated (Supplementary Table S3). Separate hierarchical clustering of the expression values of the TEL-and ALT-genes in the cell line and liposarcoma samples analyzed above, well separates double negative ALT − /TEL − from the single positive ALT and TEL samples on one hand, and also the latters each from another. Between 82% and 85% of the samples were properly annotated compared with the experimental annotations. Agreement with PSF-based annotations is high (96 and 100%, respectively). Hence, gene clustering and PSF based classifications and experimental annotations are well-aligned (Figure 4), and those samples that were misclassified by the PSF algorithm were also misclassified with the TelNet gene set clustering. This simple comparison served as an independent validation for the selection of genes in our TMM-PSF approach using TelNet. Note that the overlap between both collections is 42 genes, meaning that 67% of the TMM-pathway genes are considered in TelNet what we attribute to our more recent curation. Moreover, the TMM pathway approach clearly makes use of a markedly reduced number of genes after strict curation and,  moreover, enables topology-based analysis not only to extract genes, but also relevant branches and triggers for switching between TMM pathways. We have also explored the TMM related 12 gene sets from the C5 collection of the Molecular Signatures Database 1 . Together they contained 138 unique genes, either involved in the pathways curated by us or in the TelNet database with ambiguous role related to TMM and not included in the pathways by design (see section "Materials and Methods").
It is worth mentioning that gene sets are typically applied using enrichment or overexpression signature metrics, which are conceptually different but complementary to the PSF-approach that explicitly considers interaction topologies (see Nersisyan et al., 2014 for details). We have previously applied the TMM PSF method together with several telomere maintenance gene signatures such as the Reactome "extension of telomeres" or "telomere maintenance" and found that the latter were able to estimate the overall activity of the TEL TMM, however without providing details of the activating mechanisms as opposed to our approach (Nersisyan et al., 2019). Pathway analysis thus provides an alternative option of estimating TMM activity with an overall detailed resolution of involved sub-processes affecting telomere maintenance.

Telomere Maintenance in Healthy Human Tissues
We have previously applied our PSF approach to study telomere maintenance states in Lynch syndrome and sporadic colorectal cancer subtypes (Nersisyan et al., 2019). Here we expand 1 https://www.gsea-msigdb.org/gsea/msigdb this method to evaluate the state of telomere maintenance mechanisms in healthy human tissues.
We estimated TEL and ALT activities in units of PSF across a series of fifteen tissues (overall N = 17-80 samples per tissue taken from donors evenly distributed across age and sex groups) making use of RNA-seq data taken from the GTEx portal ( Figure 5). The TEL pathway activity was generally low across the tissues (Figure 5A), while ALT shows slightly enhanced PSF-values ( Figure 5B). Interestingly, testis showed considerably increased activity of ALT consistently in all of the samples, and also activation of TEL, however, only in half of them, while the other half showed virtually deactivated TEL (Figures 5A,B).
For a more detailed view on the TMM activation patterns in testis, we stratified the PSF-values of the full sample set (N = 129) according to the age of the donors (Figures 5C,D). No clear aging-trend of TEL and ALT PSF was observed except for a shift of the TEL-PSF distribution toward larger values for younger men of age 20-29 years (Mann-Whitney U test p = 0.01). On the other hand, we observed a clear bi-modal distribution of samples with high or low TEL pathway activities regardless of age ( Figure 5E). The detailed analysis of TEL pathway activation patterns in terms of PI showed that the observed variability was mainly driven by the expression of TERC (Figure 5F). This gene was not expressed in half of the testis samples leading to very low TEL activities. Importantly, TERC was generally found as the main limiting factor for TEL pathway activity across all the tissues studied, meaning that low TERC expression strongly downscales TEL-PSF.
PI analysis of the ALT pathway in testis mainly implicated the importance of RAD51, suggesting RAD51-dependent ALT activation. In the RAD51-dependent pathway, telomere-bound POT1 is replaced by RPA, followed by RAD51 recruitment and strand invasion. Low expression of BLM helicase and of the nuclear receptor NR2F2 potentially limits hyperactivity of ALT in this tissue (Conomos et al., 2014;Zhang et al., 2021), although it has recently be shown that in some cases NR2F2 may not be directly involved in ALT (Alhendi and Royle, 2020). Interestingly, a recent study of Episkopou et al. (2019) have identified that the expression of the testis-specific Y-encoded-like protein 5 (TSPYL5), a previously unrecognized APB body component, is crucial for survival of ALT + cells, as it protects replaced POT1 from proteasomal degradation. We also found high expression of TSPYL5 in our testis samples showing slight correlation with ALT activity (Pearson R = 0.3, Supplementary Figure S7). This supports possible involvement of TSPYL5 in maintaining viability of ALT + cells in healthy human testis.
In summary, our analyses show that both TEL and ALT pathways of telomere maintenance may be activated in human testis. In this tissue, the main driver for ALT pathway activation is RAD51, while the limiting factor for TEL pathway activity is the expression of TERC, which was observed only in a subset of samples, notably more pronounced in younger subjects.

DISCUSSION
The activation of telomere maintenance mechanisms can serve as a phenotypic biomarker for prognostic purposes and for choosing chemotherapies, e.g., by direct targeting of the TMM pathways (Villa et al., 2008;Sugarman et al., 2019;Chen et al., 2020). However, little is known about molecular triggers leading to activation of telomere maintenance mechanisms via the TEL or the ALT pathways. Some cancer tissues are prone to activation of the ALT pathway, such as the mesenchyme-originating liposarcomas, osteosarcomas and glioblastomas, while others are more permissive of TEL activation (Henson et al., 2002). At the same time, some tumors do not activate neither TEL nor ALT TMM pathways, while others may activate both or switch activation from TEL to ALT or vice versa (Costa et al., 2006;Gocha et al., 2013). The molecular mechanisms behind such cellular decisions are mostly unknown. Owing to the role of TMM in cancer prognosis and the promise of telomeretargeting therapies (Villa et al., 2008;Sugarman et al., 2019;Chen et al., 2020), it is of paramount importance to investigate these activation patterns, as well as to come up with better approaches to assess or predict TMM states of tissues and individual cells.
Our study aimed at combining information about molecular factors involved in the TMMs to study the mechanisms of activation of either the TEL or the ALT pathways in cancers and healthy human tissues, and to provide a complementary bioinformatics method for identification of TMM phenotypes from gene expression data. To reach this goal, we have reconstructed TEL and ALT pathways of telomere maintenance taking into account, first of all, comprehensive review and recent original articles of the last 3 years (Supplementary Tables S1,S2). To the best of our knowledge this is the first attempt of providing a holistic view and quantitative analysis of signaling events involved in TMM.
The TMM pathway topologies proved by quantitative assessment of the TEL and ALT TMMs activity in cancer cell lines and tissues based on two gene expression data taken from a previous study (Lafferty-Whyte et al., 2009). Our pathway approach not only considered the expression values of the genes in each pathway, but also their mutual (activating or inhibiting) interactions, complex formation, as well as linking operator nodes, alltogether potentially influencing the final pathway activity states. According to the combinations of the activity of TEL and ALT, we have classified the samples into four TMM phenotypes (TEL − /ALT − , TEL + /ALT − , TEL − /ALT + , TEL + /ALT + ) in 80-85% agreement with independent experimental annotations of TMM in the samples. The absence of TERC expression data due to the lack of microarray probes for this gene may have limited the accuracy of the TEL pathway activity estimation in part of the samples.
Our TMM-PSF method stands out with a couple of advantages: (a) it helps to easily annotate samples based on ALT/TEL pathway activity values, and (b) it provides molecular details for dissecting the role of genes and sub-events in the overall activation of the pathway. For example, we could show that for some samples despite their low ALT activity, the APBpathway branch was highly active, which may explain why these samples were detected as ALT positive in the independent APB assay. This is in agreement with recent studies showing that the existence of APBs does not ensure telomere synthesis and many APBs in the cell may lack ALT activity (Jia- Min Zhang et al., 2019). In consequence APB-based ALT tests may lead to false positives in samples with telomerase overexpression (Henson and Reddel, 2010).
It is important to note that there is no gold-standard method for TMM phenotyping of cells/tissues. All currently available experimental assays have their drawbacks (Pickett and Reddel, 2015;Jia-Min Zhang et al., 2019). The activity of the telomerase enzyme is usually assessed by the TRAP assay that measures the amount of DNA synthesized from a telomerelike template in vitro. However, it has low sensitivity, is not well suited for single-cell analysis and does not account for downstream processes, such as recruitment of the enzyme to the telomeres and synthesis of the complementary strand (Fajkus, 2006). ALT activity is assessed by assays that measure the abundance of extra-telomeric C-circles or of APB's, or heterogeneity of telomere length (Sobinoff and Pickett, 2017). However, while these biomarkers may be common in many ALT positive cells/tissues, it has been shown that they alone are not sufficient for promoting ALT (Jia- Min Zhang et al., 2019). New methods for direct monitoring of telomere elongation in ALT cells are still being adapted (Verma et al., 2018). There is an urging need for novel assays to detect TMM states that could help in understanding cellular response to chemotherapies and for development of TMM targeted therapies (Villa et al., 2008;Sugarman et al., 2019;Chen et al., 2020;Recagni et al., 2020). Particularly useful will be assays that allow for TMM assessment in single cells. As it has been shown previously, and confirmed in this study, both TEL and ALT pathway may be coactivated in the same tumor. It will be extremely important to assess this issue by single cell transcriptomics and our pathway approach whether those pathways co-exist in the same cell or show mosaic activation in the tissue (Costa et al., 2006;Gocha et al., 2013). Additionally, switching from TEL to ALT as a cellular response to a therapy is also possible (Gan et al., 2002;Bechter et al., 2004;Shay et al., 2012;Recagni et al., 2020). It is possible that the poorer agreement with experimental assays in the liposarcoma tissue samples, compared with the cell lines, was caused by the presence of different cell types in those tissues. In this sense, using RNA-seq gene expression data to assess TMM activity from single cells is a promising future direction.
We have previously applied our approach to dissect molecular factors involved in TMM activation in Lynch syndrome and sporadic colorectal cancer subtypes in order to study association of ALT with microsatellite and chromosomal instability in those cancers (Nersisyan et al., 2019). Here we expanded beyond these studies to investigate TMM activation patterns in healthy human tissues. The experimental assays have so far been used to detect ALT activity in cancers only. Previous studies using telomeric DNA tagging have found evidence for telomere elongation via the ALT pathway in the absence of high telomerase activity in mammalian somatic tissues during early development (Liu et al., 2007;Neumann et al., 2013). A recent study by Novakovic et al. (2016) has identified elevated TERRA levels, elongated telomeres and some evidence for ALT-specific C-circles in human placenta. The authors note, however, that the amount of C-cicles was low compared to ALT positive cancers, suggesting that a mild ALT phenotype may exist in specific placental cells. Overall, these studies show that more sensitive methods to detect low ALT levels are needed to further investigate TMM states in healthy human tissues with higher resolution. In this study, gene expression datasets from healthy human tissues from the GTEx portal were used to assess TMM activity states. We found high ALT and TEL pathway activities, first of all, in testis. Interestingly, TEL was activated only in a subset of testicular tissues paralleled with marked TERC expression, while depleted TERC levels lead to low TEL phenotypes. Indications for analogous binary mosaicism effects were reported in previous studies. In humans, TERC is mainly expressed in the primary spermatocytes, however at a lower level than in other spermatogenic cells (Paradis et al., 1999). In addition, TERT expression also shows mosaic-like regulation in testis, depending on the cells of the tissue or on the stage of spermatogenesis (Ozturk, 2015). Another study has shown that telomere length increases during the development of male germ cells from spermatogonia to spermatozoa, inversely correlated with enzymatic activity of telomerase (Achi et al., 2000). This could provide the link between the observed TEL mosaicism, and activation of the ALT pathway in testis.
The high ALT activity observed in testis was largely conditioned by upregulation of RAD51, suggesting that the ALT pathway is activated in a RAD51-dependent, rather than independent manner in testis (Jia- Min Zhang et al., 2019). Interestingly, the expression of the testis-specific Y-encodedlike protein 5 (TSPYL5), a recently identified APB body component that is crucial for survival of ALT + cells (Episkopou et al., 2019) was associated with ALT activity in healthy human testis in our study. Finally, our results indicate elevated TMM activity in testis in agreement with the recent study, also performed on GTEx datasets. Accordingly, telomeres are the longest in healthy human testis compared with other tissues and are in weak negative correlation with age (Demanelis et al., 2020).
There are several limitations to this study that we would like to mention. The accuracy of pathway activity measurements depends on the accuracy of expression values of TMM genes. In the absence of protein abundance measures, one can also include gene-specific translation regulators, such as miRNAs (Santambrogio et al., 2014;Salamati et al., 2020), alternative splicing or functional variants (Ludlow et al., 2019), and the lack of measurements for TERRA abundance (Novakovic et al., 2016).
Another limitation of our study is the lack of other assays as alternative measures for TMM phenotype. In particular a dataset coupled with C-circle assay based measurement of ALT could serve as an additional validation and to estimate generalizability of the curated ALT pathway. However, as the presence of APB was not always indicative of ALT pathway activity in our study, and as we had curated the ALT pathway genes from studies that were originally based on both APB and C-circle based assays (see Supplementary  Table S4), we believe that our approach should not have systematic biases toward only capturing the presence of APBs. The latter is evident, since we do have samples in datasets showing high scores at the node describing for telomere recruitment to APBs, for which we have computed low ALT pathway activities (A7 in Figure 2D and A6,A9 in Supplementary Figure S3) and vice versa (A2 and A3 in Supplementary Figure S3).
In summary, we have reconstructed the TEL and ALT TMM pathways from previous literature. It has been carefully curated relying on reported molecular ingredients contributing to TMM and their interactions. Pathway signal flow activity estimates obtained from gene expression data have been shown to reliably estimate the TMM phenotype as a novel complementary approach to experimental assays. The main advantage of our approach is its "white box" (in contrast to "black box") nature, meaning that the resulting TMM phenotype can be "dissected" at gene level. In other words, we can explore gene-specific effects on sub-processes or events triggering activation of TMM. The method estimates TEL and ALT pathway activity in the same sample, thus enabling to establish its state in a TEL/ALT continuum with impact for investigating co-activation and switching events between the two pathways, e.g., upon cancer treatment and development. Of importance, owing to its sensitivity, it detects subtle activation of TEL and ALT in healthy human tissues. Despite the actuality of our pathway topologies it is important to note that new studies about additional factors affecting TMMs are permanently appearing with possible consequences for the pathways reconstructed here. Notably, the topologies were reconstructed in a generic manner: as the pathways can be activated via different mechanisms, not all the genes are required for the pathway activation in different situation. Future applications will show what branches and components are important in different situations. Single cell transcriptomics is one important field to essentially improve resolution of our TMM pathway method. Expression based methods thus potentially provide an independent and complementary approach to assess the TMM state. Because of the complex nature of TEL and ALT TMM, single gene expression markers or gene set signatures are potentially insufficient in most cases. Our TMM pathway method paves a new way for omics-based evaluation of TMM with a series of applications ranging from cell experiments to cancer diagnostics.

Literature Search and Pathway Construction
TEL and ALT pathways presented in this publication were carefully constructed based on comprehensive literature search resulting in 31 publications which provided relevant knowledge about TMM. To find and select publications describing factors involved in the telomere maintenance mechanisms, we searched the PubMed database with terms "telomerase" or "alternative lengthening of telomeres". The first phase was based on "review" articles published before 2020. These reviews provided the order of molecular processes involved in the pathways and thus their basic topology. The TMM genes mentioned in the review articles were chosen and then used in an extended search of the form (gene or protein) AND ("telomerase" OR "alternative lengthening of telomeres"). We also included genes not yet mentioned in review articles, by repeating the initial search with "telomerase" or "alternative lengthening of telomeres", concentrating on the original research articles of the last 3 years (2017-2020).
The articles were read in chronological order and in case of current consensus about the functional role of the mentioned genes they were included in the respective pathway, otherwise they were ignored. To ensure quality, we confirmed each interaction by inter-researcher agreement between the two authors of this manuscript that have curated the pathways. The present version of ALT and TEL pathways (TMMv2.0) is based on a previous version (TMMv1.0) (Nersisyan, 2017;Nersisyan et al., 2019), and makes use of an improved topology, branch and interaction structure according to updated literature knowledge. The network in.xgmml format may be accessed from Supplementary Data 4.

Data Sources and Preprocessing
For approval of ALT and TEL TMM pathways we made use of gene expression data on cell lines and liposarcoma tissues taken from the study by Lafferty-Whyte et al. (2009) which were annotated as double negative ALT − /TEL − (ALT and TEL inactive) or single positive ALT + /TEL − (ALT active) or ALT − /TEL + (TEL active) using independent experiments. The gene expression matrix files were downloaded from the Gene Expression Omnibus (GEO) repository (accession GSE14533). This dataset contains microarray gene expression profiling data for ten cell lines cultured from different tissues, and seventeen liposarcoma tumor samples along with four human Mesenchymal Stem Cells (hMSC) samples isolated from the bone marrow of healthy individuals. ALT activation was assessed by the presence of ALT-associated promyelocytic leukemia bodies (APBs) (Costa et al., 2006;Cairney et al., 2008b), while TEL activation was measured using the telomeric-repeat amplification protocol for telomerase activity detection (TRAP-assay) (Kim et al., 1994;Cairney et al., 2008a). Among the cell lines, four were ALT positive (ALT + /TEL − ), four were telomerase positive (ALT − /TEL + ), and two were ALT/TEL double inactive (ALT − /TEL − ). Among the liposarcoma samples, nine were ALT positive and eight were telomerase positive. Most of the samples had two technical replicates. In case of multiple microarray probes mapping to the same gene, the probe with highest standard deviation of values was considered. Cell line and tissue data were processed separately. Gene expression values higher than the 0.9 percentile in each of these sets were limited to that percentile value.
Healthy human tissue RNA-seq gene expression data was obtained from the GTEx portal (release V8) in units of transcripts per million (TPM). Fold changes were computed in comparison to the average of non-zero TPM values per gene. Data of the top 15 most common tissues extracted from subjects suffered violent or fast death from natural causes were selected. They were grouped by age and sex. For cross-tissue analysis we selected a maximum number of 10 subjects per age and sex group). For age-dependent analysis of the testis transcriptome all 129 available testis samples were chosen.

Pathway Signal Flow (PSF) Activity and Partial Influence (PI)
The pathway signal flow (PSF) algorithm (Arakelyan and Nersisyan, 2013;Nersisyan et al., 2014Nersisyan et al., , 2017 was used to asses TMM pathway activity. It computes the activation along the whole pathway based on relative expression values of its member genes and of their interactions. Details are described in the supplement (section Pathway Signal Flow algorithm and Supplementary Figure S1). The PSF algorithm is implemented in the Cytoscape app PSFC (v1.1.8) . For the specific tasks applied in this work, we have implemented a higher-level app, TMM (v0.8) 2 . It compares TMM pathway activation patterns with experimental annotations, uses PSFC for pathway activity computation and it also produces reports for TMM phenotype comparison across samples. The app is written in Java (major version 8), the source code is available at https: //github.com/lilit-nersisyan/tmm. The app user guide, along with the example datasets and network files can be accessed at the project homepage http://big.sci.am/software/tmm/.
The partial influence (PI) of a source gene estimates the extent to which its expression affects the activity of a downstream target node in the pathway. The PI-value depends on the expression value of the source gene, pathway topology and the expression of other pathway members. It is computed by neutralizing the fold change of the gene to FC = 1, and calculating the log ratio of PSF at the target node before and after neutralizing its expression (Supplementary Figure S4). To compute the mean PI across all the samples in the testis, we have generated a mean sample by averaging fold change values for each gene, and performed PI analyses on it.

AUTHOR CONTRIBUTIONS
LN, AA, and HB conceived the study. LN and AS performed pathway curation and data analysis. LN designed the software packages and prepared the figures. All authors contributed to data interpretation and manuscript preparation.