Long Intergenic Non-Coding RNAs: Novel Drivers of Human Lymphocyte Differentiation

Upon recognition of a foreign antigen, CD4+ naïve T lymphocytes proliferate and differentiate into subsets with distinct functions. This process is fundamental for the effective immune system function, as CD4+ T cells orchestrate both the innate and adaptive immune response. Traditionally, this differentiation event has been regarded as the acquisition of an irreversible cell fate so that memory and effector CD4+ T subsets were considered terminally differentiated cells or lineages. Consequently, these lineages are conventionally defined thanks to their prototypical set of cytokines and transcription factors. However, recent findings suggest that CD4+ T lymphocytes possess a remarkable phenotypic plasticity, as they can often re-direct their functional program depending on the milieu they encounter. Therefore, new questions are now compelling such as which are the molecular determinants underlying plasticity and stability and how the balance between these two opposite forces drives the cell fate. As already mentioned, in some cases, the mere expression of cytokines and master regulators could not fully explain lymphocytes plasticity. We should consider other layers of regulation, including epigenetic factors such as the modulation of chromatin state or the transcription of non-coding RNAs, whose high cell-specificity give a hint on their involvement in cell fate determination. In this review, we will focus on the recent advances in understanding CD4+ T lymphocytes subsets specification from an epigenetic point of view. In particular, we will emphasize the emerging importance of non-coding RNAs as key players in these differentiation events. We will also present here new data from our laboratory highlighting the contribution of long non-coding RNAs in driving human CD4+ T lymphocytes differentiation.

Upon recognition of a foreign antigen, CD4 + naïve T lymphocytes proliferate and differentiate into subsets with distinct functions. This process is fundamental for the effective immune system function, as CD4 + T cells orchestrate both the innate and adaptive immune response. Traditionally, this differentiation event has been regarded as the acquisition of an irreversible cell fate so that memory and effector CD4 + T subsets were considered terminally differentiated cells or lineages. Consequently, these lineages are conventionally defined thanks to their prototypical set of cytokines and transcription factors. However, recent findings suggest that CD4 + T lymphocytes possess a remarkable phenotypic plasticity, as they can often re-direct their functional program depending on the milieu they encounter. Therefore, new questions are now compelling such as which are the molecular determinants underlying plasticity and stability and how the balance between these two opposite forces drives the cell fate. As already mentioned, in some cases, the mere expression of cytokines and master regulators could not fully explain lymphocytes plasticity. We should consider other layers of regulation, including epigenetic factors such as the modulation of chromatin state or the transcription of non-coding RNAs, whose high cell-specificity give a hint on their involvement in cell fate determination. In this review, we will focus on the recent advances in understanding CD4 + T lymphocytes subsets specification from an epigenetic point of view. In particular, we will emphasize the emerging importance of non-coding RNAs as key players in these differentiation events. We will also present here new data from our laboratory highlighting the contribution of long non-coding RNAs in driving human CD4 + T lymphocytes differentiation.

THE REVOLUTIONS OF REGULATORY NON-CODING RNAs
At the beginning of this century, the results of the human genome project highlighted the complexity of our genome. What emerged was that the fraction of the genome that is informative is higher than we expected. Subsequent analysis revealed that the vast majority of informative sequences does not encode for proteins. Indeed against a total of 62.1% of the human genome covered by processed transcript (74.7% by primary transcripts), exons of protein-coding genes cover only the 2.94% of the genome (1). From an evolutionary point of view, the genome size is in close relationship with coding potential in prokaryotes, which have haploid genomes primarily composed by protein-coding sequences (~88%). Conversely, in eukaryotes, a correlation lacks between protein-coding gene number and organismal complexity. These observations are likely explained by the evolution of a more sophisticated architecture to control gene expression that includes the expansion of non-coding regulatory RNAs (ncRNAs) (2). Thus, we should clearly reassess the centrality of protein-coding RNAs in favor of non-coding ones.
Non-coding RNAs with fundamental functions within cells are known since the discovery of the first transfer RNA (tRNA) (3) and comprise also ribosomal RNAs (rRNAs). Nonetheless, the interest toward non-coding RNAs with regulatory functions arose with the discovery of the first human micro-RNA, let-7 (4). In order to apply a theoretical framework to the transcriptome, regulatory ncRNAs are usually classified based on their dimension: "small" ncRNAs being less than 200 nucleotides in length and "long" or "large" ncRNAs (lncRNAs) ranging from more than 200 to tens of thousands of nucleotides ( Table 1).
Further complicating the picture, lncRNAs seem to be the preferred substrate for the generation of small RNAs (21). This maze of non-coding transcripts was revealed also in a genome-wide identification of lncRNAs in mouse CD8 + T lymphocytes, where 18 of the identified lncRNAs appeared to overlap with annotated miRNAs and 21 with snoRNAs (37).
Both classes can be further classified according to their position relative to known sequences of the genome, like in the case of promoter-associated RNAs (PASRs) or transcription initiation small RNAs (tiRNAs). In particular, long non-coding RNAs are usually classified relative to neighboring protein-coding genes. They can be defined as "sense" if they are transcribed from the same strand of the protein-coding gene or "antisense" if the opposite is true. They can be "divergent" if their promoter and the one of the coding transcript are in close proximity and located in a head to head fashion. They can be "exonic" or "intronic" if they overlap one or more exons, or an intron of the protein-coding

>200
Gene expression regulation, regulation of cellular processes (33,34) uaRNAs 3 UTR-derived RNAs <1000 Derive within 3 untranslated region (3 UTR) sequences. Function still not clearly understood (35) circRNA Circular RNA 100 to >4000 Diverse, from templates for viral replication to transcriptional regulators (36) gene respectively. Instead, they can be "intergenic" (or "intervening"; lincRNAs) if they lie within a sequence between two protein-coding genes (38). In this review, we will focus on this last category, which is probably the most studied given that the location of these lncRNAs avoids complications deriving from the overlap with other genes. The majority of known lncRNAs is generated by the same transcriptional machinery of mRNAs. This means that transcribed lincRNAs genomic sequences are marked by RNA polymerase II occupancy and histone modifications that are shared with active protein-coding genes, such as H3K4me3 at promoters and H3K36me3 within gene bodies (39). They are capped by methylguanosine at their 5 , spliced, and polyadenylated, even if the widespread representation of this last property among known lncRNAs could be partially due to the RNA sequencing strategies used for their identification (15,40). Indeed, broader analysis identified about 39% of lncRNAs to have at least one of the six most common poly(A) motifs, compared to 51% for coding transcripts (1). These properties imply that there are few distinctive biochemical features that allow the distinction of lncRNAs from protein-coding mRNAs. Among them, lncRNAs have unusual exon structure, with on average 2-5 exons. Intriguingly, lncRNAs are significantly more likely to overlap repetitive elements and particularly RNA-derived transposable elements (TEs). These last account for about 30% of human lncR-NAs nucleotides, often in proximity of their transcriptional start site (TSS), which could suggest that TEs could be important drivers of lncRNAs evolution (see below). Nonetheless, the main difference between lncRNAs and protein-coding genes relies by definition on their coding potential: lncRNAs does not possess open reading frames (ORFs), as evaluated based on: the conservation of ORFs codons (41), ORFs length, the presence of known protein domains, in vitro translation (42,43), and ribosome footprinting (44) assays. However, these conceptual constraints are terribly artificial: short, non-canonical peptides have been found to arise from small ORFs within ncRNA (45)(46)(47)(48); lncRNAs genes can also code for proteins and have a double function (49) and ultimately, the coding potential does not necessarily exclude a function as RNA also for known mRNAs (50). Evolution makes boundaries between coding and non-coding genes fainter as ncRNAs can evolve by pseudogenization. This event can follow disruption of the ancestral ORF, but not of the untranslated regulatory regions (UTRs) in protein-coding genes duplicates (50) or can arise without duplication, but from the co-option of ancestral genes to different, non-coding functions (51). This was the case of the long-known Xist RNA, involved in the silencing of the inactive X chromosomes in eutherians. In particular, two exons of the protein-coding gene Lnx3 are homologous to Xist. This gene retained a protein-coding capacity at least in the common ancestor of marsupials and placentals. Conversely, the Xist A-repeat implicated in X-silencing function is not conserved. This sequence likely arose from the insertion of a TE recruited to form a proto-Xist gene (52,53). Therefore, the difference between dosage compensation in marsupials, eutherians, and monotremes can be ascribed from the presence of a Xist-independent XCI in mammalian ancestor and the peculiar evolution of the proto-Xist gene by pseudogenization in the eutherian ancestor. Intriguingly, other lncRNAs involved in X-inactivation are similarly examples of pseudogenization (54). The boundary between coding and non-coding is even less defined when ncRNAs arise from joining of coding and non-coding exons through alternative splicing (55,56), from untranslated regions of mRNAs (57,58) or from the opposite strand of the overlapping protein-coding gene (59). Strikingly, more than a half of protein-coding genes in mammals have a complementary non-coding transcript (60). These findings further challenge our "linear" model of the genome, prompting a re-evaluation of current dogma and genes definitions. Genomic regions indeed are far more complex than previously thought: genes can be used for different purposes and different functional elements can co-locate intermingling coding and non-coding regions.
The interest toward lncRNAs has been rapidly growing and their expressions have been quantitated in many different tissues and cell types by high-throughput sequencing (RNA-seq). These efforts retrieved catalogs with little overlap, so that the number of known lncRNAs is still growing, in contrast with the number of known protein-coding genes that has been remarkably stable over years. Indeed, lncRNAs are far more cell-specific than mRNAs, generally less but also more dynamically expressed at various differentiation stages. For this reason, immune system is an excellent context in which we can deepen our knowledge on lncRNAs. While many excellent reviews cover the recent advances in understanding the role of these molecules within the innate branch (61,62), little is still known about their importance for the human adaptive immune system. Effector lymphocytes are highly specialized cells that arise from common progenitors through differentiation processes still not completely understood. Besides, lymphocytes can be purified through cell sorting from blood of healthy donors and the existence of in vitro differentiation protocols provide the ideal setting for the identification of lncRNAs expressed in the human immune system and for their functional characterization.
Indeed, the growing interest on lncRNAs and the lack of knowledge on their expression patterns in the human immune system prompted us to perform the RNA-seq analysis on 13 human primary lymphocytes subsets purified by FACS sorting from healthy donors (CD4 + naïve, T H 1, T H 2, T H 17, T reg , T CM , T EM , CD8 + naïve, T CM , T EM , B naïve, B memory, B CD25 + ) and to develop a bioinformatics pipeline for lincRNAs identification.
Through this analysis, we identified long intergenic ncRNAs genes expressed in these subsets and confirmed that lincRNAs cellspecificity is higher than protein-coding genes even when comparing lincRNAs genes with membrane receptor protein-coding genes, which are generally referred as the most accurate markers for lymphocyte subsets definition. Besides, a major outcome of this analysis is the identification through de novo transcriptome reconstruction of 563 novel, previously unannotated long intergenic ncRNAs genes, increasing by~12% the number of lincRNAs known to be expressed in human lymphocytes (63). Intriguingly, a fraction of lincRNAs specific for B cells and a fraction of "pan-T" lincRNAs also exist (63). It would be extremely interesting to study these lincRNAs during lymphocytes development in order to understand their likely peculiar role in thymic or bone-marrow-derived cells development.
These observations imply that the little overlap between available catalogs is a direct consequence of lncRNAs specificity and that we could overcome this limitation only assessing lncRNAs expression in every different, highly purified cell type at different developmental stages, instead of considering tissues as a whole. Moreover, due to their specificity of expression, human lymphocytes lincRNAs that are not yet annotated in public resources would have not been identified without performing de novo transcriptome reconstruction. As mentioned before, such tissuespecificity has been linked to the enrichment of TEs in proximity to lincRNAs TSS (64,65). Moreover, RNA-seq experiments performed in a human CD4 + naïve T cells in vitro differentiation time-course suggest that lincRNA-specific expression in human lymphocyte subsets is acquired during their activation-driven differentiation from naïve to memory cells (63).
Long non-coding regulatory RNAs functional flexibility derive from their intrinsic propensity to fold into thermodynamically stable secondary and higher orders structures that function as interaction modules (87). Each module can fold independently from another, forming bonds at the level of Watson-Crick, Hoogstein, and ribose face (88,89). These RNAs can rapidly shift between diverse stable structural conformation, allowing allosteric transitions that can act as switches in response to environmental stimuli. They are also processed faster than mRNA, given that they must not be translated, allowing a rapid response to signals. LncRNAs can also be regulated via more than a hundred different nucleotide modifications, like in the case of tRNAs, rRNAs, and snoRNAs (90-92) that modulate their function and probably their structure. RNAs can generate multiple modules within their structure, allowing the interaction with multiple players, the reception of multiple stimuli, and the generation of multiple outputs. The required pairing is likely extremely flexible, such as in the case of micro-RNAs, and allows mismatches, bulges, and wobblings (93). Many of these interaction modules derive from repetitive elements, such as transposons that took advantage of the fewer constraints that lncRNAs sequences have compared to protein-coding genes (1,94). Indeed, lncRNAs rate of sequence evolution is higher relative to protein-coding genes, even if also these transcripts exhibit evolutionary signatures of functionality. They evolve under modest but detectable selective pressure, accumulating fewer substitutions than neutrally evolving sequences (95,96). Likely, conservation of relatively small units of lncRNAs sequences (estimated to be less than 5%) could be sufficient to preserve their function, considering their already mentioned modular structure (97). This (1-5) LincRNAs described in the immune system. (1) Modulation of cell growth and apoptosis mediated by GAS5 that acts inhibiting glucocorticoid receptors binding to their DNA responsive elements; (2) Jα recombination guided by the PARL TEA; (3) Tmevpg1 recruits WDR5 to induce IFNγ expression; (4) Nron modulates the import-export of NFAT to the nucleus; (5) IFNα1-AS acts as a competing endogenous RNA, releasing IFNα from micro-RNA inhibition. In light red other mechanisms are described for lincRNAs outside the immune system. could be the reason why existing bioinformatic approaches fail to detect low level and scattered selective constraint within these loci (97).

Frontiers in Immunology | T Cell Biology
Through such a plastic and versatile structure, lncRNAs can exert their functions binding to proteins, other RNAs (98), and probably also DNA, even if there is still little evidence on the existence of RNA:DNA triplex (99,100). In particular, lncRNAs can act as scaffolds, bridging together different molecules in a coordinated hub, like in the case of NEAT1: a highly abundant lncRNA that controls sequestration of proteins involved in the formation of paraspeckles, nuclear domains associated with mRNA retention and pathologically enriched in influenza and herpes viruses infections (101,102). LncRNA can also act as guides, recruiting proteins at specific loci: this has been hypothesized in the case of recombination events that mediate genetic diversity in developing lymphocytes as class switch (CS) and V(D)J recombinations that seem to be mediated by sense and antisense transcripts that dictates the locations of combinatorial events (103)(104)(105). Again, lncRNAs can act as control devices or riboswitches in response to extracellular stimuli. For example, they can act as decoys, precluding pre-existing interactions such as GAS5 RNA that detach glucocorticoid receptor (GR) from its responsive elements in conditions of growth arrest (106,107). Nonetheless, the regulatory potential of lncRNAs has been better characterized in the context of the epigenetic regulation of transcription that ultimately defines the cell transcriptome.

THE ROLE OF LONG NON-CODING RNAs IN EPIGENETICS
Histones and DNA modifications together with the tridimensional chromosomes conformation within the nucleus define, at least in part, the epigenetic landscape of the cell. This extremely dynamic context modulates gene expression and dictates the final transcriptional output in response to environmental stimuli. By definition, these modifications are then propagated throughout cell divisions. This process is important in every moment of cell life, but particularly during differentiation. Indeed, every cell within our body harbor the same genome, but every cell acquires a particular phenotype according to intrinsic and extrinsic cues that ultimately defines its epigenome and therefore its fate during differentiation. Epigenetics also defines to what extent this fate can be irreversible or plastic (108)(109)(110).
As mentioned before, human lymphocytes are an interesting model system for understanding the basis of cell fate specification and plasticity. Indeed, although traditionally the broad range of effector lymphocytes has been referred to as composed by distinct lineages, it has become increasingly clear that these cells also have notable features of plasticity. Differentiation of naïve cells into specific helper subsets requires the integration of extrinsic cues that converge into cell-intrinsic changes in the epigenetic landscape on the genome (111,112). The interest within the field has been focused on the regulation of prototypical cytokine genes for each subset such as Ifng gene for T H 1 or Il-4 for T H 2 CD4 + lymphocytes. Much work has been done in both cases to define the complex genetic structure of these loci and the cis regulatory elements bound by TFs and chromatin modifiers promoting or repressing their transcription (113)(114)(115). The importance of the setting of epigenetic memory at these fundamental loci was underlined also by treatment with DNA methylation inhibitors (116,117) or histones deacetylases inhibitors (118)(119)(120) and by deletion of DNA methyltransferase (121)(122)(123), which caused respectively: constitutive production of IFN-γ, enhanced production of both T H 1 and T H 2 prototypic cytokines, and inability to activate the proper pattern of expressed cytokines. The same is true for deletion of components of trithorax group (TrxG) or polycomb repressive complex (PRC) that dictates active or repressive epigenetic marks at fundamental loci for proper T-helper cell differentiation, such as Il-4, Il-5, Il-13, and Gata3 (124)(125)(126)(127)(128)(129). The pattern of chromatin marks is conventional for signature cytokines: active marks are present at prototypical cytokines whereas repressive marks restrain the expression of antagonistic molecules. However, master regulators and other TFs, usually considered as definers of lineage-specific identity, are characterized by bivalent poised domains, in which both active and repressive chromatin marks are present (130,131). This histone epigenetic status is peculiar also to promoters in embryonic stem cells, where it poises the expression of key developmental genes thus allowing their timely activation in the presence of differentiation signals and concomitantly precluding expression in their absence (132). Indeed, while the expression of master TFs is quite rapid, cell divisions are required for cytokine loci to become accessible or conversely repressed. Indeed, GATA3 and T-bet/STAT proteins initiate the epigenetic changes at IFN-γ and IL-4 loci that follow the initial activation of naïve T cells and differentiation toward T H 1 and T H 2 cell fate (133,134). These observations imply that T-helper cells harbor both clear-cut and plastic epigenetic marks. Nonetheless, we must consider that even cytokines genes that are clearly defined epigenetically, can be expressed or repressed in unexpected context, as reported in T H 1 cells converted in IL-4-producing cells during strong T H 2-polarizing helminth infections (135) or stable T H 1/T H 2 hybrid cells derived after parasite infections (136). Therefore, other players must be involved to define the degree of plasticity of lymphocytes in response to these ever-changing environmental conditions during differentiation.
Long non-coding regulatory RNAs have been linked to epigenetic control of gene expression since the first studies regarding the already mentioned Xist transcript, involved in X chromosome inactivation in eutherians. Many other lncRNAs have been associated to chromatin or DNA modifiers and even TFs, thanks to specific mechanistic studies or high-throughput screenings (82,(137)(138)(139). This interplay can be observed across a broad range of eukaryotic organisms, suggesting that the epigenetic role of lncR-NAs is conserved, even if their mere sequence conservation is often limited (as described previously). It seems that lncRNAs could act as scaffolds, physically associating with proteins that modify chromatin either activating or repressing gene expression. Thanks to the already discussed structural properties of RNA, lncRNAs could organize multiple players in spatially and temporally concerted actions (138). Not only: thanks to their ability to base pair with other nucleic acids, they could recruit these modifiers at specific loci, therefore conferring them specificity of action (98). This property has been an unsolved issue, given that chromatin modifiers do not possess intrinsic bias toward consensus sequences, at least in mammals, while in Drosophila these "docking sites" are well defined (140,141). Interestingly, while many of these enzymes lack DNA-binding properties, they instead possess RNA-binding motifs (142)(143)(144).

www.frontiersin.org
The majority of reported lncRNAs are involved in the repression of gene transcription, in particular by interacting with polycomb group (PcG) proteins. The first examples of a direct interaction with PRC2 are the already mentioned Xist (145) and Kcnq1ot1, expressed only in the mammalian paternal chromosome and involved in the silencing of 8-10 protein-coding genes (146). In both these cases, lncRNAs are strictly required for the enrichment of PRC2-associated proteins and for the trimethylation of the lysine 27 of histone H3 at specific loci. Furthermore, other lncRNAs such as NEAT2 and TUG1 promote relocation of growth-control genes at foci of PcG proteins (called PcG bodies), therefore likely facilitating the concerted repression/activation of the transcription units in response to mitogenic signal (147). Many other protein complexes have been found to interact with lncRNAs, the majority targeting histones, either methylases or demethylases, but other involved in DNA methylation (148). Indeed, lncRNAs can bind proteins part of the TrxG (68) that antagonize PcG-mediated silencing (149). Interestingly, an antisense lncRNA has been recently involved in recruiting a regulator of DNA demethylation at a specific promoter (150). This process remains still largely unknown and it has only recently been associated to active enzymatic reactions, via TET family of methylcytosine dioxygenases (151,152). Even in this case, one of the unsolved questions has been how locus-specificity can be achieved. Particularly, DNA demethylation is often restricted to few dinucleotides at the TSS. The precise mechanism, though, through which lncRNAs could direct DNA or chromatin modification has never been described. Indeed in all reported examples, correlations have been described between lncRNA-modifiers associations and loss of modification after lncRNA gene silencing.
Long non-coding regulatory RNAs are supposed to confer binding specificity to modifiers and recruiting them either in cis or in trans. In the first case, lncRNAs could act directly on sites where they are synthesized without needing to leave DNA. The current hypothesis suggests that the 5 region of the nascent transcript could bind proteins while the 3 is transcriptionally lagging, being still tethered to chromatin by RNA polymerase (153). This model is particularly intriguing as through this mechanism, lncR-NAs could exert an allele-specific effect, like in the well-studied case of Xist. In trans regulation is instead achieved when lncRNAs act modulating genes across great distances or even on different chromosomes (154). Regarding this dichotomy, we must underline once again its artificiality. Indeed, chromosomes fold into complex, three-dimensional territories together with specialized subnuclear bodies. Proteins that are part of the transcriptional or splicing machinery and regulators of these processes group at these foci (155,156). These structures are not static, but on the contrary, large-scale chromosomal repositioning is observed in response to environmental stimuli or during differentiation (157,158). Subnuclear movements are of key importance in regulating events like transcription and rearrangement that occur at immunoglobulin loci during B lymphocytes development (159). The dynamic folding of the genome into higher order structure encompasses loci belonging to the same chromosome, even hundreds of kilobases apart, or different ones, bringing together regions that are distant if we consider the genome as linear. Therefore, in this context, it is extremely difficult to discern what regulations are in cis or in trans, especially when they involve long distance interactions. Intriguingly, lincRNAs have been found to regulate the formation of subnuclear structures, such as NEAT1, required for paraspeckles nucleation (101). LncRNAs can also affect directly the three-dimensional organization of chromosomes enhancing the function of proteins involved in looping formation, like the insulator protein CTCF (160). There are also many examples of lncRNAs involved in three-dimensional local chromatin looping that brings together the ncRNA gene with the region that it regulates within the same chromosome (68,161). Recently, a lincRNA called Firre has been shown to recruit specific gene loci located on different chromosomes, acting as a docking station for organizing trans-chromosomal associations. Consistently, genetic deletion of Firre leads to a loss of proximity of several trans-interactions (162). A peculiar type of lncRNA has been described that is transcribed from enhancer regions (eRNAs). Classic enhancer elements therefore likely act through transcription of these lncRNAs that upregulate expression at promoters via the recruitment of Mediator complex (163,164). Finally, there is increasing evidence that even promoters could be transcribed (165), producing lncR-NAs probably involved in the enhancer-promoter loop that was hypothesized years ago but never fully resolved (166).

LONG NON-CODING RNAs IN THE ADAPTIVE IMMUNE SYSTEM
The adaptive immune system is an extraordinary context for the study of the role of lincRNAs in differentiation. Indeed, upon antigen stimulation, naïve CD4 + T cells differentiate into distinct T-helper subsets that were traditionally considered lineages and defined by a prototypic set of expressed cytokines and master TFs. Recently, this relative simple scenario, although useful, has been subjected to debate. CD4 + T cells demonstrated to exhibit substantial plasticity and it has become increasingly clear that they can change the pattern of cytokines and TFs according to the milieu they encounter through their life (167,168). Not only, in some cases, they can concomitantly express other cytokines and TFs together with their prototypical set. Best examples include IL-10, once thought to specifically identify T H 2 and now known to be produced also by T H 1, T reg , and T H 17 cells (169) and IFN-γ, the classic T H 1 cytokine, frequently released by T H 17 cells simultaneously with IL-17 (170,171). Regarding master TFs, T regs can express Foxp3 (their prototypical TF), but also RORγt (T H 17 TF) and Runx3 (172)(173)(174); similarly T FH cells can differentiate from FOXP3 positive cells also expressing Bcl6 (their specific TF) (175,176). In this context, lncRNAs have a fundamental role in governing flexibility and plasticity or maintenance of cell identity, together with lineage-specific TFs and other ncRNAs. In particular, what is emerging from the literature is that ncRNAs typically act as fine-tuners of fate choices and this seems to be true not only in the immune system. Nonetheless, in the case of CD4 + T-cell subsets that are specified but not fully determined, subtle changes in extrinsic signals can reverberate through responsive ncRNAs inducing changes that alter cell phenotype (38,177,178). Usually, the stability of lineage identity is achieved through the implementation and inheritance of epigenetic modification, but as mentioned before, lncRNAs can act directly on histone and DNA modifiers redefining this context.

Frontiers in Immunology | T Cell Biology
Conversely, lncRNAs can also buffer this situation in other conditions, acting as maintainers of cell identity. In the cellular system, lncRNAs can be regarded as minor nodes in a huge interconnected network (179), as they usually interact with few other players. This condition allows them to be more flexible and sensitive to variations without disrupting the whole network integrity (180). This is true both over a very short period, as cells can easily and rapidly adapt to environment, and also over long evolutionary periods, as lncRNAs are among the fastest evolving sequences in the genome (95,(181)(182)(183). Conversely, master transcription regulators can be considered highly connected hubs, which confer robustness to the network. Indeed, very few protein-coding genes have been lost from worms to human and mutations are most often pathological (184,185). Several single-case or genome-wide studies on lncRNAs in the murine adaptive immune system or cell lines are now available in the literature, whereas only few studies have been conducted until now in the human context. The number of studies that unveiled the function and mechanism of a specific lncRNA is so small that can be counted on one hand ( Table 3).
The importance of the studies in the human immune system is underlined by the fact that the differences between experimental animal models and human are still subject of debate in terms of immunologic responses (199)(200)(201). Moreover, there are increasing evidences that ncRNAs are poorly conserved between animal models and human (202,203). In particular, lncRNAs are really fastevolving elements as demonstrated by the fact that over 80% of the human lncRNAs that arose in the primate lineage, only 3% are Table 3 | Studies on lncRNAs in the adaptive immune system.

GATA3-AS1
Specifically expressed in T H 2 cells (198) www.frontiersin.org conserved across tetrapods and most mammalian lncRNAs lack known orthologs outside vertebrates (97). In detail, even between mouse and human, lncRNAs are poorly conserved (204)(205)(206). Despite their rapid evolution, lncRNAs are selected more than neutral sequences and in particular more than intergenic regions, but significantly less than mRNAs (96,97,207). It must be underlined that the conservation rate reported could be overestimated: substitution rates are derived from whole-genome alignment and based on the assumption that even segment of homologies imply that that segment belongs to the same RNA class, but this is not necessarily the case. Indeed, it could be that in another genome context a specific lncRNA gene segment is transcribed and processed as part of a protein-coding RNA (208). A striking example is Hotair that is involved in the regulation of the highly conserved cluster of Hox genes (68). The human lincRNA is conserved in the mouse genome (209), nonetheless only the 3 region is effectively part of the murine homolog (183). The importance of studying lincRNAs specifically within the human immune system derives from these considerations, but this field is still poorly investigated.
The majority of the studies focused on the innate immune system (210)(211)(212) or analyzed pathological situations, such as cancerrelated lncRNAs (192,213) or responses to specific infections (102,(214)(215)(216), mostly in mice. The first functional study that focused on the adaptive immune system, and in particular on T H 1 and T H 2 lymphocytes, involved a lincRNA, Tmevpg1, that is selectively expressed in T H 1 cells via STAT4 and T-bet, both in mouse and human. It participates in the induction of IFN-γ expression strictly in response to T H 1 differentiation program and not in other cellular contexts. These results highlight once again the complexity of gene expression regulatory network and the specificity of action of lincRNAs (196). Another paper described a lincRNA, GATA3-AS1, specifically expressed in primary T H 2 cells and hypothesized its co-regulation with GATA3 (198). GAS5, expressed in human T lymphocytes, is degraded in optimal growth conditions, but it accumulates contributing to growth arrest in starving conditions (107). In this situation, it competes with GRs DNA-binding sequences, suppressing GR-mediated transcription (106). Broader studies have been performed on the CD8 + T cell transcriptome (37), and recently on CD4 + T lymphocytes (189), but still on mice models. In B cells, chromatin remodeling associated with V(D)J recombination has been potentially linked to a widespread antisense intergenic transcription that occurs in the variable (V) region of the immunoglobulin heavy chain (Igh) locus (104,105). So far, no studies have been published that performed a deep transcriptomic analysis on human primary lymphocytes from healthy donors, identifying lncRNAs fundamental for differentiation processes. These few examples are just clues of the importance that lincRNA could have for the proper function also of the human immune system and prompt to a deeper analysis of their role in this particularly intriguing context.

LONG NON-CODING RNAs AS EPIGENETIC MODULATORS IN LYMPHOCYTE DIFFERENTIATION
Traditionally, the secretion of IFN-γ and TNF-α characterizes T H 1 lymphocytes, whereas IL-4, IL-5, and IL-13 are considered prototypic cytokines secreted by T H 2 cells. According to this classic paradigm, these differences underline the different functions exerted by these lymphocytes: T H 1 are considered as important to eliminate intracellular bacteria and viruses, whereas T H 2 to resist parasitic infections (217). The advantage of solid in vitro differentiation protocols allowed a deep understanding of the genetic mechanisms governing these cells. Since the discovery of this dichotomy, other cell subsets have been identified, but this T H 1/T H 2 paradigm was undoubtedly useful. Therefore, it is not a case that among the few lncRNAs identified in the immune system, many of those functionally characterized have been described in these two cell subsets. Nevertheless as mentioned before, just one lincRNA, Tmevpg1 (also known as NeST or IFNG-AS1) has been characterized in deep. Tmevpg1 is located proximal to IFNγ gene both in mice and humans, antisense and convergently transcribed respect to the neighboring gene and plays a role in chromatin remodeling. This transcript is a T H 1-specific lincRNA: it requires STAT4 and T-bet for being transcribed and is also bound by CTCF and cohesin during lineage-specific induction (196). Therefore, Tmevpg1 is directly dependent on the activation of a T H 1-polarizing transcriptional program, in which the presence of IL-12 leads to the activation of the JAK/STAT pathway via STAT4 (and STAT1) that induces the expression of T-bet. Interestingly, Tmevpg1 gene harbor sequences regulated by histone acetylation and DNase I hypersensitive sites found in T H 1 but not T H 2 cells (218,219). Tmevpg1, in its turn, plays a direct part in defining the proper T H 1 cytokine expression pattern, influencing Ifng transcription in the presence of T-bet (196), via H3K4 trimethylation by WDR5 binding in mice models (197). Given the increasing number of lncRNAs described in different cellular contexts and the high number of specific lincRNAs expressed in the different lymphocytes subsets identified with the aforementioned RNA-seq analysis, many more lincRNAs will likely be characterized in the future with a relevant function in the human immune system. A major limitation, though, in the studies on lncRNAs is that there is little biological knowledge on the biochemical or molecular function of lncRNA genes. Compared to classical protein-coding gene studies, hints on their functions cannot be gained simply by the analysis of their primary sequence and application of computational methods to infer lncRNA function are also still in their infancy. As lincRNAs have been reported to influence the expression of neighboring genes (25,26,28,39), one possible approach to investigate their putative function is to focus on lymphocyte lincRNAs proximal to protein-coding genes involved in key cell-functions.
Through this approach, we identified a T H 1-specific lincRNA that localized~140 kb upstream to MAF that was therefore called linc-MAF-4. MAF is a TF involved in T H 2 differentiation and required for the efficient secretion of IL-4 by T H 17 and the proper development of T FH cells (220)(221)(222). Intriguingly, the expression of linc-MAF-4 is negatively correlated with respect to the expression of MAF : linc-MAF-4 expression is high and specific in T H 1 lymphocytes, where MAF is lowly expressed whereas in T H 2 cells the expression of linc-MAF-4 is extremely low and MAF is highly expressed. Coherently, linc-MAF-4 knock-down in naïve CD4 + T cells increased the expression of MAF and interestingly induced a more general skewing of the whole transcriptomic profile of these cells toward a T H 2-like fate (63). The regulation exerted by linc-MAF-4 on MAF gene was analyzed in more detail and this lincRNA Frontiers in Immunology | T Cell Biology proved to modulate MAF expression in cis, as hypothesized by expression analysis. Linc-MAF-4 exerts this regulation by exploiting a chromatin loop that brings its genomic region close to the promoter of MAF gene. Indeed, the chromatin organization of this region allows linc-MAF-4 transcript to recruit chromatin remodelers that inhibit MAF transcription. In particular, linc-MAF-4 was found to associate with EZH2, key enzymatic subunit of the PRC2 complex, and LSD1. These proteins methylate H3K27 and demethylate H3K4, respectively: two histones modifications that code for transcriptional repression. (63). A similar mechanism was described for other lincRNAs, such as HOTAIR and MEG3 (154,223) but never before for other lncRNAs expressed in the adaptive immune system (Figure 2).
Changes of lincRNAs expression during naïve to memory CD8 + T-cell differentiation (37) and during naïve CD4 + T cells differentiation into distinct helper T-cell lineages (189) have been described in the mouse immune system. linc-MAF-4, though, is, to our knowledge, the first example of a lincRNA playing a role in the proper differentiation of human T H 1 cells, suggesting that, besides cytokines and TFs, lncRNAs take part in the T H 1 differentiation program as already shown in many other cell types. At this point, an obvious question arises: to what extent are these cells plastic? These findings are evidences that it is possible to redirect the differentiation path of naïve CD4 + T cells acting on their lincRNA content. Nevertheless, could it be possible to modulate already committed cells? We would expect that the mere downregulation of a lincRNA would not be sufficient nor a lincRNA knock-out: as mentioned before, lincRNAs are minor nodes in a huge interconnected network composed by feedback mechanisms and epigenetic marks that act stabilizing a pre-existent differentiation status. However, a modulation in lincRNA content may be sufficient to make these cells more responsive to environmental cues that could overcome stabilizing forces, inducing a sort of trans-differentiation event. Functional characterization of other lncRNAs is required to address this crucial issue and to assess the extent of their contribution to cell differentiation and to the maintenance of cell identity in human lymphocytes. Based on what we discussed so far on lncRNA functions and cell-specificity, we believe that future studies will show how these molecules could be capitalized as new molecular targets for the development of novel and highly specific therapies for diseases, such as autoimmunity, immunodeficiencies, allergy, and cancer.