- 1Department of Life Science, University of Bath, Bath, United Kingdom
- 2National Heart and Lung Institute, Imperial College London, London, United Kingdom
Background: Innate immunity involves the detection and removal of pathogens and is an ancient physiological response observed across all living organisms. Long non-coding RNAs (lncRNAs) are a novel family of RNA transcripts that regulate the innate immune response. Although the identification of functional lncRNAs has been hindered by their poor evolutionary sequence conservation, it has been speculated that syntenic conservation (position relative to protein coding genes) might provide an alternative approach. To examine this hypothesis, we have produced a catalogue containing the LPS-induced transcriptional changes across 27 vertebrate species, which was employed to identify syntenically conserved and functional LPS-induced human lncRNAs.
Methods: Transcriptomics was employed to compare differential lncRNA expression, as well as mRNA expression, between LPS-stimulated human cells and 26 vertebrate species, including 7 primates, 10 mammals, 4 birds and 5 fish. The function of manually annotated syntenically conserved human lncRNAs was examined in LPS-stimulated monocytic THP-1 cells using antisense mediated knockdown.
Results: Sequencing data from 9 LPS-stimulated human cell studies was analysed to produce a high confidence catalogue of 1036 mRNAs and 71 lncRNAs that were differentially expressed in 4 or more cell types. Examination of the mRNAs involved in the LPS signaling pathway showed evolutionary conservation across human, primates, mammals, birds and fish, one notable exception being the absence of the LPS sensing complex (MD2/TLR4/CD14) in fish. To overcome poor sequence conservation amongst lncRNAs, we employed syntenic position to show that many of the 71 lncRNAs identified in humans were conserved and induced across vertebrates, the most common being MITA1 (IL7AS) and LINC02541. Knockdown studies showed that MITA1 (IL7AS) and LINC02541 negatively regulate the human LPS-induced inflammatory response in monocytic THP-1 cells. Significantly, we have identified regions/domains of MITA1 (IL7AS) and LINC02541 that are conserved across vertebrates.
Conclusion: This report provides a comprehensive catalogue of the transcriptional response following activation of the LPS-induced innate immune response across 9 human cell types and 26 vertebrate species. We show that many of the LPS-induced human lncRNAs are syntenically conserved and induced across vertebrate species and that this can predict a functional role in the human innate immune response.
Introduction
Life is traditionally divided into 5 kingdoms comprised of the animals (vertebrates and non-vertebrates), plants, fungi, protista (single cell eukaryotic) and monera (bacteria and archaebacteria). The innate immune response involves the detection and removal of pathogens (i.e. non-self) and is an ancient physiological response observed in all living organisms, that is thought to have emerged at the same time as eukaryotic cells approximately 1 billion years ago (1, 2). In animals, this innate immune response is commonly mediated by specialised phagocytic cells that recognise conserved molecular structures entitled pathogen-associated molecular patterns (PAMPs) upon bacteria, viruses and fungi, through binding to pathogen recognition receptors (PPR). This recognition results in phagocytosis and induction of an inflammatory response including the release of effector molecules such as oxygen and nitrogen species, as well as widespread changes in gene expression (3). In contrast, the adaptive immune response in the form of B- and T-cell mediated IgG memory (jawed) and VLR-based memory (jawless) is only observed in vertebrate species and is thought to have emerged during the Cambrian period approximately 500 million years ago (mya) (1, 2).
Long non-coding RNAs (lncRNAs) are a novel family of RNA transcripts longer than 200 nucleotides in length and have low coding potential (4–6). Based upon their relative position to protein coding genes, lncRNAs are currently divided into three families including i) long intergenic lncRNAs (lincRNAs), which are located between protein coding genes, ii) divergent transcript (DT) lncRNAs, which share a common promoter region with a protein coding gene but are transcribed from the opposite DNA strand and, iii) antisense (AS) lncRNAs, that are transcribed across a protein coding gene but on the opposite DNA strand. LncRNAs are known to regulate a host of physiological and pathological responses, including the innate immune response (7–9). However, the identification of functional lncRNAs, as well as their mechanism of action, has been hindered by their poor evolutionary sequence conservation. This contrasts with protein coding genes where sequence conservation has permitted the identification of orthologous (and functionally comparable) genes across species. To overcome this issue, one approach has been to identify conserved lncRNAs at the transcriptional (RNA) levels based upon their syntenic position (position relative to protein coding genes) (10, 11). Using this approach, Hezroni et al. (10), analysed tissues from 17 vertebrate species and identified 1000’s of conserved lncRNAs whilst Necsulea et al. (11) identified ~ 400 lncRNAs based upon the analysis of 11 tetrapod species. This approach has also been employed to identify individual lncRNAs that show syntenic conservation between human and mouse including those that regulate the innate immune response such as lnc13 (12), IL7AS (13), lnc-ulcerative colitis (Lnc-UC) (14), lncRNA-GM (15), gastric adenocarcinoma predictive linc (GAPLINC) (16), ncRNA inducing MHC-I and immunogenicity of tumor (LIMIT) (17) and LncRNA-ISIR (18). Significantly, Schmerer et al. have just undertaken a comprehensive analysis examining the pathway dependencies, conservation, functions and protein interaction of lncRNAs involved in the activation of innate immunity in human macrophages. Specifically, the report identifies 4 lncRNAs (ROCKI, LINC00158, LINC01215 and AC022816.1) that regulate the innate immune response and show conservation across primates and mammals (19).
Activation of Toll like receptor (TLR)-4 by lipopolysaccharide (LPS), a membrane component of gram-negative bacteria, was one of the first PAMP-PRR interactions to be characterised (20). TLR4 is a member of the interleukin-1 and toll-like receptors (TIR) superfamily that forms a complex with myeloid differentiation factor 2 (MD2) and CD14, which is essential for LPS binding. Engagement of members of the TIR family stimulates a common intracellular signaling pathway culminating in the activation of transcription factors such as nuclear factor-κB (NF-κB) and activator protein-1 (AP-1) and the subsequent production of pro-inflammatory mediators (3).
Significantly, activation using LPS has become a standard approach for examining the innate immune response in vertebrates, including the differential expression of mRNAs and lncRNAs. Building on this observation, we have assembled a unique resource containing transcriptomics data from all published data on LPS-stimulated human cells, as well as LPS-stimulated cells and tissues from 26 vertebrate species (7 primates, 10 mammals, 4 birds and 5 fish). Using this catalogue, initial analysis identified 1036 mRNAs and 73 lncRNAs that were differentially expressed in 4 or more human cell types. Using these datasets, subsequent comparative transcriptomics has then been employed to characterise differential mRNA and lncRNA expression across vertebrate species and identify which LPS-induced human lncRNAs are syntenically conserved. Crucially, this analysis has shown the two most syntenically conserved and differentially expressed human lncRNAs, MITA1 (IL7-AS) and LINC02541, are negative regulators of the LPS-induced innate immune response.
Methods
Sourcing and indexing of the reference genomes
The fasta sequence (species.toplevel.fa.gz) of the human reference genome hg38 (human) or the relevant vertebrate genomes were sourced from Ensembl (https://ftp.ensembl.org/pub/current_fasta/). When not available from Ensembl these were sourced from either UCSC genome browser (https://hgdownload.soe.ucsc.edu/downloads.html) or NCBI (https://www.ncbi.nlm.nih.gov/datasets/genome/).
FASTA files from the relevant species were then indexed with hisat2 (21) (version 2.2.1) using the following command:
hisat2-build -p 16 <species.fa> genome
Sourcing and alignment of RNA sequencing data
Bulk RNA sequencing data was obtained by searching the sequence read archive (SRA) using the terms lipopolysaccharide, LPS or bacteria. Data was downloaded from the SRA using fastq dump with the following command lines:
Paired-end data: fastq-dump -I –split-files <SRR file name>
Single end data: fastq-dump -I <SRR file name>
Dependent upon the format of the sequencing data (Figures 1A, 2), this was aligned to the relevant indexed reference genome using Hisat2 (21) (version 2.2.1) using the following command line options: Paired-end, stranded data: hisat2 –rna-strandness FR –dta -x <pathway to indexed reference genome> -1 <reverse_strand_sequence_data.fa> -2 <forward_strand_sequence_data.fa> -S <output.sam> Paired-end, non-stranded data: hisat2 –dta -x <pathway to indexed reference genome> -1 <first_strand_sequence_data.fa> -2 <forward_strand_sequence_data.fa> -S <output.sam> Single-end, stranded data: hisat2 -q –dta –rna-strandness F -x <pathway to indexed reference genome> -U <sequence_data.fa> -S <output.sam> Single-end, non-stranded data: hisat2 –dta -x <pathway to indexed reference genome> -U <sequence_data.fa> -S <output.sam>
Figure 1. Comparative transcriptomics analysis of the LPS-induced responses in human cells. Sequencing data from 17 human cell types (A) was analysed using the protocol outlined (B) to identify differentially expressed mRNAs and lncRNAs in response to LPS-stimulation. Those datasets that showed differential expression of > 500 mRNAs and > 50 lncRNAs were deemed to have produced a robust response to LPS and employed in subsequent analysis. (C) shows the distribution of alignments expressed as fragments per kilobase per million mapped read (FPKM) across human and vertebrate datasets and with common housekeeping genes. Secreted proteins (D), transcription factors (E) and enzymes (F) showing the largest absolute FPKM change following exposure to LPS. Data in C is a box and whisper plot showing the mean +/- SEM and the max/min value whilst data in D, E, F are the mean +/- SEM across the 9 LPS-responsive cells and all have a q < 0.005 when comparing control versus LPS stimulation (multiple paired Wilcoxon test).
Figure 2. Summary of the source and analysis of LPS-induced response in 26 vertebrate species. Sequencing data from 26 vertebrate species was analysed using the protocol outlined in Figure 1A with the listed reference genomes and RNA databases. Unless otherwise stated in brackets, these were obtained from the Ensembl database. Table also includes the data source, period of exposure to LPS, RNA isolation technique and addition sequencing information.
In the case of Sumatran Orangutan, White turfed Ear Marmoset and Mouse Lemur, which did not have matching paired data, sequencing data was aligned using the following command: hisat2 –no-discordant –no-mixed –rna-strandness FR –dta <pathway to indexed reference genome> -1 <first_strand_sequence_data.fa> -2 <forward_strand_sequence_data.fa> -S <output.sam>
Output SAM files were sorted, converted to BAM files and indexed in Samtools (22) using the following command lines:
samtools sort -@ 8 –o <output.bam> <output.sam>
samtools index <output.bam>
Gene quantification (counts/gene) and determination of differential gene expression
Gene expression (counts/gene) and differential gene expression was assessed using FeatureCounts (23) and Deseq2 (24), respectively. Alignment files (.bam) were initially converted to counts/gene in FeatureCounts using the following command line: featureCounts -p -a <RNA database.gtf> -g gene_id -o <featureCounts_output_table.txt> <Control_*.bam> <LPS_*.bam>
where Control_*.bam and LPS_*.bam are all the alignment files for a specific dataset.
The <featureCounts_output_table.txt> is then used as input for the Deseq2 Rscript programme (download from curl http://data.biostarhandbook.com/books/rnaseq/code/deseq2.r) using the following command:
cat featureCounts_output_table.txt | Rscript deseq2.r AxB > results_deseq2.csv
where A and B are the n numbers for the control and LPS repeats.
Determination of normalised gene expression
Normalised gene expression from each alignment file (.bam), which accounts for both gene length and the total number of reads, is given as FPKM (fragments per kiobase of transcript per million reads) and was determined using Stringtie2 (25) using the following command line: stringtie <output.bam> -G <RNA database.gtf> -e –A <gene_quantification_table.txt>
Ab initio assembly and the identification of syntenic lncRNAs
The alignment files for LPS samples were merged and indexed using Bamtools (26) and Samtools (22) to produce a single file for each experiment using the following command line:
bamtools merge -in <LPS_1.bam> -in <LPS_X.bam> -out <Non_sorted_merged.bam> samtools sort -@ 4 -o <Merged.bam> <Non_sorted_merged.bam>
bamtools index -in <Merged.bam>
We then performed ad intio transcript assembly with Stringtie2 (25) using the following command line:
Stringtie <Merged.bam> -o <Merged.gtf>
Normalised gene expression, gene quantification (counts/gene) and differential gene expression was then determined using the <Merged.gtf>, as described previously.
Identification of species orthologs of human genes
We extracted the human orthologous genes for a particular species from Ensembl 112. Where there were multiple orthologs against a specific human gene, these were sorted to produce one ortholog using the following criteria:
%id. query gene identical to target [Species] gene (high to low)
Gene-order conservation score (high to low)
Orthology confidence [0 low, 1 high]
P adjusted FPKM (low to high)
Absolute expression FPKM (high to low)
Multiple sequence alignment and identification of conserved domains
Strand specific sequences for syntenic version of MITA1 (IL7AS) (Supplementary Data 1) and LINC02541 (Supplementary Data 2) were extracted from the relevant reference genomes (Figure 2) using the relevant STRG code (Supplementary Table S40). These FASTA file were then submitted for multi-sequence alignment using multiple alignment using fourier transform (MAFFT - www.ebi.ac.uk/jdispatcher/msa/mafft?stype=dna) or identification of domains (up to 30 nucleotides) using the XSTREME in meme suite (https://meme-suite.org/meme/), with comparison against MITA1 (IL7AS) and LINC02541.
Culture and LPS-stimulation of human monocytic THP-1 cells
Monocytic THP-1 cells were cultured in growth media (RPMI supplemented with 10% (v:v) FCS, 2 mM L-glutamine, 100 U/ml penicillin, 100 μg/ml streptomycin and 50 nM of 2-Mercaptoethanol (all GIBCO, Life Technologies) and incubated in a 37°C, 5% (v:v) CO2 humidified incubator. Cells were stimulated for 4h with/without 1 μg/ml LPS (E. coli 055:B5, Sigma Aldrich). Total RNA was extracted using the RNeasy kit (Qiagen), which included an on-column DNase treatment (Qiagen), according to the manufacturer’s guidelines. RNA concentration was determined using the Qubit 2.0 (Life Technologies). RNA quality was measured using the Agilent Bioanalyser and produced RIN values of >8.0. Samples were dispatched to BGI Genomics (Hong Kong, China) where they were processed using the Illumina TruSeq (polyA) library kits prior to 100 nucleotide paired-end, stranded sequencing. Data is deposited in the ENA under PRJEB100692 and has been included as part of the analysis described in Figure 1.
Investigation of lncRNA function in human monocytic THP-1 cells
Human monocytic THP-1 cells were transfected with antisense locked nucleic acid (LNA) GapmeRs against IL7AS at a final concentration of 30 nM in 100 μl of serum- and antibiotic-free RPMI-1640 medium, supplemented with 5μl of HiPerFect (Qiagen). The LNA GapmeRs mix was added to each well of a 24-well plate and THP1 cells (5x105 cells per well) in 100 μl of their growth complete medium were added on top of the LNA GapmeRs mix. Cells were then incubated for 16h. The following day cells were diluted with 400 μl of complete medium and stimulated for 4h with/without 1 μg/ml LPS (E. coli 055:B5, Sigma Aldrich). Total RNA was extracted using the RNeasy kit (Qiagen), which included an on-column DNase treatment (Qiagen), according to the manufacturer’s guidelines. RNA concentration was determined using the Qubit 2.0 (Life Technologies). RNA quality was measured using the Agilent Bioanalyser and produced RIN values of >8.0. Samples were dispatched to BGI Genomics (Hong Kong, China) where they underwent polyA isolation prior to 100 nucleotide paired-end, stranded sequencing. Data is deposited in the European Nucleotide Archive (ENA) under PRJEB101967.
Human monocyte-derived macrophages
Monocyte derived macrophages (MDMs) were generated from monocytes isolated from peripheral blood mononuclear cells using a Percoll gradient and adherence technique, followed by culture for 12 days in RPMI supplemented with 10% v/v fetal calf serum 2 mM L-glutamine and 100 µg/ml penicillin/streptomycin with either 2 ng·mL−1 GM-CSF (cat. no. 7954-GM-059/CF, Bio-Techne, Abingdon, UK) to generate G-Mφ or 100 ng·mL−1 M-CSF (cat. no. 216-MC-500, Bio-Techne) to generate M-Mφ. MDMs were then stimulated for 4h with/without 1 μg/ml LPS (E. coli 055:B5, Sigma Aldrich). Total RNA was extracted using the RNeasy kit (Qiagen), which included an on-column DNase treatment (Qiagen), according to the manufacturer’s guidelines. RNA concentration was determined using the Qubit 2.0 (Life Technologies). RNA quality was measured using the Agilent Bioanalyser and produced RIN values of >8.0. Samples were dispatched to BGI Genomics (Hong Kong, China) where they underwent Ribozero isolation prior to 100 nucleotide paired-end, stranded sequencing. These studies were performed under ethic approval number 09/H0801/85 from the National Research Ethics Service (NRES), South West London REC1. Data is deposited in the European Nucleotide Archive (ENA) under PRJEB101891 and has been included as part of the analysis described in Figure 1.
Availability of data and materials
All sequencing data is available through NCBI sequence read archive (SRA) or European Nucleotide Archive (ENA) using the accession numbers listed in the Figures or Methods.
All Supplementary Data can be obtained at FigShare at the following DOI: 10.6084/m9.figshare.30868874.
Results
Catalogue of differential mRNA and lncRNA expression following LPS-induced activation of human innate immune response
To produce a comprehensive catalogue of high confidence, differentially expressed mRNAs and lncRNAs following activation of the human innate immune response, we identified bulk RNA sequencing data from 17 human primary cells or cells lines, following stimulation with LPS for up to 24h (Figure 1A). Differentially gene expression including mRNAs and lncRNAs, was assessed with the GENCODE database (v47), based on a p-adjusted value < 0.05 and an absolute change of > 1 FPKM. Of relevance, Gencode v47 includes updated annotations based on long read data which has resulted in a substantial increase in annotated lncRNAs from 20310 (v46) to 34915 (v47), and a smaller increase in mRNAs from 19411 (v46) to 20092 (v47) (27).
Comparison of alignment statistics across these datasets (mean ± SD) demonstrated that whilst overall alignment was high (95 ± 3%), there was large variation in the aligned reads (496 ± 327M), counts per sample (34 ± 25M) and % alignment to known genes (51 ± 17). In contrast, normalisation of the data using Fragments per Kilobase per Million mapped fragment (FPKM) produce a much smaller variation across samples (543 ± 88M) and indicated that this metric might be employed to compare gene expression across datasets. In light of this observation, we examined the FPKM of 39 previously identified housekeeping genes (28–31) and showed that ubiquitin C (UBC) (SD = 24%), importin 8 (IPO8) (SD = 37%), pumilio RNA binding family member 1 (PUM1) (SD = 35%) and SNW domain containing 1 (SNW1) (SD = 35%) demonstrated the least variation in expression across human datasets (Figure 1C) (Supplementary Table S10).
To ensure that we analysed cells that gave a robust LPS response, we excluded those datasets with < 500 differentially expressed mRNAs and < 50 differentially expressed lncRNAs, leaving 9 datasets for downstream analysis (Figure 1A, Supplementary Data 10). Interestingly, cell responses to LPS did not appear to be linked to the expression levels of TLR4, CD14 or LY96, the components of the complex responsible for binding LPS (Figure 1A).
Compilation of data across these 9 cell types showed differential expression of 7112 mRNAs in a least one cell type (Supplementary Data 10). To reduce this number for subsequent comparison across vertebrate species, we focused on those genes that were differentially expressed in 4 or more cell types. This identified 1036 differentially expressed mRNAs, of which the majority were up-regulated (746 mRNAs). (Supplementary Table S11). As might be expected, KEGG analysis of the up-regulated genes detected multiple pathways associated with immunity/inflammation, the most significant being TNF-signaling (5.4 x 10-21), NOD-like signaling (6.9 x 10-17) and NF-κB signaling (8.8 x 10-17) but also necroptosis (2.4 x 10-6) and apoptosis (2.3 x 10-4). This indicates that LPS induces both an inflammatory and apoptotic response. No pathways were identified for the down-regulated genes. To classify these up-regulated genes, we cross-referenced our gene list with data from the human protein atlas (https://www.proteinatlas.org/) (32) and identified 72 secreted proteins, 127 transcription factors, 229 enzymes, 67 transporters, 19 G-protein coupled receptors and 4 voltage-gated ion channels (Supplementary Table S11). Examination of the mRNAs showing the largest absolute change (FPKM) indicated that IL1β and CXCL8 are highest amongst secreted proteins (Figure 1D), which also includes additional cytokines (IL1a, IL6 and IL23A) and chemokines (CXCL1/2/10/11). Amongst the transcription factors (Figure 1E), there was upregulation of mRNAs coding for NF-κB (NFKBIA, NFKB1, NFKB2, NFKBIZ, BCL3), STAT (STAT4, STAT5A and STAT2) and AP-1 (JUNB, ATF5, ATF3) signaling complexes. The most expressed enzymes included those involved in metabolism (SAT1, ACOD1, NAMPT, ACSL1, IDO1), as well as the well-known inflammatory enzyme, PTGS2 (COX2) which catalyses the production of prostaglandins (Figure 1F).
In the case of lncRNAs, we observed 979 lncRNAs that were differentially expressed in a least one cell type. As with mRNAs, we focused upon lncRNAs that were differentially expressed in 4 or more cells, which identified 96. These lncRNAs were manually annotated to remove any that might represent mRNAs and classified as either lincRNAs (located between protein coding genes), antisense (transcribed across a protein coding gene but on the opposite DNA strand) or divergent transcripts (which share a common promoter region with a protein coding gene but are transcribed from the opposite DNA strand). This analysis ultimately identified 73 lncRNAs that were differentially expressed in response to LPS (67 up-regulated and 6 down-regulated) including 32 lincRNAs, 18 antisense (AS) and 23 divergent transcripts (DT) (Supplementary Table S12). Amongst the lincRNAs were 5 miRNA host genes (MIR155HG, MIR3945, MIR3142, MIR122HG, MIR9-1HG), that code for miR-155, miR-3945, miR-146a, miR-222 and miR-9-1, as well as 2 small nucleolar RNA host genes (SHNG12 and SHNG15).
Catalogue of differentially expressed mRNAs and lncRNAs following LPS-induced activation of innate immune response across vertebrates
To examine differential expression of mRNAs and lncRNAs across vertebrates, we identified data from 26 species including 7 primates [chimpanzee (33), orangutan (34), macaque (33), baboon (33), sooty mangabey (35), marmoset (34) and mouse lemur (34)], 10 mammals [mouse (36), rat (37), mole rat (38), rabbit (39), cow (37), water buffalo (37), horse (37), sheep (39), goat (37) and pig (37)], 4 birds [chicken (40), quail (41), duck (42) and zebra finch (43)] and 5 fish [salmon (44), cod (45), trout (46), sole (47) and zebrafish (48)] (Figure 2). The differential expression of mRNAs and lncRNAs across these vertebrate species (Supplementary Tables S13-S38) was analysed using gene annotations provided by Ensembl, following alignment to the relevant reference genomes (Supplementary Tables S13-S38). Exceptions to this approach included mouse data, which was annotated using Gencode M31 and water buffalo data, which were annotated using data from NCBI Refseq (Figure 2).
In contrast to the human data which were obtained from isolated cells, the vertebrate datasets was derived from a variety of sources including bone marrow-derived macrophages (BMDM) in mammals and chickens, as well as whole blood in primates and birds. Other differences included the duration of LPS exposure, which varied from 2 to 24 h, as well as the RNA isolation techniques employed. As with humans, comparison across these datasets demonstrated good overall alignment (92 ± 4%) but a large variation in the total aligned reads (802 ± 670M), reads per sample (44 ± 36M) and % alignment to known genes (50 ± 14%). In contrast to human cells, FPKM demonstrated wide variation (75 ± 53M) although this appeared to be caused by data derived from a single publication that employed whole blood with rRNA and globulin depletion. In the absence of these samples, FPKM variation was 60 ± 14M (Figure 1C). Once again these observations suggested that FPKM could be employed to compare gene expression across samples. Examination of housekeeping genes identified in human cells, showed that IPO8 and PUM1 (but not SNW1 and UBC) were also conserved across vertebrates (Figure 1C).
Figure 3A shows the evolutionary branching of the vertebrate species employed in this report including (jawed) fish (emerged ~420 mya), birds (emerged ~160 mya), mammals (emerged ~225 mya), primates (~85 mya) and humans (~0.3 mya). From Figure 3D, it can be seen that the mean number of mRNAs in primates, mammals, birds and fish are 21147, 21865, 16493 and 23585 (when factoring in the whole genome duplication in salmon and trout), respectively. This is comparable to the ~20,000 in human (GENCODE v47). Likely due to variation in availability/depth of sequencing data and annotation, the numbers of lncRNAs varies from 538 (Sooty Mangabey) to 15166 (horse), which are all considerably lower than the 34915 annotated in humans (GENCODE v47) (Figure 3H). Examination of the LPS-induced transcriptional response across vertebrates (Figure 3E) showed considerable differences in the number of differentially expressed mRNAs which ranged from 36 mRNAs in rabbits to 6852 mRNAs in mice. We observed similar variability in the number of differentially expressed lncRNAs that ranged from 0 to 301 (mouse), compared with the 607 in human (Figure 3I). In the case of orangutans and mouse lemurs, we had insufficient replicates (i.e. n=1) to be able to perform statistical analysis. In the subsequent evaluation, data were grouped into primates, mammals, birds and fish.
Figure 3. LPS induced differential expression of mRNAs and lncRNAs across 26 vertebrate species. (A) shows the evolutionary branching of the vertebrates species examined in this report, (B) shows the % of mRNAs in the stated species that have human orthologs, (C) shows the % conservation in the species mRNAs with the human mRNA orthologs, (D) shows the number of annotated mRNAs in the indicated species, (E) shows the number of differentially expressed mRNAs following LPS stimulation, (F) shows the % of mRNAs orthologs in the indicated species that match the 1016 mRNAs that are differentially expressed in humans following LPS stimulation, (G) shows the % of mRNAs in the indicated species that are differentially expression (column E) and that overlap with the 1016 mRNAs that are differentially expressed in human following LPS stimulation, (H) shows the number of annotated lncRNAs in the indicated species and (I) shown the number of differentially expressed lncRNAs following LPS stimulation.
LPS signaling pathways were conserved across vertebrates
We next investigated whether the LPS signaling pathway is conserved across vertebrates. As shown in Figure 4, primates and mammals express all the proteins (shown in yellow) seen in humans; many of which are upregulated in response to LPS, particularly IRAK2 and members of the NF-κB transcription complex. Similarly, despite the emergence of birds and jawed fish approximately 180 and 420 million years ago, these vertebrates still express the majority of the proteins in the LPS signaling pathway. The notable exception is that fish lack the components of the LPS receptor, including CD14, TLR4 and MD2, which means that they must detect and respond to LPS via an alternative pathway. Overall, this indicates that the LPS signaling pathway is highly conserved across vertebrates confirming that this pathway emerged early in animal evolution.
Figure 4. Comparison of the LPS signaling pathways in vertebrates. Transcriptomics data from LPS-stimulated cells and tissues of 26 vertebrates including 7 primates, 10 mammals, 4 birds and 5 fish were employed to examine the expression of the proteins involved in the LPS signaling pathway. Protein expressed at > 1 FPKM across > 50% of cells/tissues are shown in yellow whilst those proteins that were significantly differentially expression are indicated with a red arrow.
Comparison of the LPS induced mRNA response across vertebrates
To allow comparison with the human LPS response, we initially employed Ensembl to identify the human orthologs of the various vertebrate mRNAs (Supplementary Table S39). Reflecting their evolutionary divergence, the mean number of human mRNAs orthologs in primates, mammals, birds and fish were 82%, 75%, 75% and 39% (Figure 3B), with mean % sequence conservation across these orthologous mRNAs being 94%, 83%, 71% and 56%, respectively (Figure 3C).
Our previous analysis of the LPS-induced response in human cells identified 746 up-regulated and 290 down-regulated mRNAs (Supplementary Table S11). The data in Figure 3F and heatmap in Figure 5A shows the orthologs of these 1036 human mRNAs that are also differentially expressed across the various vertebrate species. Following exclusion of those datasets that did not give a robust response (< 250 differential expressed mRNAs), the mean number of human LPS-induced ortholog genes that were also seen in primates, mammals, birds and fish was 37 ± 15%, 39 ± 6%, 9 ± 5% and 8 ± 2%, respectively (Figure 3G). Interestingly, following the merger of the data within each group, it was shown that the majority of LPS induced mRNAs in humans are also seen primates (70%) and mammals (92%), although this number is dramatically reduced in birds (27%) and fish (23%) (Figure 5B). Using this data, it was shown that 70 mRNAs are differential expressed across humans, primates, mammals, birds and fish (Supplementary Table S11).
Figure 5. Comparison of the LPS induced transcription response in humans with that in vertebrates. Transcriptomics data from LPS-stimulated cells and tissues of 26 vertebrates including 7 primates, 10 mammals, 4 birds and 5 fish were employed to identify which orthologs match those observed in LPS-stimulated human cells (1036 mRNAs). (A) heatmap comparing individual species against human LPS-induced mRNAs, (B) heatmap comparing merged group data from primates, mammals, birds and fish against human LPS-induced mRNAs, (C) block diagram showing the most commonly expressed (conserved) secrete proteins, transcription factors and enzymes. A redline or block indicates that the specific ortholog is differentially expressed in the indicated species or group (p-adjusted < 0.05).
As with the human data (Figures 2D–F), we classified these orthologous mRNAs by function into secreted proteins, transcription factors and enzymes, and plotted those expressed in the highest number of vertebrates (Figure 5C). Among secreted proteins, the inflammatory mediator IL1β and transmembrane proteoglycan, syndecan 4 (SDC4), were induced in the largest number of vertebrates (16/24) and across both primates, mammals, birds and fish. Interleukin-10 (IL10), interleukin-1 receptor antagonist (IL1RN), penetraxin (PTX), TNF superfamily member 10 (TNFSF10), heparin binding EGF like growth factor (HBEGF) and Fas receptor (Fas) were also induced across all 4 vertebrate groups. Interestingly, a number of common inflammatory mediators such as IL1a, IL27, CXCL8 and IL12B were only induced in primates and mammals. As has already been highlighted, the most often induced transcription factors were members of the NF-κB complex (NFKB1, RELB, NFKB2, NFKBIZ, NFKBIE, NFKBIA, REL). In addition, we observed induction of IRF1 and STAT3 across all 4 groups. The most conserved and induced enzymes included kinases (RIPK2, MAP3K8, IRAK2, CDK17), and others that have been implicated in the regulation of inflammation including PTGS2 (COX2) and PDE4B. The most conserved enzyme was TNF alpha induced protein 3 (TNFAIP3) (16/24), RIPK (15/24) and PTGS2 (15/24), which along with RNF19B, ACOD1, BIRC2, CFLAR, PDE4B and CNP, were induced in both primates, mammals, birds and fish (Figure 5C).
Identification of syntenically conserved LPS-induced lncRNAs across vertebrates
We had previously identified 73 lncRNAs that were differentially expressed in 4 or more human cell types in response to LPS exposure (Figure 6A, Supplementary Table S12). To determine whether these LPS-induced lncRNAs are syntenically conserved across vertebrates, we undertook ab initio transcript assembly using the RNA sequencing data from the 26 vertebrate species and employed syntenic position to identify potential orthologs (Figure 6B, Supplementary Tables S13A-S38A). These ab initio transcript assemblies were also employed to determine differential expression and the log2 fold change (Figure 6B, Supplementary Table S40). Using expression in a least 2 species as a threshold, we showed that of the 73 lncRNAs identified in humans, 39 lncRNAs were expressed in primates, 32 lncRNAs in mammals, 5 lncRNAs in birds and a 4 lncRNAs in fish (Figure 6B, Supplementary Table S40). Based on significant differential expression we showed that 11 lncRNAs were induced in primates (MIR155-HG, MITA1, CD44-DT, LINC02605, LINC02541, SLC30A4-AS, MIR3142HG, MIR222HG, PIK3R5-DT, PARAIL, MIR9-1HG), 11 in mammals (MIR-155-HG, MITA1, LINC02605, LINC02541, SLC30A4-AS, LINC00222, CARINH, ENSG00000267737, ENSG00000270069, PIK3R5-DT, MIR9-1HG), 2 in birds (MITA1 and LINC02605) and none in fish (Figure 6B). Using expression alone, the most commonly expressed lncRNAs across the 25 species were MIR-155-HG (18/25), MITA1 (18/25), LINC02605 (18/25), LINC02541 (15/25) and LINC-PINT (16/25) (Figure 6B). Based on differential expression (Supplementary Tables S40), MITA1, LINC02605, LINC02541 and MIR-155-HG were shown to demonstrate the most consistent induction being significantly up-regulated in 17/19, 17/19, 7/15 and 8/18 species, respectively (Figure 6B). Other syntenic lncRNAs that were differentially expressed across multiple species included SLC30A4-AS (5/13), MIR3142-HG (4/10), ENSG00000267737 (4/10) and SIMALR (4/10). Of relevance, MIR155HG and MIR3142, which are the host genes for miR-155 and miR-146a, have previously been shown to be master regulators of innate and adaptive immune response (7, 9). This indicates that our analysis pipeline provides a robust approach to the identification of functionally relevant lncRNAs.
Figure 6. Expression and LPS-induced increases in lncRNAs across vertebrates. (A) Red boxes indicate which lncRNAs were significantly increased (+) or decreased (–) (p-adjusted < 0.05) in the relevant human cell type in response to LPS stimulation whilst the number in white gives the log2 fold change and, (B) Blue boxes indicated which vertebrate species demonstrated the presence of a syntenic lncRNA. The presence of a white number indicates a significant (p-adjusted < 0.05) increase (+) or decrease (-) in expression whilst the magnitude of the number is the log2 fold change. in the relevant vertebrate species.
MITA1 (IL7AS) and LINC02541 are syntenically conserved and differentially expressed across vertebrates
We have shown that MIR155HG, MITA1, LINC02605 and LINC02541 were syntenically conserved and differentially expressed in human, primate, mammals and/or birds in response to LPS stimulation. Since the action of MIR155HG is likely mediated through the production of miR-155, a well-known regulator of the innate immune response, we focused on MITA1, LINC02605 and LINC02541. Significantly, we have previously shown that MITA1 and LINC02605 are isoforms of a lncRNA located antisense to IL7, which we had previously named IL7-AS. Given this previous observation we shall forthwith refer to MITA1 and LINC02605, as MITA1 (IL7AS).
Ab initio transcript assembly using data obtained from LPS-stimulated human macrophages showed that these lncRNAs consisted of multiple transcripts. Thus, MITA1 (IL7AS) contained > 9 potential exons that could be assembled into 11 transcripts (Supplementary Figure S1A) whilst LINC02541 contained > 13 potential exons that could be assembled into 19 transcripts (Supplementary Figure S1B). To confirm the differential expression data, we visually examined the syntenic versions of these lncRNAs (STRG numbers) across primates (Figure 7), as well as mammals and birds (Figure 8) using the relevant alignment files within the IGV genome browser. This analysis confirmed LPS-induced MITA1 (IL7AS) expression in all primates, mammals and bird species although this was not significant in orangutans and rabbits. Given that fish do not express IL7, we were unable to determine MITA1 (IL7AS) expression using the data from salmon, cod, trout, sole and zebrafish (data not shown). To ascertain whether the increase in MITA1 (IL7AS) might be linked to that of IL7, we also assessed the expression of the latter (Figures 7, 8). However, this appeared unlikely since only 7 of 19 species demonstrated a significant increase in IL7 following exposure to LPS and, furthermore, the magnitude of the MITA (IL7AS) (FPKM: 17.5 +/- 5.4) response was significantly elevated (p = 0.004, paired t-test) compared with IL7 (FPKM: 0.5 +/- 2.0). Although not as prevalent, we observed significant induction in the syntenic versions of LINC02541 in primates (chimpanzee, rhesus, baboons) and mammals (mouse, rat, mole rat, cow, sheep) but not in birds or fish (Figures 7, 8).
Figure 7. LPS-induced MITA1 (IL7AS) and LINC02541 expression in primates. Following ab initio assembly using the RNA sequencing data from the indicated primates, the syntenic location was employed to identity and quantify the relevant orthologs for MITA1 (IL7AS) and LINC02541 (STRG number derived from Supplementary Tables S13A-S38A). Histograms of the control and LPS alignment data are displayed using the Integrative Genomics Viewer (IGV). Data on IL7 expression were derived from Supplementary Tables S13-S38. Expression analysis for IL7, MITA1 (IL7AS) and LINC002541 are based on FPKM and are mean +/- SEM of 3–10 experiments where *p < 0.05, **p < 0.01 and ***p < 0.001 (unpaired T-test).
Figure 8. LPS-induced MITA1 (IL7AS) and LINC02541 expression in mammals and birds. Following ab initio assembly using the RNA sequencing data from the indicated mammals and birds, the syntenic location was employed to identity and quantify the relevant orthologs for MITA1 (IL7AS) and LINC02541 (STRG number derived from Supplementary Tables S13A-S38A). Histograms of the control and LPS alignment data are displayed using the Integrative Genomics Viewer (IGV). Data on IL7 expression were derived from Supplementary Tables S13-S38. Expression analysis for IL7, MITA (IL7AS) and LINC002541 are based on FPKM and are mean +/- SEM of 3–10 experiments where *p < 0.05, **p < 0.01 and ***p < 0.001 (unpaired T-test).
In summary, we have shown that human LPS-induced MITA (IL7AS) and LINC02541 are syntenically conserved across vertebrate species and are also induced in response to LPS exposure. Subsequent studies investigated whether these LPS-responsive human lncRNAs contained conserved regions and examined their functional role in the innate immune response.
MITA1 (IL7AS) and LINC02541 have short regions that are conserved across vertebrate species
Having established that the syntenic version of MITA1 (IL7AS) and LINC02541 are induced in response to LPS stimulation across vertebrate species, we proceeded to examine whether there are conserved regions (Figure 9A) and domains/motifs (Figure 9B). Multiple sequence alignment of the vertebrate species against MITA1 (IL7AS) showed an average coverage (COV – the % of the vertebrate species that is covered by MITA1) of 66% and average percentage identify (PID – proportion of positions in the vertebrate species that has a pairwise alignment with MITA1 (IL7AS) that is the same) of 8%. This did not differ significantly between primates (COV - 55%; PID – 7%), mammals (COV - 71%: PID – 9%) and birds (COV - 69%; PID – 8%). For LINC02541, the average COV and PID across all vertebrates was 46% and 13%, respectively. There was some variation in the COV but not the PID which remained relatively consistent (COV - 66%; PID – 16%), mammals (COV - 38%: PID – 12%) and birds (COV - 69%; PID – 8%); although this variation is likely related to the smaller sample size.
Figure 9. Identification of conserved regions in MITA1 (IL7AS) and LINC02541 across vertebrate species. The sequences of syntenic versions of MITA1 (IL7AS) (Supplementary Data 1) and LINC02541 (Supplementary Data 2) were extracted from the relevant reference genome. These sequences were then compared against MITA1 (IL7AS) or LINC02541 by (A) multiple sequence alignment (MAFFT) to assess coverage (COV) and percentage identity (PID) or (B) detection of motifs/domains (MEME suite).
Overall, these results suggest that both MITA1 (IL7AS) and LINC02541 contain significant regions of conservation across vertebrates. To identify specific motifs/domains (up to 30 nucleotides), these sequences were submitted to the XSTREME option in MEME suite. This analysis identified 4 motifs/domains in MITA1 (IL7AS) and 13 motifs/domains in LINC02541, that were conserved across all the vertebrate species (Figure 9B).
MITA1 (IL7AS) and LINC02541 negatively regulate the LPS-induced inflammatory response in the human monocytic THP-1 cell line
To determine the potential function of MITA1 (IL7-AS) and LINC02541, we designed LNA-based antisense sequences targeting exon 1 in MITA1 (IL7AS) and exon 3 in LINC02541, which are shared across the various predicted transcripts (Supplementary Figure S1). These were transfected into human monocytic THP-1 cells and their action upon the LPS-induced inflammatory response at 4hr was examined by RNA sequencing (Figure 10). Compared with a scrambled negative control, the targeting of exon 1 in MITA1 (IL7AS) (Figure 10A) resulted in the down-regulation of 26 mRNAs (including MITA1 (IL7-AS)) and the up-regulation of 130 mRNAs (Figure 10A; Supplementary Table S43). Pathway analysis indicated that the up-regulated genes were linked to multiple inflammatory pathways including the NF-κB and Toll-like receptor signaling pathways (Figure 10A). No pathways were associated with the down-regulated genes. Knockdown of exon 3 in LINC02541 caused down-regulation of 915 mRNAs (including LINC02541) and up-regulation of 419 mRNAs, when compared with a scrambled control (Figure 10B; Supplementary Table S44). Once again, the down-regulated genes were not associated with any pathways whilst the up-regulated genes were linked with various inflammatory pathways including TNF and NF-κB signaling (Figure 7B). Overall, these studies indicate that MITA1 (IL7AS) and LINC02541 are negative regulators of the LPS-induced inflammatory response and support the contention that syntenic conservation can provide an approach for the identification of functional lncRNAs in human phenotypic responses.
Figure 10. Knockdown studies demonstrate a function for MITA (IL7AS) and LINC02541 expression in the LPS-induced inflammatory response in human THP-1 monocytic cells. Human monocytic THP-1 cells were transfected overnight with either an LNA-based antisense sequence against exon 1 in MITA1 (IL7AS) (MITA (IL7AS) Antisense), exon 3 in LINC02541 (LINC02541 Antisense) or a scrambled negative control (Negative Control Antisense), stimulated for 4hr with LPS and then the change in the profile of mRNA expression determined by RNA sequencing. The number of up- and down-regulated mRNAs was determined following comparison with scrambled negative control before pathway analysis using DAVID (https://davidbioinformatics.nih.gov). Data is the mean +/- SEM of 3–4 experiments where ***p < 0.001 (unpaired T-test).
Discussion
Comparative transcriptomics is a powerful technique for examining the differences in mRNA expression across species, as well as through evolutionary time. This approach is also central to investigating the role of long non-coding RNAs (lncRNAs), which can only be detected at the transcriptional level. However, the availability of data has commonly limited comparison between humans and other model organisms including mice and zebrafish. To address this limitation, we have mined all the transcriptomics data following LPS-induced activation of the innate immune response and created a unique resource containing data from 9 human cell types and 27 vertebrate species, including 7 primates, 10 mammals, 4 birds and 5 fish.
Initial comparison of the expression of genes involved in the LPS signaling pathway showed that this was highly conserved across vertebrates including primates (emerged ~70 mya), mammals (emerged ~120 mya), birds (emerged ~190 mya) and (jawed) fish (emerged ~420 mya), indicating that this has been present since the emergence of vertebrate (~480 mya). As previously reported, the only noticeable difference is the absence of the LPS sensing complex (TLR4/CD14/MD2) in fish (49), which appears to have evolved in vertebrates following their migration onto land (~ 480 mya). The observation that fish respond to LPS indicates that they must recognise this PAMP via an uncharacterised mechanism.
Having demonstrated conserved expression of the human LPS-signaling pathway across vertebrates, we proceeded to examine the LPS-induced transcription changes induced via this pathway. To this end, we compiled a high confidence list of LPS responsive mRNAs in humans by examining LPS-induced response in 9 different cell types. By selecting those that were differentially expressed in more than 4 cell types, we identified 1036 mRNAs that were employed for subsequent comparative transcriptomics. Based on orthology, we show evolutionary conservation between the LPS-induced mRNA changes in primates (82%), mammals (75%) and birds (75%), although this was dramatically reduced in fish (39%). Examination of individual genes that are induced across vertebrates identified many well characterised inflammatory modulators including IL-1β, CXCL8 and PTGS (COX2), as well as members of the NF-κB and AP1 transcription family. This also identified less characterised genes such as RNF19B, RIPK2, HBEGF, SDC4, NR4A3, PTX3 and BIRC3, that might warrant further investigation based on their evolutionary induction/conservation. Given the recent emergence of the importance of metabolism in the regulation of the immune response (50), it is also interesting to note the LPS-induction of multiple metabolic enzymes across vertebrates including SAT1, ACOD1, cytidine/uridine monophosphate kinase 2 and NAMPT.
Recent studies have highlighted the importance of lncRNAs in the regulation of the innate immune response (7–9). However, unlike protein-coding genes, their poor evolutionary conservation has precluded the search for lncRNAs based upon genomic DNA sequence. To address this challenge and identify functional lncRNAs that regulate human innate immunity, we have mined our datasets to identified conserved and differentially expression lncRNAs using transcriptional expression and syntenic position. This approach has previously been employed to compare tissue/cell specific lncRNA expression across vertebrates (10, 11). Our initial analysis in 9 human cell types identified 973 LPS-induced lncRNAs, of which 73 lncRNAs were differentially expressed in more than 4 cell types. Amongst this list of lncRNAs were the majority that had previously been shown to regulate the LPS-induced response in human cells including MITA (IL7AS) (8 cell types) (13, 51), LINC02605 (7 cell types) (51), ADORA2A-AS1 (52), IL1β-RBT46 (1 cell type) (53), PACER (1 cell type) (54), NEAT1 (3 cell types) (55) and SUGCT-AS1 (3 cell types) (56), as well as the 4 lncRNAs recently identified by Schmerer et al. (19) including ROCKI (1 cell type) (57), LINC00158 (6 cell types), LINC01215 (2 cell types) and AC022816.1 (6 cell types). Interestingly, there were a number that were expressed but not significantly up-regulated including THRIL (58), MALAT1 (59), HOTAIR (60), LINC00305 (61), GAPLINC (16), lnc-IL7R (62) and CYP1B1-AS1 (63). This could be explained by alternative methods of detection (i.e. microarrays), cell type specific expression and/or the fact that their function is mediated by constitutive expression levels. In subsequent studies, we manually examined the syntenic conservation across 26 vertebrate species of the 73 lncRNAs that were induced in 4 or more human cell types. Remarkably, we found these lncRNAs were also syntenically expressed in primates (39 lncRNAs), mammals (32 lncRNAs), birds (5 lncRNAs) and fish (4 lncRNAs). Importantly, of these, many were shown to be significantly differentially expressed in response to LPS including 11 in primates and mammals and 2 in birds, with none in fish.
To assess whether syntenic conservation can identify human lncRNAs that regulate the LPS-induced innate immune response, we focused upon the two human lncRNAs whose syntenic versions are most commonly induced across vertebrate, MITA1 (IL7AS) (primates, mammals and birds) and LINC02541 (primates and mammals).
MITA1 (IL7AS), whose syntenic versions were induced in 18 vertebrate species, is located upstream of IL7, a hematopoietic growth factor that is important for lymphocyte development and multiple other immune functions (64). MITA1 (metabolism-induced tumor activator 1) was named from studies in hepatocytes showing that its up-regulation is driven by energy stress and linked to hepatocellular carcinoma metastasis (65). Of relevance, this lncRNA was also identified in a screen for NF-κB dependent lncRNAs following TNFα- and LPS-stimulation of p65-/- and Ikkβ-/- mouse embryonic fibroblasts (MEF) and named mNAIL (mouse NF-κB Associated Immunoregulatory Long non-coding RNA) (66). In the current report, transcriptional profiling using RNAseq has shown that MITA1 (IL7AS) is a negative regulator of multiple inflammatory mediators including IL6, CCL2, CXCL8 and CXCL2 following LPS-stimulation of monocytic THP-1 cells. This is in agreement with our previous knockout studies examining the role of MITA1 (IL7AS) during IL6 release from monocytic THP-1 and mouse macrophage RAW264.7 cells (13). In contrast, studies in epithelial cells derived from the lung (A549) (13) and colon (SW480) (67) have shown that MITA (IL7AS) is a positive regulator of the release of IL6 and other inflammatory mediators. In the latter report, the regulation of CCL2, CCL5, CCL7 and IL6 expression by MITA (IL7AS) was mediated through an interaction with p300, that modulates the assembly of the SWI/SNF complex at their promoters (67). Significantly, mNAIL was also shown to positively regulate the release of inflammatory mediators in vivo using an LPS-induced model of endotoxemia, as well as a dextran sulfate sodium (DSS) induced model of colitis (66). These studies were performed using a mouse model mNAILΔNF-κB, in which the two NF-κB sites that drive mNAIL (e.g. MITA (IL7AS)) expression were removed. Mechanistic studies in the LPS-induced mouse model showed that mNAIL acts primarily in myeloid cells to promotes the release of inflammatory mediators through inhibition of Wip1 phosphatase, leading to activation of p38 MAP kinase and p65 transcriptional factors (66). Overall, these reports would indicate that the function and mechanism of action of MITA1 (IL7AS) is both cell- and/or species-dependent. This is agreement with previous observations that syntenic conservation does not always equate to functional and mechanistic conservation. Thus, FAST is a syntenically conserved lncRNA found in both mESCs and hESCs that differs in both its processing and localisation, that in turn impacts upon its function and mechanism of action (52).
In contrast to MITA1 (IL7AS), LINC02541, which was induced in 8 species, has not been previously implicated in the immune response. To ascertain the role of LINC02541, our knockdown studies showed that this lncRNA was also a negative regulator of the LPS-induced inflammatory response in human monocytic THP1 cells. To obtain additional information on LINC002541, we examined a number of lncRNA databases. In agreement with our transcriptional data, genomic data from LncBook 2.0 (68) identified homologs across 32 primate and mammalian species but not birds (chicken and zebrafinch) or fish (zebrafish). This database also identified SNPs that are associated with multiple carcinomas including haematopoietic and lymphoid carcinomas. Interestingly, a role in cancer is supported by publications showing that LINC02541 promotes EMT in clear renal cell carcinoma (52) and following hypoxia (69). Mechanistic studies demonstrated that hypoxia/HIF-1α-driven LINC02541 expression regulates core EMT regulators including TWIST1, SLUG and VEGF genes through an interaction with the YY1 complex and the promotion of histone 4 lysine 16 acetylation (H4K16Ac) marks (70, 71).
As with protein coding genes, it is currently believed that the actions of lncRNAs are mediated through domains that permits interactions with target proteins, RNA and/or DNA regions (4–6). However, the identification of these domains has been hindered by access to datasets on the lncRNA sequences across multiple species. Given that we have identified syntenic versions of MITA1 (IL7AS) in 18 species and LINC02541 in 15 species, we extracted these sequences and compared them with the human version. Significantly, we have able to identify conserved regions and domains across these lncRNAs and speculate that these might be important in mediating their negative regulation of the innate immune response.
In summary, the LPS-induced response is unique in the fact that we have transcriptional data spanning 27 vertebrate species ranging through humans, primates, mammals, birds and fish. Our annotation and compilation of this data will provide an important resource for the future analysis of individual mRNAs and lncRNAs both across vertebrate species and in the context of evolution. Specifically, we believe that the identification of syntenically conserved lncRNAs will aid in the identification of functional lncRNAs in the human innate immune response, as well as the elucidation of their mechanism of action.
Data availability statement
The datasets for this study can be found at the GEO SRA using the accession numbers listed in Figure 1 and 2.
Ethics statement
The studies involving humans were approved by National Research Ethics Service (NRES), South West London REC1. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study. Ethical approval was not required for the study involving animals in accordance with the local legislation and institutional requirements because the animal data was obtained from a public RNA data repository (NCBI - SRA).
Author contributions
MH: Data curation, Formal analysis, Writing – original draft, Writing – review & editing. RR: Writing – review & editing, Data curation, Resources. PF: Writing – review & editing, Resources. LD: Writing – review & editing, Supervision, Resources. ML: Conceptualization, Data curation, Formal analysis, Funding acquisition, Methodology, Project administration, Supervision, Writing – original draft, Writing – review & editing.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. MRH was supported by funding from the Biotechnology and Biological Sciences Research Council (BB/N015630/1).
Conflict of interest
Author MR was employed by the company Theramir Ltd.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Generative AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: 10.6084/m9.figshare.30868874
References
1. Buchmann K. Evolution of innate immunity: clues from invertebrates via fish to mammals. Front Immunol. (2014) 5:459. doi: 10.3389/fimmu.2014.00459
2. Kimbrell DA and Beutler B. The evolution and genetics of innate immunity. Nat Rev Genet. (2001) 2:256–67. doi: 10.1038/35066006
3. Carpenter S and O’Neill LAJ. From periphery to center stage: 50 years of advancements in innate immunity. Cell. (2024) 187:2030–51. doi: 10.1016/j.cell.2024.03.036
4. Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. (2011) 25:1915–27. doi: 10.1101/gad.17446611
5. Mattick JS, Amaral PP, Carninci P, Carpenter S, Chang HY, Chen LL, et al. Long non-coding RNAs: definitions, functions, challenges and recommendations. Nat Rev Mol Cell Biol. (2023) 24:430–47. doi: 10.1038/s41580-022-00566-8
6. Ulitsky I and Bartel DP. lincRNAs: genomics, evolution, and mechanisms. Cell. (2013) 154:26–46. doi: 10.1016/j.cell.2013.06.020
7. Hadjicharalambous MR and Lindsay MA. Long non-coding RNAs and the innate immune response. Noncoding RNA. (2019) 5. doi: 10.3390/ncrna5020034
8. Robinson EK, Covarrubias S, and Carpenter S. The how and why of lncRNA function: An innate immune perspective. Biochim Biophys Acta Gene Regul Mech. (2020) 1863:194419. doi: 10.1016/j.bbagrm.2019.194419
9. Peltier DC, Roberts A, and Reddy P. LNCing RNA to immunity. Trends Immunol. (2022) 43:478–95. doi: 10.1016/j.it.2022.04.002
10. Hezroni H, Koppstein D, Schwartz MG, Avrutin A, Bartel DP, and Ulitsky I. Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species. Cell Rep. (2015) 11:1110–22. doi: 10.1016/j.celrep.2015.04.023
11. Necsulea A, Soumillon M, Warnefors M, Liechti A, Daish T, Zeller U, et al. The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature. (2014) 505:635–40. doi: 10.1038/nature12943
12. Castellanos-Rubio A, Fernandez-Jimenez N, Kratchmarov R, Luo X, Bhagat G, Green PH, et al. A long noncoding RNA associated with susceptibility to celiac disease. Science. (2016) 352:91–5. doi: 10.1126/science.aad0467
13. Roux BT, Heward JA, Donnelly LE, Jones SW, and Lindsay MA. Catalog of differentially expressed long non-coding RNA following activation of human and mouse innate immune response. Front Immunol. (2017) 8:1038. doi: 10.3389/fimmu.2017.01038
14. Wang S, Lin Y, Li F, Qin Z, Zhou Z, Gao L, et al. An NF-κB-driven lncRNA orchestrates colitis and circadian clock. Sci Adv. (2020) 6. doi: 10.1126/sciadv.abb5202
15. Wang Y, Wang P, Zhang Y, Xu J, Li Z, Li Z, et al. Decreased Expression of the Host Long-Noncoding RNA-GM Facilitates Viral Escape by Inhibiting the Kinase activity TBK1 via S-glutathionylation. Immunity. (2020) 53:1168–81.e7. doi: 10.1016/j.immuni.2020.11.010
16. Vollmers AC, Covarrubias S, Kuang D, Shulkin A, Iwuagwu J, Katzman S, et al. A conserved long noncoding RNA, GAPLINC, modulates the immune response during endotoxic shock. Proc Natl Acad Sci U.S.A. (2021) 118. doi: 10.1073/pnas.2016648118
17. Li G, Kryczek I, Nam J, Li X, Li S, Li J, et al. LIMIT is an immunogenic lncRNA in cancer immunity and immunotherapy. Nat Cell Biol. (2021) 23:526–37. doi: 10.1038/s41556-021-00672-3
18. Xu J, Wang P, Li Z, Li Z, Han D, Wen M, et al. IRF3-binding lncRNA-ISIR strengthens interferon production in viral infection and autoinflammation. Cell Rep. (2021) 37:109926. doi: 10.1016/j.celrep.2021.109926
19. Schmerer N, Janga H, Aillaud M, Hoffmann J, Aznaourova M, Wende S, et al. A searchable atlas of pathogen-sensitive lncRNA networks in human macrophages. Nat Commun. (2025) 16:4733. doi: 10.1038/s41467-025-60084-x
20. Poltorak A, He X, Smirnova I, Liu MY, Van Huffel C, Du X, et al. Defective LPS signaling in C3H/HeJ and C57BL/10ScCr mice: mutations in Tlr4 gene. Science. (1998) 282:2085–8. doi: 10.1126/science.282.5396.2085
21. Pertea M, Kim D, Pertea GM, Leek JT, and Salzberg SL. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat Protoc. (2016) 11:1650–67. doi: 10.1038/nprot.2016.095
22. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. (2009) 25:2078–9. doi: 10.1093/bioinformatics/btp352
23. Liao Y, Smyth GK, and Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. (2014) 30:923–30. doi: 10.1093/bioinformatics/btt656
24. Glover JM, Leeds JM, Mant TG, Amin D, Kisner DL, Zuckerman JE, et al. Phase I safety and pharmacokinetic profile of an intercellular adhesion molecule-1 antisense oligodeoxynucleotide (ISIS 2302). J Pharmacol Exp Ther. (1997) 282:1173–80. doi: 10.1016/S0022-3565(24)36939-3
25. Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT, and Salzberg SL. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. (2015) 33:290–5. doi: 10.1038/nbt.3122
26. Barnett DW, Garrison EK, Quinlan AR, Strömberg MP, and Marth GT. BamTools: a C++ API and toolkit for analyzing and managing BAM files. Bioinformatics. (2011) 27:1691–2. doi: 10.1093/bioinformatics/btr174
27. Mudge JM, Carbonell-Sala S, Diekhans M, Martinez JG, Hunt T, Jungreis I, et al. GENCODE 2025: reference gene annotation for human and mouse. Nucleic Acids Res. (2025) 53:D966–d75. doi: 10.1093/nar/gkae1078
28. Hounkpe BW, Chenou F, de Lima F, and De Paula EV. HRT Atlas v1.0 database: redefining human and mouse housekeeping genes and candidate reference transcripts by mining massive RNA-seq datasets. Nucleic Acids Res. (2021) 49:D947–d55. doi: 10.1093/nar/gkaa609
29. Chua SL, See Too WC, Khoo BY, and Few LL. UBC and YWHAZ as suitable reference genes for accurate normalisation of gene expression using MCF7, HCT116 and HepG2 cell lines. Cytotechnology. (2011) 63:645–54. doi: 10.1007/s10616-011-9383-4
30. Tóth O, Rácz GA, Oláh E, Tóth M, Szabó E, Várady G, et al. Identification of new reference genes with stable expression patterns for cell cycle experiments in human leukemia cell lines. Sci Rep. (2025) 15:1052. doi: 10.1038/s41598-024-84802-5
31. Rácz GA, Nagy N, Tóvári J, Apáti Á, and Vértessy BG. Identification of new reference genes with stable expression patterns for gene expression studies using human cancer and normal cell lines. Sci Rep. (2021) 11:19459. doi: 10.1038/s41598-021-98869-x
32. Sjöstedt E, Zhong W, Fagerberg L, Karlsson M, Mitsios N, Adori C, et al. An atlas of the protein-coding genes in the human, pig, and mouse brain. Science. (2020) 367. doi: 10.1126/science.aay5947
33. Hawash MBF, Sanz-Remón J, Grenier JC, Kohn J, Yotova V, Johnson Z, et al. Primate innate immune responses to bacterial and viral pathogens reveals an evolutionary trade-off between strength and specificity. Proc Natl Acad Sci U.S.A. (2021) 118. doi: 10.1073/pnas.2015855118
34. McMinds R, Jiang RHY, Adapa SR, Cornelius Ruhs E, Munds RA, Leiding JW, et al. Bacterial sepsis triggers stronger transcriptomic immune responses in larger primates. Proc Biol Sci. (2024) 291:20240535. doi: 10.1098/rspb.2024.0535
35. Palesch D, Bosinger SE, Tharp GK, Vanderford TH, Paiardini M, Chahroudi A, et al. Sooty mangabey genome sequence provides insight into AIDS resistance in a natural SIV host. Nature. (2018) 553:77–81. doi: 10.1038/nature25140
36. Wang L, Oh TG, Magida J, Estepa G, Obayomi SMB, Chong LW, et al. Bromodomain containing 9 (BRD9) regulates macrophage inflammatory responses by potentiating glucocorticoid receptor activity. Proc Natl Acad Sci U.S.A. (2021) 118. doi: 10.1073/pnas.2109517118
37. Young R, Bush SJ, Lefevre L, McCulloch MEB, Lisowski ZM, Muriuki C, et al. Species-specific transcriptional regulation of genes involved in nitric oxide production and arginine metabolism in macrophages. Immunohorizons. (2018) 2:27–37. doi: 10.4049/immunohorizons.1700073
38. Gorshkova EA, Gubernatorova EO, Dvorianinova EM, Yurakova TR, Marey MV, Averina OA, et al. Macrophages from naked mole-rat possess distinct immunometabolic signatures upon polarization. Front Immunol. (2023) 14:1172467. doi: 10.3389/fimmu.2023.1172467
39. Bush SJ, McCulloch MEB, Lisowski ZM, Muriuki C, Clark EL, Young R, et al. Species-specificity of transcriptional regulation and the response to lipopolysaccharide in mammalian macrophages. Front Cell Dev Biol. (2020) 8:661. doi: 10.3389/fcell.2020.00661
40. Bush SJ, Freem L, MacCallum AJ, O’Dell J, Wu C, Afrasiabi C, et al. Combination of novel and public RNA-seq datasets to generate an mRNA expression atlas for the domestic chicken. BMC Genomics. (2018) 19:594. doi: 10.1186/s12864-018-4972-7
41. Gormally BMG and Lopes PC. The effect of infection risk on female blood transcriptomics. Gen Comp Endocrinol. (2023) 330:114139. doi: 10.1016/j.ygcen.2022.114139
42. Jax E, Müller I, Börno S, Borlinghaus H, Eriksson G, Fricke E, et al. Health monitoring in birds using bio-loggers and whole blood transcriptomics. Sci Rep. (2021) 11:10815. doi: 10.1038/s41598-021-90212-8
43. Scalf CS, Chariker JH, Rouchka EC, and Ashley NT. Transcriptomic analysis of immune response to bacterial lipopolysaccharide in zebra finch (Taeniopygia guttata). BMC Genomics. (2019) 20:647. doi: 10.1186/s12864-019-6016-3
44. Alterman JF, Hall LM, Coles AH, Hassler MR, Didiot MC, Chase K, et al. Hydrophobically Modified siRNAs Silence Huntingtin mRNA in Primary Neurons and Mouse Brain. Mol Ther Nucleic Acids. (2015) 4:e266. doi: 10.1038/mtna.2015.38
45. Jin X, Morro B, Tørresen OK, Moiche V, Solbakken MH, Jakobsen KS, et al. Innovation in nucleotide-binding oligomerization-like receptor and toll-like receptor sensing drives the major histocompatibility complex-II free Atlantic cod immune system. Front Immunol. (2020) 11:609456. doi: 10.3389/fimmu.2020.609456
46. Fuchylo U, Alharbi HA, Alcaraz AJ, Jones PD, Giesy JP, Hecker M, et al. Inflammation of gill epithelia in fish causes increased permeation of petrogenic polar organic chemicals via disruption of tight junctions. Environ Sci Technol. (2022) 56:1820–9. doi: 10.1021/acs.est.1c05839
47. Gong Z, Guo C, Wang J, Chen S, and Hu G. Establishment and identification of a skin cell line from Chinese tongue sole (Cynoglossus semilaevis) and analysis of the changes in its transcriptome upon LPS stimulation. Fish Shellfish Immunol. (2023) 142:109119. doi: 10.1016/j.fsi.2023.109119
48. Zhang Q, Kopp M, Babiak I, and Fernandes JMO. Low incubation temperature during early development negatively affects survival and related innate immune processes in zebrafish larvae exposed to lipopolysaccharide. Sci Rep. (2018) 8:4142. doi: 10.1038/s41598-018-22288-8
49. Martínez-López A, Tyrkalska SD, Alcaraz-Pérez F, Cabas I, Candel S, Martínez Morcillo FJ, et al. Evolution of LPS recognition and signaling: The bony fish perspective. Dev Comp Immunol. (2023) 145:104710. doi: 10.1016/j.dci.2023.104710
50. Zotta A, O’Neill LAJ, and Yin M. Unlocking potential: the role of the electron transport chain in immunometabolism. Trends Immunol. (2024) 45:259–73. doi: 10.1016/j.it.2024.02.002
51. Baek M, Chai JC, Choi HI, Yoo E, Binas B, Lee YS, et al. Analysis of differentially expressed long non-coding RNAs in LPS-induced human HMC3 microglial cells. BMC Genomics. (2022) 23:853. doi: 10.1186/s12864-022-09083-6
52. Chini A, Guha P, Malladi VS, Guo Z, and Mandal SS. Novel long non-coding RNAs associated with inflammation and macrophage activation in human. Sci Rep. (2023) 13:4036. doi: 10.1038/s41598-023-30568-1
53. NE II, Heward JA, Roux B, Tsitsiou E, Fenwick PS, Lenzi L, et al. Long non-coding RNAs and enhancer RNAs regulate the lipopolysaccharide-induced inflammatory response in human monocytes. Nat Commun. (2014) 5:3979. doi: 10.1038/ncomms4979
54. Krawczyk M and Emerson BM. p50-associated COX-2 extragenic RNA (PACER) activates COX-2 gene expression by occluding repressive NF-κB complexes. Elife. (2014) 3:e01776. doi: 10.7554/eLife.01776
55. Liu J, Li X, Yang P, He Y, Hong W, Feng Y, et al. LIN28A-dependent lncRNA NEAT1 aggravates sepsis-induced acute respiratory distress syndrome through destabilizing ACE2 mRNA by RNA methylation. J Transl Med. (2025) 23:15. doi: 10.1186/s12967-024-06032-7
56. Lim YH, Yoon G, Ryu Y, Jeong D, Song J, Kim YS, et al. Human lncRNA SUGCT-AS1 regulates the proinflammatory response of macrophage. Int J Mol Sci. (2023) 24. doi: 10.3390/ijms241713315
57. Zhang Q, Chao TC, Patil VS, Qin Y, Tiwari SK, Chiou J, et al. The long noncoding RNA ROCKI regulates inflammatory gene expression. EMBO J. (2019) 38. doi: 10.15252/embj.2018100041
58. Li Z, Chao TC, Chang KY, Lin N, Patil VS, Shimizu C, et al. The long noncoding RNA THRIL regulates TNFα expression through its interaction with hnRNPL. Proc Natl Acad Sci U S A. (2014) 111:1002–7. doi: 10.1073/pnas.1313768111
59. Zhao G, Su Z, Song D, Mao Y, and Mao X. The long noncoding RNA MALAT1 regulates the lipopolysaccharide-induced inflammatory response through its interaction with NF-κB. FEBS Lett. (2016) 590:2884–95. doi: 10.1002/1873-3468.12315
60. Obaid M, Udden SMN, Deb P, Shihabeddin N, Zaki MH, and Mandal SS. LncRNA HOTAIR regulates lipopolysaccharide-induced cytokine expression and inflammatory response in macrophages. Sci Rep. (2018) 8:15670. doi: 10.1038/s41598-018-33722-2
61. Zhang DD, Wang WT, Xiong J, Xie XM, Cui SS, Zhao ZG, et al. Long noncoding RNA LINC00305 promotes inflammation by activating the AHRR-NF-κB pathway in human monocytes. Sci Rep. (2017) 7:46204. doi: 10.1038/srep46204
62. Cui H, Xie N, Tan Z, Banerjee S, Thannickal VJ, Abraham E, et al. The human long noncoding RNA lnc-IL7R regulates the inflammatory response. Eur J Immunol. (2014) 44:2085–95. doi: 10.1002/eji.201344126
63. Arunima A, Niyakan S, Butler SM, Clark SD, Pinson A, Kwak D, et al. CYP1B1-AS1 regulates CYP1B1 to promote Coxiella burnetii pathogenesis by inhibiting ROS and host cell death. Nat Commun. (2025) 16:7493. doi: 10.1038/s41467-025-62762-2
64. Chen D, Tang TX, Deng H, Yang XP, and Tang ZH. Interleukin-7 biology and its effects on immune cells: mediator of generation, differentiation, survival, and homeostasis. Front Immunol. (2021) 12:747324. doi: 10.3389/fimmu.2021.747324
65. Ma M, Xu H, Liu G, Wu J, Li C, Wang X, et al. Metabolism-induced tumor activator 1 (MITA1), an Energy Stress-Inducible Long Noncoding RNA, Promotes Hepatocellular Carcinoma Metastasis. Hepatology. (2019) 70:215–30. doi: 10.1002/hep.30602
66. Akıncılar SC, Wu L, Ng QF, Chua JYH, Unal B, Noda T, et al. NAIL: an evolutionarily conserved lncRNA essential for licensing coordinated activation of p38 and NFκB in colitis. Gut. (2021) 70:1857–71. doi: 10.1136/gutjnl-2020-322980
67. Liu X, Lu Y, Zhu J, Liu M, Xie M, Ye M, et al. Antisense IL-7, promotes inflammatory gene transcription through facilitating histone acetylation and switch/sucrose nonfermentable chromatin remodeling. J Immunol. (2019) 203:1548–59. doi: 10.4049/jimmunol.1900256
68. Li Z, Liu L, Feng C, Qin Y, Xiao J, Zhang Z, et al. LncBook 2.0: integrating human long non-coding RNAs with multi-omics annotations. Nucleic Acids Res. (2023) 51:D186–d91. doi: 10.1093/nar/gkac999
69. Peng PH, Lai JC, Chang JS, Hsu KW, and Wu KJ. Induction of epithelial-mesenchymal transition (EMT) by hypoxia-induced lncRNA RP11-367G18.1 through regulating the histone 4 lysine 16 acetylation (H4K16Ac) mark. Am J Cancer Res. (2021) 11:2618–36.
70. Cao Y, Matsubara T, Zhao C, Gao W, Peng L, Shan J, et al. Antisense oligonucleotide and thyroid hormone conjugates for obesity treatment. Sci Rep. (2017) 7:9307. doi: 10.1038/s41598-017-09598-z
Keywords: vertebrate, innate immunity, LPS, long non-coding RNAs, evolution, transcriptomics
Citation: Hadjicharalambous MR, Ryder RR, Fenwick PS, Donnelly LE and Lindsay MA (2026) Catalogue of LPS-induced transcriptional changes across vertebrates identifies syntenically conserved human long non-coding RNAs that regulate the innate immune response. Front. Immunol. 16:1686490. doi: 10.3389/fimmu.2025.1686490
Received: 15 August 2025; Accepted: 25 November 2025; Revised: 21 November 2025;
Published: 12 January 2026.
Edited by:
Orestes Foresto-Neto, University of São Paulo, BrazilReviewed by:
Laura La Paglia, National Research Council (CNR), ItalyAryashree Arunima, Texas A and M University, United States
Copyright © 2026 Hadjicharalambous, Ryder, Fenwick, Donnelly and Lindsay. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Mark A. Lindsay, bS5hLmxpbmRzYXlAYmF0aC5hYy51aw==
†Present address: Marina R. Hadjicharalambous, Theramir Ltd, 1 Alkidamou, High-Tech Cluster, Agios Athanasios, Limassol, Cyprus
Richard R. Ryder, Foley’s School, Limassol, Cyprus
Richard R. Ryder1†