Transcriptomic and Network Analysis of Minor Salivary Glands of Patients With Primary Sjögren’s Syndrome

Primary Sjögren’s syndrome (pSS) is a systemic autoimmune disease characterized primarily by immune-mediated destruction of exocrine tissues, such as those of the salivary and lacrimal glands, resulting in the loss of saliva and tear production, respectively. This disease predominantly affects middle-aged women, often in an insidious manner with the accumulation of subtle changes in glandular function occurring over many years. Patients commonly suffer from pSS symptoms for years before receiving a diagnosis. Currently, there is no effective cure for pSS and treatment options and targeted therapy approaches are limited due to a lack of our overall understanding of the disease etiology and its underlying pathology. To better elucidate the underlying molecular nature of this disease, we have performed RNA-sequencing to generate a comprehensive global gene expression profile of minor salivary glands from an ethnically diverse cohort of patients with pSS. Gene expression analysis has identified a number of pathways and networks that are relevant in pSS pathogenesis. Moreover, our detailed integrative analysis has revealed a primary Sjögren’s syndrome molecular signature that may represent important players acting as potential drivers of this disease. Finally, we have established that the global transcriptomic changes in pSS are likely to be attributed not only to various immune cell types within the salivary gland but also epithelial cells which are likely playing a contributing role. Overall, our comprehensive studies provide a database-enriched framework and resource for the identification and examination of key pathways, mediators, and new biomarkers important in the pathogenesis of this disease with the long-term goals of facilitating earlier diagnosis of pSS and to mitigate or abrogate the progression of this debilitating disease.


INTRODUCTION
Sjögren's syndrome (SS) is a chronic, inflammatory autoimmune disease typically characterized by focal lymphocytic infiltration of exocrine glands that predominantly affect the salivary and lacrimal glands, resulting in oral and ocular dryness, respectively. While this disease preferentially involves salivary and lacrimal glands, systemic effects are also observed. Indeed, a wide range of other affected organs include the skin, kidney, and lungs (1). Sjögren's syndrome may exist as an independent entity, referred to as primary Sjögren's syndrome (pSS), or alternatively as secondary Sjögren's syndrome, which occurs in conjunction with other autoimmune connective tissue diseases such as rheumatoid arthritis or systemic lupus erythematosus (SLE). The prevalence of pSS ranges from 0.01% to 3% of the general population, primarily affecting middle-aged women with a female to male ratio of up to 20:1 (2,3). While pSS is overwhelmingly dominated by ocular and oral dryness, fatigue, and pain, serious disease sequelae associated with increased mortality are observed, such as cryoglobulinemic vasculitis, B cell lymphoma and pulmonary fibrosis (4).
pSS is a multifactorial complex disease involving both genetic and environmental factors such as viral infections, that may influence disease progression and severity (5,6). Viruses have long been considered potential players in SS pathology with Epstein-Barr and human T cell leukemia type I viruses being the most commonly associated with this disease (7)(8)(9)(10)(11). Over the years, studies also focused on the contribution of different immune cell types to the development of SS including T and B cells (12)(13)(14). More recently, these studies have expanded to examine the intimate role of signaling molecules and pathways in SS pathogenesis including various cytokines and chemokines. Indeed, multiple chemokines have been implicated in SS, including CCL19, CCL21, and CXCL13, all of which have been shown to play important roles in driving the immunerelated effects associated with this disease (15)(16)(17). In addition to cytokines and chemokines, several signaling pathways are implicated in SS pathogenesis. Interestingly, emerging evidence suggests a prominent role for interferon (IFN) signaling in SS pathology (18). Indeed, activated IFN signaling in the salivary gland of SS patients has been associated with this disease (17,19). Despite extensive efforts directed towards identifying the underlying molecular mechanisms contributing to SS pathogenesis, current treatment options are limited to managing clinical symptoms as no effective treatments, or cures, have been developed to date.
In order to better understand the underlying molecular landscape contributing to SS pathogenesis, we have performed RNA-sequencing to examine the global gene expression profiles of minor salivary glands (MSGs) from non-SS controls and pSS patients. Functional gene enrichment and network analysis of the MSGs revealed important molecular players and pathways that are relevant in pSS pathology. Moreover, our integrated transcriptomic analysis has uncovered a distinct pSS molecular signature highlighted by pertinent genes, genetic biomarkers and pathways that may be important in driving this disease. Furthermore, we show that the cellular dysfunction in the MSGs of pSS involves intricate contribution of various immune cells types, with epithelial cells also playing a potentially important role. Overall, our comprehensive studies have not only re-affirmed the importance of key signaling molecules and pathways, but have also identified novel genes and important cellular subtypes. This knowledge can be mined for effective diagnosis and monitoring of pSS pathology with the long-term goal of developing targeted therapeutic strategies to better treat this chronic debilitating disease.

Patients Samples
Human minor salivary glands (MSG) from 10 pSS patients and 10 non-SS controls were collected at the Oklahoma Medical Research Foundation (OMRF) as previously described (20). An additional three pSS patients and three non-SS controls were collected at the Augusta University Dental College of Georgia (AUDCG) (Supplementary Table 5). The OMRF and the AUDGC Institutional Review Boards approved all research procedures and the study participants gave written informed consent in compliance with the Declaration of Helsinki. All primary Sjögren's syndrome patients were classified as either pSS or non-SS in accordance with the 2002 revised American European Consensus Group Sjögren's syndrome criteria (AECG) (21). Patients also met the 2016 ACR-EULAR criteria for pSS (22,23). The non-SS controls were subjects with subjective sicca symptoms, based on positive responses to the AECG standardized dry eyes and dry mouth questions, but who did not have the objective criteria of glandular dysfunction to be classified as pSS. MSG biopsy tissue from female pSS patients (n = 13, mean age = 55.1 years) and age and sex-matched non-SS control subjects (n = 13, mean age = 54.2 years) were analyzed. All pSS patients displayed focal lymphocytic sialadenitis with a focus score of greater than one/4 mm 2 in accordance with the standard of care as detailed in Daniels et al. (24). A majority of patients exhibited Ro/SSA autoantibodies and salivary gland hypofunction (whole unstimulated salivary flow < 0.1 ml/min). A summary of clinical characteristics and patient demographics is provided in Table 1 and Supplementary Table S5.

RNA Isolation and Quantitative RT-PCR
Total RNA from human MSGs from non-SS controls and pSS patient samples from OMRF was extracted from optimal cutting temperature (OCT) embedded tissues and isolated and purified using TRIzol (Invitrogen, 15596018) with BioMashers (TaKaRa, 9790A). Total RNA from human MSGs from non-SS controls and pSS patient samples from CDMUA was similarly isolated and purified using TRIzol with BioMashers. All RNA was phase separated by chloroform and further isolated using the Direct-zol RNA Miniprep kit (Zymo Research, R2050) according to the manufacturer's instructions. For quantitative reverse-transcription polymerase chain reaction (qRT-PCR) a total of 0.8 micrograms of RNA was reverse transcribed using the iScript cDNA Synthesis Kit (Bio-Rad, 1708890) according to the manufacturer's instructions. Quantitative reverse-transcription polymerase chain reaction was performed on a CFX96 Touch ™ Real-Time PCR Detection System (Bio-Rad, 1855195) using iQ SYBR Green Supermix (Bio-Rad, 1708882). All qRT-PCR assays were performed in duplicates in at least three independent experiments. Relative expression values of each target gene were normalized to glyceraldehyde 3-phosphate dehydrogenase (GAPDH) expression. The RNA samples used for qRT-PCR analysis shown in Supplementary Figure S2 represent three non-SS controls (S2, S4, and S5) and three pSS (S12, S14, and S16) collected from the OMRF. In parallel, an additional set of four patient samples were utilized consisting of three non-SS controls (S21-S23) and three pSS (S25-S27) collected from the CDMUA as well as one non-SS control (S24) and one pSS (S28) from the OMRF Supplementary Figure S3. These patient samples serve as an independent cohort for the qRT-PCR data since these samples were not processed for RNA-seq studies. Primer sequences are provided in Supplementary Table S1. RNA-Sequencing, Differentially Expressed Gene (DEG), Ingenuity Pathway Analysis (IPA), and Enrichment Analyses cDNA libraries were prepared using the TrueSeq RNA Sample Preparation Kit (Illumina) from RNA samples isolated from eight non-SS controls and nine pSS patients obtained from the OMRF (see Figure 1A and Table 1 for sample ID numbers and additional patient information). The cDNA libraries were then sequenced on an Illumina NovaSeq sequencer (50 cycle paired-end). After initial quality control metrics were determined using FASTQC v0.4.3, the raw reads were then mapped to the reference genome (Hg38 build) with Hisat2 (25) v2.1.0. using Bowtie (26) v2.2.6 as the underlying aligner. Reads mapping uniquely to each gene in the reference genome were then quantified using featureCounts (27) from the Subread (28) package. The resulting counts matrix was imported into R for read count normalization and differential gene expression analysis using the DESeq2 (29) package. An adjusted p-value < 0.05 based on Benjamini-Hochberg method was used as a cut-off for determining differentially expressed genes. Gene ontology analysis for identification of enriched pathways was performed using the Database for Analysis Visualization and Integrated Discovery (DAVID) v6.8 (30,31). The databases were queried by providing the list of official names of genes of interest. The resulting KEGG Pathway (32)(33)(34)(35) Figure 5A, a table containing official names and corresponding foldchange values of the 80 genes that are common across all three datasets was used as input for the Ingenuity Pathway Analysis (IPA) software (Qiagen). The core analysis function of the software was used to interpret the data and generate the graph ( Figure 5A) of canonical pathways that correspond to the changes in gene expression. Figures 5B-D were generated from the genes enriched in the indicated canonical pathways using STRING v11 (38)(39)(40)(41), and the clusters were defined based on likelihood of protein-protein interaction using the STRING's implementation of the Markov cluster algorithm (MCL) clustering (42). Sequencing data generated for this study has been deposited in the Gene Expression Omnibus (GEO) database under the accession number GSE157159.

Determination of Immune Cell Contribution to Gene Expression
Transcripts per million (TPM) normalized gene expression matrix was generated according to the method proposed by Wagner et al. (43) and was used as input for estimating cell type enrichment using the xCell (44) program with default settings. The resulting cell type estimation matrix was filtered to remove cell types with enrichment scores less than 0.1 and used as input for generating the heatmap in Figure 4.

Protein Expression Dataset
Differentially expressed genes (DEGs) with absolute foldchange values greater than 1 between pSS and non-SS control subjects were separated into upregulated and downregulated genes. Each subset was then used to filter the RNA gene expression values from tissues downloaded from the Human Protein Atlas (HPA) (45). The salivary gland and several immune-related tissues were then further extracted and used to generate the boxplots visualizing the estimated contribution of these tissues to the overall differential gene expression profiles observed between the non-SS control and pSS tissues.

Defining the Transcriptome of the Minor Salivary Gland of pSS Patients
To better define the global gene expression patterns of pSS and to identify new molecular players that may contribute to the  development of this disease, we performed RNA-sequencing based expression profiling of human MSGs from eight non-SS controls and nine pSS patients. In order to better analyze the overall changes in gene expression patterns between cases and controls, we utilized principal component analysis (PCA). The resulting PCA plot revealed a clear degree of separation between the different samples with each of the control and pSS patient groups appearing as separate entities ( Figure 1A). Overall, our analysis suggests that the patient samples segregated well based on their mRNA expression profiles.
To better appreciate the underlying molecular drivers of pSS, we compared the transcriptomic profiles of non-SS control and pSS MSGs. Our analysis identified 5529 differentially expressed genes (DEGs), with 2979 genes upregulated and 2550 genes downregulated with the top 100 upregulated and downregulated genes shown in Figure 1B ( Figure 1B and Supplementary Table  S2). To better appreciate the biological relevance of the global transcriptomic differences between the control and pSS glands, we performed pathway analysis based on the top 100 DEGs. As expected, in the glands of pSS patients we observed specific Z-Score S18 S20 S14 S16 S15 S12 S8 S1 S2 S17 S19 S11 S3 S6 S4 S5 S7   enrichment of biological processes associated with immune responses including B cell proliferation, regulatory T cell differentiation, regulation of interleukin-12 production and chemokine-mediated signaling pathway-many of which have been implicated in SS pathogenesis (17,46,47) ( Figure 1C and Supplementary Table S3). In contrast, downregulated genes were associated with negative regulation of cell-cell adhesion, O-glycan processing, and glycoprotein metabolic process, all of which are important processes necessary for proper salivary gland function including host-mediated defense mechanisms ( Figure 1D and Supplementary Table S4) (48).

Integrated Analysis Identifies a pSS Molecular Signature
To confirm the robustness of our sequencing results, we next compared the global transcriptomes of the control and pSS glands by utilizing RNA-sequencing (RNA-seq) datasets described here, to additional RNA-seq (Liu et al.) (49) and microarray array datasets  Figure S2). In order to further validate these findings, we performed additional qRT-PCR analyses using an independent cohort of four pairs of pSS and control patient samples and observed similar trends in overall gene expression (Supplementary Figure S3 and Supplementary Table S5). Having identified 80 common genes across the three datasets, we next sought to gain a better understanding of the underlying biological functions and pathways associated with these genes. Not surprisingly, our analysis identified enrichment of genes associated with a number of biological processes which have been previously linked to pSS including immune responses, type I interferon signaling pathway, defense response to virus, chemokine-mediated signaling pathway, inflammatory response, and interferon-g-mediated signaling pathway (17,18,62,63) ( Figure 2C and Supplementary Table S6). Taken together, our integrated analysis has identified a molecular signature which includes a number of genes and pathways which have previously been shown to play a role in pSS, highlighting the power of this analysis.

Defining the Nature of the Transcriptomic Changes in pSS Salivary Glands
To further evaluate the changes in gene expression in pSS and better define the overall nature of these transcriptomic changes, we mined the publicly available transcriptome database for human tissues generated by the HPA project (45). Upon comparison of the average gene expression levels of the pSS salivary gland datasets to various human tissue datasets, we found select enrichment of genes in the pSS patient samples to be those that are predominantly expressed in immune-specific tissues and cell types ( Figure 3A). Given this result, it is tempting to speculate that the specific enrichment of genes associated with immune organs and tissues is a reflection of immune cell infiltration commonly observed in the glands of pSS patients. Conversely, we found the genes that were downregulated in pSS patients were highly enriched in the tonsil and salivary gland tissues compared to other immune tissues ( Figure 3B). This suggests that in addition to the immune cell-type related changes, other salivary gland specific cell types such as epithelial cells, may also be affected in pSS (64,65). Armed with a global view of the overall transcriptional changes in the glands of pSS patients, we next sought to characterize the cell types that may be contributing to the alterations in gene expression. Towards this end we utilized xCell (44), a cell type enrichment analysis tool that allowed us to estimate the cell types contributing to the bulk RNA-seq expression profile using published information from single-cell RNA-seq datasets (66)(67)(68). As expected, we observed enrichment of genes associated with various immune cells types including B cells, conventional dendritic cells (cDC), plasmacytoid dendritic cells (pDC) and CD4 + effector memory T cells (CD4 + Tem) in the pSS glands, which is in good agreement with previous reports (69-74) ( Figure 4). Interestingly, our analysis demonstrated a decrease in genes commonly expressed in epithelial cells ( Figure  4). Indeed, these results correlate well with our own DEG analysis demonstrating that genes which were downregulated in pSS were highly enriched in normal salivary gland tissues as described in Figure 3B. Taken together, our results highlight the complex cell intrinsic and extrinsic changes occurring in the glands of pSS patients.

Functional Gene Regulatory Network Analysis
Over the years, gene regulatory network analyses have emerged as an important tool in identifying transcriptional control programs, regulatory relationships and signaling networks that operate in the gene-rich environments during development and in disease. Towards this end, we utilized ingenuity pathway analysis (IPA) to explore differences in pathways and gene regulatory networks between the pSS and non-pSS control samples. Our analysis revealed the top two differentially regulated pathways to include the T-helper cell pathways-Th1 and Th2 ( Figure 5A), which is in good agreement with previous studies detailing the role of T cells in SS (75). Surprisingly, we also observed specific enrichment of DEGs that are commonly associated with repression of the PD-1/PDL-1 cancer immunotherapy pathway ( Figure 5A and Supplementary  Table S7). The possible association of pSS with repressed PD-1/ PDL-1 pathway is surprising given that some studies have reported elevated expression levels of PD-1 and PDL-1 in the salivary glands of pSS patients (76)(77)(78). However, our finding is in agreement with recent reports of the development of a Sjögren's like syndrome in cancer patients treated with PD-1/PD-L1 checkpoint inhibitors (79,80). Further support for pSS development resulting from repressed PD-1/PDL-1 comes from mouse models which have demonstrated that animals with deletion of PD-1 develop autoimmune diseases that include lupus-like arthritis and glomerulonephritis (81,82).
Given the conflicting roles for the PD-1/PDL-1 pathway in pSS pathogenesis, further careful studies are warranted. Yet another interesting finding was the observed enrichment of genes associated with the neuroinflammation signaling pathway ( Figure 5A). This is in line with recent studies examining the role of the neuroendocrine system in systemic autoimmune diseases including rheumatoid arthritis, SLE and SS (83,84). Finally, the role of pattern recognition receptors in recognition of bacteria and viruses was also among the over-represented pathways in our DEG dataset, which is in accordance with emerging evidence, suggesting a role for toll-like receptor (TLR) activation in pSS ( Figure 5A) (85). Gene regulatory network analyses detailing the signaling networks connecting the DEGs for each of the aforementioned pathways are demonstrated in Figures 5B-D. Overall, while our RNA-seq driven data analysis revealed enrichment in a number of immune-mediated functional DOCK2  CCR7  AC2R  CXCR4  CCL19  STAT1  CXCL11  CXCL10  CXCL9  CXCL13  CCR1  IL7R  LTB  ICOS  CD3D  LCK  TAP1  IFI44L  MS4A1  MMP9  GZMK  CMPK2  CD69  MPEG1  GZMA  ZC3H12D  IFITM1  PSMB9  NLRC5  IRF8  PLAC8  EVI2A  BANK1  HCP5  RASSF2  MX1  GMFG  EPSTI1  C16orf54  GPR183  IFI27  NAPSB  STAP1  COTL1  PPP1R16B  AIM2  CD48  SAMD9L  XAF1  IFIT3  FCRLA  FCRL1  SAMD9  CR2  DOCK10  SAMSN1  SRGN  FCRL3  MCOLN2  GBP1  GBP5  CD52  DOCK8  TAGAP  CYBB  PARP14  PRKCB  FCRL2  WDFY4  TMEM156  HIST1H2BJ  TIGIT  SELL  CD2

DISCUSSION
While the underlying molecular mechanisms driving the pathogenesis of SS has been an area of extensive research over the last several decades, very few advancements have been made in treating this disease. Although various strategies have been employed over the years to address this, recent advances in nextgeneration sequencing approaches like RNA-seq have provided unprecedented insight into the complex molecular circuitry that governs dysregulated transcriptional networks contributing to diseased states. Here we have utilized RNA-seq to generate a comprehensive global gene expression profile of MSGs from patients with pSS. Using an integrative based approach, we have identified a pSS molecular gene signature that may offer insight into new players and pathways important for driving this disease. Moreover, we have utilized sophisticated computational and bioinformatics-driven analyses to identify important cell types that may be contributing to the underlying transcriptomic changes in the MSGs of pSS patients. Our RNA-seq based approach identified a large number (~5529) of DEGs between non-SS control and pSS glands. More specifically, our analysis revealed 2979 upregulated and 2550 downregulated genes. These results were particularly surprising given that a recent RNA-seq study performed by Liu et al., and an integrated microarray-based study by Min et al., revealed a total of 293 and 382 DEGs, respectively (49,50). While discrepancies between the number of DEGs identified in our study detailed here, compared to the microarray-based study can be attributed to the inherent limitations associated with microarray technologies, the differences we observed between our study and that of the RNA-seq study performed by Liu et al., is particularly intriguing. A potential explanation for this discrepancy could be ascribed, among many things, to differences in patient background and demographics, disease severity, focus score and the inherent variability associated with genomic data collection and processing. We suspect that the patients included in the Liu et al. study were all of Chinese descent and thus likely represent a more homogenous patient population. Conversely, the patients included in our analysis were more ethnically diverse (Table 1). Hence, it is tempting to speculate that the increased number of DEGs observed in our data set is a direct reflection of ethnic diversity and more representative of the genetic heterogeneity commonly associated with this disease (86,87). It is also plausible that the differences in focus scores between the two patient groups may in part account for the significant difference in DEGs. Indeed, the high average patient focus scores (8.4) of our study compared to that of Liu et al. (1.8), might be indicative of enhanced immune cell activation in the minor salivary gland biopsies chosen for our study. We suspect that this heightened immune activation might broadly influence not only the inflammatory infiltrates but also the epithelial cell populations, which may reflect the differences in DEGs. Future studies aimed at more detailed investigations of the dynamic changes in the transcriptomic landscape as it pertains to patient focus scores will be of both prognostic and therapeutic value.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi. nlm.nih.gov/geo/, GSE157159.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Oklahoma Medical Research Foundation and the Augusta University Dental College of Georgia, Institutional Review Boards. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
AO and R-AR designed the experiments. AO, EH, ES, SM, MC, SS, JK, and R-AR performed experiments. AO, EH, and R-AR analyzed the data. CL, AR, LR, RS, DL, DS, KG, KS, SD, ZK, and AF provided essential tools. AO and R-AR wrote the paper. All authors contributed to the article and approved the submitted version.