Gene Co-expression Networks Identifies Common Hub Genes Between Cutaneous Sarcoidosis and Discoid Lupus Erythematosus

In this study we analyzed gene co-expression networks of three immune-related skin diseases: cutaneous sarcoidosis (CS), discoid lupus erythematosus (DLE), and psoriasis. We propose that investigation of gene co-expression networks may provide insights into underlying disease mechanisms. Microarray expression data from two cohorts of patients with CS, DLE, or psoriasis skin lesions were analyzed. We applied weighted gene correlation network analysis (WGCNA) to construct gene-gene similarity networks and cluster genes into modules based on similar expression profiles. A module of interest that was preserved between datasets and corresponded with case/control status was identified. This module was related to immune activation, specifically leukocyte activation, and was significantly increased in both CS lesions and DLE lesions compared to their respective controls. Protein-protein interaction (PPI) networks constructed for this module revealed seven common hub genes between CS lesions and DLE lesions: TLR1, ITGAL, TNFRSF1B, CD86, SPI1, BTK, and IL10RA. Common hub genes were highly upregulated in CS lesions and DLE lesions compared to their respective controls in a differential expression analysis. Our results indicate common gene expression patterns in the immune processes of CS and DLE, which may have indications for future therapeutic targets and serve as Th1-mediated disease biomarkers. Additionally, we identified hub genes unique to CS and DLE, which can help differentiate these diseases from one another and may serve as unique therapeutic targets and biomarkers. Notably, we find common gene expression patterns in the immune processes of CS and DLE through utilization of WGCNA.


INTRODUCTION
Cutaneous sarcoidosis (CS), discoid lupus erythematosus (DLE), and psoriasis are immune-related cutaneous disorders with different pathologies and clinical presentations. CS occurs in up to one third of patients with systemic sarcoidosis, an inflammatory disease characterized by non-caseating granulomas (1). CS is often considered the "great imitator" in dermatology due to its large range of morphologies, including papules, plaques, lupus pernio, and scar psoriaform and ulcerative lesions (1). DLE is the most prevalent type of chronic cutaneous lupus erythematosus characterized by pathogenic autoantibodies and immune complexes (2). The most common presentation of DLE is coin-shaped plaques on the scalp and ears (2,3). DLE can occur as a skin manifestation of systemic lupus erythematosus (SLE) in up to 20% of patients (2,3). Psoriasis is a common skin condition that affects over 7 million Americans (4). The hallmark of immune dysfunction in psoriasis is uncontrolled keratocyte proliferation and differentiation (5). Plaque psoriasis is the most common subtype, presenting with well-defined areas of erythematous plaques with silvery scales (5). The severity of psoriasis can greatly vary, with the joints being affected in 20-30% of patients (6). While psoriasis was traditionally considered a Th1-mediated disease, recent studies suggests that psoriasis may be predominantly Th17-mediated (7). In contrast, both CS and DLE may be predominantly Th1mediated diseases (8,9).
Within the past 20 years, biological treatments for several skin diseases, including psoriasis, atopic dermatitis, urticaria, and pemphigus vulgaris have emerged as major therapeutic breakthroughs (10). First-line therapy for cutaneous sarcoidosis consists of corticosteroids and second-line therapies consist of tetracyclines, hydroxychloroquine, and methotrexate. Biologics, e.g., anti-TNF, have been used to treat chronic or resistant cutaneous sarcoidosis with improvement or worsening of disease (11). A limited number of studies support the use of both infliximab and adalimumab as third-line therapies for cutaneous sarcoidosis, with some reports of etanercept, rituximab, golimumab, and ustekinumab being successful as well (12). For DLE, first line therapies include photoprotection in conjunction with topical or oral corticosteroids, topical calcineurin inhibitors, and systemic antimalarial therapy (2,3,13). Refractory lesions may be treated with intralesional corticosteroid injections (2,13). Chronic DLE lesions that are not responsive to topical corticosteroids or topical calcineurin inhibitors may be responsive to intralesional corticosteroid injections (2). Intravenous immunoglobin, rituximab, and dapsone have successfully treated cutaneous lupus lesions in a limited number of studies, as well as toxilizumab and anti-CD4 antibody in single reports (14). For mild to moderate psoriasis, first-line therapies include topical therapies including corticosteroids, vitamin D3 analogs, and combination products (15). Moderate to severe psoriasis can be treated with systemic therapies such as phototherapy, acitretin, methotrexate, and cyclosporine (15). For patients who do not respond to systemic therapies, biologics may be used. Infliximab has been found to be the most effective, followed by ustekinumab, adalimumab, and etanercept (15). These therapies do not address the presence of concomitant diseases of sarcoidosis and psoriasis. For example, anti-TNF in sarcoidosis may induce psoriasis skin lesions. Thus, we undertook this study to dissect the common and differentially expressed pathways among these three diseases.
Biological therapies must target a specific immune component that plays a key role in disease pathogenesis. Ideally, treatment should be directed to a patient-specific target (16). We have previously used gene co-expression networks to identify genes and molecular pathways of a disease state associated with clinical traits (17), as well as identifying similar immunological mechanisms between sarcoidosis and idiopathic pulmonary (18). In this study, we characterized commonly altered biological pathways in cutaneous sarcoidosis (CS), discoid lupus erythematosus (DLE), and psoriasis using gene coexpression networks. We created gene co-expression networks of microarray data from two previous studies. The first study found that active CS skin lesions showed several thousand differentially expression genes compared to non-lesional skin in CS patients and healthy controls. These differentially expressed genes showed a strong Th1 profile of sarcoidosis and expression of interleukin (IL)-23 and IL-23R with limited expression of other Th17 pathway genes (8). The second study found that DLE skin lesions demonstrated a predominance of IFN-γ-producing Th1 cells and an absence of IL-17-producing Th17 cells compared to psoriasis skin lesions (9).
In this study, we hypothesized that there would be common gene expression patterns between CS and DLE due to the similarities between the two diseases. Both CS and DLE are related to systemic disease, have a greater prevalence in African American populations (2,19), and are predominantly Th1mediated (8,9). To investigate this hypothesis, microarray expression data with weighted gene co-network analysis (WGCNA) was applied. Since genes with similar expression patterns are likely to be functionally related, WGCNA clusters genes with correlated expression profiles into groups known as modules. WGCNA was used to identify the most relevant module in immune-related skin disorders. Hub genes within the module were identified using intramodular connectivity and proteinprotein interaction (PPI) networks. The hub genes were further characterized by differential gene expression (DGE) analysis. We propose that characterization of commonly altered biologic pathways in CS, DLE, and psoriasis may uncover immunological targets and/or biomarkers.

Data Collection and Preprocessing
Microarray data and associated clinical data was obtained from the NCBI Gene Expression Omnibus (GEO). Dataset 1 (GSE32887) (8) included 15 skin samples from CS lesions, 11 skin samples of non-lesional skin (NLS) on the same patients with CS, and 5 skin samples from healthy volunteers (control 1). Dataset 2 (GSE52471) (9) included 7 skin samples from DLE lesions, 18 skin samples from psoriasis lesions, and 13 skin samples from healthy volunteers (control 2). Both datasets were generated using Affymetrix Human Genome U133A 2.0 Array. The collapseRows() function was used to filter the probes to include only unique genes that were present in both datasets.

Co-expression Network Construction
The WGCNA package on R (20) was used to construct coexpression networks of both datasets. We utilized code from (21). Since both datasets used the same platform, they were comparable. To create a scale-free network, an appropriate soft power was selected to promote strong connections between genes and filter out weak ones. Pearson correlation was used to measure the concordance of pair-wise genes. The Pearson correlation matrix was transformed into a weighted network for each dataset using the adjacency() function. Dataset 1 was used to construct modules. The dynamic tree-cutting function with a cut height of 0.99, a minimum cluster size of 30, and deep split of 3 identified modules with similar expression patterns. Modules are given arbitrary color names for easier tracking.

Identification of Module of Interest and GO Enrichment Analysis
The modules constructed from dataset 1 were mapped onto dataset 2. A preservation Z-score summary for each module was calculated using the modulePreservation() function in order to identify highly preserved modules between the two datasets. A Z-score of greater than five was used as the threshold for module preservation (21). TheGOenrichmentAnalysis() function in WGCNA was used to annotate each module with significant biological functions. Module eigengenes (ME), which can be interpreted as the level of expression of each module within each sample, were obtained for all samples across studies. We performed a 3-group comparison using a single-factor ANOVA to compare MEs of preserved modules between the CS lesions, NLS, and controls in dataset 1. We repeated this procedure for DLE, psoriasis, and controls for dataset 2. For modules with significant ANOVA results, we performed follow-up pairwise t-tests assuming unequal variance between each subgroup in each dataset. The module which demonstrated a significant difference between case/control status in both datasets was investigated further.

Identification of Hub Genes in Module of Interest
The genes in the module of interest were ranked by intramodular connectivity. The top 5% of genes in dataset 1 and dataset 2 were considered potential hub genes (22). The lists were cross-referenced to identify overlapping genes. We also used a second method to identify hub genes. The 2,000 genes (the maximum allowed) with the highest intramodular connectivity for each dataset were entered into the STRING (23) database to construct a protein-protein interaction (PPI) network. The minimum protein interaction score was set to high confidence (0.7). The output from the PPI network was imported into Cytoscape (24), an open source software platform for visualizing complex networks. The top 5% of genes with the highest degree were considered potential hub genes and the lists from the two datasets were cross referenced to identify overlapping genes (25). We termed genes that were common to both network types and common between datasets as "hub genes." Gene expression of hub genes was compared between disease groups using Kruskal-Wallis test and post-hoc Dunn's test with Bonferroni correction for each gene.

Validation of Hub Genes Using Differential Gene Expression (DGE) Analysis
The limma package (26) on R was used to identify differentially expressed genes between CS lesions and DLE lesions vs. their respective controls to validate our results. No covariates were used in the limma model because case/control status was the only variable. The lmFit() function and empirical Bayes method in the limma package was used to analyze the genes. The topTable() function summarized the results from the linear fit. We used the criteria of p < 0.01 and log2 (fold-change) to define differentially expressed genes. We adjusted our p-values via the Benjamini-Hochberg Procedure.

Co-expression Network Construction Identifies a Module of Interest
Our filtering process resulted in 12,991 genes to use in network construction. The co-expression network resulted in 14 modules, six of which were preserved between datasets. Of the six preserved modules, two modules were related to cellular division, one was related to biosynthesis, one was related to the nucleus, one was related to sensory perception, and one was related to leukocyte activation. The top 10 GO terms for these preserved modules can be found in File 1. All P-values reported have been adjusted for multiple comparisons. While all six preserved modules in dataset 2 revealed significant differences between case/control status (p < 0.01), only the leukocyte activation module significantly differed between case/control status in dataset 1 (p < 0.001). We therefore selected the leukocyte activation module as our module of interest. The module of interest was significantly increased in both the CS lesions and DLE lesions compared to their respective controls (p < 0.001). There was no significant difference between NLS and control in dataset 1. Psoriasis showed decreased expression of the leukocyte activation module compared to both DLE (p < 0.001) and controls in dataset 2 (p < 0.05).

Hub Genes Involved in Sarcoidosis and Lupus Pathogenesis Are Identified
The leukocyte activation module contained a total of 3,511 genes. From the co-expression network, the two datasets had 21 potential hub genes in common. From the PPI network, the two datasets had 74 potential hub genes in common. We found seven genes that were hubs in both network construction methods: TLR1, ITGAL, TNFRSF1B, CD86, SPI1, BTK, and IL10RA (Figure 1). All seven hub genes were upregulated in both the CS and DLE compared to their respective controls (p < 0.05). Additionally, all seven genes were significantly upregulated in the CS lesions compared to NLS, which did not show any significant differences from their controls. The psoriasis lesion samples showed decreased expression of BTK and TNFRSF1B compared to controls ( Table 1). The seven hub genes were unique to the leukocyte activation module and were not found in any other preserved module. The expression level of each hub gene based on case/control status is demonstrated in Figure 2.   Seventeen hub genes were found only in dataset 1, including CD40, CCR5, CCL5, MYD88, IL15, and CXCR3. Six hub genes were found only in the dataset 2, of note TLR7, FCGR2A, and CCR2. The listed genes have primarily been indicated in the pathogenesis of sarcoidosis or lupus in the literature (27)(28)(29)(30). Full gene lists can be found in Table 2.

Common Hub Genes Are Highly Upregulated in Differential Expression Analysis
The common hub genes were found to be individually upregulated in disease states compared to controls. The DGE analysis demonstrates that as a group, the common hub genes are highly upregulated compared to other DEGs in the datasets for both CS and DLE vs. respective controls. Volcano plots of DEGs with labeled hub genes are shown in Figure 3.
All 17 hub genes unique to dataset 1 showed significant upregulation in the DGE analysis of CS lesions vs. control. Ten of the 17 hub genes unique to dataset 1 were also significantly upregulated DEGs in the DLE vs. control analysis. The remaining seven were upregulated DEGs only in the CS vs. control analysis ( Table 2). Of the six hub genes significant only to dataset 2, all six were upregulated in DLE lesions vs. control, as well as the CS vs. control analysis. Thus, in the DGE analysis there were only seven hub genes that were unique to CS and no hub genes that were unique to DLE.

DISCUSSION
Our findings suggest that there may be common gene expression patterns in the immune processes of CS and DLE. Since there were no significant differences between control and NLS, it appears that the gene dysregulation is confined to the sarcoidosis lesions. Additionally, psoriasis does not demonstrate the same patterns of gene expression differences as demonstrated in CS and DLE, suggesting that similarities between CS and DLE may extend beyond being immune-related skin disorders. The common hub genes we identified may be further investigated as biomarkers of Th1-driven inflammatory disorders. These genes may represent underlying drivers of Th1-skewed immune disease and could serve as a therapeutic target in CS and DLE. We also identified hub genes that are unique to CS and DLE, which can help differentiate these diseases from one another and may serve as unique markers rather than general Th1-mediated disease markers.
Closer examination of the seven identified hub genes reveals involvement of pathways that activate the immune system. Some of these genes encode for proteins that are already therapeutic targets. For example, TNFRSF1B encodes a protein that is a member of the TNF-receptor superfamily, which mediates the recruitment of anti-apoptotic proteins (31,32). TNF-α inhibitors are currently used to treat a variety of immune-related disorders including sarcoidosis, lupus, and plaque psoriasis (33). However, TNF-α inhibitors do not consistently work for sarcoidosis (34) and are also complicated in SLE treatment due to the fact that TNF-α inhibitors can induce lupus (35). Future studies must utilize gene-regulatory networks to stratify phenotypes of sarcoidosis and lupus that are responsive to specific therapies.
Other hub genes identified encode for immune system related proteins that have shown promise as biomarkers or targets. CD86 encodes a protein that is expressed on antigen presenting cells (APCs) and serves as the costimulatory signal for T-cell activation (31). In sarcoidosis patients, alveolar macrophages (AM) act as APCs and express high levels of CD86 (31) to stimulate T-cell activation. BTK, which plays a crucial role in Bcell development, has been identified as having multiple roles in the production of autoantibodies and the pathogenesis of lupus (36). BTK inhibitors serve as a promising new therapeutic target and are currently being tested to treat lupus in animal models (37). TLR1 is a member of the toll-like receptor (TLR) family, which plays a fundamental role in pathogen recognition and innate immunity. Toll-like receptors recognize pathogenassociated molecular pathways (PAMPS) that are expressed on infectious agents (38). TLR1 in particular is expressed at higher levels than other TLRs and has been associated with infectious diseases that affect the skin, including Leprosy (39) and Lyme Disease (40). TLRs are thought to play a role in both sarcoidosis and lupus pathogenesis (41,42), and has been suggested as a potential therapeutic target for lupus (43).
Hub genes may be useful as biomarkers for disease activity as well. IL10RA is a receptor for interleukin 10, an antiinflammatory cytokine. IL-10 is increased in active pulmonary sarcoidosis as a compensatory response to increased expression of proinflammatory cytokines (44). IL-10 mediates disease activity of SLE (45). Together, the genes identified are those known to influence inflammatory responses.
The hub genes unique to CS and DLE are also potentially useful as disease markers and treatment targets. Of the 17 hub genes unique to the sarcoidosis samples, many are involved in T-cell activation and proliferation, such as CD40, CD80, CCR5, CCL5, and IL15, or cellular signaling, such as SYK, MYD88, and CXCR3. Some of the hub genes have been indicated in the pathogenesis of sarcoidosis. For example, CD40 which is required to activate APCs, is expressed in higher levels on AMs in patients with sarcoidosis (46). The expression of CD40 correlates with CD86, a common hub gene (46).
The C-C chemokine receptor 5 (CCR5) binds to chemokine ligand RANTES (CCL5) and is expressed on T-cells and macrophages. CCR5 mRNA is increased in bronchoalveolar lavage fluid (BALF) of sarcoidosis patients and may serve as a marker of pulmonary disease (47). Furthermore, certain CCR5 haplotypes are associated with sarcoidosis. The CCR5 haplotype FIGURE 3 | Volcano plots mark differentially expressed genes (orange dots) in cutaneous sarcoidosis lesions vs. control (top) and discoid lupus erythematosus lesions vs. control (bottom). The orange dots represent differentially expressed genes using the criteria of p < 0.01 and log2 (fold-change). Black, gray, and red dots are genes that do not meet the criteria of a differentially expressed gene. The common hub genes (TLR1, ITGAL, TNFRSF1B, CD86, SPI1, BTK, and IL10RA) are marked.
HHC is strongly correlated with parenchymal lung disease in sarcoidosis, however, appears not to increase susceptibility to sarcoidosis and is only relevant after disease induction (48).
As a whole, we identified sarcoidosis genes that reflect immune cell activation. In contrast, the unique genes of the lupus group are more involved in pathogen recognition and degradation, such as TLR7, and FCGR2A. Some FCGR2A polymorphisms may increase susceptibility and development of SLE in certain ethnic populations (30). Data from mouse models have shown that TLR7, which is involved with PAMP recognition, serves a pathogenic role in the development of SLE, while TLR9 serves a protective role (28). Altered expression of TLR7 and TLR9 has been suggested as a biomarker to identify a subset of SLE patients that may respond to a targeted therapeutic approach (29). CCR2+ T-cells, which aid in monocyte chemotaxis, are found to be selectively decreased during SLE flares and could potentially serve as a biomarker (27).
Together, our study indicates that our methodology may be informative in proposing key gene regulatory points in the immune processes of sarcoidosis and lupus. Notably, we find common gene expression patterns in the immune processes of CS and DLE. We have utilized WGCNA to identify genes that may underlie the pathogenesis of Th1-mediated skin disorders. The study limitations include lack of clinical data regarding therapy or systemic involvement, as well as sample collection at a single time point. Future analysis of the skin lesions at multiple points with functional interruption would be necessary to characterize the specific roles of hub genes in disease pathogenesis. We underscore that our demonstration of high intramolecular connectivity of the hub genes strongly suggests regulatory roles. We have identified hub genes that have been previously identified, as well as novel gene candidates for lupus and sarcoidosis. Lupus is a systemic autoimmune disease with skin manifestations, while sarcoidosis has been viewed as a predominantly pulmonary disorder with skin manifestations. Our results suggest that sarcoidosis may also be a systemic disease with immune dysregulation. Studies that stratify samples by therapy, organ involvement or disease progression are warranted. Future studies that utilize gene-regulatory networks and identify new genes may enhance our understanding of lupus or sarcoidosis as systemic disorders with implications for biomarkers or therapies.