Integrated Single Cell and Bulk RNA-Seq Analysis Revealed Immunomodulatory Effects of Ulinastatin in Sepsis: A Multicenter Cohort Study

Sepsis is a leading cause of morbidity and mortality in the intensive care unit, which is caused by unregulated inflammatory response leading to organ injuries. Ulinastatin (UTI), an immunomodulatory agent, is widely used in clinical practice and is associated with improved outcomes in sepsis. But its underlying mechanisms are largely unknown. Our study integrated bulk and single cell RNA-seq data to systematically explore the potential mechanisms of the effects of UTI in sepsis. After adjusting for potential confounders in the negative binomial regression model, there were more genes being downregulated than being upregulated in the UTI group. These down-regulated genes were enriched in the neutrophil involved immunity such as neutrophil activation and degranulation, indicating the immunomodulatory effects of UTI is mediated via regulation of neutrophil activity. By deconvoluting the bulk RNA-seq samples to obtain fractions of cell types, the Myeloid-derived suppressor cells (MDSC) were significantly expanded in the UTI treated samples. Further cell-cell communication analysis revealed some signaling pathways such as ANEEXIN, GRN and RESISTIN that might be involved in the immunomodulatory effects of UTI. The study provides a comprehensive reference map of transcriptional states of sepsis treated with UTI, as well as a general framework for studying UTI-related mechanisms.


Detailed methods for WGCNA
The first step in constructing a weighted gene co-expression netwoek (WGCNA) was to choose the soft thresholding power to which co-expression similarity was raised to calculate adjacency. We chose from a set of values from 1 to 30 based on the criterion of approximate scale-free topology. Finally, modules were identified in the resulting dendrogram using the Dynamic Tree Cut algorithm. This algorithm has several advantages such as capability of identifying nested clusters and flexibility. Modules with similar expression profiles were merged at the threshold of 0.25. Gene significance was defined as the student t-test statistic for testing differential expression between ulinastatin (UTI) and controls. The significance level was adjusted for multiple testing with Bonferroni correction.
Module eigengene was calculated for each module as the first principal component of gene expressions for that module. Correlation analysis was performed to relate module eigengene to external traits including age, sex, SOFA, days and UTI group.
Gene significance for UTI group was correlated to the module membership to investigate whether genes significantly associated with UTI group was also associated with module membership. Module membership (eigengene-based connectivity) for each gene was calculated by correlating its gene expression profile with the module eigengene of a given module. For a given module, a module membership value of 0 indicates that a gene is not part of the module, whereas a module membership of − 1 or 1 is highly connected to the module.

Enrichment analysis for biological function and transcription factors
Modules associated with important clinical trait such as UTI group were further analyzed for their enrichment in Gene Ontology (GO) pathways. Specifically, the gene set from a given module were enriched to GO terms to find whether some of functional GO terms are overrepresented using annotations for that gene set. Dot plot shows the gene ratio and adjusted p values for each enriched GO terms. Enriched terms were organized into a network with edges connecting overlapping gene sets. In this way, mutually overlapping gene sets are tend to cluster together, making it easy to identify functional modules. The category net plot depicts the linkages of genes and GO terms as a network, which is helpful to see which genes are involved in enriched pathways and genes that may belong to multiple annotation categories.
Modules (gene lists) significantly correlated with the mortality trait were tested for its overrepresentation in transcription factor (TF) binding motifs by using RcisTarget. Two types of databases (i.e. Gene-motif rankings and the annotation of motifs to transcription factors) were employed in the analysis: Gene-motif rankings which provides the rankings of all the genes for each motif and the annotation of motifs to transcription factors. Parameter settings for the score of each pair of gene-motif were: species = Homo sapiens, Scoring/search space = 500 bp uptream the transcription start site (TSS), Number of orthologous species = 10. The annotation of motifs to transcription factors was performed using the motifAnnotations_hgnc ('mc9nr', 24,453 motifs).

Identification of miRNA-target interactions
The multiMiR package was employed for the retrieval of miRNA-target interactions from 14 external databases in R. These databases are comprehensive collections of predicted and validated miRNA-target interactions and their associations with diseases and drugs. The module of interest (related to UTI group) was those associated with mortality outcome. It was interesting to check whether some, or all, of these genes within a module were targeted by the same miRNA(s). We restricted our search to the "mirtarbase" table because this table included only experimentally validated miRNA-target interactions.

Results
Weighted gene co-expression network analysis (WGCNA) eFig 1. Sample clustering and clinical trait heatmap