Causal Relationship Between Gut Microbiota and Autoimmune Diseases: A Two-Sample Mendelian Randomization Study

Background Growing evidence has shown that alterations in gut microbiota composition are associated with multiple autoimmune diseases (ADs). However, it is unclear whether these associations reflect a causal relationship. Objective To reveal the causal association between gut microbiota and AD, we conducted a two-sample Mendelian randomization (MR) analysis. Materials and Methods We assessed genome-wide association study (GWAS) summary statistics for gut microbiota and six common ADs, namely, systemic lupus erythematosus, rheumatoid arthritis, inflammatory bowel disease, multiple sclerosis, type 1 diabetes (T1D), and celiac disease (CeD), from published GWASs. Two-sample MR analyses were first performed to identify causal bacterial taxa for ADs in discovery samples. Significant bacterial taxa were further replicated in independent replication outcome samples. A series of sensitivity analyses was performed to validate the robustness of the results. Finally, a reverse MR analysis was performed to evaluate the possibility of reverse causation. Results Combining the results from the discovery and replication stages, we identified one causal bacterial genus, Bifidobacterium. A higher relative abundance of the Bifidobacterium genus was associated with a higher risk of T1D [odds ratio (OR): 1.605; 95% CI, 1.339–1.922; PFDR = 4.19 × 10−7] and CeD (OR: 1.401; 95% CI, 1.139–1.722; PFDR = 2.03 × 10−3), respectively. Further sensitivity analyses validated the robustness of the above associations. The results of reverse MR analysis showed no evidence of reverse causality from T1D and CeD to the Bifidobacterium genus. Conclusion This study implied a causal relationship between the Bifidobacterium genus and T1D and CeD, thus providing novel insights into the gut microbiota-mediated development mechanism of ADs.


INTRODUCTION
Autoimmune diseases (ADs) are conditions in which an individual's immune system mistakenly attacks its host's tissues. Patients with ADs often endure lifelong debilitating symptoms, loss of organ function, reduced productivity at work, and high medical expenses. ADs are considered a significant cause of morbidity and mortality worldwide. Accumulating evidence demonstrates a steady rise in the incidence of ADs over the last few decades (1).
Although the etiology and pathogenesis of ADs are not fully understood, genetic components, environmental factors, and their interactions have great significance in their development. In addition, growing evidence suggests that alterations in gut microbiota composition are closely related to autoimmunity (2,3). The gut microbiota is defined as the community of microorganisms that live in the human gastrointestinal tract. Gut microbial dysbiosis has been observed in many AD studies. For example, multiple studies reported a decrease of Firmicutes/ Bacteroidetes ratio in systemic lupus erythematosus (SLE) patients and type 1 diabetes (T1D) patients (4,5). A casecontrol study reported an increased abundance of Methanobrevibacter and Akkermansia and decreased abundance of Butyricimonas in patients with multiple sclerosis (MS) (6). Chen et al. (7) found that rheumatoid arthritis (RA) patients had a decrease in Faecalibacterium and expansion of Eggerthella and Collinsella.
All the above gut microbiota-AD associations were derived from cross-sectional studies, leaving the causal nature of these associations elusive. However, establishing causal relationships not only deepens the understanding of gut microbiota-derived AD pathogenesis but also has the capacity to guide microbiotaorientated interventions against AD in the clinic. Therefore, there is an urgent need to elucidate the causal relationship between the gut microbiota and various types of AD.
Mendelian randomization (MR) is a statistical approach that implies causal association from an exposure to an outcome. It uses genetic variants associated with exposure as a surrogate for exposure to assess the association between the surrogate and the outcome (8). Thanks to fruitful findings from large-scale genome-wide association studies (GWASs) conducted to date at both gut microbiota and disease levels (9)(10)(11), MR analysis has been widely applied to various scenarios, including the causal associations between gut microbiota and AD. In previous studies, Garcıá-Santisteban et al. (12) performed an MR analysis and identified a causal association between gut microbiota composition and celiac disease (CeD). Another study by Inamo (13) identified no causal association between gut microbiota composition and RA. The above two studies fall short in that they treat gut microbiota composition as a whole without distinguishing specific taxa, while different microbial taxa may have distinct effects on human health. During the preparation of this article, Zhang et al. (14) and Xiang et al. (15) investigated the causal effects of specific microbial taxa on two ADs, inflammatory bowel disease (IBD) and SLE. However, studies on other ADs are still sparse.
In the present study, aiming to investigate the causal relationship between gut microbiota and a broad range of ADs, we conducted a comprehensive two-sample MR analysis of six ADs, including SLE, RA, IBD, MS, T1D, and CeD.

Ethics Statement
Our analysis used publicly available GWAS summary statistics. No new data were collected, and no new ethical approval was required. The flowchart of the study is shown in Figure 1. Briefly, gut microbiota served as the exposure, while ADs served as the outcome. Single-nucleotide polymorphisms (SNPs) significantly associated with specific gut microbiota taxa were selected as instrumental variables (IVs) based on strict inclusion and exclusion criteria. Outcome samples included both discovery and replication samples. A series of sensitivity analyses was performed for significant associations. Finally, reverse MR analysis was performed to mitigate the potential impact of ADs on the causal gut microbiota.

Gut Microbiota Sample
Summary statistics for gut microbial taxa were obtained from a large-scale multi-ethnic GWAS meta-analysis that included 18,340 individuals from 24 cohorts (16). The microbial composition was profiled by targeting three distinct variable regions of the 16S rRNA gene. To account for differences in sequencing depth, all datasets were rarefied to 10,000 reads per sample. Taxonomic classification was performed using direct taxonomic binning. In each cohort, only the taxa present in more than 10% of the samples were included to explore the effect of host genetics on the abundance of gut bacterial taxa. The study-wide cutoffs included an effective sample size of at least 3,000 individuals and presence in at least three cohorts. A total of 211 taxa (131 genera, 35 families, 20 orders, 16 classes, and 9 phyla) were included. After adjustment for age, sex, technical covariates, and genetic principal components, Spearman's correlation analysis was performed to identify genetic loci that affected the covariate-adjusted abundance of bacterial taxa. More details on the microbiota data were described elsewhere (16).

Autoimmune Disease Discovery Samples
In the discovery stage, GWAS summary statistics for each of the six ADs were extracted from publicly available GWAS analyses. Summary statistics for SLE were obtained from a publicly available GWAS meta-analysis, including 7,219 cases and 15,991 controls of European ancestry (17). Summary statistics for RA were extracted from a GWAS meta-analysis, including 14,361 RA cases and 43,923 controls of European ancestry from 18 studies (18). Summary statistics for IBD were obtained from a GWAS meta-analysis of 25,042 IBD cases and 34,915 controls of European ancestry (19). Summary statistics of MS were derived from the discovery stage of the latest GWAS meta-analysis of the International MS Genetics Consortium (IMSGC), including 14,802 MS cases and 26,703 controls of European ancestry (20). Summary statistics of T1D were derived from a GWAS with 6,683 T1D cases and 12,173 controls of European ancestry (21). Finally, summary statistics of CeD were obtained from a GWAS meta-analysis, including 12,041 CeD cases and 12,228 controls (22). Detailed information on the datasets is provided in Table 1.

Autoimmune Disease Replication Samples
Significant bacterial taxa identified in the discovery stage were replicated during the replication stage. The replication outcome samples for RA, IBD, MS, and T1D were obtained from the UK Biobank study, which is a large prospective cohort study with approximately 500,000 participants aged 40-69 years from 22 centers across the United Kingdom. The replication sample of SLE was a single GWAS from Spain, including 907 patients with SLE and 1,524 healthy controls (23). The replication sample for CeD is a GWAS meta-analysis of five samples, including 4,533 CeD cases and 10,750 controls of European ancestry (24). Detailed information on the replication samples is presented in Table 1.

Selection of Instrumental Variables
The 211 bacterial taxa were categorized at six taxonomic levels. Of these, the genus is the smallest and most specific taxonomic level. To identify each causal bacteria group as specifically as possible, we analyzed 131 bacterial taxa at the genus level only. Fourteen taxa with unknown groups were excluded, meaning 117 bacterial taxa were included in the subsequent MR analysis.
SNPs associated with gut bacterial taxa at the genome-wide significance threshold P < 5.0 × 10 −8 were selected as potential IVs. A series of quality control steps was implemented to select eligible IVs. First, SNPs with inconsistent alleles between the exposure and outcome samples (i.e., A/G vs. A/C) were excluded. Second, palindromic A/T or G/C alleles were excluded. Third, SNPs within each bacterial taxon were clumped to retain only independent SNPs. The linkage disequilibrium (LD) threshold for clumping was set to r 2 < 0.01, and the clumping window size was set to 500 kb. LD was estimated based on the Europeanbased 1,000 Genome Projects reference panel. Fourth, the MR pleiotropy residual sum and outlier (MR-PRESSO) test was applied to detect potential horizontal pleiotropy and to eliminate the effects of pleiotropy by removing outliers (25). Finally, to assess the strength of the selected SNPs, the following equation was used to calculate the F statistics for each bacterial taxon: where R 2 is the portion of exposure variance explained by the IVs, n is the sample size, and k is the number of IVs. An Fstatistic ≥10 indicates no strong evidence of weak instrument bias (26). IVs with F-statistics less than <10 were considered weak IVs and were excluded.

Statistical Analysis
We performed MR analysis to estimate the causal effect of the gut microbiota on the six ADs. For bacterial genera containing only one SNP, the Wald ratio method was used for the MR analysis. The causal effect was calculated by dividing the SNP-outcome effect estimated by the SNP-exposure effect estimate. For bacterial genera containing multiple SNPs, multiple tests, including fixed-/random-effects inverse variance weighted (IVW) test (27), weighted median method, and MR-Egger regression test were performed. Cochrane's Q test was performed to assess the heterogeneity among SNPs associated with each bacterial genus. In the presence of heterogeneity (P < 0.05), the random-effects IVW test was used instead to provide a more conservative but robust estimate. The weighted median test can generate consistent estimates when ≥50% of the weights come from valid IVs (28). The MR-Egger regression test allows pleiotropy present in more than 50% of IVs (29). Significant genera identified in the discovery samples were replicated in replication samples. The replication MR analysis procedure was the same as that used in the discovery stage. To evaluate the robustness of the identified causal associations, we performed two sensitivity analyses, including the MR-Egger intercept test and leave-one-out analysis. The intercept of the MR-Egger regression test can provide an estimate of the degree of directional pleiotropy (29). The leave-one-out analysis was performed to evaluate whether the significant results were driven by a single SNP.

Reverse Mendelian Randomization Analysis
To explore whether ADs have any causal impact on the identified significant bacterial genus, we also performed a reverse MR analysis (i.e., ADs as exposure and the identified causal bacterial genus as outcome) using SNPs that are associated with ADs as IVs.
All statistical analyses were conducted using R (version 4.0.3). The IVW, weighted median, and MR-Egger regression methods were performed using the "TwoSampleMR" package (version

Selection of Instrumental Variables
After a series of quality control steps, 32 SNPs associated with 13 genera were selected as IVs. Specifically, 19 independent SNPs (P < 5.0 × 10 −8 , r 2 < 0.01) were associated with 13 genera for SLE, 17 independent SNPs were associated with 12 genera for RA, 19 SNPs were associated with 13 genera for MS, 18 SNPs were associated with 12 genera for IBD, 7 SNPs were associated with 3 genera for T1D, and 6 SNPs were associated with 3 genera for CeD (Supplementary

Causal Effects of Gut Microbiota on Autoimmune Diseases
In the discovery stage, the genetically predicted relative abundance of two genera, Bifidobacterium and Ruminococcus, was associated with the risk of SLE, MS, T1D, and CeD.
Ruminococcus was also associated with the risk of IBD (  Table S2, there was no evidence of a causal association between any microbial taxa and RA. These two genera Bifidobacterium and Ruminococcus were replicated in the replication samples. The causal effects of the Bifidobacterium genus on T1D and CeD were successfully replicated, as shown in Table 3. The effect direction was consistent with that in the discovery sample, which strengthened the confidence of the true causal associations.

Sensitivity Analyses
No evidence of heterogeneity was observed between the genetic IVs for Bifidobacterium (Supplementary Table S3). None of the MR-Egger regression intercepts deviated from null, indicating no evidence of horizontal pleiotropy (all intercept P > 0.05) ( Supplementary Table S4). Additionally, the leaveone-out analysis showed no marked difference in causal estimations of Bifidobacterium on T1D and CeD, suggesting   that none of the identified causal associations were driven by any single IV (Figures 2, 3). In reverse MR analysis, there was no evidence of a causal effect of T1D and CeD on Bifidobacterium ( Table 4). Detailed information on the IVs used in the reverse MR analyses is shown in Supplementary  Table S5.

DISCUSSION
In this study, we performed two-sample MR analyses to investigate the causal association between gut microbiota and six common ADs (SLE, RA, MS, IBD, T1D, and CeD). Combining evidence from both discovery and replication samples, we identified that the bacterial genus Bifidobacterium was causally associated with the risk of T1D and CeD.
Bifidobacterium is the primary microbe that colonizes the human gut. Previous observational studies have demonstrated that Bifidobacterium plays an important role in the pathogenesis of multiple ADs. However, observational studies have yielded conflicting results regarding the effect pattern. Two case-control studies showed that the relative abundance of Bifidobacterium was higher in T1D patients than that in controls (30,31). Similarly, a higher relative abundance of Bifidobacterium was observed in patients with CeD (32). In line with these studies, our study suggested that the increased relative abundance of Bifidobacterium was causally associated with a higher risk of T1D and CeD, indicating its harmful effect on both diseases. In contrast, several other studies observed a lower relative abundance of Bifidobacterium in T1D and CeD patients, suggesting its protective effect (33)(34)(35).
Recent studies have shown that probiotic intervention, mainly of the Lactobacillus and Bifidobacterium genera, can effectively attenuate the progression of multiple ADs, including T1D and CeD. In a double-blinded, placebo-controlled trial, probiotic intervention with Bifidobacterium breve BR03 and B. breve B632 has shown a positive effect on decreasing the production of the pro-inflammatory cytokine tumor necrosis factor-a (TNF-a) in children with CeD on a gluten-free diet (36). In contrast, Smecuol et al. (37) did not detect significant changes in TNF-a in CeD patients treated with Bifidobacterium infantis. Similarly, Groele et al. (38) reported that administration of Lactobacillus rhamnosus GG and Bifidobacterium lactis Bb12 had no significant effect on maintaining the residual pancreatic beta-cell function in children with newly diagnosed T1D. There was no significant difference in cytokine levels and intestinal permeability (zonulin levels) between the probiotics and placebo groups (38).
Some functional studies have shown evidence of the antiinflammatory effects of Bifidobacterium, while others have reported its pro-inflammatory effects. A previous study showed that Bifidobacterium adolescentis significantly increased Th17 cell levels in several other gut-associated organs, while elevated Th17 cell responses have been associated with autoimmune/ inflammatory disease in both mice and humans (38). In addition, Loṕez et al. (39) reported that some Bifidobacterium bifidum strains could induce the secretion of large amounts of interleukin IL-17 and promote Th17 cell polarization. Combining evidence from observational studies, MR analysis, clinical trials, and functional studies, we speculated that the positive and negative effects of Bifidobacterium on ADs may be species-and strainspecific. The causal relationship between Bifidobacterium and ADs needs to be further explored at more specialized levels (i.e., species level and strain level).
In previous studies, Zhang et al. (14) and Xiang et al. (15) performed MR analyses to investigate the effects of gut microbiota on IBD and SLE, respectively. Our study differs from their studies in the following three aspects: First, our study is more comprehensive in its investigation of ADs. Unlike the above two studies that analyzed two separate diseases, we comprehensively analyzed six common AD diseases. This will give us an opportunity to evaluate common gut microbiota that are causally related to multiple ADs. Second, the quality control procedure for selecting IVs was stricter in our study. We selected independent GWAS SNPs as IVs and conducted a series of sensitivity analyses, including horizontal pleiotropy assessment and reverse MR analysis, to maximally fulfill basic MR assumptions. In contrast, the above two studies used a fairly loose P-value threshold (P < 1 × 10 −5 ) to select eligible IVs. Third, Zhang et al. (14) used summary-level data of gut microbiota in a relatively small sample size (N = 1,126 twin pairs). Instead, the sample size in the present study was much larger (N = 18,340). Meanwhile, the causal associations identified in the discovery stage were further replicated in independent replication outcome samples, which enhanced the confidence of the true causal relationship.
Nevertheless, our study had several limitations. First, while the majority of participants in the GWAS summary data used in our study were of European ancestry, a small number of the gut microbiota data were taken from sets consisting of other races, which may partially bias our estimates. Second, bacterial taxa were only analyzed at the genus level but not at a more specialized level such as species or strain levels. When microbiota GWASs use more advanced shotgun metagenomic sequencing analysis, the results can be more specific and accurate. Third, our study used gut microbiota data from a meta-analysis of mostly adult individuals, whereas the CeD study was conducted in children. Finally, most ADs are more prevalent in women than in men (e.g., SLE, RA, and MS). However, our study did not analyze the two genders separately, which may have influenced our results. It would be helpful to perform a gender-specific MR analysis in future endeavors.
In conclusion, our findings support the potentially causal effects of the Bifidobacterium genus on T1D and CeD. Although Bifidobacterium is generally considered beneficial bacteria, specific species and strains of Bifidobacterium may have varying effects on human health. Therefore, the potential mechanisms of specific species and strains of Bifidobacterium in the development of AD need to be further investigated.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
LZ and Y-FP designed the research. QX and J-JN collected the data. QX and BL analyzed the data. J-JN, B-XH, S-SY, X-TW, G-JF, and HZ performed the literature search. QX and BL drafted the article. Y-FP and LZ jointly supervised the study. All authors were involved in writing the paper. All authors contributed to the article and approved the submitted version.

FUNDING
This study was partially supported by funding from the National Natural Science Foundation of China (31771417) and a project funded by the Priority Academic Program Development (PAPD) of Jiangsu higher education institutions. The numerical calculations in this paper have been done on the supercomputing system of the National Supercomputing Center in Changsha. This study was supported by grants from Suzhou Science and Technology Bureau (No. SKY2021022),