Multi-Level Analyses of Genome-Wide Association Study to Reveal Significant Risk Genes and Pathways in Neuromyelitis Optica Spectrum Disorder

Background Neuromyelitis optica spectrum disorder (NMOSD) is an inflammatory disease of the central nervous system and it is understandable that environmental and genetic factors underlie the etiology of NMOSD. However, the susceptibility genes and associated pathways of NMOSD patients who are AQP4-Ab positive and negative have not been elucidated. Methods Secondary analysis from a NMOSD Genome-wide association study (GWAS) dataset originally published in 2018 (215 NMOSD cases and 1244 controls) was conducted to identify potential susceptibility genes and associated pathways in AQP4-positive and negative NMOSD patients, respectively (132 AQP4-positive and 83 AQP4-negative). Results In AQP4-positive NMOSD cases, five shared risk genes were obtained at chromosome 6 in AQP4-positive NMOSD cases by using more stringent p-Values in both methods (p < 0.05/16,532), comprising CFB, EHMT2, HLA-DQA1, MSH5, and SLC44A4. Fifty potential susceptibility gene sets were determined and 12 significant KEGG pathways were identified. Sixty-seven biological process pathways, 32 cellular-component pathways, and 29 molecular-function pathways with a p-Value of <0.05 were obtained from the GO annotations of the 128 pathways identified. In the AQP4 negative NMOSD group, no significant genes were obtained by using more stringent p-Values in both methods (p < 0.05/16,485). The 22 potential susceptibility gene sets were determined. There were no shared potential susceptibility genes between the AQP4-positive and negative groups, furthermore, four significant KEGG pathways were also identified. Of the GO annotations of the 165 pathways identified, 99 biological process pathways, 37 cellular-component pathways, and 29 molecular-function pathways with a p-Value of <0.05 were obtained. Conclusion The potential molecular mechanism underlying NMOSD may be related to proteins encoded by these novel genes in complements, antigen presentation, and immune regulation. The new results may represent an improved comprehension of the genetic and molecular mechanisms underlying NMOSD.


INTRODUCTION
Neuromyelitis optica spectrum disorder (NMOSD) is an inflammatory disease of the central nervous system that predominantly affects the spinal cord and optic nerves (Wingerchuk et al., 2015). The estimated prevalence is 0.5 to 10 persons (predominantly women) per population of 100,000 (Hor et al., 2020), especially high incidence in Asia, and with an incidence of 0.445/100,000 (0.433-0.457) in China (Tian et al., 2020). The pathogenic role of anti-aquaporin 4 autoantibody (AQP4-Ab) has been established in previous studies. Complement-dependent cytotoxicity and antibodydependent cell-mediated cytotoxicity have been reported (Wingerchuk et al., 2007). Progression in NMOSD research has led to the deduction that environmental and genetic factors underlie the etiology of NMOSD, environmental factors including changes in the diet, exposure to infections, stressful life events, seasonal variation and so on (Akaishi et al., 2020;Hor et al., 2020;Rafiee et al., 2020).
Recent studies on genome-wide association (GWAS) of NMOSD have amplified the understanding of NMOSD. Previous studies revealed that a common promoter single-nucleotide polymorphisms (SNP) in CYP7A1 has a protective/gene dosedependent effect on the risk of NMO (Kim et al., 2010). Recently, Estrada et al. (2018) identified two independent SNP (rs1150757 and rs28383224) in the major histocompatibility complex (MHC) region associated with AQP4-Ab positive NMOSD, however, the susceptibility genes and associated pathways of NMOSD patients with AQP4-Ab positive and negative have not been elucidated.
As a useful complementary approach for SNP-GWAS, genebased tests play an increasing role in genetics research. Studies have shown that gene-based tests can be more powerful than SNP-GWAS approach and suitable for integrating the results of pathway tests from GWAS (Liu et al., 2010). Secondary analysis from a NMOSD-GWAS dataset originally published in 2018 (215 NMOSD cases and 1244 controls) was conducted to identify potential susceptibility genes and associated pathways of NMOSD (Estrada et al., 2018). The original data was obtained at the NHGRI-EBI GWAS Catalog 1 . In phase I, more stringent p-Values were used to determine the intersection of two new gene-based tests for the GWAS approach, called Versatile Genebased Association Study-2 version 2 (VEGAS2) and PLINK, which were used to conduct a gene-based association study in AQP4-positive (p < 0.05/16,532) and negative (p < 0.05/16,485) NMOSD patients, respectively. Additionally, for exploring more potential genes, we used p < 0.05 to get the share genes of these two methods. In phase II, gene sets identified with meta p < 0.05/n ("n" represents the number of genes shared by the VEGAS2 and PLINK tests) were carried forward to the next phase. In phase III, protein-protein association networks were performed by STRING, meanwhile, KEGG pathways and GO analysis were conducted for the NMOSD susceptibility genes to gain further insights into the genes (Figure 1).
In AQP4-positive NMOSD cases, where more stringent p-Values were used in both methods, five shared risk genes were obtained at chromosome 6 (p < 0.05/16,532), comprising CFB, EHMT2, HLA-DQA1, MSH5, and SLC44A4. Furthermore, we determined 50 potential susceptibility gene sets of AQP4positive NMOSD, and the top 5 of these genes were the same as the genes obtained using more stringent p-Values after which 12 significant KEGG pathways (p < 0.05) were identified. In the GO annotations of the 128 pathways identified, 67 biological process pathways, 32 cellular-component pathways, and 29 molecular-function pathways with a p-Value of <0.05 were obtained. No significant genes were obtained by using more stringent p-Values in both methods in the AQP4-negative NMOSD group (p < 0.05/16,485). The 22 potential susceptibility gene sets were determined. There were no shared potential susceptibility genes between the AQP4-positive and negative groups, furthermore, four significant KEGG pathways were also identified. Of the GO annotations of 165 pathways identified, 99 biological process pathways, 37 cellular-component pathways, and 29 molecular-function pathways with a p-Value of <0.05 were obtained. In short, the new results may represent significant steps toward defining the genetic mechanism underlying the association of NMOSD.

Samples
The original data were obtained from the NHGRI-EBI GWAS Catalog see text footnote 1. The large-scale NMOSD-GWAS dataset that comprises 215 NMOSD cases (132 AQP4-positive and 83 AQP4-negative) and 1244 controls of European ancestry was analyzed. The originally published data in 2018 used 2006 NMO diagnostic criteria that require optic neuritis and transverse myelitis plus two of the following three supportive elements: (1) longitudinally extensive lesions (≥3 vertebral segments in length); (2) magnetic resonance imaging of the brain with normal findings or with findings inconsistent with MS; and (3) NMO-IgG seropositivity. After subjecting the dataset to certain qualitycontrol methods, 7,138,498 autosomal SNPs were available for genetic analysis. Please refer to the original text for more details (Estrada et al., 2018). To investigate whether there are genetic differences between AQP4-positive and AQP4 negative NMOSD, gene-based association study in AQP4-positive and negative NMOSD patients was conducted.

Gene-based test
The first gene-based test method used was a versatile genebased association study (VEGAS2 2 ) developed by Aniket Mishra and Stuart Macgregor (Mishra and Macgregor, 2015). VEGAS2 distributes SNPs to every gene of 17,787 autosomes in terms of the location on the UCSC Genome Browser hg18 group. The method integrates the overall information of the gene and its SNPs while also illustrating their details such as the sizes of genes and the linkage disequilibrium (LD) between SNPs. ± 50 kb of 5 and 3 . UTRs, which is defined as the inclusion of SNPs screening within genes. Subsequently, the association p-Values of SNPs within a given gene are converted to upper-tail chi-squared statistics with one degree of freedom (df) (Liu et al., 2010). The statistic of this gene-based test is the total of all χ 2 1 df statistics in the given gene. This method uses the multivariate normal simulations to illustrate the LD structure of SNPs within the given gene with reference to the HapMap2 genotype information (Liu et al., 2010). Simulations of 10 3 are first processed. If the out-coming empirical p-Value is less than 0.1 then 10 4 simulations will be processed. If the empirical p-Value of 10 4 simulations is less than 0.001, the process will perform 10 6 simulations. Due to the calculated reasons, no more simulations will be processed if the empirical p-Value is 0. More stringent p-Values have been used for both methods in AQP4-positive NMOSD cases (p < 0.05/16,532) and in the AQP4-negative NMOSD group (p < 0.05/16,485). For exploring more potential genes, p < 0.05 was used to get the intersection of two methods and an empirical p-Value of 0 from 10 6 simulations can be explained as p < 10 −6 , which surpasses p-Value < 2.8 10 −6 (≈0.05/17,787(the autosomal genes' amount)) (Liu et al., 2010). Gene-based, test method based, user-friendly bioinformatics tools, like PLINK which was a set-based analysis, were used and this method combined a meta-analysis process and PLINK to calculate the data of GWAS (Moskvina et al., 2011). The theoretical description of this method is detailed in the previously published 2 https://vegas2.qimrberghofer.edu.au/ FIGURE 1 | Flow diagram of the three-phase analysis design. In phase I, two new gene-based tests for the GWAS approach, called VEGAS2 and PLINK, were used to conduct a gene-based association study. In phase II, Gene sets identified with meta p < 0.05/n were carried forward to the next phase. In phase III, protein-protein association networks were performed by STRING. Meanwhile, KEGG pathways and GO analysis were carried out for these NMOSD susceptibility genes. VEGAS2 and PLINK are two kinds of new gene-based tests. "n" represents the number of genes shared by the VEGAS2 and PLINK tests.
literature (Liu et al., 2017). The PLINK is an opensource C/C++ whole-genome association study software 3 (Purcell et al., 2007). Based on the HapMap data available in PLINK, "plink -bfile hapmap_CEU_r23a -set-screen nmo.txt -make-set glist.dat" was conducted for gene-wise calculations (Moskvina et al., 2011). In the code above, "nmo.txt" is the file with SNP IDs and their p-Value in NMO GWAS data, other files are provided by the software (Moskvina et al., 2011).

Meta-analysis
Fisher's method was used to integrate the p-Values of common risk genes derived from two gene-based tests. Fisher's method is a simple meta-analysis approach for significance computation (Kippola and Santorico, 2010). The statistics of Fisher's method are computed as: In 2-1, i is the number of studies, p i is the p-Values of the gene and k is the sum count of studies. This formula follows the chi-squared distribution with 2 kdf. 3. Protein-protein association networks performed by string STRING (version 11) was performed for functional partnerships and interactions between the proteins encoded by the overlapped NMOSD genes by two gene-based tests 4 . The STRING database plans on assembling and integrating the proteins association information (Szklarczyk et al., 2017). The subtypes of associations in STRING are direct (physical) and indirect (functional) interactions. The source of the STRING database includes assembling and evaluating obtainable tested data from known protein complexes and pathways, co-expression study, inter-genomes shared selective signals, scientific article text-mining, and interaction knowledge calculated transfer between organisms (Szklarczyk et al., 2017). The association network of STRING provides much information of proteins including annotations, cross-links, domain structures, 3D protein structures, and their types of interaction (Szklarczyk et al., 2015). The new version (v11) of STRING updates the organisms' amount to 5,090 and increased the capacity of the input to the genome-wide level (Szklarczyk et al., 2019). In a study of the genome-wide dataset, the subsets can be visualized as association networks and conducted as enrichment analysis. Furthermore, STRING actualized novel classification systems for enrichment analysis in addition to classical systems such as Gene Ontology and KEGG (Szklarczyk et al., 2019).

Pathway-based test
In the study, widely used online assessment analysis tools were highlighted for enrichment analysis, WebGestalt 2019 5 (Liao et al., 2019). A hypergeometric test was conducted to explore an over-representation of the screened genes among the genes in a given pathway (Wang et al., 2013). In this pathway, the p-Values of more than J disease-related genes were observed and computed by the following formula: In 4-1, m is the total number of target genes related to a given disease, N is the number of reference genes, S is the number of genes in the designated pathways. The pathway selection interval was set between 20 and 300 genes to avoid bias caused by testing an extremely narrow or broad enrichment of genes on the pathway (Liao et al., 2019). According to the result, the pathways with p-Value < 0.05 were regarded as risk pathways.

Gene-Based Test of NMOSD GWAS Dataset
In the AQP4-positive NMOSD cases, 6,804,718 NMOSD SNPs were mapped to 21,275 and 16532 genes, respectively, using VEGAS2 and PLINK. More stringent p-Values were used in both methods, five shared risk genes were obtained at chromosome 6 (p < 0.05/16,532), comprising CFB, EHMT2, HLA-DQA1, MSH5, and SLC44A4 (Table 1 and detailed results are listed in Supplementary Table 1). For exploring more potential genes, p < 0.05 was used to get the intersection of two methods after which 961 genes were identified using VEGAS2, and 544 genes using PLINK with a significance of p < 0.05, 315 common genes were revealed by VEGAS2 and PLINK (Supplementary Table 2), of which 50 gene sets passed the Bonferroni-corrected statistical test at a significance of meta p < 1.59 × 10 −4 (p = 0.05/315) (Supplementary Table 2). The 50 potential susceptibility gene sets of AQP4-positive NMOSD are shown in Table 2 and the top 5 of these genes were same with the ones obtained with more stringent p-Values (p < 0.05/16,532).
In the AQP4-negative NMOSD group, 6,423,746 NMOSD SNPs were mapped to 21,170 and 16485 genes, respectively using VEGAS2 and PLINK. More stringent p-Values were used in both  Frontiers in Genetics | www.frontiersin.org methods, no significant genes were determined (p < 0.05/16,485).
To explore more potential genes, p < 0.05 was used to get the intersection of two methods after which 988 genes were identified using VEGAS2 and 473 genes using PLINK with a significance of p < 0.05, 305 common genes were identified (Supplementary Table 2) of which 22 gene sets passed the Bonferroni-corrected statistical test at a significance of meta p < 1.63 × 10 −4 (p = 0.05/305). The 22 potential susceptibility gene sets of AQP4 negative NMOSD are shown in Table 3. Shared susceptibility genes between the AQP4-positive and negative groups were absent.

Protein-Protein Association Network Analysis of NMOSD Susceptibility Genes
To further verify these findings, a protein-protein interaction network analysis was conducted for AQP4-positive and negative NMOSD potential susceptibility genes. Interestingly, significant connectivity among proteins encoded by these NMOSD susceptibility genes was highlighted according to the protein-protein interaction network in STRING (version 11.0) (Figures 2, 3).

DISCUSSION
Progression in NMOSD research has led to the deduction that environmental and genetic factors underlie the etiology of NMOSD. NMOSD genetic risk is still required to be identified to date. A NMOSD-GWAS dataset comprising 215 NMOSD cases (132 AQP4-positive and 83 AQP4-negative) and 1244 control cases of European ancestry which contain 7,138,498 autosomal SNPs was chosen as regards to the present study. As a useful complementary approach to per SNP-GWAS, gene-based tests play many roles in genetics research. Studies have shown that gene-based to be more powerful than the per SNP-GWAS approach, also, they are suitable for integrating the results of pathway tests from GWAS (Liu et al., 2010).
Recently, Estrada et al. (2018) identified two independent single nucleotide polymorphisms (SNPs) (rs1150757 and rs28383224) in the MHC region associated with AQP4-Ab positive NMOSD. For further verification and as a supplement to Estrada et al's. (2018) research based on SNP analysis, our study is based on gene analysis and focused on functional comparison. In the study, more stringent p-Values were used in both methods and five shared risk genes were obtained at chromosome 6 in AQP4-positive NMOSD cases (p < 0.05/16,532), while shared risk genes in the AQP4 negative NMOSD group were absent (p < 0.05/16,485). Further, there were 50 and 22 potential susceptibility genes in AQP4-positive and negative NMOSD, respectively. Combining the protein-protein association network analysis and pathway tests, focus was on susceptibility genes that were not only significant in gene-based tests, but also meaningful in the functional analyses. In AQP4-positive NMOSD group, the five risk genes comprised CFB associated with complements, HLA-DQA1 associated with antigen processing, and EHMT2, MSH5, and SLC44A4 associated with immune regulation. Complement factor B (CFB) localizes to the MHC class III region on chromosome 6 and is a component of the alternative pathway of complement activation. Upon activation of the alternative pathway, CFB is cleaved after which it yields the noncatalytic chain Ba and the catalytic subunit Bb. The active subunit Bb is a serine protease which associates with C3b to form the alternative pathway C3 convertase. Bb is involved in the proliferation of preactivated B lymphocytes, while Ba inhibits their proliferation. Previous studies suggest that CFB may contribute to SLE (Gateva et al., 2009). Similarly, the analysis showed that CFB may be involved in the pathogenesis of NMOSD. Using the KEGG pathways analysis, it was determined that complement and coagulation cascades are associated with C2-CFB locus, previous studies showed AQP4 antibodies and complement-mediated damage are associated with NMOSD. Additionally, terminal complement inhibitor therapy of NMOSD has been proposed and proved effective among AQP4-IgG-positive NMOSD patients (Pittock et al., 2019). Similarly, the GO analysis demonstrated that regulation of humoral immune response (GO:0002920), protein processing (GO:0016485), regulation of complement activation (GO:0030449), regulation of protein activation cascade (GO:2000257) and complement activation (GO:0006956) are associated with C2-CFB locus. Based on molecular function, serine-type endopeptidase activity (GO:0004252), serine-type peptidase activity (GO:0008236) and serine hydrolase activity (GO:0017171) are associated with C2-CFB locus. Hence, our analysis shows that complements associated with C2-CFB locus play vital roles in the pathogenesis of NMOSD. However, in Estrada et al's. (2018) study, C4 CNV was associated with NMO-IgG+ but not NMO-IgG-, which is slightly different.
HLA-DQA1 (MHC class II, DQ alpha 1) gene belongs to a group of MHC genes that encode the HLA-DQ heterodimer. Previous studies indicated that HLA-DQA1 is strongly associated with SLE (Demirci et al., 2016), systemic sclerosis (SSc) (Guo et al., 2016) and anticitrullinated protein antibodies (ACPA)positive RA (Guo et al., 2019). Estrada et al's. (2018) study demonstrated that HLA-DQA1 (rs28383224) was shared between AQP4-positive and negative NMOSD, suggesting that at least one common genetic determinant exists for both groups. However, in our study, there were no shared genes between the AQP4positive and negative group. Given the complexity of the MHC region, larger studies will be needed to determine the role of HLA alleles to understand the effect of these haplotypes on the NMOSD subgroup.
In AQP4-positive NMOSD, SLE (hsa05322) was the most significant pathway (p = 6.62E-05) associated with HLA-DQA1, HIST1H2BL, HIST1H2AL, C2, and HIST1H3J locus. Other immune diseases, such as RA (hsa05323) were associated with HLA-DQA1 and ATP6V1G2 locus and it was also revealed that autoimmune diseases may coexist with NMOSD or share common pathological pathways. In previous studies, NMOSD with AQP4-IgG + has been shown to be associated with a high frequency of autoantibodies and autoimmune diseases FIGURE 2 | Network of known and predicted interactions between proteins encoded by NMOSD potential susceptibility genes were identified by GWAS of AQP4-positive NMOSD. Network nodes represent the proteins produced by a single, protein-coding gene locus. Edges represent protein-protein associations meant to be specific and meaningful. In STRING, blue edge represents protein-protein associations with known interactions from curated databases, purple edge represents protein-protein associations with known interactions experimentally determined, green edge represents protein-protein associations with predicted interactions with gene neighborhood, red edge represents protein-protein associations with predicted interactions with gene fusions, Navy blue edge represents protein-protein associations with predicted interactions with gene co-occurrence, and other associations with text mining (yellow edge), co-expression (black edge), and protein homology (lavender edge).
including SLE (Asgari et al., 2018), RA, Sjogren's syndrome (SS) (Pittock et al., 2008), myasthenia gravis (MG) (McKeon et al., 2009), and antiphospholipid syndrome (APS) (Guerra et al., 2018). Regarding Estrada et al's. (2018) study, NMO-IgG+ is genetically similar to SLE. Additionally, KEGG pathway analysis demonstrated staphylococcus (S.) aureus infection (hsa05150) to be associated with C2, CFB, and HLA-DQA1 locus was the second most significant signal (p = 8.17E-04) in AQP4-positive NMOSD. Previous studies showed that superantigens (SAgs) might be potent T cell activators, and it causes the deregulation of the immune response resulting in the proliferation of autoreactive T cells and the development and/or exacerbation of chronic autoimmune diseases (Foster, 2005). Among bacterial superantigens, the relevance of S. aureus superantigens with MS exacerbation has been depicted (Mulvey et al., 2011;Libbey et al., 2014). Kumar (Kumar et al., 2015). Our study revealed that the infection disease pathway might be involved in the pathogenesis of AQP4positive NMOSD.
The MSH5 (MutS homolog 5) gene located in MHC class III region comprises 26 exons and spans 25 kb and is involved in DNA mismatch repair and meiotic recombination (Clark et al., 2013). Prior studies confirmed MSH5 as susceptibility loci for SLE, SS, and MS (Song et al., 2013;Demirci et al., 2016;Shen et al., 2019). Additionally, Fernando et al. determined the presence of risk and protective signals in and surrounding MSH5 (best risk SNP rs3130490; best protective SNP rs409558) in SLE (Fernando et al., 2012). According to the GO analysis, DNA repair complex (GO:1990391) and FIGURE 3 | Network of known and predicted interactions between proteins encoded by NMOSD potential susceptibility genes were identified by GWAS of AQP4-negative NMOSD.  (Clark et al., 2020). Prior studies identified aberrant expression of apoptosis and DNA damage regulatory genes in MS, suggesting that DNA methylation may be effective in neuronal loss in RRMS (Gökdoǧan Edgünlü et al., 2020). Consistent with previous studies, MSH5 might contribute to the pathogenesis of NMOSD. Euchromatic Histone Lysine Methyltransferase 2 (EHMT2) is a methyltransferase and is involved with the demethylation of histone H3 at lysine 9 (H3K9) (Fiszbein et al., 2016). In vitro studies presented that EHMT2 promoted neuronal and immature oligodendrocyte differentiation was required for oligodendrocyte maturation. In previous studies, the methylation of histone H3K9, catalyzed by EHMT1 and EHMT2, was most depleted in patients with anti-neutrophil cytoplasmic autoantibody (ANCA)-associated vasculitis (AAV) (Yang et al., 2016). Ding et al. presented EHMT2 as a potent positive regulator of FOXP3 expression, playing essential roles in controlling immune responses in autoimmune diseases (Ding et al., 2019). A recent analysis observed higher C3 and lower EHMT2 plasma levels related to increasing brain atrophy in MS patients (Malekzadeh et al., 2019). The EHMT2 inhibitor triggered the inhibition of human diffuse large B-cell lymphoma cell proliferation leading to G1 phase arrest and induced apoptosis via endogenous and exogenous apoptotic pathways (Xu et al., 2021). The histone modification is a critical factor in regulating gene expression. Recent studies in autoimmune diseases suggest that increased expression of TLR2 in CD4 + T cells enhances immune reactivity and promotes IL-17 expression through upregulating H3K4 while downregulating H3K9 tri-methylation level in SLE (Liu et al., 2015). Additionally, Ding et al. discovered that increased B cell lymphoma 6 protein upregulates H3K27 trimethylation and downregulates H3K9/H3K14 acetylation of the MicroRNA-142 promoter in SLE CD4 + T cells (Ding et al., 2020). Araki et al. suggests that the histone lysine methylation (HKM) modifying enzymes results in changes of the gene expression of RA synovial fibroblasts (Araki et al., 2018). With respect to the GO analysis, histone H3K9 methylation (GO:0051567) and histone methyltransferase activity (H3K27 and H3K9) (GO:0046976 and GO:0046976), associated with EHMT2 locus, might be involved in the pathogenies of NMOSD.
The SLC44A4 (solute carrier family 44 member 4) gene located at 6p21.33 encodes human thiamine pyrophosphate transporter (TPPT), which is involved in the uptake of choline by cholinergic neurons. The SLC44A4 gene was reported to be associated with ulcerative colitis (UC) susceptibility in the Indian, Japanese, and Chinese populations (Gupta and Thelma, 2016;Wu et al., 2017).
There were no shared potential susceptibility genes between the AQP4-positive and negative group. Given data limitations and the complexity of the MHC region, Further studies will be needed to understand the genetic and molecular mechanism underlying NMOSD.

CONCLUSION
In AQP4-positive NMOSD cases, five shared risk genes were obtained at chromosome 6 using stringent p-Values in both methods (p < 0.05/16,532) comprising CFB, EHMT2, HLA-DQA1, MSH5, and SLC44A4. The 50 potential susceptibility gene sets were determined and 12 significant KEGG pathways were identified. In GO annotations of the 128 pathways identified, 67 biological process pathways, 32 cellular-component pathways, and 29 molecular-function pathways with a p-Value of <0.05 were obtained. There were no shared potential susceptibility genes between the AQP4-positive and negative group, also 4 significant KEGG pathways were identified. In the GO annotations of the 165 pathways identified, 99 biological process pathways, 37 cellular-component pathways, and 29 molecular-function pathways with a p-Value of <0.05 were obtained. The potential molecular mechanism underlying NMOSD may be related to proteins encoded by the novel genes in complements, antigen presentation, and immune regulation. The results may represent an improved understanding of the genetic and molecular mechanism underlying NMOSD.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
TL, HL, and C-SY conceived and designed the study for NMOSD. TL and HL analyzed the GWAS data and wrote the manuscript. LY and F-DS were responsible for subject guidance. C-SY was responsible for research supervision and manuscript revision. YL, S-AD, MY, Q-XZ, and BF provided technical support. All authors approved the final version for submission.

FUNDING
This work was supported by the Youth Incubation Fund of Tianjin Medical University General Hospital (ZYYFY2019008) and the National Key Clinical Specialty Construction Project of China.