Association Between FoxO1, A2M, and TGF-β1, Environmental Factors, and Major Depressive Disorder

Introduction Investigations of gene-environment (G×E) interactions in major depressive disorder (MDD) have been limited to hypothesis testing of candidate genes while poly-gene-environmental causation has not been adequately address. To this end, the present study analyzed the association between three candidate genes, two environmental factors, and MDD using a hypothesis-free testing approach. Methods A logistic regression model was used to analyze interaction effects; a hierarchical regression model was used to evaluate the effects of different genotypes and the dose-response effects of the environment; genetic risk score (GRS) was used to estimate the cumulative contribution of genetic factors to MDD; and protein-protein interaction (PPI) analyses were carried out to evaluate the relationship between candidate genes and top MDD susceptibility genes. Results Allelic association analyses revealed significant effects of the interaction between the candidate genes Forkhead box (Fox)O1, α2-macroglobulin (A2M), and transforming growth factor (TGF)-β1 genes and the environment on MDD. Gene-gene (G×G) and gene-gene-environment (G×G×E) interactions in MDD were also included in the model. Hierarchical regression analysis showed that the effect of environmental factors on MDD was greater in homozygous than in heterozygous mutant genotypes of the FoxO1 and TGF-β1 genes; a dose-response effect between environment and MDD on genotypes was also included in this model. Haplotype analyses revealed significant global and individual effects of haplotypes on MDD in the whole sample as well as in subgroups. There was a significant association between GRS and MDD (P = 0.029) and a GRS and environment interaction effect on MDD (P = 0.009). Candidate and top susceptibility genes were connected in PPI networks. Conclusions FoxO1, A2M, and TGF-β1 interact with environmental factors and with each other in MDD. Multi-factorial G×E interactions may be responsible for a higher explained variance and may be associated with causal factors and mechanisms that could inform new diagnosis and therapeutic strategies, which can contribute to the personalized medicine of MDD.


INTRODUCTION
Major depressive disorder (MDD) is the most common psychiatric disorder and is associated with high morbidity, mortality, costs, and risk of suicide (1)(2)(3). The heritability of MDD is approximately 30%-40% (4) and may be greater for recurrent, early-onset disease (5). Establishing the contribution of genetics to MDD can lead to more accurate and timely diagnosis and clinical management, which has significant public health implications.
Studies on main genetic effects in MDD have reported conflicting findings (6). One reason for this is that risk of MDD is polygenic, with many susceptibility genes exerting small effects (7). Additionally, genotypic variations among individuals may increase the risk of MDD only upon exposure to adverse environmental factors, a phenomenon known as gene-environment (G×E) interaction (8). The association between candidate susceptibility genes and disease occurrence has mainly been predicated on the observation that genetic variants not only increase MDD risk but can also explain the underlying biological and molecular mechanisms (9). However, such hypothesis-driven approaches to the study of MDD etiology can only detect a fraction of genetic variants (8). On the other hand, a hypothesis-free approach requires enormous datasets that include both exposure [e.g., stressful life events (10,11) and childhood adversities (9)] and phenotype (MDD) data. To this end, only three genome-wide environmental interaction studies has been carried out (12)(13)(14). In a recent report, a novel omicsbased approach was used to identify genetic variants for genomewide G×E interaction studies based on cross-species and -tissue biological prioritization strategies. However, genomics approaches usually cannot identify causal variants or genes (15). Integrating multiple omics approaches can provide a more comprehensive view of the biology of MDD.
Three candidate MDD susceptibility genes [Forkhead box (Fox) O1, transforming growth factor (TGF)-b1, and a2 macroglobulin (A2M)] that are ideal candidates for G×E interaction analyses were previously identified using an omics-based approach. It has been suggested that environmental and genetic factors contribute equally to the development of mental illness (16). Distal environmental risk factors such as family history (FH) and culture are important because they increase vulnerability to proximal factors; (17) however, the latter are more relevant for G×E investigations since they are more likely to meet the criteria for risk factors and lend themselves to biologically plausible hypotheses regarding their effects on specific neural systems that underlie psychopathological symptoms.
Genome-wide association studies (GWAS) have identified more than 100 genetic variants associated with MDD (18). Each of these has a small effect and the number of associated genetic markers increases with sample size. However, the utility of individual genetic markers for MDD risk prediction is uncertain (19). A multilocus genetic risk score (GRS)-based analysis has been proposed for combining the relatively small effects of single genes to better assess the complex relationship between genetic markers and MDD (20). GRS analyses have shown that including more weakly associated genetic variants improves the prediction of mental illness risk (21).
Proteins interact to mediate many cellular processes; disrupting one subunit of a protein complex can lead to direct and indirect functional consequences in various diseases and physiological conditions (22). For a specific illness, interactions between proteins encoded by susceptibility genes tend to be more frequent than those between random proteins, as revealed by protein-protein interaction (PPI) network analyses (22).
In the present study, we investigated the association between FoxO1, TGF-b1, and A2M genes and MDD in a large population. G×E, G×G, G×G×E, and environment × environment (E×E) interactions were analyzed in the context of MDD. We also assessed haplotypes of these three genes and the relationship between haplotype and environment. We used a GRS approach to calculate the combined effects of these three genes on the prediction of MDD and the GRS-environment interaction (GRS×E) in MDD. Finally, we examined FoxO1, TGF-b1, and A2M PPI networks to clarify their interactions with each other and with other MDD-related proteins.

Study Population
From November 2014 to December 2018, 800 MDD patients (564 women and 236 men) were recruited for the study (mean age: 45.64 ± 14.10 years) along with 800 age-and sex-matched control subjects from the same geographic area in Northern China. All subjects were of Chinese Han ethnicity and provided written, informed consent before participating in the study. The study protocol was approved by the ethics committee of Harbin Medical University.

Independent Measures
Participants completed three questionnaires: a sociodemographic questionnaire, the Chinese version of the 24-item Hamilton Rating Scale for Depression (HRSD-24), and the Life Events Scale (LES). The self-rating socio-demographic questionnaire was adapted from the version developed by the Epidemiology Department of Harbin Medical University and was used to collect detailed background information on family psychiatric history, childhood trauma history, and socioeconomic background. The HRSD-24 (23) is widely used to measure depression symptoms; (24-26) a number of patients above the threshold (21 points) were included in the study. The LES questionnaire for measuring negative life events contained 48 items in three dimensions: family life (28 items), work-related problems (13 items), and social and other aspects (seven items). The LES was scored based on the occurrence/absence (1 and 1, respectively) and frequency (0 or more) of SLEs (26,27).

Statistical Analysis
Haploview v.4.0 software was used to assess Hardy-Weinberg Equilibrium (HWE) and pairwise linkage disequilibrium (LD) and calculate minimal allele frequency (MAF) for genotyped polymorphisms (28). The c 2 test/Fisher's exact test was used to analyze differences in the distributions of independent variables. The Bonferroni method was adopted for multiple-testing correction. Associations between phenotype and independent variables were analyzed by multivariable logistic regression, with polymorphisms scored as 0, 1, or 2 depending on the carrier status of the minor allele, sex, family history (FH), occurrence/ absence or number of SLEs and CAs, and interaction effects (E×E/G×E/G×G/G×G×E/GRS×E). Allele-carrier status and dose-response effect of environmental factors were also determined. Hierarchical regression analysis was performed using RStudio v.1.1.423 software. Alpha levels were corrected with the number of polymorphisms of candidate genes (five polymorphisms for FoxO1 and TGF-b1; six polymorphisms for A2M), so that the p values correction was 0.05/5 = 0.01 for FoxO1 and TGF-b1, 0.05/6 = 0.0083 for A2M. Study power was calculated with QUANTO 1.2.4 (http://hydra.usc.edu/gxe/).
Haplotype analysis was performed using UNPHASED v.3.0.11 software (29). Maximum likelihood haplotype frequencies in the study population were assessed with the expectation maximization algorithm. Rare haplotypes with a frequency < 1% were excluded from analyses. The global association and effect of individual haplotypes on phenotype were analyzed in the total sample and in four subgroups classified according to positive/negative environmental factors (SLEs/ CAs). The significance level of the global association was determined with the likelihood ratio test. Individual effects of each haplotype, i.e., the difference in effect between a haplotype and all others pooled together, were computed with the score test. A permutation analysis was conducted to assess the reliability of the results; 1,000 random permutations were set to generate empirical P values. Permuted P < 0.05 was considered significant in the haplotype analyses.

GRS Analysis
To estimate the cumulative contribution of genetic factors to MDD in an individual by taking into account the reported risk alleles, we used the predictABEL package in RStudio software to compute weighted GRS with the following formula: where GRS is the sum of the effect estimates, k is the number of independent genetic variants with strong association as risk predictors, b i is the weighted coefficient from logistic regression analysis, and N i is the number of risk alleles for each locus. The association between GRS and MDD was evaluated by logistic regression analysis of FH, sex, GRS, SLEs/ CAs, and the interaction between GRS and SLEs/CAs. internationalgenome.org/data-portal/sample). STRUCTURE assumes that there are K populations in the dataset. The admixture and correlated frequency models had a burn-in length of 10,000 and 10,000 Markov chain Monte Carlo repeats and took into consideration immigration and geography-based genetic isolation. The program was run several times at each K value from 2 to 6 to obtain consistent results.

PPI Analysis
We investigated whether FoxO1, A2M, and TGF-b1 genes interact with each other and are involved in the PPI network containing protein products of the top MDD susceptibility genes identified by a recent GWAS (30)(31)(32). The PPI network comprises nodes and edges representing protein and physical interactions, respectively. Proteins encoded by MDD susceptibility genes were used as seed proteins. STRING v.11.0 software (https://string-db.org/cgi/input.pl) (33) was used to reconstruct the PPI network.

Descriptive Statistics
Demographic information on the study population is shown in Table 1. Genetic markers were successfully genotyped at a rate > 95%. No significant deviation from HWE was observed, and MAF was > 5% for each polymorphism (Supplemental Table  S1). Pairwise LD D' values are shown in Figure 1.

Interaction Between SLEs and Candidate Gene Genotypes in MDD
The two FoxO1 (rs17592371 and rs28553411), three A2M (rs10842847, rs10842849, and rs226415), and two TGF-b1 (rs12462166 and rs12983775) gene polymorphisms showed significant interactions with environmental factors ( Table 3).     That is, high exposure to SLEs was related to an increase in MDD risk ( Figure 2B).
We evaluated the G×G interaction between FoxO1 rs17592371 and rs28553411, between the TGF-b1 rs12462166 and rs12983775, and among A2M rs10842847, rs10842849, and rs226415 as well as the three-way interaction between the two genetic markers that were significant in the G×G interaction analysis and LES ( Table 3). The interaction between the two polymorphisms of the  Table 3). There results remained significant after controlling for family history and sex in the regression model. The effect of E×E interaction on MDD was significant (b = 0.746, SE = 0.312, z = 2.392, P = 0.016).

Haplotype Analysis
Haplotype analysis was performed for the total sample and subgroups. In the former, 12 haplotypes of five FoxO1 gene polymorphisms, 15 haplotypes of five TGF-b1 gene polymorphisms, and 14 haplotypes of A2M gene polymorphisms had a frequency > 1%. The haplotypes of the FoxO1 gene showed a significant global association (c 2 = 19.732, df = 13, P global < 0.001) and the less common haplotype (G-A-G-C-T-A) of the A2M gene was significantly associated with MDD (c 2 = 4.931, P effect = 0.026). In subgroup samples, haplotypes of the A2M gene within the SLE occurrence group (c 2 = 21.267, df = 11, P global = 0.030) and haplotypes of FoxO1 (c 2 = 41.792, df = 11, P global < 0.001) and TGF-b1 (c 2 = 34.531, df = 14, P global = 0.001) genes within the CA occurrence group were significant in the global association. Two less common haplotypes (G-A-A-C-G-G and G-A-G-C-T-A) of the A2M gene within the SLE occurrence group (c 2 = 5.815, P effect = 0.015; c 2 = 5.814, P effect = 0.015) and a less common A2M haplotype (G-A-G-C-T-A) in the CA occurrence group (c 2 = 3.947, P effect = 0.046) were associated with MDD.

Population Stratification Analysis
The combined population of YRI, CHB, and CEU exhibited obvious stratification (Supplemental Figure S1A). In the triangle chart with K = 3, each angle represented a potentially independent ancestry and the different colored dots represented individuals in assumed population components that did not cluster with the same color in the triangle, indicating that there was no significant stratification in our sample (Supplemental Figure S1B). Consistent results were obtained with K values ranging from two to six. Thus, population stratification was unlikely to be a confounding factor in this study.

DISCUSSION
The interaction between genetic markers and environment determines the risk of MDD (8). However, most hypothesisdriven studies of MDD to date have not produced reproducible findings. In this study, we investigated candidate MDD risk genes using a hypothesis-free, omics-based, cross-tissue and -species approach. We found that FoxO1 rs17592371 and rs2297626, A2M rs669 and rs226415, and TGF-b1 rs12462166 alleles were associated with MDD. We also found that FoxO1 rs17592371 and rs28553411, A2M rs10842847, rs10842849, and rs226415, and TGF-b1 rs12462166 and rs12983775 interacted with the environment in MDD, suggesting a potential relationship between these genes and the etiology of MDD, which could be a bona fide association or the result of linkage. A known allelic association between a gene and a disorder may reflect a G×E interaction, but the absence of such an association does not disqualify a gene as a candidate for disease risk. Some evidences showed that FoxO1 could express in hippocampus and corpus striatum, FoxO1 may contribute to the pathological  process of MDD or other psychiatric disorders (18,39). It was suggested that immune system associated with MDD and A2M played an important role in the immune system, which indicated that A2M may associate with MDD. Some studies found that the expression of A2M was higher in MDD patients comparing healthy people (21,40). TGF-b1 can restrain autoimmune response and maintain immune system stability, some studies found that TGF-b1 could moderate the imbalance of proinflammatory cytokines and anti-inflammatory cytokines in MDD patients (15,16). All of these evidences above were consistent with our findings.
The carrier status of the minor allele and number of SLEs were separately used to test different genotypic effects on MDD and the dose-response relationship between SLEs and MDD. The effect of SLEs on MDD was more highly significant for the TT than for the TC genotype of FoxO1 rs17592371 and for the GG as compared to the GT genotype of TGF-b1 rs12462166. These results indicated that mutant homozygous genotypes may be more significant than the heterozygous genotype for the effect of SLEs on MDD, as previously suggested (41,42). We also found that exposure to SLEs was related to an increased risk for MDD, which is consistent with previous reports (42,43), indicating that although the effect of a single environmental factor may be quite small, the cumulative effect of multiple factors may be large and that strong effects typically result from a chain of related events rather than a single factor.
Because gene variants can interact, we examined the SNPs that showed significant G×E interactions to investigate G×G interactions. We found that two polymorphisms in the FoxO1 gene and the A2M gene showed significant interactions and that the G×G×E interaction was also significant. This was similar to a previous finding that serotonin transporter-linked polymorphic region (5-HTTLPR) gene and SNP rs140700 both interacted with the environment and with each other in MDD, and that the 5-HTTLPR×rs140700×E interaction was significant (41). Much of the genetic variation associated with a complex trait can be explained by the joint contribution of multiple genetic markers as well as environmental influence (44). G×G and G×G×E interaction analyses are considered as biologically relevant and explain the role of heritability (45)(46)(47)(48). We also found that the E×E (CA × SLE) interaction in MDD was significant. CA increases the likelihood of SLE occurrence (49) that is, early negative experiences increase vulnerability to such events later in life, resulting in a sequential E×E interaction. For instance, mistreatment in childhood caused sensitization to the effects of specific types of SLE in adulthood (49). The present study is the first report of an E×E interaction in MDD.
Although direct associations between haplotype and psychiatric disorders have been reported (50,51), there have been no studies demonstrating an interaction between haplotype and environmental factors. In our study, the global effect of haplotype on MDD was significant for the FoxO1 and TGF-b1 genes in the total sample and in subjects that had experienced CA. A significant individual effect and global effect of haplotype on MDD was found for the A2M gene in both the total and subgroup samples. The global effect of haplotype on MDD was always significant in combination with environmental factors. Interactions existed not only between the environment and genes, but also between the environment and haplotype in MDD.
GRS analyses have shown that the prediction of mental illness is improved by including more weakly associated genetic variants, suggesting that these influence the risk of mental disorder (21,40). Our results showed that the association between 16 SNP GRS and MDD was significant, with a significant interaction between GRS and SLEs. Some of the 16 SNPs did not show a significant conditional or interaction effect. GRSs may detect the effects of weaker associated genetic variants (52). The results of the PPI analysis showed that FoxO1, A2M, and TGF-b1 were connected, suggesting that a wide variety of cellular processes are involved in MDD. This could explain the reason for the additive effects of FoxO1, A2M, and TGF-b1 gene variants indicated by the GRSs. In addition, several top MDD susceptibility genes were included in this PPI network, although the clinical significance of these associations requires further investigation. The disturbances of certain cellular processes or pathways contributing to the risk of MDD has been emerging and gained evidences supporting (22,53,54). Constructing highly interconnected PPI networks between FoxO1, A2M, and TGF-b1 protein and proteins of multiple defined risk genes for MDD may reveal the underlying biological mechanisms. FoxO1, A2M, and TGF-b1 protein could participate in this PPI network, indicating their potential involvement in the common molecular network modulating the pathogenesis of MDD. It is noteworthy that A2M protein directly interacted with TGF-b1 protein, indicating their connection in the biological process relevant to MDD. A recent study found that A2M as an inflammatory fluid proteinase scavenger could bind to a plethora of cytokines, including TGF-b1. (55) We investigated the potential influence of population stratification in our samples using STRUCTURE software v.2.3.4 and the third-stage sample set, but did not find any evidence of stratification, which confirmed that the results were not affected by this confounding factor.
Our study had three major limitations. First, we did not use standard questionnaires that have confirmed reliability and validity to measure CA. Second, we did not analyze additional SNPs in our samples and the third-stage sample set, which can improve the reliability of population stratification analysis. Lastly, including a gene expression microarray analysis in our study would have allowed us to compare the transcriptomes of subjects exposed to SLEs/CAs and non-exposed individuals, which could provide more information for predicting MDD risk.
In conclusion, our data indicated that the hypothesis-free approach is useful for identifying novel genes contributing to the G×E interaction in MDD. FoxO1, A2M, and TGF-b1 genes not only interact with environmental factors but also associate with multiple other genes in the MDD G×E interaction. FoxO1, A2M, and TGF-b1 genes may serve the clinical diagnose and treatment of MDD by providing biomarkers, which can contribute to personalized medicine. Future studies will need to focus on the complexity of poly-gene-environmental causation to obtain more detailed insight into the etiology of MDD.

DATA AVAILABILITY STATEMENT
All datasets for this study are included in the article/ Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by ethics committee of Harbin Medical University. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
MZ and LC conducted the statistical analyses and wrote the first draft of the manuscript. ZQ and JZ provided expertise in MDD search. TZ, WZ, and SK did the experiment. XZ, EZ, and XS collected the data. XQ and HP revised the manuscript. YY and XY designed this study and provided expertise. All authors were involved in modifying the secondary-analysis design and editing the manuscript. All authors contributed to the article and approved the submitted version.