Genome-Wide Interaction and Pathway Association Studies for Body Mass Index

Objective: We investigated gene interactions (epistasis) for body mass index (BMI) in a European-American adult female cohort via genome-wide interaction analyses (GWIA) and pathway association analyses. Methods: Genome-wide pairwise interaction analyses were carried out for BMI in 493 extremely obese cases (BMI > 35 kg/m2) and 537 never-overweight controls (BMI < 25 kg/m2). To further validate the results, specific SNPs were selected based on the GWIA results for haplotype-based association studies. Pathway-based association analyses were performed using a modified Gene Set Enrichment Algorithm (GSEA) (GenGen program) to further explore BMI-related pathways using our genome wide association study (GWAS) data set, GIANT, ENGAGE, and DIAGRAM Consortia. Results: The EXOC4-1q23.1 interaction was associated with BMI, with the most significant epistasis between rs7800006 and rs10797020 (P = 2.63 × 10-11). In the pathway-based association analysis, Tob1 pathway showed the most significant association with BMI (empirical P < 0.001, FDR = 0.044, FWER = 0.040). These findings were further validated in different populations. Conclusion: Genome-wide pairwise SNP-SNP interaction and pathway analyses suggest that EXOC4 and TOB1-related pathways may contribute to the development of obesity.


INTRODUCTION
Obesity is a worldwide epidemic associated with increased morbidity of chronic diseases, including diabetes, cardiovascular diseases, metabolic syndrome, and cancer. In 2015, 603.7 million adults and 107.7 million children were obese; furthermore, in many countries the incidence of obesity continues to rise, doubling since 1980 (Afshin et al., 2017). This in turn imposes an enormous burden on the public health system. Many studies have shown that 40-70% of inter-individual variability in obesity can be attributed to genetic factors (Zaitlen et al., 2013;Locke et al., 2015). Currently, large-scale genomewide association studies (GWASs) and meta-analyses have successfully identified in excess of 75 loci associated with obesity (Fall and Ingelsson, 2014). Nevertheless, these genetic variability can explain only a minor fraction of obesity cases Speliotes et al., 2010). This is partly due to the existence of other mechanisms such as epigenetics, gene-environment, and gene-gene interactions, that influence the heritability of obesity (Gibson, 2010;Wang X. et al., 2010). Almost one-third of the genetic variance in the etiology of obesity were due to nonadditive factors, according to the family, twin and adoption studies (Stunkard et al., 1986;Price, 1987;Sorensen et al., 1989;Stunkard et al., 1990).
SNP-SNP interactions are considered to be potential sources of the unexplained heritability of common diseases (Manolio et al., 2009). In research to date on the influence of interactions, most studies invariably selected loci based on biological knowledge and known associated loci, studies of genome-wide gene x gene interactions are rare. Speliotes et al. (2010) tested SNP-SNP interaction effects among 32 BMI-associated SNPs in their GWAS result, however, no significant results were obtained after multiple test corrections. Young et al. (2016) found the interaction rs11847697(PRKD1)-rs9939609 (FTO) associated with BMI via pairwise SNP × SNP interactions analysis based on 34 established BMI-related SNPs in European American adolescents. Ding et al. (2012) also examined Gene-Gene interactions for abdominal obesity in Chinese population. Nevertheless, these studies ignored genomic regions that were not individually associated but could contribute to disease development if combined.
Until now, following the traditional GWAS approach, genome-wide interaction analyses (GWIAs) were used to investigate SNP-SNP interactions. This method did not need the selection of candidate sites, but computational time was a very large barrier. With the advancement of computing technology, the major barrier has been overcome, and SNP-SNP interaction studies gradually focused on the whole genome level. Wei et al. (2012) performed GWIAs for BMI using multiple human populations, and found eight interactions that had a significant P-value in one or more cohorts. Their studies further demonstrated the GWIA is an effective approach to explain the genetic factor of BMI. SNP-SNP interactions have always been explained by mapping to gene-gene interactions, and genomewide pathway-based association analysis will further support the interpretation of gene-gene interactions. "Pathway" means a gene set collected from the same biological or functional pathway. Pathway-based association analysis will measure the correlations between phenotypes and gene sets based on the whole genome. This approach can provide additional biological insights and allow one to explore new candidate genes (Wang K. et al., 2010).
Compared to association analysis, fewer studies have assessed potential gene-gene interactions in obesity, and the relatively high heritability of obesity still has not been completely explained. We explored genome-wide IBD (identical by descent) sharing in obese families using linkage with data derived from genome-wide genotyping data, observing an interaction between 2p25-p24 and 13q13-21 that may influence extreme obesity (Dong et al., 2005). In the present study, we sought to discover novel susceptibility loci through assessing interaction effects with BMI across the whole genome, and to determine how multiple genetic variants contribute to the development of obesity.

Subjects
One thousand and seventy-one (1071) unrelated European American adults were recruited, 1030 of which were females. In this study, we carried out our analyses only in females, comprising 493 extremely obese cases (BMI > 35 kg/m 2 ) and 537 never over-weight controls (BMI < 25 kg/m 2 ). The collection processes have been described in our previous report (Wang et al., 2011). All participants gave informed consent, and the investigation protocol was approved by the Committee on Studies Involving Human Beings at the University of Pennsylvania.

Genome-Wide Interaction Analysis
About 550,000 SNP markers were genotyped by Illumina HumanHap 550 SNP Arrays in our previous GWAS (Wang et al., 2011). PLINK 1.90 was used to perform GWIA for BMI. Due to the computational-demand, we used the "-fastepistasis" command to screen for association. This test was based on a Z-score for the difference in SNP1-SNP2 association (odds ratio) between cases and controls by logistic regression, Z = [log(R)−log(S)]/sqrt[SE(R) + SE(S)], where R and S are the odds ratios in cases and controls, respectively (Purcell et al., 2007). We excluded SNPs of minor allele frequencies (MAF) < 5%. After frequency and genotyping pruning, 497174 SNPs were used to carry out interaction analyses. A total of 123,590,744,551 valid SNP-SNP tests were performed. We then selected the SNPs with interaction P < 1 × 10 −8 (Bonferroni-corrected significant threshold P = 4.05 × 10 −13 ) to analyze interactions by logistic regression based on allele dosage for each SNP.

Haplotype-Based Association Analysis
Eight hundred and thirty-one (831) SNP-SNP interactions showed P < 10 −8 in the results of the SNP-SNP interaction tests based on Z-scores. In order to rule out the possibility of an accidental finding, we mapped these SNPs to genes, then excluded the SNP-SNP interactions by the following criteria: 1 neither SNPs exist in genes; 2 either of the two SNPs exist independently in a gene. Through the above exclusion criteria, the rs7800006(EXOC4)-rs10797020(1q23.1) interaction was the most significant (P = 2.63 × 10 −11 ), where there were 39 interactions with P < 10 −8 between EXOC4 and 1q23.1. Five SNPs exist in the EXOC4 gene region and 9 SNPs exist in the 1q23.1 region, but their interaction P-value did not pass Bonferroni multiple tests. However, the Bonferroni correction test is highly conservative and would overcorrect for the non-independent SNPs, which fall within blocks of strong linkage disequilibrium (LD) (Duggal et al., 2008). Morris and Kaplan (2002) have reported that haplotype-based association analyses are more powerful than single allele-based methods when multiple disease-susceptibility mutations occur within the same gene. Epstein and Satten (2003) also have pointed out that haplotypes are useful during disease development due to the interaction of multiple cis-acting susceptibility variants located at the gene.
Therefore, in case of producing false negatives, we selected the 5 SNPs that exist in EXOC4 and the 9 SNPs that exist in 1q23.1, respectively, for the next haplotype-based association analysis, which were conducted by PLINK1.07. The haplotype windows were defined at two SNPs, three SNPs, and four SNPs.

Genome-Wide Pathway-Based Association Analysis
To further study the gene-gene interactions by pathway analysis, the GenGen program was used to analyze pathwaybased association based on the modified Gene Set Enrichment Algorithm (GSEA) (Subramanian et al., 2005;Wang et al., 2007). The calculation steps have been outlined previously (Li et al., 2015). In this study, a total of 518230 SNPs passed the qualitycontrol thresholds of minor allele frequencies > 0.01 and Hardy-Weinberg equilibrium > 0.001, which covered 17,438 genes, mapping SNPs to 20 kb upstream and downstream of each gene. A total of 1347 gene sets were selected from BioCarta, Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Ontology (GO) databases, gene set sizes were between 5 and 200 genes.

Replication of the Pathway-Based Association Results
We further attempted to replicate the GenGen results in data sets from the GIANT (N = 339,224) (Locke et al., 2015), ENGAGE (N = 87,048) (Horikoshi et al., 2015), and DIAGRAM (N = 119,688) (Wood et al., 2016) consortia. Given that no Frontiers in Genetics | www.frontiersin.org phenotypes and genotypes were available online from the three consortia, GSA-SNP software (Nam et al., 2010) was carried out to perform the pathway associations analyses using the GWAS P-values. To better compare with GenGen analysis results, we obtained SNP specific P-values from GIANT, ENGAGE, and DIAGRAM GWASs, and the same SNPs identified by the GenGen analysis were selected for the pathway association analysis for BMI in the three consortium data sets.
As described above, the flow chart of experimental analysis was shown in Figure 1.

RESULTS
The average age of the 1030 female subjects was 42.2 ± 9.0 years (range, 17-65 years). In our study, we defined BMI > 35 kg/m 2 as "cases, " N = 493, and BMI < 25 kg/m 2 as "controls, " N = 537 (Figure 2). Distributions of BMI in cases and controls are shown in Table 1.

Genome-Wide Pathway-Based Association Analysis
In the genome-wide pathway-based association study carried out with GenGen, 43 pathways achieved a significance of empirical P < 0.05 (Figure 4 and Supplement Table 1). The Tob1 pathway (role of Tob in T-cell activation) showed the most significant association with BMI (empirical P < 0.001, FDR = 0.044, FWER = 0.040, Table 4). Empirical P-values (denoted as "nominal P" values by the GenGen program) were calculated based on the 1000 phenotype permutations.
Replication studies were conducted in data sets from the GIANT, ENGAGE, and DIAGRAM consortia by GSA-SNP. The Tob1 pathway did not have a significant P-value in these settings. However, the pathway GO0051169 (nuclear transport) containing TOB1 was associated with BMI in GIANT and ENGAGE consortia, and passed FDR correction for multiple testing (P GIANT = 0.048, FDR GIANT = 0.015; P ENGAGE = 0.041, FDR ENGAGE = 0.036, Table 4). The EXOC4-contained pathway hsa04530 was also associated with BMI in ENGAGE and DIAGRAM consortium data sets by GSA-SNP (Table 4). GO0030165 containing EXOC4 was also related to BMI in the GIANT and ENGAGE consortium data sets and passed FDR correction (Table 4).

DISCUSSION
In the context of genetic epidemiology, although GWASs have found the majority of BMI-related genes identified to date, combined these loci explain only about 4% of the phenotypic variation of BMI (Loos, 2018). Modest and rare variants have been ignored by the GWASs, partly because of the other mechanisms, including epigenetics, gene-gene and geneenvironment interactions, and statistical issues (Gibson, 2010;Wang X. et al., 2010;Chiefari et al., 2013Chiefari et al., , 2016Lee et al., 2014). To date, fewer studies have examined the effects of interactions on obesity. Despite this, some obesity-related interactions still have been found, including PRKD1-FTO and WNT4-WNT5A (Wei et al., 2012;Young et al., 2016;Dong et al., 2017). Pathway-based analysis is an alternative approach to detect gene interactions. Liu et al. (2010) had found that the vasoactive intestinal peptide pathway was significantly correlated with BMI and fat mass, FIGURE 3 | Circos visualization of mapped SNP-SNP interactions for BMI (P < 1 × 10 −9 ). The curves represent the interactions between the two SNPs, and the color gradually changes from red to blue as the P-value decreases.
suggesting that this pathway plays an important role in the development of obesity. Our previous studies also revealed that the Rac1pathway was associated with the obesity-related phenotype plasma adiponectin (Li et al., 2015).
In the present study, our GWIA for BMI found an interaction between EXOC4 and 1q23.1 that may contribute to the development of obesity, although this interaction did not pass the Bonferroni correction test, they had the lowest interaction P-value (P = 4.05 × 10 −13 ) after accidental exclusion. To further examine whether EXOC4 and 1q23.1 were related to BMI, we selected the SNPs locate in EXOC4 and 1q23.1 accordingly base on the results of GWIA to carry out haplotype-based association analyses, the results verified that EXOC4 contributed to BMI.
In genome-wide pathway-based association studies, the relation between the TOB1 pathway and BMI was identified. EXOC4 and TOB1 associated with BMI were replicated in GIANT, ENGAGE, and DIAGRAM data sets. To our knowledge, these findings have not been identified having main effects in previous BMIrelated studies.
EXOC4 (exocyst complex component 4, also known as SEC8) is a component of the exocyst complex involved in the targeting of exocytic vesicles, which participate in temporal and spatial regulation of exocytosis (Hsu et al., 1996;TerBush et al., 1996). Numerous research results show that exocysts interact directly or indirectly with many proteins including cell membranes, cytoskeletal, the small GTPases and other proteins in the cell     (Benjamini and Hochberg, 1995). * * False discovery rate used by Subramanian et al. (2005). * * * Family wise-error rate.
EXOC4 is located in a widely replicated obesity linkage peak on chromosome 7q22-q36 (Feitosa et al., 2002;Li et al., 2003), and has been connected with various diseases, such as type 2 diabetes, cancer, and neuronal disorders. GLUT4 (glucose transporter 4) transports most of the glucose in muscle and adipose tissue; the docking and tethering of the GLUT4 vesicle to the plasma membrane is mediated via EXOC4 (Inoue et al., 2003;Inoue et al., 2006). A population genetic study also identified several type 2 diabetes-associated SNPs near EXOC4 in The NHLBI Family Heart Study (Laramie et al., 2008).
Nineteen genes are involved in BMI-related Tob1 pathway (role of Tob in T-cell activation) : TOB1, TOB2, IFNG, IL2,  IL2RA, IL4, SMAD3, SMAD4, TGFB1, TGFB2, TGFB3,  TGFBR1, TGFBR2, TGFBR3, CD3D, CD3E, CD3G, CD247, and CD28. This pathway is a component of balanced functioning of the immune system. TOB1 represses T cell activation and is a member of a family of genes with antiproliferative properties. Research has shown that TOB1 interacts with the TGF (transforming growth factor) and can stimulate transcription factors SMAD4 and SMAD2, increasing their binding to the IL-2 promoter and helping to repress IL-2 expression, suggesting that interference in TOB1 function be associated with autoimmune disease (Tzachanis et al., 2001;Tzachanis and Boussiotis, 2009;Gibson, 2010). Numerous studies have found a significant correlation between obesity and many autoimmune diseases, adipokines such as leptin, adiponectin and resistin may be key players in interactions among them (Versini et al., 2014).
TOB1and TOB2 belong to the TOB family of anti-proliferative proteins that have the potential to regulate cell growth. As a repressor of the p38/MAPK pathway, TOB1 can suppress p38/MAPK signaling by decreasing phosphorylation of p38 and ATF2 (Sun et al., 2013;Ng et al., 2017); p38/MAPK acts as an enhancer of adipogenesis contributes to obesity (Patel et al., 2003). The miR-32-TOB1-FGF21 pathway can regulate brown adipose tissue adipocyte function and development and is associated with obesity and metabolic syndrome (Ng et al., 2017). The biological functions mentioned above are consistent with our study results and provided evidence of a direct connection between TOB1 and obesity.
Traditional GWASs have identified many obesity-associated genes, however, additional loci have yet to be identified. EXOC4 and Tob1 pathway genes may be among these from our GWIA and genome-wide pathway-based association analysis.
EXOC4 join in the tight junction signal pathway: this pathway receives not only assembly signals but also transmit information (Zihni et al., 2014). Therefore, EXOC4 may play a role in signal transmission from sensory perception to the brain, thus affecting obesity. The Tob1 pathway may contribute to obesity through the MAPK pathway. Needless to say, molecular biological experiments are needed to repeat the results. For the GWASs, statistical replication is the golden rule to prevent false positives. Although our findings were replicated in different populations with different methods, it also needs to be confirmed in larger populations by GWIAs.

ETHICS STATEMENT
All participants gave informed consent, and the investigation protocol was approved by the Committee on Studies Involving Human Beings at the University of Pennsylvania.

AUTHOR CONTRIBUTIONS
W-DL designed the study, researched data, and edited the manuscript. HJ researched data and wrote the manuscript. KW researched data and edited the manuscript. RP designed the study and contributed to the discussion. YoZ, MZ, and YuZ researched data. YW researched data and contributed to discussion. All authors have reviewed the manuscript.