Investigating the shared genetic architecture between COVID-19 and obesity: a large-scale genome wide cross-trait analysis

Observational studies have reported high comorbidity between obesity and severe COVID-19. The aim of this study is to explore whether genetic factors are involved in the co-occurrence of the two traits. Based on the available genome-wide association studies (GWAS) summary statistics, we explored the genetic correlation and performed cross-trait meta-analysis (CPASSOC) and colocalization analysis (COLOC) to detect pleiotropic single nucleotide polymorphisms (SNPs). At the genetic level, we obtained genes detected by Functional mapping and annotation (FUMA) and the Multi-marker Analysis of GenoMic Annotation (MAGMA). Potential functional genes were further investigated by summary-data-based Mendelian randomization (SMR). Finally, the casualty was identiied using the latent causal variable model (LCV). A significant positive genetic correlation was revealed between obesity and COVID-19. We found 331 shared genetic SNPs by CPASSOC and 13 shared risk loci by COLOC. At the genetic level, We obtained 3546 pleiotropic genes, among which 107 genes were found to be significantly expressed by SMR. Lastly, we observed these genes were mainly enriched in immune pathways and signaling transduction. These indings could provide new insights into the etiology of comorbidity and have implications for future therapeutic trial.

throughout the body.Obesity, as a public health epidemic, affects over 650 million adults globally and 124 million children and adolescents (2).Considering the high infectiousness and serious adverse outcomes of COVID-19 and the high prevalence of obesity (3), an in-depth exploration of the relationship between the two could help to develop effective policies and personalized treatments to control the spread of the epidemic and save social healthcare expenditures.
Previous studies have found that COVID-19 syndrome exacerbates metabolic dysfunction (4), including diabetes mellitus (5), vertebral fractures (6), and obesity.A study showed that 90% of SARS-CoV-2-infected patients with respiratory failure had a BMI higher than 25 kg/m2 (7).From an immunological point of view, excess energy in the diet accumulates in white adipose tissue, where the immune cells affect the overall balance through metabolism.For example, hypertrophied adipocytes recruit polarized macrophages that produce excessive amounts of inflammatory cytokines such as IL-6, TNF-a, and IL-1 (8) and blood levels of the proinflammatory adipokine leptin are elevated, while expression of the antiinflammatory ACE2 receptor is reduced in lung epithelial cells (9).This imbalance significantly impairs the efficiency of the innate immune response.In addition, the sequelae of post COVID-19, also known as long COVID-19, has received more and more attention as the number of people found to be gradually increasing and affecting a number of systems and the mechanism of occurrence is not known (10), while some studies have found that exercise can be effective in alleviating post-COVID-19 syndrome and improving the physical strength and respiratory function of patients with COVID-19 (11).Using Mendelian randomization (MR), Xiong et al. found that physical activity and recreational sedentary behavior were associated with COVID severity and hospitalization rates (12).Therefore, obesity as the development of a variety of diseases and prognostic risk factors, and to understand its relationship with the COVID-19 is of profound significance.Since most of the past studies have been retrospective or observational, they are prone to many confounding factors that can bias the results, and as obesity is a highly hereditary disease, the aim of this study is to examine whether it is an intrinsic cause of neo coronary pneumonia disease from a genetic perspective and to reveal the anatomical and physiopathological mechanisms that are dependent on this.
Understanding the genetic architecture of both may not only explain the higher risk or worse prognosis of COVID-19 in obese individuals compared to normal individuals, but also provide insight into the pathogenesis of SARS-CoV-2, which help effectively manage obesity in the context of COVID-19.This study employed genome-wide cross-trait analysis to identify overlapping and distinct genetic architectures, thereby offering novel insights into disease mechanisms.

Materials and methods
The flowchart was shown in Figure 1 and the figure were from smart(https://smart.servier.com/).The GWAS for COVID-19 was derived from the COVID-19 Host Genetics Initiative (https:// www.covid19hg.org/),an international consortium aimed to discover genetic variants associated with susceptibility and severity of COVID-19.The GWAS of COVID-19 from the European population was obtained from the COVID-19 HGI GWAS round 7, including hospitalized COVID-19, critical COVID-19 and SARS-CoV-2 infection.The SNPs linked to BMI were acquired from the GIANT consortium, a meta-analysis including 2.4 million SNPs (13).To standardize the data, we first filtered out SNPs that was not present in the 1000 Genomes European population.Then we excluded SNPs without rsIDs or Overview of research of shared genetic architecture between COVID-19 and obesity.

General genetic correlation analysis
Because heritability (h2) is distributed over thousands of variants with small effects, it is not sufficient to focus only on SNPs that differ or are significant between or within traits.In order to measure the average sharing of genetic effect across the entire genome between obesity and COVID-19, we used the linkage disequilibrium score regression (LDSC) to estimate h2 (14) and genetic correlation (rg) (15) based on the summary GWAS statistic.With reference data obtained from the third phase of the European 1000 Genomes (1KG) project, LDSC can integrate the associated evidence for multiple traits of interest (continuous and dichotomous) from one or more studies.The LD Score regression intercept was employed to estimate a more powerful and accurate correction factor compared to genomic control.

Local genetic correlation analysis
Since traditional global approaches only consider the average rg across the genome, they may fail to detect scenarios where the overlapped information is confined to specific regions or has opposing directions at different loci.We used LAVA (Local Analysis of Variant Association) which can detect shared genetic association regions between phenotypes by utilizing local genetic regions (16).And the pairwise local rg tests on 2,495 genomic loci (the entire genome) applying multivariate genetic association analysis can provide more complex and conditional genetic relationships.

Cross-trait meta-analysis
For the purpose of identifying the risk SNPs associated with joint phenotypes (COVID-19 and obesity), we implemented Cross Phenotype Association (CPASSOC), which allows meta-analysis of continuous traits based on the GWAS.There are two statistical methods, SHom and SHet.SHom is an extension of the linear combination of univariate test statistics, allowing sample size to be used as weights and its statistical power is diminished when there is between-study heterogeneity.Thus, we utilized SHet for analysis which can sustain statistical power even in the presence of heterogeneity by assigning greater weights.Since SNP often cannot directly determine causal variants due to the influence of linkage imbalances, the Functional mapping and annotation (FUMA) was utilized (17) to provide annotation information for SNPs associated with functional categories, especially regarding non-coding regions or intergenic regions.Among the provided information, CADD scores above 12.37 indicate potential detrimental effects on protein outcomes and the scores from RegulomeDB offer valuable insight into the regulatory functionality of SNPs by considering their association with expression quantitative trait loci and chromatin marks.

Colocalization analysis
The meta-analysis of various traits resulted in the inclusion of genetic loci associated with individual traits.Subsequently, we employed the colocalization method (COLOC) (18) with the aim of investigating whether the same genetic variation in the loci is responsible for both traits.This algorithm is using a Bayesian algorithm to calculate posterior probabilities for five exclusive hypotheses related to the sharing of causal variants in a genomic region.These hypotheses include H0 (no association), H1 or H2 (association with one specific trait), H3 (association with both traits, involving two distinct SNPs), and H4 (association with both traits, involving one shared SNP).A locus is considered colocalized if PPH4 or PPH3 is greater than 0.7.We utilized the R package "coloc" in Rstudio software to extract summary statistics for variants within 5 Mb of the topSNP at each shared locus after annotating in FUMA.

Multi-marker analysis of GenoMic annotation
Gene and gene-set analysis have been suggested as potentially more powerful alternatives to the typical single-SNP analyses performed in GWAS.The FUMA can provide annotation of SNPs to genes based on physical location.In addition, we used MAGMA (Multi-marker Analysis of GenoMic Annotation) (19) to obtain genes or sets of genes significantly associated with traits.It is a fast and flexible tool which uses a multiple regression approach to properly incorporate LD between markers.We compared the gene sets generated by MAGMA with the gene sets annotated based on physical location.After applying the Bonferroni correction, the resulting genes represented the final set of pleiotropic genes identified at the gene level.Additionally, we performed GTEx tissue enrichment analysis using MAGMA and the 54 tissue types from GTEx (v.8) to determine the specific tissues associated with the shared genes.To address multiple testing, we adopted the Benjamin-Hochberg procedure.

Summary-data-based Mendelian randomization
We used Summary-data-based Mendelian randomization (SMR) to identify putative functional genes underlying statistical associations for obesity and COVID-19.SMR can integrate the GWAS and eQTL to investigate the expression of the pleiotropic genes in mRNA level, which was under the MR framework to test for an link related to gene expression and a target phenotype (20).The source of eQTL were based on 2 different reference panels, Genotype-Tissue Expression project (21) (GTEx) and the Encyclopedia of DNA Elements project (22) (ENCODE).The SMR can be also used to perform the heterogeneity in dependent instruments (HEIDI) test to evaluate the existence of linkage in the observed association.Significant common shared functional genes between COVID-19 and obesity were defined passed the threshold (p<0.05) and HEIDI-outlier test (p > 0.01) in SMR analyses of both traits.

Enrichment analysis
To gain a better understanding of the biological implications of the final pleiotropic genes identified from the overlapped genes detected by MAGMA and the result of annotation, we performed an enrichment analysis of these genes in terms of Gene Ontology (GO) biological processes (23) and Kyoto Encyclopedia of Genes and Genomes (KEGG) (24) pathways using the "clusterProfiler" R package.These two analysis can reveal the enriched biological functions and metabolic pathways in a gene set.The GO analysis annotates genes to biological processes (BP), molecular functions (MF), and cellular components (CC) in a hierarchically structured manner.

Latent causal variable analysis
At last, to further study whether the genetic correlated relationship between the COVID-19 and BMI have the casual component, The latent causal variable (LCV) model ( 25) was used in this study, which is mediated by a latent variable that causally impacts each trait.Compared to MR, it can overcome the heterogeneity of instrumental variables.We introduced the concept of genetic causality proportion (GCP) to measure the degree of partial causality and quantify the impact of BMI on COVID-19.The GCP scale spans from 0 to 1, representing the absence of partial g e n e t i c c a u s a l i t y a n d t h e p r e s e n c e o f f u l l g e n e t i c causality, respectively.

Genetic correlations
The heritability of BMI was 0.2116, and the heritability of critical COVID-19 was 0.0062, which was the highest of the three pair traits.The general genetic correlation between obesity and COVID-19 was positive.We identified that the rg was 0.363 between BMI and hospitalized COVID-19, and the P value was 8.91E-15.There was also a significant rg (rg=0.3451,P=1.54E-08) between BMI and critical COVID-19.A link between COVID-19 and BMI (rg=0.0722,P=8E-04) could be found.The most overlap rate was only 0.018 between the critical COVID-19 and BMI (Table 1).In LAVA, we all found the seven local relationships between hospitalized COVID-19 and BMI.Among them, the most significant locus was located at chr: bp 1:77895395-79065286, with a P-value of 9.84E-09.The second significant locus was found at chr: bp 7:13676748-15013694, with a P-value of 9.32E-07.The locus located in chr 19:55182974-55714085 was observed to be shared between COVID-19 and BMI.We did not observe any additional region that showed a significant local genetic correlation between critical COVID-19 and BMI (Table 2).
After physically annotating the polytropic SNPs obtained from the above results in the FUMA, we can get the corresponding risk loci.Further colocalization analysis identified that there were 13 loci

Candidate pleiotropic genes and tissue specificity
MAGMA was analyzed based on the results of CPASSOC, and it found that 5162 genes overlapped and were mapped by loci physically annotated by FUMA (Supplementary Table 3).Among these genes, there were 273 genes associated with hospitalized COVID-19, 2433 genes associated with COVID-19, and 2456 genes associated with critical COVID-19.After applying the Bonferroni correction, MAGMA identified 3546 significant pleiotropic genes, out of which 152 genes were detected in 2 or more trait pairs.The most common genes were shared between COVID-19 and critical COVID-19 (Supplementary Table 4).We observed significant enrichment for BMI and COVID-19 in brain tissues after correction in every form of COVID-19.The results were identified in 10 brain regions, including the brain cerebellum (P=1.66E-29), the brain cerebellar hemisphere (P=2.98E-15) in COVID-19, and the result of critical COVID-19 (Pbrain cerebellum = 1.02E-17,Pbrain cerebellar hemisphere=4.28E-17).The enrichment between hospitalized COVID-19 and BMI mainly enriched in the brain cortex (P=4.55E-07) and brain frontal cortex BA9 (P=3.05E-06)(Supplementary Table 5, Figure 2B).The further functional gene analysis using SMR found 20 significant genes between hospitalized COVID-19 and BMI, 39 genes significant genes between critical COVID-19 and BMI, and 48 genes between COVID-19 and BMI (Supplementary Table 6).
Combined with the above result, we show seven common genes significantly expressed in all analytic methods at both SNP and genetic levels (Table 4).Among them, ADORA2B was on the chr17 and was significantly expressed in blood, lung, brain frontal cortex BA9, and brain cerebellar hemisphere.The most significant tissue was on the lung (P critical COVID-19 = 1.018E-03,P BMI =7.37E-07), and the PHEIDI was all above 0.01.Apart from this, ZSWIM7 was the most common gene seen between BMI and critical COVID-19, which was found in 12 tissues.The expression in the brain cortex was most significant (P BMI = 1.12E-04,P COVID-19 = 3.09E-09).There was only one significant gene detected by COLOC and SMR between COVID-19 and BMI, which was CCND3.It was on the CHR 6 and significantly expressed in whole blood (P COVID- 19 = 0.028, P BMI =2.30E-05).There were no functional genes shared between hospitalized COVID-19 and BMI.
We cannot find the causal component in the genomic association between COVID-19 and BMI.The P value presented by LCV between COVID-19 and BMI was 0.78.It also indicated a lack of statistically significant association between hospitalized COVID-19 and BMI, with a p-value of 0.57.It yielded a p-value

Discussion
As one of the common risk factors for COVID-19, a deep exploration of the genetic correlation and genetic pleiotropy between them can not only further our understanding of their interaction but also provide evidence that weight management facilitates the reduction of viral infections in the context of the COVID-19.Through LDSC, we discovered a positive genetic correlation between COVID-19 and obesity regardless of the COVID-19 type, which was consistent with previous epidemiological surveys.In order to reduce the bias brought by genetic structure from different locations on the overall correlation, we further conducted a local genetic correlation analysis.In the pleiotropy analysis, we found that 39 loci were linked to COVID-19, 207 were linked to hospitalized COVID-19, and 85 were related to critical COVID-19.After validation with the COLOC algorithm, these loci yielded 13 loci with posterior probabilities more significant than 70% and were further mapped to 3545 genes through MAGMA analysis.Finally, combining the abovementioned filtering, we performed SMR analysis and obtained eight effective drug targets (PSMR<0.05,PHEIDI>0.01).This study provides a specific shared genetic structure for the comorbidity between obesity and COVID-19, rather than just due to age and health status, etc.
The identification of biomarkers for COVID-19 may enable early detection of patients at risk of developing severe illness.Genetic factors can partially control the up-or down-regulation of amino acid pathways that play a role in the COVID-19 immune response (26).COVID-19 susceptibility or severity is determined by host genetic polymorphisms (27),in addition, Li et al. found that obesity plays an important role in the development of severe COVID-19 when studying the causal relationship between nonalcoholic fatty liver disease and COVID-19 (28).Hakonars ect found that obesity, rather than diabetes, is the crucial risk factor for hospitalization in COVID-19 patients (29).However, these studies only used MR to explore their causal relationship, without delving into the underlying genetic connections and quantifying the genetic architecture.We selected GWAS from two large-scale studies to elucidate the intersectionality that exists between the two traits through various genomic levels to ensure that the interpretation of COVID-19 and obesity is as adequate as possible.The diverse manifestation of COVID-19 can be attributed partly to the host's genomic background.In contrast to the other two different types of COVID-19, the SMARCA4 is a particular gene we detected between critical COVID-19 and BMI in all conducted methods, which encodes the ATP-dependent chromatin remodeling factor 4 (30).SMARCA4 has been identified as the second most significant gene after ACE2 for COVID-19 (31).A previous study found SMARCA4 was the essential mutation in children with ASD and widespread low-density lipoprotein-related lipidome derangements (32).Furthermore, this study revealed that the expression of the SMARCA4 gene in whole blood tissues influenced the development of both traits which is consistent with that SMARCA4-LDLR haplotypes were the determinant of plasma lipids, whose catalytic activity is necessary for ACE2 expression and viral susceptibility (33).Weight control may be the most crucial modifiable risk factor for preventing the development of severe COVID-19.
We identified some genes reported in previous studies to be associated with COVID-19 and obesity.The ADORA2A encodes the

A B
The results of enrichment of the shared genetic architecture between COVID-19 and obesity.(A) The circle plot of GO result, (B) The bubble plot of KEGG analysis.(40,41).Previous studies have found that the expression of A2AR in the adipose tissue of obese mice induced by a high-fat diet (HFD) significantly increased and mainly existed in adipose tissue macrophages (42).It found that adenosine-A2AR signaling could activate brown adipose tissue, which has an anti-obesity effect by inducing the browning of white adipose tissue (43).In conclusion, the dysfunction of adenosine-A2AR mediated by ADORA2A mutations may be a common cause of obesity and COVID-19 due to disrupted adenosine function.what is more, the CCND3 gene encodes the protein Cyclin D3, a key molecule in cell cycle regulation, which is preferentially expressed in adipose tissue, and its expression is strongly induced during the terminal stages of 3T3-L1 adipogenesis (44).Moreover, cyclin D3 was identified to disrupt the function of envelope and membrane proteins of SARS-CoV-2 by affecting spike trafficking and incorporating the E protein into the virions (45).The discovery of these genes and the pathologic processes involved can provide insights for our future studies of obesity in neocoronogenesis and long neocoronary syndromes.
In this study, we found that brain regions are involved in the cooccurrence of both traits using SMR and enrichment analysis.The regions of the brain and neurons help maintain energy balance and homeostasis by perceiving and processing various metabolic signals observed in the hypothalamus.Meanwhile, the peripheral inflammation caused by COVID-19 may have long-term consequences on the neurological system of recovery patients, such as neurodegenerative diseases like dementia (46) The excessive intake of saturated fatty acids can activate the innate immune system and impair adaptive immunity, leading to chronic inflammation and compromised host defense against viruses and these outcomes may be worsened by continued excessive fat intake.In addition, we have also found a high enrichment of pleiotropic genes in the pituitary gland.COVID-19 patients with pituitary dysfunction experience changes in multiple endocrine organs, tissues, and hormone substances, making them susceptible to diabetes, obesity, and fractures (47).This is related to the expression of ACE2 mRNA in the hypothalamus and pituitary cells (48), indicating the close relationship between the management of pituitary diseases in the context of COVID-19 and the occurrence and development of complications.We also identified some shared genes not found to be associated with either trait in previous studies which were worthy of future study.This article has several advantages.Previous studies almost exclusively used Mendelian randomization to investigate the causal relationship between obesity and COVID-19, which can be influenced by horizontal pleiotropy and cannot identify the specific genetic architecture underlying the effects.We deeply explored the genetic overlap between them from SNP to functional levels using various post-GWAS methods.Additionally, enrichment analysis was employed to reveal the genetic factors regulating anatomical and physiological changes.In terms of causal relationships, we used LCV, a model that quantifies the causal portion compared to MR and is less susceptible to confounding variables.Our study has limitations as we only used a European sample due to the limited sample size of other racial groups.Addition of future GWAS data could further enhance our research results.Second, our research's findings requires to be confirmed through basic experiment.Our work only concentrated on the SNP, gene, and mRNA levels, detailed research can be done at the protein level in the future.

Conclusions
This paper reveals in detail the genetic structure of the new crown and obesity and supports their intrinsic link, a finding that provides strong support for future decisions on weight management to reduce COVID-19 infection and development.
FIGURE 2 (A) The manhattan plot of the SNP-based test based on results of CPASSOC of crirical COVID-19; (B) the result of tissue enrichment in GTEx 8 of shared genes between critical COVID-19 and BMI, the color red represents the significant tissue, the color blue represents the tissue did not reach threshold.

TABLE 1
The source of GWAS and genome-wide genetic correlation between COVID-19 and BMI using LDSC.
h2, heritability; rg, genetic correlation; inter, quantity overlap of population, inter represents the overlap of two samples.

TABLE 2
The local genetic correlation between COVID-19 and BMI in 2495 loci by LAVA.

TABLE 3
Results from colocalization analysis for each pleiotropic locus identified from CPASSOC.The percentages of H3 (association with both traits, involving two distinct SNPs) and H4 (association with both traits, involving one shared SNP).
of 0.56 when assessing the association between critical COVID-19 and BMI.

TABLE 4
The significant functional genes detected in both COVID-19 and BMI.
Tissue: the souce of eqtl, ENCODE, Encyclopedia of DNA Elements project; CHR, chromosomes; START, genomic starting point; STOP, genomic ending point; NSNPS, number of SNPs within the gene.