Germline Genetic Variants of Viral Entry and Innate Immunity May Influence Susceptibility to SARS-CoV-2 Infection: Toward a Polygenic Risk Score for Risk Stratification

The ongoing COVID-19 pandemic caused by the novel coronavirus, SARS-CoV-2 has affected all aspects of human society with a special focus on healthcare. Although older patients with preexisting chronic illnesses are more prone to develop severe complications, younger, healthy individuals might also exhibit serious manifestations. Previous studies directed to detect genetic susceptibility factors for earlier epidemics have provided evidence of certain protective variations. Following SARS-CoV-2 exposure, viral entry into cells followed by recognition and response by the innate immunity are key determinants of COVID-19 development. In the present review our aim was to conduct a thorough review of the literature on the role of single nucleotide polymorphisms (SNPs) as key agents affecting the viral entry of SARS-CoV-2 and innate immunity. Several SNPs within the scope of our approach were found to alter susceptibility to various bacterial and viral infections. Additionally, a multitude of studies confirmed genetic associations between the analyzed genes and autoimmune diseases, underlining the versatile immune consequences of these variants. Based on confirmed associations it is highly plausible that the SNPs affecting viral entry and innate immunity might confer altered susceptibility to SARS-CoV-2 infection and its complex clinical consequences. Anticipating several COVID-19 genomic susceptibility loci based on the ongoing genome wide association studies, our review also proposes that a well-established polygenic risk score would be able to clinically leverage the acquired knowledge.

The ongoing COVID-19 pandemic caused by the novel coronavirus, SARS-CoV-2 has affected all aspects of human society with a special focus on healthcare. Although older patients with preexisting chronic illnesses are more prone to develop severe complications, younger, healthy individuals might also exhibit serious manifestations. Previous studies directed to detect genetic susceptibility factors for earlier epidemics have provided evidence of certain protective variations. Following SARS-CoV-2 exposure, viral entry into cells followed by recognition and response by the innate immunity are key determinants of COVID-19 development. In the present review our aim was to conduct a thorough review of the literature on the role of single nucleotide polymorphisms (SNPs) as key agents affecting the viral entry of SARS-CoV-2 and innate immunity. Several SNPs within the scope of our approach were found to alter susceptibility to various bacterial and viral infections. Additionally, a multitude of studies confirmed genetic associations between the analyzed genes and autoimmune diseases, underlining the versatile immune consequences of these variants. Based on confirmed associations it is highly plausible that the SNPs affecting viral entry and innate immunity might confer altered susceptibility to SARS-CoV-2 infection and its complex clinical consequences. Anticipating several COVID-19 genomic susceptibility loci based on the ongoing genome wide association studies, our review also proposes that a well-established polygenic risk score would be able to clinically leverage the acquired knowledge.

INTRODUCTION
Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), the virus responsible for the ongoing pandemic COVID-19 has yet infected more than 108 million people worldwide with a reported mortality rate between 0.5 and 10% in different countries (1). SARS-CoV-2 is a novel coronavirus originally detected in China. The specific mechanism by which it infects humans and effects human health is not fully understood. The clinical characteristics of COVID-19 usually incorporates fever, fatigue, dry cough, and dyspnea, while severe infections may result in bilateral pneumonia, and life-threatening acute respiratory distress syndrome (ARDS). Although severe complications usually manifest in elder patients with concurrent chronic diseases (e.g., high blood pressure, diabetes) young, healthy individuals might also suffer from critical consequences of the disease, requiring intensive care. The wide range of disease susceptibility especially in younger patients suggests that difference in genetic background of individuals might contribute to these alterations. In fact, the analysis of previous, unrelated infectious diseases provides clear evidence that specific protective genetic variations are enriched in populations where certain infections are endemic. For instance, sickle cell trait and carrying specific HLA antigens in African populations confer diminished susceptibility against malaria infection (2,3). Another example, 32, a 32-base pair deletion of the CCR5 gene prevents cellular viral entry of human immunodeficiency virus (HIV) resulting in effective resistance against HIV infection in individuals homozygous regarding this variation (4).
In the present review we aim to summarize previously published genotype-phenotype studies of genes which might play a role in the susceptibility to COVID-19. The associations between various single nucleotide polymorphisms (SNPs) and certain traits were studied using targeted and genome-wide approaches. In the case of targeted approach, hypothesis-driven selection of specific genes/SNPs were analyzed in cases and controls while during genome-wide association studies (GWASs) detection of novel genomic loci with susceptibility to various traits/diseases are possible. Our examination focuses on genetic variants of 2 key processes in the initiation of the disease: viral entry and recognition and response by the innate immune system. Also, as several international collaborations are ongoing to provide large-scale genomic susceptibility data, we propose that a well-established polygenic risk score would be able to optimally leverage the acquired knowledge.

VIRAL ENTRY
Large emphasis has been directed to decipher how SARS-CoV-2 is incorporated in human cells. Key data in this regard originate from studies focusing on SARS-CoV, responsible for the SARS epidemic of 2002-2003, which shares 79.6% sequence identity with SARS-CoV-2 (5). In fact, the spike protein of SARS-CoV binds to angiotensin-converting enzyme 2 (ACE2) that serves as a receptor for the virus (6), and recent data confirmed that SARS-CoV-2 also binds ACE2 in vitro (7)(8)(9). Further analyses revealed that the spike protein of SARS-CoV-2 is cleaved by transmembrane protease serine 2 (TMPRSS2) (7), facilitating viral entry. Also of note, both ACE2 and TMPRSS2 are primarily expressed in bronchial transient secretory cells (10), elucidating the predilection of the lower airways. Additionally, proprotein convertase FURIN was shown to pre-activate the viral entry of SARS-CoV-2 (11), while additional factors as PIKfyve, TPCN2 and cathepsin L (CTSL) are also critical in this process (12). Table 1 summarizes genetic variants of the aforementioned genes with suggested genotype-phenotype findings. The main physiological function of ACE2 is catalyzing the hydrolysis of angiotensin I and angiotensin II into angiotensin (1-9) and angiotensin (1-7), respectively, contributing to blood pressure regulation (34). Therefore, numerous SNP association studies were directed to ascertain the role of ACE2 genetic variants on certain cardiovascular and metabolic traits. Throughout several populations, ACE2 polymorphisms have been associated with susceptibility to cardiovascular and metabolic diseases including hypertension and type 2 diabetes mellitus underlining the potential functional impact of these SNPs on ACE2 expression and/or function (13)(14)(15)(16)(17)(18)(19)(20)(21). A TMPRSS2 SNP has been linked to TMPRSS2-ERG genetic fusion which is a frequent molecular event in prostate cancer (22, 23). More importantly, a study examining patients of the 2009 swine flu pandemic caused by the H1N1 influenza virus found that TMPRSS2 SNP rs2070788 is associated with severity of the disease (24). Additionally, genotype-specific TMPRSS2 expression was confirmed in human lung tissues regarding rs2070788 and rs383510, the latter being tagged to the former polymorphism. Mechanistically, rs383510 was found to enhance the transcription of TMPRSS2 mRNA, and these 2 SNPs were also found to associate with susceptibility to the H7N9 influenza virus (24).
Certain high throughput screening studies identified rs4702, a common genetic variant of proprotein convertase FURIN as susceptibility factor for schizophrenia and hypertension (25, 26), while other studies correlated another SNP rs17514846 with other various traits including coronary artery disease, metabolic syndrome and longevity (27-29).
While we found no SNP association studies for PIKfyve, certain variants of the TPCN2 gene coding for cation-selective ion channel were found to be associated with type 2 diabetes mellitus (T2DM) and hair color (30, 31). In the case of CTSL, two studies performed on different populations confirmed that a promoter polymorphism correlates with hypertension in Asian and American populations (32, 33).

INNATE IMMUNITY
After SARS-CoV-2 successfully infected cells, a complex immune response initiates, in which the rapid and coordinated response of the innate immunity is pre-requisite (35). Following infection, the innate immune system recognizes viral antigens mainly by RIG-I-Like Receptors (RLRs) and Toll-Like Receptors (TLRs) (35). In the first step in RLR-dependent immune response, cytoplasmic RNA sensors RIG-I and MDA5 recognize viral RNA, after which interaction with mitochondrial antiviral signaling protein (MAVS) initiate signaling changes activating interferon Frontiers in Immunology | www.frontiersin.org regulatory factor IRF3 and IRF7, resulting in type I IFN (IFN-α and IFN-β) production and antiviral response (35-38). Supplementary Table 1 summarizes the SNP association studies concerning the agents implicated in viral recognition and response by the innate immune system. Several RIG-I SNPs were found to be associated with neutralizing antibody levels after measles and rubella vaccinations while other studies found RIG-I SNPs to be associated with nasopharyngeal carcinoma and EV71-induced hand, foot, and mouth disease (39-43). MDA5 genetic variants were thoroughly investigated in relation to autoimmunity with several associations being found with psoriasis, systemic lupus erythematosus (SLE), type 1 diabetes mellitus (T1DM), hypothyroidism and multiple sclerosis (MS) (44-53). Polymorphisms in MAVS were analyzed regarding inflammatory response finding that rs7269320 associated with osteoarthritis (54). Moreover, in an African American cohort, where rs11905552 of the MAVS gene was much more frequent compared to European Americans, this SNP associated with low type I IFN production in patients with SLE (55). Studies focusing on genetic variants of IRF3 and IRF7 found associations with SLE and systemic sclerosis (56-59), while IFN-α genetic variants were found to be associated with mixed connective tissue disease and prognosis in glioma patients (60, 61).
Genetic variants of adapter molecule MYD88 are associated with tuberculosis susceptibility, Buerger disease and treatment response in patients with RA (90,(161)(162)(163). SNPs of the other key adapter agent, TRIF are associated with pneumonia susceptibility and thyroid cancer (164,165). In addition to type I IFN response viral recognition in the innate immune system leads to NF-kB activation. NF-kB is a multiprotein complex consisting of NFKB1, NFKB2, RELA, RELB, and REL (166). Type I IFN response and NF-kB activation result in IL-6 and IL-8 production (35). The activation of these mediators contributes to inflammation and complex antiviral immune response (35).
Polymorphisms of IL6 have been shown to pre-dispose to pulmonary tuberculosis, acute lung injury in patients with systemic inflammatory response syndrome and post-infectious irritable bowel syndrome (98,194,195). An association with RA has also been proposed (196). rs1800795 has been shown to have a role in the prognosis of patients following renal and lung transplantation (197)(198)(199). IL6 SNPs were also confirmed to have a role in the susceptibility of various cardiovascular disorders including hypertension and stroke (200)(201)(202).
IL-8 is coded by CXCL8 gene, SNPs of which have been shown to be linked to infectious, autoimmune, and neoplastic diseases. Acne vulgaris, chronic periodontitis, and invasive aspergillosis among immunocompromised patients have been shown to be associated with various variants (203)(204)(205). Autoimmune diseases including idiopathic pulmonary fibrosis, childhood IgA nephropathy, erosive oral lichen planus, childhood asthma, and Graves' disease have also been linked to genetic variants of CXCL8 (206)(207)(208)(209)(210). Concerning neoplastic diseases, non-small cell lung cancer, and gastric cancer have been proposed to be associated with CXCL8 SNPs (154,211,212).
In conclusion, large majority of the discussed SNPs present pleiotropic effects, among which the frequent presence of various autoimmune and infection-related traits highlights their putative involvement in the susceptibility and severity of COVID-19.

TOWARD PRECISION RISK ASSESSMENT: PREDICTING COVID-19 SUSCEPTIBILITY AND SEVERITY BASED ON A POLYGENIC RISK SCORE
As genetic susceptibility regarding COVID-19 is an ongoing topic of several large international collaborations we anticipate to acquire a large amount of evidence regarding susceptibility loci in the near future. Indeed, recent studies identified germline variants of TLR3-and IRF7-dependent type I IFN immunity to associate with more severe COVID-19 infection (213). In particular, disease-causing germline variants have been detected in TLR3, UNC93B1, TICAM1, TBK1, IRF3, IRF7, IFNAR1, and IFNAR2 in patients with life-threatening COVID-19 (213). Another recent study analyzing 1,610 COVID-19 patients and 2,205 control subjects from the first wave in heavily affected Italy and Spain found 2 chromosomal loci on chromosome three and nine with significant association with COVID-19 patients (214). On chromosome three the affected area includes several actors which might alter COVID-19 susceptibility including chemokine receptors, while on chromosome nine the association signal coincided with the AB0 blood group locus (214). AB0 blood group has independently been linked to COVID-19 susceptibility (214)(215)(216). Further studies are needed to confirm these associations in independent populations.
Applying this knowledge to detect individuals with elevated risk for severe disease might help to prioritize them for vaccination and stricter protection measures. As COVID-19 susceptibility and severity seem to have a polygenic background, we propose that a curated polygenic risk score (PGRS) might facilitate the detection of individuals with high risk for infection (Figure 1). Based on genome-wide analyses, polygenic risk scores are able to detect high-risk individuals in various diseases, finetuning the more widely used risk stratification dependent on baseline anthropometric and physiological characteristics (217,218). A most recent GWAS on a cohort of COVID-19 patients from the U.K. found eight lead variants from independent genome-wide significant regions including rs2236757 in IFNAR2 coding for interferon α and β receptor subunit 2 (219). Though the individual odds ratio for each of the relatively frequent variants varies between 1.3 and 2.1, the combined odds ratio in the case of harboring all these susceptibility variants rises to 29.5, underlining the applicability of a polygenic risk score (219).
In addition to COVID-19 susceptibility, inclusion of genetic predictors of disease severity and treatment response might also be included. In particular, based on the effectiveness of glucocorticoid administration confirmed by the randomized, controlled RECOVERY clinical trial (220), it would be interesting to see if pharmacogenetic modifiers of glucocorticoid action, sensitivity and metabolism contribute to the severity of COVID-19 infection and treatment response (221).
It is important to note that the majority of the observed associations in Table 1 and Supplementary Table 1 were only validated in specific populations. By analyzing the population-specific allelic frequencies of the reviewed viral entry and innate immunity-related SNPs reviewed (Supplementary Table 2) we can conclude that the large variations in SNP frequencies might heavily influence their association with various traits in select populations. Additionally, pronounced differences in risk allele frequencies of the 8 proposed lead COVID-19-related SNPs (219) are present in different populations (Supplementary Table 3).
Moreover, these differences most probably alter epistatic interactions between genes, adding an additional layer of complexity (222).
Therefore, the observed population dependency of genotypephenotype associations would probably result in populationspecific PGRSs rather than a universal PGRS optimal for all populations. Dedicated efforts to perform populationspecific GWASs regarding COVID-19 susceptibility and severity to build population-specific PGRSs are needed to address these differences.

DISCUSSION
The disruption caused by the COVID-19 pandemic has yet unknown consequences on the whole human society and on each affected patient's health as well. Understanding the susceptibility toward this disease is important to detect high-risk individuals and also to decipher molecular mechanisms needed for the development of the clinical phenotype. Viral entry and innate immunity are key mechanisms in the initiation of SARS-CoV-2 infection. We performed a thorough literature review concerning genotype-phenotype association studies regarding agents of these mechanisms. Our results indicated that SNPs in the genes of these processes are frequently associated with susceptibility to various bacterial and viral infections. Additionally, several autoimmune diseases are also linked to these genes, underlining the versatile immune consequences of these genetic variants. Based on the confirmed associations it is highly plausible that the abovementioned SNPs might confer altered susceptibility to SARS-CoV-2 infection and its complex clinical consequences.
In addition to viral entry and innate immunity, other mechanisms including adaptive immunity are also of paramount importance regarding the susceptibility to COVID-19 (35). To better characterize putative genomic susceptibility loci, well-designed, international genome-wide association studies (GWAS) are needed.
As multiple GWASs on host genetic susceptibility are ongoing, several genomic susceptibility loci are proposed to be detected. Translating these individual susceptibility variants into clinically relevant polygenic risk scores would fully leverage this acquired knowledge to easily detect high-risk individuals prioritized for vaccination and stricter protective measures.
All things considered, genetic variants of genes of viral entry and innate immunity might alter susceptibility, and prognosis of COVID-19. Further GWASs are needed to better characterize susceptibility loci and to develop clinically relevant risk stratification strategies.