Cancer and COVID-19 Susceptibility and Severity: A Two-Sample Mendelian Randomization and Bioinformatic Analysis

The clinical management of patients with COVID-19 and cancer is a Gordian knot that has been discussed widely but has not reached a consensus. We introduced two-sample Mendelian randomization to investigate the causal association between a genetic predisposition to cancers and COVID-19 susceptibility and severity. Moreover, we also explored the mutation landscape, expression pattern, and prognostic implications of genes involved with COVID-19 in distinct cancers. Among all of the cancer types we analyzed, only the genetic predisposition to lung adenocarcinoma was causally associated with increased COVID-19 severity (OR = 2.93, β = 1.074, se = 0.411, p = 0.009) with no obvious heterogeneity (Q = 17.29, p = 0.24) or symmetry of the funnel plot. In addition, the results of the pleiotropy test demonstrated that instrument SNPs were less likely to affect COVID-19 severity via approaches other than lung adenocarcinoma cancer susceptibility (p = 0.96). Leave-one-out analysis showed no outliers in instrument SNPs, whose elimination rendered alterations in statistical significance, which further supported the reliability of the MR results. Broad mutation and differential expression of these genes were also found in cancers, which may provide valuable information for developing new treatment modalities for patients with both cancer and COVID-19. For example, ERAP2, a risk factor for COVID-19-associated death, is upregulated in lung squamous cancer and negatively associated with patient prognosis. Hence, ERAP2-targeted treatment may simultaneously reduce COVID-19 disease severity and restrain cancer progression. Our results highlighted the importance of strengthening medical surveillance for COVID-19 deterioration in patients with lung adenocarcinoma by showing their causal genetic association. For these patients, a delay in anticancer treatment, such as chemotherapy and surgery, should be considered.


INTRODUCTION
Coronavirus disease 2019 , which arises from severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, can result in severe illnesses such as acute respiratory distress syndrome, multiorgan dysfunction syndrome, and consequent death, and it has become a public health emergency of international concern (Huang et al., 2020).
During the COVID-19 pandemic, the clinical management of patients with cancer is a Gordian knot that has been discussed broadly but has not reached a consensus (Moujaess et al., 2020). Two observational studies from China revealed that patients with cancer were more susceptible to COVID-19 and its relevant severe complications (Dai et al., 2020;Liang et al., 2020). Several potential explanations may account for the vulnerability to COVID-19 among patients with cancer. First, most chemotherapy induces myelosuppression and an immunosuppressive condition in patients, which increases the risk of succumbing to COVID-19 and the occurrence of adverse events. Second, the physiological function of patients with cancer is dramatically undermined, either attributed to disease progression or posttreatment side effects, which may synergistically lead to an unfavorable prognosis from COVID-19 in patients with cancer (Diao et al., 2020). Third, immune checkpoint inhibitors (ICIs) have been widely used in the treatment of multiple cancers. ICIs could induce immunerelated pneumonitis followed by lung injury, which could enhance the risk of developing severe COVID-19 pneumonia. However, some studies also reported that CoV-2 infection causes functional exhaustion of CTLs and NK cells with significantly higher levels of exhaustion markers such as programmed death-1 (PD-1) than healthy controls (Zheng et al., 2020), suggesting that the use of ICIs may activate the anti-COVID-19 ability in the host and improve the patient prognosis. Two ongoing clinical trials will demonstrate the pros and cons of adopting ICIs in .
These different findings revealed the close relationship between cancer and COVID-19 susceptibility and severity, and it is difficult to speculate on their causal relationship given the unavoidable bias derived from unmeasured confounding factors in traditional observational studies. Hence, we introduced Mendelian randomization (MR) to investigate the causal association between cancers and COVID-19 susceptibility and severity.
MR is an increasingly acknowledged statistical method that uses genetic variants to determine whether an observational association between a risk factor and an outcome is consistent with a causal effect (Emdin et al., 2017). Individuals who carry the variant and those who do not are followed up until the development of an outcome of interest. Because these genetic variants are typically unrelated to confounding factors, differences in the outcome between those who carry the variant and those who do not could be attributed to the difference in the risk factor, which makes verification of the causal association accessible (Sekula et al., 2016). MR is based on 3 assumptions: (1) the genetic variant is associated with the risk factor; (2) the genetic variant is not associated with confounders; and (3) the genetic variant influences the outcome only through the risk factor. Given the difficulty of concurrently measuring the exposure and outcome traits in the same cohort, as an alternative, summary-level data from different genome-wide association study (GWAS) consortia can be used to carry out MR analyses, taking gene exposure measures from one GWAS and gene outcome measures from another GWAS (two-sample MR) (Lawlor, 2016).
The present study investigated the causal association between cancer and COVID-19 susceptibility and severity using twosample MR analysis. Moreover, we also explored the mutation landscape, expression pattern, and prognostic implications of genes involved with COVID-19 in distinct cancers. The E-value is defined as the minimum strength of an association that an unmeasured confounder would need to have with both the exposure and the outcome to fully explain away a specific exposure-outcome association, conditional on the measured covariates (Blum et al., 2020;VanderWeele and Ding, 2017). A large E-value implies that considerable unmeasured confounding would be needed to explain away an effect estimate, while a small E-value implies little unmeasured confounding would be needed to explain away an effect estimate. The E-value was calculated on a website (https://www.evaluecalculator.com/), and it could be calculated for an observed risk ratio (denoted RR) by E-value RR + √[RR*(RR − 1)]. If the original risk ratio is below 1, then one first takes the inverse before applying the E-value formula (Mathur et al., 2018). Hence, we calculated the E-value to estimate the bias from unmeasured confounders in the existing observational studies focused on the relationship between cancer and COVID-19 sensitivity and severity.

Identify Cancer-Associated Genetic Variants
Initially, we selected 10 cancers with no obvious genetic proposition by sex to maximally reduce the bias derived from sex differences (glioma, squamous lung cancer, lung adenocarcinoma, melanoma, lymphoid leukemia, hepatocellular carcinoma, colorectal carcinoma, kidney cancer, gastric cancer, and pancreatic cancer). Then, we identified the SNPs that were significantly associated with each cancer with the threshold value p < 5e-8 in the GWAS Catalog repository (https:// www.ebi.ac.uk/gwas/) (MacArthur et al., 2017). SNPs without the required information for computing the MR analysis, such as β(se), effect alleles, and their frequency, were eliminated. In addition, we only included SNPs associated with cancer susceptibility, and other traits, such as the patients' overall survival time or chemotherapy resistance, were not within our scope.

Select GWAS Involved With COVID-19 Susceptibility and Severity
The meta-analysis results of SNP-based association analysis pertaining to COVID-19 susceptibility and severity were obtained from the COVID-19 Host Genetics Initiative (https:// www.covid19hg.org/) (Release 4) (COVID-19 Host Genetics Initiative, 2020). The phenotype "COVID vs. laboratory/selfreported negative" assessed the effect of SNPs on COVID-19 susceptibility, while the phenotype "very severe respiratory confirmed COVID versus not hospitalized COVID" was selected to evaluate the effect of SNPs on COVID-19 severity.

Procedures for Two-Sample MR Analysis
For standard two-sample MR, it is important to ensure that the instruments for the exposure are independent. Among those SNPs that have linkage disequilibrium (LD) R-squares above the specified threshold, only the SNP with the lowest p-value will be retained. We pruned all SNPs in LD using the default cutoff value recommended by the "MR-base" platform. The clumping distance (kb) was set as 10,000 kb and R 2 was 0.01. By default, if a particular requested SNP is not found in the outcome GWAS, then a SNP (proxy) in the LD with the requested SNP (target) will be searched for instead. Once the exposure and outcome data are obtained, the next step is to harmonize the effects of the instrumental variants, which refers to the effect of a SNP on the exposure, and the effect of that SNP on the outcome must correspond to the same allele. For inferable palindromic SNPs, we tried to infer the forward strand alleles using allele frequency information. Noninferable palindromic SNPs referring to the allele frequency no longer provide information about the strand. Such SNPs would be discarded. This is done for any palindromic SNPs that have minor allele frequencies above 0.42.
Once the exposure and outcome data were harmonized, the effects and standard errors for each instrument SNP were available for the exposure and outcome traits. Such information could be utilized to perform MR analysis. Inversevariance weighted (IVW) estimation is a classic method to pool the MR effects of each instrument SNP.
The variance term was calculated as se(β Yj ) 2 β 2 , and the pooled fixed-effect inverse-variance weighted estimate (θ IVW ) was calculated asθIVW jβ YjβXj se(β Yj ) −2 jβ 2 se(β Yj ) −2 . In addition, other methods for two-sample MR, including MR Egger, weighted mode, weighted median, and simple mode, were also adopted for reference. Each method differs in theory and has distinct strengths. The weighted mode introduces an extra element similar to IVW and the weighted median, weighting each SNP's contribution to the clustering by the inverse variance of its outcome effect (Hartwig et al., 2017). A median-based estimator is an alternative approach that takes the median effect of all available SNPs, which has the advantage that only half of the SNPs need to be valid instruments for unbiased causal effect estimates. The weighted median estimate allows for stronger SNPs to contribute more toward the estimate and they can be acquired by weighting the contribution of each SNP by the inverse variance of its association with the outcome (Bowden et al., 2016a). MR-Egger adapts the IVW analysis by allowing a nonzero intercept, allowing the net-horizontal pleiotropic effect across all SNPs to be unbalanced or directional (Bowden et al., 2016b). Hence, these methods should be considered, especially when the basic assumptions behind MR, such as the absence of a horizontal pleiotropic effect and heterogeneity, are not well satisfied.
Heterogeneity in the causal effects among instruments is a marker of the potential violations of instrumental variants' assumptions. Heterogeneity could be assessed for the IVW and Egger estimates, and this can be used to navigate between models of horizontal pleiotropy . In addition, we depicted funnel plots to visualize any heterogeneity of effect estimates, where the causal effect estimates for each SNP were depicted on the x-axis and the inverse standard error (a measure of instrumental strength) for the association was represented on the y-axis. Asymmetry about the vertical line is indicative of the heterogeneity.
To evaluate the robustness of the MR effect estimates and identify any potential outliers, each instrument SNP was sequentially eliminated from the analysis (referred to as a leave-one-out analysis). If the precision and direction of the association between the cancer-predicting SNPs and COVID-19 susceptibility and severity remained largely unaltered, then the results were probably not driven by any outliers. All analyses were performed using the R package "TwoSampleMR" (version 0.5.5). Bonferroni correction was performed by dividing the p-value of 0.05 by the number of testing methods (here, the threshold should be 0.05/5 0.01).

Investigate the Expression Pattern and Prognostic Implications of Genes Involved in COVID-19 Susceptibility and the Severity in Cancers
The omics and clinical data of cancers derived from the primary organs that were the same as the cancer types investigated in the MR analysis were collected from The Cancer Genome Atlas (TCGA) (https://www.cancer.gov/). The whole exome sequencing (WES) and copy number variation (CNV) data of each cancer were integrated and processed through cBioPortal (www.cbioportal.org) (Cerami et al., 2012;Gao et al., 2013). Given the lack of adjacent normal samples in the TCGA database, we incorporated the transcriptome data of the corresponding normal organs from the Genotype-Tissue Expression (GTEx) datasets. Using the combined transcriptome data from TCGA and GTEx, we compared the differential expression of genes associated with COVID-19 between tumor tissues and normal tissues using the cutoff of logFC >1 and adjusted p-value <0.01. The association between the gene expression level and the patient prognosis was also investigated. The patients were divided into two groups based on the median gene expression level, and OS and DFS were compared between the two groups using Kaplan-Meier methods. The log-rank test was performed to evaluate the statistical significance of the survival differences (the cutoff was set as p < 0.05). The core code for performing the MR analysis is attached in the Supplementary Method.

A Summary of the Observational Studies Pertaining to the Cancer and COVID-19 Susceptibility and Severity
Many studies have investigated the clinical characteristics of COVID-19 patients with cancer (Table 1). However, most of them failed to compare the susceptibility and severity of COVID-19 between patients with or without cancer. Only three studies provided direct evidence to show that patients with cancer appear more vulnerable to SARS-CoV-2. Although a multivariable model was applied to adjust common confounding factors when evaluating the risk for severe COVID-19 manifestations, many potential unmeasured confounders may still exist and affect the reliability of the results (Dai et al., 2020;Liang et al., 2020;Mehta et al., 2020). Hence, we calculated the E-value to estimate the possibility that unmeasured confounders interrupted the results. The E-values for the observed association between cancer and COVID-19-induced mortality, rates of ICU admission, severe or critical symptoms, and deterioration were 4.11, 2.76, 2.73, and 4.18, respectively, which were not large enough to eliminate the possibility of bias on the causality derived from unmeasured confounders.

Identification of Instrumental Variables for Cancer and the Effects of These Instrument SNPs on COVID-19 Susceptibility and Severity
A total of 258 GWASs associated with the effect of SNPs on 10 cancers were included for screening instrumental variables (Figure 1; Supplementary Tables S1-S2). A total of 9,536,812 participants across distinct regions were included in these studies. We obtained access to 2 GWASs involved in COVID-19 susceptibility and severity. The GWAS data of 24,057 participants who were diagnosed with COVID-19 and 218,062 normal participants with laboratory or self-reported negative results of COVID-19 testing were enrolled in the present study, which was contributed to by 23 independent studies. In addition, the GWAS data of 4 cohorts, including 269 COVID-19 patients with severe respiratory symptoms and 688 nonhospitalized COVID-19 participants, were analyzed to identify SNPs strongly associated with COVID-19 severity (Supplementary Table S3). When these data were collected, we further obtained the effects of cancer-associated SNPs on COVID-19 susceptibility and severity and performed MR analysis. The SNPs used in the MR analysis and their effects are listed in Supplementary Table S4.

Study of the Causal Association Between the Genetic Predisposition to Cancer and COVID-19 Susceptibility
Among all of the cancer types we analyzed, only a genetic predisposition to colorectal cancer was causally associated with COVID-19 susceptibility ( Figure 2A; Table 2). Specifically, an increased genetic predisposition to colorectal cancer could decrease the risk of succumbing to COVID-19 (β −0.053, se 0.019, p 0.005). No obvious heterogeneity was detected by Q statistics (Q 91.01, p 0.833; Supplementary Table S5) or the symmetry of the funnel plot. In addition, the results of the pleiotropy test demonstrated that instrument SNPs were less likely to affect COVID-19 susceptibility via approaches other than colorectal cancer susceptibility (p 0.680; Supplementary  Table S6). Leave-one-out analysis showed no outliers in instrument SNPs, whose elimination caused alterations in statistical significance, which further supported the reliability of the MR results (Supplementary Table S7).

The Causal Association Between Genetic Susceptibility to Cancer and COVID-19 Severity
Then, we further analyzed the causal association between genetic predisposition to cancers and COVID-19 severity. Among all of the cancer types we analyzed, only genetic susceptibility to lung adenocarcinoma was causally associated with COVID-19 severity ( Figure 2B; Table 2). Specifically, an increased genetic predisposition to lung adenocarcinoma could increase the risk of succumbing to severe COVID-19 ( Figures 2C,D; OR 2.93, β 1.074, se 0.411, p 0.009). No obvious heterogeneity was detected by Q statistics (Q 17.29,p 0.24;Supplementary Table S5) or the symmetry of the funnel plot ( Figure 2F). In addition, the results of the pleiotropy test demonstrated that instrument SNPs were less likely to affect COVID-19 severity via approaches other than lung adenocarcinoma cancer susceptibility (p 0.96; Supplementary  Table S6). Leave-one-out analysis showed no outliers in instrument SNPs, whose elimination resulted in alterations in statistical significance, which further supported the reliability of the MR results ( Figure 2E; Supplementary Table S7). The causal relationship between genetic susceptibility to lung adenocarcinoma and COVID-19 severity was also confirmed in the "weighted median" model, which allows for stronger SNPs to contribute more toward the estimate (β 1.016, se 0.505, p 0.044). To test whether the heterogeneity was derived from the differences among the cohorts, we performed subgroup analysis by only using the SNPs in the study by McKay et al. (2017), which is the largest lung adenocarcinoma GWAS cohort available in the GWAS catalog. As expected, the result of this subgroup analysis still demonstrated that the genetic predisposition to lung adenocarcinoma was causally related to an increased risk for severe COVID-19 (IVW model, β 1.299, se 0.410, p 0.002) without any obvious heterogeneity or horizontal pleiotropy (Q 0.42, p 0.88; p 0.79). To further support our conclusion, we performed MR analysis based on hospitalized and non-hospitalized COVID-19 cohort and found a similar conclusion (β 0.28, se 0.15, p 0.05).

Expression Pattern and Prognostic Implications of Genes Involved in COVID-19 in Cancers
Furthermore, we investigated the mutation landscape of genes involved in COVID-19 in cancers. Among these genes, 2 genes (APOE and SLC6A20) were associated with COVID-19 susceptibility, 6 genes (LZTFL1, CCR9, FYCO1, CXCR6, XCR1, and ABO) were related to COVID-19 severity, 4 genes (ERAP2, BRF2, TMEM181, and ALOXE3) were associated with COVID-19 mortality, and 5 genes (ACE2, ANPEP, DPP4, ENPEP, and TMPRSS2) were detected in SARS-CoV-2 receptors (Supplementary Table S8). In total, 20% (1,062/5,292) of patients harbored intratumoral mutations of at least one COVID-19-related gene. BRF2 was the most frequently mutated gene and it mainly had amplification mutations ( Figure 3). Mutations in most genes associated with COVID-19 severity featured deletions and missense mutations. Patient mutations within COVID-19-associated genes were characterized by increased disease-specific survival (p 0.02) ( Figure 4A). Mutual exclusivity analysis revealed broad co-occurrence among COVID-19-associated genes (Supplementary Table S9). Mutations simultaneously occurring in LZTFL1 and CCR9 were the most frequent across cancers. In addition, we explored whether the mutation status of other genes could be regulated by the alterations of COVID-19-associated genes. The results showed that TTN, TP53, and MUC16 were the top 3 genes that were frequently mutated along with the alteration of COVID-19-associated genes, suggesting a potential crosstalk mechanism mutually exerted by these genes in cancer development ( Figure 4B).
Next, we investigated the transcript expression pattern of these genes across distinct cancers. The differential expression of COVID-19-associated genes was universally observed between cancer and adjacent normal tissues ( Figure 4C). Notably, the APOE transcript was upregulated in 6 cancers, while the expression of ANPEP was downregulated in seven cancers, and it may have oncogenic and antitumor effects. To further evaluate the correlation between the expression level of these genes and the patient prognosis, we divided patients into two groups according to the median transcript level and conducted survival analysis. Many genes were associated with patient overall and disease-free survival ( Figure 4C). For example, overexpression of DPP4 was associated with prolonged survival of patients with kidney, lung, or liver cancers. In contrast, overexpression of APOE was an unfavorable factor for tumors in the colon, liver, and pancreas.

DISCUSSION
During the COVID-19 outbreak, rational allocation of medical resources became urgent as medical and nursing resources were extremely lacking (Lee et al., 2020). Identifying vulnerable populations susceptible to COVID-19 and individuals who may suffer from severe manifestations contributes significantly to optimizing the allocation of medical resources. Questions have been raised about the biological vulnerability of patients with cancer to COVID-19, and several preliminary cross-sectional studies have also provided evidence to support this assumption. However, the causality of the association could not be confirmed due to numerous methodological biases and unmeasured confounders (Dai et al., 2020;Moujaess et al., 2020). The screening procedures for COVID-19 were more broadly and frequently performed in hospitalized patients, such as patients with cancer, than in the general population. In this context, the detection rate of COVID-19 is likely greater in patients with cancerous diseases than in the nonhospitalized population. Hence, the incidence of COVID-19 may seem to be increased in the cancerous population when the detection rate is confused with the actual incidence.
MR is an effective tool to assess the causal relationship between exposure factors and the outcome (Emdin et al., 2017). Here, we performed two-sample MR to evaluate whether a genetic predisposition to cancer is causally associated with COVID-19 susceptibility and severity based on public GWAS data. We demonstrated that a genetic predisposition to lung adenocarcinoma as opposed to lung squamous cancer is causally related to COVID-19 severity but not susceptibility (OR 2.93, β 1.074, se 0.411, p 0.009), suggesting that increased surveillance for severe COVID-19associated complications should be conducted among hospitalized patients with lung adenocarcinoma. To avoid the unexpected bias from weak instruments, we calculated the F-statistic for SNPs used in the MR analysis. The results showed that all SNPs were qualified with the F-statistic larger than 10 (Supplementary Table S10). Interestingly, we found that a genetic predisposition to colorectal cancer was negatively associated with COVID-19 susceptibility (β −0.053, se 0.019, p 0.005). However, such a small β effect may not have valuable clinical implications, and the β value mainly reflects the causal association between a genetic predisposition to colorectal cancer and COVID-19 susceptibility, while many hospitalized patients with colorectal cancer have received either cytotoxic chemotherapy or surgery, which theoretically increases the risk of succumbing to COVID-19.
We also evaluated the mutation landscape, expression pattern, and prognostic implications of genes involved with COVID-19 in distinct cancers to explore whether novel targeted treatment could be applied for patients with both cancer and COVID-19. For instance, ERAP2 is a risk factor for COVID-19-related death and it is upregulated in lung squamous cancer but is negatively associated with the patient prognosis. Therefore, targeting ERAP2 might be a potential treatment target to both relieve COVID-19 severity and restrain cancer progression.
To the best of our knowledge, this study is the first MR analysis to report the causal association between a genetic predisposition to lung adenocarcinoma and an increased risk for severe COVID-19, such as closer and positive surveillance to be applied to such patients in clinical practice. Specifically, for patients with lung adenocarcinoma, considering its causal association with COVID-19 severity, we suggest that chemotherapy or surgery could be postponed until they recover from COVID-19. Drugs that may exacerbate cytokine storms or lung injury should also be avoided. In addition, we explored potential molecular targets concurrently for the treatment of patients with both cancer and COVID-19, which may optimize clinical decisions precisely for such patients.
Certainly, the present study has some limitations. First, due to the inaccessibility of primary data, we cannot adjust for some key confounding factors; for example, the ethnic percentage of patients across cohorts was difficult to assess and adjust. Statistical heterogeneity and subgroup analysis were performed to minimize such unavoidable bias. Second, although our data showed no causal association between a genetic predisposition to most cancers and COVID-19 susceptibility and severity, it should not be mistaken that medical surveillance management for all of these patients could be reduced to the same level as that applied to the general population. Third, the cancer state is a binary exposure that could introduce unexpected bias, which may mitigate the causal association between the genetic predisposition to lung adenocarcinoma and increased COVID-19 severity. For patients undergoing cytotoxic chemotherapy or experienced surgery, intensified management and surveillance for COVID-19 infection and deterioration are still significant.

CONCLUSION
The management of cancer patients with COVID-19 is a knotty problem whose resolution requires wisdom and a joint effort by researchers around the world. Our results highlighted the importance of strengthening medical surveillance for COVID-19 deterioration in patients with lung adenocarcinoma by showing their genetic causal association. For these patients, a delay of anticancer treatment, such as chemotherapy and surgery, should be considered.