Skip to main content

ORIGINAL RESEARCH article

Front. Genet., 17 June 2021
Sec. Applied Genetic Epidemiology

Genome-Wide Association Analyses Identify Variants in IRF4 Associated With Acute Myeloid Leukemia and Myelodysplastic Syndrome Susceptibility

\r\nJunke Wang&#x;Junke Wang1†Alyssa I. Clay-Gilmour,&#x;Alyssa I. Clay-Gilmour2,3†Ezgi KaraesmenEzgi Karaesmen1Abbas RizviAbbas Rizvi1Qianqian ZhuQianqian Zhu4Li YanLi Yan4Leah PreusLeah Preus1Song LiuSong Liu4Yiwen WangYiwen Wang1Elizabeth GriffithsElizabeth Griffiths5Daniel O. StramDaniel O. Stram6Loreall PoolerLoreall Pooler6Xin ShengXin Sheng6Christopher HaimanChristopher Haiman6David Van Den BergDavid Van Den Berg6Amy WebbAmy Webb7Guy BrockGuy Brock7Stephen SpellmanStephen Spellman8Marcelo PasquiniMarcelo Pasquini9Philip McCarthyPhilip McCarthy10James AllanJames Allan11Friedrich StlzelFriedrich Stölzel12Kenan OnelKenan Onel13Theresa Hahn&#x;Theresa Hahn5‡Lara E. Sucheston-Campbell,*&#x;Lara E. Sucheston-Campbell1,14*‡
  • 1College of Pharmacy, The Ohio State University, Columbus, OH, United States
  • 2Department of Epidemiology, Mayo Clinic, Rochester, MN, United States
  • 3Department of Epidemiology & Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
  • 4Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, United States
  • 5Department of Medicine, Roswell Park Comprehensive Cancer Center, Buffalo, NY, United States
  • 6Department of Preventive Medicine, University of Southern California, Los Angeles, CA, United States
  • 7Department on Biomedical Informatics, The Ohio State University, Columbus, OH, United States
  • 8Center for International Blood and Marrow Transplant Research, Minneapolis, MN, United States
  • 9Center for International Blood and Marrow Transplant Research, Medical College of Wisconsin, Milwaukee, WI, United States
  • 10Department of Medicine, Roswell Park Comprehensive Cancer Center, Buffalo, NY, United States
  • 11Northern Institute for Cancer Research, Newcastle University, Newcastle upon Tyne, United Kingdom
  • 12Department of Internal Medicine I, University Hospital Carl Gustav Carus Dresden, Technical University Dresden, Dresden, Germany
  • 13Department of Pediatrics, Mount Sinai Medical Center, Miami Beach, NY, United States
  • 14College of Veterinary Medicine, The Ohio State University, Columbus, OH, United States

The role of common genetic variation in susceptibility to acute myeloid leukemia (AML), and myelodysplastic syndrome (MDS), a group of rare clonal hematologic disorders characterized by dysplastic hematopoiesis and high mortality, remains unclear. We performed AML and MDS genome-wide association studies (GWAS) in the DISCOVeRY-BMT cohorts (2,309 cases and 2,814 controls). Association analysis based on subsets (ASSET) was used to conduct a summary statistics SNP-based analysis of MDS and AML subtypes. For each AML and MDS case and control we used PrediXcan to estimate the component of gene expression determined by their genetic profile and correlate this imputed gene expression level with risk of developing disease in a transcriptome-wide association study (TWAS). ASSET identified an increased risk for de novo AML and MDS (OR = 1.38, 95% CI, 1.26-1.51, Pmeta = 2.8 × 10–12) in patients carrying the T allele at s12203592 in Interferon Regulatory Factor 4 (IRF4), a transcription factor which regulates myeloid and lymphoid hematopoietic differentiation. Our TWAS analyses showed increased IRF4 gene expression is associated with increased risk of de novo AML and MDS (OR = 3.90, 95% CI, 2.36-6.44, Pmeta = 1.0 × 10–7). The identification of IRF4 by both GWAS and TWAS contributes valuable insight on the role of genetic variation in AML and MDS susceptibility.

Introduction

Genome-wide association studies (GWAS) have been successful at identifying risk loci in several hematologic malignancies, including acute myeloid leukemia (AML; Knight et al., 2009; Lv et al., 2017; Walker et al., 2019). Recently genomic studies have identified common susceptibility loci between chronic lymphocytic leukemia (CLL), Hodgkin lymphoma (HL), and multiple myeloma demonstrating shared genetic etiology between these B-cell malignancies (BCM; Bhattacharjee et al., 2012; Law et al., 2017; Went et al., 2018).

Given the evidence of a shared genetic basis across BCM and the underlying genetic predisposition for AML and myelodysplastic syndromes (MDS) observed in family, epidemiological, and genetic association studies (Goldin et al., 2012; Gao et al., 2014; Churpek, 2017; Walker et al., 2019), we hypothesized that germline variants may contribute to both AML and MDS development. Using the DISCOVeRY-BMT (Determining the Influence of Susceptibility COnveying Variants Related to one-Year mortality after BMT) study population (2309 cases and 2814 controls), we performed AML and MDS genome-wide association studies in European Americans and used these data sets to inform our hypothesis. To address the disease heterogeneity within and across our data we used a validated meta-analytic association test based on subsets (ASSET; Bhattacharjee et al., 2012). ASSET tests the association of SNPs with all possible AML and MDS subtypes and identifies the strongest genetic association signal. To systematically test the association of genetically predicted gene expression with disease risk, we performed a transcriptome wide association study (TWAS; Gamazon et al., 2015; Gusev et al., 2016). This allows a preliminary investigation into the role of non-coding risk loci, which might be regulatory in nature, that impact expression of nearby genes. The TWAS statistical approach, PrediXcan (Gamazon et al., 2015), was used to impute tissue-specific gene expression from a publicly available whole blood transcriptome panel into our AML and MDS cases and controls. The predicted gene expression levels were then tested for association with AML and MDS. The use of both a GWAS and TWAS in the DISCOVeRY-BMT study population allowed us to identify AML and MDS associations with IRF4, a transcription factor which regulates myeloid and lymphoid hematopoietic differentiation, and has been previously identified in GWAS of BCM (Law et al., 2017).

Materials and Methods

Study Design and Population

Our study was a nested case-control design derived from the parent study DISCOVeRY-BMT (Determining the Influence of Susceptibility COnveying Variants Related to 1-Year Mortality after unrelated donor Blood and Marrow Transplant) (Hahn et al., 2015). The DISCOVeRY-BMT cohort was compiled from 151 centers around the world through the Center for International Blood and Marrow Transplant Research (CIBMTR). Briefly, the parent study was designed to find common and rare germline genetic variation associated with survival after an URD-BMT. DISCOVeRY-BMT consists of two cohorts of ALL, AML and MDS patients and their 10/10 human leukocyte antigen (HLA)-matched unrelated healthy donors. Cohort 1 was collected between 2000 and 2008, Cohort 2 was collected from 2009 to 2011.

Acute myeloid leukemia and MDS patients were selected from the DISCOVeRY-BMT patient cohorts and used as cases and all the unrelated donors from both cohorts as controls. AMLsubtypes included de novo AML with normal cytogenetics, de novo AML with abnormal cytogenetics and therapy-related AML (t-AML). De novo AML patients did not have precedent MDS, chemotherapy or radiation for prior cancers. MDS subtypes included de novo MDS, defined as patients without precedent chemotherapy or radiation for prior cancers, and therapy-related MDS (t-MDS). Patient cytogenetic subtypes were available, however due to limited sample sizes for each cytogenetic risk group, we consider here only broad categories. Controls were unrelated, healthy donors aged 18–61 years who passed a comprehensive medical exam and were disease-free at the time of donation. All patients and donors provided written informed consent for their clinical data to be used for research purposes and were not compensated for their participation.

Genotyping, Imputation, and Quality Control

Genotyping and quality control in the DISCOVeRY-BMT cohort has previously been described in detail (Hahn et al., 2015; Clay-Gilmour et al., 2017; Karaesmen et al., 2017; Zhu et al., 2018). Briefly, samples were assigned to plates to ensure an even distribution of patient characteristics and genotyping was performed at the University of Southern California Genomics Facility using the Illumina Omni-Express BeadChip® containing approximately 733,000 single nucleotide polymorphisms (SNPs; Yan et al., 2012). SNPs were removed if the missing rate was >2.0%, minor allele frequency (MAF) < 1%, or for violation of Hardy Weinberg equilibrium proportions (P < 1.0 × 10–4).

Problematic samples were removed based on the SNP missing rate, reported-genotyped sex mismatch, abnormal heterozygosity, cryptic relatedness, and population outliers. Population stratification was assessed via principal components analysis using Eigenstrat software (Price et al., 2006) and a genomic inflation factor (λ) was calculated for each cohort. Following SNP quality control, 637,655 and 632, 823 SNPs from the OmniExpress BeadChip in Cohorts 1 and 2, respectively were available for imputation. SNP imputation was performed using Haplotype Reference Consortium, hg19/build 371 via the Michigan Imputation server (Das et al., 2016; McCarthy et al., 2016). Variants with imputation quality scores <0.8 and minor allele frequency (MAF) < 0.005 were removed yielding almost 9 million high quality SNPs available for analysis in each cohort.

Methods

Statistical Analysis

Genome-Wide SNP Associations With AML and MDS

Quality control and statistical analyses were implemented using QCTOOL-v2, R 3.5.2 (Eggshell Igloo), Plink-v1.9, and SNPTEST-v2.5.4-beta3. Logistic regression models adjusted for age, sex, and three principal components were used to perform single SNP tests of association with de novo MDS, t-MDS, AML by subtype (de novo AML with normal cytogenetics, de novo AML with abnormal cytogenetics and t-AML) in each cohort. European American healthy donors were used as controls. SNP meta-analyses of cohorts 1 and 2 were performed by fitting random effects models (Lee et al., 2017). To identify the strongest association signal with AML and MDS we conducted a summary statistic SNP-based association analysis (ASSET) implemented in R statistical software (Bhattacharjee et al., 2012). ASSET tests each SNP for association with outcome using an exhaustive search across non-overlapping AML and MDS case groups while accounting for the multiple tests required by the subset search, as well as any shared controls between groups (Bhattacharjee et al., 2012).

Heritability Estimation of AML and MDS

We calculated heritability of AML and MDS combined and by independent subtypes as the proportion of phenotypic variance explained by all common genotyped SNPs, using the genome-based restricted maximum likelihood method performed with the Genome-wide Complex Trait Analysis (GCTA) software (Yang et al., 2011; Deary et al., 2012; Lee et al., 2012). We report heritability on the observed scale due to genome-wide genotyped variants as well as heritability on the liability scale assuming AML and MDS disease prevalence of 0.0001 (Lee et al., 2013; Lu et al., 2014; Mitchell et al., 2015).

Transcriptome-Wide Association Study of AML and MDS

To prioritize GWAS findings and identify expression quantitative trait loci (eQTL)-linked genes, we carried out a gene expression tests of association of de novo AML and MDS using PrediXcan (Gamazon et al., 2015). This method leverages the well-described functional regulatory enrichment in genetic variants relatively close to the gene body (i.e., cis-regulatory variation) to inform models relating SNPs to gene expression levels in data with both gene expression and SNP genotypes available. Robust prediction models are then used to estimate the effect of cis-regulatory variation on gene expression levels. Using imputation, the cis-regulatory effects on gene expression from these models can be predicted in any study with genotype measurements, even if measured gene expression is not available. Thus, we imputed the cis-regulatory component of gene expression into our data for each individual using models trained on the whole blood transcriptome panel (n = 922) from the Depression Genes and Networks (DGN; Battle et al., 2014), yielding expression levels of 11,200 genes for each case and control. The resulting estimated gene expression levels were then used to perform gene-based tests of differential expression between AML and MDS cases and controls adjusted for age and sex. A fixed effects model with inverse variance weighting using the R package Metafor was used for meta-analysis of cohorts 1 and 2. A Bonferroni-corrected transcriptome wide significance threshold was set at P < 4.5 × 10–6.

Functional Annotation of Genetic Variation Associated With AML and MDS

To better understand the potential function of the variants identified by GWAS and ASSET analyses we annotated significant SNPs using publicly available data. eQTLGen, a consortium analyses of the relationship of SNPs to gene expression in 30,912 whole blood samples, was used to determine if significant and suggestive SNPs (P < 5 × 10–6) were whole blood cis-eQTL, defined as allele specific association with gene expression (Võsa et al., 2018). Genotype-Tissue Expression project (GTEx) was used to test for significant eQTLs in > 70 additional tissues (Carithers et al., 2015). We also tested for difference in the log fold tissue expression of IRF4 in an independent sample of AML samples from The Cancer Genome Atlas (N = 170) and GTEx (N = 70). AML and MDS SNP associations were also placed in context of previous GWAS using Phenoscanner, a variant-phenotype comprehensive database of large GWAS, which includes results from the NHGRI-EBI GWAS catalog, the UK Biobank, NIH Genome-Wide Repository of Associations between SNPs and Phenotypes and publicly available summary statistics from more than 150 published genome association studies. Results were filtered at P < 5 × 10––8 and the R statistical software package phenoscanner2 was used to download all data for our significant variants (Staley et al., 2016). Chromatin state data based on 25-state Imputation Based Chromatin State Model across 24 Blood, T-cell, HSC and B-cell lines was downloaded from the Roadmap Epigenomics project3 (Roadmap Epigenomics Consortium et al., 2015). Figures including chromatin state information and results from previous GWAS were constructed using the R Bioconductor package gviz (Mifsud et al., 2015; Cairns et al., 2016; Spurrell et al., 2016). Lastly, we sought to identify promoter interaction regions (PIR), defined as significant interactions between gene promotors and distal genomic regions. Variants in PIRs can be connected to potential gene targets and thus can impact gene function (Spurrell et al., 2016). Briefly Hi-C libraries, enriched for promoter sequences, are generated with biotinylated RNA baits complementary to the ends of promoter-containing restriction fragments. Promoter fragments become bait for pieces of the genome that are targets with which they frequently interact, allowing regulatory elements and enhancers to be pulled down and sequenced. Statistical tests of bait-target pairs are done to define significant PIRs and their targets (Cairns et al., 2016; Schofield et al., 2016; Schoenfelder et al., 2018). To identify the genomic features with which our significant SNPs might be interacting via chromatin looping we used publicly available Promoter Capture Hi-C (PCHi-C) data on a lymphoblastoid cell line (LCL), GM12878, and two ex vivo CD34+ hematopoietic progenitor cell lines (primary hematopoietic G-CSF mobilized stem cells and hematopoietic stem cells) (Schofield et al., 2016). We integrated our SNP data with the PCHi-C cell line data and visualized these interactions using circos plots (Yu et al., 2018).

Results

DISCOVeRY-BMT Cases and Controls

Results of quality control have been described elsewhere (Karaesmen et al., 2017). Following quality control, the DISCOVeRY-BMT cohorts include 1,769 AML and 540 MDS patients who received URD-BMT as treatment and 2,814 unrelated donors as controls (Supplementary Table 1). The majority of AML cases are de novo (N = 1618) with normal cytogenetics (N = 543), 6% of patients had therapy-related AML (t-AML). The most frequently reported previous cancers in patients with t-AML were breast (N = 51), non-Hodgkin Lymphoma (NHL), N = 23, HL (N = 14), Sarcoma (N = 12), Gynecologic (N = 8), Acute Lymphoblastic Leukemia (N = 6), and Testicular (N = 6). Prior therapies for these patients were approximately equally divided between single agent chemotherapy and combined modality chemotherapy plus radiation. Almost half of MDS patients had Refractory Anemia with Excess Blasts (RAEB)-1 and RAEB-2. Of patients with t-MDS (∼18% of MDS patients), 65% had antecedent hematologic cancers or disorders. The most frequently reported antecedent cancers in MDS patients were NHL (N = 27), breast (N = 15), Acute Lymphoblastic Leukemia (N = 8), HL (N = 8), AML (N = 8), Sarcoma (N = 6), and CLL (N = 5) (Supplementary Table 1). Overall, the distribution of antecedent cancers differed significantly between t-MDS and t-AML, with almost 2/3 of. t-MDS and 1/3 of t-AML patients diagnosed with a prior hematologic cancer.

SNP Associations With AML and MDS

Genome-wide association studies of AML by subtype (de novo abnormal cytogenetics, de novo normal cytogenetics and t-AML) and MDS (de novo and t-MDS) are shown in Supplementary Figure 1. No population stratification was observed in PCA analysis and genomic inflation factor for cohort 1 and 2 were 1.04 and 1.03, respectively. Quantile-quantile plots of SNPs after post-imputation quality control (MAF > 0.005, imputation quality scores > 0.8) are shown in Supplementary Figure 3. To identify loci that show association with AML and MDS we used ASSET. For SNPs to be considered, we used previously defined criteria, which required ASSET SNP associations at P ≤ 5.0 × 10–8 with significant individual one-sided subset tests (P < 0.01), the variant association could not be driven by a single disease nor could it be both positively and negatively associated in different cohorts of the same disease (Law et al., 2017). In the ASSET GWAS analyses we identified a novel typed SNP associated with AML and MDS on Chromosome 6 (Figure 1). The T allele at rs12203592, a variant in intron 4 of Interferon Regulatory Factor 4 (IRF4), conferred increased risk of de novo abnormal cytogenetic AML, de novo normal cytogenetic AML, MDS and t-MDS (OR = 1.38; 95% CI, 1.26–1.51, Pmeta = 2.8 × 10–12). T-AML showed no association with rs12203592. The effect allele frequency was 19% in de novo AML, MDS and t-MDS cases versus 14% in controls. ASSET analyses also identified another variant in modest linkage disequilibrium (LD), r2 = 0.7, with rs12203592 in the regulatory region of IRF4; the A allele at rs62389423, showed a putative association with de novo AML and MDS (OR = 1.36; 95% CI, 1.21-1.52, Pmeta = 1.2 × 10–7) (Figure 2A). We identified one significant association in the subtype GWAS which was disease specific. The C allele in rs78898975 in TATA-box binding protein associated factor 2 (TAF2), associated with an increased risk of t-MDS (ORmeta = 5.87, 95% CI = 3.20, 10.76, Pmeta = 9.9 × 10–9) but not de novo MDS (OR = 1.8, 95% CI = 0.81, 1.45, Pmeta = 0.20) (Supplementary Figure 1). The effect allele frequency was 7% in t-MDS, 2% in de novo MDS and 1.5% in controls.

FIGURE 1
www.frontiersin.org

Figure 1. ASSET analysis and associations by AML and MDS subgroup. Forest plot of the odds ratios (OR) for the association between rs12203592 in IRF4 and MDS and AML subtypes. The variant resides in the Chromosome 6 outside the major histocompatibility complex region. Studies were weighted by inverse of the variance of the log (OR). The solid gray vertical line is positioned at the null value (OR = 1); values to the right represent risk increasing odds ratios. Horizontal lines show the 95% CI and the box is the OR point estimate for each case- control subset with its area proportional to the weight of the patient group. The diamond is the overall effect estimated by ASSET, with the 95% CI given by its width.

FIGURE 2
www.frontiersin.org

Figure 2. IRF4 region with AML and MDS associated SNP p-values annotated with previous GWAS and Roadmap Epigenome Chromatin States. (A) ASSET analysis AML and MDS SNP associations in the IRF4 region. The x-axis is the chromosome position in kilobase pairs and y-axis shows the –log10 (p-values) for de novo AML and MDS susceptibility. The associated SNPs in the IRF4 region, rs12203592 and rs62389423, are highlighted with sky blue lines drawn through the point to show the relationship of the variant to GWAS hits and Roadmap Epigenome data (2C). rs12203592 and rs62389423 show moderate linkage disequilibrium (r2 = 0.7); rs62389423 and rs62389424 are almost perfectly correlated (r2 = 0.95). (B) Previously reported GWAS SNPs in the IRF4 region. Phenotypes are color coded and all variants are associated at P < 5 × 10–8. (C) Genes in the region annotated with the chromatin-state segmentation track (ChromHMM) from Roadmap Epigenome data for all blood, T-cell, HSC and B-cells. The cell line numbers shown down the left side correspond to specific epigenome road map cell lines. E029:Primary monocytes from peripheral blood; E030:Primary neutrophils from peripheral blood; E031:Primary B cells from cord blood; E032:Primary B Cells from peripheral blood; E033:Primary T Cells from cord blood; E034:Primary T Cells from blood; E035:Primary hematopoietic stem cells; E036:Primary hematopoietic stem cells short term culture; E037:Primary T helper memory cells from peripheral blood 2; E038:Primary T help naïve cells from peripheral blood; E039:Primary T helper naïve cells from peripheral blood; E040:Primary T helper memory cells from peripheral blood 1; E041:Primary T helper cells PMA-Ionomycin stimulated; E042:Primary T helper 17 cells PMA-Ionomycin stimulated; E043:Primary T helper cells from peripheral blood; E044:Primary T regulatory cells from peripheral blood; E045:Primary T cells effector/memory enriched from peripheral blood; E046:Primary Natural Killer cells from peripheral blood; E047:Primary T CD8 naïve cells from peripheral blood; E048:Primary T CD8 memory cells from peripheral blood; E-50:Primary hematopoietic stem cells G-CSF mobilized Female; E-51:Primary hematopoietic stem cells G-CSF mobilized Male; E062:Primary Mononuclear Cells from Peripheral Blood; E0116 Lymphoblastic Cell Line. The colors indicate chromatin states imputed by ChromHMM and shown in the key titled “Roadmap Chromatin State.”

A previous genome-wide association study of AML done in European American cases and controls reported a susceptibility variant in BICRA (rs75797233) (Walker et al., 2019). The variant was not significantly associated with AML risk in our meta-analyses (OR = 1.08, 95% CI = 0.78–1.37). However, their cohort did not include patients who received an allogeneic transplant as curative therapy and the distribution of AML subtypes differed between the studies. DISCOVeRY-BMT AML cases consisted of more unfavorable cytogenetic cases (trisomy, monosomy) than the AML cases from Walker et al. (2019) BICRA (rs75797233) SNP may be more likely associated with cytogenetic subtypes that comprise a prognostically less severe AML. In addition, the lower frequency (MAF = 0.02) of this imputed variant (info score > 0.8 in both cohorts) possibly reduced power to detect an effect.

Functional Annotation of SNP Associations With AML and MDS

Multiple GWAS of healthy individuals have shown associations between the T allele at rs12203592 and higher eosinophil counts, lighter skin color, lighter hair, less tanning ability, and increased freckling (Astle et al., 2016; Staley et al., 2016). GWAS have also identified associations between this allele and increased risk of childhood acute lymphoblastic leukemia in males, non-melanoma skin cancer, squamous cell carcinoma, cutaneous squamous cell carcinoma, basal cell carcinoma, actinic keratosis, and progressive supranuclear palsy (Figure 2B; Staley et al., 2016). Furthermore, analyses of multiple B-cell malignancies recently identified a rs9392017, adjacent to IRF4, as a pleiotropic susceptibility variant associated with both CLL and HL (Di Bernardo et al., 2008; Mifsud et al., 2015; Schofield et al., 2016; Law et al., 2017). This SNP is approximately 40Kb away from rs12203592, although not in LD (r2 = 0.01).

The rs12203592 risk allele associated with increased expression of IRF4, P = 1.48 × 10–29 in whole blood (Võsa et al., 2018). IRF4 is a key transcription factor for lymphoid and myeloid hematopoiesis (Shaffer et al., 2008; Pratt et al., 2010; Havelange et al., 2011; Salaverria et al., 2011) and rs12203592 resides in a regulatory region across Blood, HSC, B-Cell and T-Cell lines (Figure 2C). The variant’s regulomedb score indicates how likely a variant is to be a regulatory element from 1a (most likely) to 7 (no data); the variant’s score of 2b, indicates the variant is likely to affect transcription factor binding (Boyle et al., 2012). While the HL and CLL pleiotropic variant rs9392017 was not a significant eQTL for IRF4 in whole blood, PCHi-C cell line data from both GM12878 and the ex vivo CD34+ hematopoietic progenitor cell lines show chromatin looping between rs9392017 and the regulatory region containing rs12203592 (Supplementary Figure 2).

The t-MDS associated C allele in rs78898975 is correlated with significantly lower expression of TAF2 (P = 1.95 × 10–13) and DEPTOR (P = 4.7 × 10–9) gene expression in whole blood (Võsa et al., 2018; Kamat et al., 2019).

Heritability Estimates of AML and MDS

The heritability of AML and MDS on the observed scale due to genotyped variants was 0.46 with standard error (SE) = 0.07. Transforming this to the liability scale and assuming a disease prevalence of 0.0001 resulted in a heritability of 0.10 (SE = 0.02) which differed significantly from a heritability of zero (P = 2.0 × 10–16). The proportion of variance in de novo AML with normal cytogenetics and de novo MDS on the liability scale had similar heritability at 9%, SE = 0.03, P = 1.9 × 10–3 and 14%, SE = 0.04, P = 1.4 × 10–4, respectively. Treatment-related AML and MDS were tested independently and estimated proportion of variance explained by all SNPs was 7% for t-AML and 4% for t-MDS, however SE were high and the heritability did not significantly differ from zero.

Transcriptome-Wide Association Study—PrediXcan

Using PrediXcan (Gamazon et al., 2015) gene expression imputation models trained on the DGN data set, we identified one transcriptome wide significant gene associated with de novo AML and MDS. Increased expression of IRF4 was associated with an increased risk for the development of de novo AML and MDS (OR = 3.90; 95% CI, 2.36-6.44, Pmeta = 1.0 × 10–7), consistent with our SNP-level findings (Figure 3). This association is consistent with gene expression analyses of TCGA AML samples compared to GTEx whole blood show IRF4 expression is 1.75-fold greater in AML samples than GTEx whole blood (P < 0.01). Whole blood transcriptome models also identified two additional genes with suggestive associations with de novo AML and MDS. Increased expression of AKT Serine/Threonine Kinase 1, AKT1 at 14q32.33 was associated with risk for the development of de novo AML and MDS (OR = 1.56; 95% CI, 1.25–1.95, Pmeta = 1.0 × 10–4) (Figure 4). Likewise, increased expression of Ras guanyl nucleotide-releasing protein 2, RASGRP2, was associated with an increased risk for development of de novo AML and MDS (OR = 4.05; 95% CI, 1.84-8.91, Pmeta = 5 × 10–4) (Figure 4). Other suggestive gene associations (Pmeta < 5 × 10−4) were identified with limited or no evidence of biological plausibility to AML/MDS etiology (Supplementary Table 2).

FIGURE 3
www.frontiersin.org

Figure 3. Manhattan plot of the de novo AML and MDS GWAS and TWAS. The plot represents the TWAS p-values (top) of each gene and de novo AML and MDS GWAS P-values (bottom) of each SNP included in the case-control association study. Significant and suggestive genes are highlighted in orange and labeled by their gene symbols. The orange horizontal line on the top represents the transcriptome-wide significance threshold of P = 4.5 × 10–6. The orange horizontal line on the bottom represents the genome-wide threshold of P = 5.0 × 10–8.

FIGURE 4
www.frontiersin.org

Figure 4. Regional plots of PrediXcan-TWAS and SNP associations with AML and MDS. Each box represents PrediXcan-TWAS significant genes AKT1, IRF4 and RASGRP2 ± 0.5 megabases. The grey shaded bars represent the gene, where height is gene expression association and width is gene region in base pairs and the purple dots represent SNP associations with AML and MDS -log10 (P-values) are shown on the y-axis. Green and red lines denote the transcriptome-wide and genome wide significant P-values, respectively. Results are filtered for imputation quality (rsq > 0.8) and heterogeneity of effect between cohorts.

Discussion

We performed the first large scale AML and MDS GWAS in a URD-BMT population providing evidence of novel pleiotropic risk loci associated with increased susceptibility to AML and MDS. The DISCOVeRY-BMT cohorts from the Center for International Blood and Marrow Transplant Research (CIBMTR) allow us to capture ∼99% of all AML and MDS patients who received an URD-BMT performed in the United States within the given time-frame (i.e., Cohort 1: 2000–2008 and Cohort 2: 2009–2011).

We identified an association between the T allele at rs12203592 (typed) in IRF4 and an increased risk for the development of de novo AML, de novo MDS and t-MDS in patients who had undergone URD-BMT compared to healthy donor controls. While therapy-related myeloid neoplasms have been shown to be genetically and etiologically similar to other high-risk myeloid neoplasms (McNerney et al., 2017), in our transplant population t-AML did not associate with this variant, while t-MDS did show evidence of association with rs12203592. We also identified a genome-wide significant t-MDS variant which was an eQTL for both TAF2 and DEPTOR genes. Differences in associations identified in t-MDS compared to t-AML, could be due to factors related to underlying susceptibility. For example, 61% of t-MDS cases primary diagnosis was a hematologic malignancy, whereas, only 31% of t-AML cases primary diagnosis was a hematologic malignancy (Supplementary Table 1). However this merits further exploration in larger cohorts We also provide the first estimates of the heritability of AML and MDS, at between 9 and 14%, which are in line with other GWAS of cancer heritability on the liability scale, indicating that genetic variation contributes to AML and MDS susceptibility (Sampson et al., 2015).

The rs12203592 SNP has been shown to regulate IRF4 transcription by physical interaction with the IRF4 promoter through a chromatin loop (Visser et al., 2015). This SNP resides in an important position within NFkB motifs in multiple blood and immune cell lines, supporting the hypothesis that this SNP may modulate NFkB repression of IRF4 expression (Ward and Kellis, 2012; Kheradpour and Kellis, 2014). Furthermore, this SNP resides in a hematopoietic transcription factor that has been previously identified to harbor a hematological cancer susceptibility locus, rs9392017, which we show interacts with the region containing our susceptibility variant. These data add to the mounting evidence that there could be pleiotropic genes across multiple hematologic cancers (Slager et al., 2012; Mitchell et al., 2016; Law et al., 2017; Went et al., 2018; Vijayakrishnan et al., 2019). Imputed gene expression logistic regression models showed a significant association between higher predicted levels of IRF4 expression and the risk for development of de novo AML or MDS (Gamazon et al., 2015). We also see this in the TCGA data, where gene expression analyses show significantly higher expression of IRF4 in AML samples versus GTEx samples. Although IRF4 functions as a tumor suppressor gene in early B-cell development (Acquaviva et al., 2008), in multiple myeloma IRF4 is a well-established oncogene (Shaffer et al., 2008),with oncogenic implications extending to adult leukemias (De Silva et al., 2012) and lymphomas (Bisig et al., 2012), as well as pediatric leukemia. IRF4 overexpression is a hallmark of activated B-cell-like type of diffuse large B-cell lymphoma and associated with classical Hodgkin lymphoma (cHL), plasma cell myeloma and primary effusion lymphoma (Carbone et al., 2002). In a case-control study of childhood leukemia increased IRF4 expression was higher in immature B-common acute lymphoblastic leukemia and T-cell leukemia with the highest expression levels in pediatric AML patients compared to controls (Adamaki et al., 2013). In addition to the CLL genetic susceptibility loci identified in IRF4, high expression levels of the gene have been shown to correlate with poor clinical prognosis (Allan et al., 2010).

Transcriptome-wide association studies can be a powerful tool to help prioritize potentially causal genes. It is, however, imperative to investigate the SNP and gene-expression associations in the context of the surrounding variants and genes to reduce the possibility of a false signal from co-regulation. Co-regulation can occur when there are multiple GWAS and TWAS hits due to linkage disequilibrium and thus it becomes difficult to determine which locus is driving the phenotypic association. In our study, the SNP rs12203592 is a significant eQTL for only IRF4, this implies that the SNP and imputed gene expression signal we identified is not being driven by co-regulation of neighboring SNPs and/or genes. When considering non-imputed gene expression sets, eQTLGen (Võsa et al., 2018) corroborates this finding; rs12203592 is significantly associated with only increased expression of IRF4. In addition, the relationship of rs12203592 to IRF4 expression in blood seems tissue specific, as GTEx data across over 70 tissues shows association with only lung tissue at P = 9.1 × 10–9. The specificity of rs12203592 to IRF4 expression in blood and the lack of correlation between IRF4 expression and other genes in DISCOVeRY-BMT give confidence that the observed ASSET association is the potential susceptibility locus in the region. The functional significance of variants in this gene in hematopoiesis and its previous recognition as a locus associated with the risk for development of other hematological malignancies, further strengthen the evidence of an association of IRF4 with development of AML and MDS. A limitation of the TWAS metholodology is the highly tissue specific nature of gene expression and regulation. Our use of whole blood from GTEx to create relevant genotype weights may not be representative of gene expression in AML and MDS cases but more so controls.

In addition to IRF4, we identified an association between the risk for development of de novo AML or MDS and higher expression of AKT1. AKT1 is an oncogene which plays a critical role in the PI3K/AKT pathway. AML patients frequently show increased AKT1 activity, providing leukemic cells with growth and survival promoting signals (Tang et al., 2015) and enhanced AKT activation has been implicated in the transformation from MDS to AML and overexpression of AKT has been shown to induce leukemia in mice (Kharas et al., 2010).

We also identified AML and MDS gene expression associations with RASGRP2, which is expressed in various blood cell lineages and platelets, acts on the Ras-related protein Rap and functions in platelet adhesion. GWAS have identified significant variants in this gene associated with immature dendritic cells (% CD32+) and immature fraction of reticulocytes, a blood cell measurement shown to be elevated in patients with MDS versus controls (Astle et al., 2016). RASGRP2 expression has not been studied in relation to AML or MDS, however recently RASGRP2/Rap1 signaling was shown to be functionally linked to the CD38-associated increased CLL cell migration. The migration of CLL cells into lymphoid tissues because of proliferation induced by B-cell receptor activation is thought to be an important component of CLL pathogenesis (Mele et al., 2018). This finding has implications for the design of novel treatments for CD38+ hematological diseases (Mele et al., 2018). These data imply the replication of these gene expression associations with the development of AML and MDS are warranted.

This is the largest genome-wide AML and MDS susceptibility study to date. Despite our relatively large sample size, the complexity of cytogenetic risk groups in these diseases limits our analysis, particularly with respect to therapy-related AML and MDS. The DISCOVeRY-BMT study population is composed of mostly European American non-Hispanics and thus validation of these associations in a non-white cohort of patients is imperative. Lastly, the use of TWAS is a powerful way to start to prioritize causal genes for follow-up after GWAS, however there are limitations. TWAS tests for association with genetically predicted gene expression and not total gene expression, which includes environmental, technical and genetic components (Wainberg et al., 2019).

Our results provide evidence for the impact of common variants on the risk for AML or MDS susceptibility and further characterization of the 6p25.3 locus might provide a more mechanistic basis for the pleiotropic role of IRF4 in AML and MDS susceptibility. The co-identification of variants in IRF4 associated with the risk for both myeloid and lymphoid malignancy supports the importance of broader studies that span the spectrum hematologic malignancies.

Data Availability Statement

The datasets presented in this article are not readily available because de-identified individual participant data that underlie the reported results are not available on dbGaP as informed consent is not compliant with the NIH Genomic Data Sharing Policy. Requests to access the datasets should be directed to International Blood and Marrow Transplant (www.cibmtr.org) by emailing (contactus@cibmtr.org) or Dr. Alyssa Clay-Gilmour (claygila@mailbox.sc.edu).

Author Contributions

JW, AC-G, LS-C, and TH designed the research, performed research and analysis, and wrote the manuscript. CH, DV, XS, and LPo performed the genotyping. SL, LPr, AW, and GB performed the quality control of genomic data. All authors reviewed and approved the manuscript.

Funding

This work was supported by grants from the National Institute of Health. LS-C and TH were supported by 1R01HL102278 and 1R03CA188733 to perform this work. EK is supported by the Pelotonia Foundation Graduate Student Fellowship. Any opinions, findings, and conclusions expressed in this material are those of the author(s) and do not necessarily reflect those of the Pelotonia Fellowship Program or The Ohio State University. AC-G was supported by CA9204, Mayo Clinic R25 Training Grant when she performed a majority of this work. The CIBMTR is supported by Public Health Service Grant/Cooperative Agreement 5U24-CA076518 from the National Cancer Institute (NCI), the National Heart, Lung and Blood Institute (NHLBI) and the National Institute of Allergy and Infectious Diseases (NIAID); a Grant/Cooperative Agreement 5U10HL069294 from NHLBI and NCI; a contract HHSH250201200016C with Health Resources and Services Administration (HRSA/DHHS); two Grants N00014-15-1-0848 and N00014-16-1-2020 from the Office of Naval Research; and grants from Alexion; Amgen, Inc.; Anonymous donation to the Medical College of Wisconsin; Astellas Pharma US; AstraZeneca; Be the Match Foundation; Bluebird Bio, Inc.; Bristol Myers Squibb Oncology; Celgene Corporation; Cellular Dynamics International, Inc.; Chimerix, Inc.; Fred Hutchinson Cancer Research Center; Gamida Cell Ltd.; Genentech, Inc.; Genzyme Corporation; Gilead Sciences, Inc.; Health Research, Inc. Roswell Park Cancer Institute; HistoGenetics, Inc.; Incyte Corporation; Janssen Scientific Affairs, LLC; Jazz Pharmaceuticals, Inc.; Jeff Gordon Children’s Foundation; The Leukemia & Lymphoma Society; Medac, GmbH; MedImmune; The Medical College of Wisconsin; Merck & Co, Inc.; Mesoblast; MesoScale Diagnostics, Inc.; Miltenyi Biotec, Inc.; National Marrow Donor Program; Neovii Biotech NA, Inc.; Novartis Pharmaceuticals Corporation; Onyx Pharmaceuticals; Optum Healthcare Solutions, Inc.; Otsuka America Pharmaceutical, Inc.; Otsuka Pharmaceutical Co, Ltd. – Japan; PCORI; Perkin Elmer, Inc.; Pfizer, Inc; Sanofi US; Seattle Genetics; Spectrum Pharmaceuticals, Inc.; St. Baldrick’s Foundation; Sunesis Pharmaceuticals, Inc.; Swedish Orphan Biovitrum, Inc.; Takeda Oncology; Telomere Diagnostics, Inc.; University of Minnesota; and Wellpoint, Inc. The views expressed in this article do not reflect the official policy or position of the National Institute of Health, the Department of the Navy, the Department of Defense, Health Resources and Services Administration (HRSA) or any other agency of the U.S. Government. This manuscript has been released as a pre-print at biorxiv (Wang et al., 2020). Corporate Members.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2021.554948/full#supplementary-material

Footnotes

  1. ^ http://www.haplotype-reference-consortium.org/home
  2. ^ https://github.com/phenoscanner/phenoscanner
  3. ^ https://egg2.wustl.edu/roadmap/data/byFileType/chromhmmSegmentations/ChmmModels/imputed12marks/jointModel/final/

References

Acquaviva, J., Chen, X., and Ren, R. (2008). IRF-4 functions as a tumor suppressor in early B-cell development. Blood 112, 3798–3806. doi: 10.1182/blood-2007-10-117838

PubMed Abstract | CrossRef Full Text | Google Scholar

Adamaki, M., Lambrou, G. I., Athanasiadou, A., Tzanoudaki, M., Vlahopoulos, S., and Moschovi, M. (2013). Implication of IRF4 aberrant gene expression in the acute leukemias of childhood. PLoS One 8:e72326. doi: 10.1371/journal.pone.0072326

PubMed Abstract | CrossRef Full Text | Google Scholar

Allan, J. M., Sunter, N. J., Bailey, J. R., Pettitt, A. R., Harris, R. J., Pepper, C., et al. (2010). Variant IRF4/MUM1 associates with CD38 status and treatment-free survival in chronic lymphocytic leukaemia. Leukemia 24, 877–881. doi: 10.1038/leu.2009.298

PubMed Abstract | CrossRef Full Text | Google Scholar

Astle, W. J., Elding, H., Jiang, T., Allen, D., Ruklisa, D., Mann, A. L., et al. (2016). The allelic landscape of human blood cell trait variation and links to common complex disease. Cell 167, 1415.e19–1429.e19. doi: 10.1016/j.cell.2016.10.042

PubMed Abstract | CrossRef Full Text | Google Scholar

Battle, A., Mostafavi, S., Zhu, X., Potash, J. B., Weissman, M. M., McCormick, C., et al. (2014). Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res. 24, 14–24. doi: 10.1101/gr.155192.113

PubMed Abstract | CrossRef Full Text | Google Scholar

Bhattacharjee, S., Rajaraman, P., Jacobs, K. B., Wheeler, W. A., Melin, B. S., Hartge, P., et al. (2012). A subset-based approach improves power and interpretation for the combined analysis of genetic association studies of heterogeneous traits. Am. J. Hum. Genet. 90, 821–835. doi: 10.1016/j.ajhg.2012.03.015

PubMed Abstract | CrossRef Full Text | Google Scholar

Bisig, B., Gaulard, P., and de Leval, L. (2012). New biomarkers in T-cell lymphomas. Best Pract. Res. Clin. Haematol. 25, 13–28. doi: 10.1016/j.beha.2012.01.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Boyle, A. P., Hong, E. L., Hariharan, M., Cheng, Y., Schaub, M. A., Kasowski, M., et al. (2012). Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 22, 1790–1797. doi: 10.1101/gr.137323.112

PubMed Abstract | CrossRef Full Text | Google Scholar

Cairns, J., Freire-Pritchett, P., Wingett, S. W., Várnai, C., Dimond, A., Plagnol, V., et al. (2016). CHiCAGO: robust detection of DNA looping interactions in Capture Hi-C data. Genome Biol. 17:127. doi: 10.1186/s13059-016-0992-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Carbone, A., Gloghini, A., Aldinucci, D., Gattei, V., Dalla-Favera, R., and Gaidano, G. (2002). Expression pattern of MUM1/IRF4 in the spectrum of pathology of Hodgkin’s disease. Br. J. Haematol. 117, 366–372. doi: 10.1046/j.1365-2141.2002.03456.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Carithers, L. J., Ardlie, K., Barcus, M., Branton, P. A., Britton, A., Buia, S. A., et al. (2015). A novel approach to high-quality postmortem tissue procurement: the GTEx project. Biopreserv. Biobank. 13, 311–319. doi: 10.1089/bio.2015.0032

PubMed Abstract | CrossRef Full Text | Google Scholar

Churpek, J. E. (2017). Familial myelodysplastic syndrome/acute myeloid leukemia. Best Pract. Res. Clin. Haematol. 30, 287–289. doi: 10.1016/j.beha.2017.10.002

PubMed Abstract | CrossRef Full Text | Google Scholar

Clay-Gilmour, A. I., Hahn, T., Preus, L. M., Onel, K., Skol, A., Hungate, E., et al. (2017). Genetic association with B-cell acute lymphoblastic leukemia in allogeneic transplant patients differs by age and sex. Blood Adv. 1, 1717–1728. doi: 10.1182/bloodadvances.2017006023

PubMed Abstract | CrossRef Full Text | Google Scholar

Das, S., Forer, L., Schönherr, S., Sidore, C., Locke, A. E., Kwong, A., et al. (2016). Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287. doi: 10.1038/ng.3656

PubMed Abstract | CrossRef Full Text | Google Scholar

De Silva, N. S., Simonetti, G., Heise, N., and Klein, U. (2012). The diverse roles of IRF4 in late germinal center B-cell differentiation. Immunol. Rev. 247, 73–92. doi: 10.1111/j.1600-065X.2012.01113.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Deary, I. J., Yang, J., Davies, G., Harris, S. E., Tenesa, A., Liewald, D., et al. (2012). Genetic contributions to stability and change in intelligence from childhood to old age. Nature 482, 212–215. doi: 10.1038/nature10781

PubMed Abstract | CrossRef Full Text | Google Scholar

Di Bernardo, M. C., Crowther-Swanepoel, D., Broderick, P., Webb, E., Sellick, G., Wild, R., et al. (2008). A genome-wide association study identifies six susceptibility loci for chronic lymphocytic leukemia. Nat. Genet. 40, 1204–1210. doi: 10.1038/ng.219

PubMed Abstract | CrossRef Full Text | Google Scholar

Gamazon, E. R., Wheeler, H. E., Shah, K. P., Mozaffari, S. V., Aquino-Michaels, K., Carroll, R. J., et al. (2015). A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098. doi: 10.1038/ng.3367

PubMed Abstract | CrossRef Full Text | Google Scholar

Gao, J., Gentzler, R. D., Timms, A. E., Horwitz, M. S., Frankfurt, O., Altman, J. K., et al. (2014). Heritable GATA2 mutations associated with familial AML-MDS: a case report and review of literature. J. Hematol. Oncol. 7:36. doi: 10.1186/1756-8722-7-36

PubMed Abstract | CrossRef Full Text | Google Scholar

Goldin, L. R., Kristinsson, S. Y., Liang, X. S., Derolf, A. R., Landgren, O., and Björkholm, M. (2012). Familial aggregation of acute myeloid leukemia and myelodysplastic syndromes. J. Clin. Oncol. 30, 179–183. doi: 10.1200/JCO.2011.37.1203

PubMed Abstract | CrossRef Full Text | Google Scholar

Gusev, A., Ko, A., Shi, H., Bhatia, G., Chung, W., Penninx, B. W., et al. (2016). Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252. doi: 10.1038/ng.3506

PubMed Abstract | CrossRef Full Text | Google Scholar

Hahn, T., Sucheston-Campbell, L. E., Preus, L., Zhu, X., Hansen, J. A., Martin, P. J., et al. (2015). Establishment of definitions and review process for consistent adjudication of cause-specific mortality after allogeneic unrelated-donor hematopoietic cell transplantation. Biol. Blood Marrow Transplant 21, 1679–1686. doi: 10.1016/j.bbmt.2015.05.019

PubMed Abstract | CrossRef Full Text | Google Scholar

Havelange, V., Pekarsky, Y., Nakamura, T., Palamarchuk, A., Alder, H., Rassenti, L., et al. (2011). IRF4 mutations in chronic lymphocytic leukemia. Blood 118, 2827–2829. doi: 10.1182/blood-2011-04-350579

PubMed Abstract | CrossRef Full Text | Google Scholar

Kamat, M. A., Blackshaw, J. A., Young, R., Surendran, P., Burgess, S., Danesh, J., et al. (2019). PhenoScanner V2: an expanded tool for searching human genotype-phenotype associations. Bioinformatics 35, 4851–4853. doi: 10.1093/bioinformatics/btz469

PubMed Abstract | CrossRef Full Text | Google Scholar

Karaesmen, E., Rizvi, A. A., Preus, L. M., McCarthy, P. L., Pasquini, M. C., Onel, K., et al. (2017). Replication and validation of genetic polymorphisms associated with survival after allogeneic blood or marrow transplant. Blood 130, 1585–1596. doi: 10.1182/blood-2017-05-784637

PubMed Abstract | CrossRef Full Text | Google Scholar

Kharas, M. G., Okabe, R., Ganis, J. J., Gozo, M., Khandan, T., Paktinat, M., et al. (2010). Constitutively active AKT depletes hematopoietic stem cells and induces leukemia in mice. Blood 115, 1406–1415. doi: 10.1182/blood-2009-06-229443

PubMed Abstract | CrossRef Full Text | Google Scholar

Kheradpour, P., and Kellis, M. (2014). Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments. Nucleic Acids Res. 42, 2976–2987. doi: 10.1093/nar/gkt1249

PubMed Abstract | CrossRef Full Text | Google Scholar

Knight, J. A., Skol, A. D., Shinde, A., Hastings, D., Walgren, R. A., Shao, J., et al. (2009). Genome-wide association study to identify novel loci associated with therapy-related myeloid leukemia susceptibility. Blood 113, 5575–5582. doi: 10.1182/blood-2008-10-183244

PubMed Abstract | CrossRef Full Text | Google Scholar

Law, P. J., Sud, A., Mitchell, J. S., Henrion, M., Orlando, G., Lenive, O., et al. (2017). Genome-wide association analysis of chronic lymphocytic leukaemia, Hodgkin lymphoma and multiple myeloma identifies pleiotropic risk loci. Sci. Rep. 7:41071. doi: 10.1038/srep41071

PubMed Abstract | CrossRef Full Text | Google Scholar

Lee, C. H., Eskin, E., and Han, B. (2017). Increasing the power of meta-analysis of genome-wide association studies to detect heterogeneous effects. Bioinformatics 33, i379–i388. doi: 10.1093/bioinformatics/btx242

PubMed Abstract | CrossRef Full Text | Google Scholar

Lee, S. H., Harold, D., and Nyholt, D. R. ANZGene Consortium, International Endogene Consortium, Genetic and Environmental Risk for Alzheimer’s disease Consortium (2013). Estimation and partitioning of polygenic variation captured by common SNPs for Alzheimer’s disease, multiple sclerosis and endometriosis. Hum. Mol. Genet. 22, 832–841. doi: 10.1093/hmg/dds491

PubMed Abstract | CrossRef Full Text | Google Scholar

Lee, S. H., Yang, J., Goddard, M. E., Visscher, P. M., and Wray, N. R. (2012). Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics 28, 2540–2542. doi: 10.1093/bioinformatics/bts474

PubMed Abstract | CrossRef Full Text | Google Scholar

Lu, Y., Ek, W. E., Whiteman, D., Vaughan, T. L., Spurdle, A. B., Easton, D. F., et al. (2014). Most common ‘sporadic’ cancers have a significant germline genetic component. Hum. Mol. Genet. 23, 6112–6118. doi: 10.1093/hmg/ddu312

PubMed Abstract | CrossRef Full Text | Google Scholar

Lv, H., Zhang, M., Shang, Z., Li, J., Zhang, S., Lian, D., et al. (2017). Genome-wide haplotype association study identify the FGFR2 gene as a risk gene for acute myeloid leukemia. Oncotarget 8, 7891–7899. doi: 10.18632/oncotarget.13631

PubMed Abstract | CrossRef Full Text | Google Scholar

McCarthy, S., Das, S., Kretzschmar, W., Delaneau, O., Wood, A. R., Teumer, A., et al. (2016). Haplotype Reference Consortium. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283. doi: 10.1038/ng.3643

PubMed Abstract | CrossRef Full Text | Google Scholar

McNerney, M. E., Godley, L. A., and Le Beau, M. M. (2017). Therapy-related myeloid neoplasms: when genetics and environment collide. Nat. Rev. Cancer 17, 513–527. doi: 10.1038/nrc.2017.60

PubMed Abstract | CrossRef Full Text | Google Scholar

Mele, S., Devereux, S., Pepper, A. G., Infante, E., and Ridley, A. J. (2018). Calcium-RasGRP2-Rap1 signaling mediates CD38-induced migration of chronic lymphocytic leukemia cells. Blood Adv. 2, 1551–1561. doi: 10.1182/bloodadvances.2017014506

PubMed Abstract | CrossRef Full Text | Google Scholar

Mifsud, B., Tavares-Cadete, F., Young, A. N., Sugar, R., Schoenfelder, S., Ferreira, L., et al. (2015). Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C. Nat. Genet. 47, 598–606. doi: 10.1038/ng.3286

PubMed Abstract | CrossRef Full Text | Google Scholar

Mitchell, J. S., Johnson, D. C., Litchfield, K., Broderick, P., Weinhold, N., Davies, F. E., et al. (2015). Implementation of genome-wide complex trait analysis to quantify the heritability in multiple myeloma. Sci. Rep. 5:12473. doi: 10.1038/srep12473

PubMed Abstract | CrossRef Full Text | Google Scholar

Mitchell, J. S., Li, N., Weinhold, N., Försti, A., Ali, M., van Duin, M., et al. (2016). Genome-wide association study identifies multiple susceptibility loci for multiple myeloma. Nat. Commun. 7:12050. doi: 10.1038/ncomms12050

PubMed Abstract | CrossRef Full Text | Google Scholar

Pratt, G., Fenton, J. A., Allsup, D., Fegan, C., Morgan, G. J., Jackson, G., et al. (2010). A polymorphism in the 3′ UTR of IRF4 linked to susceptibility and pathogenesis in chronic lymphocytic leukaemia and Hodgkin lymphoma has limited impact in multiple myeloma. Br. J. Haematol. 150, 371–373. doi: 10.1111/j.1365-2141.2010.08199.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Price, A. L., Patterson, N. J., Plenge, R. M., Weinblatt, M. E., Shadick, N. A., and Reich, D. (2006). Principal components analysis corrects for stratification in genome-wideassociation studies. Nat. Genet. 38, 904–909.

Google Scholar

Roadmap Epigenomics Consortium Kundaje, A., Meuleman, W., Ernst, J., Bilenky, M., Yen, A., et al. (2015). Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330. doi: 10.1038/nature14248

PubMed Abstract | CrossRef Full Text | Google Scholar

Salaverria, I., Philipp, C., Oschlies, I., Kohler, C. W., Kreuz, M., Szczepanowski, M., et al. (2011). Translocations activating IRF4 identify a subtype of germinal center-derived B-cell lymphoma affecting predominantly children and young adults. Blood 118, 139–147. doi: 10.1182/blood-2011-01-330795

PubMed Abstract | CrossRef Full Text | Google Scholar

Sampson, J. N., Wheeler, W. A., Yeager, M., Panagiotou, O., Wang, Z., Berndt, S. I., et al. (2015). Analysis of heritability and shared heritability based on genome-wide association studies for thirteen cancer types. J. Natl. Cancer Inst. 107:djv279. doi: 10.1093/jnci/djv279

PubMed Abstract | CrossRef Full Text | Google Scholar

Schoenfelder, S., Javierre, B. M., Furlan-Magaril, M., Wingett, S. W., and Fraser, P. (2018). Promoter Capture Hi-C: high-resolution, genome-wide profiling of promoter interactions. J. Vis. Exp. 28:57320. doi: 10.3791/57320

PubMed Abstract | CrossRef Full Text | Google Scholar

Schofield, E. C., Carver, T., Achuthan, P., Freire-Pritchett, P., Spivakov, M., Todd, J. A., et al. (2016). CHiCP: a web-based tool for the integrative and interactive visualization of promoter capture Hi-C datasets. Bioinformatics 32, 2511–2513. doi: 10.1093/bioinformatics/btw173

PubMed Abstract | CrossRef Full Text | Google Scholar

Shaffer, A. L., Emre, N. C., Lamy, L., Ngo, V. N., Wright, G., Xiao, W., et al. (2008). IRF4 addiction in multiple myeloma. Nature 454, 226–231. doi: 10.1038/nature07064

PubMed Abstract | CrossRef Full Text | Google Scholar

Slager, S. L., Camp, N. J., Conde, L., Shanafelt, T. D., Achenbach, S. J., Rabe, K. G., et al. (2012). Common variants within 6p21.31 locus are associated with chronic lymphocytic leukaemia and, potentially, other non-Hodgkin lymphoma subtypes. Br. J. Haematol. 159, 572–576. doi: 10.1111/bjh.12070

PubMed Abstract | CrossRef Full Text | Google Scholar

Spurrell, C. H., Dickel, D. E., and Visel, A. (2016). The ties that bind: mapping the dynamic enhancer-promoter interactome. Cell 167, 1163–1166. doi: 10.1016/j.cell.2016.10.054

PubMed Abstract | CrossRef Full Text | Google Scholar

Staley, J. R., Blackshaw, J., Kamat, M. A., Ellis, S., Surendran, P., Sun, B. B., et al. (2016). PhenoScanner: a database of human genotype-phenotype associations. Bioinformatics 32, 3207–3209. doi: 10.1093/bioinformatics/btw373

PubMed Abstract | CrossRef Full Text | Google Scholar

Tang, Y., Halvarsson, C., Nordigården, A., Kumar, K., Åhsberg, J., Rörby, E., et al. (2015). Coexpression of hyperactivated AKT1 with additional genes activated in leukemia drives hematopoietic progenitor cells to cell cycle block and apoptosis. Exp. Hematol. 43, 554–564. doi: 10.1016/j.exphem.2015.04.007

PubMed Abstract | CrossRef Full Text | Google Scholar

Vijayakrishnan, J., Qian, M., Studd, J. B., Yang, W., Kinnersley, B., Law, P. J., et al. (2019). Identification of four novel associations for B-cell acute lymphoblastic leukaemia risk. Nat. Commun. 10:5348. doi: 10.1038/s41467-019-13069-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Visser, M., Palstra, R. J., and Kayser, M. (2015). Allele-specific transcriptional regulation of IRF4 in melanocytes is mediated by chromatin looping of the intronic rs12203592 enhancer to the IRF4 promoter. Hum. Mol. Genet. 24, 2649–2661. doi: 10.1093/hmg/ddv029

PubMed Abstract | CrossRef Full Text | Google Scholar

Võsa, U., Claringbould, A., Westra, H., Bonder, M. J., Deelen, P., Zeng, B., et al. (2018). Unraveling the polygenic architecture of complex traits using blood eQTL metaanalysis. bioRxiv [Preprint]. doi: 10.1101/447367

CrossRef Full Text | Google Scholar

Wainberg, M., Sinnott-Armstrong, N., Mancuso, N., Barbeira, A. N., Knowles, D. A., Golan, D., et al. (2019). Opportunities and challenges for transcriptome-wide association studies. Nat. Genet. 51, 592–599. doi: 10.1038/s41588-019-0385-z

PubMed Abstract | CrossRef Full Text | Google Scholar

Walker, C. J., Oakes, C. C., Genutis, L. K., Giacopelli, B., Liyanarachchi, S., Nicolet, D., et al. (2019). Genome-wide association study identifies an acute myeloid leukemia susceptibility locus near BICRA. Leukemia 33, 771–775. doi: 10.1038/s41375-018-0281-z

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, J., Clay-Gilmour, A. I., Karaesmen, E., Rizvi, A., Zhu, Q., Yan, L., et al. (2020). Genome-wide association analyses identify variants in IRF4 associated with acute myeloid leukemia and myelodysplastic syndrome susceptibility. bioRxiv [Preprint]. doi: 10.1101/773952

CrossRef Full Text | Google Scholar

Ward, L. D., and Kellis, M. (2012). HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 40, D930–D934. doi: 10.1093/nar/gkr917

PubMed Abstract | CrossRef Full Text | Google Scholar

Went, M., Sud, A., Speedy, H., Sunter, N. J., Försti, A., Law, P. J., et al. (2018). Genetic correlation between multiple myeloma and chronic lymphocytic leukaemia provides evidence for shared aetiology. Blood Cancer J. 9:1. doi: 10.1038/s41408-018-0162-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Yan, L., Ma, C., Wang, D., Hu, Q., Qin, M., Conroy, J. M., et al. (2012). OSAT: a tool for sample-to-batch allocations in genomics experiments. BMC Genomics 13:689. doi: 10.1186/1471-2164-13-689

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, J., Lee, S. H., Goddard, M. E., and Visscher, P. M. (2011). GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82. doi: 10.1016/j.ajhg.2010.11.011

PubMed Abstract | CrossRef Full Text | Google Scholar

Yu, Y., Ouyang, Y., and Yao, W. (2018). shinyCircos: an R/Shiny application for interactive creation of Circos plot. Bioinformatics 34, 1229–1231. doi: 10.1093/bioinformatics/btx763

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhu, Q., Yan, L., Liu, Q., Zhang, C., Wei, L., Hu, Q., et al. (2018). Exome chip analyses identify genes affecting mortality after HLA-matched unrelated-donor blood and marrow transplantation. Blood 131, 2490–2499. doi: 10.1182/blood-2017-11-817973

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: acute myeloid leukemia, myelodysplastic syndrome, genome-wide association study, blood and marrow transplantation, pleiotropy

Citation: Wang J, Clay-Gilmour AI, Karaesmen E, Rizvi A, Zhu Q, Yan L, Preus L, Liu S, Wang Y, Griffiths E, Stram DO, Pooler L, Sheng X, Haiman C, Van Den Berg D, Webb A, Brock G, Spellman S, Pasquini M, McCarthy P, Allan J, Stölzel F, Onel K, Hahn T and Sucheston-Campbell LE (2021) Genome-Wide Association Analyses Identify Variants in IRF4 Associated With Acute Myeloid Leukemia and Myelodysplastic Syndrome Susceptibility. Front. Genet. 12:554948. doi: 10.3389/fgene.2021.554948

Received: 23 April 2020; Accepted: 19 April 2021;
Published: 17 June 2021.

Edited by:

Oskar A. Haas, St. Anna Children’s Hospital, Austria

Reviewed by:

Yafang Li, Baylor College of Medicine, United States
Mark Z. Kos, The University of Texas Rio Grande Valley, United States

Copyright © 2021 Wang, Clay-Gilmour, Karaesmen, Rizvi, Zhu, Yan, Preus, Liu, Wang, Griffiths, Stram, Pooler, Sheng, Haiman, Van Den Berg, Webb, Brock, Spellman, Pasquini, McCarthy, Allan, Stölzel, Onel, Hahn and Sucheston-Campbell. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Lara E. Sucheston-Campbell, lara.sucheston-campbell@osumc.edu

These authors share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.