The Association Between Single-Nucleotide Polymorphisms of Co-Stimulatory Genes Within Non-HLA Region and the Prognosis of Leukemia Patients With Hematopoietic Stem Cell Transplantation

To avoid graft rejection, the hematopoietic stem cells with matched classical human leukocyte antigen (HLA) alleles are the primary choice for clinical allogeneic transplantation. However, even if the fully HLA-matched hematopoietic stem cells are used for transplantation, some patients still have poor prognosis after hematopoietic stem cell transplantation (HSCT), suggesting that the HLA system was not the only determinant of the outcomes of HSCT. In this study, we investigated whether the single-nucleotide polymorphisms (SNPs) of the co-stimulatory genes within non-HLA regions were related to the outcomes of HSCT. The genomic DNAs of 163 patients who had acute leukemia and received HSCT and their respective donors were collected for analysis. Thirty-four SNPs located in the four co-stimulatory genes including cytotoxic T-lymphocyte associated protein 4 (CTLA4), CD28, tumor necrosis factor ligand superfamily 4 (TNFSF4), and programmed cell death protein 1 (PDCD1) were selected to explore their relationship with the adverse outcomes after transplantation, including mortality, cytomegalovirus infection, graft-versus-host disease, and relapse. Our results revealed that nine SNPs in the CTLA4 gene, five SNPs in the PDCD1 gene, two SNPs in the TNFSF4 gene, and four SNPs in the CD28 gene were significantly associated with the occurrence of adverse outcomes post-HSCT. These SNPs may play important roles in immune response to allografts post-HSCT and can be the targets for developing strategy to identify appropriate donors.

To avoid graft rejection, the hematopoietic stem cells with matched classical human leukocyte antigen (HLA) alleles are the primary choice for clinical allogeneic transplantation. However, even if the fully HLA-matched hematopoietic stem cells are used for transplantation, some patients still have poor prognosis after hematopoietic stem cell transplantation (HSCT), suggesting that the HLA system was not the only determinant of the outcomes of HSCT. In this study, we investigated whether the single-nucleotide polymorphisms (SNPs) of the co-stimulatory genes within non-HLA regions were related to the outcomes of HSCT. The genomic DNAs of 163 patients who had acute leukemia and received HSCT and their respective donors were collected for analysis. Thirty-four SNPs located in the four co-stimulatory genes including cytotoxic T-lymphocyte associated protein 4 (CTLA4), CD28, tumor necrosis factor ligand superfamily 4 (TNFSF4), and programmed cell death protein 1 (PDCD1) were selected to explore their relationship with the adverse outcomes after transplantation, including mortality, cytomegalovirus infection, graft-versus-host disease, and relapse. Our results revealed that nine SNPs in the CTLA4 gene, five SNPs in the PDCD1 gene, two SNPs in the TNFSF4 gene, and four SNPs in the CD28 gene were significantly associated with the occurrence of adverse outcomes post-HSCT. These SNPs may play important roles in immune response to allografts post-HSCT and can be the targets for developing strategy to identify appropriate donors.

INTRODUCTION
Leukemia is a type of cancer with abnormal blood cells. It can be classified into myeloid and lymphoid lineage depending on the type of aberrantly multiplying cells. It can also be distinguished as acute or chronic according to the rate of disease progression. Acute leukemia has the characteristics of poor survival rate. Nowadays, the main treatment of acute leukemia is hematopoietic stem cell transplantation (HSCT), enabling reconstruction of the immune and hematopoietic systems by transplanting autologous or allogeneic hematopoietic stem cells into patients (1). Human leukocyte antigen (HLA) genes are located on the short arm of human chromosome 6, playing vital roles in immune response to allografts (2,3). Hence, it is mandatory to confirm that the HLA alleles are matched between recipients and donors before transplantation (3)(4)(5), especially the classical HLA genes, such as HLA-A, -B, -C, and -D (-DR, -DQ, -DP). Because these genes are closely linked to each other, the HLA genes are inherited in the form of haplotype (6).
According to the relationship between the donor and recipient, HSCT can be divided into related and unrelated transplantation. In general, the survival rate of the former was higher than that of the latter. The incidence of adverse outcomes, such as acute graftversus-host disease (GVHD), for related transplantation was generally lower than that of the unrelated transplantation (7,8). GVHD is one of the complications of allogeneic HSCT caused by the donor's T cells attacking organs and tissues of recipients (9). Cytomegalovirus (CMV) infection is also a complication of allogeneic HSCT, resulting from the destroyed immune system in patients receiving high-dose treatment regimens (10). Patients with CMV infection and GVHD usually had a higher risk for recurrence and death (11). The allografts obtained from HLA-matched siblings was better than that obtained from the HLA-haploidentical parents, siblings, or unrelated sources (7,12,13). Because the probability of having HLA-matched siblings is 25%, there is only about a onethird chance to obtain an HLA-matched related donor. Most patients can only rely on public donation. Nevertheless, the source of donation is limited and patients usually are not able to acquire a suitable donor in time. Because graft failure still occurs even when an HLA-matched sibling was chosen as the donor, additional factors beyond HLA are likely to be involved in the regulator of allograft rejection (14,15).
The association between HSCT and non-HLA genes such as tumor necrosis factor ligand superfamily 4 (TNFSF4), cytotoxic T-lymphocyte associated protein 4 (CTLA4 or CD152), programmed cell death protein 1 (PDCD1, PD-1 or CD279), and CD28 has been shown in previous studies (16)(17)(18)(19). These genes belong to the co-stimulatory system. The imbalance of costimulatory molecules is one of the immune escape mechanisms in hematological cancers. It may promote the development of various autoimmune diseases and cancers (20). Several studies indicate that CD28, CTLA4, TNFSF4, and PDCD1 play important roles in the immune system and transplantations (21)(22)(23). In addition, genetic variants such as single-nucleotide polymorphisms (SNPs) in HLA and non-HLA regions have been linked to the success or failure of HSCT among different ethnic populations (24,25). In this study, we explored the association between donor SNPs of co-stimulatory genes (CTLA4, CD28, TNFSF4, and PDCD1) and the mortality, CMV infection, GVHD, and relapse of their corresponding recipients in the Taiwanese population. This study provides new insights into understanding the roles of co-stimulatory genes in prognosis after transplantation and may lead to developing strategy to identify appropriate donors for HSCT.

Patients and HLA Typing
This study was reviewed and approved by the Institutional Review Board of Chang Gung Memorial Hospital, and its approval IDs were 201304949B0, 201700769B0, 201701849B0, 201801985B0, and 201901246B0. All donors and recipients except the donors of unrelated HSCT signed informed consents. All methods of the study were performed according to the ethical requirements and regulations. For unrelated donors, informed consents were exempted because the HLAmatched donors were selected by the Stem Cells Center in Taiwan and the identity of the donors were made anonymous and disconnected to the physicians and research team. A total of 163 patients receiving HSCT was enrolled in this study, in which 99 patients were diagnosed as acute myeloid leukemia (AML), and 64 patients were acute lymphoblastic leukemia (ALL). The clinical characteristics of the 163 patients are shown in Table 1. All donor-recipient pairs had fully matched HLA as revealed by high-resolution HLA typing using the SeCore kit (Thermo Fisher, Waltham, MA). The MicroSSP Allele Specific Typing Tray (Thermo Fisher, Waltham, MA) was used to resolve ambiguous alleles of the SeCore typing with sequencespecific primers.

Definition of Outcomes
Mortality was referred to the state of patients who died in the duration of study. The presence of CMV antigen or DNA in the peripheral blood of recipients after transplantation was defined as a CMV-infected case. CMV antigen in the leukocytes was determined by CMV Antigenemia Assay (MONOFLUO ™ , Bio-Rad). The test was considered positive when more than two polymorphonuclear leukocytes (PMN) were positive for CMV antigen in a total of 50,000 PMN. CMV DNA Quantitative Amplification test is a real-time quantitative PCR assay (COBAS ® AmpliPrep/COBAS ® TaqMan ® CMV Test, Roche). The nucleic acid test was considered positive when the Ct < 37. These two assays can assist clinicians in monitoring the status of CMV infection.
According to the International Bone Marrow Transplant Registry, GVHD was considered as acute GVHD (aGVHD) when it occurred within 100 days after transplantation. It can be divided into four grades according to the clinical characteristics of organs as defined below. Grade I: maculopapular rash over <25% of body area with no liver or gastrointestinal involvement; Grade II: maculopapular rash over 25% to 50% of body area, diarrhea 500 to 1500 ml/day, and bilirubin 2 to 6 mg/dl; Grade III: maculopapular rash over >50% of body area, and severe diarrhea; Grade IV: skin blisters, bilirubin >15 mg/dl, severe diarrhea with pain, and life-threatening. Grades I-II were defined as mild GVHD, and Grades III-IV were defined as severe GVHD. Chronic GVHD (cGVHD) usually occurs more than 100 days after transplantation or occurs continually for more than 100 days without remission (26). Patients without any symptoms of aGVHD or cGVHD during the study period were defined as no GVHD.
Relapse was defined as recurrence of malignancy based on one or more of the following: bone marrow morphology, minimal residual disease by either flow cytometry, cytogenetics, imaging results, or short tandem repeat (STR) analysis. High-throughput amplicon sequencing (AmpFISTR Identifiler Amplification Kit, Thermo Fisher, Waltham, MA) was performed to analyze STR and to evaluate HSCT engraftments for identification of mixed chimerism (27)(28)(29) according to the manufacturer's instruction. The presence of >5% recipient STR alleles in the chimeric test was considered as a surrogate marker of disease relapse.

Selection of SNPs
Based on our initial screening of SNPs that were present in the CTLA4 gene of the Taiwanese population, and the studies demonstrating the importance of promoter and exon 1 in gene expression and the SNPs with clinical association (30), a total of eight DNA fragments of the four co-stimulatory genes (CTLA4, TNFSF4, CD28, and PDCD1) were selected for analyses of donor SNPs ( Table 2). A total of 17 SNPs in CTLA4, 3 SNPs in TNFSF4, 9 SNPs in CD28, and 5 SNPs in PDCD1 were subject to association study with the risk for relapse, mortality, GVHD, and CMV infection. All SNP variants were deposited to the NCBI database dbSNP and the accession numbers are provided in the Supplementary Table S1. Data can be accessed with the following link: https://www.ncbi.nlm.nih.gov/SNP/snp_viewTable. cgi?handle=WANGWT.

PCR and Sequencing
Peripheral blood (3 ml) was collected from the corresponding donors of the recipients, and the genomic DNA was extracted by

Statistical Analysis
Single-locus association tests were performed to identify the donor SNPs that were associated with the defined outcomes in the recipients. The allele or genotype frequencies between cases (patients with the indicated outcomes) and controls (patients without the indicated outcomes) were compared with Cochran-Armitage Trend test (or Trend test) and the allelic test using the PLINK software v1.07 (31). The allele effects of the SNPs on each outcome were further examined using the logistic regression analysis assuming three modes of inheritance: additive model, recessive model, or dominant models. p < 0.05 was considered statistically significant. The Haploview 4.2 (32) software was used to determine the linkage disequilibrium (LD). The pairwise linkage disequilibrium value D' and the haplotype blocks of SNPs were determined. The haplotype blocks were defined as the SNPs in this region had no evidence for historical recombination.

Patient Characteristics and Study Design
A total of 163 patients receiving HSCT including 99 patients with AML and 64 patients with ALL were enrolled in this study. Clinical characteristics and the tracking data of mortality, CMV infection, relapse, and GVHD for these patients are listed in Table 1.
With the importance of co-stimulatory signals in the immune system and transplantation tolerance, the associations of the four co-stimulatory genes including CTLA4, TNFSF4, CD28, and PDCD1 with the mortality, relapse, CMV infection, and GVHD after HSCT were analyzed in this study. Based on our initial screening of SNPs in the CTLA4 gene of the Taiwanese population, the importance of promoter and exon 1 in gene expression, and the SNPs with clinical association, the genomic regions covering the promoter, exon 1, exon 4, and 3'-UTR of CTLA-4 (17 SNPs), the promoter and exon 1 of TNFSF4 (3 SNPs), the promoter and exon 1 of CD28 (9 SNPs), and the promoter, exon 1, and exons 4-5 including intron 4 of PDCD1 (5 SNPs) for all donors were amplified by PCR using the forward and reverse primers ( Tables 2, 3). The PCR amplicons were sequenced and the association of candidate SNPs with the adverse outcomes of patients with AML and ALL were analyzed by Trend test and allelic test, and by logistic regression analysis with additive, dominant, or recessive mode, respectively. The genotype and allele frequencies for all donors are summarized in the Supplementary Tables S2-S5.

Association of Donor SNPs With the Mortality, CMV Infection, and Relapse of Patients With AML and ALL
By analyzing a total of 34 SNPs, 4 SNPs (rs733618, rs11571316, and rs3087243 in CTLA-4, and rs41386349 in PDCD1) and 1 SNP (rs11571315) elicited significant recessive allelic effects and contributed to the post-HSCT mortality for patients with AML   and ALL, respectively ( Table 4). For patients with AML, the C allele of rs733618 (CC vs. CT+TT, p = 0.0376, OR = 2.77, and 95% CI = 1.07-7.22) and the G allele of rs11571316 (GG vs. AA+AG, p = 0.0441, OR = 2.32, and 95% CI = 1.03-5.24) located in the promoter region of CTLA4 were associated with higher risk for mortality. The SNP rs3087243 in the 3'-UTR of CTLA4 (GG vs. AA+AG, p = 0.0441, OR = 2.32, and 95% CI = 1.03-5.24) and rs41386349 in the intron 4 of PDCD1 (GG vs. AA+AG, p = 0.0362, OR = 2.62, and 95% CI = 1.07-6.42) also conferred recessive effects to the risk for mortality of patients with AML. In addition, the SNP of rs41386349 also elicited additive effects on post-HSCT mortality for patients with AML. For patients with ALL, the C allele of rs11571315 (CC vs. CT+TT, p = 0.0289, OR = 6.14, 95% CI = 1.21-30.99) in the promoter region of CTLA4 was found to associate with higher risk for mortality. One SNP (rs6705653 in PDCD1) and four SNPs (rs36084323, rs41386349, rs6705653, and rs2227982 in PDCD1) were found to associate with CMV infection in patients with AML and ALL, respectively ( Table 5). For patients with AML, the C allele of rs6705653 (CC+CT vs. TT, p = 0.0138, OR = 7.91, and 95% CI = 1.54-40.71) elicited a dominant effect and contributed to the risk for CMV infection. For patients with ALL, the alternative T allele of the same SNP rs6705653 elicited an additive (Trend test: p = 0.0198, and additive effect: p = 0.0186) or a dominant effect to the risk for CMV infection (TT+CT vs. CC, p = 0.0201, OR = 3.95, and 95% CI = 1. 25-12.49). The C allele of rs36084323, A allele of rs41386349, and G allele of rs2227982 in PDCD1 gene also associated with a higher risk for CMV infection (allele model: p = 0.0265, 0.0356, and 0.0252, respectively).
One SNP (rs200353921 in CD28) and three SNPs (rs5839828, rs36084323, and rs2227982 in PDCD1) were associated with the risk of disease relapse in patients with AML and ALL, respectively ( Table 6). For patients with AML, the T allele of rs200353921 located on the promoter region of CD28 gene was associated with a higher risk of relapse (allele model p = 0.0343 for T vs. A, OR = 2.1, and 95% CI = 1.06-4.18). For patients with ALL, the G7 allele of rs5839828, the C allele of rs36084323, and the G allele of rs2227982 in the PDCD1 gene were also associated with a higher risk for disease relapse (allele model: p = 0.0008, 0.0095, and 0.0018, respectively).

Association of Donor SNPs With the Status of GVHD in Patients With AML and ALL
The status of GVHD was classified into four categories including GVHD III-IV (severe GVHD), GVHD I-II (mild GVHD), chronic GVHD (cGVHD), and no GVHD. Two SNPs (rs1234314 and rs45454293) in the promoter region of TNFSF4 were associated No SNP was associated with the risk for mild GVHD in patients with AML. Four SNPs (rs231775 in CTLA4, and rs41386349, rs6705653, and rs2227982 in PDCD1) were associated with the risk for mild GVHD (GVHD I-II) in patients with ALL. The A allele of rs231775 on exon 1 of CTLA4 (allele model: p = 0.0343 for A vs. G, OR = 2.28, and 95% CI = 1.07-4.89), the A allele of rs41386349 (allele model: p = 0.0436 for A vs. G, OR = 2.71, and 95% CI = 1.03-7.1), the T allele of rs6705653 (Trend test: p = 0.0086; allele model: p = 0.0039 for T vs. C, OR = 3.53, and 95% CI = 1.51-8.28) in the intron 4 of PDCD1, and the G allele of rs2227982 (Trend test: p = 0.0194; allele model: p = 0.0055 for G vs. A, OR = 3.4, and 95% CI = 1.44-8.03) in the exon 5 of PDCD1 gene contributed to a higher risk for mild GVHD. The three SNPs in the PDCD1 gene were also associated with CMV infection as above mentioned.
No SNP was found to associate with the risk for cGVHD in patients with AML. Five SNPs (rs5742909 and rs231775 in CTLA4, rs28541784 in CD28, and rs6705653 and rs2227982 in PDCD1) were associated with the risk for cGVHD in patients with ALL. The C allele of rs5742909 (allele model: p = 0.0465 for C vs. T) and the G allele of rs231775 (recessive model: p = 0.0279 for GG vs. AA+AG) on the CTLA4 gene contributed to the higher risk for cGVHD, yet in different modes. In addition, the T allele of rs28541784 on CD28 gene was associated with a higher risk for cGVHD (Trend test: p = 0.0473; allele model: p = 0.0303 for T vs. C, OR = 2.78, and 95% CI = 1.11-6.98). Of the SNPs located in the PDCD1 gene, the C allele of rs6705653 (allele model: p = 0.0066 for C vs. T) and the A allele of rs2227982 (allele model: p = 0.0305 for A vs. G) also contributed to the higher risk for cGVHD.
Two SNPs (rs3181096 and rs3181098 in CD28) and four SNPs (rs4553808, rs62182595, rs16840252, and rs5742909 in CTLA4) were associated with the protective effects on the development of GVHD in patients with AML and ALL, respectively.

Linkage Disequilibrium
The SNPs (n = 20) that were associated with the risk for adverse outcomes in patients with either AML or ALL were subject to LD analysis ( Figure 1). Several pairs of SNPs had high or complete LD including the rs3087243 with rs231775 (D' = 0.97), with rs62182595 (D' = 0.98), and with rs11571316 (D' = 0.96) in the CTLA4 gene; the rs6705653 with rs41386349 (D' = 1) in the PDCD1 gene; and the rs3181096 with rs3181098 (D' = 0.96) in the CD28 gene. In addition, there were three haplotype blocks including the SNPs in the CD28, CTLA4 and PDCD1, respectively. These data imply a potential genetic linkage of these SNPs in the human genome.

DISCUSSION
The SNPs located in the HLA regions have been reported to associate with the post-HSCT adverse outcomes (33). In this study, we investigated further whether these is an association between 34 donor SNPs in the four co-stimulatory genes (TNFSF4, CTLA4, CD28, and PDCD1) and the occurrence of adverse outcomes (mortality, relapse, CMV infection, and GVHD) for patients with AML and ALL. Our data revealed that 10 and 12 SNPs located in these four genes were related to the adverse outcomes of patients with AML and ALL, respectively. Co-stimulatory molecules play a critical role in immune regulation and are involved in the pathogenesis of autoimmune diseases, cancers, and graft rejection (34). During T-cell activation, CD28 provides a stimulatory signal when it interacts with CD80/CD86 on the antigen-presenting cells. CTLA4 is then expressed on the activated T cell, playing a role in negative regulation of T-cell activation by competing with CD28 for CD80/CD86 to prevent excessive T-cell activation (35,36). PDCD1, like CTLA4, plays a negative regulatory role in Tcell activation to develop immune tolerance, which can prevent the development of autoimmune diseases or prevent the immune system from killing cancer cells (37). In addition, the OX40 ligand encoded by TNFSF4 is the key to coordinate innate and adaptive immune cells and plays an important role in the life cycle of immune cells, such as differentiation, activation, inhibition, and apoptosis (38).
Several findings were noted in this study. Most adverse outcomes-related SNPs are unique to patients with AML and ALL, except rs6705653, which is associated with CMV infection for both leukemic types. These data imply that the four costimulatory molecules may elicit various functional activity toward AML and ALL cancer cells. In addition, several SNPs are related to more than one clinical outcome in patients with ALL. The SNP of rs41386349 is related to the risk for CMV infection and GVHD I-II, rs36084323 is related to CMV infection and relapse, rs6705653 is related to CMV infection, GVHD I-II, and chronic GVHD, and rs2227982 is related to CMV infection, relapse, GVHD I-II, and chronic GVHD for patients with ALL. These data further indicate that the co-stimulatory molecules are involved in multiple aspects of immune activity and susceptibility of CMV infection in transplantation.
Notably, the SNPs of CTLA4 and PDCD1 are associated with several adverse outcomes in patients with AML or ALL. Four SNPs in the CTLA4 gene are associated with the risk for mortality (rs733618, rs11571316, rs3087243, and rs11571315). These SNPs are also known to associate with autoimmune diseases and cancers (39). Another five SNPs in the CTLA4 gene are related to the status of GVHD (rs5742909, rs4553808, rs62182595, rs16840252, and rs231775). Among these SNPs, rs733618, rs11571315, rs11571316, rs5742909, rs4553808, rs62182595, and rs16840252 are within the promoter region, rs231775 is in exon 1, and rs3087243 is in the 3'-UTR of CTLA4 gene. Because CTLA4 expression is important to evade surveillance from host immune cells (40), these SNPs are likely to modulate CTLA4 gene expression, thereby altering the immune response and conferring a risk for mortality post-HSCT. Consistent with this notion, the SNPs located on the promoter region have been shown to elicit effects on gene expression (41), The genotypic variants of rs3087243 have been shown to associate with CTLA4 expression in patients with inflammatory bowel disease (42). The A allele of rs231775 is known to produce higher mRNA efficiency than the G allele, leading to produce more CTLA4 protein (30). It is worthy to investigate further whether the abovementioned SNPs regulate CTLA4 expression leading to aGVHD and cGVHD. Five SNPs (rs36084323, rs5839828, rs41386349, rs6705653, and rs2227982) in PDCD1 gene are associated with the risk for relapse, mortality, CMV infection, and GVHD. This is consistent with the key roles of PDCD1 in regulating allogeneic immune response in transplantation. Maintenance of graft tolerance is related to the interaction between PDCD1 (PD-1) and PD-L (43). Posttransplantation lymphoproliferative disorder, which was developed under the condition of T-cell dysfunction or immunosuppression after HSCT, is also related to the expression of PDCD1 (44). Among the SNPs, rs5839828 and rs36084323 are within the promoter region, rs6705653 and rs41386349 are in intron 4, and rs2227982 is in exon 5. The SNPs located in the promoter and exon regions may affect the expression of transcription and the alteration of coding amino acid, respectively. Whether intronic SNP has any effect on PDCD1 expression is not clear. Nevertheless, aberrant splicing has been linked to the intronic SNP and causes protein mutation (45).
Four SNPs (rs200353921, rs3181096, rs3181098, and rs28541784) in CD28 gene are associated with the GVHD grades and relapse for patients with AML and ALL. These SNPs may directly or indirectly alter CD28 expression to induce different degrees of cellular responses (46), which, in turn, affect the risk of GVHD and relapse for leukemic patients after HSCT. The interplay between CD28 and GVHD has been reported in several previous studies. CD28 in donor T cells contributes to the pathogenesis and severity of GVHD in a mouse model (17). Abnormal expression of CD28 and CTLA4 in peripheral blood leukocytes of patients with AML may promote the development of aGVHD after HSCT (47). Consistent with these previous studies, our data revealed that the FIGURE 1 | Linkage disequilibrium (LD) analysis of the donor SNPs that were associated with the adverse outcomes of patients with AML and ALL. The pairwise linkage disequilibrium (D') was given for each pair of SNPs. The red boxes indicated that the pairs of SNPs had high LD, and the lighter the color, the smaller the LD was.
donor SNPs in CD28 gene were related to the development of GVHD in AML patients, regardless of the grade status. Moreover, two SNPs (rs45454293 and rs1234314) in the promoter region of TNFSF4 gene are associated with the development of GVHD grades III and IV for patients with AML. In accord with our findings, Tripathi et al. showed that OX40L (TNFSF4)-OX40 interaction not only induces aGVHD, but also is an essential part in the progression of aGVHD (48). The SNPs in the promoter region is likely to modulate OX40L (TNFSF4) expression, resulting in excessive OX40L-OX40 interaction, which subsequently increases the risk of GVHD.
In addition to genetic studies to associate SNPs with the prognosis of leukemia patients post-HSCT, studies have been reported to integrate both clinical variables and genetic variables in generating predictive model for clinical outcomes after HSCT (49)(50)(51). In this regard, Martinez-Laperche et al. applied a complex estimation method, the least absolute shrinkage and selection operation (LASSO) procedure, to generate a predictive model to improve the prediction of severe GVHD (grades III-IV) (49). The model including both clinical variables and genetic variables is better than the models containing only clinical variables or only genetic polymorphisms. Another risk model integrating SNPs and clinical variables have also been demonstrated to predict the risk for GVHD in specific organs (50,51). An extension of the current study is to integrate clinical variables with our SNP data for multivariate regression analysis and association study. Increasing the enrollment number of donor-recipient pairs may further validate and confirm the importance of these SNPs in the development of adverse outcomes post-HSCT for patients with leukemia.
In conclusion, a total of 10 and 12 SNPs in the co-stimulatory genes are associated with the post-HSCT adverse outcomes for patients with AML and ALL, respectively. Because these SNPs are present in the donor DNA, it provides a basis for developing a screening panel of SNPs to search and select appropriate donors for transplantation. It is also worthy to investigate further the effects of these SNPs on the expression of these costimulatory genes to elucidate the underlying mechanisms of transplantation failure.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.