HLA-G 3’UTR Polymorphisms Are Linked to Susceptibility and Survival in Spanish Gastric Adenocarcinoma Patients

HLA-G is a non-classical class I HLA molecule that induces tolerance by acting on receptors of both innate and adaptive immune cells. When overexpressed in tumors, limits surveillance by the immune system. The HLA-G gene shows several polymorphisms involved in mRNA and protein levels. We decided to study the implication of two polymorphisms (rs371194629; 14bp INS/DEL and rs1063320; +3142 C/G) in paired tissue samples (tumoral and non-tumoral) from 107 Spanish patients with gastric adenocarcinoma and 58 healthy control individuals, to assess the possible association of the HLA-G gene with gastric adenocarcinoma susceptibility, disease progression and survival. The presence of somatic mutations involving these polymorphisms was also analyzed. The frequency of the 14bp DEL allele was increased in patients (70.0%) compared to controls (57.0%, p=0.025). In addition, the haplotype formed by the combination of the 14bp DEL/+3142 C variants is also increased in patients (54.1% vs 44.4%, p=0.034, OR=1.74 CI95% 1.05-2.89). Kaplan-Meier analysis revealed that 14bp DEL/DEL patients showed lower 5-year life-expectancy than INS/DEL or INS/INS (p=0.041). Adjusting for TNM staging (Cox regression analysis) disclosed a significant difference in death risk (p=0.03) with an expected hazard 2.6 times higher. Finally, no somatic mutations were found when comparing these polymorphisms in tumoral vs non-tumoral tissues, which indicates that this is a preexisting condition in patients and not a de novo, tumor-restricted, event. In conclusion, the variants predominant in patients were those increasing HLA-G mRNA stability and HLA-G expression, clearly involving this molecule in gastric adenocarcinoma susceptibility, disease progression and survival and making it a potential target for immunotherapeutic approaches.


INTRODUCTION
Gastric epithelial adenocarcinomas are the most common form of stomach tumors, representing 90% of the cases. Adenocarcinomas are usually located in the cardia (31.0%), antrum (26.0%) and body (14.0%), and their development is associated to the infection by Helicobacter pylori, as 84.0% of the patients have been infected with this bacterium (1). In 2020, this tumor was the 6th most frequent cancer in the world (11.1 new cases per 100.000 persons per year), and the 5th with highest mortality rate (7.7 patients per 100.000 persons per year), which made it one of the most aggressive tumors. The 5-year survival rate for this type of cancer was 29.5%, a situation that has been maintained for the last 30 years (2,3).
Tumors with high frequency of somatic mutations, such as stomach malignancies (4), express tumor neo-antigens on their surface and are, thus, potential targets of the immune response. However, even in an immunocompetent organism, neoplastic cells develop mechanisms, such as the expression of immunomodulatory molecules as HLA-G, to evade the action of the immune system. Published works describe that HLA-G possesses a key role in the cancer immunoediting mechanism, attenuating the elimination of tumor cells (5)(6)(7)(8).
HLA-G is a non-classical class I HLA molecule composed of a heavy chain bound to b2 microglobulin. The HLA-G gene, located on chromosome 6, exhibits 7 introns and 8 exons that codifies for the heavy chain. Exon 1 codes for the signal peptide, exons 2, 3 and 4 the extracellular domains a1, a2 and a3, respectively, and exons 5 and 6 the transmembrane region and the cytoplasmic domain, respectively (9). Exon 7 is transcribed in the pre-mRNA molecule, but not present in the mature-mRNA, whereas exon 8 is not translated., but in the latter lays the 3'UTR region, involved in the transcriptional regulation of the gene (10,11).
The 3'UTR region of the HLA-G gene shows numerous variations that can have an impact on the mRNA and, therefore, on the protein levels (10). Of these variants, rs371194629 and rs1063320 have been studied in several types of cancer and pathologies (21)(22)(23).
The rs371194629 polymorphism (14bp INS/DEL) is caused by the deletion (DEL) from the ancestral variant (INS) (10, 24) of a 14bp segment (5′-ATTTGTTCATGCCT-3') located at the +2960 position of the 3'UTR region. This 14bp segment has been associated with both, the splicing and the stability of the mRNA (10,25,26), as it contains an AUUUG domain putatively exerting an AU-pentamer-like effect, decreasing mRNA stability (27). Therefore, the DEL allele provides a higher stability of the mRNA (25), associated with a high expression of HLA-G (26).
The rs1063320 polymorphism (+3142C/G), consists of the transversion of a cytosine (C, the ancestral variant) to guanine (G) at position +3142 of the 3'UTR region, modulating the affinity of 148a, 148b and 152 miRNAs (known to either favor the direct mRNA degradation or block mRNA to protein translation) for this region (28)(29)(30). Should a C be found at position +3142, miRNAs affinity will decrease, increasing the mRNA availability and the production of HLA-G (10, 11).
Other authors have described the expression of HLA-G in gastric tumors (31), therefore, studying the polymorphisms related to the expression levels (such as 14bp INS/DEL and +3142 C/G) of this molecule could serve to identify new genetic markers involved in the risk and evolution of this pathology.
Therefore, we decided to study the influence of the aforementioned polymorphisms and the combined haplotypes thereof in paired tissue samples (tumoral and non-tumoral) from a group of patients with gastric adenocarcinoma, and compare the frequencies achieved with that of a control population. This approach will allow us to determine whether the HLA-G variants may be adequate gastric adenocarcinoma risk markers and whether somatic mutations take place in tumoral, but not healthy, gastric tissue.

PATIENTS, MATERIALS, AND METHODS
Samples used in this study were obtained from the Servicio de Cirugıá General y Digestiva, Hospital Prıńcipe de Asturias (Alcaláde Henares -Madrid) and sent to the Departamento de Inmunologıá (Facultad de Medicina. Universidad Complutense de Madrid -Madrid), where they were processed.

Patients
One hundred and seven Spanish patients diagnosed with gastric adenocarcinoma were included in this study. Patients were classified according to the TNM staging criteria (stages I through IV) (32) ( Table 1).

Tissue Samples
Tumoral (T) and distal, non-tumoral (NT), gastric tissue samples were obtained from each patient upon surgery. A total of 214 tissue samples (T+NT) were available for the study.

Controls
A total of 58 sex-and age-matched Spanish healthy donors, from the same geographic region as the patients, were included as controls. DNA was obtained from blood or saliva samples, as described in the following section.

Genomic DNA Extraction
DNA isolation, both from blood and tissue, was carried out by using the Illustra Nucleon BACC (GE Healthcare) kit, following the manufacturer's instructions. In the case of tissue specimens, fragments of 25 mg were mechanically disrupted, subjected to proteinase K treatment, and followed by the DNA precipitation protocol included in the kit. DNA from saliva control samples were extracted using the Oragene DNA 500 kit (DNA Genotek) and purified with PrepIT-L2P (DNA Genotek).
The concentration and quality of DNA extracted per sample was determined by spectrophotometric methods in a NanoDropOne (ThermoScientific).

Analysis of the 14bp Polymorphism
The region of exon 8 containing the 14bp polymorphism was amplified by PCR, using primers and conditions previously published (33,34) (Table 2A), further confirmed using the NCBI Blast tool. Amplified products (224bp, INS variant or 210bp, DEL variant) were resolved by electrophoretic analysis in 3% agarose gels for 80 min at 90V ( Figure 1A).

Analysis of the +3142 C/G Polymorphism
PCR primers and conditions used have been already published (34,35) (Table 2B). The PCR product was subjected to BseSI digestion (PCR-RFLP) and resolved in a 2% agarose gel for 50 min at 90V. Effective digestion will disclose a "G" at position +3142, yielding two bands of 316bp and 90bp, whereas an undigested amplicon will indicate a "C" at this position and a single 406bp band ( Figure 2A).

Sequencing
DNA sequencing (Sanger method) was carried out in 272 samples (107 patient paired samples plus 58 controls), to confirm the PCR and PCR-RFLP results ( Figures 1B, 2B).

Statistics
The data of the polymorphisms studied by PCR or PCR-RFLP were analyzed with the software SNPStats. This software allows to assess Hardy-Weinberg Equilibrium (exact test), chi-square test, OR estimation analysis of association between polymorphisms and disease applying logistic regression models, that consider the dominant, recessive, codominant and logadditive models of inheritance. SNPStats also allows the analysis of linkage disequilibrium, using the D statistic and a correlation coefficient, and the analysis of haplotypes (EM algorithm) (36,37).
Kaplan-Meier method was used to estimate the 5-year survival function of patients with gastric cancer and different genetic factors (GraphPad Prism 8.0 software). Multivariate Cox regression models were used to simultaneously assess the effect of genetic factors and other factors such as comorbidities, clinical features and demographic characteristics on 5-year survival of patients (software R). For all the Cox regression fits, the individual and global Schoenfeld test indicated that no covariate in the model nor the model as a whole violate the Proportional Hazard assumption, meaning that the hazard ratio stays constant over time (38).
P-values below 0.05 were considered statistically significant. As two polymorphisms were considered, the Holm-Bonferroni (HB) sequential correction method for multiple testing was applied to the statistical analyses when required. The HB  method compares the k-ranked p-value to the nominal significance level (0.05) divided by (n-k+1), where in this case n = 2 (the number of polymorphisms) and k = 1 and 2.

RESULTS
The clinical characteristics of patients are shown in Table 1.

Genetic Analysis
Hardy-Weinberg equilibrium was confirmed for both polymorphisms and groups of individuals included in the study (data not shown).

14bp INS/DEL Polymorphism
Comparing the frequency of the 14bp INS/DEL polymorphism in patients and our control group yielded statistically significant differences. The 14bp DEL variant was the most frequent in the group of patients (70.0%) compared to the control group (57.0%, p=0.025 see Table 3A). Genotype distribution (INS/INS, INS/DEL, DEL/ DEL) showed no statistically significant differences between both groups (P=0.089, Table 3B), although DEL/DEL individuals were more abundant (50.9%) in patients than in controls (34.5%).
The best fit model for this polymorphism, based on the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) values, was the log-additive model, where having each copy of DEL modifies the risk of developing gastric cancer in an additive form. DEL-allele bearers showed a higher risk of developing gastric cancer (p=0.034 OR 1.65, CI95% 1.04-2.61, see Table 3C).
As already mentioned, the DEL variant provides greater stability to the HLA-G mRNA, and, hence, yields higher levels of the protein, favoring tumor progression.

+3142 C/G Polymorphism
Although no significant difference was observed in the distribution of the alleles of this polymorphism (Table 4A), when considering the inheritance models of the different genotypes, significant differences were found. Based on the AIC and BIC values, the recessive inheritance model is the one that best fits the data obtained, with an increased frequency of C/C individuals in the group of patients (30.8%) compared to controls (14.3%, p=0.017, OR=2.68, CI95% 1.14-6.28, see Table 4C).
Like the DEL variant of the previous polymorphism, the +3142 C variant would yield higher levels of the final protein, possibly favoring tumor progression.
Based on the Holm-Bonferroni sequential correction, significant associations are still detected in both +3142 (through the recessive model with the first-ranked p-value = 0.017<0.025) and 14bp (through the log-additive model with the second-ranked p-value=0.034<0.05) polymorphisms.

Haplotype Formed by the 14bp INS/DEL and +3142 C/G Polymorphisms
These two polymorphisms lie in close vicinity (182bp) and are in linkage disequilibrium, according to the values obtained with the SNPStats software (D'=0.98, r=0.73, Table 5), forming haplotypes.
The frequency of the 14bp-DEL/+3142-C (DEL/C) haplotype (  consistent with the allele frequencies obtained for each polymorphism, as the variants producing higher expression of HLA-G are more frequent in patients with gastric adenocarcinoma. Therefore, this supports the results obtained at the individual level for each polymorphism and denotes DEL/C as a possible risk haplotype.

Somatic Mutations
The distribution of the polymorphisms studied was compared in paired (T and NT) gastric tissue samples in all patients. In no instance was a difference found between the paired tissue samples analyzed, irrespective of the polymorphisms (14bp INS/DEL or +3142 C/G) considered. To further confirm these data, blood samples (EDTA) from patients were drawn, and the results obtained matched those of tissues (results not shown).

Survival Curves
Five-year survival rate of patients with gastric cancer was calculated. Only patients enrolled in the study for at least 5 years (

DISCUSSION
The association between HLA-G, immunoediting and cancer has been of great interest in recent years (40)(41)(42). In fact, HLA-Gmediated signaling pathway is now considered as a new therapeutic immune checkpoint, in addition to other well-established ones (41).
We thus decided to analyze the involvement of HLA-G in the development of gastric adenocarcinoma. To this end, we examined in a group of 107 patients the influence of two polymorphisms (14bp INS/DEL and +3142 C/G) that affect the stability and availability of the HLA-G mRNA and, therefore, the level of the corresponding protein. A higher frequency of the variants increasing HLA-G mRNA stability (14bp DEL and +3142 C) is expected in patients, as they will favor the development and progression of gastric adenocarcinoma. Our results clearly show that HLA-G polymorphisms are linked to gastric cancer susceptibility.

14bp INS/DEL Polymorphism
A significant difference was found in the frequency of the DEL allele in patients compared to controls (Table 3A). Since already published work (9)(10)(11) showed that this variant increases HLA-G expression, we suggest that patients bearing the DEL allele will extensively display HLA-G, allowing tumor progression. These differences were also confirmed using a larger cohort of controls of Italian origin (N=245) obtained in a bibliographic search (43) (data not shown): the DEL allele was present in 57.0% of individuals, significantly different from our diseased group (p=0.001). We used an Italian population as they have a similar HLA genetic background (44). Likewise, the 14bp DEL FIGURE 4 | Multivariate Cox regression carried out to assess simultaneously the effect of other factors (sex, age, cancer stage, cancer location and cancer type) on 5-year survival of patients. Our analyses indicate that polymorphism 14bp is significantly associated with 5-year survival when adjusting by cancer stage. The "*"symbol means p < 0.05, and the "***" symbol means p < 0.005. variant was linked to the development of breast, esophageal and colorectal cancer (39,(45)(46)(47). We may conclude that the 14bp DEL variant is a potential new gastric carcinoma risk marker.
As far as inheritance models are concerned, and according to the AIC, BIC and OR values (Table 3C), the most appropriate model to describe the distribution of these genotypes is the logadditive, whereby only the risk variant (DEL), has an effect in disease susceptibility, which, furthermore, is additive. Thus, the presence of a single copy of DEL (as in INS/DEL individuals) implies 1.65 times the risk of suffering gastric cancer, and the presence of two copies (DEL/DEL individuals; 50.9% in patients vs 34.5% in controls) will further increase this risk.
Previous published research articles revealed that the DEL allele exerts an effect on the stability and levels of HLA-G mRNA (25,48) and on the levels of soluble HLA-G protein (26,49), and reinforce the notion that the log-additive model is the one that best fits our study, since the high frequency of DEL/DEL patients (50.9%) implies a strong genetic predisposition of this genotype in the development of gastric cancer, tightly related to a high expression of HLA-G: the more DEL is present in the transcript the more the HLA-G mRNA is stabilized and translated.
Moreover, similar findings have been reported by other authors. Eskandari-Nasab E et al. (47) reported a higher frequency of the 14bp DEL allele and of the DEL/DEL genotype in breast cancer patients compared to the control group. Likewise, Jiang Y et al. (45) reported in a meta-analysis that the HLA-G 14bp DEL allele and the DEL/DEL genotype were associated with increased cancer risk.

+3142 C/G Polymorphism
As for the polymorphism +3142 C/G, no statistically significant differences were detected between patients and control subjects although there is a slight increase in the frequency of the C allele in patients compared to controls (Table 4A), and an increase in the frequency of the C/C genotype (Table 4B); again, this result fits in the proposed hypothesis, since bearing the C allele favors a higher expression of HLA-G (9-11, 28, 30).
In this case, the best inheritance model, based on AIC and BIC values, is the recessive one (Table 4C). According to this model, C/C individuals present 2.68 times increased risk of developing gastric cancer. Previous works described that +3142C leads to higher expression of HLA-G (28,30), due to a lower affinity of different miRNA (148a, 148b and 152) for the HLA-G mRNA. In a similar way to 14bp DEL, which it is described to increase mRNA stability, +3142C/C genotype is overrepresented in patients (30.8% vs 14.3%, p=0.017), possibly involving this polymorphism with the development of gastric cancer.
Again, results lending support to our data have been published elsewhere. Jiang Y et al. (50) described in a metaanalysis that the HLA-G +3142 C>G mutation significantly decreased cancer risk, both in the allelic and recessive comparison models.
Haplotype Formed by the 14bp INS/DEL and +3142 C/G Polymorphisms Disease susceptibility has long been linked to extended haplotypes of the HLA system (51,52). The combination of the alleles of these two polymorphisms renders different haplotypes. Assuming the role the variants may have in HLA-G levels and cancer risk, the combination of the DEL and C alleles would pose the highest cancer risk. In fact, and according to the calculations carried out by the SNPStats software, the INS/ G combination (considered as the reference value by the software) would be underrepresented in patients, whereas the DEL/C haplotype is significantly more frequent in patients (54.1%) than in healthy controls (44.4%) ( Table 5). This indicates that the former haplotype is a protective factor, while the latter is a risk factor for this disease. The increase in the frequency of DEL/C haplotype matches the results obtained in the analysis of the individual polymorphisms, suggesting that both 14bp DEL and +3142C variants (that presumably lead to a higher HLA-G expression)are associated with susceptibility to gastric adenocarcinoma. This is, to the best of our knowledge, the first time that an association between the HLA-G 3'UTR region and the development of gastric cancer is disclosed in our population.

Somatic Mutations
Random mutations take place in cancer, conferring cells a proliferative and invasive capacity and allowing them to escape immune surveillance. In our case, an increase in variants favoring HLA-G expression and thus, abrogating immune response (i.e.: 14bp DEL and +3142C) could be expected in tumoral (T) but not in non-tumoral (NT) distal cells. However, after analysis of paired (T+NT) tissue samples from the 107 patients studied, no somatic mutations were found. The polymorphism present in a T sample analyzed matched that of the paired NT sample in every single patient tested. We can then confidently conclude that the polymorphisms here studied, and their influence on gastric cancer is a pre-existing condition in these patients.

Survival
Further to mediating disease risk, we measured whether these variants were involved in life expectancy. None of the comorbidities studied were linked to patient survival, whereas TNM staging and 14bp polymorphism revealed a clean association. Disease-specific survival rate (Figures 3 and 4) is significantly diminished, and hazard ratio increased (by up to 14fold) in patients expressing higher levels of HLA-G according to published works (25,26,48,49) (in our case, bearers of the DEL/ DEL genotype). Therefore, tumor cells evade the immune system and proliferate unchecked, leading to disease dissemination and death. The 14bp DEL variant has been already associated with worse survival in a cohort of patients with colorectal cancer (39), a tumor with similar histological features to gastric adenocarcinoma, which supports the relevance of this polymorphism in the progression of cancer.
This finding suggests the possibility of considering this molecule as a potential target for therapeutic approaches. Downregulating HLA-G expression with miRNAs (as has been done in other clinical settings) (53)(54)(55) or blocking (with monoclonal antibodies) its interaction with cognate receptors (in a way similar to PD-1/PD-L1 current immunotherapies) (56)(57)(58), will make tumors visible to immunocompetent cells, eliciting an active immune response.
Although the functional effect of these polymorphisms and HLA-G expression could explain their linkage to disease susceptibility and progression, we cannot, nevertheless, exclude the possibility that these polymorphisms be in LD with other genes (i.e., other class I HLA genes) that could truly mediate the development and prognostic of this disease.
A limitation for this study is that our research is focused on genetic polymorphisms and, although we have confirmed HLA-G expression in tissue and sHLA-G in plasma in part of our cohort of patients ( Supplementary Figure 1), we could not evaluate the association of HLA-G expression with the risk of developing gastric cancer or the survival of patients.
Further studies, focusing on HLA-G expression on the patients, are required to precisely assess the role of HLA-G in gastric cancer.
Besides these limitations, a more extensive cohort of patients and including other 3'UTR polymorphism (such as +3003 T/C or +3184 A/G) in a larger study would increase the reliability of the associations herein proposed. To ease comparisons between groups, future studies with larger cohorts will include ancestry informative markers (AIMs).
We conclude that the polymorphisms 14bp INS/DEL and +3142 C/G of the HLA-G gene mediate gastric cancer risk and survival, and suggest the possibility of establishing new therapeutic approaches aiming at counterbalancing the negative role of this protein in tumors.

DATA AVAILABILITY STATEMENT
The nucleotide sequences have been deposited to Genbankaccession MZ130952-MZ130955.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Comitéetico de investigacioń clıńica, Hospital Clıńico San Carlos, Madrid, Spain. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
CV-Y: manuscript writing, investigation, and analysis. IJ: design, analysis, and supervision. MM-A: investigation support and validation. EM-L: analysis. AL-N: investigation support. FS-T: manuscript revision. AG-C, AL-G, IL, and RG: patient follow-up, sample and data collection. AA-V: critical review, project administration, and funding acquisition. JM-V: manuscript writing and revision, supervision, project administration, and funding acquisition. All authors contributed to the article and approved the submitted version.

FUNDING
This work was supported by grants from Instituto de Salud Carlos III (PI18/00626 and PI18/00721), with funds from the European Union (Fondo Europeo de Desarrollo Regional FEDER). IJ is a grant recipient of a Universidad Complutense de Madrid-Real Colegio Complutense Harvard grant, (Ayudas para contratos predoctorales de personal investigador en formacioń CT18/16).

ACKNOWLEDGMENTS
We thank Prof. E. D. Carosella (Saint Louis Hospital, Paris, France) for scientific support and Darío Martıńez Martıńez for help with data analysis.