The Contribution of Genetic Variants to the Risk of Papillary Thyroid Carcinoma in the Kazakh Population: Study of Common Single Nucleotide Polymorphisms and Their Clinicopathological Correlations

Objective Risk for developing papillary thyroid carcinoma (PTC), the most common endocrine malignancy, is thought to be mediated by lifestyle, environmental exposures and genetic factors. Recent progress in the genome-wide association studies of thyroid cancer leads to the identification of several genetic variants conferring risk to this malignancy across different ethnicities. We set out to elucidate the impact of selected single nucleotide polymorphisms (SNPs) on PTC risk and to evaluate clinicopathological correlations of these genetic variants in the Kazakh population for the first time. Methods Eight SNPs were genotyped in 485 patients with PTC and 1,008 healthy control Kazakh subjects. The association analysis and multivariable modeling of PTC risk by the genetic factors, supplemented with rigorous statistical validation, were performed. Result Five of the eight SNPs: rs965513 (FOXE1/PTCSC2, P = 1.3E-16), rs1867277 (FOXE1 5’UTR, P = 7.5E-06), rs2439302 (NRG1 intron 1, P = 4.0E-05), rs944289 (PTCSC3/NKX2-1, P = 4.5E-06) and rs10136427 (BATF upstream, P = 9.8E-03) were significantly associated with PTC. rs966423 (DIRC3, P = 0.07) showed a suggestive association. rs7267944 (DHX35) was associated with PTC risk in males (P = 0.02), rs1867277 (FOXE1) conferred the higher risk in subjects older than 55 years (P = 7.0E-05), and rs6983267 (POU5F1B/CCAT2) was associated with pT3–T4 tumors (P = 0.01). The contribution of genetic component (unidirectional independent effects of rs965513, rs944289, rs2439302 and rs10136427 adjusted for age and sex) to PTC risk in the analyzed series was estimated to be 30–40%. Conclusion Genetic factors analyzed in the present work display significant association signals with PTC either on the whole group analysis or in particular clinicopathological groups and account for about one-third of the risk for PTC in the Kazakh population.

Objective: Risk for developing papillary thyroid carcinoma (PTC), the most common endocrine malignancy, is thought to be mediated by lifestyle, environmental exposures and genetic factors. Recent progress in the genome-wide association studies of thyroid cancer leads to the identification of several genetic variants conferring risk to this malignancy across different ethnicities. We set out to elucidate the impact of selected single nucleotide polymorphisms (SNPs) on PTC risk and to evaluate clinicopathological correlations of these genetic variants in the Kazakh population for the first time.
Methods: Eight SNPs were genotyped in 485 patients with PTC and 1,008 healthy control Kazakh subjects. The association analysis and multivariable modeling of PTC risk by the genetic factors, supplemented with rigorous statistical validation, were performed.

INTRODUCTION
Papillary thyroid carcinoma (PTC), a well-differentiated thyroid cancer of follicular cell origin, accounts for about 80% of all thyroid cancers worldwide being the most common endocrine malignancy. According to the IARC, in 2018 thyroid cancer affected 567,233 patients worldwide, making it the 9 th most prevalent human cancer (3.1%) with the average agestandardized incidence of 6.7 and mortality rate of 0.42 per 100,000 of population (1). Region-specific incidence rates vary broadly from 1.0 in Micronesia to 15.0 in North America per 100,000 of population. In Kazakhstan, the age-standardized incidence of thyroid cancer was 2.4 per 100,000 of population accounting for 1.4% of all newly diagnosed cancers in the country in 2018.
With the improvements in cancer detection and diagnosis, the incidence of thyroid cancer is growing in most countries displaying one of the fastest increases in rate among common cancers. While the advances in thyroid nodule visualization such as ultrasound imaging and their facile assessment using ultrasound-guided fine-needle aspiration cytology have likely contributed to this uptrend, the additional reasons for the increasing incidence are investigated. Besides of wellestablished risk factor for thyroid cancer such as ionizing radiation, other environmental agents, including iodine deficiency, natural and technogenic pollutants with hormone disrupter effects, exposures to excessive nitrate (2,3) and various chemicals are discussed or considered.
As a complex disease, PTC is thought to be dependent not only on environmental, but also on genetic factors. Studies of familial thyroid cancer estimated the contribution of genetic component to the risk of disease to be ranging from 28 to 53% (4,5). At the population level, hereditary factors possibly contributing to the phenotype (e.g. the development of a condition or a disease) are usually identified in genetic association studies. To date, a number of well-powered genome-wide association studies (GWAS) or target gene investigations in thyroid cancer have been performed in the groups of different ethnicities in non-exposed or exposed to radiation individuals (6)(7)(8)(9)(10)(11)(12)(13)(14). GWAS findings and consequent independent replication studies have convincingly demonstrated robust associations of rs965513 (FOXE1, forkhead box E1 and/or PTCSC2, papillary thyroid carcinoma susceptibility candidate 2; chromosome 9q22.33), rs944289 (PTCSC3, papillary thyroid carcinoma susceptibility candidate 3 and/or NKX2-1, NK2 homeobox 1; 14q13.3), rs1867277 (FOXE1; 9q22.33), rs2439302 (NRG1, neuregulin 1; 8q12) and rs966423 (DIRC3, disrupted in renal carcinoma 3; 2q35) SNPs with differentiated thyroid cancer, principally PTC (15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27), reviewed in (28). The strength of association signal for these SNPs in terms of odds ratios (OR) ranged from 1.28 to 1.70 in most studies. More recent studies have identified associations between the rs6983267 (POU5F1B, POU class 5 homeobox 1B and/or CCAT2, colon cancer associated transcript 2; 8q24) and thyroid cancer in different populations. A systematic review with meta-analysis of four studies that included a total of 2,825 cases and 9,684 controls confirmed the G allele of the rs6983267 to be a risk factor for thyroid cancer with an OR = 1.08, P = 0.01 (29).
The novel GWAS candidate loci continue to emerge. A recent combined analysis of GWAS results and the Italian replication study provided evidence of association of risk for differentiated thyroid cancer with rs10136427 (BATF, basic leucine zipper ATF-like transcription factor, 14q24.3) with an OR = 1.40, P = 4.35E-07) and rs7267944 (DHX35, DEAH-box helicase 35, 20q12) with an OR 1.39, P = 2.13E-08. These associations were replicated in the Polish and Spanish populations with little evidence of population heterogeneity (the combined, OR = 1.30, P = 9.30E-07 and OR 1.32, P=1.34E-08, respectively) (10).
To the best of the authors' knowledge, studies of rs10136427 (BATF, 14q24.3) and rs7267944 (DHX35, 20q12) in PTC have not been replicated in independent studies. We therefore aimed to examine the six well-described SNPs discussed above, namely rs965513 (FOXE1/PTCSC2, 9q22.33), rs944289 (PTCSC3/NKX2-1, 14q13.3), rs1867277 (FOXE1; 9q22.33), rs2439302 (NRG1, 8q12), rs966423 (DIRC3, 2q35) and rs6983267 (POU5F1B/CCAT2, 8q24.2), and two SNPs newly discovered to be associated with thyroid cancer (10), rs10136427 (BATF, 14q24.3) and rs7267944 (DHX35, 20q12) in a relatively large case-control series. This work is the first to characterize the eight SNPs in the Kazakh population. In addition to the classical association analysis, we estimated the contribution of the genetic variants to PTC risk, and assessed the relationships with clinicopathological characteristics of tumors in the study since available information is very limited in the literature.

Study Population
A total of 485 patients with histologically confirmed PTC (90.3% females, mean age 54.78 ± 13.3 y.o., 18-87 y.o., range) operated from 1980 to 2015, and 1,008 healthy control subjects (78.7% females, mean age 39.0 ± 15.8 y.o., 15-83 y.o., range) of Kazakh origin were recruited. Clinicopathological information was retrieved from medical records ( Table 1). The pathological classification was based on the World Health Organization definitions (30), pathological staging (pTNM, where the T category defined the anatomic extent of cancer for the tumor, N for the lymph nodes and M for distant metastases) was according to the 7th edition of TNM classification system (31). Patients and control subjects had no history of radiation exposure. All participants or their parents/ guardian gave written informed consent in accordance with the Declaration of Helsinki. A peripheral venous blood sample was collected from each participant. The protocol of this study was approved by corresponding ethics committees.

DNA Isolation and Genotyping
Blood DNA was extracted using QIAamp DNA Mini Kit (QIAGEN, Tokyo, Japan) according to the manufacturer's protocol. DNAs of sufficient quality for genotyping were obtained from all 485 PTC patients and 1,008 control participants.
Genotyping was performed with predesigned Custom Applied Biosystems TaqMan SNP Genotyping Assays ( Table 2) using TaqMan Genotyping Master mix (all reagents from ThermoFisher Scientific) and 10 ng genomic DNA per 10 µl reaction in a Light Cycler 480 (Roche, Indianapolis, IN). Cycling conditions were as follows: denaturation at 95°C for 10 min followed by 60 cycles of 92°C for 15 s and 62°C for 1 min for all SNPs. As a quality control, 15-20% of all samples were randomly selected and re-run in duplicates for each SNP. Full concordance between the experiments was observed.

Association Analysis
We used PLINK 1.9 (32) software to run the multiplicative genetic models in the case-control sample for each SNP with age and sex as covariates. This type of model evaluates the impact of individual alleles of a polymorphic locus on the disease. The multiplicative models have been used in the vast majority of the genome-wide and replication association studies of thyroid cancer (6-27); using those in our work provided an opportunity to compare the strength of association signals (ORs) between the previous studies and our findings. The risk alleles were assigned according to the cited literature sources for consistency; summary information on the risk alleles is provided in Table 3. Multiple testing correction (the Benjamini-Hochberg method) and the adaptive label-swapping permutation test (10 6 , maximum) were performed using options available in PLINK.
Associations between each SNP and clinicopathological parameters of PTCs were assessed using logistic regression models with binary outcomes sex (F vs M), age (≥55 vs <55 years old), pathological tumor (pT) category (pT3 or pT4 vs pT1 or pT2) or nodal disease (N1 vs N0, i.e. present vs absent) as dependent variables, and individual SNPs, age and sex (where Mean ± standard deviation and (range) for age in years, count data and (%) for other variables. 2 The pathological cancer staging (pTNM, where the T category defines the anatomic extent of cancer for the tumor, N for the lymph nodes and M for distant metastases; 0,1 and X in the N and M categories correspond to absent, present, and unknown, respectively) according to the 7th edition of TNM classification system (31). applicable) as explanatory variables. The LOGISTIC procedure in the 3.71 release of SAS Studio for the 9.4M5 version of SAS (SAS Institute, Cary, NC, USA) was used for these calculations. Exact two-sided tests, permutation tests and exact test for equality of allele frequencies for stratified groups were performed using the 'HardyWeinberg' package in R (37).
All tests were two-sided, P <0.05 was considered statistically significant.

Predictive Modeling of Papillary Thyroid Carcinoma
To evaluate the impact of the genetic component on PTC risk in the given case-control sample, we used multivariable logistic regression modeling. The initial full model included all eight SNPs in the study, and age and sex as explanatory variables. The reduced model was determined by stepwise or non-automatic selection of variables to achieve minimum Akaike information criterion. Statistical validation of the reduced model was performed using permutation analysis as described before (38), and bootstrapping with 0.9 sampling rate (i.e., selecting 90% of data for each sample using the unrestricted random sampling method) in 10,000 replicates using the SURVEYSELECT procedure. The receiver operating characteristic (ROC) analysis was performed to assess the predictive performance of the reduced model, supplemented with the leave-one-out cross-validation.
After obtaining evidence that certain examined SNPs display statistically significant association signals, we set out to determine the performance of a statistical model of the risk for PTC based on genetic factors. The reduced model included four SNPs: rs965513 (FOXE1/PTCSC2), rs944289 (NKX2-1, PTCSC3), rs2439302 (NRG1) and rs10136427 (BATF) ( Table 5). Their association signals remained significant after correction for multiple testing (Bonferroni and FDR). Statistical validation confirmed significant association of these SNPs with cancer risk (permutation), and confidence intervals almost did not change on bootstrapping. Of note, the OR estimates for the four SNPs in the model were very similar to those obtained in the single-SNP models (see Table 4) indicative of independent contribution of each SNP to thyroid cancer risk. The area under the ROC curve (AUC) was 0.82 (95% CI 0.80-0.84; P = 3.2E-183 as compared with a random classifier); cross-validation did not demonstrate model overfit returning a
The strong association of rs7267944 (DHX35) with patients' sex prompted us to test its association with PTC risk using stratified sampling. While no association was found in females OR = 0.95 (95%CI 0.76-1.18, P = 0.62 adjusted for age), the association signal in males was significant with an OR = 1.83 (95%CI 1.09-3.09, P = 0.02 adjusted for age). The difference in effect size was statistically significant (P = 0.023, the Breslow-Day test). No deviations from Hardy-Weinberg equilibrium in the groups of PTC patients or healthy control subjects, either non-stratified or stratified by sex, was found for this genetic variant (P > 0.4 for any exact two-sided test, P > 0.4 for any permutation test). Exact test for equality of allele frequencies for males and females in the control subjects was negative (P = 0.06), but in PTC patients a strong inequality was observed (P = 8.26E-05), in line with other statistical findings.
The modifying effect of age on rs1867277 (FOXE1) was tested in respective groups of patients and control subjects younger or older 55 years old. In the younger group, rs1867277 displayed an association signal with OR = 1.44 (95%CI 1.16-1.80, P = 9.5E-04 adjusted for sex), and in the older group with OR = 1.84 (95%CI 1.36-2.49, P = 7.0E-05 adjusted for sex). There was no surprise that the association was significant in both age groups as rs1867277 was significant on the whole group association analysis, which was adjusted for age (see Table 4). Clearly, the effect of rs1867277 on the risk for PTC was more pronounced in subject older than 55 years old, although the difference did not reach statistical significance (P = 0.20). No deviations from Hardy-Weinberg equilibrium in the groups of PTC patients or healthy control subjects were found for rs1867277 (P > 0.1 for any exact two-sided test, P > 0.1 for any permutation test). Exact test for equality of allele frequencies for subjects younger than 55 years old was negative (P = 0.85), while in older subjects the inequality existed (P = 0.03).
We also tested the association of rs6983267 (POU5F1B/ CCAT2) with PTC of different pT stage. In pT1-T2 tumors, the association was insignificant with OR = 1.0 (95%CI 0.82-1.21, P = 0.96 adjusted for age and sex), while in pT3-T4 tumors the signal was significant, OR = 1.47 (95%CI 1.09-1.98, P = 0.01 adjusted for age and sex). The difference in ORs was statistically significant (P = 0.03). No deviations from Hardy-Weinberg  The multiplicative model adjusted for age and sex. 2 The risk allele is specified in brackets. 3 False discovery rate, the Benjamini-Hochberg procedure.
equilibrium in the groups of PTC patients or healthy control subjects were found for rs6983267 (P > 0.1 for any exact twosided test, P > 0.1 for any permutation test). Exact test for equality of allele frequencies in the subgroup of PTCs of pT1-pT2 category was negative (P = 0.32) and marginally significant in pT3-pT4 PTCs (P = 0.07).

DISCUSSION
In the present study we set out to determine the impact of genetic factors on PTC risk in the Kazakh population. We focused on several SNPs found to display confident association signals in the previous studies across different populations/ethnicities and also tested two SNPs that have been newly discovered in GWAS of thyroid cancer. Our genotyping results unambiguously confirmed the associations of rs965513 (FOXE1/PTCSC2, 9q22.33), rs1867277 (FOXE1 5'UTR, 9q22.33), rs944289 (PTCSC3/NKX2-1, 14q13.3), rs2439302 (NRG1 intron 1, 8q12) using canonical multivariate analysis essentially supplemented by rigorous statistical validation.
Functional roles of rs965513 and rs1867277 located in the FOXE1 locus on chromosome 9q22.33 were linked to the transcriptional regulation of FOXE1 and PTCSC2. rs965513 was shown to affect the expression of FOXE1, PTCSC2 and TSHR (thyroid stimulating hormone receptor) in thyroid tissue (34), and rs1867277 regulates FOXE1 expression through the recruitment of USF1/USF2 transcription factors (15), implicating these SNPs into thyroid homeostasis and development. Of note, transgenic mice overexpressing FOXE1 in their thyroids displayed retardation in the proliferation of follicular cells, suggestive of its tumor suppressor function (39). A meta-analysis that combined data from 23 studies in different countries and ethnicities evaluated that rs965513[A] risk allele had an OR = 1.58 (95% CI 1.32-1.90) in the pooled sample, and OR = 1.65 and 1.49 in Caucasian and Asian populations, respectively (40). Interestingly, in the Kazakh population, which is of Asian descent, we found an OR = 2.25 (95% CI 1.85-2.73), which is one of the strongest association signals ever reported for the FOXE1 locus. Also of interest is the finding of agerelated effect rs1867277 (i.e., the higher risk in patients aged more than 55 years), which is reported for the first time. It should be noted that despite rs965513 and rs1867277 are located in the same genetic locus, their effect on PTC risk is independent (41). The age relatedness of rs1867277 effect could be addressed in independent or already available studies. rs944289 located on chromosome 14q13.3 regulates expression of the PTCSC3 lncRNA, which has tumor suppressor effect in thyroid cancer cell lines, through the recruitments of C/EBPa and b transcription factors (35). PTCSC3 level was found to be significantly lower in PTC than in normal thyroid tissue (26), which corresponds well with its tumor suppressor function. Interestingly, rs944289 besides of PTC was also associated with follicular adenoma (26), indicating its broader function in thyroid tumorigenesis. rs2439302 is located in intron 1 of NRG1 on chromosome 8q12. NGR1 encodes human epidermal growth factor receptor 3 (HER3) ligand whose dimers with HER2 can activate MAPK and AKT pathways known to play an important role in PTC (42). Similarly to PTCSC3, NRG1 was also earlier associated with follicular adenoma (26). Multivariate logistic regression adjusted for age and sex unless otherwise specified. 2 The T category (defines the anatomic extent of cancer for the tumor) from the pathological cancer staging according to the 7th edition of TNM classification system (31); here, the advanced tumors (pT3-T4) are contrasted to less advanced tumors (pT1-T2). 3 The N category (defines the regional lymph node involvement) from the pathological cancer staging according to the 7th edition of TNM classification system (31); here, cases with nodal disease (N1) are contrasted to those without lymph node involvement (N0). 4 The Cochran-Armitage test for trend. 5 Adjusted for age. 6 Adjusted for sex. 7 The risk allele is specified in brackets. Statistically significant associations are shown in bold.
rs10136427, whose association with PTC risk was confirmed in the Kazakh population for the first time, was previously detected on GWAS in the Italian, Polish, and Spanish population study providing strong evidence of association with DTC (12). rs10136427 is located in an intergenic region upstream BATF. BATF proteins are the "AP-1 inhibitors"; findings in mouse myeloid leukemia cells suggested they can act as tumor suppressors by promoting cell growth arrest and cell differentiation. Whether BATF could play a similar role in other tissues, such as the thyroid, remains unknown. Since this genetic variant was not immediately associated with BATF expression, a possibility was suggested that this genetic locus may act as a trans-regulatory region controlling the expression of distant genes that reside on the same or even different chromosome(s) (trans-eQTL) (12).
rs966423 located in the DIRC3 intron on chromosome 2q35, displayed marginally significant association (P = 0.07) in the Kazakh population. This genetic variant was significantly associated with the risk for thyroid cancer in both European and Asian ethnicities with ORs from 1.28 to 1.34 (reviewed in (28). In our study the OR = 1.18 (adjusted for age and sex) is lower than those previously described. It therefore is likely that our sample size (485 PTC patients and 1,008 healthy control subjects) did not provide sufficient statistical power (achieved power 44%). We interpret the association signal of this genetic variant as suggestive in the Kazakh population. The functional role of DIRC3 lncRNA is likely to be that of tumor suppressor, and its relevance to thyroid cancer (8,9) and other human malignancies such as, originally, familial renal cancer (43), melanoma (44), breast cancer (45) and laryngeal squamous cell carcinoma (46) has been reported.
The finding for rs7267944, which is located approximately 280 kB telomeric to DHX35 on chromosome 20q12, was somewhat unexpected. While on the whole group association analysis the signal of rs7267944 was insignificant, we noticed a strong modifying effect of sex. Accordingly, we found significant association of rs7267944 with PTC in males but not in females. DHX35 encodes a putative RNA helicase of DEAD/DEAH-box family, which are implicated in translation initiation, RNA splicing, and ribosome and spliceosome assembly. DHX35 is relatively highly expressed in the endometrium, ovaries, prostate and testes possibly pointing at its relatedness to sex-specific biological function (47). In the thyroid, DHX35 is also expressed, although its role in tissue homeostasis and carcinogenesis remains unestablished. Within our study we could not determine the reason for DHX35 association with sex, which, besides the true association could be sampling bias, a phenomenon specific for the given ethnic group (and relevant environmental exposures) or occurring by chance. This could be clarified in an independent study in the Kazakh population and also in other ethnic groups by researcher with access to rs7267944 [or other SNP(s) in strong or perfect linkage disequilibrium with it] genotyping data and clinical/demographic information. It is also possible that rs7267944 may point on the genetic factor other than DHX35 on chromosome 20q12 or elsewhere due to trans-eQTL effect.
A recent GWAS has identified rs6983267 (POU5F1B/CCAT2) as a key locus in the 8q24 region previously associated with DTC/ PTC. However, the association with thyroid cancer was somewhat controversial since the significant association was found in the Polish and UK populations, but no association was found in the Spanish, Italian, and Japanese groups (10). In the Kazakh population under study, we did not observe significant association signal on the whole group analysis, yet a correlation with the higher pT tumor stage was detected. When pT3-T4 tumors were tested, a significant association with PTC risk was confirmed. It is tempting to relegate controversies in the rs6983267 association with thyroid cancer in different populations not only to genetic heterogeneity but also to different distribution of clinicopathological characteristics of tumors in country-specific samples. rs6983267 resides in the intronic region of POU5F1B, which encodes a transcriptional activator implicated in multisite cancers (48)(49)(50)(51). It is worth noting that this genetic variant is also localized inside the CCAT2 lncRNA upregulated in colon cancer and implicated in other human malignancies (52)(53)(54)(55). The exact roles of either POU5F1B or CCAT2 in PTC remain to be established.
After confirming the associations of rs965513 (FOXE1/ PTCSC2), rs944289 (PTCSC3/NKX2-1), rs1867277 (FOXE1), rs2439302 (NRG1), rs10136427 (BATF) and, suggestively, of rs966423 (DIRC3) with PTC in the whole group or on subgroup analysis for rs7267944 (DHX35) and rs6983267 (POU5F1B/CCAT2), we combined these genetic variants in a statistical model to evaluate their contribution to PTC risk as the predictive strength of the genetic variants can be improved by combining multiple SNPs in a model (27,56). The final model, which included four SNPs, rs965513 (FOXE1/PTCSC2), rs944289 (NKX2-1, PTCSC3), rs2439302 (NRG1) and rs10136427 (BATF), was subjected to statistical validation to ensure its reliability. The model had good predictive strength as judged by the ROC analysis (AUC = 0.82). Using two different analogs of the coefficient of determination for logistic regression models, we considered it safe to claim that the four SNPs in the optimal model, adjusted for age and sex effects, could explain about 30-40% of the risk for PTC in the Kazakh population examined in the study with a retrospective case-control design.
Our study had certain advantages such as homogenous ethnicity of the participants, large sample size that provided sufficient statistical power to detect meaningful associations, and thorough selection of the genetic variants. On the other hand, the study was not devoid of limitations. We could not fully rule out sampling bias and acknowledge insufficient age and sex matching of cases and controls, which could affect the accuracy of some of the results obtained in the study. Also, clinicomorphological information was not detailed enough to analyze potentially clinically relevant correlations, and the lack of data on the participants' lifestyles and environmental exposures impeded the assessment of the impact of these factors on PTC risk.
In summary, our results unambiguously demonstrate the existence of genetic determinants of susceptibility to PTC among the SNPs analyzed in this work in the Kazakh population. We confirm the associations of rs965513 (FOXE1/ PTCSC2), rs944289 (PTCSC3/NKX2-1), rs1867277 (FOXE1), rs2439302 (NRG1), and rs10136427 (BATF). The association of rs966423 (DIRC3) with PTC risk in the Kazakh population is suggestive. The association signals in terms of ORs were generally comparable to those in typical Asian and European populations, and that of rs965513 (FOXE1/PTCSC2) was the highest so far reported. We also detected the age-related effect of rs1867277 (FOXE1) conferring the higher risk for PTC in patients older than 55 years and the association of rs7267944 (DHX35) with PTC risk in males and that of rs6983267 (POU5F1B/CCAT2) with more advanced tumor pT stage. We estimate the contribution of genetic factors to the susceptibility to PTC in the analyzed series from Kazakhstan to 30-40%, accounting for age and sex. To better understand the impact of different factors affecting PTC risk, further studies would be desirable to increase the number of potential genetic loci and to include the data on individual lifestyle and exposures to environmental agents.

DATA AVAILABILITY STATEMENT
Dataset used in this study may be made available upon request to Dr. Zhanna Mussazhanova (mussazhanova@nagasaki-u.ac.jp), subject to approval by the Semey Medical University Ethics Committee. Information on all genetic variants analyzed in this study is available in NCBI dbSNP database (https://www. ncbi.nlm.nih.gov/snp/).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Semey Medical University Ethics Committee and Nagasaki University Human Genome Ethics Committee. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
ZM, VS, ME, ZKa, RI, SY, and MN conceived and designed the study. ZM, AK, ME, ZKa, RI, ZY, MM, AM, MS, ZKo, and SB collected and formalized the clinical information. ZM and TR performed the experiments, analyzed and formalized the raw data. VS and HK performed the statistical analyses. TR, SY, KM, and MN contributed comments to the paper. ZM and VS wrote the manuscript. All authors contributed to the article and approved the submitted version.

FUNDING
This work was supported in part by the Atomic Bomb Disease Institute of Nagasaki University, The Joint Hiroshima University, Nagasaki University, and Fukushima Medical University Research Base for Radiation Accidents and Medical Science, and by Takeda Science Foundation.