Prevalence and Genetic Analysis of Thalassemia and Hemoglobinopathy in Different Ethnic Groups and Regions in Hainan Island, Southeast China

Background: There are limited studies on the molecular profile of thalassemia in Hainan, the free trade island in China. Our aim was to reveal the prevalence and molecular mutation spectrum of thalassemia in different ethnic groups and regions of Hainan through a large sample study for the first time. Methods: A total of 231,596 individuals from 19 cities and counties in Hainan were screened by hematological parameter analysis, and further genetic analysis was performed on individuals with MCV less than 82 fL. Results: Totally, 31,780 (13.72%) subjects were diagnosed as thalassemia carriers. The overall prevalence of α-thalassemia, β-thalassemia, and α+β-thalassemia were 11.04%, 1.48%, and 1.20%, respectively. We further analyzed the molecular profiles of thalassemia in various ethnic groups and mainly compared the difference between Han and Li. The results showed that the frequency of thalassemia in the Li population (47.03%) was much higher than that in Han (9.37%). Except for β-thalassemia (1.31% of Li vs. 1.47% of Han), the frequencies of α-thalassemia (39.59% of Li vs. 7.35% of Han) and α+β-thalassemia (6.13% of Li vs. 0.56% of Han) in the Li were obviously higher than those in Han. The high-frequent genotypes of α-thalassemia in Han were αα/--SEA (25.55%), -α3.7/αα (22.17%), -α4.2/αα (21.59%), αWSα/αα (8.93%), and -α3.7/-α4.2 (4.17%) and those of Li were -α4.2/αα (17.24%), -α3.7/αα (17.16%), -α3.7/-α4.2 (15.09%), αWSα/αα (9.69%), and αWSα/-α3.7 (8.06%), respectively. The αα/--SEA was the highest genotype of α-thalassemia in Han but only accounted for 1.87% in Li. For β-thalassemia, the top three high-frequent genotypes in both Han and Li were βCD41/42(-TTCT)/βN, β-28(A>G)/βN, and βIVS-Ⅱ-654(C>T)/βN, but the frequency of βCD41/42(-TTCT)/βN in Li (90.96%) was much higher than that in Han (56.32%) and the data reported in other provinces of China. Additionally, the prevalence of thalassemia ranged from 8.16% to 34.35% in Hainan, Wuzhishan, Baoting, Qiongzhong, and Baisha have a higher prevalence than other areas. Conclusion: Our study revealed the characteristics of ethnic and regional differences in the prevalence of thalassemia in the childbearing age population of Hainan for the first time, indicating that the prevalence of thalassemia among Li nationality is the highest in China. Those findings will be useful for genetic counseling and the prevention of thalassemia.

Results: Totally, 31,780 (13.72%) subjects were diagnosed as thalassemia carriers. The overall prevalence of α-thalassemia, β-thalassemia, and α+β-thalassemia were 11.04%, 1.48%, and 1.20%, respectively. We further analyzed the molecular profiles of thalassemia in various ethnic groups and mainly compared the difference between Han and Li. The results showed that the frequency of thalassemia in the Li population (47.03%) was much higher than that in Han (9.37%). Except for β-thalassemia (1.31% of Li vs. 1.47% of Han), the frequencies of α-thalassemia (39.59% of Li vs. 7.35% of Han) and α+β-thalassemia (6.13% of Li vs. 0.56% of Han) in the Li were obviously higher than those in Han. The highfrequent genotypes of α-thalassemia in Han were αα/--SEA (25.55%), -α 3.7 /αα (22.17%), -α 4.2 /αα (21.59%), α WS α/αα (8.93%), and -α 3.7 /-α 4.2 (4.17%) and those of Li were -α 4.2 /αα (17.24%), -α 3.7 /αα (17.16%), -α 3.7 /-α 4.2 (15.09%), α WS α/αα (9.69%), and α WS α/-α 3.7 (8.06%), respectively. The αα/--SEA was the highest genotype of α-thalassemia in Han but only accounted for 1.87% in Li. For β-thalassemia, the top three high-frequent genotypes in both Han and Li were β CD41/42(-TTCT) /β N , β -28(A>G) /β N , and β IVS-Ⅱ-654(C>T) / β N , but the frequency of β CD41/42(-TTCT) /β N in Li (90.96%) was much higher than that in Han (56.32%) and the data reported in other provinces of China. Additionally, the prevalence of INTRODUCTION Thalassemia is one of the most widely distributed autosomal monogenic diseases in the world. It was first found in Italy, Malta, Greece, and other regions along the Mediterranean coast. The southern part of the Yangtze River in China is also the highrisk area for this disease, among which the prevalence of Guangxi (19.10%), Hainan (12.95%), and Guangdong (11.90%) is more than 10%. He et al., 2018;Peng et al., 2021;Wang et al., 2022). At the molecular level, according to the type of globin gene defect involved, thalassemia can be divided into several subtypes; the most common subtypes are α-thalassemia and β-thalassemia. The coding gene of αglobin is located on chromosome 16 with two linked α genes on each homologous chromatid. Depending on whether one or both of the linked α genes are deleted or less active due to mutations, α-gene defects can be classified into α + and α 0 . (Piel and Weatherall, 2014). The coding gene of β-globin is located on chromosome 11, and its defects include point mutations and deletions, and point mutations are the most common. Currently, over 200 types of β-gene defects have been identified, ranging from mild mutations that result in a relative reduction in β-globin peptide chain synthesis (β + ) to severe mutations that completely inhibit β-globin peptide chain synthesis (β 0 ). (Needs et al., 2022). Patients with HbH disease (α 0 /α + ) can develop mild to moderate anemia with age and may require a blood transfusion, while patients with the αthalassemia major genotype (α 0 /α 0 ) usually die during the fetal stage or shortly after birth. Patients with β-thalassemia major or some β-thalassemia intermedia genotypes (β + /β + or β 0 / β + or β 0 /β 0 ) rely on transfusion to sustain their life. (Taher et al., 2018). Therefore, the birth of children with severe thalassemia will bring heavy mental and economic burdens to their families and society.
Hainan Island, a region located in the southernmost part of China, covers an area of 35,400 square kilometers and contains 19 cities and counties, with an estimated population of about 10 million. Hainan Island is also a multi-ethnic settlement, with a population mainly composed of Han and Li people, and a minority of the Miao, Zhuang, and Hui people, among which the Li people are the earliest residents of Hainan Island mainly living in Lingshui, Baoting, Baisha, Qiongzhong, Ledong, and Changjiang. People of childbearing age refer to individuals who have the ability to bear children, usually between the age of 15 and 50 for women and from 16 to 65 for men. In order to prevent and control severe thalassemia, the Hainan provincial government has implemented the "Hainan Pregnancy Thalassemia Screening Program" since 2019 to provide a free genetic diagnosis of thalassemia for pregnant women and their partners. Our previous study showed that the overall prevalence of thalassemia in Hainan was as high as 12.95%. (Wang et al., 2022). However, the prevalence and molecular mutation spectrum of thalassemia in different ethnic groups and regions have not been reported. In this study, we used the data obtained from this project to characterize the prevalence and molecular spectrum of thalassemia in different ethnic groups and regions in Hainan, indicating that the prevalence of thalassemia in Hainan has obvious ethnic and regional differences. Those findings will be useful for genetic counseling and the prevention of thalassemia.

Participants
The subjects of this study were couples who underwent prenatal health check-ups in medical institutions in 19 cities and counties in Hainan between January 2020 and December 2021. Both couples were included in the study after signing informed consent. All participants' sex, age, nationality, and domicile were available to ensure a comprehensive understanding of the research subjects. All participants voluntarily joined this study with informed consent, and all studies were approved by the Ethics Committee for Clinical Investigation of Hainan Women and Children's Medical Center.

Hematological Parameter Analysis
All subjects were recruited from the medical institutions within 19 cities and counties in Hainan. Peripheral venous blood samples of 2 ml volume were taken from all subjects and stored in EDTA anti-coagulated tubes. The hematology phenotypic indicators were determined and analyzed by the medical institutions where the subjects were located using the hemocyte analyzer. Subjects with MCV values less than 82 fL were considered possible thalassemia carriers.

Gene Diagnosis of Thalassemia
For the diagnosis, 2 ml of peripheral blood of the participants was collected with an EDTA anticoagulant tube and transported to Hainan Women and Children's Medical Center and Sanya Women and Children's Hospital by cold-chain immediately for gene diagnosis of thalassemia. The temperature of the cold chain was 4-8°C. The specimen transport was generally completed within 2-5 days and no more than 7 days from blood collection to the specimen received by the laboratory. The gap polymerase chain reaction (Gap-PCR) was used to identify three common Chinese α-globin gene deletion mutations (-α 3.7 , -α 4.2 , --SEA ) and rare deletion mutations (#20193401915, Yaneng Biosciences, Shenzhen, China). Reverse dot-blot hybridization was used to identify three common nondeletional mutations of αthalassemia (Hb CS, Hb QS, and Hb WS) (#20173401107, Yaneng Biosciences, Shenzhen, China) and 17 common βglobin gene mutations in China (#20163400463, Yaneng Biosciences, Shenzhen, China). The DNA sequence was used to detect rare and unknown thalassemia gene mutations.

Statistics
Excel 2016 and GraphPad Prism 8 (GraphPad Software, La Jolla, CA) were used for statistical analyses. The comparison of composition ratios between two or more groups using Fisher's exact test or chi-squared test and multiple comparisons of composition ratios among different groups were determined using the Bonferroni method. Values of p < 0.05 were considered statistically significant.

Molecular Epidemiological Characteristics of Thalassemia Genotypes in Childbearing Age Population of Hainan
In this study, we screened 231,596 individuals of the childbearing age population from the whole Hainan province. Totally, 31,780 (13.72%) subjects were diagnosed as carriers of thalassemia, and the frequencies of αthalassemia, β-thalassemia, and α+β-thalassemia were 11.04%, 1.48%, and 1.20%, respectively. The aforementioned data indicate that the prevalence of thalassemia among the childbearing age population in Hainan ranks second in China, only lower than that in Guangxi, which is consistent with our previous research conclusions. (Wang et al., 2022). We further analyzed the frequency of specific mutation in all α (or β) mutant chromosomes in Hainan. The results showed that the five most common mutations of α-thalassemia were -α 3.7 , -α 4.2 , --SEA , α WS α, and α QS α, which account for 99.15% of all αthalassemia carriers. These results suggested that the molecular profile of α-thalassemia in Hainan was different from that which is previously depicted in southern China, such as Guangxi, Guangdong, and Fujian, where--SEA and α CS α always had higher allele frequencies, and--SEA usually has the highest allele frequency (He et al., 2018;Peng et al., 2021;Zhuang et al., 2021). A total of 16 β-thalassemia mutations in both βand α+βthalassemia carriers were identified in the present study, and β CD41/42 (-TTCT) was the most frequent β-thalassemia mutation with the allele frequency of 73.08%, which was similar to the former studies reported in other regions of China but with a higher frequency in Hainan. (He et al., 2018;Peng et al., 2021;Lai et al., 2017). Other common mutations of β-gene were β -28(A>G) (10.84%), β IVS-Ⅱ-654(C>T) (5.25%), β CD71/72 (+A) (3.96%), β CD17(A>T) (3.62%), and β CD26 (GAG>AAG) (1.71%). The ranking of the highfrequent genotypes was also different from other regions of China (Peng et al., 2021;He et al., 2017;Zhuang et al., 2021). In addition, this study also identified some uncommon thalassemia genotypes in the Chinese population. The fusion gene has resulted from a fusion between the α2 and ψα1 genes, and the genotype of Fusion/--SEA can cause HbH disease . A total of 32 fusion gene carriers were detected; when the fusion gene was combined with the α + -gene, the hemoglobin level of the carriers was normal or slightly decreased, presenting an α-thalassemia minor phenotype. Both HKαα and αααanti4.2 were resulting from the unequal Frontiers in Genetics | www.frontiersin.org June 2022 | Volume 13 | Article 874624