Front. Genet.Frontiers in GeneticsFront. Genet.1664-8021Frontiers Media S.A.10.3389/fgene.2021.676917GeneticsOriginal ResearchInsights From Y-STRs: Forensic Characteristics, Genetic Affinities, and Linguistic Classifications of Guangdong Hakka and She GroupsLuoChunfang12†DuanLizhong3†LiYanning14†XieQiqian1†WangLingxiang5RuKai5NazirShahid6JawadMuhammad6ZhaoYifeng7WangFenfen8DuZhengming8PengDehua2WenShao-Qing5*‡QiuPingming1*‡FanHaoliang159*‡1School of Forensic Medicine, Southern Medical University, Guangzhou, China2Heyuan Municipal Public Security Bureau, Heyuan, China3Beijing Municipal Public Security Bureau, Beijing, China4School of Basic Medicine, Gannan Medical University, Ganzhou, China5Institute of Archaeological Science, Fudan University, Shanghai, China6Department of Forensic Sciences, University of Health Sciences, Lahore, Pakistan7Nanjing Zhenghong Judicial Identification Institute, Nanjing, China8First Clinical Medical College, Hainan Medical University, Haikou, China9School of Basic Medicine and Life Science, Hainan Medical University, Haikou, China
Edited by: Atif Adnan, China Medical University, China
Reviewed by: Guanglin He, Sichuan University, China; Rashed Alghafri, Dubai Police, United Arab Emirates
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Guangdong province is situated in the south of China with a population size of 113.46 million. Hakka is officially recognized as a branch of Han Chinese, and She is the official minority group in mainland China. There are approximately 25 million Hakka people who mainly live in the East and North regions of China, while there are only 0.7 million She people. The genetic characterization and forensic parameters of these two groups are poorly defined (She) or still need to be explored (Hakka). In this study, we have genotyped 475 unrelated Guangdong males (260 Hakka and 215 She) with Promega PowerPlex® Y23 System. A total of 176 and 155 different alleles were observed across all 23 Y-STRs for Guangdong Hakka (with a range of allele frequencies from 0.0038 to 0.7423) and Guangdong She (0.0047–0.8605), respectively. The gene diversity ranged from 0.4877 to 0.9671 (Guangdong Hakka) and 0.3277–0.9526 (Guangdong She), while the haplotype diversities were 0.9994 and 0.9939 for Guangdong Hakka and Guangdong She, with discrimination capacity values of 0.8885 and 0.5674, respectively. With reference to geographical and linguistic scales, the phylogenetic analyses showed us that Guangdong Hakka has a close relationship with Southern Han, and the genetic pool of Guangdong Hakka was influenced by surrounding Han populations. The predominant haplogroups of the Guangdong She group were O2-M122 and O2a2a1a2-M7, while Guangdong She clustered with other Tibeto-Burman language-speaking populations (Guizhou Tujia and Hunan Tujia), which shows us that the Guangdong She group is one of the branches of Tibeto-Burman populations and the Huonie dialect of She languages may be a branch of Tibeto-Burman language families.
Hakka is one of the far-reaching ethnic groups that have a worldwide distribution and is officially recognized as a branch of Han Chinese in China. Hakka is a unique group that is not named after the region (Zhong, 2019). There are about 80 million Hakka people, and ∼50 million are situated in southern parts of China (mainly including Guangdong, Jiangxi, Fujian, Guangxi, Sichuan, Hainan, Hunan, Zhejiang, Taiwan, Hongkong, and Macao). Guangdong province is an important region for Lingnan culture, which has its unique styles of language, history, and culture and which lies in the southernmost part of mainland China. The Hakka population is mainly settled in the eastern and northern regions of Guangdong, comprising Meizhou, Heyuan, Huizhou, Shaoguan, and Qingyuan (Figure 1). The origin of Hakka has not been clearly defined yet. At present, there are two views about the origin of the Hakka population, either they belong to Northern Han (Luo, 1989) or they belong to Southern Han (Fang, 2007). According to a previous study (Li et al., 2003), majority of the Fujian Hakka gene pool (80.2%) came from Northern Han based on 14 Y-SNPs. On the other hand, the frequency of a 9-bp deletion in mitochondrial region V is 21.74% in Meizhou Hakka, which indicated that Meizhou Hakka had close relationships with Fujian Hakka (∼0.197) and other populations from South China (Cai et al., 2015). From the perspective of physical anthropology, Zheng et al. found that the physical characteristics of the Hakka population are a mixture of South-Asian and North-Asian ethnic populations, which is determined by 86 anthropologic characteristics in 650 male and 704 female Chinese Hakka adults living in Guangdong and Jiangxi (Zheng et al., 2013). Guangdong province is home to the Hakka population, but there is no comprehensive study available on the Y-chromosomal prospect of Guangdong Hakka.
Locations of population distributions and sampling information. (A,B) Geographic positions of Guangdong Hakka, Guangdong She, and 159 worldwide populations with nine language families (48,637 individuals in total). (C) Distributions of geographical dialects in Guangdong province and detailed sampling information of Guangdong Hakka and Guangdong She groups.
She is one of the largest ethnic minorities in Jiangxi, Fujian, and Zhejiang provinces. Their presence is also reported in the provinces of Anhui and Guangdong. She people are shifting cultivation group in South China and had migrated from their primitive habitations in Fenghuang Mountain in Guangdong Chaozhou to Fujian, Zhejiang, Jiangxi, Anhui, and other provinces for more than a thousand years (Shi, 2013; Liu et al., 2017). The total population of the She group is 709,592 according to the 2010 census. Their presence in South Zhejiang is 25% and 53% in East Fujian. Their presence in Guangdong province is only 3.8% (27,000), although Guangdong Chaozhou Fenghuang Mountain is believed to be the definite headstream of She group’s culture and civilization (Liu et al., 2017). She people speak the She language with different dialects such as Shanke, Dongjia, and Huonie. She language is a branch of Sino-Tibetan language families and is the primary language for the She group (Lei, 2005). With reference to linguistics, the She language (mainly refers to Shanke and Dongjia dialects) belongs to the Hmong-Mien language branch (Lu and Wang, 2019). The Guangdong She population uses only the Huonie dialect. Interestingly, the divergences in languages may represent the differences in the genetic affinities of distinct She groups. Fujian She is more closely related to the Dongbei (Northern Han) and Hunan (Southern Han) populations (Liu et al., 2017). The genetic heterogeneities for the She population in the different regions indicate that the origins of Guangdong She and Fujian She may not be the same; Guangdong She may also not be a Hmong-Mien language-speaking group.
Y-chromosomal variant analysis for determining the patterns of present and past flow of genes between populations is very helpful (Oppenheimer, 2012). Y-chromosome short tandem repeats (Y-STRs) play an important role in forensic molecular biology (Prinz et al., 1997; Adnan et al., 2016). The use of Y-STRs also allows the simultaneous analysis of closely related and distantly related populations (Ballantyne and Kayser, 2012). Consequently, to address three main issues—(1) the forensic efficiencies of Promega PowerPlex® Y23 System (Promega Corporation, Fitchburg, WI, United States) in Guangdong Hakka and Guangdong She groups, (2) the population structures of the two groups, and (3) the language classification of Guangdong She (Huonie Dialect)—we used the Promega 5-dye multiplex system to genotype 23 Y-STRs in 260 Guangdong Hakka and 215 Guangdong She males and evaluated the system effectiveness of forensic applications in two groups. Furthermore, we conducted population genetics employing diversified analyses among globally dispersed human populations and regional closely related populations to make the issues clearer.
Materials and MethodsSample Preparation
In this study, a total of 475 unrelated male individuals (260 Hakka and 215 She) were recruited from Guangdong province of China (Figure 1C). Blood samples of all volunteers were collected using the FTA cards (WhatmanTM, GE Healthcare, Chicago, IL, United States), with their written informed consent. This study and the procedures were approved by the Institutional Review Boards of Hainan Medical University and the Medical Ethics Committee of Hainan Medical University (no. HYLL-2020-012). All the experimental procedures were performed following the standards of the Declaration of Helsinki.
Y-STR Amplification and Genotyping
One punched bloodstain paper (the size was 1.2 mm × 1.2 mm) was used as a PCR template directly for each sample without DNA extraction procedures. Y-STRs were co-amplified using Promega PowerPlex® Y23 System (Promega Corporation, Fitchburg, WI, United States), which is a five-dye multiplex kit that analyzes the 17 Yfiler Y-STRs (DYS19, DYS385a/b, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS635, and Y_GATA_H4) together with six discriminating Y-STRs (DYS481, DYS533, DYS549, DYS570, DYS576, and DYS643) on a Veriti® 96 Well Thermal Cycler System (Thermo Fisher Scientific, Waltham, MA, United States) following the manufacturer’s instructions. The amplified products were separated by capillary electrophoresis on a 3500XL Genetic Analyzer (Thermo Fisher Scientific, Waltham, MA, United States), and data were analyzed using GeneMapper® ID-X Software v1.6 (Thermo Fisher Scientific, Waltham, MA, United States).
Statistical Analysis
Haplotype and allele frequencies were calculated using Arlequin v3.5 (Excoffier and Lischer, 2010). Gene diversity (GD) was calculated as per the following formula:
HD=nn-1(1-∑ipi2)
where, n is the number of alleles at each Y-STR locus, and Pi is the frequency of the i-th allele. Haplotype diversity (HD) was estimated similarly as GD, while n and Pi were the total number of haplotypes and the frequency of the i-th haplotype, respectively. Discrimination capacity (DC) was determined as the ratio between the number of different haplotypes and the sample size. Match probability (MP) was defined as MP = ∑Pi2, where Pi was the frequency of the i-th haplotype.
Genetic relationships between these populations and other reference populations were calculated employing RST (Excoffier et al., 1992; Excoffier and Smouse, 1994). Population pairwise genetic distances (RST) and corresponding p values between different populations were estimated by analysis of molecular variance (AMOVA) by the online tool available at YHRD1. The principal component analysis (PCA) and multidimensional scaling plot (MDS) were conducted based on allele frequencies using the R programming language. Additionally, phylogenetic relationships among different populations were drawn in Molecular Evolutionary Genetics Analysis 7.0 software (Kumar et al., 2016) by neighbor-joining phylogenetic tree (Saitou and Nei, 1987) based upon genetic distance matrices (RST matrices) and visualized by the Interactive Tree of Life v5 (Letunic and Bork, 2019).
Quality Control
The Y-STR typing experiments were performed strictly following the recommendations of the Chinese National Standards, Scientific Working Group on DNA Analysis Methods (SWGDAM, 2010) and the recommendations of the DNA Commission of the International Society of Forensic Genetics (Gusmao et al., 2006; Carracedo et al., 2013; Roewer et al., 2020). Control DNA 9948 and sdH2O were employed as positive and negative controls in each batch of PCR amplification and electrophoresis, respectively. In addition, our laboratory has passed the proficiency testing for Y-STR typing which is organized by YHRD and has been accredited under ISO/IEC 17025:2005 and China National Accreditation Service for Conformity Assessment.
The haplotype data of 475 unrelated male individuals from Guangdong Hakka and Guangdong She populations in the present study have been submitted to the YHRD database and received the accession number YA004707 (Guangdong Hakka, n = 260) and YA004708 (Guangdong She, n = 215). The Y-STR profiles, with variants (null alleles, off-ladder alleles, or copy number variants) for all samples, were re-amplified and re-genotyped by AmpFLSTR® Yfiler® PCR Amplification Kit (Thermo Fisher Scientific, Waltham, MA, United States) and Geno-ID Y41 Human Typing (Guangzhou Koalson Intelligent BioRobotics, Guangzhou, Guangdong, China) for confirmation.
Results and DiscussionAllele Frequencies, Allele Numbers, and GD Values
The allele frequencies, allele numbers, and GD values of 23 Y-STR loci in the Guangdong Hakka and She groups are presented in Supplementary Tables 1, 2, respectively.
In Guangdong Hakka, 176 distinct alleles were observed at all loci, and the corresponding allelic frequencies ranged from 0.0038 to 0.7423 (DYS438). The number of different alleles varied from three (DYS437) to 40 (DYS385a/b). The GD values of six Y-STRs were greater than 0.9, and the highest (0.9671) and lowest (0.4877) estimates of GD corresponded to loci DYS385a/b and DYS438, respectively. We identified three off-ladders at DYS576 (19.1) and DYS458 (16.1) and the microvariant 19.1 at DYS576 which happened twice. All loci having intermediate values found in the Guangdong Hakka were re-amplified and re-genotyped for confirmation and were commonly observed in YHRD.
In the Guangdong She group, a total of 155 different alleles were obtained, and the number of diverse alleles ranged from four (at seven Y-STRs) to 36 (at DYS385a/b). The allele frequencies varied from 0.0047 to 0.8605 (DYS438). On one hand, the set of 23 Y-STR loci had a high level of genetic polymorphisms in the She group, DYS458 (0.9526), DYS385a/b (0.9180), DYS481 (0.9146), and DYS448 (0.9060), while the DYS438 (0.3277) and DYS391 (0.4875) showed extreme values of the GD distribution (with GD values < 0.5). In this study, no null allele, off-ladder, and copy number variant were found in the Guangdong She group.
Haplotypes and Forensic Parameters
The haplotypes and haplotype frequencies of 23 Y-STRs in the two groups are shown in Supplementary Tables 3, 4, respectively. A total of 231 different haplotypes were found in 260 Guangdong Hakka male individuals, of which 208 (90.04%) were unique. It was observed that 18 haplotypes (H006–H023) occurred twice, four haplotypes (H002–H005) occurred thrice, while only one haplotype (H001) was observed as being shared by four individuals (Supplementary Table 3). Genotyping with the 23 Y-STRs determined 122 distinct haplotypes in the Guangdong She group, and the fraction of unique haplotypes was only 53.28%. Additionally, 43 haplotypes (S015–S057) were detected twice, six haplotypes (S009–S014) and five haplotypes (S004–S008) were found three and four times, respectively, and the haplotypes S003, S002, S001 appeared five, six, and 15 times, respectively (Supplementary Table 4).
As shown in Table 1, the overall HD values of the Guangdong Hakka and She groups were 0.9994 and 0.9939, respectively. Moreover, it can be seen from Table 1 that the DC value of the Guangdong She group (0.5674) was much lower than those of the Guangdong Hakka group (0.8885) and Guangdong Han group (0.9706), which demonstrated that the discrimination power of Promega PowerPlex® Y23 System was not seemingly appropriate for the Guangdong She group. With the Guangdong She group being relatively isolated, there was limited gene flow, which led to the decrease in the discrimination capacity of the Y-STR system. In this regard, there is a dire need to get more knowledge about the mutability of currently known Y-STRs and to incorporate the rapidly mutating Y-STRs into forensic systems to enhance the capacity of discrimination, especially for isolated groups.
Forensic characteristics of 23 Y-STRs in Guangdong populations (Hakka, She, and Han).
Number of observed haplotypes
She
Hakka
Hana
1
65
208
320
2
43
18
10
3
6
4
4
5
1
5
1
6
1
15
1
N
122
231
330
FUH
0.5328
0.9004
0.9697
HD
0.9939
0.9994
0.9997
DC
0.5674
0.8885
0.9706
MP
1.42E-02
4.91E-03
3.31E-03
aUnpublished data for Guangdong Han (n = 340), which genotyped 23 Y-STRs by Promega PowerPlex® Y23 System.N, number of haplotypes; FUH, fraction of unique haplotypes; HD, haplotype diversity; DC, discrimination capacity; MP, match probability.PCA
Dimensionality reduction analyses including PCA, MDS, linear discriminant analysis, Laplacian Eigenmaps, and locally linear embedding can accelerate the speed of algorithm execution, improve the performance of the analysis model, and reduce the complexity of data at the same time. To illustrate the genetic landscapes of globally dispersed human populations, especially for Guangdong Hakka and She groups, the dimensionality reduction analyses (PCA and MDS) were conducted based on the frequencies of 23 Y-STRs.
The PCA provides a method of visualizing the essential patterns of genetic relationships and allows us to identify and plot the major patterns within a multivariate dataset to indicate that the populations with closer geographical distances have more intimate relationships. We collected the 23 Y-STR frequency profiles of 48,162 individuals from 159 worldwide populations having nine language families within five continents (Supplementary Table 5) to conduct PCA with Guangdong Hakka and She groups (Yaju et al., 2014; Westen et al., 2015; Bai et al., 2016b; Garcia et al., 2016; Jung et al., 2016; Pickrahn et al., 2016; Wang L. et al., 2016; Wang Y. et al., 2016; Guo et al., 2017; Iacovacci et al., 2017; Jun et al., 2017; Lacerenza et al., 2017; Qiang et al., 2017; Spolnicka et al., 2017; Wang M. et al., 2017; Zgonjanin et al., 2017; Zhang et al., 2017; Cao et al., 2018; Fan et al., 2018a,b,c; Khubrani et al., 2018; Liu Y. et al., 2018; Juan et al., 2018; Liu Y. J. et al., 2018; Wang et al., 2018, 2020; Zhou et al., 2018; D’Atanasio et al., 2019; Fan et al., 2019; He et al., 2019; Henry et al., 2019; Ip et al., 2019; Jankova et al., 2019; Kai et al., 2019; Lang et al., 2019; Liu et al., 2019, 2020; Luo et al., 2019; Santos Stange et al., 2019; Tao et al., 2019a,b; Wang C. Z. et al., 2019; Wang X. et al., 2019; Wang Y. et al., 2019; Wu et al., 2019; Xie et al., 2019; Yaju et al., 2019, 2020; Zhabagin et al., 2019; Zhang et al., 2019; Al-Snan et al., 2020; Dezhi et al., 2020; Fan et al., 2020; Feng et al., 2020; Haiying et al., 2020; Hanguang et al., 2020; Jannuzzi et al., 2020; Kang et al., 2020; Shonhai et al., 2020; Song et al., 2020; Tang et al., 2020; Yin et al., 2020; Zeyad et al., 2020). As shown in Figure 2, the first and second components (PC1 and PC2) accounted for 6.85 and 5.69% of the total variances observed within these populations, respectively. From the perspective of geography, Guangdong Hakka was located in the cluster of south Chinese populations (Figure 2A). Additionally, Guangdong Hakka was observed close to the Southern Han populations (Figure 2B) from the linguistic point of view, while the Guangdong She group was situated in a relatively isolated location with no definite relationship with other Chinese populations from both geographic and linguistic scales (Figure 2).
Principal component analysis (PCA) based on the frequencies of 23 Y-STRs among Guangdong Hakka, Guangdong She, and 159 worldwide human populations. (A) PCA from geographical scale. (B) PCA from linguistic scale.
To further clarify the genetic relationships on a relatively small scale, we performed PCA within Chinese populations (Figure 3). Populations from different regions of China clustered separately, and Guangdong Hakka and Southern Han clustered similarly on the left side. Guangdong She distributed on the middle bottom and clustered with south Chinese populations. However, from the perspective of linguistics, the language relationships between Guangdong She and the surrounding south Chinese populations which are illustrated in Figure 3B were still not distinct.
Principal component analysis (PCA) based on the 23 Y-STRs frequency profiles among Guangdong Hakka, Guangdong She, and 83 Chinese populations. (A) PCA from geographical scale. (B) PCA from linguistic scale.
MDS
MDS plots were conducted based on allele frequencies by Euclidean distance and Manhattan distance, respectively. As shown in Figure 4, each population was represented by a small dot with a different color in the multidimensional space, and the distances between the small dots showed the genetic relationships among the populations in distinct geographic areas or with different language families.
Multidimensional scaling (MDS) plots based on the frequencies of 23 Y-STRs among Guangdong Hakka, Guangdong She, and 159 worldwide human populations (including 83 Chinese populations). (A) Euclidean-based MDS plot in worldwide populations from geographical scale. (B) Euclidean-based MDS plot in worldwide populations from linguistic scale. (C) Manhattan-based MDS plot in worldwide populations from geographical scale. (D) Manhattan-based MDS plot in worldwide populations from linguistic scale. (E) Euclidean-based MDS plot in Chinese populations from geographical scale. (F) Euclidean-based MDS plot in Chinese populations from linguistic scale. (G) Manhattan-based MDS plot in Chinese populations from geographical scale. (H) Manhattan-based MDS plot in Chinese populations from the linguistic scale.
Whether from the perspective of geographical or linguistic scale, the results of MDS analysis by Euclidean distance (Figures 4A,B) had no apparent differences with the principal component analysis (Figures 2A,B). Moreover, in the Manhattan distance-based MDS plots (Figures 4C,D), Guangdong Hakka intertwining with the Southern Han cluster was in accordance with the performances in the MDS conducted by Euclidean distance (Figures 4A,B) and PCA plots (Figure 2). In addition, Guangdong She was located in a relatively isolated place.
Furthermore, to illustrate partial genetic relationships, we collected Guangdong Hakka, Guangdong She, and 83 other Chinese populations to perform MDS analyses with Euclidean distance and Manhattan distance (Figures 4E–H). The MDS results demonstrated that Guangdong Hakka had a close relationship with Southern Han populations, and Guangdong She was situated in a relatively isolated location from the perspectives of geography and linguistics, which were in line with the results of the PCA.
RST and Phylogenetic Analyses
The results of the calculations of pairwise RST and the corresponding p values between Guangdong Hakka and 13 other Chinese Han populations from Southern and Northern Han regions (Kwak et al., 2005; Wu et al., 2011; Zhang et al., 2014; Shi et al., 2015; Shu et al., 2015; Bai et al., 2016b; Li et al., 2016; Wang L. et al., 2016; Wang Y. et al., 2016; Zhou et al., 2016, 2018; Nothnagel et al., 2017; Wang H. et al., 2017; Huang et al., 2019; Lang et al., 2019; Sun et al., 2019; Guan et al., 2020) based on 23 Y-STR haplotypes are listed in Supplementary Table 6. The population Guangdong Hakka has been found to be closely related to Guangdong Jieyang Han (RST = 0.0028) and Jiangxi Han (RST = 0.0029). Subsequently, the results of the phylogenetic analysis among Guangdong Hakka and Chinese Han populations are displayed in a phylogenetic tree based on the neighbor-joining method (Figure 5). Guangdong Hakka was first clustered with Guangdong Jieyang Han, followed by Jiangxi Han. Geographically, Jieyang is surrounded by Shanwei, Chaozhou, and Meizhou, whereas the habitations for Guangdong Hakka (Figure 1C) and Jiangxi bordering with Guangdong in the southwest were the dominating Hakka settlements. The phylogenetic tree showed the relationships between Guangdong Hakka and other Han populations in genetics, which were in accordance with the geographical relations. The paternal relationships revealed by Y-STRs indicate that Guangdong Hakka had close relationships with Southern Han, and there were extensive gene flows between Guangdong Hakka and the surrounding Han populations. Our results corroborated the findings of previous studies (Du et al., 2019; Han et al., 2020), which were conducted by using STRs and Y-STRs on the Guangdong Meizhou Hakka population.
Neighbor-joining phylogenetic tree between Guangdong Hakka and 13 Han Chinese populations (including three Northern Han and 10 Southern Han) based on the matrix of pairwise RST values.
From the dimensionality reduction analyses discussed above, the genetic relationships between Guangdong She and other Chinese populations were not clarified. In addition, we employed the Y-STR haplotype profiles of 42 Chinese minorities and Guangdong She to assess the population pairwise genetic distances by AMOVA (Zhu et al., 2005, 2006a,b, 2008; He and Guo, 2013; Shan et al., 2014; Luo et al., 2015, 2019; Ou et al., 2015; Shu et al., 2015; Bai et al., 2016a; Bian et al., 2016; Fu et al., 2016; Gao et al., 2016; Yao et al., 2016; Guo et al., 2017; Ji et al., 2017; Zhao et al., 2017; Cao et al., 2018; Chen et al., 2018; Liu Y. J. et al., 2018; Atif et al., 2019; Liu et al., 2019; Xie et al., 2019; Dezhi et al., 2020; Feng et al., 2020; Song et al., 2020; Tang et al., 2020). The pairwise RST and the corresponding p values between Guangdong She and Chinese minority populations from five linguistically different families are displayed in Supplementary Table 7. No differences were observed between Guangdong She and Guizhou Tujia populations (RST = 0.0046, p = 0.0574), while significant genetic differences were observed between Guangdong She and all other Chinese minority populations (p < 0.05). The phylogenetic relationships between Guangdong She and 42 Chinese minorities were visualized in the neighbor-joining tree. As shown in Figure 6, the Tibeto-Burman language-speaking Tibetans and Altaic language-speaking Uighurs and Mongolians clustered together at the upper end, while the language relationships at the bottom, especially for Tai-Kadai, Hmong-Mien, and Tibeto-Burman, were ambiguous. The Guangdong She group clustered with two Tibeto-Burman populations, Guizhou Tujia and Hunan Tujia. However, the Fujian She group congregated with Guizhou Hmong-Mien language-speaking Miao populations.
Neighbor-joining phylogenetic tree between Guangdong She and 42 other minorities with five different language families based on pairwise RST values.
Prediction of Y-Haplogroups
The above-mentioned phylogenetic analysis hinted that Guangdong She clustered with Tibeto-Burman language-speaking Tujia populations in the same branch. To make further confirmation for the genetic relationships, we employed our in-house database which was composed of 37,754 pieces of Y-SNP/STR data and 109,142 Y-STR (Wang et al., 2015) mainly from East and Southeast Asia to make more precise predictions for 215 Guangdong She males in this study. Finally, 212 out of the 215 genotyped Y-STRs (98.60%) were observed, and a total of six Y-haplogroups were defined in Guangdong She which belong to major clades O2 and O1. The predominantly detected haplogroups were O2-M122 (45.75%), O2a2a1a2-M7 (25.47%), O1a-M119 (10.38%), O1b1a1a-M95 (8.02%), O-M175 (7.08%), and O2a2b1a1-M117 (3.30%), which were determined according to ISOGG, 20192. In addition, a PCA graph was performed among 77 populations (Wen et al., 2004; Gan et al., 2008; Li et al., 2008; Cai et al., 2011; Deng et al., 2013; Fan et al., 2018c; Rowold et al., 2019; Luo et al., 2020; Fan et al., 2021), which were composed of 4,195 individuals in total, including Tai-Kadai, Hmong-Mien, Tibeto-Burman, and Chinese (Southern and Northern Han) populations from Southeast and East Asia, which included three She groups (Fujian, Guangdong Chaoshan, and Guangdong She). From the diagram in Figure 7, the first and second principal components could explain 33.07% of the total variances. Moreover, the populations with different language branches were separated relatively, and the Guangdong She group had a close relationship with the Guangdong Chaoshan She group and clustered with Tibeto-Burman language-speaking populations (including Tujia, Tibetan, and Naxi minorities), while Fujian She located between Hmong-Mien and Han populations, especially Southern Han. The interpopulation comparison demonstrated that (1) different branches of She populations and Fujian and Guangdong She groups may have distinct origins from the perspectives of genetics and linguistics, especially from phylogenetic analyses based on Y-STRs and Y-SNPs, and (2) Guangdong She and Chaoshan She groups have close affinities with Tibeto-Burman language-speaking populations based on the evidence of Y-haplogroups which contained information about the subsequent colonization, differentiations, and migrations overlaid on recent population ranges.
Principal component analysis based on Y-haplogroup frequencies between three She groups and 74 populations (4,195 individuals in total, which included 22 Hmong-Mien, 20 Tai-Kadai, 16 Southern Han, 11 Northern Han, and five Tibeto-Burman language-speaking populations mainly from Southeast and East Asia).
Conclusion
In the present study, a total of 475 unrelated Guangdong males (260 Hakka and 215 She) were genotyped by using 23 Y-STRs (including 17 Yfiler and 6 additional Y-STRs) by Promega PowerPlex® Y23 System. For Guangdong Hakka, a total of 176 different alleles were found, with corresponding allelic frequencies ranging from 0.0038 to 0.7423 and GD values that varied from 0.4877 to 0.9671, and the systematic effectiveness for Guangdong Hakka was observed to be sufficient (HD = 0.9994, DC = 0.8885). From the perspectives of geographical and linguistic scales, the phylogenetic analyses indicated that Guangdong Hakka had a close relationship with Southern Han, and there were extensive gene flows between Guangdong Hakka and the surrounding Han populations. For Guangdong She, we identified 155 distinct alleles with a range of allele frequencies from 0.0047 to 0.8605. The GD values for 23 Y-STRs ranged from 0.3277 to 0.9526, and the overall DC and HD were 0.5674 and 0.9939, respectively. The predominant haplogroups of the Guangdong She group were O2-M122 and O2a2a1a2-M7. Moreover, Guangdong She clustered with Tibeto-Burman language-speaking populations (Guizhou Tujia and Hunan Tujia), which demonstrates that the Guangdong She group seems to be one branch of Tibeto-Burman populations and the Huonie dialect of She languages may be a branch of Tibeto-Burman language families.
Data Availability Statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.
Ethics Statement
The studies involving human participants were reviewed and approved by Medical Ethics Committee of Hainan Medical University. The patients/participants provided their written informed consent to participate in this study.
Author Contributions
HF contributed to conceptualization, formal analysis, visualization, and writing—original draft preparation. CL and DP took charge of resources and contributed to project administration. HF, LW, and KR were in charge of software. LD, YL, QX, and YZ conducted the investigation. LD, YZ, FW, and ZD performed the validation. CL contributed to data curation. QX, SN, and MJ contributed to writing—review and editing. PQ and S-QW supervised the study. CL, HF, FW, and ZD took charge of funding acquisition. All authors reviewed the manuscript.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The handling editor AA declared a past collaboration with the author SN.
Funding. This study was supported by grants from the Program of Heyuan for Social Development and Technology Plans (No. 190620161503089), the Program of Hainan Association for Science and Technology Plans to Youth R&D Innovation (QCXM201705), and the National Undergraduate Innovation and Entrepreneurship Training Program (Nos. 201911810008 and 201911810023).
We would like to thank the donors who contributed samples for this study.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2021.676917/full#supplementary-material
ReferencesAdnanA.RalfA.RakhaA.KousouriN.KayserM. (2016). Improving empirical evidence on differentiating closely related men with RM Y-STRs: a comprehensive pedigree study from Pakistan.2545–51. 10.1016/j.fsigen.2016.07.00527497645Al-SnanN. R.MessaoudiS. A.KhubraniY. M.WettonJ. H.JoblingM. A.BakhietM. (2020). Geographical structuring and low diversity of paternal lineages in Bahrain shown by analysis of 27 Y-STRs.2951315–1324. 10.1007/s00438-020-01696-432588126AtifA.KasimK.RakhaA.NoorA.CheemaA. S.HadiS. (2019). Population data of 23 Y STRs from Manchu population of Liaoning Province, Northeast China.133785–788. 10.1007/s00414-018-1957-730353363BaiR.LiuY.LvX.ShiM.MaS. (2016a). Genetic polymorphisms of 17 Y chromosomal STRs in She and Manchu ethnic populations from China.22e12–e14. 10.1016/j.fsigen.2016.01.01126896052BaiR.LiuY.ZhangJ.ShiM.DongH.MaS. (2016b). Analysis of 27 Y-chromosomal STR haplotypes in a Han population of Henan province, Central China.1301191–1194. 10.1007/s00414-016-1326-326932866BallantyneK. N.KayserM. (2012). Additional Y-STRs in forensics: Why, Which, and When.2463–78.BianY.ZhangS.ZhouW.ZhaoQ.SiqintuyaZ. R.LiC. (2016). Analysis of genetic admixture in Uyghur using the 26 Y-STR loci system.6:19998. 10.1038/srep1999826842947CaiG. Q.ZhuW. F.WuX. Y.SuN.YiH.ChenL. X. (2015). Mitochondrial genetic analysis of the origin of Hakka in Meizhou, Guangdong.26129–131.CaiX.QinZ.WenB.XuS.WangY.LuY. (2011). Human migration through bottlenecks from Southeast Asia into East Asia during Last Glacial Maximum revealed by Y chromosomes.6:e24282. 10.1371/journal.pone.002428221904623CaoS.BaiP.ZhuW.ChenD.WangH.JinB. (2018). Genetic portrait of 27 Y-STR loci in the Tibetan ethnic population of the Qinghai province of China.34e18–e19. 10.1016/j.fsigen.2018.02.00529514769CarracedoA.ButlerJ. M.GusmaoL.LinacreA.ParsonW.RoewerL. (2013). New guidelines for the publication of genetic population data.7217–220. 10.1016/j.fsigen.2013.01.00123375886ChenP.HeG.ZouX.ZhangX.LiJ.WangZ. (2018). Genetic diversities and phylogenetic analyses of three Chinese main ethnic groups in southwest China: a Y-chromosomal STR study.8:15339. 10.1038/s41598-018-33751-x30337624D’AtanasioE.IacovacciG.PistilloR.BonitoM.DugoujonJ. M.MoralP. (2019). Rapidly mutating Y-STRs in rapidly expanding populations: discrimination power of the Yfiler Plus multiplex in northern Africa.38185–194. 10.1016/j.fsigen.2018.11.00230419518DengQ.-Y.WangC.-C.WangX.-Q.WangL.-X.WangZ.-Y.WuW.-J. (2013). Genetic affinity between the Kam-Sui speaking Chadong and Mulam people.51263–270. 10.1111/jse.12009DezhiC.MeiliL.YingjianH.YipingH.YuT.WeiboL. (2020). Population genetics of 27 Y-STRs for the Yi population from Liangshan Yi Autonomous Prefecture, China.135441–442. 10.1007/s00414-020-02249-532025783DuW.WuW.WuZ.GuoL.WangB.ChenL. (2019). Genetic polymorphisms of 32 Y-STR loci in Meizhou Hakka population.133465–466. 10.1007/s00414-018-1845-129737420ExcoffierL.LischerH. E. (2010). Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows.10564–567. 10.1111/j.1755-0998.2010.02847.x21565059ExcoffierL.SmouseP. E. (1994). Using allele frequencies and geographic subdivision to reconstruct gene trees within a species: molecular variance parsimony.136343–359.ExcoffierL.SmouseP. E.QuattroJ. M. (1992). Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data.131479–491.FanG. Y.AnY. R.PengC. X.DengJ. L.PanL. P.YeY. (2019). Forensic and phylogenetic analyses among three Yi populations in Southwest China with 27 Y chromosomal STR loci.133795–797. 10.1007/s00414-018-1984-430560493FanH.DuZ.WangF.WangX.WenS.WangL. (2021). The forensic landscape and the population genetic analyses of Hainan Li based on massively parallel sequencing DNA profiling.10.1007/s00414-021-02590-3[Epub ahead of print]. 33847803FanH.WangX.ChenH.LongR.LiangA.LiW. (2018a). The evaluation of forensic characteristics and the phylogenetic analysis of the Ong Be language-speaking population based on Y-STR.37e6–e11. 10.1016/j.fsigen.2018.09.00830279073FanH.WangX.ChenH.ZhangX.HuangP.LongR. (2018b). Population analysis of 27 Y-chromosomal STRs in the Li ethnic minority from Hainan province, southernmost China.34e20–e22. 10.1016/j.fsigen.2018.01.00729409735FanH.ZhangX.WangX.RenZ.LiW.LongR. (2018c). Genetic analysis of 27 Y-STR loci in Han population from Hainan province, southernmost China.33e9–e10. 10.1016/j.fsigen.2017.12.00929276015FanZ.QunyingC.XiaoxiaZ.LeiJ.MeimanD. (2020). Polymorphisms of 27 Y-STRs in Zhejiang Jinhua Han (in Chinese).2263–267.FangX. J. (2007). Implications of Luo XiangLin’s Academic thoughts and practice of Hakka Studies.2511–16.FengR.ZhaoY.ChenS.LiQ.FuY.ZhaoL. (2020). Genetic analysis of 50 Y-STR loci in Dong, Miao, Tujia, and Yao populations from Hunan.134981–983. 10.1007/s00414-019-02115-z31263947FuX.FuY.LiuY.GuoJ.LiuY.GuoY. (2016). Genetic polymorphisms of 26 Y-STR loci in the Mongolian minority from Horqin district, China.130941–946. 10.1007/s00414-016-1387-327188626GanR. J.PanS. L.MustavichL. F.QinZ. D.CaiX. Y.QianJ. (2008). Pinghua population as an exception of Han Chinese’s coherent genetic structure.53303–313. 10.1007/s10038-008-0250-x18270655GaoT.YunL.GaoS.GuY.HeW.LuoH. (2016). Population genetics of 23 Y-STR loci in the Mongolian minority population in Inner Mongolia of China.1301509–1511. 10.1007/s00414-016-1433-127515831GarciaO.YurrebasoI.MancisidorI. D.LopezS.AlonsoS.GusmaoL. (2016). Data for 27 Y-chromosome STR loci in the Basque Country autochthonous population.20e10–e12. 10.1016/j.fsigen.2015.09.01026454642GuanT.SongX.XiaoC.SunH.YangX.LiuC. (2020). Analysis of 23 Y-STR loci in Chinese Jieyang Han population.134505–507. 10.1007/s00414-019-02019-y30778658GuoF.LiJ.ChenK.TangR.ZhouL. (2017). Population genetic data for 27 Y-STR loci in the Zhuang ethnic minority from Guangxi Zhuang Autonomous Region in the south of China.27182–183. 10.1016/j.fsigen.2016.11.00927919780GusmaoL.ButlerJ. M.CarracedoA.GillP.KayserM.MayrW. R. (2006). DNA Commission of the International Society of Forensic Genetics (ISFG): an update of the recommendations on the use of Y-STRs in forensic analysis.120191–200. 10.1007/s00414-005-0026-116998969HaiyingJ.KangW.Ke’erY. (2020). Investigation on Genetic Polymorphisms of 41 Y-STR Loci in the Zhuang Population in Guangxi (in Chinese).684–87.HanX.ShenA.YaoT.WuW.WangX.SunH. (2020). Genetic diversity of 17 autosomal STR loci in Meizhou Hakka population.135443–444. 10.1007/s00414-020-02253-932030456HanguangL.PeizhiT.QiansuY.XinY.TianM.JianpinT. (2020). Genetic Polymorphism of 36 Y-STR Loci among Han-ethnic Population in Guangxi Area (in Chinese).45420–446. 10.16467/j.1008-3650.2020.04.019HeG.WangZ.SuY.ZouX.WangM.ChenX. (2019). Genetic structure and forensic characteristics of Tibeto-Burman-speaking U-Tsang and Kham Tibetan Highlanders revealed by 27 Y-chromosomal STRs.9:7739. 10.1038/s41598-019-44230-231123281HeJ.GuoF. (2013). Population genetics of 17 Y-STR loci in Chinese Manchu population from Liaoning Province, Northeast China.7e84–e85. 10.1016/j.fsigen.2012.12.00623357833HenryJ.DaoH.ScandrettL.TaylorD. (2019). Population genetic analysis of Yfiler((R)) Plus haplotype data for three South Australian populations.41e23–e25. 10.1016/j.fsigen.2019.03.02130954445HuangY.GuoL.WangM.ZhangC.KangL.WangK. (2019). Genetic analysis of 39 Y-STR loci in a Han population from Henan province, central China.13395–97. 10.1007/s00414-018-1852-229779152IacovacciG.D’AtanasioE.MariniO.CoppaA.SellittoD.TrombettaB. (2017). Forensic data and microvariant sequence characterization of 27 Y-STR loci analyzed in four Eastern African countries.27123–131. 10.1016/j.fsigen.2016.12.01528068531IpS. C. Y.LinS. W.LamT. T. (2019). Haplotype data of 27 Y-STR loci in Hong Kong Chinese.38e14–e15. 10.1016/j.fsigen.2018.11.00130420281JankovaR.SeidelM.Videtic PaskaA.WilluweitS.RoewerL. (2019). Y-chromosome diversity of the three major ethno-linguistic groups in the Republic of North Macedonia.42165–170. 10.1016/j.fsigen.2019.07.00731351212JannuzziJ.RibeiroJ.AlhoC.de Oliveira LazaroE. A. G.CicarelliR.Simoes Dutra CorreaH. (2020). Male lineages in Brazilian populations and performance of haplogroup prediction tools.44:102163. 10.1016/j.fsigen.2019.10216331704485JiJ.RenZ.ZhangH.WangQ.WangJ.KongZ. (2017). Genetic profile of 23 Y chromosomal STR loci in Guizhou Shui population, southwest China.28e16–e17. 10.1016/j.fsigen.2017.01.01028209446JuanM.XangW.YanmeiH.HuiyongJ.LiweiG.QianZ. (2018). Genetic polymorphism of 27 Y-STR Loci among Xinxiang Han population in Henan Province (in Chinese).54452–457. 10.13705/j.issn.1671-6825.2017.09.150JunY.LiW.JingG.JiaxinX.JinfengX.BaojieW. (2017). Polymorphisms of 27 Y-STRs in Liaoning Han (in Chinese).33666–668. 10.3969/j.issn.1004-5619.2017.06.023JungJ. Y.ParkJ. H.OhY. L.KwonH. S.ParkH. C.ParkK. H. (2016). Forensic genetic study of 29 Y-STRs in Korean population.2317–20. 10.1016/j.legalmed.2016.09.00127890097KaiZ.XiaoyeJ.DajunL.WuhuG. (2019). Genetic polymorphic analysis of 27 Y-STR loci in Zhanjiang Han population (in Chinese).34449–453. 10.13618/j.issn.1001-5728.2019.05.006KangW.HuB.BingjieZ.JiabinH.PeizhiT. (2020). Investigation on the allele frequencies of 41 Y-STRLoci in Han Population in Jilin (in Chinese).6103–107.KhubraniY. M.WettonJ. H.JoblingM. A. (2018). Extensive geographical and social structure in the paternal lineages of Saudi Arabia revealed by analysis of 27 Y-STRs.3398–105. 10.1016/j.fsigen.2017.11.01529220824KumarS.StecherG.TamuraK. (2016). MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets.331870–1874. 10.1093/molbev/msw05427004904KwakK. D.JinH. J.ShinD. J.KimJ. M.RoewerL.KrawczakM. (2005). Y-chromosomal STR haplotypes and their applications to forensic and population studies in east Asia.119195–201. 10.1007/s00414-004-0518-415856270LacerenzaD.AneliS.Di GaetanoC.CritelliR.PiazzaA.MatulloG. (2017). Investigation of extended Y chromosome STR haplotypes in Sardinia.27172–174. 10.1016/j.fsigen.2016.12.00928057510LangM.LiuH.SongF.QiaoX.YeY.RenH. (2019). Forensic characteristics and genetic analysis of both 27 Y-STRs and 143 Y-SNPs in Eastern Han Chinese population.42e13–e20. 10.1016/j.fsigen.2019.07.01131353318LeiF. Q. (2005). The she dialect language, widely-used in the she nationality.2760–66.LetunicI.BorkP. (2019). Interactive Tree Of Life (iTOL) v4: recent updates and new developments.47W256–W259. 10.1093/nar/gkz23930931475LiH.PanW. Y.WenB.YangN. N.JinJ. Z.JinL. (2003). Origin of Hakka and Hakkanese: a genetics analysis.30873–880.LiH.WenB.ChenS. J.SuB.PramoonjagoP.LiuY. (2008). Paternal genetic affinity between Western Austronesians and Daic populations.8:146. 10.1186/1471-2148-8-14618482451LiL.YuG.LiS.JinL.YanS. (2016). Genetic analysis of 17 Y-STR loci from 1019 individuals of six Han populations in East China.20101–102. 10.1016/j.fsigen.2015.10.00726529186LiuS.ChenG.HuangH.LinW.GuoD.ZhaoS. (2017). Patrilineal background of the She minority population from Chaoshan Fenghuang Mountain, an isolated mountain region, in China.109284–289. 10.1016/j.ygeno.2017.05.00228487173LiuY.JinX.GuoY.ZhangX.ZhuW.ZhangW. (2020). Haplotypic diversity and population genetic study of a population in Kashi region by 27 Y-chromosomal short tandem repeat loci.8:e1338. 10.1002/mgg3.133832537948LiuY.WangC.ZhouW.LiX. B.ShiM.BaiR. (2019). Haplotypes of 27 Y-STRs analyzed in Gelao and Miao ethnic minorities from Guizhou Province, Southwest China.40e264–e267. 10.1016/j.fsigen.2019.03.00230876835LiuY.WenS.GuoL.BaiR.ShiM.LiX. (2018). Haplotype data of 27 Y-STRs analyzed in the Hui and Tujia ethnic minorities from China.35e7–e9. 10.1016/j.fsigen.2018.04.00629685746LiuY. J.GuoL. H.LiJ.YueJ. T.ShiM. S. (2018). Genetic polymorphisms of 27 Y-STR Loci in Dongxiang population of Gansu province.34270–275. 10.12116/j.issn.1004-5619.2018.03.01130051666LuJ. N.WangC. C. (2019). A new approach to the origin of she in the perspective of molecular anthropology.14598–107.LuoH.SongF.ZhangL.HouY. (2015). Genetic polymorphism of 23 Y-STR loci in the Zhuang minority population in Guangxi of China.129737–738. 10.1007/s00414-015-1175-525877764LuoX. L. (1989). Beijing: The Chinese Overseas Publishing House.LuoX. Q.DuP. X.WangL. X.ZhouB. Y.LiY. C.ZhengH. X. (2020). Uniparental Genetic analyses reveal the major origin of Fujian Tanka from ancient indigenous Daic populations.91257–277. 10.13110/humanbiology.91.4.0532767896LuoY.WuY.QianE.WangQ.WangQ.ZhangH. (2019). Population genetic analysis of 36 Y-chromosomal STRs yields comprehensive insights into the forensic features and phylogenetic relationship of Chinese Tai-Kadai-speaking Bouyei.14:e0224601. 10.1371/journal.pone.022460131703068NothnagelM.FanG.GuoF.HeY.HouY.HuS. (2017). Revisiting the male genetic landscape of China: a multi-center study of almost 38,000 Y-STR haplotypes.136485–497. 10.1007/s00439-017-1759-x28138773OppenheimerS. (2012). Out-of-Africa, the peopling of continents and islands: tracing uniparental gene trees across the map.367770–784. 10.1098/rstb.2011.030622312044OuX.WangY.LiuC.YangD.ZhangC.DengS. (2015). Haplotype analysis of the polymorphic 40 Y-STR markers in Chinese populations.19255–262. 10.1016/j.fsigen.2015.08.00726344901PickrahnI.MullerE.ZahrerW.DunkelmannB.Cemper-KiesslichJ.KreindlG. (2016). Yfiler((R)) Plus amplification kit validation and calculation of forensic parameters for two Austrian populations.2190–94. 10.1016/j.fsigen.2015.12.01426741856PrinzM.BollK.BaumH.ShalerB. (1997). Multiplexing of Y chromosome specific STRs and performance for mixed samples.85209–218. 10.1016/s0379-0738(96)02096-8QiangL.YueX.WeiZ.DianZ.BaowenC.FamingZ. (2017). Genetic Polymorphism Analysis of 27 Y-STR Loci in the Hui Population from Weishan, Yunnan Province (in Chinese).3825–29.RoewerL.AndersenM. M.BallantyneJ.ButlerJ. M.CaliebeA.CorachD. (2020). DNA commission of the International Society of Forensic Genetics (ISFG): recommendations on the interpretation of Y-STR results in forensic analysis.48:102308. 10.1016/j.fsigen.2020.10230832622324RowoldD. J.GaydenT.LuisJ. R.Alfonso-SanchezM. A.Garcia-BertrandR.HerreraR. J. (2019). Investigating the genetic diversity and affinities of historical populations of Tibet.68281–91. 10.1016/j.gene.2018.09.04330266503SaitouN.NeiM. (1987). The neighbor-joining method: a new method for reconstructing phylogenetic trees.4406–425. 10.1093/oxfordjournals.molbev.a0404543447015Santos StangeV.Silva Dos ReisR.Mariano Garcia de SouzaR. F.LimaL. M.Mayumi VieiraC.de PaulaF. (2019). Stratification among European descent and admixed Brazilian populations of Espirito Santo for 27 Y-STRs.41e20–e22. 10.1016/j.fsigen.2019.03.01930930177ShanW.AblimitA.ZhouW.ZhangF.MaZ.ZhengX. (2014). Genetic polymorphism of 17 Y chromosomal STRs in Kazakh and Uighur populations from Xinjiang, China.128743–744. 10.1007/s00414-013-0948-y24398978ShiM.LiuY.ZhangJ.BaiR.LvX.MaS. (2015). Analysis of 24 Y chromosomal STR haplotypes in a Chinese Han population sample from Henan Province, Central China.1783–86. 10.1016/j.fsigen.2015.04.00125864156ShiZ. J. (2013). An analysis of the origin and development of she culture.144122–124.ShonhaiM.NhiwatiwaT.NangammbiT.MazandoS. (2020). Genetic analysis of 27 Y-chromosomal STR loci in a Zimbabwean Shona ethnic group.43:101660. 10.1016/j.legalmed.2019.10166031911187ShuL.LiL.YuG.YuB.LiuY.LiS. (2015). Genetic analysis of 17 Y-STR loci in Han, Dong, Miao and Tujia populations from Hunan province, central-southern China.19250–251. 10.1016/j.fsigen.2015.07.00726295934SongF.XieM.XieB.WangS.LiaoM.LuoH. (2020). Genetic diversity and phylogenetic analysis of 29 Y-STR loci in the Tibetan population from Sichuan Province, Southwest China.134513–516. 10.1007/s00414-019-02043-y30877383SpolnickaM.DabrowskaJ.Szablowska-GnapE.PaleczkaA.JablonskaM.Zbiec-PiekarskaR. (2017). Intra- and inter-population analysis of haplotype diversity in Yfiler((R)) Plus system using a wide set of representative data from Polish population.28e22–e25. 10.1016/j.fsigen.2017.01.01428174016SunH.SuK.FanC.LongF.LiuY.SunJ. (2019). Y-STRs’ genetic profiling of 1953 individuals from two Chinese Han populations (Guizhou and Shanxi).38e8–e10. 10.1016/j.fsigen.2018.10.01130392972SWGDAM (2010). .TangJ.YangM.WangX.WangQ.WangQ.ZhangH. (2020). Genetic structure and forensic characterisation of 36 Y-chromosomal STR loci in Hmong-Mien-speaking Miao population.47541–548. 10.1080/03014460.2020.178815932597239TaoR.JinM.JiG.ZhangJ.ZhangJ.YangZ. (2019a). Forensic characteristics of 36 Y-STR loci in a Changzhou Han population and genetic distance analysis among several Chinese populations.40e268–e270. 10.1016/j.fsigen.2019.03.01130890321TaoR.WangS.ZhangJ.ZhangJ.YangZ.ZhangS. (2019b). Genetic characterization of 27 Y-STR loci analyzed in the Nantong Han population residing along the Yangtze Basin.39e10–e13. 10.1016/j.fsigen.2018.11.01530503807WangC. C.WangL. X.ShresthaR.WenS.ZhangM.TongX. (2015). Convergence of Y Chromosome STR haplotypes from different SNP haplogroups compromises accuracy of haplogroup prediction.42403–407. 10.1016/j.jgg.2015.03.00826233895WangC. Z.SuM. J.LiY.ChenL.JinX.WenS. Q. (2019). Genetic polymorphisms of 27 Yfiler((R)) Plus loci in the Daur and Mongolian ethnic minorities from Hulunbuir of Inner Mongolia Autonomous Region, China.40e252–e255. 10.1016/j.fsigen.2019.02.00330744984WangC. Z.ZhangJ. S.LiX. B.BaiR. F.ShiM. S.WangC. C. (2020). Haplotype analysis of 36 Y-STR loci in a Chinese Han population from Anhui Province, Eastern China.1342063–2065. 10.1007/s00414-020-02321-032472181WangH.BaH.YangC.ZhangJ.TaiY. (2017). Inner and inter population structure construction of Chinese Jiangsu Han population based on Y23 STR system.12:e0180921. 10.1371/journal.pone.018092128704439WangJ.WenS.ShiM.LiuY.ZhangJ.BaiR. (2018). Haplotype structure of 27 Yfiler((R))Plus loci in Chinese Dongxiang ethnic group and its genetic relationships with other populations.33e13–e16. 10.1016/j.fsigen.2017.12.01429402655WangL.ChenF.KangB.ZhengH.ZhaoY.LiL. (2016). Genetic population data of Yfiler Plus kit from 1434 unrelated Hans in Henan Province (Central China).22e25–e27. 10.1016/j.fsigen.2016.02.00926922336WangM.WangZ.ZhangY.HeG.LiuJ.HouY. (2017). Forensic characteristics and phylogenetic analysis of two Han populations from the southern coastal regions of China using 27 Y-STR loci.31e17–e23. 10.1016/j.fsigen.2017.10.00929111271WangX.LiY.FanH. (2019). The associations between screen time-based sedentary behavior and depression: a systematic review and meta-analysis.19:1524. 10.1186/s12889-019-7904-931727052WangY.LiS.DangZ.KongX.ZhangY.MaL. (2019). Genetic diversity and haplotype structure of 27 Y-STR loci in a Yanbian Korean population from Jilin Province, Northeast China.36110–112. 10.1016/j.legalmed.2018.11.01030502537WangY.ZhangY. J.ZhangC. C.LiR.YangY.OuX. L. (2016). Genetic polymorphisms and mutation rates of 27 Y-chromosomal STRs in a Han population from Guangdong Province, Southern China.215–9. 10.1016/j.fsigen.2015.09.01326619377WenB.LiH.LuD.SongX.ZhangF.HeY. (2004). Genetic evidence supports demic diffusion of Han culture.431302–305. 10.1038/nature0287815372031WestenA. A.KraaijenbrinkT.ClarisseL.GrolL. J.WillemseP.ZunigaS. B. (2015). Analysis of 36 Y-STR marker units including a concordance study among 2085 Dutch males.14174–181. 10.1016/j.fsigen.2014.10.01225450789WuW.PanL.HaoH.ZhengX.LinJ.LuD. (2011). Population genetics of 17 Y-STR loci in a large Chinese Han population from Zhejiang Province, Eastern China.5e11–e13. 10.1016/j.fsigen.2009.12.00520457064WuZ.ChenT. F.ZengZ. F.ZhangY. W.TangZ.SuK. Y. (2019). Genetic Structure Analysis of Y-Chromosome STR and SNP in Population of Wujiang Area, Suzhou City.35448–454. 10.12116/j.issn.1004-5619.2019.04.01431532156XieM.SongF.LiJ.LangM.LuoH.WangZ. (2019). Genetic substructure and forensic characteristics of Chinese Hui populations using 157 Y-SNPs and 27 Y-STRs.4111–18. 10.1016/j.fsigen.2019.03.02230927697YajuL.JianM.ChuanhongZ.XueboL.MeisenS. (2019). Genetic polymorphisms of 27 Y-STR loci in Tujia and Hui population and the cluster analysis of 13 ethnic groups (in Chinese).39314–320.YajuL.Jun-taoZ.ShaoxingS.HaiL.LihongG.YiZ. (2014). Genetic polymorphisms of 27Y-STR loci in Henna Han population.418–20.YajuL.LinG.JuntaoY.JinL.MeisenS.XueboL. (2020). Genetic polymorphisms of 27 Y-STR in 3 Chinese populations and the cluster analysis of 13 ethnic groups (in Chinese).5577–585. 10.3969/j.issn.1672-9455.2020.05.001YaoH. B.WangC. C.TaoX.ShangL.WenS. Q.ZhuB. (2016). Genetic evidence for an East Asian origin of Chinese Muslim populations Dongxiang and Hui.6:38656. 10.1038/srep3865627924949YinC.SuK.HeZ.ZhaiD.GuoK.ChenX. (2020). Genetic Reconstruction and Forensic Analysis of Chinese Shandong and Yunnan Han Populations by Co-Analyzing Y Chromosomal STRs and SNPs.11:743. 10.3390/genes1107074332635262ZeyadT.AdamA.AlghafriR.IratniR. (2020). Study of 27 Y-STR markers in United Arab Emirates population.2:100057.ZgonjaninD.AlghafriR.AntovM.StojiljkovicG.PetkovicS.VukovicR. (2017). Genetic characterization of 27 Y-STR loci with the Yfiler((R)) Plus kit in the population of Serbia.31e48–e49. 10.1016/j.fsigen.2017.07.01328789900ZhabaginM.SarkytbayevaA.TazhigulovaI.YerezhepovD.LiS.AkilzhanovR. (2019). Development of the Kazakhstan Y-chromosome haplotype reference database: analysis of 27 Y-STR in Kazakh population.1331029–1032. 10.1007/s00414-018-1859-829796706ZhangJ.TaoR.ZhongJ.SunD.QiaoL.ShanS. (2019). Genetic polymorphisms of 27 Y-STR loci in the Dezhou Han population from Shandong province, Eastern China.39e26–e28. 10.1016/j.fsigen.2018.11.02130529047ZhangJ.WangJ.LiuY.ShiM.BaiR.MaS. (2017). Haplotype data for 27 Y-chromosomal STR loci in the Chaoshan Han population, South China.31e54–e56. 10.1016/j.fsigen.2017.08.00328807630ZhangS.TianH.WangZ.ZhaoS.HuZ.LiC. (2014). Development of a new 26plex Y-STRs typing system for forensic application.13112–120. 10.1016/j.fsigen.2014.06.01525086414ZhaoQ.BianY.ZhangS.ZhuR.ZhouW.GaoY. (2017). Population genetics study using 26 Y-chromosomal STR loci in the Hui ethnic group in China.28e26–e27. 10.1016/j.fsigen.2017.01.01828188082ZhengL.LiY.LuS.BaoJ.WangY.ZhangX. (2013). Physical characteristics of Chinese Hakka.56541–551. 10.1007/s11427-013-4471-723576182ZhongJ. K. (2019). The review and prospect of the study in Hakka historical origin.02101–122.ZhouH.RenZ.ZhangH.WangJ.HuangJ. (2016). Genetic profile of 17 Y chromosome STRs in the Guizhou Han population of southwestern China.25e6–e7. 10.1016/j.fsigen.2016.05.01027600174ZhouY.ShaoC.LiL.ZhangY.LiuB.YangQ. (2018). Genetic analysis of 29 Y-STR loci in the Chinese Han population from Shanghai.32e1–e4. 10.1016/j.fsigen.2017.11.00329150183ZhuB.DengY.ZhangF.WeiW.ChenL.ZhaoJ. (2006a). Genetic analysis for Y chromosome short tandem repeat haplotypes of Chinese Han population residing in the Ningxia province of China.511417–1420. 10.1111/j.1556-4029.2006.00282.x17199634ZhuB.LiuS.CiD.HuangJ.WangY.ChenL. (2006b). Population genetics for Y-chromosomal STRs haplotypes of Chinese Tibetan ethnic minority group in Tibet.16178–83. 10.1016/j.forsciint.2005.09.00316298504ZhuB.LiX.WangZ.WuH.HeY.ZhaoJ. (2005). Y-STRs haplotypes of Chinese Mongol ethnic group using Y-PLEX 12.153260–263. 10.1016/j.forsciint.2004.11.00216139115ZhuB.WuY.ShenC.YangT.DengY.XunX. (2008). Genetic analysis of 17 Y-chromosomal STRs haplotypes of Chinese Tibetan ethnic group residing in Qinghai province of China.175238–243. 10.1016/j.forsciint.2007.06.01217659855