Significant East Asian Affinity of the Sichuan Hui Genomic Structure Suggests the Predominance of the Cultural Diffusion Model in the Genetic Formation Process

The ancestral origin and genomic history of Chinese Hui people remain to be explored due to the paucity of genome-wide data. Some evidence argues that an eastward migration of Central Asians gave rise to modern Hui people, which is referred to as the demic diffusion hypothesis; other evidence favors the cultural diffusion hypothesis, which posits that East Asians adopted Muslim culture to form the modern culturally distinct populations. However, the extent to which the observed genetic structure of the Huis was mediated by the movement of people or the assimilation of Muslim culture also remains highly contentious. Analyses of over 700 K SNPs in 109 western Chinese individuals (49 Sichuan Huis and 60 geographically close Nanchong Hans) together with the available ancient and modern Eurasian sequences allowed us to fully explore the genomic makeup and origin of Hui and neighboring Han populations. The results from PCA, ADMIXTURE, and allele-sharing-based f-statistics revealed a strong genomic affinity between Sichuan Huis and Neolithic-to-modern Northern East Asians, which suggested a massive gene influx from East Asians into the Sichuan Hui people. Three-way admixture models in the qpWave/qpAdm analyses further revealed a small stream of gene influx from western Eurasians into the Sichuan Hui people, which was further directly confirmed via the admixture event from the temporally distinct Western sources to Sichuan Hui people in the qpGraph-based phylogenetic model, suggesting the key role of the cultural diffusion model in the genetic formation of the Sichuan Huis. ALDER-based admixture date estimation showed that this observed western Eurasian admixture signal was introduced into the Sichuan Huis during the historic periods, which was concordant with the extensive western–eastern communication along the Silk Road and historically documented Huis' migration history. In summary, although significant cultural differentiation exists between Hui people and their neighbors, our genomic analysis showed their strong genetic affinity with modern and ancient Northern East Asians. Our results support the hypothesis that the Sichuan Huis arose from a mixture of minor western Eurasian ancestry and predominant East Asian ancestry.

Affinity-f4-statistics in the form f4(Reference population1, Hui_Boshu; reference population2, Mbuti) to explore the genetic continuity and admixture between potential ancestral sources and Boshu Hui based on the merged 1240K dataset. The red color denoted the third population (Bottom population lists) shared more derived alleles with the first population (Right population lists) compared with the second one (Boshu Hui). And the green color denoted the third population shared more alleles with the Boshu Hui relative to the first one. Tree was constructed based on the f4 matrix and red color-coded populations were possible ancestral sources.

Figure S10. Excess of sharing alleles between Boshu Hui and modern and ancient East Asians showed genetic admixture based on the merged Human Origin dataset.
Affinity-f4-statistics in the form f4(Reference population1, Hui_Boshu; reference population2, Mbuti) to explore the genetic continuity and admixture between potential ancestral sources and Boshu Hui based on the merged 1240K dataset. The red color denoted the third population (Bottom population lists) shared more derived alleles with the first population (Right population lists) compared with the second one (Boshu Hui). And the green color denoted the third population shared more alleles with the Boshu Hui relative to the first one. Tree was constructed based on the f4 matrix and red color-coded populations were possible ancestral sources.

Figure S11. Results of f4-statistics showed genomic relationship inferred from f4(Reference population1, reference population2; Hui_Boshu, Mbuti) based on the merged Human Origin dataset.
Reference population1 was the right population list, and reference population2 was the bottom population list. Statistically significant f-statistics were marked as "+". Tree was constructed based on the f4 matrix and red color-coded populations were possible ancestral sources. Red color showed positive f4 values, which suggested that Boshu Hui possessed significantly more allele sharing of reference popultion1 (Right population lists) related to reference population2 (Bottom population lists). Blue color showed negative f4 values, which suggested that Boshu Hui harbored excess sharing alleles with reference population2 (Bottom population lists) relative to reference popultion1 (Right population lists). Symmetrical f4-statistics in the form f4(Reference population1, reference population2; Hui_Boshu, Mbuti) to test excess allele sharing between Boshu Hui and northern East Asians and all Sinitic speakers (Right population lists, coded by red color) relative to other Eurasian reference populations.

Figure S12. Excess of sharing alleles between Nanchong Han and modern and ancient Eurasians.
Symmetrical f4-statistics in the form f4(Reference population1, reference population2; Han_Nanchong, Mbuti) to test the genetic affinity between Eurasian reference populations (A). Affinity-f4-statistics in the form f4(Reference population1, Han_Nanchong; reference population2, Mbuti) to explore the genetic continuity and admixture between potential ancestral sources and Boshu Hui (B). The red color denoted the third population shared more derived alleles with the first population compared with the second one. And the green color denoted the third population shared more alleles with the second one relative to the first one. Tree was constructed based on the f4 matrix and red color-coded populations were possible ancestral sources.

Figure S13. Results of f4-statistics showed genomic relationship inferred from f4(Reference population1, reference population2; Han_Nanchong, Mbuti) based on the merged Human Origin dataset.
Reference population1 was the right population list, and reference population2 was the bottom population list. Blue color showed negative f4 values, which suggested that Nanchong Han harbored excess sharing alleles with reference population2 (Bottom population lists) relative to reference popultion1 (Right population lists). Red color showed positive f4 values, which suggested that Nanchong Han possessed significantly more allele sharing of reference popultion1 (Right population lists) related to reference population2 (Bottom population lists). Statistically significant f-statistics were marked as "+". Tree was constructed based on the f4 matrix and red color-coded populations were possible ancestral sources. Symmetrical f4-statistics in the form f4(Reference population1, reference population2; Han_Nanchong, Mbuti) to test excess allele sharing between Nanchong Han and northern East Asians (Right population lists) relative to other Eurasian reference populations.

Figure S14. Excess of sharing alleles between Nanchong Han and modern and ancient Eurasians based on the merged Human Origin dataset.
Affinity-f4-statistics in the form f4(Reference population1, Han_Nanchong; reference population2, Mbuti) to explore the genetic continuity and admixture between potential ancestral sources and Nanchong Han based on the merged Human Origin dataset. The red color denoted the third population (Bottom population lists) shared more derived alleles with the first population (Right population lists) compared with the second one (Nanchong Han). And the green color denoted the third population shared more alleles with the Nanchong Han relative to the first one. Statistically significant f-statistics were marked as "+". The tree was constructed based on the f4 matrix.                 We used 28 years as the one generation length. All marked years in the bottom was calculated using the formula as Year=1950-28*(Generation-1). All comprehensive raw data were presented in Supplementary  Table S23