Ancient Mitogenomes Reveal the Origins and Genetic Structure of the Neolithic Shimao Population in Northern China

Shimao City is considered an important political and religious center during the Late Neolithic Longshan period of the Middle Yellow River basin. The genetic history and population dynamics among the Shimao and other ancient populations, especially the Taosi-related populations, remain unknown. Here, we sequenced 172 complete mitochondrial genomes, ranging from the Yangshao to Longshan period, from individuals related to the Shimao culture in northern Shaanxi Province and Taosi culture in southern Shanxi Province, Middle Yellow River basin. Our results show that the populations inhabiting Shimao City had close genetic connections with an earlier population in the Middle Neolithic Yangshao period of northern Shaanxi Province, revealing a mostly local origin for the Shimao Society. In addition, among the populations in other regions of the Yellow River basin, the Shimao-related populations had the closest maternal affinity with the contemporaneous Taosi populations from the Longshan period. The Shimao-related populations also shared more affinity with present-day northern Han populations than with the minorities and southern Han in China. Our study provides a new perspective on the genetic origins and structure of the Shimao people and the population dynamics in the Middle Yellow River basin during the Neolithic period.

over 4 million square meters. It is one of the largest known town sites in northern China, from the late Neolithic (Longshan period) to the early Bronze Age (Xia Dynasty). Shimao City is one of the biggest central settlements in the late Longshan period in northern China, with a large scale, clear structure, and complete chronological sequence (Sun et al., 2020a). And it is considered to be one of the most important ancient sites in China, which has caused great concern about the origin and developments of Chinese civilizations and early states (Sun et al., 2020a).
The ages of different sites in the Shimao City are not completely consistent, showing a trend of gradual diffusion from the Huangchengtai to the outer city. Take the sites involved in our study as an example, the Huangchengtai in the center and the Hanjiagedan, the Houyangwan, and the Mahuangliang in the inner city of the Shimao City were early, about 4,250-4,050 BP, while the Dongmen site of the outer city is later, about 4,050-3,750 BP (Sun et al., 2020a).
(1) Huangchengtai The Huangchengtai site (HCT) is the core distribution area of the Shimao City. Enormous palaces and high-grade building sites with complex structures were found in this area as mentioned before. Some "luxury goods" like jades, stone models, murals symbolizing identity level, as well as the excavation of production relics such as copper casting and bone making, have become important evidence to infer that the HCT is the core area where high-level nobles or kings live. These relics show the HCT is not only a high-level aristocratic settlement but also a religious center. In addition, the culture of the northwest may influence the appearance of stone carvings of the HCT, according to archaeological studies (Guo, 2013;Li, 2017). Among 12 individuals from the HCT, Shimao_HCT_II3_2 and Shimao_HCT_nS3_1 are directly radiocarbon dated to 4,148-3,984 cal BP and 4,082-3,895 cal BP.
(2) Hanjiagedan The Hanjiagedan site (HJGD) is located on an oval hill in the southeast of the HCT. In 2014, the Shimao archaeological team excavated the HJGD site and found 31 house sites, 41 tombs, 27 ash pits, four ditches, and one kiln site (Sun et al., 2016). Thousands of relics were unearthed, including potteries, stones, and bones. It is both a residential and burial site in the inner city of Shimao City. The function of the HJGD site changed. It was used as a residential site in the early stage but was abandoned and used as a cemetery later. Although the cemetery was seriously robbed and disturbed, it can still be judged from the scale that the site was a large noble cemetery in Shimao City. There are identity differences and hierarchical differentiation among the cemetery owners, and the tendency of social complexity is intensified. In addition, the phenomenon of martyrdom exists in some tombs, which may reflect the social state of class differentiation and frequent wars (Sun et al., 2020a). Two individuals (Shimao_HJGD_M21S and Shimao_HJGD_M24h) are dated to the Longshan period, 3,835-3,699 cal BP and 3,977-3,849 cal BP.

(3) Houyangwan
The Houyangwan site (HYW) is in the northeast of the HCT site. It is also a residential area in the inner city of Shimao City, similar to the HJGD. In 2012, trial excavations for the HYW were conducted. House sites and tombs are the main relics of this site. The types of house sites are cavedwelling. And the tombs include vertical caves, earth pit tombs, and urn coffin tombs. The unearthed relics are mainly potteries, besides a few stone tools, bone vessels, and many animal bones. It is worth noting that some tombs have the phenomenon of martyrdom (Sun et al., 2015;Sun et al., 2020a). Two individuals, Shimao_HYW_T1M2b and Shimao_HYW_T2M2, are directly radiocarbon dated to 3,975-3,840 cal BP and 3,811-3,570 cal BP, which are also assigned to the Longshan period.

(4) Mahuangliang
The Manghuangliang site (MHL) is located in the inner city of the Shimao City. There is only one individual in our analysis, which is directly radiocarbon dated to 3,894-3,722 cal BP.

(5) Dongmen
The Dongmen site (DM) is located in the northeast of the outer city. It is a prehistoric gate site with a distinct structure and exquisite design in China. Important relics such as jade wares, potteries, murals, and stone carving anthropopathic features have been unearthed in the surrounding layers. The jades were found in the stone walls, which may have the religious function of warding off evil spirits. A total of six sacrifice pits with human skulls were found on this site. These skulls, buried intensively, are located under the early ground or stone wall. The research in physical anthropology shows that there are more women than men and no minors in these pits. From the perspective of species characteristics, they are highly consistent with the indigenous people in the pre-Qin period along the great wall of Inner Mongolia (Chen et al., 2016). It is likely to be related to the foundation laying or sacrificial activities during the construction of the city wall (Shao, 2016;Sun et al., 2020a). The radiocarbon dating of individual Shimao_DM_K4_8, Shimao_DM_M2, and Shimao_DM_K6_2 are 4,144-3,976 cal BP, 3,390-3,253 cal BP and 4,084-3,902 cal BP.

Shengedaliang
The Shengedaliang site (SGDL) is located in Yejihe Village, Dabaodang Town, Shenmu County, Yulin City, northern Shaanxi Province. Many remains, such as tombs, ash pits, house foundations, and rammed-earth foundation, were unearthed in 2013-2014. Substantial artifacts dating to the Longshan and the Xia periods were unearthed, including potteries, stone tools, bone artifacts, and other relics. The combination of artifacts unearthed from this site is basically as same as those found in the Xinhua site (XH), the Shimao City, and the Muzhuzhuliang site (MZZL) (Guo et al., 2016). Individuals SGDL_M7_2014, SGDL_M17, SGDL_M14 and SGDL_M25 are directly radiocarbon dated to 3,811-3,570 cal BP, 3,868-3,650 cal BP, 3,959-3,728 cal BP, and 3,969-3,831 cal BP, which are assigned to the Longshan period.

Muzhuzhuliang
The Muzhuzhuliang site (MZZL) is located in Yejihe Village, Dabaodang Town, Shenmu County, Yulin City, northern Shaanxi Province. Plenty of house sites, ash pits were discovered with a few tombs, pottery kilns, ditches, and other relics. The unearthed relics were very similar to those on the SGDL site. This site is considered to be the most complete settlement with circular moats in the late Longshan period in northern Shaanxi (Guo, 2015;Wang et al., 2015). The research of physical anthropology research shows that the ancient population of the MZZL is close to the East Asian type of Mongoloid, and skull characteristics show they are similar to the characteristics of the ancient population of the Miaozigou, which locates in the south-central Inner Mongolia (Chen et al., 2015). The radiocarbon dating of MZZL_H32, MZZL_M3 and MZZL_M7 are also the Longshan period, 4,082-3,895 cal BP, 3,966-3,727 cal BP and 3,964-3,722 cal BP.

Xinhua
The Xinhua site (XH) locates in Xinhua Village, Dabaodang Town, Shenmu County, northern Shaanxi Province. A total of 155 ash pits, 72 tombs, 33 house sites, five kiln sites, and one jade pit and abundant artifacts (such as potteries, stone tools, bones, and jades) were unearthed (Sun, 2005;Xing et al., 2005). It is worth mentioning that 32 pieces of jade were unearthed in K1, which is considered a sacrificial pit (Sun, 2002). The radiocarbon dating data of individuals XH_M1b, XH_M58, and XH_M59 are 3,835-3,652 cal BP, 3,868-3,696 cal BP, and 4,231-3,998 cal BP, respectively.

Zhaishan
The Zhaishan site (ZS) is in Wangshamao Village, Tianjiazhai Town, Fugu County, northern Shaanxi Province, 60 kilometers away from Shimao City in the southwest. The site contains a stone city settlement dated to the Longshan period, covering an area of about 600,000 square meters. The rammed earth platform found in the north of the city and wrapped with the stone wall, which is similar to the structure of Shimao City. Therefore, the platform may be the core area of ZS stone city. Besides, ZS was considered to be the Shimao culture due to the similarities of the artifacts. 21 tombs with obvious hierarchy were found. The tombs excavated at ZS can be divided into different hierarchies according to the scale of tombs, the number of funerary objects, the existence of burial utensils, and martyrs, which shows that class differentiation has existed among the populations in this settlement (Sun et al., 2020b;Shao, 2021). And the site is considered to be in the Longshan period, around 4,050-3,750 BP .

Taosi
The Taosi site (TS) locates in the south of Taosi Villiage, Xiangfen County, Linfen City in Shanxi Province, covering an area of 2.8 million square meters. It is one of the largest Longshan cultural sites in the Central Plain, dated from 4300 to 3900 BP according to the radiocarbon data and archaeological evidence, which contains three periods: early (~4,300-4,100 BP), middle (~4,100-4,000 BP) and late period (~4,000-3,900 BP). Based on the characteristics of the remains of the TS, archaeologists regarded the Taosi culture as another new type of the Longshan Culture in the Middle YR (Gao, 1980;He, 2004;Yan and He, 2005). The similarities between the Shimao City and the Taosi site in jade, color painting, and acts of violence show that there is close interaction and connection between the two regions (Xu, 2014). The research of physical anthropology shows that the morphological characteristics of human bones in the Longshan period are between East Asian and South Asian Mongoloid (Zhang, 2009). Among three individuals from TS, one individual, TS_G33, is directly radiocarbon dated to 3,869-3,697 cal BP, the Longshan period.

Zhoujiazhuang
The Zhoujiazhuang site (ZJZ) is located in Zhoujiazhuang Village, Hengshui Town, Jiangxian County, Shanxi Province. ZJZ has a large period, including the relics of Yangshao, Miaodigou period II, Longshan, Erlitou, Erligang, Zhou, and Han dynasties. Among them, the remains of the Longshan period are the most widely distributed, covering an area of 4.5 million square meters. According to the archaeological study, the overall characteristics of the artifacts are close to the Taosi site (Dai et al., 2018;Tian and Dai, 2018). The 37 individuals from the ZJZ are considered to be in the Longshan period, 4,150-3,700 BP (Sun, 2018).

Groupings for the newly reported ancient individuals
We collected five individuals from the ML and 16 individuals from the WZGL. These two sites are showed similarities in their geographical locations, dates, and excavated relics, and we grouped these 21 individuals as "preShimao_MW".
A total of 71 individuals were obtained from the Shimao City. We grouped them based on their dates and locations in Shimao City: The "Shimao_HCT" group contained 10 individuals collected from the HCT, which was in the center of the Shimao City and had the highest political status. We grouped the HJGD, the HYW, and the MHL in the inner city of the Shimao City as "Shimao_NC "(n=48) for their similar date, location, and relics. For the 13 individuals from the Dongmen site of Shimao's outer city, we named as "Shimao_DM" group.
For the other Shimao-related individuals, we grouped them as follows. The 12 individuals from the SGDL and six individuals from the MZZL were grouped as "MZZSGDL" for their close geographical distances and similar date. The 12 and 10 individuals from the XH and the ZS were grouped as "XH" and "ZS", respectively. As for the individuals from Shanxi, we grouped the three individuals from the TS and 37 individuals from the ZJZ as "TSZJZ" for their similar dates, locations, and excavated relics.
There are three pairs of kinship (the same mitochondrial sequences) among the individuals from the HJGD (Shimao_HJGD_M6S and Shimao_HJGD_M26S, Shimao_HJGD_M34S and Shimao_HJGD_M34h, Shimao_HJGD_M36O and Shimao_HJGD_M36h), and one pair of kinship among the individuals from the XH (XH_M4a and XH_M24). We excluded four individuals, Shimao_HJGD_M6S, Shimao_HJGD_M34S, Shimao_HJGD_M36O, and XH_M24, for their relatively lower coverage in these four pairs of kinship. Besides, because of the high contamination (> 4%), we also excluded six individuals, which are Shimao_HYW_T2M2 from the HYW (8.0%), Shimao_DM_K4_10_2 from the DM (9.5%), MZZL_M7 and MZZL_M8 from the MZZL (6.8% and 5.1%, respectively), and XH_M1a and XH_M48 from the XH (10.2% and 14.0%, respectively). Therefore, there were 44 individuals in the Shimao_NC group, 12 individuals in the Shimao_DM group, four individuals in the MZZL, and nine individuals in the XH group. In summary, we used 162 individuals in these groups for further analysis.

Published individual classification and nomenclature
Our dataset includes 801 ancient individuals and 7,641 present-day individuals (Table S1). These individuals are grouped into four clusters according to the PCA based on haplogroup frequency and FST heatmap based on genetic distance: North-eastern Asian (NEA: North Asians and Northern East Asians), South-eastern Asian (SEA: South Asian and Southern East Asians), Central and West Eurasian (CWE: Central and West Asian including populations from Xinjiang, China, and European populations).

Classification of ancient samples from NEA.
The PCA results show populations from northern China (north of the Qinling-Huaihe Line) and populations from North Asia (Baikal_EN, Baikal_EBA, N.Mongolia_LBA, Xiongnu_HP) gathered together and located far away from the populations in southern China ( Figure 2B). Therefore, we put the samples from northern China, Mongolia, and the Baikal Lake basin into one group called Northeastern Asia (NEA). Although the individuals from Xinjiang are located in northern China, the genetic analyses show that the individuals in different regions and periods showed different genetic structures, we grouped the individuals following the genetic results . Therefore, we divided 224 samples from Xinjiang and 220 from Central and Western Asian samples into one group, named Central and Western Eurasia (CWE).
We grouped 473 samples from northern China based on the archaeological culture, date, and geographical location. 74 individuals are from the Gansu-Qinghai region and Tibet, including 15 individuals from the areas above 4,000 meters (Ding et al., 2020). For the samples with the age range of 3,150-511 BP, we followed the study of Ding et al. and grouped them with seven ancient individuals from high valley of the Himalayan arc in Nepal into HTP_IA (Jeong et al., 2016;Ding et al., 2020). To explore the samples from the YR basin further, we grouped LTP_IA populations finely according to archaeological culture and geographical location. For the individuals in the Gan-Qing region, we grouped the 29 individuals of the Majiayao culture with an average age of 3,957 BP (range 5,040-411 BP) together, named GQMajiaY_EBA. 11 individuals belonging to the Qijia culture with an average age of 3,351 BP (range 4,065-1,791 BP) were divided into one group named GQQijia_BA, and eight individuals of the Kayue culture with the date ranging from 2,500BP to 3,200BP, were named GQKayue_LBA as one group. Besides, for the 11 individuals in some areas of Tibet with an altitude of lower than 4,000 meters, we grouped them as LTP_IA.
In the Middle YR, in addition to the 40 TSZJZ samples from southern Shanxi we analyzed, there are 52 samples from the Qingtai site in Henan . Since these samples are from the same site and dated to the Yangshao period within 5,500-5,000 BP, we divided them into a group and named them QT_MN.
In the Lower YR, 87 samples were obtained of 9,500-1,800 BP in Shandong (Liu J et al., 2021). According to the current research, the genetic composition of Shandong populations has changed around 4,600 BP (Liu J et al., 2021). Therefore, we divided 50 individuals after 4,600 BP into one group called SD_LN refer to Liu J et al. Besides, for the individuals before 4,600BP, we observed that five earlier individuals from the Bianbian, the Xiaogao, and the Xiaojingshan belong to the early Neolithic period (~9,500-7,000 BP), while the other 37 samples belong to the middle Neolithic period (~6,000-4,600 BP). In addition, the five earlier individuals have the haplotypes N9a2'4'5'11, B4c1c, and D4b2b2 that are not carried by other Shandong individuals. Therefore, we divided the five individuals into one group named SD_EN and the other 37 individuals named SD_MN.
Besides, we also obtained haplogroup information of 36 individuals from Halahaigou site in Inner Mongolia. The individuals are dated to~4,500 BP and belong to the middle Neolithic period (Zhao, 2009). Therefore, we named them as Halahaogou_MN.

Classification of ancient samples from other regions.
We obtained 59 individuals in southern China (south of Qinling-Huaihe) and grouped them into four groups (FJ_LN, 11 individuals; YN_LN five individuals; YNHC_LN, 11 individuals; GX_HE, 32 individuals) refer to Liu Y et al. (Liu Y et al., 2021). In addition, 22 individuals of HTP_IA are also located in the south of China, and the frequency of haplogroup M is the highest same as the above four groups in southern China. As a result, we put these five groups within the South Asian populations.
For the 220 individuals in the CWE, we followed the grouping of Wang et al. and divided them into 16 populations .

Classification of present-day individuals
We collected 7,641 present-day individuals from East Asia, South Asia, North Asia, and the Central and Western Eurasia (Table S1), and mainly grouped them by their geographical locations. It was also grouped according to the grouping of Wang et al. and divided them into 26 populations except for the individuals from China .
As for the 2,102 present-day samples from China, we grouped them more finely. They were grouped mainly based on their geographical location and nationality. Since the Han is the largest nationality in China and distributes widely, we grouped the Han individuals into the Northern Han (NChina_Han) and the Southern Han (SChina_Han) (Liu Y et al., 2021). The NChina_Han includes 388 samples from Beijing, Ningxia, Liaoning, Shandong, Shaanxi, Gansu, and Xinjiang Province, while the SChina_Han contains 168 samples from Hubei, Hunan, Guangdong, Guangxi, and Yunnan Province. For ethnic minorities in northern China, we grouped them according to their nationalities. Besides, individuals in southern China, except for the Han, were grouped based on the geographical location and nationality comprehensively.

Supplementary Tables
Supplementary Table 1. The information of 172 new samples in this study.