Reanalysis on Phylogeographic Pattern of Sharpbelly Hemiculter leucisculus (Cyprinidae: Cultrinae) in China: A Review and the Implications for Conservation

Hemiculter leucisclus, as a widely distributed freshwater fish in China, provides an interesting model to explore the impact of drainage evolution and geologic history in the Pleistocene on diversification patterns. We collected the mitochondrial cytochrome b (cytb) gene and the recombination activating gene 2 (RAG2) from 1,070 individuals from 59 sampling locations. Phylogenetic and population genetic approaches were used to describe the phylogeographic pattern and to test how the geological and climatic factors on diversification. The results suggested that there existed four sublineages of the H. leucisclus across six river systems, among which two sublineages, showing strongly indigenous characteristics, are constrained to particular geographical regions in China. The molecular data and ancestral states demonstrated that the H. leucisclus possibly originated from the Pearl River basin during the later Pliocene. The phylogeographic pattern in H. leucisclus appears to have been driven by palaeoenvironmental perturbations rather than anthropogenic translocations. The geographically constrained sublineages A in the middle and lower Pearl River basin and sublineage B in the upper Yangtze River basin deserves special protection.


INTRODUCTION
Landscape features and geological history are hypothesized to strongly affect the biogeographical processes (Manel et al., 2003), and this can be especially true in freshwater systems (Leclerc et al., 2008). The river system evolution can shape and influence the distributions and phylogeographical pattern of freshwater-limited lineages among drainages (Schultheiß et al., 2014). The population structure and diversification of freshwater fishes are primarily influenced by the dynamics of river morphology, affecting connections between river basins and lakes. The separation of water bodies often leads to genetic divergence as a result of genetic drift and reproductive isolation (Husemann et al., 2012). And the intraspecific structure of riverine fish is usually clustered by river basins (Bartáková et al., 2015).
H. leucisclus often occur in high abundance in river and lake, because of high fertility, potentially reducing the amount of genetic drift through counteracting genetic differentiation among populations (Tibbets and Dowling, 1996). Further, passive dispersal by flooding may reduce genetic differentiation, at least for riverine fish within river basins (Van Leeuwen et al., 2013). Though the distribution of freshwater fish is often limited by water area, long-distance dispersal may occur when fish or their larvae are transported passively, either by accidentally or on purpose by humans. Hence, the broadly distributed H. leucisclus tends to show high connectivity and gene flow mediated by strong adaptation and fecundity, as well as their aggregative characteristic.
However, despite multiple modes of passive dispersal, strong population divergence is often observed among local populations in the freshwater system (Husemann et al., 2012), either resulting of natural geographic isolation or anthropogenic fragmentation. In China, the rapid uplift of Tibet Plateau changed the trajectory of the river system in China, and ultimately leading to the formation of a flowing eastward drainage system (Clark et al., 2004). The Yangtze River was built between Oligocene and Miocene, but the unification of the upper and middle Yangtze River in the Three Gorges mountain region has happened between Late Pliocene and Early Pleistocene (Zheng et al., 2013). However, it is generally believed that the drainage pattern of the modern Yellow River was closely followed by the great uplift of the Tibet plateau once again, about 1.7 million years ago (Mya), but the ultimate formation of the Yellow River was at about 0.15 Mya, when Sanmenxia was completely cut through by Yellow River and went through to east (Li et al., 2001). The development of major river systems in the Pearl River Basin was derived from the Yanshan tectonic movement, the connectivity between the Yangtze River and the Pearl River system was believed before the early Pleistocene (Miao et al., 2004). However, the Pearl River basins and Yangtze River basins are divided by the Yunnan-Guizhou Plateau and Nanling Mountains, which hold back the cold weather from the north, forming the warmer climate and well-watered tropical land in the Pearl River basins in China. As a result, the distinct and different evolutionary histories of the river systems can be expected to affect current patterns of diversity of the widely distributed freshwater species.
In this study, we used the partial mitochondrial cytochrome b gene (cytb, mitochondrial gene) and the recombination activating gene 2 (RAG2, nuclear DNA) and combined the data of previous studies to estimate the phylogeographic pattern and evolutionary history of H. leucisclus across the three major river systems in China, and to evaluate the conservation status and strategy of this species.

Samples and Sequence Data Preparation
Ten populations of H. leucisclus in the Yellow River basin and six populations in Huaihe River basin (Supplementary File 1) were collected between May 2017 and August 2018. The genomic DNA was extracted and the cytb was amplified following Fan and He (2014). We sequenced 345 individuals de novo (accession number in GenBank, ON315395-ON315739) and withdrew 725 sequences from the previous studies (Chen et al., 2017;Wang L. H. et al., 2021). A total of 1,070 cytb sequences from 59 localities across three major River systems in China were analyzed in this study (Figure 1 and Supplementary File 1). In addition, the RAG2 gene was also amplified following Chen et al. (2017) and the new data (accession number in GenBank, ON315740-ON315840) included 101 individuals distributed in the Yellow River and Huaihe River basin. Another 45 RAG2 sequences were also retrieved from a previous study (Chen et al., 2017), and a total of 146 RAG2 sequences were analyzed (Supplementary File 1). Cycle-sequencing reactions were executed using BigDye terminator v. 3.1 and analyzed using an ABI-PRISM 3730 sequencer at Sangon Biotech (Shanghai, China). Both DNA strands were sequenced for both new cytb and RAG2 fragments in this study.

Phylogenetic Analysis
The new datasets generated and analyzed during this study are available in the GenBank (Supplementary File 1). All sequences were aligned by using MAFFT implemented in PhyloSuite v1.2.2 (Zhang et al., 2020). The aligned nucleotide sequences cytb and RAG2 were concatenated and used to construct the phylogenetic tree by using the maximum likelihood (ML) method implemented in RAxML v8.2.4 with 1,000 bootstrap replicates. Each gene was treated as unlinked. We were able to include all 1,070 individuals assayed since it is not required to have the same individuals analyzed for every genetic marker if partitions are considered unlinked. Bayesian inferences (BI) analysis was performed in MrBayes v3.2.6 using the Markov Chain Monte Carlo method with 10 million generations and sampling trees every 1,000 generations. The first 25% of trees were discarded as burn-in with the remaining trees being used for generating a consensus tree. The species of Hemiculter bleekeri and Culter alburnus (Supplementary File 1) were used as outgroups. The final trees were visualized in FIGTREE v1.4.4.
To further visualize haplotype diversity and distribution of H. leucisclus, we generated haplotype networks using 1,070 cytb sequences and colored each haplotype by the geographic region from which river system it was collected (Supplementary File 1). Another four haplotype networks in different river systems were also generated using cytb sequences. The haplotype networks were constructed using network v. 10 1 and applying the medianjoining and maximum parsimony options. Four sublineages were found in this study, and the genetic diversity of each was calculated using DNASP v6.12.03.

Divergence Time Estimation
The divergence time was estimated using a molecular clock approach as implemented in BEAST v2.6.6. We used a reduced dataset of cytb: specimens for which one or six sequences from the same region were included. This approach resulted in 322 terminals-317 representatives of the 58 populations of H. leucisclus and five outgroup sequences (H. bleekeri and C. alburnus) (Supplementary File 2). As the lacking of fossil evidence, we first used a conservative approach by employing the strict-clock model from previous studies (Chen et al., 2017), a range of substitution rate of 1.00-2.00% per million years is often adopted and widely employed for cytb gene analysis in cyprinid fish (Durand et al., 2002;Ketmaier et al., 2004). The GTR + I + G model was implemented, and the "speciation: Yule process" tree prior was used to construct the tree. We ran four independent runs for 50 million generations, logging trees every 5,000 generations. Convergence was checked with TRACER v1.7.1. Posterior trees from the four runs were combined after removing the first 10% as burn-in in LogCombiner. The maximum credibility tree was created in TreeAnnotator available in the BEAST package. We performed another molecular clock analysis with similar settings but using a "relaxed clock model" with two calibration points. The first calibration point was the timing of separation between the upper Pearl River and Yangtze River during the early Pleistocene (2.5-1.7 Mya). The node was calibrated using a normal distribution with a mean at 2.1 Mya and a standard deviation of 0.2 Mya. According to the Cyprinidae fossils found in China, the recent cyprinid fauna of the East Asian mainland (Wang, 2005), including the Cultrins, was most likely established in the Pliocene, which was used as an additional calibration point for the common ancestor of Culter and Hemiculter.

Reconstruction of Ancestral Areas
We further performed a biogeographic reconstruction of ancestral areas for species of H. leucisclus in China using BioGeoBEARS in RASP v4.02. The analyses were conducted on a fully resolved topology from the BEAST analysis containing a subset of individuals which covered the range of four cytb mtDNA lineages of H. leucisclus and H. bleekeri  River basin, (E) the Huaihe River basin, (F) the river basin of southeastern, and (G) the river basin of southwestern. All six models of geographic range evolution were compared in a likelihood framework (DEC, DEC + j, DIVALIKE, DIVALIKE + j, BAYAREALIKE and BAYAREALIKE + j).
The best-fit model was assessed by comparing Akaike's information criterion and likelihood−ratio tests, and DIVALIKE + j was chosen.

Phylogenetic Tree and Haplotype Network
The ML and BI tree showed a clear identification of four sublineages found in H. leucisclus (sublineage A, B, C, and D, Figure 1). However, most specimens of H. leucisclus collected from ZG and JY in the Yangtze River basin were classified as H. bleekeri both in the ML and BI tree. Furthermore, the median joining network of cytb showed that there were so many mutated positions between the haplotype collected from ZG and from other localities (Figure 2), indicating a very distant phylogenetic relationship. Our results suggested that the specimens from ZG and JY were likely misidentified as H. leucisclus in the previous study (Chen et al., 2017). In addition, four mtDNA lineages coinciding with the ML and BI trees were found in the haplotype network, and a significant phylogeographical pattern with two specific regions was found for H. leucisclus, one in the Pear River basin (sublineage A), and another in upper Yangtze River basin (sublineage B) (Figure 2). Sublineage C had a wide distribution across the six river systems and there were many shared haplotypes, while sublineage D had a relatively narrow distribution range in the Yangtze River and Yellow River basins. The genetic diversity indices showed a high level of genetic diversity in different sublineages, except for sublineage B (π = 0.00406, h = 0.87, Supplementary Table 1).

Divergence Time Estimation
We utilized a constant trait specific cytb Cyprinidae clock, specimens from different river systems form four distinctive clades. The divergence between H. bleekeri and H. leucisclus was at approximately 3.69 Mya. The divergence of sublineage A was dated at approximately 2.27 Mya, and the split time of sublineage B was about 1.42 Mya. Sublineage C diverged from sublineage at approximately 0.83 Mya (Figure 3). By applying a fossil calibrated molecular clock, we estimated the molecular diversification between H. leucisclus and its sister group of H. bleekeri was approximately 4.25 Mya. Sublieage A was estimated to have separated at about 2.57 Mya, followed by the divergence of sublineage B at about 1.62 Mya. Sublineage D separated from C at approximately 0.96 Mya (Figure 3).

Ancestral Range Reconstruction
The results of the ancestral range reconstruction are depicted in Figure 4, which showed that the most recent common ancestor (TMRCA) of H. leucisclus was mainly distributed in the Pearl River basin (node 35, D: 44.49%). There existed 20 dispersal and 11 vicariance events that occurred during the evolution of H. leucisclus (Supplementary Tables 2, 3). TMRCA of H. leucisclus probably diverged into two distinct clades in the late Pliocene. One clade was just distributed in the middle and lower Pearl River basin, while the other clade further split into two separate sublineages with no dispersal and vicariance event.

DISCUSSION
The role of past climate fluctuations and geological events, playing in extant species diversification and distribution, is one of the main focuses of phylogenetics and biogeography (Rull, 2011). One consequence of drastic climatic change and geological events is the frequent reconfiguration of landscape. Meanwhile, spatial and geographical barriers prevent gene flow, which in turn affects the phylogeographic pattern (Tittes and Kane, 2014). However, rivers can act as dispersal corridors for fish, promoting or counteracting geographical isolation and diversification (Vernesi et al., 2016).
Our results identified four distinct sublineages for H. leucisclus across different river systems in China with a significant phylogeographic patterns. One of the sublineages, sublineage C, had a very wide distribution across all the river systems in this study, even distributed in many isolated drainage systems, including Lancangjiang River, the upper Pearl River, the Fenhe River, and Weihe River (in Yellow River system), Huaihe River, and the river basins of southeastern China (Figure 1), suggesting it's distributions did not completely match the river delimitation, as well as the geographic zone. The widely distributed sublineage C in different river systems could be attributed to their spatial proximity, which enables H. leucisculus populations from the Yangtze River basin to expand to nearby rivers easily by flooding in summer (Chen et al., 2017). However, the flooding cannot explain the strong admixture among far distant sampling locations across different river systems and even isolated drainage (Figure 2). Hence, anthropogenic translocations, especially the large-scale introduction of economically valuable fish half a century ago, offer alternatively possible explanations.
The formation of the Nanling Mountain Range could drive the separation of sublineage A from others (Wang L. H. et al., 2021), and the separation of sublineage B distributed in Sichuan Basin could be attributed to the isolation of populations by mountains barrier in the Southwestern and Central China (Figure 1). The Dabie Mountains-Eastern Qinling, Daba Mountains and Wushan Mountains have shaped the phylogeographic pattern of many other species, including Sinopotamon acutum (Fang et al., 2015), Microtus fortis (Gao et al., 2017), and Hyalessa maculaticollis (Liu et al., 2018). However, the populations from the Yellow River and Huaihe River basins had not yet formed independent lineage, although there are many watersheds existing in Qinling-Dabie Mountains. Furthermore, the populations from the middle and lower Pearl River basin and Hainan Island were unexpectedly found to have a close relationship. Considering the molecular data, the divergence time of H. leucisclus far postdated the formation of these mountains. For example, the ultimate formation of Nanling Mountain and the watershed between Yangtze River and Pearl River basin was dated to the Himalayan movement during the tertiary (Wang, 1993), suggesting that dispersal rather than vicariance has created the current distribution pattern of H. leucisclus in China.
Furthermore, the climate oscillations and geologic events during the quaternary could be a major driver of intraspecific divergence for H. leucisclus. Sublineage A with the highest nucleotide diversity (π = 0.00742, Supplementary Table 1) only existed in warm and moist tropical or subtropical areas, which provided important refugia for freshwater species during glaciations (Gao et al., 2012;Yan et al., 2013). Our results suggested that H. leucisclus originated in the Pearl River basin at 4.25-3.69 Mya (Figures 3A,B). And our data are consistent with the conclusion of previous studies (Chen et al., 2017). And the divergence time of lineage A was at approximately 2.57-2.27 Mya, which fit well with the separation of Upper Yangtze River and Pearl River during Late Pliocene and Early Pleistocene (Miao et al., 2004). It was speculated that the active river capture and reverse events may have caused the connection of the Yangtze River and the upper Pearl River in some regions before the Early Pleistocene. Because the upper Yangtze River originally drained into the South China Sea through the Paleo-Red River until it was captured by the middle and lower reaches of the Yangtze River and reversed its route to flow into the East China Sea (Perdices et al., 2005). The resulting river capture ultimately generated the isolation between the Yangtze River and Pearl River systems, and in turn, likely had a dramatic effect on the evolution of many freshwater species (Gu et al., 2019). Furthermore, the separation of sublineage B at approximately 1.42-1.62 Mya exactly coincided with the timescale of the formation of the current Yangtze River (Li et al., 2001). It was reported that the unification of the upper and middle Yangtze River in the Three Gorges Mountain region occurred between the Late Pliocene and Early Pleistocene (Zheng et al., 2013). Hence, the connection of the upper and middle Yangtze River facilitated species dispersal from west to east, similar to what was observed in the frog Quasipaa boulengeri (Yan et al., 2013) and the freshwater snail Bellamya (Gu et al., 2019). Therefore, the establishment of the eastward-flowing the Yangtze River and the changes in river systems could have provided opportunities for dispersal and gene flow between closely related freshwater species, especially fish. It was surprising that sublineage D had a relatively narrow and discontinuous distribution in Yangtze River and Yellow River basins. The divergence time estimation between sublineage C and D was at 0.96-0.83 Mya, basically matched with the Dagu Glaciation (1.1∼0.9 Mya) in China (Jing and Liu, 1999). The result of repeated episodes of range contraction and expansion during glacial cycling has been hypothesized to be important driving forces of vicariant speciation or intraspecific divergence in many fish species (Chen et al., 2017;Li et al., 2020).
The evolutionary significant units have been proposed for the assessments of biodiversity and have a significant impact on conservation program (Niemiller et al., 2012), especially the recognition of phylogeographic patterns has important implications for the accurate assessment of the conservation of genetic resources. According to the distribution of the four sublineages in different river systems, two sublineages were found in the Yellow River basin, three sublineages in the Yangtze River basin, and two sublineages in the Pearl River basin (Supplementary Figure 1), suggesting that the genetic structure of H. leucisclus has not been completely changed by recent human-mediated translocations and flooding. Based on our new findings, the conservation of H. leucisclus should be reassessed, as the current conservation strategies are performed under the assumption of only one species and one gene pool. Especially for the sublineage B in a couple of rivers in the upper Yangtze River basin with very low nucleotide diversity. In a previous study, the morphological diversification found in H. leucisculus along climate gradients is primarily caused by the divergent selection pressures among different environments (Wang L. H. et al., 2021). Given these indications and molecular information, we advocate different protection strategies for H. leucisculus in different river systems. There should be two protection units for H. leucisculus in the Yellow River basin, three protection units for H. leucisculus in the Yangtze River basin, and two protection units for H. leucisculus in the Pearl River basin. While only one protection unit for H. leucisculus is necessary in Huaihe River basin. Furthermore, a thorough survey of wild populations of these lineages, including an assessment of their ecology, will be required for a more precise evaluation of their conservation status.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

ETHICS STATEMENT
The animal study was reviewed and approved by the Animal Care Committee of Hunan Normal University.

AUTHOR CONTRIBUTIONS
QG conceived and designed the study. HZ and YS analyzed the data. HY, HZ, SL, and ZS carried out tissue dissection, DNA isolation, sample preparation for sequencing, and collect the mtDNA data from GenBank. QG and HZ drafted the manuscript. MW revised the language. All authors read and approved the final manuscript.

FUNDING
This study was supported by the Natural Science Foundation of Hunan (Grant No. 2021JJ30442) and the National Natural Science Foundation of China (No. 31601851).

ACKNOWLEDGMENTS
We greatly appreciate the reviewers for their very helpful comments on previous versions of the manuscript.