Genetic Diversity and Selection of Plasmodium vivax Apical Membrane Antigen-1 in China–Myanmar Border of Yunnan Province, China, 2009–2016

Plasmodium vivax apical membrane antigen-1 (PvAMA-1) is an important vaccine candidate for vivax malaria. However, antigenic variation within PvAMA-1 is a major obstacle to the design of a global protective malaria vaccine. In this study, we analyzed the genetic polymorphism and selection of the PvAMA-1 gene from 152 P. vivax isolates from imported cases to China, collected in the China–Myanmar border (CMB) area in Yunnan Province (YP) during 2009–2011 (n = 71) and 2014–2016 (n = 81), in comparison with PvAMA-1 gene information from Myanmar (n = 73), collected from public data. The overall nucleotide diversity of the PvAMA-1 gene from the 152 YP isolates was 0.007 with 76 haplotypes identified (Hd = 0.958). Results from the population structure suggested three groups among the YP and Myanmar isolates with optimized clusters value of K = 7. In addition, YP (2014–2016) isolates generally lacked some K components that were commonly found in YP (2009–2011) and Myanmar. Meanwhile, PvAMA-1 domain I is found to be the dominant target of positive diversifying selection and most mutation loci were found in this domain. The mutation frequencies of D107N/A, R112K/T, K120R, E145A, E277K, and R438H in PvAMA-1 were more than 70% in the YP isolates. In conclusion, high genetic diversity and positive selection were found in the PvAMA-1 gene from YP isolates, which are significant findings for the design and development of PvAMA-1-based malaria vaccine.


INTRODUCTION
Malaria is still a serious infectious disease that threatens human health and affects social and economic development in the world. According to the World Health Organization (WHO), by 2019, there were 229 million malaria cases, an increase of one million over 2018 and 409,000 malaria deaths worldwide (WHO, 2020). Plasmodium vivax is one of the five species of Plasmodium that regularly infect humans and cause malaria, and is the most widely distributed human malaria species outside the African continent with an estimated 2.5 billion people at risk of infection (Vogel, 2013;Howes et al., 2016). Plasmodium vivax is widely prevalent in parts of Asia (Flannery et al., 2019), especially, in some countries bordering China, such as Myanmar, Laos, and Vietnam (von Seidlein et al., 2019;Brashear et al., 2020).
Plasmodium vivax was once a serious epidemic in China (Feng et al., 2015). Although there were no local cases of P. vivax in China since 2017 (Lai et al., 2019), there is still a huge risk of re-emerging cases due to the existence of Anopheles vectors (Zhang et al., 2018). In the China-Myanmar border (CMB) area, with the development of the Belt and Road Initiative, the risk of imported malaria cases to China is increasing (Lai et al., 2019). Therefore, studies to dissect the genetic backgrounds of P. vivax in the CMB area are of great significance to provide information for the control and elimination of vivax malaria in Myanmar and other related countries.
In recent years, drug resistance to P. vivax was frequently reported (Imwong et al., 2007;Lu et al., 2010;Dayananda et al., 2018). Meanwhile, there are many other factors which contribute to the difficulties of P. vivax control, such as hypnozoites (White and Imwong, 2012), early gametocytogenesis, frequent low parasitemias, high infectivity to mosquitoes, and shorter development cycle in the vector host compared with other species of Plasmodium (Mueller et al., 2009;Mueller and Adams, 2017). Thus, the development of a stable and effective vaccine has been proposed as a possible aid to drugs for effective control and elimination of vivax malaria. Plasmodium vivax apical membrane antigen-1 (PvAMA-1) is an important candidate for malaria vaccine (Mueller et al., 2015;Beeson et al., 2019). Apical membrane antigen-1 (AMA-1) is expressed in the microneme of apicomplexan parasites and is present in all Plasmodium species (Bittencourt et al., 2020). It is a type I transmembrane protein binding with rhoptry neck proteins (Srinivasan et al., 2011). AMA-1 is involved in merozoite reorientation and tight junction formation during the invasion process (Gaur et al., 2004;Richard et al., 2010;Lamarque et al., 2011) and is essential for parasite survival (Triglia et al., 2000). It has been reported that antibodies against the ectodomain of Plasmodium falciparum AMA-1 (PfAMA-1) can inhibit erythrocyte invasion, and its immunization protects against malaria infection (Remarque et al., 2008;Kusi et al., 2009). The extracellular domain of AMA-1 is divided into three subdomains referred to as domain I, domain II, and domain III based on the conserved cysteine residues (Pizarro et al., 2005). Domain I and domain II cover the polymorphic regions and have been shown to be the major targets that elicit inhibitory responses (Remarque et al., 2008). In general, the formation of moving junction between merozoite and erythrocyte is the key procedure for the successful invasion of host cells by the parasite, and the mechanism includes the interaction between the conserved hydrophobic groove in domain II loop of AMA-1 and the conserved rhoptry neck protein 2 (RON2) loop (Bargieri et al., 2013). PvAMA-1, therefore, becomes an important immune target (Múfalo et al., 2008). Meanwhile, due to the highly polymorphic feature of the PvAMA-1 gene, it has been used as molecular marker for population genetic studies (Joshi, 2003). However, antigenic variation is a major challenge in the design of protective malaria vaccine (Esmaeili Rastaghi et al., 2014). Recently, a report on genetic diversity of Pvama-1 in India also showed this trend (Kale et al., 2021).
In the present study, the genetic diversity and selection of PvAMA-1 gene in the CMB area were investigated in P. vivax isolates collected from Yunnan Province (YP) in China during 2009-2011 and 2014-2016 and compared with the public PvAMA-1 gene information from Myanmar, to advance our knowledge of the rational design of vivax malaria vaccine.

Ethics Statement
This study was conducted according to the principles expressed in the Declaration of Helsinki. Before blood collection, the study protocol, potential risks, and benefits were explained to the participants, and written informed consent was obtained from all adult participants and from the parents or legal guardians of children. Blood was collected following institutional ethical guidelines reviewed and approved by the Ethics Committee at the National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention (no. 20120826).

Sample Collection and DNA Extraction
A total of 180 blood samples of malaria patients infected with P. vivax were collected from the CMB area in YP (China) during 2009-2011 (n = 77) and 2014-2016 (n = 103) ( Figure 1). All samples were stained by Giemsa and verified by microscopic examination then confirmed for single infection with P. vivax by nested PCR (Zhou et al., 2014). Genomic DNA was extracted from whole blood using the DNeasy Blood & Tissue Kit (Qiagen, Germany) as previously reported (Chen et al., 2017;Kassegne et al., 2020).

PCR Amplification and Sequencing
To identify Plasmodium species, nested PCR was performed as previously described (Zhou et al., 2014). In the first PCR, the DNA fragment was amplified by 2×Taq PCR MasterMix (Tiangen Biotech, Beijing, China) using the primers rPLU1 5′-TCAAAGATTAAGCCATGCAAGTGA-3′ and rPLU5 5′-CCTGTTGTTGCCTTAAACTTC-3′. One microliter of template DNA was added to a 20-ml PCR mixture consisting of 0.4 mM of each primer, 10 ml 2×Taq PCR MasterMix (Tiangen Biotech, Beijing, China) containing 0.1 U Taq polymerase/ml, 500 mM deoxynucleotide triphosphates (dNTP), 3 mM MgCl 2 , 100 mM KCl, and 20 mM Tris-HCl, pH 8.3. The cycling parameters to amplify the fragments were as follows: initial denaturation at 94°C for 5 min; 30 cycles of 94°C for 30 s, 55°C for 30 s, and 72°C for 1 min; followed by a final extension at 72°C for 5 min. One microliter of the first PCR product was used in the second amplification. Conditions and concentrations used for the second amplification were identical to those used for the first, except that rVIV1/rVIV2 (rVIV1 5′-CGCTTCTAGCTTAATCCACATAACTGATAC-3′, rV IV2 5′-ACTTCCAAGCCGAAGCAAAGAAAGTCCTTA-3′) were used as primers and amplification was performed over 35 cycles. The size of the DNA target amplified by these outer primers is about 1,600-1,700 bp and that by the inner primers is 121 bp. PCR products were qualitatively analyzed on 2% agarose gel and were sent to Beijing Genomics Institution (BGI, Shenzhen, China) for sequencing.
The PvAMA-1 fragment was amplified by PrimerSTAR Max DNA Polymerase (Takara, Japan) using the primers SeqF1 5′-CCCTACCAGCGGCTACTTC-3′ and SeqR1 5′-CGTTTGCTT GGCCAACTC-3′. All PCR amplifications were performed in a 50-ml PCR reaction volume containing 0.4 mM of each primer pair, 1.5 mM MgCl 2 , 1× PCR buffer (50 mM KCl, 10 mM Tris-Cl, pH 8.3), 0.2 mM dNTPs, and 0.5 unit of DNA polymerase. The cycling parameters to amplify the fragments were as follows: initial denaturation at 98°C for 5 min, 35 cycles of denaturation at 98°C for 10 s, annealing at 69°C for 15 s, extension at 68°C for 1 min 30 s, and a final extension at 68°C for 5 min. The PCR fragments were about 1,900 bp in size and contained all the gene fragments of PvAMA-1 (1,689 bp). The PCR products were qualitatively analyzed on 1% agarose gel and were sent to Beijing Genomics Institution (BGI, Shenzhen, China) for sequencing. All unique mutations were carefully checked, and ambiguous bases were confirmed by resequencing. Meanwhile, a set of 73 sequences of Pvama-1 (1,290 bp) isolates from Myanmar was downloaded from NCBI (KX495505-KX495577) (Zhu et al., 2016).

Data Analysis
All sequences were assembled and aligned by DNAMAN. After all the sequences were aligned, data were trimmed to 1,689 bp, which contains the full coding sequence of PvAMA-1. We performed a multiple sequence alignment of the PvAMA-1 gene sequences using MEGA6 (Tamura et al., 2013). PvAMA-1 gene sequence PVP01_0934200.1 (1,689 bp) obtained from PlasmoDB (http://PlasmoDB.org) was used as reference.
After all the sequences were aligned, three domains of PvAMA-1 were also divided. These included domain I (462 bp, nucleotides 280-741), domain II (297 bp, nucleotides 793-1,089), and domain III (192 bp,nucleotides 1,353). To investigate the genetic polymorphism and selection of PvAMA-1 in different time periods in the CMB area, we used DnaSP (Librado and Rozas, 2009) to calculate the number of haplotypes (H), the mean value of nucleotide differences (k), nucleotide diversity (p), and haplotype diversity (Hd) as previously described (Murhandarwati et al., 2020), with a sliding window of 100 bp and step size of 25 bp for nucleotide diversity (p). KaKs_Calculator 2.0 software (Wang et al., 2010) was used to calculate the non-synonymous (Ka) and synonymous (Ks) substitution rates. The Ka and Ks values and Ka/Ks ratios were calculated based on a model-averaged method (Wang et al., 2010). Ka/Ks calculation was used to estimate the selection pressure of PvAMA-1 gene pairs. The algorithm was NG (Nei and Gojobori, 1986) and YN, which is an alternative model for NG (Yang and Nielsen, 2000). Additionally, Tajima's D test was performed in DnaSP to evaluate the neutrality theory of evolution (with Fu and Li's test as a double check). The probability of recombination between adjacent nucleotides per generation was calculated using DnaSP. The linkage disequilibrium (LD) between different polymorphic sites was computed based on the D and R 2 indices. The calculations performed through DnaSP were based on the default parameters.
The recombination region was calculated by Recombination Detection Program v.4.101 (Martin et al., 2015) with the MaxChi method (Maynard Smith, 1992). After removing the recombination block, a haplotype network based on the Pvama-1 sequence was constructed using the NETWORK software Version 10200 with the median-joining method (Bandelt et al., 1999). To assess allele ancestry, STRUCTURE software was used to assess clustering of isolates under the ancestry model "Use Population Information to test for migrants" (Evanno et al., 2005). Six iterations for the numbers of clusters (K) from three to eight were run, each with a burning period of 5,000 steps and 10,000 Markov chain Monte Carlo iterations. The best K was selected as previously described (Evanno et al., 2005). The sequences of nine additional Pvama-1 populations (Myanmar, KX495505-KX495577; Thailand, FJ784891-FJ785121; Korea, KM230319-KM230384; Sri Lanka, EF218679-EF218701; Iran, JX624732-JX624760; Papua New Guinea (PNG), KC702402-KC702503; India, MH657021-MH657120; Venezuela, EU346015-EU346087; and Brazil, MH049550-MH049589) were analyzed together (Gunasekera et al., 2007;Ord et al., 2008;Putaporntip et al., 2009;Arnott et al., 2013;Zakeri et al., 2013;Kang et al., 2015;Zhu et al., 2016;Bittencourt et al., 2020;Kale et al., 2021). Here, the India population (Kale et al., 2021) should be considered as a longdistance geography distribution outgroup since Kale et al. had exhibited the genetic difference between Myanmar and South Asia. Before the Network and Structure analyses, all the sequences of PvAMA-1 were analyzed and cut to 1,290 bp according to the fragment of PvAMA-1 isolates from Myanmar (Zhu et al., 2016) through MEGA6 and using Arlequin3.5 to analyze the molecular variance (AMOVA) to evaluate fixation (F ST ) (Excoffier and Lischer, 2010). Meanwhile, the migration rate between YP and Myanmar populations was estimated by Migrate-n version 4.4.3 (Beerli, 2006).

Genetic Diversity of Pvama-1 in Plasmodium vivax Isolates From the CMB Area
Of the 152 Pvama-1 sequences from the CMB area, there were 76 haplotypes, giving an overall haplotype diversity (Hd) of 0.958 ( Table 1). The average number of pairwise nucleotide differences (k) for the entire 1,689 bp in different time periods including the full sampling period (referred to as Total), 2009-2011, and 2014-2016 were 12.191, 12.933, and 10.814, respectively ( Table 1). The number of haplotypes was 76, 54, and 23 for Total, 2009Total, -2011Total, , and 2014Total, -2016 (Table 1) Table 1). Nucleotide diversity (p) of the YP samples from 2009 to 2011 and from 2014 to 2016 was 0.00766 and 0.00640, respectively, and that of the total samples was 0.00722 ( Table 1). Nucleotide diversity (p) of the YP samples in three domains was 0.01701, 0.00482, and 0.00355, respectively, in which the Pvama-1 domain I showed the highest genetic variation. Recently, Kale et al. (2021) reported the genetic diversity of PvAMA-1 in India and confirmed that the high genetic variation was observed in Pvama-1 domain I. p values for 2009-2011 and 2014-2016 were ranging from 0.000 to 0.03771 (350-380 bp) and from 0.000 to 0.02533 (350-380 bp), respectively ( Figure 2A).
In order to assess the neutral evolving of PvAMA-1 and its three domains, Tajima's D test was performed. Tajima's D value of the full PvAMA-1 fragment was 0.205 for Total, 0.321 for 2009-2011, and 0.311 for 2014-2016 (P > 0.1; Table 1). However, the Tajima's D values obtained for the three domains were obviously different ( Table 1)    for 2014-2016 (YN, P < 0.05), suggesting a significant positive selection for PvAMA-1 of P. vivax populations in the CMB area during these times ( Table 1). The LD index (R 2 ) also declined with distance, suggesting that intragenic recombination may also contribute to the PvAMA-1 diversity (Figure 3). Furthermore, we performed the LD test on different time periods and found the 2009-2011 subpopulation under similar pattern for LD SNP pairs with the whole population ( Figure 3). In contrast, the 2014-2016 subpopulation showed more LD SNP pairs than the 2009-2011 subpopulation did, which may suggest less recombination ratio from the founder effect ( Figure 3). Meanwhile, we analyzed the change in the frequency of haplotypes over time, and the result showed that the frequency of haplotypes decreased significantly in 2015 and 2016 samples ( Figure 4A).

Mutations of PvAMA-1 in Isolates From the CMB
A total of 40 amino acid mutation sites were found in YP samples. Among them, 37 and 29 amino acid mutation sites were found in 2009-2011 and 2014-2016 samples, respectively. In the Total samples, the mutation frequencies of 19 mutation sites were less than 10%; for 11 mutation sites, 15%-45%; and for 10 mutation sites, more than 50% ( Table 2). More importantly, the mutation frequencies of D107N/A, R112K/T, K120R, E145A, E277K, and R438H were more than 70% ( Table 2). Some mutation sites with low frequency in the 2009-2011 samples disappeared in the 2014-2016 samples ( Table 2, Table S1). There were also three new mutation sites in the 2014-2016 samples, in which the mutation frequencies of N316T, M319I, and K336R were 0.66%, 7.24%, and 0.66%, respectively (Table 2). Furthermore, there were 19 mutation sites in domain I of PvAMA-1, in which the most mutated amino acid sites were distributed. The number of amino acid mutation sites in domains II and III were seven and three, respectively.
Here, we also analyzed the distribution of these mutation sites of YP samples in other groups (Table S1). In the 37 mutation sites of YP samples, the lowest coincidence rate was found in the Korean samples and the highest coincidence rate in the Iran and India samples (Table S1). At the same time, we also calculated the mutation ratio of these mutation sites of YP samples in the Indian population ( Figure 4B). It showed that the relatively preserved amino acid changes found in the YP PvAMA-1 were well-conserved in India ( Figure 4B showed great differentiation with F ST values of 0.15269 and 0.24836, respectively. The very great differentiation was found between YP isolates and Korea isolates (the value of F ST is 0.3761) ( Table 3). The differentiation between YP isolates and other isolates was moderate ( Table 3).
One recombination event of YP samples was detected by Recombination Detection Program v.4.101, containing a 602-bp fragment, located in 493-1,094 bp of Pvama-1. After removing the recombination block, a total of 889 sequences from the global populations, length 688 bp, were analyzed by the NETWORK software. Network analysis results for PvAMA-1 populations showed that there was an obvious cluster of populations of global samples, except those from Korea and PNG which were partially separated ( Figure 5A). Among them, the haplotype sharing ratio of YP samples was the highest. The YP population has shared haplotypes with all other populations except Brazil. Twenty-seven of the 51 YP haplotypes (52.9%) were shared with other populations, of which 59.3% (16/27), 44.4% (12/27), and 44.4% (12/27) were identical to some of the Pvama-1 haplotypes observed in Thailand, Myanmar, and India populations, respectively. Haplotype 286 is the predominant haplotype in the YP population, with a frequency of 17.1%. Haplotype 17 is shared by most populations, consisting of YP, Myanmar, Thailand, Sri Lanka, Iran, and India populations. The haplotype network, drawn by excluding the 102 singletons from the analysis, showed that clusters from the Asian populations, Oceania, and South American overlapped ( Figure 5A).
Through the analysis by MEGA6, a total of 165 SNPs from the 10 populations (YP, Myanmar, Thailand, South Korea, Papua New Guinea, Sri Lanka, Iran, India, Venezuela, and Brazil) were detected. The results of principal component analysis (PCA) showed similar results with the network analysis, where only the Korea and PNG populations were partially separated from the global samples ( Figures 5B, C).
Furthermore, the structure analysis results of global populations suggested 10 groups with optimized clusters value of K = 7 ( Figure 5D). The results of structure analysis show that the YP and Thailand samples have the most K components (n = 7), followed by Myanmar and India (n = 6), Sri Lanka and Iran (n = 5), and PNG (n = 4) and the least K components (n = 3) in samples from Korea, Venezuela, and Brazil ( Figure 5D). The distribution of    Figure 5D), indicating that the YP population was more abundant than the Myanmar population. In addition, the migration rate between YP and Myanmar populations was estimated by Migrate-n version 4.4.3 (Beerli, 2006). The mean Q values for YP and Myanmar populations were 0.06478 and 0.03100, respectively. The result showed asymmetric gene flow between YP and Myanmar populations, with the number of migrant individuals per generation (Nm) from Myanmar to YP (15.090) being higher than that from YP to Myanmar (11.189). Meanwhile, the 2009Meanwhile, the -2011Meanwhile, the , 2014Meanwhile, the -2016, and Myanmar samples were analyzed by STRUCTURE software using no admixture model. A total of 62 SNPs from the three populations were detected by MEGA6. The structure results suggested three groups among the CMB samples with optimized clusters value of K = 7 ( Figure S1B). The results of the structure analysis for the three populations show that the 2014-2016 samples generally lacked some K components that were commonly found in other samples. The distribution of K5 was particularly prominent as shown in Figure S1B.

DISCUSSION
According to the statistical report of the WHO (2020), the malaria elimination target may not be achieved as expected by 2020. With the bottleneck of drug resistance to P. vivax (Imwong et al., 2007;Lu et al., 2010;Dayananda et al., 2018), new intervention tools and strategies are urgently needed to efficiently control and eliminate vivax malaria. For example, comprehensive research is needed to develop vaccine strategy. Though antigenic variation is one of the major challenges that affects the development of malaria vaccine (Esmaeili Rastaghi et al., 2014), advanced knowledge of parasite antigenic variants is a prerequisite for the rational design of a vaccine that might be efficient in various endemic areas.
In this study, we analyzed the genetic diversity of the PvAMA-1 gene from the border area of China and Myanmar and assessed its genetic variation across different time periods. Through the analysis of the full-length PvAMA-1 gene sequence, the results of Tajima's D test and Fu and Li's tests showed that there was no significant balancing selection in the 2009-2011 and 2014-2016 samples, which suggests that the PvAMA-1 gene was not under selection. However, the Ka/Ks values of PvAMA-1 were positive and statistically significant in all cases of the YP samples, which indicated that the PvAMA-1 of P. vivax populations in the CMB area was under significant positive selection during these times. Meanwhile, the linkage disequilibrium (R 2 ) results showed that there was more long-distance linkage in 2014-2016 than it was in 2009-2011. This indicates that the number of genetic recombinations in 2014-2016 was lower and more ancestor SNPs were retained, which is not consistent with the time periods. The 2014-2016 samples showed more long-distance linkage (Figure 3), which aligns with the growing scarcity of Chinese strains to participate in regional recombination. At the same time, the network analysis showed that the frequency of sharing haplotypes between 2014-2016 and Myanmar samples was higher than that in the 2009-2011 and Myanmar samples. This may indicate that the PvAMA-1 samples from the Chinese side of the China-Myanmar border were closer to the samples from the Myanmar side over time. Furthermore, the structure analysis showed that the 2014-2016 samples lacked some K components that were commonly found in other samples such as in the 2009-2011 samples ( Figure S1B). This may indicate that the control of malaria transmission in China reduced the number and class of P. vivax spread in the CMB region, while the transmission from Myanmar continued. In addition, the movement of sections of the population that includes tourism, traveling and holidays, and border trade business, particularly in those free ports on the border, may also affect the population structure and genetic characteristics of malaria in this region (Cui et al., 2012). The structure analysis of PvAMA-1 in isolates from YP, Myanmar, Thailand, Korea, PNG, Sri Lanka, Iran, India, Venezuela, and Brazil revealed that the population structure in the CMB and Thailand is the most complex and abundant ( Figure 5D). The study by Arnott et al. (2013) reported that diversity was the highest in PNG and Thailand, while it was the lowest in Venezuela. This suggests that there is greater genetic diversity of the PvAMA-1 gene in Southeast Asia, also including the CMB region. Although highly diverse, it was observed that the majority of the YP Pvama-1 haplotypes (49%) are shared with Thailand, Myanmar, and India populations, with moderate F ST values observed between YP, Myanmar, Thailand, and India isolates. This indicate that a YP-PvAMA-1-based multicomponent malaria vaccine may be effective in this entire region. Therefore, it is of great significance to explore the genetic diversity of PvAMA-1 in the CMB to provide instructive insights for the development of effective diagnostics and vaccines for P. vivax.
In a previous study, PvAMA-1 domains I and II covering the polymorphic regions have been shown to be major targets that elicit inhibitory responses (Remarque et al., 2008). In this study, Tajima's D test showed that part of domain I (nt 401-500, Tajima's D: 2.233, P < 0.05) of PvAMA-1 was under positive balancing selection in the YP population. Similar findings have been reported from other studies where highly significant positive values have been consistently observed within domain I of the Myanmar, PNG, Iran, India, and Venezuela populations (Ord et al., 2008;Arnott et al., 2013;Zakeri et al., 2013;Zhu et al., 2016;Kale et al., 2021). Such informative findings suggest this domain a dominant target of host immune responses. Domains II and III showed direction selection in YP samples through some time periods different from those of Myanmar samples, in which all PvAMA-1 domains were balancing selected (Zhu et al., 2016). The domain II of Pvama-1 has been reported as highly immunogenic (Pizarro et al., 2005;Mufalo et al., 2008;Remarque et al., 2008;Gentil et al., 2010). In addition, positive selection within domain II of Pvama-1 populations from Sri Lanka has been observed (Gunasekera et al., 2007). However, in this study, no evidence was found for diversifying selection on domains II and III of Pvama-1 from the CMB area as confirmed by neutrality tests, which is in agreement with previous reports (Ord et al., 2008;Arnott et al., 2013).
Sequences of Pvama-1 from the YP population were compared to the reference sequence (PVX_092275), and 61 SNPs resulting in 40 amino acid substitutions were identified in the YP Pvama-1. Most mutation loci were found in domain I of PvAMA-1 from YP samples, which aligns with the findings in other populations as previously reported (Gunasekera et al., 2007;Ord et al., 2008;Putaporntip et al., 2009;Arnott et al., 2013;Zakeri et al., 2013;Kang et al., 2015;Zhu et al., 2016;Bittencourt et al., 2020;Kale et al., 2021). Meanwhile, the AMA-1 ligand-binding site and a major target of protective immunity have been proven to be a hydrophobic trough composed of domains I and II (Coley et al., 2007). The polymorphic residues 197,200,201,204,and 225 have been proven to be important for PvAMA-1 binding (Coley et al., 2007). However, no mutations were found in these important amino acid loci in the YP samples used in this study.
In this study, the mutation frequencies of 10 mutation sites were more than 50% ( Table 2). Most importantly, the six tightly preserved amino acid changes (D107, R112, K120, E145, E277, and R438), which are the most outstanding characteristics found in the YP PvAMA-1 analyzed in this study, were well-preserved in all the populations except that from Korea, which revealed some inconsistent amino acid sites (Kang et al., 2015;Bittencourt et al., 2020). Meanwhile, when compared with other populations, PvAMA-1 of YP had more high-frequency mutation sites. These results also suggest that PvAMA-1 from the CMB area showed different patterns of polymorphic nature compared with those from other geographical areas, and more abundant genetic diversity was observed among isolates globally (Arnott et al., 2013;Kang et al., 2015;Zhu et al., 2016).
Some of the SNPs identified in P. vivax isolates globally, including E145K, P210S, R249H, G253E, K352E, R438H, and N445D, overlap with the B-cell epitope regions. These amino acid changes may affect the protein structure by causing changes in charge and polarity of the protein and might help parasites to escape from host immunity (Anders et al., 1998). In this study, except the R249H, other mutations were all observed ( Table 2) in PvAMA-1 which contained more immune escape-related mutations than those of PvAMA-1 from other reported populations (Kang et al., 2015). This indicates that P. vivax in the CMB area is under strong immune pressure.
Collectively, we found a high diversity and a complex population structure of PvAMA-1 in the CMB region of Yunnan Province, China, in comparison with the PvAMA-1 from other populations. The study revealed the unique genetic diversity of Pvama-1 in the CMB area, which is an instructive finding for the development of extensive and effective malaria vaccines.

CONCLUSION
This study provides the first in-depth understanding of the genetic diversity of PvAMA-1 from different time periods in the CMB. PvAMA-1 domain I is the dominant target of positive diversifying selection. Meanwhile, the majority of the YP Pvama-1 haplotypes, shared with Thailand, Myanmar, and India populations, indicate the possibility of a YP-PvAMA-1-based multicomponent malaria vaccine with an effect on this entire region. These results suggest PvAMA-1 a dominant target of host immune selection and a potential vaccine target.

DATA AVAILABILITY STATEMENT
All materials and data supporting these findings are contained within the manuscript and supplementary figure and table. The sequences have been deposited in the GenBank database under the accession numbers OK605600 -OK605751 for the China-Myanmar border isolates in Yunnan Province of China.

ETHICS STATEMENT
This study was conducted according to the principles expressed in the Declaration of Helsinki. Before blood collection, the study protocol, potential risks and benefits were explained to the participants, and written informed consent from all adult participants and from the parents, or legal guardians of children. Blood was collected following institutional ethical guidelines reviewed and approved by the ethics committee at National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention (no. 20120826).