Genome-Wide Analysis of the Malaria Parasite Plasmodium falciparum Isolates From Togo Reveals Selective Signals in Immune Selection-Related Antigen Genes

Malaria is a public health concern worldwide, and Togo has proven to be no exception. Effective approaches to provide information on biological insights for disease elimination are therefore a research priority. Local selection on malaria pathogens is due to multiple factors including host immunity. We undertook genome-wide analysis of sequence variation on a sample of 10 Plasmodium falciparum (Pf) clinical isolates from Togo to identify local-specific signals of selection. Paired-end short-read sequences were mapped and aligned onto > 95% of the 3D7 Pf reference genome sequence in high fold coverage. Data on 266 963 single nucleotide polymorphisms were obtained, with average nucleotide diversity π = 1.79 × 10−3. Both principal component and neighbor-joining tree analyses showed that the Togo parasites clustered according to their geographic (Africa) origin. In addition, the average genome-wide diversity of Pf from Togo was much higher than that from other African samples. Tajima’s D value of the Togo isolates was −0.56, suggesting evidence of directional selection and/or recent population expansion. Against this background, within-population analyses identifying loci of balancing and recent positive selections evidenced that host immunity has been the major selective agent. Importantly, 87 and 296 parasite antigen genes with Tajima’s D values > 1 and in the top 1% haplotype scores, respectively, include a significant representation of membrane proteins at the merozoite stage that invaded red blood cells (RBCs) and parasitized RBCs surface proteins that play roles in immunoevasion, adhesion, or rosetting. This is consistent with expectations that elevated signals of selection due to allele-specific acquired immunity are likely to operate on antigenic targets. Collectively, our data suggest a recent expansion of Pf population in Togo and evidence strong host immune selection on membrane/surface antigens reflected in signals of balancing/positive selection of important gene loci. Findings from this study provide a fundamental basis to engage studies for effective malaria control in Togo.

Malaria is a public health concern worldwide, and Togo has proven to be no exception. Effective approaches to provide information on biological insights for disease elimination are therefore a research priority. Local selection on malaria pathogens is due to multiple factors including host immunity. We undertook genome-wide analysis of sequence variation on a sample of 10 Plasmodium falciparum (Pf) clinical isolates from Togo to identify local-specific signals of selection. Paired-end short-read sequences were mapped and aligned onto > 95% of the 3D7 Pf reference genome sequence in high fold coverage. Data on 266 963 single nucleotide polymorphisms were obtained, with average nucleotide diversity p = 1.79 × 10 −3 . Both principal component and neighbor-joining tree analyses showed that the Togo parasites clustered according to their geographic (Africa) origin. In addition, the average genome-wide diversity of Pf from Togo was much higher than that from other African samples. Tajima's D value of the Togo isolates was −0.56, suggesting evidence of directional selection and/or recent population expansion. Against this background, within-population analyses identifying loci of balancing and recent positive selections evidenced that host immunity has been the major selective agent. Importantly, 87 and 296 parasite antigen genes with Tajima's D values > 1 and in the top 1% haplotype scores, respectively, include a significant representation of membrane proteins at the merozoite stage that invaded red blood cells (RBCs) and parasitized RBCs surface proteins that play roles in immunoevasion, adhesion, or

INTRODUCTION
Malaria clinical presentation ensues when Plasmodium parasites invade and destroy red blood cells (RBCs). Fever and chills occur at the time of rupture of infected RBCs (iRBCs) containing merozoites that are freed to invade uninfected RBCs (1,2). Failure to receive prompt treatment may lead to dyserythropoietic anaemia or severe malaria. P. falciparum (Pf) is the most dangerous malaria parasite because of the high level of mortality with which it is associated, its widespread resistance to antimalarial medicines, and its dominance in the world's most malarious continent, Africa (3)(4)(5).
In Togo, malaria transmission occurs most of each year. Although decades of control efforts have reduced the disease burden, the entire country's population is still at risk of falciparum malaria infection (6). In addition, challenges in parasite control would have made the infection a public health concern and may aggravate the difficulty of treatment. Clinical spectrum of malaria in Togo usually ranges from asymptomatic carriage of malaria parasites to a febrile disease that may evolve into a severe, life-threatening illness, making the infection a major cause of morbidity and mortality, especially in children (7,8). Antimalarial drug resistance (e.g., parasite resistance to chloroquine or pyrimethamine) has been experienced across Africa. In early investigations in Togo, clinical and parasitological therapeutic failure tests of artemether-lumefarine (AL) and artesunate-amodiaquine (ASAQ) for 3% and 3.8%, respectively, have been observed (6), and they drew the entire country's attention to an eventual resistance to artemisinins. However, in a recent study, therapeutic efficacy of AL and ASAQ was shown without delay in the clearance of mutant parasites (9). Pf surfaceexposed proteins are targets of host immune responses, and repeated exposures to the parasite in endemic areas induce a slow and gradual development of acquired immunity to clinical malaria, which is usually evidenced as a decline in the prevalence of clinical episodes (10,11). Hence, acquisition of information on both immunity-related antigens and drug resistance genes for effective interventions to sustain and drive forward the struggle against malaria parasite in Togo is therefore a research priority.
Complete sequencing of the Pf genome has boosted postgenomic studies of malaria (12). It provides fundamental knowledge for better understanding of the cellular and molecular mechanisms of infection and immunity to develop new control methods, including new drugs and vaccines, improved diagnostics, and effective vector control techniques.
With rapid development of sequencing technologies (13), hundreds of falciparum isolate genomic data worldwide had been investigated and shared by large collaborative initiatives such as the MalariaGEN Pf Community Project and the Pf3k Consortium. Application of the genomic approaches in the analysis of whole genome variations-generated high-density single nucleotide polymorphisms (SNPs) of the parasite has mostly focused on vaccine antigen genes and drug-resistant genes. However, to date, nothing is known on genomes of malaria isolates in Togo, and this could limit the joint research with those in other endemic areas in the sub-Saharan Africa region.
In this study, we performed the first whole-genome sequencing (WGS) of Pf clinical isolates from Togo. With the aim to contribute to accelerating the pursuit of effective malaria control, we applied genomic approaches in the analysis of whole genome variations-generated high-density SNPs to provide biological insights on target genes, especially those under host immune selection.

Sampling Sites and Ethics Statement
Malaria transmission in Togo occurs for most of each year with seasonal outbreaks (9), and populations are served by health facilities experienced in the management of malaria cases. For this study, clinical samples were collected at health centres in urban areas of Agou-Gadzeṕé(7°28'01'' N; 1°55'01'' E) and Atakpamé(7°52'87'' N; 1°13'05'' E) in Agou and Ogou prefectures, respectively, in the Plateaux Region ( Figure 1)

Sampling of Malaria Parasites and Extraction of Genomic DNA
Malaria-naturally exposed subjects who received parasitological diagnosis using Giemsa-stained thick blood smear microscopy under 1000x magnification were referred to our study. Whole blood specimens from subjects who were diagnosed with the presence of Pf asexual parasitaemia (parasites counted per 200 leukocytes and parasite density calculated as the number of parasites per microliter by assuming a fixed leukocyte count of 8000 cells/mL of blood) were sampled as dried blood spots (DBSs) on Whatman FTA cards (GE Healthcare) as recommended by the manufacturer. Genomic DNA was extracted [using the QIAGEN DNeasy Blood & Tissue Kit (Qiagen), according to the manufacturer's instructions] from DBSs and monospecies infection was confirmed by polymerase chain reaction (PCR). Ten clinical samples with high parasitaemia (parasite density > 50000/mL), and qualitatively and quantitatively good enough, were selected to ensure the integrity of sequencing.

Whole-Genome Sequencing
WGS of Pf clinical isolates from Togo was performed by OE Biotech (Shanghai). Extracted genomic DNA was sheared into 150 bp fragments using a Covaris instrument. The fragmented DNA molecules were used to construct Illumina-sequencing libraries with TruSeq DNA LT Sample Prep Kit (Illumina). All libraries were sequenced on the Illumina HiSeq X10 platform according to the manufacturer's protocol (18), using the direct sequencing approach, as described previously (17). All reads were filtered by removing the adapter sequences and low quality sequences were removed with Trimmomatic-3.0. (19). The sequencing reads have been submitted to the Short Read Archive of the National Centre for Biotechnology Information.

Identification of SNPs and Population Structure
All sequenced reads from the 10 samples were mapped to the Pf 3D7 genome using Burrows-Wheeler Aligner and Sequence Alignment/Map (SAMtools-1.3) (20). Samples with average coverage < 95% sequences mapping over 3D7 reference genome were removed. For high-quality SNP calling, sequencing reads were genotyped using an in-house pipeline based on GATK best practices and SnpEff workflows (21), with Pf3K known-sites (15).
Principal component analysis (PCA) and neighbor-joining were performed to investigate major geographical division of population structure. PCA and a neighbor-joining tree of all samples were undertaken via SPSS-Ver25 and Mega-Ver6.0 programs, respectively, to compare Pf SNPs from Togo isolates with those from the 62 isolates collected worldwide (15)(16)(17).

Tests for Signatures of Selection
For SNPs in all populations, nucleotide diversity (p) was estimated for the whole genome mutation rate in 4 kb sliding window and 2 kb step across each chromosome in Arlequin-Ver3.5 (22). To distinguish between genes evolving neutrally and under selective pressures, or genetic hitchhiking, Tajima's D value (TD) for each sliding window and the corresponding gene was also calculated.
In addition, long-range haplotype diversity approach integrated haplotype score (iHS) was employed to identify genes under recent positive selection. iHS compares integrated extended-haplotype homozygosity (EHH) values between alleles at a given SNP (23). iHS computation was based on the Togo clinical isolates by tracking the decay of haplotype homozygosity for both the ancestral and derived haplotypes extending from every SNP site (24). For this test, we restricted the analyses to SNPs with inferred ancestral states with minor allele frequencies equal to or higher than 5% (25). iHS scores were estimated using Selscan-Ver1.10a (26).
To assess whether genes associated with putative functions were enriched among the group of genes with high Tajima's D values (> 1.0) or high |iHS| (top 1% score), gene ontology (GO) term analysis was conducted. Genes with a TD > 1.0 were classed as genes of potential interest for GO analysis. Analysis was performed using GO Enrichment tool of PlasmoDB (http:// plasmodb.org/plasmo/, PlasmoDB Ver-46). The adjusted P values were also generated from Fisher's exact test, and the statistical significance was set for P < 0.05.

Genetic Diversity of Falciparum Isolates From Togo
We used a direct sequencing approach that requires only high parasitaemia for malaria parasites without leukocytes filtration (17) to sequence clinical isolates of Pf genomes from Togo. Among the 10 clinical samples that were sequenced, results of seven were good enough and provided enough coverage (> 95% sequences mapping over the 3D7 reference genome) ( Table 1). The remaining three samples mapped onto only 58.89%, 56.43%, and 46.83% (unshown data) and failed for further analysis. In this study, the Togo isolates generated between 55 and 176 M paired-end reads of 150 bp from each of the samples, globally. All sequencing reads have been deposited to the National Centre for Biotechnology Information (NCBI) Short Read Archive (Bio-Project Accession Number: PRJNA616298). A variable proportion of reads (3.8-14.2%) from all the isolates were mapped to the reference and aligned onto at least 95% of the reference 3D7 strain genome in high fold coverage (7.2-33.9x).
For analysis of polymorphism, a total of 266963 SNPs common loci were available for analysis after quality filtering ( Table 1). The list of the SNPs for all the isolates is provided in Supplementary Table 1. Of the 266963 SNPs, excluding the lowfrequency SNPs (103497 SNPs with minor allele frequency < 5%), a total of 163466 SNPs across the seven isolates were identified and could be mapped to coding sequences. In addition, SNPs were identified across 4614 genes on 14 chromosomes in the samples and 931 genes had more than five SNPs (Figure 2A). These genes were considered informative for comparisons of polymorphic nucleotide sites.

Comparison of Genetic Diversity of the Isolates Among Different Endemic Regions
Overall genome-wide p of Pf clinical isolates from Togo were estimated at 1.79 × 10 −3 . However, genetic diversity was lower in intronic regions but higher in exonic and intergenic regions (Supplementary Table 1). Supplementary Figure 1 shows the p map of the isolates across 14 chromosomes. Interestingly, we observed that Togo samples have genes with higher SNPs, suggesting a greater genetic diversity than that reported from other African samples (p = 1.03 × 10 −3 ) (27), but lower than that of isolates from CMB (p = 2.87 × 10 −2 ) (17).
We then performed PCA and neighbor-joining analyses of all strains to assess major geographical difference. As part of Africa isolates, the Togo isolates illustrated a higher discrepancy than the 3D7 strain genome. Neighbor-joining displayed a tree with two distinct branches separating two major clades that correspond to the Asia and Africa geographical groups of samples ( Figure 2B). There was evidence of clear distinction of the isolates from the two regions, and African isolates displayed sub-clusters to form two (or three) monophyletic clades. Furthermore, we found that the outcome from PCA was similar to that of the neighbor-joining analysis. The major axis of differentiation (F1) of the PCA distinguished clearly two major Asia and Africa groups of isolates, which is in accordance with their geographical origins ( Figure 2C). Similar observation was noted in recent studies on Pf isolates from CMB (17,28). In addition, among the Africa samples, Togo samples exhibited greater genetic diversity than has been reported from other African regions. The second and third principal components (F2 and F3) defined a distinct South-Asian cluster and distinguished the African samples better according to their locations, where Togo samples were well differentiated from other African samples ( Figure 2D). Furthermore, Togo isolates were widely separated in our PCA result, suggesting high diversity of Pf from Togo.

DISCUSSION
P. falciparum originated in Africa and spread to other continents as human migration gradually formed new populations (29). In this study, both the PCA and neighbor-joining tree analyses showed that the parasites derived from Togo clustered according to their geographic origin and distinguished two major clades that correspond to the Asia and Africa geographical groups of samples (17,38). In addition, our data revealed the average nucleotide diversity of Pf from Togo is much higher than that from other African samples, but it is lower than the parasite from the CMB, probably due to the historically different antimalarial drugs used in that area (17). However, locally varying selection on pathogens due to differences in host immunity may be the major factor for the high nucleotide diversity observed in Togo isolates in comparison to other Africa isolates. The purpose of the Tajima test is to detect deviation from neutrality, in other words, to indicate processes such as balancing selection, selective sweeps, and population expansion. This study revealed that some particular antigen genes that are related to RBC invasion and disease severity, and known to be polymorphic and under balancing selection by host immune system (31,39), got TD < 0; suggesting selective sweep (directional selection) and/or recent population expansion. Interestingly, previous scans for evidence of positive selection on Pf have clearly identified loci that have undergone selective sweeps (38,49,50) as well as loci that are apparently under balancing selection, including those encoding targets of acquired immunity (31). In addition, some other investigations have observed multiple genes under recent positive selection by computation of iHS in other parasite populations (39,40,51,52). Therefore, here, we applied iHS as a complementary analysis to assess signals of host immune selection.
In Pf isolates from Togo, within genes that are likely under signals of recent positive selection, host immunity-related antigen genes have been the major selective agents. In terms of the top outlier genes (top 1% |iHS| as a strong hits threshold and GO enrichment analysis), 31 of the 306 genes with known functions included six RBC invasion-linked antigen genes (msp1, msp7, mspdbl1, mspdbl2, ra, and sera6) (32, 41) and 22 antigen genes (six rifs, seven vars, and eight stevors) that are associated with roles in evasion to host immunity, rosetting or cytoadhesion (35)(36)(37), among which is var2csa, a pregnancy placental malaria-related gene (34,47) (Table 3). Potential interest for GO analysis for genes under balancing selection by host immune system revealed six genes related to RBC invasion (aarp, flp, msp3, msp7, pl, and sera5) (32), one var (PF3D7_0302300) associated with pathogenesis (GO: 0009405), and phistb rpl1 that is implicated in placental cytoadherence to microvasculature (47).
Interestingly, we found that most of the gene family members with elevated |iHS| are located close to each other on the chromosome. For example, from three sera genes that are contiguously arranged on chromosome two, sera6 was involved in the top 1% SNP locus (|iHS| = 2.61685) and the remaining other two were also included in the 5% iHS list. This was also observed on chromosome two between mps4 involved in the top 1% |iHS| (|iHS| = 2.83572) and msp2 included in the 5% |iHS| (|iHS| = 2.08998). Following similar observation with eight serine-repeat antigen genes in P. vivax isolates (25), this could be explained by the process of positive natural selection increasing the prevalence of both selected variant as well as of nearby variants, generating local regions of extended haplotypes.
We identified genes that are likely to have been under exceptionally strong recent positive selection. Given these genes encode membrane/surface proteins, they would have been under high selection from the host immune system as potential selective targets of host immunity, and this may explain the high iHS scores that we observed (39,41,42). For example, highly elevated |iHS| associated with the gene encoding the MSP1 antigen was consistent with that from a previous report on Pf isolates from Gambia and Guinea, as this gene has a complex pattern of polymorphism that is likely to result from different selective processes (38). The MSP1, a core member of band 3 co-ligand complex during RBC invasion (32), has been validated as one of the leading blood-stage malaria vaccine antigens with sequences incorporated in experimental vaccine trials (41). In addition, highly supported windows of elevated iHS scores were also observed on chromosomes two and 10, incorporating the sera6 and a cluster of different antigen genes (including ra, mspdbl1, and mspdbl2), respectively. Similarly, genes under high operation of positive selection in the Togo isolates include those encoding known surface antigens such as vars (PF3D7_1100200, PF3D7_0425800, PF3D7_1300300, and PF3D7_0400400) and promising targets of immunity that require further studies [members of rif (PF3D7_0223100, PF3D7_1400600, and PF3D7_0100200) and stevor (PF3D7_1040200, PF3D7_0631900, PF3D7_1300900, and PF3D7_0832600) families]. They are known to bind to cerebral endothelial/RBC surface receptors and have been identified or reported previously as immune targets that may serve to prevent severe malaria (43-45, 53, 54).
This analysis failed to detect selection signals for some important antigen genes such as lsa3, ama1, msp2, msp3, eba175, or circumsporozoite protein, csp [which have been entered vaccine-stage development (39,41)], and rif (PF3D7_0100400, PF3D7_0401600, or PF3D7_1254800), vars (PF3D7_1150400, PF3D7_0533100, or PF3D7_0412700), or stevors (PF3D7_1254100 or PF3D7_0300400), to mention a few (43,44,53,55), which have been identified or validated as targets of acquired immunity for vaccine development (39,42). The reason could be that iHS may not be suitable for detecting positive selection for those SNPs that have reached fixation in a local population (28). Another possible explanation could be that they may be less targeted by host immunity in Togo subjects, given malaria transmission intensity and parasite genetic diversity are known to vary greatly among different parts of Africa due to variation in rainfall abundance and seasonality (39). However, immunological investigations using higher numbers of samples are needed in the future.
Positively skewed allele frequency distributions indicating the operation of balancing selection of Pf genes in other parasite populations have been reported (31,38,39,56). In this study, the phistb rlp1 encoding PHISTb domain-containing RESA-like protein 1 at the surface of iRBCs, which was reported previously as most likely under balancing selection (31,38), was also identified. It interacts with VAR2CSA and modulates knob-associated heat-shock protein 40 expression on the iRBC surface, and thus may regulate VAR2CSA expression to confer stable chondroitin sulfate A binding capacity and the parasite's cytoadherence (47). The var2csa was also detected among genes under strong positive selection in the Togo isolates. It encodes a particular parasite adhesion molecule (PfEMP1) expressed on the surface of iRBCs for roles in sequestration of Pf-iRBCs in the placenta, which occurs as a result of its binding to host receptors such as chondroitin sulphate A. Signals of strong balancing selection were evident in a similar subset of genes in Togo and other West Africa isolates. This is consistent with expectations that balancing selection due to allele frequency-dependent acquired immune responses is likely to operate on antigenic targets in Togo subjects (38). Such evidence could lead to studies for a vaccine to induce antibodies to prevent placental adhesion/ sequestration by reducing the maternal anaemia and infant deaths that are associated with malaria in pregnancy (34,39). Furthermore, we found high |iHS| for two particularly important antigen genes (msp7 and phistb rlp1), although they appear to being under balancing selection. The msp7 in association with msp1, is important in invasion of mature RBCs and has been reported as a potential target of acquired immunity (32). Following similar observation with csp gene in P. knowlesi isolates (30), these genes could be targets of both balancing and directional selection due to their location within an elevated window of haplotype homozygosity on chromosomes, or might have hitchhiked to intermediate allele frequencies by a linked locus under selection within populationspecific isolates.
Of the eight Pf drug-resistant genes identified within elevated iHS regions in Togo samples, none of the five known drug resistance genes (crt, mdr1, dhfr, dhps, and k13) were included, suggesting that Togo population is not under important antimalarial drug selection. This is consistent with a recent study in Togo that has shown therapeutic efficacy of AL and ASAQ without delay in the clearance of mutant parasites (9). However, GO analysis for the drug-resistant genes that we identified by iHS computation within the top 1% |iHS| (abcI3 and apiap2) or with TD > 1 (aat1 and fpps/ggpps) were highly significantly (P < 0.001) enriched. In addition, our study suggested additional drug resistance genes under strong positive selection (Supplementary Table 8), which have been reported previously (48,49).

CONCLUSION
This study assessed the first whole-genome sequences of Pf isolates from Togo. Our results showed that the parasites derived from Togo clustered according to their geographic origin and suggest greater genetic diversity of Pf isolates in Togo than seen in other African countries. In addition, Tajima's D values were predominantly negative, consistent with directional selection and/or a history of recent expansion of Pf population in Togo. Against this background, there was evidence of balancing and positive selections on particular genes. Loci showing evidence of recent positive selection and balancing selection attest that host immunity has been the major selective agent. This is reflected in a significant representation of genes that encode membrane proteins expressed at the merozoite stage that invades RBCs and parasitized RBC surface proteins implicated in roles for immunoevasion, rosetting, or cytoadhesion. Our study would contribute with insightful information on the current epidemiological scenario of malaria in Togo and provides a fundamental basis to engage studies for effective malaria control in Togo.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi. nlm.nih.gov/, PRJNA616298.

ETHICS STATEMENT
Permission was obtained from all malaria subjects before collecting specimens. Blood collection was made with informed consent from all individuals or their parents, under a study   . 18490741100). The sponsor played no roles in the study design or in the collection, analysis, or interpretation of the data, in writing the report, or in the decision to submit the article for publication.