Application of 23 Novel Serological Markers for Identifying Recent Exposure to Plasmodium vivax Parasites in an Endemic Population of Western Thailand

Thailand is aiming for malaria elimination by the year 2030. However, the high proportion of asymptomatic infections and the presence of the hidden hypnozoite stage of Plasmodium vivax are impeding these efforts. We hypothesized that a validated surveillance tool utilizing serological markers of recent exposure to P. vivax infection could help to identify areas of ongoing transmission. The objective of this exploratory study was to assess the ability of P. vivax serological exposure markers to detect residual transmission “hot-spots” in Western Thailand. Total IgG levels were measured against a panel of 23 candidate P. vivax serological exposure markers using a multiplexed bead-based assay. A total of 4,255 plasma samples from a cross-sectional survey conducted in 2012 of endemic areas in the Kanchanaburi and Ratchaburi provinces were assayed. We compared IgG levels with multiple epidemiological factors that are associated with an increased risk of P. vivax infection in Thailand, including age, gender, and spatial location, as well as Plasmodium infection status itself. IgG levels to all proteins were significantly higher in the presence of a P. vivax infection (n = 144) (T-test, p < 0.0001). Overall seropositivity rates varied from 2.5% (PVX_097625, merozoite surface protein 8) to 16.8% (PVX_082670, merozoite surface protein 7), with 43% of individuals seropositive to at least 1 protein. Higher IgG levels were associated with older age (>18 years, p < 0.05) and males (17/23 proteins, p < 0.05), supporting the paradigm that men have a higher risk of infection than females in this setting. We used a Random Forests algorithm to predict which individuals had exposure to P. vivax parasites in the last 9-months, based on their IgG antibody levels to a panel of eight previously validated P. vivax proteins. Spatial clustering was observed at the village and regional level, with a moderate correlation between PCR prevalence and sero-prevalence as predicted by the algorithm. Our data provides proof-of-concept for application of such surrogate markers as evidence of recent exposure in low transmission areas. These data can be used to better identify geographical areas with asymptomatic infection burdens that can be targeted in elimination campaigns.

Thailand is aiming for malaria elimination by the year 2030. However, the high proportion of asymptomatic infections and the presence of the hidden hypnozoite stage of Plasmodium vivax are impeding these efforts. We hypothesized that a validated surveillance tool utilizing serological markers of recent exposure to P. vivax infection could help to identify areas of ongoing transmission. The objective of this exploratory study was to assess the ability of P. vivax serological exposure markers to detect residual transmission "hot-spots" in Western Thailand. Total IgG levels were measured against a panel of 23 candidate P. vivax serological exposure markers using a multiplexed beadbased assay. A total of 4,255 plasma samples from a cross-sectional survey conducted in 2012 of endemic areas in the Kanchanaburi and Ratchaburi provinces were assayed. We compared IgG levels with multiple epidemiological factors that are associated with an increased risk of P. vivax infection in Thailand, including age, gender, and spatial location, as well as Plasmodium infection status itself. IgG levels to all proteins were significantly higher in the presence of a P. vivax infection (n = 144) (T-test, p < 0.0001). Overall seropositivity rates varied from 2.5% (PVX_097625, merozoite surface protein 8) to 16.8% (PVX_082670, merozoite surface protein 7), with 43% of individuals seropositive to at least 1 protein. Higher IgG levels were associated with older age (>18 years, p < 0.05) and males (17/23 proteins, p < 0.05), supporting the paradigm that men have a higher risk of infection than females in this setting. We used a Random Forests algorithm to predict which individuals had exposure to P. vivax parasites in the

INTRODUCTION
Thailand is a region of low malaria transmission, with 5,432 reported confirmed clinical cases in 2019 (data obtained from the Bureau of Vector Borne Diseases Control, Ministry of Public Health, Thailand). As malaria transmission in Thailand has declined, a change in infection dynamics has occurred with a higher proportion of infections now due to Plasmodium vivax. Another notable feature is that a very high proportion of infections are asymptomatic (Nguitragool et al., 2019), suggesting the number of cases reported by the World Health Organization is likely an underestimate. This mirrors changes that have occurred in other low-moderate transmission countries, such as the Solomon Islands (Quah et al., 2019), Myanmar (Liu et al., 2019), and Laos (Lover et al., 2018), where P. vivax is now the dominant species and a high proportion of infections are asymptomatic.
To achieve malaria elimination, national malaria control programs now need to switch their primary focus from case management to interruption of transmission. This requires monitoring and surveillance activities to detect infections, rather than just morbidity and mortality. This includes identifying residual transmission foci that can be specifically targeted for elimination. Current surveillance tools, such as microscopy and rapid diagnostic tests, are not well suited for identifying lowdensity, asymptomatic infections. An additional major challenge for P. vivax elimination is the presence of an arrested liverstage, known as hypnozoites, which can reactivate weeks to months after the primary infection to cause relapsing infections. Relapsing infections can be responsible for over 80% of all bloodstage infections (Robinson et al., 2015;Taylor et al., 2019), thus a major gap in our diagnostic/surveillance toolkit for malaria is the inability to identify individuals that harbor hypnozoites in the absence of concurrent blood-stage infections.
Serology has been proposed as a useful means of surveillance for P. vivax, as antibodies are possible to measure in point-ofcontact tests and can reflect recent exposure not just current infections. Multiple studies have reported on the use of serology for identifying long-term changes in Plasmodium transmission patterns, but the application of serology as an intervention requires more actionable information. In order to address these limitations of current surveillance tools we have recently identified and validated a panel of novel serological exposure markers for classifying individuals as recently exposed to P. vivax (Longley et al., 2020). We demonstrated that 57/60 tested P. vivax proteins were able to classify individuals as recently exposed within the past 9 months, with various levels of accuracy. Using a combination of IgG levels to eight specific P. vivax proteins resulted in the highest level of accuracy (sensitivity and specificity of ∼80%), better than any single protein alone. Due to the biology of the P. vivax parasite, essentially all individuals harboring hypnozoites will have had a blood-stage infection within the past 9-months (White, 2011;Battle et al., 2014). Therefore, our hypothesis is that serological exposure markers can indirectly identify individuals with P. vivax hypnozoites; in support of this hypothesis, we demonstrated that individuals who were classified as seropositive by our algorithm went on to have recurrent P. vivax infections at much higher rates than those who were sero-negative (Longley et al., 2020). Our serological exposure markers that can identify recent exposure within the past 9months with ∼80% sensitivity and specificity thus provide a novel and potentially highly informative method for identifying regions of residual transmission and individuals at the highest risk of recurrent infections.
The objective of our current study was to assess the applicability of a panel of 23 candidate P. vivax serological exposure markers for detecting areas of residual transmission and individuals at greater risk of infection within an endemic-area of western Thailand. The 23 candidate P. vivax serological markers were selected based on our prior work (Longley et al., 2020), as they had the best accuracy for classifying individuals as recently exposed or not. We aimed to further assess these markers by measuring the association between IgG levels to the 23 P. vivax proteins with various epidemiological factors associated with increased risk of P. vivax infection, and by applying the algorithm we previously developed using IgG levels to a subset of eight P. vivax proteins for predicting recent exposure of the individuals in the past 9-months.

Study Population
We used plasma samples collected from a cross-sectional survey conducted in September 2012 at three sites of the Kanchanaburi and Ratchaburi provinces in western Thailand (Bongti, Kong Mong Tha, and Suan Phueng), covering a 250 km central span of the Thai Myanmar border (Nguitragool et al., 2017). This cross-sectional survey was conducted toward the end of a period of rapidly declining malaria transmission in Thailand (Chu et al., 2020). Briefly, 4,309 individuals were enrolled and a fingerprick blood sample (250 µl) was collected. Blood was separated into pellet and plasma, with both components stored at −20 • C until usage. The presence of Plasmodium falciparum and P. vivax parasites was determined using 18S rRNA qPCR (Rosanas-Urgell et al., 2010;Wampfler et al., 2013). The presence of P. vivax infections was 3.09% and P. falciparum infections was 0.86% (0.26% had mixed P. vivax/P. falciparum infections) (Nguitragool et al., 2017). Body temperature was also recorded, and fever was defined as >37.5 • C. Most P. vivax (91.7%) and P. falciparum (89.8%) infections were not accompanied by any febrile symptoms (Nguitragool et al., 2017). Demographic characteristics and behavioral risk data (such as bednet usage and travel) were obtained by trained interviewers using a structured questionnaire. Plasmodium infections were most common in adolescent and adult males (Nguitragool et al., 2017).
In the current study, data from 4,255 individuals with epidemiological data and plasma samples available were used. Table 1 provides an overview of the demographic and epidemiological study characteristics of these volunteers.
We also utilized data we had previously generated against the same P. vivax proteins in 274 malaria-naïve negative control samples (Longley et al., 2018a), in order to set seropositivity cut-offs for each P. vivax protein. These samples were obtained from 102 individuals from the Volunteer Blood Donor Registry (VBDR) in Melbourne, Australia, 72 individuals from the Thai Red Cross in Bangkok, Thailand and 100 individuals from the Australian Red Cross in Melbourne, Australia. Standard screening procedures in Thailand exclude individuals from donating to the Red Cross if they have had a confirmed malaria infection within the previous 3 years or have traveled to malaria endemic areas within the past year.

Ethics Statement
Written informed consent or assent was obtained from each participant in the cross-sectional survey and, where required, their legal guardian. This study was approved by the Ethics Committee of the Faculty of Tropical Medicine, Mahidol University (Ethical Approval number MUTM 2012-044-01).
Collection and/or use of the malaria-naïve negative control samples was approved under the Human Research Ethics Committee at WEHI (#14/02).

P. vivax Proteins
A total of 23 P. vivax proteins were selected based on our prior work identifying and validating candidate serological markers of recent exposure to P. vivax (Longley et al., 2018a). These proteins, either individually or in combination, could accurately classify individuals as infected within the last 9-months, including in individuals from a Thai cohort conducted in a similar region of western Thailand as the present study. Note that the selection was made after our first iteration of the algorithm, available as version 1 in BioRxiv (Longley et al., 2018b), with those proteins that could most accurately classify recent infection selected. We have since made further improvements to the algorithm (Longley et al., 2020). Table 2 provides details of the 23 P. vivax proteins used in this study. The majority of proteins (n = 18) were expressed using a wheat germ cellfree (WGCF) system (CellFree Sciences, Matsuyama, Japan) at either Ehime University or CellFree Sciences. The remainder were produced using an Escherichia coli expression system at the Walter and Eliza Hall Institute in Melbourne or the Institut Pasteur in Paris. Whilst most of the P. vivax proteins selected are annotated in PlasmoDB, one is not (PVX_097715, a hypothetical protein). PVX_097715 is located near the pvmsp3 gene family, but was excluded as being a member of this family as it does not encode the characteristic features (Jiang et al., 2013). This gene is protein coding, and whilst future studies into the function of this protein would be interesting, for our purposes the fact it induces a serological response is sufficient. Our panel included two PVX_087885 proteins (rhoptry associated membrane antigen, RAMA), covering the same amino acid sequence, but produced by two different laboratories. The data in this manuscript shows similar trends in antibody responses to these two proteins, and future work will continue using only one of these proteins (PVX_087885A, for which we have larger stocks of protein). Our panel also includes two versions of PVX_094255, however, these are two different fragments with unique (non-overlapping) amino acid sequences. In addition, our panel included two proteins annotated as merozoite surface protein 3 (MSP3) in PlasmoDB (PVX_097720 and PVX_097680), and three proteins annotated as MSP7 (PVX_082650, PVX_082700, and PVX_082670). These are all unique constructs from multi-gene families (Garzón-Ospina et al., 2016;Kuamsab et al., 2020) and all performed well at classifying recent infections in our previous work (Longley et al., 2020). Sequence identity between the P. vivax proteins and their P. falciparum orthologs was performed using two methods. (1) PlasmoDB (release 51) was used to identify the P. falciparum (strain 3D7) ortholog. Full length sequences for both proteins obtained from PlasmoDB were aligned using EMBOSS Needle and the percent identity reported. (2) The P. vivax protein construct sequence (not always full-length) was searched on NCBI BLASTp (accessed May 13, 2021) for matches in P. falciparum 3D7 and the percent identity reported.

Multiplexed Bead-Based Assay for Total IgG Measurements
Protein-specific IgG antibody levels were measured using a multiplexed bead-based assay, utilizing Luminex R technology, as previously described (Franca et al., 2016;Longley et al., 2017a). All plasma samples were run at a dilution of 1/100. Median fluorescent intensity (MFI) values from the Luminex R -200 were converted to arbitrary relative antibody units (RAU) based on a standard curve generated with a positive control plasma pool from highly immune adults from Papua New Guinea (PNG) (Franca et al., 2016). The standard curve was run on every plate. A five-parameter logistic function was used to obtain an equivalent dilution value compared to the PNG control plasma (ranging from 1.95 × 10 −5 to 0.02). The conversion was performed in R (version 3.5.3).
For 21 proteins IgG levels were measured in all 4,255 individuals. For proteins PVX_088820 (Pv-fam-a) and PVX_097715 (hypothetical protein) levels were measured in 1,873 and 3,916 individuals, respectfully, due to availability of protein-coupled beads at the time of the study or following quality control of plate data. The complete IgG dataset generated is provided in Supplementary Table 1.

Statistical Analyses
Data analysis and presentation were performed in Prism version 8 (GraphPad, United States) or Stata version 12.1 (StataCorp, United States). Antibody values were log 10 transformed to better fit a normal distribution. Spearman's r correlations were used to determine the correlation between IgG levels and age. T-tests were used to determine the association between age and gender, bednet usage and the presence of Plasmodium infections. Oneway ANOVA was used to compare IgG levels across multiple groups (i.e., regions and age categories), with adjustment for multiple comparisons performed using Sidak's test. To determine which variables had the greatest influence on IgG levels we used a multivariable linear regression model with a stepwise backward approach, as previously described (Longley et al., 2017a). Age was included as both a linear and quadratic term, reflecting the assumption that antibody levels first increase with age but then reach a plateau.
A Random Forest classification algorithm was used to predict which individuals had prior exposure to P. vivax in the last 9 months. The algorithm used antibody data against the top eight P. vivax serological exposure markers we had previously identified (Longley et al., 2020): PVX_099980 (merozoite surface protein 1, MSP1-19), PVX_096995 (Pv-fam-a), PVX_112670 (Pv-fam-a), PVX_087885B (RAMA), PVX_097625 (MSP8), PVX_097720 (MSP3α), PVX_094255B (reticulocyte binding protein 2b, RBP2b), and KMZ83376.1 (erythrocyte binding protein region II, EBPII). The algorithm was trained with 4 datasets [Thailand (n = 826), Brazil (n = 925), the Solomon Islands (n = 754), and negative controls (n = 274)], from our previous work (Longley et al., 2020). Different diagnostic targets along the receiver operating characteristic (ROC) curve of the algorithm could be selected (Supplementary Figure 1): we selected a threshold with 62% sensitivity and 90% specificity given the PCR prevalence in this study was low (thus a high specificity target was prioritized to reduce the number of potential false positives). Since the performance of our validated diagnostic tests are known, we can account for potential misclassification bias due to imperfect sensitivity and specificity in classifying individuals in our cross-sectional study. Given that the measured sero-prevalence (M) is a function of the true sero-prevalence (T) and the sensitivity (SE) and specificity (SP) of our diagnostic test, we can derive the following equation: We can rearrange our equation to solve for the true (T) or, in this case, the adjusted (A) sero-prevalence, which we expect to be very close to the true sero-prevalence: It should be noted that this does not account for relapse periodicity, lifetime exposure, immunity, and other factors that contribute to the uncertainty of these estimates.

Immunogenicity and Seropositivity of the IgG Response Against 23 Candidate P. vivax Serological Exposure Markers
We first determined population levels of immunogenicity or seropositivity against the 23 P. vivax proteins, using two distinct methods.
The multiplexed bead-based assay is run with a reference standard curve generated from a pool of plasma from hyperimmune individuals from PNG. We assessed the proportion of our Thai volunteers who exceeded the IgG level of the PNG pool (RAU exceeding equivalent of 1/100 dilution, i.e., 0.01), to determine the proportion of individuals who have highly immunogenic responses. This varied from 0.8% (PVX_097625, MSP8) to 9.96% (PVX_097715, hypothetical protein) across the 23 P. vivax proteins ( Figure 1A).
To determine seropositivity rates we set a protein-specific seropositivity cut-off using IgG values (RAU) we previously obtained against a panel of 274 malaria-naïve negative control samples (from Melbourne, Australia and Bangkok, Thailand). The cut-off was set at the mean plus two-times the standard deviation. Seropositivity rates varied from 2.5% (PVX_097625, MSP8) to 16.8% (PVX_082670, MSP7F) ( Figure 1A). As the transmission level is low in this region of western Thailand (4.47% prevalence for any Plasmodium infection in this study), we also calculated seropositivity rates in the subset of individuals who had concurrent P. vivax infections during the crosssectional survey (n = 76-144, dependent on protein). This resulted in an increase in seropositivity rates for nearly all proteins, with the maximum reaching 56% of individuals seropositive against P. vivax erythrocyte binding protein (EBPII, KMZ83376.1) ( Figure 1B).
The breadth of the response per individual was also calculated, based on how many proteins each individual was seropositive to. This was calculated for 21 of the 23 P. vivax proteins that had complete data (PVX_097715-hypothetical protein and PVX_088820-Pv-fam-a were excluded due to incomplete data, as detailed in section "Materials and Methods"). The breadth of the response per individual ranged from 0 to 21 proteins. Overall, 43% of individuals were seropositive to at least 1 protein; this increases to 90% for individuals with a concurrent P. vivax infection.

Association of IgG Level With Individual Demographic Variables Known to Relate to the Risk of Malaria Infection
A number of individual-level demographic variables are known to be associated with risk of malaria infections, including age, gender and bednet usage. In particular, males were shown to have a higher risk of Plasmodium exposure in the individuals from the current cross-sectional survey (Nguitragool et al., 2017), and IgG levels are generally expected to increase with age in malariaendemic regions (Longley et al., 2017a) due to increasing chance of exposure to the parasite.
We observed a weak but statistically significant correlation between IgG levels and age for all 23 P. vivax proteins (Spearman's r 0.13-0.59, p < 0.0001). The clearest distinction was between adults (>18 years of age) and children (<18 years of age) (Figure 2A), with a significant difference between these two groups for 19/23 proteins (one-way ANOVA with Sidak's multiple comparison test, p < 0.0001). For 18 of 23 P. vivax proteins there was a significantly higher IgG level in males than females (T-tests, p < 0.05) (Figure 2B).
We also assessed whether IgG levels varied with reported bednet usage (defined as bednet used last night/in general as yes or no). There was a significantly higher IgG level in individuals who reported using a bednet compared to those who did not for 9 of 23 P. vivax proteins (39%) (PVX_112670 Pv-fam-a, PVX_003770 MSP5, PVX_087885 RAMA, PVX_097680 MSP3β, PVX_082670 MSP7F, PVX_097720 MSP3α, KMZ83376.1 EBPII, PVX_110810 Duffy binding protein region II (DBPII), and PVX_092995 Pv-fam-a). The strongest association was for the proteins PVX_097680 (MSP3β) and KMZ83376.1 (EBPII) (Ttests, p < 0.0001). There was a significantly lower IgG level in individuals who reported using a bednet for 1 protein only, PVX_088820 (a Pv-fam-a protein) (p = 0.0019) (noting that this was the protein with the smallest sample size, n = 1,873). In the epidemiological analysis of the cross-sectional survey data, bednet usage was not associated with a reduction in risk of infection (Nguitragool et al., 2017).

Association of IgG Levels With P. vivax and P. falciparum Infection Status
A total of 190 individuals (4.4% of total 4,309 individuals) in the cross-sectional survey had a current Plasmodium infection at the time of sampling (determined by qPCR) (Nguitragool et al., 2017). The majority of infections were caused by P. vivax (n = 144, 3.3%), with the remainder due to P. falciparum (n = 46, 1.1%). 11 of these infected individuals had mixed P. falciparum/P. vivax infections. Samples were not assayed for other species of Plasmodium. As these were crosssectional samples we do not know when the individuals acquired the P. vivax infection, but note that the average duration of an (untreated) P. vivax infection in Thailand has been estimated as relatively short (29 days) (White et al., 2018). P. vivax infections were most prevalent in Bongti, in males, and in adults (aged 18-49, though infections were also observed in children) (Nguitragool et al., 2017). The majority of P. vivax infections (91.7%) were asymptomatic (Nguitragool et al., 2017). The 12 individuals with symptomatic P. vivax infections (fever recorded at the cross-sectional survey or reported within the prior 2 days) had an age range of 6-70 years.
We assessed how the IgG level differed in individuals with or without Plasmodium infections (caused by either P. vivax or P. falciparum). As expected, there was a statistically significantly higher IgG level in individuals with a current P. vivax FIGURE 2 | Association of IgG levels against 23 P. vivax proteins with age and gender. (A) IgG levels compared between individuals from the 2012 cross-sectional survey as split by age: 0-7, 7-13, 13-18, and more than 18 years. Statistical significance between groups was assessed by one-way ANOVA with Sidak's multiple comparison test. (B) IgG levels compared between gender of participants (male or female). Statistical significance was assessed by T-test. The black dashed line indicates the assay minimum, the blue dashed line the assay maximum, and the red dashed line is equivalent to a 1/100 dilution of the PNG pool. * p < 0.05, * * p < 0.01, * * * p < 0.001, * * * * p < 0.0001. infection compared to those without for all proteins tested (p < 0.0001) ( Figure 3A). Interestingly, the same relationship was also evident for individuals with P. falciparum infections (p < 0.001) ( Figure 3B). Notably, sequence identity between the P. vivax proteins and their P. falciparum orthologs is low (<44%, 2 have no orthologs, see Supplementary Table 2), suggesting this effect is unlikely to be due to cross-reactivity. The higher IgG levels in individuals with P. falciparum infection remained evident even when the 11 individuals that were co-infected with P. vivax were excluded (p < 0.01). This finding is likely attributable to shared epidemiological risk factors for P. vivax and P. falciparum exposure. In addition, we directly compared total IgG levels between individuals with P. vivax and P. falciparum mono-infections. For the majority of proteins (18/23) there was no significant difference in IgG level between those with P. vivax and P. falciparum infections (Supplementary Figure 2). However, for five protein constructs (including both the PVX_087885 RAMA constructs), there was a significantly higher IgG level in the group with concurrent P. falciparum infections. Two of these proteins have moderate levels of sequence identity between the orthologs (RAMA, 27.9% and TRAP 40.7%). Importantly, the parasite density in individuals with P. falciparum infections was on average almost 14× higher than parasite density in P. vivax infected individuals (average log copy number, 5,710 vs 409).
In individuals with P. vivax infections, we also assessed whether there was a relationship between parasite density and IgG level. There was no correlation evident between parasite density (measured as number of copies/µl blood by PCR) and IgG levels for all P. vivax proteins (p > 0.05), suggesting that Note these include individuals with P. falciparum-P. vivax co-infections. The black dashed line indicates the assay minimum, the blue dashed line the assay maximum, and the red dashed line is equivalent to a 1/100 dilution of the PNG pool. Statistical significance was assessed using T-tests. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001. current P. vivax antigenic density was not affecting the magnitude of the IgG response.

Association of IgG Levels With Spatial Household Location of Study Individuals
Individuals enrolled in this cross-sectional survey lived in three main regions: Bongti (three villages), Kong Mong Tha (one village), and Suan Phueng (four villages). These regions have stable populations, with less than 2% of individuals reporting travel in the previous months (Nguitragool et al., 2017). They are all situated on the Thai-Myanmar border but in distinct areas over a 250 km span. There was evidence of spatial heterogeneity in terms of which individuals in the survey had P. vivax infections (i.e., evidence of spatial clustering or hot-spots of infection) (Figure 4), with similar P. vivax prevalence rates of 3.82 and 3.12% in Bongti and Suan Phueng, respectively, but much lower at 1.45% in Kong Mong Tha (Nguitragool et al., 2017). There was also variation in P. vivax prevalence rates amongst the villages within the three study sites, which is also depicted in Figure 4.
We used a one-way ANOVA to determine whether we could also identify differences in IgG levels between the three main regions. Protein PVX_088820 (Pv-fam-a) was excluded as all IgG data obtained was from individuals living in Bongti only. For all remaining proteins there was a statistically significant difference between groups (p < 0.05) ( Table 3), with the exception of PVX_094255A (RBP2b 1986−2653 ). Using pairwise comparisons and adjusting for multiple testing, five proteins induced significantly different IgG levels in all three regions [PVX_096995 Pv-fam-a, PVX_097680 MSP3β, PVX_097625 MSP8, PVX_110810 DBPII, and PVX_095055 Rh5 interacting protein (RIPR)], whilst the most common pattern was for significant differences between Bongti compared to both Kong Mong Tha and Suan Pheung (evident for 13 proteins). The trend was for the highest IgG levels in Bongti followed by Suan Pheung then Kong Mong Tha, reflecting the distribution of P. vivax infections. We further assessed the difference in P. vivax-specific antibodies between villages once we had used the IgG levels to the validated subset of eight P. vivax proteins to classify individual as recently exposed to P. vivax or not, as this reduced the information to one variable for the antibody response rather than 21.

Key Variables Influencing IgG Level
We identified multiple variables that were associated with differences in IgG levels to the P. vivax proteins: age, gender, bednet usage, Plasmodium infections, and spatial location. We therefore used a multivariable regression model to identify the variables that were having the highest influence on IgG levels. For nearly all P. vivax proteins, age, gender, P. vivax and P. falciparum infections, and spatial location all had significant influences on IgG levels in the final multivariable model (using a backward stepwise approach as detailed in section "Materials and Methods"). A number of proteins had a slightly different combination of variables in the final model as detailed in Supplementary Table 3. Plasmodium infections had the biggest impact on varying the IgG level.

Predicting Recent Exposure to Blood-Stage P. vivax Infections
We used the antibody responses generated to our previously identified top eight P. vivax serological exposure markers (PVX_099980 MSP1-19, PVX_096995 Pv-fam-a, PVX_112670 Pv-fam-a, PVX_087885 RAMA, PVX_097625 MSP8, PVX_097720 MSP3α, PVX_094255B RBP2b, and KMZ83376.1 EBPII) to predict which individuals had exposure to P. vivax in the previous 9 months. Individuals with this serological signature are more likely to have recurrent P. vivax bloodstage infections in the following year than those who do not (Longley et al., 2020). Using the classification algorithm also enabled us to reduce multiple antibody levels to one variable; whether or not the individuals were predicted sero-positive or sero-negative according to the algorithm. As described in section "Materials and Methods, " since PCR prevalence in the study was low, we also chose a high specificity target to reduce the number of potential false positives that may contribute to the over-estimation of sero-prevalence. Using a 62% sensitivity and 90% specificity target, the algorithm identified 644 sero-positive individuals with an estimated 15.6% (95% CI 14.5-16.7%) sero-prevalence (Table 4). This diagnostic target identified 75% of currently infected individuals, i.e., 108 out of 144 PCR+ cases. This target identified an additional 556 who likely had a P. vivax infection within the past 9-months ( Table 4). This suggests a potential 17% of individuals (144 + 556) in this cross-sectional survey are either currently infected (in the blood or the liver) or have had recent past exposure to P. vivax. Given we know the sensitivity and specificity of our test, we can calculate an adjusted seroprevalence using the formula in section "Materials and Methods." With the high specificity target at 90 and 62% sensitivity, an adjusted sero-prevalence of 10.8% is calculated.

DISCUSSION
Serological exposure markers for identifying regions of residual or ongoing transmission of P. vivax could play a unique role in control and elimination efforts of malaria-endemic countries in the Asia-Pacific (and elsewhere). In this study, we measured total IgG antibody responses against a panel of 23 P. vivax proteins that we have previously shown to be accurate markers FIGURE 5 | Spatial distribution of model predicted recent P. vivax exposure status. The location of individuals (at the household level) who were classed as seropositive (orange dot) in the algorithm, using the antibody responses against the top eight P. vivax proteins. The size of orange dots represents the seropositive cases ranging from 1 to 11. The shading in orange represents multiple sero-positive individuals in the same household or multiple positive households in the one area. Blue dots are indicative of seronegative in the algorithm. The map was made using QGIS 3.16 software. Bongti (A) had 354 seropositive individuals out of a total 2,334 individuals tested (sero-prevalence of 15.2%). Suan Phueng (B) was the second largest region with seropositive prevalence of 259 in the 1,514 individuals (17.1%). Kong Mong Tha (C) was the smallest region and also had the lowest sero-prevalence, with 51 of the 407 individuals having a serological signature of recent P. vivax exposure (12.5% sero-prevalence).
of recent exposure to P. vivax parasites. By associating total IgG responses with the presence of Plasmodium infections, in addition to various epidemiological risk factors of exposure, we provide further evidence that these responses can be used as markers of residual transmission.
The majority of individuals surveyed in this cross-sectional study had no detected Plasmodium infections by qPCR. Accordingly, the seropositivity rates against the 23 P. vivax proteins were low, ranging from 2.5% (PVX_097625, MSP8) to 16.8% (PVX_082670, MSP7F). The seropositivity rates increased when we limited our analysis to the 144 individuals with current P. vivax infections, reaching a maximum of 56% against the protein KMZ83376.1 (EBPII). These seropositivity rates are very similar to what we have previously observed in a longitudinal cohort study conducted in the Kanchanaburi province (Longley et al., 2018a). Of the 23 P. vivax proteins, our top eight proteins as previously identified were PVX_094255 (RBP2b), PVX_087885 (RAMA), PVX_099980 (MSP1-19), PVX_096995 (a Pv-fama protein), PVX_097625 (MSP8), PVX_112670 (Pv-fam-a), KMZ83376.1 (EBPII), and PVX_097720 (MSP3α). In the entire study set of this cross-sectional survey the seropositivity rates to the top eight were similar to the rest of the P. vivax proteins, however in the individuals with current P. vivax infections five of the top eight proteins exhibited the highest levels of seropositivity (>40%). Whilst we do not know when the individuals in this cross-sectional survey acquired the P. vivax infection, the relatively high levels of seropositivity against our markers in individuals with current infections is a promising feature. An additional protein not within our top eight that had a high seropositivity rate in individuals with current P. vivax infections was MSP7, with 53% seropositivity. The selection of the top eight proteins was made using data from three individual studies conducted in Thailand, Brazil, and the Solomon Islands (Longley et al., 2020); this was to identify markers that work well across multiple geographic regions, and not just in Thailand, for example.
For all 23 P. vivax proteins, we observed higher total IgG levels in individuals with current P. vivax infections compared to those without. This is again expected given these proteins were chosen due to their ability to identify individuals with recent past exposure to P. vivax parasites. We observed no relationship between P. vivax antigenic density and IgG FIGURE 6 | Association between village-level PCR prevalence and adjusted P. vivax sero-prevalence. P. vivax sero-prevalence was predicted by the algorithm incorporating antibody responses to the panel of eight P. vivax proteins, using the 62% sensitivity, 90% specificity target. The estimates were adjusted, at the village-level, for to account for the known sensitivity and specificity. A Pearson r correlation was performed to assess the association between PCR prevalence and adjusted P. vivax sero-prevalence. Data passed the Anderson-Darling test for normality. level. A plausible reason for this finding, in contrast to our previous results (Longley et al., 2017b), is that all infections were asymptomatic and of relatively low parasite density. In addition, another key difference between the studies is that the present uses cross-sectional data (exact acquisition date of infection unknown) compared to an enrolled clinical cohort in our prior study (rough infection date known). What was perhaps unexpected was the statistically significant association of higher total IgG levels against all proteins in individuals with P. falciparum infections. This association remained even after the 11 individuals with co-infections were excluded. Potential reasons include cross-reactivity between orthologs in P. vivax and P. falciparum, or more simply because individuals with P. falciparum infections are also at risk of past P. vivax infections [as evidenced by recurrent P. vivax infections following treatment of P. falciparum infections (Douglas et al., 2011)]. As the sequence identity between the P. vivax proteins and their P. falciparum orthologs is low it is the latter explanation that seems most likely. Increased risk of prior exposure to Plasmodium parasites is also likely the reason for the initially counter-intuitive finding of higher IgG levels in individuals using bednets. Previous analysis of the epidemiological data from this cross-sectional survey demonstrated that bednets had no protective effect in this population (Nguitragool et al., 2017), potentially due to the early outdoor biting phenotype of some Plasmodium vectors in Thailand (Sriwichai et al., 2016). It is plausible that the individuals who chose to use bednets did so because of past illness due to Plasmodium infections, thus they may be a group at higher risk of recent past exposure, potentially explaining the higher IgG levels. It is also likely that the malaria control program targeted distribution of bednets to where they were most needed, thus bednet usage could be considered a marker of where there is transmission.
A known risk factor for Plasmodium infections in Thailand is gender, with adult males being at higher risk of infection due to their occupation (Nguitragool et al., 2017). This increased risk of infection was also reflected in IgG levels to our 23 P. vivax proteins, which were generally higher in adults over 18 years of age and in men. The individuals in this study lived in three main regions, Bongti, Kong Mong Tha, and Suan Phueng. P. vivax infections rates are highest in Bongti, and this was again reflected in our IgG measurements: IgG levels were highest in Bongti out of the three regions. Together, these analyses demonstrate that our panel of P. vivax serological exposure markers can be applied in a low-transmission setting to identify both individuals at high risk of infection and spatial clustering of infections at the region level.
We finally used a Random Forest algorithm that incorporated the antibody responses against the top eight P. vivax serological exposure markers to predict which individuals had recent exposure to the parasite in the past 9 months. We used a cut-off prediction value to prioritize a high specificity target (i.e., 90% specificity, 62% sensitivity, Supplementary Figure 1). A high specificity target for the serological exposure marker tool is better suited for surveillance to identify hot-spots or clusters of infections. As the number of true recent infections is low, it is important to have a high specificity rate to not over-estimate the levels of recent transmission. With the 62% sensitivity and 90% specificity target used in the current study, we identified a total 664 sero-positive individuals with an estimated 15.6% sero-prevalence. This included 108 out of the 144 individuals with PCR confirmed P. vivax infections. After adjusting the seroprevalence estimate for the known sensitivity and specificity, we calculate an adjusted sero-prevalence of 10.8%. While the true cumulative exposure to infectious bites in the last 9 months is unknown in this population, we can infer from a longitudinal study in Thailand with similar PCR prevalence [mean 3.4%, range 1.7-4.2% (Nguitragool et al., 2019)], that the cumulative PCR prevalence over 9 months was 10%. Since our study was conducted earlier (2012 vs 2013-2014) and during the high season, we expect slightly higher transmission levels and higher cumulative exposure, and therefore our adjusted estimate falls within the range that we would expect in this setting. We observed a significant moderate correlation between villagelevel PCR prevalence and the adjusted sero-prevalence from the algorithm, again demonstrating that our markers can be used for surveillance to determine clustering or hot-spots of infection.
Two systematic reviews on serological responses to Plasmodium proteins have highlighted that (a) serological markers of surveillance could be used as an important and reliable tool for assessing malaria endemicity and exposure history, and (b) that key knowledge gaps remain, particularly concerning the appropriate P. vivax proteins to be used and studies outside the South American region (Cutts et al., 2014;Folegatti et al., 2017). There have been a limited number of prior studies using detection of antibody responses against various P. vivax proteins as an indication of past exposure in the Asia-Pacific region; these include two studies conducted in Cambodia and Vanuatu. These studies have mainly used responses against relatively well-known proteins, such as the circumsporozoite protein (CSP), MSP1 (which is also in our eight-protein panel), the Duffy binding protein (DBP, in our larger panel) and apical membrane antigen 1 (AMA1) (Kerkhof et al., 2016;Idris et al., 2017). A challenge highlighted was the relatively long estimated antibody half-life against most P. vivax proteins tested (Kerkhof et al., 2016), suggesting it would be difficult to distinguish between recent and long-term changes in transmission patterns. Indeed the study in Vanuatu was aiming to demonstrate changes in transmission over a 4-year period (Idris et al., 2017), which is useful for monitoring endemicity and demonstrating the power or utility of various interventions. However, long-term changes in transmission are not useful for applying sero-surveillance as an intervention in itself. The power of our specific panel of proteins is that they were carefully selected to reflect recent exposure to P. vivax parasites, rather than lifetime exposure, in low-transmission settings. Furthermore, using the combination of IgG responses to our top eight proteins enables classification of recent infections with high accuracy and also introduces flexibility to account for potential differences in antibody half-life and seropositivity rates in different geographic regions. We have demonstrated the power of our markers for surveillance and identifying hot-spots of infections. These hot-spots can then be further targeted using our markers, where all sero-positive individuals could be treated with anti-malarial drugs.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

ETHICS STATEMENT
Written informed consent or assent was obtained from each participant in the cross-sectional survey and, where required, their legal guardian. This study was approved by the Ethics Committee of the Faculty of Tropical Medicine, Mahidol University (Ethical Approval number MUTM 2012-044-01). Collection and/or use of the malaria-naïve negative control samples was approved under the Human Research Ethics Committee at WEHI (#14/02).

AUTHOR CONTRIBUTIONS
IM, JS, and RL designed the study. CK, KK, PS-A, WN, and JS conducted the cross-sectional survey. ET, TT, W-HT, MH, CC, and JuH expressed the proteins. JB and RL performed the protein coupling. SC, CK, and JeH performed the antibody measurements. KS performed the sequence identity analysis. SC and RL did the data management and analysis. MW and NN performed the statistical modeling. SC and RL wrote the first draft of the manuscript. All authors contributed to data interpretation and revision of the manuscript.

FUNDING
Funding for this study was provided by the National Institute of Allergy and Infectious Diseases (NIH grant 5R01 AI 104822 to JS), the National Health and Medical Research Council (#1092789, #1134989, and #1043345 to IM, #1143187 to W-HT, and #1173210 to RL) and Global Health Innovative Technology Fund (https://www.ghitfund.org/) (T2015-142 to IM). We also acknowledged support from the National Research Council of Thailand and the Victorian State Government Operational Infrastructure Support and Australian Government NHMRC IRIISS. RL was supported in part by the Jared Purton Award from the Australian and New Zealand Society of Immunology. W-HT was a Howard Hughes Medical Institute-Wellcome Trust International Research Scholar (https://www.hhmi.org/programs/biomedical-research/ international-programs,~208693/Z/17/Z).

ACKNOWLEDGMENTS
We gratefully acknowledge the extensive field teams that contributed to sample collection and qPCR assays. We thank all individuals that participated in the cross-sectional survey. We thank Connie Li-Wai-Suen for writing the R script used for converting MFI values to relative antibody units. We also thank Christopher King (Case Western Reserve University) for provision of the PNG control plasma pool.