Dissemination Dynamics of HIV-1 Subtype B Pandemic and Non-pandemic Lineages Circulating in Amazonas, Brazil

The HIV-1 epidemic in the Amazonas state, as in most of Brazil, is dominated by subtype B. The state, nonetheless, is singular for its significant co-circulation of the variants BCAR, which can mostly be found in the Caribbean region, and BPAN, a clade that emerged in the United States and aggregates almost the totality of subtype B infections world-wide. The Amazonian HIV-1 epidemic provides a unique scenario to compare the epidemic potential of BPAN and BCAR clades spreading in the same population. To reconstruct the spatiotemporal dynamic and demographic history of both subtype B lineages circulating in Amazonas, we analyzed 1,272 HIV-1 pol sequences sampled in that state between 2009 and 2018. Our phylogeographic analyses revealed that while most BCAR infections resulted from a single successful founder event that took place in the Amazonas state around the late 1970s, most BPAN infections resulted from the expansion of multiple clusters seeded in the state since the late 1980s. Our data support the existence of at least four large clusters of the pandemic form in Amazonas, two of them nested in Brazil’s largest known subtype B cluster (BBR–I), and two others resulting from new introductions detected here. The reconstruction of the demographic history of the most prevalent BPAN (n = 4) and BCAR (n = 1) clades identified in Amazonas revealed that all clades displayed a continuous expansion [effective reproductive number (Re) > 1] until most recent times. During the period of co-circulation from the late 1990s onward, the Re of Amazonian BPAN and BCAR clusters behaved quite alike, fluctuating between 2.0 and 3.0. These findings support that the BCAR and BPAN variants circulating in the Brazilian state of Amazonas displayed different evolutionary histories, but similar epidemic trajectories and transmissibility over the last two decades, which is consistent with the notion that both subtype B variants display comparable epidemic potential. Our findings also revealed that despite significant advances in the treatment of HIV infections in the Amazonas state, BCAR and BPAN variants continue to expand and show no signs of the epidemic stabilization observed in other parts of the country.


INTRODUCTION
The HIV-1 subtype B pandemic started when the ancestral virus arrived and first established itself in the Caribbean region during the mid-1960s (Gilbert et al., 2007). The subsequent subtype B spread generated a set of local clades, designated as B CAR , that remained mostly confined to the Caribbean region (Gilbert et al., 2007;Cabello et al., 2014). One of those viruses, however, migrated from the Caribbean to the United States around the late 1960s and established a pandemic clade, called B PAN , that was then disseminated worldwide (Worobey et al., 2016). The remarkable dissemination of the B PAN clade was probably shaped by ecological, rather than virological factors (Gilbert et al., 2007;Arantes et al., 2020); but there are no studies comparing the epidemic potential of B CAR and B PAN clades co-circulating outside the Caribbean region.
The HIV-1 subtype B epidemic in Brazil is mainly driven by the B PAN clade, with a few notable exceptions, like the Northern state of Amazonas, which is characterized by the cocirculation of the B PAN (75%) and B CAR (25%) variants at a high prevalence (Divino et al., 2016;Crispim et al., 2019;Gräf et al., 2021). Among regional HIV-1 epidemics in Brazil, the one in the state of Amazonas stands out as the second largest AIDS detection rate (34.8 cases per 100,000 inhabitants) in the country, well above the mean national rate (17.8 cases per 100,000 inhabitants) (Ministério da Saúde, 2020). Furthermore, despite important advances in HIV diagnosis and treatment, the Amazonas HIV/AIDS epidemic is not stabilized and has displayed a rising AIDS incidence over the last 10 years (Ministério da Saúde, 2020).
The Amazonian HIV-1 epidemic thus provides a great opportunity to compare the epidemic dynamics of the B PAN and B CAR clades spreading in the same population outside the Caribbean region. Previous studies identified four major B PAN (B PAN−BR−I to B PAN−BR−IV ) and four major B CAR (B CAR−BR−I to B CAR−BR−IV ) clusters circulating in Brazil (Mir et al., 2015;Divino et al., 2016). The lineage B CAR−BR−I aggregates the majority (51%) of non-pandemic subtype B sequences from Brazil (Divino et al., 2016) and its most recent common ancestor (MRCA) could be traced back to Amazonas in the late 1970s, from a viral migration probably originated in the French Guiana (Divino et al., 2016;Arantes et al., 2019). The cluster composition and the evolutionary history of the B PAN clade in the Amazonas state are currently unknown.
This work aims to characterize the spatiotemporal dynamics of the major B PAN clusters circulating in the state of Amazonas and to compare the evolutionary and demographic history of major pandemic and non-pandemic subtype B lineages that drive the expanding HIV-1 subtype B epidemic in this Northern Brazilian state.

HIV-1 Subtype B pol Sequence Dataset From Amazonas State
In this study, we used a total of 1,272 HIV-1 subtype B pol sequences (nucleotides 2,253-3,275 of reference strain HXB2) from Amazonas state sampled between 2009 and 2018 that were either available at the Los Alamos HIV Database 1 by March 2021 or were recently published (Chaves et al., 2021;Gräf et al., 2021) and made available in GenBank 2 . Only one sequence per subject was selected. Sequences were aligned using the ClustalW program (Larkin et al., 2007) and all sites associated with major antiretroviral drug resistance in protease and reverse transcriptase were excluded.

Phylogenetic Classification of Amazonian HIV-1 Subtype B Sequences
Amazonian HIV-1 subtype B sequences were initially classified as B CAR or B PAN and next within major B CAR or B PAN Brazilian clusters by using an evolutionary placement algorithm (EPA) available in RAxML v.8.0.0 (Stamatakis, 2014) for the rapid assignment of query sequences to edges of a reference phylogenetic tree under a maximum-likelihood (ML) model. This analysis allowed us to classify sequences within 10 different clades: B CAR−BR−I to B CAR−BR−IV , B PAN−BR−I to B PAN−BR−IV , other B CAR lineages, and other B PAN lineages.

Selection of Brazilian and Global Reference HIV-1 Subtype B Pol Datasets
After the initial cluster assignment, the HIV-1 subtype B pol Amazonian sequences were aligned with different sub-sets of non-Amazonian subtype B pol reference sequences (covering nucleotides 2,253-3,260 relative to HXB2 genome). Amazonian subtype B sequences classified within major B CAR (B CAR−BR−I to B CAR−BR−IV ) or B PAN (B PAN−BR−I to B PAN−BR−IV ) Brazilian clusters were aligned with Brazilian sequences from other states that were previously classified within those major lineages (Mir et al., 2015;Divino et al., 2016). Amazonian subtype B sequences classified as "others B CAR lineages" were aligned with sequences representative of the B CAR diversity in the Caribbean region (n = 228) that were also described previously (Cabello et al., 2014;Mendoza et al., 2014). Finally, Amazonian subtype B sequences classified as "others B PAN lineages" were aligned with: (i) one subset of closely related B PAN sequences from Brazil (n = 687) selected from a large dataset of Brazilian sequences (n = 88,441) described previously (Gräf et al., 2021) and (ii) one subset of closely related B PAN sequences from different countries (n = 1,700) selected from a large dataset of worldwide sampled sequences (n = 71,160) recovered from Los Alamos HIV Database. We used the basic local alignment search tool (BLAST) 3 to select the 10 subtype B reference sequences (Brazilian and worldwide) with the highest similarity score (>95%) to each Amazonian subtype B sequence.

Detection of Major Amazonian HIV-1 Subtype B Clades
Amazonian and non-Amazonian subtype B sequences were subject to ML phylogeographic analyses to identify the B CAR and B PAN sub-clusters that probably originated in Amazonas.
ML phylogenetic trees were inferred with the PhyML program (Guindon et al., 2010) using an online web server (Guindon et al., 2005) under the GTR + I + 4 nucleotide substitution model, as selected by the jModelTest program (Posada, 2008), and the SPR branch-swapping algorithm of heuristic tree search. The reliability of the obtained tree topology was estimated with the approximate likelihood-ratio test (aLRT) (Anisimova and Gascuel, 2006) based on the Shimodaira-Hasegawa-like procedure. Trees were rooted using subtype D sequences and visualized using the FigTree v1.4.0 software (Rambaut, 2009). The ML trees were employed for the ancestral character state reconstruction (ACR) of epidemic locations with PastML (Ishikawa et al., 2019), using the maximum likelihood Joint (Pupko et al., 2000) and marginal posterior probabilities approximation (MPPA) methods with an F81-like model. Beyond Amazonas, remaining Brazilian states were aggregated in five discrete locations according to their geographic regions and sequences from other countries were aggregated in a single "non-Brazilian" location. Amazonian B CAR and B PAN clades were defined as those monophyletic clusters with high support (aLRT ≥ 0.85) that were mostly composed of sequences from Amazonas (>85%) and displayed Amazonas as the most probable (P ≥ 0.85) state location of its MRCA. Amazonian clades were further subdivided according to size into large (n > 30), medium (n = 10-30), and small (n = 2-9) clades.

Phylodynamic Analysis
The study of epidemiological and evolutionary parameters of major B CAR and B PANDEMIC clusters from Amazonas was done by Bayesian inference using coalescent and birth-death tree priors implemented, respectively, in BEAST v1.10 (Drummond et al., 2002;Suchard et al., 2018) with BEAGLE (Suchard and Rambaut, 2009) to improve run-time, and in BEAST 2.6 (Bouckaert et al., 2019) software packages. Dated phylogenies were inferred with the flexible Bayesian skyline coalescent model (Drummond, 2005). Changes across time in their effective sample size (N e ) were estimated using the coalescent Bayesian Skygrid (BSKG) model (Gill et al., 2012), and in their effective reproductive number (R e ), using the Birth-death Skyline (BDSKY) model (Stadler et al., 2012). For BDSKY, the sampling rate (δ) was set to zero for the period before the oldest sample and estimated afterward. The R e was modeled in a piecewise manner in equidistant intervals from the most recent sample up to the root of the tree with a lognormal prior (mean = 0; standard deviation = 1), and the becoming non-infectious rate with a lognormal prior (mean = 0.25; standard deviation = 0.5). All Bayesian MCMC analyses were performed using the GTR + I + 4 nucleotide substitution model, and a relaxed uncorrelated lognormal molecular clock model (Drummond et al., 2006) with a uniform prior distribution on the substitution rate that encompasses mean values previously estimated for the subtype B pol gene (2.0-3.0 × 10 −3 subst./site/year) (Hue et al., 2005;Zehender et al., 2010;Chen et al., 2011;Mendoza et al., 2014;Bello et al., 2018). MCMC chains were run for 50-100 × 10 6 generations and convergence and uncertainty of parameter estimates were assessed by calculating the effective sample size (ESS) and 95% highest probability density (HPD) values, respectively, after excluding the initial 10% of each run with Tracer v1.7.1 . The convergence of parameters was considered when ESS ≥ 200.

Statistical Analysis
Demographic information of age group and gender of individuals with samples included in the present study were compared using Pearson's chi-squared test as implemented in R version 3.6.3 (R Core Team, 2018), with 10,000 replicates. The false discovery rate (FDR) method was used to correct for multiple hypothesis testing and to reduce false positives. Statistical significance was defined as p-values <0.05.

RESULTS
The 1,272 HIV-1 subtype B pol sequences from Amazonas were assigned to either B CAR (23%) or B PAN (77%) lineages ( Table 1). The sub-lineage assignment reveals that most B CAR Amazonian sequences belong to the major Brazilian clade B CAR−BR−I (89%), while the remaining sequences were classified within clades B CAR−BR−II (1%), B CAR−BR−III (1%), or branched outside known Brazilian clades (8%) ( Table 1). Although a high proportion of B PAN Amazonian sequences also branched within major Brazilian B PAN clades (45%), particularly the B PAN−BR−I (37%), most pandemic sequences (55%) branched outside known countrywide Brazilian clades ( Table 1). To identify the major subtype clades circulating in Amazonas, we next conducted independent ML phylogeographic analyses for: (i) sequences that branched within major Brazilian B CAR or B PAN clades, by combining Brazilian sequences sampled in Amazonas and other states, and (ii) sequences that branched outside major Brazilian clades, by combining Amazonian sequences with either B CAR sequences of Caribbean origin or B PAN sequences sampled in   Table 1). The major B CAR founder event resulted in the B CAR−BR−I clade (Figure 1) while the remaining B CAR sequences were distributed among a few local clusters of small size (2-9 sequences, 7%) or singletons (4%) (Supplementary Figures 1A-C). A few B PAN introductions (n = 4) originated highly supported (aLRT > 0.85) Amazonian B PAN clades of large size that were mostly composed by sequences from Amazonas (>85%) and most probably arose in Amazonas (P > 0.90). Two major clusters, B  Figures 2, 3 and Supplementary Figures 1D-F).
To study in more detail the evolutionary and demographic history of lineages B CAR and B PAN spreading in Amazonas, we selected the five major clades that display both local epidemic importance -as, combined, they comprise 38% of HIV-1 subtype B infections in the state -and adequate sample sizes to give reliable demographic estimates. Time-scaled trees were reconstructed using a Bayesian coalescent model with an informative clock rate prior due to the weak temporal structure of Amazonian subtype B pol datasets (Supplementary Figure 2). Posterior estimates, that were, as expected, largely influenced by the selected clock rate prior, traced the median T MRCA of major Amazonian clades to the late 1970s for B CAR−BR−I , the late 1980s for B PAN−AM−I , the mid-1990s for B PAN−AM−II , and the late 1990s for B PAN−BR−I−AM−I and B PAN−BR−I−AM−II ( Table 2). These findings support that the major B CAR clade was successfully spreading in Amazonas for about 10 years before the emergence of the major B PAN clades, which may explain the singular high prevalence of non-pandemic subtype B variants in Amazonas with respect to most other Brazilian states. The Frontiers in Microbiology | www.frontiersin.org  (Figures 4A-E). The temporal trajectories of the R e estimated using the Bayesian BDSKY model, however, support that all major Amazonian HIV-1 subtype B clades continuously expanded (median R e > 1) over all the studied period, with some temporal fluctuations in the rate of expansion (Figures 4A-E). The clade B CAR−BR−I reached the highest median R e (2.5-2.6) between the late 1970s and the early 1990s, while the B PAN clades reached the highest median R e (2.9-3.4) between the mid-1990s and the mid-2000s ( Table 2). Despite those differences in the early phase of spread, all major Amazonian subtype B clades converge to the roughly similar median growth rate (R e = 1.6-2.3) at the most recent time period analyzed (2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018), with no evidence of recent epidemic stabilization (R e > 1) ( Figure 4F).

DISCUSSION
Previous studies demonstrate that the expanding HIV-1 subtype B epidemic in the Northern Brazilian state of Amazonas was driven by both pandemic (B PAN ) and non-pandemic (B CAR ) viral variants Divino et al., 2016;Arantes et al., 2019;Crispim et al., 2019;Chaves et al., 2021;Gräf et al., 2021), thus creating a great opportunity to compare the epidemic dynamics of both subtype B forms spreading in   the same population. This study revealed that the B PAN and B CAR epidemics in Amazonas have been shaped by different evolutionary histories, but displayed very similar transmissibility and expansion dynamics at most recent times. Our analysis confirmed that variants B CAR and B PAN were introduced multiple times in the Amazonas state, although the estimated number of B PAN introductions was 16 times higher than that of B CAR . The founder event that originated the clade B CAR−BR−I occurred in the late 1970s (Divino et al., 2016;Arantes et al., 2019) and gave rise to 89% of B CAR and 19% of total subtype B infections in Amazonas. In sharp contrast, most B PAN sequences from Amazonas branched into multiple state-specific clusters of medium/small size (57%) or appeared as unclustered infections (19%). The four largest B PAN Amazonian clades identified probably arose between the late 1980s and late 1990s and, together, comprise 23% of the B PAN and 17% of all subtype B infections in the state. These findings support that the early introduction (late 1970s) of the B CAR−BR−I ancestor in Amazonas from neighboring Caribbean countries probably drove its successful establishment and wide dissemination in the state. Although the B PAN strains arrived in Amazonas later, they reached a high prevalence because they were introduced at much higher numbers and spread through more transmission networks than B CAR strains.
The AIDS detection rate increased ∼10% in the Amazonas state between 2009 and 2019 (Ministério da Saúde, 2020). This finding is consistent with our BDSKY analyses that support a continuous expansion (R e > 1) of major B CAR and B PAN Amazonian clades over all the studied period. The BSKG model indicates a recent stabilization of some Amazonian B PAN clades since the late 2000s, and a previous study conducted by our group also indicated a recent epidemic stabilization of the clade B CAR−BR−I since the late 2000s . Although the median estimated R e of the B CAR and B PAN Amazonian clades was somewhat lower between 2010 and 2018 (1.6-2.3) than during the previous decades (2.5-3.4), we found no solid evidence of epidemic stabilization or reduction in the BDSKY analyses. A previous study pointed out that the BSKG model requires strongly informative data to prevent erroneous estimates of N e stabilization (Volz and Didelot, 2018). Thus, we hypothesize that the much larger number of recent (2009-2018) B CAR−BR−I sequences used in the present study (n = 267) compared to the previous one (n = 45) allowed us to obtain a more accurate demographic reconstruction of the epidemic pattern in the last two decades.
It is interesting to note that the B CAR (2.5-2.6) and B PAN (2.9-3.4) Amazonian clades reached similar highest median R e values. Furthermore, the highest median R e estimated here using a birth-death approach was comparable to the previous ones estimated for the B CAR−BR−I clade (3.8) using a coalescent-based approach , but lower than those estimated for major B PAN Brazilian lineages spreading in the Southeastern region (5.0-7.9) (Mir et al., 2015). These findings support that differences in the spreading dynamics of subtype B lineages may reflect discrepancies in the connectivity of underlying transmission networks across different Brazilian states/regions, rather than intrinsic differences in viral transmissibility. A preliminary analysis of the available demographic data (sex and age) of HIV-infected subjects from Amazonas revealed no significant differences between major B CAR and B PAN clades (Supplementary Table 2), supporting that both viral lineages are possibly spreading through networks with similar epidemiological properties. This observation is also consistent with a previous study that revealed comparable epidemic growth rates of B CAR and B PAN lineages circulating in the French Guiana (Bello et al., 2018).
Our study has some limitations. First, inferences about potential sources, total number of viral introductions, and local clade size in Amazonas were limited by both the incomplete sampling of local population and the limited number of non-Amazonian reference sequences included in each ML phylogenetic analysis as revealed by the variable phylogenetic and phylogeographic placement of some Amazonian sequences that branched basal to each local clade. The bulk of B CAR and B PAN sequences that compose each major Amazonian clade, however, remained constant across analyses, and our major phylogeographic conclusions were robust to sampling bias. Second, time-scale reconstructions were largely influenced by the selected clock rate prior due to the weak temporal structure of Amazonian HIV-1 subtype B pol datasets. Despite this, the T MRCA here obtained were fully consistent with the overall time-scale of dissemination of the HIV-1 B CAR and B PAN lineages in the Americas and Brazil described in previous studies (Gilbert et al., 2007;Cabello et al., 2014;Mir et al., 2015;Worobey et al., 2016;Arantes et al., 2019;Bello et al., 2019). Finally, the lack of epidemiological data regarding the mode of transmission of individuals analyzed reduced the power of our study to confirm any association between the inferred rate of viral spread and the ecological characteristics of local transmission networks in Amazonas.
In summary, this study highlights that the HIV-1 epidemic in the Amazonas state mostly results from the local expansion of one B CAR strain (B CAR−BR−I ) introduced around the late 1970s and of multiple B PAN viral strains introduced since the late 1980s. Albeit the earlier introduction of the B CAR−BR−I clade granted a much prolonged period of local spread than that of the B PAN strains, this was compensated by a much higher number of independent introductions and the concurrent establishment of multiple B PAN local transmission networks. Despite significant differences in the pattern of early establishment, major B CAR and B PAN clades circulating in Amazonas have been spreading at a quite similar rate over the last two decades, arguing against the hypothesis of significant differences in their intrinsic transmissibility. Our analyses also demonstrate that major Amazonian B CAR and B PAN clades continued to spread and showed no clear signs of recent epidemic stabilization, supporting the relevance of designing more effective strategies to prevent HIV transmission in the region.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: GenBank accession numbers: KEXV01000001 to KEXV01046877; HQ127524 to HQ127607; KU762066 to KU762066; MH673055 to MH673280; and MW545333 to MW545424; Los Alamos HIV Sequence Database (http://www. hiv.lanl.gov).

ETHICS STATEMENT
Ethical review and approval were not required for the study on human participants in accordance with the Local Legislation and Institutional Requirements. Written informed consent for participation was not required for this study in accordance with the National Legislation and the Institutional Requirements.

AUTHOR CONTRIBUTIONS
GB conceived and designed the study and supervised the experiments. IA conducted the experiments and analyzed the data. YO, TG, and MG provided HIV-1 sequence data and intellectual input. IA and GB wrote the first draft of the manuscript. All authors assisted with the writing and approved the final manuscript.