Human Immunodeficiency Virus-1 Diversity in the Moscow Region, Russia: Phylodynamics of the Most Common Subtypes

This study analyzes the HIV-1 subtype diversity and its phylodynamics in Moscow region, which is the most densely populated area of Russia characterized by high rates of internal and external migration. The demographic and viral data from 896 HIV-infected individuals collected during 2011–2016 were analyzed. The study revealed broad diversity in the HIV-1 subtypes found in Moscow, which included A6 (85.1%), B (7.6%), CRF02_AG (1.2%) and URF_A6/B recombinants (4.2%). Other HIV-1 subtypes were detected as single cases. While A6 was most prevalent (>86.0%) among heterosexuals, injecting drug users and cases of mother-to-child transmission of HIV, subtype B (76.3%) was more common in men who have sex with men. Phylogenetic reconstruction revealed that the A6 sequences were introduced into the epidemic cluster that arose approximately around 1998. Within the subtype B, six major epidemic clusters were identified, each of which contained strains associated with only one or two dominant transmission routes. The date of origin of these clusters varied between 1980 and 1993, indicating that the HIV-1 B epidemic began much earlier than the HIV-1 A6 epidemic. Reconstruction of the demographic history of subtypes A6 and B identified at least two epidemic growth phases, which included an initial phase of exponential growth followed by a decline in the mid/late 2010s. Thus, our results indicate an increase in HIV-1 genetic diversity in Moscow region. They also help in understanding the HIV-1 temporal dynamics as well as the genetic relationships between its circulating strains.


INTRODUCTION
HIV-1 is a rapidly evolving human pathogen, characterized by significant genetic diversity. The pandemic group M is subdivided into nine subtypes (A-D, F-H, J, K), each of which have been further divided into sub-subtypes (A1-A4, A6, F1-F2), and multiple inter-subtype recombinants (CRFs) (Foley et al., 2016). The global distribution of HIV subtypes is heterogeneous and varies significantly even within the same continent/country (Hemelaar et al., 2011;Bobkova, 2013).
The HIV epidemic started in Russia in the mid-1990s with an outbreak of the subtype A infection among intravenous drug users (IDUs) (Bobkov et al., 1998). As per the current nomenclature, the viral variant that caused the first epidemic and accounted for up to 90% of the infections is the sub-subtype A6 (formerly FSU-A or IDU-A) (Foley et al., 2016). During the following years, the HIV epidemic spread further affecting heterosexuals (HSX) and men who have sex with men (MSM). Ten years later sexual transmission accounted for about 37.3% of the new cases (FedAC, 2010).
During the last few years, the recombinant CRF063_02A1 has been responsible for more than 71.0% of the new HIV infections among HSXs and IDUs in Central Siberia (Gashnikova et al., 2015). Such changes in the prevalence of HIV variants in Russia are likely to affect the state of infection in Moscow region, which is characterized by a high population density and high rates of internal and external migration. Moscow located in the European part of Russia is a highly developed transport hub, for communications both within and outside Russia. The first HIV infection in Moscow region (not Moscow city) was reported in 1994; by the end of 2018 the number of people living with HIV (PLHIV) in this area was estimated to be 41,949 and the prevalence among adults has reached 0.7%.
There have been few studies on HIV diversity in Moscow region. While sub-subtype A6 has been described as the causative agent of the first HIV outbreak among IDUs probably introduced from St. Petersburg (Bobkov et al., 2001;Diez-Fuertes et al., 2015), by 2010 no significant changes were reported in the spectrum of circulating HIV subtypes [A6 (93.4%) and B (6.6%)] in the region (Giliazova et al., 2010). However, to date, there has been no information regarding the evolutionary history and population dynamics of HIV in Moscow region.
In this study, we aim to analyze the HIV-1 genetic diversity and reconstruct the temporal dynamics of its most common subtypes in Moscow region, using phylogenetic and phylodynamic approaches.

Study Population
The study involved 896 patients who were diagnosed with HIV infection and received antiretroviral therapy (ART) between 2011 and 2016 at the Moscow RCAC which is 2.1% of PLHIV in this area. Three criteria for formation of study population have been applied: (1) well-documented history of HIV-infection in patients, (2) permanent residency in Moscow region, and (3) informed consent for the research study.

Genotyping and Dataset
Plasma samples were genotyped using the ViroSeq HIV-1 Genotyping System (Abbott, USA). Multiple sequence alignments were made using MAFFT (Katoh et al., 2017)The total length of the alignment was 1299 nucleotides which covered the entire protease and partial reverse transcriptase (positions 2253-3551, HXB2-numbering). HIV-1 subtyping was performed using the COMET HIV-1 (Struck et al., 2014) and REGA Subtyping Tool v3.0 (Pineda-Pena et al., 2013) and subsequently confirmed by phylogenetic analysis. New inter-subtype or inter-CRF sequences were analyzed using jpHMM (Schultz et al., 2012). Moscow region sequences for A6 (N = 5) and B (N = 9) (2007-2008 sampling years) from the Los Alamos HIV-1 database (https://www.hiv.lanl.gov) were additionally included in the analysis. The epidemiological clusters were identified in the initial dataset of A6 (N = 768) and B (N = 77) subtypes. The evolutionary population dynamics was estimated in the optimized dataset by excluding (1) presence of more than 1% of ambiguous bases, (2) sequences with especially high evolutionary rates and (3) potential late chronic infections by AIDS symptoms. Finally, 320 and 71 sequences for A6 and B subtypes, respectively were selected. All codons associated with major drug-resistance mutations (Johnson et al., 2011) were removed from the final sequence alignment.

Phylogenetic Reconstruction and Phylodynamic Analysis
Selecting the best-fit model of nucleotide substitution was performed using jModelTest v.2.1.4 (Darriba et al., 2012). According to the Akaike information criterion (AIC), the best model for each dataset was General Time Reversible model with proportion of invariable sites and gamma-distributed rate variation among sites (GTR+I+G).The epidemiological clusters were identified using the maximum-likelihood phylogenetic analysis implemented by the IQ-TREE (Nguyen et al., 2015) with 1000 replicates for bootstrap and Shimodaira-Hasegawa (SH)-aLRT test. The clusters with an SH-aLRT support >0.9 were considered reliable.
The time of the most recent common ancestor (TMRCA) and the effective population size was estimated employing the Bayesian Markov Chain Monte Carlo (MCMC) approach using the BEAST v1.10.0 (Suchard et al., 2018). The temporal scale of the evolutionary process was inferred from the sampling dates of the sequences using a strict molecular clock model and the Bayesian Skyline coalescent tree prior. Comparing molecular clock models (as well as demographic models) in Bayesian framework were performed by calculating the AIC for MCMC with Tracer v1.7.1. Two independent MCMC chains were run of 50-500 × 10 6 steps with logging every 2000 generations, excluding first 25%. Convergence of the chains was estimated based on the Effective Sample Size in Tracer v1.7.1. The parameter estimates with ESS > 200 were accepted. We summarized the maximum clade credibility trees from the posterior distribution of trees using Tree Annotator and visualized them in FigTree v1.4.0. The datasets were analyzed with TempEst (Rambaut et al., 2016) for evaluating the temporal signal sufficiency.

Statistical Analysis
Statistical analysis was performed using STATISTICA v10.0 (StatSoft, USA) with discrete categorical data and the Pearson's chi-squared (or Fisher's extact) test. Differences were considered significant at P < 0.05.

Study Population
We evaluated 896 HIV-1 infected patients with a median age of 35 years (range 1-67). The proportion of men and women was about the same (58.5 and 41.5%, respectively). The dominant transmission routes were heterosexual contact (49.8%), intravenous drug use (39.8%), mother-to-child transmission (MTCT, 5.5%) and MSM contact (4.2%). The demographic characteristics of patients grouped by subtypes are summarized in Table 1.
Based on sequences identified by BLAST searches and phylogenetic analysis, sub-subtype A1 was found to be related to the East African strains (Uganda/Rwanda), subtype F1 to Angola and Romania, subtype C to the India/China Cstrains; subtype G to the Portuguese/ Spanish G-strains and CRF01_AE to the Philippines AE-strains. Sub-subtype A6, CRF03_AB, CRF63_02A1, and CRF02_AG viruses were related to strains circulating in the former Soviet Union (FSU) countries. While most of the subtype B sequences (N=64) belonged to the Pandemic (≪Western-B≫) variant, four sequences belonged to the FSU-B variant, previously identified among IDUs in Ukraine (Nabatov et al., 2002) (Figure 1).

Identification of HIV-1 Subtypes A6 and B Epidemic Clusters, and TMRCA Estimation
Maximum-likelihood and Bayesian phylogenetic analyses identified several clusters/subclusters of HIV-1 subtypes A6 and B (Figure 2 and  for Cluster 1 and subclusters 1 and 2 were estimated to be 1998.3 [1996.4-1999.9] -2000.4 [1999.2-2001.5]. Analysis of subtype B sequences identified six clusters (1-6) including 64 (83.2%) sequences. Unlike A6, the genetic structuring of subtype B was to some extent dependent on the transmission route. The smallest cluster (cluster 4) contained ∼5% of the B sequences obtained mainly from IDUs. While the sequences in cluster 3 (largest, ∼25% sequences) were from IDUs and MSM-individuals (in equal proportion), they were predominantly from MSM in clusters 4 and 5, and from HSX-individuals in cluster 2 and 6. The TMRCA for clusters 1-6 were estimated to be 1980.6 [1977.4-1995.3] -1993.1[1986.6-1999.7] (Figure 2 and Table 2).

DISCUSSION
This is the most extensive study in Moscow region to date, devoted to understanding the diversity and temporal dynamics of the common HIV-1 subtypes. We have confirmed that subsubtype A6 (85.1%) is still the major HIV-1 subtype in the region (Giliazova et al., 2010) (as elsewhere in Russia) (Kazennova et al., 2011;Lapovok et al., 2017), followed by subtype B at 7.6%. The A6 subtype predominates in HSXs and IDUs while subtype B is the most prevalent variant in MSM group. This unequal distribution repeats the molecular pattern of the epidemic in Russia as a whole. Apparently, the ≪ founder effect≫ and the relative social isolation/persistence of individual risk groups (at least at the beginning of the epidemic) are responsible for this phenomenon. We also observed a significant spread (4.2%) of unique recombinants between the two main subtypes, with at least 11 sequences distinct from the currently described A6/B recombinants. We are planning further studies with these recombinants, including full-length genome analysis to determine if they are new circulating forms. The CRF02_AG is the fourth most common HIV variant in the region, which is not surprising given the high level of labor migration from Central Asia where this subtype is widespread. Finally, we have identified at least seven other HIV-1 subtypes, which points to the increasing genetic complexity of the HIV-1 epidemic in the region. We also investigated the emergence of HIV-1 A6 and B epidemics and their growth rates in Moscow region using the molecular clock. According to our estimates, A6 sub-subtype was introduced into the region around 1998 and marked the onset of massive epidemic (Figure 2). These findings are in good agreement with other seroepidemiological studies (Bobkov et al., 2001). Subsequently, the divergent evolution of A6 strains and possible existence of several independent transmission networks apparently led to the formation of two epidemic sub-clusters, identified in this study. The study of the genetic relationships between HIV-1 Bstrains revealed that the spread of the subtype B in Moscow region involved at least six viral lineages that arose between 1980 and 1993 (Figure 2). Clusters 1-3, 5, and 6, which also belong to the Pandemic subtype B were responsible for 70.1% of the sexually transmitted infections and played a dominant role in the B-epidemic in Moscow region among MSM and HSXs. On the other hand, the FSU-B strain (Cluster 4) was responsible for infections in IDUs.
Though we are confident that HIV-1 subtype B penetrated into the region much earlier than A6, the origin of these clusters deserves further investigation. The growth pattern of the B epidemic is similar to that of A6 and represents the combined population dynamics of different subtype B-clusters. With an initial phase of rapid growth (not as rapid as for subtype A6), it appears that subtype B encountered less favorable conditions for local expansion, such as a slower rate of sexual transmission (Maljkovic Berry et al., 2007). The implementation of ART among all population groups could account for the slowdown in the epidemic since the mid-2010s.
The study can contribute to a better understanding and assessment of the social and biological driving forces behind the development of the epidemic process in one of the key regions for the HIV epidemic in Russia, as well as predicting of future trends of HIV infections and suggesting effective preventive measures.

ETHICS STATEMENT
This study was approved by the local Ethics Committee of the Moscow Regional AIDS Centre (RCAC). Informed consent was signed by all the subjects.

AUTHOR CONTRIBUTIONS
AL and MB planned and designed the study. NL and AP collected the epidemiological data and nucleotide sequences. FM performed the recombination analysis and submitted sequences. EK performed the analysis of the epidemiological data. AL performed the phylogenetic and phylodynamic analysis, produced the illustrations and wrote the manuscript. MB supervised the project and edited the manuscript. All authors participated in the critical review of this manuscript.