Global associations of key populations with HIV-1 recombinants: a systematic review, global survey, and individual participant data meta-analysis

Introduction Global HIV infections due to HIV-1 recombinants are increasing and impede prevention and treatment efforts. Key populations suffer most new HIV infections, but their role in the spread of HIV-1 recombinants is unknown. We conducted a global analysis of the associations between key populations and HIV-1 recombinants. Methods We searched PubMed, EMBASE, CINAHL, and Global Health for HIV-1 subtyping studies published from 1/1/1990 to 31/12/2015. Unpublished data was collected through a global survey. We included studies with HIV-1 subtyping data of key populations collected during 1990-2015. Key populations assessed were heterosexual people (HET), men who have sex with men (MSM), people who inject drugs (PWID), vertical transmissions (VERT), commercial sex workers (CSW), and transfusion-associated infections (BLOOD). Logistic regression was used to determine associations of key populations with HIV-1 recombinants. Subgroup analyses were performed for circulating recombinant forms (CRFs), unique recombinant forms (URFs), regions, and time periods. Results Eight hundred and eighty five datasets including 77,284 participants from 83 countries were included. Globally, PWID were associated with the greatest odds of recombinants and CRFs (OR 2.6 [95% CI 2.46–2.74] and 2.99 [2.83–3.16]), compared to HET. CSW were associated with increased odds of recombinants and URFs (1.59 [1.44–1.75] and 3.61 [3.15–4.13]). VERT and BLOOD were associated with decreased odds of recombinants (0.58 [0.54–0.63] and 0.43 [0.33–0.56]). MSM were associated with increased odds of recombinants in 2010–2015 (1.43 [1.35–1.51]). Subgroup analyses supported our main findings. Discussion As PWID, CSW, and MSM are associated with HIV-1 recombinants, increased preventative measures and HIV-1 molecular surveillance are crucial within these key populations. Systematic review registration PROSPERO [CRD42017067164].


. Introduction
In 2021, 38.4 million people were living with HIV worldwide and 1.5 million people became newly infected (1). The HIV pandemic remains a major global health challenge, and its extreme global genetic diversity impedes treatment and prevention efforts (2). Global temporal analysis indicates that the HIV-1 pandemic is diversifying, with increases in both the numbers of distinct HIV-1 variants and proportions of recombinant strains (3)(4)(5). Increasing diversity impacts HIV diagnosis and treatment, drug resistance, viral load measurement, transmission, disease progression, immune responses, and vaccine development (2,(6)(7)(8)(9)(10).
After zoonotic transmission from chimpanzees to humans in Central Africa around 1900, the HIV-1 group M epidemic rapidly diversified into distinct subtypes, designated by the letters A-D, F-H, and J-L (11,12). HIV-1 subtypes spread across the globe throughout the 20th century, resulting in HIV-1 subtype distributions that greatly vary by region (3,13). The genetic complexity of the HIV pandemic continues to increase over time, largely driven by the high mutation and recombination rates of the error-prone reverse transcriptase enzyme (14). Recombination occurs when an individual is co-infected with multiple strains which combine into a new variant (15). The resulting variants are designated as circulating recombinant forms (CRFs) or unique recombinant forms (URFs). CRFs, which are characterized by community spread, must be fully sequenced and found in at least three epidemiologically unlinked individuals. More than 120 distinct CRFs have been described to date, and more CRFs continue to be identified (16). URFs are unique recombinant sequences without evidence of onward transmission. The proportion of recombinants has been increasing over time, both globally and in most regions, and recombinants now constitute nearly a quarter of all HIV-1 infections globally (4). In addition to increasing the genetic complexity of the HIV pandemic, recombination may confer an evolutionary advantage, leading to altered transmission and/or virulence (17,18).
In 2021, 70% of new HIV infections occurred within key populations and their sexual partners, though these populations account for <5% of the global population (1). It is estimated that men who have sex with men (MSM) have 28 times the risk of HIV infection relative to heterosexual (HET) adult men, female commercial sex workers (CSW) have 30 times the risk relative to other adult women, and people who inject drugs (PWID) have 35 times the risk compared to those who do not inject drugs (1). Additionally, people in areas without comprehensive blood screening are particularly vulnerable to HIV infection through transfusions with infected blood (BLOOD) (19), and children born to mothers with HIV can become infected via vertical transmission (VERT) during pregnancy,labor,delivery,or breastfeeding (20). Prior work indicates that HIV can follow a chain of transmission among these groups, spreading from PWID to CSW who transmit the virus to their HET clients. The virus can then be transmitted to the client's female sexual partner before VERT transmission of HIV infection to children (20,21). Transmission among MSM and during blood transfusions has also played a major historical role in the spread of HIV, particularly across Asia, Europe, and North America (19,21,22). Though these key populations are known to play a role in HIV transmission, it is unclear what role they play in the spread of HIV-1 recombinants. Since these populations often face difficulties accessing HIV services and have an increased risk of infection (1), potentially by multiple strains, they may be more likely to develop novel HIV strains. These recombinant strains may cross from key populations into the general population, making the overall HIV epidemic more complex.
The global proportion of HIV infections with recombinants is increasing and key populations globally account for most new HIV infections. However, there is an evidence gap regarding the global association of key populations with HIV-1 recombinants. To address this gap, we conducted a global analysis of the association between multiple key populations and HIV-1 recombinants using the largest global HIV-1 molecular epidemiology database assembled to date.
. Materials and methods

. . Data collection
Data on the global distribution of HIV-1 subtypes and recombinants among key populations were obtained through a systematic literature review (PROSPERO: CRD42017067164), review of specialist journals and reports, and global survey of experts (3). We searched PubMed (29,825 citations retrieved), Embase (Ovid) (25,914 citations), CINAHL (Ebscohost) (451 citations), and Global Health (Ovid) (9,707 citations) for studies reporting HIV-1 subtyping data published from Jan 1, 1990 to Dec 31, 2015. This time period covers the period for which reliable estimates of national HIV prevalence were available. Search terms were Medical Subject Headings (MeSH) and Emtree terms, free text words, and synonyms, including "HIV, " "Subtype, " "recombinant, " "CRF, " and "URF" (Appendix pp2-5). No language or methodology filters were used. All references retrieved were combined in Endnote reference manager, and duplicates removed (Endnote X9; Clarivate Analytics, Philadelphia, PA). Authors RE, JY, LD-T and JH screened titles and abstracts, retrieved relevant full text articles, and assessed articles against the eligibility criteria. Additional published data were derived from the WHO HIV Drug Resistance Report 2012 (23), published reviews and reports on HIV diversity, and papers indexed on Scopus that referenced previous publications on global HIV-1 diversity (Appendix pp6-8). Additionally, four specialist journals (AIDS, Journal of AIDS, Journal of Virology, AIDS Research and Human Retroviruses) were screened for relevant articles published between January 1990 and February 2016. Using a data collection template, unpublished original HIV-1 subtyping data was collected through a global survey of members in the WHO-UNAIDS Network for HIV Isolation and Characterisation.

. . Eligibility criteria and data extraction
Studies were eligible for inclusion if they were prevalence studies of key populations living with HIV with original HIV-1 subtyping data, known country and year of sample collection . /fpubh. . , and a minimum of 20 participants. Studies that only contained incident infections or untyped samples were excluded. Full-length genomes or any genome segment could be used for subtyping, no minimum sequence length was specified, all online subtyping tools were accepted, and subtyping data from each included dataset was assumed to be correct. Authors RE, JY, LD-T, and JH extracted the following information for each data set: country, city or region, sample collection year(s), study type, key population, HIV-1 subtyping method(s), and genome segment(s) analyzed. The primary outcome was the number of each key population designated by the original authors as each HIV-1 subtype (A, B, C, D, F, G, H, J, K), CRFs, and URFs. Country designation was based on where samples were taken. One subtype/CRF/URF was assigned to each participant. Subtyping methods included sequencing, heteroduplex mobility assay, and serotyping. The vast majority of data was acquired by sequencing (100% in 2010-2015), mostly of partial genome sequences, mainly pol (94.4% in 2010-2015) (3). Contributing researchers were assumed to have obtained consent from participants, and no personal identifiable information was retrieved. Formal assessment of individual study quality was not performed. Discrepancies were resolved by the senior reviewer (JH).

. . Key populations
Based on the populations specified by each study, participants were categorized as heterosexual (HET), men who have sex with men (MSM), people who inject drugs (PWID), vertical transmissions (VERT), commercial sex workers (CSW), and transfusion-associated infections (BLOOD) by author NN and confirmed by JH. Studies involving multiple key populations were assigned to the key population comprising at least 95% of data or excluded if no single key population met the 95% threshold. Studies with unspecified or indeterminate key populations were excluded. Any discrepancies or ambiguities were resolved by JH.

. . Meta-analysis
As most studies provided data on a single key population, one-stage meta-analysis of individual-participant data of different studies was performed. For logistic regression, HIV-1 variants were categorized as "Subtype" or "Recombinant" (CRF/URF). A univariate binomial logistic regression model was constructed to analyse the global association of each key population with HIV-1 recombinants. To assess the global association of key populations with CRFs and URFs separately, logistic regression was repeated using a multinomial model.
Countries were grouped into 14 regions (Appendix p9) and data were assigned to four periods: 1990periods: -1999periods: , 2000periods: -2004periods: , 2005periods: -2009periods: , and 2010periods: -2015. All participants in each dataset were assigned to periods based on the midpoint year of the reported sample collection period. Datasets of which sampling years were evenly split between two periods (e.g., [2003][2004][2005][2006] were excluded from time-stratified analyses. To assess temporal and geographic differences in the associations with recombinants, the binomial logistic model was separately stratified into subgroups by time period and region. For all logistic models, "Subtype" was used as the reference group. Odds Ratios (ORs) were reported with 95% confidence intervals. In the Appendix, pairwise ORs are reported globally and for each region (pp11-13) and period subgroup analyses were repeated for the multinomial logistic regression model (pp14). Statistical analyses were performed using STATA 17.0 (StataCorp LLC,College Station,TX). This systematic review is reported according to the PRISMA guidelines, as applicable.

. . Data collection
A total of 885 datasets including 77,284 participants from key populations from 83 countries were included ( Figure 1). The systematic literature search yielded 208 datasets comprising 23,988 participants. Six hundred and seventeen datasets with 48,984 participants were collected from the global survey, and 60 datasets with 4,312 participants were obtained from other published sources.
Most included participants were heterosexual people (58.2%) and MSM (25.7%), with smaller proportions of participants representing PWID (8.1%), VERT (5.3%), CSW (2.2%), and BLOOD (0.6%) ( Table 1). Data from the HET population was the largest in each time period, while there was no data for CSW in the most recent period. Most participants were derived from Western and central Europe, and North America (WCENA) followed by East, West, and Southern Africa. HET data was available in every geographic region while all BLOOD participants were derived from East Asia and WCENA. HET was selected as the reference group as it was the only population represented in all regions and time periods.

. . Global association of key populations with recombinants
The global distribution of HIV-1 subtypes, CRFs and URFs among key populations in 1990-2015 is shown in Table 2. During 1990-2015, the largest proportion of recombinant infections was found among PWID (52.8%) and CSW (40.5%), followed by HET (30.1%), MSM (29.4%), VERT (19.9%), and BLOOD (15.6%). PWID had the highest proportion of CRFs (49.1%) and CSW had the highest proportion of URFs (17.6%). The proportion of recombinant infections grew consistently across periods for HET, MSM, and BLOOD and across the first three periods for PWID. Global proportions of individual CRFs for each key population are included in the Appendix p16.

FIGURE
Data collection flowchart. *For example, HIV-positive immigrants only. † For example, data only provided for subtype B and non-B participants. ‡ For example, subtypes referred to disease states, not HIV subtypes. § For example, HIV subtyping data could not be assigned to a specific key population, as multiple key populations were present in the study.
association with recombinants relative to HET [0.97 (0.93-1.01)] ( Figure 2A; Table 3). Relative to HET, independent associations of each key population with CRFs and URFs varied substantially ( Figure 2A; Appendix p15). PWID were significantly associated with increased odds of CRFs [2.99 (2.83-3.16)], while CSW were significantly associated with increased odds of URFs [3.61 (3.15-4.13)]. VERT was associated with decreased odds of CRFs Relative to HET, the strength of associations with recombinants across time differed by key population ( Figure 2B; Appendix pp14, 15). PWID were associated with increased odds of recombinants across all periods. CSW were initially associated with increased odds of recombinants, but the strength of the association decreased with time before leading to decreased odds in the 2005-2009 period. VERT alternated from increased to decreased odds of recombinants across time.      Global distribution of HIV-1 subtypes, CRFs, and URFs within key populations in 1990-2015 and each of four time periods (1990-1999, 2000-2004, 2005-2009, 2010-2015).

Number of participants
* Total CRFs is the sum of CRF01_AE, CRF02_AG, and Other CRFs. † Total recombinants is the sum of total CRFs and URFs. ‡ Total is the sum of total recombinants and all HIV-1 subtypes. BLOOD, blood/plasma transfusion associated infections; CRF, circulating recombinant form; CSW, commercial sex workers; HET, heterosexual; MSM, men who have sex with men; PWID, people who inject drugs; URF, unique recombinant form; VERT, vertical transmission (mother to child).   -). No data on recombinants was available for BLOOD in -, and no data was available for CSW in -. Error bars represent the % confidence intervals. Square areas are proportional to the number of participants in each key population analyzed. Odds ratios and % CI are provided in the Appendix p (*P < . , **P < . ). BLOOD, blood/plasma transfusion associated infections; CRF, circulating recombinant form; CSW, commercial sex workers; HET, Heterosexual; MSM, men who have sex with men; PWID, people who inject drugs; URF, Unique Recombinant Form; VERT, vertical transmission (mother to child). . . Regional association of key populations with recombinants The regional distribution of HIV-1 subtypes, CRFs, and URFs for each key population is included in the Appendix pp17-19. The association of key populations with recombinants varied by region (Table 3). Compared to HET, PWID had the greatest odds of recombinants in Eastern Europe and central Asia [EECA; ], followed by Latin America [4.16 (3.02-5.74)] and East Asia [3.20 (2.50-4.09)]. PWID were significantly associated with CRFs in ] and East Asia [3.26 (2.55-4.18)], and both CRFs and URFs in Latin America .
BLOOD was associated with decreased odds of recombinants in East Asia

. Discussion
A strong association between PWID and recombinants and CRFs was observed globally across all periods and in most regions. Only in SE Asia, where CRF01_AE has a prevalence of ∼70-80% (3), were PWID associated with decreased odds of recombinant strains. The prevalence of recombinant epidemics among PWID in most regions, where "pure" HIV-1 subtypes are typically the most prevalent overall (3), and subtype-based epidemics among PWID in SE Asia, where recombinant strains are highly prevalent, indicates that HIV-1 circulates among PWID via transmission networks distinct from the HET population. This finding extends previous studies suggesting that HIV is transmitted among independent PWID networks across multiple continents (24,25). Furthermore, the association with recombinants across all periods indicates that PWID play a major role in the global diversification of HIV-1.
CSW were associated with increased odds of recombinants and URFs, particularly in the periods 1990-1999 and 2000-2004. Across East Africa and Latin America, CSW were significantly more likely to be infected with URFs than the HET population. These findings highlight that novel HIV strains frequently arise within the CSW population. URFs arise independently and lack evidence of transmission, minimizing the likelihood that the observed association is due to reverse causation. The decreased odds of CRFs in West Africa may be related to the high prevalence of CRF02_AG (3), similarly to the case of PWID and CRF01_AE in SE Asia. Additional data is required to identify factors contributing to the diminishing global association of CSWs with recombinants across time.
Though VERT was associated with decreased odds of recombinants and CRFs, results greatly varied across times and regions. While biological differences in recombinant strains may cause increased rates of vertical transmission relative to "pure" subtypes (26), high levels of heterogeneity indicate that VERT is not a major driver of increasing HIV-1 diversity.
BLOOD was associated with decreased odds of recombinants and CRFs in both East Asia and WCENA. Particularly in East Asia, where CRF01_AE is highly prevalent, blood transfusion recipients were significantly less likely to have a recombinant strain of HIV than the heterosexual population, which may reflect the geographical origins of the blood donor base. However, the small number of datasets means that the observed association is subject to limitations of power and representativeness. Additional data is required to clarify the association between BLOOD and recombinants.
MSM did not have a significant global association with recombinants overall, likely due to a positive association with CRFs and negative association with URFs. The positive association between MSM and CRFs was strongest in East Asia where the prevalence of CRFs has grown from 25.9 to 75. 5% during 1990-2015 (3). Within this region, MSM had nearly double the odds of CRFs as HET, indicating that MSM may be at the forefront of the growing epidemic across East Asia. Similar results were seen in West Africa where the proportion of URFs grew from 3.4 to 15.5% over the same period (3), and findings indicated that MSM had 2.44 times the odds of being infected with URFs. These associations suggest that MSM likely play a major role in the spread of new strains in some regions. Despite an overall association with recombinants that was not significant and historical associations with HIV-1 subtype B (27), the significant association in 2010-2015 and increasing trend across time indicate that MSM may be associated with an increased risk of recombinants.
A key strength of this study is its unprecedented large size, including 77,284 participants from 83 countries, collected from key populations globally during 1990-2015. To our knowledge, this is the first comprehensive analysis of the association between key populations and HIV-1 subtypes and recombinants at a global and regional level. Additionally, data was collected through both a literature search and a global survey, with the inclusion of unpublished data enabling increased regional coverage and improved coverage of recent time periods.
The study also had some limitations. Estimates of associations of key populations with HIV-1 variants are dependent on the underlying data. There was notable variation in coverage by key population, geographic region, and time period. Although the numbers of participants were generally high, the limited number of datasets included from BLOOD means that results must be interpreted with caution. Conclusions could not be independently . /fpubh. .     Odds ratios and 95% confidence intervals by region for binomial/multinomial logistic regression between key populations and recombinants, CRFs, and URFs compared to HIV-1 subtypes. Significant results are in bold. * Regions with insufficient data on key populations to fit a logistic regression model for recombinants (Caribbean, Ethiopia, Middle East and North Africa, Oceania) were included in the global association, but not as independent regions. Additional data on global and regional odds ratios of key populations and HIV-1 recombinants is included in the Appendix pp11-13.
BLOOD, blood/plasma transfusion associated infections; CRF, circulating recombinant form; CSW, commercial sex workers; HET, heterosexual; MSM, men who have sex with men; PWID, people who inject drugs; URF, unique recombinant form; VERT, vertical transmission (mother to child).
drawn for any key populations in the Caribbean, Ethiopia, Oceania, and Middle East and North Africa (MENA) due to persistent data gaps that have been previously noted (1,28). Similarly, the absence of data for certain key populations in some regions (e.g., MSM data in Central Africa, East Africa, and MENA) may reflect limited access to healthcare due to sociolegal restrictions (29). Most data were not drawn from nationally representative surveys and we were unable to weigh country-level data according to relative numbers of people of key populations living with HIV in each country, as comprehensive global data on key populations is not available. Hence, reported distributions of HIV-1 variants should not be interpreted as representative of key populations in each region or globally. As HIV subtyping data for most studies was primarily based on pol sequencing rather than the whole genome (3), recombination outside of this genome region was likely missed, leading to an underestimation of recombinants. Seventy four CRFs were described at the time of data collection (up until 2015), contributing to the discrepancy between the 48 CRFs identified within the datasets contributing to this study and the >120 CRFs that have been described to date (16). Findings could be subject to bias due to heterogeneity in study design, inclusion/exclusion criteria, subtyping methods, and rates of treatment and migration across regions. In particular, differences in participant recruitment and the definition of key populations between studies could affect observed associations with recombinants. Lastly, insufficient data was available to conduct analysis for transgender women.
Among key populations, increased risk of HIV infection, potentially by multiple strains, and difficulty accessing treatment, potentially leading to increased viral loads, may contribute to the formation and onward spread of HIV-1 recombinants (18,30). The increasing diversity of the HIV pandemic has implications across diagnosis, treatment, and prevention (2,(6)(7)(8)(9)(10). Efforts to prevent the spread of novel HIV strains should consider approaches for key populations such as PWID, CSW, and MSM that are at increased risk of developing and transmitting recombinants. In the case of PWID, this may require prevention-based approaches such as distribution of sterile injection equipment (31,32), opiate substitution treatment (33), and increased access to antiretroviral therapies (34). For CSW and MSM, prevention efforts should focus on increasing availability of the dapivirine vaginal ring for cisgender women, oral TDF-based pre-exposure prophylaxis (PrEP), and long-acting injectable cabotegravir (1, 35,36). Increased HIV testing among key populations will help detect and treat new HIV infections early. These efforts can help limit the spread of traits from newly-emergent, highly virulent strains (18). Structural reform may also be necessary as the criminalization of these three key populations is associated with worse HIV outcomes and inadequate viral suppression (37,38), potentially accelerating HIV-1 diversification.
In summary, this is the first study to comprehensively analyse the global association of key populations with HIV-1 recombinants. PWID, CSW, and MSM were significantly associated with recombinants globally and across multiple regions. As key populations and their partners account for 70% of new HIV infections (1), it is apparent that key populations are driving the genetic diversification of the global HIV-1 pandemic, posing a challenge to diagnostics, treatments, and vaccines against . /fpubh. .
HIV. Therefore, additional surveillance of HIV-1 molecular epidemiology and increased preventative measures should be targeted toward these key populations.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding author.

Ethics statement
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

Author contributions
NN assessed eligibility of manuscripts, conducted the analyses, designed figures and tables, interpreted the data, and wrote the manuscript. RE, JY, and LD-T screened the electronic literature search results for relevant manuscripts, assessed their eligibility, extracted data, and collected additional published data. SK designed and did the electronic literature search. JH conceived, designed, coordinated the study, wrote the systematic review protocol, assisted with the literature search, assessed eligibility of manuscripts, collected additional published data, conducted the global survey, extracted data, designed the analysis plan, interpreted the data, and wrote the manuscript. All authors read and approved the final version of the manuscript.

Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.