Skip to main content

ORIGINAL RESEARCH article

Front. Cell. Infect. Microbiol.
Sec. Virus and Host
Volume 14 - 2024 | doi: 10.3389/fcimb.2024.1388059

Optimization of genetic distance threshold for inferring the CRF01_AE molecular network based on next-generation sequencing Provisionally Accepted

  • 1Key Laboratory of AIDS Immunology, Department of Laboratory Medicine, The First Affiliated Hospital of China Medical University, China
  • 2The First Affiliated Hospital of China Medical University, China

The final, formatted version of the article will be published soon.

Receive an email when it is updated
You just subscribed to receive the final version of the article

HIV molecular network based on genetic distance (GD) has been extensively utilized. However, the GD threshold for the non-B subtype differs from that of subtype B. This study aimed to optimize the GD threshold for inferring the CRF01_AE molecular network. Next-generation sequencing data of partial CRF01_AE pol sequences were obtained for 59 samples from 12 transmission pairs enrolled from a high-risk cohort during 2009 and 2014. The paired GD was calculated using the Tamura-Nei 93 model to infer a GD threshold range for HIV molecular networks. 2,019 CRF01_AE pol sequences and information on recent HIV infection (RHI) from newly diagnosed individuals in Shenyang from 2016 to 2019 were collected to construct molecular networks to assess the ability of the inferred GD thresholds to predict recent transmission events. When HIV transmission occurs within a span of 1-4 years, the mean paired GD between the sequences of the donor and recipient within the same transmission pair were as follow: 0.008, 0.011, 0.013, and 0.023 substitutions/site. Using these four GD thresholds, it was found that 98.9%, 96.0%, 88.2%, and 40.4% of all randomly paired GD values from 12 transmission pairs were correctly identified as originating from the same transmission pairs. In the real world, as the GD threshold increased from 0.001 to 0.02 substitutions/site, the proportion of RHI within the molecular network gradually increased from 16.6% to 92.3%. Meanwhile, the proportion of links with RHI gradually decreased from 87.0% to 48.2%. The two curves intersected at a GD of 0.008 substitutions/site. A suitable range of GD thresholds, 0.008-0.013 substitutions/site, was identified to infer the CRF01_AE molecular transmission network and identify HIV transmission events that occurred within the past three years.This finding provides valuable data for selecting an appropriate GD thresholds in constructing molecular networks for non-B subtypes.

Keywords: HIV, Genetic distance, Next-generation sequencing, Molecular network, CRF01_AE

Received: 19 Feb 2024; Accepted: 28 Mar 2024.

Copyright: © 2024 Hu, Zhao, Liu, Gao, Ding, Hu, An, Shang and Han. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Mx. Hong Shang, Key Laboratory of AIDS Immunology, Department of Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China
Prof. Xiaoxu Han, The First Affiliated Hospital of China Medical University, Shenyang, China