Molecular Characteristics of Human Adenovirus Type 3 Circulating in Parts of China During 2014–2018

Human adenoviruses (HAdVs) are important pathogens causing respiratory infections; 3.5–11% of childhood community-acquired pneumonia is associated with HAdV infection. Human adenovirus type 3 (HAdV-3), leading to severe morbidity and mortality, is one of the most prevalent genotype among adenoviruses responsible for acute respiratory infections (ARIs) in children in China. To identify the genetic variation of HAdV-3 in children with ARIs in China, a molecular epidemiological study was conducted. A total of 54 HAdV-3 isolated strains were obtained from children with ARIs in Beijing, Wenzhou, Shanghai, Shijiazhuang, Hangzhou, Guangzhou, and Changchun from 2014 to 2018. Thirty-two strains of which were selected for whole-genome sequencing, while the hexon, penton base, and fiber genes were sequenced for remaining strains. Bioinformatics analysis was performed on the obtained sequences. The phylogenetic analyses based on whole-genome sequences, major capsid protein genes (hexon, penton base, and fiber), and early genes (E1, E2, E3, and E4) showed that the HAdV-3 strains obtained in this study always clustered together with the reference strains from Chinese mainland, while the HAdV-3 prototype strain formed a cluster independently. Compared with the prototype strain, all strains possessed nine amino acid (AA) substitutions at neutralization antigenic epitopes of hexon. The homology models of the hexon protein of the HAdV-3 prototype and strain BJ20160214 showed that there was no evident structural change at the AA mutation sites. Two AA substitutions were found at the Arg-Gly-Asp (RGD) loop and hypervariable region 1 (HVR1) region of the penton base. A distinct AA insertion (20P) in the highly conserved PPPSY motif of the penton base that had never been reported before was observed. Recombination analysis indicated that partial regions of protein IIIa precursor, penton base, and protein VII precursor genes among all HAdV-3 strains in this study were from HAdV-7. This study showed that the genomes of the HAdV-3 strains in China were highly homologous. Some AA mutations were found at antigenic sites; however, the significance needs further study. Our data demonstrated the molecular characteristics of HAdV-3 circulating in China and was highly beneficial for further epidemiological exploration and the development of vaccines and drugs against HAdV-3.

Human adenoviruses (HAdVs) are important pathogens causing respiratory infections; 3.5-11% of childhood community-acquired pneumonia is associated with HAdV infection. Human adenovirus type 3 (HAdV-3), leading to severe morbidity and mortality, is one of the most prevalent genotype among adenoviruses responsible for acute respiratory infections (ARIs) in children in China. To identify the genetic variation of HAdV-3 in children with ARIs in China, a molecular epidemiological study was conducted. A total of 54 HAdV-3 isolated strains were obtained from children with ARIs in Beijing, Wenzhou, Shanghai, Shijiazhuang, Hangzhou, Guangzhou, and Changchun from 2014 to 2018. Thirty-two strains of which were selected for whole-genome sequencing, while the hexon, penton base, and fiber genes were sequenced for remaining strains. Bioinformatics analysis was performed on the obtained sequences. The phylogenetic analyses based on whole-genome sequences, major capsid protein genes (hexon, penton base, and fiber), and early genes (E1, E2, E3, and E4) showed that the HAdV-3 strains obtained in this study always clustered together with the reference strains from Chinese mainland, while the HAdV-3 prototype strain formed a cluster independently. Compared with the prototype strain, all strains possessed nine amino acid (AA) substitutions at neutralization antigenic epitopes of hexon. The homology models of the hexon protein of the HAdV-3 prototype and strain BJ20160214 showed that there was no evident structural change at the AA mutation sites. Two AA substitutions were found at the Arg-Gly-Asp (RGD) loop and hypervariable region 1 (HVR1) region of the penton base. A distinct AA insertion (20P) in the highly conserved PPPSY motif of the penton base that had never been reported before was observed. Recombination analysis indicated that partial regions of protein IIIa precursor, penton base, and protein VII precursor genes among all HAdV-3 strains in this study were from HAdV-7. This study showed that the genomes of the HAdV-3 strains in China were highly homologous.
HAdV-3, first isolated from a patient with acute respiratory infection (ARI) in the winter of 1952-1953, is one of the most prevalent serotypes responsible for respiratory infection diseases in children and adults worldwide (Lynch and Kajon, 2016). Our previous study indicated that HAdV-3, the predominant type of HAdV in China, accounted for 44.4% CAP caused by HAdV in children (Duan et al., 2019). Outbreaks of respiratory infections related to HAdV-3 have been reported many times. In 2011, an outbreak of febrile respiratory disease and pharyngoconjunctival fever in Hangzhou, China, was caused by HAdV-3 ; in 2005, an outbreak of respiratory infection and conjunctivitis implicated in HAdV-3 infection occurred in a pediatric long-term care facility in Illinois, United States (James et al., 2007). A number of studies have demonstrated that HAdV-3 may lead to severe pneumonias in immunocompetent children (Rebelo-de-Andrade et al., 2010) and adults (Barker et al., 2003). The positive rates of neutralizing antibody against HAdV-3 showed an age-dependent increase. The seroprevalence against HAdV-3 was low (12.07-33.96%) in 1-5 year-old children, which was high (64.29-81.25%) in healthy adults (Tian et al., 2016(Tian et al., , 2020. Furthermore, no effective vaccines or drugs for HAdV-3 are available. Therefore, monitoring the prevalence of HAdV-3 in China and analyzing the genetic stability and amino acid (AA) variation of the HAdV-3 genome, especially in the hypervariable regions 1-7 (loops 1 and 2) of the hexon gene, are of great importance to develop vaccines and drugs against HAdV-3.
To date, there are only 36 HAdV-3 complete genome sequences available in GenBank. It is inadequate to reflect the molecular epidemiological characteristics of HAdV-3. In this 1 http://hadvwg.gmu.edu/ study, 54 HAdV-3 isolated strains were obtained from children with ARI in parts of China from 2014 to 2018, including Beijing, Wenzhou, Shanghai, Shijiazhuang, Hangzhou, Guangzhou, and Changchun. Thirty-two strains of HAdV-3 were randomly selected for whole-genome sequencing. Bioinformatics analyses of HAdV-3 sequences were conducted. Our study is in favor of promoting the understanding of the epidemiology and evolution of HAdV-3, as well as provide a foundation for the development of effective vaccines and public health strategies.

Strains
On the basis of a network monitoring viral pathogens of respiratory tract infections among children, samples positive for HAdV-3 from 2014 to 2018 were collected from parts of China. The sentinel hospitals in our surveillance network are distributed in northern China (Beijing Children's Hospital and Children's Hospital of Hebei Province), eastern China (Yuying Children's Hospital, Xin Hua Hospital, and The Chiliren's Hospital-Zhejiang University School of Medical), southern China (The First Affiliated Hospital of Guangzhou Medical University and Guangzhou Women and Children's Medical Center), and northeastern China (Children's Hospital of Changchun). The samples were collected from these sentinel points monthly. The inclusion criteria were uniform according to the guideline (Subspecialty Group of Respiratory Diseases The Society of Pediatrics; Chinese Medical Association The Editorial Board Chinese Journal of Pediatrics, 2013). The samples inoculated onto HEp-2 cells for virus isolation. A total of 54 strains were obtained, of which 32 strains were randomly selected for whole-genome sequencing and analysis. The information on strains is shown in Table 1.

Extraction of Viral Nucleic Acid
The viral nucleic acid was directly extracted from isolates using a QIAamp MinElute Virus Spin Kit (QIAGEN, Hilden, Germany) in line with the manufacturer's instructions.

Sequencing of Hexon, Penton Base, and Fiber Gene
Sequences of hexon, penton base, and fiber genes were amplified by polymerase chain reaction (PCR) using HotStar Taq Plus Master Mix Kits (QIAGEN, Hilden, Germany). The primers used in this study have been described earlier . PCR products were sequenced using a Sanger sequencing method

Sequencing of Whole Genome
A total amount of 700 ng DNA per sample was used as input material for the DNA sample preparations. Sequencing libraries were generated using NEB Next R Ultra DNA Library Prep Kit for Illumina R (NEB, Ipswich, MA, United States) following the manufacturer's recommendations, and index codes were added to attribute sequences to each sample. Briefly, the chip DNA was purified using AMPure XP system (Beckman Coulter, Brea, CA, United States). After adenylation of the 3' ends of DNA fragments, the NEB Next Adaptor with hairpin loop structure was ligated to prepare for hybridization. Electrophoresis was then used to select DNA fragments specified in length. USER Enzyme (NEB, Ipswich, MA, United States) was used with sizeselected, adaptor-ligated DNA at 37 • C for 15 min followed by 5 min at 95 • C before PCR. PCR was then performed with Phusion High-Fidelity DNA polymerase, Universal PCR primers, and Index (X) Primer. At last, PCR products were purified (AMPure XP system), and library quality was assessed on the Agilent Bioanalyzer 2100 system. The clustering of the indexcoded samples was performed on a cBot Cluster Generation System using HiSeq 4000 PE Cluster Kit (Illumina) according to the manufacturer's instructions. After cluster generation, the library preparations were sequenced on an Illumina HiSeq 4000 platform, and 150-bp paired-end reads were generated. The quality controlled sequence data were de novo assembled using Shovill v1.0.9 2 and BWA v0.7.17-r1188 3 with minimum average coverage of 22.

Phylogenetic Analysis
Whole-genomic reference sequences of HAdV-3 were downloaded from the GenBank database. Reference sequences typed as HAdV-3 only based on the hexon gene without the penton base or fiber genes were excluded. A total of 36 whole-genomic reference sequences obtained from GenBank were enrolled for phylogenetic analysis. Reference sequences of the hexon, penton, and fiber genes of HAdV-3 were also selected from the GenBank database for phylogenetic analysis. The reference sequences are listed in Table 2.
MAFFT 4 software was used to conduct multiple alignment of the hexon, penton base, fiber genes, and whole-genomic sequences. MEGA v6.0 (Sudhir Kumar, Arizona State University, Tempe, AZ, United States) software was used to generate phylogenetic trees with the neighbor-joining method and the Kimura two-parameter model. The robustness of the 4 https://www.ebi.ac.uk/Tools/msa/mafft/ phylogenetic trees was assessed by the bootstrap method with 1,000 replicates.

Analysis of Genetic Variation
The genetic variations of the hexon, penton base, and fiber genes were determined by BioEdit v7.2.0 5 software.
The homology models of the hexon protein of the HAdV-3 prototype and strain BJ20160214 were built by submitting the AA sequences to SWISS-MODEL server 6 . The positions of mutated AA residues were labeled by PyMOL v2.4 in the HAdV-3 hexon structures.

Recombination Analysis
SimPlot v3.5.1 software was used to complete similarity plots and recombination detection (bootscan approach) of the whole-genomic sequences of the HAdV-3. The default settings FIGURE 1 | Phylogenetic analysis of the major capsid gene (A) penton base, (B) fiber, and (C) hexon. The phylogenetic trees were generated using the neighbor-joining method based on the Kimura two-parameter model with 1,000 replicates. The red dot indicates the strains obtained in this study. The black triangle indicates HAdV-3 prototype strain GB (accession number is AY599834). The empty square indicates the whole-genome reference sequences. AB900147, AB900148, and AB900149 are from the same strain.
Fifty-four penton base gene sequences obtained in this study and 42 reference sequences worldwide available from GenBank were analyzed by phylogenetic analysis. The phylogenetic tree of the penton base gene sequences formed three clusters. The prototype strain GB (AY599834) formed a separate cluster 1. Cluster 2 was constituted by four sequences from the United States and one sequence from Japan. All the strains circulating in China from 2004 to 2020 including 54 isolates obtained in this study as well as parts of strains circulating in the United States, Japan, and South Korea constituted cluster 3 (Figure 1).  (Pring-Akerblom et al., 1995); the region of RGD was 300-363 AA location of penton base, and that of HVR1 was 150-171 AA location of penton base (Madisch et al., 2007). The 54 hexon sequences we obtained were highly consistent and had same AA variations. Therefore, seven strains were selected as the representatives for AA variation analysis, and the strain BJ20160214 was selected for homology modeling.
The phylogenetic tree generated from 54 fiber gene sequences obtained in this study and 48 reference sequences worldwide available from GenBank formed three clusters. Cluster 1 and cluster 2 were consistent with the cluster 1 and cluster 2 in the phylogenetic tree formed by penton base. Similarly, the 54 isolates obtained in this study were in cluster 3 with all strains from China and part of strains from other countries (Figure 1).
A total of 105 hexon gene sequences were analyzed, including 54 sequences obtained in this study and 51 reference sequences worldwide downloaded from GenBank. As shown in the phylogenetic tree, 105 hexon gene sequences could be stratified into six clusters. Corresponding to the phylogenetic trees of penton base and fiber genes, the prototype strain GB (AY599834) was located in a distinct cluster 1; four strains from the United States and one strain from Japan comprised cluster 2; and all strains from mainland China were in cluster 3 including strains circulating in other countries or regions. Interestingly, three strains (AF542104, AF542110, and AF542127) from South Korea formed cluster 4 and cluster 5 in the hexon phylogenetic tree, respectively, however, these three strains all belonged to the cluster 2 in the phylogenetic tree of fiber gene, together with the epidemic strains from the United States and Japan. Cluster 6 was composed of two strains circulating in Taiwan, China.

AA Variation Analysis
After HAdV infection, the host produces neutralizing antibodies primarily targeting the neutralization epitopes on the surface of hexon. Antigenic domains and type-specific determinants have been mapped to loops 1 and 2 of the hexon and classified into seven hypervariable regions (HVRs): HVR1-6 (loop 1) and HVR7 (loop 2) (Sumida et al., 2005;Roberts et al., 2006). AA variations in the loop 1 and loop 2 were analyzed. Compared with the prototype strain GB, the 54 strains obtained in this study had three AA substitutions (G141R, E299G, and N302D) in the loop 1 region, and six AA substitutions (N411D, T418R, T429A, A439D, P440T, and T445A) in the loop 2 region. All reference sequences from China had same mutations in loop 1 and loop 2, except the reference sequence Guangzhou02 with an AA substitution (M221V) in loop 1. The homology models of the HAdV-3 prototype hexon and strain BJ20160214 hexon were built based on the AdC68 hexon crystal structure (PDB_ID 2OBE). The mutation sites were marked on the homology models (Figure 2). We aligned the two hexon model structurals and found that there was no distinct difference at the AA mutation sites of loop 1 and loop 2.
Compared with the prototype strain GB, an AA substitution (D327N) was observed at the Arg-Gly-Asp (RGD) loop in all strains circulating in China, and an AA substitution (T159I) was found at hypervariable region 1 (HVR1) in 54 isolates obtained in this study. However, another AA substitution (D162G) occurred in the reference sequence Guangzhou02, besides T159I. In addition, A distinct AA insertion (20P) was found in six isolates obtained in this study at the highly conserved PPPSY motif of the penton, which was not found in reference sequences (Figure 2).

Phylogenetic Analysis of the Complete Genome
Since the analysis of the major capsid protein genes showed that strains we obtained were highly consistent, we randomly selected 32 strains for whole-genome sequencing and analysis. The nucleotide sequence identities were 99.3-100% among 32 HAdV-3 isolates. The phylogenetic dendrogram based on the whole-genome sequences formed three clusters. In accord with the phylogenetic tree of the major capsid, cluster 1 was formed FIGURE 3 | Phylogenetic analysis of complete genome of HAdV-3. The phylogenetic tree was generated using the neighbor-joining method based on the Kimura two-parameter model with 1,000 replicates. The red dot indicates the strains obtained in this study. The black triangle indicates HAdV-3 prototype strain (accession number is AY599834). by prototype strain GB. Cluster 2 consisted of four strains circulating in the United States from 1988 to 2007. The 32 isolates obtained in this study and reference strains circulating in China, United States, and Korea from 2002 to 2020 were located in cluster 3. Further, the 32 whole-genome sequences obtained in this study formed a small evolutionary branch supported by a significant bootstrap value in cluster 3, with a reference sequence collected from the United States, 2007 (Figure 3). These results indicated that strains obtained in this study had high percent identity and high homology.

Phylogenetic Analysis of Early Genes
The early genes of HAdV include E1, E2, E3, and E4 genes. Multiple recombination events of early gene regions occurred in species HAdV-C (Dhingra et al., 2019). The early genes play a crucial role in disturbing host immune defense mechanism as well as the transcription and replication of HAdV (Tauber and Dobner, 2001;Zeng and Carlin, 2019). In this study, the early genes were captured from the whole-genome sequences of 32 isolates for phylogenetic analysis. The phylogenetic trees based on the E1, E2A, E2B, E3, and E4 sequences formed three clusters, which were consistent with the tree based on whole-genome sequences. Similarly, clusters 1 and 2 were formed by the prototype strain GB and four strains circulating in the United States from 1988 to 2007, respectively. The 32 isolates obtained in this study were in cluster 3 together with the rest of the reference strains circulating in China, United States, and Korea from 2002 to 2020 (Figure 4).

Recombination Analysis
Recombination plays a significant role in the evolution of HAdV. The recombination analysis of the 32 HAdV-3 wholegenome sequences demonstrated that a gene fragment coding partial protein IIIa precursor, penton base, and protein VII precursor were recombined from HAdV-7 (Figure 5). To further confirm the reorganization event, phylogenetic analysis of the recombinant region was performed, which showed that the recombinant region clustered with HAdV-7 (Supplementary  Figure 1). The strain KF268128 isolated in 1988, the earliest strain among the reference sequences except the prototype strain GB, was analyzed by SimPlot. A similar recombination event was observed, indicating that the recombination event has taken place as early as 1988 (Supplementary Figure 2).

DISCUSSION
In the present study, phylogenetic analyses were performed for the 54 HAdV-3 isolates collected from seven cities in China between 2014 and 2018 for the hexon, penton base, and fiber sequences, and 32 of which were analyzed for the whole-genome sequences. Except hexon gene, which formed six clusters, the other genes all formed three clusters in the phylogenetic tree. Notably, all strains obtained from mainland China were together in the same cluster in the phylogenetic trees based on all genes, while the prototype strain GB was in a separate cluster. In addition, the nucleotide and AA identities of strains obtained in this study could reach a high degree ranging from 99.3 to 100%. Therefore, these results indicated that the HAdV-3 strains currently circulating in China had high identity and homology.
The hexon gene sequences were stratified into six branches, which may be correlated with heterogeneous HVRs of hexon. There are seven HVRs (HVR1-7) in the hexon protein, and their AA sequences are quite dissimilar in different types of HAdV but relatively conservative among different strains of one type HAdV. However, a research about variation in HVRs of HAdV-3 hexon reported that the HVRs of HAdV-3 strains were highly heterogeneous and could be categorized into 25 hexon variants, which was higher than the number of hexon variants of other types of HAdV (Haque et al., 2018). The heterogeneity of HVRs of hexon increased the challenge of developing a vaccine and drug for HAdV-3. Meanwhile, it prompted us to attach importance in analyzing the genetic variation characteristics of the HAdV-3,   -11, AY163756;HAdV-14, AY803294;HAdV-16, AY601636;HAdV-21, AY601633;HAdV-34, AY737797;HAdV-35, AY128640;HAdV-50, AY737798;and HAdV-55, FJ643676. The genome sequences of the 32 strains obtained in this study have high identity, and the results of the recombination analysis were consistent. Therefore, BJ20180775 was selected as the representative to display the results of the recombination analysis. Protein IIIa precursor, 12,051-13,817 nt gene location of prototype strain GB, without gaps. Penton base, 13,905-15,539 nt gene location of prototype strain GB, without gaps. Protein VII precursor, 15,553-16,131 nt gene location of prototype strain GB, without gaps. especially the genes coding antigen epitopes (such as the loop 1 and loop 2 of hexon).
The phylogenetic trees based on penton base, fiber, early genes, and genome all formed three clusters, while six clusters were separated in the hexon phylogenetic tree. These differences on the branch structures of phylogenetic trees based on different genes of the same strains prompted that these three strains, which formed cluster 4 and cluster 5 in the hexon phylogenetic tree may be derived from genetic recombination. Nevertheless, since there were no penton base and complete genome sequences of these three strains, we could not acquire further molecular biological information. Besides, two strains from Taiwan, China, formed the cluster 6 in the hexon phylogenetic tree. No further phylogenetic or recombination information could be obtained for the lack of penton base gene or genome. This suggests that we should strengthen the sequencing of the hexon, penton base, fiber, and the whole-genome sequences of HAdV in the future to obtain more epidemiological or genetic information on HAdV.
The RGD motif and HVR1, on the surface of the penton base, are type specific and hypervariable (Madisch et al., 2007). The RGD loop binds to the α v β 3 or α v β 5 integrins to facilitate the endocytosis process of the virus (Wickham et al., 1994). HVR1 may be a target of neutralizing antibodies. Recombination events around the HVR1 region has been reported (Madisch et al., 2007). Therefore, AA variation analysis at the RGD loop and HVR1 was conducted. Some AA mutations in the antigenic epitopes of hexon and HVR1 and RGD loop of penton base were found by comparing with the prototype strain GB. These mutations contained the substitutions of hydrophilic/hydrophobic AA (T429A, A439D, T445A, and T159I) and insertion of heterocyclic AA (20P). Similar AA substitutions in hexon have been observed in strains circulating in Japan, Korea, Germany, and Taiwan, China (Haque et al., 2018). The locations of the mutations in our study were not included in the epitopes that have been functionally confirmed in previous reports (Leen et al., 2004;Chakupurakal et al., 2013;Keib et al., 2017). However, we cannot completely rule out the influence of these mutations for antigenicity, which warrants further immunological investigations. It is worth noting that the insertion (20P) in the highly conserved PPPSY motif of penton base gene was discovered for the first time in this study. We analyzed and compared all penton base sequences available in GenBank, and no such insertion was observed. The PPPSY motif in this conserved region is an essential motif for dodecahedron penton particle structure (Zubieta et al., 2005). The influence of a heterocyclic AA (20P) insertion in PPPSY on the structural stability and other properties of penton base is currently unclear, and further studies are needed.
Recombination is a significant mechanism for the evolution of HAdV. Among the 104 HAdV genotypes we know, most HAdV new genotypes were produced by recombination (see text Footnote 1). Recombination analysis of 32 HAdV-3 strains in this study demonstrated that the partial region of protein IIIa precursor, penton base, and protein VII precursor of these strains were recombined from HAdV-7. Protein IIIa is a minor capsid protein of HAdV and responsible for stabilizing interactions between hexons and penton base (Liu et al., 2010;Reddy et al., 2010). Protein IIIa also plays a role in the packaging of viruses. It is involved in packaging virus DNA into the virus capsid (Ma and Hearing, 2011). Penton base is a major capsid protein, which is connected to the fiber. It contains the RGD loop region, which can interact with integrins to promote virus internalization (Wickham et al., 1994). The effect of recombination at this location on the structural stability, susceptibility, and virulence of the virus is still unclear, and further research is required.
In conclusion, this study comprehensively illustrated the molecular evolution characteristics of HAdV-3 circulating in China during 2014-2018. Our results revealed high homology of HAdV-3 circulating in China, and a few AA mutations were observed at antigenic epitopes of the hexon gene and RGD loop of the penton base gene. Whole-genome sequence analysis of currently circulating HAdV-3 strains in China indicated that partial regions of protein IIIa precursor, penton base, and protein VII precursor genes were recombined from HAdV-7, which might have occurred as early as 1988. Our research enriched the molecular epidemiology data of HAdV-3 in China and was highly conducive to further epidemiological exploration of HAdV-3related severe clinical diseases and the development of vaccine and drugs against HAdV-3.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material. was selected as the representative to display the results of recombination analysis. Protein IIIa precursor, 12,051-13,817 nt gene location of prototype strain GB, without gaps. Penton base, 13,905-15,539 nt gene location of prototype strain GB, without gaps. Protein VII precursor, 15,553-16,131 nt gene location of prototype strain GB, without gaps.
Supplementary Figure 2 | Phylogenetic analysis of recombinant region. Recombinant region: 12,286-16,193 nt gene location of prototype strain GB, without gaps. The phylogenetic tree was generated using the neighbor-joining method based on the Kimura two-parameter model with 1,000 replicates. The red dot indicates strains obtained in this study.