A novel long-tailed myovirus represents a new T4-like cyanophage cluster

Cyanophages affect the abundance, diversity, metabolism, and evolution of picocyanobacteria in marine ecosystems. Here we report an estuarine Synechococcus phage, S-CREM2, which represents a novel viral genus and leads to the establishment of a new T4-like cyanophage clade named cluster C. S-CREM2 possesses the longest tail (~418 nm) among isolated cyanomyoviruses and encodes six tail-related proteins that are exclusively homologous to those predicted in the cluster C cyanophages. Furthermore, S-CREM2 may carry three regulatory proteins in the virion, which may play a crucial role in optimizing the host intracellular environment for viral replication at the initial stage of infection. The cluster C cyanophages lack auxiliary metabolic genes (AMGs) that are commonly found in cyanophages of the T4-like clusters A and B and encode unique AMGs like an S-type phycobilin lyase gene. A variation in the composition of tRNA and cis-regulatory RNA genes was observed between the marine and freshwater phage strains in cluster C, reflecting their different modes of coping with hosts and habitats. The cluster C cyanophages are widespread in estuarine and coastal regions and exhibit equivalent or even higher relative abundance compared to those of clusters A and B cyanophages in certain estuarine regions. The isolation of cyanophage S-CREM2 provides new insights into the phage–host interactions mediated by both newly discovered AMGs and virion-associated proteins and emphasizes the ecological significance of cluster C cyanophages in estuarine environments.

tRNA genes are widely distributed in the T4-like cyanophage genomes, with numbers ranging from 0 to 33 (Enav et al., 2012).Enav et al. propose that phages carry tRNAs to optimize the codon usage discrepancy between phages and hosts, enabling phage crossinfectivity of hosts with divergent G + C contents (Enav et al., 2012).In addition, cyanophage tRNA genes are predicted to facilitate the expression of specific AMGs (Enav et al., 2012;Xu et al., 2018).Cyanophage genomes also contain non-coding RNA (ncRNA) genes with regulatory functions.Currently, T4-like cyanophages have been found to contain several cis-regulatory RNAs, including glnA, manA, PhotoRC-II, and wcaG, which are thought to play a regulatory role in important host bioprocesses, including photosynthesis, nitrogen metabolism, and exopolysaccharide production (Weinberg et al., 2010;Wang et al., 2022;Zheng et al., 2023).
Generally, virions (i.e., an infectious virus particle) consist of structural proteins.However, certain viral proteomes have been found to encapsulate non-structural proteins related to host metabolism regulations.The most well-known phage non-structural protein is the RNA polymerase in N4 podoviruses (Falco et al., 1980;Kazmierczak et al., 2002).Virion-associated protein kinases (VAPKs) have been identified in various virus families except for dsRNA and ssDNA viruses.Especially in animal and plant viruses, VAPKs are prevalent and play crucial roles in multiple stages of the viral life cycle, including infection, uncoating, transcription, and replication (Hui, 2002).Furthermore, VAPKs have also been detected within the virion of cyanophages.For instance, putative serine/therine protein kinases have been identified in the S-CRM01 and S-TIM5 virions (Dreher et al., 2011;Sabehi et al., 2012).APH, ChoK, and Rio2 kinases were detected in the virions of a non-T4 cyanomyovirus, S-CBWM1, and are predicted to be involved in host bioprocesses like antibiotic resistance, protein binding to phospholipids and choline, and ribosome biogenesis (Xu et al., 2018).In addition, nicotinamide/ nicotinate monomucleotide adenylytransferase (NMNAT)-like proteins are found in the virion proteomes of cyanomyoviruses, S-CBWM1, S-SZBM1, and S-SCSM1, which are thought to be involved in NAD + synthesis during infection (Xu et al., 2018;Rong et al., 2022;Wang et al., 2022).These regulatory proteins are thought to create an optimal environment for viral replication upon entry into the host cell.
The discovery of novel cyanophage isolates is always enhancing our comprehension of viral genetic diversity, evolution, phage-host interactions, and potential ecological functions.Here, we characterized a new T4-like cyanophage, S-CREM2, which was isolated from the Changjiang River Estuary.S-CREM2 represents a new viral genus and possesses the longest tail ever found in cyanomyoviruses.The identification of S-CREM2 promoted the establishment of a novel T4-like cyanophage lineage, referred to as cluster C, which is as prevalent as the previously identified clusters A and B in the estuarine environment.The isolation and characterization of S-CREM2 provide new insights into phage-host interactions and the ecological distribution of the newly established cluster C cyanophages.

Cyanophage isolation
Synechococcus sp.CRE1902 was isolated from the surface water of Changjiang River Estuary (31.52°N, 122.64°E) in July 2019, and used as a host organism for cyanophage isolation.Synechococcus sp.CRE1902 was grown in seawater-based SN medium with a salinity of 25‰ (SN25) (Waterbury and Willey, 1988), and incubated at a temperature of 22°C under a constant cool-white light intensity of 20 μE m −2 s −1 .Cyanophage S-CREM2 was obtained from the surface seawater sample collected in the Changjiang River Estuary (31.31°N, 122.49°E) in July 2019.The viral seawater used for cyanophage isolation was prepared by a 0.22-μm filtration to remove bacterial cells and subsequently stored in the dark at 4°C until further use.Cyanophages were first enriched by adding 20 μl of the above viral seawater to 180 μl of exponentially growing Synechococcus sp.CRE1902 cultures (optical density at 750 nm (OD 750 ) = 0.5) in a 96-well microtiter plate.After the lysis of Synechococcus cells, the lysates were collected and centrifuged at 10,000 × g, 4°C for 10 min.The supernatants were filtered through 0.22-μm-pore-size sterile syringe filters (Millipore, Millex ® -G, USA) and subsequently used for phage purification.Phage purification was performed using the plaque assay method (Suttle and Chen, 1992) and repeated three times.

Host range determination
Nine Synechococcus strains were used for the host range detection of S-CREM2, which included five estuarine strains, CB0101, CRE1901, CRE1902, CBW1107, CBW1101, and four oceanic strains, CC9311, WH8102, WH7803, WH7805.About 20 μl of S-CREM2 suspensions were added to 180 μl exponentially growing Synechococcus cultures in 96-well microtiter plates in triplicate, while the controls received 20 μl of SN25 medium.All plates were incubated under the same condition as described above and observed daily for cell lysis.

Phage amplification and purification
To amplify S-CREM2 phage, phage suspensions were added into 2 L of exponentially growing Synechococcus sp.CRE1902 cultures (OD 750 = 0.5) at a multiplicity of infection of 0.01.The resulting lysates were treated with DNase І and RNase A both at a concentration of 2 μg mL −1 at room temperature for 1 h.Subsequently, the NaCl concentration of the lysates was adjusted to 1 M, and the lysates were ice-bathed for 30 min (Xu et al., 2015).The treated lysates were then centrifuged at 10,000 × g, 4°C for 20 min.The resulting supernatants were filtered through 0.45-μm-pore-size polycarbonate membrane filters to remove cell debris.Phage particles in the supernatants were concentrated using 10% (w/v) polyethylene glycol 8,000 at 4°C for 24 h and then precipitated by centrifugation at 12,000 × g for 1 h.The resulting S-CREM2 pellet was resuspended in TM buffer (20 mM Tris-Cl and 10 mM MgSO 4 ) and subjected to CsCl-gradient centrifugation (200,000 × g at 4°C for 6 h) using a SW 41Ti rotor (Beckman Optima L-100XP, Beckman Coulter, CA, USA).The visible phage band was extracted and underwent a 30-kDa centrifugal ultrafiltration to remove CsCl from the phage suspension.

Transmission electron microscopy observation
Ten microliters of the CsCl-purified phage suspension were absorbed onto a 200-mesh carbon-coated copper film for 1 min.Subsequently, they were negatively stained with 2% (w/v) uranyl acetate for 30 s.The excess dye was gently removed using filter paper, and the staining process was repeated.After drying for 30 min, the prepared sample was observed using a Tecnai G2 Spirit BioTwin transmission electron microscope (FEI Tecnai G2 F20, Thermo Fisher Scientific, Waltham, MA, USA).The Xplore3D image transmission system (USA) was utilized to capture high-quality images of the phage particles.

Phage DNA extraction and genome sequencing
Phage particles were first treated with a cocktail buffer containing proteinase K (100 mg mL −1 ), SDS (10%, wt/vol), and EDTA (0.5 M).Subsequently, phage DNA was extracted using the phenol-chloroform method as previously described (Chen et al., 2006).A whole-genome shotgun strategy was used to construct the PE150 library.The obtained raw data were subjected to quality filtering, trimming, and de novo assembly using IDBA v1.1.3(Peng et al., 2012) and megahit v1.2.9 (Li et al., 2016).Any remaining gaps in the cyanophage genome were closed using pilon v1.24 and bcftools v1.17 (Narasimhan et al., 2016).The complete genome sequence has been submitted to the GenBank database under accession no.OR473000.

Genome annotation and comparative genomic analyses
The putative open reading frames (ORFs) of S-CREM2 were predicted using the GeneMarkS 1 (Besemer and Borodovsky, 2005), the RAST server 2 (Brettin et al., 2015), and the MetaGene Annotator3 (Noguchi et al., 2008).Translated ORFs were annotated by combining the results of homolog search against the NCBI non-redundant (NR) database, conserved domain prediction, and remote homolog search using the HHpred server4 (Söding et al., 2005).ORF homolog search against the NR database was conducted using BLASTP with an e-value cutoff of <10 −5 and a bit core of >40 (Pruitt et al., 2007).Conserved domains within ORFs were predicted by searching against the NCBI Conserved Domain Database (CDD) (Marchler-Bauer et al., 2011), with an e-value cutoff <10 −3 , a bit score of >40, and a coverage of >40%.For ORFs without predicted conserved domains, HHpred search against PDB_mmCIF70_18_Jun, UniProt-SwissProt-viral70_3_NOV_2021, SCOPe70_2.08structural/domain databases was conducted, with a probability cutoff of >90%, to supplement the ORF annotation.tRNA genes in the S-CREM2 genome were identified using tRNAscan-SE (Chan and Lowe, 2019).Other ncRNA genes were predicted by searching against the Rfam database 5 (Yao et al., 2007).Comparative genomic analyses of cluster C cyanophages were conducted and visualized by using Easyfig v2.2.3 (Sullivan et al., 2011).

Phylogenetic analyses
Phylogenetic analyses of the phycobilin lyase and CP12 genes were conducted using the MEGA 7.0 software package (Kumar et al., 2016).The phycobilin lyase phylogenetic tree was constructed based on amino acid sequences, while the CP12 phylogenetic trees utilized nucleotide sequences.The maximum-likelihood method with the Jones-Taylor-Thornton (JTT) model and the neighbor-joining method with the p-distance model were both used in the phylogenetic tree construction with 1,000 bootstrap replicates.Phylogenomic analyses of S-CREM2 and 40 T4-like cyanophages were performed based on the amino acid sequences of 31 core genes.The core genes were identified among the 41 cyanophages using OrthoFinder v2.5.2 (Emms and Kelly, 2015), aligned using MAFFT v7.52 (Katoh et al., 2009), and edited using TrimAI v22.9.0 (Capella-Gutiérrez et al., 2009).The phylogenomic tree was constructed by RAxML v8.2.12 (Stamatakis, 2014) using the maximum-likelihood method with the PROTGAMMAJTT model (bootstrap replicates = 100).Five cyanophage representatives in clusters A, B, and C were selected to analyze the intergenomic similarity by VIRIDIC6 (Moraru et al., 2020).All five phages in cluster C, S-CREM2, S-CRM01, S-B68, S-H34, and S-N03, were used in the analysis.Cyanophages S-PM2, S-RSM4, S-SM2, S-SSM7, and P-HM1 were chosen to represent cluster A, while S-ShM2, Syn10, S-RIM8, S-IOM18, and S-RIM2 were selected to act for cluster B.

Virion protein determination by mass spectrometry analysis
The CsCl-purified phage suspensions were used for the virion protein determination.Virion proteins were digested using the FASP methods procedure described by Wiśniewski et al. (2009).The resulting tryptic peptides were analyzed using a Q Exactive mass

Recruitments of reads from metagenomic data
To estimate the relative abundances and distributional patterns of T4-like cyanophage clusters A, B, and C, fragment recruitment was performed using virome datasets from both marine and freshwater environments.Five representative cyanophages in each cluster used in the intergenomic similarity analysis were selected, and core genes shared among clusters A, B, and C were used for the recruitment analyses.Viromes used in this study include Global Ocean Virome 2.0 (GOV 2.0) (Gregory et al., 2019), Delmarva Estuarine Virome (DEV) (Sun et al., 2021), and Pearl River Estuary Virome (PREV) (Xu et al., 2022; Supplementary Table S1).The GOV 2.0 datasets were downloaded from the iMicrobe website 7 .The DEV datasets were obtained from the NCBI SRA database 8 .The PREV was sourced from the National Omics Data Encyclopedia 9 .Core gene homolog recruitment was conducted using BLASTN, with specific thresholds: an e-value of <1e −5 , a bit score of >40, a nucleotide identity of >95%, an alignment length of >90 bp, and a coverage of >40% (Mizuno et al., 2016;Martinez-Hernandez et al., 2017).The relative abundances of T4-like cyanophage clusters A, B, and C were normalized by the total recruited nucleotides (kb) per kilobase of core genes per gigabase of metagenome (KPKG) (Martinez-Hernandez et al., 2017).

Morphology and host range of S-CREM2
Cyanophage S-CREM2 and its host, Synechococcus sp.CRE1902 which is a member of Synechococcus subcluster 5.1 clade VI, were both isolated from the surface seawater of the Changjiang River Estuary in July 2019.Transmission electron microscopy observation reveals that S-CREM2 is a myovirus, possessing an isometric capsid (~96 nm in diameter) and an extraordinarily long contractile tail (~418 nm in length) (Figure 1; Supplementary Figure S1).Cyanomyovirus isolates rarely have tails longer than 200 nm (Sullivan et al., 2005;Clokie et al., 2008;Dreher et al., 2011;Sabehi et al., 2012;Xu et al., 2018;Rong et al., 2022;Wang et al., 2022;Zheng et al., 2023).S-CREM2 has the longest tail among the isolated cyanomyoviruses, even the isolated myoviruses, discovered so far.In contrast to the strong cross-infectivity of most cyanomyovirus isolates (Sullivan et al., 2003(Sullivan et al., , 2008;;Wang et al., 2022), S-CREM2 exhibits a narrow host range (Table 1).Among the nine Synechococcus strains examined in this study, S-CREM2 exclusively infected its original host, failing to cross-infect any other strains, even those in the same phylogenetic clade as the host.Previous studies have

Genomic features of S-CREM2
The genome of S-CREM2 was assembled into a circularly permuted, double-stranded DNA molecule with a length of 174,876 bp and a G + C content of 47.92%.A total of 219 ORFs, two tRNA genes, and a cis-regulatory RNA gene were predicted in the S-CREM2 genome (Figure 2; Supplementary Table S2).Of the 219 ORFs, 92 were annotated with predicted functions and categorized into four categories, structural formation (32 ORFs), DNA replication and metabolism (29 ORFs), regulation (27 ORFs), and lysis (4 ORFs) (Figure 2; Supplementary Tables S2, S3).The remaining 127 ORFs had unknown functions, with 49 ORFs having no matches in the NR database.A total of 159 ORFs of S-CREM2 showed homology with those of T4-like cyanophages that infect Prochlorococcus and Synechococcus, which indicates that S-CREM2 is a member of the T4-like cyanophages (Supplementary Table S2).
To investigate the phylogenetic relationship between S-CREM2 and other T4-like cyanophages, a set of 31 core genes were identified in S-CREM2 (Supplementary Table S4) and 40 referenced T4-like cyanophages.Phylogenomic analysis based on these 31 core genes revealed that S-CREM2 clustered with Synechococcus phages S-B68, S-N03, S-H34, and S-CRM01 and formed a discrete clade, which is divergent from the well-characterized clusters A and B proposed by Ignacio-Espinoza and Sullivan (2012) (Figure 3).The new clade encompassing S-CREM2, S-B68, S-N03, S-H34, and S-CRM01 were named as cluster C, in which the marine phage strains S-CREM2, S-B68, S-N03, and S-H34 exhibit closer phylogenetic relationship with each other and are relatively distant from the freshwater strain S-CRM01.In addition, the G + C contents of marine strains in cluster C (47.9-51.7%)are much higher than that of the freshwater strain S-CRM01 (39.7%) (Table 2) and those of cluster A (37.8-43%) and B (36.7-42.2%)(Jiang et al., 2020).A total of 150 ORFs in S-CREM2 showed homology with the T4-like cluster C cyanophages and 46 of them were exclusively homologous to cluster C. Within cluster C, S-CREM2 shares the largest number of homologous genes (144) with S-H34.In addition, S-CREM2 has 137, 135, and 78 ORFs homologous with S-N03, S-B68, and S-CRM01, respectively (Supplementary Figure S2; Supplementary Table S2).
Five cyanophages in each T4-like cluster were selected as representatives to calculate nucleotide-based intergenomic similarities.The nucleotide similarities between cluster C cyanophages and representatives in clusters A and B (4.4-6.8%) are much lower than those within cluster C (11.4-68%) (Supplementary Figure S3).S-CREM2 showed nucleotide similarities of 11.4-32.8%with the other four cyanophages in cluster C, with the highest similarity observed with S-B68 and the lowest with S-CRM01.Following the genus-level classification criteria in phage taxonomy, the nucleotide similarity less than 50% of the whole genome is indicative of different genera (Adriaenssens and Brister, 2017).Thus, we propose classifying S-CREM2 as a representative of a new viral genus.

DNA replication and metabolism genes of S-CREM2
A total of 29 ORFs encode genes related to DNA replication and metabolism in the S-CREM2 genome, including DNA polymerase, helicase, primase, ligase, various endonucleases, and enzymes involved in nucleotide metabolism and DNA damage repair.Nucleotide metabolism genes encoded in the S-CREM2 genome include nrdA (ORF163), nrdB (ORF164), thyX (ORF192), and DNA adenine methylase gene (ORF8).Specifically, nrdA, nrdB, and thyX can provide DNA monomers for viral replication (Myllykallio et al., 2002;Gon et al., 2006;Koehn and Kohen, 2010).DNA adenine methylase, which is involved in the process of nucleotide methylation, plays an important role in enhancing DNA stability (Miller et al., 2003).Among the DNA damage repair genes, the product of ORF10 (putative pyrimidine dimer DNA glycosylase) may function as a base-cutting repair protein, thereby reducing the occurrence of pyrimidine dimer formation caused by UV damage (Grafstrom et al., 1982;Walker et al., 2006), while the S-CREM2 putative UvsY (ORF1) and CDNA repair exonuclease SbcCD ATPase subunit (ORF148) may be able to remove mutated bases and nucleotide fragments (Wilson and Murray, 1991).The expression of these genes may be crucial in maintaining accurate transcriptional translation when the virus or host is subjected to external damage (Kemp and Hu, 2017).Additionally, a CRISPR-Cas9 nuclease (ORF12) gene is predicted in the S-CREM2 genome.Bacterial CRISPR-Cas9 nuclease is associated with chromosome rearrangement and genotoxicity, and it functions as a component of the adaptive immune system, which serves to defend against viral infection by degrading DNA originating from invading viruses or other foreign sources (Cui et al., 2020).The S-CREM2 CRISPR-Cas9 nuclease gene may be acquired from cyanobacterial hosts through horizontal gene transfer.
Among the 29 DNA replication and metabolism ORFs predicted in the S-CREM2 genome, 27 ORFs are homologous to those predicted in other T4-like cluster C cyanophages, with the highest amino acid identity for each gene ranging from 32.7 to 94.6% and averaged at 73.1%.Additionally, 28 out of the 29 DNA replication and metabolism ORFs showed homology with genes predicted in T4-like clusters A and B cyanophages, with the highest amino acid identity of each gene ranging from 29.3 to 81.4% and averaged at 54.6% (Supplementary Figure S4A).

Structural genes of S-CREM2
A total of 32 ORFs were predicted to encode structural proteins in the S-CREM2 genome, including terminase large subunit, terminase small subunit, portal protein, adaptor, stopper, sheath terminator, capsid-related proteins, and tail-related proteins (Figure 2; Supplementary Tables S2, S3), 24 of which were detected in the virion proteome by mass spectrometry analysis (Table 3).Of the 32 structural ORFs, 31 show homology to ORFs predicted in other cluster C cyanophages, with the highest amino acid identity for each gene ranging from 30.6 to 92.9% and averaged at 64.6% (Supplementary Figure S4B).The S-CREM2 ORF219, encoding a long tail fiber distal subunit, shows no homology with any cyanophages, but is homologous to genes predicted in heterotrophic bacteria, other bacteriophages, and Ostreococcus lucimarinus viruses, with amino acid identities ranging from 26.5 to 67% (Supplementary Table S2).Of the 32 S-CREM2 structure-related ORFs, 25 are homologous with those predicted in the T4-like cluster A and B cyanophages, with the highest amino acid identity of each gene ranging from 26.8 to 70.5% and averaged at 46.9% (Supplementary Figure S4B).It is worth noting that the S-CREM2 ORFs involved in structure formation exhibit a lower degree of conservation compared to ORFs related to DNA replication and metabolism (Supplementary Figure S4).Notably, six tail-related ORFs of S-CREM2 are exclusively homologous to genes predicted in the T4-like cluster C cyanophages.By comparing the S-CREM2 structural proteins involved in the virion formation with those of the T4 phage, the overall architecture of the S-CREM2 virion was predicted (Figure 4).Twenty-three proteins were mapped to the virion structure, with 21 detected in the virion proteome (Figure 4).Most of the structural proteins of S-CREM2 highly resemble those of the T4 phage.However, the adopter and the long tail fiber are different from those of the T4 phage and show homology with those of Escherichia phage vB_EcoP_SU10 and Escherichia phage K1F, respectively (Supplementary Table S5).

Virion-associated proteins of S-CREM2
A total of 52 S-CREM2-encoded proteins were detected in the virion proteome by mass spectrometry analysis, including 24 structural proteins, six non-structural proteins, and 22 proteins with unknown function (Figure 2; Table 3).Notably, among the 22 protein genes with unknown function, 13 are located in the genome region (ORF84-145) primarily associated with structural genes and may also encode structural proteins, indicating distinctive proteins contributing to the formation of the unique virion, which has an extraordinarily long tail.The six non-structural proteins include S-adenosyl methionine (SAM) hydrolase (ORF6), APH/ChoK-like kinase (ORF14), cytidylyltransferase (ORF154), CRISPR-Cas9 nuclease (ORF12), and two endolysins (ORF208, 209) and may be encapsulated within the capsid as virion-associated proteins.
SAM hydrolase is essential for the degradation of S-adenosine methionine (SAM) (Jerlström Hultqvist et al., 2018).SAM serves as a crucial methyl donor for methyltransferases that function on nucleic acids, proteins, and lipids in bacteria cells (Loenen, 2006).As a defense mechanism, bacteria employ SAM to differentiate their own DNA from that of foreign invaders (Wilson and Murray, 1991).It is reported that the phage-encoded SAM hydrolase can degrade SAM, switching off the bacterial defense (Jerlström et al., 2018;Guo et al., 2021).The entry of the phage-encoded SAM hydrolase into the host cell upon infection may protect the phage genomic DNA from attacks of the host restriction-modification systems.VAPKs are common in enveloped viruses infecting animals and plants but are rarely discovered in phages (Hui, 2002).Recently, protein kinase-like proteins have been continuously detected in the cyanophage virion proteomes (Dreher et al., 2011;Sabehi et al., 2012;Xu et al., 2018) and are speculated to regulate host bioprocesses by phosphorylating specific substrates like serine, threonine, or tyrosine residues of proteins, aminoglycosides, and choline.The putative APH/ChoK-like kinase detected in the S-CREM2 virions is homologous and shares an amino acid identity of 32% with the putative protein kinase detected in the S-CRM01 virions.In prokaryotes, APHs phosphorylate and inactivate aminoglycoside antibiotics (Wright and Thompson, 1999), ChoKs facilitate the formation of phosphorylcholine and play an important role in phosphorylcholineassociated lipopolysaccharide modifications on cell surface and cell The maximum-likelihood phylogenomic tree based on the 31 core genes among S-CREM2 and 40 T4-like cyanophages.Bootstrap values are calculated based on 100 replicates.The 31 core genes contain 11 DNA replication-related genes, 13 structure-related genes, four AMGs, and three hypothetical genes.Liu et al. 10.3389/fmicb.2023.1293846Frontiers in Microbiology 08 frontiersin.orgstress (Thomsen et al., 2003).The S-CREM2 APH/ChoK-like kinase may influence antibiotic resistance and the stress tolerance of the host cells (Wright and Thompson, 1999;Thomsen et al., 2003).
Cytidyltransferase is a homolog of NMNAT.Previous studies have speculated that phage-encoded NMNAT may regulate host metabolism by affecting NAD + levels in the cell and promote the production of phage progeny (Raffaelli et al., 1997(Raffaelli et al., , 1999(Raffaelli et al., , 2001;;Wang et al., 2022).The frequent detection of protein or small molecule kinases and cytidyltransferases in the virion proteomes of cyanophages suggests that these proteins may be carried by the virion and able to enter the host cells to create an optimized intracellular environment that fosters phage replication at the initial stage of phage infection.However, it is also possible that these phage-encoded regulatory proteins are highly expressed during phage infection and were not separated from virions in the CsCl purification.Further efforts are needed to verify their presence in the phage virions.

Limited and unique AMGs in T4-like cluster C cyanophages
In contrast to the numerous and diverse AMGs identified in T4-like cyanophages of clusters A and B, only a limited number of AMGs were predicted in the T4-like cluster C cyanophages (Figure 5).Only the freshwater strain, S-CRM01, contains the six most commonly found AMGs, psbA, hli, phoH, mazG, cobS, and hsp20, in T4-like cyanophages of clusters A and B (Ignacio-Espinoza and Sullivan, 2012;Jiang et al., 2020).While, four marine strains in cluster C, S-CREM2, S-H34, S-N03, and S-B68, only encode phoH, mazG, and hsp20 (Figure 5).
Tough lacking the commonly found photosynthesis genes psbA and hli, four marine cyanophage strains in cluster C encode an S-type phycobilin lyase gene, cpcV (Figure 5).Phycobilin lyases catalyze the covalent ligation between phycobilin chromophores and phycobiliproteins at specific binding sites, facilitating the synthesis of phycobilisome (Bretaudeau et al., 2013).The expression of phage cpcVs may assist the light absorption in infected host cells, providing energy for phage replication (Six et al., 2007;Xu et al., 2018).Phycobilin lyase genes are common in T4-like cyanophages of clusters A and B. However, all of the phycobilin lyase genes in clusters A and B are T-type, cpeT or cpcT.The cpcV is only found in T4-like cluster C cyanophages and S-CBWM1 (Xu et al., 2018).Phylogenetic analysis revealed that cpcT and cpeT of cyanophages in clusters A and B grouped into a stable branch with those of picocyanobacteria.The cpcV of cluster C cyanophages formed an individual clade with those of S-CBWM1 and a putative prophage of Synechococcus sp.SYN20, but did not cluster with any hostderived cpcV (Figure 6A).The discovery of more cpcV homologs from both cyanophage and cyanobacteria would facilitate the illustration of the evolutionary source and trajectory of the cyanophage cpcVs in future studies.
In addition, the S-CREM2 genome carries a cp12 that is related to carbon metabolism.As a Calvin cycle inhibitor, the phage-encoded CP12 was proposed to redirect the host carbon flow from the Calvin cycle to the pentose phosphate pathway, resulting in ATP, NADPH, and pentose accumulation that are favorable for phage dNTP biosynthesis (Thompson et al., 2011).The phylogeny of cp12 revealed that cp12 homologs from marine T4-like cyanophages of clusters A and B and   cyanopodoviruses both grouped with those of marine picocyanobacteria, indicating that cyanophages may acquire cp12 from their hosts.However, the S-CREM2 cp12 clustered with those of three cyanosiphoviruses and formed a very deep branch (Figure 6B), which suggests that the S-CREM2 cp12 evolves from a different origin or has experienced a divergent evolutionary trajectory from those of T4-like cyanophages in clusters A and B.
Distinct ncRNA profiles between the marine and freshwater phage strains in cluster C Three ncRNA genes were identified in the S-CREM2 genome, including two tRNA genes and a cis-regulatory RNA gene (wcaG) (Figure 2).The numbers of tRNA genes vary greatly among  cyanophages in T4-like cluster C (Table 2).The marine phages in cluster C contain no more than five tRNA genes, while the freshwater strain, S-CRM01, contains 33 tRNA genes which cover all 20-amino-acid specificities.Notably, the G + C content of S-CRM01 (39.7%) is much lower than those of the marine phage strains in cluster C (47.9-51.7%).While, the G + C contents of their hosts are the  Maximum-likelihood phylogenetic trees of phycobilin lyase genes (A) and cp12s (B) from picocyanobacteria and cyanophages.The phylogenetic analyses of the phycobilin lyase genes were performed based on the amino acid sequences, while the cp12 trees were constructed using nucleotide sequences.Numbers near each branch node represent the bootstrap values (maximum-likelihood/neighbor-joining, ML/NJ) of ≥50%.The bootstrap replicates = 1,000.
opposite.The G + C contents of Synechococcus CRE1902 and WH7803, which are hosts of S-CREM2 and S-B68, are 57.4 and 60.2%.The G + C content of the S-CRM01 host, Synechococcus LC16, is not available.Synechococcus LC16 is a member of the Cyanobium gracile cluster.Since the G + C contents of cyanobacteria in the same phylogenetic clade are usually similar, the G + C content of Synechococcus LC16 can be estimated from that of the type strain in the Cyanobium gracile cluster, Synechococcus PCC6307, which is 68.5% and much higher than those of Synechococcus CRE1902 and WH7803.It is speculated that phages carry tRNAs to overcome the codon usage difference from its hosts (Enav et al., 2012).The large difference in tRNA number between the S-CRM01 and the marine phages in cluster C can be illustrated by the larger discrepancy of G + C contents between S-CRM01 and its host than those between the marine phage strains and their hosts.A cisregulatory RNA gene, wcaG, was also predicted in genomes of the other three marine phages of cluster C. S-H34 also contains an extra glnA.However, no cis-regulatory elements were found in the S-CRM01 genome (Table 2).In prokaryotes, the cis-regulatory RNA wcaG acts as a regulator of exopolysaccharide production-related genes, glnA regulates gene expressions related to nitrogen metabolism (Weinberg et al., 2010).Phage cis-regulatory RNAs may also play similar roles in altering host metabolisms during infection.Different compositions of tRNA and cis-regulatory RNA genes between the marine and freshwater phages in cluster C may reflect their different modes of coping with their hosts and habitats.

Ecological distribution of cluster C cyanophages
The distribution and relative abundance of the T4-like cluster C cyanophages in the marine environment were investigated by metagenomic fragment recruitment analyses and compared with those of T4-like cluster A and B cyanophages.Among the 120 viromes employed for recruitment analyses, the cluster C-like cyanophages were detected in 29 viromes retrieved from various ecosystems, including temperate and subtropic estuaries, diverse coastal regions, and open oceans in both tropic and polar regions (Figure 7A).The five cluster C-like cyanophages  Liu et al. 10.3389/fmicb.2023.1293846Frontiers in Microbiology 13 frontiersin.orgare widespread in estuarine and coastal regions.Specially, the S-CREM2like cyanophages are more abundant in coastal environments, the S-H34, S-N03, and S-B68-like cyanophages are more prevalent in estuarine environments.Whereas, ORF homologs of the freshwater strain, S-CRM01, are rarely detected in marine ecosystems (Figure 7B).However, the residence of these five cluster C-like cyanophages in the open sea is quite limited.Only four out of 18 open sea viromes used in this study exhibit the presence of cluster C-like cyanophages (Figure 7).The distributional pattern of the cluster C-like cyanophages is congruent with those of their hosts, Synechococcus subcluster 5.1 clade V, VI, and IX (Table 2), which also thrive in the estuarine and coastal environments but are rarely observed in the open sea (Xia et al., 2015;Sohm et al., 2016).
The cluster A and B-like cyanophages are prevalent across various marine ecosystems.Despite consistently lower relative abundance compared to specific cluster A and B-like members in various marine ecosystems, specific members of cluster C-like cyanophages exhibit comparable or even higher relative abundances in certain estuarine regions (Figure 7B).This suggests that cluster C-like cyanophages play important ecological roles in the estuarine environment, which has been previously overlooked due to a lack of awareness regarding their existence.

Conclusion
Cyanophage S-CREM2 represents a new viral genus.The discovery of S-CREM2 refreshes our knowledge of the tail length of cyanomyoviruses and leads to the establishment of a new T4-like cyanophage clade, cluster C. Much less and unique AMGs and various virion-associated regulatory proteins of S-CREM2 may drive different phage-host interactions from those of clusters A and B cyanophages.The T4-like cluster C cyanophages are widespread in the estuarine and coastal environment.Specific members of this cluster may play important roles in certain estuarine ecosystems due to their equivalent or even higher relative abundance compared to cyanophages of clusters A and B. The isolation of S-CREM2 and establishment of the T4-like cluster C cyanophages provide new insights into the phage diversity, evolution, and phage-host interactions in the marine environment.

FIGURE 2
FIGURE 2 Genome organization of S-CREM2.ORFs with different functions are indicated by different colored arrows, and red dotted boxes represent virion proteins identified by mass spectrometry analysis.The number inside the arrow indicates the ORF number.ncRNA genes are labeled below the ORF bars.

FIGURE 3
FIGURE 3 and morphology features of the T4-like cyanophages in cluster C.

FIGURE 4
FIGURE 4Predicted architecture of the S-CREM2 virion.Bold fonts indicate structural proteins detected by mass spectrometry analysis.

FIGURE 5
FIGURE 5Comparative analysis of the AMGs in different T4-like cyanomyovirus clusters.Phylogenomic tree of S-CREM2 and 40 T4-like cyanophage based on 31 core genes.Colored boxes on the left signify T4-like cyanomyovirus clusters.

FIGURE 7
FIGURE 7 Comparison of environmental distribution of T4-like clusters A, B, and C-like cyanomyoviruses.(A) Location of publicly available viriomes used for the distributional analyses of the T4-like cyanophages.Red and blue dots indicate the presence and absence of the cluster C-like cyanophages; (B) The relative abundances of five representative strains of cyanophages in each cluster in metagenome databases.The results for 29 viromes containing cluster C-like cyanophages were shown, while those for additional five open sea viromes were also exhibited to better demonstrate the distributional pattern of cyanophages in clusters A and B in the open ocean.Relative abundance was normalized by KPKG.

TABLE 1
Host range of S-CREM2.

TABLE 3
The virion proteome of S-CREM2.