Crystal Structures of R-Type Bacteriocin Sheath and Tube Proteins CD1363 and CD1364 From Clostridium difficile in the Pre-assembled State

Diffocins are high-molecular-weight phage tail-like bacteriocins (PTLBs) that some Clostridium difficile strains produce in response to SOS induction. Similar to the related R-type pyocins from Pseudomonas aeruginosa, R-type diffocins act as molecular puncture devices that specifically penetrate the cell envelope of other C. difficile strains to dissipate the membrane potential and kill the attacked bacterium. Thus, R-type diffocins constitute potential therapeutic agents to counter C. difficile-associated infections. PTLBs consist of rigid and contractile protein complexes. They are composed of a baseplate, receptor-binding tail fibers and an inner needle-like tube surrounded by a contractile sheath. In the mature particle, the sheath and tube structure form a complex network comprising up to 200 copies of a sheath and a tube protein each. Here, we report the crystal structures together with small angle X-ray scattering data of the sheath and tube proteins CD1363 (39 kDa) and CD1364 (16 kDa) from C. difficile strain CD630 in a monomeric pre-assembly form at 1.9 and 1.5 Å resolution, respectively. The tube protein CD1364 displays a compact fold and shares highest structural similarity with a tube protein from Bacillus subtilis but is remarkably different from that of the R-type pyocin from P. aeruginosa. The structure of the R-type diffocin sheath protein, on the other hand, is highly conserved. It contains two domains, whereas related members such as bacteriophage tail sheath proteins comprise up to four, indicating that R-type PTLBs may represent the minimal protein required for formation of a complete sheath structure. Comparison of CD1363 and CD1364 with structures of PTLBs and related assemblies suggests that several conformational changes are required to form complete assemblies. In the sheath, rearrangement of the flexible N- and C-terminus enables extensive interactions between the other subunits, whereas for the tube, such contacts are primarily established by mobile α-helices. Together, our results combined with information from structures of homologous assemblies allow constructing a preliminary model of the sheath and tube assembly from R-type diffocin.


INTRODUCTION
In order to infect bacteria, bacteriophages use an attachment organelle known as the "tail, " which recognizes the host cell and attaches the phage's capsid to it. The tail then acts as a molecular syringe to puncture the cell envelope and establish an inner channel that translocates genomic DNA and proteins from the capsid into the cell cytoplasm (Goldberg et al., 1994;Leiman and Shneider, 2012;Hu et al., 2015). Tailed bacteriophages (order Caudovirales) can be divided into three families: Myoviridae (long contractile tail), Siphoviridae (long non-contractile tail), and Podoviridae (short non-contractile tail) (Sharp, 2001;Ackermann, 2003;Veesler and Cambillau, 2011).
Interestingly, bacterial genomes often contain gene clusters encoding for structural elements that are evolutionary related to bacteriophage tail structures with regard to morphology, size, and the mechanism of action but without containing a phage capsid. This includes, e.g., the T6SS , needle-like particles such as the Afp from Serratia entomophila (Hurst et al., 2004) or the PVC (Yang et al., 2006), as well as PTLBs such as the "pyocins" from Pseudomonas aeruginosa or the "monocins" from Listeria monocytogenes (Nakayama et al., 2000;Lee et al., 2016). PTLBs, also referred to as tailocins, are high-molecular weight protein particles that are widespread in bacteria and for a subset of which antibacterial activity against competing strains of the same species has been demonstrated (Ghequire and Mot, 2015;Scholl, 2017). Two types of PTLBs have been identified: the R-type PTLBs, which are rigid and contractile and are related to Myoviridae tail structures (Kageyama, 1964;Ishii et al., 1965), and the F-type PTLBs, which are flexible and noncontractile and are related to Siphoviridae tails (Takeya et al., 1967;Govan, 1974;Kuroda and Kageyama, 1979). Both types are exemplified by the so far best-studied PTLBs, namely the R-and F-type pyocins from P. aeruginosa (Nakayama et al., 2000;Michel-Briand and Baysse, 2002;Scholl, 2017). A rather recently identified group of PTLBs are the R-type "diffocins" from Clostridium difficile, which constitute the only Grampositive R-type PTLBs known to date Scholl, 2017).
R-type PTLBs share a common structure and mechanism with bacteriophage tails of the Myoviridae family and although they cannot independently replicate, it has been suggested that PTLBs should not be considered defective prophages but rather as specifically adapted to their function as antibacterial instruments (Nakayama et al., 2000;Leiman et al., 2009;Ghequire and Mot, 2015). R-type PTLBs are produced and accumulate within the Abbreviations: Afp, antifeeding prophage from Serratia entomophila; CDIs, C. difficile infections; Cryo-EM, cryo electron microscopy; ESI, electrospray ionization; HCP(1), hemolysin co-regulated tube protein (1); MD, molecular dynamics; ORF, open reading frame; PDB, Protein Data Bank; PTLBs, phage taillike bacteriocins; PVC, virulence cluster from Photorhabdus luminescens; Q-TOF, quadrupole time of flight; rmsd, root mean square deviations; SAD, singlewavelength anomalous dispersion; SAXS, small angle X-ray scattering; SeMet, seleno-L-methionine; SLIC, sequence-and ligation-independent-cloning; T6SS, type VI secretion system; TLS, translation liberation screw; VTFM, variable target function method. bacterium under SOS response conditions and are only released upon lysis of the cell (Matsui et al., 1993;Williams et al., 2008;Lee et al., 2016). Whereas for members of other strains from the same species, contact with only a single particle can lead to cell death , cells harboring the respective PTLB genes themselves are provided with a resistance mechanism (Scholl et al., 2009;Köhler et al., 2010). Thus, sister cells of the same strain gain advantage from self-sacrifice of one member during which PTLBs are released (Lee et al., 2016;Scholl, 2017).
All R-type PTLBs consist of an inner needle-like rigid tube surrounded by a contractile sheath. Both of these complexes are composed of disks that assemble from six sheath or tube subunits via an axial sixfold rotational symmetry and are stacked atop each other after right-handed rotation (Heymann et al., 2013;Ge et al., 2015;Nováček et al., 2016;Brackmann et al., 2017;Salih et al., 2018). With up to 200 copies of the individual subunits in one sheath or tube assembly (Takeda and Kageyama, 1975), these proteins constitute the main body of the particle, which is capped by a complex baseplate structure with six attached receptor-binding proteins Taylor et al., 2018). These proteins are responsible for recognition and binding to specific receptors on the surface of the target cell and thus determine the very narrow specificity spectrum for a subset of strains within the species (Meadow and Wells, 1978;Nakayama et al., 2000;Williams et al., 2008;Köhler et al., 2010;Gebhart et al., 2012;Kocíncová and Lam, 2013). Binding triggers extensive conformational changes in the baseplate, accompanied by contraction of the sheath, which, due to extensive interactions, drives the rigid inner tube into the cell envelope. The driving force of sheath contraction and subsequent movement of the tube resides in the high-energy metastable structure of the extended state of the sheath Leiman and Shneider, 2012;Ge et al., 2015;Taylor et al., 2016;Brackmann et al., 2017). As a consequence of the puncturing, ions can pass through the tube, resulting in fast dissipation of the membrane potential and death of the attacked cell (Figure 1) (Uratani and Hoshino, 1984;Strauch et al., 2001). Thus, the complex network between the contractile sheath and the rigid tube is essential for the killing potency of R-type PTLBs Kube and Wendler, 2015;Brackmann et al., 2017;Scholl, 2017).
Their impressive killing capacity makes R-type PTLBs promising candidates for an efficient treatment of bacterial infections and, given their narrow strain-specificity, it seems likely that they can be administered without the risk of transmissible resistance or adverse effects on the beneficial flora (Ghequire and De Mot, 2014;Behrens et al., 2017;Scholl, 2017). Additionally, several studies demonstrated that R-type PTLBs can be specifically bioengineered to target also other species or strains (Williams et al., 2008;Scholl et al., 2009Scholl et al., , 2012, and prove-of-concept studies have been conducted to treat the respective infections (Merrikin and Terry, 1972;Haas et al., 1974;Scholl and Martin, 2008;Ritchie et al., 2011;Gebhart et al., 2015). Thus, R-type PTLBs could serve as novel therapeutics to treat infections particularly caused by antibiotic-resistant bacteria.
FIGURE 1 | The genome of Clostridium difficile strain CD630 contains a 25-gene cluster encoding the R-type "diffocin." Diffocins form after SOS induction and are released by cell lysis to bind closely related but unprotected C. difficile strains. Contraction of the diffocin sheath (orange) drives the tube (blue) through the wall of the attacked cell and leads to death by dissipation of the membrane potential.
The anaerobic Gram-positive bacterium C. difficile is a human pathogen that can be found in the intestine of humans as well as animals (Hall and O'toole, 1935;Larson et al., 1977;George et al., 1978;Bartlett, 1994). Symptoms of infections range from diarrhea to life-threatening pseudomembranous colitis (Poutanen and Simor, 2004;Borali and Giacomo, 2016) and C. difficile-associated morbidity and mortality in humans has increased in recent years (Lessa et al., 2015). C. difficile and its spores are resistant to a range of antibiotics as well as to various environmental factors and can therefore form reservoirs in healthcare settings, which significantly enhances the risk for hospital-acquired infections (Bignardi, 1998;Zilberberg et al., 2008;Vohra and Poxton, 2011;Tenover et al., 2012;Slimings and Riley, 2014). Additionally, a preceding treatment with broad-spectrum antibiotics constitutes a major risk factor for CDIs since the protective microbiome in the gut is disrupted, providing a less competitive habitat for relapsing C. difficile (Theriot and Young, 2015). Therefore, alternatives to conventional antibiotics are urgently needed to specifically target C. difficile, and the R-type diffocins, possibly after bioengineering, constitute promising candidates (Gebhart et al., 2015). Toward this, insights into their threedimensional structures could provide valuable information about their activity, assembly, and specificity. In this respect, R-type pyocins from P. aeruginosa are currently the most thoroughly studied R-type PTLBs, whereas structural information for other particles such as the C. difficile R-type diffocins is largely absent. Additionally, in contrast to prophage-like elements and bacteriophage tails, no crystal structures of the free sheath and tube building blocks are available, which precludes insights into the pre-assembly states of these proteins. Here, we therefore aimed at determining the crystal structures of these components from C. difficile strain CD630 individually. By extensive comparison with homologous structures, we demonstrate that several conformational changes are required in order to form the R-type diffocin particle. We also provide a model of the assembled sheath and tube structures of the R-type diffocin contractile apparatus, which may serve as templates for future cryo electron microscopy (cryo-EM) studies.

Cloning, Expression, and Purification
The genes encoding the R-type diffocin sheath protein CD1363 and tube protein CD1364 were PCR-amplified from genomic C. difficile CD630 DNA, kindly provided by Prof. Dr. Ralf Gerhard (Hannover Medical School, Hannover, Germany). Both genes were subcloned into a pOPINM expression vector (Addgene plasmid # 26044; a kind gift from Ray Owens; Berrow et al., 2007) via SLIC (Li and Elledge, 2007), using appropriate oligonucleotide primers listed in Supplementary  Table S1 to yield fusion proteins with N-terminal 6xHis-MBPtags and HRV3C protease cleavage sites. Positive clones of both constructs were identified with colony PCR and the correct sequence was verified via sequencing. Optimal conditions for recombinant protein production were determined in smallscale test expression experiments and large-scale expression was performed with Escherichia coli BL21-CodonPlus-RIL cells (Agilent Technologies, Santa Clara, CA, United States) for 24 h in M9 minimal medium (Miller, 1972) at 20 • C for seleno-L-methionine-(SeMet-)labeled sheath protein CD1363 or in auto-induction medium (Studier, 2014) at 25 • C for native tube protein CD1364. Cell pellets were resuspended in 20 mM HEPES, 300 mM NaCl, 3 mM DTT, pH 7.5, homogenized and centrifuged. The supernatants were applied onto a 5 mL MBPTrap HP column on an ÄKTA system (GE Healthcare Life Sciences, Pittsburgh, PA, United States) and eluted with 10 mM maltose in the same buffer. Tags were cleaved off with HRV3C protease at 4 • C and overnight incubation. To subsequently remove cleaved affinity tags and uncleaved protein, a nickel affinity chromatography step was performed via a 5 mL HisTrap TM HP chelating column (GE Healthcare Life Sciences) and the flow-through containing the respective protein was concentrated using a Vivaspin 6 10 kDa cutoff concentrator (GE Healthcare Life Sciences). As a polishing step, the proteins were passed through a Superdex 75 16/60 or 26/60 prep grade column (GE Healthcare Life Sciences) and fractions containing the proteins of interest were concentrated and dialyzed against 20 mM HEPES, 300 mM NaCl, 3 mM DTT, pH 7.5. Protein purity was assessed by SDS-PAGE and protein concentrations were determined spectroscopically using calculated extinction coefficients based on the amino acid sequence via the PROTPARAM web server 1 (Gasteiger et al., 2005). The correct molecular masses of the purified proteins were confirmed at the Department Chemical Biology at the Helmholtz Center for Infection Research, using a maXis HD ultra-high resolution (UHR) Q-TOF mass spectrometer equipped with a Apollo II electrospray source (Bruker Daltoniks, Bremen, Germany) and an Ultimate 3000RS autosampler together with a binary high gradient pump (Dionex/Thermo Fisher Scientific, Waltham, MA, United States). A solvent flow of 50 µL/min was used for infusion of a calibrating solution as well as for the sample. Data were processed using the Data Analysis Software Version 4.2 (Bruker Daltoniks). Original mass spectra containing peaks from multiply charged ions were smoothed and deconvoluted to obtain a singly charged mass spectrum. The UV traces of the size exclusion polishing step, SDS PAGE analysis of purified proteins and ESI-MS spectra are shown in Supplementary Figure S1.

Small Angle X-Ray Scattering (SAXS) Analysis
The oligomeric states of the purified proteins were assessed by SAXS experiments at beamline BM29 at the European Synchrotron Radiation Facility (ESRF, Grenoble, France; Pernot et al., 2013) equipped with a PILATUS 1M detector and by using a sample detector distance of 2.867 m and a wavelength of λ = 0.9919. All data collection steps were carried out at 4 • C and scattering data from buffer ingredients were subtracted from protein scattering data using PRIMUS software (Konarev et al., 2003). Best data were obtained in 20 mM HEPES pH 7.5, 100 mM NaCl, 3 mM DTT and with 0.63 and 1.25 mg/mL of sheath protein CD1363 or tube protein CD1364, respectively. Scattering data were normalized, averaged and merged using PRIMUS software and SAXS envelopes were calculated with DAMMIF, DAMSEL, DAMSUP, DAMAVER, and DAMFILT of the ATSAS software package (Franke and Svergun, 2009). Fitting of the respective crystal structures with the experimental scattering curves and with the calculated envelopes was performed with CRYSOL (Svergun et al., 1995) or with UCSF Chimera (Pettersen et al., 2004), respectively.

Protein X-Ray Crystallography and Structure Determination
Vapor diffusion experiments were set up in a 96-well sitting drop format at 19 • C using a dispensing robot (Zinsser Analytics, Frankfurt, Germany). 0.1 µL protein solution at a concentration of 20 mg/mL for both the SeMet-labeled sheath protein CD1363 and native tube protein CD1364 were mixed with an equal amount of reservoir to give concentrations listed in Table 1. The drops were equilibrated against 70 µL of reservoir solution and crystals appeared after a few days. Before flash cooling the crystals in liquid nitrogen, 10% (v/v) of (2R,3R)-(-)-2,3-butanediol was added as cryoprotectant.
X-ray data collection for native tube protein CD1364 was carried out at 100 K at beamline 14.1 at BESSY II (Helmholtz-Zentrum Berlin (HZB), Berlin, Germany; Gerlach et al., 2016). For SeMet-labeled sheath protein CD1363, a SAD data set was collected on beamline PXII at the Swiss Light Source (SLS; Paul Scherrer Institut, Villigen, Switzerland; Fuchs et al., 2014). Diffraction data were indexed, integrated and scaled with XDS (Kabsch, 2010) and AIMLESS (Evans and Murshudov, 2013) from the CCP4 suite (Winn et al., 2011). Details of data collection are listed in Table 1. The structure of SeMetlabeled sheath protein CD1363 was determined using AUTOSOL (Terwilliger et al., 2009). Eight SeMet sites were identified, indicating the presence of one molecule in the asymmetric unit (Matthews, 1968;Winn et al., 2011). The structure of the native tube protein CD1364 was solved by molecular replacement with PHASER (McCoy et al., 2007), using a truncated model of a monomer of P54332 from Bacillus subtilis (B. subtilis; PDB: 2GUJ; unpublished). Initial model building was performed using AUTOBUILD (Terwilliger et al., 2008) and final models were obtained by iterative cycles of TLSmotion refinement with PHENIX.REFINE (Afonine et al., 2012) and manual rebuilding using COOT (Emsley et al., 2010). All programs were run through the PHENIX suite (Adams et al., 2010). Coordinates of the SeMet-labeled sheath protein CD1363 and tube protein CD1364 structures are available at the PDB 2 (Berman et al., 2000) under accession codes 6GKW and 6GKX and refinement statistics are summarized in Table 1. Secondary structure elements were assigned using the DSSP web server (Kabsch and Sander, 1983;Touw et al., 2015). The DALI web server 3 was used to analyze structural similarities and to calculate Z-scores as well as sequence similarities toward homologous proteins (Holm and Laakso, 2016). Sequence alignments were generated with the PROMALS3D web server 4 (Pei et al., 2008) and alignment  (Baker et al., 2001). Topology diagrams were generated using TopDraw (Bond, 2003 (Webb and Šali, 2016). The models were further treated by two cycles of VTFM-based optimization and MD-based refinement in slow mode (Šali and Blundell, 1993) as implemented in MODELLER (Webb and Šali, 2016). Loops and secondary structure elements that exhibited severe clashes such as residues F36-G55, Y65-S68, and V121-G125 of the tube protein CD1364 and residues Y308-E326 of the sheath protein CD1363 were excluded from the assembly. Since the 5LI2 template used for modeling of the tube disk comprises a single layer only, generation of two additional disks was achieved via superimposition of both single disks of R-type diffocin sheath and tube onto 5LI2 and their subsequent merging. Afterward, the merged model contained both an R-type diffocin sheath and tube and was superimposed back onto two adjacent R-type diffocin disks of the pyocin-based sheath model such that the three-layered tube structure follows the sheath geometry of this model. Figures were generated as described above.

Cloning, Expression, and Purification
Genes of CD1363 and CD1364 from C. difficile strain 630 were successfully PCR-amplified from genomic DNA and subcloned into a pOPINM expression vector (Berrow et al., 2007) via SLIC (Li and Elledge, 2007), using the oligonucleotide primers listed in Supplementary Table S1. Positive clones for both constructs were identified via colony PCR and the correct sequences were confirmed via sequencing. During test expression experiments, the highest yields of soluble protein were obtained with E. coli BL21-CodonPlus-RIL cells in auto-induction medium (Studier, 2014) for both proteins. However, since SeMet-labeling was required to determine the structure of sheath protein CD1363, large-scale expression for this protein was performed in M9 minimal medium (Miller, 1972).
The obtained fusion proteins comprised a combination of an N-terminal 6xHis and an MBP-tag followed by a HRV3C protease cleavage site to remove both tags simultaneously. The proteins were affinity-purified using an MBPTrap HP column (GE Healthcare Life Sciences, Pittsburgh, PA, United States). Subsequently, the N-terminal 6xHis-MBP-tags were cleaved off with HRV3C protease during overnight incubation and uncleaved fusion protein along with free N-terminal tags were removed using a 5 mL HisTrap TM HP column (GE Healthcare Life Sciences). Tag-free CD1363 (˜39 kDa) and CD1364 (˜16 kDa) were identified in the flow-through, concentrated and then finally purified via gel filtration, in which both proteins eluted as 6 http://www.rcsb.org/pdb/workbench/workbench.do single prominent peaks (Supplementary Figure S1A). Via mass spectrometry analysis using electrospray ionization (ESI)-TOF, the full integrity and correct molecular weight were confirmed for the pooled fractions (CD1363:  Figure S1B).

General Organization of the R-Type Diffocin Tube Protein CD1364
The R-type diffocin tube protein CD1364 crystallized in space group P4 3 2 1 2 containing one molecule in the asymmetric unit. Initial phases were obtained by molecular replacement using a truncated model of a single central β-barrel from the phagelike element P54332 protein XkdM from B. subtilis (PDB: 2GUJ,unpublished), and the structural model was subsequently refined to a resolution of 1.5 Å with R work = 19.15% and R free = 22.00% (Table 1).
CD1364 possesses a compact fold with a central β-barrel flanked by two α-helices at both of its open sides (Figures 2A,B). Whereas α-helices α2 and α4 cap the β-barrel on one side, the α-helices at the other protrude away from the β-barrel such that α1 covers the β-barrel of a symmetry-related neighbor (Figure 2A). Higher B-factors indicate increased flexibility for α1, which hints at structural changes that may occur during tube assembly.
As was suspected from the molecular replacement process, CD1364 shares highest structural and sequence similarity with XkdM from B. subtilis as well as with gp104 from the bacteriophage ϕ812 tail tube (Nováček et al., 2016) ( Supplementary Table S2), and the overall fold for these proteins is mainly preserved (Supplementary Figure S2). Conserved residues are primarily identified within β1, β2, β3, and β6 as well as in α1 and α2 (Supplementary Figure S3). Notably, in XkdM, α1 does not protrude away from the core of the structure such as in the crystal structure of CD1364 discussed here, but rather caps the β-barrel of the same molecule, again suggesting a high degree of flexibility for this region in both CD1364 and XkdM (Supplementary Figure S2).
Although the corresponding sequence identity is low, tube structures of other bacteriophage tails, bacterial T6SSs and R-type pyocin can be superimposed with reasonable rmsd onto CD1364 ( Figure 2D and Supplementary Table S2). These structures all have a central β-sheet architecture with at least one α-helix (α2 in CD1364) in common . However, the β-sheet does not appear as compact as in CD1364 and can rather be described as two β-sheets tilted by an angle of 90 •  among the less related homologs such as FIIR2 from R-type pyocin (PDB: 3J9Q, 5W5E; Ge et al., 2015;Zheng et al., 2017), the T6SS hemolysin co-regulated tube protein 1 (HCP1) from Burkholderia pseudomallei (PDB: 3WX6; Lim et al., 2015) or phage tail tube protein gp19 from bacteriophage T4 (PDB: 5W5F; Zheng et al., 2017;Supplementary Figure S2). In addition, several differing loop conformations as well as insertions of secondary structure elements are observed. Particularly, a prominent loop (K33-G58 in CD1364) is variable and folds into a flexible α-helix in CD1364 (α1), XkdM from B. subtilis FIGURE 2 | Structural analysis of Clostridium difficile R-type diffocin tube protein CD1364. The structures of monomeric CD1364 (blue and red) and a symmetry-related molecule (gray) are shown as cartoon representation in (A) and the corresponding topology diagram is shown in (B). Secondary structure motifs are numbered consecutively. SAXS analysis of purified CD1364 is presented in (C) with a fit of the experimental SAXS data (black dots) and a theoretical curve calculated from the crystal structure (green dots) together with the CD1364 crystal structure fitted to the calculated SAXS envelope. In (D), monomeric structures from homologous tube assembles of R-type pyocin (FIIR2; PDB: 5W5E; Zheng et al., 2017) and the bacteriophage ϕ812 tail (gp104; PDB: 5LI2; Nováček et al., 2016) were superimposed onto CD1364. N-and C-termini are labeled with N and C, respectively. and HCP1 from the T6SS of B. pseudomallei (Lim et al., 2015), whereas these residues form a protruding β-hairpin in FIIR2 as well as in gp19 from bacteriophage T4 (Zheng et al., 2017; Figure 2D and Supplementary Figure S2). Interestingly, the number of amino acids of this loop is nearly unchanged (25-28) and although several residues from this region contribute to inter-disk contacts in the assembled tubes of bacterial T6SSs, the tail of bacteriophage T4 as well as R-type pyocin Lim et al., 2015;Wang et al., 2017;Zheng et al., 2017), the corresponding sequence is not conserved (Lim et al., 2015), indicating potentially different principles of tube assembly in these particles.

General Organization of the R-Type Diffocin Sheath Protein CD1363
The R-type diffocin sheath protein CD1363 crystallized in space group C121 with one molecule in the asymmetric unit. Despite extensive attempts, it was not possible to determine the structure via molecular replacement. Therefore, SAD data were collected FIGURE 3 | Structural analysis of Clostridium difficile R-type diffocin sheath protein CD1363. The structure of monomeric CD1363 is shown as cartoon representation in (A) with Domain I and II colored in yellow and orange, respectively. Domain I contains a central three-helix bundle that is stabilized by hydrophobic interactions and by a salt bridge (see insert). The C-terminal part of this domain is organized as a three-stranded β-sheet that hydrophobically interacts with the helix bundle. The general domain organization of CD1363 is shown as topology diagram in (B). SAXS analysis of purified CD1363 is presented in (C) with a fit of the experimental SAXS data (black dots) and a theoretical curve calculated from the crystal structure (orange dots) as well as a fit of the CD1364 crystal structure to the SAXS envelope. In (D), monomeric structures of homologous sheath proteins from assembled R-type pyocin (FIIR2; PDB: 3J9R; Ge et al., 2015) and from monomeric DSY3957 from Desulfitobacterium hafniense (PDB: 3HXL; Aksyuk et al., 2011) were superimposed onto CD1363. N-and C-termini are labeled with N and C, respectively. from a crystal obtained from SeMet-labeled CD1363 protein.
These crystals were of the same space group and exhibited identical cell parameters, allowing for straightforward structure solution and refinement. Because crystals of the labeled protein diffracted to higher resolution, the structure was refined against these data (Table 1) and the subsequent discussion refers to SeMet-labeled CD1363.
In the crystal form discussed here, CD1363 is monomeric and comprises two domains ( Figure 3A). The domain arrangement is best described as a "Russian doll" system (Aksyuk et al., , 2011 in which domain II (residues G26-S225; domain numbering according to Leiman and Shneider, 2012) is an insertion into domain I (residues G4-R25 and residues L226-I354; Figure 3B and Supplementary Figure S4). At the N-terminus, residues P6-A25 form a long α-helix that closely interacts with α-helices α9 and α10, which flank the central antiparallel three-stranded β-sheet. Domain I is furthermore completed by three smaller α-helices. The larger domain II is composed of a central six-stranded β-sheet and five smaller β-strands flanked by four and two α-helices, respectively ( Figure 3A).
Despite relatively low sequence identities, several PDB entries related to sheath protein CD1363 were identified via the DALI web server (Holm and Laakso, 2016), and the corresponding rmsd as well as Z-scores indicate a conserved fold for domain I and II of CD1363 (Supplementary Figure S4 and Supplementary Table S3). However, apart from the R-type pyocin sheath protein FIR2 , all related structures contain one or two additional domains (domain III and IV) likewise organized as insertions of one another in a Russian doll arrangement. Generally, the architecture of sheath domain III and IV is less conserved and differs between the contractile systems (Supplementary Figure S4). For instance, the fold of the surface-exposed domain III of T6SSs is highly T6SS-specific and plays a crucial role for the recycling mechanism of the contracted sheath in these systems (Leiman and Shneider, 2012;Kube et al., 2014;Kudryashev et al., 2015;Brackmann et al., 2017;Wang et al., 2017).
Yet, sheath proteins containing only domains I and II still form complete sheath assemblies as has been demonstrated for FIR2 or CD1363 of the R-type pyocins and diffocins, respectively (Scholl et al., 2011;Gebhart et al., 2012;Ge et al., 2015). This indicates that the R-type bacteriocin sheath proteins represent the minimal requirement for sheath formation (Taylor et al., 2018), which is in contrast to former studies suggesting that at least three domains are necessary (Aksyuk et al., 2011 with differing domain annotation).

Oligomeric State of the Sheath and Tube Proteins CD1363 and CD1364
The ability of the sheath and tube subunits to assemble into oligomeric structures is essential for the formation of fully functional bacteriophage tails, T6SSs or R-type PTLBs. Typically, polymerization of the tube as well as of the uncontracted sheath is initiated upon binding to initiation factors in the baseplate, and assembly of the sheath furthermore requires the tube polymer as a template (Kostyuchenko et al., 2003;Leiman and Shneider, 2012;Kudryashev et al., 2015). However, despite these external factors, preparations of purified tube and sheath proteins have been observed to also spontaneously self-assemble into oligomers in vitro. For instance, tube proteins of bacterial T6SSs pack into hexameric rings in the absence of baseplate components (Ballister et al., 2008;Leiman et al., 2009;Douzi et al., 2014), whereas tail sheath proteins of bacteriophage T4 as well as ϕKZ assemble into irreversibly contracted polysheaths of different lengths without the tube structure or the baseplate components being present (Moody, 1967;Efimov et al., 2002;Kurochkina et al., 2009). This can impede crystallization and structure determination, especially when assessing the unassembled state of these proteins.
In contrast to homologous proteins from other sources, recombinant purified R-type diffocin tube and sheath proteins CD1364 and CD1363 showed no tendency for spontaneous assembly but eluted predominantly in single prominent peaks at volumes corresponding to the molecular weight of the monomeric species in size exclusion chromatography (CD1364: 16 kDa; CD1363: 39 kDa; Supplementary Figure S1A). To further compare their oligomeric states in solution with the crystal structures described above, SAXS experiments have been performed here. As indicated by low χ-values of 0.71 for CD1364 and 1.84 for CD1363, the theoretical curves calculated from the crystal structures fitted well with the experimental SAXS data as well as with the calculated envelopes, confirming their monomeric state in solution (Figures 2C, 3C). Notably, in the tube protein CD1364, α1, which interacts with a neighboring molecule in the crystal structure, sticks out of the calculated envelope. This suggests that the interaction observed in the crystal structure is a consequence of crystallization and that α1 is not swapped in solution but rather packs onto the β-barrel of the same molecule such as observed for tube protein XkdM from B. subtilis (PDB: 2GUJ; unpublished; Supplementary Figure S2). Thus, based on our results, we can conclude that the sheath and tube proteins CD1363 and CD1364 adopt a stable monomeric pre-assembly state after translation and that this state is reflected in the crystal structures obtained here. However, as will be outlined in the following paragraph, significant structural changes are likely to accompany formation of the fully assembled R-type PTLB particle.

Comparison to Homologs Proteins in the Free and Particle-Assembled State
As stated above, both the sheath and the tube proteins assemble to sixfold symmetrical disks that stack on top of each other with a slight right-handed rotation in PTLBs (Heymann et al., 2013;Ge et al., 2015;Nováček et al., 2016;Salih et al., 2018). Comparison of the crystal structures of the monomeric tube and sheath proteins CD1364 and CD1363 to homologous proteins in these assemblies allows for speculation about structural changes that may accompany particle formation.
CD1364 can be superimposed with low rmsd values onto the hexameric assembly of tube protein gp104 in an electron microscopy structure of the native bacteriophage ϕ812 tail (PDB: 5LI2; Nováček et al., 2016) (Supplementary Table S2). Comparison reveals that the architecture of the central β-barrel mainly remains unaltered in the assembled tube whereas positions of α-helices α2, α3, and α4 are significantly changed. In the hexameric assembly, α4 establishes interactions with neighboring subunits of the same disk and of one disk below, whereas α2 tightly packs onto the β-barrel of the same molecule ( Figure 2D and Supplementary Figure S2). Both α-helices are located at the outer surface of the tube. Comparison with the monomeric structure of XkdM from B. subtilis (PDB: 2GUJ) suggests that the movement of α2 can be described as a rotation such that residues that point toward α4 in the unassembled structure face the β-barrel in the hexameric ring. Furthermore, α3 of CD1364 forms into an extended loop in gp104 and interacts with α2 of the adjacent molecule on the other site of the β-barrel. With respect to the mobile α1, no conclusions can be drawn on basis of the ϕ812 tail tube structure, as the corresponding loop region is not resolved in PDB entry 5LI2. However, given the fact that the number of residues in the loop comprising FIGURE 4 | Structural differences between monomeric R-type diffocin sheath and tube proteins CD1364 and CD1363 and the respective subunits in the R-type pyocin particle. Two subunits (FIIR2) from two disks of the tube (PDB: 5W5E; Zheng et al., 2017) are shown (blue and yellow) in cartoon presentation for pyocin in (A). A β-hairpin runs underneath a neighboring molecule of the same disk and establishes interactions with the molecules of one disk below. Each β-barrel is extended by two β-strands that form interactions with the adjacent central β-barrel. In (B), three subunits (FIR2) from two disks of the R-type pyocin sheath assembly (PDB: 3J9Q; Ge et al., 2015) forming a four-stranded β-sheet via exchanging and refolding of the N-and C-termini are shown (labeled with N and C, respectively). For comparison, structures of the monomeric unassembled R-type diffocin tube CD1364 and sheath CD1363 proteins are presented and regions that correspond to the respective R-type pyocin elements are indicated by red coloring. α1 is almost identical even among less homologous proteins, further conclusions may be drawn from tube structures of other contractile systems, such as that of FIIR2 from R-type pyocin (PDB: 3j9Q, 5W5E; Ge et al., 2015;Zheng et al., 2017) or gp19 from the bacteriophage T4 tail (PDB: 5W5F; Zheng et al., 2017). As has been outlined above, the corresponding loop folds into a protruding β-hairpin in these assemblies ( Figure 2D and Supplementary Figure S2). This β-hairpin runs underneath an adjacent molecule of the same disk and thus connects it to two molecules of the lower disk in these tube assemblies. Similarly, the smaller loop region that corresponds to α4 in CD1364 forms into β-strands that extend the central β-barrel in both FIIR2 and gp19 and mainly interact with the central β-sheet of the adjacent subunit ( Figure 4A). These conformational arrangements generate a continuous antiparallel β-sheet of 24 β-strands that spans the complete inner surface of the tube structure Chang et al., 2017;Wang et al., 2017;Zheng et al., 2017). Together, this indicates that assembly of the R-type diffocin tube structure not only requires movements of the four α-helices such as observed for the ϕ812 tail tube but might also involve significant refolding, specifically around α1.
In the sheath protein, significant refolding upon particle assembly seems to occur within the N-and C-termini especially, as can be concluded by an overlay of monomeric CD1363 with the electron microscopy structure of FIR2 in the assembled R-type pyocin from P. aeruginosa (PDB: 3J9Q, 3J9R; Ge et al., 2015). While the N-terminus of monomeric CD1363 folds into a long α-helix (α1; P6-A24) that packs tightly against two other long α-helices of domain I (α9 and α10) via extensive hydrophobic interactions (Figure 3A magnified FIGURE 5 | Preliminary model of the extended sheath and tube assembly of R-type diffocin, generated on the basis of the sheath from R-type pyocin (PDB: 3J9Q; Ge et al., 2015) and the tube of phage ϕ812 tail (PDB: 5LI2, Nováček et al., 2016). A top view onto three disks of CD1363 sheath and CD1364 tube layers is given in (A) with the interacting subunits colored in yellow, red, and green, respectively. In (B), a side view of three disks of the CD1363 sheath assembly is shown. Because of helical disk stacking, the surface ridges wind around the particle. In (C), three disks of tube subunits are presented using the same color code as in (A,B). Of note, since the 5LI2 template for tube modeling contained a single tube disk only, generation of two additional disks for the R-type diffocin tube was performed via superimpositions on pyocin as described in the "Materials and Methods" section. White stars indicate positions of the attachment points for the loop comprising α-helix 1 in CD1364, which was excluded from the model due to clashes but might establish important inter-disk contacts to fill the gaps observed in the tube assembly. In (D), putative contact points between sheath and tube subunits of the R-type diffocin particle with the respective electrostatic potential are shown. For CD1363, the attachment helix α10 and for CD1364, the central β-barrel, is shown as cartoon.
insert, Figure 3B, and Supplementary Figure S5), it adopts an extended structure in FIR2 of the P. aeruginosa R-type pyocin particle ( Figure 3D and Supplementary Figure S4) to contribute a new strand to a β-sheet with three parallel and one antiparallel strand in the C-terminal region of domain I within a monomer of the next disk. This β-sheet is further expanded by a strand from the C-terminus of the monomer neighboring the chain that contributes the N-terminus from the disk below ( Figure 4B). This creates an "interwoven mesh" that connects not only the monomers within one disk but also ties neighbored disks to one another both in the extended and contracted state . Similar meshlike organizations have been found in several T6SSs, indicating that these interactions are likewise conserved in assembled R-type PTLBs (Clemens et al., 2015;Kudryashev et al., 2015;Salih et al., 2018). Interestingly, the C-terminus in monomeric CD1363 also forms a β-strand (β14) that hydrophobically interacts with residues of α9 and α10 as well as with the same residues from the neighboring β-strand (β13 in CD1363) as observed in the pyocin particle, albeit within the same chain ( Figure 3A magnified insert and Figure 3B). This indicates that the length of the C-terminus is perfectly evolved to enable sheath formation by arm exchange between chains lying next to each other. At the same time, the hydrophobic character of residues in the refolded N-terminus is conserved (Supplementary Figure S5), and the finding that the N-terminus adopts a similar α-helix in the prophage sheath protein LIN1278 from Listeria innocua (PDB: 3LML; Aksyuk et al., 2011) suggests that the formation of the contractile apparatus of R-type PTLBs is enabled by low refolding barriers of the N-and C-terminal sequences in domain I of the sheath proteins. This is also corroborated by the crystal structure of the prophage sheath protein DSY3957 from Desulfitobacterium hafniense (PDB: 3HXL; Aksyuk et al., 2011), where the N-terminus displays similar structure and interactions as in the assembled FIR2 sheath structure (Figure 3D), even if the crystal structure does not reveal disks as in the PTLBs. Notably, sheath formation involves only residues from domain I, whereas domain II is positioned at the surface of the sheath where it does not contact other subunits or the tube (Figure 4B), which is also expected for domain III and IV in sheath proteins gp18, LIN1278, and DSY3957 (Supplementary Figure S4 and Supplementary  Table S3).
In bacterial T6SSs as well as bacteriophage tails, baseplate gp48/54/gpU and gp25 subunits are required to initiate tube and sheath polymerization, respectively (Nakayama et al., 2000;Kostyuchenko et al., 2003;Leiman and Shneider, 2012;Brunet et al., 2015), and similar proteins are found in the gene clusters of Afp, PVC, or R-type pyocin (Sarris et al., 2014;Kube and Wendler, 2015). Using fold recognition via the Phyre2 7 web server (Kelley et al., 2015), the ORF CD1370 of C. difficile strain 630 was found to encode a protein with high homology to the sheath initiator protein gp25 of T4 bacteriophage (PDB: 5IW9; Gebhart et al., 2012;Taylor et al., 2016), but no tube initiator could be identified via this approach. In the recent structure of the baseplate of T6SS from T4 bacteriophage (PDB: 5IV5, 5IV7), gp25 adopts the same sixfold symmetry as observed in the assembled sheath of R-type pyocins and sheath polymerization initiation likely involves a similar β-strand exchange mechanism as described above (Taylor et al., 2016(Taylor et al., , 2018. Thus, conformational changes for the first layer of pre-assembled sheath proteins are probably induced upon binding to a homologous sheath initiator protein that might be present in every baseplate complex .

Preliminary Model of the Contractile Apparatus of R-Type Diffocin in the Extended State
Based on the high homology of CD1364 to the assembled tube protein in the tail of bacteriophage ϕ812 (PDB: 5LI2; Nováček et al., 2016) as well as of CD1363 to FIR2 in the assembled sheath of the extended R-type pyocin structure (PDB: 3J9Q; Ge et al., 2015), we constructed a preliminary model of the extended R-type diffocin particle using MODELLER (Webb and Šali, 2016). As common for phage tail-like contractile systems, the sheath and tube disks of the used templates are organized with identical helical symmetry Taylor et al., 2016Taylor et al., , 2018Wang et al., 2017), and the resulting model of R-type diffocin re-iterates the observed pitch of disks in R-type pyocin (translation of approximately 38 Å, rotation of 18 • ; Ge et al., 2015; Figure 5A) However, since several residues were not resolved in the tube template (PDB: 5LI2), the corresponding loops including the potentially important region around α1 could not be modeled, causing substantial gaps between the tube disks ( Figure 5C). In the R-type pyocin tube, these gaps are primarily occupied by the protruding β-hairpin discussed earlier, which underlines the potential importance of structural rearrangements in R-type diffocin building blocks. In the sheath assembly, in contrast, only a few residues had to be excluded and the model reflects the extensive mesh-like interactions as well as the prominent surface ridges known from other contractile systems in the extended sheath very well ( Figure 5B) (Clemens et al., 2015;Ge et al., 2015;Kudryashev et al., 2015;Wang et al., 2017;Salih et al., 2018).
In the R-type pyocin, the main contacts between sheath and tube are established via electrostatic interactions of a positively charged region in attachment helix α10 of sheath domain I with a complementarily charged patch in the central β-barrel 7 http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index of the respective tube protein . We examined the electrostatic potential of the modeled R-type diffocin sheath and tube for similarly charged patches and our analysis indeed revealed equivalent regions in the sheath and tube but with opposite charges (Figure 5D). Indeed, the sequence alignment of homologous sheath proteins demonstrates that the positions of basic residues in α10 are not conserved among these proteins (Supplementary Figure S5). However, our finding of opposite charges with respect to R-type pyocin may be a consequence of our preliminary model not accurately reflecting all aspects of the diffocin particle, which is also underpinned by clashes seen between attachment helix α10 and tube residues. This reiterates once more that further rearrangements seem to be required for R-type diffocin particle formation ( Figure 5A). The revelation of detailed interactions will thus have to await cryo-EM studies, for which promising preliminary work has recently been published (Hegarty et al., 2016).

CONCLUSION
Here, we have determined the crystal structures and SAXS envelopes of the R-type diffocin tube and sheath protein CD1364 and CD1363 in their pre-assembled monomeric state. Our data indicate that several conformational changes are necessary to enable formation of the diffocin particle, which, together with the finding that recombinant monomers of both proteins were stable enough for crystallization studies, indicates that the refolding energy barriers in diffocins are higher than in similar assemblies with spontaneously self-assembling building blocks. This could make the R-type diffocins an ideal system to study the particle formation process in vitro. Further, the crystal structures of CD1364 and CD1363 together with cryo-EM structures of related particles enabled us to construct a preliminary model of the contractile apparatus of R-type diffocins, but because significant differences to the available templates exist, several structural details will have to be revealed in future studies addressing the complete diffocin particle.

AUTHOR CONTRIBUTIONS
WB designed and directed the project. NS and E-MG performed the experiments. NS, JP, and E-MG analyzed the data. NS, JP, and WB generated figures. JP wrote the manuscript with discussion and comments from all authors.

FUNDING
This work has been supported by the Federal State of Lower Saxony, Niedersächsisches Vorab (VWZN2889/3215/3266).
Beutling at the Department Chemical Biology at the Helmholtz Center for Infection Research for performing mass spectrometry experiments. We are grateful to the staff members of the Dortmund Protein Facility at the Max Planck Institute of Molecular Physiology (Dortmund, Germany) for their assistance in SLIC and for test expression experiments. We also thank the beamline scientists at the European Synchrotron Radiation Facility (ESRF, Grenoble, France) for assistance in using BioSAXS beamline BM29. We furthermore thank the Helmholtz-Zentrum Berlin (HZB, Berlin, Germany) for the allocation of synchrotron radiation beamtime at beamline 14.1 at BESSY II and the beamline staff for their assistance. We also acknowledge the Paul Scherrer Institute (Villigen, Switzerland) for synchrotron radiation beamtime at beamline PXII of the Swiss Light Source (SLS) and thank the beamline scientists for their support.