Genetic Characterization of the Exceptionally High Heat Resistance of the Non-toxic Surrogate Clostridium sporogenes PA 3679

Clostridium sporogenes PA 3679 is a non-toxic endospore former that is widely used as a surrogate for Clostridium botulinum by the food processing industry to validate thermal processing strategies. PA 3679 produces spores of exceptionally high heat resistance without botulinum neurotoxins, permitting the use of PA 3679 in inoculated pack studies while ensuring the safety of food processing facilities. To identify genes associated with this heat resistance, the genomes of C. sporogenes PA 3679 isolates were compared to several other C. sporogenes strains. The most significant difference was the acquisition of a second spoVA operon, spoVA2, which is responsible for transport of dipicolinic acid into the spore core during sporulation. Interestingly, spoVA2 was also found in some C. botulinum species which phylogenetically cluster with PA 3679. Most other C. sporogenes strains examined both lack the spoVA2 locus and are phylogenetically distant within the group I Clostridium, adding to the understanding that C. sporogenes are dispersed C. botulinum strains which lack toxin genes. C. sporogenes strains are thus a very eclectic group, and few strains possess the characteristic heat resistance of PA 3679.


INTRODUCTION
Clostridium botulinum, Clostridium baratii, and Clostridium butyricum species produce various types of botulinum neurotoxin (BoNT), the causative agent of the neuroparalytic botulism poisoning (Dodds and Hauschild, 1989;Collins and East, 1998;Rossetto et al., 2014). These species cluster into six groups defined by their metabolic and physiological traits (Collins and East, 1998;Rossetto et al., 2014). Group I (proteolytic) C. botulinum strains are particularly important to the food industry, as they produce endospores of high heat resistance that may survive inadequate thermal processing strategies and result in food spoilage and foodborne botulism (Townsend et al., 1938;Gross et al., 1946;Ingram and Robinson, 1951;Stumbo et al., 1975;Rossetto et al., 2014). Clostridium sporogenes is closely related to C. botulinum group I strains, but differs in two characteristic respects: it lacks the BoNT toxin genes and it produces spores with even higher heat resistance (Nakamura et al., 1977;Bull et al., 2009;Brown et al., 2012;Diao et al., 2014).
C. sporogenes PA 3679 (PA 3679) is widely used in testing commercial thermal food processing procedures for their ability to prevent foodborne botulism in shelf-stable products (McClung, 1937;Brown et al., 2012;Rossetto et al., 2014). PA 3679 is a non-toxic surrogate possessing higher heat resistant spores than group I C. botulinum, providing a safe alternative test organism that ensures neurotoxic spores have been eliminated during the thermal process without introducing the target pathogen to the food processing facilities (Brown et al., 2012;Diao et al., 2014). PA 3679 was originally isolated from spoiled canned corn in 1927 by E.J. Cameron of the National Canner's Association (Townsend et al., 1938;Brown et al., 2012). However, the properties of PA 3679 that give it such high heat resistance have not been well explored at a genetic level.
DPA is synthesized in the mother cell by shunting the product of DapA, dihydrodipicolinic acid (DHDPA), from the process of lysine biosynthesis (Daniel and Errington, 1993;Orsburn et al., 2010). The dicistronic spoVF operon codes for dipicolinic acid synthase subunits A and B (dpaA/spoVFA and dpaB/spoVFB), which convert DHDPA to DPA. Although the spoVF operon has been identified in many Bacillus species (Daniel and Errington, 1993;Onyenwoke et al., 2004) and Peptoclostridium difficile (Donnelly et al., 2016), this operon is not found in all endospore formers. Other members of the class Clostridia, including Clostridium perfringens and Thermoanaerobacter spp., lack spoVF (Onyenwoke et al., 2004). In C. perfringens, an alternate dipicolinic acid synthase, EtfA, has been demonstrated to produce DPA in vitro and in vivo, with knockout mutants lacking this metabolite (Orsburn et al., 2010). Following synthesis, DPA is transported to the core by three to seven products coded by the spoVA operon (Tovar-Rojo et al., 2002;Paredes-Sabja et al., 2008b;Li et al., 2012;Perez-Valdespino et al., 2014). Of these, products coded by spoVAC, spoVAD and spoVAE seem particularly important and are especially well conserved in both Bacillus and Clostridium species (Onyenwoke et al., 2004;Paredes-Sabja et al., 2008b;Donnelly et al., 2016).
In addition to DPA, several other genes are associated with core dehydration, though their roles are less clear. Spore maturation proteins A and B (products of spmA and spmB) both play a significant role in reducing core water content, though the mechanism is not understood (Paredes-Sabja et al., 2008a;Orsburn et al., 2009). The dac genes (dacA, dacB, dacC, and dacF in B. subtilis) code for D-alanyl-D-alanine carboxypeptidases which regulate peptidoglycan crosslinking. Both dacB and dacF genes are under the control of sporulation specific sigma factors, and their products regulate spore cortex formation (Popham et al., 1999). Knockout mutants lacking either gene show diminished heat resistance, presumably due to reduced cortex integrity under high heat conditions (Popham et al., 1999;Paredes-Sabja et al., 2008a;Orsburn et al., 2009).
In a previous study, we sequenced eight C. sporogenes samples labeled "PA 3679" obtained from a variety of sources and which displayed differential heat resistance (Schill et al., 2016). From our analyses, we distinguished two distinct clades of C. sporogenes isolates. Clade I isolates had significantly lowered heat resistance, with two (1990 and 2007) featuring near-identical genotypes. Clade I isolates did not survive heat treatment at 105 • C for 5 min and displayed D 97 • C and D 100 • C values of 2.97 and 2.28 min, respectively (the decimal reduction time, D, is equal to the time required under a given condition to destroy a population of microorganisms by one logarithm). In contrast, all isolates from clade II exhibited near-identical genotypes and heat resistance profiles of the original PA 3679 isolate by E.J. Cameron, with an estimated D 121 • C of 1.28 min (Diao et al., 2014), and survived thermal processing at temperatures from 117 • C to 121 • C. Given the two clades of C. sporogenes with differing heat resistance, we were presented with an opportunity to elucidate the specific genomic differences conferring the exceptional heat tolerance of PA 3679 spores.

Genomes Used in Study
Eight genomes used in this study ( Table 1) were from our previous study (Schill et al., 2016). The annotations of C.

Pan-Genomic Analysis
For pan-genomic comparison, the eight strains previously described were clustered using Roary 3.6.2 (Page et al., 2015) using a 70% identity threshold. Roary's core gene alignment was trimmed using BMGE 1-1 (Criscuolo and Gribaldo, 2010) to 2,511,737 sites across 2,751 core genes. PhyML 3.1 (Guindon et al., 2010) was used with GTR + I + F + G (4 categories) to generate a maximum likelihood (ML) tree for clustering. Roary_plots.py (https://github.com/sangerpathogens/Roary/tree/master/contrib/roary_plots) was used to generate the orthologous cluster map. Orthologs unique to the five clade II isolates were examined and those related to sporulation were investigated. Roary was used with 23 C. sporogenes (including the eight in this study) and 15 group I C. botulinum genomes, to generate a concatenated nucleotide alignment of 389 core genes (216,294 sites) using a 70% identity threshold. A maximum likelihood tree was generated as above, with the addition of 100 bootstraps in PhyML.

Analysis of spoVA and Conserved Genes
Orthologous groups identified from the pan-genomic analysis were searched using known genes related to spore heat resistance. Additional homology searches using BLAST (Altschul et al., 1990) looked for any missed homologs. Both blastp and tblastn searches were conducted using known reference genes (See Supplementary Table 1). Orthologous groups for each gene were compared and aligned using Geneious 9.1.5 (Kearse et al., 2012). Conserved domains in the aligned clusters were revealed with InterProScan 5 (Jones et al., 2014).

Analysis of the spoVA Operon
Using the 38 Clostridium above, plus C. tetani E88, OrthoFinder 0.2.8 (Emms and Kelly, 2015) identified 6,168 orthologous groups, 840 of which were unique orthologs present in all 39 strains. All spoVA genes identified by the pan-genomic analysis were located in the ortholog groups produced by OrthoFinder. Neighboring genes and operons were also identified in all 38 group I Clostridium species examined. spoVA2 operons and neighboring genes for several representative species were aligned and compared using EasyFig 2.2.2 (Sullivan et al., 2011). Conserved domains were identified using InterProScan 5, and predicted protein structures were calculated using the RaptorX webserver (Källberg et al., 2012).

Pan-Genome Analysis
The Roary pan genome generated using the eight clade I and clade II C. sporogenes isolates contained a total of 4,899 distinct orthologous groups, 2,751 of which represented the core genes, each with a unique ortholog in all eight isolates (Figure 1). These core orthologs included many genes previously identified as related to spore heat resistance. The heat resistance core orthologs are further characterized in Supplementary Figure 1, and described later in detail. There were 751 ortholog groups that were present in the five clade II isolates, but absent in clade I. Of those, 278 ortholog groups code for hypothetical proteins. Seven of the 751 were sporulation specific, of which four ortholog groups were related to germination. The remaining three ortholog groups constituted a second set of spoVA genes not found in the clade I isolates, henceforth dubbed the spoVA2 locus.

Examination of the spoVA2 operon
As mentioned in the pan genomic analysis, a second locus of spoVA genes was found in a single pentacistronic operon, spoVA2 (Figure 2A). InterProScan searches of the spoVA2 genes revealed conserved domains from two types of spoVA operons. To explore this further, a collection of the C. sporogenes genomes in GenBank (at the time of writing) plus fifteen commonly studied group I C. botulinum strains and one C. tetani strain ( Table 2) were clustered using OrthoFinder. For all five clade II isolates and five additional C. botulinum species, this spoVA2 operon was conserved and clustered separately FIGURE 1 | Pangenomic cluster matrix of Clostridium sporogenes isolates. A total of 4,899 orthologous clusters were identified; orthologs present or absent are indicated in blue or white, respectively. The maximum likelihood tree on the left was generated from an alignment of 2,751 core orthologs using a GTR + I + F + G model. from the traditional spoVA operon, which was found in all 39 species. All 39 species showed similar spoVA loci as clustered in Orthofinder. The 38 group I Clostridia showed a conserved genomic neighborhood around the site of the spoVA2 operon inclusion ( Figure 2B). The spoVA2 operon and its neighboring regions were perfectly conserved in all clade II (PA 3679) isolates, so only one representative sequence (Camp) is depicted in the figure. The clade I isolates similarly only had a single nucleotide difference across the whole region which didn't affect gene coding, so 2007 was chosen as the representative in Figure 2B.
The spoVA2 operon itself was well conserved in all strains it was found in. In addition to SpoVAC, SpoVAD, and SpoVAEb, the operon encodes two other proteins: a hypothetical protein and a membrane protein (Figure 2A). Neither protein has domain similarity or sequence homology to SpoVAA or SpoVAB. Both feature a domain of unknown function (DUF), DUF1657 (IPR012452; PF07870), and the membrane protein contains an additional uncharacterized YcaP domain (PTHR34582) composed of three transmembrane domains and DUF421. Predicted 3D structures of the DUF1657 hypothetical protein, the YcaP/DUF1657 membrane protein and SpoVAC, SpoVAD, and SpoVAE are depicted in Supplementary  Figure 2, with high similarity to previously reported examples. Also of note is the downstream neighbor of spoVA2, the xanthine dehydrogenase (xdh) operon. The xdh operon and a gene encoding isochorismate hydrolase are present in all spoVA2 containing strains, as well as closely related C. botulinum A strain ATCC 3502 (which lacks spoVA2; Figure 2B). However, in several strains, the xdh genes are partial or pseudogenes.

Characterization of SASP
A total of eight different SASP-encoding ortholog groups were found, each group containing an ortholog from every one of the eight investigated genomes. The traditional α/β-type SASP, with both a gpr cleavage domain (IPR018126; Prosite PS00304) and a DNA-binding domain (IPR018126; Prosite PS00684), was encoded by three of these orthologous groups. Translations of the genes in those groups, named ssp1, ssp2, and ssp3, also displayed the characteristic α/β-type SASP Pfam domain (IPR001448; PF00269).
A fourth SASP-encoding ortholog group showed high sequence conservation to a previously described ssp4 in C. perfringens (Li and McClane, 2008;Li et al., 2009). The product coded by these ssp4 orthologs had the characteristic α/β-type SASP Pfam domain, and the gpr cleavage domain, but lacked the conserved DNA-binding domain. The fifth SASP-coding ortholog group contained orthologs labeled ssp5. Again, translations displayed the conserved SASP Pfam domain, however it lacked both the gpr cleavage domain and the DNA-binding domain typical of α/β-type SASP.
The remaining three SASP-encoding ortholog groups exhibited the conserved domains and sequence similarity to minor types of SASP not associated with high heat resistance in previous studies: the H-type SASP and the tlp type SASP (Cabrera-Hernandez et al., 1999;Wetzel and Fischer, 2015). All of the SASP-encoding ortholog groups in this study appear to be monocistronic, and the amino acid alignments of each SASP is available in Supplementary Figure 1.

Characterization of Conserved Sporulation Genes
A number of additional sporulation-related orthologous groups were found with representative orthologs from all eight isolates. Six D-alanyl-D-alanine carboxypeptidase encoding orthologous groups were found, and their respective orthologous genes were dubbed dac1 through dac6. One orthologous group, dac4, encoded proteins with high homology to DacF (blastp evalues above 1e-105) and contained the two expected conserved domains: Peptidase S11, N-terminal domain (IPR001967; Pfam PF00768) and Penicillin Binding Protein 5, C-terminal domain (PBP5_C) (IPR012907; Pfam PF07943). Two orthologous groups-dac2 and dac5-encoded proteins similar to DacB (with blastp e-values above 1e-49) which characteristically have the same two conserved domains as DacF. The three remaining dac ortholog groups (dac1, dac3, and dac6) showed poor similarity to dacB or dacF and are likely D-alanyl-D-alanine carboxypeptidases unrelated to sporulation. The amino acid alignments of all Dac proteins are available in Supplementary  Figure 1.
The spoVF operon, coding for DPA synthase subunits A and B, was not found in any of the eight isolates. Instead, three orthologous groups-containing orthologs dubbed etfA_1, etfA_2, and etfA_3-encoded products with high protein sequence similarity (blastp e-values above 1e-130) to EtfA from C. perfringens, an alternate DPA synthase. All three EtfA homologs were present in all eight genomes. Only EtfA_1 contained the correct array of conserved domains associated with C. perfringens EtfA. EtfA_3 lacked a Prosite conserved motif (IPR018206; PS00696) and EtfA_2 contained an extra conserved domain: N-terminal 4Fe-4S ferredoxin-type iron-sulfur binding  The orthologous groups for 4-hydroxy-tetrahydrodipicolinate synthase (DapA) and 4-hydroxy-tetrahydrodipicolinate reductase (DapB) were both present in all eight isolates, as was a second DapB orthologous group (encoded by dapB_2). Orthologous groups encoding spore maturation protein A (SpmA) and B (SpmB) were also found and contained an ortholog in all eight isolates. Other orthologs typically associated with germination were also identified in the isolates. Germination protease (gpr), putative germination protease (yyaC) and spore photoproduct lyase (splB) orthologs were also found in all eight genomes. Supplementary Table 1 summarizes the orthologous genes and includes those which were found in C. botulinum A strain ATCC 3502. Locus tags and further information for all the genes in this study can be found in Supplementary  Table 1.

Phylogenomic Comparison
The phylogeny in Figure 3 (upper) depicts a branching of C. sporogenes and C. botulinum strains into two mixed groups. The majority of C. sporogenes strains are grouped together in the right group, though clade II (PA 3679) isolates group on the left. The majority of C. botulinum strains are in the left group, though several are present in the right group. All strains possessing the spoVA2 locus are in the left group. The xanthine dehydrogenase operon and isochorismate hydrolase are present in all members of the left group, though degenerated in some strains, and absent in all strains in the right group. The clade II isolates form an extremely well conserved group consistent with coming from the same original spore crop. The clade I isolates also group as expected, showing similarity to several C. sporogenes strains and one C. botulinum strain, Prevot 594. The pairwise genetic distances comparison in Figure 3 (lower) shows consistent results with the core gene phylogenetic tree, however the species in the right branch of the phylogeny are split into  Figure 3.
two more distinct groups than the phylogenetic tree alone would suggest.

DISCUSSION
Ensuring the quality and safety of packaged foods is an ongoing process that is ideally unnoticed by the consumer when everything works as intended. While current methods for food preservation have an excellent track record, bacteria do evolve over time and there is a non-negligible risk that these methods may no longer be adequate in the near future (for example, the rise of antibiotic resistance is a sharp reminder that things can change quickly in the microbial world). Here, we were offered the opportunity to examine the genetic differences behind the low and high heat resistance of clade I and II isolates of C. sporogenes; a knowledge that could be applied to the detection and prevention of heat resistance in pathogenic species of Clostridium and other common foodborne pathogens. Importantly, we discovered that the genetic locus that most likely conveys a meaningful improvement in PA 3679 spore heat resistance is part of the bacterial mobilome, and that this heat resistance-conferring island could be, and likely has been, transferred from/to a number of pathogenic species. Our study also serves as a reminder that not all C. sporogenes isolates identified as PA 3679 strains actually possess the capability for high heat resistance and therefore cannot fulfill the role of nontoxic, thermal surrogates. Processes vetted using these deficient strains may not perform up to the desired specifications, with potentially dire implications for the safety of the foods packaged by these processes.

Impact of the spoVA2 Locus
Undoubtedly, the most significant difference found between the low and high heat resistance isolates was the presence of a second set of spoVA genes in the clade II (PA 3679) group. This is not the first reported incidence of multiple spoVA operons in an endospore former. At least one group IV Clostridium species (Brunt et al., 2016), several species of Geobacillus and Bacillus cereus, as well as several of the more heat resistant Bacilli have multiple spoVA loci. Adding additional copies of spoVA in a mobile Tn1546 transposon (spoVA mob , Figure 2A) creates an additive resistance effect, greatly increasing the concentration of DPA in the Bacillus subtilis spore core (a D 112.5 • C increase from 0.2 min to 25.6 min with three copies; Berendsen et al., 2016). Our results are compatible with these previous observations, and lend to a compelling hypothesis about spore heat resistance. As spore formation is temporally limited, the SpoVA apparatus encounters a flow rate challenge. Increasing the flow rate with extra pumps can either move more DPA in the given time, or overcome losses due to diffusion (or both) resulting in a much higher concentration of DPA in the spore core. This effect should scale until the point that the maximal amount of DPA has been added, which has apparently not been reached in Bacillus, and our findings suggest the same for Clostridia, with the implication that further multiplication of this operon in the genetic paraphernalia of an endospore former may imbue it with the ability to survive current canning processes. The origin of the PA 3679 spoVA2 operon, however, is not entirely clear. This operon was not contained in the same Tn1546 mobile element as in Bacillus species, but individual genes within the spoVA2 locus-hypothetical protein, membrane protein, spoVAC, spoVAD, and spoVAEb-showed a higher sequence similarity to foreign loci than to the native spoVA locus in PA 3679 (Figure 2A). Blastn searches of the contiguous spoVA2 operon gave high sequence homology (>80% identity, >99% query coverage) to the expected C. botulinum species from this study, plus C. argentinese CDC 2741, C. neonatale, C. saccharobutylicum DSM 13864, and C. saccharoperbutylacetonium N1-4. Given this information, horizontal acquisition seems more likely than a paralogous duplication event. This idea is furthered when considering the two additional genes (coding for the DUF1657 domain-containing and YcaP domain-containing proteins) which show a homology to spoVA2 mob from B. subtilis yet are 0.01 absent in the native spoVA locus. The YcaP domain-containing protein is of particular interest as Berendsen et al. (2016) knocked out the orthologous protein in spoVA mob , which severely diminished spore heat resistance in B. subtilis. While the roles of SpoVAD (Li et al., 2012) and SpoVAC (Velásquez et al., 2014) are partially established in DPA transport, a full understanding of the roles of all spoVA2 proteins needs further study. The presence of additional spoVA operons in Clostridium species has not been previously explored, though it is a phenomenon that occurs not just in PA 3679, but also in several closely related C. botulinum species ( Figure 2B) and at least one C. argentinese strain (Brunt et al., 2016). This might seem paradoxical as C. botulinum is generally considered to have lower spore heat resistance than C. sporogenes. However, (1) there is a large amount of variance and inconsistency in heat resistance data, owing to a variety of environmental factors involved, (2) the heat resistance of the C. botulinum species possessing the spoVA2 locus has not been widely studied and (3) this study only examined a comparison of C. sporogenes strains. Perhaps C. botulinum species containing the spoVA2 locus also feature increased heat resistance. This would present a considerable challenge designing thermal processing strategies which effectively eliminate this dangerous pathogen in a food product. Future studies will be required to explore other spore heat resistance factors that may differ between the C. botulinum and C. sporogenes.

High Conservation of SASPs
The α/β-type SASP have been recognized primarily for their function in maintaining spore DNA integrity when exposed to a variety of factors (Setlow, 2014b). Many studies of α/β-type SASP knockouts have demonstrated a significant loss of heat resistance when lacking a functional DNA protection mechanism (Setlow, 2007(Setlow, , 2014b. However, the protection provided by SASP is not additive, as only one or two of the paralogous SASPencoding genes are expressed in large amounts, and confer maximal heat resistance (Setlow, 2014b). The presence of eight SASP-encoding orthologous groups in our isolates clouded the search for differential heat resistance, thus it was decided to focus on faulty or absent SASP. The eight isolates in our study share three orthologs similar to those in other Clostridia: ssp1, ssp2, and ssp3 (Raju et al., 2006;Galperin et al., 2012). These encode proteins containing all the major α/β-type SASP conserved domains, and show very little difference in protein sequence between clade I and clade II isolates, suggesting the presence of functional SASP-DNA protection in all isolates (Supplementary Figure 1). Additionally, the minor SASP-ssp5, H-type, and tlpare also well conserved though not directly implicated in heat resistance.
The one SASP that demonstrated a unique feature was that encoded by the ssp4 orthologous group. Previous research on the orthologous SASP in C. perfringens suggested that the presence of an aspartate (D), or other negatively charged or large amino acid at position 36 (Li and McClane, 2008) correlates with higher heat resistance when compared to other residues (Li et al., 2009). The clade I and II strains in this study displayed either a threonine (T) or isoleucine (I) residue at this position, respectively; and it is worth noting that C. botulinum A strain ATCC 3502 features an Ssp4 with an I at that position, yet still produces spores with a lower heat resistance than PA 3679. The lack of a negative charge at this position also does not appear to impede spore heat resistance for clade II (PA 3679) isolates. While a potential increase in spore heat resistance for PA 3679 with an I36D mutation is worth investigating, it would appear that the SASP-DNA protection mechanism provided by Ssp1-3 is already sufficient given its current robustness. As Setlow (2014b) has suggested, this dynamic hits a saturation point, beyond which more or better SASPs are no longer the limiting factor for higher spore heat resistance and the potential effect, if any, of T36I or T36D substitutions in Ssp4 for improving heat resistance of clade I isolates is unclear.

Conserved Sporulation Genes
Possessing six D-alanyl-D-alanine carboxypeptidases appears typical for many Clostridium and Bacillus species. All eight isolates in this study have potential orthologs for DacB and DacF, the two carboxypeptidases which have a demonstrated effect on spore heat resistance. Dac2 through Dac5 contained all the expected conserved domains. Based on sequence homology, Dac4 is most likely the DacF ortholog, and the DacB homolog is likely either Dac2 or Dac5 (both showed similar e-values). Ultimately, the determination of the role of each Dac will require future experiments to determine which one is regulated by σ F (DacF, expressed in the forespore) and which one by σ E (DacB, expressed in the mother cell). From this study, none of the potential orthologs appeared to be significantly different between the clade I and clade II isolates ( Supplementary Figure 1), thus they are likely not responsible for the differential heat resistance.
DPA synthesis in these C. sporogenes isolates is not controlled via a spoVF mechanism, though potentially is synthesized via an electron transport flavoprotein α-subunit as seen in C. perfringens (Orsburn et al., 2010). The EtfA_3 orthologous group lacked a C-terminal Prosite conserved domain, and EtfA_2 orthologous group had an extraneous N-terminal FerB domain, making them both unlikely candidates. The EtfA_1 product, which contained all the expected domains and had the highest sequence homology, is the most likely ortholog. Future experiments will need to replicate the experiments from Orsburn et al. (2010) in order to prove conclusively that the product of etfA_1 is capable of DPA synthesis in vitro and in vivo. An electron transport flavoprotein is common and this phenomenon is fairly unique. Regardless, the three potential orthologous groups show a high degree of sequence conservation in all eight isolates, thus none of them are likely to account for the heat resistance difference we see between clade I and clade II isolates.
The SpmA and SpmB orthologs were present and highly conserved, generating little ambiguity about their identities. All expected domains were present, and minimal variation between clade I and clade II sequences make it unlikely that they contribute to the differential heat resistance. All additional genes examined showed a very high sequence similarity to unique orthologous groups containing representatives from all eight isolates. Plus, their involvement in spore heat resistance is mostly tangential, again making them unlikely factors in the observed change (For more information see Supplementary Table 1 and Supplementary Figure 1).

Interrelatedness of Group I Clostridium Species and Origins of the spoVA2 Locus
The phylogenetic tree produced in this study (Figure 3) was consistent with previous studies (Kenri et al., 2014;Weigand et al., 2015;Williamson et al., 2016). However, PA 3679 strains did not group with other C. sporogenes strains, instead clustering deep in the left branch. The Mash pairwise distances corroborated the phylogenetic tree, demonstrating not only the position of PA 3679 strains with a group of C. botulinum, but heterogeneity among C. sporogenes strains in general. Considering how very different PA 3679 strains appear from the other C. sporogenes strains, the assertion that C. sporogenes strains in general are suitable non-toxic surrogates is questionable. Most of the C. sporogenes strains examined lack the spoVA2 locus, and given their phylogenetic relatedness to the clade I isolates from this study, it is likely that they possess similar low heat resistance profiles. C. sporogenes CDC24533 is the exceptionpossessing spoVA2-and has the potential to produce spores that are resistant to high temperatures similar to PA 3679, warranting further investigation.
The high degree of conservation observed between the spoVA2 operons present in species from the left side of the tree (Figure 3) argues in favor of a common origin, but it is unclear if this distribution results from multiple independent acquisition events from similar sources or rather from a single acquisition in their shared common ancestor followed by independent losses. While heat resistance confers an obvious advantage to species exposed to extreme temperatures like those involved in canning processes, those conditions are rarely met in the environment and one can envision that the added benefit may be rather minimal in normal circumstances, and thus commonly lost during pruning processes. In any case, the presence of the spoVA2 operon in botulinum neurotoxin-containing Clostridium species strongly argues in favor of maintaining stringent canning processes that meet or exceed spore destruction targets of heat-resistant C. sporogenes isolates.

CONCLUSIONS
The high heat resistance of Clostridium sporogenes PA 3679 is unique among observed C. sporogenes strains. While this resistance is most likely influenced by the presence of an extra spoVA2 operon, other factors including differential expression, altered function of canonical sporulation proteins and/or additional novel sporulation proteins could be involved. Further, studies will be required to circumscribe the full set of factors that confer to PA 3679 this thermal endurance and to better define the mechanisms that are involved in its endospore survival. Furthermore, because the potential for higher heat resistance also exists in both harmless and pathogenic species, strategies to detect and reduce thermal stability in foodborne organisms as well as to how maintain safe standards of food processing will need to be revisited.

AUTHOR CONTRIBUTIONS
RB, JP, KS, and YW designed the study and drafted the manuscript. RB conducted the work and RB and JP conducted the analysis.

FUNDING
This work was supported by a C.V. Starr fellowship to RB and by funds from the Illinois Institute of Technology to JP and YW was supported by an appointment to the Research Participation Program at the Center for Food Safety and Applied Nutrition administered by the Oak Ridge Institute for Science and Education via an interagency agreement between the U.S. Department of Energy and the FDA.