Candidatus Nitrosocaldus cavascurensis, an Ammonia Oxidizing, Extremely Thermophilic Archaeon with a Highly Mobile Genome

Ammonia oxidizing archaea (AOA) of the phylum Thaumarchaeota are widespread in moderate environments but their occurrence and activity has also been demonstrated in hot springs. Here we present the first enrichment of a thermophilic representative with a sequenced genome, which facilitates the search for adaptive strategies and for traits that shape the evolution of Thaumarchaeota. Candidatus Nitrosocaldus cavascurensis has been enriched from a hot spring in Ischia, Italy. It grows optimally at 68°C under chemolithoautotrophic conditions on ammonia or urea converting ammonia stoichiometrically into nitrite with a generation time of approximately 23 h. Phylogenetic analyses based on ribosomal proteins place the organism as a sister group to all known mesophilic AOA. The 1.58 Mb genome of Ca. N. cavascurensis harbors an amoAXCB gene cluster encoding ammonia monooxygenase and genes for a 3-hydroxypropionate/4-hydroxybutyrate pathway for autotrophic carbon fixation, but also genes that indicate potential alternative energy metabolisms. Although a bona fide gene for nitrite reductase is missing, the organism is sensitive to NO-scavenging, underlining the potential importance of this compound for AOA metabolism. Ca. N. cavascurensis is distinct from all other AOA in its gene repertoire for replication, cell division and repair. Its genome has an impressive array of mobile genetic elements and other recently acquired gene sets, including conjugative systems, a provirus, transposons and cell appendages. Some of these elements indicate recent exchange with the environment, whereas others seem to have been domesticated and might convey crucial metabolic traits.


INTRODUCTION
Ammonia oxidizing archaea (AOA) now collectively classified as Nitrososphaeria within the phylum Thaumarchaeota (Brochier-Armanet et al., 2008;Spang et al., 2010;Stieglmeier et al., 2014;Kerou et al., 2016a) represent the sole archaeal group that is globally distributed in oxic environments efficiently competing with aerobic bacteria. Because of their large numbers in the ocean plankton, in marine sediments, in lakes and in soils, AOA are considered one of the most abundant groups of prokaryotes on this planet Prosser and Nicol, 2008;Erguder et al., 2009;Schleper and Nicol, 2010;Pester et al., 2011;Hatzenpichler, 2012;Stahl and de la Torre, 2012;Offre et al., 2013). All currently cultivated AOA strains gain energy exclusively through the oxidation of ammonia to nitrite, i.e., they perform the first step in nitrification. They grow chemolithoautotrophically from inorganic carbon supply. Some strains show growth only in the presence of small organic acids (Tourna et al., 2011;Qin et al., 2014) which seem to catalyze degradation of reactive oxygen species (ROS) from the medium (Kim et al., 2016). Different from their bacterial ammonia oxidizing counterparts, AOA are often adapted to rather low levels of ammonia for growth, which seems to favor their activities in oligotrophic environments, such as the marine pelagic ocean and marine sediments, but also in acidic environments, where the concentration of ammonia decreases in favor of ammonium . However, AOA occur also in large numbers in terrestrial environments, including fertilized soils and some waste water treatment plants, and several studies indicate alternative energy metabolisms (Mussmann et al., 2011;Alves et al., 2013). Although ammonia oxidation in archaea has not been biochemically resolved, the presence of genes for an ammonia monooxygenase in all AOA with remote similarity to methane and ammonia monooxygenases of bacteria implies involvement of this complex in the process (Konneke et al., 2005;Treusch et al., 2005;Nicol and Schleper, 2006). Hydroxylamine has been suggested to be the first product of ammonia oxidation (Vajrala et al., 2013), but further conversion to nitrite is performed in an unknown process, as no homolog of the bacterial hydroxylamine dehydrogenase has been found in the genomes of AOA. However, nitric oxide (NO) has been suggested to be involved in the process, because NO production and re-consumption have been observed (Martens-Habbena et al., 2015;Kozlowski et al., 2016b) and the NO scavenger PTIO was shown to inhibit AOA at very low concentrations (Yan et al., 2012;Shen et al., 2013;Martens-Habbena et al., 2015).
Ammonia oxidation by AOA has also been documented to occur in hot springs. With the help of molecular marker genes, diverse ammonia oxidizing thaumarchaea were found in hot terrestrial and marine environments (Reigstad et al., 2008;Zhang et al., 2008;Jiang et al., 2010;Cole et al., 2013). Furthermore, in situ nitrification activities up to 84 • C were measured using isotopic techniques in AOAcontaining habitats in Iceland (Reigstad et al., 2008), in Yellowstone National park (Dodsworth et al., 2011), in a Japanese geothermal water stream (Nishizawa et al., 2016), and in enrichment cultures from Tengchong Geothermal Field in China (Li et al., 2015). In addition, an enrichment of an ammonia oxidizing thaumarchaeon, Ca. Nitrosocaldus yellowstonensis, from a hot spring in Yellowstone National park grew at temperatures of up to 74 • C confirming that archaeal ammonia oxidation, different from bacterial ammonia oxidation, indeed occurs at high temperatures (de la Torre et al., 2008).
A remarkable diversity of archaea with different metabolisms has been described from terrestrial and marine hot springs and hyperthermophilic organisms tend to be found at the base of almost all lineages of archaea. Therefore, it has often been proposed that the ancestor of all archaea and the ancestors of mesophilic lineages of archaea were hyperthermophilic or at least thermophiles (Barns et al., 1996;Lopez-Garcia et al., 2004;Gribaldo and Brochier-Armanet, 2006;Brochier-Armanet et al., 2012;Eme et al., 2013). This has been supported by sequence analyses which suggested a parallel adaptation from hot to moderate temperatures in several lineages of archaea (Groussin and Gouy, 2011;Williams et al., 2017). Indeed, the thermophilic strain Ca. Nitrosocaldus yellowstonensis emerged in 16S rRNA phylogeny as a sister group of all known mesophilic AOA (de la Torre et al., 2008) indicating that AOA also evolved in hot environments.
However, until now, no genome of an obligate thermophilic AOA was available to analyze in depth the phylogenetic placement of thermophilic AOA and to investigate adaptive features of these ammonia oxidizers that have a pivotal position in understanding the evolution of Thaumarchaeota.
In this work we present the physiology and first genome of an extremely thermophilic AOA of the Nitrosocaldus lineage that we cultivated from a hot spring in Southern Italy. We analyze its phylogeny based on ribosomal proteins and reconstruct metabolic and genomic features and adaptations.

Sampling and Enrichment
About 500 mL of mud were sampled from a terrestrial hot spring on the Italian island Ischia at "Terme di Cavascura" and stored at 4 • C until it arrived at the laboratory in Vienna (after about one week, October 2013). A temperature of 77 • C and pH 7-8 were measured in situ using a portable thermometer (HANNA HI935005) and pH stripes. Initial enrichments (20 mL final volume) were set up in 120 mL serum flasks (two times autoclaved with MilliQ water) containing 14 mL of autoclaved freshwater medium (FWM) (de la Torre et al., 2008;Tourna et al., 2011) amended with non-chelated trace element solution (MTE) (Konneke et al., 2005), vitamin solution, FeNaEDTA (20 µL each), 2 mM bicarbonate, 1 mM ammonium chloride and 2 mL of 0.2 µm filtrated hot spring water. Serum flasks were inoculated with 4 mL (20%) of sampled mud, sealed with gray rubber stoppers (6x boiled and autoclaved) and incubated aerobically at 78 • C while rotating at 100 rpm.

Enrichment Strategies
The temperature was changed after one week to 70 • C and medium amendment with filtrated hot spring water was discontinued after four transfers as there was no effect observed. Growth was monitored by microscopy, nitrite production and ammonia consumption. Cultures were transferred at a nitrite concentration of ∼700 µM and in case ammonia was depleted before the desired nitrite concentration was reached, cultures were fed with 1 mM ammonium chloride. For a preliminary growth temperature test, cultures were incubated at 55, 60, 65, 70, 72, 74, 76, and 80 • C, with cultures at 70 • C showing the highest nitrite production rate. The antibiotics Sulfamethazine, Rifampicin and Novobiocin (100 µg mL −1 ) were used alone and in combination with pyruvate, glyoxylate and oxalacetate (0.1 mM), but with little effect on enrichment. The use of the mentioned organic acids or N,N -Dimethylthiourea (DMTU, all 0.1 mM) as ROS scavengers increased the abundance of heterotrophic bacteria and even reduced nitrite production. Filtration of enrichments with 1.2 µm filters had no or even detrimental effect on AOA abundance when done repeatedly and 0.45 µm filtration sterilized the cultures.
As nitrite production rate increased over time, inoculum size was decreased from 20 to 10% and finally to 5%. Omitting vitamin solution from the medium led to an increase in nitrite production rate and AOA abundance. Most crucial for increasing thaumarchaeal abundance was keeping an exact timing on passaging cultures in late exponential phase (after 4 days) and setting up multiple cultures from which the best (based on microscopic observations) was used for further propagation. Based on cell counts maximal enrichments of up to 92% were achieved (see Results for more details).

Cultivation
Cultures are routinely grown at 68 • C using 5% inoculum in 20 mL FWM amended with MTE and FeNaEDTA solutions, 1 mM NH 4 Cl, 2 mM NaHCO 3 and are transferred every 4 days once they reach about 700 µM NO 2 − . 1 mM NH 4 Cl was substituted by 0.5 mM urea to confirm urease activity, while the rest of the medium components and conditions remained unchanged.
Inhibition Test with 2-phenyl-4, 4, 5, 5,-tetramethylimidazoline-1-oxyl 3-oxide (PTIO) Cultures were grown in 20 mL FWM with 5% inoculum under standard conditions and different amounts of an aqueous PTIO stock solution were added when early to mid-exponential growth phase was reached (ca. 63 h after inoculation; 20, 100, and 400 µM final PTIO concentration). Ammonia oxidation ceased immediately at all applied PTIO concentrations, but cultures with 20 µM PTIO were able to resume nitrite production 72 h after the addition of PTIO.

Amplification of amoA Gene
Specific primers were designed for the amplification of the amoA gene from Candidatus N. cavascurensis: ThAOA-277F: CCA TAY GAC TTC ATG ATA GTC and ThAOA-632R: GCA GCC CAT CKR TAN GTC CA (Alves and Schleper, unpublished). They were based on an alignment of amoA gene sequences from available genomes, all PCRamplified sequences related to Nitrosocaldus identified in GenBank through BLAST searches, and several reference sequences from the orders Nitrososphaerales, Nitrosopumilales, and Nitrosotaleales. ThAOA-277F targets a gene region conserved only among Ca. Nitrosocaldus spp., and primer ThAOA-632R was a modification of CrenamoA616r(48x)  to match the amoA of Nitrosocaldus spp.; ThAOA-632R contains additional degenerated bases that account for the most common substitutions among all amoA genes, since only two sequences containing this region were available for Ca. Nitrosocaldus spp.
PCR conditions were 95 • C for 10 min as initialization, followed by 35 cycles of 30 s denaturing at 95 • C, 45 s primer annealing at 55 • C, and elongation at 72 • C for 45 s, finishing with 10 min final elongation at 72 • C.

Genome Assembly
DNA was prepared from 480 mL of culture using standard procedures and sequenced using a PacBio Sequel sequencer at the VBCF (Vienna BioCenter Core Facilities GmbH) with a SMRT Cell 1Mv2 and Sequel Sequencing Kit 2.1. Insert size was 10 kb. Around 500000 reads were obtained with an average size of ∼5 kb (N50: 6866 nt, maximal read length: 84504 nt).
The obtained PacBio reads were assembled using the CANU program (version 1.4, parameters "genomeSize = 20 m, corMhapSensitivity = normal, corOutCoverage = 1000, errorRate = 0.013") (Koren et al., 2017), and then "polished" with the arrow program from the SMRT analysis software (Pacific Biosciences, United States). The Ca. N. cavascurensis genome consisted of two contiguous, overlapping contigs of 1580543 kb and 15533 kb, respectively. The resulting assembly was compared to a previous version of the genome bin we had obtained by using IDBA-UD and Newbler for assembly, and differential coverage binning on ∼150 nt reads from five runs of IonTorrent sequencing (use of GC% and reads coverage from IonTorrent PGM reads) (Albertsen et al., 2013;Nurk et al., 2013). The new assembly confirmed the previous version as all nine contigs from the previous aligned with MUMMER to the largest of the two contigs, except for a ca. 21 kb region we had not manually selected using differential coverage binning because of a variable read coverage between runs. Interestingly, this region belongs to an integrated conjugative element (ICE1, see results) that might perhaps be excised occasionally. A repeated region between both extremities of the longest contig, and the 2 nd small contig obtained was found, and analyzed using the nucmer program from the MUMMER package (see Supplementary Figure 1) (Kurtz et al., 2004). This region was merged using nucmer results, and the longest contig obtained was "circularized" using the information of the nucmer program, as sequence information from the 2nd contig was nearly identical (>99.5%) to that of the extremities of the long contig. It coincided with a repeat-rich adhesin (Supplementary Figure 1). The origin of replication was predicted using the Ori-Finder 2 webserver (Luo et al., 2014). By analogy with Nitrosopumilus maritimus, it was placed after the last annotated genomic object, before the ORB repeats and the cdc gene. The annotated genome sequence has been deposited to the European Nucleotide Archive (ENA) with the study accession number: PRJEB24312.

Genome Annotation
The first annotation of the Ca. N. cavascurensis genome was obtained by the automatic annotation pipeline MicroScope (Vallenet et al., 2009(Vallenet et al., , 2017Medigue et al., 2017). Annotations from Nitrososphaera viennensis EN76 were carried over when aligned sequences were 35% identical (or 30% in syntenic regions) and covered 70% of the sequence. All subsequent manual annotations were held on the platform. Putative transporters were classified by screening against the Transporter Classification Database (Saier et al., 2014). To assist the annotation process, several sets of proteins of interest from Kerou et al. (2016b) (EPS, metabolism), Offre et al. (2014) (transporters), Makarova et al. (2016) (archaeal pili and flagella) and ribosomal proteins were screened specifically within our set of genomes using HMMER and HMM protein profiles obtained from databases (TIGRFAM release 15, PFAM release 28, SUPERFAMILY version 1.75), or built from arCOG families (Gough et al., 2001;Haft et al., 2003;Finn et al., 2014;Makarova et al., 2015). In the former case, the trusted cutoff "-cut_tc" was used if defined in the profiles. In the latter case, for each family of interest, sequences were extracted from the arCOG2014 database, aligned using MAFFT ("linsi" algorithm) (Katoh and Standley, 2014), and automatically trimmed at both extremities of conserved sequence blocks using a home-made script relying on blocks built by BMGE v1.12 (BLOSUM 40) (Criscuolo and Gribaldo, 2010). HMMER was then used to screen genomes for different sets of specialized families (Eddy, 2011).
In order to visualize and compare the genetic organization of sets of genes (AMO, Urease -Ure, type IV pili), we built HMM profiles (as above except alignments were manually curated for AMO and Ure), based on the corresponding arCOG families and integrated the respective sets of profiles in the MacSyFinder framework that enables the detection of sets of co-localized genes in the genome (parameters inter_gene_max_space = 10, min_nb_mandatory_genes = 1 for AMO and Ure, inter_gene_max_space = 10, min_nb_mandatory_genes = 4 for type IV pili) (Abby et al., 2014). MacSyFinder was then run with the models and profiles on the genome dataset, and the resulting JSON files were used for visualization in the MacSyView web-browser application (Abby et al., 2014). The SVG files generated by MacSyView from the predicted operons were downloaded to create figures.

Reference Phylogeny
Ribosomal proteins were identified in genomes using a set of 73 HMM profiles built for 73 arCOG families of ribosomal proteins (see Genome Annotation). When multiple hits were obtained in a genome for a ribosomal protein, the best was selected. The sequences were extracted and each family was aligned using the MAFFT program ("linsi" algorithm) (Katoh and Standley, 2014). Alignments were curated with BMGE (version 1.12, default parameters and BLOSUM 45) (Criscuolo and Gribaldo, 2010) and then checked manually. In several cases, sequences identified as part of a given ribosomal protein family did not align well and were excluded from the analysis. In the end, 59 families with more than 35 genomes represented were selected to build a phylogeny. The corresponding family alignments were concatenated, and a tree was built using the IQ-Tree program version 1.5.5 (Nguyen et al., 2015) (model test, partitioned analysis with the best model selected per gene, 1000 UF-Boot and 1000 aLRT replicates). A tree drawing was obtained with the Scriptree program (Chevenet et al., 2010), and then modified with Inkscape.

Protein Families Reconstruction and Phyletic Patterns Analysis
Homologous protein families were built for the set of genomes selected. A blast similarity search was performed with all sequences against themselves, and the results used as input for the Silix and Hifix programs to cluster the sets of similar sequences into families (Miele et al., 2011(Miele et al., , 2012. For sequences to be clustered in a same Silix family, they had to share at least 30% of identity and the blast alignment cover at least 70% of the two sequences lengths. The distribution of protein families of interest was analyzed across the genome dataset.

Phylogenetic Analyses
Sequences from Silix families of interest were extracted in fasta files, and a similarity search was performed against a large database of sequences which consisted of 5750 bacterial and archaeal genomes (NCBI Refseq, last accessed in November 2016), and the genomes from our genome dataset. Sequences with a hit having an e-value below 10 −10 were extracted, and the dataset was then de-replicated to remove identical sequences. An alignment was then obtained for the family using MAFFT ("linsi" algorithm), filtered with BMGE (BLOSUM 30) and a phylogenetic tree was reconstructed by maximum-likelihood using the IQ-Tree program (version 1.5.5, best evolutionary model selected, 1000 UF-Boot and 1000 aLRT replicates). A tree drawing was obtained with the Scriptree program (Chevenet et al., 2010), and modified with Inkscape.

Identification and Annotation of Integrated MGE
Integrated mobile genetic elements (MGE) were identified as described previously (Kazlauskas et al., 2017). Briefly, the integrated MGE were identified by searching the Ca. N. cavascurensis genome for the presence of the hallmark genes specific to mobile genetic elements, such as integrases, casposases, VirB4 ATPases, large subunit of the terminase, major capsid proteins of achaeal viruses, etc. In the case of a positive hit, the corresponding genomic neighborhoods were analyzed further. IS elements were searched for using ISfinder (Siguier et al., 2006). The precise borders of integration were defined based on the presence of direct repeats corresponding to attachment sites. The repeats were searched for using Unipro UGENE (Okonechnikov et al., 2012). Genes of integrated MGE were annotated based on PSI-BLAST searches (Altschul et al., 1997) against the nonredundant protein database at NCBI and HHpred searches against CDD, Pfam, SCOPe and PDB databases (Soding, 2005).

Enrichment of Ca. Nitrosocaldus cavascurensis
An ammonia oxidizing enrichment culture was obtained from thermal water outflow of the Terme di Cavascura on the Italian island Ischia. After repeated transfers into artificial freshwater medium supplemented with 1 mM ammonium and 2 mM bicarbonate, highly enriched cultures were obtained that exhibited almost stoichiometric conversion of ammonia into nitrite within 4-5 days ( Figure 1A). A single phylotype of AOA but no AOB or Nitrospira/commamox was identified in the enrichment via amplification and sequencing of amoA and 16S rRNA gene fragments (data not shown) and the presence of a single AOA phylotype was confirmed by metagenomic sequencing (see below). Its closest relative in the 16S rRNA database was fosmid 45H12 isolated from a gold mine metagenome with 99% identity and the next closest cultivated relative was Ca. Nitrosocaldus yellowstonensis with 96% identity (Nunoura et al., 2005;de la Torre et al., 2008). The AOA has been propagated for 4 years within a stable enrichment culture and its relative enrichment usually ranged between 75 and 90% based on qPCR or cell counts respectively. Different strategies were tested in order to obtain a better enrichment (heat treatment, antibiotics, dilution to extinction, filtration) which resulted in a maximal enrichment of 92% (based on cell counts) but not in a pure culture. In high throughput sequencing analyses using general prokaryotic primers to amplify the V2/V4 region of the 16S rRNA gene we identified phylotypes of the Deinococcus/Thermus group comprising up to 97% of the bacterial contaminating sequences (with 96% identity to sequences from the Thermus genus) and to minor extent sequences related to the Chloroflexi and Armatimonadetes (not shown). Although Aigarchaeota had been detected in earlier enrichments, the AOA was the only archaeon discovered in the high-level enrichments used in this study. Ca. N. cavascurensis does not encode a catalase, and in the light of current knowledge on the auxiliary role of ROS scavengers in the growth of AOA (Kim et al., 2016), it is possible that a ROS detoxification mechanism is provided by the remaining bacteria in the enrichment culture, although supplementation with α-keto acids did not have a positive effect as it had on other AOA enrichments (see Materials and Methods).
The shortest generation time of the thermophilic AOA was approximately 23 h (±1.1), as measured by nitrite production rates and in qPCR, which is comparable to the generation time of Ca. N. yellowstonensis (de la Torre et al., 2008). It grew in a temperature range from about 55 to 74 • C with an apparent (but relatively wide) temperature optimum around 68 • C (Figure 2).
In fluorescence in situ hybridizations (FISH) with archaeaand bacteria-specific probes, all coccoid-shaped cells of less than 1 µm in diameter were assigned to the AOA, while all shorter and longer rod-shaped morphotypes were clearly assigned to bacteria (Figures 3A,B). Scanning electron microscopy revealed the typical spherical and irregular shape of cocci, as seen, e.g., for the ammonia oxidizing archaeon Nitrososphaera viennensis (Tourna et al., 2011) or hyperthermophilic and halophilic coccoid archaea strains with a diameter of around 0.7 µm (Figures 3C,D). Based on its relationship with Ca. Nitrosocaldus yellowstonensis (de la Torre et al., 2008), and the location it was sampled from (Terme di cavascura, Italy) we named this organism provisionally Candidatus Nitrosocaldus cavascurensis.

Ca. Nitrosocaldus cavascurensis Represents a Deeply Branching Lineage of AOA
We gathered all complete or nearly complete 27 genome sequences available for Thaumarchaeota. These included 23 genomes of closely related cultivated or uncultivated AOA all harboring amo genes, and four genomes obtained from metagenomes that do not have genes for ammonia oxidation. Among the latter were two assembled genomes from moderate environments (Fn1 and NESA-775) and two from hot environments (BS4 and DS1) (Beam et al., 2014;Lin et al., 2015). In addition, two Aigarchaeota genomes, and a selection of representative genomes of Crenarchaeota (11) were included to serve as outgroups (Petitjean et al., 2014;Raymann et al., 2015;Adam et al., 2017;Williams et al., 2017). This resulted in a 40 genomes dataset (see Materials and Methods). A maximum likelihood analysis based on 59 concatenated ribosomal proteins (in one copy in at least 35 of the 40 genomes) resulted in a highly supported tree: UF-boot and aLRT supports were maximal or

Genome and Energy Metabolism of Ca. Nitrosocaldus cavascurensis
The genome of Ca. N. cavascurensis contains 1.58 Mbp, with 1748 predicted coding sequences, one 16S/23S rRNA operon and 29 tRNA genes. It has a G + C content of 41.6%.
Similar to the genomes of most marine strains (Nitrosopumilales) and to Ca. Nitrosocaldus yellowstonensis, but different from those of the terrestrial organisms (Nitrososphaerales) it encodes all putative subunits of the ammonia monooxygenase in a single gene cluster of the order amoA, amoX, amoC, amoB (Figure 4) indicating that this might represent an ancestral gene order. Different from several other AOA it has a single copy of amoC. The genome contains a cluster of genes for the degradation of urea, including the urease subunits and two urea transporters (Figures 4, 5) with a similar structure to the urease locus of Nitrososphaerales. Accordingly, growth on urea, albeit slower than on ammonia, could be demonstrated ( Figure 1B). Urease loci were so far found in all Nitrososphaerales genomes, and in some Nitrosopumilales (Hallam et al., 2006;Park et al., 2012;Spang et al., 2012;Bayer et al., 2016). No urease cluster could be found in non-AOA Thaumarchaeota, nor in Aigarchaeota. In Crenarchaeota, only one Sulfolobales genome harbored a urease (Metallosphaera sedula DSM 5348). A specific protein family of a putative chaperonin (Hsp60 homolog, arCOG01257) was found to be conserved within the urease loci of Ca. N. cavascurensis and the Nitrososphaerales ("soil-group" of AOA) (Figure 4) as also observed in some bacteria (e.g., in Haemophilus influenzae). Unlike the conserved "thermosome" Hsp60 (also part of arCOG01257), this particular Hsp60 homolog was not found in any Nitrosopumilales genome. Given the conservation of the urease loci between Ca. Nitrosocaldus and Nitrosophaerales, it Genes are represented on an uninterrupted cluster when less than five genes separated them, otherwise, genes separated by more than five gene positions are displayed on another cluster. For AMO clusters, the number of different amoC homologs found in the genome outside of the main AMO cluster is indicated in the amoC box. Putative urea transporters are indicated by gray boxes outlined in blue for SSS-type urea transporter, and pink for UT urea transporter. The chaperonin Hsp60 homolog found in the urease cluster and discussed in the main text is indicated with a red box underlining its conserved position over multiple loci. Note that Nitrososphaera evergladensis and Nitrososphaera viennensis both have an extra set of three urease genes (UreABC) not displayed here.
is possible that these genes have a common origin, and were acquired en bloc from bacteria by the ancestor of AOA. It is likely that the Hsp60 homolog is involved in stabilization of the urease, as demonstrated in the deltaproteobacterium Helicobacter pylori (Evans et al., 1992).
Intriguingly, a gene encoding a nitrite reductase that is present in all genomes of AOA (except the sponge symbiont, Ca. Cenarchaeum symbiosum, Bartossek et al., 2010) and which is highly transcribed during ammonia oxidation in Nitrososphaera viennensis (Kerou et al., 2016b) and in metatranscriptomic datasets (Shi et al., 2011) is missing from the genome. Nitrite reductases have been postulated to be involved in ammonia oxidation by providing nitric oxide (NO) for the oxidation step to nitrite (Kozlowski et al., 2016a). We therefore tested, if the organism was affected by the NOscavenger PTIO which inhibited at low concentrations all AOA tested so far (Martens-Habbena et al., 2015;Kozlowski et al., 2016a). Ammonia oxidation of Ca. N. cavascurensis was fully inhibited at concentrations as low as 20 µM, similar to those affecting Nitrososphaera viennensis and Nitrosopumilus maritimus (Figure 6). This indicates that NO is also an important intermediate in the ammonia oxidation of this thermophile and it might be produced by an unknown nitrite reductase or perhaps by an unidentified hydroxylamine dehydrogenase (HAO), as recently shown for the HAO of the ammonia oxidizing bacterium Nitrosomonas europaea (Caranto and Lancaster, 2017). Alternatively, NO might be supplied through the activity of other organisms in the enrichment culture. Indeed we found a nirK gene in the genome of the Deinococcus species of our enrichment culture. The sensitivity to PTIO reinforces the previously raised hypothesis that NO represents an important intermediate or cofactor in ammonia oxidation in archaea Simon and Klotz, 2013;Kozlowski et al., 2016b). However, the full understanding of ammonia oxidation in AOA and the role of NO awaits further biochemical investigations.
Pathways of central carbon metabolism and -fixation of Ca. N. cavascurensis were found to be generally similar to those of other AOA, underlining the strikingly high similarity and conservation of metabolism within the Nitrososphaeria (Figure 5) (Kerou et al., 2016b).
Handelsmanbacteria (49% identity). In a phylogenetic tree of the protein family, it clusters together with the second group of crenarchaeal genes ("Cren Type-2, " Supplementary Figure 2, see Materials and Methods) which lack essential catalytic residues and whose function remains unknown (Ramos-Vera et al., 2011). Könneke et al. (2014) suggested an independent emergence of the cycle in Thaumarchaeota and autotrophic Crenarchaeota based on the unrelatedness of their respective enzymes (Könneke et al., 2014). But the existence of the second gene in this deep-branching lineage rather indicates that both genes could have been present in a common ancestor of Crenarchaeota and Thaumarchaeota and that their CO 2 fixation pathways could have a common origin. No homolog of the experimentally characterized malonic semialdehyde reductase from Nitrosopumilus maritimus (Otte et al., 2015), or any other iron-containing alcohol dehydrogenase protein family member, was found in the Ca. N. cavascurensis genome. Since this enzyme (Nmar_1110) is so far the only characterized malonic semialdehyde reductase active in mesophilic conditions (Otte et al., 2015), we searched for homologs of the corresponding (but unrelated) thermophilic enzymes from Crenarchaea (Metallosphaera sedula) (Kockelkorn and Fuchs, 2009) and Chloroflexi (Hügler et al., 2002). The best hits of Msed_1993, a member of the 3-hydroxyacyl-CoA dehydrogenase family, were the 3-hydroxybutyryl-CoA dehydrogenase homologs of the thaumarchaeal genomes, as observed already by Könneke et al. (2014). The reduction of malonic semialdehyde in the 3-hydroxypropionate bi-cycle of Chloroflexus aurantiacus is carried out by a bifunctional protein (Caur_2614) with an N-terminal short-chain alcohol dehydrogenase domain and a C-terminal aldehyde dehydrogenase domain (Hügler et al., 2002). No homolog of this bifunctional enzyme was identified in the genome of Ca. N. cavascurensis, but a number of short-chain alcohol dehydrogenases with low identities (∼30%) to Caur_2614 exist (Supplementary Table 1) that could serve as candidates in the absence of a canonical malonic semialdehyde reductase. Whether temperature adaptation leads to non-orthologous gene replacement we observe in mesophilic AOA remains to be seen upon experimental characterisation of the thermophilic thaumarchaeal enzyme.
A full oxidative TCA cycle is present in Ca. N. cavascurensis, including a malic enzyme and a pyruvate/phosphate dikinase connecting the cycle to gluconeogenesis. Additionally, and like most mesophilic AOA, Ca. N. cavascurensis encodes a class III polyhydroxyalkanoate synthase (phaEC) allowing for the production of polyhydroxyalkanoates, carbon polyester compounds that form during unbalanced growth and serve as carbon and energy reserves (Poli et al., 2011;Spang et al., 2012) ( Figure 5 and Supplementary Table 1).
The presence of a full set of genes encoding for the four subunits (and maturation factors) of a soluble type 3b [NiFe]-hydrogenase, uniquely in Ca. N. cavascurensis, indicates the ability to catalyze hydrogen oxidation potentially as part of the energy metabolism. This group of hydrogenases is typically found among thermophilic archaea, is oxygentolerant and bidirectional, and can couple oxidation of H 2 to reduction of NAD(P) in order to provide reducing equivalents for biosynthesis, while some have been proposed to have sulfhydrogenase activity (Kanai et al., 2011;Peters et al., 2015;Greening et al., 2016) (and references therein). Although classified by sequence analysis (Sondergaard et al., 2016) and subunit composition as a type 3b hydrogenase, the alpha and delta subunits belong to the arCOG01549 and arCOG02472 families respectively, which contain coenzyme F 420 -reducing hydrogenase subunits, so far exclusively found in methanogenic archaea. Given the fact that Thaumarchaeota can synthesize this cofactor and encode a number of F 420dependent oxidoreductases with a yet unknown function, it is interesting to speculate whether oxidized F 420 could also be a potential substrate for the hydrogenase. Expression of the hydrogenase is likely regulated through a cAMP-activated transcriptional regulator encoded within the hydrogenase gene cluster (Supplementary Table 1).
Putative S-layer subunits were identified in the genome of Ca. N. cavascurensis (NCAV_0187 and NCAV_0188, Supplementary Table 1). Although the overall identity among thaumarchaeal S-layer subunits is very low (<30% between different genera), investigations of the genomic neighborhood reveals conserved synteny of this region.

Adaptations to Thermophilic Life
The molecular adaptations that enable survival and the maintenance of cell integrity at high temperatures have been the subject of intense studies since the discovery of thermophilic organisms. The issue of extensive DNA damage occurring at high temperatures has led to the study of systems of DNA stabilization and repair in thermophilic and hyperthermophilic archaea. Among them, the reverse gyrase, a type IA DNA topoisomerase shown to stabilize and protect DNA from heatinduced damage, is often (but not always (Brochier-Armanet and Forterre, 2007)) found in thermophiles, and is even considered a hallmark of hyperthermophilic organisms growing optimally above 80 • C (Bouthier de la Tour et al., 1990;Forterre, 2002). However, the gene might not always be essential for survival at high temperature in the laboratory (Atomi et al., 2004;Brochier-Armanet and Forterre, 2007). Interestingly, we could not identify a gene encoding for reverse gyrase in the genome of Ca. N. cavascurensis. This might either reflect that its growth optimum is at the lower end of extreme thermophiles or that there is a separate evolutionary line of adaptation to thermophily possible without reverse gyrase.
The following DNA repair mechanisms were identified in the genome of Ca. N. cavascurensis, in agreement with the general distribution of these systems in thermophiles (for reviews see Rouillon and White, 2011;Grasso and Tell, 2014;Ishino and Narumi, 2015): (a) Homologous recombination repair (HR): Homologs of RadA and RadB recombinase, Mre11, Rad50, the HerA-NurA helicase/nuclease pair, and Holliday junction resolvase Hjc are encoded in the genome. These genes have been shown to be essential in other archaeal (hyper)thermophiles, leading to hypotheses regarding their putative role in replication and more generally the tight integration of repair, recombination and replication processes in (hyper)thermophilic archaea (Grogan, 2015).
(b) Base excision repair (BER): the machinery responsible for the repair of deaminated bases was identified in the genome, including uracil DNA glycosylases (UDG) and putative apurinic/apyrimidinic lyases. Deletion of UDGs was shown to impair growth of (hyper)thermophilic archaea (Grogan, 2015).
(c) Nucleotide excision repair (NER): Homologs of the putative DNA repair helicase: nuclease pair XPB-Bax1 and nuclease XPF were found in the genome, but repair nuclease XPD could not be identified. XPD is present in all so far analyzed mesophilic AOA, but it is also absent in other thermophiles (Kelman and White, 2005). It should be noted that NER functionality in archaea is still unclear, and deletions of the respective genes were shown to have no observable phenotype (Grogan, 2015).
(d) Translesion polymerase: A Y-family polymerase with lowfidelity able to perform translesion DNA synthesis is encoded in Ca. N. cavascurensis.
Key enzymes of all the above-mentioned systems are also found in mesophilic AOA. Given their extensive study in the crenarchaeal (hyper)thermophiles, it would be interesting to characterize their respective functions and regulation in both mesophilic and thermophilic Thaumarchaeota.
(e) Bacterial-type UvrABC excision repair: In contrast to mesophilic AOA and in agreement with the known distribution of the system among mesophilic archaea and bacteria but its absence in (hyper)thermophiles, Ca. N. cavascurensis does not encode homologs of this repair machinery.
Ca. N. cavascurensis could potentially produce the polyamines putrescine and spermidine, which have been shown to bind and stabilize compacted DNA from thermal denaturation, acting synergistically with histone molecules, also present in AOA (Higashibata et al., 2000;Oshima, 2007). Although a putative spermidine synthase is also found in mesophilic AOA, the gene encoding for the previous step, S-adenosylmethionine decarboxylase (NCAV_0959), is only found in Ca. N. cavascurensis, located in tandem with the former. The biosynthesis of putrescine (a substrate for spermidine synthase) is unclear, since we could not identify a pyruvoyl-dependent arginine decarboxylase (ADC, catalyzing the first of the two-step biosynthesis of putrescine). However, it was shown that the crenarchaeal arginine decarboxylase evolved from an S-adenosylmethionine decarboxylase enzyme, raising the possibility of a promiscuous enzyme (Giles and Graham, 2008).
The production of thermoprotectant compounds with a role in stabilizing proteins from heat denaturation seems to be a preferred strategy of heat adaptation in Ca. N. cavascurensis. Firstly, the presence of mannosyl-3-phosphoglycerate synthase (NCAV_1295) indicates the ability to synthesize this compatible solute, shown to be involved in heat stress response and protection of proteins from heat denaturation in Thermococcales (Neves et al., 2005). Homologous genes are also present in mesophilic members of the order Nitrososphaerales and other (hyper)thermophiles (Spang et al., 2012;Kerou et al., 2016b). Secondly, only Ca. N. cavascurensis, but no other AOA encodes a cyclic 2, 3-diphosphoglycerate (cDPG) synthetase (NCAV_0908), an ATP-dependent enzyme which can synthesize cDPG from 2,3-biphosphoglycerate, an intermediate in gluconeogenesis. High intracellular concentrations of cDPG accumulate in hyperthermophilic methanogens, where it is required for the activity and thermostability of important metabolic enzymes (Shima et al., 1998).

Notable Features of the DNA Replication and Cell Division Systems in
Ca. N. cavascurensis Strikingly, only one family B replicative polymerase PolB was identified in the genome of Ca. N. cavascurensis (NCAV_1300), making it the only archaeon known so far to encode a single subunit of the replicative family B polymerase, as in Crenarchaeota multiple paralogs with distinct functions coexist (Makarova et al., 2014). Both subunits of the D-family polymerases PolD present in all other AOA and shown to be responsible for DNA replication in Thermococcus kodakarensis (also encoding both polD and polB families) (Cubonova et al., 2013) were absent from the genome, raising intriguing questions about the role of the polB family homolog in mesophilic Thaumarchaeota. Sequence analysis indicated that the Ca. N. cavascurensis homolog belongs to the polB1 group present exclusively in the TACK superphylum and shown recently by Yan and colleagues to be responsible for both leading and lagging strand synthesis in the crenarchaeon Sulfolobus solfataricus (Yan et al., 2017). However, the activity of Sulfolobus solfataricus PolB1 is dependent on the presence and binding of two additional proteins, PBP1 and PBP2, mitigating the strand-displacement activity during lagging strand synthesis and enhancing DNA synthesis and thermal stability of the holoenzyme, respectively. No homologs of these two additional subunits were identified in Ca. N. cavascurensis, raising the question of enzymatic thermal stability and efficiency of DNA synthesis on both strands.
Homologs of genes for the Cdv cell division system proteins CdvB (3 paralogs) and CdvC, but not CdvA, were identified in Ca. N. cavascurensis. This is surprising given that all three proteins were detected by specific immuno-labeling in the mesophilic AOA Nitrosopumilus maritimus, where CdvA, CdvC and two CdvB paralogs were shown to localize midcell in cells with segregated nucleoids, indicating that this system mediates cell division in Thaumarchaeota (Pelve et al., 2011). The Ca. N. cavascurensis CdvB paralogs (as all other thaumarchaeal CdvBs) all share the core ESCRT-III with the crenarchaeal CdvB sequences, contain a putative (but rather unconvincing) MIM2 motif necessary for interacting with CdvC in Crenarchaeota, while they all lack the C-terminal wingedhelix domain responsible for interacting with CdvA (Samson et al., 2011;Ng et al., 2013). Interestingly, one of the Ca. N. cavascurensis paralogs (NCAV_0805) possesses a 40 aminoacids serine-rich C-terminal extension right after the putative MIM2 motif absent from other thaumarchaea. It is worth noting that CdvA is also absent from the published Aigarchaeota genomes Ca. Caldiarchaeum subterraneum (Nunoura et al., 2011) and Ca. Calditenuis aerorheumensis (Beam et al., 2016), while both phyla (Thaumarchaeota and Aigarchaeota) encode an atypical FtsZ homolog. Thermococcales, albeit they presumably divide with the FtsZ system, also encode CdvB and CdvC homologs, while no CdvA homolog is detectable (Makarova et al., 2010). Given the emerging differences in the molecular and regulatory aspects of the Cdv system between Crenarchaeota and Thaumarchaeota (Pelve et al., 2011;Ng et al., 2013), the intriguing additional roles of the system (Samson et al., 2017) and the fact that only CdvB and CdvC are homologous to the eukaryotic ESCRT-III system and therefore seem to have fixed roles in evolutionary terms, this observation raises interesting questions regarding the versatility of different players of the cell division apparatus.

Ca. N. cavascurensis Has a Dynamic Genome
The Ca. N. cavascurensis genome contains large clusters of genes that showed deviations from the average G+C content of the genome (i.e., 41.6%), indicating that these regions might have been acquired by lateral gene transfer (Figure 7).
Two of the larger regions were integrative and conjugative elements (ICE-1 and ICE-2 in Figures 7, 8) of 65.5 and 43.6 kb, respectively. Both are integrated into tRNA genes and flanked by characteristic 21 and 24 bp-long direct repeats corresponding to the attachment sites, a typical sequence feature generated upon site-specific recombination between the cellular chromosome and a mobile genetic element (Grindley et al., 2006). ICE-1 and ICE-2 encode major proteins required for conjugation (colored in red in Figure 8A), including VirD4like and VirB4-like ATPases, VirB6-like transfer protein and VirB2-like pilus protein (in ICE-2). The two elements also share homologs of CopG-like proteins containing the ribbon-helixhelix DNA-binding motifs and beta-propeller-fold sialidases (the latter appears to be truncated in ICE-1). The sialidases of ICE-1 and ICE-2 are most closely related to each other (37% identity over 273 aa alignment). It is not excluded that ICE-1 and ICE-2 have inherited the conjugation machinery as well as the genes for the sialidase and the CopG-like protein from a common ancestor, but have subsequently diversified by accreting functionally diverse gene complements. Indeed, most of the genes carried by ICE-1 and ICE-2 are unrelated. ICE-1, besides encoding the conjugation proteins, carries many genes for proteins involved in DNA metabolism, including an archaeoeukaryotic primase-superfamily 3 helicase fusion protein, Cdc6like ATPase, various nucleases (HEPN, PD-(D/E)XK, PIN), DNA methyltransferases, diverse DNA-binding proteins and an integrase of the tyrosine recombinase superfamily (yellow in Figure 8A). The conservation of the attachment sites and the integrase gene as well as of the conjugative genes indicates that this element is likely to be a still-active Integrative and Conjugative Element (ICE) able to integrate into the chromosome and excise from it as a conjugative plasmid.
ICE-2 is shorter and encodes a distinct set of DNA metabolism proteins, including topoisomerase IA, superfamily 1 helicase, ParB-like partitioning protein, GIY-YIG and FLAP nucleases as well as several DNA-binding proteins. More importantly, this element also encodes a range of proteins involved in various metabolic activities as well as a Co/Zn/Cd efflux system that might provide relevant functions to the host (green in Figure 8A). In bacteria, ICE elements often contain cargo genes that are not FIGURE 7 | Genomic regions of atypical GC% content contain mobile genetic elements in Ca. N. cavascurensis. The genome is displayed in the form of a ring where genes are colored by functional categories (arCOG and COG), with the mean GC% content (blue circle) and local GC% as computed using 5 kb sliding windows (black graph) in inner rings. The minimal and maximal values of local GC% are displayed as dashed gray concentric rings. The predicted origin of replication is indicated with "Ori." Several large regions show a deviation to the average GC% that correspond to mobile genetic elements, i.e., ICEs (integrated conjugated elements), a specific pilus (see Figure 8), and a defense system (CRISPR-Cas Type IB and three CRISPR arrays (see text). . CHLP, geranylgeranyl diphosphate reductase; GGPS, geranylgeranyl pyrophosphate synthetase; UbiC, chorismate lyase; UbiA, 4-hydroxybenzoate octaprenyltransferase; RmlB, dTDP-glucose 4,6-dehydratase; UbiE, Ubiquinone/menaquinone biosynthesis C-methylase; HslU, ATP-dependent HslU protease; PflA, Pyruvate-formate lyase-activating enzyme; SuhB, fructose-1,6-bisphosphatase; TrxB, Thioredoxin reductase; YyaL, thoiredoxin and six-hairpin glycosidase-like domains; BCAT, branched-chain amino acid aminotransferase; PncA, Pyrazinamidase/nicotinamidase; SseA, 3-mercaptopyruvate sulfurtransferase; ThyX, Thymidylate synthase; CzcD: Co/Zn/Cd efflux system. (B) A head-tail like provirus from the Caudovirales order with genes in blue indicating putative viral functions. (C) Ca. N. cavascurensis specific type IV pilus locus next to a set of chemotaxis and flagellum (archaellum) genes. Homologs shared between the flagellum and the T4P are displayed with the same colors. A scheme of the putative type IV pilus is shown on the left with the same color code (inspired by drawings in Makarova et al., 2016). mb, membrane; cyto, cytoplasm; MCP, methyl-accepting chemotaxis proteins. related to the ICE life cycle and that confer novel phenotypes to host cells (Johnson and Grossman, 2015). It is possible that under certain conditions, genes carried by ICE-1 and ICE-2, and in particular the metabolic genes of ICE-2, improve the fitness of Ca. N. cavascurensis. Notably, only ICE-1 encodes an integrase, whereas ICE-2 does not, suggesting that ICE-2 is an immobilized conjugative element that can be vertically inherited in AOA. Given that the attachment sites of the two elements do not share significant sequence similarity, the possibility that ICE-2 is mobilized in trans by the integrase of ICE-1 appears unlikely.
The third mobile genetic element is derived from a virus, related to members of the viral order Caudovirales (Figure 8B) (Prangishvili et al., 2017). It encodes several signature proteins of this virus group, most notably the large subunit of the terminase, the portal protein and the HK97-like major capsid protein (and several other viral homologs). All these proteins with homologs in viruses are involved in virion assembly and morphogenesis. However, no proteins involved in genome replication seem to be present. The element does not contain an integrase gene, nor is it flanked by attachment sites, which indicates that it is immobilized. Interestingly, a similar observation has been made with the potential provirus-derived element in Nitrososphaera viennensis (Krupovic et al., 2011). Given that the morphogenesis genes of the virus appear to be intact, one could speculate that these elements represent domesticated viruses akin to gene transfer agents, as observed in certain methanogenic euryarchaea (Eiserling et al., 1999;Krupovic et al., 2010), or killer particles (Bobay et al., 2014), rather than deteriorating proviruses. Notably, ICE-1, ICE-2 and the provirus-derived element all encode divergent homologs of TFIIB, a transcription factor that could alter the promoter specificity to the RNA polymerase.
A fourth set of unique, potentially transferred genes encodes a putative pilus ( Figure 8C). All genes required for the assembly of a type IV pilus (T4P) are present, including an ATPase and an ATPase binding protein, a membrane platform protein, a prepilin peptidase, and several prepilins (Makarova et al., 2016). Based on the arCOG families of the broadly conserved ATPase, ATPase-binding protein, membrane-platform protein, and prepilin peptidase (arCOG001817, -4148, -1808, and -2298), this pilus seems unique in family composition, but more similar to the archetypes of pili defined as clades 4 (A, H, I, J) by Makarova and colleagues, which are mostly found in Sulfolobales and Desulfurococcales (Figure 4 from Makarova et al., 2016).
Yet, the genes associated to this putative pilus seem to be more numerous, as the locus consists of approximately 16 genes (versus ∼5 genes for the aap and ups in Sulfolobus). Such a combination of families is not found either in the T4P or flagellar loci found in analyzed Thaumarchaeota genomes. Prepilins are part of the core machinery of T4P, but display a high level of sequence diversity. The three prepilins found in Ca. N. cavascurensis T4P locus correspond to families that are not found in any other genome from our dataset (arCOG003872, -5987, -7276). We found a putative adhesin right in between the genes encoding the prepilins, and therefore propose that this Ca. N. cavascurensisspecific type IV pilus is involved in adhesion ( Figure 8C). This pilus thus appears to be unique in protein families composition when compared to experimentally validated T4P homologs in archaea (flagellum, ups, aap, bindosome Zolghadr et al., 2007;Frols et al., 2008;Tripepi et al., 2010;Henche et al., 2012), bioinformatically predicted ones (Makarova et al., 2016), and the pili found in other Thaumarchaeota, which correspond to different types. Interestingly, this pilus gene cluster lies directly next to a conserved chemotaxis/archaellum cluster as the one found in Nitrososphaera gargensis or Nitrosoarchaeum limnia (four predicted genes separate the T4P genes and cheA, Figure 8C) (Spang et al., 2012). This suggests that this pilus might be controlled by exterior stimuli through chemotaxis. The interplay between the archaellum and pilus expression would be interesting to study in order to comprehend their respective roles.
The genome of Ca. N. cavascurensis also carries traces of inactivated integrase genes as well as transposons related to bacterial and archaeal IS elements, suggesting that several other types of mobile genetic elements have been colonizing the genome. Collectively, these observations illuminate the flexibility of the Ca. N. cavascurensis genome, prone to lateral gene transfer and invasion by alien elements. Accordingly, we found a CRISPR-Cas adaptive immunity system among the sets of genes specific to Ca. N. cavascurensis that we could assign to the subtype I-B (Abby et al., 2014). We detected using the CRISPRFinder website (Grissa et al., 2007) at least three CRISPR arrays containing between 4 and 101 spacers presumably targeting mobile genetic elements associated with Ca. N. cavascurensis, reinforcing the idea of a very dynamic genome. Interestingly, the second biggest CRISPR array (96 spacers) lies within the integrated conjugative element ICE-1, which we hypothesize to be still active. This suggests that ICE-1 may serve as a vehicle for the horizontal transfer of the CRISPR spacers between Ca. N. cavascurensis and other organisms present in the same environment through conjugation, thus spreading the acquired immunity conferred by these spacers against common enemies.

CONCLUSION
We present an obligately thermophilic ammonia oxidizing archaeon from a hot spring on the Italian island of Ischia that is related to, but also clearly distinct from Ca. Nitrosocaldus yellowstonensis. It contains most of the genes that have been found to be conserved among AOA and are implicated in energy and central carbon metabolism, except nirK encoding a nitrite reductase. Its genome gives indications for alternative energy metabolism and exhibits adaptations to (high temperature) extreme environments. However, it lacks an identifiable reverse gyrase, which is found in most thermophiles with optimal growth temperatures above 65 • C and apparently harbors a provirus of head-and tail structure that is not usually found at high temperatures. Ca. N. cavascurensis differs also in its gene sets for replication and cell division, which has implications for the function and evolution of these systems in archaea. In addition, its extensive mobilome and the defense system indicate that thermophilic AOA are in constant exchange with the environment and with neighboring organisms as discussed for other thermophiles (van Wolferen et al., 2013). This might have shaped and continues to shape the evolution of Thaumarchaeota in hot springs. The pivotal phylogenetic position of Ca. N. cavascurensis will enable the reconstruction of the last common ancestor of AOA and provide further insights into the evolution of this ecologically widespread group of archaea.
For our enriched strain we propose a candidate status with the following taxonomic assignment: Nitrosocaldales order Nitrosocaldaceae fam. and Candidatus Nitrosocaldus cavascurensis sp. nov. Etymology: L. adj. nitrosus, "full of natron, " here intended to mean nitrous (nitrite producer); L. masc.n. caldus, hot; cavascurensis (L.masc. gen) describes origin of sample (Terme di Cavascura, Ischia) Locality: hot mud, outflow from hot underground spring, 77 • C Diagnosis: an ammonia oxidizing archaeon growing optimally around 68 • C at neutral pH under chemolithoautotrophic conditions with ammonia or urea, spherically shaped with a diameter of approximately 0.6-0.8 µm, 4% sequence divergence in 16S rRNA gene from its next cultivated relative Ca. Nitrosocaldus yellowstonensis.

AUTHOR'S NOTE
Another report on the enrichment and genome analysis of a thermophilic Thaumarchaeota was submitted to Frontiers in Microbiology, Daebeler et al. (2018, in review), shortly before submission of this manuscript. The core genomes of both organisms share many of the above-discussed features, but it remains to be seen if the observed similarities extend to their shell genome.

AUTHOR CONTRIBUTIONS
CS conceived the study; MS did the first enrichments; MM and CR made the growth characterizations; KP did the electron microscopy; SSA assembled the genome and performed the phylogenetic and genomic analyses for annotation; MKe annotated all metabolic and information processing genes; MKr analyzed the mobile genetic elements; CS, SSA, and MKe wrote the manuscript with contributions from MKr and MM.

FUNDING
This project was supported by ERC Advanced Grant TACKLE (695192) and Doktoratskolleg W1257 of the Austrian Science Fund (FWF). SSA was funded by a "Marie-Curie Action" fellowship, grant number THAUMECOPHYL 701981.

ACKNOWLEDGMENTS
We are grateful to Lucilla Monti for sharing her geological knowledge and for guiding us to the hot springs of Ischia in spring and fall 2013 and to Silvia Bulgheresi for crucial help in organizing the fall expedition. We also thank Romana Bittner for excellent technical assistance and continuous cultivation and the 12 participants of the Bachelor practical class (University of Vienna) for help in sampling hot springs on Ischia and for setting up initial enrichments. We are grateful to Ricardo Alves for primer design, Thomas Pribasnig for qPCR and to Thomas Rattei and Florian Goldenberg for help in running computations and for providing access to the CUBE servers in Vienna. Special thanks go to LABGeM, the National Infrastructure "France Génomique" and the MicroScope team of the Genoscope (Evry, France) for providing the pipeline of annotation, and for their help during this study. We thank Eduardo Rocha and Marie Touchon for kindly providing access to their curated genome database.