Polymorphisms in Brucella Carbonic Anhydrase II Mediate CO2 Dependence and Fitness in vivo

Some Brucella isolates are known to require an increased concentration of CO2 for growth, especially in the case of primary cultures obtained directly from infected animals. Moreover, the different Brucella species and biovars show a characteristic pattern of CO2 requirement, and this trait has been included among the routine typing tests used for species and biovar differentiation. By comparing the differences in gene content among different CO2-dependent and CO2-independent Brucella strains, we have confirmed that carbonic anhydrase (CA) II is the enzyme responsible for this phenotype in all the Brucella strains tested. Brucella species contain two CAs of the β family, CA I and CA II; genetic polymorphisms exist for both of them in different isolates, but only those putatively affecting the activity of CA II correlate with the CO2 requirement of the corresponding isolate. Analysis of these polymorphisms does not allow the determination of CA I functionality, while the polymorphisms in CA II consist of small deletions that cause a frameshift that changes the C-terminus of the protein, probably affecting its dimerization status, essential for the activity. CO2-independent mutants arise easily in vitro, although with a low frequency ranging from 10–6 to 10–10 depending on the strain. These mutants carry compensatory mutations that produce a full-length CA II. At the same time, no change was observed in the sequence coding for CA I. A competitive index assay designed to evaluate the fitness of a CO2-dependent strain compared to its corresponding CO2-independent strain revealed that while there is no significant difference when the bacteria are grown in culture plates, growth in vivo in a mouse model of infection provides a significant advantage to the CO2-dependent strain. This could explain why some Brucella isolates are CO2 dependent in primary isolation. The polymorphism described here also allows the in silico determination of the CO2 requirement status of any Brucella strain.


INTRODUCTION
Brucella species are facultative intracellular Gram-negative coccobacilli that cause brucellosis, the most prevalent zoonosis with more than 500,000 human cases reported worldwide every year (Pappas et al., 2006). Brucella isolates are routinely identified and classified by biochemical and phenotypical characteristics like urease activity, CO 2 dependence, H 2 S production, erythritol and of not only physiologically essential but also industrially useful metabolites, such as amino acids, nucleotides, and fatty acids (Mitsuhashi et al., 2004). A role for CA in the intracellular pH regulation has also been demonstrated in some bacteria (Marcus et al., 2005).
Brucella species contain two different β-CAs, first identified in Brucella suis 1330 and thus named Bs1330 CAI and Bs1330 CAII. Both CAs contain the amino acid residues involved in the binding of the Zn ion (typical of the β family of CAs), as well as those involved in the catalytic site. Their activity has been verified in vitro, and it is slightly higher in Bs1330 CAII than in Bs1330 CAI (Joseph et al., 2010(Joseph et al., , 2011. Pérez-Etayo et al. (2018) compared CA I and CA II activity (activity defined empirically as that allowing growth in a normal atmosphere, the same definition used throughout this study) in several strains of B. suis, B. abortus, and B. ovis and determined that CA II is not functional in CO 2dependent B. abortus and B. ovis, thus establishing a correlation between CA activity and CO 2 dependence. They also observed that CA I is active in B. suis 1330 or 513, but not in B. abortus 2308W, 292, and 544. Moreover, although an active CA I alone is enough to support CO 2 -independent growth of B. suis in rich media, it is not able to do it in minimal media or to support CO 2 -independent growth of B. abortus at all. A similar result was also obtained by Varesio et al. (2019) that identified BcaA BOV (CA II) as the enzyme responsible for the growth of B. ovis in a standard, unsupplemented atmosphere (0.04% CO 2 ), in this case, by whole-genome sequencing (WGS) of CO 2 -independent mutants. Interestingly, they also reported that a CO 2 downshift B. ovis initiates a gene expression program that resembles the stringent response and results in transcriptional activation of its type IV secretion system. This shift is absent in B. ovis strains carrying a functional copy of CA.
The classical biotyping mentioned above, despite its limitations and the emergence of new molecular approaches to identify and classify Brucella at different taxonomic levels, is still extensively used by reference laboratories, often side by side with the molecular methods (Garin-Bastuji et al., 2014). However, although there is a known link between phenotype and its genetic cause in some traits like urease activity or erythritol sensitivity (Sangari et al., 1994(Sangari et al., , 2007(Sangari et al., , 2010, there is still a gap between the information provided by the molecular methods and the phenotype of Brucella isolates. With the availability of more genome sequences, it should be possible to reduce this gap by comparing the phenotypic characteristics of Brucella strains with their genome content. Comparative genomics of whole-genome sequences is especially interesting in bacterial pathogenesis studies (Hu et al., 2011). Pathogenomics can be considered as a particular case of comparative genomics, and it has been extensively used for the identification of putative virulence factors in bacteria, by comparing virulent and avirulent isolates (Pallen and Wren, 2007), although in principle it could be applied to the elucidation of any phenotypic trait. The genus Brucella is a very homogeneous one, with over 90% identity on the basis of DNA-DNA hybridization assays within the classical species, and this results in relatively minor genetic variation between species that sometimes result in striking differences. As an example, only 253 single-nucleotide polymorphisms (SNPs) separate Brucella canis from its nearest B. suis neighbor (Foster et al., 2009), but their host specificity differs widely, while B. canis is almost entirely restricted to the Canidae family, B. suis has a wide host range that includes pigs, dogs, rodents, hares, horses, reindeer, musk oxen, wild carnivores, and humans. Similarly, there are only 39 SNPs consistently different between the vaccine strain B. abortus S19 and strains B. abortus 9-941 and 2308, two well-known virulent isolates (Crasta et al., 2008). In the last years, a large number of Brucella genomes representing all species and biovars have been sequenced, and all this wealth of information is already resulting in new molecular epidemiology and typing methods (O'Callaghan and Whatmore, 2011). We have tested the potential of pathogenomics to unveil phenotypic traits in Brucella by defining the pangenome/pseudogenes of a set of Brucella strains and comparing it with the CO 2 dependence of those strains. This process has allowed us to identify CA II as the enzyme responsible for growth of the bacteria at atmospheric CO 2 concentrations and extend the analysis to new species of Brucella. All the sequenced genomes of Brucella contain two β-CA genes, but only those that carry a defective β-CA II require supplemental CO 2 . Reversion of this phenotype happens in vitro at a low frequency and is accompanied by a compensatory mutation that results in a full-length β-CA II product. We have also tested the hypothesis that the presence of a truncated β-CA II would have a competitive advantage in vivo, as a way to explain why a mutation with such a low frequency could get fixed in some Brucella species and biovars. A competitive assay shows that one of such mutants is significantly enriched in a mouse model of infection when compared with its corresponding fulllength β-CA II strain. This could explain why CO 2 -dependent strains are selected in vivo. The polymorphisms affecting β-CA II encoding genes allow the prediction of the CO 2 dependence status of any given strain, thus having the potential to replace the classical assay to characterize Brucella isolates.

Bacterial Strains and Growth Conditions
The bacterial strains and plasmids used in this work are listed in Table 1. Brucella strains were grown at 37 • C for 48-96 h in a 5% CO 2 atmosphere in Brucella broth (BB) or agar (BA) medium (Pronadisa, Spain). Media were supplemented with 10% fetal bovine serum (FBS) to grow B. ovis. All experiments with live Brucella were performed in a Biosafety Level 3 (BSL3) facility at the Department of Molecular Biology of the University of Cantabria, and animal infections with Brucella were conducted at the University of Cantabria animal facilities, also under BSL3 conditions.

Bioinformatic Methods
Genomic and protein sequences of the different Brucella species were obtained from GenBank and the Broad Institute 1 . To allow easy comparison between the genes and pseudogenes in the different Brucella species, we constructed the panproteome of  Table S1). To construct this set, we started with all the coding sequences (CDSs) annotated in the B. suis 1330 genome. Next, we found the most probable functional counterparts for the n pseudogenes annotated in B. suis 1330. The pseudogene list was taken directly from the original annotation of the B. suis 1330 genome. Finally, we added those CDSs in indels from the other genomes not present in B. suis 1330. We assigned a new gene name to every CDS in our set following the Bru1_xxxx and Bru2_xxxx nomenclature, depending on the location of the gene in the B. suis genome. CDSs from indels were also renamed with a nomenclature, BRU1_iXXXX, the "i" indicating their origin from indels absent in B. suis 1330. The file pan_pep provided in the Supplementary Material is a multifasta protein file containing the sequence of all 3,496 CDSs present at least once in any of the used genomes and constitutes the first version of the Brucella panproteome. The genes and pseudogenes annotated in these genomes were tabulated and assigned to one of the different gene families present in those genomes. In this way, we constructed a spreadsheet with the pseudogenes in each genome using a uniform nomenclature. The analysis of the CA sequences at both the DNA and protein levels was extended to a group of 35 Brucella genomes (Supplementary Table S2). A structural theoretical model of Brucella Ba2308 CAII was generated by molecular threading using the protein homology and recognition engine Phyre2 (Kelley et al., 2015), taking the atomic coordinates of the best hit as template. The pdb model generated was visualized using the PyMOL Molecular Graphics System, version 1.3 (Schrödinger, LLC, Portland, OR, United States).
Primers used in this study ( Table 2) were designed with Primer 3 2 and synthesized by Sigma-Aldrich. Different CO 2 -dependent Brucella strains from our collection were streaked onto BA plates and grown in a 5% CO 2 atmosphere. Individual colonies were then re-streaked in duplicate plates and incubated at 5% CO 2 and ambient atmosphere to check for the correct CO 2 dependence phenotype. They were grown as a lawn in fresh BA plates, and the growth was resuspended in phosphate-buffered saline (PBS). The suspension was serially diluted, and each dilution seeded in duplicate in BA plates. One dilution series was incubated at 5% CO 2 to enumerate the number of bacteria in the inoculum, while the second was incubated at ambient atmosphere to select for CO 2 -independent colonies. The mutation rate was expressed as number of mutants per number of initial bacteria. Individual mutants were selected, and genomic DNA was obtained by using the InstaGene matrix as described by the supplier (Bio-Rad Laboratories, United Kingdom). CA I and CA II complete sequences from the different strains were amplified by PCR with oligonucleotides BS192_0456.F/R and BS191_1911.F/R, respectively, and sequenced to determine if there was any change compared to the corresponding parental sequence.

Infection and Intracellular Viability Assay of Brucella abortus in J774 Cells
J774.A1 macrophage-like cells [American Type Culture Collection (ATCC), TIB-67] were cultured in RPMI medium with 2 mM L-glutamine and 10% FBS at 37 • C in 5% CO 2 and 100% humidity. Confluent monolayers were trypsinized, and 2 × 10 5 cells per well were incubated for 24 h before infection in 24-well tissue culture plates. Macrophages were infected with Brucella strains in triplicate wells at an MOI of 50. After infection for 30 min, the wells were washed five times with sterile PBS and further incubated for 30 min in RPMI with 2 mM L-glutamine, 10% FCS, and 50 µg gentamicin per milliliter to kill extracellular bacteria. That was taken as time 0 post infection, and the medium was changed to contain 10 µg gentamicin per milliliter. The number of intracellular viable B. abortus was determined at different time points by washing three times with PBS and lysing infected cells with 0.1% Triton X-100 in H 2 O and plating a series of 1:10 dilutions on BA plates for colony-forming unit (CFU) determination.

Competitive Infection Assays
The following protocol was approved by the Cantabria University Institutional Laboratory Animal Care and Use Bioethics Committee and was carried out in accordance with the Declaration of Helsinki and the European Communities Council Directive (86/609/EEC). Comparison of fitness between CO 2 -dependent and isogenic CO 2 -independent strains was done through a competitive infection assay in order to minimize animal-to-animal variation. BALB/c mice (CIFRA, Spain) were injected with 1:1 mixtures of B. abortus 292 (CO 2 dependent, wild type) and B. abortus 292mut1 (a spontaneous Ba292 CAII CO 2 -independent mutant). Two hundred microliters of a suspension containing approximately 10 8 bacteria was administered intraperitoneally to a group (n = 6) of 6-to 8week-old female BALB/c mice. Mice were sacrificed 8 weeks after infection, and the liver and spleen were removed aseptically and homogenized with 5 ml of BB containing 20% glycerol. Samples were serially diluted and plated in quadruplicate on BA plates. Half of the plates were incubated with 5% CO 2 , and the other half at ambient atmosphere. Additionally, colonies grown at 5% CO 2 were replica-plated and incubated at both CO 2 concentrations, to measure the ratio of CO 2 -dependent and CO 2 -independent colonies in two independent ways. For in vitro competitive index (CI) assays, BA plates were seeded, forming a lawn with the infection mix, and incubated at 37 • C with 5% CO 2 for 8 weeks, with repeated subculture in fresh BA plates every 4-5 days in the same conditions. The ratio of CO 2 -dependent and CO 2independent colonies was determined with the same protocol as the in vivo CI. The CI was calculated as the ratio of mutant to wild-type bacteria recovered at the end of the experiment divided by the ratio of mutant to wild-type bacteria in the inoculum, and the differences between groups were analyzed by Student's two-tailed t test with significance set at P < 0.05.

Identification of the Gene Responsible for the CO 2 Dependence in Brucella abortus
The first evidence of the involvement of CA in the CO 2 dependence phenotype came from the analysis of pseudogenes in the 10 fully annotated Brucella genomes (Supplementary Table S1). After tabulation of the pseudogenes, their presence along with the different species was compared with the target phenotype; in this particular case, we interrogated the spreadsheet n_pseudos.xls (Supplementary Material) to find out which genes are pseudogenes only in those strains in our list that are CO 2 dependent, B. abortus 9-941, and B. ovis. Three genes met this criterion, namely, Bru1_1050, which encodes for a multidrug resistance efflux pump; Bru1_1827, which encodes for CA II; and Bru2_1236, encoding for an adenosylmethionine-8-amino-7-oxononanoate aminotransferase. Given the requirement of CA for growth of other microorganisms at ambient CO 2 concentrations and to check if Bru1_1827 could be responsible for the CO 2 dependence Frontiers in Microbiology | www.frontiersin.org FIGURE 1 | Continued FIGURE 1 | Alignment of sequences of carbonic anhydrase (CA) II from representative Brucella isolates. The genomes shown here are the representative species for each of the clusters of identical sequences obtained from the 35 selected Brucella strains. Those clusters formed by species that are CO 2 dependent are shown in blue. (A) Partial DNA sequences, with nucleotides that differ from the consensus wild-type sequence highlighted in red, and (B) protein sequences, with amino acids that differ from the consensus wild-type sequence highlighted in red. Red triangles indicate the four zinc-binding residues, Cys42, Asp44, His98, and Cys101, and blue triangles the catalytic dyad Asp44-Arg46. phenotype, we retrieved and aligned the DNA and corresponding amino acid sequences obtained from a set of 35 Brucella strains with a known requirement for CO 2 (Wattam et al., 2014;Supplementary Table S2). Sequences were clustered with VSEARCH (Rognes et al., 2016), resulting in 10 unique sequences that were aligned with ClustalW (Larkin et al., 2007). The CO 2independent isolates code for full-length identical proteins except for the B. abortus 2308 and 2308A strains that have an extra amino acid, Ala113. On the contrary, the CO 2 -dependent isolates contain different frameshifts or single-point mutations, which results in truncated or altered proteins (Figure 1).
A group of three B. abortus strains (86/8/59, 9-941, and 292) shows an extra "C" at position 337 in the CAII gene when compared with the wild-type allele, leading to a frameshift that causes a premature stop, truncating half of the protein. B. ovis ATCC 25840 shows an extra "G" at position 523, which similarly leads to a frameshift that alters the last third of the protein at the C-terminus. Finally, three B. pinnipedialis strains (M163/99/10, M292/94/1, and B2/94) contain a SNP, 557T > C, that causes a non-conservative amino acid substitution, Leu186Pro ( Figure 1B).
C337 also appears in CO 2 -independent B. abortus biovar 1 strains, like S19, 2308, or NTCC 8038, but in these cases, there are additional mutations that recover the original open reading frame (ORF), two extra nucleotides in strain 2308, or one nucleotide deletion in B. abortus NCTC 8038 and S19. These changes do not affect the conserved amino acid residues typical of β-CAs involved in the catalytic cycle, that is, the four zinc-binding residues, Cys44, Asp46, Hys105, and Cys108, and the catalytic dyad Asp46 and Arg48 ( Figure 1B). Some of the strains analyzed here (B. suis 1330 and B. abortus strains 2308W, 292, and 544) were also analyzed by Pérez-Etayo et al. (2018), and our results are in complete agreement.
There is one discrepancy involving B. abortus Tulya, a biovar 3 strain that according to the literature (Alton et al., 1988) should be CO 2 dependent, but according to our analysis, it codes for a full-length CA II, thus being grouped with the CO 2 -independent isolates. To solve this apparent puzzle, we plated a sample of B. abortus Tulya from our laboratory stock and determined its CO 2 dependence. Contrary to the original reference strain phenotype and in agreement with our in silico analysis, this isolate was indeed CO 2 independent. The complete BaTulya CAII was amplified by PCR from our strain and sequenced, confirming the published sequence. This strain originated from the collection kept in the Centro de Investigación y Tecnología Agroalimentaria of Aragón (CITA), Zaragoza, Spain, where it is also labeled as being CO 2 independent, suggesting that this is not the result of a contamination or selection of a CO 2 -independent mutant in our hands.
As Brucella species code for two different CAs (Joseph et al., 2010), we repeated the analysis for the CA I CDSs.
Although several isolates contain a polymorphism consisting of a 24-nt deletion between two 11-nt direct repeats or different SNPs (Supplementary Figure S1), there was no obvious correlation between the presence of these polymorphisms and CO 2 dependence. Pérez-Etayo et al. (2018) demonstrated that CA I from B. abortus strains 2308W, 292, and 544 is inactive, while that from B. suis strains 1330 and 513 is active, although it can only mediate CO 2 independence in complex media and in a rather prototrophic host. Comparison of Bsuis513 CAI and Babortus2308W CAI reveals a difference of only one amino acid, the valine at position 74 being replaced by a glycine.

Brucella CO 2 -Independent Spontaneous Mutants Present a Modified CA II Sequence
Comparison of some of the CA II sequences of B. abortus biovar 1 CO 2 -independent strains like 2308, S19, or NCTC 8038 with those of the other Brucella CO 2 -dependent and independent isolates suggests that reversion of the CO 2 requirement is coincidental with the introduction of compensatory mutations able to reverse the initial frameshift described above. CO 2independent mutants have been previously reported to appear at a low frequency (3 × 10 −10 ) in cultures of CO 2 -dependent strains by subculturing in vitro in the absence of supplementary CO 2 (Marr and Wilson, 1950). We measured the frequency of the reversion in six CO 2 -dependent strains from our laboratory collection, by growing duplicate cultures with or without CO 2 . We first checked the phenotype of all the strains by streaking them in a BA that was incubated without added CO 2 . All the strains but B. abortus Tulya, as reported above, failed to grow in these conditions, in agreement with the published phenotype. We then plated o/n cultures from the CO 2 -dependent strains to obtain colonies grown at ambient atmosphere and calculated the frequency of revertants for those strains ( Table 3). The B. abortus strains had a similar frequency to the one described by Mar and Wilson (1950), 10 −8 to 10 −10 , but B. ovis and B. pinnipedialis had a higher frequency of reversion, 10 −6 . In an exploratory effort to identify a possible cause for these differences in mutation rates, we analyzed the presence and identity between strains of the most obvious proteins that could be involved in this phenotype, like DNA polymerases, MutT, MutS, and MutD. Blastp analysis showed that, in all the cases, the protein not only was present in all strains but also had a 100% identity, so we could not find any difference that could explain our results. Maybe the analysis of the frequency of reversion in more CO 2 -dependent strains will reveal if this is a species, biovar, or even isolate phenotype. We selected a few revertants from each strain and amplified by PCR the CA I and CA II coding regions. The amplicons were then sequenced to determine if any compensatory mutation had appeared in those loci. In all cases, we found compensatory mutations in the same region, around nucleotides 333-343. All mutations in this hot spot resulted in full-length CA II proteins (Figure 2), or in the case of B. pinnipedialis, a C-to-T change that reverts the Leu-to-Pro substitution. In this case, we also found the insertion of a nucleotide triplet (CGC or CCG) at the hot spot, which results in the addition of an extra amino acid, either Ala113 or Arg113, that is, the same position where the extra codon in Ba2308 CAII is located. Although some of the compensatory mutations appear several times, the most common situation was to find different mutations for the same sequence. As expected, reversion of the CO 2 dependence phenotype did not produce any change in the CDSs of CA I, reinforcing the hypothesis that CA II plays the main role in CO 2 independence.

Structural Modeling of Babortus2308 CAI and Babortus2308 CAII
A single amino acid substitution, Val74 in Bsuis513 CAI to Gly74 in Babortus2308W CAI, putatively renders the protein inactive, while the mutations in CA II in CO 2 -dependent Brucella isolates do not affect the region where the active center is located, at the N-terminal part of the protein ( Figure 1B). Moreover, a nonconservative Leu186Pro substitution, far from the active center, is enough to induce CO 2 dependence in the B. pinnipedialis strains analyzed. To better understand the effect of the observed mutations, a structural theoretical model of Ba2308 CAI and Ba2308 CAII was built with Phyre2. The modeled structures closely Frontiers in Microbiology | www.frontiersin.org FIGURE 2 | Nucleotide changes in CA II from selected CO 2 -independent mutants of different Brucella strains. Partial sequence of the regions where the original CO 2 -dependent strains had the mutations that caused the defective phenotype (shown in red), and the changes observed after selection and sequencing of different spontaneous CO 2 -independent mutants (shown in blue). resembled those of other β-CAs that have been crystalized, displaying matches with a 100% confidence.
The closest structural homolog to Ba2308 CAI is 1DDZ, a β-CA from the red alga Porphyridium purpureum (Mitsuhashi et al., 2000), with a 45% identity. Each 1DDZ monomer contains two internally repeated structures, each one homologous to Ba2308 CAI. Overlapping of the modeled structures shows how the mutated residue Gly76 lies in close proximity to the coordinated zinc atom and also to the dimer interface (Figure 3). In the equivalent position of Val76 in Bsuis513 CAI, 1DDZ contains Ile173 or Ile427, both among the most hydrophobic of amino acids. These residues are establishing hydrophobic contacts in the interface between the domains: Iso173 with Val441 and Phe442 (upper zoom image) and Iso427 with Phe168 and Tyr190 (lower zoom image). Identical (Phe71 and Val90) and similar (Phe93) residues are located in the equivalent positions in Brucella Ba2308 CAI. The presence of a glycine in Brucella Ba2308 CAI instead of an isoleucine disrupts these hydrophobic interactions Frontiers in Microbiology | www.frontiersin.org FIGURE 3 | Structural model of Brucella Ba2308 CAI. (A) Predicted structure of a monomer of Q2YL41, the CA I from Brucella abortus 2308, created using Phyre2, and the structure from the β-carbonic anhydrase from the red alga Porphyridium purpureum (1ddz) as template. Gly76 is depicted in red and the nearby zinc atom as a pink ball. (B) X-ray structure of the P. purpureum monomer, composed of two internally repeated structures. The N-terminal half (residues 1-308, equivalent to the sequence of monomeric Ba2308 CAI) is in green, and the C-terminal half (residues 309-564, equivalent to the second molecule of a putative dimer from Ba2308 CAI) is in gray. In the equivalent position of Gly76 from Ba2308 CAI in P. purpureum is located Iso173 (in the N-terminal half) or Iso427 (in the C-terminal half). These residues are establishing hydrophobic contacts in the interface between the domains; Iso173 with Val441 and Phe442 (upper zoom image) and Iso427 with Phe168 and Tyr190 (lower zoom image). Identical (Phe71 and Val90) and similar (Phe93) residues are located in the equivalent positions in Brucella Ba2308 CAI (A). The presence of a glycine in Brucella Ba2308 CAI instead of an isoleucine disrupts these hydrophobic interactions and could impair dimerization. Besides, this substitution could alter locally the folding of this region and affect the nearby residues that are coordinating the Zn atom. and could impair dimerization. Besides, this substitution could locally alter the folding of this region and affect the nearby residues that are coordinating the Zn atom. In both cases, the structure and consequently the activity of the protein would be affected. Indeed, a Val-to-Gly substitution, located in the dimerization surface, was shown to interfere with dimerization of citrate synthase from Thermoplasma acidophilum (Kocabiyik and Erduran, 2000), not only reducing its catalytic activity (about 10-fold) but also decreasing its thermal and chemical stability.
The model structure obtained for Ba2308 CAII is shown in Figure 4, along with the dimer structure of the best hit obtained, 5SWC, showing a 29% of identity and 100% confidence. 5SWC is the β-CA CcaA from Synechocystis sp. PCC 6803. As Ba2308 CAII contains an extra codon, the residue highlighted in red, Leu187, is the equivalent residue to the Leu186Pro change that is present in the CO 2 -dependent B. pinnipedialis strains.
In this structure, the protein crystalizes as a dimer, with the N-terminal arm composed of two α-helical segments (H1 and H2) that extend away from the rest of the molecule and make significant contacts with the last β-sheet with an adjacent monomer (in the case of Ba2308 CAII His188 with Met1 and Trp191 with Leu4). This interaction between monomers has been determined as crucial for the establishment of the dimer (Cronk et al., 2001). In the case of B. abortus strains 86/8/59, 9-941, and 292, the premature stop would cause the complete loss of the C-terminal end of the protein, including the last β-sheet, involved in the formation of the dimer. B. ovis ATCC 25840 shows also a completely altered C-terminus, and although the new amino acid sequence would remain folded as a β-sheet, it shows a completely different amino acid composition that would prevent the establishment of the right molecular interactions between the adjacent monomers. Regarding the last mutation observed in CO 2 -dependent strains, the SNP present in B. pinnipedialis strains M163/99/10, M292/94/1, and B2/94 causes a non-conservative amino acid substitution, Leu186Pro. The model predicts that this change will occur at the last β-sheet, in the area of interaction with the N-terminus of the adjacent monomer. Proline is an amino acid that confers an exceptional conformational rigidity and as such is a known disruptor of both α-helices and β-sheets. This being the case, this substitution is predicted to disrupt the dimerization of Brucella CA II.

Competitive Infection Assays
Strain 2308 is not only a CO 2 -independent Brucella isolate but also one of the most widely used virulent challenge strains, while S19, also a CO 2 -independent Brucella isolate, is an attenuated vaccine strain. In vitro cell assays using J774 macrophages did not detect any difference in virulence between a CO 2 -dependent B. abortus 292 strain and its corresponding CO 2 -independent revertant (Figure 5). Additionally, we could not find any report in the literature that suggests that the CO 2 dependence phenotype is related to virulence; however, there is one puzzling fact; despite the expected low frequency of a frameshift mutation, somehow, this mutation is fixed in several species and biovars of FIGURE 6 | Competitive index assay of a mixture of Brucella abortus 292 and its corresponding CO 2 -independent spontaneous mutant B. abortus 292mut1. The competitive index was calculated by dividing the output ratio of mutant to wild-type bacteria by the input ratio of mutant to wild-type bacteria, in the two groups tested, regarding the original inoculum. Thus, for strains with the same fitness, the result should be 1. The differences between groups were analyzed by Student's two-tailed t test with significance set at * P < 0.05. ns, non-significant.
Brucella. It is then reasonable to think of it as having a biological advantage in specific situations. CI assays have been used to reveal subtle differences in fitness between two strains, and intra-animal experiments help to minimize inherent inter animal biological variation and also improve the identification of mutations or isolates with reduced or improved competitive fitness within the host (Falkow, 2004). As this could be the case with Ba2308W CAII, we performed a CI experiment using B. abortus 292 and one of its CO 2 -independent mutants, 292mut1. As a control, we grew the same initial mixture in BA plates that were incubated at 37 • C with 5% CO 2 , to know if any change in CI could be attributed to just the CO 2 concentration, or there was some other factor that could be attributed to growth within an animal. Results are shown in Figure 6. During the course of the experiments in mice, there was a significant enrichment of the strain carrying the truncated form of Ba2308W CAII B. abortus 292, when compared with the CO 2 -independent revertant able to produce a complete active form of Ba2308W CAII. There was not a significant change in the ratio of both strains in liver or spleen, so the colony counts were combined in each mouse to show the ratio in that mice. At the same time, there was no significant enrichment/change in the ratio in cultures grown on plates. This suggests that inactivation of Ba292 CAII has some fitness advantage in vivo and could eventually result in the displacement of its corresponding CO 2independent counterpart. This hypothesis could explain why, despite the low frequency of mutation, CO 2 -dependent strains appear on primary isolation. As there are some other species and biotypes of Brucella that are CO 2 dependent on primary isolation, we could infer that the fitness advantage is also present in those species and biotypes.

DISCUSSION
Diagnosis of brucellosis is usually achieved by serological detection in both animals and humans. This could be enough to warrant the initiation of response measures, like the start of antibiotic therapy in humans or immobilization or sacrifice of animals. However, isolation, identification, and subtyping of brucellae not only are definitive proof of infection but also allow epidemiological surveillance. Depending on the laboratory, this process is carried out by a combination of classical and modern molecular methods. The classical typing methods consist in the phenotypic characterization of the isolates, using biochemical and immunological tests (CO 2 requirement; H 2 S production; urease activity; agglutination with monospecific A, R, and M sera; growth on media with thionin or basic fuchsin; or sensitivity to erythritol), and susceptibility to lytic Brucella phages (Alton et al., 1988). These methods require culture of the bacteria, are usually time-consuming and laborious, and do not offer good discriminatory power. Moreover, in the last few years, the field has experienced a revolution with the advent of new molecular methods, resulting in the description of new species and a better understanding of the population structure of the genus Brucella. Thus, the classical methods are being replaced or complemented by modern molecular methods. These methods range from PCR detection systems targeting different loci (like ery, bcsp31, or IS711), which allow species and even biovar differentiation (Mayer-Scholl et al., 2010;López-Goñi et al., 2011), to the multilocus sequence analysis (MLSA) that has been successfully used to describe the phylogenetic relationships of isolates and the global population structure of the genus Brucella (Whatmore et al., 2016). More recently, with the advent of WGS and especially with the drop in sequencing prices, WGS has been proposed to be the new routine typing method, particularly in groups with a high degree of similarity at the biochemical or serological levels (Chattaway et al., 2017), like Brucellaceae. But these methods are still far from being routine in most brucellosis laboratories, particularly in developing countries, and the classical methods are still routinely used in reference laboratories. Although genomic information offers the potential to unveil most of the phenotypic traits in bacteria, there are still important attributes that are not evident in the genome sequence. Thus, there is a gap between the classical typing schemes and the molecular methods, and some features still cannot be attributed to any specific genetic trait. In the case of Brucella, host specificity is particularly interesting, as it is yet impossible to predict from the genome sequence. It is reasonable to think that as molecular typing improves, we should advance in closing the gaps between classical and molecular typing, and we would be able to predict the full virulence and host specificity of a given isolate by analyzing the genome content. We have started to address this gap by looking at the genomic differences between Brucella isolates regarding one of the classical tests for typing, namely, CO 2 requirement.
Brucella abortus biovars 1, 2, 3, and 4 and some isolates from biovar 9, as well as B. ovis, require an increased concentration of CO 2 for growth, as do most strains of B. pinnipedialis, but only some of B. ceti. We selected 10 Brucella strains, which have been sequenced and annotated and whose CO 2 dependence status was known, to construct a Brucella pangenome based on the B. suis 1330 genome annotation. This resulted in a collection of 3,496 CDSs. We next compared the distribution of pseudogenes (as annotated in the databases) and absent genes with CO 2 dependence, resulting in only three candidate genes: Bru1_1050, which encodes for a multidrug resistance efflux pump; Bru1_1827, which encodes for CA II; and Bru2_1236, which encodes for an adenosylmethionine-8-amino-7-oxononanoate aminotransferase. The most obvious candidate was CA II, as it has been shown to be required to grow under ambient air in a number of microorganisms. To confirm our initial result, we extracted and aligned the DNA and amino acid sequences of CA II from an extended set of sequenced strains with a known CO 2 phenotype. Those strains that are able to grow in atmospheric concentrations of CO 2 carry a full-length copy of the protein, while those that are not contain truncated or mutated versions of the proteins. Brucella species also carry a second CA, CA I, but the polymorphisms found at both the DNA and protein levels do not allow us to infer CO 2 dependence. This result is in agreement with that reported by Pérez-Etayo et al. (2018) and Varesio et al. (2019) and further extends the range of strains tested.
A direct application of this result would be the determination of the CO 2 dependence status of any given strain by determining the sequence at the CA II locus. This is actually the case in B. abortus Tulya, where our analysis predicted that our stock should be CO 2 independent, as it was the original stock from CITA. Laboratory determination of the phenotype confirmed the in silico result. This approach could be used to determine or at least narrow down candidate genes for different phenotypes, obviously with monogenic traits being the easier to determine.
We have found three different mutations that caused dependence of added CO 2 , two independent insertions (C337 and G523) that either cause a premature stop or change completely the C-terminus of the protein, and an SNP that changes a leucine for a proline in the last β-sheet. All bacterial β-CAs crystallized so far are active as dimers or tetramers and inactive as monomers, and all of them have the N-terminal α-helix arm that extends away from the rest of the molecule and makes significant contact with the last β-sheet of an adjacent monomer (Supuran, 2016). In all the cases observed in this work, the mutations do not affect the active site, but all of them potentially change the sequence and structure of the protein at the C-terminus, so the most obvious hypothesis is that it is the modified structure of the proteins that causes the loss of activity. Inactive Brucella CA II proteins either lack the last β-sheet completely or have a very different sequence composition that disrupts this last β-sheet. The substitution of a leucine by a proline in the β-sheet is a particular example of this latter case, as proline is known to be a very disruptive amino acid for both α-helix and β-sheet structures. As these contacts seem to be important for dimerization, we can hypothesize that all the mutations found in CA II will have a strong impact in the dimerization or multimerization of CA II that will remain as a monomer, losing its activity (which we have defined as that allowing growth in a normal atmosphere). But there is a caveat in this reasoning. We, as well as others (Pérez-Etayo et al., 2018), have been unable to obtain a full-length mutant of CA II, despite being able to obtain a CA I (both data not shown). Moreover, a transposon sequencing analysis shows that CA II is essential, at least for B. abortus 2308 (Sternon et al., 2018). This experiment was apparently carried out without added CO 2 , so the result is not unexpected. It would be interesting to know if, performed in the presence of 5-10% CO 2 , they would have observed insertions only in the C-terminus of the protein, where the mutations in the natural CO 2 -dependent isolates accumulate. This means that the C-terminal part of the protein still carries out at least some of its functions as a monomer. We have not found any information regarding the activity of β-CAs as monomers, but in the α-CA from Thermovibrio ammonificans, the destabilization of the tetramer by reduction of the cysteines results in the dissociation of the tetrameric molecule into monomers with lower activity and reduced thermostability. It seems reasonable to think that this is the case also for Brucella CA II.
Carbonic anhydrase II catalyzes the fixation of CO 2 with high efficiency when forming dimers, but the low efficiency of the carboxylation reaction when acting as a monomer would require the presence of higher amounts of CO 2 .
A similar situation could be taking place in the case of CA I. Modeling of the structure of Babortus2308 CAI allows us to hypothesize the role of the only residue of difference with Bsuis513 CAI that has to be responsible for the absence of activity in the first one. Its localization close to the Zn atom and to the dimer interface probably results in the destabilization of the dimer, lowering, or abolishing its activity. However, it would be necessary to purify and characterized biochemically the monomers of both Babortus2308 CAI and Babortus2308 CAII to confirm our model.
These mutations can only be selected in high-CO 2 environments, like those present inside animals, where high CA II activity would be dispensable, as this atmosphere generates enough bicarbonate in solution as to fulfill the metabolic requirement of the bacteria (Nishimori et al., 2009). We have determined the frequency of appearance of CO 2 -independent isolates, and although there is a huge variation between strains, it ranges from 10 −6 to 10 −10 , as previously described. Despite their low frequency, somehow these mutations got selected in several species and biovars of Brucella, suggesting that they provide some biological advantage. To test this hypothesis, we performed a competitive assay both in vitro and in vivo. This assay resulted in a significant enrichment of the strain carrying an inactive CA in animals, but not in cultured plates. Pérez-Etayo et al. (2018) assayed the bacterial loads of B. ovis PA and B. ovis PA Tn7 Ba2308W CAII in the spleens of BALB/c mice at 3 and 8 weeks post infection and found that there was no significant difference between a CO 2 -dependent strain and its corresponding CO 2 -independent strain at the level of multiplication in the mouse model. This apparent contradiction with our own results could be due to the different species used or to the different experiment used to test this hypothesis. When researchers try to determine subtle differences in fitness between two given strains, a competitive assay has a higher discrimination power (Eekels et al., 2012;Shames et al., 2017), as any effect is amplified over time. Although the ultimate reason behind this competitive advantage is currently unknown, it would explain why some strains and biovars of Brucella are dependent of CO 2 in primary isolation, despite the low frequency of mutation. It is also noteworthy that this phenotype is only observed in certain species and biovars, suggesting that the competitive advantage of the CA II mutants only applies to a subset of host/pathogen pairs. As CA II is essential, the mutant strains still would have to produce the protein, and thus, the metabolic gain should be negligible for them. Another possibility would be that the dimer form of the enzyme is too active in a high-CO 2 environment and causes a deleterious acidification in the bacteria. By evolving this sophisticated system that reversibly alters the dimerization state of the protein, Brucella is able to adjust to the different requirements encountered during its biological cycle.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/Supplementary Material.

ETHICS STATEMENT
The animal study was reviewed and approved by the Cantabria University Institutional Laboratory Animal Care and Use Bioethics Committee.

AUTHOR CONTRIBUTIONS
FS conceived and coordinated the study, conducted the bacteriology work, and wrote the manuscript. JG analyzed the data and wrote the manuscript. YO, CG-R, AS, and BA-R conducted the bacteriology work. All authors interpreted the data, corrected the manuscript, and approved the content for publication.

FUNDING
This work was supported by grants BFU2011-25658 from the Spanish Ministry of Science and Innovation, and by grant 55.JU07.64661 from the University of Cantabria to FS. BA-R was supported by a Scholarship received from DGAPA-UNAM PASPA program. The authors want to acknowledge help from María J. Lucas and Elena Cabezón in the drawing and interpretation of crystallographic data.