Confirmation of the Need for Reclassification of Neisseria mucosa and Neisseria sicca Using Average Nucleotide Identity Blast and Phylogenetic Analysis of Whole-Genome Sequencing: Hinted by Clinical Misclassification of a Neisseria mucosa Strain

The taxonomy of the genus Neisseria remains confusing, particularly regarding Neisseria mucosa and Neisseria sicca. In 2012, ribosomal multi-locus sequence typing reclassified both as N. mucosa, but data concerning 17 N. sicca strains remain available in GenBank. The continuous progress of high-throughput sequencing has facilitated ready accessibility of whole-genome data, promoting vigorous development of average nucleotide identity (ANI) and high-resolution phylogenetic analysis. Here, we report that a Neisseria isolate, which caused native-valve endocarditis and multiple embolic brain infarcts in a patient with congenital heart disease, was misidentified as N. sicca by VITEK MS. This isolate was reclassified as N. mucosa by ANI blast (ANIb) and by phylogenetic analysis using whole-genome data yielded by the PacBio Sequel and Illumina NovaSeq PE150 platforms. The confusion evident in the GenBank and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) databases suggests that N. mucosa (n = 13) and N. sicca (n = 16) in GenBank should be reclassified using ANIb and high-resolution phylogenetic analysis. The whole-genome data of 30 strains (including the clinical isolate) were compared with the data of 27 type Neisseria strains (including one N. sicca and two N. mucosa type strains) as a genomic index. In total, 25 (8 originally identified as N. mucosa and 17 originally identified as N. sicca) and 7 (1 originally identified as N. sicca and 6 originally identified as N. mucosa) strains were reclassified into the N. mucosa and Neisseria subflava groups, respectively; 1 residual N. mucosa strain was reclassified as Neisseria meningitidis. In conclusion, a combination of ANIb and robust phylogenetic analysis reclassified strains originally identified as N. mucosa and N. sicca into (principally) the N. mucosa group and the N. subflava group. The misclassified N. sicca and N. mucosa strains in the GenBank and MALDI-TOF MS databases were supposed to be corrected. Updated genomic classification strategy for originally identified N. mucosa and N. sicca strains was recommended to be adopted in GenBank.

The taxonomy of the genus Neisseria remains confusing, particularly regarding Neisseria mucosa and Neisseria sicca. In 2012, ribosomal multi-locus sequence typing reclassified both as N. mucosa, but data concerning 17 N. sicca strains remain available in GenBank. The continuous progress of high-throughput sequencing has facilitated ready accessibility of whole-genome data, promoting vigorous development of average nucleotide identity (ANI) and high-resolution phylogenetic analysis. Here, we report that a Neisseria isolate, which caused native-valve endocarditis and multiple embolic brain infarcts in a patient with congenital heart disease, was misidentified as N. sicca by VITEK MS. This isolate was reclassified as N. mucosa by ANI blast (ANIb) and by phylogenetic analysis using whole-genome data yielded by the PacBio Sequel and Illumina NovaSeq PE150 platforms. The confusion evident in the GenBank and matrixassisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) databases suggests that N. mucosa (n = 13) and N. sicca (n = 16) in GenBank should be reclassified using ANIb and high-resolution phylogenetic analysis. The whole-genome data of 30 strains (including the clinical isolate) were compared with the data of 27 type Neisseria strains (including one N. sicca and two N. mucosa type strains) as

INTRODUCTION
Neisseria mucosa and Neisseria sicca are both non-pathogenic species that inhabit the human pharynx (Caugant and Brynildsrud, 2020). However, they are phenotypically similar; N. mucosa forms mucoid colonies and reduces nitrates, whereas N. sicca forms dry, wrinkled colonies and does not reduce nitrates (Tønjum, 2005). Both species exhibit variable features; identification to the species level using phenotypic characters is difficult. Compared with traditional phenotypic approaches, genomic analyses are superior (Paul et al., 2019). Whole-genome sequencing (WGS) is now very accessible and affordable (Zong, 2020). On the basis of ribosomal multi-locus sequence typing (rMLST) findings, Bennett et al. (2012) proposed that N. sicca was a variant of N. mucosa. Using a phylogenetic taxonomic method, Caugant and Brynildsrud (2020) suggested that N. sicca and N. mucosa did not form monophyletic groups and might constitute a single species. Although the N. sicca strains evaluated by Bennett et al. (2012) have been corrected in GenBank, a couple of N. sicca strains remain.
Current, WGS-based, high-resolution, genomic taxonomic methods include the average nucleotide identity blast (ANIb) and phylogenetic analysis. ANIb of all genes conserved in two genomes measures the evolutionary distance between the strains (Altschul et al., 1997;Konstantinidis and Tiedje, 2005). The phylogenetic analysis uses gene sequences to infer the evolutionary pattern (Staley, 2006). Both algorithms are always used together; they are complementary. For example, of 181 Neisseria isolates examined, seven putative novel Neisseria species were identified by ANI and phylogenetic analysis (Diallo et al., 2019).
Here, we report the case of the clinical isolate SAMN18451419, which was identified as N. sicca by VITEK MS; it was reclassified as N. mucosa by ANIb and phylogenetic analysis (via WGS using the PacBio Sequel and Illumina NovaSeq PE150 platforms). This Neisseria strain was isolated from a patient with infective nativevalve endocarditis and multiple embolic brain infarcts. This case and the findings by Bennett et al. (2012) and Caugant and Brynildsrud (2020) suggest that N. sicca and N. mucosa might be the same species. We thus used phylogenetic and ANIb analysis to reassign the "N. mucosa" and "N. sicca" strains in GenBank to the correct groups.

Ethics Approval Statement
The study was conducted in accordance with the Declaration of Helsinki and was approved by the Clinical Research Ethics Committee of the First Affiliated Hospital, Zhejiang University School of Medicine [approval no. 2021IIT 026 (fast track)].

Sampling and Clinical Data
A Gram-negative coccus was isolated from the blood culture of a patient with an intermittent fever that had persisted for 3 weeks. Clinical data were retrieved from the patient's medical record.

Bacterial Isolation and Clinical Identification
The isolate was plated on blood and chocolate agar (Autobio, China) plates at 35 • C under 5% (v/v) CO 2 before testing (Humphries et al., 2021). Single colonies were selected from the plates. For phenotypic characterization, 3% (v/v) hydrogen peroxide and tetramethyl-p-phenylenediamine hydrochloride were used in the catalase and oxidation assays. For molecular characterization, the isolate was submitted to matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) using a VITEK MS platform (bioMérieux) and VITEK 2 Compact software (bioMérieux).

Whole-Genome Sequencing and Analysis
The Neisseria strain described earlier was subjected to WGS. Genomic DNA was extracted using a commercial kit (QIAGEN Gentra Puregene Yeast/Bact Kit, Germany) in accordance with the manufacturer's instructions. The library for singlemolecule real-time (SMRT) sequencing was constructed (insert size 10 kb) using the SMRT Bell Template kit version 1.0; a next-generation sequencing library was generated using the NEBNext Ultra DNA Library Prep Kit (from Illumina), in accordance with the manufacturer's recommendations. WGSs were obtained using the PacBio Sequel and Illumina NovaSeq PE150 platforms at Beijing Novogene Bioinformatics Technology Co., Ltd. Clean reads were assembled using SMRT Link ver. 5.1.0 (Supplementary Figure 1).

Data Availability
The complete genome data of strain SAMN18451419 have been deposited in GenBank with the accession number CP072524 2 .

Identification of Virulence-Associated Genes of Strain SAMN18451419
Virulence genes were identified using the Virulence Factor database 3 .

PubMLST Analysis of Strain SAMN18451419
After draft genome assembly, we used multi-locus sequence typing (MLST) for taxonomic identification. The assembled scaffolds were submitted to the Neisseria MLST website 4 (Jolley et al., 2018).

Phylogenetic Analysis
To characterize the evolutionary relationships among Neisseria isolates, we used a core genome-based phylogenetic tree to identify the Neisseria genomes [strain SAMN18451419, previously identified N. mucosa (n = 13) and N. sicca (n = 16), and 27 type Neisseria genomes] ( Table 1). All genomes were annotated using RAST 5 . The core genes in Neisseria genomes were identified using RAST and Roary 6 . A maximumlikelihood phylogenetic tree based on the core single-nucleotide polymorphism alignments was generated using MegaX (Kumar et al., 2018). Phylogenetic tree visualizations were produced by the Interactive Tree of Life 7 .

Average Nucleotide Identity Blast Analysis
ANIb analysis was performed using pyani 8 . The following genomes were analyzed: strain SAMN18451419, previously identified N. mucosa (n = 13) and N. sicca (n = 16), and 27 type Neisseria strains. Pairwise ANIb data for each strain were clustered and visualized using heatmap. All software mentioned earlier were implemented using the default settings.

Case Presentation
A 52-year-old man with a recurrent fever was admitted (day 0). A schematic of the disease course is shown in Figure 1. Before admission, he had developed non-specific symptoms, including high fever, fatigue, anorexia, nausea, and vomiting (all persisted for 3 weeks). Considering the high leukocyte count (15.9 × 10 9 /L) and C-reactive protein level (144.1 mg/L), despite the absence of an obvious infective focus, empirical outpatient treatment (azlocillin sodium) was administered before admission; this treatment was ineffective. Empirical ticarcillin disodium and clavulanate potassium were prescribed from days 0 to 5. A Gram-negative diplococcus was isolated from blood cultures on day 1. Echocardiography on day 4 revealed vegetations attached to the anterior and posterior leaflets of the mitral valve, with dimensions of approximately 19 × 10 and 12 × 7.8 mm, respectively ( Figure 1B). Cranial magnetic resonance imaging (MRI) on day 6 showed multiple abnormal signals in the left occipital lobe and on both sides of the ventricle ( Figure 1C). The antibiotics were switched to ceftriaxone and amoxicillin sodium clavulanate potassium, based on the drug sensitivities of the isolate (minimal inhibitory concentration of ceftriaxone ≤0.12 µg/ml; recorded from days 6 to 13). Cranial contrast-enhanced MRI revealed abnormal enhancement of punctate, patchy, and ring-shaped lesional regions (compared with the day 6 MRI image); infectious lesions were first considered on day 10 ( Figure 1D). The body temperature returned to the normal level, followed by downward trends in the indicators of inflammation from days 1 to 10. However, a relapse developed on day 11. Follow-up heart ultrasonography showed that the vegetations attached to the mitral valve had shrunken ( Figure 1E). The patient was then transferred to our main campus for further treatment. Piperacillin sodium and tazobactam sodium were prescribed from days 13 to 16. However, the high fever persisted. Follow-up ultrasonography of the heart, as well as cranial MRI, yielded results similar to the findings on day 15. The antibiotics were changed to ceftriaxone sodium and compound sulfamethoxazole from days 16 to 38. Symptoms of cerebral infarction manifested on day 17. Subsequent blood cultures were negative. Cranial contrast-enhanced computed tomography Frontiers in Microbiology | www.frontiersin.org (CT) revealed low-density focus in the left frontal, parietal, and occipital lobes, with minor hemorrhage after infarction on day 18 ( Figure 1F). On day 25, cranial CT showed that the lesion had expanded, and new low-density lesions had developed in the left cerebellar hemisphere. The focus of ischemic infarction was compared with the focus on day 18 ( Figure 1G). On day 33, mitral valve replacement, tricuspid valvuloplasty, and left atrium folding were performed after preoperative transesophageal echocardiography had revealed a highly echoic mass (dimensions of approximately 1.07 × 1.08 × 1.69 cm) attached to the anterior mitral valve leaflet ( Figure 1H). Histology revealed valvular connective tissue hyperplasia and collagenization, mucus changes, calcification, and both acute and chronic inflammatory cell infiltration and necrosis ( Figure 1I). Follow-up ultrasonography of the heart (after the operation) revealed normal mechanical valve function without any obvious FIGURE 2 | Phylogenetic analysis of 57 Neisseria strains. The maximum likelihood tree showed phylogeny of genus Neisseria based on WGS. The phylogenetic tree was generated for clinical isolate SAMN18451419, as well as 13 N. mucosa strains, 16 N. sicca strains, and 27 Neisseria type strains (including two N. mucosa and one N. sicca type strains). Red color code refers to clinical isolate SAMN18451419. WGS, whole-genome sequencing. paravalvular leakage. On day 39, follow-up cranial CT showed that the low-density focus had been absorbed; the size was reduced relative to the size on day 31. The patient was discharged to home on anticoagulant therapy and scheduled for follow-up outpatient review 1 month later.

Clinical Phenotypic Identification of Strain SAMN18451419
The blood and chocolate agar plates both grew off-white Gramnegative diplococci (Figure 1A). The isolates exhibited both catalase and oxidase. Isolates from both plates were identified as N. sicca by VITEK MS and VITEK 2 Compact with 99.9% confidence.

General Genomic Features of Strain SAMN18451419
Sequencing of strain SAMN18451419 revealed a genome size of 2,566,407 bp with a G+C content of 51.1%. There was one scaffold featuring 2,295 protein-encoding genes, 61 transfer RNAs, and 12 ribosomal RNAs (Supplementary Figure 2).

Identification of Virulence-Associated Genes in Strain SAMN18451419
The genome encoded 21 putative virulence factors (Supplementary Figure 2).

PubMLST Analysis of Strain SAMN18451419
Strain SAMN18451419 was identified as a new sequence type using the Neisseria public databases for molecular typing and microbial genome diversity (PubMLST) 9 .
Phylogenetic Analysis of N. mucosa and N. sicca We explored the phylogenetic relationships among N. mucosa, N. sicca, and the 27 Neisseria type strains based on the core genomes obtained via WGS. Clear separation of the strains listed in Table 1 was evident in the phylogenetic tree (Figure 2). Eight strains originally identified as N. mucosa (including two N. mucosa type strains ATCC 19696 and NCTC 10774) and 17 strains originally identified as N. sicca (including the clinical isolate SAMN18451419 and one N. sicca type strain ATCC 29256) clustered into the N. mucosa group. Six strains originally identified as N. mucosa and one strain originally identified as N. sicca clustered into the N. subflava group. Notably, strain SAMEA104189262, originally classified as N. mucosa, clustered with the N. meningitidis-type strains NCTC 3372 and ATCC 13091.
Average Nucleotide Identity Blast Analysis of N. mucosa and N. sicca Consistent with the results of phylogenetic analysis, the eight strains originally identified as N. mucosa (including two N. mucosa type strains ATCC 19696 and NCTC 10774) and the 17 strains originally identified as N. sicca (including the clinical isolate SAMN18451419 and one N. sicca type strain ATCC 29256) co-clustered. Six strains originally identified as N. mucosa and one strain originally identified as N. sicca clustered with the N. subflava type strain ATCC 49275. One strain originally identified as N. mucosa (SAMEA104189262) clustered with two N. meningitidis type strains NCTC 3372 and ATCC 13091 (Figure 3). According to the combination of robust phylogenetic and ANIb analysis, the proposed reclassification of previously identified N. mucosa and N. sicca strains were listed in Table 2. DISCUSSION N. mucosa and N. sicca are genetically closely related and have a contentious taxonomic history [described in Bergey's Manual of Systematic Bacteriology and addressed by Bennett et al. (2012)]. Traditional bacterial taxonomic assignments use phenotypic approaches; these are generally unsatisfactory because  (Brodie et al., 1971). WGS has greatly assisted bacterial taxonomy. Data ranging from small regions (from which 16S ribosomal RNAs are transcribed) to the entire genome are readily available (Mechergui et al., 2014). Modern genomic analysis approaches based on WGS, rMLST, cgMLST, ANIb, and phylogenetic analysis have flourished recently (Bennett et al., 2012;Zong, 2020). We used phylogenetic analysis to reclassify the originally (mis)identified N. sicca and N. mucosa strains into the N. mucosa group, confirming the findings by Bennett et al. (2012). The term "N. mucosa group" is used because N. mucosa [originally termed Diplococcus mucosus by Von Lingelsheim in 1906 (Bennett et al., 2012)] was the first such species to be identified (Tønjum, 2005). The N. subflava group, in which one N. sicca, six N. mucosa, one N. subflava, and two N. flavescens strains co-clustered in our results, was also named in accordance with the same principle in the work by Bennett et al. (2012). Only one strain originally identified as N. sicca belonged in the N. subflava group in the work by Bennett et al. (2012), whereas we found that one strain originally identified as N. sicca and six strains originally identified as N. mucosa clustered into that group. The reason may be that six N. mucosa strains, most of which sequences were uploaded to GenBank after 2012, analyzed in our research, were not enrolled by Bennett et al. (2012). According to our results, some originally misidentified N. mucosa and N. sicca species seem as variants of N. subflava. The N. mucosa and N. subflava groups of the phylogenetic tree also clustered in the ANIb heatmap, which revealed ANIb analysis verified our initial conclusion.
The genus Neisseria includes the two pathogenic species, N. meningitidis and N. gonorrhoeae; the remaining species are opportunistic pathogens (Caugant and Brynildsrud, 2020). Compared with the opportunistic pathogen Neisseria species, N. meningitidis causes significant morbidity and mortality (Wang et al., 2019). Notably, one strain originally identified as N. mucosa was reclassified as N. meningitidis. It reveals some GenBank genomes are clearly not curated or checked, creating taxonomic errors. We found that strain SAMN18451419 from a patient with congenital heart disease was misidentified by VITEK MS as N. sicca. It indicates that both MS and GenBank databases require revision of Neisseria classification.

CONCLUSION
We offered a detailed and well-supported description of the phylogenetic relationship between the (erroneously) originally classified N. mucosa and N. sicca strains, as confirmed by ANIb analysis. The original N. mucosa and N. sicca strains were reclassified into the N. mucosa group and N. subflava group via high-resolution genomic taxonomy. The GenBank and MALDI-TOF MS databases thus require correction. In addition, the classification strategy for originally identified N. mucosa and N. sicca strains in GenBank is supposed to be updated with the progressive development of genomic analysis approaches.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repositories and accession numbers can be found in the article/ Supplementary Material.

ETHICS STATEMENT
The study was conducted in accordance with the Declaration of Helsinki and was approved [2021IIT026 (fast track)] by the Clinical Research Ethics Committee of the First Affiliated Hospital, Zhejiang University School of Medicine. The patient provided the written informed consent to participate in this study. Written informed consent was obtained from the individual for the publication of any potentially identifiable images or data included in this article.