Evaluation of a Novel MALDI Biotyper Algorithm to Distinguish Mycobacterium intracellulare From Mycobacterium chimaera

Accurate and timely mycobacterial species identification is imperative for successful diagnosis, treatment, and management of disease caused by nontuberculous mycobacteria (NTM). The current most widely utilized method for NTM species identification is Sanger sequencing of one or more genomic loci, followed by BLAST sequence analysis. MALDI-TOF MS offers a less expensive and increasingly accurate alternative to sequencing, but the commercially available assays used in clinical mycobacteriology cannot differentiate between Mycobacterium intracellulare and Mycobacterium chimaera, two closely related potentially pathogenic species of NTM that are members of the Mycobacterium avium complex (MAC). Because this differentiation of MAC species is challenging in a diagnostic setting, Bruker has developed an improved spectral interpretation algorithm to differentiate M. chimaera and M. intracellulare based on differential spectral peak signatures. Here, we utilize a set of 185 MAC isolates that have been characterized using rpoB locus sequencing followed by whole genome sequencing in some cases, to test the accuracy of the Bruker subtyper software to identify M. chimaera (n = 49) and M. intracellulare (n = 55). 100% of the M. intracellulare and 82% of the M. chimaera isolates were accurately identified using the MALDI Biotyper algorithm. This subtyper module is available with the MALDI Biotyper Compass software and offers a promising mechanism for rapid and inexpensive species determination for M. chimaera and M. intracellulare.


INTRODUCTION
The use of Matrix Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry (MALDI-TOF MS) for identification of microbial specimens has scaled up dramatically over the past decade due to the continual improvement of available tools, including expanded spectral peak databases [reviewed in Doern and Butler-Wu (2016); Rahi et al. (2016); and Alcaide et al. (2018)].
Species identification of bacteria is now honed to strain-specific and subspecies distinctions in many cases (Neville et al., 2011;Huang et al., 2018) as well as the characterization of nonbacterial organisms such as fungi (Chalupova et al., 2014;Levesque et al., 2015). These approaches are relatively rapid, inexpensive, and accurate for use in clinical diagnostics (Neville et al., 2011), environmental sample analysis, food safety, and other applications (Carbonnelle et al., 2011;Florio et al., 2018). MALDI-TOF MS is increasingly utilized to distinguish clinically relevant nontuberculous mycobacteria (NTM) (Levesque et al., 2015;Costa-Alcalde et al., 2018).
The Mycobacterium avium complex (MAC) comprises several clinically important mycobacterial species including M. avium (subsp. hominissuis), M. intracellulare, and M. chimaera (Diel et al., 2018;Forbes et al., 2018), among others [e.g., M. colombiense (Murcia et al., 2006) and M. yongonense (Castejon et al., 2018)]. Although M. avium and M. intracellulare species are more frequently observed in clinical cases, a recent series of M. chimaera infections originating in cardiac surgery suites (van Ingen et al., 2017) illuminates the importance of identifying members of the MAC complex at the species level in a rapid and accurate manner. While the spectrum of M. avium is relatively distinct from the other two species, M. intracellulare and M. chimaera are highly similar to each other (van Ingen et al., 2012), and not readily distinguished using the standard approaches of most MALDI-TOF peak interpretation algorithms (Boyle et al., 2015a;Leyer et al., 2017). Pulmonary infections caused by the different MAC species are distinct in their virulence and clinical features, indicating that species-level identification is important for effective prognostics (Boyle et al., 2015b;Kim et al., 2017). Bruker Daltonik (Bremen, Germany) developed a commercially available algorithm to differentiate M. chimaera and M. intracellulare from each other using only MALDI-TOF peak data. This algorithm was found to perform well against 59 bacterial isolates of European origin using the internal transcribed spacer (ITS) sequence to identify species (Pranada et al., 2017). Here we set out to evaluate the same MALDI algorithm against 111 isolates of United States origin. Sequence from the rpoB locus was used to determine species, but in 16 discrepant cases, the isolates were whole genome sequenced and species identity was determined using phylogenomics.

NTM Species Determination Using Sanger Sequencing
All samples were of clinical origin within the United States, either bronchoalveolar lavage, sputum, or tissue. Isolates were sent to National Jewish Health for NTM species identification and/or antimicrobial susceptibility testing. DNA from these isolated cultures of 300 acid-fast bacilli were analyzed using a targeted 711 base pair region of the rpoB locus that was Sanger sequenced using ABI 3730xL and queried against the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST) database. The rpoB target is one of several recommended for NTM species determination according to current national clinical microbiology guidelines (Forbes et al., 2018); other targets that are utilized for this purpose are 16S rRNA, secA, ITS, and hsp65 (Lecorche et al., 2018). From this collection, a subset (185) were identified as one of the MAC species according to rpoB sequence, i.e., M. avium (n = 74), M. intracellulare (n = 55), or M. chimaera (n = 56).

MALDI-TOF MS
MAC strains were subcultured onto Middlebrook 7H11 agar media and incubated for 7-10 days to obtain sufficient biomass for MALDI-TOF MS identification. Colonies from the 7H11 plates were harvested and heat-killed in high performance liquid chromatography (HPLC) grade water, followed by acetonitrile/formic acid extraction procedure according to the manufacturer protocol. The MALDI-TOF target was spotted with 1 µL of extract supernatant and overlaid with 1 µL of α-Cyano-4-hydroxycinnamic acid (HCCA) matrix prior to data capture using the Bruker MALDI-TOF Biotyper system (software v4.0). A minimum score of 1.8 was required. A complete MALDI spectrum dataset was sent to Germany for MALDI spectrum analysis in parallel at Bruker Daltonik in Bremen, Germany.
For samples identified by MALDI-TOF MS as M. chimaeraintracellulare group, a second-tier analysis was performed using a recently developed research subtyping software from Bruker Daltonik [ (Pranada et al., 2017) herein referred to as MBT Subtyping Module]. One sample, NTM-184, was eliminated from the sensitivity and specificity calculations because the MALDI results placed it in the M. chimaera-intracellulare group, but the rpoB result was M. avium, and this sample did not have whole genome sequence (WGS) information (see below.) An additional seven samples, six of which were identified as M. chimaera using rpoB, were also eliminated from these calculations because the MBT Subtyping Module did not yield a result. These strains are listed in Table 1 under the MBT Subtyper as "M. chim/M. int." It should be noted that the MBT Subtyping Module returned a high score on these seven samples but because it was unable to distinguish the two species, it does not make a call in these cases.

Phylogenomic Analysis of Ambiguous NTM Strains
For almost all samples, the rpoB species identification was considered to be the actual species of that strain. In a few cases, samples yielded discrepant results, and for these samples, WGS was used to assign a species identification to a given mycobacterial strain. Genomic DNA was isolated according to a protocol adapted from Käser et al. (2010), employing a column DNA clean in lieu of a phenol chloroform extraction and alcohol precipitation. Genomic libraries were constructed using Nextera XT and sequenced using Illumina chemistry (Illumina, Inc., San Diego, CA).
Illumina reads were trimmed of adapters and low quality bases (< Q20) using Skewer (Jiang et al., 2014). Trimmed reads were assembled into scaffolds using Unicycler (Wick et al., 2017), and genome assemblies were compared against a selection of reference genomes to calculate average nucleotide identities (ANI, chjp/ANI on GitHub.com, Supplementary Table S1) and to assign a species call to each isolate (Goris et al., 2007;Richter and Rossello-Mora, 2009). Trimmed reads and background reference genomes were mapped to the M. chimaera CDC 2015-22-71 genome (Hasan et al., 2017) using Bowtie2 software (Langmead and Salzberg, 2012). Single nucleotide polymorphisms (SNPs) were called using SAMtools mpileup program (Li et al., 2009), and a multi-fasta sequence alignment was created from concatenated basecalls from all strains (Davidson et al., 2014;Page et al., 2016). Resulting sequences were used to make a maximum likelihood (ML) phylogenetic tree using RAxML-NG with a GTR+G substitution model (Kozlov et al., 2018) and 1,000 bootstrap replicates. The tree was annotated and visualized with ggtree (Yu et al., 2017)

Comparison of WGS to Single Locus Species Determination
Using primer sequences of known targets, three regions were extracted from the WGS data. These were previously targeted regions from the following loci: 16S rDNA (Springer et al., 1996;van Ingen et al., 2012), the 16S-23S internal transcribed spacer (ITS) (Schweickert et al., 2008), and rpoB (Adekambi et al., 2003;Ben Salah et al., 2008). These sequences were used to BLAST against the non-redundant NCBI database and the top species call or calls are reported in Table 1.

RESULTS
From the initial strain collection, 185 isolates were identified by rpoB amplicon sequencing as one of the three prominent clinical MAC species, specifically 74 M. avium, 55 M. intracellulare, and 56 M. chimaera. These 185 strains were then analyzed using MALDI-TOF MS at National Jewish Health, in Denver, Colorado, United States and the data were re-analyzed in Bremen, Germany, yielding identical results according to the Biotyper 4.0 software. Of the 74 M. avium isolates, 73 were identified as M. avium. A single M. avium sample (as identified by rpoB) and the remaining 111 samples were all identified as being in the M. chimaera-intracellulare group.
These 112 strains were then subjected to a second tier of analysis wherein their MALDI-TOF MS spectra were evaluated using the recently developed MBT Subtyping Module software; see Methods (Pranada et al., 2017). One sample, NTM-006, was identified as M. chimaera by rpoB but as M. intracellulare by WGS, thus the sample was categorized as M. intracellulare. This was the only sample for which WGS yielded a result that differed from rpoB. In this case, the WGS result was considered to be the actual species for calculations. After removing the eight samples that were unconfirmed (see Methods), the samples numbered 55 M. intracellulare and 49 M. chimaera that were suitable to evaluate the MBT Subtyping Module algorithm for its ability to distinguish these two species.
Sixteen isolates were selected for WGS. These strains were investigated in greater depth using extracted sequences of specific loci (Table 1) in addition to phylogenomic tree analysis using the WGS data (Figure 1). The specific loci were selected to reflect the regions that are most commonly used for mycobacterial species identification. For most Mycobacteria spp., these loci are useful for identifying to the NTM species level. But the close relationship of M. chimaera and M. intracellulare is apparent in these genes since the two species' sequences over these regions are largely interchangeable, consistent with the initial confounding observations that led to the naming of M. chimaera (Tortoli et al., 2004). Based on these extracted sequences, the rpoB fragment was most reflective of the WGS result ( Table 1).
For the 49 samples identified by sequencing as M. chimaera, the MBT Subtyping Module identified 40 as M. chimaera and 9 as M. intracellulare for a sensitivity of 81.6%. However, because all of the Subtyper M. chimaera calls were in fact M. chimaera according to whole genome sequencing, the positive predictive value for a call of M. chimaera is 100%. The M. intracellulare (n = 55) data differed in that all 55 samples were identified as M. intracellulare for a sensitivity of 100%, but nine of the M. chimaera samples were called M. intracellulare by the Subtyping software, reducing the specificity, and the positive predictive value is 86% or 55/(55+9). This indicates that a minority (∼18%, or 9/49) of the true M. chimaera samples would be recognized as M. intracellulare using the MALDI Biotyper Compass software.

DISCUSSION
The recent outbreak of M. chimaera-contaminated heater-cooler devices used in cardiac surgery has increased the clinical urgency for differentiating M. chimaera from M. intracellulare. The MALDI-TOF MS approach to species identification is less expensive and more rapid than sequencing methods, but until now has been unable to distinguish these two closely related MAC species. Our findings indicate that a call of M. chimaera by the Bruker subtyping algorithm tested here is correct every time for a positive predictive value of 100%. We also found that a call of M. intracellulare is correct 86% of the time, that is, 14% of our samples that were recognized as M. intracellulare by MALDI are actually more closely related to M. chimaera than to M. intracellulare. Given the regional diversity of NTM species, it is possible that these strains were not available in Frontiers in Microbiology | www.frontiersin.org the initial European sample set during test design. A number of these strains cluster more closely with the M. chimaera reference strains (samples 178, 019, 232, etc.), although they are arguably relatively distant from any available reference. The dynamic nature of bacterial phylogenomics requires us to integrate currently available references and whole genome approaches with an understanding that species nomenclature is always evolving as new branches and clades become apparent.
With increasing numbers of M. chimaera and M. intracellulare isolates requiring species identification in diagnostic reference laboratories, it is clear that improved tools are needed to quickly and accurately identify and distinguish these close mycobacterial species, in most cases inseparable using average nucleotide identity measures (Supplementary Table S1). We observe a qualitative difference in genome size for these two species and performed a statistical comparison of available complete reference genomes from M. chimaera (mean ± 1s.d., 6.24 ± 0.29Mb, n = 8) and M. intracellulare (5.52 ± 0.12Mb, n = 6), and found that M. chimaera was consistently larger by ∼700,000 bases (p < 0.001, Mann-Whitney U test.) This information may provide a path to discovering markers that can distinguish the two species, but the standard and reliable tools used to distinguish most NTM from one another often fail to discriminate between these very closely related strains. The discerning capability of the Bruker Subtyper module with MALDI spectral data offers an available and notable advance in MAC species identification.

AUTHOR CONTRIBUTIONS
MT, MK, GS, and MaS designed the study. LEE wrote the manuscript and performed whole genome sequencing. NKH, PG, and DD performed rpoB sequencing and MALDI-TOF analysis. MT, MK, and GS performed parallel MALDI-TOF spectral analysis and developed and implemented the subtyping software. NAH analyzed whole genome sequencing data, extracted loci of interest, and created visualizations. All authors contributed to writing, editing, and overall presentation of the manuscript.