Identification of pathogens and detection of antibiotic susceptibility at single-cell resolution by Raman spectroscopy combined with machine learning

Rapid, accurate, and label-free detection of pathogenic bacteria and antibiotic resistance at single-cell resolution is a technological challenge for clinical diagnosis. Overcoming the cumbersome culture process of pathogenic bacteria and time-consuming antibiotic susceptibility assays will significantly benefit early diagnosis and optimize the use of antibiotics in clinics. Raman spectroscopy can collect molecular fingerprints of pathogenic bacteria in a label-free and culture-independent manner, which is suitable for pathogen diagnosis at single-cell resolution. Here, we report a method based on Raman spectroscopy combined with machine learning to rapidly and accurately identify pathogenic bacteria and detect antibiotic resistance at single-cell resolution. Our results show that the average accuracy of identification of 12 species of common pathogenic bacteria by the machine learning method is 90.73 ± 9.72%. Antibiotic-sensitive and antibiotic-resistant strains of Acinetobacter baumannii isolated from hospital patients were distinguished with 99.92 ± 0.06% accuracy using the machine learning model. Meanwhile, we found that sensitive strains had a higher nucleic acid/protein ratio and antibiotic-resistant strains possessed abundant amide II structures in proteins. This study suggests that Raman spectroscopy is a promising method for rapidly identifying pathogens and detecting their antibiotic susceptibility.


Introduction
Antimicrobial resistance (AMR) is a global public health challenge. It has been estimated that almost 5 million deaths globally are associated with bacterial AMR, including more than 1.2 million deaths attributable to AMR in 2019 (Murray et al., 2022). The number of deaths worldwide will reach 10 million a year by 2050 if the trend in rising AMR is not efficiently contained (O'neill, 2016). The emergence of AMR is ascribed to the overuse and misuse of antibiotics in clinical treatment, the livestock industry, and aquaculture (Andersson and Hughes, 2014;Aslam et al., 2018;Ben et al., 2019). Diagnostic uncertainty is a major reason for the overprescribing of antibiotics in clinical practice (Llor and Bjerrum, 2014;Michael et al., 2014;Roope et al., 2019), largely because of shortcomings of test time and accuracy in detecting infectious pathogens and antibiotic resistance (Michael et al., 2014;Dadgostar, 2019;Jamrozik and Selgelid, 2020). Methods for rapid and accurate diagnosis are highly desirable to mitigate AMR and allow rational antibiotic therapy (Germond et al., 2018;Vasala et al., 2020).
Traditional pathogenic identification involves both phenotypic and molecular methods. Generally, phenotypic diagnoses are based on bacterial growth and metabolism. For instance, bacterial identification and the antimicrobial susceptibility test (AST) are performed simultaneously by the Vitek or BD Phoenix commercial systems in clinical microbiology (Syal et al., 2017;Franco-Duarte et al., 2019). These systems evaluate bacterial growth on a series of biochemical substrates or different carbon sources to identify bacterial species. Recently, MALDI-TOF mass spectrometry has been widely used in clinical microbiology laboratories because of its high throughput and rapid performance to identify isolated bacterial colonies (Wieser et al., 2012;Strejcek et al., 2018;Vasala et al., 2020). However, these methods are laborious and require time-consuming cell culture. Moreover, many bacteria are slow growing or non-culturable in the laboratory. For example, Mycobacterium tuberculosis needs more than 2 weeks for cultivation, sometimes up to 6-8 weeks (Acharya et al., 2020). Molecular methods such as 16S rDNA gene sequencing and quantitative PCR provide high sensitivity and specificity, and they are also very rapid. In some cases, such as metagenomic sequencing, bacterial culture is not required for these nucleic acid-based detection methods. However, molecular methods conventionally require extensive sample preparation. Difficulties in preparing target DNA can arise when samples are present in tiny amounts or contaminated by interfering substances. Moreover, molecular methods of bacterial identification are all destructive. They cannot be utilized to identify living microbes in situ for real-world samples.
Raman spectroscopy is a non-invasive, culture-and label-free technique that is able to monitor the chemical composition and metabolism of single live microorganisms in real time (Lee et al., 2021). The Raman spectrum of an individual cell represents an ensemble of different molecular vibration modes and structures, including nucleic acids, proteins, lipids, and carbohydrates (Lorenz et al., 2017;He et al., 2019;Yan et al., 2021a). High dimensional and complex Raman bands provide rich information about variable cellular phenotypes to distinguish different bacteria (Germond et al., 2018;Du et al., 2020). However, it is challenging to classify different species because of the weak Raman signal of a single bacterium, spectral variations between individuals of the same species, and spectral overlaps of different molecules (Khan et al., 2018;Ho et al., 2019;Yan et al., 2021b). A variety of chemometric analyses and machine learning methods, such as principal component analysis (PCA), support vector machine (SVM), random forest (RF), and convolutional neural networks (CNN), have been employed to analyze complicated Raman spectral data (Senger and Scherr, 2020;Xu et al., 2020). Of these, machine learning has shown great promise in rapidly and accurately identifying microorganisms at single-cell resolution (Ho et al., 2019;Lu et al., 2020;Zhou et al., 2022). Besides Raman-based identification of bacteria, Raman spectroscopy has also been used to determine bacterial antibiotic resistance when coupled with stable isotope probing, such as heavy water labeling (Yang et al., 2019;Zhang et al., 2020;Yi et al., 2021). However, this method requires pre-labeling and culturing bacteria in the presence of antibiotics. In theory, the label-free signatures of bacterial Raman spectra are excellent phenotypic indicators of antibiotic resistance (Germond et al., 2018;Verma et al., 2021). It is worth exploring the possibility of identifying bacterial species and detecting antibiotic susceptibility simultaneously using single-cell Raman spectra.
In this study, we developed a method combining Raman spectroscopy with machine learning to identify a pathogenic bacterium and predict its antibiotic resistance rapidly and accurately at single-cell resolution. The average accuracy of identification of 12 species of common pathogenic bacteria by the machine learning model was 90.73 ± 9.72%. The optimal machine learning model predicted the antibiotic susceptibility of Acinetobacter baumannii isolated from hospital patients with 99.92% accuracy. Meanwhile, we found that antibiotic-resistant A. baumannii strains showed more abundant amide II structures in proteins and a lower nucleic acid/protein ratio than antibiotic-sensitive strains.

Bacterial and yeast strains
The pathogens used in this study included seven species of gram-negative bacteria (A. baumannii, Enterobacter cloacae, Escherichia coli, Klebsiella pneumonia, Pseudomonas aeruginosa, Salmonella enterica, and Vibrio parahaemolyticus), three species of gram-positive bacteria (Staphylococcus aureus, S. epidermidis, and Streptococcus pneumoniae), and two species of fungi (Cryptococcus neoformans and Candida albicans). Five clinical strains of A. baumannii (ST2 sequence type) were isolated from sputum of different patients in the intensive care unit of Cangzhou Central Hospital, Hebei Province, China. The five clinical A. baumannii strains all contained the carbapenem resistance gene oxa23, which was confirmed by PCR assays as previously described (Woodford et al., 2006;Jinshu et al., 2022).

Single-cell Raman spectral measurements
The bacterial and yeast cells were cultured in Luria-Bertani (LB) medium or yeast extract peptone dextrose (YPD) medium with various culture times under different culturing conditions. The microbial cells were collected and suspended in 0.85% NaCl solution. A total of 20 μl suspended cells was injected in the sealed chamber for Raman spectral measurement. The Raman spectra were measured by laser tweezers Raman spectroscopy (LTRS) as previously described (Lu et al., 2020). Raman spectral calibration of the LTRS was performed at 620.9 cm −1 , 1001.4 cm −1 , and 1602.3 cm −1 of the 10 μm polystyrene spheres. The integration time was set at 60-90 s. For each species, the Raman spectra of at least 300 single cells collected from different batches were acquired. These Raman spectra were training data used for training the machine learning models. To test the classifying accuracy of the machine learning models, 80 single cells per species were gathered from other culturing batches, which were completely different from the batches used for model training. The Raman spectra acquired from these single cells were testing data, used for testing the models.

Raman spectral data processing and training of machine learning models
The spectra were pre-processed as follows: Savitzky-Golay smoothing to remove noise and polynomial fitting to remove the fluorescence background, followed by min-max normalization. The spectral range between 555 cm −1 and 1815 cm −1 was selected for model training and model testing. We used algorithms based on random forest (setting, 101 decision trees), support vector machine (selecting linear kernel), decision tree (using C5.0 algorithm), bagging (loading parallel backend), and naive Bayes (no Laplace correction) to build models for training Raman spectral data. A 10-fold cross-validation of the training dataset was used to evaluate the robustness of the models. In this study, the training time for the five models was about 4 hours using a HP workstation (16G, i7-8550U CPU). Independent spectra acquired from 80 single cells of each species were used to assess the performance of models. Sample identification included the processes of sample preparation (cell collection for 1-2 min), Raman spectral acquisition (integration time for 60-90 s), and species prediction (less than 30 s).

The architecture of the antibiotic susceptibility detection model and training details
The antibiotic susceptibilities of all six A. baumannii strains (one reference strain and five clinical isolates) were first determined by the VITEK 2 instrument (bioMérieux, France) according to the instruction manual. The antibiotics used in the test included imipenem, meropenem, ampicillin, cefoperazone, and cefepime. The reference strain and all the clinical isolates were cultured in LB medium without adding any antibiotics, and then the cells were harvested to acquire the Raman spectra. Data on the antibiotic resistance of each strain are listed in Supplementary Table S1. The training data for constructing the RF antibiotic susceptibility detection model included 523 spectra acquired from different A. baumannii strains. An independent testing dataset of 1,255 spectra from six A. baumannii strains was collected for testing the predictive performance of the identification models and the antibiotic susceptibility detection model. The RF antibiotic susceptibility detection model with 101 decision trees was evaluated by 10-fold cross-validation. Subsequently, the model with the highest prediction accuracy was selected. The receiver operating characteristic (ROC) curve was used to assess the diagnostic accuracy of the antibiotic susceptibility detection model in terms of antibiotic resistance. The A. baumannii strains were grouped based on the antimicrobial susceptibility test (Supplementary Table S1). The Raman spectra of strain ZB180325 were labeled as sensitive to imipenem and cefoperazone. The Raman spectra of strain ZB180589 were labeled as sensitive to cefoperazone and ampicillin. The Raman spectra of strains ZB180791 and ZB18102 were labeled as sensitive to imipenem and meropenem, respectively. We trained the RF model on the four-antibiotic prediction task. Completely different batches of the Raman spectra of strains were used to test the model. The main features of the Raman spectra of antibioticresistant and antibiotic-sensitive A. baumannii were analyzed by principal component analysis (PCA).
The Raman spectral pre-processing, building machine learning models, PCA analysis, and statistical analysis were all performed using the R language (R ≥ 3.6.1). In this study, the main open-source packages of R included "caret" and "hyperSpec. "

Raman spectra of 12 common clinical pathogens
To gather a training dataset, we measured the Raman spectra of 12 different pathogens, including seven gram-negative bacteria (A. baumannii, Enterobacter cloacae, Escherichia coli, Klebsiella pneumonia, Pseudomonas aeruginosa, Salmonella enterica, and Vibrio parahaemolyticus), three gram-positive bacteria (Staphylococcus aureus, S. epidermidis, and Streptococcus pneumoniae), and two fungi (Cryptococcus neoformans and Candida albicans). To minimalize batch effects on the final classification, we constructed a reference dataset of 3,982 spectra from bacteria and yeast under different culturing conditions to cover a more varied cellular physiological status and heterogeneity in the same species. The normalized average Raman spectra of the 12 pathogens with the standard deviations (SD) are shown in Figure 1A. The Raman spectral profiles of bacteria and fungi show a big visual difference. Although the Raman spectra of the 10 Frontiers in Microbiology 04 frontiersin.org pathogenic bacteria look similar in pattern, the changes of Raman intensities and the SD among the different pathogens show obvious differences at the same Raman shift ( Figure 1A). Subsequently, we analyzed the seven species of gram-negative bacteria and three species of gram-positive bacteria by principal component analysis (PCA). The PCA plot suggests that the grampositive and gram-negative bacteria are separated in two clusters ( Figure 1B). The most common differences between Raman peaks of the two categories were at 1000, 1285, and 1,553 cm −1 ( Figure 1C). It is likely that the gram-positive bacteria have higher levels of phenylalanine (1,000 cm −1 ) and protein (1,285 cm −1 ; Choi et al., 2018). The gram-negative bacteria possess abundant amide II structures in proteins ( Figure 1D). The spectral differences indicate that the composition and concentration of biomolecules are different for different species, and so it will be possible to extract the characteristics of different pathogens based on these informative variables.

Machine learning for pathogen identification from Raman spectra
As the 10 pathogenic bacteria bear a high resemblance in their Raman spectra, it is challenging to discern the subtle spectral difference between pathogens. Machine learning is a computerbased strategy that can extract subtle variation of sophisticated hidden features within Raman spectra. Here, we compared the identification accuracy of five machine learning methods: random forest (RF), support vector machine (SVM), naive Bayes (NB), bagging, and decision tree (DT). The Raman dataset of 3,982 spectra was used to train the machine learning models. We used 10-fold cross-validation to evaluate the discriminative ability of these models. In this process, one fold was randomly split out and used to validate the model trained by all the other nine folds. This process was repeated until each of the 10 folds had acted as the test set once. Taking into account the identification accuracy and successes occurring by chance, two metrics of accuracy and Cohen's kappa (Cohen, 1960;García et al., 2009;Vieira et al., 2010) were utilized to evaluate the robustness of these models. The accuracies of both RF and SVM were higher than DT, bagging, and NB (Supplementary Figure S1), indicating that the performances of the RF and SVM models were superior to the other models for accurately identifying pathogens at the single-cell resolution. Likewise, the kappa scores for both RF and SVM were higher than the other three models. The data demonstrate that these two models have good consensus agreement for classifying microbial pathogens.
We further tested the models on the independent test dataset gathered from a separately cultured batch that consisted of 960 spectra for the 12 pathogens (80 spectra per pathogen). Four indicators including accuracy, kappa, recall, and F1 score were selected to evaluate the performances of the different machine learning models. The results suggested that the RF is slightly superior to SVM (Supplementary Table S2). The RF model identified the 12 pathogens with an average accuracy of 90.73 ± 9.72% (Figure 2). For the two fungal pathogens, C. albicans and C. neoformans, the accuracy of fungi identification reached 100%. The identification accuracies for the gram-positive bacteria S. epidermidis and S. pneumonia were 98.75%, higher than for S. aureus with an accuracy of 91.25%. The classifying accuracy for K. pneumoniae was 100%, the highest accuracy of the gramnegative bacteria. A. baumannii was the second highest accuracy, at 95%. The identifying accuracy for E. coli was 82.5%. The accuracies for the other three gram-negative bacteria (E. cloacae, P. aeruginosa, and V. parahaemolyticus) were between 75 and 83%.

Detection of the Acinetobacter baumannii antibiotic-resistant strain by machine learning
Species-level identification is only the first step in clinical practice, choosing the correct antibiotic against bacterial infections is more important for clinical outcome. To step toward a culture-free antibiotic susceptibility test using Raman spectroscopy, we used five multidrug resistant (MDR) A. baumannii strains isolated from patients in the intensive care unit (ICU; Supplementary Table S1) as a proof-of-concept. A. baumannii is a ubiquitous opportunistic pathogen that is responsible for a broad range of severe nosocomial infections such as bloodstream infections (Antunes et al., 2014), especially in the ICU and immunocompromised patients (Eliopoulos et al., 2008;Vázquez-López et al., 2020). Carbapenem-resistant A. baumannii has been listed at the top of the greatest threat list by the World Health Organization (WHO; Abadi et al., 2019;Vázquez-López et al., 2020;Murray et al., 2022). Rapid and accurate diagnosis of antibiotic resistance is critical to allow timely performance of an effective therapeutic scheme (Butler et al., 2019).
We cultured one antibiotic-sensitive A. baumannii strain (isolate code: ZB18051) and five multidrug-resistant strains (isolate codes: ZB18101, ZB18102, ZB180325, ZB180589, and ZB180791) to acquire single-cell Raman spectra. To avoid the effects of residual antibiotics from cell culture when discerning the antibiotic-resistant strain, the multidrug-resistant strains and the drug-sensitive strain were cultured in the same LB medium without supplementing antibiotics. The 523 Raman spectra from the antibiotic-sensitive strain ZB18051 and four randomly selected multidrug-resistant strains (isolate codes: ZB18101, ZB18102, ZB180325, and ZB180589) were collected to build a training dataset. The training dataset was used to train the antibiotic susceptibility detection model. A separate testing dataset, consisting of 1,255 spectra from six A. baumannii strains, was prepared for the model validation. First, we used the pathogenic identification models constructed previously to predict the 1,255 spectra of A. baumannii. The RF identification model achieved the highest identification accuracy of 95.86%, which is consistent with the previous prediction on 80 A. baumannii spectra ( Figure 3A). This further confirmed that RF is the best identification model. Then, an RF model was constructed to distinguish antibiotic resistance from sensitivity using the training dataset. The sensitivity (true positive rate) and specificity (true negative rate) were evaluated by the receiver operating characteristic (ROC) curve ( Figure 3B). The value of AUC (the area under the ROC curve) was 1, suggesting that the RF model detects antibioticresistant and -sensitive A. baumannii with very high specificity and sensitivity based on the Raman spectra. As seen in Figure 3C, for the antibiotic-sensitive strains, the rate of correct detection was 100%. For the antibiotic-resistant strains, just 0.09% of cells were mistakenly predicted as sensitive cells. Thus, the mean accuracy predicted by the RF model reached 99.92 ± 0.06% ( Figure 3C). These results indicate that Raman spectroscopy combined with RF is not only a reliable approach to accurately identify pathogens at single-cell resolution, but is also able to accurately distinguish between antibiotic-resistant and antibiotic-sensitive bacteria without labeling and antibiotic treatment. It is desirable in clinical practice to choose the correct antibiotics to treat infectious diseases. Four antibiotics (imipenem, meropenem, cefoperazone, and ampicillin) are available to treat A. baumannii strains  Table S1). The A. baumannii strains were grouped based on the antimicrobial susceptibility test of A. baumannii. The independent training and testing data of A. baumannii were used to train and test the RF model, respectively. The average accuracy predicted by the RF model was 91.80%. The RF model accurately predicted the preferential antibiotics for the treatment of multidrug-resistant A. baumannii ( Figure 3D). This result was also consistent with the results of testing with the VITEK 2 system (Supplementary Table S1). This implies that Raman spectroscopy combined with machine learning is able to identify the species of bacteria in situ, detect the antibiotic susceptibility, and facilitate the correct antibiotic choice at single-cell resolution.

Antibiotic resistance characteristics of Acinetobacter baumannii
Raman spectroscopy offers rich information on chemical compositions such as DNA/RNA, proteins, lipids, and carbohydrates, which are linked with cellular physiological functions, biochemical metabolism, and transcriptomic features (Germond et al., 2018;He et al., 2021;Cui et al., 2022). Since Raman spectral signatures of multidrug-resistant A. baumannii contribute to the discrimination of antibiotic-resistance phenotypes, we sought to detect the corresponding physiological or metabolic features of bacterial resistance based on the Raman spectra. The Raman spectrum difference was calculated by subtracting the average spectrum of antibiotic-sensitive A. baumannii from the average spectrum of antibiotic-resistant A. baumannii. The Raman bands at 860-918 cm −1 (polysaccharides and proteins), 1,336-1,367 cm −1 (carbohydrates and proteins), 1,554 cm −1 (amide II of proteins), and 1,602 cm −1 (C-C or C-N protein bonds) were increased in the antibiotic-resistant strains ( Figure 4A; Supplementary Figure S2; Table 1). In contrast, Raman peaks at 729, 783, and 1,576 cm −1 (DNA/RNA Raman bands) were decreased in the antibiotic-resistant strains. In addition, the Raman peaks at 1,002 cm −1 (phenylalanine) and 1,281 cm −1 (amide III of protein) in antibiotic-sensitive A. baumannii were also slightly increased ( Figure 4A; Supplementary Figure S2; Table 1).

FIGURE 2
Identification accuracy of the random forest identification model for 12 pathogens. The confusion matrix shows the percentage of accurate prediction for each pathogen.
Frontiers in Microbiology 07 frontiersin.org The Raman band changes reflect the differences between antibiotic-resistant and antibiotic-sensitive strains in the composition and proportion of carbohydrates, proteins, and nucleic acids.
To further reveal the primary differences between antibioticresistant and antibiotic-sensitive A. baumannii, principal component analysis (PCA) was used to reduce the dimensionality of the spectra data and extract the main features. The first two principal components (PC1 and PC2) projected by the Raman spectral data accounted for 57.05% of the original variance. The scores plot of PC1 and PC2 suggested that the antibiotic-resistant and antibiotic-sensitive A. baumannii were completely separated in two clusters ( Figure 4B). The two clusters represented a significant difference in Raman spectra along the PC1 score values. The Raman peak at 1,554 cm −1 showed a clear positive correlation to resistant bacteria. The loading of PC1 and PC2 indicated that the Raman spectral difference between antibiotic-resistant and antibiotic-sensitive strains was predominantly assignable to the proteins and nucleic acids signals ( Figure 4C). The PC1 loadings confirmed that the amide II Raman intensity at 1,554 cm −1 was increased for the antibioticresistant A. baumannii (Figures 4A,D; Supplementary Figures S2,  S3). The data indicate that antibiotic-resistant strains possess abundant amide II structures in proteins. This might reflect active synthesis of oxacillinase enzyme in the A. baumannii strain containing the oxa23 gene, even in the absence of antibiotic in the environment. In contrast to antibiotic-resistant A. baumannii, the Raman signal of nucleic acids in the antibiotic-sensitive strain showed a higher intensity at the 783 cm −1 peak ( Figures 4A,D; Supplementary Figures S2, S3). Moreover, the average ratio of the Raman intensity for I 783 /I 1554 (nucleic acids/ proteins) was 2.20 in the antibiotic-sensitive strain, while the average ratio of I 783 /I 1554 in the antibiotic-resistant strains was close to 4.58 ( Figure 4B). Frontiers in Microbiology 08 frontiersin.org

Discussion
Rapid pathogen identification and diagnosis of antibiotic susceptibility are critical for efficient clinical therapy and deceleration of AMR emergence. In this study, we applied machine learning techniques to Raman spectroscopy to identify 12 common clinical pathogens at single-cell resolution. We also showed that the machine learning model can detect single antibiotic-resistant A. baumannii cells with high accuracy based on the Raman spectrum. Significantly, the combination of Raman spectra and machine learning could predict the bacterial antibiotic resistance in the absence of antibiotic treatment. We envision that it could be important to develop a method to detect the drug resistance of pathogens in situ. Such an approach combined with an automated system would rapidly and accurately identify each microbial cell in clinical samples, provide an opportunity for analyzing the biochemical and metabolic characteristics of each cell, and could even directly explore pathogenic ecophysiology in native habitats.
This study uses five machine learning algorithms (RF, SVM, DT, NB, and bagging) to construct prediction models. Compared with the accuracy of the other models, the SVM and RF models had the best performance. The average accuracy of the RF identification model to identify 12 common clinical pathogens was 90.73%, and the RF antibiotic susceptibility detection model predicted the antibiotic susceptibility of A. baumannii with an accuracy of 99.92%. Thus, RF is the best machine learning algorithm for construction of a prediction model in this study. RF works by various independent decision trees that vote on the pathogens and output the category labels for those have the majority vote (Biau and Scornet, 2016;Shaikhina et al., 2019). This method might be more robust than the single DT and bagging Box plots represent the median and first and third quartiles, with the whiskers representing the minimum and maximum values within 1.5 interquartile ranges from the first and third quartiles. Two-sided t-tests were applied to compare the statistical significance between antibiotic-resistant and antibioticsensitive strains. ****p ≤ 0.0001. Each dot represents the Raman intensity of a single A. baumannii cell.
Frontiers in Microbiology 09 frontiersin.org methods (Cui et al., 2020). Moreover, RF model training takes less time than for DT and SVM (Parmar et al., 2019), which is a merit when applying this to larger datasets. Antibiotic resistance is a consequence of the immense genetic plasticity of bacterial pathogens. Understanding the molecular mechanisms of resistance is of paramount importance to design strategies for curtailing the emergence and spread of resistance. This study lays the foundation to infer the mechanism of antibiotic resistance from Raman spectral signatures. Compared with the antibiotic-sensitive A. baumannii strain, the Raman spectral differences of five antibiotic-resistant strains were almost consistent, especially at the Raman peaks of proteins (1,554 and 1,602 cm −1 ) that show significant increased intensity. Since all five strains contain the oxa23 gene that encodes the oxacillinase enzyme, the stronger protein Raman peaks might be related to the high expression level of oxacillinase in vivo. In addition, the increased phospholipids (1,445 cm −1 ), polysaccharides and proteins (860-918 cm −1 and 1,330-1,367 cm −1 ) in antibioticresistant A. baumannii might promote changes in the bacterial cell membrane, resulting in enhanced biofilm formation (Ramirez-Mora et al., 2019;Gieroba et al., 2020;Park et al., 2021). Because of the complex and high-dimensional nature of the spectra, representing many biomolecules, it remains a challenge to assign a Raman spectrum wavenumber directly to a specific biomolecule to infer the molecular mechanism of bacterial antibiotic resistance. More efforts are required to develop computational methods to allow mapping between Raman spectra and biomolecular profiles. Meanwhile, the integration of big data obtained from single-cell genomes, transcriptomes, proteomes, and metabolomes with machine learning would assist in revealing the bacterial drug-resistance mechanism based on the Raman spectra.
The antibiotic-resistant A. baumannii strains used in this study have multiple drug resistance, including imipenem, ampicillin, sulfamethoxazole, ceftriaxone, and levofloxacin (Supplementary Table S1). This multiple drug resistance makes clinical anti-infective treatment more difficult and confers one of the most serious threats to public health. The RF antibiotic susceptibility detection model built in this study can only distinguish resistant strains from sensitive strains. This model is, thus far, unable to discern the specific antibiotic resistance. Future study will focus on applying deep learning in modeling to determine the spectral characteristics of resistance to each specific antibiotic. Such a technique would allow for accurate treatment and would limit MDR.
Although Raman spectroscopy has not yet been applied to pathogenic identification and antibiotic susceptibility detection in clinical practice, a standardized Raman spectral database of pathogenic microorganisms covering more cell physiological states, growth media and conditions, resistant and susceptible strains, and greater diversity in antibiotic susceptibility profiles would bridge the gap between academic research and clinical implementation. Raman spectroscopy combined with technologies such as hollow-core optical fiber or microscopy would enable the analysis of a single pathogen without time-consuming culturing and complex laboratorial analysis (Neugebauer et al., 2015). The Raman spectral dataset of clinical strains would promote the clinical use of a portable device for field tests. Non-destructive, culture-independent, label-free, and rapid identification of pathogenic microorganisms and the detection of antibiotic susceptibility in patient samples in a single step would be a revolution, improving patient outcomes.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.