Vaginal microbiota molecular profiling and diagnostic performance of artificial intelligence-assisted multiplex PCR testing in women with bacterial vaginosis: a single-center experience

Background Bacterial vaginosis (BV) is a most common microbiological syndrome. The use of molecular methods, such as multiplex real-time PCR (mPCR) and next-generation sequencing, has revolutionized our understanding of microbial communities. Here, we aimed to use a novel multiplex PCR test to evaluate the microbial composition and dominant lactobacilli in non-pregnant women with BV, and combined with machine learning algorithms to determine its diagnostic significance. Methods Residual material of 288 samples of vaginal secretions derived from the vagina from healthy women and BV patients that were sent for routine diagnostics was collected and subjected to the mPCR test. Subsequently, Decision tree (DT), random forest (RF), and support vector machine (SVM) hybrid diagnostic models were constructed and validated in a cohort of 99 women that included 74 BV patients and 25 healthy controls, and a separate cohort of 189 women comprising 75 BV patients, 30 intermediate vaginal microbiota subjects and 84 healthy controls, respectively. Results The rate or abundance of Lactobacillus crispatus and Lactobacillus jensenii were significantly reduced in BV-affected patients when compared with healthy women, while Lactobacillus iners, Gardnerella vaginalis, Atopobium vaginae, BVAB2, Megasphaera type 2, Prevotella bivia, and Mycoplasma hominis were significantly increased. Then the hybrid diagnostic models were constructed and validated by an independent cohort. The model constructed with support vector machine algorithm achieved excellent prediction performance (Area under curve: 0.969, sensitivity: 90.4%, specificity: 96.1%). Moreover, for subjects with a Nugent score of 4 to 6, the SVM-BV model might be more robust and sensitive than the Nugent scoring method. Conclusion The application of this mPCR test can be effectively used in key vaginal microbiota evaluation in women with BV, intermediate vaginal microbiota, and healthy women. In addition, this test may be used as an alternative to the clinical examination and Nugent scoring method in diagnosing BV.


Introduction
BV is one of the most common infectious diseases occurring in the female lower genital tract (FLGT) that can lead to adverse maternal outcomes (Javed et al., 2019;Ding et al., 2021).Timely and accurate clinical diagnosis is essential to improve patient prognosis and reduce antibiotic use.However, many limitations exist in the diagnostic methods currently used in clinical practice (Coleman and Gaydos, 2018).Multiplex quantitative PCR (mPCR) and nextgeneration sequencing (Fredricks et al., 2005;Coleman and Gaydos, 2018) have revolutionized our understanding of microbial communities by enabling complete and accurate vaginal microbiota profiling to determine the primary causative agent.These methods do not require microbial culture but involve direct extraction of genetic material from samples and the use of highly variable nucleic acid sequences to classify species, thereby constituting effective strategies for the comprehensive characterization of microbial diversity.Based on gene sequencing analysis, vaginal microbial communities were divided into separate categories determined by their composition, so-called community state type (CST I: Lactobacillus crispatus-dominated [L.crispatus], CST II: Lactobacillus gasseri-dominated [L.gasseri], CST III: Lactobacillus iners-dominated [L.iners], CST V: Lactobacillus jensenii-dominated [L.jensenii], CST IV: Anaerobic bacteriadominated).Microbial communities dominated by L. gasseri (CST II) and L. jensenii (CST V) are less common than CST I and CST III and are considered opportune vaginal microbiota (Ravel et al., 2011;Abou Chacra et al., 2022).Moreover, the pathogenicity of different microorganisms in the host may be influenced by lifestyle and environmental factors, genetic susceptibility and other factors.However, due to the high costs and limited availability of NGS, the multiplex real-time PCR draws more attention (Kusters et al., 2015;Drew et al., 2020;Darie et al., 2022), but there is still a lack of reports on simultaneous quantitative detection of these four Lactobacillus spp.and BV-related pathogenic microorganisms (BVPs) by the mPCR method.Therefore, there is an urgent need to develop an mPCR assay that can comprehensively evaluate the key lactobacilli and BVPs in the FLGT, which will have important implications for the clinical management of BV in Chinese women of childbearing age.
Thus, this study aimed to evaluate the microbial composition and dominant lactobacilli species in non-pregnant women with bacterial vaginosis using a multiplex PCR test and to determine diagnostic significance of its combination with artificial intelligence algorithms through a case-control study.A flow chart of our experimental design is shown in Figure 1.

Study approval and sample collection
This study was approved by the Ethics Committee of Northwestern University (Xi'an, China) and the Ethics Committee of the First Affiliated Hospital of Xi'an Medical College (Xi'an, China) and was conducted in compliance with the ethical guidelines of the Declaration of Helsinki of the World Medical Association.We randomly collected vaginal secretion samples from 4865 subjects from the First Affiliated Hospital of Xi'an Medical College from January 2022 to April 2022.The inclusion criteria were at least 18 years old, not pregnant, no current use of hormonal or barrier contraceptive products, vaginal douching, tobacco or alcohol abuse, no hospitalization or systemic use of medication for chronic diseases or antibiotics/ probiotics within the 6 months before sample collection, and no intercourse in the day before sampling.And then, smear microscopy, Gram staining and drying chemoenzymatic method (Bacterial vaginosis Test kit, BioPerfectus Co., Ltd.China) were used to determine the following parameters in accordance with the provided instructions: pH, flora density and diversity, Donders' score, Nugent score (Nugent et al., 1991) ), the samples were divided into two groups according to Nugent score: the 7-10 group (376 cases) and the 0-3 group (1092 cases).In the 7-10 group, 131 pregnant women, 16 patients with malignant tumors, 88 patients with other non-reproductive tract infections, 43 patients who had received treatment, and 23 patients with missing information or without symptoms were excluded, and 74 symptomatic BV patients were retained as the BV group in the modeling cohort.The healthy group in the modeling cohort consisted of 25 non-pregnant, asymptomatic healthy women who underwent pre-pregnancy physical examination in the 0-3 group.Then, 25 healthy individuals (NO group) and 74 BV samples (BV group) in the modeling cohort were analyzed to screen candidate Lactobacilli and BVPs, and then the BV diagnostic model was constructed and internally validated.Finally, another vaginal secretion samples from 10607 subjects were randomly collected from the Affiliated Hospital of Xi'an Medical College from May 2022 to March 2023.And 83 healthy samples, 76 BV samples and 30 intermediate flora samples were obtained for external data validation according to the sample screening method described above.Clinical information and the data of the Laboratory for all participants were recorded in detail (Supplementary Table S2).

Vaginal sample collection and nucleic acid extraction
Two vaginal swabs per patient were sampled simultaneously.Sample secretions were collected from the lowest one third of the vagina by rotating the swabs (HCY Transsystem CY93050T swab, HCY technology, China) three times ensuring uniform distribution on both swabs.One sample was used for non-molecular tests, which performed at the laboratory of the First Affiliated Hospital of Xi'an Medical College.The second swab was used for the extraction and purification of microbial nucleic acids by The Upure Microbial DNA Kit ® (BioKeystone, Chengdu, China).Pure DNA was evaluated by the ratio of 260/280 with using NanoDrop 2000 (Thermo Scientific, Wilmington, DE, United States) and stored at −80°C for further use.

Primer and taqman probe design
Our qPCR methods have not been described elsewhere and were developed in-house.crispatus, L. gasseri, L. jensenii, and L. iners) were downloaded from the NCBI database.MAFFT software was used for global sequence alignment to screen-conserved sequences.Primers were designed using the Primer-BLAST (https://www.ncbi.nlm.nih.gov/tools/primer-blast/) online tool according to the primer design principles and commissioned to be synthesized by Bioengineering (Shanghai, China) Co., Ltd.The PCR reaction employed universal primers for the host gene (b-actin) to provide the internal positive control.The specificity of primers and probes were verified by sanger sequencing of the PCR products.

Development of the BVLaB assay
We used four independent reactions to achieve simultaneous detection of 15 microbial targets, with four targets per reaction.Probes were labeled with different fluorescent molecules.The mPCR amplification was performed using ABI 7500 platform (Applied Biosystems, Wilmington, DE, United States) in 20 mL of reaction mixture solution containing 5 mL of nucleic acid template, 4 mL of 5×buffer (FAPON Biotech Inc., Guangzhou, China), 2 mL of 10×solution I (FAPON Biotech Inc., Guangzhou, China), 1 mL of dNTP mix (FAPON Biotech Inc., Guangzhou, China), and 0.275 mL of Hotstart Hitaq polymerase mix (FAPON Biotech Inc., Guangzhou, China), and sterilized distilled water.The reaction conditions were as follows: 95°C for 5 min, followed by 40 cycles of 95°C for 50 s, and 60°C for 40 s.

Construction and validation of diagnostic model
Univariate analysis was used to compare the characteristics of 15 microorganisms between the healthy and the BV groups, and to screen indicators that could be used to construct diagnostic models.The differential factors or high-resolution indicators (sensitivity or specificity higher than 80%) were used as candidate features for model construction.Three machine learning algorithms, support vector machine (SVM), decision tree (DT), and random forest (RF) algorithm, were used for the construction of mixed diagnostic models of BV.The modeling cohort were randomly divided into a training dataset and a validation dataset in a ratio of 2:1 in the modeling process.We then internally validated our whole model development strategy using the validation dataset of modeling cohort.In addition, an independent validation dataset of 159 samples was used to verify the performance of diagnostic models.The predictive performance of diagnostic models was evaluated by a series of evaluation metrics, including area under the curve (AUC), sensitivity and specificity.

Statistical analysis
Quantification cycle (Ct) values obtained by the mPCR were utilized for subsequent data analysis, with a uniform assignment of 40 to microorganisms that were either undetected or had Ct values exceeding the maximum quantification cycle.A lower Ct value indicated a higher abundance of this organism in the sample.Data preprocessing, model constuction and statistical analysis were performed using perl (www.perl.org)and R (https://www.rproject.org/)software.The Kolmogorov-Smirnov and Shapiro-Wilk tests were used to test the normality of the data distribution.ANOVA, Pearson's chi-squared test, and Fisher's exact test were used to compare continuous and categorical data.Feature selection and diagnostic performance evaluation of the models were performed by differential analysis and ROC analysis, respectively.Statistical analysis were only performed for data with a sample size of more than 20 cases.A p value less than 0.05 was considered statistically significant.

Development and performance evaluation of the BVLaB assay
Species-specific primers designed manually from variable regions are summarized in Supplementary Table S1.In order to assess the sensitivity of the BVLaB assay, the detection limit of the mPCR was performed with different dilution of the template.As shown in Figure 2A and Supplementary Table S1, the newly developed method was able to amplify all targets in a range of 10 3 -10 8 copies mL −1 with a strong linear relationship, and all 15 targets were detectable even as DNA template amounts was as low as 15 copies mL −1 .In addition, non-specific amplification of other organisms or non-template agents was undetected by the BVLaB assay.These results indicate that the new assay was proved to be highly sensitive for amplification of low quantities of DNA and thus could be applied to detect pathogenic microorganisms.

Population characteristics and molecular testing results
A total of 288 samples were collected for differential analysis and model construction in this study.Information regarding the clinical samples is summarized in the Table 1.There were significant differences in the detection rates of 15 microorganisms in the samples.Overall, the detection rate of each microorganism ranged from 9.09% to 94.95%.Among BVPs, the detection rate of G. vaginalis was the highest (94.95%), followed by A. vaginae (77.78%), and that of Megasphaera 1 was the lowest (9.09%).Among the four species of Lactobacillus spp., L. iners could be detected in the largest number of samples, with the detection rate of 85.86% and 75.29% in the Lactobacilli cohort and modeling cohort, respectively, followed by L. crispatus (64.7% and 41.41% in the Lactobacilli and modeling cohort, respectively).

Alteration of BVPs distribution in vaginal discharges
To determine whether the distribution of 11 BVPs were associated with healthy women and BV patients, 99 samples (NO = 25 and BV = 74) were independently tested by the assay in the modeling cohort.As shown in Figures 2B-E, the heatmap plot displayed the alteration of BVPs' distribution between the two groups.And the PCA results indicated a significant difference in the distribution of the first two principal components between the  hominis (p < 0.0001), G. vaginalis (p = 0.0138).As expected, these six BVPs were present in higher abundance in the BV group than in the healthy group: A. vaginae (p ≤ 0.0001), BVAB2 (p ≤ 0.0001), Megasphaera 2 (p ≤ 0.0001), M. hominis (p ≤ 0.0001), G. vaginalis (p = 0.0138), P. bivia (p = 0.0039).

Alteration of Lactobacilli distribution in vaginal discharges
Lactobacillus spp. is closely associated with the maintenance of a stable microenvironment in the FLGT.As presented in Figure 3A, There were statistically significant differences in the distribution of L. crispatus and L. jensenii in the NO and BV groups.The PCA results demonstrated that the distribution features of these four species of Lactobacillus spp.can be used to distinguish whether the subjects are BV positive (Figure 3B).The detection rate of L. iners was higher in the BV group than in the NO group, whereas that of L. crispatus and L. jensenii showed an opposite trend (Figure 3C).The abundance of L. crispatus and L. jensenii in the BV group was significantly lower than that in the NO group (Both p ≤ 0.0001) (Figure 3D).In addition, coexistence analysis (Figure 3E) showed that L. iners was the most predominant type in the BV group, mainly in samples with one and two species of Lactobacillus detected.There is a significant correlation between L. jensenii and L. gasseri, which may be related to the low detection rate of these two lactobacilli (Figure 3E).

Correlation analysis between lactobacilli and BVPs
The similarity in the distribution of microorganisms in different infection states might be used to predict the interactions between microorganisms.The detection rates and abundance levels of L. iners and all BVPs were higher in the BV group than in the NO group (Figures 4A, B).As expected from Figure 4C, we found positive correlations between the abundance of L. iners and some BVPs in both groups, but the microbial species of BVPs associated with L. iners were not consistent between the two groups.For example, in the NO group, the abundance of P. bivia and U. urealyticum showed significant positive correlations with L. iners, while L. iners was found to have a significant positive correlation with BVAB2 and A. vaginae in the BV group, which may indicate that L. iners may be an independent factor in the development of BV.In addition, we also found that there were correlations between various BVPs in the BV group, such as a significant positive correlation ermerged between the abundance of A. vaginae and P. bivia, G. vaginalis was also significantly positively correlated with M.curtisii, suggesting the synergistic effect of these BVPs on the occurrence of BV.

Candidate indicators for BV diagnosis
Given the observed differences in the distribution of microorganisms between healthy individuals and BV patients, the

Model construction and performance evaluation
Given our findings on microbial distribution patterns in BV patients, three lactobacilli (L.crispatus, L. jensenii, and L. iners), seven BVPs (G.vaginalis, A. vaginae, Megasphaera 2, P. bivia, M. mulieris and M. hominis) and age of the subjects were used for modeling.As shown in Figure 5, the diagnostic AUC of the DT model was 0.736 (95% CI: 0.594-0.932) in the modeling cohort, with the sensitivity and specificity of 82.6% and 70.0%, respectively.The diagnostic AUC, sensitivity, and specificity of the SVM and RF models were all 100%.In the external validation dataset, the diagnostic AUC of the DT was 0.745 (95% CI: 0.678-0.813),with the sensitivity and specificity of 81.9% and 67.1%, respectively.The diagnostic AUC of the RF model was 0.830 (95% CI: 0.775-0.885),with the sensitivity and specificity of 97.6% and 68.4%, respectively.The SVM model had a diagnostic AUC of 0.969 (95% CI: 0.945-0.99),with the sensitivity and specificity of 90.4% and 96.1%, respectively.Overall, the SVM algorithm resulted in the highest Correlation analysis between key lactobacilli and BV-related microorganisms in vaginal microenvironment.(A, B) represented the changes of detection rate and abundance of 15 microorganisms between healthy women and BV patients.(C) The lower triangular correlation matrix showed the correlations between the distribution of 15 microorganisms in the BV and healthy group.Mean difference represents the difference of Ct value of a microorganism between BV group and healthy group, and the positive value represents the content of the index in the healthy group is lower than that in the BV group.GV, Gardnerella vaginalis; AV, Atopobium vaginae; BF, Bacteroides fragilis; BVAB2, Bacterial vaginosis-associated bacteria 2; UU, Ureaplasma urealyticum; MM, Mobiluncus mulieris; MC, Mobiluncus curtisii; MH, Mycoplasma hominis; M1, Megasphaera 1; M2, Megasphaera 2; PB, Prevotella bivia; LI, Lactobacillus iners; LJ, Lactobacillus jensenii; LG, Lactobacillus gasseri; LC, Lactobacillus crispatus.*, p<0.05.**, p<0.01, ***, p<0.001.performance in terms of AUC value in the internal validation and external validation datasets among the other two models, proving that the SVM model has more potential to diagnose BV.Further investigation of the diagnostic consistency with intermediate scores using the SVM model is presented in Figure 5D.The heatmap results showed that 93.3% (28/30) of the IBV samples were also identified as BV positive by the new method, but the diagnosis of the other 6.7% (2/30) of the intermediate BV samples were different between the two methods.

Discussion
Several previous studies have reported the high sensitivity and accuracy of mPCR methods for the diagnosis of BV (Cartwright et al., 2012;Hilbert et al., 2016;Gaydos et al., 2017;Coleman and Gaydos, 2018;van der Veer et al., 2018;Carter et al., 2023).For example, The Allplex bacterial vaginosis assay, a multiplex PCR-based test for BV based on quantitative results, has been reported for Lactobacillus spp., G. vaginalis, and A. vaginae with qualitative detection of Megasphaera 1, B. fragilis, BVAB2, and Mobiluncus spp., with a sensitivity and specificity of 65% and 98%, respectively (Drew et al., 2020).Kusters et al. also developed a semiquantitative multiplex PCR assay for five microorganisms (G.vaginalis, A. vaginae, Megasphaera 1, L. crispatus, and L. iners) for BV diagnosis (Kusters et al., 2015).The FDA-approved nucleic acid based diagnostic tests for BV, such as BD MAX ™ Vaginal Panel (Becton,Dickinson and Company, United States) and Aptima ® BV Assay(Hologic, Inc., San Diego, CA), also consist of Lactobacillus spp (2 to 3 kind of species) and several BVPs, both of which have high sensitivity but moderate specificity.These studies suggested that Lactobacilli are a good indicator of the changes in the vaginal microenvironment as pathogenic bacteria.Moreover, in addition to the known BV related pathogens such as G. vaginalis and A. vaginae, a high proportion of genital mycoplasma (M.hominis and U. urealyticum) infections have been found in patients with BV, and some studies have shown that these genital mycoplasma infections are also related to the drug resistance of BV (Lendamba et al., 2022;Rosaŕio et al., 2023).However, none of the previous methods could reflect the changes in the levels of the representative species of five CSTs at the same time.Therefore, we developed a general-purpose fluorescence PCR instrument-based BVLaB assay for key microorganisms in the five CSTs or microorganisms with high detection rate reported in previous studies (Kinoshita et al., 2014;Javed et al., 2019;Lendamba et al., 2022;Li et al., 2022;Carter et al., 2023).
Similar to previous reports (Gryaznova et al., 2022;Rak et al., 2022;Roachford et al., 2022), the detection rates and/or abundance of G. vaginalis, P. bivia, A. vaginae, BVAB2, Megasphaera 2 (not Megasphaera 1), and M. hominis changed significantly during the transition from the healthy state to the diseased state.These results may indicate a high abundance of G. vaginalis and P. bivia in the FLGT and suggest that these two bacteria act as early colonizers, whereas A. vaginae and other BV-associated bacteria are secondary colonizers, and that these early colonizers may evade the immune system while forming a bacterial vaginosis biofilm (Muzny et al., 2019;Randis and Ratner, 2019;Muzny et al., 2020).Significant differences in the genomic characteristics and metabolic processes among the four lactobacilli may lead to their different roles in the vaginal microenvironment.A genome-wide study reported that L. iners has a smaller genome sequence than L. crispatus, L. jensenii, and L. gasseri (Kwak et al., 2020;Zheng et al., 2021).L. crispatus, L. jensenii, and L. gasseri are the main D (-) isomer lactic acid-producing microorganisms (Edwards et al., 2019;Kerry-Barnard et al., 2022), whereas L. iners mainly produces L (-) isomer lactic acid (van Houdt et al., 2018;Witkin et al., 2019), which might lead to the differences in host protection and inhibition of pathogen colonization in the CSTs dominated by these microorganisms.In this study, the difference in the distribution of L. crispatus and L. iners between healthy status and diseased status was similar to that reported in previous studies (Ravel et al., 2011;Javed et al., 2019).In addition, we found a significant positive correlation between the abundance of LI and the abundance of multiple BVPs, indicating that the increase in L. iners in the vaginal microenvironment is related to the increased risk of BV (Nilsen et al., 2020;Zhu et al., 2022).The differential distribution of these microorganisms suggests that they have a potential to serve as molecular markers for BV diagnosis.
The "grey zones" of the intermediate Nugent score has been one of the most controversial issues in the diagnosis of BV.Sometimes, in these cases, some of the symptomatic relapses were due to   (Campisciano et al., 2021).In our study, we also observed a misdiagnosis of the intermediate microbiota by the Gram staining, which was not confirmed by the SVM-based qPCR method.These results may indicate that molecular techniques can better elucidate qualitative and quantitative changes in the vaginal microbiota, helping clinicians decipher some of the "grey zones" in clinical practice.Furthermore, as age impacted prediction accuracy, we constructed BV diagnostic models based on differences in microbial distribution patterns and found that the SVM algorithm outperformed the other two modeling strategies.The SVM algorithm model is more robust than models built using other algorithms, and this advantage is confirmed by the performance evaluation metrics in the results of cross-validation.Although some studies have shown that machine learning algorithms have great potential in building and optimizing risk assessment and diagnosis models, there are still some barriers to the application of such techniques in clinical settings, such as the lack of easy interpretation and instability (Hardy et al., 2021;Song and Lee, 2022).
The present study still has some limitations.Although the distribution characteristics of four lactobacilli and several key BVPs in the vaginal secretions of healthy women and BV patients have been identified, the sample size used in this study is still relatively small; And the patient population was exclusively Chinese women and the results may not be generalizable to other racial groups as the microbiome may differ in both normal and disease states among different groups.Therefore, this is an exploratory and preliminary study, and further studies with larger sample sizes are needed to confirm these results.Moreover, the stability and accuracy of the model developed using machine learning algorithms are closely related to the improvement in the sample size, and more data support can help eliminate the interference of potential confounding and nonrandom factors in the model.However, the synergistic, antagonistic, and additive effects of vaginal microorganisms in the transition from a healthy state to the diseased state are still not clear.Therefore, we propose to overcome this limitation in further studies with larger sample sizes.Furthermore, our subsequent work will focus on the underlying causes of microbial changes and abnormalities during different stages of disease progression.

Conclusion
In conclusion, a highly sensitive, specific, indigenous, single-run mPCR for BV diagnosis has been developed, which can simultaneously detect L. crispatus, L. gasseri, L. jensenii, L. iners and 11 key BVPs.And we preliminary study provided information about the distribution characteristics of BV-associated microorganisms in the vaginal secretions of healthy women and BV patients, combined with machine learning algorithms to construct a diagnostic model of BV, which may contribute to understanding the dynamic change of the microbial community of BV and provided a rapid, comprehensive, and accurate diagnostic strategy.organizations, or those of the publisher, the editors and the reviewers.Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
, Lactobacillus, Trichomonas, blastospore, catalase, cleanliness, leukocyte esterase, sialidase, and other relevant indicators.According to the test results, the samples were divided into the following types: (1) Normal flora: the density of vaginal flora was grade II-III, the diversity of vaginal flora was grade II-III, the dominant bacteria was Lactobacillus spp., no other pathogens were detected and the vaginal microbial function was normal.(2) Aerobic vaginitis (AV): Donders' score ≥ 3 points, Hydrogen peroxide, leukocyte esterase and bglucuronidase/coagulase positive can be used for auxiliary diagnosis.(3) Vulvovaginal candidiasis (VVC): Gram positive ovoid blastospores or tubular pseudohyphae were observed under the oil immersion objective.(4) Trichomonas vaginitis (TV): a large number of white blood cells with Trichomonas spp.were observed.(5) BV: Nugent score ≥ 7 points, Clue cells positive, PH ≥ 4.4 and Whiff test positive.(6) Intermediate Microbiota (IBV): the Nugent score was 4 to 6 points.After excluding 3397 cases with other reproductive tract infections (such as VVC, TV, etc.

FIGURE 1
FIGURE 1 Flowchart of the three phases in this study: I. design and development process of species-specific 15-plex PCR assay, II.case-control studies for association study of bacterial vaginosis and candidate microbial markers selection for diagnostic of bacterial vaginosis, III. and development of machine learning diagnostic models and validation tests on clinical samples.
FIGURE 2 Amplification plots of 15 microorganisms using the new mPCR assay and the distribution pattern of 11 BV pathogenic microorganisms (BVPs) in vaginal discharges.(A) Amplification results of the 15 targets by the mPCR assay.(B) Heatmap displays differential abundance of 11 BVPs between healthy women and BV patients.(C) Relationships among samples visualized by principal component analysis.And (D, E) represents the differences of detection rate and abundance between healthy women and BV patients.Ct, Quantification cycle values in PCR.**, p<0.01, ***, p<0.001.
FIGURE 3 Distribution pattern of four species of Lactobacillus spp. in vaginal discharges.(A) Heatmap of hierarchical clustering analysis showed differential abundance of four Lactobacilli between healthy women and BV patients.(B) Relationships among samples visualized by principal component anlaysis.And (C, D) represented the differences of detection rate and abundance between healthy women and BV patients.(E) Barplot and the lower triangular correlation matrix displays the co-detection of four species of Lactobacillus spp.LC_1sps represented only Lactobacillus crispatus was detected in the sample, and LC_2sps represented Lactobacillus crispatus and one of the other Lactobacilli were detected in the sample, etc. LI, Lactobacillus iners; LJ, Lactobacillus jensenii; LG, Lactobacillus gasseri; LC, Lactobacillus crispatus.Ct, Quantification cycle values in PCR.*, p<0.05.**, p<0.01, ***, p<0.001.****, p<0.0001.
FIGURE 5The analysis results of the three machine learning models in the modeling, external validation and intermediate microbiota dataset.(A-C) represented the performance of the BV diagnostic models constructed by the three machine learning algorithms in the internal and external validation datasets, and (D) represented the prediction results using the constructed SVM model on the intermediate vaginal flora samples.ROC, receiver operating characteristic; AUC, the area under the curve; Sens, sensitivity; Spec, specificity; CI, 95% confidence interval; DT, decision tree algorithm; RF, random forest algorithm; SVM, support vector machine algorithm.

TABLE 1
Clinical and laboratory statistics collected from the modeling and validation datasets.
respectively.Moreover, the ROC analysis was then performed based on the abundance of each microorganism (Ct value) and the cut-off value (CoV) with the maximum Youden' s index was used as the BV diagnostic criteria, L. crispatus showed the best diagnosis performance (CoV: 21.24, sensitivity 97%, specificity

TABLE 2
The results of performance evaluation of each microorganism of detection status as a diagnostic indicator for BV.

TABLE 2 Continued
worsening of the preexisting underdiagnosed dysbiosis state rather than BV relapse.For example,Campisciano etal.showed that only 17 (16.7%) of 102 women diagnosed as intermediate BV by Gram's stain were confirmed by qPCR as an intermediate clinical picture (partial BV)