A Retrospective Paired Comparison Between Untargeted Next Generation Sequencing and Conventional Microbiology Tests With Wisely Chosen Metagenomic Sequencing Positive Criteria

Objectives: To evaluate the performance of metagenomic next generation sequencing (mNGS) using adequate criteria for the detection of pathogens in lower respiratory tract (LRT) samples with a paired comparison to conventional microbiology tests (CMT). Methods: One hundred sixty-seven patients were reviewed from four different intensive care units (ICUs) in mainland China during 2018 with both mNGS and CMT results of LRT samples available. The reads per million ratio (RPMsample/RPMnon−template−control ratio) and standardized strictly mapped reads number (SDSMRN) were the two criteria chosen for identifying positive pathogens reported from mNGS. A McNemar test was used for a paired comparison analysis between mNGS and CMT. Results: One hundred forty-nine cases were counted into the final analysis. The RPMsample/RPMNTC ratio criterion performed better with a higher accuracy for bacteria, fungi, and virus than SDSMRN criterion [bacteria (RPMsample/RPMNTC ratio vs. SDSMRN), 65.1 vs. 55.7%; fungi, 75.8 vs. 71.1%; DNA virus, 86.3 vs. 74.5%; RNA virus, 90.9 vs. 81.8%]. The mNGS was also superior in bacteria detection only if an SDSMRN ≥3 was used as a positive criterion with a paired comparison to culture [SDSMRN positive, 92/149 (61.7%); culture positive, 54/149 (36.2%); p < 0.001]; however, it was outperformed with significantly more fungi and DNA virus identification when choosing both criteria for positive outliers [fungi (RPMsample/RPMNTC ratio vs. SDSMRN vs. culture), 23.5 vs. 29.5 vs. 8.7%, p < 0.001; DNA virus (RPMsample/RPMNTC ratio vs. SDSMRN vs. PCR), 14.1 vs. 20.8 vs. 11.8%, p < 0.05]. Conclusions: Metagenomic next generation sequencing may contribute to revealing the LRT infection etiology in hospitalized groups of potential fungal infections and in situations with less access to the multiplex PCR of LRT samples from the laboratory by choosing a wise criterion like the RPMsample/RPMNTC ratio.

Objectives: To evaluate the performance of metagenomic next generation sequencing (mNGS) using adequate criteria for the detection of pathogens in lower respiratory tract (LRT) samples with a paired comparison to conventional microbiology tests (CMT).
Methods: One hundred sixty-seven patients were reviewed from four different intensive care units (ICUs) in mainland China during 2018 with both mNGS and CMT results of LRT samples available. The reads per million ratio (RPM sample /RPM non−template−control ratio) and standardized strictly mapped reads number (SDSMRN) were the two criteria chosen for identifying positive pathogens reported from mNGS. A McNemar test was used for a paired comparison analysis between mNGS and CMT.
Results: One hundred forty-nine cases were counted into the final analysis. The RPMsample/RPM NTC ratio criterion performed better with a higher accuracy for bacteria, fungi, and virus than SDSMRN criterion [bacteria (RPMsample/RPM NTC

INTRODUCTION
Lower respiratory tract (LRT) infection is a common cause of intensive care unit (ICU) admission. The Management of Severe sepsis in Asia's Intensive Care unitS (MOSAICS) study revealed that among 1,285 severe sepsis patients admitted to Asian ICUs, 37.4% of the sources of infection were attributed to the lungs (1). Meanwhile, acute respiratory infection often leads to an unfavorable outcome, with in-hospital mortality increasing by 21.8% over a decade in one French region (2). Indeed, a point prevalence study of ICU infection, which recruited 1,150 centers, reported that less than two-thirds (5,259/8,135, 65%) of the ICU patients with probable or definite infections received at least one positive culture (3). Even combining cultures, targeted molecular methods and serology all together to routinely investigate lower respiratory tract (LRT) infection microbial etiology, Leven et al. found out that the potential pathogen detection rate was only 59% (1,844/3,104) in this study (4). Approximately 40% of the cases were causative agents undetermined by conventional microbiological tests (CMT).
Next generation sequencing, as a high-throughput technique, with a cost that was largely reduced since 2004 (5), is now widely used in clinical metagenomics, specifically for unbiased pathogen detection. In particular, metagenomic next generation sequencing (mNGS) allows for the novel findings and taxonomical classification of microorganisms, which could tremendously impact infectious disease diagnosing (6).
Large sample size studies evaluating mNGS diagnostic performance compared with CMTs either as a first-line test or the last resort in clinical scenarios were conducted. Parize et al. designed a prospective study with a cohort of 101 patients and a 30-day follow-up. They reported that untargeted next generation sequencing had a high negative predictive value of 98.4% (95% CI 95.3-100%) in bacteria and viruses, but with a restriction on the identification of fungi or parasites (7). Xing et al. claimed in his 213 case series that the mNGS detection rate was 57%, while in the laboratory of Chiu, mNGS succeeded in 32 central nervous system (CNS) infection diagnoses from 204 patients. The performance of mNGS remains controversial, and studies in LRT specimens were limited (8,9).
A multicentered research was conducted with patients with suspected LRT infections, and both mNGS and CMT results reported from LRT samples, which aimed to interpret mNGS results in a fair way for common infectious agents.

Study Design
One hundred sixty-seven patients admitted to four different ICUs in mainland China during 2018 were retrospectively studied. These cases were reviewed considering the following inclusion criteria: (i) patients with respiratory symptoms requiring oxygen therapy or any other organ support in the ICU, (ii) patients with radiology images (chest x-ray or CT scanning) showing lung infiltration, (iii) patients with a suspected infection causing the conditions described meeting criteria (i) and (ii), and (iv) patients with lower respiratory tract samples collected for both mNGS and CMT.

Data Collected From Medical Records
Demographic information, comorbidities, immune state, disease severity upon ICU admission, duration between ICU admission and attempt to sequence LRT specimen, ICU intervention, clinical outcome, and conventional microbiological test results were all collected from the medical record of each patient through the hospital information system. Disease severity was assessed using an Acute Physiology and Chronic Health Evaluation (APACHE) II score (10) and a Sequential Organ Failure Assessment (SOFA) score (11). A higher score indicates more severe clinical presentation. Furthermore, ICU intervention in this study was listed but not limited to vasopressor utilization, invasive or non-invasive mechanical ventilation, and continuous renal replacement therapy. Clinical outcome was illustrated as the length of stay in the ICU and ICU mortality.

Retrieving the LRT Sample and CMT Positive Agreement on Clinically Significant Microbes
Lower respiratory tract samples such as sputum, trachea aspirates, or bronchoalveolar lavage fluids were investigated in this study. The available CMTs of LRT samples (smear, culture, and multiplex PCR) in participating centers and significant pathogen consideration were consistent with previous publication (Supplementary Tables 1, 2) (12). Microbes detected via CMT were identified as true pathogens rather than commensals or contaminants. Serum antigen test results were also reviewed, while serum antibody test results were excluded due to a failure to distinguish acute infection. More specifically, immunoglobulin G (IgG) antibodies reported in our cohort lacked track of a 4-fold rise while IgM antibody test results were not recommended for the infection diagnoses of adenovirus, Mycoplasma pneumoniae, and Legionella/Chlamydia spp. (13,14).

Library Construction, Sequencing, Bioinformatic Analysis, and Criteria for a Positive Result
After extraction, nucleic acid fragments underwent end repairing, adapter ligation, and amplification to construct the library, quality control of which was assessed by Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, United States). The qualified library was sequenced on the BGISEQ-100/50 platform. A non-template specimen was parallelly run for the purpose of contamination control (non-template control, NTC).
Raw sequencing data were filtered by removing low-quality reads, and clean reads were mapped to the human reference genome (hg19) to subtract the host portion by Burrows-Wheeler Alignment. The remaining reads were classified by simultaneously mapping to four microbial genome databases, composed of 4,152 whole genome sequences of viral taxa, 3,446 bacteria, 206 fungi, and 140 parasites associated with human disease (15).
The mapped reads number of each microbe in each respiratory sample was normalized in three ways (15,16).
(i) Mapped reads abundance relative to other microbes in the same sample: Reads per million (RPM) = Mapped reads number(MRN) * 10 6 Total sequencing reads (ii) Mapped reads abundance relative to the same microbe at the species level in other samples of this study cohort using Z-score.
χ is the log10 transformation of RPM of one microbe. µ is the mean of log 10 RPM of the same microbe in this study cohort. σ is the SD of log 10 RPM of the same microbe in this study cohort.
(iii) Uniquely mapped reads in this study were described as stringently mapped reads number (SMRN) and normalized as SDSMRN. SDSMRN = SMRN * 20 * 10 6 Total sequencing reads The positive criteria for the mNGS result were set as follows, which were consistent with literature reviews at the species level and those that found that microbes were evidenced of pathogenicity in the lungs (15)(16)(17). A value of "1" was adjusted to the RPM of the non-template control with no pathogenic bacteria, fungi, or viruses (18

Statistical Analysis
A Mann-Whitney U-test was used for non-parametric data analysis. A 2 × 2 contingency table was drawn to calculate sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Data were reported as absolute values with 95% CI (7). The marginal frequencies of the 2 × 2 table were tested by a McNemar test. Significance was considered when p < 0.05. The SPSS 22.0 software was used.

Patient Characteristics and General Sequencing Information
One hundred forty-nine cases were enrolled into the final analysis since the total sequencing reads in the mNGS reports of 18 patients were missing, making both RPM and SDSMRN infeasible to calculate. Among these 149 patients [median age 59, interquartile range (IQR) from 47.5 to 70; 104 (69.8%) male], 75 (50.3%) were at immunocompromised states. Fifty (33.6%) were prescribed prolonged corticosteroids (defined as a minimum dose of 0.3 mg/kg/d of prednisone or an equivalent for 3 weeks at least) (12), constructing the therapy of autoimmune disease [37 (24.8%)] and post-transplantation [8 (5.4%)] or other. Twentyone (14.9%) were receiving chemotherapy for cancer, leukemia, or lymphoma. The median APACHE II and sofa scores were 20 (IQR from 15 to 24) and 9 (IQR from 6 to 11), respectively. The median ICU stay was 12 days (IQR from 7 to 21.5 days), and ICU mortality was 51%. The LRT samples of patients were retrieved and sent for mNGS 1 day (IQR from 1 to 3 days) after ICU admission ( Table 1).

Pathogen Identified by SDSMRN Criterion and Comparison of Two Criteria
The mNGS detected 92 (92/149, 61.7%) bacterium cases with the fulfilment of SDSMRN ≥3. The SDSMRN criterion identified ∼1.5 times more bacterium cases than the RPM ratio criterion, also with a better bacterium identification rate than culture (p < 0.001). More fungal infections were suggested by SDSMRN ≥3 (44/149, 29.5%) than RPMsample/RPM NTC ratio ≥1. The detection rate of mNGS for fungi was higher than culture with either criterion chosen in the paired analysis of 149 patients (SDSMRN vs. culture, p < 0.001; RPM ratio vs. culture, p < 0.001) (Figure 4).   ]. The accuracy of mNGS in detecting fungi was higher than bacteria, which was also criteria independent ( Table 2).

Due to the inability of culturing
Viral etiology was revealed only in one-third (54/149, 36.2%) of the patients in this study when using a multiplex PCR. Viruses of the PCR interests were limited to cytomegalovirus (CMV), influenza A/B, adenovirus, rhinovirus, respiratory syncytial virus (RSV), metapneumovirus, and parainfluenza virus. Because the samples of 96 patients only underwent a metagenomic DNAseq analysis, from which RNA viruses would be impossibly discovered, the accuracy of mNGS was evaluated for DNA virus and RNA virus cases separately. Both an RPMsample/RPM NTC ratio ≥1 (21/149, 14.1%) and an SDSMRN ≥3 (31/149, 20.8%) identified more DNA virus cases than the PCR (6/51, 11.8%). The mNGS was superior to the PCR detecting DNA viruses after a paired comparison between the two criteria and the multiplex PCR of the 51 cases (RPM ratio vs. PCR, p = 0.007; SDSMRN vs. PCR, p < 0.001). Six (6/53, 11.3%) RNA viral infections were suggested by the RPM ratio criterion and nine (9/53, 17.0%) by SDSMRN ≥1. The PCR revealed less cases (5/46, 10.9%) but without significant difference compared with mNGS (RPM ratio vs. PCR, p = 0.5; SDSMRN vs. PCR, p = 0.313) (Figure 4).

DISCUSSION
Microbiological etiology was only revealed in 12.7% of the hospitalized patients with pneumonia diagnoses in mainland China by CMTs (20). The mNGS was seen as a promising tool for its broad-spectrum and unbiased pathogen detection, with a sensitivity increase of 15% in diagnosing infection compared to culture (21). Our study first used the RPM ratio and SDSMRN as criteria for the positive identification of pathogens reported by mNGS in such a large-scale cohort of LRT infections. We also paired compared mNGS with the CMT of LRT samples in terms of their performance in detecting causative microbes. The RPM ratio criterion performed better, with a higher accuracy in identifying bacteria, fungi, and viruses, than the SDSMRN criterion in this cohort. The mNGS was only superior in bacteria detection if using SDSMRN ≥3 as the positive criterion with a paired comparison with culture but outperformed with significantly more fungi and DNA virus identification by choosing both criteria.
The mNGS test has been lacking a unified criterion for reporting clinically significant microbes due to varied sequencing instruments, the bioinformatic analysis pipeline, and the unstandardized productivity and quality control of each platform. Some studies defined mNGS positive as high-ranking microbes by sequencing abundance relative to other microbes in the same sample, probably with the purpose of balancing host background (21,22). The human proportion of the total sequencing yield from an upper respiratory sample like an oropharyngeal swab [median (IQR), 5.1% (1.1-39.1%)] was clearly less than that of an LRT sample, which might result in more reads left for microbe mapping (23). However, a case series also proved absent microbes of the top 10 pathogens listed by mNGS in the validation run of nanopore sequencing with much longer read length, which also had a recognized 96.6% sensitivity for pathogen detection in LRT samples (24,25). The evaluation of others for mNGS using absolute values as criteria like SMRN or unique reads number remained controversial since the difficulty of extracting nuclei acids varies from species to species and the total sequencing yield varies from sample to sample (26). It was believed that RPM could be an adequate criterion for mNGS because of its documented success in maximizing specificity for bacteria and sensitivity for fungi or viruses among immunocompromised children with pulmonary infections (18). A study by Zinter also used a combination with a Z-score ≥2 to set thresholds, which was, unfortunately, not applicable for this study. A Z-score ≥2 left eight mNGS-positive cases (8/149, 5%) in the study population (Supplementary Figure 1). The RPM to RPM (sample/NTC) ratio was also adjusted for the subtraction of contaminants to the platform. The RPM ratio criterion and the SDSMRN criterion, which was reported with a decent sensitivity (95%), were compared, detecting respiratory bacteria and fungi by Qian et al. (27).
The mNGS sensitivity for fungi detection was similar to a study by Li, which also made paired comparisons between mNGS and culture, while the mNGS of this study achieved higher specificity by wisely choosing a criterion with RPMsample/RPM NTC ratio ≥1 (28). Consistent with the previous research results, mNGS also did not outperform comprehensive CMTs for IPA diagnosis but was superior to gomori methenamine stains for P. jirovecii pneumonia. By setting the clinical view as the final call of infection and comparing mNGS and simplex PCR of P. jirovecii with the diagnoses of the clinician, both the sensitivity and specificity of mNGS and the P. jirovecii PCR were very close from the previous study (12). In this cohort, with a paired analysis, no significant difference was found between the SDSMRN criterion and P. jirovecii PCR, but the P. jirovecii PCR showed its advantage over the RPM ratio criterion. Probably in the population with highly suspected P. jirovecii pneumonia, there was no need to use mNGS to look for a needle in a haystack (28). For viral detection, mNGS definitely exceeds the hypothesis of causative agents made by the clinician, which multiplex PCR is restricted to. However, novel emerging virus identification requires the annotated microbe genome reference, long read, and depth sequencing methods (29).

Strengths
The majority of studies published by researchers evaluated the diagnostic performance of mNGS by setting up the final call for infection of clinicians as the golden standard. However, the mNGS performance in pathogen detection might have been disvalued if clinical impressions involved bias. By reviewing existing reports of both mNGS and CMTs from suspected severe pneumonia patients, this study utilized direct paired comparisons between mNGS and CMT to reveal whether these two methods were significantly different in causative pathogen detection. The study implied that mNGS performed better than fungi culture. Thus, clinicians who are faced with immunocompromised populations or who are working in less-developed areas without access to simplex fungus PCRs or antigen tests are suggested to order mNGS for LRT specimens.
It was also discovered that RPM could be widely used for mNGS results interpretation among clinicians and that the RPMsample/RPM NTC ratio criterion outperformed regardless of pathogen categorizing.

Limitations
The research was clearly limited in several ways. Not all samples from this cohort had a PCR test of the suspected viral pathogen. The mNGS performance in detecting common viruses responsible for LRT infection was only evaluated in a small proportion of patients due to the need to compare with available PCR results. This could impact less in a future prospective study with all samples sent for both mNGS and PCR screens. Also, the criteria that were selected in identifying positive microbes reported by mNGS were documented ones from previous studies. It would be better to come up with a threshold based on the quantity of sequenced species under the circumstance that quantitative PCR was run to validate those in each sample. Lastly, effects on the clinical decision of mNGS results were not discussed, which would be better evaluated in a randomized controlled trial.

CONCLUSIONS
The RPM ratio criterion performed better with a higher accuracy for bacteria, fungi, and viruses than the SDSMRN criterion. By choosing positive criteria wisely, mNGS may contribute to revealing LRT infection etiology in hospitalized groups of potential fungal infections and in situations with less access to the multiplex PCR of LRT samples from the laboratory.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethic Committee of Peking Union Medical College Hospital. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
HQin drafted the paper. JP, LL, JW, LP, XH, MH, HQiu, and BD contributed to the data collection. All authors contributed to the article and approved the submitted version.