Targeted next-generation sequencing for pulmonary infection diagnosis in patients unsuitable for bronchoalveolar lavage

Background Targeted next-generation sequencing (tNGS) has emerged as a rapid diagnostic technology for identifying a wide spectrum of pathogens responsible for pulmonary infections. Methods Sputum samples were collected from patients unable or unwilling to undergo bronchoalveolar lavage. These samples underwent tNGS analysis to diagnose pulmonary infections. Retrospective analysis was performed on clinical data, and the clinical efficacy of tNGS was compared to conventional microbiological tests (CMTs). Results This study included 209 pediatric and adult patients with confirmed pulmonary infections. tNGS detected 45 potential pathogens, whereas CMTs identified 23 pathogens. The overall microbial detection rate significantly differed between tNGS and CMTs (96.7% vs. 36.8%, p < 0.001). Among the 76 patients with concordant positive results from tNGS and CMTs, 86.8% (66/76) exhibited full or partial agreement. For highly pathogenic and rare/noncolonized microorganisms, tNGS, combined with comprehensive clinical review, directly guided pathogenic diagnosis and antibiotic treatment in 21 patients. This included infections caused by Mycobacterium tuberculosis complex, certain atypical pathogens, Aspergillus, and nontuberculous Mycobacteria. Among the enrolled population, 38.8% (81/209) of patients adjusted their treatment based on tNGS results. Furthermore, tNGS findings unveiled age-specific heterogeneity in pathogen distribution between children and adults. Conclusion CMTs often fall short in meeting the diagnostic needs of pulmonary infections. This study highlights how tNGS of sputum samples from patients who cannot or will not undergo bronchoalveolar lavage yield valuable insights into potential pathogens, thereby enhancing the diagnosis of pulmonary infections in specific cases.


Introduction
Pulmonary infection stands as a prominent contributor to the global disease burden, resulting in substantial morbidity and mortality worldwide (1).Prompt identification of pathogens holds a crucial role in initiating timely and appropriate treatment, thereby enhancing clinical outcomes (2,3).The prevailing diagnostic framework for pulmonary infection relies on physicians constructing a differential diagnosis from patient history, evaluating clinical presentations, interpreting imaging findings, conducting auxiliary tests, and subsequently employing microbiological assays to pinpoint causative agents.However, this conventional approach faces challenges in pathogenic diagnosis due to extended turnaround times, sensitivity, and the wide array of potential pathogens, encompassing bacteria, viruses, fungi, and atypical agents.Consequently, the etiology of pulmonary infections remains elusive in a substantial 19%-62% of cases (4)(5)(6).
Emerging as a technology with notable potential, targeted nextgeneration sequencing (tNGS) addresses the limitations of metagenomic NGS (mNGS) by targeting the identification of a broad spectrum of pathogens.Earlier investigations have underscored the efficacy of mNGS in significantly expediting pathogenic diagnosis in pulmonary infections (7,8).Nevertheless, widespread clinical adoption of mNGS has been impeded by its considerable cost, susceptibility to host nucleic acid interference, and separate detection of DNA and RNA.In combination with ultra-multiplex polymerase chain reaction (PCR) and high-throughput sequencing, tNGS allows simultaneous identification of multiple common pathogens.Although the number of detectable pathogens is fewer than that detected using mNGS, tNGS offers distinct advantages in terms of the cost and diagnostic process.A preliminary report has demonstrated the effectiveness of tNGS in detecting respiratory pathogens, at a quarter of the cost of mNGS (9).Furthermore, another study revealed comparable diagnostic performance between tNGS and mNGS in microbiological testing of bronchoalveolar lavage fluid (BALF) (10).
Both BALF and sputum are commonly used sample types from the lower respiratory tract.Sputum, due to its noninvasive collection procedure and high patient acceptability, proves to be a more accessible option for early pathogenic screening than BALF.Therefore, sputum is often utilized for pathogen detection when obtaining BALF samples is infeasible or declined by patients.Consequently, tNGS of sputum holds promise as a pragmatic approach in such scenarios.However, the existing published evidence supporting the efficacy of sputum-based tNGS in patients with pulmonary infections is predominantly confined to small case series (11).
This study endeavors to elucidate the potential utility of tNGS in the pathogenic diagnosis of pulmonary infections.Employing a tNGS assay targeting 153 pathogens (Supplementary Table S1), we sought to assess its clinical performance in comparison to conventional microbiological tests (CMTs).Furthermore, we aimed to shed light on the heterogeneity of pathogen distribution within the study population based on tNGS results.

Study design
This retrospective case series involved the analysis of 234 sputum samples, collected between April and November 2022 at The First People's Hospital of Qinzhou in China.These samples were subjected to both tNGS and CMTs.Investigators conducted a thorough review of clinical data related to each patient diagnosed with pulmonary infection who underwent both tNGS and computed tomography scans.The study received approval from the local Ethics Committee (approval number: 2022081) and was conducted in accordance with the 1990 Declaration of Helsinki and its subsequent amendments.All data utilized in this study were obtained anonymously and exclusively employed for analysis in this paper.The confidentiality of patient information was rigorously upheld, obtaining the patient informed consent.
The inclusion criteria for this study were as follows: (i) patients diagnosed with pulmonary infection; (ii) implementation of both tNGS and CMTs for pathogenic diagnosis; (iii) availability of complete clinical data; and (iv) patients who provided informed consent to participate.Exclusion criteria included: (i) patients declining sample collection for tNGS; (ii) sputum samples failing to meet the tNGS quality standards; and (iii) patients with incomplete clinical data.

Sample collection
Adequate and informative sputum sample collection hinges on proper instructions.Trained nurses guided patients to brush their teeth and rinse their mouth with saline in the morning, take a deep breath, and then forcefully expel sputum from the respiratory tract, making an effort to avoid contamination with oral and nasopharyngeal secretions.The samples were collected in sterile containers with secure lids.Patients were explicitly instructed that sputum from a forceful cough was required, and saliva should not be introduced into the collection cup.For infants or young children unable to produce sputum through coughing, a disposable suction tube was employed to extract sputum under negative pressure.In cases where patients faced challenges in generating sputum through forced expectoration, alternative methods such as sputum induction and tracheal aspirate were considered.It is crucial to emphasize that all these procedures were exclusively conducted by physicians trained in accordance with specific collection protocols.Approximately 1-3 mL of sputum was collected and preserved at −20°C within 48 h for tNGS analysis.Additionally, residual sputum and blood samples from select patients were obtained for CMTs, including smear microscopy, culture, PCR, and serologic testing (Supplementary Presentation S1).

Targeted next-generation sequencing 2.3.1 Sample preparation
A volume of 650 μL of the sample was liquefied by combining it with an equal volume of 80 mmol/L dithiothreitol in a 1.5 mL centrifuge tube.The mixture was homogenized for 15 s using a vortex mixer.Meanwhile, a positive control and a negative control from the Respiratory Pathogen Detection Kit (KS608-100HXD96, KingCreate, Guangzhou, China) were set up to monitor the whole experiment process of tNGS.

Nucleic acid extraction
Subsequently, 500 μL of the homogenate was utilized for total nucleic acid extraction and purification via the MagPure Pathogen

Library construction and sequencing
The library was constructed using the Respiratory Pathogen Detection Kit, and a no template control was set up to monitor the library construction and sequencing process.This process encompassed two rounds of PCR amplification.The sample nucleic acid and cDNA were employed as templates, and a set of 153 microorganism-specific primers were selected for ultra-multiplex PCR amplification to enrich the target pathogen sequences, spanning bacteria, viruses, fungi, mycoplasma, and chlamydia.After the amplification, PCR products underwent purification with beads, followed by amplification using primers containing sequencing adapters and distinct barcodes.The quality and quantity of the constructed library were evaluated using the Qsep100 Bio-Fragment Analyzer (Bioptic, Taiwan, China) and Qubit 4.0 fluorometer (Thermo Scientific, Massachusetts, United States), respectively.Generally, the library fragments exhibited sizes within the approximate range of 250-350 bp, and the library concentration was maintained at a minimum of 0.5 ng/μL.The concentration of the mixed library was reassessed and subsequently diluted to a final concentration of 1 nmol/L.Subsequently, 5 μL of the mixed library was mixed with 5 μL of freshly prepared NaOH (0.1 mol/L).Following brief vortexing and centrifugation, the library was incubated at room temperature for 5 min.The diluted and denatured library was subsequently subjected to sequencing on an Illumina MiniSeq platform using a universal sequencing reagent kit (KS107-CXR, KingCreate, Guangzhou, China).On average, each library yielded approximately 0.1 million reads, with a sequencing read length of single-end 100 bp.

Bioinformatics analysis
Sequencing data were analyzed using the data management and analysis system (v3.7.2, KingCreate).The raw data underwent initial identification via the adapter.Reads with single-end lengths exceeding 50bp were retained, followed by low-quality filtering to retain reads with Q30>75%, ensuring high-quality data.The single-ended aligned reads were then compared using the Self-Building clinical pathogen database to determine the read count of specific amplification targets in each sample.The reference sequences used for read mapping was a database curated from different sources, including Genbank database, Refseq database, and Nucleotide database from NCBI. 1

Interpretation of tNGS results
In line with the experimental principle of targeted amplification of microbial sequences using specific primers, the amplicon coverage and normalized read count of detected microorganisms within the sample constituted the primary interpretation indicators.To categorize a microorganism as a potential pathogen, the following criteria were established: (i) bacteria (excluding Mycobacterium tuberculosis complex), fungi and atypical pathogen: amplicon coverage ≥50% and normalized read count ≥10; (ii) viruses: amplicon coverage ≥50% and normalized read count ≥3, or normalized read count ≥10; (iii) Mycobacterium tuberculosis complex: normalized read count ≥1.
Subsequently, two experienced clinicians independently conducted a comprehensive assessment of the patient's clinical data to determine the presence of pulmonary infection and the clinical relevance of potential pathogens.This assessment included the patient's medical history, symptoms, imaging findings, tNGS results, and CMT outcomes.In cases of divergent interpretations, consultation with a senior physician was pursued to achieve a consensus.

Statistical analyses
Quantitative variables were represented as medians with accompanying ranges, while categorical variables were presented as counts with percentages.Statistical analyses were performed using SPSS 22.0 software (IBM, Armonk, NY, United States).A significance level of p < 0.05 was considered statistically significant.

Patient characteristics
Initially, a total of 234 patients diagnosed with pulmonary infection and under-going tNGS were considered for review and potential enrollment in this study.Among them, 25 patients were subsequently excluded due to reasons including the absence of paired CMTs (n = 19), repetition (n = 4), and incomplete data (n = 2).Consequently, a definitive cohort of 209 patients met the stipulated enrollment criteria and underwent further analysis (Figure 1).The median age of this cohort was 4 years, with 141 (67.5%) of the patients being male.Comorbidity was identified in 61.7% of these patients.A majority of patients (180, 86.12%) had been exposed to antibiotics before sample col-lection (Table 1).
Additionally, measles virus was detected through tNGS in 2 patients who had been vaccinated within 1 month; however, measles was not considered as a diagnostic consideration.Four microorganisms were solely identified via CMTs: Elizabethella meningosepticum, Burkholderia cepacian, S. epidermidis, and C. tropicalis.Each of these microorganisms was detected in only 1 patient within the study population, with the latter two not falling within the detection range of tNGS.

Clinical impact of tNGS
It is important to highlight that within this study, standard reference results for pathogenic diagnosis were unavailable because of the absence of qualified BALF samples for detection and the lack of quality evaluation for individual sputum samples.Nevertheless, through the application of tNGS, several highly pathogenic and rare/

Discussion
The emergence of mNGS has ushered in significant advancements in the realm of infectious disease diagnosis, particularly within the domain of pulmonary infections.mNGS has been embraced by the medical community as a supplementary diagnostic method alongside CMTs, due to its ability to swiftly, accurately, and comprehensively identify pathogens compared to CMTs.However, the considerable cost associated with mNGS has posed a substantial hurdle to its widespread clinical applicability (12)(13)(14).This constraint has spurred the development and clinical adoption of tNGS.Notably, tNGS does not merely reduce expenses, but rather, it strikes a balance between cost and detection capabilities.Our previous study indicated that the tNGS technology employed here exhibits comparable performance to mNGS in detecting respiratory pathogens while reducing costs by three-quarters ( 9), highlighting the potential utility of tNGS in pathogenic diagnosis.
Within this study, we aimed to explore the clinical utility of tNGS for diagnosing pulmonary infections in a patient population unable or unwilling to undergo BALF and to compare its efficacy against CMTs.Accordingly, we employed a tNGS assay targeting 153 pathogens to evaluate its detection performance in sputum samples, addressing concerns related to sample collection, testing costs, TAT, and accessibility.Among the enrolled patients, tNGS identified a greater number of potential pathogens (45 vs. 23), encompassing common or clinically relevant respiratory pathogens, including bacteria, viruses, fungi, and atypical pathogens.Moreover, tNGS exhibited a higher positive detection rate compared to CMTs (96.7% vs. 34.0%).These findings underscore the reality that a considerable portion of patients may harbor potential pathogens that remain undetected despite undergoing CMTs.Furthermore, among the 76 patients exhibiting positive results for both tNGS and CMTs, a substantial 86.8% demonstrated either complete or partial consistency between the two methods.This evidence underscores the promising potential of tNGS in these patients.However, interpreting tNGS results presents significant challenges.A growing body of evidence suggests that the respiratory tract is not a sterile environment and potentially pathogenic microorganisms are ubiquitous (15).The acquisition of noninvasive or minimally invasive sputum samples from the lower respiratory tract is prone to contamination by a patient's own endogenous upper respiratory tract flora (16,17).Given that pulmonary infections often arise from the patient's own flora (18), such contamination can complicate the interpretation of tNGS results in sputum samples.The central question remains: does the presence of potential pathogens indicate colonization or infection?If infection is present, does it affect the upper or lower respiratory tract?However, in most cases, definitively determining whether a specific microorganism has caused infection based solely on  Heterogeneity in the pathogen spectrum between children and adults.and assist clinicians to improve the differential diagnosis and identification of mixed infections.Another primary objective of this study is to investigate the distinctions in the pathogen spectrum between children and adults through the analysis of sputum samples.This endeavor provides valuable insights into the likely potential pathogens or colonizing flora prevalent among patients of varying ages.Bacterial infections exhibited a diverse age-related pattern.Numerous respiratory bacteria establish colonization in the respiratory tract of asymptomatic individuals and can subsequently opportunistically lead to pulmonary infections.This pattern is observed in organisms such as S. pneumoniae, H. influenzae, and S. aureus among children, and K. pneumoniae, A. baumannii, and P. aeruginosa among adults (20)(21)(22)(23).A comprehensive 11-year surveillance study of respiratory infectious diseases conducted by the Chinese Center for Disease Control and Prevention validated our findings.This study identified specific age thresholds for the detection rates of various bacteria: 9 years for S. pneumoniae, 6 years for H. influenzae, 2 years for S. aureus, 16 years for K. pneumoniae, and 40 years for P. aeruginosa (no data available for A. baumannii).In relation to DNA viruses, especially herpesviruses detected in this study, distinct age patterns were observed among different types.However, except for CMV, the clinical implications of detecting HSV, EBV, HHV-6, and HHV-7 in the respiratory tract remain unclear, and they rarely lead to pulmonary infections.Existing research suggests that herpesviruses may reactivate in patients with severe infections, malignancies, and transplant recipients, with implications for prognosis and mortality (24-27).In contrast, the age patterns associated with RNA viruses, atypical pathogens, and fungi are comparatively straightforward.We observed that HRV, RSV, HAdV, HPIV, and M. pneumoniae were predominantly detected in children, potentially because of their comparatively lower immunity levels than adults and increased opportunities for transmission within school environments (28).Fungal infections were more prevalent in adults, which may be linked to the complex comorbidities and infection severity typically found in this age group.
Conditions such as COPD, bronchiectasis, diabetes, malignant tumors, sepsis, and severe pneumonia are all high-risk factors for invasive fungal diseases (29,30).In alignment with prior research, these outcomes underscore the presence of age-specific heterogeneity in the distribution of respiratory microorganisms.This heterogeneity may play a role in differentiated tNGS interpretation and the identification of relevant pathogens.
While tNGS demonstrated superior detection performance for potential pathogens in sputum samples, our study findings underscore the importance of interpreting positive results with caution.Nonetheless, when integrated with comprehensive clinical analysis, tNGS proves valuable for identifying highly pathogenic and rarely colonizing microorganisms, such as M. tuberculosis, certain atypical pathogens, Aspergillus, and nontuberculous Mycobacteria.Such pathogens are typically less susceptible to colonization.Moreover, treatment adjustments were made in response to tNGS results for 38.8% (81/209) of patients, with tNGS directly guiding antibiotic treatment in 10.0% (21/209) of patients.Patients with unidentified pathogens may benefit from the insights provided by tNGS.Subsequent larger-scale clinical studies are essential to further elucidate the role of tNGS in clinical diagnosis and treatment of pulmonary infections.
This study has several limitations that warrant consideration.First, the absence of standard reference results for pathogenic diagnosis prevented the calculation of sensitivity and specificity for tNGS, thus hindering a comprehensive assessment of its diagnostic performance.Second, distinguishing between microbial colonization and infection posed significant challenges because the current tNGS technology lacks uniform standards for pathogenic diagnosis.Third, the sample size was relatively small, and the study duration was limited, potentially affecting the robustness of the results.
In conclusion, our study underscores the clinical utility of sputum-based tNGS in the pathogenic diagnosis of pulmonary infections among patients who are not candidates for bronchoalveolar lavage.The findings demonstrate that tNGS yields a notably higher positive detection rate compared to CMTs, which aids in the early identification of potential pathogens, particularly those that are highly pathogenic or rarely colonize.This technology also supports clinical treatment decision-making.Additionally, the study reveals age-specific heterogeneity in the distribution of pathogens, indicating the necessity for distinct interpretations of tNGS results among patients of varying ages.Nonetheless, further research is imperative to establish clear indications, criteria for sample selection, and guidelines for result interpretation in the context of tNGS.requirements.The human samples used in this study were acquired from primarily isolated as part of your previous study for which ethical approval was obtained.Written informed consent for participation was not required from the participants or the participants' legal guardians/next of kin in accordance with the national legislation and institutional requirements.

FIGURE 1 Flow
FIGURE 1Flow diagram of the study.

FIGURE 2
FIGURE 2Distribution of potential pathogens in the study cohort and the respective contributions of tNGS and CMTs for pathogen detection.

FIGURE 3
FIGURE 3Consistency of pathogen detection results between tNGS and CMTs.

TABLE 1
Baseline characteristics of the 209 patients enrolled.

TABLE 2
Differences in clinical characteristics between children and adults.
a COPD, chronic obstructive pulmonary disease.