Lung Tissue Microbiome Is Associated With Clinical Outcomes of Idiopathic Pulmonary Fibrosis

Background: Several studies using bronchoalveolar lavage fluid (BALF) reported that lung microbial communities were associated with the development and clinical outcome of idiopathic pulmonary fibrosis (IPF). However, the microbial communities in IPF lung tissues are not well known. This study is aimed to investigate bacterial microbial communities in lung tissues and determine their impact on the clinical outcomes of patients with IPF. Methods: Genomic DNA extracted from lung tissues of patients with IPF (n = 20; 10 non-survivors) and age- and sex-matched controls (n = 20) was amplified using fusion primers targeting the V3 and V4 regions of the 16S RNA genes with indexing barcodes. Results: Mean age of IPF subjects was 63.3 yr, and 65% were male. Alpha diversity indices did not significantly differ between IPF patients and controls, or between IPF non-survivors and survivors. The relative abundance of Lactobacillus, Paracoccus, and Akkermansia was increased, whereas that of Caulobacter, Azonexus, and Undibacterium decreased in patients with IPF compared with that in the controls. A decreased relative abundance of Pelomonas (odds ratio [OR], 0.352, p = 0.027) and Azonexus (OR, 0.013, p = 0.046) was associated with a diagnosis of IPF in the multivariable logistic analysis adjusted by age and gender. Multivariable Cox analysis adjusted for age and forced vital capacity (FVC) revealed that higher relative abundance of Streptococcus (hazard ratio [HR], 1.993, p = 0.044), Sphingomonas (HR, 57.590, p = 0.024), and Clostridium (HR, 37.189, p = 0.038) was independently associated with IPF mortality. The relative abundance of Curvibacter (r = 0.590) and Thioprofundum (r = 0.373) was correlated positively, whereas that of Anoxybacillus (r = −0.509) and Enterococcus (r = −0.593) was correlated inversely with FVC. In addition, the relative abundance of the Aquabacterium (r = 0.616) and Peptoniphilus (r = 0.606) genera was positively correlated, whereas that of the Fusobacterium (r = −0.464) and Phycicoccus (r = −0.495) genera was inversely correlated with distance during the 6-min walking test. Conclusions: The composition of the microbiome in lung tissues differed between patients with IPF and controls and was associated with the diagnosis, mortality, and disease severity of IPF.


INTRODUCTION
Idiopathic pulmonary fibrosis (IPF) is a chronic progressive fibrosing interstitial lung disease of unknown etiology (1). It is characterized by worsening dyspnea, impaired lung function, decreased quality of life, and a poor prognosis (1). The pathogenesis of IPF involves both genetic (2,3) and environmental factors (4,5). Repeated epithelial injuries caused by multiple environmental factors, such as smoking, microaspiration, organic and inorganic dust, and viral infection (4,5), can lead to the abnormal wound healing process, such as epithelial-mesenchymal transition (6) in genetically susceptible individuals who have a mutation in airway defense (MUC5B), telomerase function (TERT), or immune responses (TOLLIP, TLR3, and IL1RN) (2,3,7). Much evidence supports an association between the etiology of several viruses (8)(9)(10)(11), and the development or acute exacerbation (AE) of IPF (12,13). The fact that combined therapy with steroid, azathioprine, and Nacetylcysteine increases the mortality and hospitalization rates of patients with IPF (14) also suggests that infectious organisms are involved in IPF progression.
Along with the development of culture-independent molecular-sequencing techniques, such as 16s ribosomal RNA (16s rRNA) gene sequencing (15), several studies of bronchoalveolar lavage fluid (BALF) have suggested that lung microbial communities are associated with the clinical course of IPF (16)(17)(18)(19)(20)(21). The findings of the Correlating Outcomes With Biochemical Markers to Estimate Time-progression in IPF (COMET) study revealed that an increased bacterial burden in BALF from patients with IPF (n = 65), compared with controls (n = 44), is associated with a 10% decline in forced vital capacity (FVC) at 6 months and mortality (16). On the contrary, a study of explanted lung tissues from patients with IPF (n = 40) showed very low bacterial abundance in IPF lung tissues that was similar to that of negative controls (22). These contradictory findings could be attributed to different types of samples or sample collection times. Therefore, the composition and impact of the lung tissue microbiome at diagnosis on clinical outcomes in patients with IPF are not well defined. Our study aimed to identify the diversity and composition of the bacterial microbial communities in lung tissues at the time of diagnosis and determine their association with clinical outcomes, such as survival, disease severity, and progression in patients with IPF.

Study Population
All participating patients with IPF were diagnosed between January 2011 and December 2013 at Asan Medical Center, Seoul, Republic of Korea and met the diagnostic criteria of the American Thoracic Society (ATS)/European Respiratory Society/Japanese Respiratory Society/Latin American Thoracic Association statement (1). Samples of lung tissues from patients with IPF (n = 20; 10 non-survivors [cause of death: AE = 1, disease progression = 2, unknown = 7]) were aseptically obtained at the time of surgical biopsy for diagnosis, and those from age-and gender-matched controls (lung cancer patients; n = 20) with no histological evidence of disease collected aseptically at the time of surgery were obtained from the Bio-Resource Center of Asan Medical Center. None of the patients with IPF or the controls had been treated with antibiotics, steroids, anti-fibrotic agents, or probiotics within 1 month before undergoing surgery. Lung tissues were procured under protocol #2016-1366. This study was conducted in accordance with the Declaration of Helsinki (2013) and was approved by the Institutional Review Board of Asan Medical Center (2018-1096). Written informed consent was obtained from all study participants.
Clinical and survival data of all patients were retrospectively collected from medical records, telephone interviews, and/or the National Health Insurance of Korea. Spirometry, total lung capacity (TLC) determined by plethysmography and diffusing capacity for carbon monoxide (DLco) measured according to published recommendations are expressed as ratios (%) of normal predicted values (23)(24)(25). The patients with IPF underwent 6-min walk tests (6MWT) according to the ATS guidelines (26). Baseline clinical data at the time of IPF diagnosis were collected within one month of sample acquisition.

Bacterial 16S rRNA Gene Sequencing
Tissue samples were frozen in liquid nitrogen immediately after collection and stored at −80 • C. Genomic DNA was extracted from lung tissues using Mo Bio PowerSoil R DNA Isolation Kits (Mo Bio Laboratories, Carlsbad, CA, USA) according to the instructions of the manufacturer. The variable V3 and V4 regions of the 16S rRNA genes were amplified using the following specific forward and reverse primers with overhang adapters: 5′-TCGTCGGCAGCGTCAGATGTGTATA AGAGACAGCCTACG GGNGGCWGCAG-3′ and 5′-GTCTCG TGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVG GGTATCTAATCC-3′, respectively (27). The PCR proceeded using a 5 ng/µl DNA template, 2 × KAPA HiFi HotStart Ready Mix (KAPA Biosystems, Wilmington, MA, USA), and two amplicon PCR forward and reverse primers. The PCR protocol comprised initial incubation at 95 • C for 3 min, followed by 25 cycles of 95 • C for 30 s, 55 • C for 30 s, and 72 • C for 30 s, then 72 • C for 5 min, and retention at 4 • C. After PCR clean-up and index PCR, 300 bp paired-end sequences were pooled on the Illumina MiSeq platform (Illumina Inc., San Diego, CA, USA) as described by the manufacturer for all sample sequencing (28,29). Distilled water provided in the PCR kit was used as a negative control, and no amplification was identified during the processes.

Reconstruction and Compositional Analysis
Fast Length Adjustment of Short (FLASH) reads, http://ccb.jhu. edu/software/FLASH/), were used for 16S rRNA gene by merging pairs of reads when the original DNA fragments were shortened than two times the length of the reads (30). Pre-processing and clustering were performed using the CD-HIT-operational taxonomic units (OTU; http://weizhongli-lab.org/cd-hit-otu/). Short reads (56,825) were filtered out and extra-long tails were trimmed. After filtering, the remaining reads were clustered at 100% identity. Chimeric reads (254,891) were filtered, and secondary clusters were recruited into primary clusters. After excluding reads with all other noise (5,292,165), the remaining reads (3,484,551) were clustered algorithm into OTUs at a cutoff of 97% (31,32). Feature tables, such as abundance and representational sequence files, were created using UCLAST in Quantitative Insights Into Microbial Ecology (QIIME1; https:// qiime.org) software (33). Taxonomy was assigned based on information about organisms with the closest similarity to the representative sequence of each OTU in the Basic Local Alignment Search Tool (BLAST), version 2.4.0, the NCBI 16S microbial reference database. Taxonomy was not assigned when the query coverage of the best match in the database was <85%, and the identity of the matched area was <85%.

Statistical Analysis
Continuous data were analyzed using Mann-Whitney U tests, and categorical data were analyzed using Fisher exact tests. The decline rate of lung function and exercise capacity for one year was estimated by linear regression analysis. Correlations between the relative abundance of the microbiome and clinical parameters were assessed using Spearman's correlation coefficients (r). The risk of microbial relative abundance for a diagnosis of IPF was expressed as odds ratio (OR) with 95% CI using binary logistic regression. In addition, the risk of microbial relative abundance for IPF mortality was presented as hazard ratio (HR) with 95% CI using Cox proportional hazards regression analyses. Alpha diversity indices that estimate the number of unique OTU in each sample are represented using four indices; Observed estimated the actual number of different taxa evident in a sample, Chao 1 non-parametrically estimated the richness of the species (34), Shannon estimated richness and evenness of species present in a sample considering the distribution of strains belonging to each species (35), and Inverse Simpson measured the probability that two randomly selected objects in a sample belong to the same species (36). Principal coordinates analysis (PCoA), based on weighted UniFrac methods to obtain phylogenetic and quantitative indices for assessing abundance differences among groups (IPF vs. controls, survivors vs. non-survivors), was conducted for all samples using QIIME1 (37). The exploratory and differential microbial compositions were analyzed using QIIME1. All data were statistically analyzed using SPSS version 24.0 (IBM Corp., Armonk, NY, USA), and values with p < 0.05 (two-tailed) were considered statistically significant.

Microbial Diversity and Composition
Among 20 patients with IPF, the mean age was 63.3 yr, and 65.0% were male ( Table 1). Lung function (FVC, DLco, and TLC) was worse in the patients with IPF than in the controls, whereas demographics, lung function, and exercise capacity during the 6MWT did not significantly differ between IPF non-survivors and survivors. Alpha diversity indexes, such as Observed, Chao 1, Shannon, and Inverse Simpson, did not differ between the IPF and control groups (Supplementary Figure A1). However, the PCoA plot revealed dissimilarity in the weighted UniFrac distance between the IPF and controls (Figures 1A-C), especially between five of the patients with IPF (non-survivors, n = 3; Figures 1A,C red circles) and controls, indicating more heterogeneity in the microbial distribution.
Among the 10 most frequent taxa, the genus Ralstonia was the most prevalent in the IPF and control groups, followed by Nocardia and Pelomonas (Supplementary Figure A2). On the contrary, Lactobacillus, Enterobacter, Tetragenococcus, and Neisseria were frequently identified in IPF, whereas Haemophilus, Caulobacter, Bradyrhizobium, and Thermomonas were prevalent in the controls. The relative abundance of Lactobacillus    (Figure 2). Logistic analysis adjusted by age and gender independently associated a diagnosis of IPF with lower relative abundance of the genera, Pelomonas (OR, 0.352; 95% CI, 0.139-0.891; p = 0.027), and Azonexus (OR, 0.013; 95% CI, 0.000-0.926; p = 0.046; Table 2).

Microbial Communities: IPF Non-survivors vs. Survivors
Alpha diversity did not significantly differ between non-survivors and survivors of IPF (Supplementary Figure A3). However, the PCoA plot showed that the distribution of microbes differed between non-survivors and survivors (Figure 3). Among the 10 most frequent taxa, Ralstonia and Nocardia were the most common in both groups (Supplementary Figure A4). The genus Streptococcus was more abundant in non-survivors compared with survivors. In addition, the genera Neisseria, Haemophilus, Rothia, and Rubrobacter were frequently detected in non-survivors, while the genera Tetragenococcus, Enterobacter, Lactobacillus, and Caulobacter were prevalent in survivors. The relative abundance of genera Bifidobacterium (2.77 [nonsurvivors] vs. 0.68% [survivors], p = 0.003) and Olsenella (0.51 vs. 0.41%, p = 0.013) was significantly higher in non-survivors than in survivors (Figures 4A,B).

Impact on Survival
The median follow-up period for patients with IPF was 3.0 yr (interquartile range: 1.5-5.4 yr), and the median survival period was 3.1 yr. Unadjusted Cox analysis significantly associated the relative abundance of the Streptococcus, Sphingomonas, Veillonella, and Clostridium genera with IPF mortality. Neisseria and Granulicatella were also marginally associated with IPF mortality (

Association With Disease Severity
The relative abundance of Curvibacter and Thioprofundum was positively associated with FVC in patients with IPF, whereas Anoxybacillus, Enterococcus, Akkermansia, and Clostridium negatively correlated with FVC ( Figure 5A and Supplementary Table A1). The relative abundance of Thermomonas and Peptoniphilus positively correlated with DLco, whereas that of Granulicatella and Rhodoferax was positively correlated with TLC. The relative abundance of the Aquabacterium, Nakamurella, and Peptoniphilus genera was positively correlated, whereas that of the Fusobacterium, Anaerococcus, and Phycicoccus genera was negatively correlated with distance during the 6MWT (Figure 5B and Supplementary Table A2). The relative abundance of genus Rhodoferax and Lactococcus was positively correlated with resting and lowest oxygen saturation (SpO 2 ) during the 6MWT.

Association With Disease Progression
We estimated the decline rate in lung function and exercise capacity for one yr and compared them between survivors and non-survivors ( Table 4). Non-survivors had a faster decline rate in DLco, and distance and the lowest SpO 2 during 6MWT compared with survivors. The relative abundance of Granulicatella and Paracoccus genera was positively correlated, while that of the Novosphingobium genus was negatively correlated with the decline rate in FVC ( Figure 6A and Supplementary Table A3). The relative abundance of Bifidobacterium was positively associated, whereas Streptococcus was negatively associated with the decline rate in DLco.    The relative abundance of Lactobacillus, Staphylococcus, Granulicatella, and Selenomonas genera was positively correlated with the decline rate of TLC. The relative abundance of Staphylococcus and Variovorax was positively associated, while that of Legionella, Anoxybacillus, Acidocella, and Hyphomicrobium genera was negatively associated with the decline rate in distance during 6MWT (Figure 6B and Supplementary Table A4). The relative abundance of Beijerinckia, Mycobacterium, and Microbulbifer genera was positively correlated, whereas that of Enterobacter genus was negatively correlated with the decline rate in resting SpO 2 . The relative abundance of genus Staphylococcus was positively correlated with the lowest SpO 2 during 6MWT.

DISCUSSION
The microbial communities in the lung tissues differed between patients with IPF and controls, and between IPF nonsurvivors and survivors. When adjusted for age and gender, a decreased relative abundance of genus Pelomonas and Azonexus was associated with a diagnosis of IPF. A higher relative abundance of the Streptococcus, Sphingomonas, and Clostridium genera was an independent prognostic factor in patients with IPF and several genera correlated with disease severity and progression. We found no differences in the alpha diversity of lung tissue microbiomes between patients with IPF and controls, whereas other results of studies of BALF from patients with IPF yielded different results (16,19). Molineux et al. found a significantly decreased alpha diversity index for the microbiome in BALF samples from 65 patients with IPF at the time of diagnosis (Shannon diversity index, 3.81 ± 0.08 vs. 4.11 ± 0.10; p = 0.005) compared with controls (n = 44) (16). The Shannon diversity index was also decreased in BALF from mice treated with bleomycin (n = 6), compared with control mice (n = 6, p < 0.05) (19). These contradictory findings could be attributed to differences in baseline demographics and treatment of the subjects in a previous study (16); there were differences in age (68 [IPF] vs. 58.2 years [controls], p < 0.0001) and inhaled steroid therapy (6.2 vs. 0.0%, p = not significant) between IPF and controls, and these might affect differences in alpha diversity in microbiome. Our findings were in line with those of Kitsios et al. who identified separate clusters on PCoA plots of Bray-Curtis dissimilarity distances among explanted lung tissues from patients with IPF (n = 40), cystic fibrosis (n = 5), and donors (n = 7) (22).
The relative abundance of Lactobacillus was increased in lung tissues from patients with IPF compared with controls. Lactobacillus generally resides in the gastrointestinal and reproductive tract, where it maintains a healthy microecology with lactic acid production (38,39). However, given the well-known association between IPF and gastroesophageal reflux disease (GERD) (40), the high prevalence of GERD in IPF might contribute to the increase in the relative abundance of Lactobacillus in IPF. Moreover, Harata et al. found the increased expression of the mRNA for interleukin-1, tumor necrosis factor, and monocyte chemotactic protein-1 in the respiratory tracts of mice infected with influenza and treated with intranasal Lactobacillus rhamnosus than in infected and untreated mice (41). Since proinflammatory cytokines and chemokines are associated with the pathogenesis of IPF (42), the immunoregulatory effect of Lactobacillus might contribute to the pathophysiology of IPF. We found a higher relative abundance of Bifidobacterium in IPF nonsurvivors than survivors. Bifidobacterium can also produce lactic acid (43,44), along with Lactobacillus. Levels of lactic acid and lactate dehydrogenase-5, which induce the differentiation of fibroblasts into myofibroblasts by activating transforming growth factor (TGF)-ß1, were elevated in lung tissues from patients with IPF (n = 6), compared with healthy persons (n = 6) (45). Therefore, bacteria that produce lactic acid might also contribute to the progression of IPF.
We independently associated a higher relative abundance of the Streptococcus genus with IPF mortality, which is in line with the previous reports (21,(46)(47)(48). The COMET-IPF study of 55 patients with IPF found that a higher relative abundance of Streptococcus OTU was an independent prognostic factor for disease progression (HR, 1.11; 95% CI, 1.04-1.18; p = 0.0009) according to a multivariable Cox proportional hazard analysis adjusted for age, gender, smoking status, and desaturation during the 6MWT (21). Infection with Streptococcus pneumoniae significantly increased hydroxyproline levels in lung tissues from mouse models of TGFß1-induced lung fibrosis compared with mock infection (46). In addition, fibrosis and collagen deposition were increased in lung tissues from mice treated with both bleomycin and Streptococcus pneumoniae serotype 3 compared with mice that were either treated with bleomycin or infected with Streptococcus pneumoniae (48). These results suggested that Streptococcus infection could induce IPF disease progression. Furthermore, lung vascular permeability and neutrophil and monocytes counts were increased in BALF from mice treated with pneumolysin (47), which is a pore-forming cytotoxin released by Streptococcus pneumoniae that causes alveolar epithelial injury (47).
In this study, the relative abundance of the Anoxybacillus genus in IPF lung tissues was correlated with IPF disease severity. The relative abundance of the Firmicutes phylum was inversely correlated with FVC in BALF samples from 34 patients with IPF (r = −0.5514, p = 0.0007) (19). Our results are consistent with these findings because Anoxybacillus belongs to the Firmicutes phylum and is negatively correlated with FVC. Another study also found an increased relative abundance of Firmicutes in BALF from mice treated with bleomycin (19), suggesting that an increased prevalence of the Anoxybacillus genus is associated with the pathogenesis of IPF.
This study had some limitations. Although we matched the baseline characteristics, such as age and sex between the IPF and control groups, other confounding factors might have affected the microbial communities. However, we tried not to include patients who had been treated with agents that might affect the microbiota. The number of samples analyzed was not large. Nevertheless, we identified significant differences in the distribution and clinical impact of the microbiomes of patients with IPF compared with controls. Because non-malignant and non-fibrotic lung tissues from lung cancer patients were used as controls, the microbial community in the control group might be affected by lung cancer. However, in studies of lung tissue microbiome of other diseases, it is common to use normal tissue of lung cancer tissue as normal control (49,50). Even in lung tissue microbiome studies in lung cancer patients, non-cancer tissue from the lung cancer patient was used as a control (50). The cross-sectional design of this study prevented the identification of causal relationships between changes in microbial communities and IPF development. Additional longterm clinical studies should address this issue. Despite these limitations, the strength of our study is that we first revealed the microbial communities in lung tissues from patients when they were initially diagnosed with IPF and the impact of these communities on their survival.
In conclusion, our finding suggests that specific microbial communities in lung tissues from patients with IPF and associations between the relative abundance of some genera and clinical parameters, such as diagnosis, mortality, disease severity, and progression in such patients, imply microbial communities in the lungs play roles in the pathogenesis of IPF.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: NCBI SRA BioProject, accession no: PRJNA761508.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Institutional Review Board of Asan Medical Center (2018-1096). Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
JS was the guarantor of the manuscript for designing and supervising the entire study. H-YY and JS took responsibility for the data analysis. S-JM contributed to sample collection and preparation. H-YY and JS drafted the initial manuscript. All the authors discussed the results and reviewed the manuscript.

FUNDING
This study was supported by a grant from the Basic Science Research Program of the National Research Foundation of Korea (NRF), which is funded by the Ministry of Science and Technology (NRF-2019R1A2C2008541), South Korea. The sponsor had no role in the design of the study, the collection and analysis of the data, or the preparation of the manuscript.