Risk of newly diagnosed interstitial lung disease after COVID-19 and impact of vaccination: a nationwide population-based cohort study

Objectives Previous studies suggested that coronavirus disease 2019 (COVID-19) could lead to pulmonary fibrosis, but the incidence of newly diagnosed interstitial lung disease (ILD) after COVID-19 is unclear. We aimed to determine whether COVID-19 increases the risk of newly diagnosed ILD and whether vaccination against COVID-19 can reduce this risk. Methods This retrospective cohort study used data from the Korean National Health Insurance claim-based database. Two study groups and propensity score (PS)-matched control groups were constructed: Study 1: participants diagnosed with COVID-19 (COVID-19 cohort) and their PS-matched controls; Study 2: COVID-19 vaccinated participants (vaccination cohort) and their PS-matched controls. Results In Study 1, during a median 6 months of follow-up, 0.50% of the COVID-19 cohort (300/60,518) and 0.04% of controls (27/60,518) developed newly diagnosed ILD, with an incidence of 9.76 and 0.88 per 1,000 person-years, respectively. The COVID-19 cohort had a higher risk of ILD [adjusted hazard ratio (aHR), 11.01; 95% confidence interval (CI), 7.42–16.32] than controls. In Study 2, the vaccination cohort had a lower risk of newly diagnosed ILD than controls (aHR, 0.44; 95% CI, 0.34–0.57). Conclusion Using nationwide data, we demonstrated that COVID-19 was associated with a higher incidence rate of newly diagnosed ILD, but that this risk could be mitigated by COVID-19 vaccination.

Risk of newly diagnosed interstitial lung disease after COVID-and impact of vaccination: a nationwide population-based cohort study

Introduction
The declaration of a global coronavirus disease 2019 (COVID-19) pandemic in March 2020 followed the emergence of a novel coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), in December 2019 (1,2).However, the rapid decline in COVID-19 following a surge in COVID-19 caused by the omicron variant led many governments to relax their mandates and limit population-wide intervention (3).
In this post-COVID-19 era, numerous post-COVID-19 patients continue to experience persistent respiratory symptoms such as coughing and dyspnoea, despite the absence of detectable viral infection.Some patients exhibit abnormal findings on follow-up chest computed tomography (CT) or pulmonary function tests after COVID-19 (4)(5)(6)(7).During the COVID-19 pandemic, there was widespread concern about the subsequent development of pulmonary fibrosis after COVID-19, thus previous studies investigated pulmonary fibrosis in patients with persistent respiratory symptoms following COVID-19 (8)(9)(10).One previous meta-analysis investigated persistent lung sequelae of COVID-19 and reported that recovered patients still had chest CT abnormalities up to 1 year after infection (10).
However, to the best of our knowledge, no studies have examined the incidence of newly diagnosed interstitial lung disease (ILD) following COVID-19 using a nationwide population-based cohort.Determining the exact incidence of fibrotic sequelae caused by COVID-19 upon entering the post-COVID-19 era will aid in the long-term management of these patients.Therefore, we aimed to evaluate the association between prior COVID-19 and the incidence of newly diagnosed ILD in adults using a nationwide cohort dataset.Additionally, we assessed the impact of COVID-19 vaccination on the incidence of newly diagnosed ILD.

Data source
This was a retrospective cohort study using the Korean National Health Insurance claims-based dataset.The National Health Insurance Service (NHIS) is the universal insurance provider in Korea that is managed by the government and covers 97% of the Korean population (∼50 million people) (11,12).During the COVID-19 pandemic, the Korean government encouraged people to get tested for COVID-19 without delay by subsidizing the cost of diagnosis and treatment for individuals who met criteria related to COVID-19 and provided health insurance services to all Koreans with COVID-19 (NHIS-2022-1-623) (13).The NHIS database therefore includes medical data for all patients who underwent SARS-CoV-2 testing.The NHIS dataset includes demographic variables, socioeconomic characteristics, healthcare utilization (e.g., outpatient visits, emergency department visits, and hospitalization), health screening examination findings, disease diagnoses based on the 10th revision of the International Classification of Disease (ICD-10) codes, and treatments such as medication, procedures, and surgeries (11).The NHIS has been widely used in epidemiologic studies to identify risk factors for COVID-19 and post-COVID complications (14-17).

Study population
The Korean government provided a COVID-19 study database (N = 8,464,242) composed of data from 561,158 subjects diagnosed with COVID-19 at least once from October 2020 to December 2021 and 7,903,084 subjects who were not diagnosed with COVID-19 during the same period (those with negative results in SARS-CoV-2 tests or those who did not undergo SARS-CoV-2 tests).These 7,903,084 subjects were selected by stratified sampling by age and sex from the entire NHIS database (N = ∼50 million subjects) except for 561,158 subjects who were diagnosed with COVID-19.
After excluding 4,614,779 subjects who did not undergo health screening examination between 2019 and 2020, 3,849,463 subjects who received a routine health screening examination between 2019 and 2020 were initially selected (Figures 1, 2).The recruitment period for Study 1 was from October 8, 2020, to June  1; Supplementary Figure 1A).
Study 2 aimed to evaluate whether COVID-19 vaccination was associated with a reduced risk of newly diagnosed ILD by comparing the risk of newly diagnosed ILD between participants who were vaccinated against COVID-19 and their 1:1 PS-matched controls.Using the 3,849,463 participants, we excluded 13,808 with a diagnosis of ILD before the index date (COVID-19 vaccination date), and an additional 2,074,910 subjects with a follow-up duration of <6 months.Additionally, 92 patients diagnosed with ILD on the date of COVID-19 vaccination and their matched pairs (n = 92) were excluded from the vaccinated group and non-vaccinated group, respectively.Finally, 185,457 subjects who received COVID-19 vaccination (vaccination cohort) were 1:1 PS matched with 185,457 subjects who were not vaccinated (controls) (Figure 2; Supplementary Figure 1B).
Our study protocol was approved by the Institutional Review Board of Hanyang University Hospital (No. 2023-06-054).The requirement for informed consent was waived because all patient records were anonymized before use.

Study exposure
The study exposure of Study 1 was COVID-19.Laboratory diagnosis of SARS-CoV-2 infection was defined as a positive result from real-time RT-PCR assay of nasal or pharyngeal swabs from patients with a history of SARS-CoV-2 (U071) defined using ICD-10 codes (18).The cohort entry date for each patient tested for SARS-CoV-2 was the date of the first SARS-CoV-2 test.
The study exposure of Study 2 was COVID-19 vaccination.During the study period, the NHIS provided COVID-19 vaccination data for the study purpose.The vaccination cohort included subjects who received at least one dose of the vaccine.For patients who received multiple vaccinations, the index date was set to the first vaccination date.During the study period, there were no subjects who received a third vaccination.

Study outcome
The primary outcome was newly diagnosed ILD.ILD was defined as one or more claims under ICD-10 diagnostic code J84.x as a major or minor diagnosis (14).

Covariates
Basic demographic characteristics of age, sex, residential area, and income status were collected from the dataset.Income status was divided into the highest 30% (high), lowest 30% (low), and the rest (middle); individuals supported by the medical aid program were classified as the low-income group.Residential areas were classified as metropolitan cities, middle-and small-sized cities, or rural areas.Smoking status (never-smoker or smoker) and alcohol consumption (none, 1-2 times a week, 3-4 times a week, and almost every day) were determined based on a self-reported questionnaire.Body mass index (BMI) was calculated as body weight divided by the square of height (kg/m 2 ) and subjects were classified into four groups based on BMI as follows: normal (18.5-22.9kg/m 2 ), low (<18.5 kg/m 2 ), overweight (23.0-24.9kg/m 2 ), or obese (≥ 25 kg/m 2 ).

Statistical analysis
Descriptive statistics are presented as numbers (percentages) for categorical variables and mean ± standard deviations (SD) for continuous variables.We compared two groups using the χ 2 test for categorical variables and t-tests for continuous variables.The incidence rates of ILD were calculated by dividing the number of incident events by the total follow-up period (1,000 person-years).A cumulative incidence plot was used to compare the incidence of ILD, and a log-rank test was used to evaluate significant differences between groups.
We performed 1:1 PS matching between the study cohort and controls based on age, sex, BMI, smoking status, alcohol consumption, economic status, residential area, and comorbidities (hypertension, DM, CKD, allergic rhinitis, and dyslipidemia).Standardized mean difference (SMD) was used to examine the balance of covariate distributions between the groups, and an SMD > 0.1 was considered to indicate an imbalance (27).Cox proportional hazards regression analyses were used to evaluate the risk of incident ILD.To further minimize potential bias that could persist even after PS matching, we additionally adjusted for age, sex, BMI, smoking status, alcohol consumption, economic status, residential area, and comorbidities (hypertension, DM, CKD, allergic rhinitis, dyslipidemia).Stratified analyses were also performed by sex, age, BMI, smoking status, alcohol consumption, economic status, residential area, and comorbidities.A two-sided p-value < 0.05 was considered statistically significant.All statistical analyses were performed using SAS version 9.4 (SAS Institute Inc., Cary, NC, USA).

Baseline characteristics
The baseline characteristics of the study participants (n = 121,036) in Study 1 are presented in Table 1.After PS matching, there were no significant imbalances in baseline characteristics between the COVID-19 and control cohorts (all SMDs < 0.1).The mean age of the study population was 51 years and 47.9% were males.
A total of 370,914 subjects were analyzed for Study 2. The baseline characteristics of study participants in Study 2 are summarized in Supplementary Table 1.There were no significant imbalances in baseline characteristics between the vaccination and control cohorts (all SMDs < 0.1).

Incidence and risk of newly diagnosed ILD according to COVID-
In Study 1, the incidence and risk of newly diagnosed ILD were compared between the COVID-19 cohort and the PS-matched controls.During 6 months of follow-up, 0.50% of the COVID-19 Frontiers in Public Health frontiersin.org

Incidence and risk of newly diagnosed ILD according to COVID-vaccination
In Study 2, the incidence and risk of newly diagnosed ILD were compared between the vaccination cohort and controls.As shown in Table 3, 0.04% of the vaccination cohort (76/185,457) and 0.09% of controls (173/185,457) developed ILD, with an incidence of 0.80 and 1.83 per 1,000 person-years, respectively.Subjects in the vaccination cohort had a lower risk of ILD (aHR, 0.44; 95% CI, 0.34-0.57)than controls, consistent with the results shown in Figure 3B (log-rank p < 0.001).

Discussion
Our study is the largest comprehensive study to evaluate the risk of newly diagnosed ILD in adults after COVID-19 based on analyses of nationwide cohort data.Subjects with COVID-19 had an ILD incidence rate of 9.76 per 1,000 person-years, which is 11-fold higher than that of those without COVID-19.Additionally, we analyzed the effects of COVID-19 vaccination and found that vaccination was effective at reducing the incidence of newly diagnosed ILD.
Many previous studies have reported persistent ILD after COVID-19 (5,6,10,(28)(29)(30)(31)(32); however, the number of subjects included in these studies was relatively small, and the followup duration was relatively short ranging from a few weeks (30) to 1 year after COVID-19 (6,10,28,29,31).For example, a Chinese study reported that 118 out of 1,279 patients underwent CT, and among these patients, 65 (55%) had persistent CT abnormalities (28).Swiss and French multicentre prospective studies reported that severe COVID-19 patients show a greater decrease in lung function and more radiologic lung sequelae than non-severe COVID-19 patients (6,29).Although these previous studies demonstrated an association between COVID-19 and ILD, the actual estimated incidence and risk of clinically relevant ILD Another important finding of our study is that COVID-19 vaccination can decrease the risk of post-COVID-19 ILD at the population level, demonstrating the preventive role of COVID-19 vaccination on this disastrous post-COVID-19 complication.However, at the individual level, there have been some concerns that the COVID-19 vaccine could cause or aggravate ILD (33)(34)(35)(36).Although some cases were successfully recovered with corticosteroid treatments (33)(34)(35), one study in which 37% had underlying pre-diagnosed ILD, reported the mortality rate was 15% (35).Accordingly, based on the findings of a previous study that has demonstrated the safety of the vaccine (37) and our research, maintaining a positive position on COVID-19 vaccination against post-COVID-19 ILD, a balanced understanding of the relationship between COVID-19 vaccination and ILD risk, is needed.
ILD may develop after COVID-19 for several reasons.First, post-infectious organizing status or ARDS-related fibrosis should be considered in patients with severe COVID-19 pneumonia and ARDS (38).Ventilator-induced lung injury could also be associated with post-infection fibrotic changes.However, ILD may occur without prior COVID-19 acute respiratory distress syndrome or lung injury from mechanical ventilation during treatment of severe COVID-19.One study delineated immuneproteomics in the airway and peripheral blood of healthy controls and post-COVID-19 patients 3-6 months after discharge (39).The authors of that study reported that increased B cell numbers and altered monocyte subsets were associated with widespread lung abnormalities.Another earlier study reported that an increase in the frequency of airway and lung B cells similar to that seen in the airway after COVID-19 was also present in interstitial pulmonary fibrosis (IPF), a representative ILD (40).Thus, in the post-COVID-19 airway, B cells may be directly promoting aberrant lung tissue repair (41).Peripheral blood mononuclear cell signatures in COVID-19 lung disease and IPF have also been shown to exhibit comparable transcriptional signatures and have prognostic value (42).There is also evidence that monocyte and T cell subsets are involved in the pathogenesis of post-COVID-19 ILD; a recent study evaluating blood samples from subjects with post-COVID-19 ILD, IPF, and controls showed that survivors of post-COVID-19 ILD had higher expression of genes related to naïve and memory CD4 T cells, Tregs, memory CD8 T GZMB+, memory CD8 T GZMK+, and naïve CD8 T cells, but lower levels in IPF, suggesting most subjects with post-COVID-19 ILD have partially or completely resolved pulmonary fibrosis, while most patients with IPF have progressive disease (43).Further studies are still needed to determine the exact mechanism(s) underlying ILD after COVID-19, but the results of previous studies suggest that the complex immune response to the virus itself may promote the development of ILD after COVID-19.
Several limitations to our study should be acknowledged.First, it is important to note that our study utilized a dataset specifically derived from the Korean population, which potentially restricts the generalizability of our findings to other countries or ethnic groups.Second, the identification of ILD and other comorbidities relied on the use of ICD-10 codes, so there might have been over-or underestimation of the diagnosis.Additionally, we could not classify the subtypes of ILD.Third, due to a lack of laboratory data and pulmonary function test data, these data could not be incorporated into our analyses.Fourth, we are unable to provide post-COVID-19 ILD outcomes related to the delta or omicron variants, since our study period (from October 2020 to June 2021) was performed before the delta or omicron variants dominated (major variant changes in Korea are as follows: alpha and beta variants from December 2020, gamma variant from January 2021, delta variant from May 2021, and omicron variant from November 2021).Further study of post-COVID-19 ILD associated with delta or omicron variants will be required.An additional limitation is that due to the relatively short follow-up duration of our dataset, we were unable to provide results with more than 1 year of follow-up.Thus, further study of longer follow-up is needed.Finally, the exact mechanism of the effect of the COVID-19 vaccine on the prevention of newly diagnosed ILD after COVID-19 cannot be determined through our study.Future research is required to explore the mechanisms behind the increased risk of post-COVID-19 ILD and the protective role of different types of COVID-19 vaccines against post-COVID-19 ILD.
In conclusion, based on analyses of a nationwide dataset, we demonstrated that COVID-19 is associated with a higher incidence rate of newly diagnosed ILD.Additionally, we suggest that COVID- Abbreviations: aHR, adjusted hazard ratio; BMI, body mass index; CI, confidence interval; CKD, chronic kidney disease; COVID-, coronavirus disease ; CT, computed tomography; DM, diabetes mellitus; ICD-, the th revision of the International Classification of Disease; ILD, interstitial lung disease; NHIS, National Health Insurance Service; PS, propensity score; SARS-CoV-, severe acute respiratory syndrome coronavirus ; SMD, standardized mean di erence.
TABLE Baseline characteristics.
*Income status was divided into the highest 30% (high), the lowest 30% (low), and the rest (middle); individuals supported by the medical aid program were classified as the low-income group.cohort(300/60,518)and0.04% of controls (27/60,518) developed newly diagnosed ILD, with an incidence of 9.76 and 0.88 per 1,000 person-years, respectively.Similarly, there was a significant difference in the cumulative incidence of newly diagnosed ILD between the COVID-19 cohort and controls (Figure3A, log-rank p < 0.001).As shown in Table2, the COVID-19 group had a higher Adjusted for age, sex, BMI, smoking status, alcohol consumption, economic status, residential area, and comorbidities (hypertension, diabetes mellitus, chronic kidney disease, allergic rhinitis, COVID-19 in real-world clinics has not been reported.Our results indicate that the risk of ILD is about 11-fold higher in those with COVID-19 than those without COVID-19.Notably, the cumulative curve showed a steep increase in the slope until about 2 months after infection, but ILD incidence showed a steady increase even after 2 months.Based on the results of our study, we recommend that clinicians perform follow-up chest CTs or pulmonary function tests in post-COVID-19 patients over a long period to detect abnormalities in the lung interstitium, especially in those patients with persistent respiratory symptoms (e.g., cough and dyspnoea). *following 19vaccination reduces the risk of developing ILD by preventing COVID-19 itself.