Associations of chest X-ray trajectories, smoking, and the risk of lung cancer in two population-based cohort studies

Objectives Despite the increasing use of computed tomography (CT), chest X-ray (CXR) remains the first-line investigation for suspected lung cancer (LC) in primary care. However, the associations of CXR trajectories, smoking and LC risk remain unknown. Methods A total of 52,486 participants from the PLCO and 22,194 participants from the NLST were included. The associations of CXR trajectories with LC risk were evaluated with multivariable COX regression models and pooled with meta-analyses. Further analyses were conducted to explore the stratified associations by smoking status and the factors associated with progression and regression in CXR. Results Compared to stable negative CXR (CXRSN), HRs (95%CIs) of LC incidence were 2.88(1.50–5.52), 3.86(2.03–7.35), and 1.08(0.80–1.46) for gain of positive CXR (CXRGP), stable positive CXR (CXRSP), and loss of positive CXR (CXRLP), while the risk of LC mortality were 1.58(1.33–1.87), 2.56(1.53–4.29), and 1.05(0.89–1.25). Similar trends were observed across different smoking status. However, LC risk with CXRGP overweighed that with CXRSP among ever smokers [2.95(2.25–3.88) vs. 2.59(1.33–5.02)] and current smokers [2.33(1.70–3.18) vs. 2.26(1.06–4.83)]. Moreover, compared to CXRSN among never smokers, even no progression in CXR, the HRs(95%CIs) of LC incidence were 7.39(5.60–9.75) and 31.45(23.58–41.95) for ever and current smokers, while risks of LC mortality were 6.30(5.07–7.81) and 27.17(21.65–34.11). If participants gained positive CXR, LC incidence risk significantly climbed to 22.04(15.37–31.60) and 71.97(48.82–106.09) for ever and current smokers, while LC mortality risk climbed to 11.90(8.58–16.50) and 38.92(27.04–56.02). CXRLP was associated with decreased LC risk. However, even smokers lost their positive CXR, and the increased risks of LC incidence and mortality did not decrease to non-significant level. Additionally, smoking was significantly associated with increased risk of CXRGP but not CXRLP. Conclusion LC risk differed across CXR trajectories and would be modified by smoking status. Comprehensive intervention incorporating CXR trajectories and smoking status should be recommended to reduce LC risk.


Introduction
Lung cancer (LC) ranks as the most common cancer and the leading cause of cancer mortality in men for several years, while it is the third common cancer and the second leading cause of mortality in women in 2020 around the world (1)(2)(3). In 2020, an estimated 2.2 million new LC cases and 1.8 million LC deaths occurred, which accounted for 11.4% of all new cancer cases and 18.0% of all cancer deaths (4). Due to the emerging aging trend, stubbornly high tobacco epidemic, and surging air pollution, the LC incidence is expected to continually rise in many countries in the future. Reducing the increasing burden of LC has become a global concern faced by several countries, especially by transitioning or developing countries.
Although low-dose computed tomography (LDCT) has greatly altered the landscape of LC screening since 2011 (5,6), chest X-ray (CXR) is still the first-line examination of lung cancer in primary healthcare due to more universal availability (7), less radiation dose (8), less requirement for technicians, and relatively lower cost than LDCT. Moreover, with the widespread rise in artificial intelligence (AI), deep-learning-based automatic diagnostic model based on CXR is highly expected to significantly improve the early detection rate of LC (9)(10)(11). However, before the sophisticated AI-assisted CXR diagnostic technology is widely used in resource-limited regions, how to reduce the potential missed diagnosis and false positive diagnosis associated with traditional CXR exam is the key to improve the effect of CXR examination. Currently, the associations of different CXR trajectories and LC risk remain unknown. Ignoring the CXR trajectories, especially the progression and regression in CXR, is presumed to be the leading causes of the non-significant reduction in LC mortality for previous CXR screening trials. Independent evaluation of CXR without referring to other risk factors (especially for smoking) may also dilute the effects of CXR screening for LC. However, until now, few studies have investigated the associations of CXR trajectories with LC risk, and no study has explored the stratified effects by smoking and the interaction between CXR and smoking on LC risk. Therefore, in this study, based on the analysis for secondary data from the Prostate, Lung, Colorectal, and Ovarian Cancer (PLCO) Screening (12)(13)(14)(15) trial and the National Lung Screening Trial (NLST) (6), we first aimed to investigate the associations of CXR trajectories with LC incidence and mortality, and metaanalyses were conducted to achieve the pooled results beyond the individual study. Furthermore, we aimed to evaluate the stratified associations by smoking status, the interaction between CXR and smoking on LC risk, and the potential factors associated with progression and regression in CXR.

Source of population
The trial registration number (on ClinicalTrials.gov) of PLCO is NCT00002540, and the trial registration number (on ClinicalTrials.gov) of NLST is NCT00047385. The designs of the PLCO and NLST cancer screening trials have been previously published. Briefly, from 1993 to 2001, the PLCO cancer screening trial randomized 154,887 participants aged 55-74 years to the intervention arm receiving multiple screening exams for prostate, lung, colorectal, and ovarian cancers or to the control arm receiving usual care in 1:1 ratio (12)(13)(14)(15). For the LC screening, participants in the screening arm received four annual posterior-anterior CXR. There were several changes in the screening protocol, including that never smokers randomized after December 1995 were no longer offered a T3 CXR exam unless they insisted on it. Positive screening exam of CXR was defined as one or more nodules, mass, hilar or mediastinal lymph node enlargement, infiltrate, consolidation, or alveolar opacity. Participants with positive CXR were encouraged to receive further diagnostic evaluation with their primary care physicians. From 2002 to 2004, the NLST cancer screening trial randomized 53,452 smokers aged 55-74 years with at least 30 packyears of smoking history and at most 15 years smoking cessation history to receive three annual LDCT screening (the intervention arm) or posterior-anterior CXR screening (the control arm) in 1:1 ratio (6). Radiologists were required to review the images in two ways: looking at the image without reference to historical images (isolation read) and then looking at the image again with reference to historical images (comparison read). Positive exam was defined as any non-calcified nodule or mass with diameter ≥ 4 mm, or any other abnormalities appeared suspicious for lung cancer (in the radiologist's judgment). Participants with either a positive result or a negative result but with other clinically significant abnormalities were strongly encouraged to receive a diagnostic evaluation for lung cancer or other suspected condition. Each participating center's institutional review board approved the protocols, and all participants provided written informed consent.

Selection of participants
In the PLCO trial, after excluding the participants in the control arm, a total of 77,443 participants in the screening arm were initially included in this study. After further excluding 6,811 participants who did not receive any CXR examination or without inadequate screen, 3,819 participants with one round of CXR examination, 4,937 participants with two rounds of CXR examination, 2,620 participants with three rounds of CXR examination enrolled in 1993-1995, 3,175 smoking participants with three rounds of CXR examination enrolled in 1996-2001, and 3,595 participants who did not meet the definitions of four classic CXR trajectories (details referred to the following section) in CXR, a total of 52,486 participants were finally included in this study (Supplementary e- Figure S1). In the NLST trial, after excluding the participants who received LDCT in the intervention arm, a total of 26,730 participants who received CXR in the control arm were initially included in this study. After further excluding 4,037 participants without any CXR examinations and 699 participants who did not meet the definitions of four classic CXR trajectories (details referred to the following section) in CXR, a total of 21,994 participants were finally included in this study (Supplementary e- Figure S2).

Determination of CXR trajectories
Based on multiple rounds of CXR screening, to achieve comparable CXR trajectories between the PLCO and NLST trials, four classic CXR trajectories were defined in this study, including stable negative CXR (CXR SN ), gain of positive CXR after persistent negative CXR and no subsequent negative diagnosis (CXR GP ), stable positive CXR (CXR SP ), and loss of positive CXR after persistent positive CXR and no subsequent positive diagnosis (CXR LP ). To avoid the confounding effect of instability symptoms on four classic CXR trajectories, disordered fluctuations on CXR were not included in this study, including that negative CXR progressed to positive CXR and subsequently regressed to negative CXR again, and that positive CXR regressed to negative CXR and subsequently progressed to positive CXR again.

Information of baseline variables
After informed consent, all participants in the PLCO trial were provided with a baseline questionnaire to collect participantreported information on demographic and potential risk factors associated with PLCO cancers, such as demographics, smoking history, family history of cancer, and medical history. Similar baseline variables with different information were collected in the NLST trial. To achieve comparable data in both the PLCO and NLST trials, the following index variables were finally included in this study, including age (55-59, 60-64, 65-69, and 70-74 years), sex (female and male), race/ethnicity (non-Hispanic white and other), education level (<senior high school, senior high school, college, and above), marital status (married/living as married, widowed/divorced/separated, and never married), smoking status (never smoking [only for the PLCO trial], ever smoking, and current smoking), body mass index (BMI) (<18.5, 18.5-25, 25-30, and > 30 kg/m 2 ), and family history of lung cancer (no and yes). BMI was calculated as weight in kilograms divided by the square of height in meters (kg/m 2 ).

Ascertainment of endpoints
The primary endpoints events of this study were LC incidence and mortality. In the PLCO trial, the incidence and mortality of LC were ascertained primarily by mails of Annual Study Update (ASU) questionnaire after last-round CXR and supplemented by repeated mails or phone calls to participants who were not response to the ASU questionnaire. The mortality was further supplemented by periodic linkage to the National Death Index (NDI), and a more accurate assessment of LC deaths was adjudicated by an independent Death Review Process (DRP). The cancer data were collected until 31 December 2009, and mortality data were collected through 2018 in the PLCO trial.
The NLST confirmed diagnoses of LC through medical record abstraction (MRA), which was triggered by annual or semi-annual study update form, positive CT or CXR screening exam, direct report by relatives or physicians, and supplemented by NDI Plus searches. If the MRA process did not find records indicating a LC diagnosis, the LC was not considered confirmed, even if a source such as a death certificate indicated LC. Endpoint verification process was used to determine definitively whether LC was the cause of death. Active follow-up data were collected on cancer diagnoses and deaths that occurred through 31 December 2009. Extended follow-up data were collected for deaths through 31 December 2015.
In both the PLCO and NLST trials, if LC was diagnosed, further information was recorded about cancer characteristics (including histopathological type, grade, location, size of tumor, and TNM components of stage), initial treatment, and cancer progression. Furthermore, in this study, the primary outcomes in both the PLCO and NLST trials were censored at the date of the LC diagnosis (for LC incidence only), death, loss of follow-up, or end of the follow-up period, whichever comes first.

Statistical analysis
In this analysis for secondary data, analysis of variance or chisquare test was performed to compare the distribution of baseline variables and pathological characteristics of LC between different groups. Log-rank test was initially used to compare the LC incidence and mortality between four classic CXR trajectories. Multivariable Cox proportional hazard regression model was used to analyze the associations of CXR trajectories with LC incidence and mortality after adjusting all available baseline variables (including age, sex, race, education levels, marital status, smoking status, family history of lung cancer, and BMI). The associations were measured as hazard ratio (HR) and 95% confidence interval (CI). Due to the potential heterogeneity between the PLCO and NLST trials, meta-analyses with random-effect models were conducted to pool the study-specific associations from the PLCO and NLST trials.
Subgroup analyses were performed to evaluate the stratified associations of CXR trajectories with LC risks by smoking status in the PLCO trial but not the NLST trial due to lack of non-smokers in the NLST trial. Further analyses were conducted to interaction between CXR trajectories and smoking with LC risk in the PLCO trial. Additionally, multivariable logistic regression models were used to investigate the potential factors associated with progression and regression in CXR, and associations were measured as odd ratio (OR) and 95% confidence interval (CI).
All statistical analyses were conducted via R software (version 4.1.0). A p-value < 0.05 was considered statistically significant.

Association of CXR trajectories with LC risk
After a median follow-up of 16.95 and 10.30 years in the PLCO and NLST trials, a total of 889 (1.7%) LC cases and 1,186 (2.3%) LC deaths were documented in the PLCO trial, while a total of 532 (2.4%) LC cases and 823 (3.7%) LC deaths were recorded in the NLST trial. As shown in Figure 1, participants with CXR SP seemed to have the higher risks of LC incidence and mortality than participants with other CXP trajectories in the PLCO trial (all pvalues < 0.01), and almost the same results were also observed in the NLST trial.
As shown in Table 1

Association of CXR trajectories with LC risk by smoking status
In the PLCO trial, after stratifying by smoking status and adjusting other available baseline factors (

Interaction between CXR trajectories and smoking on LC risk
Among participants with first-round negative CXR, compared to CXR SN among never smokers, even no progression in CXR, the adjusted HRs (95%CIs) of LC incidence were 7.39(5.60-9.75) for ever smokers and 31  were also observed for both ever and current smokers, especially for current smokers [68.24(31.00-150.18)]. Ever these smokers lose their positive CXR, the increased risks of LC incidence and mortality did not decrease to non-significant level (Table 3).

Pathological characteristics of LC associated with CXR trajectories
As shown in Table 4 and Supplementary e-Tables S3,S4, the proportion of advanced-stage LC in patients with CXR GP was lower than those with CXR SN [55.1% vs. 65.6% (p-value = 0.039) for PLCO, 39.2% vs. 66.3% (p-value < 0.001) for NLST, and 48.6% vs. 65.8% (p-value < 0.001) for meta-analysis], while there was no significant difference in cancer stages between CXR SP and CXR LP . Similar lower proportion of LC with poor-grade differentiation were observed in participants with CXR GP compared to those with CXR SN [49.6% vs. 60.9% (p-value = 0.020) for meta-analysis], while no significant difference in cancer grades was observed between CXR SP and CXR LP . Moreover, in the PLCO trial, nonsignificant differences in the histological types of LC were observed between CXR SN and CXR GP , or between CXR SP and CXR LP (both p-value > 0.05, Supplementary e- Table S3).

Potential factors associated with progression and regression in CXR
As shown in

Discussion
After systematically searching the comparative studies on the association between progression (or trajectory) on chest X-ray and LC risk published before 2023, unfortunately, we found few studies focused on this topic, while fewer studies explored the stratified association by smoking status. To the best of our knowledge, this was the first study to investigate the associations of CXR trajectories with LC risk, and this was also the first study to evaluate the interaction between CXR trajectories and smoking status on LC risk. Based on the two independent, well-curated, multicenter community-based LC screening trials, we discovered that LC risk varied across different CXR trajectories. Overall, progression in CXR was significantly associated with increased risk of LC incidence and mortality even after stratification according to smoking status, while regression in CXR was associated with decreased LC risk among never or ever smokers but not current smokers. Furthermore, significantly higher LC risks were observed in smokers than never smokers, even in the absence of obvious chest symptoms in CXR. Additionally, smoking was significantly associated with increased risk of progression in CXR but not with regression in CXR.
Previous studies suggested that chest X-ray screening for LC may have a false-negative rate of at least 20% (8), and the UK biobank study also supported that CXR failed to identify nearly 17.7% of lung cancer patients in the year before diagnosis (16). Similarly, CXR screening is also likely to have a high percentage of false positive (17)(18)(19). All these studies remind the general practitioners that a single round of CXR cannot rule out the risk of lung cancer; therefore, multiple rounds of CXR screening or monitoring are necessarily needed to reduce the missed or false diagnosis of LC (20). However, no studies had explored whether there was a significant difference in LC risk across different CXR trajectories, especially between CXR SP and CXR GP . In this study, overall, LC risk with CXR SP was significantly higher than that with CXR GP . This is easy to understand, since people with CXR SP are likely to have a more serious chest disorder than those with CXR GP . This association can be more clearly observed in never smokers. However, this result cannot always be observed in any settings. Conversely, among either ever or current smokers, consistent higher risks of LC incidence were observed in CXR GP than CXR SP . These results were likely to suggest that smoking-related CXR GP may be associated with a more serious LC risk than smoking-related CXR SP , and this type of CXR SP should deserve more attentions than CXR GP in the never smokers. The harmful effects of smoking have been fully addressed in the previous and latest WHO reports on the global tobacco epidemic 2021 (21), the previous and updated reports of the Surgeon General on the Health Consequences of Smoking (22), and the latest Cancer Atlas (third edition) released in 2019 (23). In summary, all smoked and traditional smokeless tobacco products cause cancer. Although lung cancer is the most dominated cancer caused by cigarette smoking, at least 19 other sites or sub-sites cancer are causally associated with smoking. Additionally, smoked tobacco products cause even more deaths from vascular and respiratory conditions than from cancer.
Consistent with a large body of previous research evidence (24)(25)(26)(27), smoking can lead to an increased LC risk dozens of times, especially among current smokers. However, most previous studies suggest that smoking increases the LC risk primarily by increasing lung symptoms. In this study, we observed that even in people with stable negative CXR, smoking still increased the LC risk 7-30 times. These results suggest that smoking could increase the LC risk through other ways with no obvious symptoms in the lung. Although this was also reported in previous studies, it was rarely observed (28)(29)(30). On the other hand, although we observed a clear interaction between smoking and trajectories on LC risk, the increased risk of LC still existed among current smokers even if positive CXR regressed to negative CXR. Both evidence from CXR SN and CXR LP further suggested that smoking increased the LC risk not only by increasing lung symptoms but also by other ways without obvious lung symptoms. This was the second key finding of this study. This result suggested that we should not only focus on the obvious lung symptoms associated with smoking should but also learn more about how smoking increased the LC risk through other ways without obvious lung symptoms (20). The latter is frequently ignored in current LC prevention practice but would be very helpful in understanding the carcinogenic mechanisms of smoking. Additionally, two other findings also deserved more attentions. First, as mentioned above, it was easy to understand that smoking was associated with increased risk of progression in CXR. This was consistent with several previous studies (24,31). However, unexpectedly, we found no association between smoking and regression in CXR. This result may be related to the small sample size or the limited analytical variables available in this study. Several factors, such as diet, exercise, and previous lung diseases, may influence the regression in CXR. Lack of these information would bias the current results. The current results further supported the interaction between CXR trajectories and smoking on LC risk. However, the causes for regression in CXR may be much more complex than expected, and more research is needed to investigate these causes of CXR regression in the future. Second, among participants with first-round negative CXR, lower proportion of advanced LC and poorly differentiated LC were observed in CXR GP than CXR SN . Since both advanced stage and poor differentiation are associated with worse prognosis, hence, participants with CXR GP deserved special attention and more follow-up. This will allow us to detect potential LC earlier and to reduce the progression in CXR with effective interventions.
In addition to the interesting findings mentioned above, several limitations of this study should also be considered. First, the NLST trial did not include never smokers, and the rounds of CXR differed between the PLCO and NLST trials. Therefore, the current results would be biased by the potential heterogeneity between the PLCO and NLST trials. As noted above, the heterogeneity indeed existed. However, the overall association between CXR trajectories and LC risk from the two trials was very similar. Since the PLCO and NLST trials recruited two completely different populations, the results of the two cohorts can be seen as independently mutual validation. Moreover, meta-analyses with random-effects model were used to combine the results of the two cohorts; a more conservative conclusion was drawn in this study. Second, ignorance of the disordered fluctuations in CXR would bias the current results. In fact, the sample size of these populations was relatively small, and the clinical significance of these disorder fluctuations in CXR is unclear. Therefore, ignoring the disordered fluctuations would not have a significant impact on current results. Third, due to lack of never smokers in the NLST trial, stratified analyses by smoking and interaction analyses between CXR trajectories and smoking can only be conducted in the PLCO trial. This would also bias the current results. Further studies are needed to validate the current results. Fourth, CXR can only reflect relatively limited pulmonary symptoms; several minor progressions in CXR would be missed. LDCT trajectories or more detailed symptoms trajectories in the lung are needed to be investigated to detect potential LC earlier in the future.

Conclusion
Based on the two large-scale lung cancer screening trials, generally, progression in CXR was associated with increased LC risk, while regression was associated with decreased LC risk. However, smoking would greatly increase the LC risk even in the absence of obvious chest symptoms in CXR, which deserves more attention to explore the carcinogenic mechanisms of smoking. Moreover, smoking would reverse the higher LC risk with CXR SP than CXR GP , which was observed in general participants or never smokers, and regression in CXR can only reverse the LC risk with positive CXR to the reference level as CXR SN among never or ever smokers but not current smokers. Given the high incidence and mortality of LC worldwide, the high prevalence of smoking, and the great role of CXR trajectories with LC risk, comprehensive intervention incorporating CXR trajectories and smoking status should be recommended to reduce LC risk, especially in resourcelimited regions, thereby preventing more life-threatening adverse clinical outcomes.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.

Ethics statement
The studies involving human participants were reviewed and approved by The National Cancer Institute and their local institutional review board. The patients/participants provided their written informed consent to participate in this study.