Gender differences, environmental pressures, tumor characteristics, and death rate in a lung cancer cohort: a seven-years Bayesian survival analysis using cancer registry data from a contaminated area in Italy

Introduction In Taranto, Southern Italy, adverse impacts on the environment and human health due to industrial installations have been studied. In the literature, associations have been reported between gender, environmental factors, and lung cancer mortality in women and men. The aim of this study was to investigate the relationships between gender, residence in areas with high environmental pressures, bronchus/lung cancer characteristics, and death rate. Methods Data from the Taranto Cancer Registry were used, including all women and men with invasive bronchus/lung cancer diagnosed between 1 January 2016 and 31 December 2020 and with follow-up to 31 December 2022. Bayesian mixed effects logistic and Cox regression models were fitted with the approach of integrated nested Laplace approximation, adjusting for patients and disease characteristics. Results A total of 2,535 person-years were observed. Male gender was associated with a higher prevalence of histological grade 3 (OR 2.45, 95% CrI 1.35–4.43) and lung squamous-cell carcinoma (OR 3.04, 95% CrI 1.97–4.69). Variables associated with higher death rate were male gender (HR 1.24, 95% CrI 1.07–1.43), pathological/clinical stage II (HR 2.49, 95% CrI 1.63–3.79), III (HR 3.40, 95% CrI 2.33–4.97), and IV (HR 8.21, 95% CrI 5.95–11.34), histological grade 3 (HR 1.80, 95% CrI 1.25–2.59), lung squamous-cell carcinoma (HR 1.18, 95% CrI 1.00–1.39), and small-cell lung cancer (HR 1.62, 95% CrI 1.31–1.99). Variables associated with lower death rate were other-type lung cancer (HR 0.65, 95% CrI 0.44–0.95), high immune checkpoint ligand expression (HR 0.75, 95% CrI 0.59–0.95), lung localization (HR 0.73, 95% CrI 0.62–0.86), and left localization (HR 0.85, 95% CrI 0.75–0.95). Discussion The results among patients with lung cancer did not show an association between residence in the contaminated site of national interest (SIN) and the prevalence of the above mentioned prognostic factors, nor between residence in SIN and death rate. The findings confirmed the independent prognostic values of different lung cancer characteristics. Even after adjusting for patients and disease characteristics, male gender appeared to be associated with a higher prevalence of poorly differentiated cancer and squamous-cell carcinoma, and with an increased death rate.


Introduction:
In Taranto, Southern Italy, adverse impacts on the environment and human health due to industrial installations have been studied.In the literature, associations have been reported between gender, environmental factors, and lung cancer mortality in women and men.The aim of this study was to investigate the relationships between gender, residence in areas with high environmental pressures, bronchus/lung cancer characteristics, and death rate.
Methods: Data from the Taranto Cancer Registry were used, including all women and men with invasive bronchus/lung cancer diagnosed between 1 January 2016 and 31 December 2020 and with follow-up to 31 December 2022.Bayesian mixed effects logistic and Cox regression models were fitted with the approach of integrated nested Laplace approximation, adjusting for patients and disease characteristics.

Discussion:
The results among patients with lung cancer did not show an association between residence in the contaminated site of national interest (SIN) and the prevalence of the above mentioned prognostic factors, nor between residence in SIN and death rate.The findings confirmed the independent prognostic values of different lung cancer characteristics.Even after adjusting

Introduction
Lung cancer is the second most diagnosed cancer and the leading cause of cancer death in 2020.With an estimated 2.2 million new cancer cases and 1.8 million deaths in 2020, it represents approximately one in 10 (11.4%) cancers diagnosed and one in 5 (18.0%) deaths (1, 2).The risk of developing this cancer is associated with older age combined with a history of smoking cigarettes.It is more common among men than women and among those with lower socioeconomic status.Among non-smokers, important lung cancer risk factors are exposure to second-hand smoke, exposure to ionizing radiation, and occupational exposure to lung carcinogens, such as asbestos (3).
Treatments for lung cancers are based on the biological subtyping of the tumors, and several disease characteristics, such as staging, grading, morphology, PD-L1 expression, topography, and laterality, represent prognostic factors and/or targets for therapies (4)(5)(6)(7)(8)(9).Specifically, tumor, node, metastases (TNM) classification is a system used to describe the amount and spread of cancer in a patient's body.In TNM classification, T describes the size of the tumor and any spread of cancer to nearby tissues, N describes the spread of cancer to nearby lymph nodes, and M describes the metastasis, i.e., the spread of cancer to other parts of the body.TNM combinations are grouped into five less-detailed stages, from 0 (carcinoma in situ, where abnormal cells are present but have not spread to nearby tissues) to I-II-III (invasive cancer, where the higher the number, the larger the tumor and the more it has spread to nearby tissues) to IV (invasive, metastatic cancer, where cancer has spread to distant parts of the body) (4,(10)(11)(12).In addition to TNM staging, histologic grading is a predictor of disease outcome in lung cancer patients, with higher tumor grade (lower differentiation) being associated with a poorer prognosis (differentiation describes how much a tumor resembles the normal tissue from which it arose) (5,13).Tumor morphology is another prognostic factor in patients with lung cancer.Non-small-cell lung cancer (NSCLC) is any type of epithelial lung cancer other than small-cell lung cancer (SCLC).The most common types of NSCLC are lung adenocarcinoma (LAC) and lung squamous-cell carcinoma (LSCC), but there are several other types that occur less frequently, and all types can occur in unusual histological variants.NSCLC is usually less sensitive to chemotherapy and radiation therapy than SCLC, but patients with resectable cancer may be cured through surgery or surgery followed by chemotherapy.Conversely, SCLC is a distinct subtype of lung cancer that presents as a proliferation of small cells.It is more responsive to chemotherapy and radiation therapy than other cell types of lung cancer; however, it is difficult to cure as SCLC has a greater tendency to spread widely even before diagnosis takes place (4,6,7).Programmed death-ligand 1 (PD-L1) expression is also an important prognostic factor in lung cancer.PD-L1 is a ligand of the programmed death protein 1 (PD-1) coinhibitory immune checkpoint expressed on tumor cells and infiltrating immune cells.Tumors with the expression of PD-L1 ≥ 50% are amenable to first-line treatment with targeted biological agents such as pembrolizumab, with improvement in survival (4,6,7).In previous studies, authors have also found associations between bronchus/lung cancer topography, laterality, and death rate.Specifically, increased death rates were reported in patients with cancer localization in the main bronchus or on the right side (8,9).
As far as gender differences are concerned, higher lung cancer incidence and mortality were reported in men, even when other clinical and demographic characteristics were considered.Regarding environmental pressures, which can also be linked to gender differences in exposure patterns, air pollution is a recognized risk factor for lung cancer incidence and mortality and is a major health concern for Europeans, with more than 300,000 premature deaths each year attributed to chronic exposure to fine particulate matter alone.Part of this mortality is due to lung cancer, and the International Agency for Research on Cancer (IARC) has classified outdoor air pollution and particulate matter in outdoor air pollution as carcinogenic to humans (Group 1), with sufficient evidence for lung cancer (14)(15)(16)(17)(18)(19).
In some areas of the province of Taranto, a coastal city in the Apulia region in Southern Italy, various industrial installations and polluting sources (a steel plant, an oil refinery, urban discharges, harbor activities, and the shipyard of the Italian Navy) have been operating in close proximity to the resident population for decades with well-known and extensively studied adverse impacts on the environment and human health (20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31)(32)(33)(34).With regard to environmental, feed, and food impacts, it is of particular importance that the area shows contamination of these matrices by metals and persistent organic pollutants, specifically dioxins and PCBs.Moreover, some of these substances have been detected in in human biological samples (21,(24)(25)(26)(27)(28)(29).As far as human health effects are concerned, evidence has been produced after studying the populations who resided in the contaminated site of national interest (SIN) of Taranto.
In particular, cohort studies have reported an increased risk for different types of cancer incidence, including lung cancer incidence in women and men (20,31).Some studies have also noted an increased risk for all-cause hospitalization; for circulatory, respiratory, digestive, and urinary diseases hospitalization; and for different types of cancer hospitalization, including lung cancer hospitalization in women and men (20,33,34).Different studies have also indicated an increased risk for all-cause mortality; for circulatory and digestive diseases mortality; and for some types of cancer mortality, including lung cancer mortality in women and men (20,23,30,33,34).
To summarize, associations have been reported between the aforementioned factors and lung cancer mortality in women and men.The aim of this study was to investigate the relationships between gender, residence in areas with high environmental pressures, bronchus/lung cancer characteristics, and death rate.

Study area and baseline epidemiological data
The study area is the province of Taranto, which consists of 29 municipalities with a total resident population of 555,999 on 1 January 2023 (35).The SIN of Taranto consists of two municipalities, Taranto (the provincial capital) and Statte, respectively, with a population size of 188,098 and 12,917 on 1 January 2023 (33)(34)(35).The study area with municipalities and SIN is shown in Figure 1.The map was created with QGIS version 3.28.4.
From 2015 to 2019, Taranto Province recorded 1,740 bronchus/ lung cancer (ICD10 codes C34.0 to C34.9) cases, with a directly standardized rate of 21.5 cases ber 100,000 inhabitants in women and 96.8 cases per 100,000 in men, and a median age of 70 years in women and 72 years in men.In the studied period, no trachea cancer (C33) cases were recorded.In the same period, 67% of patients with bronchus/lung cancer requiring hospitalization were admitted to a hospital in the Taranto Province, 21% to an extra-provincial hospital in the Apulia region, and 12% to an extra-regional hospital.Between 2013 and 2017, the relative standardized five-years bronchus/lung cancer survival rate was 24.5 (95% CI 19.4-29.9)for women and 17.8 (95% CI 14.7-21.2) for men (31).
From 2013 to 2017, in the SIN, 121 bronchus/lung cancer deaths were recorded among women, with a standardized mortality ratio (reference: Apulia region) of 125 (90% CI 107-145); and 451 bronchus/lung cancer deaths were recorded among men, with a standardized mortality ratio (reference: Apulia region) of 118 (90% CI 109-127) (33).From 2015 to 2019, in the SIN, 202 bronchus/lung cancer cases were recorded among women, with a directly standardized rate of 30.6 cases per 100,000 inhabitants and a standardized incidence ratio (reference: Taranto Province) of 191.2 (95% CI 165.8-219.5).For the same period, in the SIN, 582 bronchus/ lung cancer cases were recorded among men, with a directly standardized rate of 111.1 cases per 100.000inhabitants and a standardized incidence ratio (reference: Taranto Province) of 125.7 (95% CI 115.7-136.4)(31).

Data source and cancer cohort
Data from the Taranto Cancer Registry of the Italian Association of Cancer Registries (AIRTUM) were used, including all women and men with invasive bronchus/lung cancer (ICD10 codes C34.0 to C34.9) diagnosed between 01 January 2016 and 31 December 2020 who resided in Taranto Province at the time of diagnosis.In the studied
In the cross-sectional study, the studied outcomes were each of the tumor characteristics (prevalence), and the studied exposures were gender and residence in areas with high environmental pressures.The aim of this step was to assess possible associations between gender, environmental factors, and each of the tumor characteristics.In the longitudinal study, the studied outcome was all-cause death (incidence), and the studied exposures were gender, residence in areas with high environmental pressures, and tumor characteristics.The aim of this step was to assess possible associations between gender, environmental factors, tumor characteristics, and death.Adjustment variables recorded at the time of diagnosis were age class (40-59, 60-69, 70-79, ≥ 80 years), year, patient ID, and municipality of residence.Adjustment for the patient's ID and municipality of residence was provided to account for the heterogeneity related to possible unobserved individual or ecological level variables (e.g., tobacco use, alcohol consumption, social and material deprivation and access to health services).

Statistical analysis
Data analysis was performed using R version 4.2.3.Bayesian inference was performed with package INLA version 22.12.16.Complex models could be fitted with the Bayesian approach, including generalized linear models and survival analyses.The possible non-independence and heterogeneity of observations could be taken into account by fitting mixed models with both fixed and random effects.While traditional survival analysis relies on parameter estimation based on partial likelihood, Bayesian approaches for time-to-event data allow us to use the full likelihood to estimate all unknown elements in the model.Bayesian generalized linear models comprise Bayesian logistic regression for binary response data.However, the computation of the posterior and other quantities of interest in these complex models is usually much more difficult than frequentist calculations.The Integrated Nested Laplace Approximation (INLA) is a deterministic method for Bayesian calculations that applies to a wide class of models called Latent Gaussian Models.INLA provides fast and accurate approximations to the posterior marginals through a clever use of Laplace approximations and advanced numerical methods, taking computational advantage of sparse matrices.In most cases, INLA is both faster and more accurate than other methods for Bayesian computation (36)(37)(38)(39)(40).
The cross-sectional study analyzed the associations between gender, residence in areas with high environmental pressures, and tumor characteristics using a series of mixed effects binary logistic regressions.Pathological/clinical staging (TNM III-IV; TNM IV), histological grading (grade 3), morphology (SCLC in patients with LAC, LSCC, or SCLC; LSCC in patients with LAC or LSCC), immune checkpoint ligand expression (PD-L1 ≥ 50%), topography (lung), and laterality (left, excluding patients with bilateral cancer) were considered as outcome measures binary variables.For each regression model, records with missing values for the analyzed outcome were excluded.Gender and residence in areas with high environmental pressures were included as fixed effects binary variables.Age class and year were included as fixed effects multinominal variables.Patient ID and municipality of residence were included as random effects multinominal variables (random intercepts).Bayesian binary logistic regression models were fitted with the INLA approach for latent Gaussian models, computing odds ratios (OR) and 95% credible intervals (CrI).An independent and identically distributed random distribution was chosen for patient ID and municipality of residence (37)(38)(39).
The longitudinal study analyzed the associations between gender, residence in areas with high environmental pressures, tumor characteristics, and death using a mixed effects Cox proportional hazard regression.The time axis was the difference in days between the day of cancer diagnosis and the last day of follow-up (event or right censoring).All-cause death was considered as the outcome measure binary variable (event).The proportional hazard assumption was verified through the analysis of plotted survival curves between the different levels of the variables.Gender, residence in areas with high environmental pressures, pathological/clinical staging (TNM I, II, III, IV), histological grading (grade 1-2, 3), morphology (LAC, LSCC, SCLC, OTLC), immune checkpoint ligand expression (PD-L1 0-49%, ≥ 50%), topography (main bronchus, lung), and laterality (right, left) were included as fixed effects binary or multinominal variables.An "NA" (not available) category was created for the records with missing values for the analyzed exposures.Due to low frequency, the bilateral cancer category was merged with the NA category.Age class and year were included as fixed effects multinominal variables.Patient ID and municipality of residence were included as random effects multinominal variables (random intercepts).Bayesian Cox regression models were fitted with the INLA approach for latent Gaussian models, computing hazard ratios (HR) and 95% credible intervals (CrI).An independent and identically distributed random distribution was chosen for patient ID and municipality of residence, while a random walk model of order two was chosen for the baseline hazard function (36)(37)(38)(39)(40). Generalized variance inflation factors (GVIF) were calculated to test the presence of multicollinearity in the data.Sensitivity analyses were performed by examining the extent to which the results were affected by changes in methods, models, variables, influential observations, and inclusion/exclusion criteria.Different combinations of included patients and variables were tested, and for the included variables, different collapsed categories, as well as changes in the type of estimated effects (fixed or random), were also tested.The models were iteratively refitted by excluding from the dataset each age class and year one at a time.

Results
Baseline patients and disease characteristics are shown in Table 1.A total of 2,535 person-years were observed, 1,893 for men, 1,212 for residents in SIN, and 1,118 for deceased patients, with a median (IQR) age of 72.0 (66.0;78.2) years.
The results of the mixed effects Bayesian binary logistic regression models are reported in Table 2. Mutually adjusting and adjusting for baseline age class, year, patient ID, and municipality of residence, the fixed effect variable male gender was associated with a higher prevalence of grade 3 (OR 2.45, 95% CrI 1.35-4.43)and morphology LSCC (OR 3.04, 95% CrI 1.97-4.69),while the fixed effect variable residence in SIN did not appear to be clearly associated with the prevalence of the investigated tumor characteristics.

Discussion
The results of the present study confirmed that TNM staging, histological grading, morphology, immune checkpoint ligand expression, topography, and laterality are independent prognostic factors for mortality in lung cancer patients.Of interest was the finding during the follow-up period of an overall average negative association between PD-L1 expression and death rate in our cohort, although it may be non-constant over time.This association could be explained by the development and implementation of PD-L1 targeting drugs, such as pembrolizumab, in recent years (4, 6, 7).Pembrolizumab is a humanized monoclonal antibody that inhibits the interaction between the programmed death protein 1 (PD-1) coinhibitory immune checkpoint expressed on tumor cells and infiltrating immune cells and its ligand, PD-L1 (6).In general, patients are eligible for this first-line immunotherapy treatment if their cancer-tissue sample shows a positive expression for PD-L1 in ≥50% of neoplastic cells (4).For these reasons, the presence of PD-L1 ≥ 50% could be assumed as a proxy for treatment with immune checkpoint inhibitors, which is a piece of information not directly available in the cancer registry.Given this assumption, our findings could be supported by the scientific literature on the improvement of overall survival in the treated patients (4, 6, 7).Another interesting result was the negative prognostic value of the presence of missing data ("NA" category) for some variables in our cohort.This could be partly explained by the fact that patients without information on some variables (e.g., grading) could correspond to poor prognosis patients who, due to a severe condition at the time of diagnosis, were unable to undergo further interventions or investigations.
With regard to gender differences in tumor characteristics, according to our study, there appeared to be a lower prevalence of LAC and a higher prevalence of grade 3 tumors among men.While the former result appears to be consistent with gender differences in lung cancer characteristics reported in the literature (15), the latter result was a peculiar and interesting finding of our study.With regard to gender differences in lung cancer prognosis, an important, related result appears to be the association between male patients and increased all-cause death rate.This finding was observed independently of all other factors analyzed, as the HR was adjusted for the other variables included in the Bayesian mixed effects regression model.Specifically, this indicates that male patients in the lung cancer cohort have an excess relative risk for all-cause mortality of 24% (95% CrI 7-43%) compared to female patients.Male patients also present an excess odds ratio for grade 3 tumors of 145% (95% CrI 35-343%) compared to female patients, which is, in turn, a factor independently associated with an increased all-cause death rate.Besides, male patients in the LAC/LSCC cohort present an excess odds ratio for LSCC of 204% (95% CrI 97-369%) compared to female patients, which is also, in turn, a factor independently associated with increased all-cause death rate.
Probably, these direct and indirect effects of gender on overall survival could explain the relative standardized five-years bronchus/ lung cancer survival difference between women (24.5) and men (17.8) observed in Taranto Province in the years 2013-17 (31).Moreover, the excess relative risk for mortality independently associated with male patients (direct effect) not only confirms what was already known in relation to the lower survival reported for men in the general population and among lung cancer patients, but also suggests that this excess relative risk could be different in the cohorts followed in this study (14,16,35) respectively of 6% and 14% for mortality in men compared to women (14,16).Although these differences could be attributable to random error, bias, and/or methodological differences, we could not completely rule out the hypothesis that the excess mortality risk in male patients with lung cancer compared to female patients could be different in the population residing in the province of Taranto.In general, gender-based differences in women with lung cancer could be observed in terms of exogenous risk factors (tobacco use, second-hand smoke, asbestos, radon, radiation, and infections), endogenous risk factors (estrogen and genetic polymorphism), diagnosis (diagnosis at a younger age and with never-smoking history), and outcome and mortality (superior surgical outcomes, differences in response to therapies and adverse effect rates, and improved survival across stages and histologies) (15).Therefore, our findings could indicate an interaction between gender differences in lung cancer prognosis and disadvantaged and/or polluted external context, which is also linked to the second main analyzed determinant in the present study, namely the environmental pressures.
In this regard, the results of this study did not show a clear association between residence in SIN and prevalence of the above mentioned prognostic factors, and between residence in SIN and all-cause death rate.Briefly, this suggests that among the followed-up lung cancer cohort, patients who resided in SIN were supposed to have approximately the same risk of all-cause mortality compared to the patients who resided in other municipalities of the province.To evaluate how the random effect variable municipality of residence affects the association between residence and mortality, in sensitivity analysis, the model was refitted without random effects.Even in this analysis, residence in SIN was not associated with a higher death rate in patients with lung cancer (HR 0.97, 95% CrI 0.87-1.08).
These results do not seem to be consistent with what is already known in relation to the increased risk for all-cause mortality reported for women and men residing in the SIN of Taranto.In fact, the latest epidemiological studies on the resident population reported an excess relative risk for all-cause mortality of 7% (90% CI 5-9%) in women and 10% (90% CI 8-13%) in men in SIN of Taranto compared to the Apulia region for the years 2013-17 (33).A probable explanation is related to the aforementioned very low overall survival of patients with diagnosed invasive lung cancer (31).Specifically, on the one hand, we suppose that the high absolute case fatality rate in these patients is probably not significantly influenced by environmental pressures once the lung cancer has developed, and therefore, it was observed independently from their residence in SIN.On the other hand, we suggest that this high mortality rate in the lung cancer cohort could basically act as an important competing risk to the other causes of death associated with environmental pressures (e.g., cardiovascular diseases) and mask with its magnitude the excess relative risk for all-cause mortality that has been conversely reported for the general population who reside in SIN (31,33).Besides, a lung cancer cohort is presumably largely made up of smokers or ex-smokers, and tobacco use increases mortality as well.Therefore, the selection of the cohort conditional on the diagnosis of lung cancer could also have influenced the results, preventing the adverse effect of residence in SIN on mortality from being clearly observed.
Whatever the explanations, these findings confirmed the wellknown ethical questions regarding the environmental health issues in the contaminated site, as several epidemiological studies have reported an increased risk for lung cancer incidence, hospitalization, and mortality in the entire population residing in the SIN of Taranto (20, 23,30,31,33,34).In other words, even if we have not found in the present study a difference in survival related to residing in SIN in patients with diagnosed lung cancers, the development of the disease has been clearly associated with residence in SIN in the entire population.In particular, the latest epidemiological data on resident populations reported in SIN an excess relative risk (reference: Taranto Province) for lung cancer incidence of 91% (90% CI 66-120%) in women and 26% (90% CI 16-36%) in men for the years 2015-19 (31).These data raise another point of reflection.Even if female lung cancer patients present a lower all-cause death rate compared to male patients, and even if the women in SIN present a lower absolute incidence rate for lung cancer compared to men, in the SIN of Taranto, a higher excess relative risk for lung cancer incidence was reported in women compared to men (31).This could explain why, in SIN, a higher excess relative risk (reference: Apulia Region) for lung cancer mortality in the general population for the years 2013-2017 was reported in women compared to men (25% vs. 18%) (33).As discussed above, women in our LAC/LSCC cohort also presented a higher prevalence of LAC compared to men.In this regard, it is worth pointing out the interesting finding of the seemingly higher prevalence of LAC in SIN compared to other municipalities, even if the non-effect is included in the 95% credible interval (OR 1.25, 95% CrI 0.94-1.64;given the nature of the data and models, the LAC odds ratio is the reciprocal of the LSCC odds ratio reported in Table 2).The same result was observed in the model without random effects.According to a previous meta-analysis (41), the association with particulate matter exposure was significant for LAC incidence and unclear for LSCC incidence.For these reasons, a hypothesis could be that an overall average higher population exposure to environmental pollutants in SIN could be linked to a higher prevalence of LAC.However, since this study used an ecological variable (i.e., residence in the SIN of Taranto) to ascertain exposure to environmental pressures, this approach is potentially prone to ecological fallacy.In addition, there is a lack of specificity in the exposure assessment as the specific chemical pollutants could be varied and come from different sources in the studied region.Moreover, in addition to gender differences and the well-known pressures of a strictly environmental nature, the two municipalities of Taranto and Statte present relatively high municipality-level deprivation indexes.This metric is a regionally referenced deprivation index that uses individual data of the general population and housing census of 2011.For the calculation of the index, five conditions were chosen from the authors to best describe the multidimensional concept of social and material deprivation: low level of education, being unemployed, living on rent, living in a crowded house, and living in a single-parent family.The index was calculated as the sum of standardized indicators and is also available categorized into quintiles (42).Although the ecological-level adjustment for the deprivation index was somehow provided by including the municipality of residence in the regression models as a random effect, taking the value of the index itself into consideration with descriptive purposes can also be useful to interpret the results.Regardless, particular attention should be paid to the interpretation of this index due to its ecological-level indicator nature and because the latest available index relates to the 2011 census (42).These limits, therefore, do not guarantee that the available deprivation index represents an accurate indicator of deprivation at the individual level in the years covered by the present study.
However, it should be taken into account that gender, socioeconomic status, deprivation, and inequalities could not only exert an effect on harmful habits (e.g., cigarette smoking), health conditions, and mortality but also potentially affect the utilization of health services (42,43).Furthermore, in this regard, the SIN corresponds almost completely to the provincial capital, Taranto, which could potentially influence access to health at territorial and hospital levels, and in terms of regional and extra-regional mobility.This is linked to another limitation of the study, which is the lack of available data about the preventive or diagnostic-therapeutic pathways followed by these patients, including information regarding their access to smoking cessation programs/services.In general, the fact that both SIN and extra-SIN municipalities belong to the same Local Health Authority, and so consequentially, the entire studied cohort could virtually access the same healthcare services, did not lead us to suppose that there could be relevant biases in relation to these aspects.However, there is the possibility that residing in the provincial capital could facilitate early cancer diagnosis due to the higher accessibility of the population to healthcare services.In the same way, residing in SIN and being conscious of the overall average higher lung cancer incidence and mortality could potentially influence access to health care services.Conversely, SIN also corresponds to an area with a high level of deprivation, a factor that could potentially exert negative effects on early diagnosis probability, therefore acting in the opposite direction with unclear overall net effects.Socioeconomic deprivation could increase the probability of tobacco use as well.In this regard, another potential limitation of the study could be the lack of information about harmful habits such as smoking or alcohol consumption.These data are unfortunately not available in cancer registries or in health records, and many other published longitudinal studies that use this kind of data lack these details (42)(43)(44)(45)(46)(47).However, part of the influence of these factors may have been indirectly captured in the analysis using mixed models with random effects, which take into account the heterogeneity between patients and areas.Moreover, we expected this lack of information to not be a limitation in the strict sense, as these variables could not act as confounders, but rather as mediators between gender and residence in SIN and mortality.In a broad sense, deprivation and tobacco use could be part of a broad range of possible mediators between these factors and mortality, which could comprise sociocultural factors, risky behaviors, diseases and treatments, and biological factors.
To summarize, as mentioned previously, the lack of information about individual-level environmental exposures, socio-cultural indicators, harmful habits, and utilization of healthcare services could represent a limitation of the present study.However, from another perspective, the same elements could also be considered starting points for what can be done in the future.Specifically, it would be interesting to update and expand upon this epidemiological study by recovering further individual-level data about specific environmental exposures (distance from the different polluting sources, exposure to airborne pollutants through dispersion models, and biomonitoring), risky behaviors (cigarette smoking, alcohol abuse, high-fat diet, and physical inactivity), gender-specific pressures and socioeconomic factors (updated indicators of deprivation at individual or census-tract level), and access to prevention programs and diagnostic-therapeutic paths (timing, place, and type of interventions).
In conclusion, the results confirmed the independent prognostic values of different lung cancer characteristics.Despite the limitations discussed above, even after adjusting for patients and disease characteristics, in the cohort of patients with invasive lung cancer, male gender appeared to be associated with a higher prevalence of poorly differentiated cancer and squamous-cell carcinoma, and with an increased death rate.

FIGURE 2
FIGURE 2Survival probabilities in the lung cancer cohort, conditional on each analyzed variable and unconditional on other variables.Province of Taranto, 2016-20, follow-up to 31/12/2022.Time: days of follow-up.Outcome (incidence): all-cause death.
Map of the province of Taranto (grid interval: 20 km) (EPSG:32632 -WGS 84 / UTM zone 32N) (Modified from Italian National Institute of Statistics.Administrative boundaries.https://www.istat.it/it/archivio/222527).As a general rule, baseline patients and disease characteristics refer to the time of diagnosis.Mortality data (all-cause mortality) relative to the follow-up period (2016-2022) were retrieved from the Taranto Province's Causes of Death and Health Registries.Patients with no mortality follow-up information due to extra-provincial transfer before 31 December 2022 (right-censoring, loss-to-follow-up) contributed to the person-time until the date of transfer (n = 8).
. In this regard, two epidemiological studies on different lung cancer patients' cohorts reported excess relative risks Baseline patients and disease characteristics and follow-up survival status in the lung cancer cohort, by gender, residence in SIN, and survival status.

TABLE 2
Results of the mixed effects Bayesian INLA binary logistic regression models in the lung cancer cohort, mutually adjusted and adjusted for baseline age class, year, patient ID, and municipality of residence.

TABLE 3
Results of the mixed effects Bayesian INLA Cox proportional hazard regression model in the lung cancer cohort, mutually adjusted and adjusted for baseline age class, year, patient ID, and municipality of residence.
(incidence): all-cause death.N: incident cases and non-cases.n: incident cases.