Capturing COPD heterogeneity: anomaly detection and parametric response mapping comparison for phenotyping on chest computed tomography

Background Chronic obstructive pulmonary disease (COPD) poses a substantial global health burden, demanding advanced diagnostic tools for early detection and accurate phenotyping. In this line, this study seeks to enhance COPD characterization on chest computed tomography (CT) by comparing the spatial and quantitative relationships between traditional parametric response mapping (PRM) and a novel self-supervised anomaly detection approach, and to unveil potential additional insights into the dynamic transitional stages of COPD. Methods Non-contrast inspiratory and expiratory CT of 1,310 never-smoker and GOLD 0 individuals and COPD patients (GOLD 1–4) from the COPDGene dataset were retrospectively evaluated. A novel self-supervised anomaly detection approach was applied to quantify lung abnormalities associated with COPD, as regional deviations. These regional anomaly scores were qualitatively and quantitatively compared, per GOLD class, to PRM volumes (emphysema: PRMEmph, functional small-airway disease: PRMfSAD) and to a Principal Component Analysis (PCA) and Clustering, applied on the self-supervised latent space. Its relationships to pulmonary function tests (PFTs) were also evaluated. Results Initial t-Distributed Stochastic Neighbor Embedding (t-SNE) visualization of the self-supervised latent space highlighted distinct spatial patterns, revealing clear separations between regions with and without emphysema and air trapping. Four stable clusters were identified among this latent space by the PCA and Cluster Analysis. As the GOLD stage increased, PRMEmph, PRMfSAD, anomaly score, and Cluster 3 volumes exhibited escalating trends, contrasting with a decline in Cluster 2. The patient-wise anomaly scores significantly differed across GOLD stages (p < 0.01), except for never-smokers and GOLD 0 patients. In contrast, PRMEmph, PRMfSAD, and cluster classes showed fewer significant differences. Pearson correlation coefficients revealed moderate anomaly score correlations to PFTs (0.41–0.68), except for the functional residual capacity and smoking duration. The anomaly score was correlated with PRMEmph (r = 0.66, p < 0.01) and PRMfSAD (r = 0.61, p < 0.01). Anomaly scores significantly improved fitting of PRM-adjusted multivariate models for predicting clinical parameters (p < 0.001). Bland–Altman plots revealed that volume agreement between PRM-derived volumes and clusters was not constant across the range of measurements. Conclusion Our study highlights the synergistic utility of the anomaly detection approach and traditional PRM in capturing the nuanced heterogeneity of COPD. The observed disparities in spatial patterns, cluster dynamics, and correlations with PFTs underscore the distinct – yet complementary – strengths of these methods. Integrating anomaly detection and PRM offers a promising avenue for understanding of COPD pathophysiology, potentially informing more tailored diagnostic and intervention approaches to improve patient outcomes.


Introduction
Chronic obstructive pulmonary disease (COPD) remains a major global health burden, ranking as the third leading cause of mortality worldwide (1).Characterized by progressive airflow limitation, COPD predominantly arises from prolonged exposure to harmful airborne particles, particularly in individuals with a history of smoking (2,3).Early detection and accurate phenotyping of COPD are paramount, as timely intervention, including smoking cessation and appropriate treatments, may slow disease progression and improve patient outcomes.Although pulmonary function testing (PFT) and, in particular, gold standard spirometry, play a central role in COPD diagnosis, its ability to detect early-stage disease and reliably characterize its heterogeneity remains limited (4,5).
Recent advances in imaging technology, particularly computed tomography (CT), have been instrumental to gain insights into COPD pathophysiology, by quantifying emphysema and small-airway disease.Several emphysema quantification methods are based on measuring the relative area of the lungs below a specific Hounsfield unit (HU) threshold, i.e., low attenuation areas (LAA), on inspiratory CT scans, showing significant correlations with pulmonary function test (PFT) parameters (6)(7)(8).
While large airways down to the first sub-segmental generations are clearly visible on CT, smaller, more distal subsegmental airways and respiratory bronchioles cannot be detected unless becoming more conspicuous due to mucus retention or peribronchial inflammation.However, small airways disease can be indirectly assessed and quantified with CT scans in expiration, when small airway obstruction results in air trapping (8).Several methods, including measuring the LAA below −856 HU (LAA-856) on expiratory scans, have been proposed for evaluating air trapping (8).However, the optimal CT-based method to assess small-airway disease remains a subject of ongoing debate.In recent years, parametric response mapping (PRM) has emerged as a novel approach to phenotyping COPD by utilizing both inspiration and expiration CT scans (9).PRM allows differentiation between emphysematous and non-emphysematous air trapping regions.For small-airway disease, PRM identifies lung areas with densities greater than or equal to −950 HU on inspiration CT and less than −856 HU on expiration CT.However, PRM's current method for small-airway disease assessment focuses on slight dynamic density changes within each voxel and may not fully consider emphysema's potential contribution to air trapping assessment.As it is a mutually exclusive voxel-wise method, each voxel is assigned exclusively to either the emphysematous or non-emphysematous category, potentially oversimplifying the intricate interplay between emphysema and small-airway disease.Additionally, its method's dependence on fixed thresholds introduces limitations in capturing variations across diverse patient populations, while the reliance on registration may be susceptible to specific methods and anatomical variations.Moreover, the sensitivity to CT-protocol variations poses a challenge in ensuring consistent and reproducible results across different imaging settings.
With advances in deep learning (DL) based artificial intelligence (AI), several algorithms have been developed to target the direct interpretation of CT scans, without the need to pre-define COPD features of interest.While the majority relies on supervised learning (10-14), self-supervised (15) and unsupervised methods (16) have been gaining a lot of attention.Their ability to capture complex disease heterogeneity without the need for explicit annotation, makes them particularly advantageous in scenarios where manual labeling may be challenging or subjective.Moreover, these methods have the potential to discover novel disease subtypes or manifestations that may not be predefined in the training dataset, enhancing their adaptability to the diverse and evolving nature of COPD.
In particular, Almeida et al. (15) introduced a self-supervised DL anomaly-detection approach that identifies COPD lung regions as anomalies, reflecting the varied manifestations of COPD across different phenotypes, including large and small airway disease, parenchymal scars, and emphysema.The method harnesses informative latent representations and a generative model to identify deviations from the distribution of normal-appearing lung regions of individuals without airflow obstruction.These deviations are then presented as a lung anomaly map with anomaly scores.Importantly, it has demonstrated its value in distinguishing normal individuals from those with COPD and predicting lung function decline (17).While being a promising tool for the assessment of the severity of COPD, the regional anomaly scores derived from this method have not been further explored.Hence, our study aims to compare the quantitative and spatial relationships between PRM functional small airway disease and emphysema measurements to regional anomaly scores.This can further provide critical insights into COPD heterogeneity and implications for personalized patient care.

Materials and methods
The objective of our study, depicted in Figure 1, was to perform a quantitative and qualitative comparison of spatial relationships between traditional (1) Parametric response mapping (PRM) volumes and the (2) Self-Supervised Anomaly Detection method (15,17).Our hypothesis posits that the anomaly scores derived from self-supervised learning align with PRM volumes, and their associations with clinical metrics are comparable.To enhance our understanding of what the self-supervised method learns, we also employ (3) Principal Component Analysis (PCA) & Clustering on the self-supervised representation level.This approach identifies stable clusters, allowing us to compare them with the aforementioned two methods.

Study cohort
In the present study, we utilized the Genetic Epidemiology of COPD (COPDGene) study.The COPDGene study (ClinicalTrials.govIdentifier NCT00608764) recruited never-smoker controls and current and former smokers, who had a smoking history of ≥10 packyears.The enrollment period was conducted between 2008 and 2011, targeting individuals aged 45-80 years.
Comprehensive assessments were performed, including paired chest CT scans during inspiration (Insp) and expiration (Exp), pulmonary function tests (PFT), and questionnaire evaluations.Specific details regarding the CT acquisition protocol can be found in Almeida et al. (17).
Ethical approval was obtained, and written consent was acquired from all participants after the study protocol received approval from the respective clinical center's review board.The inclusion and exclusion criteria were previously described in Almeida et al. (15,17).
In accordance with the established protocol defined in Almeida et al. (15,17), the dataset was partitioned into distinct sets for training, evaluation and testing.

Parametric response mapping
Parametric response mapping (PRM) was applied to the paired Insp and Exp CT scans as described in the original work (9) of the test set.This process categorized the lung parenchyma into functional small-airway disease (PRM fSAD ), emphysema (PRM Emph ), and normal lung (PRM Normal ).To minimize the contribution of airways and vessels, specific minimum and maximum attenuation values were defined for both scans.PRM Emph was defined by voxels between −1,000 HU and − 950 HU in the inspiratory CT and between −1,000 HU and −856 HU in the expiratory CT scan.PRM fSAD was defined by voxels between −950 HU and −810 HU in the inspiratory CT and − 1,000 HU and −856 HU in the expiratory CT.Lastly, PRM Normal was defined by voxels between −950 HU and −810 HU in the inspiratory CT and −856 and −500 HU in the expiratory CT scan.The top section of Figure 1 illustrates this step.
Following the PRM analysis, the subsequent sections detail the application of (2) Anomaly Detection and (3) PCA & Clustering, both involving two common steps: quantitative pre-processing and extraction of self-supervised latent features.

Extraction of self-supervised latent features
Following the extraction of 3D patches, the objective was to create a representation per patch that encapsulated relevant information related to its imaging features.This representation vector, of size 1 × 512, was later employed for both (1) Anomaly Detection and (2) PCA & Clustering.The self-supervised learning strategy described in Almeida et al. (15,17) was employed, where a subset of the COPDGene subjects was used to train a self-supervised contrastive model, without using any labels.This model is based on the idea of learning representations that maximize the agreement between differently augmented views of the same region via a contrastive loss, i.e., attracting regions that look similar and repel the ones that do not.Based on this, informative latent representations were generated per patch, as illustrated in Figure 1.

Visualization of the latent space: t-distributed stochastic neighbor embedding
For understanding the information conveyed in the selfsupervised latent features of each region and for visualization purposes, t-Distributed Stochastic Neighbor Embedding (t-SNE) (18) was applied as a non-linear dimensionality reduction technique.t-SNE is a common dimensionality reduction technique that maps high-dimensional data into a lower-dimensional space while preserving the pairwise similarities between data points.In order to preserve better global structure, non-standard affinity methods were initialized with the evaluation set, and later transformed to the test set.This was implemented using the openTSNE package (19).

Anomaly detection
Subsequently, having the patch-level latent representations, the anomaly detection model was applied, as illustrated in Figure 1.
The model aimed to quantify the degree to which a specific region or patient deviates from the pre-defined "normality, " as defined in Almeida et al. (17).This normative baseline was defined by lung regions with less than 1% emphysema from individuals without airflow obstruction (never-smoker controls and GOLD0).The distribution of these "normal/healthy" latent features, as derived from the self-supervised contrastive approach, serves as the reference for the model.Region, i.e., patch-wise, anomaly scores are computed using the negative log-likelihood.Patient-level anomaly scores are subsequently obtained by aggregating scores from all regions.Further details can be found in Almeida et al. (15,17).
For visualization purposes of the anomaly lung map, min-max normalization was applied to the anomaly scores, corresponding to the 5th and 95th percentiles of the dataset.

Principal component analysis and clustering
Principal Component Analysis (PCA) & Clustering served as a direct comparative approach to the anomaly detection model, as both operate on the self-supervised latent features of the test set.These representations were obtained by a region-similarity approach, which grouped similar patterns together (whether region-or intensity-wise).The objective with the PCA & Clustering was to explore and identify clusters of regions sharing similar characteristics within the informative latent features, and directly compare it to the anomaly detection method and to the reference PRM volumes.

PCA on the self-supervised latent features
PCA was applied to the latent vectors of each region of the test set, in order to mitigate multicollinearity along the 512 features.The eigenvectors of the covariance matrix were analyzed.Then to reduce the dimension space, the dataset was projected onto the first few uncorrelated principal components, representing dominant eigenvectors of the covariance matrix.The optimal number of Principal Components (PCs) was decided based on Horn's parallel analysis (20), through a Scree Plot, and based on the Kaiser Criterion.

Clustering the PCA
Once determined the subset of PCs to retain, cluster analysis was conducted on the dataset.Various clustering methods, including K-means and Gaussian finite mixture model-based methods were compared using Silhouette, Davies-Bouldin and Calinski and Harabasz scores (21).The Silhouette score aimed for maximization, providing a measure of the separation among different clusters.Davies-Bouldin, targeted for minimization, assessed the similarity of each cluster to its next closest neighbor.Calinski and Harabasz, also targeted for minimization, estimated the cohesion and separation of points within a cluster, based on the distance between cluster centroids.Based on these scores, the optimal clustering method and number of clusters were determined and applied for further analysis.Further information is provided in the Supplementary materials.However, it's important to note that these clusters were primarily generated for comparison purposes.They serve as a reference point to assess the efficacy of the anomaly detection method, since both employ the same self-supervised latent features.

Statistical analysis
Comparisons among PRM volumes, Patient-wise Anomaly Scores and each class of Anomaly Region Clusters were made according to the Global Initiative for Chronic Obstructive Lung Disease (GOLD) stages using the Jonckheere-terpstra test.A post-hoc Tukey-test was then applied for multiple pairwisecomparisons between GOLD stages.
The relationships between pulmonary function tests, clinical data and each PRM class, anomaly scores and cluster groups were evaluated through Pearson's correlation coefficient.Confidence intervals (CI) were calculated using bootstrap resampling method on 10,000 samples.Pulmonary function tests and clinical data included parameters such as FEV1% predicted, FEV1/FVC, Functional Residual Capacity (FRC), Total Lung Capacity (TLC), FRC/TLC, BODE index (body mass index, air-flow obstruction, dyspnea, exercise capacity), St. George's Respiratory Questionnaire (SGRQ), the 6-min-walking-test (6MWT) and smoking duration.Correlations were interpreted as follows: 0.00-0.10(negligible), 0.10-0.39(weak), 0.40-0.69(moderate), 0.70-0.89(strong), and 0.90-1.00(very strong) (22).Differences between correlation coefficients were assessed via the R package "cocor" (23), utilizing the Zou et al. (24) method.This involved calculating the difference between correlation coefficients for each pair of groups and determining a 95% confidence interval (CI) for that difference.If the CI included zero, the null hypothesis that the two correlations are equal was retained; if the CI did not include zero, the null hypothesis was rejected.
Linear mixed effects models (LMM) were utilized to predict clinical parameters based on the PRM volumes, with adjustments made for relevant covariates including age, gender, body mass index (BMI), smoking status (0: never-smoker control, 1: former smoker, and 2: current smoker), smoking duration, and a random term for the study site.To assess the contribution of the anomaly score beyond morphological lung changes in predicting clinical variables (FEV1%, FEV/FVC, FRC, TLC, FRC/TLC, BODE, SGRQ, and 6MWT), it was then introduced as an additional predictor in the PRM-adjusted LMM models.The overall conditional coefficient of determination (R2), adjusted for the number of regressors, was reported.Models were compared through the likelihood ratio test of nested models.
The agreements between PRM relative volumes and volumes obtained from the Cluster groups were assessed using the Bland-Altman method (25).A p-value of <0.05 was considered as statistically significant and adjustments were made for multiple comparisons, using the Holm-Bonferroni method, when applicable.
All statistical analyses were performed using R software version 4.3.4(R Foundation for Statistical Computing, Vienna, Austria).

Results
Patient characteristics are summarized in Table 1.

Visualization of the latent space: t-distributed stochastic neighbor embedding
The comprehensive representation of the lung regions in the latent space, achieved through self-supervised contrastive learning, is visually depicted using t-SNE (Figure 2).Each point represents a feature vector (1 × 512) per lung region (3D patch).A clear separation between regions with and without emphysema and gas trapping is consistently illustrated.Also, regions exhibiting less than 1% Emphysema and less than 1% Air Trapping are distinctly discernable from diseased regions.Notably, factors such as gender information do not contribute to the separation of these latent features (Supplementary Figure S4).

PCA and cluster analysis
PCA applied to the region-wise latent vectors determined that 85 factors should be retained, meeting the criteria of obtained eigenvalues surpassing those from random data, as per Horn's parallel analysis and the Kaiser Criterion (Supplementary Figure S1).Supplementary Figure S2 provides a comparison between the number of clusters and the method applied to the 85 retained PCs.Cluster analysis identified four stable clusters (Clusters 1, 2, 3, and 4) through K-means with a mini-batch size of 1,000.More details about the clusters are available in the Supplementary materials.

Qualitative results
Figure 3 showcases representative CT images individuals from all GOLD grades.The first and second column represent a coronal slice of the inspiratory and expiratory CT scans, while the following columns are the visualization of PRM maps, maps of the anomaly score and cluster classes, overlayed on the inspiratory CT. Figure 4 represents the spatial overlap between region-wise anomaly scores and PRM Emph and PRM fSAD for two former-smoker individuals: GOLD 0 (A) and GOLD 3 (B), with more than 30 years of smoking history.

PRM volumes, anomaly score, cluster groups by GOLD score
Figure 5 depicts a comparison between PRM volumes, anomaly score and relative size of the 4 clusters, relatively to the GOLD stage.Significant differences were found between PRM volumes, anomaly scores and all cluster classes according to the GOLD stage (Table 2), except for class 4. With an increase in the GOLD stage, PRM Emph , PRM fSAD , Anomaly Score and relative volume of cluster 3 increased, while cluster 2 decreased.Post-hoc Tukey analysis (Supplementary Table S1) revealed significant intergroup differences in the anomaly score between all GOLD stages (p < 0.01 corrected), except between controls and GOLD 0. No differences were observed between controls and GOLD 0, controls and GOLD 1, and GOLD 0 and GOLD 1 for PRM Emph ; and controls and GOLD 0 and GOLD 4 and GOLD 3 for PRM fSAD .Cluster 1 showed significant differences between high grades COPD (GOLD 3 and 4), and Cluster 2 between GOLD 2 and 3. Notably, statistically significant differences between never-smoker controls and GOLD 0 subjects were only found by Cluster 3.

Relationships between PRM volumes, anomaly score, cluster groups
Table 3 shows the Pearson correlation coefficient and p-values corrected with Holm-Bonferroni for PRM volumes, anomaly score and cluster groups.Each PRM volume showed statistically significant correlations with PFTs and clinical data.Both PRM fSAD , PRM Emph , Anomaly score and Cluster 3 are negatively correlated with spirometry volumes (FEV1 and FEV1/FVC) and with the walking distance, and positively correlated with FRC, TLC, FRC/ TLC, BODE and smoking duration.Pearson correlation coefficients for the anomaly score are moderate, with the exception of FRC and smoking duration, and very well comparable with the ones from PRM volumes, while less strong for the Cluster 3. The anomaly score showed significantly higher correlations than PRM fSAD for SGRQ and the distance walked in the 6-min walking test; and stronger than PRM Emph for FRC/TLC and smoking duration.No significant differences were found between correlations of the anomaly score and PRM fSAD for FEV1, FEV1/FVC, BODE and smoking duration; and between the anomaly score and PRM Emph for FEV1, FRC, SGRQ and the distance walked in the 6-min walking test.The anomaly scores also showed significantly higher correlations than all Clusters for all clinical variables, except for TLC.Clusters 1 and 2 follow the same trends as the PRM Healthy volumes: positively correlated with FEV1, FEV1/FVC and walking distances and negatively correlated with FRC, TLC, FRC/TLC, BODE, SGRQ, and smoking duration.
Figure 6 illustrates linear relationships between patient-wise anomaly scores and PRM-derived emphysema and fSAD volumes.The anomaly score shows to be significantly correlated with PRM Emph (r = 0.66, p < 0.01) and PRM fSAD (r = 0.61 p < 0.01).As depicted in Table 4, adding the anomaly score to the PRM baseline models statistically improves the fit of all LMM to predict clinical measures, including FEV1, FEV/FVC, FRC, FRC/TLC, BODE, SGRQ, and 6MWT (only to PRM fSAD ).No differences were found for predicting the TLC and the 6MWT (for PRM Emph ).
Bland-Altman plots for the volume (log transformed) agreement of PRM-derived volumes and clusters which showed significant correlation to the PFTs and clinical data (Clusters 1, 2, 3) are presented in Figure 7 showing the differences (D) against averages (A).Since the Bland-Altman plots displayed non-constant bias even after log transformation of the measurements, a regression-based approach (25) was used to compute the bias and limits of agreement (LoA).The mean differences are given by , 1 .Thus, the degree of agreement is not constant across the range of measurements.

Discussion
In this study, we aimed to explore the potential of a deep-learning self-supervised anomaly detection method for phenotyping Chronic Obstructive Pulmonary Disease (COPD) using computed tomography (CT) scans of 1,310 never-smoker controls and GOLD 0-4 from the COPDGene cohort.COPD remains a significant global health burden, necessitating early detection and accurate phenotyping for effective intervention.While traditional methods like parametric response mapping (PRM) have provided insights into COPD pathophysiology (27,28), they may not fully capture the complexity and heterogeneity of the disease.This present study marks the first of its kind, comparing PRM-derived functional small-airway disease and emphysema measurements against regional anomaly scores derived from a recently proposed self-supervised deep-learning anomaly detection approach.
Before delving into the comparison itself, it was important to understand what spatial distribution is encoded by the self-supervised latent features.As unveiled through t-distributed Stochastic Neighbor Embedding (t-SNE) analysis, it showcases a notable attraction among similar disease-related areas, concurrently distancing healthy regions from those corresponding to healthy patients.Notably, this visualization is rooted in the self-supervised contrastive learning of lung regions and, as such, does not rely on labels.It is noteworthy that this approach effectively captures features stemming from emphysema and gas trapping, while factors such as gender information do not contribute to this attraction.Furthermore, an intriguing observation surfaces in COPD patients, where even regions with low levels of emphysema exhibit spatial resemblances with areas demonstrating higher emphysema levels.This pattern potentially hints at underlying features beyond emphysema that play a role in clustering these regions together, possibly indicating the presence of early-stage COPD markers that could progress into advanced emphysematous changes.
Investigating the spatial relationships between small airway disease and emphysema manifestations captured by PRM alongside regions identified as anomalies by the self-supervised approach has revealed interesting insights.Traditionally, the progression of COPD involves the narrowing or destruction of small airways preceding the Visualization is colored by regions with less than 1% emphysema from healthy individuals (green), regions with more than 1% emphysema from healthy individuals (dark green), regions with less than 1% emphysema from COPD individuals (yellow) and regions with more than 1% emphysema from COPD individuals (red).(C) Same as in B but related to airway disease.Healthy individuals were defined as controls and GOLD 0, while COPD individuals as GOLD 1, 2, 3, or 4.  29), a transition not easily detectable using conventional spirometry or visible on CT scans (27,30).Our exemplary patients' regions, colored solely by PRM fSAD , often exhibit areas with higher anomaly scores in the anomaly maps, as illustrated in Figures 3, 4A.This observation raises the intriguing possibility that the anomaly detection approach serves as an additional means to Visual comparison of PRM maps, anomaly score and cluster volumes on exemplary subjects across several GOLD stages (0-4).As the severity increases (GOLD score), so do the areas detected as emphysema and functional small airway disease by PRM.On the same fashion, the region anomaly scores also shift from green to red for more severe cases.Interestingly, these areas overlap PRM Emph and PRM fSAD .Particularly for low GOLD levels, it attributes higher scores to PRM fSAD areas, possibly indicative of differential progressive features.Lastly, the cluster maps reveal four aggregations, indicative of different disease components.Exemplary coronal sections of overlapping of PRM and anomaly maps for two former-smoker individuals.Example (A) is a 64 years-old former-smoker female who does not fulfill the criteria for COPD (GOLD 0; FEV1% post-bronchodilator 126% of predicted value; FEV1/FVC 78%; smoking history of 30 years).Example (B) is a 72 years-old former-smoker male with severe COPD (GOLD 3; FEV1% post-bronchodilator 31% of predicted value; FEV1/FVC 34%; smoking history of 37 years).Top left corresponds to the inspiratory scans, top right to the PRM maps, bottom left to the expiratory registered scan, bottom right to the anomaly score map.The bigger image corresponds to the overlapping of the anomaly maps to the PRM maps, in which regions deviating from the norm are overlapping with PRM fSAD .The closer to red, the higher the anomaly detected.PRM healthy tissue (green), PRM functional small-airway disease (yellow), and PRM emphysema (red). 10.3389/fmed.2024.1360706 Frontiers in Medicine 09 frontiersin.orghighlight transitional stages in the disease process, paramount for understanding early pathological changes (31).It may also suggest that small airway disease undergoes dynamic changes before evolving into more advanced stages of emphysema, particularly relevant for individuals with normal lung function (GOLD 0, Figure 4A).Therefore, more importantly than the spatial similarity, the divergence    4).Further investigations, including the assessment of normal CT voxels alongside CT airway measurements and PRM analyses, are warranted to elucidate the underlying pathophysiological mechanisms responsible for these observations and to validate these findings in prospective longitudinal studies.Furthermore, our study revealed a significant positive correlation between the patient-wise anomaly scores and PRM-derived fSAD (r = 0.61, p < 0.01) and Emphysema volumes (r = 0.66, p < 0.01).Overall, these findings underscore the complementary nature of the two approaches and the potential benefit of integrating them to achieve a more comprehensive assessment of COPD heterogeneity.Moreover, our study reveals a parallel finding to Hwang HJ et al. (32), where a novel emphysema air-trapping composite (EAtC) has been proposed.In particular, their functional air trapping (fAT) component captures air trapping in both emphysematous and non-emphysematous areas.Just as in their work, our investigation suggests that employing the anomaly detection comprehensive approach yields better or comparable correlations with clinical variables (SGRQ, 6MWT; =FEV1, =FEV1/FVC, =BODE and = smoking duration) compared to conventional PRM-based small-airway disease assessments, which focus solely on non-emphysematous air trapping.Similarly, the anomaly score exhibited stronger correlations than PRM Emph for parameters like FRC/ TLC and smoking duration, while no differences were found for FEV1, FRC, SGRQ and the distance walked in the 6-min walking test.This suggests the ability of the anomaly detection approach in characterizing small-airway disease and emphysema more comprehensively, potentially encompassing aspects that conventional methods might overlook.Subsequent analysis confirmed this hypothesis, as adding the anomaly score to LMM adjusted for both PRM volumes, significantly improved the prediction of clinical variables (FEV1, FEV/FVC, FRC, FRC/TLC, BODE, SGRQ).This provides further evidence that the anomaly score captures nuanced features beyond the structural characteristics from the PRM analysis and underscores their complementary nature.
While the debate surrounding air trapping in emphysematous regions remains, evidence from both Hwang HJ et al. (32) and our study supports the notion that small-airway disease is not confined solely to areas with preserved alveoli but can coexist within emphysematous zones as well.This duality emphasizes the interplay between different disease manifestations and highlights the potential for anomalies, as captured by the anomaly detection approach, to encompass both emphysematous and non-emphysematous regions.Additionally, it may also explain why, in contrast to PRM fSAD and PRM Emph , the patient-wise anomaly score significantly differs across all GOLD stages (p < 0.01), except between never-smoker controls and GOLD 0.
Our cluster analysis and visualization of PRM-derived regions have illuminated distinct patterns within the dataset.By employing principal component analysis (PCA) for dimensionality reduction, we successfully identified four stable clusters that exhibited variations corresponding to different stages of the disease.Notably, Cluster 1 appears to embody regions that persist consistently across all GOLD stages, indicative of common characteristics shared across disease phenotypes.Cluster 2, on the other hand, emerges as a distinct subset that represents healthy regions and unaffected by the disease, potentially explaining its volume decline as disease severity increases.The most intriguing finding lies in Cluster 3, where regions manifest features closely aligned with PRM-derived fSAD (Figure 7), which suggests an association with early-stage COPD characteristics.It is conceivable that Cluster 1 represents areas that have never experienced the impact of COPD, while Cluster 3 may signify regions embarking on the initial stages of disease progression.This interpretation finds support in the progressively increasing representation of Cluster 3 across GOLD classes, hinting at its role in the evolution of the disease from milder to more advanced stages.Although the correlations observed between PFTs and PRM-derived Emphysema and fSAD classes are superior to Cluster The non-constant bias observed in the Bland-Altman analysis suggests that the agreement between PRM-derived volumes and cluster volumes is not uniform across the entire range of measurements.This non-uniformity may stem from inherent differences in how PRM and the clustering method characterize and quantify lung regions.The regression-based equations provide a nuanced understanding of this non-constant bias, indicating that the discrepancy between methods is influenced by the magnitude of the measurements.These findings may imply that the clusters, despite their ability to capture distinct patterns related to COPD stages, may not uniformly agree with PRM-derived volumes across all levels of disease severity.The varying agreement observed in different parts of the measurement range could be attributed to the complex and heterogeneous nature of COPD, where different phenotypes and disease manifestations may impact the agreement between methods differently.Moreover, it's important to acknowledge an unexpected discrepancy observed in the distribution of clusters between the left and right lungs.Visual inspection of the original inspiratory and expiratory CT scans did not consistently support the distinct left/right lung asymmetry detected by the clustering algorithm.Notably, we observed an apparent concentration of clusters 3 and 4 in the left lung and clusters 1 and 2 in the right lung.This left/right lung asymmetry prompts caution in the interpretation of cluster-specific findings.While our primary objectives center around the comparative analysis between PRM and anomaly detection, this unexpected observation underscores the need for transparency regarding potential sources of bias in the clustering component.This information adds a layer of transparency to our study, acknowledging that the agreement between PRM and clustering might be influenced by factors that vary across different regions of the lung or disease states.Importantly, it's worth noting that the anomaly score, which consistently demonstrated significant correlations with PRM-derived measurements and clinical data, exhibited a more uniform agreement, providing a robust and complementary perspective on COPD phenotyping.Our findings also indicate that the anomaly detection method provides a more robust and complementary perspective on COPD phenotyping compared to Clustering.This is demonstrated by statistically significant higher correlations between anomaly scores and clinical variables, along with larger effect sizes.Anomaly detection's ability to identify deviations from the distribution of "normal" samples makes it better suited for capturing subtle variations indicative of disease pathology, contrasting with Clustering's reliance on proximity-based grouping, while based on the exact same latent features.Furthermore, while our findings indicate a strong potential for anomaly detection in COPD phenotyping, further research is needed to establish its generalizability across diverse populations.Additionally, our current exploration does not delve into distinguishing COPD from other comorbidities, such as lung cancer, warranting careful consideration and validation in future investigations.
In conclusion, we introduce the anomaly detection approach as a novel perspective in COPD phenotyping, extending the lens beyond threshold-based analysis and methodologies focused on distinct imaging features.By identifying anomalies spanning a spectrum of disease features, including those beyond functional small-airway disease and emphysema, we unveil a more comprehensive viewpoint, offering an additional layer of information that goes beyond traditional PRM phenotyping.The resulting insights offer a window into the intricate distribution of the disease.This transparency in AI diagnostics empowers clinicians to personalize interventions while supplementing established methods.By integrating this innovative approach into clinical practice, we lay the groundwork for refined diagnostics and tailored interventions, ultimately leading to enhanced patient outcomes.

Funding
The author(s) declare that financial support was received for the research, authorship, and/or publication of this article.This research was funded by the State Ministry of Baden-Wuerttemberg for Sciences, Research and Arts, Germany grant number 32-5400/58/3 and by Helmholtz Imaging (HI), a platform of the Helmholtz Incubator on Information and Data Science.The COPDGene study was funded by National Institutes of Health grants U01HL089856 and U01 HL089897 and also supported by the COPD Foundation through contributions made by an Industry Advisory Board comprised of Pfizer, AstraZeneca, Boehringer Ingelheim, Novartis, and Sunovion.

FIGURE 1
FIGURE 1Methodology pipeline.A subject is composed of paired inspiratory and expiratory CT scans which are analyzed by 3 methods: (1) Parametric response mapping (PRM), which relies on the spatial alignment of the two scans and defines emphysema and airway disease areas based on strict thresholds; (2) Anomaly detection, which attributes region-wise anomaly scores based on the distribution of normal lung features, defined by a self-supervised contrastive method; (3) Principal Component Analysis (PCA) & Clustering, which applies dimensionality reduction (PCA) to the latent features from the self-supervised contrastive method to find stable clusters.All three methods produce lung maps, which were then compared qualitatively and quantitively.

FIGURE 2 t
FIGURE 2t-Distributed Stochastic Neighbor Embedding (t-SNE) visualizations of the self-supervised contrastive latent space vectors.Each dot represents a region (3D ROI) or more specifically, an embedding of its latent representation into a two-dimensional space, and its color represents a clinical or radiological characteristic (GOLD, emphysema and air trapping measures at the patch-level).The dotted regions in the t-SNE maps emphasize distinct groups (bottom -healthy, center right and upper right -diseased), serving as illustrative examples.For each dotted region, several examples are provided to illustrate representative inspiratory patches extracted from those dotted regions.Readers are guided to recognize consistent groups positioning across visualizations, enhancing overall interpretation of the t-SNE plots.(A) Visualization is colored by regions with and without emphysema and airway disease.(B) Visualization is colored by regions with less than 1% emphysema from healthy individuals (green), regions with more than 1% emphysema from healthy individuals (dark green), regions with less than 1% emphysema from COPD individuals (yellow) and regions with more than 1% emphysema from COPD individuals (red).(C) Same as in B but related to airway disease.Healthy individuals were defined as controls and GOLD 0, while COPD individuals as GOLD 1, 2, 3, or 4.
development of emphysema (

FIGURE 5
FIGURE 5Distribution of PRM emphysema (A), PRM airway disease (fSAD) (B), patient-wise anomaly score (C) and cluster classes (D), according to the GOLD stage.While PRM Emph , PRM fSAD and the anomaly score clearly increase with the disease severity, the relative volumes of the cluster classes showed distinctly different distributions according to GOLD severity; cluster 3 increases, whereas cluster 2 decreases with an increase in GOLD stage.GOLD, Global Initiative for Obstructive Lung Disease.

TABLE 3
Correlation of PRM volumes, anomaly score, and cluster groups with PFTs and clinical data.
3, Cluster 3 is the only able to differentiate never-smoker controls and GOLD 0 subjects.Amidst these notable clusters, Cluster 4 adds a layer of complexity.While present in a smaller subset of patients (n = 24), notably it covers over 50% of the lung in 22 individuals, showing distinct clinical features like high gas trapping (31.3 ± 23.4%) and extensive smoking history (33 ± 12 years).Still, the decision to retain it was supported by robust clustering metrics detailed in the Supplementary materials.