Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Immunol., 28 November 2025

Sec. Cancer Immunity and Immunotherapy

Volume 16 - 2025 | https://doi.org/10.3389/fimmu.2025.1690829

This article is part of the Research TopicNovel Immune Markers and Predictive Models for Diagnosis, Immunotherapy and Prognosis in Lung Cancer​​​​​​​View all 11 articles

DDR1 as a key prognostic biomarker in non-small cell lung cancer: identification, validation, and potential therapeutic implications

RenCai Lu,,RenCai Lu1,2,3Lu QianLu Qian4XueQin SunXueQin Sun3JianFang ZhangJianFang Zhang3YeQian CuiYeQian Cui3HuiMei SuHuiMei Su5DongDong Xv*DongDong Xv3*ShaoBo Wang*ShaoBo Wang3*
  • 1Faculty of Life Science and Technology, Kunming University of Science and Technology, Kunming, China
  • 2School of Medicine, Kunming University of Science and Technology, Kunming, China
  • 3Department of Nuclear Medicine, The First People’s Hospital of Yunnan Province, The Affiliated Hospital of Kunming University of Science and Technology, Kunming, Yunnan, China
  • 4Department of Pathology, The First People’s Hospital of Yunnan Province, The Affiliated Hospital of Kunming University of Science and Technology, Kunming, Yunnan, China
  • 5Department of Nuclear Medicine, No.926 Hospital, Joint Logistics Support Force of PLA, Kaiyuan, Yunnan, China

Background: Non-small cell lung cancer (NSCLC) remains the leading cause of cancer-related death, with a limited response to immune checkpoint inhibitors (ICIs). Discoidin domain receptor 1 (DDR1) is a collagen-binding kinase that is implicated in tumor progression and immune escape, but its role in NSCLC is unclear. This study aimed to clarify the clinical significance and therapeutic potential of DDR1 via bioinformatics, machine learning, in vitro experiments, and clinical sample analysis.

Materials and Methods: NSCLC patients were stratified by DDR1 expression based on retrospective RNA-seq data from The Cancer Genome Atlas (TCGA); after quality control, 495 lung adenocarcinoma (LUAD) and 481 lung squamous cell carcinoma (LUSC) tumor samples, together with 57 LUAD and 48 LUSC normal samples, were retained for further analysis. The analyses included survival, mutation, immune landscape, drug sensitivity, single-cell heterogeneity, and functional Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses, accounting for the heterogeneity among NSCLC subtypes. A machine learning-based 4-gene prognostic model was constructed and externally validated using two independent datasets: GSE30219 (GPL570 platform, 278 NSCLC samples) and GSE41271 (GPL6884 platform, 274 NSCLC samples). The biological functions of DDR1 were further evaluated via in vitro assays and immunohistochemistry (IHC) of clinical samples.

Results: High DDR1 expression was correlated with advanced T stage (P<0.05), poor progression-free survival (PFS) (HR = 1.62, P<0.001), and an immunosuppressive microenvironment. Drug sensitivity analysis revealed that high DDR1 expression was associated with reduced sensitivity to methotrexate but increased sensitivity to vinblastine, doxorubicin, cisplatin, and docetaxel, with no significant difference observed for gefitinib. Single-cell heterogeneity analysis revealed that DDR1 was enriched in tumor-associated macrophages and neutrophils. A LASSO and random survival forest (RSF) machine learning algorithm revealed a 4-gene signature (PKP2, DKK1, TEF, and GJB5) with strong prognostic value (C-index=0.728). DDR1 knockdown suppressed cell proliferation, migration, and invasion and induced apoptosis in NSCLC cell lines. IHC of clinical samples confirmed DDR1 overexpression in 55.88% of NSCLC patients.

Conclusion: Our study demonstrated that DDR1 promotes tumor progression and immune evasion and is frequently overexpressed in NSCLC patients, suggesting that DDR1 is a potential prognostic biomarker and therapeutic target.

1 Introduction

Lung cancer incidence has risen from 11.4% to 12.4% of all cancers worldwide between 2020 and 2022, with its mortality increasing from 18.0% to 18.7% of global cancer deaths, making it the most frequently diagnosed malignancy. Non-small cell lung cancer (NSCLC) accounts for approximately 85% of all cases (13). The majority of NSCLC patients are diagnosed at an advanced or even metastatic stage, with a five-year survival rate of less than 15% (4, 5). Despite advances in conventional, targeted, and immune checkpoint therapies, patient outcomes remain variable due to individual differences and tumor escape mechanisms (4, 6, 7). Currently, the most widely employed immunotherapy strategy in clinical practice is PD-1 and CTLA-4 immune checkpoint inhibition. However, nearly half of patients with NSCLC do not respond to ICIs, and some patients with high expression levels fail to respond to monotherapy (8). Despite significant advances in immunotherapy, available clinical biomarkers remain suboptimal, as none possess sufficient sensitivity or specificity to reliably predict responders or non-responders (8). The shortcomings associated with these treatments underscore the urgent need for reliable prognostic biomarkers to guide personalized treatment strategies.

Discoidin domain receptor 1 (DDR1), a collagen-binding receptor tyrosine kinase encoded on chromosome 6p21.3, promotes tumor progression and immune exclusion by regulating cell motility, extracellular matrix remodeling, and collagen alignment that restricts immune cell infiltration (9, 10). Aberrant DDR1 activation has been implicated in multiple cancers, including pancreatic, cervical, hepatic, and breast malignancies (1116). In a triple-negative breast cancer mouse model, DDR1 knockout enhanced CD8+ and CD4+ T-cell infiltration, increased IFN-γ production, disrupted peripheral collagen organization, and suppressed tumor growth (10). However, another study reported that pharmacological inhibition or genetic deletion of DDR1 increased tumor burden and inadvertently promoted a protumorigenic microenvironment (17). These findings suggest that while DDR1 is a potential therapeutic target, its role in shaping the tumor microenvironment (TME) and clinical implications in NSCLC remain unclear and require further investigation. Unlike conventional biomarkers such as PD-L1 with limited predictive accuracy, DDR1 may provide complementary or enhanced predictive value through its dual role in extracellular matrix remodeling and immune exclusion.

To address these gaps, we employed integrative bioinformatics approaches to comprehensively characterize DDR1 expression patterns, genetic alterations, immune landscape associations, and prognostic significance in NSCLC, supported by machine learning and external validation (18).

In this study, we identified DDR1 as a key oncogenic driver in NSCLC that modulates the immune microenvironment and influences immunotherapy response. Using machine learning, we developed and externally validated a robust prognostic model for clinical outcome prediction. DDR1 expression was further confirmed in clinical samples, and in vitro experiments demonstrated that DDR1 knockdown influence NSCLC cell function. These findings establish DDR1 as a key oncogenic regulator in NSCLC progression and offer new insights into therapeutic strategies targeting DDR1.

2 Materials and methods

2.1 Data acquisition and preprocessing

RNA-seq data for TCGA-LUAD and TCGA-LUSC cohorts were obtained from the TCGA website (https://portal.gdc.cancer.gov/) and converted to TPM, then log2(TPM + 1) transformed. After excluding samples with missing follow-up or incomplete clinical and staging information, 495 LUAD tumor and 57 control samples, and 481 LUSC tumor and 48 control samples were retained. These were combined as NSCLC data for further analysis. NSCLC datasets with follow-up data, GSE30219 (19) and GSE41271 (20), were downloaded from GEO (https://www.ncbi.nlm.nih.gov/geo/) via GEOquery (https://bioconductor.org/packages/release/bioc/html/GEOquery.html, v2.70.0) (21). GSE30219 (GPL570 platform) includes 278 NSCLC samples, and GSE41271 (GPL6884 platform) includes 274, both used for external validation.

DNA methylation data for TCGA-LUAD and TCGA-LUSC were downloaded from UCSC-Xena (https://toil.xenahubs.net) using HumanMethylation450 BeadChip, including 732 NSCLC tumor samples. DDR1 methylation was calculated as the mean beta value across all sites (22). Somatic mutation data were obtained using TCGAbiolinks (https://bioconductor.org/packages/release/bioc/html/TCGAbiolinks.html, v2.30.0) (23) and visualized with maftools (https://bioconductor.org/packages/release/bioc/html/maftools.html, v2.18.0) (24). CNV analysis used “Masked Copy Number Segment” data via TCGAbiolinks.

scRNA-seq data (GSE117570 (25)) were downloaded from GEO, including eight samples (GSM3304007–GSM3304014). Low-quality cells were removed by filtering out those with fewer than 200 detected genes or with mitochondrial gene expression exceeding 5% of total reads to ensure data reliability (2628), leaving 11,481 cells for analysis.

2.2 Survival analysis

The surv_cutpoint function from the survminer package was utilized to determine the optimal cutoff value for DDR1 expression in NSCLC patients, with survival time and survival status used as outcome variables (29). On the basis of this optimal cutoff, NSCLC patients were divided into high and low DDR1 expression groups. Kaplan–Meier analysis was then performed using the survival (https://cran.r-project.org/web/packages/survival/index.html, version = 3.4-8) and survminer packages to investigate the correlation between patient survival time and NSCLC.

2.3 Construction of a nomogram based on DDR1 expression for NSCLC prognosis prediction

Univariate and multivariate Cox regression analyses were performed on DDR1 expression and clinical factors. On the basis of these results, a nomogram (30) was constructed using the rms package (https://cran.r-project.org/web/packages/rms/, version = 6.8-0) (31). Decision curve analysis (DCA) is a straightforward method for evaluating the clinical utility of prediction models, diagnostic tests, and molecular biomarkers. Therefore, calibration curves and DCA plots for 1-year, 3-year, and 5-year predictions were drawn to assess the predictive accuracy of the model.

2.4 Cellular mutations and copy number variations

The maftools package in R was used to visualize the somatic mutation profiles between the DDR1 high-expression and low-expression groups in NSCLC, and the mutation differences between the two groups were compared. The results are displayed in a waterfall plot. Additionally, GISTIC 2.0 analysis (32) was conducted on the downloaded CNV segments via GenePattern (https://cloud.genepattern.org) (33) to study the differences in copy number variations between the DDR1 high-expression and low-expression groups.

2.5 DDR1 and immune-related features

Four algorithms were used to assess immune infiltration in NSCLC tumor samples. CIBERSORT (https://cibersort.stanford.edu/) estimated the proportions of 22 immune cell types using a known reference matrix and support vector regression (34). ESTIMATE (https://bioinformatics.mdanderson.org/estimate/) inferred tumor purity and stromal/immune cell content based on transcriptomic data via the ESTIMATE R package (35), generating ESTIMATEScore, ImmuneScore, StromalScore, and TumorPurity.

MCPcounter (http://github.com/ebecht/MCPcounter) (36) quantified the abundance of nine immune cell types using whole-transcriptome data. TIMER (https://cistrome.shinyapps.io/timer/) (37) provided immune infiltration estimates through its online platform. These tools jointly evaluated immune infiltration in NSCLC, and Pearson correlation analysis assessed the relationship between infiltration and risk score (RS).

Seven immune modulatory gene types were retrieved from previous studies (38) and their correlation with DDR1 expression was analyzed. To evaluate DDR1’s role in immunotherapy response, immune-related signatures were collected, including cancer immune cycle genes (39), cytotoxic activity (CYT), and tertiary lymphoid structure (TLS) markers (40, 41). CYT and TLS scores were calculated via ssGSEA. The TIDE algorithm (http://tide.dfci.harvard.edu) (42) was used to predict immunotherapy response and immune escape. Tumor mutation burden (TMB) data were obtained from TCGA. Wilcoxon tests compared scores between high and low DDR1 expression groups.

2.6 Drug sensitivity prediction

To assess the sensitivity of NSCLC patients to common chemotherapy drugs, the Cancer Drug Sensitivity Genomics database (https://www.cancerrxgene.org/) (43) was used to estimate the sensitivity of each patient to NSCLC chemotherapy drugs. The pRRophetic package in R was then used to calculate the half-maximal inhibitory concentration (IC50), with IC50 values z-score normalized across cell lines to account for baseline variability (44). The Wilcoxon test was applied to compare differences in drug sensitivity between the high and low DDR1 expression groups.

2.7 Cell population annotation

The Seurat object for single-cell data was visualized using UMAP, revealing 16 clusters. Through manual annotation on the basis of cell type marker genes, 13 distinct cell types were identified on the basis of their markers as follows: B cells (IGHG1, IGHA1, and IGKC), CD8 T cells (CD8B and CD8A), cytotoxic cells (KLRF1, CTSW, and KLRB), dendritic cells (CLEC10A and MS4A6A), fibroblasts (PROCR and CD151), M1 macrophages (CCR7), M2 macrophages (MRC1, CD163, PPARG, and TREM2), macrophages (GPC4, RAI14, and BCAT1), monocytes (VCAN, FCN1, and IL1B), neutrophils (VNN3 and SLC22A4), NK cells (PSMD4 and TINAGL1), other T cells (TRAT1, CD96, and ITM2A), and T helper cells (BATF, ANP32B, and SNRPD1). Differential gene expression between these cell types was assessed using the FindAllMarkers function, and the results were visualized via a heatmap.

2.8 Pseudotime analysis

Pseudotime analysis allows for the ordering of cells along a trajectory on the basis of their gene expression profiles, effectively mapping each cell to a corresponding position in the developmental trajectory. By analyzing gene expression status, cells can be grouped into multiple differentiation states, and an intuitive lineage tree of the predicted differentiation and developmental trajectories of the cells can be generated (45). The results of pseudotime analysis require confirmation of the differentiation starting and ending points on the basis of the trajectory distribution of cell types and the changes in the expression of characteristic genes. For this analysis, the monocle package in R (https://bioconductor.org/packages/release/bioc/html/monocle.html, version = 2.30.0) was used to perform the pseudotime analysis (46).

2.9 CellChat analysis

To study intercellular communication and identify the mechanisms of signaling molecules at single-cell resolution, the CellChat package in R (https://github.com/sqjin/CellChat, version = 1.6.1) was used for cell communication analysis (47). CellChat is a public knowledge database that contains information on ligands, receptors, cofactors, and their interactions, along with pathway annotations. Using social network analysis tools, pattern recognition methods, and manifold learning techniques, CellChat identifies differentially expressed ligands and receptors in each cell type and clusters various communication patterns across different cell groups and pathways. Through these analyses, specific signaling roles of each cell group can be determined, and novel functional intercellular communication mechanisms for certain cell types can be discovered.

2.10 Differentially expressed gene analysis

Differentially expressed genes (DEGs) between samples with high DDR1 expression and low DDR1 expression were analyzed using the limma package (https://bioconductor.org/packages/release/bioc/html/limma.html, version = 3.58.1) (48). Genes with |logFC| > 0.5 and P value < 0.05 were selected as DEGs for further investigation (49).

2.11 Enrichment analysis

To explore functional and pathway differences between the two groups, enrichment analyses were conducted on the DEGs. Gene Ontology (GO) analysis (50) (covering BP, MF, and CC categories) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis (51) were performed using the clusterProfiler package in R (52), with false discovery rate (FDR) < 0.05 indicating significance.

Gene set enrichment analysis (GSEA) (53), implemented via clusterProfiler, assessed biological process differences between the groups, with FDR < 0.05 as the threshold. Gene set variation analysis (GSVA) (54), a nonparametric method for evaluating gene set enrichment across samples, was conducted using the GSVA package in R.

The “h.all.v7.5.2.symbols.gmt” gene set from MSigDB (55) was used for GSVA on high and low DDR1 expression groups. Limma was applied to compare pathway scores (P.adjust < 0.05, |logFC| > 0.1), and results were visualized with a heatmap (56).

2.12 Construction of a prognostic model for NSCLC using machine learning algorithms

Univariate Cox analysis was first performed on DEGs in both training and validation datasets. Genes with p < 0.05 and consistent hazard ratio directions in at least two datasets were selected as key genes (57). These were used to construct a prognostic model for NSCLC via a multi-step computational framework.

In the TCGA-NSCLC dataset, ten machine learning algorithms were applied: random survival forest (RSF), elastic net (Enet), lasso, ridge, stepwise Cox, Cox boost, plsRcox, SuperPC, GBM, and survival support vector machine (survival-svm) (57, 58). Some algorithms (e.g., lasso, stepwise Cox, RSF) included feature selection. These were combined into 101 model combinations via 10-fold cross-validation, and their performance was evaluated by C-index across TCGA, GSE30219, and GSE41271. The model with the highest average C-index was selected as optimal.

Using this model, prognostic genes were identified and used to calculate a risk score (RS) for each patient. The model’s performance was validated using ROC curves with the timeROC package in R (59). The optimal RS cutoff was determined using surv_cutpoint from survminer (29), classifying patients into high-RS and low-RS groups. Kaplan–Meier analysis (survival and survminer packages) assessed the RS’s prognostic value.

2.13 Validation of in vitro experiments

The human NSCLC cell lines HARA, NCI-H292, and CALU-3 were acquired from the Kunming Institute of Zoology, Chinese Academy of Sciences, and cultured under standard conditions according to the provider’s protocols. Reverse transcription−quantitative (RT–q)PCR, western blotting and immunofluorescence were used to select two cell lines with high DDR1 expression for subsequent experiments. The primers used for RT–qPCR were as follows: for DDR1, forward: CCGACTGGTTCGCTTCTACC, reverse: CGGTGTAAGACAGGAGTCCATC; for β-actin, forward: CATGTACGTTGCTATCCAGGC, reverse: CTCCTTAATGTCACGCACGAT. Log-phase cells were trypsinized, seeded in 24-well plates (1×105/well), and transfected with DDR1-small interfering (si) RNA using DMEM-based complexes. After transfection (48 h), the cells were cultured in supplemented medium, and DDR1 knockdown was confirmed via RT–qPCR and western blotting. The functional assays used were as follows: (1) Cell viability: Absorbance (450 nm) was measured at 0–72 h after CCK-8 (C0037, Beyotime, China) incubation; (2) Colony formation assay: 1×103 transfected cells/well were cultured for 14 days, fixed with formaldehyde, crystal violet-stained for 15 min, and counted (×40 magnification); (3) Migration/Invasion: 1×104 cells in Matrigel-coated (invasion) or uncoated (migration) Transwell chambers were quantified after 24 h via crystal violet staining; (4) Wound healing: Monolayer scratches were imaged before and after 24 h of incubation; (5) Adhesion: Matrigel-coated 96-well assays were analyzed via 0.3% crystal violet staining; and (6) Apoptosis: Annexin V-FITC/PI (Absin, China)-stained cells were assessed by flow cytometry. All experiments included triplicate measurements and statistical validation.

2.14 Validation of pathological features in clinical samples

Paraffin-embedded tissue samples obtained from 34 patients (mean ± SD age, 62.58 ± 9.65 years; 7 females, 27 males) were retrospectively subjected to pathological examination. Ethical approval for the study was received from the Institutional Review Board of the First People’s Hospital of Yunnan Province (Approval No. KHLL2025-KY131) and was conducted in accordance with the Declaration of Helsinki and Good Clinical Practice Guidelines. The requirement for informed consent from all patients was waived because of the retrospective nature of the study. Patient tissue samples were cut into 4-μm-thick sections, deparaffinized, and rehydrated before being treated with antigen retrieval solution (10 mmol/L sodium citrate buffer, pH 6.0) and then reacted with anti-DDR1 antibodies (1:3200; Cell Signaling Technology). Immunoreactivity was assessed by one pathologist with 7 years of experience in a blinded fashion as follows: no staining (-); mild staining (+); moderate staining (++); and strong staining (+++). Quantitative immunohistochemical (IHC) scoring was conducted using ImageJ software to compare the staining intensities between NSCLC tumor tissues and paired adjacent normal pulmonary tissues.

2.15 Statistical analysis

Flowchart of the bioinformatic analysis process was shown in Figure 1. All bioinformatic data calculations and statistical analyses were performed using R programming (https://www.r-project.org/, version 4.1.2). For comparisons of continuous variables between two groups, statistical significance for normally distributed variables was assessed using the independent Student’s t test, whereas the Mann–Whitney U test (i.e., Wilcoxon rank-sum test) was used to analyze differences for nonnormally distributed variables. Differences between the two groups were compared using the R package ggpubr (60), survival analysis was conducted using the survival package in R, and Kaplan–Meier survival curves were generated to display survival differences. The log-rank test was used to assess the significance of survival time differences between the two patient groups, and the results were visualized using the survminer package (61). Spearman’s correlation analysis was used to calculate the correlation coefficients between different molecules. In this study, all the statistical P values were two-sided, with P < 0.05 considered statistically significant. GraphPad Prism (version 9.5.1) and ImageJ (http://imagej.org) were used for the analysis of results of in vitro experiments and pathological examination of clinical samples.

Figure 1
Flowchart showing the analysis process involving scRNA-seq and TCGA-LUAD/LUSC data. It includes pseudotime analysis, nomogram, DDR1 variance analysis, immune infiltration, drug sensitivity, and algorithm combinations leading to constructing a prognosis model. Various analyses such as single-cell expression UMAP, gene ontology, and differential expression groupings are detailed.

Figure 1. Flowchart of the bioinformatic analysis process.

3 Results

3.1 DDR1 expression differences and survival analysis

Differential expression analysis of DDR1 in different subgroups of the TCGA dataset revealed that DDR1 expression was significantly higher in tumor samples than in control samples (Tumor: 7.06 ± 0.968 vs. Normal 5.95 ± 0.465, P < 0.05) (Figure 2A). DDR1 expression was also significantly greater in tumor samples from patients over 65 years of age (over 65 years: 7.14 ± 0.917 vs. under 65 years: 6.94 ± 1.03, P < 0.05) and females (female: 6.89 ± 0.95 vs. Male: 7.18 ± 0.963, P < 0.05) compared to the corresponding reference groups (Figures 2B, C). DDR1 expression was greater in T3 stage samples (T1: 6.93 ± 0.97 vs. T2: 7.10 ± 0.965 vs. T3: 7.21 ± 0.913 vs. T4: 7.01 ± 1.10, P < 0.05) (Figure 2D). No statistically significant difference in DDR1 expression was detected across groups with different M stages, N stages, and pathological stages (P > 0.05) (Figures 2E–G).

Figure 2
Boxplots and a survival curve chart show DDR1 expression analysis. Panels A to G compare DDR1 expression in different clinical groups: normal vs. tumor, age groups, gender, cancer stages, and other clinical parameters. Panel H depicts a survival curve with progression-free survival over time for high and low DDR1 expression groups, with a significant p-value of 0.00033.

Figure 2. Differential expression of DDR1 and survival analysis in the TCGA dataset. (A) Differential expression of DDR1 between tumor and control samples. (B) Differential expression of DDR1 between different age groups. (C) Differential expression of DDR1 between different sex groups. (D) Differential expression of DDR1 across different pathological stages. (E) Differential expression of DDR1 across different T stages. (F) Differential expression of DDR1 across different N stages. (G) Differential expression of DDR1 across different M stages. (H) Survival analysis results.

Using the surv_cutpoint function from the survminer package, the optimal cutoff point for DDR1 expression was determined on the basis of the survival time and status of NSCLC patients. This cutoff was then used to classify the patients into high and low DDR1 expression groups (Supplementary Table 1). Survival analysis was performed on the basis of these groupings to investigate the correlation between patient survival time and NSCLC features. The analysis results revealed that in the TCGA dataset, the high DDR1 expression group had significantly shorter progression-free survival (PFS) (P < 0.05) (Figure 2H).

3.2 Nomogram for predicting NSCLC prognosis based on DDR1 expression

Univariate and multivariate risk regression analyses revealed that the T stage (P < 0.05), N stage (P < 0.05), pathological stage (P < 0.05), and DDR1 expression (P < 0.05) were significantly associated with NSCLC prognosis (Figure 3A). Multivariate analysis further identified the pathological stage and DDR1 expression as independent prognostic factors (Figure 3B). A nomogram based on these factors was constructed (Figure 3C), and DCA was used to assess its clinical utility at 1, 3, and 5 years (Figure 3E). Calibration curves (Figure 3D) revealed that predictions of the 1-year nomogram model were most consistent with those of the ideal model, with other time points also showing high accuracy.

Figure 3
Five-panel figure analyzing various factors related to a study. Panel A and B are forest plots showing hazard ratios (HR) for variables like DDR1, gender, age, and stages, indicating significant and non-significant risk factors with circles of different sizes. Panel C is a nomogram depicting total points to predict probability over time. Panel D is a calibration curve comparing observed overall survival (OS) to nomogram-predicted OS over one, three, and five years. Panel E is a decision curve showing standardized net benefit across high-risk thresholds for different models.

Figure 3. Construction of a nomogram for NSCLC patients based on DDR1 expression from the TCGA database. (A) Univariate Cox analysis. (B) Multivariate Cox analysis. (C) Nomogram for clinical features. (D) Calibration curves for 1-, 3-, and 5-year outcomes. (E) Decision curve analysis (DCA) for the nomogram model.

3.3 Mutational differences between the high and low DDR1 expression groups

The expression level of DDR1 was significantly negatively correlated with the DNA methylation level (R = -0.43, P < 0.05) (Figure 4A). On the basis of DDR1 mutation status, NSCLC samples were categorized into DDR1-mutant and DDR1-wild-type groups. No significant differences in DDR1 expression levels or survival times were detected between the DDR1-mutant and wild-type groups (P > 0.05) (Figures 4B, C). Furthermore, no statistically significant differences in the tumor mutational burden (TMB) were found between the high and low DDR1 expression groups (Figure 4D).

Figure 4
A multi-panel scientific figure consists of five sections. (A) A scatter plot shows an inverse correlation between DDR1 expression and DNA methylation. (B) A box plot compares DDR1 levels in wild-type and mutated groups. (C) A Kaplan-Meier curve illustrates survival probability for wild-type and mutated groups over time. (D) A box plot compares tumor mutational burden (TMB) in high and low groups. (E) A complex heatmap depicts mutations in various genes across samples, with a bar plot indicating TMB and a legend detailing mutation types and groups, highlighting differences in high and low groups.

Figure 4. Somatic mutation analysis between the high and low DDR1 expression groups. (A) Correlation analysis between DDR1 expression and the DNA methylation level. (B) DDR1 expression differences between the DDR1-mutant and wild-type groups. (C) Survival analysis between the DDR1-mutant and wild-type groups. (D) Distribution of TMB between the high and low DDR1 expression groups. (E) Waterfall plot showing common somatic gene mutations, with a bar chart on the right indicating the mutation frequency in each group.

Compared with common somatic mutations in NSCLC, distinct mutation patterns were observed between the DDR1 expression groups (Figure 4E). CNV analysis revealed that in the high DDR1 expression group, amplifications on chromosomes 3, 8, and 11 and deletions on chromosomes 2 and 9 were more frequent (Supplementary Figures 1A, B). In the low DDR1 expression group, amplifications on chromosomes 1, 8, and 14 and deletions on chromosome 9 were more common (Supplementary Figures 1C, D).

3.4 Association between DDR1 and immune-related features

Immune and stromal cell infiltration levels were estimated using four algorithms. Pearson correlation analysis revealed significant associations between DDR1 expression and immune cell infiltration. Among the immune deconvolution algorithms, the strongest correlations were observed for T_cells_CD4_memory_activated (R = -0.18, P < 0.001) and Macrophages_M0 (R = 0.17, P < 0.001) in CIBERSORT (16 cell types); ImmuneScore (R = -0.46, P < 0.001) and ESTIMATEScore (R = -0.45, P < 0.001) in ESTIMATE (4 cell types); T_cells (R = -0.31, P < 0.001) and Monocytic_lineage (R = -0.30, P < 0.001) in MCPcounter (8 cell types); and T_cell_CD8 (R = -0.34, P < 0.001) and DC (R = -0.25, P < 0.001) in TIMER (6 cell types). (Figure 5A).

Figure 5
Heatmaps illustrating the expression of DDR1 gene in various cell types and immune-related markers. Part A shows association with immune and stromal scores, cell types, and tumor purity, categorized by analysis type (ESTIMATE, CIBERSORT). Part B displays DDR1 expression correlated with a range of antigen-presenting, cell adhesion, and immune regulatory markers, grouped by functional type. Data is color-coded from blue (low expression) to red (high expression), with multiple sections indicating distinct gene or marker clusters.

Figure 5. Immune infiltration analysis. (A) Correlations between immune cell infiltration and DDR1 expression evaluated by four algorithms. (B) Correlations between the expression levels of DDR1 and those of seven types of immunoregulatory genes.

Pearson correlation analysis revealed a significant association between the levels of DDR1 and those of 63 immunoregulatory genes (P < 0.05), with the strongest correlations observed for CD276 (R = 0.48, P < 0.001), VTCN1 (R = 0.41, P < 0.001), and SLAMF7 (R = -0.37, P < 0.001) (Figure 5B).

3.5 Drug sensitivity analysis based on DDR1

The IC50 values of chemotherapy drugs (methotrexate, vinblastine, doxorubicin, cisplatin, docetaxel, and gefitinib) were calculated in NSCLC patients. Methotrexate had a greater IC50 in the high DDR1 expression group (P < 0.05) (Figure 6A), whereas vinblastine, doxorubicin, cisplatin, and docetaxel had lower IC50 values in the high DDR1 expression group (P < 0.05) (Figures 7B–E). No significant difference in the IC50 was observed for gefitinib between the two groups (P > 0.05) (Figure 6F).

Figure 6
Box plots compare IC50 values for six drugs between high and low groups. Panels A to E show significant differences in IC50 values, with high (red) and low (blue) groups. Panel F for Gefitinib shows no significant difference. Statistical significance is indicated by asterisks.

Figure 6. Drug sensitivity analysis. Differences in the IC50 values for methotrexate (A), vinblastine (B), doxorubicin (C), cisplatin (D), docetaxel (E), and gefitinib (F) between the high and low DDR1 expression groups.

Figure 7
Five-panel figure presenting various box plots and a heatmap. Panel A displays a box plot comparing TLS for high and low groups with a significant p-value. Panel B shows TIDE with a p-value of 2.7e-05. Panel C presents CYT with a significant p-value. Panel D illustrates diverse box plots for fractions across different steps and comparisons, indicating group differences. Panel E is a heatmap showing correlations of gene expressions with values ranging from -1.0 to 1.0, color-coded from blue to red.

Figure 7. Correlation analysis of immune therapy predictors. Distribution of TLS (A), TIDE (B) and CYT (C) scores between the high and low DDR1 expression groups. (D) Differences in tumor immune cycle steps between the high and low DDR1 expression groups. (E) Correlations between immune checkpoint genes and DDR1 expression.

3.6 DDR1-based immunotherapy response prediction

To investigate the role of cell death index features in the immune therapy response, we examined the relationships between DDR1 and immune therapy predictors (TLS, TIDE, and CYT). TLS was higher in the low DDR1 expression group (p < 0.05) (Figure 7A), whereas TIDE was higher in the high DDR1 expression group (P < 0.05) (Figure 7B). CYT was higher in the low DDR1 expression group (P < 0.05) (Figure 7C). Tumor immune cycle analysis revealed significant differences in 16 steps between the high and low DDR1 expression groups of NSCLC patients (P < 0.05) (Figure 7D), with higher proliferation levels in the high-risk group. Pearson correlation analysis revealed the strongest positive correlation between CD276 and DDR1 levels (R = 0.48, P < 0.05) and the strongest negative correlation between HLA-DMB and DDR1 levels (R = -0.37, P < 0.05) (Figure 7E).

3.7 Single-cell heterogeneity analysis

UMAP clustering of the NSCLC single-cell dataset revealed 16 clusters (Figure 8A), and manual annotation revealed 13 cell types (B cells, CD8 T cells, cytotoxic cells, dendritic cells, fibroblasts, M1 cells, M2 cells, macrophages, monocytes, neutrophils, NK cells, other T cells, and T helper cells) (Figure 8B). Marker gene expression across cell types was visualized in a bubble plot (Figure 8C). Differential expression analysis and a heatmap (Figure 8D) highlighted the top two genes with the most significant expression differences. The proportions of each cell type are shown in bar charts (Figures 8E, F), revealing that macrophages predominated in the tumor samples, whereas other T cells were most abundant in the normal samples.

Figure 8
UMAP plots (A, B) show cell clusters labeled by type, with each color representing a different cell type, such as monocytes, T helper cells, and fibroblasts. Dot plot (C) illustrates the expression of various features across cell types, with size and color indicating percentage and average expression. Heatmap (D) displays gene expression levels across cell identities, with a gradient from blue to red. Bar plots (E, F) compare cell type proportions in tumor versus normal samples, with each color representing a specific cell type.

Figure 8. Single-cell heterogeneity. (A) Clustering results for single-cell data. (B) Cell type annotation results. (C) Bubble plot of marker gene expression across different cell types. (D) Heatmap of differentially expressed genes in the single-cell transcriptome (red indicates upregulated expression, blue indicates downregulated expression). (E) Proportions of cell types in tumor samples. (F) Proportions of cell types in normal samples.

3.8 Pseudotime analysis of single-cell data

Pseudotime analysis was conducted to explore the developmental trajectories of 13 cell types in tumor samples, as shown in the differentiation trajectory plot (Figure 9A) and timeline plot (Figure 9B). DDR1 expression decreased along the developmental trajectory (Figure 9C). A UMAP plot further revealed differential DDR1 expression among cell types, with notably higher expression in macrophages (wilcox test: avg_log2FC = 1.87, P_val_adj < 0.001) and neutrophils (wilcox test: avg_log2FC = 2.74, P_val_adj < 0.001) (Figure 9D).

Figure 9
Four graphs are shown: (A) A scatter plot with different cell types indicated in various colors along pseudotime on components 1 and 2. (B) Similar to A, but colored by pseudotime progression from light to dark blue. (C) A scatter plot showing relative expression of DDR1 against pseudotime, with data points transitioning from dark to light blue. (D) A UMAP plot with different cell types, colored by DDR1 expression levels from light to dark purple.

Figure 9. Pseudotime analysis. (A) Differentiation trajectory of 13 cell subpopulations in tumor samples (colors represent different cell subtypes). (B) Developmental timeline of 13 cell subpopulations (color intensity from dark to light represents pseudotime). (C) Timeline showing changes in DDR1 expression during the developmental trajectory. (D) UMAP plot showing DDR1 expression across different cell types.

3.9 CellChat analysis of cell communication

To better understand the interactions between macrophages, neutrophils, and other cell types, CellChat analysis was conducted on tumor samples. The results revealed a close connection between these cell types (Figure 10A). The NOTCH signaling pathway was identified as a key pathway linking macrophages and neutrophils (Figure 10B). Network centrality analysis revealed that neutrophils likely act as senders, whereas macrophages function as receivers in the NOTCH pathway (Figure 10C). Further analysis of the receptor–ligand interactions, visualized in bubble plots, revealed that the ANXA1–FPR1 pair represented a strong interaction between macrophages and M2 macrophages (Figure 11A), and the APP–CDD74 pair represented a strong interaction between neutrophils and M2 macrophages (Figure 11B).

Figure 10
Panel A presents two complex network diagrams displaying cell interactions, with the left diagram focusing on the number of interactions and the right on interaction strength. Panel B shows a chord diagram illustrating the NOTCH signaling pathway network among various immune cells. Panel C contains bar graphs representing the roles of cells (sender, receiver, mediator, influencer) in the NOTCH signaling pathway, with bars indicating importance levels.

Figure 10. Cell communication analysis between macrophages, neutrophils, and other cell types. (A) Interactions between different cell types in tumor samples. (B) Predicted NOTCH signaling network. The size of the circles is proportional to the number of cells in each group, and the edge width indicates the communication probability. (C) Heatmap showing the relative importance of each cell group on the basis of four network centrality metrics calculated from the NOTCH signaling network.

Figure 11
Two dot plot charts labeled A and B compare expressions of various marker genes across cell types. Colored dots indicate communication probability, with a gradient from blue (low) to red (high). Dot size reflects p-values, with larger dots signifying greater significance. The x-axis lists cell interactions, while the y-axis lists gene pairs.

Figure 11. Receptor–ligand interactions between macrophages, neutrophils, and other cell types. Bubble plot of receptor–ligand interactions between macrophages (A), neutrophils (B) and other cell types (color intensity indicates interaction strength, and bubble size reflects significance).

3.10 Investigation of potential biological mechanisms enriched in the high and low DDR1 expression groups

To compare gene expression between the high and low DDR1 expression groups, we identified 382 upregulated and 1,749 downregulated DEGs in NSCLC tumor samples from the TCGA database (Figure 12A) (Supplementary Table 2). The results of the GO and KEGG enrichment analyses are shown in Supplementary Table 3 and Supplementary Table 4. The DEGs were associated with pathways such as the cell cycle, toxoplasmosis, and hepatocellular carcinoma (Figure 12B); biological processes such as epidermis development, skin development and gland development (Figure 12C); cellular components such as the cornified envelope, basal part of the cell and clathrin-coated endocytic vesicle (Figure 12D); and molecular functions such as cadherin binding, MHC class II protein complex binding and cell–cell adhesion mediator activity (Figure 12E).

Figure 12
The image contains multiple scientific data visualizations related to gene expression and biological pathways. A) Volcano plot showing differentially expressed genes, with points colored to indicate significance and regulation direction. B) Circular diagram illustrating gene interaction networks across various categories. C-E) Donut plots with associated tables representing GO terms for biological processes, cellular components, and molecular functions, with z-scores and logFC data. F) Bubble plot showing pathway activation and suppression with gene ratios and p-values. G) Heatmap displaying gene expression across different pathways, with color-coded group labels.

Figure 12. Enrichment analysis between the high and low DDR1 expression groups. (A) Volcano plot of the differential expression analysis results (red: upregulated; blue: downregulated). (B–G) Enrichment analysis results for KEGG analysis, BP analysis, CC analysis, MF analysis, GSVA, and GSEA.

GSEA enrichment analysis was performed on the basis of the log2FoldChange values of the DEGs from the differential analysis. The results are shown in Supplementary Table 5. Figure 12F displays the top four most significant pathways in terms of both activation and inhibition, including autoimmune thyroid disease, asthma, allograft rejection, the intestinal immune network for IgA production, the estrogen signaling pathway, human papillomavirus infection, the PI3K-Akt signaling pathway, and Staphylococcus aureus infection.

GSVA was performed on common pathways for the high and low DDR1 expression groups, followed by differential analysis of pathway scores using the limma package. The results are shown in the heatmap. All GSVA results are provided in Supplementary Table 6. GSVA revealed significant differences (p < 0.05) in the enrichment of gene sets such as HALLMARK_SPERMATOGENESIS, HALLMARK_NOTCH_SIGNALING, and HALLMARK_HEDGEHOG_SIGNALING between the high and low DDR1 expression groups (Figure 12G).

3.11 Construction of the prognostic model for NSCLC

On the basis of the DEGs identified according to the expression levels of DDR1 mentioned above, univariate Cox analysis was performed on the DEGs in both the training and validation datasets, identifying 88 genes with consistent hazard ratios (P < 0.05) across two or more datasets. In the TCGA-NSCLC cohort, 101 algorithm combinations were used to construct a predictive model via 10-fold cross-validation. The robustness of the model was evaluated in multiple validation cohorts, and the LASSO + RSF model, with the highest average C-index (0.728), was selected (Figure 13A). LASSO analysis revealed 19 genes (ARRB1, CHEK2, DIRAS2, DKK1, EIF4EBP1, FST, GJB5, HES2, HSD11B2, NPM3, PKP2, SHOX2, SORBS2, SUSD4, TEF, TFAP4, TSPAN7, WDR72, and ZIC2) (Figures 13B, C), and RSF analysis revealed the importance of these genes (Figure 13D). Four genes with importance scores >0.01 were selected as prognostic genes (Figure 13E). Risk regression coefficients were calculated using the proportional hazards model, and the RiskScore formula was constructed accordingly. The formula was as follows:

Figure 13
The image includes multiple panels with statistical data and visualizations related to risk assessment models in oncology. Panel A displays a heatmap comparing various models with their C-index scores color-coded for different cohorts. Panel B shows a plot of partial likelihood deviance versus log lambda. Panel C presents a Lasso path of coefficients versus L1 norm. Panel D illustrates variable importance in a decision tree model. Panel E displays a coefficient plot with several gene names. Panels F, H, J depict Kaplan-Meier survival curves for different cohorts. Panels G, I, K show ROC curves with AUC values for each dataset.

Figure 13. Construction of the prognostic model for NSCLC. (A) C-index of 101 algorithm combinations in 4 cohorts. (B, C) LASSO analysis. (D) RSF analysis. (E) Risk regression coefficients for 4 genes. (F) Survival curves of high/low RiskScore groups in TCGA; (G) ROC analysis for 1-, 3-, and 5-year survival in the TCGA cohort. (H) Survival curves in the GSE41271 cohort. (I) ROC analysis in the GSE41271 cohort. (J) Survival curves in the GSE30219 cohort. (K) ROC analysis in the GSE30219 cohort.

RiskScore = (0.135 * PKP2 exp.) +(0.104 * DKK1 exp.) +(-0.118 * TEF exp.) +(-0.132 * GJB5 exp.).

Risk scores ranged from 0.2 to 2.1, with the median value of 1.0 used as the cutoff to classify patients into high- and low-risk groups. After the RiskScore for each NSCLC patient was calculated, patients were grouped according to the cutoff from the survminer package (Supplementary Table 7). Survival analysis revealed that the low-RiskScore group had significantly better survival in the TCGA cohort (P < 0.05) (Figure 13F), as well as in the GSE41217 (P < 0.05) (Figure 13H) and GSE30219 cohorts (P < 0.05) (Figure 13J). The performance of the model was evaluated, revealing good results in the TCGA cohort, with 1/3/5-year AUCs of 0.600/0.645/0.663 (Figure 13G), and similar results in the GSE41217 (0.716/0.603/0.608) (Figure 13I) and GSE30219 (0.673/0.692/0.662) cohorts (Figure 13K). These findings confirm the stable performance of our model across multiple cohorts.

3.12 Comparison of the NSCLC prognostic model with other signatures

To comprehensively compare the performance differences between our prognostic model and other signatures, we systematically reviewed published prognostic signatures or models. A total of 38 signatures, including those for LUAD, LUSC, and NSCLC, were included in this study. Univariate Cox regression analysis revealed that our prognostic model can serve as a risk factor for NSCLC prognosis in the TCGA, GSE30219, and GSE41217 cohorts (Supplementary Figure 2A). We then calculated the C-index values for the 38 signatures across the three cohorts. The results revealed that in the TCGA cohort (Supplementary Figure 2B), GSE30219 cohort (Supplementary Figure 2C), and GSE41217 cohort (Supplementary Figure 2D), our model consistently ranked among the top signatures, outperforming the majority of previously published signatures.

3.13 Knockdown of DDR1 in vitro inhibits the proliferation of lung cancer cells

To investigate the role of DDR1 in lung cancer cell proliferation, invasion, migration, and adhesion, we performed siRNA-mediated knockdown of DDR1 in NSCLC cell lines. RT–qPCR, western blotting, and immunofluorescence confirmed DDR1 expression in HARA, CALU3, and NCI-H292 cells (Supplementary Figures 3A–C). CALU3 and NCIH292 cells were selected for subsequent experiments. The knockdown efficiency of DDR1-siRNA was validated by RT–qPCR and western blotting, which revealed significant suppression of DDR1 expression in CALU3 and NCI-H292 cells (P<0.05) (Supplementary Figures 3D–G). MTT, apoptosis, and colony formation assays revealed a significant reduction in the proliferative capacity of CALU3 and NCI-H292 cells following DDR1-siRNA transfection (Figures 14A–C). DDR1 depletion significantly impaired cell migration, invasion, and adhesion, as demonstrated by the results of the scratch, Transwell, and adhesion assays (Figures 14D–G).

Figure 14
Panel A shows line graphs comparing cell viability over 72 hours for si-NC and si-DDR1 in CALU3 and NCIH292 cells, with si-DDR1 showing decreased viability. Panel B displays flow cytometry plots and bar graphs showing higher apoptosis rates in si-DDR1 treated cells compared to si-NC. Panel C depicts colony formation assays with fewer colonies in si-DDR1, supported by a bar graph. Panel D includes wound healing assay images and bar graphs illustrating reduced wound healing in si-DDR1. Panels E, F, and G show invasive, migratory, and adherent cell assessments, each with bar graphs indicating decreased cell numbers in si-DDR1.*** indicates p<0.001, ** indicates p<0.01.

Figure 14. Effect of DDR1 knockdown on the proliferation, invasion, migration, and adhesion of CALU3 and NCI-H292 cells. A cell viability assay (A), apoptosis analysis by flow cytometry (B), and a colony formation assay (C) demonstrated changes in cell proliferation following DDR1-siRNA transfection. Scratch assays (D) revealed that DDR1 knockdown significantly reduced the migration ability of CALU3 and NCI-H292 cells. Transwell assays (E, F) revealed a marked decrease in the number of cells that crossed the Transwell membrane, indicating a significant reduction in migration and invasion capacity. Furthermore, an adhesion assay (G) demonstrated that DDR1 knockdown led to a substantial decrease in the adhesion of CALU3 and NCI-H292 cells to the extracellular matrix. ***P<0.001, **P<0.01, *P<0.05. NC, negative control; si, small interfering; OD, optical density.

3.14 Pathological examination of clinical samples

A total of 34 NSCLC samples were included in this study, 19 (55.88%, 19/34, mean ± SD age, 63.84 ± 9.32 years; 2 females, 17 males) of which were positive. Among them, 5 of 17 LUAD cases (29.41%) and 14 of 16 LUSC cases (87.50%) were positive for DDR1. The IHC score of DDR1 in positive samples was significantly greater than that of adjacent normal tissues (27.84 ± 5.815 vs. 3.734 ± 1.895, P < 0.0001). Moreover, the IHC score DDR1 in in LUSC was significantly greater than that in LUAD (30.03 ± 5.102 vs. 24.02 ± 2.689, *P = 0.0062) (Figure 15).

Figure 15
Panel A shows immunohistochemistry images of tissue samples with varying staining intensities from negative to strong positive, indicated as (-) to (+++). Panel B features two bar charts: the first compares IHC scores between tumor and normal tissues, showing a significant decrease in normal tissues (****), and the second compares LUSC and LUAD, indicating LUSC has a higher IHC score (**).

Figure 15. DDR1 expression in NSCLC tumor tissue. (A) Representative IHC images of DDR1 expression in NSCLC tissues. -: negative staining, +: mild positive staining, ++: moderate positive staining, +++: strong positive staining. Magnification, up 20×; down 80×. (B) Comparison of the IHC scores of DDR1 in NSCLC tumor tissues and paired adjacent normal tissues**** P < 0.0001. (C) Comparison between LUSC and LUAD. ** P < 0.01.

4 Discussion

This study systematically explores DDR1’s role in NSCLC progression and prognosis via bioinformatics, immunohistochemistry of clinical samples, and in vitro experiments. Elevated DDR1 expression was linked to advanced tumor stage (T3), older age (>65), female sex, and shorter progression-free survival, highlighting its prognostic relevance. Multivariate analysis identified DDR1 and pathological stage as independent prognostic factors, supporting the construction of a predictive nomogram. Our findings align with Sun Ho Yang et al (62), who reported DDR1 overexpression in 61% of NSCLC cases, especially invasive adenocarcinoma. In comparison, our study observed DDR1 expression in 55.88% of cases—likely due to sample size differences. Additionally, univariate analysis showed high DDR1 expression correlated with poorer overall survival. Previous studies have shown that collagen I-activated DDR1 promotes NSCLC cell migration and invasion by upregulating MMP-2, N-cadherin, and vimentin (63).

Our analysis revealed that DDR1 expression in NSCLC is significantly negatively correlated with DNA methylation levels (R = -0.43, P < 0.05), suggesting that epigenetic regulation is a potential driver of DDR1 dysregulation in tumors. Global DNA hypomethylation and promoter hypermethylation-induced gene inactivation are well-established epigenetic hallmarks of cancer cells (64, 65). Low-methylation epigenotypes are associated with a poorer prognosis for LUSC (66). An inverse relationship between DDR1 promoter methylation and DDR1 expression was observed at the five CpG sites previously analyzed in NSCLC, with hypomethylation identified as an independent prognostic factor for disease-free survival (67). Studies on DDR1 methylation remain limited, and further research is needed to elucidate its underlying mechanisms.

Notably, although DDR1 mutations were detected in the NSCLC cohorts, no significant differences in survival or expression were detected between the mutant and wild-type groups (P > 0.05), and the difference in TMB between the DDR1 expression groups was not significant (P > 0.05). These findings suggest that DDR1 mutation or expression may not be the primary drivers of its oncogenic activity in NSCLC and that DDR1-driven tumor progression may rely more on microenvironmental interactions (e.g., ECM remodeling) rather than intrinsic genomic instability, with DDR1 overexpression promoting collagen alignment and creating a barrier to immune infiltration independent of mutational burden (10). Our in vitro cellular experiments further confirmed that DDR1 knockdown inhibits the proliferation and migration of NSCLC cells.

Comparative somatic mutation profiling revealed distinct mutation patterns between the high and low DDR1 expression groups. High DDR1 expression was associated with amplifications on chromosomes 3, 8, and 11, which harbor key oncogenes such as PIK3CA on chromosome 3 and MYC on chromosome 8 (68). Notably, PIK3CA genomic gain, as detected by FISH, has been reported in 43% of lung cancers, with a preference for squamous cell carcinoma (69, 70). Additionally, deletions on chromosomes 2 and 9 impact key tumor suppressors, including the well-known CDKN2A on chromosome 9 (71). The activation or amplification of these oncogenes and the deletion of tumor suppressors play crucial roles in lung cancer pathogenesis. Conversely, low DDR1 expression cohorts presented amplifications of chromosomes 1 and 14, potentially linked to alternative protumorigenic pathways such as MDM4 (chr1) or AKT1 (chr14) activation (72, 73).

Further multialgorithm analyses revealed that elevated DDR1 expression is significantly correlated with immune cell infiltration in NSCLC, which may play a crucial role in tumor promotion and immune escape. In addition to the aforementioned role of DDR1 in influencing the TME by promoting collagen realignment and restricting immune cell infiltration, the significant upregulation of immune checkpoint genes in the high DDR1 expression group may enhance immune evasion by inhibiting T-cell function. Notably, the association of DDR1 with 63 immunoregulatory genes underscores its central role in tumor immune regulation. For example, DDR1 modulates the MMP and chemokine secretion to recruit macrophages, shaping an immunosuppressive TME, a mechanism validated in hepatic metastasis models where DDR1 silencing reduces MMP2/9 and prometastatic factors in HSCs, thereby inhibiting tumor growth and immune escape (74).

Drug sensitivity analysis revealed that the high DDR1 expression group presented increased IC50 values for methotrexate but increased sensitivity to vinblastine, doxorubicin, cisplatin and docetaxel. This may be attributed to DDR1-driven collagen barriers, which promote drug efflux and contribute to matrix-mediated drug resistance (75). DDR1 inhibitors can overcome ECM-mediated drug resistance by disrupting DDR1/PYK2/FAK signaling, remodeling the tumor microenvironment, and enhancing the efficacy of conventional chemotherapeutic agents (76, 77). However, these findings are based on preclinical studies and warrant further investigation in clinical settings. With respect to immunotherapy prediction, elevated TIDE scores in the high DDR1 expression group suggest an immune evasion phenotype, whereas reduced CYT scores reflect impaired cytotoxic T-cell activity. These findings align with preclinical studies of DDR1 inhibitory antibodies: ECD-neutralizing antibodies disrupt collagen fiber alignment, mitigate immune exclusion and inhibit tumor growth in immunocompetent hosts (10). The above analyses highlight the immunotherapeutic potential of targeting DDR1 and identifying promising DDR1-directed therapeutic agents; however, rigorous preclinical and clinical validation of these findings is imperative.

UMAP-based clustering of the single-cell heterogeneity data revealed 16 clusters corresponding to 13 major cell types, highlighting the complexity of the NSCLC TME. Notably, macrophages were found to be the predominant cell type in tumor tissues, whereas T cells were more enriched in normal tissues, which is consistent with previous reports suggesting macrophage enrichment and T-cell abnormalities in NSCLC tumors (78, 79). Pseudotime analysis revealed decreasing DDR1 expression along progression trajectories, with higher levels in macrophages/neutrophils. This dynamic pattern positions DDR1 as a regulator of NSCLC microenvironment remodeling, suggesting that coordinated changes in immune cell composition (notably immunosuppressive shifts) are linked to both immune evasion and fibrosis-like ECM reorganization (80). CellChat analysis revealed macrophages and neutrophils as central NOTCH signaling mediators, with neutrophils functioning as senders and macrophages as receivers, which aligns with known mechanisms of macrophage polarization and myeloid-derived suppressor cell (MDSC) recruitment (81). Key ligand–receptor pairs (ANXA1–FPR1 and APP–CD74) that drive these interactions regulate the phenotypes of tumor-associated macrophages (TAMs), which are functionally linked to the TME (82, 83). Enrichment analysis of the DDR1-high and DDR1-low groups revealed enrichment of the cell cycle and PI3K–Akt signaling pathways, both of which are well-established drivers of tumor proliferation, survival, and therapeutic resistance in NSCLC (84, 85). GSVA/GSEA revealed that DDR1-high-expressing tumors are associated with developmental oncogenic programs (NOTCH/Hedgehog), pathways mechanistically linked to cell fate regulation (proliferation/differentiation/apoptosis) and core tumorigenic processes (invasion/migration) (86, 87). Together, these data suggest that DDR1 is a regulator of the ECM and cellular adhesion and a key factor in the TME.

In this study, we developed a robust prognostic model for NSCLC by integrating the LASSO and random survival forest algorithms and identified four key genes (PKP2, DKK1, TEF, and GJB5). The combined expression of these genes effectively stratified patients into high- and low-risk groups with significantly different survival outcomes. The model demonstrated strong and consistent predictive performance across the TCGA, GSE41217, and GSE30219 cohorts, underscoring its generalizability and clinical potential. Both PKP2 and DKK1 promote the development and progression of NSCLC (88, 89). TEF and GJB5 appeared to function as protective factors in our model. Emerging evidence indicates that TEF expression is lower in NSCLC tissues than in normal tissues and that TEF expression is positively correlated with better clinical survival rates (90). In bladder cancer, upregulation of TEF expression significantly retarded bladder cancer cell growth by inhibiting the G1/S transition via regulating AKT/FOXOs signaling (91). GJB5, a gap junction protein, is traditionally viewed as a tumor suppressor, and its loss of connexin-mediated communication is linked to cancer progression (9294). Overexpression of GJB5 in NSCLC cells reduced cell proliferation, induced a delay in the G1 phase, inhibited anchorage-independent growth and suppressed cell migration and invasion (95). However, the role of TEF and GJB5 in NSCLC remains underexplored and warrants further investigation. Furthermore, our comparison with 38 published prognostic signatures demonstrated that our model ranked first in the TCGA cohort, eighth in the GSE30219 dataset, and thirteenth in the GSE41271 dataset, indicating robust and consistent predictive performance across multiple independent cohorts. This finding supports prior findings that combining robust feature selection with machine learning enhances model performance and reproducibility in high-dimensional omics data (18).

Notably, our study has several limitations. First, our findings are derived from bioinformatics analyses of publicly available datasets and require validation in real-world patient cohorts. Variations in datasets and analytical approaches may lead to inconsistent results, even when the same pathological condition is investigated (96). Second, the prognostic significance of DDR1 in the context of immunotherapy warrants prospective evaluation in larger patient populations. Finally, the number of pathological samples analyzed in this study is relatively limited, underscoring the need for further investigations with expanded sample sizes to better elucidate the role of DDR1 in NSCLC.

5 Conclusion

In conclusion, our research indicates that DDR1 is upregulated in NSCLC and closely associated with a poor prognosis, serving as one of the important driving factors for the development of NSCLC. In vitro validation and pathological analysis revealed that DDR1 has a significant effect on the biological function of NSCLC cancer cells, promoting tumor cell proliferation, migration and invasion. Notably, we further revealed potential biological mechanisms and identified potential therapeutic drugs that may function in individuals with high DDR1 expression. In addition, through systematic integration of DEGs selected on the basis of DDR1 expression profiles, we developed a robust machine learning-driven prognostic model that demonstrated predictive performance superior to that of traditional models. These results highlight the potential of DDR1 as a therapeutic strategy for treating NSCLC.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.

Ethics statement

The studies involving humans were approved by the Ethics Committee of the First People’s Hospital of Yunnan Province. The studies were conducted in accordance with the local legislation and institutional requirements. The ethics committee/institutional review board waived the requirement of written informed consent for participation from the participants or the participants’ legal guardians/next of kin because the requirement for informed consent from all patients was waived because of the retrospective nature of the study.

Author contributions

RL: Funding acquisition, Formal Analysis, Conceptualization, Methodology, Writing – original draft, Writing – review & editing. LQ: Data curation, Resources, Writing – original draft. XS: Data curation, Writing – original draft. JZ: Writing – original draft. YC: Writing – original draft, Data curation. HS: Software, Validation, Writing – original draft. DX: Funding acquisition, Validation, Writing – review & editing, Supervision, Visualization. SW: Visualization, Validation, Funding acquisition, Writing – review & editing, Supervision.

Funding

The author(s) declare financial support was received for the research and/or publication of this article. This work was supported by the Yunnan Provincial Science and Technology Department Social Development Special Project (202403AC100018), the Kunming University of Science and Technology & the First People's Hospital of Yunnan Province Joint Special Project on Medical Research (No. KUST-KH2022026Y, KUST-KH2023017Y), and the National Natural Science Foundation of China (No. 82260355).

Acknowledgments

We thank the online language editing service from American Journal Experts (AJE) for improving the grammar and expression of our manuscript before submission.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fimmu.2025.1690829/full#supplementary-material

References

1. Bray F, Laversanne M, Sung H, Ferlay J, Siegel RL, Soerjomataram I, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. (2024) 74:229–63. doi: 10.3322/caac.21834

PubMed Abstract | Crossref Full Text | Google Scholar

2. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. (2021) 71:209–49. doi: 10.3322/caac.21660

PubMed Abstract | Crossref Full Text | Google Scholar

3. Inamura K. Lung cancer: understanding its molecular pathology and the 2015 WHO classification. Front Oncol. (2017) 7. doi: 10.3389/fonc.2017.00193

PubMed Abstract | Crossref Full Text | Google Scholar

4. Yang T, Xiong Y, Zeng Y, Wang Y, Zeng J, Liu J, et al. Current status of immunotherapy for non-small cell lung cancer. Front Pharmacol. (2022) 13:989461. doi: 10.3389/fphar.2022.989461

PubMed Abstract | Crossref Full Text | Google Scholar

5. Allemani C, Matsuda T, Di Carlo V, Harewood R, Matz M, Niksic M, et al. Global surveillance of trends in cancer survival 2000-14 (CONCORD-3): analysis of individual records for 37 513–025 patients diagnosed with one of 18 cancers from 322 population-based registries in 71 countries. Lancet. (2018) 391:1023–75. doi: 10.1016/S0140-6736(17)33326-3

PubMed Abstract | Crossref Full Text | Google Scholar

6. Zappa C and Mousa SA. Non-small cell lung cancer: current treatment and future advances. Transl Lung Cancer Res. (2016) 5:288–300. doi: 10.21037/tlcr.2016.06.07

PubMed Abstract | Crossref Full Text | Google Scholar

7. Uchino J, Goldmann T, and Kimura H. Editorial: treatment for non-small cell lung cancer in distinct patient populations. Front Oncol. (2022) 12:838570. doi: 10.3389/fonc.2022.838570

PubMed Abstract | Crossref Full Text | Google Scholar

8. Mielgo-Rubio X, Uribelarrea EA, Cortes LQ, and Moyano MS. Immunotherapy in non-small cell lung cancer: Update and new insights. J Clin Transl Res. (2021) 7:1–21. doi: 10.18053/jctres.07.202101.001

PubMed Abstract | Crossref Full Text | Google Scholar

9. Valiathan RR, Marco M, Leitinger B, Kleer CG, and Fridman R. Discoidin domain receptor tyrosine kinases: new players in cancer progression. Cancer Metastasis Rev. (2012) 31:295–321. doi: 10.1007/s10555-012-9346-z

PubMed Abstract | Crossref Full Text | Google Scholar

10. Sun X, Wu B, Chiang HC, Deng H, Zhang X, Xiong W, et al. Tumour DDR1 promotes collagen fibre alignment to instigate immune exclusion. Nature. (2021) 599:673–8. doi: 10.1038/s41586-021-04057-2

PubMed Abstract | Crossref Full Text | Google Scholar

11. Deng J, Kang Y, Cheng CC, Li X, Dai B, Katz MH, et al. DDR1-induced neutrophil extracellular traps drive pancreatic cancer metastasis. JCI Insight. (2021) 6. doi: 10.1172/jci.insight.146133

PubMed Abstract | Crossref Full Text | Google Scholar

12. Zhang J, Maimaiti A, Chang X, Sun P, and Chang X. DDR1 promotes metastasis of cervical cancer and downstream phosphorylation signal via binding GRB2. Cell Death Dis. (2024) 15:849. doi: 10.1038/s41419-024-07212-5

PubMed Abstract | Crossref Full Text | Google Scholar

13. Zhang X, Hu Y, Pan Y, Xiong Y, Zhang Y, Han M, et al. DDR1 promotes hepatocellular carcinoma metastasis through recruiting PSD4 to ARF6. Oncogene. (2022) 41:1821–34. doi: 10.1038/s41388-022-02212-1

PubMed Abstract | Crossref Full Text | Google Scholar

14. Han Q, Xiao F, Ma L, Zhou J, Wang L, Cheng H, et al. DDR1 promotes migration and invasion of breast cancer by modulating the Src-FAK signaling. Neoplasma. (2022) 69:1154–64. doi: 10.4149/neo_2022_220316N289

PubMed Abstract | Crossref Full Text | Google Scholar

15. Xie R, Wang X, Qi G, Wu Z, Wei R, Li P, et al. DDR1 enhances invasion and metastasis of gastric cancer via epithelial-mesenchymal transition. Tumour Biol. (2016) 37:12049–59. doi: 10.1007/s13277-016-5070-6

PubMed Abstract | Crossref Full Text | Google Scholar

16. Yang L, Zhang Y, Tang Y, Wang Y, Jiang P, Liu F, et al. A pan-cancer analysis of DDR1 in prognostic signature and tumor immunity, drug resistance. Sci Rep. (2023) 13:5779. doi: 10.1038/s41598-023-27975-9

PubMed Abstract | Crossref Full Text | Google Scholar

17. Maitz K, Valadez-Cosmes P, Raftopoulou S, Kindler O, Kienzl M, Bolouri H, et al. Altered treg infiltration after discoidin domain receptor 1 (DDR1) inhibition and knockout promotes tumor growth in lung adenocarcinoma. Cancers (Basel). (2023) 15. doi: 10.3390/cancers15245767

PubMed Abstract | Crossref Full Text | Google Scholar

18. Auslander N, Gussow AB, and Koonin EV. Incorporating machine learning into established bioinformatics frameworks. Int J Mol Sci. (2021) 22. doi: 10.3390/ijms22062903

PubMed Abstract | Crossref Full Text | Google Scholar

19. Rousseaux S, Debernardi A, Jacquiau B, Vitte AL, Vesin A, Nagy-Mignotte H, et al. Ectopic activation of germline and placental genes identifies aggressive metastasis-prone lung cancers. Sci Trans Med. (2013) 5:186ra66. doi: 10.1126/scitranslmed.3005723

PubMed Abstract | Crossref Full Text | Google Scholar

20. Sato M, Larsen JE, Lee W, Sun H, Shames DS, Dalvi MP, et al. Human lung epithelial cells progressed to Malignancy through specific oncogenic manipulations. Mol Cancer research: MCR. (2013) 11:638–50. doi: 10.1158/1541-7786.MCR-12-0634-T

PubMed Abstract | Crossref Full Text | Google Scholar

21. Davis S and Meltzer PS. GEOquery: a bridge between the gene expression omnibus (GEO) and bioConductor. Bioinf (Oxford England). (2007) 23:1846–7. doi: 10.1093/bioinformatics/btm254

PubMed Abstract | Crossref Full Text | Google Scholar

22. Zhou X, Chen Z, and Cai X. Identification of epigenetic modulators in human breast cancer by integrated analysis of DNA methylation and RNA-Seq data. Epigenetics. (2018) 13:473–89. doi: 10.1080/15592294.2018.1469894

PubMed Abstract | Crossref Full Text | Google Scholar

23. Colaprico A, Silva TC, Olsen C, Garofano L, Cava C, Garolini D, et al. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. (2016) 44:e71. doi: 10.1093/nar/gkv1507

PubMed Abstract | Crossref Full Text | Google Scholar

24. Mayakonda A and Koeffler HPJB. Maftools: Efficient analysis, visualization and summarization of MAF files from large-scale cohort based cancer studies. (2016), 052662. doi: 10.1101/052662

Crossref Full Text | Google Scholar

25. Song Q, Hawkins GA, Wudel L, Chou PC, Forbes E, Pullikuth AK, et al. Dissecting intratumoral myeloid cell plasticity by single cell RNA-seq. Cancer Med. (2019) 8:3072–85. doi: 10.1002/cam4.2113

PubMed Abstract | Crossref Full Text | Google Scholar

26. Sultana A, Alam MS, Liu X, Sharma R, Singla RK, Gundamaraju R, et al. Single-cell RNA-seq analysis to identify potential biomarkers for diagnosis, and prognosis of non-small cell lung cancer by using comprehensive bioinformatics approaches. Transl Oncol. (2023) 27:101571. doi: 10.1016/j.tranon.2022.101571

PubMed Abstract | Crossref Full Text | Google Scholar

27. Galow AM, Kussauer S, Wolfien M, Brunner RM, Goldammer T, David R, et al. Quality control in scRNA-Seq can discriminate pacemaker cells: the mtRNA bias. Cell Mol Life Sci. (2021) 78:6585–92. doi: 10.1007/s00018-021-03916-5

PubMed Abstract | Crossref Full Text | Google Scholar

28. Luecken MD and Theis FJ. Current best practices in single-cell RNA-seq analysis: a tutorial. Mol Syst Biol. (2019) 15:e8746. doi: 10.15252/msb.20188746

PubMed Abstract | Crossref Full Text | Google Scholar

29. Zhang D, Zeng H, Pan Y, Zhao Y, Wang X, Chen J, et al. Liver tumor markers, HALP score, and NLR: simple, cost-effective, easily accessible indexes for predicting prognosis in ICC patients after surgery. J personalized Med. (2022) 12. doi: 10.3390/jpm12122041

PubMed Abstract | Crossref Full Text | Google Scholar

30. Kim SY, Yoon MJ, Park YI, Kim MJ, Nam B-H, and Park SRJGC. Nomograms predicting survival of patients with unresectable or metastatic gastric cancer who receive combination cytotoxic chemotherapy as first-line treatment. Gastric Cancer. (2018) 21:453–63. doi: 10.1007/s10120-017-0756-z

PubMed Abstract | Crossref Full Text | Google Scholar

31. Harrell FE Jr., Harrell MFE Jr., and Hmisc DJVU. Package ‘rms’. Genome Biol. (2017) 229:Q8. doi: 10.1186/gb-2011-12-4-r41

PubMed Abstract | Crossref Full Text | Google Scholar

32. Mermel CH, Schumacher SE, Hill B, Meyerson ML, Beroukhim R, and Getz G. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. (2011) 12:R41. doi: 10.1186/gb-2011-12-4-r41

PubMed Abstract | Crossref Full Text | Google Scholar

33. Coletta A, Molter C, Duqué R, Steenhoff D, Taminau J, de Schaetzen V, et al. InSilico DB genomic datasets hub: an efficient starting point for analyzing genome-wide studies in GenePattern, Integrative Genomics Viewer, and R/Bioconductor. Genome Biol. (2012) 13:R104. doi: 10.1186/gb-2012-13-11-r104

PubMed Abstract | Crossref Full Text | Google Scholar

34. Chen B, Khodadoust MS, Liu CL, Newman AM, and Alizadeh AA. Profiling tumor infiltrating immune cells with CIBERSORT. Methods Mol Biol (Clifton NJ). (2018) 1711:243–59. doi: 10.1007/978-1-4939-7493-1_12

PubMed Abstract | Crossref Full Text | Google Scholar

35. Yoshihara K, Shahmoradgoli M, Martínez E, Vegesna R, Kim H, Torres-Garcia W, et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat Commun. (2013) 4:2612. doi: 10.1038/ncomms3612

PubMed Abstract | Crossref Full Text | Google Scholar

36. Becht E, Giraldo NA, Lacroix L, Buttard B, Elarouci N, Petitprez F, et al. Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome Biol. (2016) 17:218. doi: 10.1186/s13059-016-1070-5

PubMed Abstract | Crossref Full Text | Google Scholar

37. Li T, Fu J, Zeng Z, Cohen D, Li J, Chen Q, et al. TIMER2.0 for analysis of tumor-infiltrating immune cells. Nucleic Acids Res. (2020) 48:W509–w14. doi: 10.1093/nar/gkaa407

PubMed Abstract | Crossref Full Text | Google Scholar

38. Thorsson V, Gibbs DL, Brown SD, Wolf D, Bortone DS, Ou Yang TH, et al. The immune landscape of cancer. Immunity. (2018) 48:812–30.e14. doi: 10.1016/j.immuni.2018.03.023

PubMed Abstract | Crossref Full Text | Google Scholar

39. Xu L, Deng C, Pang B, Zhang X, Liu W, Liao G, et al. TIP: A web server for resolving tumor immunophenotype profiling. Cancer Res. (2018) 78:6575–80. doi: 10.1158/0008-5472.CAN-18-0689

PubMed Abstract | Crossref Full Text | Google Scholar

40. Roh W, Chen PL, Reuben A, Spencer CN, Prieto PA, Miller JP, et al. Integrated molecular analysis of tumor biopsies on sequential CTLA-4 and PD-1 blockade reveals markers of response and resistance. Sci Trans Med. (2017) 9. doi: 10.1126/scitranslmed.aah3560

PubMed Abstract | Crossref Full Text | Google Scholar

41. Cabrita R, Lauss M, Sanna A, Donia M, Skaarup Larsen M, Mitra S, et al. Tertiary lymphoid structures improve immunotherapy and survival in melanoma. Nature. (2020) 577:561–5. doi: 10.1038/s41586-019-1914-8

PubMed Abstract | Crossref Full Text | Google Scholar

42. Wang Q, Li M, Yang M, Yang Y, Song F, Zhang W, et al. Analysis of immune-related signatures of lung adenocarcinoma identified two distinct subtypes: implications for immune checkpoint blockade therapy. Aging. (2020) 12:3312–39. doi: 10.18632/aging.102814

PubMed Abstract | Crossref Full Text | Google Scholar

43. Yang W, Soares J, Greninger P, Edelman EJ, Lightfoot H, Forbes S, et al. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. (2013) 41:D955–61. doi: 10.1093/nar/gks1111

PubMed Abstract | Crossref Full Text | Google Scholar

44. Geeleher P, Cox N, and Huang RS. pRRophetic: an R package for prediction of clinical chemotherapeutic response from tumor gene expression levels. PloS One. (2014) 9:e107468. doi: 10.1371/journal.pone.0107468

PubMed Abstract | Crossref Full Text | Google Scholar

45. Haghverdi L, Büttner M, Wolf FA, Buettner F, and Theis FJ. Diffusion pseudotime robustly reconstructs lineage branching. Nat Methods. (2016) 13:845–8. doi: 10.1038/nmeth.3971

PubMed Abstract | Crossref Full Text | Google Scholar

46. Qiu X, Hill A, Packer J, Lin D, Ma YA, and Trapnell C. Single-cell mRNA quantification and differential analysis with Census. Nat Methods. (2017) 14:309–15. doi: 10.1038/nmeth.4150

PubMed Abstract | Crossref Full Text | Google Scholar

47. Jin S, Guerrero-Juarez CF, Zhang L, Chang I, Ramos R, Kuan CH, et al. Inference and analysis of cell-cell communication using CellChat. Nat Commun. (2021) 12:1088. doi: 10.1038/s41467-021-21246-9

PubMed Abstract | Crossref Full Text | Google Scholar

48. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. (2015) 43:e47. doi: 10.1093/nar/gkv007

PubMed Abstract | Crossref Full Text | Google Scholar

49. Zhang R, Ye J, Huang H, and Du XJB. Pharmacotherapy. Mining featured biomarkers associated with vascular invasion in HCC by bioinformatics analysis with TCGA RNA sequencing data. Biomed Pharmacother. (2019) 118:109274. doi: 10.1016/j.biopha.2019.109274

PubMed Abstract | Crossref Full Text | Google Scholar

50. Drabkin HJ, Hill DP, Carbon S, Dietze H, Mungall CJ, Munoz-Torres MC, et al. Gene Ontology Consortium: going forward. Nucleic Acids Res. (2015) 43:D1049–56. doi: 10.1093/nar/gku1179

PubMed Abstract | Crossref Full Text | Google Scholar

51. Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, and Kanehisa M. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. (1999) 27:29–34. doi: 10.1093/nar/27.1.29

PubMed Abstract | Crossref Full Text | Google Scholar

52. Yu G, Wang LG, Han Y, and He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics: J Integr Biol. (2012) 16:284–7. doi: 10.1089/omi.2011.0118

PubMed Abstract | Crossref Full Text | Google Scholar

53. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci United States America. (2005) 102:15545–50. doi: 10.1073/pnas.0506580102

PubMed Abstract | Crossref Full Text | Google Scholar

54. Ferreira MR, Santos GA, Biagi CA, Silva Junior WA, and Zambuzzi WF. GSVA score reveals molecular signatures from transcriptomes for biomaterials comparison. J Biomed materials Res Part A. (2021) 109:1004–14. doi: 10.1002/jbm.a.37090

PubMed Abstract | Crossref Full Text | Google Scholar

55. Liberzon A, Birger C, Thorvaldsdóttir H, Ghandi M, Mesirov JP, and Tamayo P. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell systems. (2015) 1:417–25. doi: 10.1016/j.cels.2015.12.004

PubMed Abstract | Crossref Full Text | Google Scholar

56. Cui J, Zhu Y, Liu X, Wang W, Jiang X, Xia Y, et al. Comprehensive analysis of N6-methyladenosine regulators with the tumor immune landscape and correlation between the insulin-like growth factor 2 mRNA-binding protein 3 and programmed death ligand 1 in bladder cancer. Cancer Cell Int. (2022) 22:1–21. doi: 10.1186/s12935-022-02456-7

PubMed Abstract | Crossref Full Text | Google Scholar

57. Liu Z, Guo C, Dang Q, Wang L, Liu L, Weng S, et al. Integrative analysis from multi-center studies identities a consensus machine learning-derived lncRNA signature for stage II/III colorectal cancer. EBioMedicine. (2022) 75:103750. doi: 10.1016/j.ebiom.2021.103750

PubMed Abstract | Crossref Full Text | Google Scholar

58. Liu Z, Liu L, Weng S, Guo C, Dang Q, Xu H, et al. Machine learning-based integration develops an immune-derived lncRNA signature for improving outcomes in colorectal cancer. Nat Commun. (2022) 13:816. doi: 10.1038/s41467-022-28421-6

PubMed Abstract | Crossref Full Text | Google Scholar

59. Blanche PJRPV. TimeROC: Time-dependent ROC curve and AUC for censored survival data. (2015) 2:.

Google Scholar

60. Kassambara A. Kassambara MAJRpv. Package ‘ggpubr’. (2020) 6:.

Google Scholar

61. Kassambara A, Kosinski M, and Biecek P. Fabian SJDSCug. Package ‘survminer’. (2017).

Google Scholar

62. Moon. Discoidin domain receptor 1 is associated with poor prognosis of non-small cell lung carcinomas. Oncol Rep. (2010) 24. doi: 10.3892/or_00000861

PubMed Abstract | Crossref Full Text | Google Scholar

63. Miao L, Zhu S, Wang Y, Li Y, Ding J, Dai J, et al. Discoidin domain receptor 1 is associated with poor prognosis of non-small cell lung cancer and promotes cell invasion via epithelial-to-mesenchymal transition. Med Oncol. (2013) 30. doi: 10.1007/s12032-013-0626-4

PubMed Abstract | Crossref Full Text | Google Scholar

64. Vizoso M, Puig M, Carmona FJ, Maqueda M, Velasquez A, Gomez A, et al. Aberrant DNA methylation in non-small cell lung cancer-associated fibroblasts. Carcinogenesis. (2015) 36:1453–63. doi: 10.1093/carcin/bgv146

PubMed Abstract | Crossref Full Text | Google Scholar

65. Hoang PH and Landi MT. DNA methylation in lung cancer: mechanisms and associations with histological subtypes, molecular alterations, and major epidemiological factors. Cancers (Basel). (2022) 14. doi: 10.3390/cancers14040961

PubMed Abstract | Crossref Full Text | Google Scholar

66. Hata A, Nakajima T, Matsusaka K, Fukuyo M, Morimoto J, Yamamoto T, et al. A low DNA methylation epigenotype in lung squamous cell carcinoma and its association with idiopathic pulmonary fibrosis and poorer prognosis. Int J Cancer. (2020) 146:388–99. doi: 10.1002/ijc.32532

PubMed Abstract | Crossref Full Text | Google Scholar

67. Villalba M, Redin E, Exposito F, Pajares MJ, Sainz C, Hervas D, et al. Identification of a novel synthetic lethal vulnerability in non-small cell lung cancer by co-targeting TMPRSS4 and DDR1. Sci Rep. (2019) 9:15400. doi: 10.1038/s41598-019-51066-3

PubMed Abstract | Crossref Full Text | Google Scholar

68. Chen X, Chang CW, Spoerke JM, Yoh KE, Kapoor V, Baudo C, et al. Low-pass whole-genome sequencing of circulating cell-free DNA demonstrates dynamic changes in genomic copy number in a squamous lung cancer clinical cohort. Clin Cancer Res. (2019) 25:2254–63. doi: 10.1158/1078-0432.CCR-18-1593

PubMed Abstract | Crossref Full Text | Google Scholar

69. Varella-Garcia M. Chromosomal and genomic changes in lung cancer. Cell Adh Migr. (2010) 4:100–6. doi: 10.4161/cam.4.1.10884

PubMed Abstract | Crossref Full Text | Google Scholar

70. Massion PP, Taflan PM, Shyr Y, Rahman SM, Yildiz P, Shakthour B, et al. Early involvement of the phosphatidylinositol 3-kinase/Akt pathway in lung cancer progression. Am J Respir Crit Care Med. (2004) 170:1088–94. doi: 10.1164/rccm.200404-487OC

PubMed Abstract | Crossref Full Text | Google Scholar

71. Thu KL, Chari R, Lockwood WW, Lam S, and Lam WL. miR-101 DNA copy loss is a prominent subtype specific event in lung cancer. J Thorac Oncol. (2011) 6:1594–8. doi: 10.1097/JTO.0b013e3182217d81

PubMed Abstract | Crossref Full Text | Google Scholar

72. Wang D, Zhang S, Zhao M, and Chen F. LncRNA MALAT1 accelerates non-small cell lung cancer progression via regulating miR-185-5p/MDM4 axis. Cancer Med. (2020) 9:9138–49. doi: 10.1002/cam4.3570

PubMed Abstract | Crossref Full Text | Google Scholar

73. Tan AC. Targeting the PI3K/Akt/mTOR pathway in non-small cell lung cancer (NSCLC). Thorac Cancer. (2020) 11:511–8. doi: 10.1111/1759-7714.13328

PubMed Abstract | Crossref Full Text | Google Scholar

74. Romayor I, Badiola I, Benedicto A, Marquez J, Herrero A, Arteta B, et al. Silencing of sinusoidal DDR1 reduces murine liver metastasis by colon carcinoma. Sci Rep. (2020) 10:18398. doi: 10.1038/s41598-020-75395-w

PubMed Abstract | Crossref Full Text | Google Scholar

75. Cui G, Deng S, Zhang B, Wang M, Lin Z, Lan X, et al. Overcoming the tumor collagen barriers: A multistage drug delivery strategy for DDR1-mediated resistant colorectal cancer therapy. Adv Sci (Weinh). (2024) 11:e2402107. doi: 10.1002/advs.202402107

PubMed Abstract | Crossref Full Text | Google Scholar

76. Ko S, Jung KH, Yoon YC, Han BS, Park MS, Lee YJ, et al. A novel DDR1 inhibitor enhances the anticancer activity of gemcitabine in pancreatic cancer. Am J Cancer Res. (2022) 12:4326–42.

PubMed Abstract | Google Scholar

77. Wang S, Xie Y, Bao A, Li J, Ye T, Yang C, et al. Nilotinib, a Discoidin domain receptor 1 (DDR1) inhibitor, induces apoptosis and inhibits migration in breast cancer. Neoplasma. (2021) 68:972–82. doi: 10.4149/neo_2021_201126N1282

PubMed Abstract | Crossref Full Text | Google Scholar

78. Zilionis R, Engblom C, Pfirschke C, Savova V, Zemmour D, Saatcioglu HD, et al. Single-cell transcriptomics of human and mouse lung cancers reveals conserved myeloid populations across individuals and species. Immunity. (2019) 50:1317–34 e10. doi: 10.1016/j.immuni.2019.03.009

PubMed Abstract | Crossref Full Text | Google Scholar

79. Lavin Y, Kobayashi S, Leader A, Amir ED, Elefant N, Bigenwald C, et al. Innate immune landscape in early lung adenocarcinoma by paired single-cell analyses. Cell. (2017) 169:750–65 e17. doi: 10.1016/j.cell.2017.04.014

PubMed Abstract | Crossref Full Text | Google Scholar

80. Zhang J. Extracellular matrix modulation via DDR kinase degradation: exploring new therapeutic frontiers. Curr Med Chem. (2025) 32:7560–3. doi: 10.2174/0109298673365845250131104153

PubMed Abstract | Crossref Full Text | Google Scholar

81. Zhou J, Tang Z, Gao S, Li C, Feng Y, and Zhou X. Tumor-associated macrophages: recent insights and therapies. Front Oncol. (2020) 10:188. doi: 10.3389/fonc.2020.00188

PubMed Abstract | Crossref Full Text | Google Scholar

82. Vecchi L, Mota STS, Zoia MAP, Martins IC, de Souza JB, Santos TG, et al. Interleukin-6 signaling in triple negative breast cancer cells elicits the annexin A1/formyl peptide receptor 1 axis and affects the tumor microenvironment. Cells. (2022) 11. doi: 10.3390/cells11101705

PubMed Abstract | Crossref Full Text | Google Scholar

83. Ma C, Chen J, Ji J, Zheng Y, Liu Y, Wang J, et al. Therapeutic modulation of APP-CD74 axis can activate phagocytosis of TAMs in GBM. Biochim Biophys Acta Mol Basis Dis. (2024) 1870:167449. doi: 10.1016/j.bbadis.2024.167449

PubMed Abstract | Crossref Full Text | Google Scholar

84. Raskova Kafkova L, Mierzwicka JM, Chakraborty P, Jakubec P, Fischer O, Skarda J, et al. NSCLC: from tumorigenesis, immune checkpoint misuse to current and future targeted therapy. Front Immunol. (2024) 15:1342086. doi: 10.3389/fimmu.2024.1342086

PubMed Abstract | Crossref Full Text | Google Scholar

85. Wang S, Cheng Z, Cui Y, Xu S, Luan Q, Jing S, et al. PTPRH promotes the progression of non-small cell lung cancer via glycolysis mediated by the PI3K/AKT/mTOR signaling pathway. J Transl Med. (2023) 21:819. doi: 10.1186/s12967-023-04703-5

PubMed Abstract | Crossref Full Text | Google Scholar

86. Lei W and Huo Z. Jervine inhibits non-small cell lung cancer (NSCLC) progression by suppressing Hedgehog and AKT signaling via triggering autophagy-regulated apoptosis. Biochem Biophys Res Commun. (2020) 533:397–403. doi: 10.1016/j.bbrc.2020.08.023

PubMed Abstract | Crossref Full Text | Google Scholar

87. Sun J, Dong M, Xiang X, Zhang S, and Wen D. Notch signaling and targeted therapy in non-small cell lung cancer. Cancer Lett. (2024) 585:216647. doi: 10.1016/j.canlet.2024.216647

PubMed Abstract | Crossref Full Text | Google Scholar

88. Zhang J, Zhang X, Zhao X, Jiang M, Gu M, Wang Z, et al. DKK1 promotes migration and invasion of non-small cell lung cancer via beta-catenin signaling pathway. Tumour Biol. (2017) 39:1010428317703820. doi: 10.1177/1010428317703820

PubMed Abstract | Crossref Full Text | Google Scholar

89. Hao XL, Tian Z, Han F, Chen JP, Gao LY, and Liu JY. Plakophilin-2 accelerates cell proliferation and migration through activating EGFR signaling in lung adenocarcinoma. Pathol Res Pract. (2019) 215:152438. doi: 10.1016/j.prp.2019.152438

PubMed Abstract | Crossref Full Text | Google Scholar

90. Xu X, Xu L, Lang Z, Sun G, Pan J, Li X, et al. Identification of potential susceptibility loci for non-small cell lung cancer through whole genome sequencing in circadian rhythm genes. Sci Rep. (2025) 15:7825. doi: 10.1038/s41598-025-92083-9

PubMed Abstract | Crossref Full Text | Google Scholar

91. Yang J, Wang B, Chen H, Chen X, Li J, Chen Y, et al. Thyrotroph embryonic factor is downregulated in bladder cancer and suppresses proliferation and tumorigenesis via the AKT/FOXOs signalling pathway. Cell Prolif. (2019) 52:e12560. doi: 10.1111/cpr.12560

PubMed Abstract | Crossref Full Text | Google Scholar

92. Fromme JE and Zigrino P. Melanoma metastasis, BRAF mutation and GJB5 connexin expression: a new prognostic factor. Br J Dermatol. (2022) 186:13–4. doi: 10.1111/bjd.20756

PubMed Abstract | Crossref Full Text | Google Scholar

93. Scatolini M, Patel A, Grosso E, Mello-Grand M, Ostano P, Coppo R, et al. GJB5 association with BRAF mutation and survival in cutaneous Malignant melanoma. Br J Dermatol. (2022) 186:117–28. doi: 10.1111/bjd.20629

PubMed Abstract | Crossref Full Text | Google Scholar

94. Wu JI and Wang LH. Emerging roles of gap junction proteins connexins in cancer metastasis, chemoresistance and clinical application. J BioMed Sci. (2019) 26:8. doi: 10.1186/s12929-019-0497-x

PubMed Abstract | Crossref Full Text | Google Scholar

95. Zhang D, Chen C, Li Y, Fu X, Xie Y, Li Y, et al. Cx31.1 acts as a tumour suppressor in non-small cell lung cancer (NSCLC) cell lines through inhibition of cell proliferation and metastasis. J Cell Mol Med. (2012) 16:1047–59. doi: 10.1111/j.1582-4934.2011.01389.x

PubMed Abstract | Crossref Full Text | Google Scholar

96. Wang B, Bao L, Li X, Sun G, Yang W, Xie N, et al. Identification and validation of the important role of KIF11 in the development and progression of endometrial cancer. J Transl Med. (2025) 23:48. doi: 10.1186/s12967-025-06081-6

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: discoidin domain receptor 1, non-small cell lung cancer, bioinformatics, machine learning, immunotherapy

Citation: Lu R, Qian L, Sun X, Zhang J, Cui Y, Su H, Xv D and Wang S (2025) DDR1 as a key prognostic biomarker in non-small cell lung cancer: identification, validation, and potential therapeutic implications. Front. Immunol. 16:1690829. doi: 10.3389/fimmu.2025.1690829

Received: 22 August 2025; Accepted: 17 November 2025; Revised: 05 November 2025;
Published: 28 November 2025.

Edited by:

Zhenwei Shi, Guangdong Academy of Medical Sciences, China

Reviewed by:

Adiba Sultana, Guangdong Provincial People’s Hospital, China
Yifei Deng, Chengdu Seventh People’s Hospital, China

Copyright © 2025 Lu, Qian, Sun, Zhang, Cui, Su, Xv and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: ShaoBo Wang, d3NoYm9fOThAMTI2LmNvbQ==; DongDong Xv, ODQyMTk2NUBxcS5jb20=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.