- 1Department of Histology and Embryology, Shanxi Medical University, Taiyuan, Shanxi, China
- 2Department of Tumor Biobank, Shanxi Province Cancer Hospital/Shanxi Hospital Affiliated to Cancer Hospital, Chinese Academy of Medical Sciences/Cancer Hospital Affiliated to Shanxi Medical University, Taiyuan, China
- 3Department of Translational Medicine, Shanghai Shengdi Medicine Co. Ltd., Shanghai, China
- 4School of Basic Medicine, Shanxi Medical University, Jinzhong, China
- 5Department of Radiobiology, Shanxi Province Cancer Hospital/Shanxi Hospital Affiliated to Cancer Hospital, Chinese Academy of Medical Sciences/Cancer Hospital Affiliated to Shanxi Medical University, Taiyuan, China
- 6Department of Hepatobiliary and Pancreatogastric Surgery, Shanxi Province Cancer Hospital/Shanxi Hospital Affiliated to Cancer Hospital, Chinese Academy of Medical Sciences/Cancer Hospital Affiliated to Shanxi Medical University, Taiyuan, China
Background: Epidermal growth factor receptor (EGFR) is a key protein in cellular signaling that is overexpressed in many human cancers, making it a compelling therapeutic target. On-target severe skin toxicity has limited its clinical application. Dual-targeting therapy represents a novel approach to overcome the challenges of EGFR-targeted therapies.
Methods: A single-cell tumor-normal RNA transcriptomic meta-atlas of lung adenocarcinoma (LUAD) and normal lung tissues was constructed from published data. Tumor associated antigens (TAAs) were screened from the genes which were expressed on cell surface and could distinguish cancer cells from normal cells. Expression of MUC1 and EGFR in tumors and normal tissues was detected by immunohistochemistry (IHC), bulk transcriptomic and single-cell transcriptomic analyses. RNA cut-off values were calculated using paired analysis of RNA sequencing and IHC in patient-derived tumor xenograft samples. They were used to estimate the abundance of EGFR- and MUC-positive subjects in The Cancer Genome Atlas Program (TCGA) database. Survival analysis of EGFR and MUC1 expression was carried out using the transcription and clinical data from TCGA.
Results: A candidate TAA target, transmembrane glycoprotein mucin 1 (MUC1), showed strong expression in cancer cells and low expression in normal cells. Single-cell analysis suggested EGFR and MUC1 together had better tumor specificity than the combination of EGFR with other drug targets. IHC data confirmed that EGFR and MUC1 were highly expressed on LUAD and colorectal cancer (CRC) clinical samples but not on various normal tissues. Notably, co-expression of EGFR and MUC1 was observed in 98.4% (n=64) of patients with LUAD and in 91.6% (n=83) of patients with CRC. It was estimated that EGFR and MUC1 were expressed in 97.5% of LUAD samples in the TCGA dataset. Besides, high expression of EGFR and MUC1 was significantly associated with poor prognosis of LUAD and CRC patients.
Conclusions: Single-cell RNA, bulk RNA and IHC data demonstrated the high expression levels and co-expression patterns of EGFR and MUC1 in tumors but not normal tissues. Therefore, it is a promising TAA combination for therapeutic targeting which could enhance on-tumor efficacy while reducing off-tumor toxicity.
Introduction
Epidermal growth factor receptor (EGFR), a tyrosine kinase receptor whose activation leads to receptor dimerization and tyrosine autophosphorylation, mediates tumor cell survival and proliferation in lung cancer, colorectal cancer (CRC) and breast cancer (1–4). Monoclonal antibody inhibitors of EGFR have been developed for cancer therapy in the last two decades, including cetuximab and necitumumab approved by the U.S. Food and Drug Administration (FDA) for lung adenocarcinoma (LUAD) and squamous cell lung cancer (5, 6), and panitumumab and cetuximab approved by the FDA and European Medicines Agency (EMA) for the treatment of metastatic CRC (7). However, anti-EGFR therapy can be marred by chronic and disfiguring adverse reactions such as acne-like rash, abnormal hair growth, and ocular abnormalities, which could worsen life qualities of patients and even trigger treatment termination (7).
Tumor associated antigens (TAAs) are important tumor targets for drug development with abnormal expression on tumor cells (8). While drugs targeting single TAA faced obstacles such as on-target, off-tumor toxicity and antigen escape, strategies targeting dual TAAs could improve the selectivity of tumor cells and reduce drug toxicity (9), through stronger binding to tumor cells expressing dual TAAs rather than normal tissues expressing a single target. To find an optimal TAA target to combine with EGFR, we constructed a computational pipeline which integrated multiple single-cell RNA sequencing datasets of both tumor and normal tissues from public sources. From a pool of TAAs identified by single-cell analysis, transmembrane glycoprotein mucin 1 (MUC1) was distinguished as a candidate target with promising druggability, as affirmed by extensive literature review and evaluation of its therapeutic tractability.
Through analysis of an independent lung cancer single-cell dataset, we showed that co-expression of EGFR and MUC1 was specific to tumor cells in the tumor microenvironment. We then confirmed that the expression patterns of EGFR and MUC1 in tumor tissues by immunohistochemistry (IHC). EGFR and MUC1 were co-expressed in 63 (98.4%) of the 64 patients with LUAD and in 76 (91.6%) of the 83 patients with CRC. In addition, we performed paired analysis of RNA sequencing and IHC data of patient-derived tumor xenograft (PDX) samples, and estimated that EGFR and MUC1 were expressed in 97.5% of LUAD samples of The Cancer Genome Atlas Program (TCGA) database. High expression of EGFR and MUC1 was associated with poor prognosis in LUAD and CRC. Together, our results illustrated the expression patterns of EGFR and MUC1 in tumor and normal tissues, and suggested that they were promising drug targets for developing cancer therapies targeting dual TAAs.
Materials and methods
Single-cell RNA sequencing data processing
For LUAD, a tumor-normal single cell meta-atlas was constructed through integrating single-cell RNA data of primary LUAD samples and normal lung tissue samples from public single cell datasets (Kim et al. (10), Qian et al. (11), E-Madissoon et al. (12), Braga et al. (13), Supplementary Table 1). An external single cell dataset from Wu et al. (14) was used for validation (Supplementary Table 1). Files of raw UMI count matrices were downloaded from GEO (15), Array Express (16) or author-referred websites and imported into Scanpy (17). To ensure data quality, cells with fewer than 3 detected genes, fewer than 200 unique molecular identifier (UMI) counts, or mitochondrial gene percentages greater than 20% were filtered out.
Next, a global-scaling normalization method was applied, adjusting the read count in each cell to 10^6 and subsequently performing log-transformation. The top 2000 highly variable genes (HVG) were identified and used in the principal component analysis (PCA), from which the top 30 components were retained for downstream dimensional reduction and clustering analysis using default parameters.
Due to the datasets being collected from different studies, the dataset ID was considered as the batch variable and corrected using a ridge-regularized linear regression model. Additionally, the BBKNN algorithm with default parameters was applied for evaluation and visualization (18).
For cell type annotation, a two-step process was employed. First, SingleR (19) was used for automatic cell annotation with HumanPrimaryCellAtlasData as the reference. Following this, curated markers or available annotation files from previous research (20) were manually applied to refine the cell type annotations based on the SingleR predictions (marker genes were listed in the Supplementary Table 2). This resulted in the construction of a tumor-normal single-cell meta-atlas for the discovery of TAAs.
Malignant cell identification
The raw count matrices of tumor samples with manually annotated cell labels were imported into the CopyCat (21) analysis with default parameters. Immune cells and fibroblasts from each sample served as normal cell references. Epithelial cells with diploid or undefined predictions were excluded from further analysis. In contrast, epithelial cells with aneuploid predictions were identified as malignant cells and retained in the filtered tumor-normal single-cell meta-atlas.
Random-forest construction
The filtered tumor-normal single-cell meta-atlas was randomly divided into training, testing, and validation datasets at 6:2:2 ratio. A random forest classifier with 1000 trees was built using the R package ‘randomforest’ (22). The model was initially trained on training datasets with 10-fold cross-validation and then evaluated on testing datasets to assess performance. The average AUC score was calculated using the ROCR R package (23) to select the best fitting model. Feature importance scores were calculated to determine the influence of genes in the final random forest model. The highest-ranked genes by MeanDecreaseAccuracy were overlapped with cell surface proteins from the in silico human surfaceome database (24) to extract the top 100 genes with surface expression.
Expressing cell fraction
The raw counts of each gene in the filtered tumor-normal single-cell meta-atlas were transformed into a binary classification to illustrate the expression pattern. A raw count greater than 0 was considered an expressed pattern (True), while a raw count of 0 was considered a non-expressed pattern (False). The ECF of a gene was calculated as the percentage of cells with an expressed pattern in a specific cell group.
LUAD, CRC, breast cancer tissues and normal samples
Paraffin sections of normal tissue samples for IHC experiments were sourced from Guilin Fanpu Biotech, Inc. with ethical approval for research, and the sample characteristics were provided in Table 1. Paraffin sections and tissue microarrays (TMAs) of human LUAD, CRC and breast cancer tissues from randomly and anonymously selected patients (LUAD, n=64; CRC, n=83; breast cancer, n=20) were provided by Shanxi Province Cancer Hospital with the patients’ informed consent. Patient characteristics, including age, sex, stage, tumor size, etc., were listed in Table 2. The pathological subtype of LUAD diagnosis was made in accordance with the 2015 World Health Organization Classification (25). The TMAs of LUAD and CRC PDX models were also obtained from Shanxi Province Cancer Hospital.
IHC
The IHC protocol for MUC1 and EGFR was as follows:
i) For antigen retrieval, sections were treated with retrieval solution (high pH) in Leica BOND-MAX for both EGFR and MUC1 at 98°C for 20 min before inhibiting endogenous peroxidase activity for 5 min at room temperature (RT) with Tris-EDTA/EGTA pH=9; ii) sections were incubated with a commercially available MUC1 antibody (clone MRQ17, 1:100; CNT) or anti−EGFR monoclonal antibody (clone EP22, 1:50; CNT) for 20 min at RT; iii) an enhanced labelled polymer system (Power Vison) with 3’,3−diaminobenzidine (DAB) was used; and iv) sections were counterstained with hematoxylin. Slides were dehydrated and placed on coverslips.
Evaluation of immunohistochemical staining. Digital images of IHC−stained TMA slides were obtained at x20 magnification using a whole-slide scanner (PRECICE 500 slide scanner; UNIC). Tumor regions on slides were annotated. For each case, total EGFR or MUC1 immunohistochemical staining was evaluated under light microscopy. The staining results were calculated as the histochemistry score (H-score), which was determined by adding the products of the three staining intensity values (weak (1+), moderate (2+), and strong (3+) membranous staining) and their respective cell percentages in the slide.
Bulk RNA sequencing of PDX samples
RNA of PDX samples was extracted using the Qiagen AllPrep DNA/RNA kit, followed by the library construction using TruSeq RNA sample preparation kit (Illumina), and pair-end sequencing on HiSeq 2000. Quality control of RNA sequencing data was conducted using FastQC (26) and MultiQC (27). Adapter trimming, as well as the removal of low-quality reads (quality score < 30) and short reads (< 30 bp), was performed using Trim Galore (28). Then, the reads were aligned to the Homo sapiens genome (Human GRCh38) using Hisat2 (29) with default parameters. SAM files converted to BAM file using Samtools (30) and gene raw read counts were subsequently extracted from BAM files using featureCounts (31).
Calculation of RNA cut-off values from IHC and RNA sequencing data of PDX samples
Using the paired RNA sequencing data and IHC scores from the same PDX samples, we aimed to determine the transcriptomic cut-off values for EGFR and MUC1 that would classify samples with positive or negative staining in IHC. Receiver operating characteristic (ROC) curves were drawn using the `roc` function of the R package `pROC` (32), based on the EGFR or MUC1 H-scores from IHC and the log2(TPM+1) values from RNA sequencing of the same PDX samples. The area under the ROC curve (AUC) was calculated, and the best threshold was set as the cut-off.
TCGA data processing
The TCGA-LUAD and TCGA-COAD datasets were downloaded via the TCGAbiolink package (33). For transcriptomic TPM expression matrix, the data category ‘Transcriptome Profiling’, data type ‘Gene Expression Quantification’, and workflow types “STAR - Counts” and ‘tpm_unstrand’ were selected. TPM value was converted into log2(TPM+1). For clinical information, the data category ‘Clinical’, data type ‘Clinical Supplement’, and data format “BCR XML” were chosen. Additionally, phenotype files of TCGA-LUAD and TCGA-COAD were downloaded from the UCSC Xena web (34) to provide complementary clinical information. For patients with multiple samples, only the tumor sample (sample ID from 0-9) with the latest plate value was retained according to the TCGA barcode guideline. The log-rank test was used to compare Kaplan-Meier survival curves between groups, utilizing the R package `survival` (35). Univariate and multivariate Cox proportional hazards models were applied to evaluate the hazard ratio, using the R package `survminer` (36).
Statistical analysis
The prognosis of LUAD and CRC patients were estimated through Kaplan-Meier survival curves. The criterion for statistical significance was p < 0.05 in all evaluations unless otherwise indicated.
Results
TAA screening based on single-cell meta-atlas
To identify TAA targets, we constructed a tumor-normal single cell meta-atlas via integrating multiple single-cell transcriptome datasets of primary lung tumors and normal lung tissues (10–13) for analysis (Figure 1A). The ECF of each gene was compared with the average expression level for tumor and normal cell populations, and the genes that contributed most to distinguishing between individual malignant and normal tissues were selected by random forest analysis. Feature importance (FI) was gauged through the metric of ‘mean decrease in accuracy,’ a measure that reflected the model’s proficiency in aligning the surfaceome expression profiles with the initial labels of tumor or normal as annotated by our meta-atlas. The top 100 genes with the highest FI were selected for detailed literature review, with various aspects taken into consideration, such as drug developmental status, signaling pathway, gene expression and mutations. Among them MUC1 was finally selected with low expression in various normal cell types for subsequent evaluation (Figures 1B, C).
 
  Figure 1. Identification of MUC1 as a potential TAA target. (A) Overview of TAA discovery pipeline based on single-cell transcriptomic analysis. (B) Expression levels of MUC1 in normal cells (epithelial cell as an example) and tumor cells. (C) Expression levels (evaluated by ECF) of MUC1 in various types of normal cells.
Single-cell analysis confirmed the co-expression of EGFR and MUC1 in tumor cells
We performed single-cell analysis based on data from an independent study (14) to validate the co-expression of EGFR and MUC1 in cancer cells, as well as in other cells of the tumor microenvironment. We used the cut-off of raw count values > 0 to define co-expression of genes at single-cell transcriptomic level. Co-expression of EGFR and MUC1 in malignant cells and stromal cells, including fibroblasts, endothelial cells, normal epithelial cells and immune cells (conventional CD4 T cells, CD8 T cells, exhausted CD8+ T cells, proliferating T cells, NK cells, monocytes or macrophages) was estimated. Co-expression of EGFR and MUC1 in stromal cells was lower than 10% (mostly under 5%) in any patient (Figure 2; Supplementary Table 3). By contrast, 44% of LUAD patients had more than 10% EGFR and MUC1 co-expression in malignant cells, which was significantly higher than the co-expression percentages in other cell populations (P-value = 2.088e-10). The co-expression of EGFR and MUC1 was higher in tumor cells than EGFR combined with targets under active drug development for lung cancer such as MET, HER3 (ERBB3) or Trop2 (TACSTD2) (Figure 2), suggesting the potential safety advantage of developing therapies targeting EGFR and MUC1.
 
  Figure 2. Co-expression of MUC1 and EGFR in LUAD single-cell RNA sequencing data (GSE148071). Co-expression percentages of EGFR with MUC1 (A), MET (B), TROP2 (C), HER3 (D) in different cell clusters in the tumor microenvironment are shown as box plots. Gene expression was quantified as log2(TPM+1). Each dot represents one patient. The horizontal line in the box plot denotes the median, and the box denotes the 25th to 75th quantiles of the data.
IHC analysis of MUC1 and EGFR expression in normal tissues and tumor samples
To examine the protein levels of MUC1 and EGFR in normal tissues, we conducted IHC on a panel of tissue samples from different anatomic sites (Table 1). IHC method for EGFR and MUC1 was established and optimized by screening the most specific primary antibody and testing appropriate antibody concentration (Supplementary Figure 1). EGFR showed positive staining in skin, liver, testicle, prostate, adrenal gland and esophagus, while MUC1 displayed more restricted staining in lung, breast, kidney and stomach (Figure 3). Importantly, the data suggested that these two targets were not co-existed in any normal tissues, which is crucial to minimize potential on-target, off-tumor toxicity.
 
  Figure 3. IHC analysis of MUC1 and EGFR in normal human tissues. (A, B) Representative images of EGFR and MUC1 showing no co-expression in normal skin and lung tissues. (C) Intensity and H-score of EGFR and MUC1 in various normal tissues. The symbols represent the staining intensity: -, negative; +, weak; ++, moderate; +++, strong.
As aberrant expression of EGFR and MUC1 has been reported in various epithelial tumors (37–39), we carried out IHC experiments on LUAD, CRC and breast cancer tissues (Table 2). EGFR generally showed the membranous staining pattern, while MUC1 exhibited both membranous and cytoplasmic patterns, consistent with previous reports (40–42) (Figure 4A). Of the 64 clinical LUAD samples analyzed, EGFR protein expression was observed in 63 (98.4%) cases, and MUC1 were stained positive in all of the samples. Among CRC clinical samples, 79 (95.2%) and 81 (97.6%) were positive for EGFR and MUC1, respectively. As EGFR and MUC1 were stained in consecutive sections with a thickness of 4 μm, which was smaller than the averaged diameter of tumor cells (7-20 μm in most cases), we could observe if they were co-expressed in the same tumor cells (Figure 4B). Roughly, EGFR and MUC1 were detected on the same tumor cells in 63 (98.4%) LUAD and 76 (91.6%) CRC samples, which included samples with at least 1% of tumor cells stained positive with both EGFR and MUC1 regardless of staining intensity.
 
  Figure 4. IHC analysis of EGFR and MUC1 in LUAD, CRC and breast invasive carcinoma (BRCA) clinical samples. (A) Representative images of EGFR and MUC1 staining profiles. The image in the black box on the right is a magnification of the black box on the left to visualize the tumor cells. (B) Representative images showing co-expression, differential expression and non-expression of EGFR and MUC1 in consecutive sections of LUAD and CRC samples. (C, D) Expression levels of EGFR and MUC1 quantified by H-score in LUAD, CRC and BRCA. (E, F) Percentages of samples with different cut-offs of H-score for EGFR (E) and MUC1 (F).
The H-score, which was the sum of the staining intensities multiplied by their corresponding cell percentages, averaged 207.7 (EGFR) and 285.9 (MUC1) in LUAD and 87.6 (EGFR) and 158.3 (MUC1) in CRC (Figures 4C–F). Using H-score ≥100 and H-score ≥200 as the cut-off values for medium and high expression of each protein, medium levels of EGFR and MUC1 were found in 54 (84.3%) LUAD and 19 (22.9%) CRC samples (Supplementary Figure 2). Furthermore, 60.9% of LUAD and 3.6% of CRC samples showed high levels of EGFR and MUC1 (Supplementary Figure 2). However, the staining intensity and percentage of EGFR were much weaker in breast cancer samples which were therefore not used for the following analysis (Figures 4C, E).
Taken together, IHC analysis demonstrated that EGFR and MUC1 were present at medium to high levels in the majority of LUAD and CRC samples, and in line with single-cell analysis, they tended to be co-expressed on tumor cells. Moreover, they did not show high expression or co-existence patterns in normal tissues.
EGFR-MUC1 as dual-TAA targets could potentially cover a large population of NSCLC patients
To estimate the size of patient population with EGFR and MUC1 expression using TCGA database, we calculated the cut-off values of RNA expression corresponding to the positivity of EGFR and MUC1 protein levels. We used PDX samples due to their similarity to primary tumors, and obtained RNA sequencing and IHC staining results of the same PDX samples. In 24 and 62 PDX samples of LUAD and CRC respectively, the staining patterns of EGFR and MUC1 were similar to those in primary clinical samples (Figures 5A, B). 100% LUAD samples (mean H-score=262) and CRC samples (mean H-score=174) were positive for EGFR, while 95.8% LUAD samples (mean H-score=202) and 100% CRC samples (mean H-score=178) were positive for MUC1 (Figures 5C, D). The cut-off log2(TPM+1) values were determined to predict IHC positivity through ROC curve analysis, in which a larger AUC indicated a better predictive performance (a higher consistency between TPM and H-score). Figures 5E, F showed the ROC curves and the AUC values for LUAD PDX samples, illustrating a high AUC value (0.97 by H-score >0) for EGFR, a medium AUC value (0.6 by H-score >0) for MUC1 in LUAD, and a low AUC value for both EGFR and MUC1 in CRC (data not shown). Therefore, we proceeded with analyses of EGFR and MUC1 in LUAD.
 
  Figure 5. Expression analysis of EGFR and MUC1 in PDX samples and TCGA database. (A, B) Representative images of EGFR (A) and MUC1 (B) staining profiles in PDX samples. (C, D) EGFR and MUC1 expression levels quantified by H-score in LUAD and CRC PDX samples. (E, F) ROC curves for EGFR (E) and MUC1 (F) in LUAD PDX samples which were drawn using H-score and RNA log2(TPM+1) data from the same PDX sample. AUCs and log2(TPM+1) threshold values corresponding to positive protein staining by Youden’s cut-off are also illustrated. (G) Estimated percentage of EGFR and MUC1 expression in LUAD patients in the TCGA database.
We estimated the optimal threshold value of RNA expression referring to protein positivity (H-score >0) by the roc function in the R package “pROC” as the cut-off (43). The log2(TPM + 1) cutoff points of 2.23 for EGFR and 1.5 for MUC1 (Figures 5E, F) were applied in the TCGA database to define positivity. Consistent with our above results, 504 of 517 (97.5%) LUAD samples were positive for EGFR, and MUC1 protein expression was positive in 517 cases (100%). Thus, the proportion of tumor samples positive with EGFR and MUC1 was estimated to be 97.5% (Figure 5G), which indicated that a large population of LUAD patients could potentially benefit from EGFR-MUC1 dual-targeting therapies.
EGFR and MUC1 corelated with poor prognosis of LUAD and CRC patients
To further evaluate the clinical significance of EGFR and MUC1 expression, we analyzed the correlation between EGFR and MUC1 expression and the prognosis of LUAD and CRC patients through Kaplan-Meier survival curves. To predict the effect of EGFR and MUC1 on the survival of individuals who might need biological treatments, patients with advanced tumor stages (TNM tumor stage III/IV) from TCGA database were selected for the survival analysis. In LUAD patients, although there was no significant difference, we observed that high expression of EGFR or MUC1 alone was corelated with reduced overall survival (OS) and disease-free interval (DFI) (Supplementary Figures 3A–D). Furthermore, LUAD patients with high expression of both EGFR and MUC1 had significantly worse prognosis than other patients (Figures 6A, B, p-value=0.033 for OS, p-value=0.027 for DFI). In CRC patients, prognosis of patients with high EGFR or MUC1 expression alone was slightly worse than low expression (Supplementary Figures 3E–H), while the prognosis of patients with high expression of both EGFR and MUC1 was significantly worse than that of patients with low expression levels of both targets (Figures 6C, D, p-value=0.045 for OS, p-value=0.053 for DFI). These results suggested high expression of EGFR-MUC1 might serve as a prognostic factor for LUAD and CRC patients, and development of dual- targeting therapies might provide effective treatment options for patients with late stage LUAD or CRC.
 
  Figure 6. Correlation of EGFR-MUC1 expression with prognosis of LUAD and CRC patients. Survival curves showing the association between EGFR-MUC1 expression and prognosis of LUAD (A, B) and CRC patients (C, D).
Discussion
The EGFR-dependent pathway has an important role in epithelial cancer biology, which has led to the development of cetuximab and necitumumab. However, severe skin toxicity, most likely due to the expression of EGFR in normal epithelium, hinders the widespread use of these drugs (7). To increase tumor selectivity and reduce tumor escape, dual-targeting therapies using antibodies, antibody-drug conjugates (ADCs) or chimeric antigen receptor T cells have emerged (44, 45), as targeting two TAAs simultaneously might better distinguish tumor cells from normal cells.
In this work, we screened TAA targets using a single-cell analysis pipeline, and found a potential target MUC1 with high expression in tumor cells and relatively low or undetectable expression in normal tissues. MUC1 is a glycoprotein involved in the proliferation, metabolism, metastasis and invasion of multiple tumor types (46–49), and overexpressed in various epithelial cancers with aberrant glycosylation (50). As MUC1 protein was confined to the apical surface of epithelial cells under normal physiological conditions, it was distributed all over the cell surface and within the cytoplasm in tumor cells, which might improve the safety of drugs targeting MUC1 (47). Thus, MUC1 presented unique properties in cancer cells, including the aberrant glycosylation pattern, loss of polarity and overexpression, making it an attractive TAA target (47). However, the clinical efficacy of monotherapy targeting MUC1 was much lower than expected. The glyco-engineered humanized monoclonal antibody PankoMab-GEX showed no difference from placebo for the primary endpoint of progression-free survival (51). One possible reason could be that these anti-MUC1 antibodies were developed to target the N-terminal subunit (MUC1-N) which was usually shed from the cell surface and released into the peripheral blood. The detached MUC1-N attached to antibodies and prevented them from binding to surface MUC1 (52).
Several studies have demonstrated that MUC1 drove EGFR expression and signaling in different cellular contexts (53–56). Through its C-terminus, MUC1 stimulated EGFR promoter activation, thereby increasing EGF-dependent signaling, spheroid survival and cellular proliferation (55). Furthermore, knockout of MUC1 in tumor cells resulted in higher sensitivity to EGFR inhibitors, and activated EGFR stimulated MUC1 expression in human uterine and pancreatic cancer cell lines (55). Therefore, it is predicted that EGFR-MUC1 dual-targeting therapies could improve the response of tumor cells, and analysis of EGFR and MUC1 co-expression patterns in clinical samples are essential to provide a solid rationale for drug development.
We investigated the expression patterns of EGFR and MUC1 in patients with LUAD and CRC by IHC, and observed high expression of EGFR and MUC1 in tumor tissues. In contrast, there was lower expression of both proteins in various normal tissues. In line with previous studies, co-expression of EGFR and MUC1 in the same tumor cells was found in most cases, including 63 (98.4%) LUAD and 76 (91.6%) CRC samples. Using the RNA cut-off values for EGFR and MUC1 positivity, we estimated that 97.5% of LUAD patients expressed EGFR and MUC1 based on the 517 cases’ data from TCGA, which suggested that EGFR and MUC1 expression was prevalent in LUAD patients. Furthermore, high expression of EGFR and MUC1 was prognostic for poor survival of LUAD and CRC patients.
There are some limitations in our study. Firstly, as we showed the expression patterns of EGFR and MUC1 in a panel of normal tissues and various tumor tissues, the sample size was still limited to achieve a thorough comparison between tumor and normal tissues. A larger collection of normal samples and tumor samples with paired tumor-adjacent tissues need to be examined to better elucidate the potential toxicity of targeting EGFR and MUC1. Secondly, we did not show experimental data to confirm the efficacy and safety of dual-targeting strategy in this study, while other companies have reported their preclinical data on bispecific ADCs targeting EGFR and MUC1 recently, which suggested the great therapeutic potential of dual-targeting drugs for cancers co-expressing EGFR and MUC1, as well as EGFR- or MUC1- expressing tumors (57, 58).
In summary, through a generalizable pipeline for screening TAAs by single-cell transcriptomic and IHC analysis, MUC1 was selected as a candidate target to combine with EGFR, as to increase the tumor specificity over normal cells. We demonstrated that EGFR and MUC1 were co-expressed on tumor cells in the majority of LUAD and CRC clinical samples. High expression of EGFR and MUC1 was corelated with unfavorable prognosis of LUAD and CRC patients. Given that EGFR and MUC1 expression was present in a large patient population, our work has shed light on the prospects of developing EGFR-MUC1 dual-targeting therapies, such as bispecific antibodies and ADCs, which might broaden the tumor selectivity while reducing the side-effects on normal organs.
Data availability statement
The public datasets used in this study are available in the TCGA database (https://www.cancer.gov/tcga), ArrayExpress database (https://www.ebi.ac.uk/biostudies/arrayexpress), and Tissue Stability Cell Atlas (https://www.tissuestabilitycellatlas.org/). The details can be found in the article and its Supplementary Materials.
Ethics statement
The studies involving humans were approved by Shanxi Province Cancer Hospital. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study. The animal study was approved by Shanxi Province Cancer Hospital. The study was conducted in accordance with the local legislation and institutional requirements.
Author contributions
HC: Investigation, Project administration, Validation, Writing – review & editing. QY: Investigation, Methodology, Writing – original draft, Writing – review & editing. QX: Project administration, Visualization, Investigation, Writing – review & editing. LZ: Investigation, Methodology, Validation, Writing – review & editing. WY: Data curation, Formal analysis, Validation, Writing – review & editing. YY: Data curation, Formal analysis, Validation, Writing – review & editing. ST: Formal analysis, Validation, Writing – review & editing. CLin: Project administration, Visualization, Writing – original draft. YZ: Data curation, Software, Visualization, Writing – review & editing. RZS: Methodology, Writing – review & editing. YM: Resources, Writing – review & editing. NY: Methodology, Writing – review & editing. HW: Resources, Writing – review & editing. FC: Methodology, Writing – review & editing. ML: Methodology, Writing – review & editing. JM: Resources, Writing – review & editing. CLiao: Conceptualization, Resources, Supervision, Writing – original draft. RFS: Conceptualization, Funding acquisition, Resources, Supervision, Writing – original draft, Writing – review & editing.
Funding
The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was supported by Central Government Funds for Guiding Local Scientific and Technological Development (YDZJSX2022B014) and the Project of Shanxi Provincial Key Research and Development Program (201803D421054) in China.
Conflict of interest
Authors QX, LZ, WY, YY, ST, CLin, YZ, and CLiao were employed by the company Shanghai Shengdi Medicine Co. Ltd.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2024.1433033/full#supplementary-material
Supplementary Figure 1 | Establishment of IHC method and evaluation of staining intensity. Representative examples of IHC method optimization (A) and positive samples with different staining intensities (B). (A) An optimized antibody concentration of 0.5 μg/mL was selected considering MUC1 staining sensitivity and specificity in tonsil tissue. (B) Membrane staining was scored as follows: 0 for no staining visible at a magnification of x400; 1+ for light staining visible at a magnification of x400; 2+ for intermediate staining visible at a magnification of x400; and 3+ for dark staining of the linear membrane visible at a magnification of x100.
Supplementary Figure 2 | Percentages of EGFR and MUC1 expression in LUAD (A) and CRC (B) samples according to different H-score cut-offs (H-score>0, H-score ≥100, H-score ≥200).
Supplementary Figure 3 | Correlation of EGFR or MUC1 expression with prognosis in LUAD and CRC patients. (A, B) Survival curves showing the association between EGFR expression and OS rate (A) or DFI (B) of LUAD patients. (C, D) Survival curves showing the association between MUC1 expression and OS rate (C) or DFI (D) of LUAD patients. (E, F) Survival curves showing the association between EGFR expression and OS rate (E) or DFI (F) of CRC patients. (G, H) Survival curves showing the association between MUC1 expression and OS rate (G) or DFI (H) of CRC patients.
References
1. Veale D, Ashcroft T, Marsh C, Gibson GJ, Harris AL. Epidermal growth factor receptors in non-small cell lung cancer. Br J Cancer. (1987) 55:513–6. doi: 10.1038/bjc.1987.104
2. Masui H, Moroyama T, Mendelsohn J. Mechanism of antitumor activity in mice for anti-epidermal growth factor receptor monoclonal antibodies with different isotypes. Cancer Res. (1986) 46:5592–8.
3. Weng W, Feng J, Qin H, Ma Y. Molecular therapy of colorectal cancer: progress and future directions. Int J Cancer. (2015) 136:493–502. doi: 10.1002/ijc.v136.3
4. Gonzalez-Conchas GA, Rodriguez-Romo L, Hernandez-Barajas D, Gonzalez-Guerrero JF, Rodriguez-Fernandez IA, Verdines-Perez A, et al. Epidermal growth factor receptor overexpression and outcomes in early breast cancer: A systematic review and a meta-analysis. Cancer Treat Rev. (2018) 62:1–8. doi: 10.1016/j.ctrv.2017.10.008
5. Thatcher N, Hirsch FR, Luft AV, Szczesna A, Ciuleanu TE, Dediu M, et al. Necitumumab plus gemcitabine and cisplatin versus gemcitabine and cisplatin alone as first-line therapy in patients with stage IV squamous non-small-cell lung cancer (SQUIRE): an open-label, randomised, controlled phase 3 trial. Lancet Oncol. (2015) 16:763–74. doi: 10.1016/S1470-2045(15)00021-2
6. Lynch TJ, Patel T, Dreisbach L, McCleod M, Heim WJ, Hermann RC, et al. Cetuximab and first-line taxane/carboplatin chemotherapy in advanced non-small-cell lung cancer: results of the randomized multicenter phase III trial BMS099. J Clin Oncol. (2010) 28:911–7. doi: 10.1200/JCO.2009.21.9618
7. Lacouture ME, Anadkat M, Jatoi A, Garawin T, Bohac C, Mitchell E. Dermatologic toxicity occurring during anti-EGFR monoclonal inhibitor therapy in patients with metastatic colorectal cancer: A systematic review. Clin Colorectal Cancer. (2018) 17:85–96. doi: 10.1016/j.clcc.2017.12.004
8. López de Sá A, Díaz-Tejeiro C, Poyatos-Racionero E, Nieto-Jiménez C, Paniagua-Herranz L, Sanvicente A, et al. Considerations for the design of antibody drug conjugates (ADCs) for clinical development: lessons learned. J Hematol Oncol. (2023) 16:118. doi: 10.1186/s13045-023-01519-0
9. Huang S, van Duijnhoven SMJ, Sijts AJAM, van Elsas A. Bispecific antibodies targeting dual tumor-associated antigens in cancer therapy. J Cancer Res Clin Oncol. (2020) 146:3111–22. doi: 10.1007/s00432-020-03404-6
10. Kim N, Kim HK, Lee K, Hong Y, Cho JH, Choi JW, et al. Single-cell RNA sequencing demonstrates the molecular and cellular reprogramming of metastatic lung adenocarcinoma. Nat Commun. (2020) 11:2285. doi: 10.1038/s41467-020-16164-1
11. Qian J, Olbrecht S, Boeckx B, Vos H, Laoui D, Etlioglu E, et al. A pan-cancer blueprint of the heterogeneous tumor microenvironment revealed by single-cell profiling. Cell Res. (2020) 30:745–62. doi: 10.1038/s41422-020-0355-0
12. Madissoon E, Wilbrey-Clark A, Miragaia RJ, Saeb-Parsy K, Mahbubani KT, Georgakopoulos N, et al. scRNA-seq assessment of the human lung, spleen, and esophagus tissue stability after cold preservation. Genome Biol. (2019) 21:1. doi: 10.1186/s13059-019-1906-x
13. Vieira Braga FA, Kar G, Berg M, Carpaij OA, Polanski K, Simon LM, et al. A cellular census of human lungs identifies novel cell states in health and in asthma. Nat Med. (2019) 25:1153–63. doi: 10.1038/s41591-019-0468-5
14. Wu F, Fan J, He Y, Xiong A, Yu J, Li Y, et al. Single-cell profiling of tumor heterogeneity and the microenvironment in advanced non-small cell lung cancer. Nat Commun. (2021) 12:2540. doi: 10.1038/s41467-021-22801-0
15. Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, et al. NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Res. (2013) 41:D991–5. doi: 10.1093/nar/gks1193
16. Parkinson H, Kapushesky M, Shojatalab M, Abeygunawardena N, Coulson R, Farne A, et al. ArrayExpress–a public database of microarray experiments and gene expression profiles. Nucleic Acids Res. (2007) 35:D747–50. doi: 10.1093/nar/gkl995
17. Wolf FA, Angerer P, Theis FJ. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. (2018) 19:15. doi: 10.1186/s13059-017-1382-0
18. Polański K, Young MD, Miao Z, Meyer KB, Teichmann SA, Park JE. BBKNN: fast batch alignment of single cell transcriptomes. Bioinformatics. (2020) 36:964–5. doi: 10.1093/bioinformatics/btz625
19. Aran D, Looney AP, Liu L, Wu E, Fong V, Hsu A, et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat Immunol. (2019) 20:163–72. doi: 10.1038/s41590-018-0276-y
20. Jiang S, Qian Q, Zhu T, Zong W, Shang Y, Jin T, et al. Cell Taxonomy: a curated repository of cell types with multifaceted characterization. Nucleic Acids Res. (2023) 51:D853–d860. doi: 10.1093/nar/gkac816
21. Gao R, Bai S, Henderson YC, Lin Y, Schalck A, Yan Y, et al. Delineating copy number and clonal substructure in human tumors from single-cell transcriptomes. Nat Biotechnol. (2021) 39:599–608. doi: 10.1038/s41587-020-00795-2
23. Sing T, Sander O, Beerenwinkel N, Lengauer T. ROCR: visualizing classifier performance in R. Bioinformatics. (2005) 21:3940–1. doi: 10.1093/bioinformatics/bti623
24. Bausch-Fluck D, Goldmann U, Müller S, van Oostrum M, Müller M, Schubert OT, et al. The in silico human surfaceome. Proc Natl Acad Sci U.S.A. (2018) 115:E10988–e10997. doi: 10.1073/pnas.1808790115
25. Travis WD, Brambilla E, Nicholson AG, Yatabe Y, Austin JHM, Beasley MB, et al. The 2015 world health organization classification of lung tumors: impact of genetic, clinical and radiologic advances since the 2004 classification. J Thorac Oncol. (2015) 10:1243–60. doi: 10.1097/JTO.0000000000000630
26. Andrews S. FastQC: A Quality Control Tool for High Throughput Sequence Data. (2010). Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc (accessed October 9, 2023).
27. Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. (2016) 32:3047–8. doi: 10.1093/bioinformatics/btw354
28. Krueger F. FelixKrueger/TrimGalore: v0.6.10- add default decompression path (0.6.10). Zenodo. (2023). doi: 10.5281/zenodo.7598955 (accessed November 3, 2023).
29. Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. (2019) 37:907–15. doi: 10.1038/s41587-019-0201-4
30. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. (2009) 25:2078–9. doi: 10.1093/bioinformatics/btp352
31. Liao Y, Smyth GK, Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. (2014) 30:923–30. doi: 10.1093/bioinformatics/btt656
32. Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez JC, et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinf. (2011) 12:77. doi: 10.1186/1471-2105-12-77
33. Colaprico A, Silva TC, Olsen C, Garofano L, Cava C, Garolini D, et al. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. (2016) 44:e71. doi: 10.1093/nar/gkv1507
34. Goldman MJ, Craft B, Hastie M, Repečka K, McDade F, Kamath A, et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol. (2020) 38:675–8. doi: 10.1038/s41587-020-0546-8
35. Therneau TM. A Package For Survival Analysis in R. R package version 3.7-0 (2024). Available online at: https://CRAN.R-project.org/package=survival (accessed July 4, 2024).
36. Kassambara A, Kosinski M, Biecek P. survminer: Drawing Survival Curves using ‘ggplot2’. R package version 0.4.9. (2021). Available online at: https://CRAN.R-project.org/package=survminer (accessed July 10, 2024).
37. Lan Y, Ni W, Tai G. Expression of MUC1 in different tumours and its clinical significance (Review). Mol Clin Oncol. (2022) 17:161. doi: 10.3892/mco.2022.2594
38. Li X, Zhao L, Chen C, Nie J, Jiao B. Can EGFR be a therapeutic target in breast cancer? Biochim Biophys Acta Rev Cancer. (2022) 1877:188789. doi: 10.1016/j.bbcan.2022.188789
39. Zhao B, Wang L, Qiu H, Zhang M, Sun L, Peng P, et al. Mechanisms of resistance to anti-EGFR therapy in colorectal cancer. Oncotarget. (2017) 8:3980–4000. doi: 10.18632/oncotarget.14012
40. Avilés−Salas A, Muñiz-Hernández S, Maldonado-Martínez HA, Chanona-Vilchis JG, Ramírez-Tirado LA, HernáNdez-Pedro N, et al. Reproducibility of the EGFR immunohistochemistry scores for tumor samples from patients with advanced non-small cell lung cancer. Oncol Lett. (2017) 13:912–20. doi: 10.3892/ol.2016.5512
41. Kato T, Ujiie H, Hatanaka KC, Nange A, Okumura A, Tsubame K, et al. A novel Tn antigen epitope-recognizing antibody for MUC1 predicts clinical outcome in patients with primary lung adenocarcinoma. Oncol Lett. (2021) 21:202. doi: 10.3892/ol.2021.12463
42. Atkins D, Reiffen KA, Tegtmeier CL, Winther H, Bonato MS, Störkel S. Immunohistochemical detection of EGFR in paraffin-embedded tumor tissues: variation in staining intensity due to choice of fixative and storage time of tissue sections. J Histochem Cytochem. (2004) 52:893–901. doi: 10.1369/jhc.3A6195.2004
43. Fluss R, Faraggi D, Reiser B. Estimation of the Youden Index and its associated cutoff point. Biom J. (2005) 47:458–72. doi: 10.1002/bimj.200410135
44. Zeng H, Ning W, Liu X, Luo W, Xia N. Unlocking the potential of bispecific ADCs for targeted cancer therapy. Front Med. (2024) 18(4):597–621. doi: 10.1007/s11684-024-1072-8
45. Dagar G, Gupta A, Masoodi T, Nisar S, Merhi M, Hashem S, et al. Harnessing the potential of CAR-T cell therapy: progress, challenges, and future directions in hematological and solid tumor treatments. J Transl Med. (2023) 21:449. doi: 10.1186/s12967-023-04292-3
46. Bitler BG, Goverdhan A, Schroeder JA. MUC1 regulates nuclear localization and function of the epidermal growth factor receptor. J Cell Sci. (2010) 123:1716–23. doi: 10.1242/jcs.062661
47. Nath S, Mukherjee P. MUC1: a multifaceted oncoprotein with a key role in cancer progression. Trends Mol Med. (2014) 20:332–42. doi: 10.1016/j.molmed.2014.02.007
48. Kalluri R, Weinberg RA. The basics of epithelial-mesenchymal transition. J Clin Invest. (2009) 119:1420–8. doi: 10.1172/JCI39104
49. Chaika NV, Gebregiworgis T, Lewallen ME, Purohit V, Radhakrishnan P, Liu X, et al. MUC1 mucin stabilizes and activates hypoxia-inducible factor 1 alpha to regulate metabolism in pancreatic cancer. Proc Natl Acad Sci U.S.A. (2012) 109:13787–92. doi: 10.1073/pnas.1203339109
50. Lau SK, Weiss LM, Chu PG. Differential expression of MUC1, MUC2, and MUC5AC in carcinomas of various sites: an immunohistochemical study. Am J Clin Pathol. (2004) 122:61–9. doi: 10.1309/9R6673QEC06D86Y4
51. Fiedler W, DeDosso S, Cresta S, Weidmann J, Tessari A, Salzberg M, et al. A phase I study of PankoMab-GEX, a humanised glyco-optimised monoclonal antibody to a novel tumour-specific MUC1 glycopeptide epitope in patients with advanced carcinomas. Eur J Cancer. (2016) 63:55–63. doi: 10.1016/j.ejca.2016.05.003
52. Bose M, Mukherjee P. Potential of anti-MUC1 antibodies as a targeted therapy for gastrointestinal cancers. Vaccines (Basel). (2020) 8:659. doi: 10.3390/vaccines8040659
53. Ma Q, Song J, Wang S, He N. MUC1 regulates AKT signaling pathway by upregulating EGFR expression in ovarian cancer cells. Pathol Res Pract. (2021) 224:153509. doi: 10.1016/j.prp.2021.153509
54. Li X, Wang L, Nunes DP, Troxler RF, Offner GD. Suppression of MUC1 synthesis downregulates expression of the epidermal growth factor receptor. Cancer Biol Ther. (2005) 4:968–73. doi: 10.4161/cbt.4.9.1913
55. Engel BJ, Bowser JL, Broaddus RR, Carson DD. MUC1 stimulates EGFR expression and function in endometrial cancer. Oncotarget. (2016) 7:32796–809. doi: 10.18632/oncotarget.v7i22
56. Schroeder JA, Thompson MC, Gardner MM, Gendler SJ. Transgenic MUC1 interacts with epidermal growth factor receptor and correlates with mitogen-activated protein kinase activation in the mouse mammary gland. J Biol Chem. (2001) 276:13057–64. doi: 10.1074/jbc.M011248200
57. Zhang Y, Shang C, Wang A, Zhang J, Liu Y, Li H, et al. Abstract 6325: A novel EGFR x MUC1 bispecific antibody-drug conjugate, BSA01, targets MUC1 transmembrane cleavage products and improves tumor selectivity. Cancer Res. (2023) 83:6325. doi: 10.1158/1538-7445.AM2023-6325
Keywords: EGFR, MUC1, TAA, LUAD, CRC
Citation: Cui H, Yu Q, Xu Q, Lin C, Zhang L, Ye W, Yang Y, Tian S, Zhou Y, Sun R, Meng Y, Yao N, Wang H, Cao F, Liu M, Ma J, Liao C and Sun R (2024) EGFR and MUC1 as dual-TAA drug targets for lung cancer and colorectal cancer. Front. Oncol. 14:1433033. doi: 10.3389/fonc.2024.1433033
Received: 15 May 2024; Accepted: 23 October 2024;
Published: 27 November 2024.
Edited by:
Brian D. Adams, Brain Institute of America, United StatesReviewed by:
Qian Liu, Capital Medical University, ChinaBarani Kumar Rajendran, Yale University, United States
Copyright © 2024 Cui, Yu, Xu, Lin, Zhang, Ye, Yang, Tian, Zhou, Sun, Meng, Yao, Wang, Cao, Liu, Ma, Liao and Sun. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Ruifang Sun, c3VucnVpZmFuZ0BzeG11LmVkdS5jbg==; Cheng Liao, Y2hlbmcubGlhb0BoZW5ncnVpLmNvbQ==
†These authors have contributed equally to this work
 Huilin Cui1†
Huilin Cui1† 
  