A Diagnostic Model With IgM Autoantibodies and Carcinoembryonic Antigen for Early Detection of Lung Adenocarcinoma

Immunoglobulin M (IgM) autoantibodies, as the early appearing antibodies in humoral immunity when stimulated by antigens, might be excellent biomarkers for the early detection of lung cancer (LC). We aimed to develop a multi-analyte integrative model combining IgM autoantibodies and a traditional tumor biomarker that could be a valuable and powerful auxiliary diagnostic tool and might improve the accuracy of early detection of lung adenocarcinoma (LUAD). A customized protein array based on cancer driver genes was constructed and applied in the discovery cohort consisting of 68 LUAD patients and 68 normal controls (NCs); 31 differentially expressed IgM autoantibodies were identified. The top 5 candidate IgM autoantibodies [based on the area under the receiver operating characteristic curve (AUC) ranking], namely, TSHR, ERBB2, survivin, PIK3CA, and JAK2, were validated in the validation cohort using enzyme-linked immunosorbent assay (ELISA), which included 147 LUAD samples, 72 lung squamous cell carcinoma (LUSC) samples, 44 small cell lung carcinoma (SCLC) samples, and 147 NCs. These indicators presented diagnostic capacity for LUAD, with AUCs of 0.599, 0.613, 0.579, 0.601, and 0.633, respectively (p < 0.05). However, none of them showed a significant difference between the SCLC and NC groups, and only the IgM autoantibody against JAK2 showed a higher expression in LUSC than in NC (p = 0.046). Through logistic regression analysis, with the five IgM autoantibodies and carcinoembryonic antigen (CEA), one diagnostic model was constructed for LUAD. The model yielded an AUC of 0.827 (sensitivity = 56.63%, specificity = 93.98%). The diagnostic efficiency was superior to that of either CEA (AUC = 0.692) or IgM autoantibodies alone (AUC = 0.698). Notably, the accuracy of this model in early-stage LUAD reached 83.02%. In conclusion, we discovered and identified five novel IgM indicators and developed a multi-analyte model combining IgM autoantibodies and CEA, which could be a valuable and powerful auxiliary diagnostic tool and might improve the accuracy of early detection of LUAD.


INTRODUCTION
With estimates of 2.2 million new cases and 1.8 million deaths, lung cancer (LC) is the second most common cancer and the leading cause of cancer death worldwide, approximately accounting for one-tenth (11.4%) of cancer occurrence and one-fifth (18.0%) of cancer deaths (1). The 5-year survival rate is 57% for patients with localized tumors, while this decreased to 5% for patients at the metastatic stage (2), which indicates that the high mortality rate of LC is closely related to cancer stage. Traditionally, LC is classified into non-small cell lung carcinoma (NSCLC) and small cell lung carcinoma (SCLC). There are two major types of NSCLC: lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) (3). LUAD, as the most common lung malignancy, is frequently found in women and non-smokers (3,4). A large number of studies have shown that, if LC patients with adenocarcinoma in situ (AIS) and minimally invasive adenocarcinoma (MIA) could undergo radical surgery, their 5-year disease-free survival rate may approach 100% (5). Therefore, the early diagnosis and treatment of LUAD are essential to reduce the mortality of LC.
Currently, low-dose spiral CT (LDCT) and pathological tissue biopsy are used to screen and detect LC patients clinically, but the former has high false-positive rates and the latter is traumatic, which causes some excessive diagnosis, unnecessary tests, invasive procedures, and, rarely, radiation-induced cancers (6). In recent years, serological biomarkers have received widespread attention because of their advantages, such as being simple, noninvasive, and easily accepted by patients (7,8). Traditional serum tumor markers had been used in the auxiliary diagnosis of cancers in clinical practice, such as carcinoembryonic antigen (CEA), cancer antigen 125 (CA-125), and cytokeratin-19 fragment (CYFRA 21-1), but their diagnostic ability was limited by their unsatisfactory:: sensitivity and specificity for LC (9,10).
Tumors are the products of the malignant transformation of normal cells, which are characterized by continuous proliferation and metastasis in the body. The prominent feature of tumor cells in immunology is the appearance of certain tumor-associated antigens that are not visible or have low expressions in normal cells of the same type (11). Due to the presence of tumorassociated antigens, it is bound to be recognized by the body's immune system and thus stimulate adaptive immune responses, including cellular immunity and humoral immunity (12,13). Immunoglobulin G (IgG) and immunoglobulin M (IgM) autoantibodies are produced as an important part of humoral immunity and are secreted into the blood. Recent researches have provided substantial evidence that patients with cancers could develop humoral immune response and then produce autoantibodies in the early stage even before cancer diagnosis (14,15). Therefore, as the primary and secondary response products, IgM and IgG autoantibodies have great potential as early diagnostic indicators of LC. Related studies on IgG autoantibodies in the early diagnosis of LC have made admirable progress (16)(17)(18). However, studies regarding IgM autoantibodies are limited. Thus, more research is needed to provide evidence for IgM as an earlier indicator for discriminating LC patients and normal individuals.
In the present study, we aimed to screen valuable IgM autoantibody indicators for LUAD by protein array and verify them in another sample cohort with enzyme-linked immunosorbent assay (ELISA). Ultimately, the five candidate IgM autoantibodies and CEA were integrated to construct a diagnostic model to improve the diagnostic efficiency for LUAD. The diagnostic model might be able to improve the treatment status of LC patients and increase their survival rate.

Study Population and Serum Collection
All serum samples included in this study were obtained from the Specimen Biobank in Henan Key Medical Laboratory of Tumor Molecular Biomarkers collected from a provincial hospital in Zhengzhou, Henan Province, China, between 2016 and 2019. Two independent sample cohorts (a discovery cohort and a validation cohort) were used in this research. The discovery cohort consisted of 68 LUAD patients (LUADs) and 68 normal controls (NCs) matched by gender and age. In addition, 147 LUADs, 147 matched NCs, 72 LUSC patients, and 44 SCLC patients were included in the validation cohort. The blood samples of all LC patients were drawn upon their first diagnosis without any other cancers, antitumor treatment, and autoimmune diseases. All NCs were individuals who had a health checkup without history of cancer, pulmonary diseases, and autoimmune diseases. The sera were extracted and stored according to standard protocols (19). The study was approved by the Medical Ethics Committee of Zhengzhou University, and all the patients and NCs signed an informed consent before their participation in the study.
The serum CEA test results were provided by the laboratory of the hospital. It was obtained using the MODULARE70 automatic analyzer and supporting kits produced by Roche in Switzerland. The principle was electrochemiluminescence. The experimental operations were carried out by professional and technical personnel. Moreover, the results were released after inspection by experienced laboratory physicians.

Selection of Candidate IgM Autoantibodies
Analysis of the receiver operating characteristic (ROC) of each autoantibody in LUADs and NCs was applied based on the SNR value of each samples. Thirty-one IgM autoantibodies were screened and showed higher distinguishing values in LUADs [area under the ROC curve (AUC) > 0.5, p < 0.05]. In view of the optimal choice in terms of the number of indicators when constructing the model, we set the top 5 based on AUC rankings (TSHR, ERBB2, survivin, PIK3CA, and JAK2) among the 31 differential autoantibodies as the candidate IgM indicators (Supplementary Table S1).

Enzyme-Linked Immunosorbent Assay
ELISA was used to detect the serum levels of candidate autoantibodies discovered by the protein array. The five purified recombinant proteins (TSHR, ERBB2, survivin, PIK3CA, and JAK2) were coated at an optimal concentration of 0.125 mg/ml (50 ml/well) at 4°C overnight. After blocking with 2% BSA buffer (50 ml/well) at 4°C overnight, all serum samples diluted 1:100 were piped into the antigen-coated wells (50 ml/ well) and incubated at 37°C for 1 h. Then, the plates were washed three times with PBST. Subsequently, the horseradish peroxidase (HRP)-conjugated anti-human IgM antibody (1:2,000 dilution) or the HRP-conjugated anti-human IgG antibody (1:5,000 dilution) was added into each microplate well (50 ml/well) and incubated at 37°C for 1 h, followed by washing five times with PBST. TMB (3,3′,5,5′-tetramethylbenzidine) substrate solution was used as the detecting reagent (50 ml/well), and the enzymesubstrate reaction was terminated by the addition of a sulfuric acid solution (25 ml/well). Finally, the color change was measured spectrophotometrically at wavelengths of 450 and 620 nm. The difference of the optical density (OD) values at 450 and 620 nm was applied in further statistical analysis. To ensure the reliability of the serum levels of the IgM autoantibodies, a positive control, a negative control, and two blank controls were used in every plate and the samples of each group were dispersed on the plates.

Statistical Analysis
Data analysis and visualization was performed using DAVID (Database for Annotation, Visualization, and Integrated Discovery), SPSS Statistics 26.0, GraphPad Prism 8.0, and R-4.0.0 software. The c 2 test and Student's t-test were applied to compare the differences in the characteristics between two cohorts. Besides, differences in the levels of autoantibodies in LC patients and NCs were analyzed using the non-parametric Mann-Whitney U test. Logistic regression was employed to construct a diagnostic model for LUAD combining autoantibodies and CEA. ROC analysis and the AUC with 95% confidence interval (CI) were used to evaluate the diagnostic performance of the biomarkers and the model. The sensitivity and specificity were determined based on the cutoff value, which was defined as the OD value at the maximum Youden's index, while specificity is more than 90%. In all tests, p < 0.05 (twotailed) was considered statistically significant.

Overall Study Design and Characteristics of the Study Subjects
In the whole study, we employed a three-phase strategy to identify IgM autoantibodies for the early diagnosis of LUAD and to construct a diagnostic model combining potential IgM autoantibodies and CEA ( Figure 1). In phase I, the serum samples from the discovery cohort composed of 68 LUADs and 68 matched NCs were individually profiled on the protein array for the screening of IgM autoantibodies. A total of 31 IgM autoantibodies were observed to have higher intensity in the sera of LUADs than in NCs. In phase II, the top 5 (AUC ranking) of the 31 overexpressed IgM autoantibodies were tested in a larger cohort comprising sera from 147 LUADs, 147 matched NCs, 72 LUSC patients, and 44 SCLC patients using indirect ELISA. In phase III, we selected 83 LUADs with CEA results and 83 NCs matched by age and sex from the validation cohort. Based on the CEA and IgM autoantibody results of these populations, a diagnostic model was built, which showed a much improved performance using logistic regression.
The clinical characteristics of all the subjects involved in this study are described in Table 1. There were no significant differences in sex, smoking history, TNM stage, status of lymph node metastasis, and distant metastasis in LUADs of the two cohorts.

Human Protein Array Analysis
In order to identify candidate IgM autoantibodies, we assembled the discovery cohort of 136 serum samples collected from 68 LUADs and 68 matched NCs. The levels of IgM autoantibodies against 154 recombinant proteins were evaluated in the sera using the protein array. As shown in Figure 2A, the 154 proteins, the anti-human IgM (positive control), and a buffer (negative control) were arranged in order in the protein chips. The stability test results revealed that the average repeatability between the different batches of chips was 0.98, manifesting that the overall  stability was good ( Figure 2B). Furthermore, as exhibited in the representative results from LUADs and NCs ( Figures 2C, D), the IgM response of LUADs was apparently stronger than that of NCs. The positive and negative controls ensured the credibility of the experimental results. Based on the criteria AUC > 0.5 and p < 0.05, 31 of the screened IgM autoantibodies showed higher expressions in LUADs, but much less in NCs ( Figure 3A). Supplementary Table S1 illustrates the characteristics and functions of these 31 IgM autoantibodies, which were functionally relevant in tumorigenesis. Subsequently, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and Gene Ontology (GO) analyses of the corresponding genes were performed with DAVID. KEGG pathway enrichment analysis showed that the proteins were mostly involved in pathways in cancer, PI3K-Akt signaling pathway and Ras signaling pathway ( Figure 3B). GO enrichment analysis revealed that the main biological processes involved cell proliferation, cell migration, regulation of angiogenesis, and adaptive immune response ( Figure 3C), whereas the molecular functions of these proteins were primarily related to protein binding, ATP binding, and receptor signaling protein tyrosine kinase activity ( Figure 3D). As for the cellular component terms in the GO analysis, the proteins  were mainly localized in the nucleus, cytoplasm, and plasma membrane ( Figure 3E).

Performance of Five Candidate IgM Autoantibodies in the Discovery and Validation Cohorts
Of the 31 aforementioned IgM autoantibodies, we selected the top 5 (AUC ranking) as the candidate biomarkers. In the discovery cohort, the levels of IgM autoantibodies against TSHR, ERBB2, survivin, PIK3CA, and JAK2 (the top 5 in AUC ranking) were significantly higher in LUADs than those in NCs ( Figure 4). The AUCs of the single IgM autoantibodies ranged from 0.665 to 0.758. Similar results were also found in another independent cohort (validation cohort) that included 147 LUADs and 147 matched NCs ( Figure 5), with AUCs ranging from 0.579 to 0.633. These results indicated the consistency of the results from the protein array and from ELISA. Moreover, the specificities of the five IgM indicators in LUADs were investigated in 72 LUSC and 44 SCLC patients. Only the titer of IgM autoantibodies against JAK2 in LUSC patients was higher than in NCs (p = 0.046) ( Figure 5A). Moreover, the diagnostic performance of each IgM autoantibody in LUADs of the validation cohort was assessed using a series of evaluation indices in diagnostic experiments ( Table 2). Regrettably, the individual IgM autoantibodies showed poor diagnostic value, with sensitivities ranging from 9.52% to 17.01% when the specificities were set higher than 90%. The diagnostic accuracies only ranged from 51.70% to 54.76%. Besides, the levels of the five IgM autoantibodies in LUADs with different clinical characteristics (clinical stage, gender, age, smoking history, and condition of lymph node metastasis, and distant metastasis) were stratified and analyzed in Supplementary Figure S1.
Since IgG autoantibodies to tumor-associated antigens have been frequently investigated, we also conducted an analysis on the five IgG autoantibodies. As shown in Supplementary Figure  S2, three IgG autoantibodies (ERBB2, JAK2, and PIK3CA) were found to be significantly higher in LUADs than in NCs (AUC > 0.05, p < 0.05).

Construction of a Diagnostic Model Combining IgM Autoantibodies and CEA
Given the insufficiency of a single IgM autoantibody for LUAD detection, we selected and matched the populations (83 LUADs and 83 NCs) with CEA results from the validation cohort to construct a diagnostic model that combines candidate IgM autoantibodies and CEA. Binary logistic regression was adopted to combine the IgM autoantibodies with CEA. It was found that the AUC of CEA alone was only 0.692 and that of the five IgM autoantibodies (combined) was 0.698. However, the AUC of the model that included both five IgM autoantibodies and CEA reached 0.827 ( Figure 6A). This model effectively improved the diagnostic performance in LUADs. Moreover, the AUCs of this model in early-and advanced-stage LUADs reached 0.774 and 0.860, respectively (Figures 6B, C).

Application of the Diagnostic Model Combining IgM Autoantibodies and CEA in Stratified Diagnosis of LUAD
LUADs were stratified according to their clinical characteristics of tumor stage, lymph node metastasis, and distant metastasis. ROC analysis and evaluation of the diagnostic performance were performed for every subgroup and in NCs ( Table 3). The results revealed that the LUADs in every subgroup could be significantly distinguished from NCs based on this diagnostic model. In addition, the AUCs in LUADs with early-stage disease, lymph node metastasis (−), and distant metastasis (−) were higher than 0.7 (0.744, 0.804, and 0.769, respectively). Moreover, the AUCs in LUADs with advancedstage disease, lymph node metastasis (+), and distant metastasis (+)   Table 3). Interestingly, the model that included five IgM autoantibodies and CEA possessed great diagnostic value in earlystage LUAD (AUC = 0.744, p < 0.05) ( Table 3), but the model that included three IgG autoantibodies and CEA had no diagnostic value in early-stage LUAD (AUC = 0.549, p > 0.05) ( Supplementary  Table S2).

DISCUSSION
In recent studies, the EarlyCDT-Lung test has achieved remarkable results in distinguishing LC patients from high-risk The optical density (OD) value at the maximum Youden's index while specificity is more than 90% was defined as the cutoff value. Sen, sensitivity; Spe, specificity; AUC, area under the receiver operating characteristic curve; CI, confidence interval; LUAD, lung adenocarcinoma; OD, optical density; PPV, positive predictive value; NPV, negative predictive value; +LR, positive likelihood ration; −LR, negative likelihood ration. subjects. The panel, which includes seven IgG autoantibodies, demonstrated sensitivity and specificity values for newly diagnosed LC of about 40% and 90%, respectively (22)(23)(24).
Regrettably, this detection method has not been widely used clinically due to its limitation of low sensitivity. Theoretically, IgM autoantibodies, as the first antibodies produced by the immune response, may be more suitable for screening indicators of early cancer diagnosis. The study by Fitzgerald et al. found that the IgM autoantibodies against CADM1, ICLN, SEC16, and ZNF768 were increased in the serum of colorectal cancer patients and demonstrated that these IgM autoantibodies are potential biomarkers for colorectal cancer (25). However, research works on the screening of autoantibody biomarkers for LC diagnosis and treatment have mainly focused on IgG autoantibodies (26)(27)(28). Notably, a study has shown that IgM autoantibodies play a crucial part in the immunosurveillance mechanisms against malignant epithelial cells (29). There is an urgent need to explore the expressions of IgM autoantibodies in LC patients for the screening of indicators that have the potential to identify malignant lesions at an earlier stage. A fundamental target in cancer research is to understand the mechanisms of cell transformation. Cancer driver genes are involved in tumorigenesis and development across cancer types (30). Hence, IgM autoantibodies against proteins encoded by cancer driver genes might be potential biomarkers for the detection of LUAD. To test this hypothesis, we customized the protein array based on the 138 cancer-driving genes revealed in the genomic landscapes of common forms of human cancer by Vogelstein et al. (31) and endeavored to explore the expressions of IgM autoantibodies in the sera of LUADs and NCs in order to screen for indicators with potential diagnostic value. In phase I of this study, there was a significantly higher IgM autoantibody response to 31 antigens in LUADs than in NCs. In addition, the results indicated that the top 5 IgM autoantibodies (TSHR, ERBB2, survivin, PIK3CA, and JAK2) were capable of identifying patients with LUAD from NC individuals, whether it is the small sample screening data of the protein chip or the large sample verification data of ELISA. Importantly, these indicators showed excellent specificity in LUADs, but without increasing in other subtypes of LC.
ERBB2 is a member of the epidermal growth factor (EGF) receptor family of receptor tyrosine kinases, and its IgG autoantibody has been extensively assessed as a potential diagnostic biomarker for gastric cancer (32) and breast cancer (33). In a recent study, the determination of autoantibodies to TSHR presented helpful in the diagnosis and management of patients with autoimmune thyroid disease (34). In our previous study, potential IgG autoantibodies were screened in 100 serum samples from LC patients and 50 sera from NCs using focused protein array based on cancer driver genes. Interestingly, higher levels of IgG autoantibodies against survivin, PIK3CA, and JAK2 were observed in LC patients than in NCs (19). Therefore, IgM  autoantibodies to TSHR, ERBB2, survivin, PIK3CA, and JAK2 may also be used as potential biomarkers for LUAD. In phase II of the present study, we found that singleautoantibody indicators always showed an insufficient diagnostic ability, as in previous studies (35)(36)(37). Moreover, the AUCs of the five IgM and CEA were only 0.698 and 0.692, respectively. However, combining the five IgM autoantibodies with CEA by logistic regression resulted in a significant improvement in the AUC, sensitivity, and specificity, as well as the positive predictive value (PPV) and negative predictive value (NPV) of the assay. Besides, it is worth noting that the diagnostic accuracy in early-stage LUADs reached 83.02%. The results were better than that of the currently used EarlyCDT-Lung test with an accuracy of about 68% (22,23,38). Therefore, integrating two different types of biomarkers tends to have better diagnostic performance than that of a panel of multiple indicators of a single type.
Interestingly, the diagnostic performance of the model with five IgM autoantibodies and CEA (AUC = 0.827) was a little higher than that of the model with three IgG autoantibodies and CEA (AUC = 0.781). Additionally, for early-stage LUAD, the model with five IgM autoantibodies and CEA showed great diagnostic value, but the model with three IgG autoantibodies and CEA had no diagnostic value. This would justify IgM autoantibodies as having the advantage of potentially identifying malignant lesions at an earlier stage than IgG autoantibodies, as reported in the study of Pedchenko et al. (39), where high titers of IgM lung cancer-associated autoantibodies in the serum of patients 2 years before clinical presentation were detected. The diagnostic model that combined IgM and CEA may be a great auxiliary diagnostic tool for LUAD, especially for early-stage LUAD. Meanwhile, the predictive probability value of the diagnostic model increased with the aggressiveness of LUAD, which indicated that the model may have certain predictive power for the condition of patients.
Our study displayed several novel features. Firstly, we adopted a new modality of a customized protein array based on cancer driver genes to improve the possibility of screening potential IgM autoantibodies, which yielded satisfactory results. Secondly, ELISA was applied to validate the candidate IgM autoantibodies in an independent validation cohort with the aim of identifying robust LUAD biomarkers. Thirdly, the model constructed with the combination of IgM autoantibodies and CEA improved the diagnostic value of the existing clinical auxiliary diagnostic method, which implied that multi-analyte tests could ameliorate the problem of the insufficient diagnostic value of biomarkers.
There were certain limitations in this retrospective study. In future work, we will longitudinally explore the levels of these IgM indicators using pre-diagnostic longitudinal sera from LUADs. Moreover, we will verify the performance of the diagnostic model in a larger sample cohort to evaluate its capability in the early stage of LUAD.
In conclusion, we performed a meaningful approach of a protein array assay to identify novel IgM autoantibodies for the detection of LUAD. The diagnostic model that combined five IgM autoantibodies and CEA may be a more accurate method for the early detection of LUAD and possess certain predictive power for the severity of tumor.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Medical Ethics Committee of Zhengzhou University. The patients/participants provided written informed consent to participate in this study. Written informed consent was obtained from individual(s) for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
LD was responsible for study conception and design. LD and JZ provided administrative support. TW, LP, FL, XiZ, and XW contributed to the provision of study materials or patients. XuZ, JL, YW, ML, and DJ collected and organized data. XuZ analyzed and interpreted the data and wrote the manuscript. All authors contributed to the article and approved the final manuscript.