AUTHOR=Xu Yong , Wang Yao , Liang Leilei , Song Nan TITLE=Single-cell RNA sequencing analysis to explore immune cell heterogeneity and novel biomarkers for the prognosis of lung adenocarcinoma JOURNAL=Frontiers in Genetics VOLUME=Volume 13 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2022.975542 DOI=10.3389/fgene.2022.975542 ISSN=1664-8021 ABSTRACT=Background: Single-cell RNA sequencing is necessary to understand tumor heterogeneity, and the cell type heterogeneity of lung adenocarcinoma (LUAD) has not been fully studied. Method: We used the single-cell sequencing data GSE149655 to cluster and reduce the dimensionality. Then, we statistically analysed the subpopulations obtained by cell annotation to find the subpopulations highly enriched in tumor tissues. Monocle was used to predict the development trajectory of five subpopulations, beam was used to find the regulatory genes of five branches, qval was used to screen the key genes, and cellchart was used to analyse the cell communication of subpopulations. Next, we used the differentially expressed genes from TCGA-LUAD to screen for overlapping genes and established a prognostic risk model. To identify the independence of the model in clinical application, univariate and multivariate Cox regression were used to analyse the relevant HR, 95% CI of HR and P value. Finally, the novel biomarker genes were verified by qPCR and immunohistochemistry. Results: The single-cell datasets GSE149655 was subjected to quality control, filtration and dimensionality reduction. Finally, 23 subsets were screened, and annotated by marker genes. subpopulations were annotated. Through the statistical analysis of 11 subgroups, five important subgroups were selected. From the analysis of cell trajectory and cell communication, it is found that the interaction of these five subpopulations is very complex and that the communication between them is dense. We screened the marker genes of these five subpopulations, which are also the differentially expressed genes in tumorigenesis, with a total of 462 genes, and constructed 10 gene prognostic risk models based on related genes. The 10-gene signature has strong robustness and can achieve stable prediction efficiency in datasets from different platforms.The results showed that HLA-DRB5 expression was negatively correlated with the risk of LUAD, and CCDC50 expression was positively correlated with the risk of LUAD. Conclusions: Therefore, we identified a prognostic risk model including CCL20, CP, HLA-DRB5, RHOV, CYP4B1, BASP1, ACSL4, GNG7, CCDC50 and SPATS2 as new biomarkers and verified their predictive value for the prognosis of LUAD, which helps to stratify patients and serve as a new therapeutic target.