Bioinformatics Analysis of Omics Data for Biomarker Identification in Clinical Research

240.8K
views
283
authors
45
articles
Editors
5
Impact
Loading...

Breast cancer represents the number one cause of cancer-associated mortality globally. The most aggressive molecular subtype is triple negative breast cancer (TNBC), of which limited therapeutic options are available. It is well known that breast cancer prognosis and tumor sensitivity toward immunotherapy are dictated by the tumor microenvironment. Breast cancer gene expression profiles were extracted from the METABRIC dataset and two TNBC clusters displaying unique immune features were identified. Activated immune cells formed a large proportion of cells in the high infiltration cluster, which correlated to a good prognosis. Differentially expressed genes (DEGs) extracted between two heterogeneous subtypes were used to further explore the underlying immune mechanism and to identify prognostic biomarkers. Functional enrichment analysis revealed that the DEGs were predominately related to some processes involved in activation and regulation of innate immune signaling. Using network analysis, we identified two modules in which genes were selected for further prognostic investigation. Validation by independent datasets revealed that CXCL9 and CXCL13 were good prognostic biomarkers for TNBC. We also performed comparisons between the above two genes and immune markers (CYT, APM, TILs, and TIS), as well as cell checkpoint marker expressions, and found a statistically significant correlation between them in both METABRIC and TCGA datasets. The potential of CXCL9 and CXCL13 to predict chemotherapy sensitivity was also evaluated. We found that the CXCL9 and CXCL13 were good predictors for chemotherapy and their expressions were higher in chemotherapy-responsive patients in contrast to those who were not responsive. In brief, immune infiltrate characterization on TNBC revealed heterogeneous subtypes with unique immune features allowed for the identification of informative and reliable characteristics representative of the local immune tumor microenvironment and were potential candidates to guide the management of TNBC patients.

6,259 views
16 citations

Background: Atherosclerotic cardiovascular diseases accounted for a quarter of global deaths. Most of these fatal diseases like coronary atherosclerotic disease (CAD) and stroke occur in the advanced stage of atherosclerosis, during which candidate therapeutic targets have not been fully established. This study aims to identify hub genes and possible regulatory targets involved in treatment of advanced atherosclerotic plaques.

Material/Methods: Microarray dataset GSE43292 and GSE28829, both containing advanced atherosclerotic plaques group and early lesions group, were obtained from the Gene Expression Omnibus database. Weighted gene co-expression network analysis (WGCNA) was conducted to identify advanced plaque-related modules. Module conservation analysis was applied to assess the similarity of advanced plaque-related modules between GSE43292 and GSE28829. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis of these modules were performed by Metascape. Differentially expressed genes (DEGs) were mapped into advanced plaque-related modules and module membership values of DEGs in each module were calculated to identify hub genes. Hub genes were further validated for expression in atherosclerotic samples, for distinguishing capacity of CAD and for potential functions in advanced atherosclerosis.

Results: The lightgreen module (MElightgreen) in GSE43292 and the brown module (MEbrown) in GSE28829 were identified as advanced plaque-related modules. Conservation analysis of these two modules showed high similarity. GO and KEGG enrichment analysis revealed that genes in both MElightgreen and MEbrown were enriched in immune cell activation, secretory granules, cytokine activity, and immunoinflammatory signaling. RBM47, HCK, CD53, TYROBP, and HAVCR2 were identified as common hub genes, which were validated to be upregulated in advanced atherosclerotic plaques, to well distinguish CAD patients from non-CAD people and to regulate immune cell function-related mechanisms in advanced atherosclerosis.

Conclusions: We have identified RBM47, HCK, CD53, TYROBP, and HAVCR2 as immune-responsive hub genes related to advanced plaques, which may provide potential intervention targets to treat advanced atherosclerotic plaques.

6,035 views
31 citations
7,849 views
35 citations
Methods
14 December 2020
A t-SNE Based Classification Approach to Compositional Microbiome Data
Xueli Xu
3 more and 
Ximing Xu
Overview of the procedure of our approach.

As a data-driven dimensionality reduction and visualization tool, t-distributed stochastic neighborhood embedding (t-SNE) has been successfully applied to a variety of fields. In recent years, it has also received increasing attention for classification and regression analysis. This study presented a t-SNE based classification approach for compositional microbiome data, which enabled us to build classifiers and classify new samples in the reduced dimensional space produced by t-SNE. The Aitchison distance was employed to modify the conditional probabilities in t-SNE to account for the compositionality of microbiome data. To classify a new sample, its low-dimensional features were obtained as the weighted mean vector of its nearest neighbors in the training set. Using the low-dimensional features as input, three commonly used machine learning algorithms, logistic regression (LR), support vector machine (SVM), and decision tree (DT) were considered for classification tasks in this study. The proposed approach was applied to two disease-associated microbiome datasets, achieving better classification performance compared with the classifiers built in the original high-dimensional space. The analytic results also showed that t-SNE with Aitchison distance led to improvement of classification accuracy in both datasets. In conclusion, we have developed a t-SNE based classification approach that is suitable for compositional microbiome data and may also serve as a baseline for more complex classification models.

13,005 views
47 citations
8,765 views
24 citations
Fetching...
Open for submission
Frontiers Logo

Frontiers in Genetics

Computational Pangenomics: Software Tools and Applications in Plants and Agriculture
Edited by Alan Cleary, Joann Mudge, Mathieu Rouard, Thiruvarangan Ramaraj, Indika Kahanda, Brendan Mumey
Deadline
11 October 2025
Submit a paper
Recommended Research Topics
59K
views
79
authors
15
articles
Frontiers Logo

Frontiers in Genetics

Identification of Multi-Biomarker for Cancer Diagnosis and Prognosis based on Network Model and Multi-omics Data
Edited by Chunquan Li, Dechao Bu, DECHEN Lin LIN, Sun Liang, Masaharu Hazawa
79.7K
views
152
authors
19
articles
Frontiers Logo

Frontiers in Genetics

Bioinformatics Analysis of Omics Data for Biomarker Identification in Clinical Research, Volume II
Edited by Lixin Cheng, Hongwei Wang, Shibiao Wan
184.9K
views
330
authors
53
articles
Frontiers Logo

Frontiers in Genetics

Papers of Sixth China Computer Federation Bioinformatics Conference
Edited by Xuefeng Cui, Fa Zhang, Wang Guohua, Chunhou Zheng, Hongmin Cai, Qinghua Jiang, Kang Ning
10.9K
views
19
authors
4
articles
Frontiers Logo

Frontiers in Genetics

Identification of Multi-Biomarker for Cancer Diagnosis and Prognosis based on Network Model and Multi-omics Data - Volume II
Edited by Chunquan Li, Dechao Bu, DECHEN Lin LIN, Masaharu Hazawa, Sun Liang
27K
views
49
authors
5
articles