<?xml version="1.0" encoding="utf-8"?>
    <rss version="2.0">
      <channel xmlns:content="http://purl.org/rss/1.0/modules/content/">
        <title>Frontiers in Genetics | Computational Genomics section | New and Recent Articles</title>
        <link>https://www.frontiersin.org/journals/genetics/sections/computational-genomics</link>
        <description>RSS Feed for Computational Genomics section in the Frontiers in Genetics journal | New and Recent Articles</description>
        <language>en-us</language>
        <generator>Frontiers Feed Generator,version:1</generator>
        <pubDate>2026-05-03T19:30:43.769+00:00</pubDate>
        <ttl>60</ttl>
        <item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1787544</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1787544</link>
        <title><![CDATA[Application of integrated nested Laplace approximation to identify hot spots of methylation heterogeneity in healthy individuals from the MAMELI cohort]]></title>
        <pubdate>2026-04-29T00:00:00Z</pubdate>
        <category>Brief Research Report</category>
        <author>Tiago Nardi</author><author>Eva Dariol</author><author>Rachele Matsagani</author><author>Donya Zojaji</author><author>Stefano Gustincich</author><author>Luca Pandolfini</author><author>Elia Biganzoli</author><author>Valentina Bollati</author>
        <description><![CDATA[DNA methylation is an epigenetic regulator of gene expression and cell identity, which can be shaped by both physiological and pathological factors, including environmental exposure. The identification of sites with high methylation variability can be computationally challenging, especially in large-scale studies. To address this, we propose a framework based on the integrated nested Laplace approximation (INLA) to model methylation with Bayesian generalized linear mixed models (GLMMs), accounting for subject covariates, genomic annotations, and cell composition. To validate the methodology, we sequenced 158 healthy subjects with nanopore and analyzed a panel of 13 genes related to inflammation and stress response. We identified a set of hypervariable CpG sites whose genomic context and methylation levels were consistent with a regulatory role, making them potential candidates for epigenomic association studies. In our comparison, INLA results were concordant with those obtained with MCMC-based methods, with runtimes shorter by orders of magnitude. The computational efficiency of the framework allows for fast exploratory data analysis, model testing, and iterative prototyping, making it viable for large-scale studies that otherwise would be computationally prohibitive.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1845666</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1845666</link>
        <title><![CDATA[Correction: BRMDA: prediction model for potential microbe-drug associations based on bilinear attention networks and random forest]]></title>
        <pubdate>2026-04-21T00:00:00Z</pubdate>
        <category>Correction</category>
        <author>Ge Yu</author><author>Fang Chen</author><author>Hui Chen</author><author>Shichang Tang</author><author>Mingmin Liang</author><author>Xianzhi Liu</author><author>Bin Zeng</author><author>Lei Wang</author>
        <description></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1793277</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1793277</link>
        <title><![CDATA[Exploring Ebola virus-associated gene expression through comparative analysis]]></title>
        <pubdate>2026-04-17T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Mostafa Rezapour</author><author>Sean V. Murphy</author><author>David A. Ornelles</author><author>Thomas D. Shupe</author><author>Stephen J. Walker</author><author>Alan R. Jacobson</author><author>Metin Nafi Gurcan</author><author>Patrick M. McNutt</author><author>Anthony Atala</author>
        <description><![CDATA[IntroductionEbola virus (EBOV) infection triggers intense host transcriptional responses that overlap extensively with those induced by other viral and bacterial pathogens. This overlap complicates the identification of EBOV-specific gene expression signatures and limits diagnostic specificity. Defining transcriptional markers that distinguish EBOV from other infections is essential for improving molecular diagnostics and advancing understanding of EBOV-specific host responses.MethodsWe developed a multi-step filtering framework using blood-derived RNA-Seq data from nonhuman primates and human cohorts organized into independent training and test sets. In the training cohort, differential expression analysis was performed using an edgeR-based GLMQL-MAS approach to identify EBOV-associated genes. Candidates were filtered against non-EBOV comparator datasets, including mpox virus, influenza, bacterial pneumonia, acute HIV-1 infection, and multiple SARS-CoV-2 variants, to remove broadly shared host-response genes. Genes included in the NanoString nCounter® Host Response Panel were additionally excluded. The resulting EBOV-specific signature was evaluated in independent EBOV and non-EBOV test cohorts using principal component analysis and logistic regression. Functional enrichment was assessed using KEGG pathways.ResultsInitial analysis identified numerous interferon-stimulated genes that were similarly upregulated across infections. After cross-infection filtering and NanoString exclusion, 281 EBOV-specific genes were identified. Optimization within the training cohort yielded a top-50 gene set that clearly separated EBOV from Non-EBOV samples. In the independent test cohort, classification performance improved substantially, with the F1 score increasing from 37.5% when all genes were used to 95.0% after applying the top-50 gene set. Enrichment analysis of the top-50 EBOV-specific genes revealed significant association with vascular, coagulation, secretory, and metabolic pathways. ADAMTS1 showed consistent upregulation in EBOV while remaining downregulated or inactive in comparator infections.DiscussionStructured cross-pathogen filtering enables identification of EBOV-specific transcriptional features beyond shared antiviral responses. The validated gene signature generalizes across independent cohorts and highlights biologically distinct pathways, which supports its potential utility for host-based diagnostic development.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1688627</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1688627</link>
        <title><![CDATA[A deep learning model for predicting essential proteins based on an attention mechanism]]></title>
        <pubdate>2026-04-15T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Shunxian Zhou</author><author>Haodong Zhou</author><author>Sisi Chen</author><author>Yangtai Xu</author><author>Lei Wang</author>
        <description><![CDATA[IntroductionEssential proteins are key to cellular viability, yet their experimental identification is costly and time-consuming.MethodsIn this study, DLAM is introduced as a deep learning framework that integrates four complementary biological cues, namely, domain composition, subcellular localization, orthology, and gene expression, together with a weighted protein–protein interaction network. The heterogeneous signals are encoded into compact representations and learned by an attention-enhanced network to score protein essentiality.ResultsOn the DIP dataset, DLAM achieves consistently better performance than representative centrality measures and conventional machine-learning classifiers. In a further expanded baseline study based on the larger BioGRID dataset containing more proteins, we conducted comparative experiments between DLAM and four recently proposed deep learning methods (TCBB2021, EPGAT, BMC2022, and ACDMBI). On the BioGRID dataset, we evaluated DLAM using stratified five-fold cross-validation. Across folds, DLAM achieves consistently strong discrimination and ranking performance (reported as the mean ± std. for ROC-AUC and AP) and maintains a stable F1-score under a validation-selected decision threshold. This suggests that, under the same evaluation protocol, DLAM has strong ranking and discrimination capability. Moreover, it also exhibits good and stable performance on other metrics such as accuracy, precision, recall, and F-measure.DiscussionThese results indicate that jointly modeling multi-source biological information with interaction topology yields more reliable essential-protein prediction under class imbalance.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1803456</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1803456</link>
        <title><![CDATA[MoJKNet: a jumping knowledge graph framework for multi-omics cancer subtype prediction]]></title>
        <pubdate>2026-04-07T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Jiangjie Lou</author><author>Xiaoguang Pan</author><author>Xuanlong Wang</author>
        <description><![CDATA[Cancer remains one of the leading causes of morbidity and mortality worldwide and poses a major threat to global public health. Despite substantial advances in early diagnosis and therapeutic strategies, patient outcomes vary widely due to the pronounced molecular and clinical heterogeneity of tumors. Accurate identification of cancer subtypes is therefore essential for elucidating tumor heterogeneity, improving prognostic assessment, and enabling precision medicine. In recent years, multi-omics technologies have provided unprecedented opportunities to characterize cancer at multiple molecular layers, including genomic, epigenomic, transcriptomic, and proteomic levels. However, effectively integrating high-dimensional and heterogeneous multi-omics data remains a major challenge. Moreover, many existing graph convolutional network–based integration methods suffer from over-smoothing and limited utilization of deep feature representations, which restrict their ability to capture complex multi-scale relationships inherent in cancer biology. To address these challenges, we propose MoJKNet, a novel multi-omics integration framework for cancer subtype classification. MoJKNet incorporates a jumping knowledge network (JK-Net) to adaptively aggregate node representations across multiple propagation depths, thereby alleviating over-smoothing and enhancing feature extraction within each omics modality. Subsequently, a multimodal autoencoder combined with similarity network fusion (SNF) is employed to capture complementary information across different omics layers. Finally, a graph attention network (GAT) assigns adaptive feature weights to enable accurate cancer subtype prediction. We evaluated MoJKNet on seven cancer types from The Cancer Genome Atlas (TCGA). Experimental results demonstrate that MoJKNet consistently outperforms state-of-the-art methods, including MOGCAN, MOGONET, and MoGCN, in terms of precision, recall, and F1-score, achieving nearly a 10% performance improvement on the COADREAD dataset. Ablation studies further confirm the critical contribution of the jumping knowledge mechanism to improved representation learning. Overall, MoJKNet provides an effective and generalizable solution for multi-omics data integration and cancer subtype classification, with strong potential for downstream biological interpretation and translational applications.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1819270</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1819270</link>
        <title><![CDATA[Federated, governed, and interoperable? The emerging architecture of public human genomic data infrastructures: a European perspective]]></title>
        <pubdate>2026-04-01T00:00:00Z</pubdate>
        <category>Mini Review</category>
        <author>Marco Antonio Tangaro</author><author>Matteo Chiara</author><author>Graziano Pesole</author><author>Federico Zambelli</author>
        <description><![CDATA[Public infrastructures for human genomic data are increasingly incorporating federated approaches alongside centralized and cloud-native models, yet operational federation remains constrained by unsolved challenges at the legal, semantic, and technical layers. We describe the current landscape along three analytical axes, taking a primarily European perspective while drawing on global examples to highlight broader trends. First, we compare architectural models, centralized archives such as the European Genome-phenome Archive (EGA) and the database of Genotypes and Phenotypes (dbGaP), cloud-native platforms for data analysis, and federated networks exemplified by the European Genomic Data Infrastructure (GDI), highlighting their specific trade-offs on scalability, sovereignty, and analytical flexibility. Second, we examine the governance layer, from the tension between the GDPR’s consent requirements and large-scale secondary use, through the European Health Data Space (EHDS) and Health Data Access Bodies, to machine-readable authorization via GA4GH Passports and the Data Use Ontology. Third, we assess interoperability and semantic alignment, including the role of GA4GH technical standards, FAIR metadata principles, and emerging schema harmonization efforts such as the German Human Genome-Phenome Archive (GHGA). We argue that the central challenge is no longer building individual platforms, but aligning heterogeneous regulatory interpretations, metadata models, and trust frameworks across jurisdictions. Addressing this alignment gap will determine whether federated genomics delivers on its promise of large-scale, privacy-preserving data reuse.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1780660</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1780660</link>
        <title><![CDATA[A single-cell multi-omics atlas of human eyelid skin]]></title>
        <pubdate>2026-03-31T00:00:00Z</pubdate>
        <category>Data Report</category>
        <author>Weiguang Ma</author><author>Yanwen Xu</author><author>Junjie Chen</author><author>Yue Yuan</author><author>Xiaoyu Wei</author><author>Qiuting Deng</author><author>Ruikang Li</author><author>Jiaxin Du</author><author>Zhongjin Zhang</author><author>Zhentao Zhou</author><author>Quhuan Li</author><author>Haiyan Shen</author><author>Jufang Zhang</author><author>Jufang Wang</author><author>Pengfei Cai</author><author>Pengcheng Guo</author>
        <description><![CDATA[The skin acts as the first barrier protecting the human body against the external environment. Transcriptional heterogeneity in human skin is widely confirmed, but the regulatory mechanisms remain largely unexplored. In this study, we employed high-throughput single-cell chromatin accessibility and transcriptome sequencing (HT-scCAT-seq), a technique for simultaneous analysis of the transcriptome and epigenome. We used HT-scCAT-seq to analyze 10,065 cell profiles from four adult human eyelid skin and define 13 distinct cell types. In addition, we described detailed molecular signatures and identified key gene regulatory network enriched in each cell type. Our dataset is a valuable resource for further research in human skin biology.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1785259</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1785259</link>
        <title><![CDATA[Genomic and transcriptomic insights into aflatoxin-induced intrahepatic cholangiocarcinoma: an integrated pathway and meta-analysis study]]></title>
        <pubdate>2026-03-31T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Fengping Li</author><author>Yuanyang Ma</author><author>Yan Cheng</author><author>Shanshan Zou</author>
        <description><![CDATA[IntroductionEnvironmental and chemical exposures are major yet incompletely characterized drivers of human carcinogenesis. Aflatoxin, a potent food-borne mycotoxin, has been implicated in tumor initiation, proliferation, and immune suppression in intrahepatic cholangiocarcinoma (ICC), but its genomic mechanisms remain poorly defined.MethodsWe employed extensive literature data mining (LDM) to identify genes associated with both aflatoxin exposure and ICC, enabling construction of a mechanistic genetic pathway linking exposure to disease. These pathways were further evaluated using a meta-analysis of five Gene Expression Omnibus (GEO) expression datasets, followed by functional annotation to characterize their biological roles.ResultsLDM identified 1,754 ICC-associated genes, of which 427 were also linked to aflatoxin, with 154 positioned as potential intermediate regulators connecting aflatoxin exposure to ICC. Meta-analysis revealed significant expression alterations in six genes upon aflatoxin exposure, including upregulation of CRP, CDK2, AXL, and MIR221 (overexpression >50%, p < 0.05) and downregulation of F2 and BUB1B (reduced expression >60%, p < 0.014). Co-expression analysis indicated strong interactions among these regulators (Fisher’s Z > 0.53, p < 0.05), suggesting coordinated molecular responses associated with ICC progression. Functional annotation further highlighted inflammatory responses, cytokine dysregulation, and kinase-related signaling as key processes potentially linking aflatoxin exposure to ICC development.DiscussionThese findings provide a systems-level view of the genomic mechanisms underlying aflatoxin-associated ICC carcinogenesis and identify candidate molecular mediators linking environmental toxin exposure to tumor development. This integrative framework may facilitate exposure-informed biomarker discovery and potential preventive or therapeutic strategies, particularly in regions where aflatoxin exposure remains prevalent.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1770432</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1770432</link>
        <title><![CDATA[A graph clustering algorithm with hypergraph learning and a core-attachment strategy for protein complex identification]]></title>
        <pubdate>2026-03-30T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Jie Wang</author><author>Xiancan Yang</author><author>Pengbo Yang</author><author>Jian Yang</author><author>Yangyang Miao</author>
        <description><![CDATA[Protein complexes play a crucial role in cellular biological processes. Identifying these complexes is essential for understanding cellular functions and biological mechanisms. Graph clustering approaches to identify protein complexes in protein-protein interaction (PPI) networks have become a significant research hotspot in data mining and bioinformatics. Many graph clustering methods have been developed for protein complex identification. However, most existing methods only utilize original networks to discover dense subgraphs and ignore higher-order topological characteristics. Considering the prevalent multi-relational and complex interactions in biological networks, a graph clustering algorithm based on hypergraph learning and a core-attachment strategy is proposed for protein complex identification, called HLCA. Hypergraph networks are employed to directly model multi-relational interactions. Based on this method, a multi-level hypergraph is used as higher-order topology and a core-attachment strategy are adopted to identify protein complexes. Firstly, the original PPI network is transformed into a hypergraph network. Secondly, a hierarchical compression strategy is applied to recursively compress the hypergraph into smaller hypergraphs at various levels, forming a multi-level analytical framework. Thirdly, hypergraph convolution is performed across different hierarchical levels to obtain node representations at each level. These node representations are then combined to produce complete node embeddings. Based on these node embeddings, a weighted PPI network is constructed by cosine similarity from the original PPI network. Core clusters are obtained in this weighted network by cluster density. Finally, remaining protein nodes are added to the core clusters using a core-attachment strategy combining hyperedge density and overlap. The effectiveness of HLCA is evaluated by comparing it with other protein complex identification methods on multiple datasets. Experimental results show that the proposed method outperforms comparison methods regarding F-measure and Accuracy.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1762055</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1762055</link>
        <title><![CDATA[Advances in algorithms for normalizer gene selection in qRT-PCR: implications for cancer biology and precision medicine]]></title>
        <pubdate>2026-03-27T00:00:00Z</pubdate>
        <category>Mini Review</category>
        <author>Abhay Kumar Pathak</author><author>Sukhad Kural</author><author>Lalit Kumar</author><author>Sumit Saini</author><author>Manjari Gupta</author>
        <description><![CDATA[Quantitative Reverse Transcription Polymerase Chain Reaction (qRT-PCR) plays a significant role in gene expression analysis in cancer research and precision medicine. It allows precise quantification of gene expression variation which is necessary for understanding tumor biology, identifying predictive biomarkers and developing therapeutics interventions. However the accuracy and stability of qRT-PCR data heavily rely on finding stable reference genes. The gene stability refers to minimal variation in expression levels of a candidate reference gene across different biological conditions, sample groups and technical replicates. Traditionally, housekeeping genes such as β-actin, GAPDH and 18S rRNA have been used for normalization but consistency and variation can vary under different experimental settings. Over time, mathematical and statistical tools such as geNorm, NormFinder, BestKeeper and gQuant have been developed to find most stable reference genes. These algorithms have become essential in ensuring accurate and reproducible data in cancer research, where gene expression profiles can vary significantly across different tumor types, stages and individual patients. This review focuses on the progression and advancements of traditional and advanced reference gene selection methods, applications in cancer research and their significant role in precision medicine. It presents an overview of the commonly employed normalizers, outlining their respective advantages and limitations, and includes a concise discussion on the assessment of gene stability across diverse experimental contexts. Additionally, it emphasizes their use in cancer research and their importance in enhancing the accuracy and consistency of gene expression normalization, particularly within precision medicine.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1757318</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1757318</link>
        <title><![CDATA[BRMDA: prediction model for potential microbe-drug associations based on bilinear attention networks and random forest]]></title>
        <pubdate>2026-03-26T00:00:00Z</pubdate>
        <category>Methods</category>
        <author>Ge Yu</author><author>Fang Chen</author><author>Hui Chen</author><author>Shichang Tang</author><author>Mingmin Liang</author><author>Xianzhi Liu</author><author>Bin Zeng</author><author>Lei Wang</author>
        <description><![CDATA[IntroductionUncharted microbe-drug relationships constitute an under-exploited reservoir of therapeutic leads. In this manuscript, we introduced a hybrid framework named BRMDA by coupling a bilinear attention network with a random-forest classifier to systematically expose latent microbe-drug associations.MethodsFirstly, BRMDA integrated multiple drug-centric, microbe-centric, and disease-centric similarity profiles, along with experimentally validated microbe–drug associations, to construct a unified heterogeneous graph. And then, the bilinear attention network and random-forest classifier were employed to compute the predicted scores for potential microbe-drug associations based on the newly constructed unified heterogeneous graph. Next, benchmarking experiments were conducted under a rigorous five-fold cross-validation protocol using the MDAD dataset to validate the prediction performance of BRMDA. Additionally, case studies were further performed, focusing on front-line antibiotics including amoxicillin and ciprofloxacin as well as clinically relevant pathogens including Bacillus cereus and Mycobacterium tuberculosis, to evaluate the translational validity of the proposed model.ConclusionIntensive experimental results demonstrated that BRMDA outperformed seven state-of-the-art competitors in terms of both AUC and AUPR, and 9 out of the top 10 associations predicted by the model were corroborated by independent literature evidence. These findings underscored the accuracy and translational potential of BRMDA, offering a data-driven compass for antimicrobial discovery and microbe-oriented therapeutic design.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1723592</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1723592</link>
        <title><![CDATA[Identifying mitochondrial genes and potential biological functions in pre-eclampsia: bioinformatics and experimental insights]]></title>
        <pubdate>2026-03-24T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Yaohui Wang</author><author>Zhixin Du</author><author>Liping Yang</author><author>Junlin Hou</author><author>Xiaolin Li</author><author>Mengyang Fan</author><author>Chenyang Yu</author><author>Jianhua Sun</author><author>Li Zhou</author><author>Lingling Li</author><author>Pengbei Fan</author>
        <description><![CDATA[BackgroundPre-eclampsia (PE) is a specific type of gestational hypertension associated with high morbidity and mortality. This study aims to identify mitochondria-related regulatory molecules in PE through bioinformatics analysis, which will help pinpoint potential therapeutic targets and elucidate potential mechanisms of action in PE.MethodsThis study integrated three PE placental transcriptome datasets (n = 103/157) to screen for mitochondrial-related hub genes. Key gene screening was performed by combining three machine learning algorithms—Random Forest, LASSO, and SVM—followed by the construction of a diagnostic neural network model. Additionally, single-cell sequencing data were utilized to analyze the cellular expression patterns of candidate genes in the placenta. To further elucidate the underlying mechanisms, functional validation was conducted both in PE rat model and in vitro using HTR-8 cells, supplemented by multi-omics correlation analysis.ResultsMachine learning analysis identified three key genes (GCLM, SNAP23, RHOT2), and the diagnostic model built upon them demonstrated excellent performance (training set AUC = 0.907; validation set AUC = 0.875). Single-cell analysis revealed the expression patterns of these genes within specific cell subtypes, consistent with the transcriptional features of trophoblast cell populations. In the PE rat model, downregulation of GCLM and SNAP23 and upregulation of RHOT2 were significantly correlated with clinical phenotypes such as hypertension and proteinuria, as well as changes in placental inflammatory factor levels (TNF-α, IL-1β, IL-6). Specifically, SNAP23 and GCLM showed negative correlations with inflammatory cytokines but positive correlations with fetal weight, while RHOT2 expression positively correlated with disease severity. In vitro experiments confirmed that overexpression of SNAP23 restored mitochondrial membrane potential, reduced reactive oxygen species levels, and suppressed cytokine release in lipopolysaccharide (LPS)-treated HTR-8 cells. Multi-omics analysis further indicated that these genes are involved in immune dysregulation and mitochondrial dysfunction during PE progression.ConclusionThis study establishes GCLM, SNAP23, and RHOT2 as mechanistically important biomarkers for preeclampsia. Among them, modulation of SNAP23 shows therapeutic potential in alleviating mitochondrial damage and inflammatory responses in PE, providing a new direction for intervention strategies.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1723401</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1723401</link>
        <title><![CDATA[Integrating dual convolutional networks and BiLSTM for precision prediction of chronic myeloid leukemia from protein sequences]]></title>
        <pubdate>2026-03-23T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Hend Khalid Alkahtani</author><author>Ayman Qahmash</author>
        <description><![CDATA[IntroductionChronic Myeloid Leukemia (CML) is a hematologic malignancy characterized by the occurrence of the Philadelphia chromosome [t(9; 22)(q34; q11)], leading to the creation of the BCR–ABL fusion gene. The fusion gene expresses a constitutively active tyrosine kinase that stimulates the uncontrolled growth and survival of myeloid cells, both a diagnostic marker and therapeutic target. Conventional diagnostic techniques, including cytogenetic examination, fluorescence in situ hybridization (FISH), and polymerase chain reaction (PCR), while accurate, remain invasive, require enormous resources, and often detect the disease at more progressed stages. Computational methods based on protein sequence analysis offer a non-invasive, scalable solution; meanwhile, contemporary machine learning methods are strongly dependent on manually designed features, limiting their ability to effectively capture long-range dependencies and subtle contextual interactions.MethodsTo counter these shortcomings, we propose a Dual Convolutional Neural Network–Bidirectional Long Short-Term Memory (Dual CNN–BiLSTM) framework for the accurate prediction of CML from protein sequences. The model includes two parallel CNN modules of different kernel sizes for multi-scale motif discovery, followed by a BiLSTM layer for modeling bidirectional sequential dependencies. The combination of features is realized by concatenating ProtBERT embeddings with Pseudo Amino Acid Composition (PseAAC) and Dipeptide Composition (DPC).ResultsAn experimental evaluation over curated UniProtKB sets of CML-associated proteins indicates improved performance, with an accuracy of 97.5% and a 0.98 ROC–AUC.DiscussionThe proposed framework delivers breakthroughs to computational oncology and enables early, non-invasive screening for CML.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1769896</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1769896</link>
        <title><![CDATA[BioMutation: a portable graphical user interface for mutagenesis and feature analysis in proteins, nucleic acids, and their complexes]]></title>
        <pubdate>2026-03-17T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Tushar Gupta</author><author>Pradeep Pant</author>
        <description><![CDATA[Protein and nucleic acid mutational studies are central to understanding biomolecular structure, function, and interactions, yet existing computational tools often lack user-friendly interfaces for high-throughput and systematic mutagenesis. To address this limitation, we present BioMutation, a graphical user interface (GUI) that enables automated, batch-wise introduction of substitution-based mutations in proteins, nucleic acids, and their complexes facilitated via UCSF ChimeraX functionalities. BioMutation supports user-defined and class-wise amino acid substitutions in proteins, as well as user-specified and combinatorial substitutions in DNA and RNA. The tool automatically generates libraries of mutated structures in standard formats (PDB, MOL2, and mmCIF) suitable for downstream computational studies, including molecular docking and molecular dynamics simulations. In addition, BioMutation includes a Google Colab–based Structure Analyzer module for comparative assessment of physicochemical features before and after mutation. By integrating automation, flexibility, and accessibility within a single platform, BioMutation facilitates efficient in silico mutagenesis for applications in structural biology, biomolecular engineering, and drug discovery. The GUI is freely available at https://github.com/Computational-biolab/BioMutation.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1742595</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1742595</link>
        <title><![CDATA[An in silico protocol for predicting genetic biomarkers in rare diseases: a case study in sporadic amyotrophic lateral sclerosis]]></title>
        <pubdate>2026-03-12T00:00:00Z</pubdate>
        <category>Methods</category>
        <author>Ali Aguerd</author><author>Badreddine Nouadi</author><author>Abdelkarim Ezaouine</author><author>Imad Fenjar</author><author>Faiza Bennis</author><author>Fatima Chegdani</author>
        <description><![CDATA[Studying the genetics of rare diseases is challenging because small sample sizes limit the statistical power of standard methods like Genome-wide association studies (GWAS). We created a new machine-learning approach to find candidate Single Nucleotide Polymorphisms (SNPs) when data is scarce. Our method trains a Random Forest model to spot similarities between SNPs. We used 189 known Sporadic Amyotrophic Lateral Sclerosis (sALS)-linked SNPs as positive examples and 938,544 unrelated SNPs as negatives. The model learns from genomic location, significance levels, nearby genes, and other features. When we tested it on sALS, it performed exceptionally well, with 93.8% accuracy and near-perfect AUC scores. The method uncovered 1,890 new SNP candidates for sALS. Among these, 209 reached genome-wide significance, and 50 appeared repeatedly in our analyses, making them strong candidates. Key genes like SARM1, OPHN1, and BPTF emerged from the results, all connected to neural health and survival pathways. Our examination revealed a notable excess of SNPs on chromosome 18 compared to expectations. This non-random distribution underscores the region’s particular interest. Here, our approach demonstrates its ability to extract meaningful signals from a restricted sample. The results generated by this approach enable early diagnosis of the disease under study, explanation of its mechanism, and identification of therapeutic targets.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1685927</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1685927</link>
        <title><![CDATA[An auxiliary diagnosis model for the pathological classification of cervical cancer based on radiomics biomarkers]]></title>
        <pubdate>2026-03-11T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Mei Wang</author><author>Yu Cao</author><author>Mengchen Zhu</author><author>Peilin Zhao</author><author>Qin Zhou</author><author>Hongxiang Lan</author><author>Jizhao Liu</author><author>Junqiang Lei</author>
        <description><![CDATA[IntroductionCervical cancer remains a major global health burden, and accurate pathological classification is essential for personalized treatment planning. However, conventional radiomics studies often rely on manual lesion delineation and are limited in extracting meaningful imaging biomarkers from heterogeneous cervical cancer lesions.MethodsWe proposed a convolutional recurrent feature extraction (CRFE)-based automatic segmentation framework for cervical cancer MRI images and developed histogram-based imaging features reflecting lesion pixel concentration trends. These features were integrated with conventional radiomics and clinical features. Feature engineering and machine learning classifiers, including random forest (RF), XGBoost, support vector machine, and logistic regression, were evaluated to construct an auxiliary diagnostic model for pathological classification. The dataset included 114 patients with cervical cancer who underwent MRI examinations.ResultsThe CRFE segmentation model achieved an Intersection over Union (IoU) of 0.9443, a Dice coefficient of 0.5980, and an F1-score of 0.7085. Feature selection retained 30 key imaging biomarkers, including the median of the histogram, GLSZM large-area low gray-level emphasis (LoG, σ = 2.0mm, 3D), and GLRLM long-run low gray-level emphasis (LoG, σ = 2.0mm, 3D). Among the evaluated classifiers, the RF model achieved the best performance, with an accuracy of 87.27% and an F1-score of 86.91% in pathological classification.DiscussionThe proposed deep learning–radiomics framework enables accurate lesion segmentation and effective pathological classification of cervical cancer. This auxiliary diagnostic model may reduce unnecessary invasive procedures and improve early screening and clinical decision-making.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1738448</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1738448</link>
        <title><![CDATA[Identification of the PFK gene family in Solanum species and expression analysis in the fruitof Solanum lycopersicum]]></title>
        <pubdate>2026-03-09T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Zepeng Wang</author><author>Zhongyu Wang</author><author>Ruiqiang Xu</author><author>Qingyuan Meng</author><author>Jintao Wang</author><author>Ning Li</author><author>Qinghui Yu</author>
        <description><![CDATA[IntroductionPhosphofructokinase (PFK) is a crucial rate-limiting enzyme in glycolysis, essential for sugar metabolism and fruit quality. This study provides the first pangenome-scale analysis of the PFK family across Solanum species.MethodsUsing pan-genome data, 156 PFK genes were identified across 12 Solanum species. Comprehensive bioinformatic analyses, protein-protein interaction predictions, and promoter motif scans were performed. Expression patterns across four fruit developmental stages were characterized via RNA‐seq and validated by qRT‐PCR.ResultsThe PFK family, categorized into PFK and PFP subfamilies, expanded primarily through segmental duplication under strong purifying selection. We identified distinct, stage-specific expression patterns, with SolyPFK07 and SolyPFPA2 emerging as key regulators of sugar accumulation. Promoters contained numerous elements responsive to hormones and abiotic stresses.ConclusionPFK genes are vital for fruit development, sugar metabolism, and stress adaptation. These findings offer a theoretical basis and genetic resources for the molecular breeding of high-quality tomatoes.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1766223</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1766223</link>
        <title><![CDATA[Multiscale computational genomics in Wilson disease: from atomic dynamics to clinical prediction]]></title>
        <pubdate>2026-03-03T00:00:00Z</pubdate>
        <category>Review</category>
        <author>Moujun Luan</author><author>Qingkai Xue</author><author>Yujie Cao</author><author>Gangli Cheng</author><author>Xingxing Huo</author>
        <description><![CDATA[Wilson disease (WD) is an autosomal recessive disorder caused by pathogenic variants in the ATP7B gene, leading to toxic copper accumulation. The integration of computational genomics approaches is now essential for deciphering the complex genotype-phenotype relationships and advancing towards targeted therapies. This review synthesizes how multiscale computational strategies are transforming WD research. At the atomic level, molecular dynamics (MD) simulations reveal the conformational dynamics of the ATP7B protein, the functional impact of mutations, and the detailed copper transport cycle. At the systems level, machine learning (ML) models integrate genomic, epigenomic, transcriptomic, and clinical data to classify variant pathogenicity, predict disease subtypes, and forecast clinical outcomes such as cirrhosis or neurological deterioration. Furthermore, multi-omics network analyses uncover disease-associated regulatory modules, elucidate the role of epigenetic dysregulation, and implicate emerging pathways like cuproptosis in WD pathogenesis. Critically, these computational insights are increasingly guiding therapeutic innovation, including the in silico design of allosteric modulators (e.g., nanobodies) and pharmacological chaperones to correct ATP7B folding. By bridging scales from molecular structure to patient phenotypes, computational genomics provides a powerful, integrative framework that holds the potential to accelerate the development of dynamic, mechanism-based therapies and pave the way for personalized medicine in Wilson disease.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1715155</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1715155</link>
        <title><![CDATA[Case Report: Association of a rare single nucleotide variant in the KCNH2 gene with drug-induced QT prolongation]]></title>
        <pubdate>2026-02-24T00:00:00Z</pubdate>
        <category>Case Report</category>
        <author>Tianci Wang</author><author>Charlene R. Norgan Radler</author><author>Mohanakrishnan Sathyamoorthy</author>
        <description><![CDATA[BackgroundLong QT Syndrome (LQTS) is characterized by prolonged QT intervals on electrocardiogram, which may progress into life-threatening polymorphic ventricular tachycardia and sudden cardiac death. Variants in the KCNH2 gene have been associated with congenital LQTS, with thousands identified to date but very few clinically characterized.ObjectivesTo describe the rare single nucleotide variant KCNH2 (NM_000238.4):c.1066C>T (p.Arg356Cys) associated with drug-induced QT prolongation and to assess its pathogenicity risk using in silico tools and protein structural modeling in accordance with American College of Medical Genetics and Genomics (ACMG) guidelines.MethodsNext-generation sequencing was performed for a patient presenting with drug-induced QT prolongation who was found to carry the rare KCNH2 1066C>T variant. Thirteen established gene discovery computational tools were employed to analyze the variant in silico. Additionally, structural modeling of the variant’s region within the wild-type protein was performed utilizing AlphaFold.ResultsThe clinical phenotype associated with the KCNH2 1066C>T variant has not been previously described in literature, except in combination with a variant in the KCNQ1 gene. Computational analysis with a meta-predictor, REVEL, supported variant pathogenicity, while predictive modeling and AlphaMissense illustrated the uncertainty of structural impacts in a disordered region. Risk analysis of the variant performed utilizing ACMG guidelines and ClinGen criteria-specific recommendations resulted in an overall classification of “uncertain significance”.ConclusionTo our knowledge, this is the first study reporting a direct phenotype-to-genotype association between the KCNH2 1066C>T variant and drug-induced QT prolongation, supplemented by in silico analyses and ACMG-based variant risk stratification. Our study underscores the importance of recognizing genetic predisposition in drug-induced QT prolongation and motivate further investigation of KCNH2 variants within the N-linker region.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fgene.2026.1807544</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fgene.2026.1807544</link>
        <title><![CDATA[Editorial: Advancements in sequencing technologies for epigenomic and transcriptomic analysis: from bulk to single-cell resolution]]></title>
        <pubdate>2026-02-18T00:00:00Z</pubdate>
        <category>Editorial</category>
        <author>Laura Veschetti</author><author>Massimiliano Cocca</author><author>Dougba Noel Dago</author><author>Giovanni Malerba</author>
        <description></description>
      </item>
      </channel>
    </rss>