AUTHOR=Peng Xiayu , Wei Yong , Zhou Xue TITLE=Enhancing pathogen identification through AI-assisted metagenomic sequencing JOURNAL=Frontiers in Microbiology VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2025.1634194 DOI=10.3389/fmicb.2025.1634194 ISSN=1664-302X ABSTRACT=IntroductionTo address the limitations of current metagenomic identification approaches, we proposed a principled AI-assisted architecture that enhances accuracy, scalability, and biological interpretability through three core innovations.MethodsFirstly, we developed a structured probabilistic model that formulates pathogen detection as a hierarchical and compositional inference task under taxonomic and ecological constraints. This framework enables the integration of phylogenetic priors and sparsity-aware mechanisms, reducing noise and ambiguity. By modeling taxonomic structure and ecological dependencies, the approach ensures more accurate identification, especially in complex or low-abundance microbial communities. Secondly, we introduced the Taxon-aware Compositional Inference Network (TCINet), a deep learning model that processes sequencing reads to produce taxonomic embeddings. TCINet estimates abundance distributions via masked neural activations that enforce sparsity and interpretability, while also propagating uncertainty through log-normal variance modeling. Designed to respect microbial phylogeny and co-occurrence patterns, TCINet enables scalable, biologically plausible inference across diverse clinical and environmental datasets. Thirdly, we presented the Hierarchical Taxonomic Reasoning Strategy (HTRS), a post-inference module that refines predictions by enforcing compositional constraints, propagating evidence across taxonomic hierarchies, and calibrating confidence using entropy and variance-based metrics. HTRS includes context-aware thresholding and co-occurrence priors to adaptively optimize performance based on dataset characteristics.ResultsTogether, these innovations create a unified framework for metagenomic identification that combines probabilistic modeling, deep learning, and structured reasoning.DiscussionThe architecture delivers robust and interpretable results, making it suitable for applications in clinical diagnostics, environmental monitoring, and ecological research.