Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Cell. Infect. Microbiol.

Sec. Antibiotic Resistance and New Antimicrobial drugs

This article is part of the Research TopicMachine Learning and AI-Driven Insights into Microbial Pathogenesis and Drug ResistanceView all articles

Uncovering the potential virulence factors of emerging pathogens using AI/ML-based tools: a case study in Emergomyces africanus

Provisionally accepted
Peter  F. FaragPeter F. Farag1*Karema  S. Abdel-monemKarema S. Abdel-monem2Hibah  M. AlbasriHibah M. Albasri3Areej  A. AlhhazmiAreej A. Alhhazmi4Rana  H. IsmailRana H. Ismail5
  • 1Department of Microbiology, Faculty of Science, Ain Sham University, Cairo, Egypt
  • 2Department of Microbiology and Biochemistry, Faculty of Science, Benha University, Al-Qalyubia, Egypt
  • 3Department of Biology, College of Science, Taibah University, Al-Madinah, Saudi Arabia
  • 4Clinical Laboratory Sciences Department, Applied Medical Sciences, Taibah University, Al-Madinah, Saudi Arabia
  • 5Department of Microbiology, Faculty of Science, Ain Shams University, Cairo, Egypt

The final, formatted version of the article will be published soon.

Background: We are currently in the era of artificial intelligence (AI), which has become deeply embedded across nearly all scientific disciplines. Harnessing this revolutionary technology to predict virulence factors of emerging pathogens can improve our understanding of their pathogenicity, especially since the majority of these pathogens' proteomes are composed of hypothetical or uncharacterized proteins. Moreover, emerging orphan proteins were expressed from novel open reading frames. Therefore, this study aimed to develop a pipeline for predicting and annotating the species-specific secreted protein structures of these pathogens, with Emergomyces africanus selected as a model organism. Methods: The proteome of E. africanus CBS 136260 was retrieved from the NCBI database. The secretome of this fungus was predicted by ML-based SignalP and Phobius tools, targeting signal peptide (SP) bearing proteins. Species-specific proteins were detected using BLASTp (sequence level) and AFDB clusters (structure level). AlphaFold2, an AI-based system, was used to build structural models of hypothetical proteins specific to Emergomyces. DeepFRI was used to anticipate functional annotation of these proteins based on their structures, while the DALI server was used to detect homologous similarity. Candidate proteins were applied to molecular docking analysis against MHC-II. Results: The structure modeling and homologous matching revealed several protein domains similar to toxins (scorpion toxin-like, cytolysin, CARDS toxin, defensin-like), allergens, adhesins, hydrolytic enzymes, and inhibitors. Novel domains with putative functions (ion binding, proteolysis, transferase activity, and protein binding) were also discovered. In immunoinformatics and molecular docking studies, a cytolysin like-containing protein (Gene ID: ACJ72_08076) outperformed the other selected proteins in binding to MHC-II (Docking score = −318.74) with a confidence score = 0.96. Conclusion: The findings suggest that AI and ML tools can be employed in the preliminary stage to explore host-pathogen interactions and anticipate novel virulence genes.

Keywords: AI Tools, emerging pathogens, Emergomyces, pathogenicity, Virulence Factors

Received: 30 Sep 2025; Accepted: 20 Nov 2025.

Copyright: © 2025 Farag, Abdel-monem, Albasri, Alhhazmi and Ismail. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Peter F. Farag, peter_jireo@sci.asu.edu.eg

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.