Your new experience awaits. Try the new design now and help us make it even better

MINI REVIEW article

Front. Microbiol.

Sec. Microbiotechnology

This article is part of the Research TopicMicrobial Strategies for Phytoremediation EnhancementView all 4 articles

Machine Learning Approaches for Data-Driven Hydrocarbon Bioaugmentation and Phytoremediation: The Role of Multi-Omics Insights

Provisionally accepted
  • 1Nnamdi Azikiwe University Faculty of Biosciences, Awka, Nigeria
  • 2Clinical Technology Department, Respiratory Care Program Faculty of Applied Medical Sciences, Umm Al-Qura University, Makkah, Saudi Arabia
  • 3Department of Chemical Engineering, Brunel University London, Uxbridge, UB8 3PH, United Kingdom, London, United Kingdom

The final, formatted version of the article will be published soon.

ABSTRACT: Hydrocarbon contamination, particularly with polycyclic aromatic hydrocarbons (PAHs), poses a significant environmental challenge due to its persistence and carcinogenic effects on ecosystems and human health globally. This review explores how ML algorithms can enhance the efficiency of bio-augmentation and phytoremediation through predictive modeling, real-time optimization of microbial consortia, and plant species selection. Traditional bioremediation methods, such as bioaugmentation and phytoremediation, are characterized by slow degradation rates and sub-optimal performance in complex, multi-contaminant environmental milieus. The use of machine learning models with multi-omics data presents an advanced predictive approach to optimizing bioremediation processes by providing a systematic understanding of microbial and plant-mediated hydrocarbon degradation strategies and processes. ML models can predict which microbial strains or plant species will effectively degrade hydrocarbons under specific environmental conditions by utilizing supervised learning methods such as support vector machines and neural networks. Additionally, the combination of multi-omics data with ML facilitates the identification of critical genes, enzymes, and metabolic pathways involved in the degradation of hydrocarbons, and offers insights into the molecular mechanisms whichdrive the bioremediation process. The translation of laboratory-based ML models into large-scale, real-world bioremediation strategy is hindered by the complex, dynamic nature of our contaminated environments. This review paper showcases these hinderances and provides a direction for future research, including the development of field-deployable technologies, adaptive ML models, and real-time environmental monitoring strategies. The integration of ML with multi-omics holds substantial promise for enhanced efficiency, adaptability, and scalability of bioremediation strategies which ultimately mitigates carcinogenic risks often associated with hydrocarbon-polluted lithosphere.

Keywords: Bio-augmentation, Cancer-Risk Mitigation, Hydrocarbon contamination, machine learning, multi-omics, Phytoremediation

Received: 05 Dec 2025; Accepted: 29 Jan 2026.

Copyright: © 2026 Okafor, Alghamdi, Anguilano and Yang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Saeed M. Alghamdi

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.