Embodiment of machine learning in biogas power industry: an opinion

Achinas, Spyridon; Vitola Quintero, Marena; Euverink, Gerrit Jan Willem

doi:10.3389/fenrg.2025.1589782

OPINION article

Front. Energy Res., 01 October 2025

Sec. Bioenergy and Biofuels

Volume 13 - 2025 | https://doi.org/10.3389/fenrg.2025.1589782

This article is part of the Research TopicArtificial Intelligence/Machine Learning Applications for Sustainable Energy, Biofuels, and ChemicalsView all articles

Embodiment of machine learning in biogas power industry: an opinion

Spyridon Achinas¹*

Marena Vitola Quintero²

Gerrit Jan Willem Euverink¹

¹Faculty of Science and Engineering, University of Groningen, Groningen, Netherlands
²Faculty of Engineering, Rafael Núñez University Corporation, Cartagena, Colombia

1 Introduction

In response to escalating environmental challenges and the global energy crisis, Europe has established ambitious targets to reduce greenhouse gas emissions and increase the production of renewable energy. Biogas, derived from anaerobic digestion (AD) of organic waste, is poised to play a pivotal role in the transition towards a sustainable energy landscape and the circular economy. According to recent studies, biogas production not only mitigates greenhouse gas emissions but also contributes to waste management and energy security (Yao et al., 2023; Wang et al., 2022). Nowadays, research efforts focus on the implementation of machine learning (ML) techniques on the bioenergy systems (Yao et al., 2023). The use of ML tools and running the complex and time-consuming algorithms are facilitated by the nowadays increase computational power of computers providing accurate decision-making in energy industry (Wang et al., 2022). ML allows to use complex prediction models based on high-dimensional dataset to aid energy experts in process optimization (Ukoba et al., 2024). The demand for revolutionizing the bioenergy systems renders a necessity the development and implementation of novel ML-based approaches. Accuracy in metering of complex process parameters makes ML a powerful tool for the operators in the plant monitoring and the subsequent decision-making (Duchesne et al., 2020). ML can support the parameters’ optimization of the reactors enhancing the efficiency of the power plant.

The last decade, there has been an investigation on the optimization of anaerobic digestion (AD) process for biogas production (Habib et al., 2024). AD plant operations has also availed from the pioneering work on learning-based platforms, as ML demonstrates the shrewdness of the tools to embed big data and to leverage predictive learning that can support biogas plants (Wang et al., 2020). ML tools are increasingly being applied across three primary categories within the AD process: monitoring, modeling, and optimization. In monitoring, ML algorithms enable real-time data analysis, facilitating early detection of process anomalies and improving operational stability (Cinar et al., 2021). For modeling, ML techniques such as Artificial Neural Networks (ANN) and Random Forest (RF) are used to predict biogas yield based on input parameters like substrate composition, temperature, and pH (Andrade Cruz et al., 2022). Optimization, on the other hand, leverages ML to fine-tune process parameters, enhancing biogas production efficiency and reducing operational costs (Kim et al., 2022).

Despite the promising potential of ML in biogas production, several challenges persist. One major issue is the complexity of bioprocess data, which often requires extensive preprocessing before they can be used as input for ML models. We recognize the controversy regarding the interpretation of the experimental data for the replicable model implementation in industrial scale. Extracting, quantifying, and evaluating bioprocess features is crucial, as raw data from AD processes are often noisy and non-linear (Beltramo and Hitzmann, 2019). Additionally, the lack of standardized datasets and the variability in feedstock composition further complicate the application of ML in full-scale biogas plants (Danish, 2023).

The technology also needs to be further developed and popularized among biogas plants due to the dynamics of the AD process and the uncontrollable environment conditions (Beltramo and Hitzmann, 2019). These methods are criticized for oversimplifying uncertain process parameters and being unable to monitor the efficiency of bioreactor in the case of perturbations. However, ML-based tools can capture and evaluate a large number of discriminative features over time that are unable to be assessed by traditional methods.

At the moment, although research points to many superior properties of AI, the field is still in its infancy with real-life application yet to surface. We believe that deeper investigation is required to compile insights from ML implementation in biogas production. This article briefly highlights intrinsic aspects of ML integration in biogas systems and provides authors’ viewpoints on the potential use of ML in biogas industry.

2 Carving out new territory in biogas power

The gap amidst efficacy and reliability of learning algorithms calls into question whether the AI industry can drive the patronage of biogas industry (Figure 1), along with expunging the technological implications (Beltramo and Hitzmann, 2019). The biogas industry has ostensibly struggled with applicability disparities of AI approaches due to technical barriers and a compendium is timely adjudicating the ongoing debate for AI tools. An ample assortment of bioenergy applications, widening from the process monitoring to parameters selection, can be imputed to ML. The conundrum of innovative ML techniques and data quality management for process monitoring to parameters selection is challenging for the biogas producers (Khatri and Khatri, 2022). The bioenergy providers anticipate from ML tools manufacturers to embrace AI projects by accruing supplementary investments.

Figure 1

Diagram illustrating machine learning in biogas, with a central circle labeled

Figure 1. High-level drivers of machine learning in biogas technology.

An important problem in bioprocess optimization is subjectivity where an expert still has a higher chance of selecting of wrong parameters of a bioreactor compared to specialized software, although software can be based on subjective estimates. A variety of ML-based tools have been successfully applied in biogas production, each with its unique strengths. Artificial Neural Networks (ANN) are widely used for their ability to model complex, non-linear relationships in biogas production data (Barik and Murugan, 2015). Random Forest (RF) and Support Vector Machines (SVM) are particularly effective for classification tasks and feature selection, making them ideal for optimizing substrate mixtures (Khatri and Khatri, 2022). Bayesian Networks (BN) offer probabilistic reasoning capabilities, which are useful for handling uncertainty in AD processes (De Clercq et al., 2020). Additionally, Extreme Gradient Boosting (XG-Boost) has shown promise in improving prediction accuracy by combining multiple weak models into a robust ensemble (Habib et al., 2024). These methods generate datasets, provide pattern recognition of these large datasets and translate the qualitative and largely subjective task into a quantitative and reproducible one.

The field of ML aims to deliver estimates and process selection recommendations for biogas production with high levels of uncertainty (Ukoba et al., 2024). ML allows complex AD process models from high-dimensional datasets to aid engineers and operators in biogas yield prediction (Khatri and Khatri, 2022). Emerging breakthroughs in development of monitoring and optimization models for bioreactor efficiency is an intrinsic pillar for companies making energy ML-related products and services available to biogas power plants (Sonwai et al., 2023; Habib et al., 2024).

These companies which develop software with sophisticated predictive models focusing on simplifying the anaerobic digestion process and removing constraints like waste composition analysis, operational efficiency, and regulatory compliance. Kanadevia-INOVA (Switzerland) has developed the DPM AI system to increase the operational reliability and productivity of dry biogas plants. It is an in-house development based on AI to early detect digester biology problems and is refitted to refitted to any biogas facility operating in continuous mode (Kanadevia-INOVA, 2023). BioGASMAS (Lithuania), is AI-powered analytics provider for biogas industry (BioGASMAS, 2021). Their product, BioGASMAS is a digital twin-based software which replicates functions and collects 1) data from sensors and 2) document-based data by plant team. Algorithms collate and leverage the data, with AI-power analytics enabling the development of predictive models that can support the biogas producers to identify trends and send alerts to responsible facility members. Anessa (Canada) has developed the Anessa AD software based on digital twin technology for the monitoring of plant performance and optimization of plant layout. Specifically, it tests feedstock combinations, detects anomalies in gas production, temperature, pH and pressure and optimizes these process parameters (Anessa, 2024).

3 Realising the value of ML

The advent of AI has shown that bioenergy production systems can be improved as ML allows the parameters selection, process monitoring and optimization (Kim et al., 2022). Fundamental discoveries are perpetrated to subvert beleaguered energy process and ensure functionality of algorithms (Ukoba et al., 2024). The goal is to evolve the algorithms and use databases of multimodal datatypes. The quintessence of ML tools lifecycle in computational bioprocess modeling aims to enhance data vigilance and improve efficiency and validity (See Figure 2). The convergence of ML innovation and bioinformatics research is intrinsic to fathom the strategy and deliver AI solutions.

Figure 2

Flowchart illustration with four interconnected bubbles:

Figure 2. ML lifecycle scheme using algorithms to optimize and validate bioprocesses.

The dilemma on the dataset harvesting and their grafting is meticulous due to the specific data categorization into finite classes. Assiduous research efforts in global level create a sweeping basis for technological investments with the guarantee of reliable solution in order to spur up the field of AI. Focusing on churning out ML algorithms tailored to harness the power of AI to unleash information of vast amount of data, questions the claim if manufacturers can achieve an profitable cascading technological path (Duchesne et al., 2020).

The attempt to implement AI practices is more than necessary for a radical veer to reliable decision support systems (Beltramo and Hitzmann, 2019). Process monitoring and optimization are the compelling catalyzers towards ML-based platforms development. Scientific hubs anticipate from European funding schemes the potency to embrace AI-oriented projects by accruing supplementary budgeting for ML platforms. The gap between research and commercialization calls into question whether bioenergy industry can drive the introduction of advanced decision making tools in bio power generation (Liao and Yao, 2021).

Technological advancements in ML, referred to by some as the critical juncture of the bioenergy production, have led to the pledge that computational engineers will expand the ambit of end-solutions (Yao et al., 2023). The advent of ML may enhance the biogas industry and the ailing bioenergy arena. The ML solutions hinged on the complex interplay between data preprocessing and learning. A cardinal number of ML-cognate objectives and meddlings limns the status quo; it has been immensely alluded that utilization of ML prediction tools for yield prognosis availed the power sector.

A sublime number of studies for contemporaneous research on AI may herald the unprecedented industrial focus on ML tools-akin biogas production (Khatri and Khatri, 2022; Habib et al., 2024). The application of ML tools can lead to better biogas production efficiency which can be used to finetune a proper parameters package. The technological deadlock in data reliability resulted in drastic pursuance for fusion of different ML tools. Future ventures of ML may reinforce their competitiveness in bioenergy sector.

4 Leveraging ML algorithms in biogas power

The development of ML is involved in marginalization enhancing the accuracy and the calibration of the models. Thus, ML approaches s have been used for the enactment of predictive model for several parameters in AD such as substrates and inoculum characteristics, temperature, reactor configuration, pH, HRT and OLR.

Vien et al. (2024) developed a ML-based monitoring strategy to enhance the diagnostics for the performance optimization of Melbourne Water’s Western Treatment Plant (WTP). They proposed a model that predicts the biogas performance in a wastewater treatment plant by using real-time operational data to make probabilistic predictions on biogas performance. Gan et al. (2024) studied various tools (RF, ANN, EN) with 92 datasets to evaluate the synergistic effects of the co-digestion palm oil mill effluent and decanter cake for COD and biogas prediction. Andrade Cruz et al. (2022) reviewed the application and the challenges of ML tools in AD and concluded that ANN, RF, and SVM are particularly reliable and effective in biogas yield prediction, bioreactor stability monitoring and the optimization of the AD bioprocess based on real-time parameters.

Tufaner and Demirci (2020) applied non-linear regression models (LMM/three layers FFN) to predict biogas production rate from anaerobic hybrid reactor treating synthetic waste water. They used several process parameters (such as pH, alkalinity, OLR, COD, volatile solids) as input variables applying 60 datasets. It is worthwhile to mention that ANN have been widely implemented for the prediction of the biogas plants’ performance (Barik and Murugan 2015; Flores-Asis et al. 2018; Beltramo et al., 2019; Ismail et al. 2019; Neto and Ozorio, 2021), however, researchers reported the lack of reliability when it comes to the interpretation and pattern recognition of large datasets.

The perpetration of scientific consortia is regarded a crucial impetus for the developers to ponder efficient computational approaches in bioenergy analytics and invest in AI field. When it comes to experiential applications for decision support in biogas sector, stakeholders’ interest for innovative ML tools has seemed to corroborate.

ML techniques have been implemented in biogas production and optimization using the substrates characteristics as input variables (Ghatak and Ghatak, 2018; Almomani 2020). Sonwai et al. (2023) followed the RF approach with 14 datasets to determine critical factors related to biogas yield and examined their impact on the anaerobic degradation of lignocellulosic biomass. Feed-forward Ann approach was also implemented to predict the biogas generation rate from the co-digestion of rice silage and vegetable residues (Singh and Uppaluri, 2023). Researchers concluded that using the substrates attributes as input variables in the models provided a more accurate interpretation on the diverse demands of bioreactor feeding.

ML tools manufacturers for bioenergy operations are ideating ML as an nascent solution for probabilistic inference that can aid decision making in operations. The promise of ML application is tantalizing due to their ability to intuitively encapsulate the causal nexus between biogas plant factors that are stored in data portfolio (De Clercq et al. 2020). Previous research studies have colligated that ML is intuitive fashion compared to many other ML techniques providing solutions for decision support. Enhancing efforts for ML-based modeling strategies is conceivable; however, contemplating business like practices is vital to reroute the process decision making field into a profitable trajectory (Offie et al., 2023).

The selection of machine learning algorithms for optimizing biogas production necessitates a critical analysis of their advantages and limitations. Artificial Neural Networks (ANNs) excel at modeling complex nonlinear relationships in large datasets, making them ideal for predicting biogas yield under variable operational conditions (Andrade Cruz et al., 2022). However, their “black-box” nature impedes result interpretability—a significant challenge in industrial settings where transparency is critical for decision-making (Beltramo and Hitzmann, 2019). In contrast, Support Vector Machines (SVMs) offer superior interpretability and efficacy in classification tasks, particularly with moderate-dimensional data, but their performance degrades with excessively large or noisy datasets (Khatri and Khatri, 2022).

Generative Adversarial Networks (GANs) emerge as a promising alternative for simulating dynamic operational scenarios in biogas plants, enabling real-time optimization of parameters like hydraulic retention time (Amin et al., 2024). Yet, their implementation demands high computational costs and extensive training data, limiting feasibility for small-scale or resource-constrained facilities (Safari et al., 2024). While ANNs and SVMs are more accessible for budget-limited projects, GANs represent a strategic investment for large-scale plants aiming to maximize efficiency through synthetic data generation and digital twins (EcoData Center, 2023).

Algorithm selection hinges on project-specific objectives. For real-time monitoring and anomaly detection, SVMs may prove more suitable due to their robustness with incomplete data (Cinar et al., 2021). Conversely, ANNs are preferable for modeling systems with interdependent variables (e.g., substrate composition and environmental conditions) (Dominguillo-Ramírez et al., 2023). Though less mature in industrial applications, GANs offer unique potential for bioprocess innovation—provided technical and economic barriers are overcome (Amin et al., 2024). This comparison underscores the need for hybrid approaches that leverage each algorithm’s strengths to address the multifaceted challenges of sustainable biogas production.

Although transition to ML has constraints and strives, unequivocal technological strategy is pivotal for the creative fecundity and maturity of ML approached in biogas production. Technological advancements and scientific breakthroughs in AD process avow the applicability of ML with Big Data, fact that that necessitates the confluence of ML techniques. Technological abeyances delay the use of ML, however, painstaking race of scientific AI-hubs and focal ML research-oriented efforts are perpetrated to ensure their implementation in biogas sector.

5 Ensconcing the bioenergy-ML affinity

Contentious arguments over the translation of ML algorithms, at the same time jeopardize the commercialization of these tools. As referred from several studies, comprehension of AD variables and conditions may improve the ML algorithms development, being the essence of the issue (Wang et al., 2020).

The clustering of these techniques is the sine qua non for the correct use of the machine learning in biogas plants optimization. The maneuverability of the ML to add a new piece of evidence requires a fairly small number of probabilities and edges in the graph and circumvents complicated steps for their extension. The graph is a crucial component to expedite a compact representation of the knowledge surrounding the system. Looking to a later levy, leveraging ML renders a sparking change for a data-driven tools with process risk averseness.

We believe that one of the major obstacles to implementing artificial intelligence in biogas plants is the lack of universal protocols for data collection and storage. According to Cinar et al. (2021), fewer than 30% of European plants use compatible formats, hindering the development of scalable predictive models. This issue is further exacerbated by the absence of sector-specific regulations defining mandatory measurement parameters (International Energy Agency, 2022). A potential emerging solution is the development of cloud-based platforms like BioGasML, which provides standardized templates for recording critical variables such as pH, temperature, and substrate composition (Safari et al., 2024).

Conversely, institutional investments in AI literacy are undeniably crucial, bridging the gap between theoretical potential and practical implementation requires standardized frameworks tailored to biogas systems. Recent initiatives like the EU OpenBiogas API (European Biogas Association, 2023) demonstrate how blockchain-based interoperability can mitigate data fragmentation, yet sector-wide adoption remains limited. An effective solution lies in adaptive hybrid models that combine mechanistic anaerobic digestion equations (e.g., ADM1) with lightweight ML algorithms (e.g., decision trees for anomaly detection), reducing computational overhead while maintaining interpretability (Offie et al., 2023). Such frameworks could be piloted through public-private partnerships, with regulatory incentives for plants that achieve NREL’s algorithmic certification benchmarks (NREL, 2024).

Moreover, the “incremental” critique underscores a missed opportunity: leveraging federated learning to address data scarcity without compromising proprietary plant data. Projects like EcoData Analytics’ synthetic data platform (EcoData Center, 2023) show promise, but their scalability depends on integrating domain-specific constraints (e.g., feedstock variability protocols from BioGASMAS’ digital twins). A hierarchical validation pipeline - where synthetic data trains base models, and real-world microdata fine-tunes them - could resolve the reliability-model complexity trade-off (Amin et al., 2024). We agree that this approach could transform incremental gains into systemic progress by embedding ML adaptability into existing ISO 20670:2018 biogas monitoring standards.

Additionally, data quality is compromised by the widespread use of uncalibrated sensors in small and medium-sized plants. A study by Haque et al. (2024) revealed that 68% of pH sensors in anaerobic digesters exhibited deviations >0.5 units after 6 months of operation. To address this, companies like Siemens Energy have developed IoT-enabled self-calibrating sensors that adjust readings every 12 h using thermodynamic correction algorithms. This approach has reduced errors by 72% in pilot tests across 15 plants (Siemens Energy, 2023).

Another critical challenge we suggest to this ML-biogas industry nexus is the substrate heterogeneity. The composition and origin of biomass and biowaste varies leading to inconsistencies in historical data. To mitigate this, a study examined the prediction of methane yield using artificial neural network (ANN) model. The model used 340 experimental datapoints regarding biomasses from various sources, i.e., manure, plants, maize, grass and tubercle. The proposed ANN-based model showed to have a significant high predictive power with lower RMSE values and lower prediction error in most cases and was concluded that the model can be implemented in preliminary stages of bioprocess design in biogas-related projects (Dominguillo-Ramírez et al., 2023).

Similarly, the scarcity of labeled data limits the training of advanced models. Startups like EcoData Analytics are tackling this issue through synthetic data generation, where physicochemical models simulate thousands of realistic operational scenarios (EcoData Center, 2023). These synthetic datasets, validated against real-world measurements in 10% of cases, enable algorithm training without compromising plant confidentiality. A successful Norwegian case study demonstrated that this method can reduce predictive model development time by 60% (Nordic Biogas Report, 2024).

Interoperability between systems remains a critical barrier. While large plants use SAP or PI System software, smaller facilities rely on manual spreadsheets. To bridge this gap, the EU-led OpenBiogas initiative is developing a universal API that connects 18 different formats via blockchain, ensuring traceability and security. Recent trials show an 80% reduction in data transmission errors across platforms (European Biogas Association, 2023). To validate model reliability, standardized benchmarking is essential. The U.S. National Renewable Energy Laboratory (NREL) introduced the first quality certification for biogas algorithms in 2024, assessing accuracy, robustness, and algorithmic fairness (NREL, 2024). Plants adopting this standard reported a 35% reduction in production prediction failures over 6 months.

In the technical domain, transformer-based models have been considered effective for predicting failures in biogas generators. These systems combine Long Short-Term Memory (LSTM) networks with attention to mechanisms to process multivariate temporal data, achieving 40% higher accuracy than traditional methods (Araujo-Varga et al., 2022). Another breakthrough is dynamic optimization via digital twins. Denmark’s BioCirc plant employs generative adversarial networks (GANs) to simulate 1,200 operational scenarios per minute. This technology autonomously adjusts hydraulic retention times, increasing annual production by 2.1 GWh (Amin et al., 2024).

When new learning techniques become mainstream, thoughts inevitably arise about their longevity and efficacy prior to their integration into the market. These dilemmas and skepticism activate the debate of better guardrails and standardized frameworks being an option to address pervasive bias in algorithms. The ML industry contemplates the validity issues, however, plausible technological experience is needed to dodge operational risks as careless investments can imperil the viability of those.

6 Conclusion

This article has explored the transformative potential of ML in the biogas sector, highlighting its applications in monitoring, modeling, and optimization. While ML offers significant advantages, including improved prediction accuracy and operational efficiency, challenges such as data preprocessing, model interpretability, and scalability remain. To fully realize the potential of ML in biogas production, it is essential to invest in research and development, foster collaboration between academia and industry, and integrate ML training into bioenergy programs. We agree that as the field continues to evolve, ML is expected to play an increasingly critical role in driving the transition towards a sustainable and circular bioeconomy. These approaches may improve the value chain of the ML tools functionality enticing the commercial interest. To overcome the weakness of limited expertise and knowledge, we suggest that learning of ML tools should become part of bioenergy programs at the universities as well in a lifelong training. The allure of ML has accumulated positive bias in investors who pin the world’s bioenergy future hopes on the learning tools.

Author contributions

SA: Conceptualization, Investigation, Writing – original draft. MV: Writing – review and editing. GE: Writing – review and editing.

Funding

The author(s) declare that no financial support was received for the research and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Almomani, F. (2020). Prediction of biogas production from chemically treated co-digested agricultural waste using artificial neural network. Fuel 280, 118573. doi:10.1016/j.fuel.2020.118573