AUTHOR=Azrai Muhammad , Aqil Muhammad , Andayani N. N. , Efendi Roy , Suarni , Suwardi , Jihad Muhammad , Zainuddin Bunyamin , Salim , Bahtiar , Muliadi Ahmad , Yasin Muhammad , Hannan Muhammad Fitrah Irawan , Rahman , Syam Amiruddin TITLE=Optimizing ensembles machine learning, genetic algorithms, and multivariate modeling for enhanced prediction of maize yield and stress tolerance index JOURNAL=Frontiers in Sustainable Food Systems VOLUME=Volume 8 - 2024 YEAR=2024 URL=https://www.frontiersin.org/journals/sustainable-food-systems/articles/10.3389/fsufs.2024.1334421 DOI=10.3389/fsufs.2024.1334421 ISSN=2571-581X ABSTRACT=The frequent occurrence of drought, halting from unpredictable climate-induced weather patterns, presents significant challenges of breeding drought-tolerant maize to identify adaptable genotypes. The study explores the optimization of machine learning (ML) techniques to predict both grain yield and stress tolerance index (STI) of maize crosses under normal and drought-induced stress. Three popular ML algorithms were optimized using a genetic algorithm (GA) and ensemble ML to enhance data capture, facilitating the selection of highly drought-tolerant genotypes. Additionally, a Multi-trait Genotype-Ideotype Distance (MGIDI) model was also involved to identify superior maize hybrids wellsuited for drought conditions. In total, 35 genotypes comprising 31 hybrid candidates and four commercial varieties were meticulously evaluated across three normal and drought-affected sites. The findings underscore that the Random Forest-Genetic Algorithm (RF-GA) combination yields high accuracy (R 2 =0.91 for grain yield and 0.79 for STI), whereas Support Vector Machine-GA (SVM-GA) and K-Nearest Neighbor (KNN-GA) models exhibit less favourable predictions. Remarkably, the ensemble models, which make predictions from diverse individual models, consistently deliver optimal results. The ensemble meta-models incorporating SVM, KNN, and RF demonstrate high R 2 values (0.92 for grain yield and 0.82 for stress tolerance index) across the testing datasets. The efficacy of these ensemble models can be attributed to the inherent diversity within the ensemble itself, facilitating a comprehensive exploration of the feature landscape. Among the six hybrids with the highest STI values, both the ML-based optimized model and MGIDI accurately predict four hybrids with high drought tolerance, namely H06, H10, H13, and H35. Thus, combining ML with MGIDI enables researchers to discern the most impactful traits for each genotype and holds promise for advancing the field of droughttolerant maize breeding and expediting the development of resilient varieties.