AUTHOR=Holgado-Apaza Luis Alberto , Isuiza-Perez Dany Dorian , Ulloa-Gallardo Nelly Jacqueline , Vilchez-Navarro Yban , Aragon-Navarrete Ruth Nataly , Quispe Layme Wilian , Quispe-Layme Marleny , Castellon-Apaza Danger David , Choquejahua-Acero Remo , Prieto-Luna Jaime Cesar TITLE=A machine learning approach to identifying key predictors of Peruvian school principals' job satisfaction JOURNAL=Frontiers in Education VOLUME=Volume 10 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/education/articles/10.3389/feduc.2025.1580683 DOI=10.3389/feduc.2025.1580683 ISSN=2504-284X ABSTRACT=School principals encounter contemporary demands that impact their job satisfaction and leadership effectiveness. Despite the significance of this issue, there is limited research on satisfaction predictors for these professionals, particularly using machine learning approaches. This study identified key predictors of job satisfaction among Peruvian school principals by applying an ensemble of feature selection methods and evaluating five machine learning algorithms (Random Forest, Decision Trees-CART, Histogram-Based Gradient Boosting, XGBoost, and LightGBM) with data from the 2018 National Survey of Directors. The principal variables identified included satisfaction with salary, geographic location of the educational institution, relationships with students and teachers, workplace climate, student learning achievements, and job benefits. Economic factors proved important, such as gross and net income, and the minimum monthly amount required to meet household needs. Time-related aspects also exerted influence, including hours dedicated to training, time spent on administrative and/or teaching duties outside working hours, travel time to and from the Local Educational Management Unit (UGEL), duration of stays at the UGEL, and commuting time from principal residence to the educational institution. The Histogram-Based Gradient Boosting algorithm, optimized with Bayesian techniques and trained with data balanced through Random Oversampling, achieved a balanced accuracy of 0.63 on a test set with real-world class distribution. When using Generative Adversarial Networks to balance only the training set, better results were obtained in recall (0.74), precision (0.72), and F1 score (0.70). SHAP analysis revealed that economic factors primarily influenced dissatisfied principals, while interpersonal factors were more important for highly satisfied principals, suggesting a hierarchical pattern of needs. The findings could inform strategies to enhance principals' job satisfaction and strengthen data-driven educational policies.