ORIGINAL RESEARCH article

Front. Plant Sci.

Sec. Plant Breeding

Volume 16 - 2025 | doi: 10.3389/fpls.2025.1594736

Optimizing Soybean Variety Selection for the Pan-African Trial Network using Factor Analytic Models and Envirotyping

Provisionally accepted
Maurício  S. AraújoMaurício S. Araújo1*Bruno  F. FregoneziBruno F. Fregonezi1André  A. StellaAndré A. Stella1João  P. S. PavanJoão P. S. Pavan1Natally  F. LimaNatally F. Lima1Erica  P. LelesErica P. Leles2Michelle  F SantosMichelle F Santos2Peter  GoldsmithPeter Goldsmith2Godfree  ChigezaGodfree Chigeza3Brian  W. DiersBrian W. Diers2José  Baldin PinheiroJosé Baldin Pinheiro1
  • 1University of São Paulo, São Paulo, Brazil
  • 2Feed the Future Innovation Lab, University of Illinois Urbana-Champaign, Illinois, USA, Illinois, United States
  • 3International Institute of Tropical Agriculture, Ibadan, Oyo State, Nigeria, Oyo, Nigeria

The final, formatted version of the article will be published soon.

Soybean is a global food and industrial crop, however, climate change significantly affects its grain yield. Therefore, the selection of varieties with high adaptation to target population of environments is imperative in Sub-Saharan Africa. This study aimed to identify soybean varieties with high overall performance and stability using multi-environment trial data from the Pan-African Soybean Trial Network. Additionally, we sought to determine the environmental factors influencing yield through envirotyping tools. In two South-Eastern African countries, a total of 169 soybean varieties were evaluated across 83 environments in 19 locations in Malawi (47 trials) and 14 locations in Zambia (36 trials). The trials followed a randomized complete block design with three replications. Data for 37 environmental features were obtained from NASA POWER and SoilGrids. We fitted factor analytic models (FA) to estimate genotype adaptation across environments. Additionally, we applied an environmental kernel approach and the XGBoost method to assess the number of mega-environments. The FA model with four factors provided the best fit, explaining 82.44% and 81.95% of the variance and the average semi-variance ratio (ASVR), respectively. Approximately, 59.6% of the genotype-by-environment interaction were crossover. Varieties V025, V035, and V158 exhibited high yield potential and reliability but displayed moderate stability. Three mega-environments were identified, with growing degree days, mean temperature, and photosynthetically active radiation use efficiency being the most associated features for soybean grain yield. To enhance the identification of variety adaptation in these environments, integrating machine learning models with crop growth modeling is essential to assess associations between environmental features and soybean yield.

Keywords: Glycine max, linear mixed models, Environmental data, adaptation, stability

Received: 16 Mar 2025; Accepted: 15 May 2025.

Copyright: © 2025 Araújo, Fregonezi, Stella, Pavan, Lima, Leles, Santos, Goldsmith, Chigeza, Diers and Pinheiro. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Maurício S. Araújo, University of São Paulo, São Paulo, Brazil

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.