- School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Westville Campus, Durban, South Africa
Introduction: As agricultural participation continues to shift under the pressures of urbanization, climate change, and evolving socioeconomic conditions, understanding the drivers behind household engagement becomes increasingly vital.
Methods: This study explores these dynamics using household budget survey data, applying decision trees, random forests, and gradient boosting to uncover trends in model performance and variable importance over time.
Results: Our comparative analysis reveals a consistent decline in decision tree accuracy, which reflects the model's limited ability to capture increasingly complex and non-linear relationships in household behavior. In contrast, ensemble learners—random forests and gradient boosting—combine multiple weak learners, typically shallow decision trees, to improve predictive performance. Random forests aggregate predictions through bagging, while gradient boosting builds trees sequentially to correct prior errors. These methods demonstrated superior sensitivity and balanced accuracy in identifying agricultural participants, particularly by 2017–2018, when random forests achieved a notably low out-of-bag error rate for classifying agricultural sales participants. However, early-year specificity remained a challenge. Key predictors evolved from income-dominated variables in 2002–2003 to a more nuanced mix of household size, age, water access, and geographic context by 2017–2018. While all models identified overlapping predictors, ensemble methods were more effective in capturing subtle interactions and demographic shifts. Decision trees, though less accurate overall, provided valuable insights into spatial variation, especially in 2010-2011 when district-level factors were prominent. Rural households consistently showed higher participation rates, with urbanization and regional disparities becoming increasingly influential.
Discussion: These findings highlight the strength of ensemble learning in capturing the complexity of agricultural engagement and underscore the need for adaptive, data-driven policy strategies. The observed shifts in variable importance reflect a changing socioeconomic landscape, calling for targeted interventions that address local realities and emerging challenges such as climate volatility and rural-to-urban migration.
1 Introduction
The rapid growth of the global population has placed the agricultural sector at a critical juncture, thereby presenting significant challenges for food production systems [1]. As the sector grapples with urgent issues such as climate change, resource depletion, and food insecurity, predicting future agricultural participation and its triggers have never been more. Such predictive insights are vital for policymakers, farming households, development practitioners—enabling them to devise effective strategies to enhance agricultural productivity and resilience in the face of evolving socio-economic and environmental conditions. Among plausible considerations, it could prove prudent to explore context-specific strategies tailored to local geographic conditions, such as adopting sustainable agricultural practices that account for regional climate variations and available resources. Educational programs aimed at farming households could be among the considerations for enhancing understanding of innovative farming techniques while accommodating the diverse household dynamics affecting food accessibility. Beyond these, strategies focused on increasing crop yields through improved seed varieties and efficient irrigation methods can form priority considerations, alongside optimizing supply chain management. These could potentially contribute to mitigating food shortages and ensuring a more resilient agricultural sector [2–4]. In the broader Sub-Saharan African (SSA) context, agricultural participation remains predominantly driven by smallholder farming, which accounts for nearly 70% of food supply on the continent [5]. However, recent evidence suggests that productivity growth has been slow, averaging only about 1% annually since the 1980s, despite significant investments in agricultural research and extension services [6, 7]. Persistent barriers such as insecure land tenure, limited access to credit, and inadequate irrigation infrastructure continue to constrain household engagement in agriculture [8, 9]. Climate change further exacerbates these challenges, with erratic rainfall, prolonged droughts, and floods threatening crop yields and rural livelihoods [10, 11]. These dynamics underscore the vulnerability of farming households and the urgent need for climate-smart agricultural practices, including conservation tillage, agroforestry, and water harvesting systems, which remain under-adopted in many SSA countries [8].
The urgency for such strategies in Lesotho is underscored by an analysis revealing troubling historical trends in household participation in agriculture, which could be indicative of a worsening situation. Evaluated through household budget surveys (HBS) conducted in 2002–2003, 2010–2011, and 2017–2018, the findings indicate that household consumption expenditures on agricultural products have declined across all surveys (Figure 1). This trend could suggest that households may not have consistently relied on agriculture to sustain their livelihoods for the past years but rather, on other means. Similar patterns have been observed in other SSA countries, where rural households increasingly diversify income sources due to low returns from agriculture and high vulnerability to climate shocks [12]. In densely populated areas with limited land availability, such as Malawi and Ethiopia, farming alone rarely guarantees food security or a living income, pushing households toward off-farm employment [12]. These structural constraints—combined with market volatility and policy biases favoring cash crops over food crops—have contributed to declining engagement in subsistence farming across the region [41]. Among many potential lines of inquiry, the following could warrant investigation. For instance, what might be the underlying reasons causing households not to actively engage in agriculture, despite its potential to improve food security and alleviate poverty? Are there any barriers related to access to land, capital, or technology that hinder families from participating? What other factors might be at play? How do shifts in market demand influence the decisions of households to participate in agriculture? What about the general future of agriculture and its drivers in the country?
Figure 1. Participation by possession of agricultural items—Lesotho district dynamics by HBS period (2002–2003 to 2017–2018).
A detailed understanding of these dynamics is critical not only for enhancing agricultural participation but also for devising effective strategies for poverty eradication. While this study examines participation trends across three survey periods −2002–2003, 2010–2011, and 2017–2018—it provides a broad analytical lens without limiting its scope to a single explanatory framework. However, the limited number of time points constrains the ability to establish robust temporal patterns. The notable differences observed in the 2010–2011 period may reflect external influences such as economic shocks, policy shifts, or climate events, which are not explicitly addressed in this analysis. This underscores the need for future research to incorporate additional time points and contextual data for a more comprehensive understanding and validation of observed trends.
Interestingly, while general participation in agriculture in the country reflects a long-run decline, the likelihood of households possessing at least one agricultural item—a component of participation—remains relatively high. During this period, nearly two-thirds of households owned at least one agricultural item, and this was notable in both 2002–2003 and 2017–2018, contrasting with the more varied observations in 2010–2011. In particular, in the 2002–2003 cycle, agricultural participation was relatively high across districts, showing less variability; however, the 2010–2011 period demonstrated a decrease in participation on a balanced scale, characterized by notable variability, with many districts experiencing declines. Some districts plummeted to participation rates as slow as 6%.
This dip could be reflective of a collective challenge faced by farming households, possibly due to economic or environmental stressors that hindered production and resource access. The subsequent period, 2017–2018, reflected a modest resurgence in possession of agricultural items across most districts. In contrast, spending on agricultural items or products remained relatively low, with a slight increase in recent years, while participation in selling agricultural products experienced a sustained decline over the three measurement cycles.
This juxtaposition underscores the complexity of agricultural participation, reflecting the dynamic interplay of various factors that fluctuate over time, differing across geographic spacing, and responding to specific events. Such variability necessitates a thorough investigation into the motivations behind household involvement in agriculture. In particular, it is crucial to understand how existing inertia may perpetuate current participation and subsequently shape future engagement in agricultural practices. In providing context, to effectively determine possession within agricultural households, the criteria included essential items such as land availability, cattle ownership, and ownership of plowing implements. These factors are instrumental in defining agricultural assets. On one hand, participation in agricultural activities—whether through spending on farming inputs or selling agricultural products—appears to be nuanced and influenced by numerous consumer expenditure variables, thereby complicating the analysis.
Traditional modeling techniques often face challenges in capturing these complex relationships due to their inherent assumptions and limitations. The complex nature of these considerations, marked by high uncertainty, receives support from the literature regarding the benefits of machine learning (ML) in enhancing efficiencies in their modeling [13, 14]. This area of research continues to receive increased attention and interest on several fronts such as crop disease, weed detection, yield prediction, and crop recognition [14]. This heightened focus has inspired this paper, which aims to identify plausible future triggers of agricultural participation in Lesotho. Exploring this area is crucial for informing policies and strategies aimed at improving food security and economic resilience, especially in rural communities. At the same time, employing ML techniques can enhance the detection of these triggers, thereby contributing to the development of novel solutions and strategies for addressing challenges within this delicate yet important sector.
2 Toward model prediction: study area, context to the data collection and variable choice
Data for this study were sourced from three independent rounds of the Lesotho Household Budget Survey (HBS): 2002–2003, 2010–2011, and 2017–2018, with sample sizes of n = 5,992; 5,318; and 4,295, respectively. The variation in sample sizes reflects the independent nature of each survey round, where sampling frames were recalculated based on population dynamics and enumeration areas prevailing at the time of data collection. Each round was designed to be nationally representative, employing stratified sampling techniques aligned with demographic and geographic distributions.
The datasets were used to hypothesize household participation in agriculture, conceptualized as arising from household activities that blend producer and consumer behaviors. From a latent variable perspective, the analysis incorporates a range of factors—such as socioeconomic status, resource acquisition, and demographic characteristics—to assess their influence on household participation in agricultural activities. This framework allows for the evaluation of these variables not only to provide a retrospective view of their influence but also to serve as potential predictors of future engagement in agriculture.
Although the surveys share a common structure, minor variations in feature availability were observed across rounds due to evolving questionnaire designs. To ensure comparability, a harmonization process was applied to align core variables across datasets, focusing on household demographics, income, resource access, and agricultural participation markers. Where features differed, equivalent proxies were identified, and categorical variables were standardized through recoding. Continuous variables were normalized to maintain scale consistency across rounds, while at the same time, missing data were handled through systematic exclusion of incomplete records to maintain analytical integrity and avoid introducing artificial variance that could distort participation patterns. Survey weights provided by the HBS were incorporated during model development to account for sampling design and mitigate bias, ensuring population-level representativeness.
2.1 Classes: participation and non-participation
The study modeled participation in agricultural activities as a function of several key factors influencing household engagement: where P represents participation in agriculture, which is determined by various components of agricultural output: output sold, Os; output for own final consumption, Oc; and output possessed, could be for further processing by farming households, Op.
We express this relationship as:
Each outcome variable is defined as a linear combination of its thematic indicators.
For instance, for Output sold (Os):
Let,
Then:
Where is a vector of weights representing the relative importance or contribution of each indicator (e.g., eggs, fish, grain crops) to the output sold. These weights were computed based on the frequency of households reporting the sale of each item, converted into normalized proportions that sum to one. The assumption is that each indicator contributes linearly and independently to overall output sold.
Regarding Output for Own Final Consumption (Oc):
Let,
Then:
Where represents the weights for each thematic indicator under own consumption, computed using the frequency of households reporting each activity and converted into normalized proportions that sum to one. The assumption here is that these indicators collectively influence household consumption decisions in a linear additive manner.
While for Output Possessed (Op):
Then:
Where denotes weights for possession indicators, derived from ownership prevalence and scaled to reflect their relative importance in agricultural capacity. The assumption is that possession of these assets directly enhances participation potential.
Substituting the linear forms into the main equation:
The weighted sums were computed using:
where V is the vector of thematic indicators (X, Y, or Z), wj is the weight for indicator j, and m is the number of indicators in that group. Weights were normalized so that j = 1 within each thematic group.
2.1 Plausible predictors
Additionally, among variables in the specification, household characteristics, denoted H comprised the following: age of the household head, Hage; education level of the household head, Hedu; household size, Hsiz; and sex of the household head, Hsex. Literature confirms these variables to be bearing significant influence on agricultural participation: see Ref. [15–17]. Other explored plausible predictors included the household geographic setting, Hgeo; the household socioeconomic status such as income and economic activity denoted Heac; and the household-related resource acquisition that includes ownership of cattle, access to water—among others, denoted Hres.
In line with this model specification, we underscore the pivotal argument raised by Dossa et al. [18], which posits that socioeconomic dynamics significantly influence the household decision to engage in agricultural activities, particularly when considering the type and purpose of agricultural output. This assertion is further corroborated by insights from Yee and Man [19], who emphasize the importance of geographic setting in this context. These variables were also added to the specification, allowing us to extend our function to incorporate them as follows:
Where:
Hgeo = f(Urban - Rural Residence, District, Ecological Zone)
Hres = f(Access to Water, Access to Electricity, Ownership of Cattle, Land Access)
Heac = f(Employment, Income, Economic Activity, Spending on Farming Inputs)
Each of these predictor groups was modeled as part of the input feature vector in the machine learning framework under the assumption of additive contribution to participation probability. Categorical variables (e.g., urban-rural residence, district) were encoded using one-hot encoding to avoid imposing artificial order, while continuous variables (e.g., income, household size) were normalized to maintain scale consistency. The assumption is that these predictors influence participation independently, and their combined effect is captured through the classification algorithms without manually specified interaction terms, although tree-based models can learn interactions automatically.
A fundamental aspect of our analysis is the interplay between agricultural output metrics and household characteristics, which is instrumental in assessing the likelihood of participation in agricultural endeavors. Specifically, we examine how various factors—such as socioeconomic status, education level, access to land and resources, and demographic variables—correlate with agricultural output within a predictive modeling framework. These characteristics serve as input features for binary classifiers, enabling segmentation of households based on estimated probabilities of participation rather than deterministic categorization. For example, households with higher educational attainment may exhibit higher predicted probabilities of adopting innovative agricultural techniques, whereas households experiencing economic distress may show lower predicted probabilities despite resource access.
3 Analytical framework
The literature surrounding the predictive ability of ML-based binary classifiers reveals substantial variability. A compelling example is provided in the work of Yee and Man [19], which examines the determinants of employee satisfaction. The duo highlights the advantages of employing Naive Bayes Classifiers (NBC) for predicting outcomes using panel data, with the technique reflecting simplicity, efficiency, and an ability to manage missing and noisy data effectively. The reliance of the algorithm of NBC on the assumption of feature independence allows it to perform robustly, even when the data exhibits complexities such as multicollinearity or sparsity. However, it is crucial to recognize that the effectiveness of various ML techniques can differ based on the specific attributes of the data set at hand, the feature dimensionality, and the nature of the problem domain. For example, in the context of crop yield prediction, existing literature highlights a notable preference for using algorithms such as Neural Networks (NN), Linear Regression (LR), Random Forest (RF), Decision Trees (DT), Support Vector Machines, and Gradient Boosting (GB) methods (i.e., XGBoost, LightGBM) as noted by Van Klompenburg et al. [20]. Gradient Boosting algorithms, in particular, are renowned for their high predictive accuracy and flexibility in handling various data complexities, often outperforming other models in structured data scenarios. Each of these techniques offers distinct advantages tailored to specific analytical needs. The NN—for example—are often favored for their ability to capture intricate non-linear relationships within the data, making them suitable for complex agricultural systems. In contrast, RF are recognized for their robustness against overfitting due to their ensemble approach, which aggregates the predictions of multiple decision trees to enhance overall model reliability [21].
The choice of traditional machine learning over deep learning was driven by the nature of the dataset and the research objectives. Household Budget Survey data are structured, tabular, and relatively small in size (n ≈ 5,000 per round), which limits the advantage of deep learning models that typically require large, high-dimensional datasets to outperform traditional methods [22]. Moreover, interpretability is critical for policy-oriented research; traditional ML models provide clearer insights into feature importance and decision rules, whereas deep learning models often operate as “black boxes,” making it difficult to extract actionable policy recommendations [23]. Additionally, traditional ML algorithms are computationally efficient and well-suited for handling mixed data types (categorical and continuous), which aligns with the characteristics of the HBS dataset.
Given this context, our analysis opted to employ DT and RF as primary methodologies, leveraging their strengths in handling varying feature sets and capturing non-linear relationships effectively. Additionally, we incorporated GB techniques to explore potential improvements in predictive performance, given their proven efficacy in similar classification tasks. To effectively implement these ML techniques, data preparation was essential, and this involved addressing class imbalance issues and ensuring the data was suitable for model training and evaluation.
Specifically, regarding the class imbalance and ensuring equitable representation of participation across survey rounds, a combination of Synthetic Minority Over-sampling Technique (SMOTE) [24] and class weight adjustments was employed. SMOTE was used to synthetically augment the minority class, thereby balancing the class distribution without losing information, while class weights were tuned to penalize misclassification of the minority class more heavily, aligning with best practices outlined by He and Garcia [25] and Fernández et al. [26]. This hybrid approach leverages the strengths of both resampling and algorithm-level weighting to improve model sensitivity to minority classes, which is critical in contexts with skewed participation data.
Finally, for comparison and interpretability, a light-touch analysis of LR was conducted. This provided a baseline assessment and facilitated evaluation of the trade-offs associated with a simpler, more transparent model.
3.1 Decision trees
The DTs emerge among prominent and interpretable classification methods within the realm of supervised learning [27]. One of their key strengths lies in their visualization capabilities, which allow for an intuitive representation of the decision-making process. This technique is structured as a series of nodes, branches, and leaves, with each node representing a feature or attribute used to split the data, while the branches denote the outcomes of these splits. The leaves of the tree correspond to the final classifications or predicted outcomes. This hierarchical structure not only makes it easy to understand the logic behind the classification process but also facilitates traceability of how specific attributes influence the decision-making outcome. In this study, we leverage this powerful technique to classify household participation in agricultural activities, aiming to pinpoint the key variables that influence their decisions to engage in agriculture. Noted by Feng and Park [28], DT operate by recursively partitioning the input feature space based on significant attribute values, allowing for the identification of the most impactful factors.
The DT model can be mathematically represented as follows:
where, (P) is the predicted outcome (i.e., participation in agriculture), f (X) is the function that maps input features, X to the predicted classes, wi are the weights (or probabilities) assigned to the categories, I(X∈ Ri) is an indicator function that equals 1 if the input, X falls within the region, Ri defined by a specific split in the tree (i.e., where Ri represents a region in the feature space defined by a specific set of decision rules or splits in the tree. Each region corresponds to a leaf node and captures households that share similar characteristics based on the partitioning criteria (e.g., income thresholds, land ownership, or resource access).
In general, this model outlines a systematic approach to understanding the dynamics of agricultural participation by infusing input features, X comprising demographic variables, socioeconomic variables, geographic setting, and access to resources—to predict the likelihood of participation (P). The emerging weightings, wi, influence predicted classes through the function, f(X), and further employ the indicator function I(X∈ Ri) to clarify decision boundaries toward identifying key characteristics and trends among participants vs. non-participants. This methodological consistency provides a robust framework for inferring future trends in agricultural participation, potentially offering valuable insights regarding the sector.
To enhance model complexity and predictive power, we extend this formulation by incorporating probabilistic decision boundaries and feature interactions:
Here,
• πiX is a soft assignment function (e.g., based on kernel density or logistic weighting) that reflects the degree to which X belongs to region Ri,
• P(Y = 1 | X∈ Ri) is the conditional probability of participation given that X falls within region Ri,
This probabilistic extension allows for smoother decision boundaries and better handling of uncertainty, especially in cases where feature distributions overlap significantly. In our application, the input feature vector X comprises demographic characteristics, socioeconomic indicators, geographic context, and access to agricultural resources. The model systematically integrates these variables to predict the likelihood of household participation in agriculture. The resulting weights wi and probabilities P(Y = 1 | X∈ Ri) provide insights into the relative importance of each feature and region, enabling the identification of key patterns and trends among participants and non-participants.
3.2 Random forest
The RF algorithm—another ML technique, stands out as an optimal choice for this study due to its robustness and versatility in managing complex, high-dimensional datasets commonly encountered in agriculture. One of the key advantages of this technique is its applicability across a wide range of predictive problems, necessitating only a minimal number of parameters for tuning, which contributes to its reputation as a user-friendly ML technique [29]. As noted by Breiman [30], the RF algorithm effectively addresses the management of categorical variables with a multitude of values, a valuable feature for our data sets that include categorical variables such as district, which has 10 distinct values, and zone, with four different values. Building on this foundation, our approach aligns with literature [see Gegiuc et al. [31]] while contextualizing the RF model within the specific framework of agricultural participation analysis.
In each iteration from d = 1, 2,…, D, where D is the total number of trees, we first create a bootstrap sample Z(d) of size N from the training dataset. Next, we construct an RF tree Td using the bootstrapped data by recursively performing the following steps at each terminal node until the minimum node size nmin is reached:
(i) randomly select m variables from p where p denotes the total number of predictor variables (features) in the dataset, including agricultural outputs and household characteristics;
(ii) identify the optimal feature and corresponding split point based on impurity reduction (e.g., Gini index or entropy);
(iii) split the node into two child nodes. We then compile an ensemble of trees .
For predicting a new data point x, we employ the RF for classification purposes, where the prediction from the dth tree is denoted as Ĉd(x). Consequently, the aggregated prediction, ĈD, rf(x) is determined by the majority vote across all D trees, ĈD. This ensemble-based approach enhances generalization by reducing variance and mitigating overfitting, especially in heterogeneous datasets. Moreover, RF provides internal metrics such as feature importance scores, which are instrumental in identifying the latent drivers of agricultural participation.
3.3 Gradient boosting
The GB comes in as a powerful ensemble learning technique that constructs predictive models through an iterative, stage-wise process, optimizing a specified loss function by combining weak learners—typically decision trees—into a strong, cohesive predictor [32]. Its core principle lies in sequentially fitting new models to the residual errors of prior models, thereby minimizing overall prediction error through gradient descent optimization in function space. This approach enables GB to capture complex, non-linear relationships that simpler models often overlook.
At each iteration m, GB adds a new base learner hm(x) to the existing ensemble Fm−1(x), producing an improved composite model:
where v is the learning rate controlling the contribution of each new learner. The new base learner approximates the negative gradient of the loss function L with respect to current predictions:
This flexibility allows GB to optimize a wide array of loss functions, including logistic loss for binary classification, making it particularly suitable for modeling household participation in agriculture. In this context, GB effectively models interactions among demographic, socioeconomic, geographic —among the model variables, refining predictions by focusing on difficult-to-classify cases.
The predictive function can be expressed as:
where P is the predicted probability of participation, σ(·) is the sigmoid function for binary outcomes, and nm represents the contribution of each boosting iteration.
In summary, GB offers practical advantages, including support for diverse loss functions, feature importance metrics for interpretability, and partial dependence plots for visualizing marginal effects—critical for policy-oriented analysis [33]. Its robustness to multicollinearity and ability to handle heterogeneous data make it well-suited for complex socioeconomic datasets. However, challenges such as class imbalance and skewed participation require mitigation; in this study, sample weighting and resampling techniques (under- or over-sampling) were applied to reduce bias and ensure balanced representation [25], thereby improving prediction reliability and fairness across household profiles.
3.4 Logistic regression
To identify potential triggers for future agricultural participation, we initiated our analysis by investigating the retrospective factors influencing household engagement in agriculture. For this purpose, we employed FIRTH logistic regression, with particular emphasis on Penalized Maximum Likelihood Estimation (PMLE) as the core methodological framework.
FIRTH logistic regression proves advantageous in mitigating issues commonly associated with standard maximum likelihood estimation, particularly in contexts involving small or imbalanced datasets [34, 35, 40]—a characteristic present in the three household budget survey datasets utilized in this study. The PLME is further considered easier to implement and less computationally intensive compared to alternative approaches [36]. The essence of the PMLE lies in its ability to refine the score function by integrating a corrective term designed to offset the first-order bias from the asymptotic expansion of maximum likelihood estimation. This incorporation of a bias-reducing term is particularly beneficial in small sample contexts, where the corrective effect diminishes as the sample size increases [37, 38]. The breakdown of the FIRTH model is outlined in the equations that follow. This method is derived from the Newton- Raphson algorithm, which computes the vector of first derivatives as follows (see Lu 2016):
Where:
• ∪(β) is the score function (gradient of the penalized log-likelihood with respect to the parameter vector β);
• β is vector of regression coefficients for predictors (e.g., household characteristics, agricultural outputs);
• xi is predictor vector for observation i (e.g., age, education, land ownership);
• yi is the observed binary outcome for observation i (1 = participates in agriculture, 0 = does not participate);
• ŷi is predicted probability of participation for observation I;
• qi is the ith diagonal element Of the “hat” matrix Q, which adjust for bias.
The hat matrix Q is defined as:
Where:
• X is the design matrix of predictors, and
• W is the diagonal matrix of weights,
Where:
The correction acts as a shrinkage quantity designed to remove bias by adding one-half of the natural logarithm of the information matrix to the log-likelihood, while simultaneously adjusting the score function [39]. As the log-likelihood approaches zero over successive iterations, this adjustment counteracts bias in parameter estimation, particularly in cases with small sample sizes or rare events.
4 Results
The study employed four analytical approaches to identify plausible predictors or triggers of agricultural participation among households. The PMLE method provided a retrospective analysis, highlighting historical factors that have influenced household participation in agriculture. To deepen our understanding of these predictors and enable predictive modeling, the study further utilized DT, GB, and RF algorithms to assess the feature importance of all variables included in the model specification. This approach not only clarified the relationships between predictors and participation but also facilitated the prediction of plausible future household engagement in agricultural activities based on these identified factors.
4.1 Retrospective view of triggers for agricultural participation: insights from regression modeling
Using the PMLE, the analysis identified key factors that exhibit inertia, indicating their potential significance in future agricultural programming and policy formulation. The conclusions presented here were derived using Firth logistic regression, which adjusts for small sample bias and imbalanced data by applying PMLE. Statistical significance was assessed at α = 0.05, and p-values were used to determine whether each predictor had a meaningful association with household participation in agriculture. This suggests that future programs could benefit from the consistency of these factors while also accommodating the potential emergence of new predictors, including an updated set of confounding variables that may vary over time. In particular, the analysis identified several statistically significant variables, comprising geographic location, household size, sex, age and marital status of the household head, as well as household income and access to water resources; see Table 1. The possession of agricultural items was used for comparison purposes, given its substantial weight relative to participation measured through spending and selling agricultural products, with over two-thirds of the households demonstrating possession. The analysis in Table 1 presents p-values for each predictor across the three survey rounds (2002–2003, 2010–2011, and 2017–2018). A p-value below 0.05 indicates that the variable is statistically significant in influencing participation for that survey year. For example, household size and age consistently show strong significance (p = 0.0000), suggesting they are persistent drivers of participation. Conversely, variables such as access to electricity and spending on farming inputs exhibit high p-values (>0.7), indicating weak or no association with participation. Missing values in the table mean these variables were not collected in those HBS survey rounds, reflecting design limitations rather than lack of effect. The consistency of this process provides valuable insights into prior probabilities, which can be instrumental in making informed predictions about future participation in agriculture. This foundational understanding establishes a basis for applying ML techniques. While DT, RF, and similar algorithms do not explicitly incorporate prior probabilities and instead focus on feature values during training, the consistency of specific influential variables can guide their predictions.
Table 1. P-values (Sig., for α = 0.05) from FIRTH regression for household participation in agriculture based on possession of agricultural items.
4.2 Prospective view of triggers for agricultural participation: insights from ML
To ensure reliability and reduce overfitting, the study applied a consistent data-splitting strategy across all machine learning models—DT, RF, and GB—using a 75% training and 25% testing split. This approach provides sufficient data for model learning while preserving an independent test set for unbiased performance evaluation. Maintaining the same split ratio across algorithms enhances comparability of results and supports robust validation of predictive accuracy. The training data set served as a foundation for model development, allowing for the adjustment of learning parameters, while the test segment provided a critical evaluation framework of the performance of the model on unseen data. Based on the DT, as illustrated in Figure 2a, the output summary indicates how in 2002–2003 the tree classifies households based on various predictors such as urban and rural classification, age of the head of the household, the household size, district where the household is located, and the main activity of the household head. Each node in the tree presents a predicted class (e.g., participation or non-participation) along with a confidence level, typically shown as a probability value (e.g., yprob = 0.75), which reflects the proportion of households in that node belonging to the predicted class. The tree reveals that rural households are generally more likely to participate in agriculture than their urban counterparts. Younger individuals (under 41 years) show a significant tendency not to participate, whereas older individuals have higher participation rates. Household size plays a crucial role, with larger households among older individuals indicating an increased likelihood of agricultural involvement. The tree also emphasizes the importance of geographic location and district affiliations, suggesting that targeted interventions could effectively enhance participation among specific demo- graphics, such as older households in rural areas. This raises a policy consideration: should programs reinforce participation among groups already inclined to engage, or focus on those less likely to participate—such as younger or urban households—to broaden inclusion and equity in agricultural development?
Figure 2. Decision Trees for possessing agricultural items or products across survey years: (a) 2002–2003, (b) 2010–2011, and (c) 2017–2018.
In 2010–2011—as illustrated in Figure 2b—the root node indicates that approximately 33% of households are likely to participate in agriculture, which is shown by the predicted probabilities (yprob) of 0.6755 for non-participation and 0.3245 for participation. The decision tree reveals several important splits that shed light on key factors influencing agricultural participation. The first significant split is based on the urban and rural variables, where rural households are differentiated from urban households. Among the rural population, the tree further segments the households by district. The analysis shows that households in districts with lower designation codes exhibit higher probabilities of participating in agriculture (and these are mostly located in lowland areas with arable land). In contrast, those from urban settings have a lower likelihood of participation, as indicated by the loss and predicted values at various nodes (i.e., urbanization may reduce engagement in agriculture). For instance, among urban households, a significant split occurs at district coded < 5, where households show a predilection toward non-participation. For rural households with district coded < 6, the probabilities hint toward a higher likelihood of participation, particularly those younger than 38 years. Geographic location, age, and district categorization remain key factors. Again, this highlights a strategic policy question: should interventions be concentrated in high-probability areas to maximize impact, or redirected toward underrepresented groups to foster inclusive growth?
At the same time, the DT output for 2017–2018 reveals several important insights based on demographic and household characteristics: Figure 2c. At the root node, the overall probability of participation is approximately 30.3%. The tree first splits based on urban-rural status, with urban households showing a higher likelihood (53.2% probability of not participating) than rural households at 18.6% probability of not participating in agriculture. For rural households, further segmentation occurs with age as a key factor: those younger than 41 years predominantly display non-participation, while older individuals show a higher likelihood of engaging in agriculture (61.3% probability of participation). Among older households, household size again plays a critical role; smaller households (size < 2) show a 42.0% probability of participation, while larger households (size ≥2) show a slightly lower probability at 33.8%. This suggests that while older age increases the likelihood of participation, household size may moderate this effect, with smaller households being marginally more engaged than larger ones in this subgroup.
Urban households, particularly those with older individuals (49+), show minimal engagement (11.3%). These findings suggest that while certain demographic groups are consistently more engaged, policy design must weigh whether to reinforce these trends or actively support less-engaged populations to ensure broader agricultural inclusion. A potential confounding variable: education level, significantly influences agriculture engagement among younger urban households, suggesting that those with lower educational attainment participate at a higher rate than their more educated peers (67.2% probability of participation).
Further to the analysis of household participation in agriculture based on spending on agricultural items (see Figure 3a), the root node indicates that nearly half of the households engaged in agricultural spending. The first split reveals a critical threshold at an income level of M3,061. Households with an income greater than or equal to this amount show a low likelihood of participating in agriculture, as evidenced by 85.84% confidence that they do not spend on agricultural items. Households with an income below M3,061 exhibit complete participation in agricultural spending (100% probability). This analysis highlights an inverse relationship between income level and agricultural spending participation, suggesting that lower-income households are more inclined to invest in agricultural items, which could serve as a means of production or as a way to supplement their livelihoods due to their low purchasing power. In contrast, higher-income households tend to abstain from such expenditures, as they can afford to rely on alternative sources for their needs. In 2010–2011, the DT analysis of spending on agricultural items reveals significant insights regarding participation in agriculture, classified as class 1 (see Figure 3b). The model begins with a nearly even probability distribution at the root node, where the likelihood of not participating (class 0) is slightly greater than participating (class 1). The initial split based on the “district” variable identifies a crucial threshold of below 2, effectively distinguishing between two primary branches.
Figure 3. Decision Trees for spending on agricultural items or products across survey years: (a) 2002–2003, (b) 2010–2011, and (c) 2017–2018.
The left branch, containing 108 instances, shows a strong bias toward class 0, with a 90.7% probability, indicating that individuals in this category, particularly those in districts below the threshold, are less likely to invest in agricultural spending. A further division based on the “zone” variable highlights that while this group has a terminal node with 95% confidence in class 0, there remains a small subset where class 1 is present, albeit with less certainty. In contrast, the right branch, which includes districts with values above or equal to 2, exhibits a striking predominance of class 1, with a 93% probability, indicating that higher spending in certain districts is strongly associated with participation in agriculture. An emerging theme from these findings is the importance of “district” and “zone” as key factors influencing agricultural spending decisions, suggesting that lower values in these variables correlate with a reduced likelihood of participation, while higher values tend to facilitate greater involvement in agricultural expenditure. Extending the same analysis for 2017–2018 (see Figure 3c), the DT provides insight into household participation in agriculture based on household size. Households with five or more members (node 2) demonstrate a significant loss of 102 with a y-value of 0, indicating that these larger households do not participate in agricultural spending (probability of participation is only 21.6%). In contrast, households with fewer than 5.5 members (node 3) show complete participation, with a y-value of 1 and a probability of 100% for engaging in agricultural spending. This stark division suggests that smaller households were more inclined to allocate resources toward agricultural items in 2017–2018.
In terms of household participation in agriculture through the selling of agricultural products—in 2002—the initial split was based on the household income at a threshold of M2,129 (Figure 4a). Households with an income equal to or exceeding this threshold (node 2) exhibit a lower participation rate in selling agricultural products, as indicated by a loss of 82 and a y-value of 0 (21.5% probability of participation). In stark contrast, those with incomes below this threshold (node 3) demonstrate complete participation, achieving a y-value of 1 and a probability of 100% for actively selling agricultural products. This analysis highlights a relationship between lower income levels and increased involvement in agricultural sales, suggesting that economic constraints may drive households to engage more actively in selling agricultural outputs as a means of subsistence or additional income.
Figure 4. Decision Trees for selling agricultural items or products across survey years: (a) 2002–2003, (b) 2010–2011, and (c) 2017–2018.
In 2010–2011, as illustrated in Figure 4b, the DT analysis revealed complex interdependencies based on geographic and demo- graphic characteristics. Comprising a total of 419 households following resampling, the model begins with a split at the district level, specifically identifying districts with a value code less than 3 (node 2), where participation is more likely, with a participation probability of 27.3%. The two districts involved share boundaries (Butha-Buthe and Leribe), suggesting potential homogeneity in household practices. Further divisions underscore the significance of locality and household attributes, with households in districts with lower codes (i.e., those located largely in the lowlands side of the country) exhibiting heightened engagement, especially those in zone categories. For example, within the first district category, households in zone values less than 3 display a very low participation probability (9.1%) compared to small subsets in zones 3 and above. Meanwhile, in the urban-rural split, households with fewer than 3.5 members also show a significant drop in participation, while larger households tend to have higher engagement, particularly when older (age over 65.5 years) members are involved. On the other hand, households in districts with values of 3 or higher posit activeness in selling agricultural products. Overall, the findings suggest that geographic location, household size, and age play vital roles in influencing the propensity of households to engage in selling agricultural products, highlighting the multifaceted dynamics that drive agricultural market participation. In the same vein, in 2017–2018, the analysis revealed a relationship between house- hold size and market engagement—see Figure 4c. At the root node, there is a balanced starting point with equal probabilities of households participating or not in agricultural sales (50% each). However, a decisive split occurs based on household size at the threshold of 8.5 members. Households with 8.5 or more members (node 2) show a very low probability of participation (only 3.0%), indicating that larger households may be less likely to actively sell agricultural products.
Conversely, households with fewer than 8.5 members (node 3) exhibit a complete rate of participation (100%), suggesting that during this period, smaller households were more engaged in agricultural sales. This contrast highlights how household size can significantly influence economic activities in agriculture, with smaller households likely relying more on selling products as a source of income or sustenance, whereas larger households may have differing resource dynamics or obligations that limit their engagement in such markets. In the context of 2017–2018, several economic dynamics may have played a pivotal role in shaping these participation patterns. For instance, fluctuations in agricultural commodity prices during this period could have incentivised smaller households to engage more actively in selling their products to capitalize on favorable market conditions. Additionally, in-country programs aimed at promoting alternative livelihood strategies should not be overlooked as possible confounders. The effects of climate variability and droughts prevalent in some agricultural regions could have created a sense of urgency among smaller household producers to monetise their harvests quickly, leading to increased market participation.
Next, using the models derived from DT and RF, the study predicts outcomes based on the training and the test data sets. As shown in Table 2, from the training data set, the RF model utilizing 500 trees and considering four variables at each split indicates a classification task from the perspective of agricultural item possession in 2010–2011. The Out-of-Bag (OOB) error rate is estimated at 25.08%, suggesting that nearly a quarter of the predictions could be inaccurate when validated on unseen data, reflecting a moderate model generalization level. An examination of the confusion matrix reveals that 702 instances of class 0 were correctly classified, while 767 were misclassified as class 1, resulting in a high error rate of 52.21% for class 0, indicating persistent challenges in accurately identifying non-participants. In contrast, the model performed considerably better for class 1, correctly classifying 2,665 instances and misclassifying 360 as class 0, leading to a lower error rate of 11.90%.
Table 2. Model output for classification—Possession of agric items in 2002–2003: true values from training data.
This discrepancy suggests that the model continues to exhibit a bias toward accurately identifying participants, highlighting the need for enhanced class balance strategies, refined feature selection, or alternative ensemble techniques such as gradient boosting, especially since class weights have already been considered.
Further to the examination, the error rate plot reveals a consistent error rate across the increasing number of trees (Figure 5a). The stability of the overall OOB error rate—represented by the black line—suggests that adding more trees beyond 500 does not significantly enhance the model's performance, indicating a plateau in predictive capability. The red and green lines reflect class-specific error rates, typically corresponding to participation and non-participation respectively, and help illustrate how the model distinguishes between these outcomes. This consistency across all lines reinforces the reliability of the model in generalizing to unseen data, while also confirming that the chosen number of trees achieves a balanced trade-off between complexity and accuracy. When comparing the confusion matrix outputs of the DT, RF, and GB models on the test dataset (as illustrated in Table 3), the study observes key differences in their classification performance and resultant metrics. The DT model misclassifies a substantial number of instances, with 102 false positives (FP) and 293 false negatives (FN), resulting in an overall accuracy of 73.63%. The RF model improves upon this, yielding 113 false positives and 222 false negatives, which contributes to a higher accuracy of 77.64%. This improvement is reflected in the RF's confusion matrix, where it correctly predicts 253 true negatives (TN) and 910 true positives (TP), compared to DT's 199 TN and 904 TP. The GB model, while slightly lower in overall accuracy at 74.37%, shows a different trade-off: 309 TN and 805 TP, with 201 FP and 183 FN. These results translate into better sensitivity for RF (0.5326) and GB (0.6280) compared to DT (0.4045), indicating that both ensemble methods are more effective at identifying positive instances. However, DT maintains the highest specificity (0.8986), followed by RF (0.8895) and GB (0.8002), suggesting that DT is more conservative and less prone to false positives.
Figure 5. Error Rates for possessing agricultural items across survey years: (a) 2002–2003, (b) 2010–2011, and (c) 2017–2018.
Table 3. Confusion matrix and statistics—Possession of agric items or products as per HBS 2002–2003: predicted values from test data.
The Kappa statistic, which measures agreement between predicted and actual classifications beyond chance, further supports the superiority of RF (0.4498) and GB (0.4243) over DT (0.3364). This indicates moderate agreement for RF and GB, with RF showing the strongest consistency. Balanced accuracy, which accounts for both sensitivity and specificity, is highest for GB (0.7141), closely followed by RF (0.7111) and DT (0.6515). This suggests that GB and RF perform more evenly across both classes, making them more reliable in imbalanced classification scenarios.
Statistical significance is confirmed through the p-Value [Acc > NIR], which tests whether the model's accuracy is significantly better than the No Information Rate. All models show highly significant results: DT (p = 3.199e-08), RF (p < 6.7223e-16), and GB (p = 7.75e-10). RF's extremely low p-value indicates the strongest evidence that its performance is not due to chance. McNemar's Test p-Value, which assesses the symmetry of classification errors, shows significant disagreement for DT (p < 2.2e-16) and RF (p = 3.620e-09), suggesting systematic differences in error types. GB's p-value (0.3857) is not statistically significant, implying its errors are more evenly distributed. In summary, RF demonstrates the best overall performance in terms of accuracy, Kappa, and statistical significance, while GB excels in sensitivity and balanced accuracy. DT, although more specific, lags behind in identifying positive cases. These findings illustrate the advantages of ensemble methods over single-tree models, with RF offering a balanced and statistically robust approach, and GB providing heightened sensitivity for detecting positive instances. Regarding participation in terms of possession, as per the 2010–2011 data set, the output from the RF model indicates that it is a classification run with 500 trees and a consideration of three variables at each split (Table 4). The model estimates an OOB error rate of 24.14%, suggesting that approximately a quarter of predictions may be incorrect when evaluated on unseen or test data (and this could include future household budget survey cycles with ceteris paribus in terms of the survey design, and so on).
Table 4. Model output for classification—Possession of agric items in 2010–2011: true values from training data.
The confusion matrix reveals specific performance metrics: the model correctly classified 2,355 instances of class 0 but misclassified 346 instances as class 1, resulting in a class error rate of approximately 12.81% for class 0.
In contrast, for class 1, it correctly identified 729 instances but misclassified 558 as class 0, yielding a higher class error rate of about 43.36%. To address this imbalance, SMOTE (Synthetic Minority Over-sampling Technique) was applied to generate synthetic examples of the minority class, and class weighting was incorporated to penalize misclassification of underrepresented instances. These techniques aimed to improve the model's sensitivity and overall balance between precision and recall. Despite these enhancements, the model still shows stronger performance in identifying non-participants (class 0) than participants (class 1), though the gap has narrowed compared to unbalanced training. This suggests that while SMOTE and weighting improved minority class representation, further tuning or alternative ensemble methods may be needed to optimize classification of agricultural participants.
The visualization of the OOB error illustrates a consistent error rate that remains stable despite an increasing number of trees (Figure 5b). This stability reinforces the model's reliability and suggests that the chosen number of trees is sufficient to prevent overfitting while maintaining generalization performance. In the broader context of evaluating classifier performance on the possession of agricultural items, the subsequent comparison of DT, RF, and GB models (Table 5) reveals how balancing techniques like SMOTE and weighting contribute to improved fairness, robustness and predictive consistency across models.
Table 5. Confusion Matrix and Statistics—Possession of agric items or products as per HBS 2010–2011: predicted values from test data.
The Table further reveals distinct performance characteristics for each classifier. The DT model achieves an accuracy of 75.56%, while RF improves slightly to 76.77%, and GB follows closely at 76.09%. All models perform significantly better than the NIR, with p-values well below 0.001—DT (p = 8.36e-12), RF (p < 2.34e-16), and GB (p = 6.65e-11)—indicating that their predictive performance is not due to chance and is statistically significant. Sensitivity scores are high for DT (84.98%) and RF (86.89%), reflecting strong performance in identifying agricultural participants. GB, while slightly lower in sensitivity (76.80%), compensates with a markedly higher specificity of 74.59%, compared to 56.39% for DT and 56.63% for RF. This suggests that GB is more effective at correctly identifying non-participants, offering a better balance between the two classes. The Balanced Accuracy metric further supports this, with GB achieving the highest score at 75.69%, followed by RF (71.76%) and DT (70.69%). The Kappa statistic, which measures agreement between predicted and actual classifications beyond chance, is also highest for GB (0.4831), compared to RF (0.4548) and DT (0.4279), indicating stronger model reliability and consistency. Additionally, GB records the highest Positive Predictive Value5 (PPV) at 86.55%, reinforcing its strength in predicting actual participants. Despite RF's marginally higher sensitivity, GB's superior specificity, balanced accuracy, and predictive value—combined with its statistically significant performance—suggest that it offers a more equitable and generalisable classification outcome.
These results underscore the value of incorporating gradient boosting as an alternative ensemble technique, particularly when class weights and synthetic sampling have already been applied. Overall, while all three models are effective, GB emerges as the most balanced and statistically robust option for assessing agricultural participation based on possession data.
Regarding the 2017–2018 data set, the RF model output indicates a classification task with 500 trees, considering three variables at each split (Table 6). The OOB error rate is estimated at 21.77%, suggesting that the model exhibits reasonable predictive capability, with less than a quarter of predictions expected to be incorrect on unseen data. Analyzing the confusion matrix shows that the model correctly classified 2,043 instances of class 1 (participants), while misclassifying 203 instances as class 0 (non-participants), leading to a low class error rate of approximately 9.04% for participants.
Table 6. Model output for classification—Possession of agric items in 2017–2018: true values from training data.
However, the model struggled with class 0, correctly identifying 455 instances but misclassifying 520 as class 1, resulting in a high class error rate of around 53.33%. This contrast in performance between the two classes highlights the RF model's challenge in accurately categorizing non-participants.
The imbalance in classification accuracy suggests a bias toward predicting participation, which may stem from underlying class distribution or feature dominance. The OOB error rate plot reveals important insights into model performance based on the number of trees in the ensemble. Figure 5c is an illustration. With an OOB estimate of 21.77% at 500 trees, the plot indicates that the error rate stabilizes around this point, suggesting that adding more trees beyond 500 does not significantly improve the predictive accuracy of the model. The error rate exhibits a long-run plateau, indicating diminishing returns with the addition of more trees, suggesting that the model has likely attained an optimal level of complexity. This stability is vital for model efficiency, as it points to an optimal point for resource allocation, allowing for maximization without compromising performance. Consequently, this finding affirms that the current setup is adequate for producing reliable classification results.
Based on the 2017–2018 data set (see Table 7), all three models demonstrate statistically significant predictive performance, with accuracy scores exceeding the NIR of 66.85% and p-values well below 0.001—DT (p = 5.585e-13), RF (p < 3.369e-13), and GB (p = 1.935e-05)—indicating that their classification results are unlikely to be due to random chance. Among the models, RF achieves the highest accuracy at 76.91%, narrowly outperforming DT (76.82%) and GB (72.72%). However, GB demonstrates superior sensitivity at 59.55%, compared to RF (50.84%) and DT (48.88%), suggesting that GB is more effective at identifying true positives, i.e., households possessing agricultural items.
Table 7. Confusion Matrix and Statistics—Possession of agric items or products as per HBS 2017–2018: predicted values from test data.
Conversely, DT and RF outperform GB in specificity, with DT at 90.67%, RF at 89.83%, and GB trailing at 79.25%, indicating that DT and RF are better at correctly identifying non-participants. In terms of PPV, which reflects the proportion of predicted positives that are actual positives, RF leads with 71.26%, followed by DT (72.20%) and GB (58.73%). GB, however, records the highest Negative Predictive Value at 79.80%, suggesting stronger reliability in predicting non-participants.
Balanced Accuracy, which averages sensitivity and specificity, is highest for RF (70.34%), followed closely by DT (69.77%) and GB (69.40%), indicating relatively consistent performance across both classes. The Kappa statistic, measuring agreement beyond chance, is strongest for RF (0.4384), followed by GB (0.3866) and DT (< 5.031e-13, indicating poor agreement), reinforcing RF's overall reliability. Additionally, the Mcnemar's Test p-values show significant differences in classification errors for DT (< 5.031e-13) and RF (< 1.422e-10), while GB's result (p = 0.8152) suggests no significant imbalance in misclassification, further supporting GB's balanced performance. Overall, while RF offers the highest accuracy and agreement, GB demonstrates greater sensitivity and more balanced error distribution, making it a compelling alternative when the priority is to reduce false negatives. These findings underscore the importance of aligning model selection with classification priorities—whether the goal is to maximize overall accuracy or to ensure equitable identification of both participants and non-participants.
Furthermore, the analysis of Mean Decrease Gini (MDG) scores from the RF model reveals evolving patterns in the importance of variables influencing agricultural item possession across the three survey cycles. Notably (see Figure 6), age and income consistently rank among the top predictors, with age scoring 284.93 in 2002–2003 and maintaining high importance through 2010–2011 (278.76) and 2017–2018 (244.26). Income follows a similar trajectory, peaking in 2002 (248.11), dipping in 2010–2011 (135.30), and rising again in 2017–2018 (192.85), underscoring the enduring relevance of demographic and economic factors. Household size also emerges as a key variable, particularly in 2002–2003 (183.88) and 2017–2018 (134.49), suggesting that family composition plays a growing role in determining agricultural possession. Meanwhile, education and urban/rural setting maintain moderate importance across all years, reflecting their steady influence on access to and ownership of agricultural assets.
This trend is further reinforced by the GB model's variable importance output, which also identifies income, age, and urban-rural setting as pivotal predictors of possession in both 2002–2003 and 2017–2018 (Figure 7). In 2010–2011, however, GB highlights district, age, and urban-rural setting as the most influential variables, indicating a temporary shift toward spatial and demographic factors during that period. The convergence between RF and GB in identifying age and urban-rural setting as consistently important across all years suggests a stable relationship between demographic structure and agricultural engagement. At the same time, the divergence in the prominence of district and income across models and years reflects the dynamic nature of agricultural possession determinants, shaped by evolving policy environments, infrastructural access, and socio-economic transitions. These insights underscore the value of multi-model analysis in capturing nuanced shifts in variable importance and guiding more targeted, context-sensitive interventions.
Figure 7. Important Variables [GB]—Possession of Agric items/products across survey years: (a) 2002–2003, (b) 2010–2011, and (c) 2017–2018.
In the context of classifying agricultural participation through the sale of agricultural products, the RF analysis on the 2002–2003 data set demonstrates strong predictive performance (Table 8). The model was constructed using 500 trees, with four variables considered at each split, and achieved an OOB error rate of 9.11%, indicating that fewer than one in 10 predictions are expected to be incorrect when applied to unseen data—an encouraging sign of the model's generalization capability. The confusion matrix offers further insight into class-level performance. Among the non-participant group (class 0), the model correctly classified 580 instances, with only 12 misclassified as participants, resulting in a very low class error rate of 2.03%. This highlights the model's high specificity and precision in identifying non-participants. Conversely, for the participant group (class 1), the model correctly identified 516 instances, but misclassified 99 as non-participants, leading to a higher class error rate of 16.10%. This discrepancy suggests that while the model is highly effective at recognizing non-participants, it is less reliable in detecting participants, potentially due to class imbalance or overlapping feature distributions.
Table 8. Model output for classification—Selling agric items or products in 2002–2003: true values from training data.
When evaluating the overall performance of the model in relation to the inclusion of additional trees, the OOB error rate plot demonstrates long-term stability. An illustration of this trend is presented in Figure 8a. This suggests that further increases in the number of trees are unlikely to yield significant improvements in the RF predictive accuracy. Such stability is indicative of a well-defined feature set that effectively captures the underlying patterns influencing the triggers for participating by selling agricultural items or products. In this context, robust feature selection will play a crucial role in identifying the key drivers behind selling behavior. The consistency observed in the model performance could serve as a basis for developing targeted interventions or triggers that prompt increased participation in this trade.
Figure 8. Error Rates for selling agricultural items or products across survey years: (a) 2002–2003, (b) 2010–2011, and (c) 2017–2018.
In terms of performance—as illustrated in Table 9—all three models demonstrate strong and statistically significant accuracy in predicting participation through the sale of agricultural products. The DT model achieves the highest accuracy at 90.84%, slightly outperforming RF (89.83%) and GB (87.59%). All models exceed the NIR of 52.85%, with p-values < 2e-16, confirming that their predictive performance is highly significant and not due to chance. The Kappa statistic, which measures agreement beyond random classification, is highest for DT (0.8169), followed by RF (0.7940) and GB (0.7502), indicating that DT provides the most reliable classification consistency. In terms of sensitivity, which reflects the true positive rate for identifying sellers of agricultural products (class 1), DT achieves a perfect score of 1.000, outperforming RF (0.9765) and GB (0.9108). This suggests that DT is most effective at correctly identifying participants, a critical factor for policy targeting and program design. However, when evaluating specificity, which measures the ability to correctly identify non-participants (class 0), GB leads with 83.68%, followed by DT (81.73%) and RF (81.05%). This indicates that GB is better at avoiding false positives, offering a more balanced classification across both groups. Further, Positive Predictive Value (PPV) is highest for GB (86.22%), suggesting it is most reliable in confirming actual participants among predicted positives. RF follows with 85.25%, and DT with 84.47%. Meanwhile, Negative Predictive Value is perfect for DT (1.0000), indicating no false negatives, while RF (96.86%) and GB (89.33%) also perform well. The Balanced Accuracy, which averages sensitivity and specificity, is highest for DT (90.87%), followed by RF (89.35%) and GB (87.38%), reinforcing DT's overall strength in classification performance. Notably, the Mcnemar's Test p-values for DT (< 4.321e-14) and RF (2.797e-06) indicate significant asymmetry in classification errors, whereas GB (p = 0.1198) shows no significant imbalance, suggesting greater stability in its predictions. Overall, while DT excels in identifying true positives and offers the highest balanced accuracy, GB provides more equitable performance across both classes, particularly in specificity and PPV. These results affirm that the features selected for all models are consistent and relevant predictors of agricultural participation in Lesotho. Their reliability lays a strong foundation for further research into the determinants of sectoral engagement, enabling data-driven interventions tailored to specific population segments—such as rural vs. urban households, or youth vs. adult participants—within the agricultural economy.
Table 9. Confusion Matrix and Statistics—Selling agric items or products as per HBS 2002–2003: predicted values from test data.
A deeper analysis of the RF classification output for the 2010–2011 data on the sale of agricultural products reveals important insights into the model's performance and its ability to distinguish between participants and non-participants (Table 10). The model was configured with 500 trees and considered three variables at each split, yielding an OOB error rate of 18.15%. This suggests that approximately one in five predictions may be incorrect when applied to unseen data, indicating moderate generalization performance with room for improvement. The confusion matrix further clarifies the model's classification dynamics. For the non-participant group (class 0), the model correctly classified 242 instances, while 22 were misclassified as participants, resulting in a low class error rate of 8.33%. This reflects a strong ability to identify non-participants accurately. However, for the participant group (class 1), the model correctly identified 100 instances, but misclassified 56 as non-participants, leading to a substantially higher class error rate of 35.90%. This imbalance in performance suggests that the model is less sensitive to participant characteristics, potentially due to class imbalance, overlapping feature distributions, or insufficiently informative predictors.
Table 10. Model output for classification—Selling agric products in 2010–2011: true values from training data.
When compared to the 2002–2003 model—which achieved a significantly lower OOB error rate of 9.11%—the 2010–2011 model reflects a decline in predictive accuracy. This deterioration may be attributed to shifts in agricultural engagement patterns, changes in household characteristics, or greater heterogeneity in the 2010–2011 dataset, all of which could complicate the model's ability to generalize effectively. Despite this, the OOB error rate plot for 2010–2011 demonstrates stability across increasing tree counts, suggesting that the model is not overfitting and that its performance has plateaued with the current configuration.
As depicted in Figure 8b, the plot demonstrates a gradual leveling off of the OOB error rate as more trees are added, suggesting that the model benefits from increased ensemble size without significant deterioration in performance. This observation indicates that the model is consolidating its predictive accuracy with the addition of trees, reinforcing the idea that an adequately sized ensemble can enhance classification reliability and reduce variance. However, while the stability in the OOB error rate indicates a consistent model performance over time, it is essential to recognize that the overall error rate remains relatively high at 18.15%. This suggests that although the model demonstrates stability concerning the number of trees, there are still unaddressed challenges related to the underlying data distribution or feature selection. Expanding the comparison to include GB alongside DT and RF provides a more comprehensive view of model performance in predicting participation through the sale of agricultural items in 2010–2011 (Table 11). All three models demonstrate statistically significant predictive power, with p-values well below 0.001, confirming that their performance exceeds random chance. In terms of overall accuracy, RF leads slightly with 84.29%, followed by DT at 82.86%, and GB at 78.57%. However, accuracy alone does not capture the full picture. When examining sensitivity, the ability to correctly identify participants, RF again outperforms the others with 96.51%, followed by DT (86.86%) and GB (86.05%). This indicates that RF is most effective at detecting true positives, making it particularly useful for identifying households engaged in agricultural sales.
Table 11. Confusion Matrix and Statistics—Selling agric items or products as per HBS 2010–2011: predicted values from test data.
Conversely, specificity, which measures the correct identification of non-participants, is highest for GB at 83.68%, compared to DT (79.02%) and RF (64.81%). This suggests that GB is better at avoiding false positives, offering a more balanced classification across both classes. The Balanced Accuracy, which averages sensitivity and specificity, reflects this trade-off: DT achieves 82.94%, RF 80.66%, and GB 76.36%. The Kappa statistic, which accounts for agreement beyond chance, is highest for DT (0.6576), followed closely by RF (0.6490) and GB (0.5383), indicating that DT and RF provide stronger classification consistency. However, McNemar's Test p-values reveal that RF (p = 0.001384) exhibits a statistically significant imbalance in misclassification, while DT (p = 0.1124) and GB (p = 0.3613) do not, suggesting that GB offers greater stability in its error distribution. In terms of predictive reliability, Positive Predictive Value is highest for GB (86.22%), followed by RF (81.37%) and DT (79.87%), indicating GB's strength in confirming actual participants. RF, however, leads in NPV at 92.11%, suggesting high confidence in its predictions of non-participation. Overall, RF demonstrates the highest sensitivity and accuracy, making it ideal for maximizing participant detection. DT offers balanced performance with strong agreement metrics, while GB provides greater specificity and predictive precision, especially in avoiding false positives. These distinctions highlight the importance of aligning model selection with policy goals—whether prioritizing outreach to participants, minimizing misclassification, or ensuring equitable treatment across groups. Incorporating ensemble diversity or hybrid approaches may further enhance classification robustness in future applications. In recent years, the classification of agricultural product sales using the RF method has shown exceptional performance, particularly in the 2017 dataset (Table 12). Leveraging an ensemble of 500 trees and evaluating three variables at each split, the model achieved an impressively low OOB error rate of 1.57%. This minimal error rate underscores the RF model's high efficiency in capturing complex data patterns while maintaining strong generalization and minimizing misclassifications.
Table 12. Model output for classification—Selling agric items or products in 2017–2018: true values from training data.
The confusion matrix further illustrates the model's precision. It perfectly classified all 122 instances of class 0 (non-participants), resulting in a class error rate of 0.00%. For class 1 (participants), the model correctly identified 113 instances, with only 4 misclassified as non-participants, yielding a low class error rate of 3.42%. These results reflect the model's robust predictive capability and balanced performance across both classes, making it highly effective for distinguishing between participants and non-participants in agricultural sales. Such performance metrics are particularly encouraging, as they highlight the strength of the RF approach in handling structured classification tasks, especially in domains with potentially overlapping or imbalanced class distributions.
The OOB error rate plot for 2017–2018 (Figure 8c) further reinforces this, showing a progressive decline in error as the number of trees increases. The initial sharp drop in error with the addition of more trees suggests that the model benefits significantly from ensemble diversity, while the eventual plateau indicates that the model reaches a point of stability and optimal complexity. When delving into the performance metrics of DT, RF, and GB for classifying agricultural product sales in 2017, it becomes evident that all three models exhibit exceptional predictive capabilities (Table 13). Each model achieves high accuracy, with DT slightly leading at 97.67%, followed closely by RF (97.50%) and GB (96.25%). These results are statistically significant, as indicated by p-values < 2e-16, confirming that all models perform well above the No Information Rate.
Table 13. Confusion Matrix and Statistics—Selling agric items or products as per HBS 2017–2018: predicted values from test data.
In terms of sensitivity, both DT and RF attain a perfect score of 1.000, meaning they correctly identified all participants in the dataset. GB also performs strongly with a sensitivity of 97.37%, indicating only a minor shortfall in detecting true positives. However, GB demonstrates slightly lower specificity (92.54%) compared to DT (95.31%) and RF (95.24%), suggesting that GB is somewhat more prone to false positives when identifying non-participants. The Kappa statistic, which measures agreement between predicted and actual classifications beyond chance, is highest for DT (0.9535), followed closely by RF (0.9500) and GB (0.9249), reinforcing the reliability of all three models, with DT showing the strongest consistency. Additionally, PPV is highest for DT (95.59%), while NPV is perfect for both DT and RF (1.000), and slightly lower for GB (97.56%), again highlighting the superior performance of DT and RF in ruling out false negatives. The Balanced Accuracy, which averages sensitivity and specificity, is nearly identical for DT (97.66%) and RF (97.62%), with GB slightly lower at 96.30%. Importantly, McNemar's Test p-values for all models are non-significant—DT (0.2482), RF (0.4795), and GB (1.0000)—indicating no statistically significant differences in misclassification patterns between classes.
In summary, while DT marginally outperforms RF and GB across most metrics, RF remains a highly competitive model due to its ensemble robustness and generalization strength. GB, although slightly behind in specificity and agreement, offers competitive sensitivity and predictive precision, making it a valuable alternative, especially in contexts where minimizing false negatives is critical.
These findings are further enriched by the RF variable importance analysis (Figure 9), which reveals evolving trends in predictors of agricultural participation across the survey years. In 2002–2003, income emerged as the most dominant predictor of agricultural selling behavior, with a MDG score of 302.46—far surpassing all other variables. This underscores the centrality of economic capacity in determining market participation during that period. Other influential variables included economic activity (53.54), age (37.21), and education (32.73), highlighting the role of employment status and human capital in shaping selling decisions. Spatial factors such as zone, district, and urban-rural setting also held moderate importance, suggesting that geographic context played a supporting role. By 2010–2011, however, the landscape shifted: district became the leading predictor (71.17), followed by age (26.64) and zone (17.89), while income's importance declined sharply to 20.41. This shift indicates a growing influence of spatial and demographic factors over purely economic ones in determining agricultural selling behavior. In 2017–2018, the pattern evolved further, with household size emerging as the most significant predictor (79.67), accompanied by age (9.72), while income (5.74), district (4.70), and other variables registered minimal influence. This transition suggests a move toward household-level and demographic determinants, possibly reflecting changes in labor dynamics, intra-household decision-making, or resource allocation. These RF-based trends align closely with the GB model's variable importance outputs (Figure 10), which identify income, age, and urban-rural setting as key predictors in 2002–2003 and 2017–2018, and district, age, and urban-rural setting in 2010–2011.
Figure 9. Mean Decrease Gini [RF] for important variables—Selling Agric items or products across survey years.
Figure 10. Important Variables [GB] - Selling Agric items or products across survey years: (a) 2002–2003, (b) 2010–2011, and (c) 2017–2018.
The consistent prominence of age across all years in both models underscores its enduring relevance in agricultural participation. Meanwhile, the rise of household size in 2017–2018 and the fluctuating importance of spatial and economic variables reflect the dynamic nature of agricultural engagement, shaped by evolving socio-economic conditions and policy environments. These insights highlight the value of multi-model analysis in capturing nuanced shifts in participation drivers and guiding more responsive, evidence-based interventions. Extending on the analysis of agricultural spending participation in 2002–2003, as presented in Table 14, offers valuable insights into the classification performance of the RF model. Built with 500 trees and evaluating four variables at each split, the model achieved an OOB error rate of 8.02%, indicating strong generalization and predictive reliability.
Table 14. Model output for classification—Spending on agric items/products in 2002–2003: true values from training data.
The confusion matrix reveals that the model perfectly classified all 356 instances of class 0 (non-participants), resulting in a class error rate of 0.00% for this group. This underscores the RF model's exceptional specificity and precision in identifying households not engaged in agricultural spending. However, for class 1 (participants), the model correctly identified 314 instances, but misclassified 59 as non-participants, yielding a class error rate of approximately 15.82%.
This discrepancy suggests that while the model is highly effective at ruling out non-participation, it is less sensitive to the nuances of participant behavior. Regarding the OOB error rate plot for visualizing the performance of the RF model, the study observes a clear pattern of performance improvement as the number of trees increases. Initially—as shown in Figure 11a—the OOB error rate experiences an exponential decline, reflecting the enhanced predictive accuracy of the model with each added tree due to the ensemble learning mechanism. This trend indicates that additional trees contribute to capturing the underlying structure of the data more effectively. The error rate later stabilizes after reaching an optimal point, suggesting that once a sufficient number of trees (in this case, 500) are employed, the model achieves diminishing returns in error reduction. Extending the analysis to include GB alongside DT and RF provides a more comprehensive assessment of model performance in predicting agricultural spending participation in 2002–2003 (Table 15). All three models demonstrate statistically significant predictive accuracy, with p-values < 2e-16, confirming that their performance exceeds the No Information Rate and is not due to chance. In terms of overall accuracy, DT leads slightly at 90.05%, followed by RF (89.39%) and GB (85.71%). Nonetheless, accuracy alone does not capture the full spectrum of performance. DT and RF both achieve perfect sensitivity (1.0000), meaning they correctly identified all participants (positive class).
Figure 11. Error Rates for spending on agricultural items or products across survey years: (a) 2002–2003, (b) 2010–2011, and (c) 2017–2018.
Table 15. Confusion Matrix and Statistics—Spending agric items or products as per HBS 2002–2003: predicted values from test data.
GB, while slightly lower at 90.70%, still demonstrates strong sensitivity, though it missed a few positive cases. When evaluating specificity, which reflects the ability to correctly identify non-participants (class 0), GB outperforms both DT and RF with a score of 80.17%, compared to 79.26% for DT and 77.59% for RF. This suggests that GB is more effective at minimizing false positives, offering a more balanced classification across both classes. The Kappa statistic, which measures agreement beyond chance, is highest for DT (0.7991), followed by RF (0.7847) and GB (0.7121), indicating that DT and RF provide stronger classification consistency. However, GB's Mcnemar's Test p-value (0.09097) is non-significant, suggesting no substantial misclassification bias, whereas DT (< 1.166e-09) and RF (9.443e-07) show statistically significant differences in error distribution between classes. In terms of predictive reliability, PPV is highest for DT (83.95%), with GB (83.57%) and RF (83.23%) close behind. DT and RF also achieve perfect NPV scores of 1.0000, while GB records a slightly lower NPV of 88.57%, indicating that DT and RF are more confident in ruling out false negatives. The Balanced Accuracy, which averages sensitivity and specificity, is highest for DT (89.63%), followed by RF (88.79%) and GB (85.44%). These results suggest that while DT offers the most consistent and accurate classification, RF provides a robust ensemble-based alternative, and GB contributes greater specificity and stability, particularly in avoiding misclassification bias. In general, each model presents distinct strengths: DT excels in sensitivity and agreement, RF offers balanced predictive power with ensemble robustness, and GB provides enhanced specificity and error stability. These insights are critical for tailoring predictive strategies and refining feature selection, especially when aiming to understand the determinants of agricultural spending behavior and design targeted interventions across diverse household profiles. At the same time, the RF model output for classifying agricultural spending participation in 2010–2011 demonstrates strong and balanced predictive performance, as detailed in Table 16. Configured with 500 trees and evaluating three variables at each split, the model achieved an OOB error rate of 7.21%, indicating that only a small fraction of predictions are expected to be incorrect when applied to unseen data—a clear sign of generalization strength. The confusion matrix reveals that the model correctly classified 118 non-participants, with 11 misclassified as participants, resulting in a class error rate of 8.53%. For the participant group, 120 instances were correctly identified, while 12 were misclassified as non-participants, yielding a class error rate of 9.09%. These nearly symmetrical error rates reflect a well-balanced model performance across both classes, which is particularly noteworthy given the historical tendency of models to struggle more with participant classification. The OOB error rate plot (Figure 11b) reinforces this assessment, showing a stable decline in error as the number of trees increases, followed by a plateau once optimal complexity is reached. This trend confirms that the RF model benefits from ensemble diversity but also suggests that additional trees beyond 500 offer minimal gains, highlighting the model's efficiency and robustness.
Table 16. Model output for classification—Spending on agric items or products in 2010–2011: true values from training data.
However, despite this commendable performance, the slightly higher misclassification rate among participants points to a persistent challenge in capturing the full complexity of positive class behavior. This may stem from latent variables, feature overlap, or subtle socio-economic factors not fully represented in the current feature set. A broader evaluation of model performance in predicting agricultural spending participation in 2010–2011 reveals distinct strengths across DT, RF, and GB, as detailed in Table 17. DT emerges with the highest overall accuracy at 91.43%, followed by RF at 89.66% and GB at 88.51%, a trend echoed in their respective Kappa statistics—DT at 0.8284, RF at 0.7930, and GB at 0.7698—indicating DT's superior consistency beyond chance classification.
Table 17. Confusion Matrix and Statistics—Spending on agric items or products as per HBS 2010–2011: predicted values from test data.
While RF and GB both achieve slightly higher sensitivity scores of 88.89% compared to DT's 88.41%, DT outperforms in specificity, correctly identifying non-participants with a rate of 94.37%, vs. 90.48% for RF and 88.10% for GB. DT also leads in Positive Predictive Value (93.85%) and Balanced Accuracy (91.39%), reinforcing its reliability in both precision and overall classification balance. Although RF and GB trail slightly in these metrics, they offer competitive performance, with RF benefiting from ensemble robustness and GB contributing greater error stability, as evidenced by its non-significant McNemar's Test p-value (1.000). These findings suggest that while DT is ideal for high-precision, interpretable classification tasks, RF and GB provide valuable alternatives for more complex or variable datasets, making all three models instrumental in shaping predictive strategies and informing targeted agricultural policy interventions. In 2017–2018, the RF model's output for classifying agricultural spending participation reflects a moderate yet stable level of predictive accuracy, as shown in Table 18. Configured with 500 trees and evaluating three variables at each split, the model achieved an OOB error rate of 15.3%, indicating that approximately one in seven predictions may be incorrect when applied to unseen data. The confusion matrix reveals that the model performed well in identifying non-participants (class 0), correctly classifying 535 instances with only 22 misclassifications, resulting in a low class error rate of 3.95%. However, its performance was notably weaker for participants (class 1), where 412 cases were correctly classified but 142 were misclassified as non-participants, yielding a significantly higher class error rate of 25.63%. This disparity highlights a persistent challenge in accurately identifying the participant group, suggesting a higher susceptibility to false negatives and a need for improved sensitivity.
Table 18. Model output for classification—Spending on agric items or products in 2017–2018: true values from training data.
The OOB error rate plot further supports this assessment, showing a clear decline in error as more trees are added to the ensemble, followed by a plateau around the 500-tree mark (Figure 11c). This stabilization indicates that the model has reached an optimal level of complexity, where additional trees offer minimal gains in predictive performance. It also suggests that the RF model has effectively captured the underlying structure of the data, particularly for the non-participant class, but struggles with the more variable patterns associated with participants. In evaluating the performance of classification models on the 2017–2018 test dataset, a comparison between DT, RF, and GB reveals distinct strengths and trade-offs across the three approaches (Table 19). The DT model achieved the highest overall accuracy at 86.10%, correctly classifying 371 non-participants and 267 participants. RF and GB followed closely, both with an accuracy of 82.75%, though RF correctly identified 174 non-participants and 133 participants, while GB classified 166 and 141, respectively. All models demonstrated statistically significant predictive power, with p-values below 2.2e-16 when compared to the No Information Rate (NIR), confirming their effectiveness.
Table 19. Confusion Matrix and Statistics—Spending on agric items or products as per HBS 2017–2018: predicted values from test data.
The Kappa statistic, which measures agreement beyond chance, was highest for DT at 0.7219, followed by RF (0.6556) and GB (0.6554), indicating that DT offers the most consistent classification reliability. In terms of sensitivity, DT excelled with a perfect score of 1.0000, successfully identifying all participants, while RF and GB recorded slightly lower values of 94.57 and 90.22%, respectively. Conversely, GB demonstrated the highest specificity at 75.40%, outperforming RF (71.12%) and DT (72.16%), suggesting that GB is more effective at correctly identifying non-participants. Further distinctions emerge in predictive value metrics.
GB achieved the highest Positive Predictive Value at 78.30%, narrowly surpassing DT (78.27%) and RF (76.32%). However, DT maintained a perfect Negative Predictive Value of 1.0000, compared to RF (93.01%) and GB (88.68%), reinforcing its strength in ruling out false negatives. Balanced Accuracy, which averages sensitivity and specificity, was highest for DT at 86.08%, with RF and GB closely aligned at 82.84 and 82.81%, respectively. McNemar's Test p-values for all models were significant, indicating some degree of misclassification bias, though DT's performance remained the most stable. Complementing this performance analysis, the RF model's variable importance output (Figure 12) reveals evolving predictors of agricultural spending over time. In 2002–2003, income was the most dominant variable (MDG: 207.54), followed by age (23.52), activity (19.83), and education (17.91), indicating that economic capacity and demographic characteristics were central to spending behavior. Spatial variables such as district (15.12), zone (6.42), and urban/rural setting (3.55) played moderate roles, while household-level factors like household size (14.60) and marital status (8.48) contributed meaningfully. In 2010–2011, the landscape shifted: district surged in importance (78.46), with age (9.78) and income (8.96) remaining relevant but less dominant. This shift suggests a growing influence of geographic context and demographic structure over economic status in shaping agricultural spending decisions. More recently, in 2017–2018, the pattern evolved further, with household size emerging as the most significant predictor (276.84), followed by age (47.12) and income (53.44), marking a return to household-level and demographic determinants. Spatial variables such as zone (21.52) and district (31.44) remained influential but were overshadowed by structural factors.
As illustrated in Figure 13, these RF-based trends align closely with the GB model's variable importance outputs, which identified income, age, and district as pivotal in 2002–2003; district, age, and income in 2010–2011; and household size, age, and income in 2017–2018. The consistent prominence of age and income across all years in both models underscores their enduring relevance, while the rise of household size in 2017–2018 reflects shifting dynamics in household decision-making and resource allocation. Together, these insights highlight a transition from spatial and economic predictors toward more nuanced demographic and structural factors, reinforcing the importance of adaptive, context-aware modeling in understanding agricultural spending behavior.
Figure 13. Important Variables [GB]—Spending on Agric items/products across survey years: (a) 2002–2003, (b) 2010–2011, and (c) 2017–2018.
5 Discussions
Understanding the evolving dynamics of agricultural participation is critical in the context of rapid urbanization, climate variability, and shifting socio-economic conditions. Using DT, RF, and GB models across three nationally representative datasets from the three survey years, this study identifies notable shifts in classification performance and variable importance over time. In 2002–2003, the DT model achieved a classification accuracy of 90.84% for households selling agricultural products. RF and GB, while slightly lower in overall accuracy, demonstrated stronger sensitivity and balanced accuracy, indicating their capacity to better identify participating households. By 2010–2011, model performance declined across all three algorithms. RF maintained relatively stable predictive capacity with an OOB error rate of 9.11%, while GB continued to perform well in terms of sensitivity and balanced accuracy. This period coincided with the aftermath of the 2008–2009 global financial crisis and regional economic disruptions, which may have influenced household decisions regarding agricultural engagement. DT models during this period placed greater emphasis on spatial variables such as “district” and “zone,” suggesting that localized economic or environmental shocks may have played a more prominent role in shaping participation. In 2017–2018, RF achieved its lowest OOB error rate (1.05%) in selling classifications, and GB maintained strong performance in identifying positive cases. These results align with a period of renewed investment in rural infrastructure and agricultural support programs in Lesotho, including improvements in water access and extension services. DT performance continued to decline, indicating that its simpler structure may be less effective in capturing the increasingly complex patterns of agricultural participation.
Variable importance analysis across the models reveals a shift from economic indicators like “income” in 2002 to demographic and infrastructural factors—such as “household size,” “age,” and “water access”—by 2017–2018. This transition suggests a growing influence of demographic pressures and resource accessibility on agricultural engagement. The prominence of water access, in particular, reflects the increasing relevance of environmental constraints. Lesotho has experienced recurrent droughts and rainfall variability over the past two decades, which have significantly impacted smallholder agriculture. While direct climate variables were not included in the models, water access serves as a proxy for environmental stress, highlighting the importance of integrating climate indicators—such as precipitation trends or drought indices—in future analyses.
The analysis also distinguishes between different forms of participation—selling, spending, and possession—revealing nuanced behavioral patterns. Households spending on agricultural items were often linked to lower income, suggesting subsistence-oriented investment. Rural households consistently showed higher participation, while urban areas exhibited growing disengagement, reflecting the influence of urbanization and shifting labor markets. GB was particularly effective in capturing these subtleties, offering heightened sensitivity in detecting positive cases across diverse household profiles. From a methodological standpoint, the use of three distinct algorithms enables a comprehensive understanding of participation dynamics. Ensemble methods like RF and GB are well-suited to capturing non-linear relationships and complex interactions. Although explicit feature engineering was not the primary focus of this study, the models' performance suggests that interaction effects—such as between age and household size or district and water access—are being effectively leveraged. Future work could enhance interpretability by incorporating engineered features and testing interaction terms directly.
Policy implications emerging from this analysis are multifaceted. First, targeted interventions should focus on districts with persistently low participation, enhancing access to water, land, and agricultural inputs. Second, urban disengagement calls for tailored outreach and training programs aimed at younger populations, potentially integrating urban agriculture initiatives. Third, adaptive support mechanisms—responsive to both economic and climatic variability—are essential to ensure resilience in household-level agricultural decisions. The models also highlight the importance of localized strategies, with DT pointing to district-level variation and RF/GB revealing demographic and resource-based disparities. The dynamics of participation in agriculture—through selling, spending, and possession—reflect a complex interplay of economic, demographic, geographic, and environmental factors. As Lesotho continues to navigate challenges such as climate change, urban migration, and shifting consumer preferences, agricultural policy must evolve in tandem. Aligning predictive insights from DT, RF, and GB models with broader macroeconomic conditions and environmental realities supports the development of strategies that are both data-informed and context-sensitive. This approach helps ensure that participation remains resilient, inclusive, and responsive to the country's development trajectory.
Data availability statement
The data analyzed in this study is subject to the following licenses/restrictions: the data is accessed upon formal request from the Lesotho Bureau of Statistics. Requests to access these datasets should be directed to: www.bos.gov.ls.
Author contributions
KR: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Validation, Visualization, Writing – original draft, Writing – review & editing. RC: Conceptualization, Investigation, Methodology, Project administration, Supervision, Validation, Writing – review & editing. TZ: Conceptualization, Supervision, Validation, Writing – review & editing. KC: Conceptualization, Supervision, Validation, Writing – review & editing.
Funding
The author(s) declared that financial support was not received for this work and/or its publication.
Acknowledgments
The authors are grateful to the Lesotho Bureau of Statistics for providing the data sets used in this study.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
2. Rao CS, Gopinath K, Prasad J, Prasannakumar, Singh A. Climate resilient villages for sustainable food security in tropical India: concept, process, technologies, institutions, and impacts. Adv Agron. (2016) 140:101–214. doi: 10.1016/bs.agron.2016.06.003
3. Rao CS, Gopinath K. Resilient rainfed technologies for drought mitigation and sustainable food security. Mausam. (2016) 67:169–82. doi: 10.54302/mausam.v67i1.1174
4. Sjah T, Zainuri Z. Zero Hunger. Cham, Switzerland: Springer (2020). p. 79–88. doi: 10.1007/978-3-319-95675-6_82
5. Statista. Agriculture in Africa – Statistics and Facts. Statista (2025). Available online at: https://www.statista.com/topics/12901/agriculture-in-africa/ (Accessed November 10, 2025).
6. Fuglie K, Rada N. Resources, policies, and agricultural productivity in sub-Saharan Africa. USDA-ERS Economic Research Report (145) (2013). doi: 10.2139/ssrn.2266459
7. Wollburg P, Markhof Y, Bentze T, Ponzini G. Substantial impacts of climate shocks in African smallholder agriculture. Nat Sustain. (2024) 7:1525–34. doi: 10.1038/s41893-024-01411-w
8. Sithole A, Olorunfemi OD. The adoption of sustainable farming practices by smallholder crop farmers: micro-level evidence from North-Eastern South Africa. Agriculture. (2024) 14:2370. doi: 10.3390/agriculture14122370
9. Mnukwa ML, Mdoda L, Mudhara M. Assessing the adoption and impact of climate-smart agricultural practices on smallholder maize farmers' livelihoods in Sub-Saharan Africa: a systematic review. Front Sustain Food Syst. (2025) 9:1543805. doi: 10.3389/fsufs.2025.1543805
10. Omokpariola DO, Agbanu-Kumordzi C, Samuel T, Kiswii L, Moses GS, Adelegan AM. Climate change, crop yield, and food security in Sub-Saharan Africa. Discov Sustain. (2025) 6:678. doi: 10.1007/s43621-025-01580-4
11. Alawode A. Assessing climate change impacts on agricultural productivity and rural livelihoods in Sub-Saharan Africa. Int J Res Public Rev. (2025) 6:4508–23. doi: 10.55248/gengpi.6.0525.1734
12. Giller KE, Delaune T, Silva JV, van Wijk M, Hammond J, Descheemaeker K, et al. Small farms and development in sub-Saharan Africa: farming for food, for income or for lack of better options? Food Sec. (2021) 13:1073–99. doi: 10.1007/s12571-021-01209-0
13. Liakos KG, Busato P, Moshou D, Pearson S, Bochtis D. Machine learning in agriculture: a Review. Sensors. (2018) 18:2674. doi: 10.3390/s18082674
14. Benos L, Tagarakis AC, Dolias G, Berruto R, Kateris D, Bochtis D. Machine learning in agriculture: a comprehensive updated review. Sensors. (2021) 21:3758. doi: 10.3390/s21113758
15. Nkamleu GB, Manyong VM. Factors affecting the adoption of agroforestry practices by farmers in Cameroon. Small Scale For Econ Manag Policy. (2005) 4:135–48. doi: 10.1007/s11842-005-0009-6
16. Eric O-O, Prince AA, Elfreda ANA. Effects of education on the agricultural productivity of farmers in the Offinso Municipality. Int J Dev Res. (2014) 4:1951–60.
18. Dossa LH, Buerkert A, Schlecht E. Cross-Location Analysis of the Impact of Household Socioeconomic Status on Participation in Urban and Peri-Urban Agriculture in West Africa. Human Ecol. (2011) 39:569. doi: 10.1007/s10745-011-9421-z
19. Yee SS, Man N. Does participation of rural women in agriculture associated with socio-demographic characteristics? Case of Ranau District in Malaysia. Int J Acad Res Bus Soc Sci. (2023) 13:1304–16. doi: 10.6007/IJARBSS/v13-i10/19006
20. Van Klompenburg T, Kassahun A, Catal C. Crop yield prediction using machine learning: a systematic literature review. Comput Electron Agric. (2020) 177:105709. doi: 10.1016/j.compag.2020.105709
21. Barreñada L, Dhiman P, Timmerman D, Boulesteix A-L, Van Calster B. Understanding overfitting in random forest for probability estimation: a visualization and simulation study. arXiv. [preprint]. (2024) 2402.18612. doi: 10.1186/s41512-024-00177-1
23. Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Machine Intel. (2019) 1:206–15. doi: 10.1038/s42256-019-0048-x
24. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. (2002) 16:321–57. doi: 10.1613/jair.953
25. He H, Garcia EA. Learning from imbalanced data. IEEE Trans Knowl Data Eng. (2009) 21:1263–84. doi: 10.1109/TKDE.2008.239
26. Fernández A, García S, Galar M, Prati RC, Krawczyk B, Herrera F. Learning from Imbalanced Data Sets, Vol. 10, No. 2018, (Cham, Switzerland: Springer). (2018). p. 4. doi: 10.1007/978-3-319-98074-4
27. Charbuty B, Abdulazeez A. Classification based on decision tree algorithm for machine learning. J Appl Sci Technol Trends. (2021) 2:20–8. doi: 10.38094/jastt20165
28. Feng Y, Park J. Using machine learning-based binary classifiers for predicting organizational members' user satisfaction with collaboration software. PeerJ Comput Sci. (2023) 9:e1481. doi: 10.7717/peerj-cs.1481
29. Biau G, Scornet E. A random forest guided tour. Test. (2016) 25:197. doi: 10.1007/s11749-016-0481-7
31. Gegiuc A, Similä M, Karvonen J, Lensu M, Mäkynen M, Vainio J. Estimation of degree of sea ice ridging based on dual-polarized C-band SAR data. Cryosphere. (2018) 12:343–64. doi: 10.5194/tc-12-343-2018
32. Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. (2001) 1189–232. doi: 10.1214/aos/1013203451
33. Chen T, Guestrin C. Xgboost: A scalable tree boosting system. in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. (2016). pp. 785–94. doi: 10.1145/2939672.2939785
34. Rahman MS, Sultana M. Performance of Firth-and logF-type penalized methods in risk prediction for small or sparse binary data. BMC Med Res Methodol. (2017) 17:1. doi: 10.1186/s12874-017-0313-9
35. Karabon P. Rare events or non-convergence with a binary outcome? The power of Firth regression in PROC LOGISTIC. SAS Global Forum. Paper, Vol. 4654 (2020).
36. Wang X. Firth logistic regression for rare variant association tests. Front Genet. (2014) 5:187. doi: 10.3389/fgene.2014.00187
37. Firth D. Bias reduction of maximum likelihood estimates. Biometrika. (1993) 80:27–38. doi: 10.1093/biomet/80.1.27
38. Heinze G, Schemper M. A solution to the problem of separation in logistic regression. Stat Med. (2002) 21:2409–19. doi: 10.1002/sim.1047
39. Miller JM, Miller MD. Handling quasi-nonconvergence in logistic regression: technical details and an applied example. Interstat (2011) 15:22.
40. Lu K. On logistic regression analysis of dichotomized responses. Pharmaceut Stat. (2017) 16:55–63. doi: 10.1002/pst.1777
Keywords: agricultural participation, classification, decision trees, ensemble, feature selection, machine learning, random forest
Citation: Ramalebo K, Chifurira R, Zewotir T and Chinhamu K (2026) Decoding the future of agricultural participation: machine learning insights to unravel the plausible triggers. Front. Appl. Math. Stat. 11:1693403. doi: 10.3389/fams.2025.1693403
Received: 27 August 2025; Revised: 22 November 2025;
Accepted: 27 November 2025; Published: 05 January 2026.
Edited by:
Jiangjiang Zhang, Hohai University, ChinaReviewed by:
Norman Peter Reeves, Sumaq Life LLC, United StatesEndris Assen Ebrahim, Debre Tabor University, Ethiopia
Rose Nakibuule, Makerere University, Uganda
Copyright © 2026 Ramalebo, Chifurira, Zewotir and Chinhamu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Katiso Ramalebo, cmFtYWxlYm9rQGdtYWlsLmNvbQ==