- 1Office de Gestion de la Calidad, Universidad Nacional Toribio Rodriguez de Mendoza de Amazonas, Chachapoyas, Peru
- 2Facultad de Ingeniería Zootecnista, Biotecnología, Agronegocios y Ciencia de Datos, Universidad Nacional Toribio Rodríguez de Mendoza de Amazonas, Chachapoyas, Peru
- 3Pontificia Universidad Catolica del Peru, Lima, Peru
There is concern about the levels of stress faced by college students and their effects on mental health and academic performance. This study aimed to characterize academic stress levels in college students, using data mining algorithms to classify and predict risk patterns. Data were collected from 287 students using the SISCO Academic Stress Inventory, and classification algorithms and association rules were applied using WEKA software. The results revealed that 75.3% of the students experienced high stress levels, primarily linked to psychological reactions and academic demands. It also compared the predictive performance of 13 algorithms, where J48, LMT, and SimpleLogistic achieved classification accuracies above 89%, surpassing results previously reported in similar educational contexts. Association rule mining further showed that being single and childless was strongly correlated with elevated stress levels, highlighting demographic risk profiles often overlooked in earlier research. By integrating predictive modeling with demographic and behavioral factors, this study extended prior literature by showing how data mining can simultaneously classify and explain academic stress, offering actionable insights for universities to design targeted, evidence-based interventions.
1 Introduction
A growing number of young people pursue higher education seeking to strengthen their academic foundation and enhance their opportunities in an increasingly competitive labor market (Carrasco et al., 2024; Sánchez et al., 2024; Xiao et al., 2024). However, this educational process unfolds within a context of high academic demands, economic uncertainty, and limited professional opportunities, which can generate high levels of stress and anxiety (Leslie et al., 2021; Noman et al., 2021). The transition to university life is also a crucial stage of personal development, involving major changes in habits, lifestyle, and support networks, along with the challenge of adapting to a new academic system (Durán Acevedo et al., 2021). If unmanaged, these conditions may trigger mental health problems that directly affect academic performance, persistence, and overall student wellbeing (Castillo-Navarrete et al., 2024; Gogichadze et al., 2023).
Higher education benefits both individuals and society as a whole (Hitches et al., 2022), nevertheless, the academic journey presents numerous challenges (Durán Acevedo et al., 2021). Internationally, students report high stress levels related to their education, which can negatively impact health, quality of life, and academic achievement (Pascoe et al., 2020; Powell and Graham, 2017). However, academically confident students experience less stress, adapt more successfully to college, and are generally considered healthier and happier individuals (Chemers et al., 2001). Success in higher education encompasses both academic achievement and life satisfaction, which can be predicted by academic confidence (self-efficacy) and stress (Krumrei-Mancuso et al., 2013). Therefore, examining the latter factors may provide information on how best to support college students to reach their full potential and which individuals may need support (Hitches et al., 2022).
University students tend to experience higher levels of stress and anxiety compared to the general population (Bewick et al., 2010; Bonneville-Roussy et al., 2017; Deasy et al., 2015). In the United Kingdom, mental health problems have increased fivefold (Thorley, 2017); in Asia, 11% of students report stress and anxiety (Cuttilan et al., 2016); and in Malaysia, 47.1% exhibit decreased psychological wellbeing (Zulkefly and Baharudin, 2010). There has been growing research interest in stress and coping strategies, particularly among medical students (Melaku and Bulcha, 2021; Tran et al., 2023; Uyen et al., 2024). Among Vietnamese medical students, more than 30% report moderate to high stress (Nguyen-Thi et al., 2024; Pham et al., 2023). Prolonged exposure to high stress leads to cognitive and emotional overload, psychological distress, school dropout, poor quality of life, and reduced empathy, with professional risks such as medical errors (Gleichgerrcht and Decety, 2013; Ruzhenkova et al., 2018; Wong and Chapman, 2023).
Research suggests that college students experience stress from multiple sources, academic, psychosocial, and financial, often related to professor, student relationships and high personal or external expectations (Ragab et al., 2021; Saxena et al., 2014). Physical problems can also be significant stressors affecting academic performance and quality of life (Coffin et al., 2023; Maity et al., 2022). Academic-related stressors in particular have been found to cause more distress than interpersonal, intrapersonal, or environmental ones (Sadiq et al., 2021). Early identification of these stressors may help design effective prevention and intervention programs aimed at reducing psychological problems among this high-risk population (Klein and McCarthy, 2022; Slavin et al., 2012).
Students tend to experience high stress, especially in their first year, due to difficulties balancing studies, work, and family responsibilities, as well as limited knowledge of the teaching profession (Geng and Midford, 2015). Exams, assessments, and professional practice requirements increase this burden (Deasy et al., 2015; Gustems-Carnicer et al., 2019), negatively affecting academic performance (Gustems-Carnicer et al., 2019). Female students often report higher stress than males (Deasy et al., 2015). while the effect tends to decrease with age (Gustems-Carnicer et al., 2019). Time management and self-regulation strategies can reduce stress and anxiety (Heikkilä et al., 2012), yet many students are reluctant to seek help or do not know how to access support (Deasy et al., 2015; Geng and Midford, 2015). This highlights the importance of timely interventions in higher education to better understand and address students' needs (Hitches et al., 2022).
There are studies on academic stress focused on the use of machine learning to predict stress levels based on academic performance and study load (Shahapur et al., 2024), artificial neural networks to estimate mental health (Pei, 2022), and classification algorithms such as Support Vector Machine (SVM), Decision Tree (DT), and Random Forest (RF) to detect mental stress by evaluating factors such as internet use, academic workload, exam pressure, and family environment (Ahuja and Banga, 2019; Firoz et al., 2023; Wicaksono and Sriani, 2024). Data mining has proven to be a tool for characterizing academic stress in students by identifying stressors and offering the possibility of early interventions (Anand et al., 2023). The application of this methodology in the field of academic stress remains scarce; therefore, the proposed study seeks to fill this gap by using data mining, applying algorithms to classify and predict stress levels, and using association rules to reveal hidden relationships between variables, which will allow a better understanding of the factors that influence academic stress.
2 Materials and methods
2.1 Place of study
The study population comprised students from the 2nd to 12th−14th academic cycles of the different professional careers of the National University Toribio Rodriguez of Mendoza of Amazonas, enrolled in the 2024-II academic semester.
2.2 Methodology
The study analyzed stress levels among students enrolled at the National University Toribio Rodriguez of Mendoza of Amazonas during the 2024-II semester. Data collected on students' stress levels were analyzed using data mining algorithms implemented in the open-source software WEKA (version 3.8.6; see Figure 1). Although this cross-sectional approach offers strong descriptive and predictive value, it does not capture variations in stress levels across semesters, a limitation that should be considered when interpreting the results.
2.2.1 Data collection
Data were collected through an online survey distributed via institutional emails to students and faculty in May, coordinated by the university's Office of Information Technology. The primary instrument used was the SISCO Academic Stress Inventory developed by Guzmán-Castillo et al. (2022), was applied as the main instrument. The questionnaire was structured in sections: (I. general data, II. stressors, III. physical, psychological, and behavioral reactions when you were worried or nervous, IV. stress coping strategies). This questionnaire was designed on a Likert scale where (1) is never, (2) is rarely, (3) is sometimes, (4) is almost always, and (5) is always. The questionnaire demonstrated high internal consistency, with a Cronbach's alpha coefficient of 0.89.
The data set used in the study includes students enrolled in semester 2024-II. Table 1 provides information on variables affecting academic stress during the semester under study. Detailed information on the variables was obtained using a survey, for which the participants gave informed consent. A total of 383 students participated in the survey, providing a robust dataset for subsequent analysis.
2.2.2 Stress data processing
The raw academic stress data were processed and classified to prepare the input parameters for Data Mining. All stress data corresponding to semester 2024-II were identified by their level. Surveys containing incomplete or inconsistent data were excluded, resulting in 287 valid responses for analysis.
Stress levels were categorized using 20 key variables defined as attributes within the WEKA data mining framework. WEKA provides a robust framework for analyzing and classifying data, allowing the integration of multiple attributes to identify patterns and relationships (Witten et al., 2011).
These 20 essential attributes include variables directly related to stress level (Area of study, Training cycle, Number of courses enrolled, Hours of study, Age, Gender, Marital status, Number of children, Academic performance, Level of worry or nervousness, Level of stressors, Physical reactions, Psychological reactions, Behavioral reactions, Strategy 1, Strategy 2, Strategy 3, Strategy 4, Strategy 5, Strategy 6).
When the attributes in Table 1 are analyzed for stress level, the sample presents an almost perfectly balanced distribution between genders, with 50.2% male and 49.8% female. The majority of students (47%) are between 21 and 25 years old, representing the central stage of university life. In the academic cycle, 41.1% are in specialty training, followed by 30% in specific training and 28.9% in general training. This distribution reflects different moments in the academic trajectory, with a slight predominance of students in more advanced stages of their careers. The area of training shows a clear predominance of Engineering and Architecture with 40.1%, followed by Accounting, Economics, and Administrative Sciences with 31%. Social Sciences (15%) and Health Sciences (13.9%) have a lower representation, which could indicate the profile of the educational institution or the most popular careers. Almost all are single (96.2%), reflecting the typical profile of university students. An interesting fact is that only 9.8% belong to vulnerable populations. Some 93.7% experience moments of nervousness throughout the semester, which reflects that almost all are going through situations that generate considerable anxiety and tension. 39.4% report a moderate level of worry, while 30% indicate that to are quite worried. This suggests that more than 69% of students are experiencing considerable levels of stress that could impact their academic performance and personal wellbeing. In terms of stressors, 60.3% are at a medium level, while 24.4% perceive them at a high level. Reactions are similarly distributed, physical (51.6% medium), psychological (45.6% medium), and behavioral (45.6% medium). This is evidence that students are experiencing considerable impact in terms of stress, affecting aspects such as their health, academic performance, and behavioral conditions.
Table 2 presents the coping strategies used by the students, revealing that the majority employ methods of moderate intensity. Some 38.7% of the students indicate that they “sometimes” defend their preferences or feelings without harming others, while 39.4% elaborate occasional plans to organize their tasks. In addition, 33.8% occasionally seek information about their stressful situations. On the other hand, 35.2% of the students engage “sometimes” in self-praise, which could reflect an attempt at self-confidence, while 27.2% also engage in religious practice or attend church/temple in stressful situations, albeit less frequently. Finally, 33.1% of students verbalize their concerns and seek emotional support from others “sometimes.” These results suggest that students are developing strategies to cope with stress, but also highlight the need for more support and tools to effectively manage these situations.
In Figure 2, the level of stress is particularly alarming: 75.3% of students present a high level of stress, which means that three out of four students are in a condition of high emotional and psychological vulnerability. This percentage indicates a structural problem in the educational system that requires immediate attention.
2.2.3 Use of data mining algorithms
The implementation of data mining was performed in two phases: In the first, algorithms were applied to classify and predict stress level data; the objective was to reveal the ability of data mining algorithms to classify and predict stress levels. In the second stage, association rules were applied to stress level data; the objective was to reveal hidden relationships between variables in the form of rules.
To ensure robustness, a wide range of classification algorithms was selected, including trees, rule-based models, probabilistic approaches, ensembles, and logistic methods. This diversity allows for comparisons across different paradigms, avoiding dependence on the strengths or weaknesses of a single model and aligning with recommendations in recent educational data mining studies (Corrales et al., 2021; Khairy et al., 2024). The data mining classification algorithms employed were:
J48: A machine learning algorithm used in data mining to create decision trees. It uses a training dataset to build a tree to make rule-based decisions. Each node of the tree represents a feature of the data and branches according to criteria that maximize the homogeneity of the classes. These trees are used to classify new data instances or predict outcomes based on their characteristics. J48 is a specific implementation of C4.5 in the Weka software (Witten et al., 2016; Saranaval and Gayathri, 2018).
LMT (Logistic Model Tree): Provides a very good description of the data. It consists of a decision tree structure with logistic regression functions at the leaves (McMahan et al., 2005). As in ordinary decision trees, a test on one of the attributes is associated with each internal node (Fayaz et al., 2021; Landwehr et al., 2005).
SimpleLogistic: A statistical model for binary classification using the probability that an observation belongs to a class. The SimpleLogistic classifier simply implements this model, providing fast results for classification problems (Ali, 2021; Witten et al., 2011).
RandomForest: It is an ensemble model based on the construction of multiple decision trees from random subsets of data. Each tree is trained independently, and its predictions are combined to obtain a robust final result (Breiman, 2001).
REPTree: It is an efficient implementation of decision trees, which uses classification error reduction to build trees more quickly, making them suitable for large volumes of data (Hall et al., 2009).
DecisionTable: Uses a decision table to represent the rules that determine the output as a function of the inputs. It is suitable for problems where the relationship between inputs and outputs is simple and well-defined (Witten et al., 2016).
Bagging: Bagging, or Bootstrap Aggregating, is an ensemble method that trains multiple models on random subsets of the data and then averages or votes their predictions to improve model accuracy. This approach is effective in reducing model variance and improving model stability and generalization (Breiman, 1996).
Bayes Network Classifier (BayesNet): Bayesian networks are probabilistic models that represent dependency relationships between variables. These models allow inferences and classifications based on probability distributions, being useful in cases where variables are interrelated (Heckerman, 1997).
NaiveBayes: A classifier based on Bayes' theorem that assumes that features are independent of each other. Despite this simplification, the algorithm often provides good results in classification tasks (Rennie et al., 2003).
SMO (Sequential Minimal Optimization): An algorithm designed to train support vector machines (SVM) efficiently, solving quadratic optimization problems in minimal steps, which makes training much faster and suitable for large data sets (Platt, 1998).
JRip: An algorithm that generates classification rules based on the RIPPER algorithm, which divides data into attributes and constructs logical rules for prediction (Cohen, 1995).
DecisionStump: It is a simple classifier that uses a single attribute of the data to split it into two groups, creating a decision tree with a single node. Despite its simplicity, it is mainly used as a base classifier in ensemble techniques such as AdaBoost, where several weak models (such as DecisionStump) are combined to improve performance (Freund and Schapire, 1996).
Locally weighted learning (LWL): A learning approach that fits prediction models for data near a specific query point. This approach allows the model to better adapt to local variations in the data, making it especially useful when relationships in the data are not homogeneous across the feature space. LWL adjusts the model based on the closeness of the data to the query point, which improves the accuracy of local predictions (Atkeson et al., 1997).
PART: It is a hybrid algorithm that combines rule induction with decision tree construction. This algorithm generates a set of classification rules based on a decision tree, which are then interpreted to make decisions. The advantage of PART is that it produces rules that are easy to understand, which facilitate the interpretation of the model's decisions (Ibarguren et al., 2018).
The classification processes were performed using data mining algorithms on the stress level database for the 2024-II semester. The classification has two main objectives. The first is to determine the algorithm that best represents all the data. The other is to evaluate the ability of the data mining algorithms to predict the desired parameter.
In association rule mining, the Apriori and predictive Apriori algorithms were applied. The attribute distributions of both algorithms were analyzed. In this way, deeper insights into the level of academic stress were discovered. The Apriori algorithm finds the most frequent attributes in the dataset and generates association rules with these attributes (Chen and Yin, 2022). When generating the rules with the algorithm, the confidence criterion determined in the implementation is taken into account.
2.2.4 Evaluation of results
For classification, cross-validation was used as a test method in the modeling studies of this phase. In addition to these test methods, Kappa, mean absolute error (MAE), and root mean square error (RMSE). The accuracy of the algorithms is determined by the percentage of correctly classified data in the dataset. Considering only the performance, some modifications were made to some parameters to obtain better results.
For the association rule, three criteria were used to identify the relationship: support, trust, and elevation.
Support: It is the percentage of transactions in the database that contain both the set of elements A and B. The degree of support A⇒ B in the rule A⇒ B is the probability that a given set of elements contains A and B, which is expressed by the probability value P(A∪B) (Prajapati et al., 2017). A high degree of support indicates that the mining results are consistent and that the provided rules are effective association rules. On the other hand, a low degree of support indicates that the data mining results appear only occasionally and the provided rules have little value for research. Equation 1 represents the definition of association rule support between A and B (Valdivia et al., 2020).
Confidence: The percentage of database transactions D with item set A that also contains item set B (Han et al., 2012). Confidence is calculated using conditional probability and is expressed relative to the support of the item set (Agrawal et al., 1993) and is represented by Equation 2:
In Equation 2, support (A ⇒ B) is the number of transactions containing the set of items A and B, and support (A) is the number of transactions containing the set of items A (Prajapati et al., 2017).
Lift: It is used to measure the frequency of A and B together if both sets of elements are statistically independent of each other (Brijs et al., 2003). The calculation is shown in Equation 3:
The elevation of the rule A ⇒ B shows how much the probability of B will increase if A occurs (Lee et al., 2019). There are three cases:
- When the elevation (A ⇒ B) > 1, then there is a positive interdependence between the antecedent and the consequent; therefore, the rule is considered valuable.
- When the elevation (A ⇒ B) < 1, then there is a negative interdependence between the antecedent and the consequent.
- When (A ⇒ B) = 1, then A and B are independent and there is no correlation between them.
Therefore, the higher the elevation measure, the higher the interest in the generated rules. Thus, with the help of this, it will rank the rules that meet the minimum thresholds of support and confidence (Huaman et al., 2024).
3 Results
Algorithm J48 with a correct classification percentage of 89.55% shows that students with high stress levels tended to exhibit medium levels of psychological reactions, medium stressors, low behavioral reactions, and low physical reactions. Likewise, students present a high level of stress when they are in the specialty cycle, with a high level of stressors, a low level of behavioral reactions, and a low level of physical reactions (see Figure 3).
Figure 3. Decision tree J48, for stress level. The terminal leaves display the numbers 1, 2, and 3, which correspond to the stress levels classified according to the SISCO Academic Stress Inventory: 1 = Low stress, 2 = Medium stress, and 3 = High stress. Each branch leads to a final prediction of the students' stress level based on their academic and personal characteristics.
In the context of stress level analysis, LMT classifies students into three classes, using an equation for each class based on variables such as nervousness, stressors, and physical, psychological, and behavioral reactions. The results for LMT also showed a correct classification rate of 89.55%.
In the stress level analysis, the SimpleLogistic classifier algorithm predicts students' classes using logistic regression, based on the variables in the model, with 89.55% in terms of correct classification.
Although simple, JRip proved effective in terms of both accuracy and interpretability. The stress level analysis correctly classified a significant percentage of students, with a correct classification score of 88.85%.
In the student stress analysis, RandomForest showed a correct rating of 88.85%.
In the stress analysis, REPTree obtained a percentage of correct classification of 88.50%, showing that students with high stress levels tended to exhibit medium or high behavioral reactions, and a low level in their physical reactions (see Figure 4).
Figure 4. REPTree decision tree for stress level. The numerical values at the leaves indicate the predicted stress level: 1 = Low stress, 2 = Medium stress, and 3 = High stress. The interpretation of each branch shows how combinations of physical, psychological, and behavioral reactions determine the final classification of students' stress levels.
This model is suitable for problems where the relationships between attributes and classes are easily interpreted. DecisionTable obtained a correct classification rate of 88.15% in the stress analysis.
The Bagging or Bootstrap Aggregating method reduces variance and improves the stability of the results. The stress analysis showed a performance similar to the correct classification of 88.50%.
The Bayes Network Classifier (BayesNet) approach is useful for modeling and classifying data with complex relationships between variables. In the stress analysis, it showed a correct classification rate of 88.50%.
Although this assumption is rarely realistic, NaiveBayes is still effective for classification tasks, especially on high-dimensional problems. In the stress analysis, a correct classification rate of 87.80% was obtained.
The SMO algorithm, for the analysis of students' academic stress levels, showed a correct classification of 86.76%.
In the stress level analysis, DecisionStump showed a correct classification rate of 85.71%, reflecting its ability to contribute to an effective assembly model.
In the stress analysis, the LWL showed a correct classification rate of 85.71%, adapting well to variations in the data.
Despite being relatively simple, the PART algorithm has proven to be effective in data classification. In the stress analysis, it obtained a correct classification rate of 84.32%.
Table 3 shows that all applied algorithms achieved over 84% correct classification, confirming robust capacity to identify academic stress patterns in students under WEKA's default hyperparameters. The top-performing models were J48, LMT, and SimpleLogistic, each with 89.55% correct classification and a Kappa index of 0.74, reflecting substantial agreement beyond chance. Similarly, JRip (88.85%), RandomForest (88.55%), and REPTree (88.50%) exceeded 88%, while DecisionTable, Bagging, and BayesNet remained close to 88%. In the lower but still acceptable range were NaiveBayes, SMO, DecisionStump, LWL, and PART. Error metrics (MAE and RMSE) were generally low, though slightly higher for SMO.
Beyond numerical accuracy, the models reveal that academic stress results from the interaction of multiple factors. The decision tree J48 highlights the importance of the specialty cycle and level of stressors; LMT and SimpleLogistic emphasize nervousness and psychological reactions as decisive predictors; and other models reinforce the role of behavioral and physical responses. Taken together, the findings underscore that students in advanced academic stages, facing cumulative stressors and lacking strong adaptive responses, are particularly vulnerable to experiencing high levels of stress.
Table 4 presents the precision, recall, and other performance metrics by class for the evaluated algorithms. The results show strong performance in identifying the high-stress class (class 3), with recall and F-measure values generally above 0.90, indicating that the models accurately capture the patterns associated with this category. The medium-stress class (class 2) achieves intermediate levels of performance, with recall values ranging from 0.74 to 0.87 depending on the algorithm, reflecting acceptable detection rates though with a higher risk of misclassification. The lowest results are observed in the low-stress class (class 1), where recall and F-measure remain between 0.00 and 0.22, highlighting the challenge of accurately detecting cases in this group. The metrics reflect consistent algorithmic behavior, with greater reliability in the categories that show clearer and more distinguishable patterns.
Table 5 shows the confusion matrices for the evaluated algorithms. The high-stress class (class 3) accounts for most of the correct classifications, confirming consistency with the high recall values obtained. In the medium-stress class (class 2), a significant number of cases were misclassified as high stress, indicating that students at the intermediate level tend to be classified as higher risk. The low-stress class (class 1) is the most affected, as most of its cases were assigned to the higher categories, with few correct predictions, showing the difficulty of the models in distinguishing this group. The confusion matrices demonstrate solid performance in the majority category but also reveal limitations in differentiating lower stress levels.
Table 6 presents the association rules generated using the Apriori algorithm, applied to explore the relationships between variables such as number of children, stress level, stressors, and marital status in college students. The table's columns show key metrics to assess the strength of the associations: confidence, lift, leverage, and convergence. It is observed that the rules with the highest level of confidence, especially those relating to marital status and number of children, also present a high lift, indicating a significant relationship between the variables. These results allow us to identify relevant patterns of personal conditions associated with the level of stress in the population studied.
The results of the Apriori association analysis reveal that, considering stress level as the dependent variable, there is a strong correlation between a high level of stress (Stress_Level = 3) and certain demographic characteristics of the participants. Specifically, 94% of high-stress people have no children (Number_Children = 0), and 96% have a specific marital status (Marital_status = 1), “single.” This association suggests that single people without children are more likely to experience elevated levels of stress, which may reflect factors such as work or academic pressures that significantly affect students with these demographic characteristics.
This finding shows that, in the analyzed population, the absence of direct family support is linked to greater vulnerability to academic demands. The recurrence of these conditions across different rules confirms that academic stress does not depend solely on academic factors but is reinforced by personal characteristics. In other words, young, single, and childless students lack support networks that could buffer the emotional impact of university workload, making them particularly prone to experiencing high levels of stress. In this sense, these results provide evidence for designing preventive interventions targeted at these profiles, considering both the academic context and the personal factors that increase vulnerability.
4 Discussion
Analysis of data collected from 287 university students showed that 75.3% experienced high stress levels, 21.6% moderate stress, and only 3.1% low stress. These results indicate that three out of four students were highly vulnerable emotionally and psychologically within the university environment, which is consistent with previous studies that have identified higher education as a particularly critical stage for mental health (Bewick et al., 2010; Bonneville-Roussy et al., 2017). Similarly, research such as Cuttilan et al. (2016) reports worrying levels of stress, although the percentage observed in our sample is even higher, evidencing the particular severity in our academic context.
In terms of predictive performance, the applied data mining algorithms (J48, LMT, and SimpleLogistic) achieved classification accuracies above 89%, demonstrating a strong ability to identify stress-level patterns. This robust performance reinforces the reliability of the findings and aligns with previous research supporting the application of data mining techniques in student mental health analysis (Corrales et al., 2021; de la Fuente et al., 2020; Khairy et al., 2024). This level of accuracy is comparable to the results of Corrales et al. (2021), who, using logistic regression and classification trees, achieved accuracies of 78%, highlighting the robustness of data mining tools in this type of educational research.
The analysis of coping strategies revealed that although students employ certain mechanisms, these are inconsistent and often ineffective. For example, only 39.4% occasionally made plans to manage stress, while 33.8% sometimes sought information about their stressful situations. This trend suggests an intermittent use of coping strategies, which is consistent with Nguyen-Thi et al. (2024) and Frydenberg (2014), who warn that without consistent adaptive strategies, students remain vulnerable to the negative effects of academic stress. Additionally, practices such as emotional venting or the use of religious support are even less frequent, limiting avenues for emotional release in the face of university pressure.
The association rules generated using the Apriori algorithm show particularly revealing patterns, single (Marital_Status = 1) and childless (Number_Children = 0) students are highly associated with an elevated level of stress (Stress_Level = 3). High confidence values (≥96%) and lift scores above 1 further reinforce the significance of these relationships, indicating that such personal characteristics substantially increase the likelihood of experiencing stress. These findings are consistent with previous studies emphasizing the role of social support in mitigating academic stress (Gogichadze et al., 2023; Maity et al., 2022; Ragab et al., 2021). They agree with what was reported in the study by Corrales et al. (2021), where personal and contextual conditions (such as lack of social support and high academic demands) were found to be key determinants of students' stress levels. In addition, the lack of family responsibilities could imply a greater dedication to academic life, which, although beneficial in terms of concentration, can translate into more intense pressure, economic worries, and anxiety about the future.
The association rules, characterized by high confidence and lift values, suggest that the absence of personal support networks defines a particularly vulnerable student profile to the demands of the university environment. These characteristics, rather than functioning as simple background variables, indicate greater vulnerability to institutional pressures in the absence of social and emotional support. Close relationships and family responsibilities may act as protective factors that buffer stress (Gogichadze et al., 2023; Maity et al., 2022). This interpretation aligns with the findings of Li and Hasson (2020) and Hitches et al. (2022), who argue that resilience to stress depends not only on individual competencies but also on affective bonds and contextual resources. Therefore, academic stress should not be seen merely as a consequence of external or familial factors, but as a systemic experience within higher education, one that universities must recognize and address as an institutional responsibility.
The elevated stress levels experienced by single, childless students in college settings can be attributed to a variety of factors. These include academic pressures, financial concerns, and personal lifestyle choices, thereby contributing to a challenging environment for students (Cody et al., 2024). Understanding these factors is crucial to developing effective interventions to support student wellbeing (Chemagosi, 2024). Students often face intense academic demands, including heavy workloads and high-performance standards, leading to significant stress (Slimmen et al., 2022). Pressure to perform well on exams can exacerbate feelings of anxiety and stress (Islam and Rabbi, 2024). Many students experience financial difficulties, which can generate additional pressure and anxiety, affecting their overall mental health (Mofatteh, 2021). On the other hand, the analysis of associated variables reveals that marital status and childlessness not only describe demographic conditions, but also reflect student profiles with a greater focus on their academic career, while simultaneously exposing them to internal pressures without the emotional buffer that close personal relationships might represent (Hitches et al., 2022; Li and Hasson, 2020).
Higher education benefits the individual, as well as the community and society to which it contributes, however, the educational path is not without challenges (Durán Acevedo et al., 2021). Internationally, students report a high level of stress related to their education, which can have adverse effects on their health, quality of life, and academic achievement (Pascoe et al., 2020). However, when students are academically confident, they experience less stress, adapt more successfully to college, and are considered “healthier” and “happier” individuals (Chemers et al., 2001). Success in higher education encompasses not only student achievement but also satisfaction with life, which academic confidence (self-efficacy) and stress can predict, respectively (Krumrei-Mancuso et al., 2013). Therefore, examining the latter factors may provide information on how best to support college students to reach their full potential and which individuals may most need such support (Hitches et al., 2022).
Furthermore, the results show that academic stress does not arise as an isolated factor but as a consequence of multiple interrelated dimensions: academic pressure, course load, frequent nervousness, moderate or high physical and psychological reactions, and insufficient coping strategies. All this reinforces the need for a comprehensive approach to university stress prevention and care, as already proposed by authors such as Slavin et al. (2012) and Castillo-Navarrete et al. (2024).
From a theoretical standpoint, the study reinforces the notion of academic stress as a systemic phenomenon, resulting from the interaction between curricular demands, students' resources, and institutional conditions. In line with current models of stress in higher education (Misra and Castillo, 2004; Moradi et al., 2011), it confirms that task overload, evaluation pressure, and internal competition are key triggers of this issue. Moreover, it demonstrates that data mining algorithms provide robust classification accuracy, thereby contributing methodological evidence that enriches the theoretical framework on student stress prediction.
From a practical perspective, the findings highlight the urgent need for preventive policies in curriculum design, academic workload management, and the establishment of permanent emotional support spaces. Additionally, it is recommended to incorporate risk profile analyses, such as marital status or the absence of support networks, into student wellness programs (Kim and Sax, 2017; Stallman, 2010). Universities should implement early intervention systems based on predictive models, develop workshops in socioemotional skills, and create mentoring spaces that strengthen support networks, offering not only crisis management but also preventive and sustainable strategies for student wellbeing.
The evidence gathered in this study makes it clear that academic stress should not be seen as a mere individual reaction to study load, but rather as the result of a deeply rooted institutional construct, driven by structural demands, competitive dynamics, and the lack of effective support mechanisms. This perspective aligns with the findings of Pascoe et al. (2020) and Powell and Graham (2017), who warn about the detrimental impact that the modern educational system can have on students' mental health. In this context, it becomes essential for universities to take on an active and transformative role in promoting student wellbeing by implementing policies that include early emotional support, stress management workshops, socioemotional skills programs, and a redesign of curricular dynamics to make them more humane, balanced, and responsive to students' real needs. Only through these structural changes can we move toward a more sustainable, empathetic educational experience that places mental health at the core of professional development.
However, the results should be interpreted with caution considering certain limitations of the study. First, the data were collected through self-administered questionnaires, which may introduce biases such as social desirability or recall errors (Licht-Ardila et al., 2021). Although a validated instrument such as the SISCO inventory was used, the subjective perception of stress does not always accurately reflect its actual intensity. In addition, since most variables are self-reported and qualitative in nature, there is a risk that responses reflect individual perceptions rather than objective measures, which may weaken the strength of the inferences. Therefore, the conclusions should be understood as indicative of patterns and associations rather than direct causal relationships. Moreover, while data mining algorithms provide strong predictive capabilities, their internal logic is not always fully interpretable, especially in complex models such as ensembles or those based on sequential optimization (Atzmueller et al., 2024; Cortis and Davis, 2021). The study was also cross-sectional, limiting the ability to capture how stress levels fluctuate throughout an academic program. Research by Bewick et al. (2010), for instance, has shown that stress can vary significantly across academic cycles, suggesting that future studies should incorporate longitudinal approaches for a deeper understanding of these dynamics.
Additionally, it must be acknowledged that class imbalance was a methodological limitation. Although tests were conducted using resampling techniques such as SMOTE, undersampling, and cost-sensitive learning, the results did not improve performance on the minority class (low stress) and, in some cases, even reduced overall accuracy. Likewise, macro-averaged metrics were explored for a fairer evaluation across categories, but these highlighted the difficulty of the model in learning patterns from underrepresented classes. Therefore, per-class metrics and weighted averages were reported as the primary reference, while noting that future work should expand the sample or implement more robust hybrid approaches. Transforming the educational experience into a space that values not only academic performance but also the student's overall wellbeing requires universities to assume an active commitment to the creation of emotionally healthy environments, with permanent psychological support programs, preventive stress management workshops, and the promotion of more collaborative and less competitive settings that strengthen socio-emotional skills and peer support networks, thereby contributing to the development of resilient, balanced, and emotionally healthy professionals.
5 Conclusions
The results of this study show a worrisome picture, revealing that three out of four students experience high stress levels, affecting both their wellbeing and academic performance. This phenomenon is not isolated, as the majority of respondents also report medium or high levels of physical, psychological, and behavioral reactions to stressful situations. Despite employing some coping strategies, such as occasionally making plans or seeking information, these actions do not seem to be systematic or effective enough to mitigate the impact of stress. In addition, the association rules analysis revealed that single and childless students are the most vulnerable, suggesting that the lack of personal support networks could aggravate the experience of stress in academic life.
Regarding the performance of the data mining algorithms applied, a high accuracy rate was achieved in the prediction of stress levels, exceeding 89% in some cases. Beyond this performance, the main contribution is that these techniques help identify key determinants, such as the academic cycle, stress levels, and emotional reactions, offering a practical framework for prioritizing student support. In particular, models based on decision trees and logistic regression allowed us to observe that factors such as academic cycle, level of stressors, and emotional reactions are major determinants in the classification of stress levels. Taken together, the methodological evidence supports the use of analytics not only to classify risk but also to inform targeted, data-driven decisions in higher education.
Based on these findings, universities must design student welfare policies that address not only critical cases but also preventive measures. Implement continuous emotional support programs, socioemotional skills workshops, and personalized counseling in time management and stress management. Structural adjustments are also recommended, for example, reviewing teaching loads and assessment pressure, to foster more balanced learning environments. Likewise, support networks among students should be strengthened, promoting meeting and mentoring spaces, where students can share experiences, concerns, and coping strategies in a safe and accompanied way. In this approach, mental health should operate as a transversal axis that guides policies, resource allocation, and evaluation of metrics.
This study opens a fertile space for future research on academic stress and its management in higher education. Although data mining algorithms demonstrated great accuracy in classifying stress levels, future work should adopt longitudinal designs to track changes across academic cycles and transition periods. Cross-institutional and cross-regional studies are needed to test generalizability and contextual effects. It is also crucial to examine additional variable external social support, digital habits and resilience, socioeconomic constraints, and prior mental health, within explanatory models. Finally, intervention-based research should link predictive models to early-warning protocols, evaluate effectiveness through academic and wellbeing outcomes (e.g., retention, GPA, help-seeking), and assess ethical, fairness, and interpretability dimensions of algorithmic tools.
Data availability statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.
Ethics statement
Ethical approval was not required for the study involving humans in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.
Author contributions
YR: Writing – original draft, Funding acquisition, Supervision. LQ: Methodology, Writing – original draft, Validation, Software. OC: Formal analysis, Writing – original draft, Investigation, Writing – review & editing, Conceptualization. ES: Validation, Investigation, Writing – review & editing, Writing – original draft, Visualization. JA: Writing – original draft, Project administration, Investigation, Supervision. JM: Writing – original draft, Data curation. RC: Writing – original draft, Resources.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This work was supported by the Universidad Nacional Toribio Rodríguez de Mendoza de Amazonas.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Gen AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
Agrawal, R., Imieliński, T., and Swami, A. (1993). “Mining association rules between sets of items in large databases,” in Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (Washington, DC: Association for Computing Machinery), 207–216. doi: 10.1145/170035.170072
Ahuja, R., and Banga, A. (2019). Mental stress detection in university students using machine learning algorithms. Procedia Comput. Sci. 152, 349–353. doi: 10.1016/j.procs.2019.05.007
Ali, S. (2021). Intrusion Detection Using the WEKA Machine Learning Tool 2013 A Report Submitted in Partial Fulfillment of the Requirements for the Degree of MASTER OF ENGINEERING in the Department of Electrical and Computer Engineering. Available online at: https://dspace.library.uvic.ca/bitstreams/44548e81-969a-4093-8dcf-27e89d969b03/download (Accessed May 15, 2025).
Anand, R., Md, A., Urooj, S., Mohan, S., Alawad, M., and C, A. (2023). Enhancing diagnostic decision-making: ensemble learning techniques for reliable stress level classification. Diagnostics 13:3455. doi: 10.3390/diagnostics13223455
Atkeson, C. G., Moore, A. W., and Schaal, S. (1997). Locally weighted learning for control. Artificial Intell. Rev. 11, 75–113. doi: 10.1023/A:1006511328852
Atzmueller, M., Fürnkranz, J., Kliegr, T., and Schmid, U. (2024). Explainable and interpretable machine learning and data mining. Data Min. Knowl. Discov. 38, 2571–2595. doi: 10.1007/s10618-024-01041-y
Bewick, B., Koutsopoulou, G., Miles, J., Slaa, E., and Barkham, M. (2010). Changes in undergraduate students' psychological well-being as they progress through university. Stud. Higher Educ. 35, 633–645. doi: 10.1080/03075070903216643
Bonneville-Roussy, A., Evans, P., Verner-Filion, J., Vallerand, R. J., and Bouffard, T. (2017). Motivation and coping with the stress of assessment: gender differences in outcomes for university students. Contemp. Educ. Psychol. 48, 28–42. doi: 10.1016/j.cedpsych.2016.08.003
Brijs, T., Vanhoof, K., and Wets, G. (2003). Defining interestingness for association rule. Int. J. Inf. Theories Appl. 10, 370–375.
Carrasco, A. M., Sánchez, E., Reina, Y., Cruz, O., Chávez, R., Maicelo, Y., et al. (2024). Comprehensive wellness in university life: an analysis of student services and their impact on quality of life. J. Educ. Soc. Res. 14:514. doi: 10.36941/jesr-2024-0190
Castillo-Navarrete, J. L., Bustos, C., Guzman-Castillo, A., and Zavala, W. (2024). Academic stress in college students: descriptive analyses and scoring of the SISCO-II inventory. PeerJ 12, 1–16. doi: 10.7717/peerj.16980
Chemagosi, M. J. (2024). Student Well-Being in Higher Education Institutions. IGI Global. pp. 81–106. doi: 10.4018/979-8-3693-4417-0.ch004
Chemers, M. M., Hu, L. T., and Garcia, B. F. (2001). Academic self-efficacy and first-year college student performance and adjustment. J. Educ. Psychol. 93, 55–64. doi: 10.1037/0022-0663.93.1.55
Chen, M., and Yin, Z. (2022). Classification of cardiotocography based on the apriori algorithm and multi-model ensemble classifier. Front. Cell Dev. Biol. 10, 1–8. doi: 10.3389/fcell.2022.888859
Cody, K., Scott, J. M., and Simmer-Beck, M. (2024). Examining the mental health of university students: a quantitative and qualitative approach to identifying prevalence, associations, stressors, and interventions. J. Am. College Health 72, 776–786. doi: 10.1080/07448481.2022.2057192
Coffin, T., Wray, J., Sah, R., Maj, M., Nath, R., Nauhria, S., et al. (2023). A review and meta-analysis of the prevalence and health impact of polycystic ovary syndrome among medical and dental students. Cureus 15:e40141. doi: 10.7759/cureus.40141
Cohen, W. W. (1995). “Fast effective rule induction,” in Machine Learning Proceedings 1995: Proceedings of the Twelfth International Conference on Machine Learning, ed. M. Kaufmann (Tahoe City, CA: Elsevier), 115–123. doi: 10.1016/B978-1-55860-377-6.50023-2
Corrales, C. A., Rojas, J. E., Atoche, W. J., Cáceres, A. A., and Rodriguez, M. Á. (2021). “Caracterización Del Nivel De Estrés De Alumnos De Ingeniería Mediante Herramientas De Data Mining,” in Proceedings of the 19th LACCEI International Multi-Conference for Engineering, Education, and Technology: “Prospective and Trends in Technology and Skills for Sustainable Social Development” “Leveraging Emerging Technologies to Construct the Future” (Latin American and Caribbean Consortium of Engineering Institutions). doi: 10.18687/LACCEI2021.1.1.489
Cortis, K., and Davis, B. (2021). Over a decade of social opinion mining: a systematic review. Artificial Intell. Rev. 54, 4873–4965. doi: 10.1007/s10462-021-10030-2
Cuttilan, A. N., Sayampanathan, A. A., and Ho, R. C. M. (2016). Mental health issues amongst medical students in Asia: a systematic review [2000-2015]. Ann. Transl. Med. 4, 1–11. doi: 10.3978/j.issn.2305-5839.2016.02.07
de la Fuente, J., Amate, J., González-Torres, M. C., Artuch, R., García-Torrecillas, J. M., and Fadda, S. (2020). Effects of levels of self-regulation and regulatory teaching on strategies for coping with academic stress in undergraduate students. Front. Psychol. 11:22. doi: 10.3389/fpsyg.2020.00022
Deasy, C., Coughlan, B., Pironom, J., Jourdan, D., and Mcnamara, P. M. (2015). Psychological distress and lifestyle of students: implications for health promotion. Health Promot. Int. 30, 77–87. doi: 10.1093/heapro/dau086
Durán Acevedo, C. M., Carrillo Gómez, J. K., and Albarracín Rojas, C. A. (2021). Academic stress detection on university students during COVID-19 outbreak by using an electronic nose and the galvanic skin response. Biomed. Signal Process. Control 68:102756. doi: 10.1016/j.bspc.2021.102756
Fayaz, S. A., Zaman, M., and Butt, M. A. (2021). An application of logistic model tree (LMT) algorithm to ameliorate Prediction accuracy of meteorological data. Int. J. Adv. Technol. Eng. Explor. 8, 1424–1440. doi: 10.19101/IJATEE.2021.874586
Firoz, M., Monirul, M., Shidujaman, M., Islam, A., and Habib, Md. (2023). “University student's mental stress detection using machine learning,” in Seventh International Conference on Mechatronics and Intelligent Robotics (ICMIR 2023), eds. S. Patnaik and T. Shen (SPIE) (Kunming), 113. doi: 10.1117/12.2690039
Freund, Y., and Schapire, R. E. (1996). “Experiments with a New Boosting Algorithm,” in Machine Learning: Proceedings of the Thirteenth InternationalConference, 1–9.
Frydenberg, E. (2014). Coping research: historical background, links with emotion, and new research directions on adaptive processes. Aust. J. Psychol. 66, 82–92. doi: 10.1111/ajpy.12051
Geng, G., and Midford, R. (2015). Investigating first year education students' stress level. Aust. J. Teacher Educ. 40. doi: 10.14221/ajte.2015v40n6.1
Gleichgerrcht, E., and Decety, J. (2013). Empathy in clinical practice: how individual dispositions, gender, and experience moderate empathic concern, burnout, and emotional distress in physicians. PLoS ONE 8:e61526. doi: 10.1371/journal.pone.0061526
Gogichadze, M., Mgbedo, N., Landia, N., and Odzelashvili, I. (2023). Sleep quality, perceived academic stress and mental health among international students at the University of Georgia during COVID-19: a cross-sectional study. J. Neurol. Sci. 455:122242. doi: 10.1016/j.jns.2023.122242
Gustems-Carnicer, J., Calderón, C., and Calderón-Garrido, D. (2019). Stress, coping strategies and academic achievement in teacher education students. Euro. J. Teacher Educ. 42, 375–390. doi: 10.1080/02619768.2019.1576629
Guzmán-Castillo, A., Bustos, C., Zavala, W., and Castillo Navarrete, J. L. (2022). Inventario SISCO del estrés académico: revisión de sus propiedades psicométricas en estudiantes universitarios. Terapia Psicológica 40, 197–211. doi: 10.4067/S0718-48082022000200197
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. (2009). The WEKA data mining software. ACM SIGKDD Explor. Newsl. 11, 10–18. doi: 10.1145/1656274.1656278
Heckerman, D. (1997). Bayesian networks for data mining. Data Min. Knowl. Discov. 1, 79–119. doi: 10.1023/A:1009730122752
Heikkilä, A., Lonka, K., Nieminen, J., and Niemivirta, M. (2012). Relations between teacher students' approaches to learning, cognitive and attributional strategies, well-being, and study success. Higher Educ. 64, 455–471. doi: 10.1007/s10734-012-9504-9
Hitches, E., Woodcock, S., and Ehrich, J. (2022). Building self-efficacy without letting stress knock it down: stress and academic self-efficacy of university students. Int. J. Educ. Res. Open 3:100124. doi: 10.1016/j.ijedro.2022.100124
Huaman, A. A., Quiñones, L., Yalta, R., Huaman, A., Adrianzen, O. D., and Rodriguez, J. S. (2024). “Toward enhanced customer transaction insights: an apriori algorithm-based analysis of sales patterns at university industrial corporation,” in IJACSA, International Journal of Advanced Computer Science and Applications, Vol. 15. Available online at: www.ijacsa.thesai.org (Accessed May 15, 2025).
Ibarguren, I., Pérez, J. M., Muguerza, J., Gurrutxaga, I., and Arbelaitz, O. (2018). UnPART: PART without the ‘partial' condition of it. Inf. Sci. 465, 505–522. doi: 10.1016/j.ins.2018.07.022
Islam, S., and Rabbi, F. (2024). Exploring the sources of academic stress and adopted coping mechanisms among university students. Int. J. Stud. Educ. 6, 255–271. doi: 10.46328/ijonse.203
Khairy, D., Alharbi, N., Amasha, M. A., Areed, M. F., Alkhalaf, S., and Abougalala, R. A. (2024). Prediction of student exam performance using data mining classification algorithms. Educ. Inf. Technol. 29, 21621–21645. doi: 10.1007/s10639-024-12619-w
Kim, Y. K., and Sax, L. J. (2017). “The impact of college students' interactions with faculty: a review of general and conditional effects,” in Higher Education: Handbook of Theory and Research, Vol. 32, ed. M. Paulsen (Cham: Springer). doi: 10.1007/978-3-319-48983-4_3
Klein, H. J., and McCarthy, S. M. (2022). Student wellness trends and interventions in medical education: a narrative review. Human. Social Sci. Commun. 9:92. doi: 10.1057/s41599-022-01105-8
Krumrei-Mancuso, E. J., Newton, F. B., Kim, E., and Wilcox, D. (2013). Psychosocial factors predicting first-year college student success. J. Coll. Stud. Dev. 54, 247–266. doi: 10.1353/csd.2013.0034
Landwehr, N., Hall, M., and Frank, E. (2005). Logistic model trees. Machine Learn. 59, 161–205. doi: 10.1007/s10994-005-0466-3
Lee, S., Cha, Y., Han, S., and Hyun, C. (2019). Application of association rule mining and social network analysis for understanding causality of construction defects. Sustainability 11:618. doi: 10.3390/su11030618
Leslie, K., Brown, K., and Aiken, J. (2021). Perceived academic-related sources of stress among graduate nursing students in a Jamaican University. Nurse Educ. Pract. 53:103088. doi: 10.1016/j.nepr.2021.103088
Li, Z.-S., and Hasson, F. (2020). Resilience, stress, and psychological well-being in nursing students: a systematic review. Nurse Educ. Today 90:104440. doi: 10.1016/j.nedt.2020.104440
Licht-Ardila, V., Soto-Gualdron, S. N., and Angulo-Rincon, R. (2021). Nivel de estrés y rendimiento académico en estudiantes universitarios que trabajan y los que no. Espacios 42, 82–90. doi: 10.48082/espacios-a21v42n07p06
Maity, S., Wray, J., Coffin, T., Nath, R., Nauhria, S., Sah, R., et al. (2022). Academic and social impact of menstrual disturbances in female medical students: a systematic review and meta-analysis. Front. Med. 9:821908. doi: 10.3389/fmed.2022.821908
McMahan, H. B., Likhachev, M., and Gordon, G. J. (2005). “Bounded real-time dynamic programming,” in Proceedings of the 22nd International Conference on Machine Learning - ICML '05 (New York, NY: Association for Computing Machinery), 569–576. doi: 10.1145/1102351.1102423
Melaku, L., and Bulcha, G. (2021). Evaluation and comparison of medical students stressors and coping strategies among undergraduate preclinical and clinical year students enrolled in Medical School of Arsi University, Southeast Ethiopia. Educ. Res. Int. 2021, 1–12. doi: 10.1155/2021/9202156
Misra, R., and Castillo, L. G. (2004). Academic stress among college students: comparison of American and International students. Int. J. Stress Manag. 11, 132–148. doi: 10.1037/1072-5245.11.2.132
Mofatteh, M. (2021). Risk factors associated with stress, anxiety, and depression among university undergraduate students. AIMS Public Health 8, 36–65. doi: 10.3934/publichealth.2021004
Moradi, A., Pishva, N., Ehsan, H. B., Hadadi, P., and pouladi, F. (2011). The Relationship Between Coping Strategies and Emotional Intelligence. Procedia Soc. Behav. Sci. 30, 748–751. doi: 10.1016/j.sbspro.2011.10.146
Nguyen-Thi, T. T., Le, H. M., Chau, T. L., Le, H. T., Pham, T. T., Tran, N. T., et al. (2024). Prevalence of stress and related factors among healthcare students: a cross – sectional study in Can Tho City, Vietnam. Annali Di Igiene Medicina Preventiva e Di Comunita 36, 292–301. doi: 10.7416/ai.2023.2591
Noman, M., Kaur, A., and Nafees, N. (2021). Covid-19 fallout: interplay between stressors and support on academic functioning of Malaysian university students. Child. Youth Serv. Rev. 125:106001. doi: 10.1016/j.childyouth.2021.106001
Pascoe, M. C., Hetrick, S. E., and Parker, A. G. (2020). The impact of stress on students in secondary school and higher education. Int. J. Adolesc. Youth 25, 104–112. doi: 10.1080/02673843.2019.1596823
Pei, J. (2022). Prediction and analysis of contemporary college students' mental health based on neural network. Comput. Intell. Neurosci. 2022, 1–10. doi: 10.1155/2022/7284197
Pham, T. T., Pham, T. T., Le, C. N., Suwanbamrung, C., Le, H. T., Nguyen, T. T. T., et al. (2023). Stress reduction intervention for preventive medicine students in Vietnam's limited resources setting. Arch. Balkan Med. Union 58, 158–166. doi: 10.31688/ABMU.2023.58.2.09
Platt, J. C. (1998). “Fast training of support vector machines using sequential minimal optimization,” in Advances in Kernel Methods, eds. C. J. C. Burges, B. Schölkopf, and A. J. Smola (The MIT Press). doi: 10.7551/mitpress/1130.003.0016
Powell, M. A., and Graham, A. (2017). Wellbeing in schools: examining the policy–practice nexus. Austral. Educ. Res. 44, 213–231. doi: 10.1007/s13384-016-0222-7
Prajapati, D. J., Garg, S., and Chauhan, N. C. (2017). Interesting association rule mining with consistent and inconsistent rule detection from big sales data in distributed environment. Future Comput. Informatics J. 2, 19–30. doi: 10.1016/j.fcij.2017.04.003
Ragab, E. A., Dafallah, M. A., Salih, M. H., Osman, W. N., Osman, M., Miskeen, E., et al. (2021). Stress and its correlates among medical students in six medical colleges: an attempt to understand the current situation. Middle East Curr. Psychiatry 28, 1–10. doi: 10.1186/s43045-021-00158-w
Rennie, J. D. M., Shih, L., Teevan, J., and Karger, D. R. (2003). “Tackling the poor assumptions of naive bayes text classifiers,” in Proceedings of the Twentieth International Conference on International Conference on Machine Learning, 616–623.
Ruzhenkova, V., Ruzhenkov, V., Khamskaya, I. S., Ruzhenkova, V. V., Ruzhenkov, V. A., Lukyantseva, I. S., et al. (2018). “Academic stress and its effect on medical students' mental health status,” in Drug Invention Today, Vol. 10. Available online at: https://www.researchgate.net/publication/332625907 (Accessed May 15, 2025).
Sadiq, A., Ashraf, M. F., Zakaullah, P., and Asghar, A. (2021). Measuring the stressors in undergraduate medical students: a cross sectional study. Sustain. Business Soc. Emerg. Econ. 3, 367–373. doi: 10.26710/sbsee.v3i3.1995
Sánchez, E., Reina, Y., Cruz, O., Torres, M., Carrasco, A. M., and Chávez, R. (2024). Analysis of social demand and labor supply for university study programs: case study in the province of Rodriguez de Mendoza, Amazonas region. Cogent Educ. 11, 1–20. doi: 10.1080/2331186X.2024.2406589
Saranaval, N., and Gayathri (2018). Performance and classification evaluation of J48 algorithm and Kendall's based J48 algorithm (KNJ48). Int. J. Computer Trends Technol. 59, 73–80. doi: 10.14445/22312803/IJCTT-V59P112
Saxena, Y., Shrivastava, A., and Singh, P. (2014). Gender and stress among medical students 147 Indian. Indian J. Physiol. Pharmacol. 58, 47–151.
Shahapur, S., Chitti, P., Patil, S., Abhay, C., Shivaram, V., Rayanaikar, V., et al. (2024). Decoding minds: estimation of stress level in students using machine learning. Indian J. Sci. Technol. 17, 2002–2012. doi: 10.17485/IJST/v17i19.2951
Slavin, S. J., Schindler, D., Chibnall, J. T., Fendell, G., and Shoss, M. (2012). PERMA: a model for institutional leadership and culture change. Academic Med. 87:1481. doi: 10.1097/ACM.0b013e31826c525a
Slimmen, S., Timmermans, O., Mikolajczak-Degrauwe, K., and Oenema, A. (2022). How stress-related factors affect mental wellbeing of university students A cross-sectional study to explore the associations between stressors, perceived stress, and mental wellbeing. PLoS ONE 17:e0275925. doi: 10.1371/journal.pone.0275925
Stallman, H. M. (2010). Psychological distress in university students: a comparison with general population data. Aust. Psychol. 45, 249–257. doi: 10.1080/00050067.2010.482109
Thorley, C. (2017). Not by Degrees: Improving Student Mental Health in the UK's Universities. Available online at: https://www.ippr.org/articles/not-by-degrees (Accessed May 15, 2025).
Tran, D.-S., Nguyen, D.-T., Nguyen, T.-H., Tran, C.-T.-P., Duong-Quy, S., and Nguyen, T.-H. (2023). Stress and sleep quality in medical students: a cross-sectional study from Vietnam. Front. Psychiatry 14:1297605. doi: 10.3389/fpsyt.2023.1297605
Uyen, P. D., Vo, M. T., Hoai, T. T., Huynh, G., Tran, M. H., Phung, H. N., et al. (2024). Depression in final-year medical students in Ho Chi Minh City, Vietnam: the role of career-choice motivation. J. Med. Educ. Curricular Dev. 11. doi: 10.1177/23821205241238602
Valdivia, A., Martínez-Cámara, E., Chaturvedi, I., Luzón, M. V., Cambria, E., Ong, Y.-S., et al. (2020). What do people think about this monument? Understanding negative reviews via deep learning, clustering and descriptive rules. J. Ambient Intell. Humanized Comput. 11, 39–52. doi: 10.1007/s12652-018-1150-3
Wicaksono, P., and Sriani, S. (2024). Application of support vector machine algorithm for students' final assignment stress classification. JIKO (Jurnal Informatika Dan Komputer) 7, 138–144. doi: 10.33387/jiko.v7i2.8618
Witten, I. H., Frank, E., and Hall, M. A. (2011). Data Mining: Practical Machine Learning Tools and Techniques, 3rd Edn (ed. Morgan Kaufmann). Elsevier. doi: 10.1016/B978-0-12-374856-0.00001-8
Witten, I. H., Frank, E., Hall, M. A., and Pal, C. J. (2016). Data Mining: Practical Machine Learning Tools and Techniques, ed. Morgan Kaufmann. Available online at: https://www.google.com.mx/books/edition/Data_Mining/1SylCgAAQBAJ?hl=es-419andgbpv=0 (Accessed May 15, 2025).
Wong, W. H., and Chapman, E. (2023). Student satisfaction and interaction in higher education. Higher Educ. 85, 957–978. doi: 10.1007/s10734-022-00874-0
Xiao, L., Yao, M., and Liu, H. (2024). Beliefs about the universality of meaning in life enhance psychological and academic adjustment among university students: the role of meaning in life and stress mindset. Child. Youth Serv. Rev. 158:107460. doi: 10.1016/j.childyouth.2024.107460
Keywords: academic stress, data mining, college students, ranking rule, stress prediction, stress coping
Citation: Reina Marín Y, Quiñones Huatangari L, Cruz Caro O, Sánchez Bardales E, Alva Tuesta JN, Maicelo Guevara JL and Chávez Santos R (2025) Characterization of the stress level of university students using data mining algorithms. Front. Res. Metr. Anal. 10:1637206. doi: 10.3389/frma.2025.1637206
Received: 28 May 2025; Accepted: 31 October 2025;
Published: 21 November 2025.
Edited by:
Jose Manuel Martinez-Vicente, University of Almeria, SpainReviewed by:
Nahumi Nugrahaningsih, University of Palangka Raya, IndonesiaSuleyman Alpaslan Sulak, Necmettin Erbakan University, Türkiye
Copyright © 2025 Reina Marín, Quiñones Huatangari, Cruz Caro, Sánchez Bardales, Alva Tuesta, Maicelo Guevara and Chávez Santos. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Omer Cruz Caro, b21lci5jcnV6QHVudHJtLmVkdS5wZQ==
†ORCID: Yuri Reina Marín orcid.org/0000-0002-9402-4104
Lenin Quiñones Huatangari orcid.org/0000-0002-0953-328X
Omer Cruz Caro orcid.org/0000-0001-5664-3222
Einstein Sánchez Bardales orcid.org/0009-0002-8577-4330
Judith Nathaly Alva Tuesta orcid.org/0000-0003-1850-1535
Jorge Luis Maicelo Guevara orcid.org/0009-0008-9097-0348
River Chávez Santos orcid.org/0000-0002-3705-8682
Judith Nathaly Alva Tuesta1†