Student evaluation of teaching prior, during and after the COVID-19 pandemic

Pusker, Máté; Takács, Szabolcs; Takács, Rita

doi:10.3389/feduc.2025.1593000

ORIGINAL RESEARCH article

Front. Educ., 18 September 2025

Sec. STEM Education

Volume 10 - 2025 | https://doi.org/10.3389/feduc.2025.1593000

Student evaluation of teaching prior, during and after the COVID-19 pandemic

Máté Pusker¹^*

Szabolcs Takács²

Rita Takács³

¹Doctoral School of Psychology, Károli Gáspár University of the Reformed Church in Hungary, Budapest, Hungary
²Institute of Psychology, Department of General Psychology and Methodology, Károli Gáspár University of the Reformed Church in Hungary, Budapest, Hungary
³Faculty of Informatics, Eötvös Loránd University, Budapest, Hungary

Introduction: The primary aim of this study is to determine whether the quality of education declined at a Hungarian university during the COVID-19 lockdown. Contrary to studies with smaller sample sizes, we analyzed SET results from the university's IT faculty (both undergraduate and postgraduate programs) to provide empirical evidence for the academic discourse on the quality of education before, during, and after the COVID-19 pandemic. Although controversy surrounds SET questionnaires regarding their validity, reliability, and application, valuable information can be extracted from student feedback.

Method: This research retrospectively investigates the quality of education during the COVID-19 pandemic by analyzing student evaluation of teaching (SET) questionnaires at a Hungarian university collected from 2015 to 2022. This timeframe allows for a unique comparison between pre-COVID, COVID, and post-COVID educational periods, providing essential context for the pandemic's impact.

Results: Our analysis of SET questions yielded statistically significant results; however, the effect sizes were negligible (ϵ2 <0.01), indicating a lack of meaningful differences between the COVID and non-COVID periods.

Discussion: Results suggests that despite the widespread disruption across all areas of life during the pandemic, the quality of education, with all its inherent challenges, did not substantially decrease in this Hungarian university.

Introduction

The COVID-19 pandemic required an immediate response from higher education institutions, leading to the suspension of on-campus attendance. While traditional in-person education could not be maintained, online education presented new challenges for institutions, teachers, and students alike (Adedoyin and Soykan, 2020; Daniel, 2020). Given the scarcity of examples of such drastic global educational reorganization, ensuring the quality of education became a central concern, as the transition to online education had to occur with extreme haste (Means et al., 2020). This adaptation was facilitated by advancements in teaching methodologies and support systems during the pandemic (Abu Talib et al., 2021; Lin et al., 2022; Zhang et al., 2022; Boyle and Cook, 2023). This research aims to contribute to the academic literature by evaluating feedback on the quality of education during the pandemic through the analysis of large-sample quantitative data from Student Evaluation of Teaching (SET) surveys.

Concerns regarding the negative effects of the pandemic on higher education were widely expressed (Boysen, 2023), encouraging researchers to investigate the issue. Abu Talib et al. (2021), in their review of research on student and faculty feedback, found that approximately 22% of articles addressed negative aspects of online education (e.g., anxiety, doubts, overwhelming effects), while the remaining research proactively gathered information concerning the exceptional situation caused by the pandemic.

Jin et al. (2021) observed that students recognized the convenience introduced by online learning during COVID-19 and found the quality of online lectures equally satisfactory (Angelova, 2020). Surprisingly, Alam and Singh (2021) found that IT students performed better with online teaching and learning compared to in-person education. Not only did student performance remain stable, but teacher performance also appeared unaffected by COVID-19 (Reis et al., 2021), irrespective of staff gender, academic title, or field of teaching (Berniak-Wozny et al., 2021). From an institutional perspective, no significant difference in course feedback was detected during the switch to online learning (Rodrigo et al., 2022). Furthermore, Charytanowicz et al. (2022) found no clear evidence that pandemic disruption and online learning led to knowledge deficiencies.

However, one study found that 80% of students would choose a traditional educational setting (Kazainé, 2021), indicating dissatisfaction with online teaching and a preference for on-site learning (Guo, 2020). There is only a slight predominance of students who perceive the intensity of online education as less compared to traditional lectures (Angelova, 2020). Moreover, the transition from traditional to online education was evaluated negatively, specifically noting that courses became less enjoyable, less interesting, and decreased in learning value (Garris and Fleck, 2022). When comparing the experiences of engineering students, Warfvinge et al. (2022) found that during the pandemic, student satisfaction with courses decreased, they received less feedback, and understanding course completion requirements became more difficult.

Ahmed et al. (2022) examined feedback from IT students on teaching in a fully online environment during the COVID period to identify key factors for effective education. Analyzing data from 27,622 individuals, two factors characterized successful education:

1. Student-faculty relationship, encompassing:

• Respectful relationships between teachers and students

• Ensuring a supportive educational climate

• Care and support for students

• Fair, objective, and consistent treatment of students

2. Routine activities, such as:

• Tracking student attendance

• Providing accurate class status

• Availability of online teaching materials

• Clear structure and communication of course framework and expectations

Sepahi et al. (2021), in their small-sample study, found that the most important teaching characteristics during the COVID period, from the students' perspective, were effective presentation, observable professional and ethical attitudes, and consultation and support in problem-solving.

LeBlanc (2021) investigated SET results during pandemic-affected semesters and found that SET scores deteriorated, despite still receiving relatively high average scores. Boysen (2023) also investigated whether SET results would negatively change, aligning with assumptions that the pandemic adversely affected online education. Boysen's results showed a significant difference between non-COVID and COVID periods (the latter receiving significantly higher ratings, but with a low degree of impact), concluding that average SET results were not affected by the magnitude of external events. Campos et al. (2022) followed up on an education reform implemented during the COVID-19 period and found that alongside a more positive perception of the reform, there were no negative changes in SET assessments. A more recent study by Fullerton et al. (2024), comparing SET scores across pre-COVID and COVID periods, did not reveal meaningful changes.

Despite the ongoing debate surrounding the reliability of SET as a source of information, it remains a widely applied instrument at universities. Pineda and Steinhardt (2023), in their comprehensive analysis, noted that the focus of SET is divided between viewing the methodology as a management tool (American continent) or as a tool to improve the quality of education. The aim of SET surveys is, among other things, to support the quality assessment of teaching (Zabaleta, 2007), aid decisions regarding promotions (Clayson, 2009; Berezvai et al., 2020; Goos and Salomons, 2017) and provide students an opportunity to express opinions about a subject or instructor (Baxter, 1991; Safavi et al., 2013). Professional judgment on the methodology is heterogeneous, with sharp criticism (Cook et al., 2022; Lakeman et al., 2023; Spooren et al., 2013) alongside views of the method as a useful approach for gathering information (Ali et al., 2021; Murray, 1997; Ulker, 2021).

Factors unrelated to teaching quality, such as age (Murray et al., 2020; Stonebraker and Stone, 2015), likability (Clayson, 2022), physical appearance (Reinsch et al., 2020), and cultural characteristics (Arnold and Versluis, 2019), can all impact SET ratings. Some research has examined, with varying results, whether gender also affects student assessment (Boring and Ottoboni, 2016; Kreitzer and Sweet-Cushman, 2022; Tangalakis et al., 2022). Feistauer and Richter (2018) also investigated the effect of sympathetic teachers on SET outcomes. Their results indicated that teacher sympathy positively affects SET scores; however, they noted the low number of participants in their research due to voluntary participation. Students who positively assess the SET questionnaire and its usefulness tend to give higher scores, making the reliability of the measurement questionable (Spooren and Christiaens, 2017).

SET also appears to be an unreliable measure of teaching effectiveness, as only a weak relationship was found between current academic performance and teaching effectiveness (Sánchez et al., 2020). Esarey and Valdes (2020) argue that unbiased, reliable SETs may contain a reporting error that should not be ignored when interpreting results. According to Uttl (2021), 2023), it is a mistake to assess teachers' educational effectiveness based solely on SET results. Additionally, Zabaleta (2007) found a moderate correlation between students with low grades and low SET scores in a sample of 18,175 students. This could lead to teachers awarding better grades in hopes of receiving higher SET scores, resulting in grade inflation (Nowell, 2007; Berezvai et al., 2020), when examining factors influencing SET evaluation, found that results are not independent of the grade received for the given subject; a better grade correlates with a higher SET rating for the instructor. Berezvai et al. (2019) also found that financial incentives for educators significantly increased grade averages in courses where students had previously received low average grades.

When reviewing SET literature, one encounters various experimental settings and sometimes conflicting results; thus, appropriate context is required to support the future use of identified relationships and effects in the research field (Ali et al., 2021). Active student participation in SET surveys is essential for ensuring quality education (Salvo-Garrido et al., 2022). However, SET data collection itself proves challenging, which forms the cornerstone of validity concerns (Ali et al., 2021). In many cases, the completion rate of SET is considered low (Nulty, 2008), increasing the chance of sampling error (Goos and Salomons, 2017; Wolbring and Treischl, 2016). Meta-analyses on the relationships between student studies and SET were reviewed by Uttl et al. (2017), who found that previous conclusions were often based on small sample sizes. For this reason, they re-ran the analysis adjusting for sample size and found no correlation between the variables (Uttl et al., 2017).

The widespread use of SET results raises additional expectations regarding the reliability and validity of the measurement (Zhao et al., 2022); however, it is increasingly accepted that SET can play an active role in improving the quality of education (Ali et al., 2021; Penny, 2003). We observe examples of SET use in improving the quality of educational work (Golding and Adam, 2016) and in research supporting administrative activity (Barrow and Grant, 2016). SET scales focusing on fundamental aspects of learning (e.g., teacher helpfulness) highly predict high teacher ratings (Park and Dooris, 2020). It is also important to highlight that students associate value with SET surveys (Stein et al., 2021), supporting the use of this key measurement for both administration and educators (Linse, 2017).

Following prior research building on SET results (LeBlanc, 2021; Boysen, 2023), our aim was to determine if differences could be detected by comparing not only the COVID-19 period SET scores to pre-COVID results but also by including post-COVID SET results in the analysis. The scale and type of this analysis remain scarce in the literature to date (Fullerton et al., 2024). Based on Warfvinge et al. (2022), we sought to detect the potential negative effects of “distance learning” utilizing the available data. Based on these considerations, our research question is:

RQ: What effect did the COVID-19 period have on SET results that reflect on the quality of education?

Based on Ahmed et al. (2022), we examined key SET questions related to the course framework and requirements, which were found to be indispensable during COVID-19 education.

H1: The perception of course structure received lower ratings during the COVID-19 period compared to pre-COVID and post-COVID periods.

H2: The assessment of the feasibility of course requirements is worse in the COVID-19 period compared to pre-COVID and post-COVID periods.

We assume that due to the COVID-19 pandemic, teachers were required to provide more support and motivation, which would manifest in higher scores on the corresponding SET questions (Sepahi et al., 2021).

H3: Students gave higher scores on the “teacher's professional, motivating and supportive attitude” question during the COVID-19 period compared to pre-COVID and post-COVID periods.

Based on Means t al. (2020), we assume that during the COVID-19 pandemic, due to the transition to online education, the interactivity of practice courses could not be achieved as effectively as in the previous period. However, after the COVID-19 experiences and practices, students consider available teaching methodologies successful.

H4: When judging the interactivity of practice courses, students gave lower ratings during the COVID-19 period compared to pre-COVID and post-COVID periods.

Materials and methods

Hungarian universities conduct SET surveys every semester to monitor educational quality. The surveys and evaluations are available at all faculties, serving multiple purposes. Students provide separate feedback on the course quality and the teacher's performance. This research is based on SET data from one Hungarian university's IT Faculty, including both undergraduate and postgraduate students feedback. The completion period for the surveys begins on the first day of the exam period and concludes on the penultimate day. The Quality Office is responsible for gathering information across all faculties. Students provide anonymous, voluntary feedback on courses they have signed up for each semester. This anonymity aims to promote full disclosure and authenticity of information. The university's Senate stipulates that students who complete all questionnaires receive 8 extra points as a reward, usable for course admission in the subsequent semester. This practice aims to remedy low completion rates by introducing incentives, such as the possibility of early subject admission for survey respondents (Lukáts et al., 2023). Naturally, with an increased number of evaluable fill-ins, the number of non-evaluable fills may also increase, and response bias must be considered; however, an overall increase in evaluable data can be achieved (Lukáts et al., 2023).

The research builds on individual SET questions to test the hypotheses. All individual SET questions were assigned a label to make the analysis and its interpretation easier (Table 1).

Table 1

Table 1. Labels of individual SET questions.

The university collects student feedback using a four-point Likert scale, where 1 is the minimum and 4 is the maximum score. While this raises questions regarding potential validity and reliability errors, Chang (1994) suggests that the choice of the number of points in a Likert scale may depend on the empirical setting. In our research, students are prevented from choosing a neutral stance, which serves institutional purposes.

This study is based on a secondary analysis of existing data obtained from the university. The SET results were extracted from the NEPTUN educational system in an anonymized format. Student identities and feedback were anonymized to the extent that only the number of survey responses is available for each course and the corresponding teacher evaluations. Consequently, no information is available regarding response rates, gender, age, or other sample characteristics. While this represents a limitation of the study, the volume of available data provides a sufficient basis for interpreting SET scores across the key time periods.

We defined a “Period” variable to categorize information into pre-COVID (spring 2015 to autumn 2019), COVID (spring 2020 to spring 2021), and post-COVID (autumn 2021 to spring 2022) periods. Our second definition was the “Course Type” variable, enabling the analysis to differentiate between Lectures, Practice courses, and a combination of the two labeled as Lecture and practice. All other courses (labs, seminars) were treated uniformly, as the research did not aim to investigate these types of courses.

Results

IBM SPSS 25.0 for Windows (IBM Corp., Armonk, NY, USA) software was used to conduct the analysis. The sample size (Table 2) is sufficient for the analysis (Nulty, 2008; Uttl et al., 2017). A Kruskal-Wallis H test was conducted to examine differences between ordinal data (Field, 2017), which is suitable for this dataset as the assumption of equality of variances was violated (Table A1). Mean values of the selected SET questions based on Period can be found in Table 3. Since the individual SET questions assessed distinct constructs rather than a single underlying factor, scale reliability analysis was not conducted.

Table 2

Table 2. Frequency of feedback based on periods.

Table 3

Table 3. Mean value of SET questions based on the period variable.

The first analysis involved the first three hypotheses, investigating differences among the three Periods across all course types concerning the selected SET questions. Bonferroni correction was applied to control for multiple comparisons (adjusted p = 0.01 for three tests, rounded down) to control for Type I error. The analysis revealed statistically significant (p < 0.001) differences among the periods (Table 4). However, these differences demonstrated negligible effects (ϵ² <0.01), indicating minimal practical implications (Tomczak and Tomczak, 2014). For this reason, post hoc comparisons were not conducted, in line with recommendations to avoid overinterpreting trivial effects (Funder and Ozer, 2019).

Table 4

Table 4. Kruskal-Wallis test.

The second analysis focused on the Interactivity of Practice courses across the three periods. Similarly, a Kruskal-Wallis H test was conducted with Bonferroni correction. The test revealed significant differences (p < 0.001) among the three periods regarding the Interactivity of Practice courses (Table 5); however, we found a negligible effect (Tomczak and Tomczak, 2014). Since the effect size was negligible (ϵ² <0.01), post-hoc testing was discarded.

Table 5

Table 5. Kruskal-Wallis test.

Summary of the results can be found in Table 6.

Table 6

Table 6. Hypothesis and results.

Discussion

Following Warfvinge et al. (2022), we initially suspected a decrease in SET scores influenced by the COVID-19 pandemic. Based on Ahmed et al. (2022), we examined SET questions related to the framework and requirements of the courses. Our findings indicate that the perception of course structure and feasibility of course requirements during the COVID-19 period did not differ meaningfully, despite statistical significance. We suggest this is attributable to the inherent stability of well-established course structures and requirements. Furthermore, considering the unique circumstances of the pandemic, educators likely tailored certain requirements to online education, ensuring students had a fair and manageable workload throughout the semester.

The third hypothesis investigated the level of support teachers provided to students. We assumed that students' heightened need for support and motivation (Sepahi et al., 2021) would manifest in higher scores on the corresponding SET question. This assumption was not supported by the data; despite statistically significant differences in “teacher's professional, motivating and supportive attitude,” the practical difference was negligible. The data suggest that students' evaluation of teacher support remained similarly high across the pre-COVID, COVID, and post-COVID periods. While it is possible that students' need for support and motivation did not change considerably during the COVID-19 period, it seems more probable that teachers were able to provide the required amount of support just as effectively as before and after the pandemic.

The final hypothesis focused on the Interactivity of Practice courses, as the transition of these courses may have been particularly challenging (Means et al., 2020), potentially leading to decreased SET evaluations. The analysis established a statistically significant, yet negligible, difference regarding the Interactivity of Practice courses. It is important to note that interactivity does not solely rely on in-person education, as online platforms offer a variety of tools to ensure collaboration and interactivity during lessons. This novelty, brought about by the pandemic, could have factored into the evaluation of interactivity.

Contrary to LeBlanc (2021), we did not find a deterioration in SET scores; however, our results align with the relatively high overall evaluation of the selected SET questions observed in his study. Boysen (2023) and Fullerton et al. (2024) reported significant differences, with the COVID-19 period receiving higher SET scores, but these effects were also marginal. A key novelty of this research is the inclusion of the post-COVID period in the analysis, providing a more comprehensive context for comparison. Our results support the conclusion that no meaningful negative change can be detected in SET scores during the COVID-19 pandemic.

We acknowledge limitations in the data collection process, as this secondary analysis relied on an anonymized dataset provided by the university. To comply with data protection guidelines, student identities and feedback were anonymized to the extent that only the number of survey responses was available. Consequently, no information was accessible regarding response rates, gender, age, or other sample characteristics. Incentives offered to students for completing SET surveys may distort responses; however, they can also increase the overall volume of evaluable data (Lukáts et al., 2023). The use of a four-point Likert scale may polarize student responses, potentially affecting the results. Further limitation is that outcomes are single-item measures, so internal-consistency reliability is not applicable. Although the validity of using the mean value as an indicator has been questioned (Fullerton et al., 2024), our analysis appears to support contemporary findings on the subject. While limitations exist, the volume of available data provides a sufficient basis for interpreting SET scores across the key time periods.

Our research aimed to provide factual data, based on a large sample size, regarding the quality of education before, during, and after the COVID-19 pandemic. Our objective was to understand how students experienced education during these periods and to contribute to the academic literature concerning their feedback. By building on results from a sufficient sample size and providing contextual information, we aimed to offer useful insights despite the controversy surrounding the value of SET questionnaires.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving humans were approved by Institute Research Ethics Committee of the Institute of Psychology at Károli Gáspár University of the Reformed Church in Hungary. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required from the participants or the participants' legal guardians/next of kin in accordance with the national legislation and institutional requirements.

Author contributions

MP: Writing – review & editing, Writing – original draft, Formal analysis. ST: Writing – original draft, Writing – review & editing, Formal analysis, Data curation, Methodology. RT: Data curation, Formal analysis, Methodology, Writing – review & editing, Conceptualization, Supervision, Writing – original draft.

Funding

The author(s) declare that no financial support was received for the research and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Abu Talib, M., Bettayeb, A. M., and Omer, R. I. (2021). Analytical study on the impact of technology in higher education during the age of COVID-19: systematic literature review. Educ. Inf. Technol. 26, 6719–6746. doi: 10.1007/s10639-021-10507-1

PubMed Abstract | Crossref Full Text | Google Scholar

Adedoyin, O. B., and Soykan, E. (2020). Covid-19 pandemic and online learning: the challenges and opportunities. Interact. Learn. Environ. 31, 863–875. doi: 10.1080/10494820.2020.1813180

Student evaluation of teaching prior, during and after the COVID-19 pandemic

Introduction

Materials and methods

Results

Discussion

Data availability statement

Ethics statement

Author contributions

Funding

Conflict of interest

Generative AI statement

Publisher's note

References

Appendix