Gamification tailored for novelty effect in distance learning during COVID-19

The pandemic led to an increase of online teaching tools use. One such tool, which might have helped students to stay engaged despite the distance, is gamification. However, gamification is often criticized due to a novelty effect. Yet, others state novelty is a natural part of gamification. Therefore, we investigated whether gamification novelty effect brings incremental value in comparison to other novelties in a course. We created achievement- and socialization-based gamification connected to coursework and practice test. We then measured students’ behavioral engagement and performance in a quasi-experiment. On the one hand, results show ICT students engaged and performed moderately better in a gamified condition than in control over time. On the other hand, BA course results show no difference between gamified and practice test condition and their novelty effect. We conclude an external gamification system yields better results than a classical design but does not exceed practice tests effect.


Introduction
Gamification is an intentional or emergent transformation of a non-game environment to a more game-like state (Koivisto and Hamari, 2019). Throughout the last decade, gamification has been used in the workplace, in school education, with various mobile apps for healthcare, fitness, self-learning, and many other contexts (Huotari and Hamari, 2017;Looyestyn et al., 2017;Sardi et al., 2017). Notably, implementing a gamified design in such a context usually aims to change behavioral outcomes, performance, motivation, and attitudes (Treiblmaier and Putz, 2020). This can be done using game elements such as badges, leaderboards, and narratives, out of which the most common is the PBL triad -points, badges, and leaderboards. Previous studies have shown that a well-designed gamification has the potential to promote motivation , increase behavioral task engagement (Looyestyn et al., 2017) and performance (Landers et al., 2017). However, many researchers admit that the gamification effect may rather be caused by a novelty effect (Koivisto and Hamari, 2019). A novelty effect means that if we add something new to the environment, people get curious and temporarily more engaged with the environment. For instance, as gamification is something new to the environment, it may temporarily affect the users' outcomes due to such a novelty effect. More specifically, the users' initial gain in engagement and performance has often lowered across time in gamification studies (Farzan et al., 2008;Hamari, 2013). Raftopoulos (2020) even suggests that we should consider such a novelty effect as a systemic design feature of gamification. Meaning, we should strive to reap its benefits, for example, by irregularly adding something new to the design or redesigning the whole gamification once in a while. However, even if we consider the novelty effect as a feature of gamification, not an intervening variable, we may ask This study aims to examine how implementing an achievement and social-based gamification design to enhance knowledge gain and retention in a university course differs in the short-term and the longterm from adding several practice tests with simple feedback. Therefore, we address how effective gamification is and how useful it is to gamify compared to providing other novel catalyzers of change in an educational context.

Gamification and its novelty effect in education
Many researchers have extensively examined the potential effect of various gamification designs in education (such as the effect on motivation or learning performance) throughout the last decade (Koivisto and Hamari, 2019). However, the results appear to be somewhat mixed. In their review, Koivisto and Hamari (2019) found majorily positive effects of gamification in education/learning experiments (68%, 19 studies), but also a substantial amount of null or ambiguous effects (25%, seven studies). Similarly, a review of gamified second-language acquisition in higher education (Boudadi and Gutiérrez-Colón, 2020) shows mainly positive (73%, 11 studies) with some ambiguous (20%, three studies) results. As the authors of both reviews agree, the unclear effect may primarily be caused by the type of gamification (e.g., Duolingo is less suitable for more skilled language learners) or by small sample size. Even other recent reviews agree that negative results occur scarcily and may be caused by user, gamification type or educational content type incompatibility (Zainuddin et al., 2020a;Metwally et al., 2021). Thus, we may assume that gamification has predominantly a positive effect in educational context. This is also in line with the recent meta-analysis which found a medium positive effect of gamification in terms of performance feedback and enjoyment (Bai et al., 2020).
Even recent large-sample studies further support this finding. For example, Legaki et al. (2021) showed that when using a gamified app in a forecasting course, students had better learning outcomes than students who only saw a lecture or read a book after the lecture. Legaki et al. (2020) also found a similar effect in a long-term experiment during a statistics course with 365 other students. Furthermore, El-Beheiry et al. (2017) concluded that gamifying virtual reality surgeon simulation through points and competition leads to higher simulator use and improved performance in comparison to providing students only with the simulator. Not only this shows that the gamification effect may add up to the experiential learning of a simulator (Kolb et al., 2001), but also that the gamification's novelty may increase the supposed novelty effect of a simulator if we consider it a feature. This is crucial to our research aim as it means the effect of gamification and its novelty may add up to the effect of some other new course features and their novelty. The importance of this generalization is even emphasized by the fact that simulations are often difficult to distinguish from gamification and other game-based learning (van Gaalen et al., 2020).
The idea that a novelty effect might play a role in learning behavior and outcomes is supported by the manifestation of the said effect in multiple studies. Farzan et al. (2008) report that the initial spike in contributions to a gamified organizational network diminished, possibly because the system did not adjust to the users (i.e., stopped being novel to them). Additionally, Koivisto and Hamari (2014) observed that perceived enjoyment, playfulness, and usefulness of a gamified exercise application decreased with time, suggesting a novelty effect was at play. As this novelty effect on exercising was more substantial for younger users, we should inspect its role in educational gamification. This is further supported by a gamified long-term quasi-experiment where students with gamification outperformed the control group in Test 1, but did not differ from the control group in Tests 2 and 3 (Sanchez et al., 2020).
Contrastingly, van Roy and Zaman (2018) found no novelty effect in a course with a long-term gamification. Instead, they found a curvilinear relationship (first negative, later positive) between the time of use and autonomous and controlled motivation, as defined by selfdetermination theory (Deci and Ryan, 2015). Such a dicrepancy in novelty effect occurrence could be explained by the fact that their gamification introduced something new each week. This is consistent with the remarks of Raftopoulos (2020) and also with Tsay et al. (2019) who stated that adding new interactions later in the second year of a gamified two-term course mitigated a drop in engagement after the novelty effect wore off.
However, one could ask whether transforming the gamification regularly is not too costly compared to other methods which could be more easily sustained and whether gamification should not rather be used in short-term tasks where even a shallow gamification leads to positive results (Lieberoth, 2014). Not only must we therefore compare gamification with a control group like Tsay et al. (2019), but also gamification and other novel elements that may improve students' behavior and outcomes (such as quizes, chatbots). The ideal time for such experiments was the first semester with COVID-19, as the sudden change to full-time online teaching brought the necessity of engaging students in new ways.

Students' engagement during the pandemic
First studies on the pandemic unsurprisingly, yet sadly report that the time has been quite difficult on students. Although students usually had the time to study, their well-being and health suffered, leading to, for instance, stress, anxiety, and depressive symptoms (Safa et al., 2021), deterioration of family and peer relations (Morris et al., 2021), and even post-traumatic stress disorder and suicidality (Czeisler et al., 2020). Due to these and other adverse effects, students' learning performance and engagement could suffer if not cared for, especially since the restrictions thwarted many common methods of teaching, learning, and socialization (Morris et al., 2021). The fact that the words "distance learning" directly put the distance between the students and their classes speaks for itself.
In pandemical distance learning, online tools and other technologybased methods are the key and most feasible approaches to teaching and studying. In order to narrow the distance, we need to seek out those methods which sustain students' engagement in courses. Previous studies have shown that methods such as online practice, online peer discussions, videos, and teleconferencing can maintain engagement and lead to good learning performance (Campillo-Ferrer and Miralles-Martínez, 2021). Consequently, we may ask whether gamification has the potential to maintain or even increase students' engagement and which method would be more cost-effective with respect to possible novelty effects.
Although we could not have foreseen the COVID-19 pandemic, a blessing in disguise was that we decided to implement our gamification Frontiers in Education 03 frontiersin.org and practice tests (i.e., online practice) with simple feedback in the spring term when the pandemic later began and when all universities in the Czech Republic closed down. Accordingly, most courses were just a combination of online lectures and self-study. This gave us a unique opportunity to examine whether gamification helps students learn and remain engaged in the course compared to practice tests (i.e., a quasiexperiment) and no changes to the course. Thus, we hypothesize the following: H1: Students with gamification engage more in course materials than those without it both in the short-term and over time.
H2: Students perform better in practice tests with gamification than without it both in the short-term and over time.
H3: Students with gamification perform better in the course than those without gamification.

Participants and study plan
The sample of our quasi-experiment included 278 Czech university students from three courses: two in the field of business administration (BA; 120 and 65 students) and one in information and communication technologies (ICT; 93 students). We chose this sample because of the course size, which allowed us to split the sample into two parts in 1 year. Also, the same teachers taught both BA courses and the courses had similar requirements, difficulty, and topics (psychology in HR). The courses differed only in the degree of study and thus were easily comparable. For one half of the students in the BA courses, we implemented only the practice tests; for the other half of BA students, we implemented the tests and their gamification. Contrastingly, we redesigned the whole coursework in the ICT course, creating a gamified and a non-gamified version of the course. However, we built the gamification with the same system, game elements, and aesthetics as the BA gamification. In all courses, the gamified and the non-gamified groups were distributed equally and randomly.
This study plan allowed us to examine the gamification novelty effect compared to other novelty effects and to the original course design, which would not have been possible to do only in the BA courses due to sample size. This way we were able to examine the generalizability of gamification across various uses of one gamified system (i.e., adding something new with it vs. redesigning current coursework). Unfortunately, the pandemic weakened such comparability as it led to one unexpected difference between the BA and ICT courses and gamification. Originally, there were supposed to be practice tests in all courses. However, the tests were canceled in the ICT course because they were not ready to be converted to online administration without the risk of flaws and cheating.
On the other hand, this shift to distance learning provided us with the opportunity to tailor the gamification to the current situationwe decided to emphasize the gamification features connected to distance learning problematique in our design (i.e., how to keep students focused during online seminars and lectures, how to help them be proactive in them, how to help them practice gained knowledge in-between lectures, how to get some feedback without continous direct contact with the lecturer, how to help students communicate with one another).

Design
We prepared a quasi-experimental design to inspect the difference in gamification and control condition over the course of 12 weeks. Examining the differences over time allowed us to observe the changes in them, thus granting us the possibility to infer about the novelty effects. However, we first examined from what the students would benefit the most.
In the BA courses, we first identified possible gaps in their design by interviewing the lecturers, investigating the syllabuses, and examining the yearly course outputs (grade points, students' feedback, etc.). Based on this, we concluded that students could benefit from the possibility of exercising their critical thinking over cases from a workbook. We also determined that students need to solidify the knowledge gained after each lecture and seminar. Thus, we translated a suitable English workbook by Robbins and Judge (2017); an accompanying workbook to the handbook used in the class to Czech and transferred it to online exercise tests divided by lecture topics. Each test consisted of five random questions from a larger pool to make the exercise brief and motivate students to repeat them. After each test submission, the students received simple feedback (the number of correct answers). We implemented these tests in the administrative system so that they became available only after the topic was discussed with the students. For those who participated in the gamification, we also linked these tests to the gamification system. These systems were then consulted with course lecturers and discussed in cognitive interviews with previous students and IT experts. Based on these discussions, we finalized our instructions, exercise tests, and gamification.
We based our gamification on the achievements, socialization, and immersion framework by Xi and Hamari (2019) while focusing on the first two parts of the framework for several reasons. First, achievementbased gamification has been numerously found reliable and valid in academic environment (Koivisto and Hamari, 2019). Especially, if we strive for a long-term effect on behavioral outcomes and performance (Kuo and Chuang, 2016). Second, such a gamification can be easily based on self-determination theory by promoting students' competence through challenges and feedback, their autonomy by making the tests voluntary and giving them enough challenges to choose from, and their relatedness to others by sharing their gamification success, comparisons, and discussing the test questions and answers (Treiblmaier et al., 2018). Such a gamification should motivate intrinsically rather than extrinsically (Tsay et al., 2019). In fact, there is longstanding evidence for this positive effect of satisfying basic psychological needs outside of gamification. If teachers give feedback supporting competence and if they support autonomy, students perform better (Guay et al., 2008). In work settings, supportive leadership with individualistic approach (i.e., transformational leadership) and skill-based compensation also lead to better performance and employee health (Gagné and Forest, 2008). Similar outcomes have even been found in many other contexts (Deci and Ryan, 2008). Finally, such gamification can also be in line with goal-setting theory. It sets clear goals through challenges and relative position leaderboards, that is, leaderboards showing only peers with similar scores (Landers et al., 2017;Koivisto and Hamari, 2019). Finally, with achievement and socializationbased gamification, it was easy to add novel challenges and practice test types in order to sustain the positive outcomes of a novelty effect as proposed in previous research (Tsay et al., 2019;Raftopoulos, 2020).
We designed our challenges and achievements to motivate students to engage in several activities. First, as we wanted them to try and explore "the game, " we created badges for logging in, opening the first test, among others. Second, we made test-completion badges of varying difficulty (i.e., one test with/without a full score, several tests with a certain number of Frontiers in Education 04 frontiersin.org points) to highlight the importance of trying tests that include various topics and gaining better scores. Third, we prepared badges for repeating the same tests to solidify the knowledge. Fourth, we devised time-constraint badges to give them feedback on their ability to finish the final exam in time. Fifth, to increase relatedness, we created badges for answering other students' questions and reporting system bugs. Such a design should correspond to the needs we found with our gap analysis and gamification design recommendations (Furdu et al., 2017;Mekler et al., 2017).
In the ICT course, we examined the design gaps by forming a focus group with the lecturers, observing the syllabus, and examining the course outputs. We discovered that students had not been very active in the course (low non-mandatory seminar attendance, low activity in seminar discussions). Furthermore, their continuous coursework had not reached the expected quality (homework, presentations, and graphics). Thus, we focused on students' buy-in, course activity (attendance, activity in and across seminars), and some coursework aspects (when they begin to work on a task, how they work, and what outcomes they present). We also used badges for logging in, looking in the forums for the first time, similar to BA courses. Moreover, we created badges of varying difficulty, time-constrained badges, and social interaction badges.
We also present the game elements, examples of their rules, and the objectives we strived for with these elements and rules in Appendix 2. This allows us to highlight the comparability and differences of the course objectives and the gamification design across them. Although the specific rules sometimes differ, the objectives and challenge types are very similar. Based on this, we assume the courses and their gamification are at least partially comparable.

Procedure
We introduced our research to students at the beginning of each course. In half of the randomly chosen seminar groups, we presented our addition to the course with gamification and the other half without it. In this presentation, we described what the system looked like, how it functioned, the general purpose of the research, and the requirements if the students decided to participate. We also assured them of data anonymization and confidentiality. We then provided informed consents and collected initial data. We collected their data on performance and engagement in the tests, course materials, and course performance continously throughout the 12-week semester, without interruptions or pauses in data collection. The semester schedule and the research plan can be found in Appendix 1.

Manipulation
We gamified half of the seminar groups in the BA courses, while the others received only practice tests. We gamified half of the seminar groups in the ICT course while leaving the other half unchanged. Thus, we have two types of control groups.

Behavioral engagement
Based on the recommendation of Tsay et al. (2019), we measured behavioral engagement in multiple ways connected to what we were trying to achieve with our design. We expected the students to gain feedback from the practice tests (and gamification) and thus look more often and sooner in the given literature. Such an aim should have helped students learn in the disorganized times of the first pandemic wave in spring 2020 when it was difficult for them to grasp online education, be motivated, and access learning materials (Kohli et al., 2021). Thus, we observed how often the students looked into each of the course learning materials since the introduction of our gamification (i.e., number of views of each of the course material in the information system). We looked into both the total amount of views and the views per semester week (and lecture topic). We have also examined how soon students opened their coursework. This variable has been calculated as the number of days between the release of a weekly study material and the first date of opening the material by the student. We assume this is a valid measure of the success of our tailored design due to self-determination theory. Were we successful in designing the gamification in accordance with this theory, students should have been more autonomously motivated. In previous research, this meant participants stayed longer on task in a free-choice period (Ryan and Rigby, 2019) or that they reported more curiosity about the gamified activity (Treiblmaier and Putz, 2020). Thus, observing whether students use the materials more often and whether they go through various topics, allowed us to examine further support of our hypotheses.
We also examined behavioral engagement with practice tests themselves: What amount of the weekly course topics the students tried to practice in the tests and how many tests the students went through, regardless of the topic. This measure's validity is based on the same principle as the previous one.

Performance
We were interested in several types of performance. The first was course performance, i.e., the amount of grade points (GP) gained in the course. These points were obtained similarly across all courses (a project assignment and a final exam). Although GP can be subject to bias due to subjective evaluation, the measure is the most common objective outcome in education (Canfield, et al., 2015), including gamification research (Domínguez et al., 2013). Furthermore, the GP evaluation is highly (and similarly) standardized both on the university and course level. Moreover, the course instructors were subjected to a blinding procedure as they did not know which students were in the experimental condition. Therefore, their evaluation could not be biased by an effort to help the experimenter nor attentional bias to such students. Based on all of this, we assume GP are a valid measure of students' learning outcomes.
The second performance measure was practice test performance, measured in total points earned from practice tests by giving the right answers in multiple-choice questions. Although multiple-choice questions are one of the easier testing forms since students do not have to come up with the answer themselves, they are a valid measure of performance and a valid tool for knowledge practice (Considine et al., 2005). However, given that we were also interested in students' knowledge broadness and precision, we looked not only into total points (whether students answered correctly more often across all the various questions), but also into points per practice test (whether students answered more questions correctly in one test). We also note this second performance measure is relevant only to the BA courses as there were no practice tests in the ICT course.

Descriptive statistics
Starting with 284 Czech university students, we lost 10 participants to a drop-out, and 121 were excluded because they did not engage in the Frontiers in Education 05 frontiersin.org practice tests or the gamification system. We attribute this huge sample loss (a limit to our study) to the pandemic as for some students doing anything, but the most necessary work could have been too muchespecially in the chaotic times of the first wave. Thus, the sample available for data analysis consisted of 274 students, but the meaningful data consisted only of 153 students who were, on average, 21.69 years old (SD = 1.57). The majority were men (94, 61%). A total of 62 students came from the Bachelor BA course, 32 from the Master BA course, and 59 from the ICT course (see Table 1 for more stratification information). All variables except for material opening time (i.e., one of the measures of behavioral engagement) were non-normally distributed; thus, we used non-parametric tests. Specifically, points were negatively skewed due to low course difficulty and other behavioral engagement measures (be it course or practice tests) were positively skewed. We present the main descriptive statistics and correlations in Tables 2, 3.
As expected, we found a positive relationship between grade points and the number of course materials studied, supporting the idea that behavioral engagement in a course may lead to better outcomes. Similarly, the grade points were higher when the material opening time was shorter. Further, the weak correlation between the two behavioral engagement variables supports the statement of Tsay et al. (2019) that such engagement should be assessed from multiple points of view when gamifying. Interestingly, although practice test measures are related to most of the other variables, only the points gained in them are related to grade points. This result may be explained by our large sample loss or by course difficulty.

Hypotheses testing -BA courses
We first performed two Mann-Whitney U tests to test the hypothesis (H1) that behavioral engagement in the course differs across conditions. As we did not find a significant difference in both the amount of course materials the students went through [U(N Game = 37, N nGame = 57) = 1273.5, z = 1.7, p = 0.09] and in the material opening time [U(N Game = 37, N nGame = 57) = 864.5, z = −1.47, p = 0.14], we did not gain support for H1 on the whole course level. However, as we were interested in the novelty effect, we also decided to examine whether gamification leads to decreased engagement over time. Although a repeated-measure mixed model would be best suited for this, we first looked into the visualization of the growth curve. The graph for both the number of materials (Graph 1) and opening time (Graph 2) shows that those in the gamified condition fare slightly better as the amount of studied materials is higher for them and their material opening time is lower. However, these differences are too small to continue with a sensible evidencedriven analysis. Simultaneously, we can see that the drops in engagement in the gamified condition somewhat copy the drops in the control condition. Thus, we could not support the first hypothesis that those involved in gamification would engage more in the course.
To test the hypothesis that students with gamification perform better in the practice tests (H2), we once again performed a Mann-Whitney U test on the total sum of points with no significant result [U(N Game = 37, N nGame = 57) = 1114.5, z = 0.62, p = 0.53] and on the sum of points per test with no significant result [U(N Game = 37, N nGame = 57) = 1027.5, z = −0.07, p = 0.95]. Regarding the growth curve (Graphs 3, 4), we found substantial differences at the start of the semester in total points, which diminished to an insubstantial difference. Thus, we did not proceed with a repeated-measures mixed model. Nevertheless, a positive gamification effect is noticeable at first in total points. Meaning, students with gamification initially tried out more tested and gained more points in general, but were not significantly more successful per test than those with practice tests only. Further, despite the reduction of the positive effect or even a converse effect across time, we found weak partial support for H2 that practice test performance would be higher with gamification in case of buy-in (i.e., in the short-term).
Finally, we also observed the gamification effect on course performance with no significant result [U(N Game = 37, N nGame = 57) = 900.5, z = −1.07, p = 0.29]. Meaning, we could not support the hypothesis (H3) that students would perform better in the exam with the gamified design.

Hypotheses testing -ICT course
Testing the hypothesis (H1) that behavioral engagement is higher in the gamified course, we first performed two Mann-Whitney U tests. Although we did not find a significant difference in the amount of course materials the students went through [U(N Game = 29, N nGame = 28) = 444.5, z = 0.625, p = 0.53], there was a medium positive effect (Me Game = 1.71, Me nGame = 4.18) on the material opening time [U(N Game = 29, N nGame = 28) = 265, z = −2.25, p < 0.05, d = 0.63]. Thus, we gained partial support for H1 at the whole course level. We further examined whether gamification leads to decreased engagement over time by looking at the growth curves for the number of materials (Graph 5) and the opening time (Graph 6). We can see that those in the gamified condition fare better in some sense as their material opening time is lower. However, these differences are too small to continue with a sensible evidence-driven analysis. Simultaneously, we can see that the drops in engagement in the gamified condition somewhat copy the drops in the control condition. Thus, we only found some support for the first hypothesis that those involved in gamification would engage more in the course. There is a difference in behavioral engagement as the opening time of materials persists over time. However, this difference does not widen.
Regarding the hypothesis that students with gamification will perform better in the course (H3), we found a moderate effect via a Mann-Whitney U test [U(N Game = 30, N nGame = 29) = 569, z = 2.03, p < 0.05, d = 0.73] with a higher score in the gamified condition (Me Game = 59.75, Me nGame = 57). Our results for a course redesigned with gamification support this hypothesis.

Discussion
This study aimed to examine how behavioral engagement and knowledge gain in a university course differ during the pandemic if we redesign the whole course with a gamification or if we expand the course with something novel that is either gamified or not (i.e., practice tests) and that is supposed to help the students with distance learning difficulties. Specifically, we hypothesized that both engagement (H1) and learning performance (H2, H3) would be higher with gamification focused on activity, feedback, and practice. Furthermore, we explored whether such gamification's novelty effect wears off compared to the control condition. This allowed us to assess both the immediate and long-term effects of our gamified course redesign or extension when using an external gamification system (i.e., a system that is not a part of what the students usually use in the course and the university administrative system) focused on achievements and socialization. In this sense, we have come to several significant findings. First, we found partial support for the positive effect of our achievement-and socialization-based gamification delivered via an external system. Namely, we found a moderate negative impact on the length of time before students opened the course materials for the first time in the course redesigned with our gamification focused on feedback, activity, socialization, and performance. This initial difference persisted across the semester, although gamification did not lead to further widening effects across time. These results correspond with both review studies on the positive impact of gamification (Koivisto and Hamari, 2019) and previous studies on gamification and behavioral engagement (e.g., Çakıroğlu et al., 2017;Huang et al., 2018;Zainuddin et al., 2020b). Simultaneously, the results differ from previous novelty effect studies. We did not find the U-shaped positive difference of Rodrigues et al. (2022). Therefore, it is possible that a downside of our study is that by adding something new consistently, we lost the possibility of a familiarization effect (the positive difference at the end of the study after a period of no difference which is caused by knowing the game design better). However, the upside of our regular additions is that the significant engagement difference we found persisted across time and where there was no negative difference none was created unlike in previous studies Sanchez et al., 2020). At the same time, our results are in line with Tsay et al. (2019) who managed to overcome the novelty effect and sustain engagement when iterrating their gamification based on student feedback and favored activities. Therefore, we recommend using our design where problems in distance or hybrid learning and maybe even e-learning arise from students' untimely or low commitment to course activities or from poor time management. However, we should note that success of such a design is also highly dependant on a well-done gap analysis and consequent fitness of the design to the students and lecturers, to their needs, and to the environment.
At the same time, students in the gamified course did not open more materials than those in the control group. Similarly, extending the course with gamified practice tests showed no difference in either measure of engagement compared to extending it with non-gamified practice tests. This points to the interpretation that when using an external gamification system for distance learning, redesigning the whole course may be better than expanding the course with something novel that is either gamified or non-gamified. This is further backed by the conflicting evidence we found for H3 that students with gamification would perform better in the final exam because only the ICT course data supported this hypothesis.
However, there are other explanations for this inconsistency. For instance, BA courses may have differed from the ICT course. While the BA courses focused on both theory and practice, the ICT course was predominantly practical. Thus, working continuously may have, for example, seemed to be a more sensible goal in the ICT course. Furthermore, we could have designed a gamification that is more suitable for the ICT course students. Although the aesthetics and gamification system are identical, and although we strived for similar types of challenges which would simultaneously suit the environment, it is possible we created a design that is more fitting for the ICT course. Such an interpretation would also be in line with the work of Legaki et al. (2021), who found their gamification to be more suitable to engineering students than to BA students. Another reason may be the course difficulty. If BA courses were easy to get through, the course gaps could have been so small that we would not detect a meaningful difference. This is supported by the success rate in each course in the last 5 years and by the change in the success rate in 2020 (see Appendix 3). Moreover, the difference between ICT success rate in 2020 and in 2015-2019 shows a substantial improvement which provides further support for the effectiveness of our gamified redesign. However, we should also note the effect may have been caused or partially caused by the pandemic as teachers were more lenient during the first wave, at least in terms of deadlines (e.g., Armstrong-Mensah et al., 2020;Gillis and Krull, 2020).
The assumption about BA course difficulty is also supported by the fact that the exam performance of those who used the practice tests did not differ from those we excluded because they did not partake in the extracurriculars. Meaning, people who did not take upon the offer to use extra course activities performed similarly to those partaking even though their behavioral engagement differed (see Appendix 3). It is also  Frontiers in Education 07 frontiersin.org possible that the students prepared similarly well for the exam, but their other outcomes (e.g., enjoyment, long-term retention) may have differed. At the end of the semester, we asked the students how engaged they felt in the practice tests and participants with gamification felt more engaged in them than those without it (see Appendix 3). However, this difference in psychological engagement needs to be taken with a grain of salt as it is based only on one half of the sample. Simultaneously, the difference in behavioral engagement may be confounded. As we do not know why these participants chose not to use the tests (and gamification), it is possible the reason for this decision also led to the difference in behavioral engagement. Furthermore, the initial gap analysis of the ICT course led to finding more gaps in students' proactivity in that course than in the BA courses. Therefore, it is possible our gamification is more suitable in cases where activity needs more support. In sum, there may be some course or student differences which reduce the comparability across courses. Future research should thus use a non-manipulated control group to properly assess whether practice tests and gamification design work similarly well or do not differ from no addition to the course during distance learning. Similarly, researchers should focus on other outcomes in easy courses. Finally, we recommend examining students' attitudes toward similar gamification designs across various fields. Our second major finding lies in the partial support for H2. Students gained more points in the practice tests with gamification at the beginning of the semester, yet not in general nor per test. Moreover, the initial difference diminished over time and eventually even turned over. There are multiple possible reasons for this development. For instance, we may have been unsuccessful in sustaining the novel effect of gamification on performing better in the tests. Thus, while the students in gamified conditions started off better, their willingness to try out new tests and consequent higher test performance might have decreased once the novelty wore off. This is consistent with the results of Hanus and Fox (2015) who, similarly to us, created a gamification with achievements and leaderboards and found a drop in intrinsic motivation after a while.
However, such an interpretation contradicts the positive trend we can see in behavioral engagement measures. As the curves for practice test performance vastly changed after the outbreak of COVID-19, we might assume that the pandemic caused this discrepancy. According to Nieto-Escamez and Roldán-Tapia (2021), students in gamified experiments often stopped partaking in the gamification due to inadequate physical and psychological conditions to conserve energy during the pandemic. In our case, students would logically stop using the gamified system, even if they still used the practice tests. Thus, sustaining the novelty effect with novel additions to the gamification would not have an impact. Simultaneously, such conditions would explain our sample loss. This once again points to an assumption that in crisis times, such as the pandemic, achievement-and socializationbased gamified designs in education (and possibly even other fields) should focus more on sustaining students' activity in the current curriculum than in new extracurricular activities.
Finally, students with gamification might have focused on a different goal than those without gamification. Unlike the students in the control condition, they repeated the same tests more often to obtain better results in them (see Appendix 3). Meaning, our design might have prompted them to correct and learn from their mistakes. If this happened, it is possible students in gamified condition opened less course materials because they learned through practice tests and their repetition and did not feel as high a need to revisit the materials as those in control condition. This once again points to the fact that researchers and practitioners should choose carefully what the outcomes in gamified settings in various contexts should be. Simultaneously, we need to consider the circumstances of future users. This study and previous pandemical gamification studies show that under high stress and other physical and mental difficulties, users may not be able to utilize some of the gamified designs they have at hand (e.g., Lelli et al., 2020;Liénardy and Donnet, 2020) as gamification tends to create a cognitive load on them (Suh, 2015). Given the differences in BA and ICT courses, we extrapolate this load might become heavier if we add new extracurricular activities to the course together with the gamified design. Therefore, redesigning the course without such activities may be a better solution in distance learning. Such possibility should be further examined in future studies.
Altogether, we established some support for the assumption that our gamified design may be a suitable catalyst for change, even in pandemical times. Although our gamified design does not seem to work better than a more traditional and more easily developed teaching method (practice tests), it leads to better outcomes in comparison to using the most traditional methods of lectures, self-study, and homework without a GRAPH 1 N of studied materials over time.
GRAPH 2 t of opening materials over time.
Frontiers in Education 08 frontiersin.org gamified design. In a gamified course, students start working on their coursework earlier during the pandemic, corresponding with the results of Pakinee and Puritat (2021). However, unlike these authors, who created a more competitive gamification than us and whose students did not have any lectures with the teachers, we also found a moderate effect on students' final exam performance. This could mean -as the authors themselves suggest -that competitive gamification elements need to be chosen carefully with respect to the users and their personality. But we may also suggest that when redesigning a course with gamification for distance learning purposes, we should still use those traditional teaching methods we are able to and possibly even intertwine them with the gamified design. Such a proposition should be further examined, as we can expect distance learning will still be needful in the future due to its benefits (Goudeau et al., 2021) or the concurrent energy crisis and other crises.

Limitations and future work
Our study had several limitations. The main limitation is our sample loss, probably caused by the pandemic, as in other studies (Nieto-Escamez and Roldán-Tapia, 2021). Possibly, the differences we have found or were not able to find may have been caused by the specific sample loss as we lost more participants in the gamified BA courses. Further, although we could not foresee the pandemic, we might have shed some light on its role and other determinants of sample loss via qualitative interviews with drop-out students. While this was not feasible due to the pandemic, it once again shows that gamification data collections are best done in mixed-method designs. Especially so since the first pandemical wave was chaotic even though our university started using Zoom, Teams, and e-learning very early on. Initially, many students and teachers were not used to using technology that much and in such a way. Students who had to get more used to being online might benefit more from the gamification and practice test than the others, but simultaneously might not have the capacity to utilize the opportunity unlike those more used to online communication and work.
The second limitation is the lack of a true control group in the BA courses combined with the differences between the BA courses and the ICT course. Even though we strived for a similar design and system, the results in the courses may have been caused by course differences. Therefore, gamified and non-gamified practice tests may function similarly well and better than no extension to the BA courses while some extensions to the ICT course might work similarly well as redesigning it with gamification. Thus, we recommend using a very large course where GRAPH 3 Points gained in exercise tests.

GRAPH 4
Points gained per practice tests.
GRAPH 5 N of studied materials over time. Frontiers in Education 09 frontiersin.org multiple grouping would be possible with a reasonable sample size to further test our results. However, finding such a course, which also could be sensibly gamified, may prove difficult. Our results are also limited by possible individual differences in our sample. Previous research (e.g., Amo et al., 2020;Pakinee and Puritat, 2021) shows extraversion and trait competitiveness may play a role in a socialization-based gamification. Although our gamified designs were only partially socialization-based and competition was rather a marginal part of it, the individual differences may have played a role we could not tackle. Therefore, future research on the gamification novelty effect should consider these possible differences when designing, providing, and analyzing the gamification.
Lastly, our gamified designs were not originally planned to resolve the problems stemming from the pandemic and distance learning. Thus, our gamified BA design could have been ill-suited for such a situation, leading to no significant results in the BA courses. However, if so, it proves that redesigning a whole course so that students attend course meetings, are active in them, start working on their coursework sooner and more efficiently may be a viable solution in distance learning. Meaning, researchers should focus on such goal types in future gamification of distance learning in order to further support these findings or to find out which ones are the most relevant to making distance learning more accessible and efficient.

Conclusion
The aim of this study was to examine the novelty effect and effect of gamification in comparison to another novelty in a course as well as to a non-manipulated condition during distance learning. We found that practice tests and gamified practice tests lead to similar results in engagement and practice test performance across time. Although such results point to the conclusion that gamification is not always the better (nor the worse) solution in terms of objective outcomes, we conclude using similar gamified design of practice tests may still be the better choice due to subjective outcome differences (e.g., in enjoyment) found in previous studies (e.g., Lieberoth, 2014;Treiblmaier and Putz, 2020). Furthermore, our engagement and practice test performance results are in contrast with Sanchez et al. (2020) who found that gamified practice tests led to a significant decrease in performance across time even in comparison with traditional practice tests. We ascribe this contrast to the fact that we added something novel to the gamified environment regularly across time as proposed by Raftopoulos (2020) and Tsay et al. (2019). Therefore, our first main contribution to educational gamification is that we should extend the design regularly in times of distance learning in order to prevent negative consequences of a novelty effect. This is also in line with the results we found when comparing a course redesigned with gamification and its original form. Students with the redesigned course performed better in the final exam and kept their higher engagement across time which is in contrast with previous novelty effect studies where they lost it (e.g., Koivisto and Hamari, 2014;Rodrigues et al., 2022). Once again, this points to the contribution that gamified designs should present something novel across time. Simultaneously, this meant the students were more pro-active in such a course and started working on their tasks sooner even in distance learning times. Thus, our second contribution is that redesigning a course with such an achievement-and socialization-based gamification in distance learning is a viable solution to close the distance, especially if we focus on coursework, pro-activity, and attendance. Although we should be mindful of the possible BA and ICT course differences, it also seems redesigning the whole course with gamification is more viable than redesigning only a course extension.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. The patients/participants provided their written informed consent to participate in this study.