Students' interactions with an artificial intelligence assistant in a remote chemistry laboratory

Lizano-Sánchez, Fiorella; Idoyaga, Ignacio Julio; Orduna, Pablo; Rodriguez-Gil, Luis; Arguedas-Matarrita, Carlos

doi:10.3389/feduc.2025.1712743

ORIGINAL RESEARCH article

Front. Educ., 05 November 2025

Sec. Digital Education

Volume 10 - 2025 | https://doi.org/10.3389/feduc.2025.1712743

This article is part of the Research TopicGenerative AI Tools & Software for EducationView all articles

Students' interactions with an artificial intelligence assistant in a remote chemistry laboratory

Fiorella Lizano-Sánchez¹

Ignacio Julio Idoyaga²

Pablo Orduna³

Luis Rodriguez-Gil⁴

Carlos Arguedas-Matarrita¹^*

¹Universidad Estatal a Distancia, San José, Costa Rica
²Universidad de Buenos Aires, Buenos Aires, Argentina
³LabsLand, San Francisco, CA, United States
⁴LabsLand, Bilbao, Spain

The integration of artificial intelligence (AI) in education has demonstrated potential for solving individualised learning challenges, particularly through virtual assistants in natural science subjects. This study analyses students‘ interactions with an intelligent assistant integrated into the acid-base titration II remote laboratory to characterise learning needs and identify assigned roles during experimental activities. The intelligent assistant utilised OpenAI GPT-4o, customised with laboratory-specific information. The analysis of interactions was carried out using a mixed methodology combining content analysis with descriptive statistical analysis. The interactions were systematically categorised, revealing four main dimensions in the use of the assistant which are development of remote experiences, data processing, application and conceptual understanding, and guidance for the preparation of laboratory reports. Students positioned the assistant mainly as a tutor for procedural and calculation support, a conceptual support resource connecting experimental practise with theoretical understanding, and a mediator in scientific communication for report elaboration. This role diversification responded directly to students' main difficulties in experimental procedures, concentration calculations, theoretical understanding, and scientific writing. The study validates the necessity of AI-powered pedagogical support, demonstrating the versatility of artificial intelligence in remote laboratories.

1 Introduction

Artificial intelligence is becoming more important in education due to its popularisation and greater accessibility in recent years. This expansion has demonstrated its potential to solve long-standing problems such as individualised learning, adaptive assessment, and support throughout the educational process. The integration of AI in this field has been extensive, encompassing the utilisation of applications such as Khanmigo and Duolingo, as well as the adoption of virtual assistants in the field of natural science subjects (Dogru and Faulconer, 2025; Erümit and Sarialioglu, 2025).

The role of the virtual assistant depends upon the design and pedagogical strategy with which they are integrated. They have been employed as virtual tutors where they provide learning support, as academic assistants where they facilitate administrative and organisational processes, as auxiliary motivators in virtual or hybrid classrooms in the learning process (Gubareva and Lopes, 2020), as facilitators, allowing interaction with educational resources in a multimodal way (Todericiu, 2025), in analysis and monitoring to record interaction data, progress levels, and reports that can be used by teachers for decision-making (Sajja et al., 2023).

A variety of studies have demonstrated how AI is revolutionising the field of chemistry and its education. A study conducted by Berber et al. (2025) revealed that the most frequent applications of AI in the field of chemistry are: predicting protein structures, optimising processes to accelerate drug development, and predicting ozone concentrations in the atmosphere using machine learning. These advances highlight the necessity to incorporate AI literacy into education in order to prepare students for a field where these tools will become increasingly relevant to their professional development.

1.1 Intelligent assistant integrated with remote laboratories

The integration of artificial intelligence into the field of education has been progressive, with the development of innovative learning ecosystems resulting from the convergence of artificial intelligence with various emerging technologies. Among the documented applications, the integration of AI with augmented reality and virtual reality (Lampropoulos, 2025) and the application of AI in virtual laboratories (Paladines et al., 2021) are of particular note.

However, despite this diversity of applications, one emerging and relatively unexplored area is the integration of artificial intelligence in educational remote laboratories. The pioneering study in this area is being developed by the Remote Hub Lab at the University of Washington (Hussein et al., 2024).

A remote laboratory is real equipment that is accessed via the internet (Orduña et al., 2016), and is available at all times. There are remote laboratories in the areas of physics, biology and chemistry, and in the latter several developments have been made, including the Acid Base Titration II (Idoyaga et al., 2024).

As mentioned, remote laboratories are available 24 h a day, 7 days a week, which presents the limitation that students do not have the support of teachers during experimental activities, due to the diversity of schedules in which they use these laboratories (Arias-Navarro et al., 2024). Considering this limitation, a research project is being developed focused on the integration of an intelligent assistant to the remote acid-base titration II laboratory, with the purpose of providing educational support to the student during the experimental learning process.

The development of this research has been carried out in three phases. Initially, a first study focused on the perspective of teachers regarding the potential use of artificial intelligence in remote laboratories (Lizano-Sánchez et al., 2025a). Subsequently, a second study was carried out that evaluated the students‘ experience after using the integrated intelligent assistant in the remote acid-base titration II laboratory (Lizano-Sánchez et al., 2025b). The present article is the third phase, which presents a detailed analysis of the students' interactions with the intelligent assistant. Together, the three studies provide valuable information to optimise the operation of the assistant and establish guidelines for the effective implementation of this technology in remote laboratories.

1.2 Customisation of the assistant

The intelligent assistant was built using the OpenAI Assistants API, which has evolved over the past few years. It was progressively adapted to the new versions and features, up to the GPT-4o model (May 40) (OpenAI., 2024) used in this study. OpenAI has newer models, such as GPT-4.1 (May 2025) (OpenAI., 2025a), o3-mini (January 2025) (OpenAI., 2025b) or o1 (September 2024) (OpenAI., 2025b), which are often better in benchmarks covering science (and in particular Chemistry) such as GPQA Diamond (Rein et al., 2023), and most of the newer ones have a lower cost than GPT-4o. However, for the sake of consistency throughout the experiments, all the experiments in this article were made using the same model (GPT-4o). There are also newer models (e.g., GPT-5, August 2025) (OpenAI., 2025c) that are not available in the OpenAI Assistants API. Other companies (Anthropic -Claude-, Google -Gemini-, DeepSeek -R1 and V3.1-, Meta -Llama-) have also newer, more advanced models at the time of this writing.

However, even if it is very tempting to use the latest model every time there is a new update, it is also impossible to draw conclusions if every experience uses a different model. For this reason, the authors decided not to upgrade the models until the current study was completed.

From a technical standpoint, our assistant comprises three main components, all hosted and integrated within the LabsLand remote-laboratory platform, which also hosts the acid-base titration II experiment:

• User interface (UI): developed with Angular and Web Components, providing a familiar assistant-style chat experience so that students can interact with minimal adaptation effort, while allowing flexibility for experimentation and future extensions.

• Server-side component: deployed in LabsLand's infrastructure, this handles communication between the UI and the OpenAI API through a custom protocol, manages authentication, logs all interactions, and stores data securely.

• Instructor and administrator configuration interface: a web-based UI that allows teachers to provide contextual information, laboratory descriptions, procedural guidance, and behavioural constraints (e.g., “do not give complete numerical solutions”) that tailor the assistant's behaviour to educational purposes.

The assistant's context included:

• Information about the laboratory setup, instruments, and safety guidelines;

• Excerpts from the instructors' laboratory manual;

• Instructions preventing the model from directly providing numeric results to promote reasoning instead of answer-giving.

All student messages were routed through the LabsLand backend, which handled authentication, storage, and moderation. The assistant itself had no Internet access or control over physical equipment, and could only respond to textual questions related to the experiment.

The objective of this study is to analyse students' interactions with the intelligent assistant integrated into the acid-base titration II remote laboratory, to characterise the learning needs that arose during the experimental activity and to identify the roles that students assigned to the assistant through their usage patterns.

2 Methods

A mixed methodology was employed. For the qualitative element of the study, a content analysis approach (Bardin, 1986) was used to analyse the students' interactions with the assistant, in order to establish categories and emerging dimensions. Furthermore, a descriptive statistical analysis (Hernández et al., 2014) was conducted to quantify occurrence frequencies, identify distributions, and visualise the primary learning barriers through graphical representation of data.

2.1 Participants

The study involved 150 students enrolled in the Common Basic Cycle at the University of Buenos Aires. All participants were taking the introductory Chemistry course, which covers fundamental chemistry concepts and introduces Analytical Chemistry and Organic Chemistry. This represented the students' first exposure to both remote laboratories and experimental activities in general. Throughout the course, students conducted experimental work through remote laboratory platforms. While the use of the remote laboratory to perform the experimental activity was mandatory, students had the opportunity to choose whether or not to use the intelligent assistant.

2.2 Experimental activity

The students were tasked with conducting an experimental acid-base titration activity, which consisted in a document divided into two sections, available in the learning management system of the course. The initial section focused on the utilisation of hydrochloric acid as a titrant, while the subsequent section centred on the use of acetic acid samples as a titrant. The section of particular relevance to this study was the second part, in which the intelligent assistant was employed.

The second part of the experiment required the utilisation of the acid-base titration II remote laboratory. The laboratory was allotted 1 h to carry out the experimental activity, which was structured in the following manner (Figure 1):

1. Introduction: Includes theoretical concepts about acetic acid as a weak carboxylic acid and its medical and industrial applications. It also includes supplementary resources with further information.

2. Case study: This section presents a problem situation, students must resolve a case from the Pathological Histology laboratory, in which titration solutions are required to detect human papillomavirus.

3. Objectives and procedure: Define the purpose of the practice, as well as the steps to follow. Includes two videos, one that provides an explanation of the remote laboratory and another that explains the use of the intelligent assistant.

4. Report guidance: Provides specific guidance for each section of the report.

5. Report preparation: Data analysis, findings and their explanation from a scientific point of view.

Figure 1

Flowchart depicting the stages of an acid-base titration project. Steps: Acid-base Titration II, Introduction, Case Study, Objectives and Procedure, Guide to the Report, and Preparation of the Report. Complementary Resources connect to Remote Laboratories and Intelligent Assistant.

Figure 1. Structure of the Acid-Base Titration experimental activity used in the chemistry course at the CBC of the University of Buenos Aires.

The students had the possibility to repeat the laboratory experience as many times as necessary, and they were given informed consent to carry out the corresponding analysis of the interactions.

2.3 Analysis and creation of categories

The messages that the students sent to the assistant were extracted from the Labsland database, where all interchanged messages are automatically stored. A total of 308 conversations were recorded, with a total of 818 interactions from the 150 students. These were then subjected to a systematic process in order to construct categories based on Glaser y Strauss (1967) Constant Comparative Method, with researcher triangulation. Initially, 18 categories were identified (Figure 2), followed by a refinement stage in which five categories with information that did not fit the study objectives were eliminated. Subsequently, categories which were found to have similar meanings or functions were grouped together. Three were integrated into a category designated “calculation” two into “theory” and a further two into “help with report.” The emerging categories were grouped into four dimensions that reflect the different types of support that students needed during the experimental activity, which are detailed below.

Figure 2

Flowchart detailing the categorization process of student interactions with AI, totaling 818. It includes five stages: Identification, Encoding, Debugging, Grouping, and Categorization. Under each stage, interactions and categories are analyzed, filtered, grouped, or constructed, with numbers given for categorized interactions and categories at each step.

Figure 2. Systematic process for creating categories and dimensions from student interactions with the intelligent assistant.

2.3.1 Development of remote experiences

- Procedure: Questions about the methodological steps and sequence of actions to carry out the laboratory practise.

- Equipment: Questions about the used instruments and materials, including technical characteristics such as equipment capacities and volumes.

- Remote Laboratories: Queries related to access, navigation and use of the remote laboratory platform.

2.3.2 Data processing

- Data collection: Questions about how to record, measure and collect experimental values during titration.

- Calculations: Questions on mathematical formulae and procedures for determining concentrations, pH and unit conversions.

2.3.4 Application and conceptual understanding

- Theory: Conceptual questions on chemical fundamentals of titration such as colour changes, acid-base reactions and equivalence point.

- Application in real contexts: Questions relating practical situations and clinical applications of acetic acid to experience.

- Examples: Request for practical demonstrations and illustrations to gain a better understanding of concepts and procedures.

2.3.5 Guide to preparing the laboratory report

- Help with reporting: Questions about how the laboratory report is written, including how it is structured, the words used, and how the results are presented.

Subsequently, a descriptive statistical analysis was carried out to calculate the frequencies of the patterns in the categories with the highest number of interactions. The data obtained from this analysis were grouped by levels of importance (minor, moderate and critical). For this purpose, graphical visualisations were created with RStudio. This quantitative analysis enabled the identification of specific patterns of frequent queries and the determination of the most relevant difficulties in each thematic axis, thus providing information about the challenges faced by the students during the experimental activity.

4 Results and discussion

After the debugging process of the students' interactions with the assistant, 646 queries were obtained (Figure 2). The analysis of these interactions yielded four dimensions, comprising various categories (see Table 1). These dimensions facilitate the identification of the primary roles assigned to the assistant by students during the learning process. It is important to note that not only was the type of support requested identified, but also the areas in which students concentrated their needs.

Table 1

Table 1. Total number of interactions classified according to dimensions and categories.

The first dimension, Development of remote experiences, groups the interactions related to instructions on the execution of the practise, configuration of the equipment and more technical doubts about the utilisation of the laboratory. Some examples of these interactions are:

- OK, I'm at the second stage, what do I do? (Procedure)

- What is the stopcock valve? (Equipment)

- How to use this remote laboratory? (Remote laboratory)

The category with the most interactions in this dimension was “Procedure” with 84 interactions. An analysis of the students' most frequently asked questions about the procedure (Figure 3) revealed that the most critical problems (40.2%) were concentrated in the “Start of the experiment” with 22 interactions (26.8%), indicating a lack of clear initial orientation and uncertainty about the first step. As illustrated in the “General procedure,” 11 interactions (13.4%) were indicative of a lack of clarity regarding the sequence of actions and the presence of a need for detailed instructions.

Figure 3

Bar graph displaying interactions categorized by importance level: Critical (≥10), Moderate (5-9), and Minor (<5). Critical has the highest interactions, followed by Moderate. Procedure types are color-coded, including starting the experiment, ending, solutions, titrant selection, and more, each with specific interaction counts.

Figure 3. Grouping according to frequency of interactions by level of importance of the “Procedure” category.

These types of questions show that, although students have access to a laboratory guide and introductory videos, they still require real-time support from the assistant, acting as a tutor who can clarify sequences and confirm steps, reducing obstacles to the progress of the experimental activity. Furthermore, the diverse interactions facilitate the assistant in adapting the responses to the particular needs and levels of understanding exhibited by each student. The utilisation of these technological models that promote personalised learning enables students to advance in the development of competences in accordance with their prior knowledge and skills (Alamri et al., 2021).

The second dimension of Data processing is concerned with the interactions that are focused on the recording of data and subsequent calculations for the analysis of results. This dimension has been identified as the one with the highest number of interactions, suggesting that it is a primary area of concern within this subject matter. This is further substantiated by the following set of questions:

- How do I read the volume of the burette to find out V, and what would be the initial volume? (Data collection)

- Hello, what is the formula to calculate the initial concentration of the reaction? (Calculations)

- How do I know what value to put in each unknown in the formula? (Calculations)

In this dimension, the category with the majority of interactions is “Calculations” with 129 interactions. Figure 4 demonstrates that 39.7% of the total queries refer to the same fundamental problem concerning the calculation of the molar concentration of the substance, with a total of 54 interactions. Together with the 25 queries pertaining to mass/volume concentration (18.4%), these elements account for 58.1% of the total. This finding serves to confirm that concentration calculations represent the major difficulty experienced by students in this experiment, ranking at the critical level of importance.

Figure 4

Bar chart illustrating the number of interactions by importance level. “Critical” (20 or more) with one large red and orange bar, “Minor” (fewer than 5) with several thin colored bars, and “Moderate” (5 to 19) with stacked bars. A legend identifies the calculation types by color, including sample/analyte concentration, pH calculation, and general formulas.

Figure 4. Grouping according to frequency of interactions by level of importance of the “Calculation” category.

These results are consistent with a study by Raviolo et al. (2021) in which half of the first-year students taking their first chemistry subject demonstrated an absence of a clear conceptual understanding of molarity. These findings underscore the urgent need to develop more effective pedagogical strategies and educational resources that address these conceptual difficulties in concentration calculations. The teaching of fundamental concepts in this area has the potential to enhance students' comprehension of acid-base titration.

In addition, other queries were directed to the application of formulas and data interpretation, which, although not the majority, are equally important for the analysis. These queries show the difficulty of students in interpreting equations and identifying the values to be used for each variable, which is consistent with the study by Towns et al. (2025) where they reveal the difficulty of chemistry students in connecting mathematical calculation with the chemical phenomenon being studied, as well as the importance of emphasising the connexions between calculus and chemistry to improve learning and teaching.

In summary, these results reveal that, despite the fact that the data collection procedures are in the laboratory guide, the interpretation, manipulation and analysis of data remains a challenge. This trend confirms that students positioned artificial intelligence in a tutorial role that not only guides the step-by-step development of the activities, but also offers differentiated support according to the requirements of each student (Lin et al., 2023).

It is important to highlight that, although the assistant is of great help in the understanding, interpretation and validation of the results, it is necessary to make sure that the reasoning of the studied contents continues to be encouraged. In other words, the assistant needs to be configured in a way that allows it to have a guiding function, avoiding providing direct solutions, especially in the case of calculations. This requires that teachers have a greater appropriation of the use of artificial intelligence tools (Zawacki-Richter et al., 2019).

The third dimension of Application and conceptual understanding includes interactions that establish links between chemical concepts and application in everyday contexts, as well as requests for examples that allow a deeper understanding of the theoretical underpinning:

- What is the reason for the colour change? (Theory)

- How do I know the equivalence point and the end point? As there are times when you can see that it changes colour and then becomes transparent (Theory).

- In what products can acetic acid be found? (Application in real-world contexts)

- Can you give me an example of what my hydrochloric acid table should look like? (Example)

The category with the most interactions in this dimension is “Theory” with 170 interactions. Figure 5 shows that the critical problems (40.8%) are found in the identification of the titrant with 25 queries (14.8%), the equivalence point and end point with 23 queries (13.6%) and colour changes with 21 queries (12.4%). This demonstrates that there is confusion about which substance acts as a titrant, the differentiation between an acid/base, and difficulty in understanding the role of each reagent.

Figure 5

Bar chart showing interactions by importance level: Critical (less than 20), Minor (less than 5), and Moderate (5-19). Critical shows a significant number of interactions, primarily involving titrant/analyte identification. Moderate includes various concepts such as phenolphthalein function and basic concepts. Minor interactions are few, spanning multiple categories. Color-coded legend identifies each theoretical concept.

Figure 5. Grouping according to frequency of interactions by level of importance of the “Theory” category.

Problems were also encountered in identifying when to stop titration, as well as interpreting the different shades at the exact moment of change. This reinforces the importance of giving a clear explanation of the mechanisms of colour change, defining basic terminology with examples and relating concepts to real applications. This approach facilitates the development of connexions between prior knowledge and new scientific experiences, enhancing knowledge acquisition (Pozo Municio, 2023).

The utilisation of tools such as remote laboratories has the potential to reinforce these aspects by affording students a range of degrees of freedom to experiment. This is due the fact that students can modify the variables of the experiment by selecting different configurations and, especially, they can repeat the practise as many times as they consider necessary. This, together with the integration of the intelligent assistant, allows users to clarify any doubts that arise with the appropriate immediacy that only this type of tool can generate.

The students adopted the assistant as a conceptual support resource, utilising it to facilitate the gradual construction of learning within a conceptual framework. These interactions reveal that students employed artificial intelligence as a bridge between the experimental component and the conceptual understanding of chemical phenomena.

In the same category, interactions were presented that ask for the generation of images as examples, for a deeper understanding of the experimental activity:

- Can you use pictures? (Example)

- Show me a picture of the experiment (Example)

At the time of the study, the GPT-4o model was used, however, the function of generating images was not configured. The above interactions demonstrate the importance of using multimodal educational resources, which are fundamental to optimise learning as they take advantage of multiple sensory channels simultaneously, which increases comprehension and retention of information. According to a study by Luo (2023) the combination of multiple sensory stimuli promotes cognitive performance and engagement in learning. Moreover, this approach facilitates the establishment of diverse learning pathways that are able to adapt to the individual styles and requirements of learners.

In the final dimension, Guide to preparing the laboratory report, interactions related to the writing and presentation of the final report are presented, showing the need for support in aspects of scientific communication. The most frequently asked questions include:

- In which verb tense should the report be in the introduction section? (Help with report)

- How do I make the table for the lab? (Assistance with report)

- How do I put together the report? (Assistance with report)

This dimension has a single category, “Assistance with report” An analysis of the most frequently asked questions (Figure 6) showed that 45.5% of the queries (critical level of importance) were related to the elaboration of tables with 21 interactions (17.1%), conclusions of the report with 18 interactions (14.6%) and objectives of the experiment with 17 interactions (3.8%). The above queries evidence problems in organising and structuring experimental data, difficulty in connecting results with experimental objectives and lack of clarity about the purpose of the experimental activity. This finding indicates that students face the challenge of documenting and scientifically communicating their results.

Figure 6

Bar chart depicting types of assistance by importance level and number of interactions. Critical assistance includes table creation, report conclusions, and experiment objectives. Moderate includes theoretical foundation and practical work. Minor has bibliographic references. Each type is color-coded with interaction counts in parentheses.

Figure 6. Grouping according to frequency of interactions by level of importance of the “Assistance with Report” category.

Laboratory reporting constitutes a fundamental skill in science education, as it involves the ability to organise and present the results of experimental activity, as well as developing skills to argue and communicate those results in an effective way (Gormally et al., 2022).

This finding reveals that the assistant not only acts as a resource to solve technical or conceptual doubts, but that students used it as a mediator in scientific communication. This emerging role is significant in demonstrating how students utilised the assistant to develop communicative competencies and refine technical writing skills, which are an essential component of chemistry education (Tilstra, 2001).

An analysis of the temporal patterns of utilisation of the intelligent assistant was conducted (see Table 2). The temporal analysis was constrained to the initial interaction of each student with the assistant in each laboratory session; subsequent interactions within the same time range were excluded to avoid biassing the results. The total number of interactions recorded is 296, which exceeds the number of participating students (N = 150). The observed discrepancy can be attributed to the fact that a proportion of students returned to the laboratory on multiple occasions and utilised the assistant. The average number of sessions per student was found to be 1.93, indicating the presence of supplementary reinforcement or practise sessions.

Table 2

Table 2. Number of first interactions by time range.

The results show that more than half of the uses of the assistant (58.78%) are concentrated in the ranges from 16:00 to 23:59 h (174 interactions). This temporal window aligns with periods that fall outside the conventional teaching workday. This finding is consistent with the observations of Matarrita and Concari (2018) who asserted that remote laboratories, due to their uninterrupted accessibility, are often employed by students in instances where they do not have direct teaching assistance available.

Overall, the categories show that students gave the assistant roles as a tutor, conceptual support resource and mediator in scientific communication, and these roles allowed for guidance in data analysis and clarification of conceptual doubts. These uses demonstrate the versatility of AI in remote laboratories, but also raise challenges about the balance between guiding and fostering autonomy. A future line of improvement could include the incorporation of multimedia resources such as images and diagrams to enrich the assistant's answers. Furthermore, it would be beneficial to configure the assistant to guide the student in problem solving through reflective questioning and gradual support, enhancing students' self-regulated learning.

4 Conclusions and future perspectives

The use of the intelligent assistant integrated into the remote acid-base titration II laboratory revealed that students assigned different roles to the artificial intelligence according to their learning needs. Through the analysis of the interactions, it was identified that students positioned the assistant mainly as a tutor to solve procedural and calculation doubts, a conceptual support resource to connect experimental practise with theoretical understanding, and a mediator in scientific communication to guide the elaboration of laboratory reports. This diversification of roles arose in direct response to the main difficulties identified in students which were specifically in understanding experimental procedures, concentration calculations, theoretical understanding and scientific writing.

The study shows that in order to make better use of remote laboratories, it is necessary to continue integrating Artificial Intelligence tools as a component to enhance student performance, teacher support and optimise these educational resources.

In the future, we plan to improve the results with both exogenous improvements (by current trends in the field) and endogenous improvements (that we build). Among the exogenous improvements, these companies, which are competing, are constantly improving the models with multiple releases per year. To measure these, different benchmarks are used, such as GPQA Diamond (Graduate-Level Google-Proof Q&A Benchmark), which is focused on biology, physics, and chemistry. Google-Proof in this context means that you cannot search for the solution, and human experts (with a PhD in the field, or pursuing a PhD) are obtaining in this benchmark 65% or, after retrospectively identifying mistakes, 74% of the correct answers [4]. The model used in this contribution (GPT-4o, May 2024) gets results of 53%. OpenAI o1 obtains 78.0%. The latest GPT-5 reports obtaining 85.7% (OpenAI., 2025c), Claude 4 Opus 83.3%, Gemini 2.5 Pro 83.0% (Anthropic., 2025). As these models get closer and closer to 100% of the answers, newer benchmarks will need to be created where human experts in the field score lower and lower, as it has happened with other previous benchmarks (e.g., SQuAD, SuperGLUE) once they lose their discriminative power.

Additionally, the APIs provided by these companies are also improving and switching between models. For example, GPT-5 now (OpenAI., 2025c) is announced as the multipurpose model where it will take longer to reason for more complex prompts and will take shorter for prompts that can be solved faster, probably using simpler models internally. This makes it easier for developers to create better assistants like the one described in this article. Regarding endogenous improvements, we are planning to provide the assistant with more information, such as what the student is doing in the laboratory, and upload screenshots and laboratory data to the assistant so that it can provide a better explanation of what is happening.

Hallucinations are indeed a known issue in generative AI. The system, however, is designed to remain compatible with different large language models (LLMs), and the risk of hallucinations depends heavily on the specific engine used. As new, more reliable models emerge, we expect the frequency of hallucinations to decrease, improving performance in similar educational applications. The risk is low, relevant cases are rare, and there is no safety risk or direct interaction with the chemical equipment.

Finally, the frequency and patterns of use of the assistant validate the need to provide pedagogical support precisely when the teacher is not available, to facilitate greater appropriation of concepts and improve knowledge acquisition during experimental activities. This analysis provides the necessary basis for implementing improvements to the assistant and facilitating its scalability to remote laboratories in other areas such as physics and biology.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

FL-S: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing. II: Methodology, Writing – original draft. PO: Data curation, Resources, Writing – original draft. LR-G: Conceptualization, Investigation, Resources, Software, Visualization, Writing – original draft, Writing – review & editing. CA-M: Conceptualization, Funding acquisition, Investigation, Project administration, Supervision, Validation, Visualization, Writing – original draft.

Funding

The author(s) declare that no financial support was received for the research and/or publication of this article.

Acknowledgments

This article is part of the results of Project PROY0018-2024 registered with the Vice-Rectorate for Research of Universidad Estatal a Distancia.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/feduc.2025.1712743/full#supplementary-material

References

Alamri, H. A., Watson, S., and Watson, W. (2021). Learning technology models that support personalization within blended learning environments in higher education. TechTrends 65, 62–78. doi: 10.1007/s11528-020-00530-3

Crossref Full Text | Google Scholar

Anthropic. (2025). Introducing Claude 4. Available online at: https://www.anthropic.com/news/claude-4 [Accessed August 26, 2025].

Google Scholar

Arias-Navarro, E., Moya, C. N., Lizano-Sánchez, F., Arguedas-Matarrita, C., Mora-Ley, C., and Idoyaga, I. (2024). Study of free fall using an ultra-concurrent Laboratory at the University. IJOE 20, 4–15. doi: 10.3991/ijoe.v20i02.43099

Crossref Full Text | Google Scholar

Bardin, L. (1986). El análisis de contenido. Madrid: Akal.

Google Scholar

Berber, S., Brückner, M., Maurer, N., and Huwer, J. (2025). Artifical intelligence in chemistry research-implications for teaching and learning. J. Chem. Educ. 102, 1445–1456. doi: 10.1021/acs.jchemed.4c01033

Crossref Full Text | Google Scholar

Dogru, M. S., and Faulconer, E. K. (2025). ChatGPT as a virtual laboratory teaching assistant in undergraduate biology. Res. Sci. Educ. doi: 10.1007/s11165-025-10271-z

Crossref Full Text | Google Scholar

Erümit, A. K., and Sarialioglu, R. Ö. (2025). Artificial intelligence in science and chemistry education: a systematic review. Discov. Educ. 4:178. doi: 10.1007/s44217-025-00622-3

Crossref Full Text | Google Scholar

Glaser y Strauss (1967). The discovery of grounded theory. Chicago: Aldine Publishing Company.

Google Scholar

Gormally, C., Sullivan, C. S., Szeinbaum, N., and Nair, K. (2022). Motivating and Shaping Scientific Argumentation in Lab Reports. CBE Life Sci. Educ. 21:ar71. doi: 10.1187/cbe.21-11-0316

PubMed Abstract | Crossref Full Text | Google Scholar

Gubareva, R., and Lopes, R. (2020). Virtual assistants for learning: a systematic Literature Review. In Proceedings of the 12th International Conference on Computer Supported Education (CSEDU 2020), 1, 579–586. doi: 10.5220/0009417600970103

Crossref Full Text | Google Scholar

Hernández, R., Fernández, C., and Baptista, P. (2014). Metodología de la investigación (6° edición). McGraw – Hill Interamericana de México, S.A.

Google Scholar

Hussein, R., Zhang, Z., Amarante, P., Hancock, N., Orduna, P., and Rodriguez-Gil, L. (2024). Integrating Personalized AI-Assisted Instruction Into Remote Laboratories: Enhancing Engineering Education with OpenAI's GPT Models. 2024 IEEE Frontiers in Education Conference (FIE), Washington, DC, USA, 2024, pp. 1–7. doi: 10.1109/FIE61694.2024.10892918

Crossref Full Text | Google Scholar

Idoyaga, I. J., Montero-Miranda, E., Lizano-Sánchez, F., Medina, G. L., and Arguedas-Matarrita, C. (2024). “Comparative Study of Two Ultra-Concurrent Laboratories of Acid-Base Titration,” in Online Laboratories in Engineering and Technology Education. Lecture Notes in Networks and Systems, eds. May, D., Auer, M.E., Kist, A. (Cham: Springer), 1135. doi: 10.1007/978-3-031-70771-1_17

Crossref Full Text | Google Scholar

Lampropoulos, G. (2025). Combining artificial intelligence with augmented reality and virtual reality in education: current trends and future perspectives. Multimodal Technol. Interact. 9:11. doi: 10.3390/mti9020011

Crossref Full Text | Google Scholar

Lin, C. C., Huang, A. Y. Q., and Lu, O. H. T. (2023). Artificial intelligence in intelligent tutoring systems toward sustainable education: a systematic review. Smart Learn. Environ. 10:41. doi: 10.1186/s40561-023-00260-y

Crossref Full Text | Google Scholar

Lizano-Sánchez, F., Idoyaga, I., Capuya, F., Orduña, P., Rodríguez-Gil, L., and Arguedas-Matarrita, C. (2025b). Students' experience using an Artificial Intelligence tool integrated into a Remote Chemistry Laboratory. In Proceedings of 23rd International Conference on Smart Technologies and Education 2025, Lecture Notes in Networks and Systems, Springer, in press.

Google Scholar

Lizano-Sánchez, F., Idoyaga, I., Orduña, P., Rodríguez-Gil, L., and Arguedas-Matarrita, C. (2025a). Teachers' perspective on the use of artificial intelligence on remote experimentation. Front. Educ. 10:1518896. doi: 10.3389/feduc.2025.1518896

Crossref Full Text | Google Scholar

Luo, H. (2023). Editorial: advances in multimodal learning: pedagogies, technologies, and analytics. Front. Psychol. 14:1286092. doi: 10.3389/978-2-8325-3917-0

PubMed Abstract | Crossref Full Text | Google Scholar

Matarrita, C., and Concari, S. (2018). Características deseables en un laboratorio remoto para la enseñanza de la física: Indagando a los especialistas. Caderno Brasileiro Ensino Física 35, 702–720. doi: 10.5007/2175-7941.2018v35n3p702

Crossref Full Text | Google Scholar

OpenAI. (2024). Hola, GPT-4o. Available online at: https://openai.com/es-ES/index/hello-gpt-4o/ [Accessed August 26, 2025].

Google Scholar

OpenAI. (2025a). Introducing GPT-4.1 in the API. Available online at: https://openai.com/index/gpt-4-1/ [Accessed August 26, 2025].

Google Scholar

OpenAI. (2025b). OpenAI o3-mini. Available online at: https://openai.com/es-ES/index/openai-o3-mini/ [Accessed August, 2025].

Google Scholar

OpenAI. (2025c). Presentamos GPT-5. Available online at: https://openai.com/es-ES/index/introducing-gpt-5/ [Accessed 26 August, 2025].

Google Scholar

Orduña, P., Rodriguez-Gil, L., Garcia-Zubia, J., Angulo, I., Hernandez, U., and Azcuenaga, E. (2016). LabsLand: A sharing economy platform to promote educational remote laboratories maintainability, sustainability and adoption. In 2016 IEEE frontiers in education conference (FIE), October 2016, Erie, United States. doi: 10.1109/FIE.2016.7757579

Crossref Full Text | Google Scholar

Paladines, J., Ramírez, J., and Berrocal-Lobo, M. (2021). Integrating a dialog system with an intelligent tutoring system for a 3D virtual laboratory. Interact. Learn. Environ. 31, 4476–4489. doi: 10.1080/10494820.2021.1972012

Crossref Full Text | Google Scholar

Pozo Municio, J. I. (2023). Aprender en la educación primaria (1.ªedición) Fundación Universitat Oberta de Catalunya (FUOC).

Google Scholar

Raviolo, A., Farré, A. S., and Traiman, N. (2021). Students' understanding of molar concentration. Chem. Educ. Res. Pract. 22, 486–497. doi: 10.1039/D0RP00344A

Crossref Full Text | Google Scholar

Rein, D., Hou, B. L., Stickland, A. C., Petty, J., Pang, R. Y., Dirani, J., et al. (2023). GPQA: A graduate-level Google-proof Q&A benchmark. arXiv preprint arXiv:2311.12022.

Google Scholar

Sajja, R., Sermet, Y., Cwiertny, D., and Demir, I. (2023). Integrating AI and learning analytics for data-driven pedagogical decisions and personalized interventions in education. arXiv preprint.

Google Scholar

Tilstra, L. (2001). Using journal articles to teach writing skills for laboratory re+- in general chemistry. J. Chem. Educ. 78:762. doi: 10.1021/ed078p762

Crossref Full Text | Google Scholar

Todericiu, I. A. (2025). Virtual assistants: a review of the next frontier in AI interaction. Acta Univ. Sapientiae Inform. 17:1. doi: 10.1007/s44427-025-00002-7

Crossref Full Text | Google Scholar

Towns, M. H., Rodriguez, J. M. G., McAfee, S. C., et al. (2025). The intersection of chemistry and calculus: a mutually beneficial crossroad. Int. J. Res. Undergrad. Math. Ed. doi: 10.1007/s40753-025-00272-8

Crossref Full Text | Google Scholar

Zawacki-Richter, O., Marín, V. I., Bond, M., and Gouverneur, F. (2019). Systematic review of research on artificial intelligence applications in higher education–where are the educators? Int. J. Educ. Techn. High. Educ. 16:39. doi: 10.1186/s41239-019-0171-0

Crossref Full Text | Google Scholar

Keywords: artificial intelligence, remote laboratories, chemistry education, intelligent assistant, student learning support

Citation: Lizano-Sánchez F, Idoyaga IJ, Orduna P, Rodriguez-Gil L and Arguedas-Matarrita C (2025) Students' interactions with an artificial intelligence assistant in a remote chemistry laboratory. Front. Educ. 10:1712743. doi: 10.3389/feduc.2025.1712743

Received: 25 September 2025; Accepted: 20 October 2025;
Published: 05 November 2025.

Edited by:

Sami Heikkinen, LAB University of Applied Sciences, Finland

Reviewed by:

Pongkit Ekvitayavetchanukul, Khon Kaen University, Thailand
Renata Nemeth, Eötvös Loránd University, Hungary

Copyright © 2025 Lizano-Sánchez, Idoyaga, Orduna, Rodriguez-Gil and Arguedas-Matarrita. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Fiorella Lizano-Sánchez, ZmxpemFub3NAdW5lZC5hYy5jcg==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.