Coexistence of Misconceptions and Scientific Conceptions in Chemistry Professors: A Mental Chronometry and fMRI Study

Using mental chronometry and functional magnetic resonance imaging (fMRI), 17 pre-university and university professors were tested with a cognitive task that required distinction of true or false statements about chemistry. These were prepared in pairs of similar statements, which differed only by congruency: while congruent stimuli involved no plausible interference of misconceptions, the incongruent match did. Results show longer response times and more activation in brain areas related to inhibitory control for incongruent compared to congruent statements, thereby supporting the hypothesis that misconceptions interfere in the production of scientific answers, even by experts. Possible educational implications are discussed.


Misconceptions and Conceptual Change in Science Education
Learners make all kinds of errors. Some of them originate from difficulties in memory encoding or retrieval, while others have to do with shortcomings in procedural knowledge and application. Most of the time, such errors can rather easily be corrected with intelligible teaching, proximal tutoring or practice. But many other errors appear to resist this kind of classic yet proven effective intervention. Their recurrence (Stavy et al., 2006) and the stability of their structure suggests that something coherent can be identified at their origin, and which leads to their eventual existence. This influence can be -and has been -studied (at least since the 1970s) by science education researchers through interviews and questionnaires and has been found to be of a mental representational nature.
Such representations about what things are and about what they do have been found in every scientific field and are sometimes called misconceptions, alternative, erroneous, students' , or initial conceptions (Duit and Treagust, 2012). For example, "clouds are made of gas (usually water vapor)" is a misconception because clouds are made of tiny liquid water particles, otherwise, they would be invisible; the widespread belief that "microbes are all toxic" is also false because most microorganisms are in fact completely harmless, while some are useful and a few are even essential for the preservation of humans life; etc. Entire inventories of misconceptions have been developed: vaccines eradicate viruses; sugar is toxic; and all mountains are ancient volcanoes; are all common examples.
According to one of the most classic examples of a misconception that has been intensively studied by the science education community, a heavier object would always fall (accelerate) faster than a lighter one, and hit the ground first, even in a vacuum (Lazonder and Ehrenhard, 2014). This is a misconception because it conflicts with scientific knowledge that in a vacuum, say near the moon, objects will accelerate at the exact same rate, regardless of their mass. While some erroneous representations can derive from physically interacting with the world, others could be socially transmitted, like the one according to which magnets attract metal (Thouin, 2015, p. 93). Indeed, this conception is incomplete thus erroneous because magnets attract only three metallic elements and their alloys (Fe, Co, Ni) out of (at least) 91. But as the "Magneto" character is said to attract metal in most "X-Men" movies, it is possible that mass media sometimes cultivate such false ideas, albeit not necessarily intentionally. Misconceptions are also socially transmitted through peers, folk culture, and sometimes even through careless teaching (Taber, 2001).
Regardless of their origin, when students' conceptions contradict scientific knowledge, they can seriously interfere with educational efforts and survive persuasive teaching efforts. Indeed they have long been considered resistant and hard to "change" (Stepans et al., 1986;Taylor and Kowalski, 2004).
To address this kind of difficulty, many conceptual change models have been proposed through the years, among which some of the most popular are Posner et al.'s classic model (Posner et al., 1982), Vosniadou's ontological categories (Vosniadou and Brewer, 1992), and DiSessa's p-prims (diSessa, 1993). While some of these models concentrate on effective teaching, others focus on change as a learning process. And while some of them consider learning as a (re-) mobilization and (re-)coordination of pieces of knowledge, others see it as re-qualifications within ontological categories, or theories. More than 86 distinct conceptual change models of all sorts have been recorded (Potvin et al., 2020). While these models suggest a diverse array of recommendations, it appears widely accepted that. . .
• Explicitly taking into account these misconceptions (identifying, revealing, discussing, and studying their origins) might be vital for conceptual change; and that • Conceptual conflicts (that reveal the shortcomings of misconceptions) generally appear to play a positive role, even if authors sometimes advise caution while trying to trigger such conflicts.
However, the vast majority of conceptual change perspectives are also characterized by a generalized lack of interest in what happens to students' conceptions when change has been recorded as successful. We believe this to be rather unsurprising, since most models postulate explicitly or implicitly that initial conceptions have to be altered in order for change to occur. And indeed, misconceptions are then usually considered as "completely abandoned (Villani, 1992), modified (Limon, 2001), replaced (Posner et al., 1982, p. 212), reorganized (Jensen and Finley, 1995, p. 149), eliminated (Nersessian, 1998), rejected (Hewson, 1981, p. 385), transformed, or restructure[d] (Limon, 2001, p. 359)" (Potvin, 2013). Thus, if a conceptual "change" initiative had been successful, it is often implicitly presumed that there should not remain any trace of the targeted initial conception, or of its initial form.
However, not all of the proposed models or interventions were based on the premise that initial conceptions are altered when conceptual learning occurs. A growing number of them, on the contrary, postulate, implicitly or explicitly, that such learning is the result of a competition between coexisting ideas [Kendeou et al., 2016 (for refutational texts); Ohlsson, 2009 (resubsumption hypothesis); Dawson, 2014 (conceptual mediation); Solomon, 1983 (avoiding extinction of misconceptions); Nadelson et al., 2018 (Dymanic model of CC)]. However, we argue that the general idea of coexistence of multiple conceptions or representations within a learner's mind remains, even today, rather marginal (or implicit) in the conceptual change research tradition and never became a major trend. However, since no study has been able to refute the possibility of coexistence, we believe that this claim deserves further verification.

The Coexistence/Persistence Claim
Recently published studies using mental chronometry [reaction times (RT)] and neurophysiological methods [electroencephalography (EEG) and functional magnetic resonance imaging (fMRI)] have argued that this might not be the case. Such studies have tested participants with cognitive tasks and compared congruent conditions, in which it is presumed that no misconception should impair the production of correct answers, to incongruent conditions, in which a documented misconception could. To give an example of a task, Stavy and Babai tested the interference of a geometric misconception according to which, when a shape shows more surface, its perimeter will be greater, which is sometimes true, but not always. They asked participants to say which one of two shapes had the larger perimeter (left, right or none). Figure 1 shows a congruent stimulus (A) [correct answer is right. It is a congruent stimuli because participants can find the correct answer by using the widespread conception that "the larger the area the larger the perimeter"] and an incongruent one (B) [correct answer is none. It is an incongruent stimuli because participants can be misled by applying "the larger the area the larger the perimeter" conception and thus choose the left shape]. According to their results, incongruent stimuli are generally answered less accurately, and even when correct answers are produced, they take longer than in congruent trials .
Other tasks like this one have been used to study the effects of the presumed presence of scientific misconceptions in many scientific disciplines: electricity Zhu et al., 2019), mechanics (Brault Foisy et al., 2015), states of matter (Babai and Amsterdamer, 2008), buoyancy (Potvin et al., 2015a; see Figure 2: In this task the misconception that larger/heavier objects have a stronger tendency to sink is tested. A = congruent; B = incongruent), biology , and sometimes many disciplines at the same time (Shtulman and Valcarcel, 2012;Allaire-Duquette et al., 2019, etc.) To our knowledge, when researchers contrast incongruent and congruent conditions (incongruent > congruent),  they generally report significant differences in response times and EEG or fMRI signatures, even for knowledgeable participants. Since the only apparently defendable difference that exists between incongruent and congruent stimuli is the possible presence and therefore interference of a conceptual distractor (if the task is designed appropriately), results have been interpreted as indications of the activation of cognitive control or inhibition processes, also sometimes called inhibitory control. This function can be defined as a cognitive process that allows an individual to prevent the activation or manifestation of their habitual or dominant behavioral responses to stimuli. Indeed, delays in response are routinely interpreted as the result of the cognitive load that comes with the resolution of perceptual or conceptual internal conflicts. Likewise, late positive potential (LPP) and P-300 peaks (electrically positive difference at 300 ms), recordable by electroencephalography (EEG), have been associated with conflict resolution (Zhu et al., 2019). Finally, certain brain regions whose BOLD signal (Blood-Oxygen-Level Dependent) can be recorded through functional magnetic resonance imaging (fMRI) (Amaro and Barker, 2006), have also been associated with conflict detection or resolution. These precise and rather small regions are the ventrolateral prefrontal cortex (VLPFC), the dorsolateral prefrontal cortex (DLPFC), and the anterior cingulate cortex (ACC) (Masson et al., 2012). For example, such regions had been widely recognized to activate when correctly resolving Stroop tasks (telling the color of a red-printed "blue" word) (Parris et al., 2019). A considerable number of studies in psychology have thus studied the brain mechanisms that activate when participants have to resist distractors of all kinds (Koch et al., 2010), and thus can be used as a rather robust basis for interpreting other tasks that test for conceptual distractors and could eventually have educational relevancy (Mason and Zaccoletti, 2020).
In all the studies that we referred to in the preceding four paragraphs, the general conclusion was similar: in the production of correct or expert answers, evidence of conflict was recorded. This conflict is generally argued as evidence of the persistence of the initial conceptions in participants' resolution processes, because it is the presence and the activation (conscious or not) of those conceptions that interfere in the resolution process, trigger temporary conflict and eventually require inhibitory control (Nenciovici et al., 2019;Vaughn et al., 2020). Thus, the activation of inhibitory control is considered a good indication of the coexistence of multiple conceptions in a single mind, and thus supports the coexistence/persistence claim.

Research Problems and Question
In recent years, positive results about the coexistence/persistence hypothesis have been accumulating. However, very few of these results were obtained by testing misconceptions in the field of chemistry. Even though a very large number of misconceptions in this particular field have been identified (Barke et al., 2009) and the coexistence hypothesis has been around for quite some time (Mortimer, 1995;Talanquer, 2009), very few studies have tested it directly.
The first study that we know of used mental chronometry to demonstrate the persistence of misconceptions involved in the classification of substances. Children often tend to associate the ability to flow (or to "be poured") with liquids (Stavy and Stachel, 1985) and rigidity with solids (Stavy and Stachel, 1985;Krnel et al., 2003). When classifying different substances, incongruent stimuli (viscous liquids and powders) took longer than congruent stimuli to be assessed correctly. However, the authors did not refer (at the time) to inhibitory processes to explain the delay in response times.
Two other studies directly tested the coexistence/persistence claim, but unfortunately used very few chemistry items (among other scientific ones). Thus, their specific effects were not distinguished. Also, these items were not necessarily exclusive to the chemistry universe and could have been associated with other scientific fields (e.g., "Plants emit carbon dioxide gas, " "A natural substance can pollute" (Allaire-Duquette et al., 2019, p. 4, supplementary material) and "Fire is composed of matter" (Shtulman and Valcarcel, 2012, p. 211). However, the results clearly supported the coexistence claim.
Yet it is also possible to find results that come to somewhat divergent conclusions. For example, an fMRI study that compared the brain activity of novices and experts in chemistry was conducted by Nelson et al. (2007). This study used a task in which participants had to choose between correct representations of evaporation (molecules remaining intact) and incorrect ones (molecules breaking apart). They observed that experts activated the frontal regions more, which are generally associated with working memory, while novices used visual areas more (in the back of the brain). While these results do not completely contradict the possibility of coexistence (frontal regions activated to respond accurately), the conclusions of the authors pointed to working memory and semantic retrieval, rather than inhibitory control. While one could attribute these results to the nature of the task (visual vs. "not based on experience" Nelson et al., 2007, in the section "Discussion"), it remains difficult to hypothesize about the absence of a typical inhibitory signature, since we have few details about precisely how the study was conducted.
Finally, we must not rule out the possibility that other studies have been conducted, but did not obtain positive results. Since negative results are not easily publishable, it is possible that existing negative results in chemistry have been unknown. The scarcity or sometimes anomalous results about chemistry education suggest that there could be differences between misconceptions in this field and misconceptions in other scientific fields. And indeed, it has often been argued that chemistry learning might be distinct: "chemistry is a very conceptual subject, and many of its concepts are rather abstract [. . .] many refer to ideas that are not so easily demonstrated [. . .] involving hypothetical sub-microscopic entities" (Taber, 2009, p. 14). Thus, it is possible that chemistry learning, unlike many other disciplines, could not rely as easily on kinesthetic or embodied interactions or on direct observations. Therefore, conceptual knowledge in chemistry could be constituted on different bases than other scientific disciplines.
This might be why it follows that a very popular conceptual change model that prevails in the chemistry education field (more than in others) is the "common sense reasoning" paradigm, which is "grounded in a set of presuppositions about the surrounding world and the nature of things, and relies on mental strategies to make decisions and build inferences based on the information that is readily available" (Talanquer, 2006, p. 812). It is thus possible that chemical conceptions (as well as misconceptions) might not be grounded so much on physical interaction (compared to physics for instance), but maybe more on the need to solve abstract and school-typical problems, while disposing of few usable observations. In such a context, students often illegitimately fall back on superficial and macroscopic observations or apparent properties, or on the only data that is immediately available. As for socially rooted misconceptions, it has been argued that most chemical misconceptions that teachers face in higher education might originate from "shortcomings resulting from previous learning" (Cormier, 2014, p. 35), rather than mass media or folk knowledge.
Thus, it is possible that chemistry misconceptions might be different in origin, and thus in nature, from misconceptions in other fields, like physics. This is the first reason why we believe it is important to address the coexistence claim in chemistry. Another issue in the present study might be the importance of showing that the persistence of misconceptions -if any -can occur with experts as well as novices. Indeed, in previous studies, persistence was usually tested merely by analyzing the correct answers produced by learners at different pre-university levels, instead of at the expert level. It might not be so surprising, indeed, to record conflicts during periods of life when participants are in the process of developing scientific thinking.
We thus propose testing the coexistence/persistence claim with confirmed experts in order to provide a stronger, albeit more difficult to achieve demonstration of coexistence. For the purposes of this study, we have chosen to define experts as qualified and confirmed professors of chemistry instead of Ph.D.s or other professionals of the field (technicians or chemical engineers, for example) because these professionals are in close proximity, on a day-to-day basis, with a large enough spectrum of chemical concepts, but also with their associated misconceptions through the work they accomplish with their students. We have also chosen chemistry professors because in the context of an educational study, their identification as experts makes sense and might allow educators to see how the results can ultimately resonate in their context.
Our general research question is "do chemistry professors have to inhibit frequent misconceptions while producing scientifically correct answers to chemistry questions?" We believe that providing a positive answer to this question will allow a better understanding of errors made by experts, but also by lessthan-experts. Indeed, if misconceptions persist in the minds of experts, we can certainly infer that they should also persist within novices' minds. Such results would also allow our study to bring indirect support to conceptual change models that presuppose the coexistence of multiple conceptions. Based on previous and available studies, we hypothesize that we will observe longer latencies (hypothesis 1) as well as a stronger activation of brain regions usually associated with inhibitory control [DLPFC (hypothesis 2) and VLPFC (hypothesis 3)] when experts in chemistry respond to incongruent stimuli than when they respond to equivalent but congruent stimuli. In addition, activation of the anterior cingulate cortex is also expected (hypothesis 4), although previous results demonstrate that this has not always necessarily been required to conclude that inhibitory control is active (Brault Foisy et al., 2015).

Participants
In order to ensure that we recruited experts, our participants had to be university or pre-university (CÉGEP -Professional and general teaching college, which is a direct prerequisite to attend state universities at ages [17][18]Canada) professors with at least 5 years of work experience. They had to be right-handed to avoid the possible greater variability of lateralization of the brain regions associated with language that can sometimes be observed among left-handers (Pujol et al., 1999;Josse and Tzourio-Mazoyer, 2004). Cerebral maturity was secured since anyone who met the inclusion criteria was necessarily over 18 years of age. Gender was not controlled for purposes of educational relevancy (generalization). Participants signed a consent form and completed a fMRI contraindication screening form, and were compensated with $20(CAD) and an image of their brain. Traveling expenses were also compensated when necessary. A total of 19 participants were recruited, but the data of two of them had to be discarded because of technical problems [final N = 17: 9 female and 8 male; mean age: 43 (SD = 7.4); mean years of teaching experience: 14 (SD = 5.5); 6 Ph.D., 7 M.A. and 4 B.Sc. (These last four were CÉGEP professors, teaching analytic and organic chemistry to 18-19 year-olds)].

Protocol and Ethics
After completing the forms approved by the ethics committees of the Université du Québec à Montréal and Unité de neuroimagerie fonctionnelle (UNF-Joint Research Ethics Committee of the Neuroimagerie, 13-14-023), participants were inserted in an fMRI simulator, where they could experience the environment. After confirmation of their capability to undergo the experiment, they were accompanied to the real MRI machine and tested.

Materials
The cognitive task was designed with the E-Prime R software and was inspired by the one designed and used by Shtulman and Valcarcel (2012) and Shtulman and Harrington (2015). It consisted of 40 pairs of textual statements concerning various misconceptions in chemistry (e.g., "The strength of an acid depends on its concentration") whose scientific validity had to be assessed by responding with the index finger of the right hand (if the statement was judged to be true) or the middle finger of the right hand (false). Each pair of statements consisted of one image/stimuli that was congruent with intuition or common sense. This statement could therefore be properly evaluated by most people, whether novices or experts in chemistry. The other statement, on the other hand, was incongruent with intuition or common sense and selecting the correct answer required overcoming a misconception. These statements were presented for a maximum of 10 s using a block protocol (Amaro and Barker, 2006). Each block was composed of four statements of the same condition (congruent or incongruent), followed by a 15 s pause (presentation of a fixation cross). In addition, two mixed blocks composed of congruent and incongruent statements were added to increase the variability of the presentation of stimuli and prevent participants from inferring the structure of the task. These blocks have not, however, been analyzed.
The task was divided into two sessions separated by a short break of approximately 1 min. Each session lasted approximately 5 min during which participants had to respond to 40 stimuli (4 blocks of 4 congruent statements, 4 blocks of 4 incongruent statements and 2 blocks of 4 mixed statements). All stimuli and blocks were presented randomly to avoid presentation bias. However, the first block of the session was always a mixed block in order to add a practice run and allow participants to get used to the task. Figure 3 illustrates a typical example of an incongruent block.
To ensure the reliability and validity of the analysis, stimuli were prepared in pairs and validated by two experienced chemists with a specialization in science education. They were written in such a way as to ensure that they differed only in congruity. Thus, pairs were included in the task only if they respected the following criteria: 1. Relevant to what is taught in chemistry classes; 2. Misconceptions to overcome in incongruent statements were identified as common (or frequent) in at least one science education research article; 3. The statements were scientifically valid; 4. The statements of the same pair dealt with the same scientific concept; 5. Statements from the same pair had the same familiarity and similar complexity of analysis; and 6. The congruent and incongruent statements of a same pair had the same readability and similar lengths [the length criterion was secured by using Flesh-De Landsheere's readibility formula (De Landsheere and Mialaret, 1976)]. 7. Also, the task (as a whole) had to contain the same number of true and false statements.
The entire set of statements is available in Appendix.

Analysis
Behavioral analysis was conducted with the SPSS 24 R software, while the pre-treatment and analysis were conducted with the SPM8 package (Wellcome Department of Imaging Neuroscience, London, United Kingdom) which works with the MATLAB interface. Pre-treatment included correction of movement [based on Friston et al. (1996)], normalization [based on the segmentation method described by Ashburner and Friston (2000)] and smoothing [Gaussian filter of 6 mm (full-width-halfmaximum)]. Determining the significance level of t-tests remains a complicated issue in brain imaging (Bennett et al., 2009). Indeed, the large number of voxels to be tested leads to a high risk of false positives (type 1 errors) with the classic threshold of p = 0.05. However, the correction of the threshold for multiple comparisons, such as the Bonferroni correction, can also result in an exceedingly strict threshold that does not allow all significant activations to be detected (type 2 errors). Unless otherwise advised, the threshold of p = 0.001 (uncorrected) was therefore chosen to report the brain analyses of this study. The general linear model was used for modeling the data. More precisely, trial-related activity was modeled by convolving a vector of trial onsets with a canonical hemodynamic response function.
Given the objective to know the brain activity of all participants when assessing incongruent stimuli, it was necessary to perform a second level analysis (Masson et al., 2012). Indeed, the first level analysis (intra-subject or fixed effects) provided a portrait of the brain activity of each of the participants. However, it was sensitive to extreme data and therefore could not be used to generalize the results to larger populations. To compensate for this shortcoming, second-level analysis (random effects analysis) was used to combine the brain activity of all participants, and compare them using paired t-tests. Thus, a brain region was significantly activated only if it was activated by most participants since the participants reacting strongly to the task could not compensate for the weaker activity of the other participants. These two analyses, carried out using SPM8, were used for this research.

Accuracy
The 17 participants achieved an average accuracy rate of 96.3% (SD = 3.44) when assessing congruent stimuli and 71.47% (SD = 11.40) for incongruent stimuli. A t-test for paired samples was then performed to compare the two conditions. This test highlighted a significant difference, t(16) = 9.757; p < 0.001, with a very large effect size (d = 2.95) according to Cohen's (1988). The very high (96%) accuracy for congruent stimuli confirms that these stimuli were indeed in accordance with intuition or common sense and that our participants were indeed experts. Their rather high (71%) accuracy rate for incongruent stimuli also supports the hypothesis that our participants were able to overcome most widespread conceptual difficulties. However, for the remainder of the analyses, only pairs of stimuli for which both congruent and incongruent stimuli were answered correctly were considered. Indeed, the objective was not to formulate a judgment about the level of expertise of our participants, but to assess what happened when they did behave as true experts. From the 544 pairs of statements composing the entire data collected, 168 pairs (31% of total) containing either one of the 21 incorrectly answered congruent statements or one of the 155 incorrectly answered incongruent ones were therefore removed from the analysis. Thus, the number of usable stimuli for each of the categories was 376.

Response Times for Correct Answers
The descriptive characteristics of the response times can be found in Table 1. On average, participants took 3,058 ms to correctly assess congruent stimuli and 3,631 ms to correctly assess incongruent stimuli. A significant difference was observed using a Wilcoxon rank test, z = 7.796; p < 0.001, whose effect size (r = 0.40) is medium.

Brain Activation for Correct Answers
A first analysis sought to determine the regions of the brain that were significantly more activated when participants responded correctly to congruent stimuli (congruent > incongruent). No brain region stood out during this analysis. Then, a second analysis focused on the evaluation of incongruent stimuli (incongruent > congruent). The results of this analysis are reported in Table 2 and in Figure 4. The table presents the significantly activated regions, their coordinates in the MNI space, the number of voxels (k) in the region as well as the value of the t-test. Analysis of brain activity during the assessment of incongruent stimuli highlighted three significantly more active regions. The first was localized in the anterior region of the supplementary motor area (pre-SMA) ( Figure 4A) located in the upper right and middle frontal cortex, the second activation was found to overlap the left ventrolateral prefrontal cortex ( Figure 4B) and the left anterior insula while the last region is located in the dorsolateral prefrontal cortex (Figure 4C).

DISCUSSION
The following discussion is organized around the initially formulated hypotheses. It also discusses other possible interpretations of the results as well as their limits, and extends to possible educational considerations and implications.

Behavioral Results (Hypothesis 1)
Unsurprisingly, differences in recorded accuracies between congruent (96%) and incongruent stimuli (71%) suggest that some misconceptions might persist even in experts. However, it naturally cannot be expected that experts are perfect and avoid all possible conceptual traps we set for them. Also, since  correctness is stricter (all possible cases must conform to the rule) than incorrectness (a single case can refute a rule), it is thus normal to see even experts not answering correctly from time to time. Indeed, it is unlikely that a participant will be familiar with all possible cases. It is also important to see that, while accuracies can inform us about the persistence of misconceptions within populations, they cannot inform us about the persistence of misconceptions within individuals. Indeed, by analyzing only accuracies, one cannot infer interpretations about the reasons why an answer could have been incorrect. Thus, accuracies cannot play a part in the confirmation or disconfirmation of hypothesis 1. Indeed, only correct answers that allow a presumption of overcoming the conceptual interference could support an eventual confirmation. So, when considering results from Table 1 and the reaction time results of the Wilcoxon rank test (which include only correct answers), the calculated difference between incongruent (3,631 ms) and congruent (3,058 ms) stimuli appears to support hypothesis 1 (longer latencies for incongruent > congruent) and thus the general coexistence/persistence claim. Indeed, if we are convinced by looking at the stimuli that the only difference that exists between the two conditions is the possible involvement (or not) of a misconception during the successful resolution, then the RT difference has to be attributed to overcoming (inhibiting) the presence and interference of the misconception. These results are similar to the ones obtained in other contexts, and with misconceptions in other fields, by testing RTs (Babai and Amsterdamer, 2008;Goldberg and Thompson-Schill, 2009;Babai et al., 2010;Shtulman and Valcarcel, 2012;Brault Foisy et al., 2015;Potvin et al., 2015a;Shtulman and Harrington, 2015).

Brain Imaging Results (Hypotheses 2, 3, and 4)
First, the absence of significantly more activated regions in the congruent > incongruent contrast (Table 2) suggests that our congruent stimuli indeed referred to more intuitive and easy to answer situations than our incongruent ones and posed no particular or recordable difficulty (also their accuracy was very high). It also suggests that they constitute a defendable "baseline" for the opposite contrast (incongruent < congruent) for which we considered the presence of possible misconceptions, and from which we now discuss the significantly activated regions. Table 2 support the confirmation of hypotheses 2 and 3 because both the left VLPFC and left DLPFC show greater activations (hypotheses 2 and 3 were about a greater activation of these regions for incongruent > congruent). As argued earlier, the left VLPFC is indeed known to be activated during inhibition tasks (Tamm et al., 2002;Swick et al., 2008;Swick and Chatham, 2014;Nozari et al., 2016). The meta-analysis by Laird et al. (2005) on the Stroop task, which is a prototypical example of an inhibition task, underlines the frequent activation of the left VLPFC. The large-scale study on the Go/No-Go task by Steele et al. (2013) also made it possible to highlight an activation of that region when inhibition is presumed to help produce correct answers. Finally, Masson et al. (2014) also obtained significant activation of the left VLPFC while participants solved their inhibition task in which the presumed distractor is a frequent misconception in electricity.

Results presented in
Several studies, however, seem to indicate that inhibitory control is more often lateralized in the right VLPFC (Aron et al., 2004;Levy and Wagner, 2011;Cai et al., 2014). Nevertheless, our activation of the left VLPFC could be explained by the type of stimuli that was used for this study and by the fact that this region is usually activated when semantic inhibition is necessary. Indeed, the classic Hayling task of semantic inhibition does activate the left VLPFC (Collette et al., 2001). Bilateral activation is also sometimes observed when visual stimuli can be easily recoded into verbal form (Wig et al., 2004). Similarly, the logic task used by Houdé et al. (2000) presented a left lateralization when analyzing cerebral activity following a teaching "warning" intervention. Houdé and Borst (2015) associated this activation in the left VLPFC with the strong verbal component of the task.
As for the left DLPFC, it is also commonly associated with inhibitory control and working memory (Miller and Cohen, 2001) as well as the ability to relate information from different brain regions (Blumenfeld and Ranganath, 2007;Blumenfeld et al., 2011). Several neuropsychological tasks have highlighted its activation when it is necessary to inhibit a tempting response (Bush et al., 1998;Liddle et al., 2001;Menon et al., 2001;Monchi et al., 2001;Laird et al., 2005). According to Heekeren et al. (2006), the left DLPFC is responsible for keeping relevant information in memory to perform a task and to allow the evaluation and comparison of information. It is thus not surprising to see it activated in the experiment since our participants could have had to keep in working memory the alternative and scientific conceptions in order to compare them and then select the correct answer. The existence of such a comparison process could also explain the observed lags in RTs.

ACC
Our result, however, cannot confirm hypothesis 4 (higher activation of ACC for incongruent > congruent) because we did not observe any significant activation of the anterior cingulate cortex (ACC, similar to Brault Foisy et al., 2015). It is also possible that the ACC was activated in incongruent trials, but not enough to pass the significance threshold, or that it was activated in both conditions. Indeed, the ACC is often associated with the overall expected value of control (EVC) (Shenhav et al., 2016), and not necessarily with mere conflict detection. It is also possible that a combination of both these explanations prevented us from observing the activation of ACC.
Unexpected activated regions: Left anterior insula and pre-SMA. The role of the insula is difficult to define since this region can apparently be activated during a large variety of functions such as the treatment of emotions, pain, language as well as in situations requiring cognitive control (Singer et al., 2009;Kurth et al., 2010;Chang et al., 2012). These authors argue that the insula could have the role of linking and integrating information from several systems. This region is also frequently activated during decision-making under uncertainty . Several authors have also observed its activation during Stroop and Go/No-Go tasks (Aron et al., 2004;Chang et al., 2012). Swick et al. (2011) also carried out a meta-analysis on 66 studies having used a Go/No-Go or Stop-Signal task and emphasized the important role of insula and pre-SMA for success in these two tasks. Thus, given the uncertainties that prevail regarding its function, the insula, which is nevertheless sometimes involved in the resolution of problems requiring cognitive control, could hardly bring undisputable support to our general hypothesis.
As for the pre-SMA, more precisely its anterior part (pre-SMA), it is frequently attributed to the planning and temporal organization of conscious physical movements (Tanji and Shima, 1994;Gerloff et al., 1997;Lee et al., 1999;Nachev et al., 2007).
Even though several research and meta-syntheses argue for its involvement in inhibitory control tasks (Derrfuss et al., 2005;Simmonds et al., 2008), it appears plausible that the interference of the misconceptions merely leads to the need to change firstinclination responses and, consequently, to change the finger used to respond. Thus, participants might merely have had to plan a new movement in order to press the correct button.

Other Possible Interpretations
An alternative interpretation of our results could be derived from Nelson et al. (2007), who also reported activation in the left VLPFC and DLPFC for the incongruent > congruent contrast performed on the results obtained with their evaporation task. According to their interpretation, the success of their task could be attributed to working memory combined with the retrieval of semantic information. Indeed, the left VLPFC is sometimes attributed to semantic memory (i.e., knowledge of the facts, the meaning of words and the properties of objects), even if its precise role in such cognitive processes is still debated (McDermott et al., 2003;Badre et al., 2005).
This region would also contribute to the success of tasks requiring control of semantic processes (Badre and Wagner, 2007) although researchers still argue about the possibility of a division of functions according to the regions of the left VLPFC (Thompson-Schill and Botvinick, 2006). According to Badre and Wagner (2007), the anterior part of the left CPVL (BA 47, pars orbitalis) could be responsible for controlling access to conceptual representations, while the middle part (BA 45, pars triangularis) might be activated after retrieval in order to resolve the competition between active representations. In return, Thompson-Schill and Botvinick (2006) argue that these two processes can very well be integrated into a single model where the recovery of semantic representations necessarily involves a selection between several representations.
We believe, nonetheless, that the inhibition hypothesis remains the best explanation for our results. Given the nature of our task, it is impossible to precisely identify the activation of the left VLPFC according to the separation proposed by Badre and Wagner (2007). On the other hand, this activation supports the proposition of Thompson-Schill and Botvinick (2006) and of Nozari et al. (2016) in which the selection between two competing representations is integrated into retrieval processes. This selection between two representations can actually be associated with a form of semantic inhibition. In addition, unlike Nelson et al. (2007), the results of our study revealed activation in the pre-SMA, which can be associated with motor inhibition. Activating this region suggests that participants had to modify their movements to inhibit their intuitive responses. The cognitive work carried out by the participants, therefore, might very well go beyond the mere retrieval of semantic information.

Limits
Among the limits of this research is, of course, the small number of participants (N = 17). Indeed, it always remains impossible to conduct studies with complete generalization power, even if the studied population is as small as the number of chemistry professors. Since its size, our sample could therefore be unevenly tainted in ways we did not think about, thus threatening a possible generalizabilition to all chemistry professors. For example, our entire sample is composed of culturally rather uniform (french-canadians) professors. It is also composed exclusively of voluntary participants. And thus, while we are unable to imagine in which direction such characteristics could have skewed our results, we cannot defend the possibility that they did not. Therefore, reproduction through similar yet converging studies would be needed.
Another limit is that, even though our results reached statistical significance, this study did not present results with corrected thresholds, which is increasingly becoming a standard in the field. It is possible that the small number of participants or the reduction of considered pairs of stimuli for analysis (we could only consider pairs when both incongruent and congruent stimuli were answered correctly) explained this restriction, at least in part. However, it is also possible that the very high automatization of scientific knowledge in chemistry professors made it more difficult for us to detect the interference of misconceptions. Indeed, selecting chemistry experts was a bolder methodological choice that made reaching significant thresholds more challenging and, with eventual positive results, more educationally interesting.

Possible Educational Implications
We believe that the general confirmation of our broad inhibitory control hypothesis suggests that, as it has been seen in other subfields of science (Shtulman and Valcarcel, 2012), misconceptions and scientific conceptions can coexist in chemistry professors' minds. Even though they have developed the highest level of expertise in their field, chemistry professors appear to have to suppress ordinary and widespread misconceptions in order to produce correct answers from chemistry problems. Our results therefore support a definition of expertise that considers inhibitory control mechanisms as more central, instead of a definition that focuses only on the availability and appropriate mobilization of scientific concepts and models.
In this context, chemistry professors might have to develop better vigilance about the messages they address to learners. Indeed, under time pressure or other constraints on cognitive treatment of information (Kelemen et al., 2012), the cognitive function of inhibition might temporarily be impaired and teachers might then make conceptual errors (explicit or passive) during teaching and thus mislead their students.
Chemistry teachers could also possibly choose to develop a somewhat different perspective on learning. Indeed, in a conceptual prevalence perspective, the production of correct answers is no longer seen as a definitive deconstruction, rejection or replacement, but more as an overtaking operation. Thus, setbacks in students' observable gains are more easily acceptable, and could possibly even be considered as normal, unless a more durable prevalence of scientific conceptions can be obtained through practice and automatization. The production of correct answers could then merely be considered as a first step in the development of durable expertise. Thus, this could be a game changer in the consideration of students' errors.
The role of cognitive conflicts in learning could also be reconsidered. Indeed, RTs and brain activity record significant differences when, before performing a cognitive task, participants are given warnings about the possible presence of distractors, conceptual traps, or told that people usually make many errors during its resolution (Houdé et al., 2000;Babai et al., 2015). Even though the optimal utilization of such warning interventions is not perfectly understood, novices appear to trigger more expertlike functions when they are used.
Finally, we believe that the confirmation of our hypothesis supports the use of conceptual change models that recognize and build on the possible coexistence of multiple conceptions. Many such models Ohlsson, 2009;Potvin, 2013;Dawson, 2014;Kendeou et al., 2016;Potvin and Cyr, 2017;Nadelson et al., 2018) have been proposed and suggest varied approaches, but all of them consider the importance of monitoring the status of initial conceptions, even after students become capable of good performance. For further discussion on what it could mean to teach science while assuming coexistence, we refer the reader to Potvin (2017).

CONCLUSION
This research initiative tested the coexistence/persistence claim according to which misconceptions and scientific conceptions can coexist in experts' minds. The experiment was conducted with chemistry professors, considered experts in the field in this study. Based on the results presented, namely, longer response times and more activation in the left DLPFC and VLPFC during the incongruent trials compared to the congruent ones, we believe that our general hypothesis is supported.
We also believe that, even though we did not find the activation of the ACC and did not reach corrected thresholds, our results are nevertheless convergent and statistically significant and suggest that chemistry learning, while presenting distinctions compared to learning in other fields of science, does not completely escape the coexistence phenomenon. Therefore, it could be assumed that misconceptions are not completely erased or modified during chemistry learning -and also possibly at any level -and that they could still interfere, even in unconscious or implicit ways, with teaching and learning efforts. In this context, pedagogical interventions and models that assume coexistence of conceptions should be favored. Further comparative research about the respective merits of such models might thus be needed and could aid in the development of future educational research programs.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Comité d'Éthique Mixte de la Recherche de l'Institut Universitaire de Gériatrie de Montréal, Québec, Canada. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
SM directed and supervised all aspects of the project with the help of GM-R and PP. GM-R developed the task with the help of CC and collected and analyzed the data. PP wrote the article with the help of all authors. All authors contributed to the article and approved the submitted version.