# INSIGHT AND INTUITION – TWO SIDES OF THE SAME COIN?

EDITED BY : Michael Öllinger, Kirsten G. Volz and Eörs Szathmáry PUBLISHED IN : Frontiers in Psychology

#### Frontiers Copyright Statement

© Copyright 2007-2018 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-519-5 DOI 10.3389/978-2-88945-519-5

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# INSIGHT AND INTUITION – TWO SIDES OF THE SAME COIN?

Topic Editors:

Michael Öllinger, Parmenides Foundation, Germany Kirsten G. Volz, University of Tübingen, Germany Eörs Szathmáry, Parmenides Foundation, Germany

Image: Natali\_ Mis/Shutterstock.com

Insight and intuition might be the most mysterious and fascinating fields of human thinking and problem solving.

They are different from standard and analytical problem solving accounts and provide the basis for creative and innovative thinking.

Until now they were investigated in separate academic fields with differing tradition. Therefore, this eBook attempts to bridge the gap between both processes and to provide a more integrated perspective. Several experts address the underlying cognitive processes and provide a broad spectrum of new empirical, theoretical, and methodological insights.

Citation: Öllinger, M., Volz, K. G., Szathmáry, E., eds. (2018). Insight and Intuition – Two Sides of the Same Coin? Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-519-5

# Table of Contents

*05 Editorial: Insight and Intuition – Two Sides of the Same Coin?* Michael Öllinger, Kirsten Volz and Eörs Szathmáry

# CHAPTER 1

# CONCEPTUAL ASPECTS

*08 Intuition and Insight: Two Processes That Build on Each Other or Fundamentally Differ?*

Thea Zander, Michael Öllinger and Kirsten G Volz


Michael Öllinger and Albrecht von Müller

# CHAPTER 2

# PHENOMENOLOGICAL ASPECTS

*39 Insight is not in the Problem: Investigating Insight in Problem Solving Across Task Types*

Margaret Elizabeth Webb, Daniel R Little and Simon James Cropper

*52 Intuitive Feelings of Warmth and Confidence in Insight and Noninsight Problem Solving of Magic Tricks*

Mikael Ringstad Hedne, Elisabeth Norman and Janet Metcalfe

*65 What About False Insights? Deconstructing the Aha! Experience Along its Multiple Dimensions for Correct and Incorrect Solutions Separately* Amory H. Danek and Jennifer Wiley

# CHAPTER 3

# NEUROCOGNITIVE ASPECTS

*79 Cognitive Architecture With Evolutionary Dynamics Solves Insight Problem*

Anna Fedor, István Zachar, András Szilágyi, Michael Öllinger, Harold P. de Vladar and Eörs Szathmáry

*94 Neural Correlates of Learning From Induced Insight: A Case for Reward-Based Episodic Encoding* Jasmin M. Kizilirmak, Hannes Thuerich, Kristian Folta-Schoofs,

# Björn H Schott and Alan Richardson-Klavehn

# CHAPTER 4

# CREATIVITY

# *110 Incubation and Intuition in Creative Problem Solving* Kenneth James Gilhooly

*119 The Role of Intuition in the Generation and Evaluation Stages of Creativity*

Judit Pétervári, Magda Osman and Joydeep Bhattacharya

*131 A Neurocognitive Framework for Human Creative Thought* Arne Dietrich and Hilde Haider

# CHAPTER 5

# APPLICATIONS


# Editorial: Insight and Intuition – Two Sides of the Same Coin?

Michael Öllinger <sup>1</sup> \*, Kirsten Volz <sup>2</sup> and Eörs Szathmáry <sup>1</sup>

<sup>1</sup> Parmenides Foundation, Pullach im Isartal, Germany, <sup>2</sup> Universität Tübingen, Tübingen, Germany

Keywords: insight, intuition, evolution, creativity, representational change

**Editorial on the Research Topic**

#### **Insight and Intuition – Two Sides of the Same Coin?**

When we prepared this research topic, we had the strong feeling that there is a need to systematically investigate the relationship between insight and intuition. Although there have been approaches attempting to link these concepts (e.g., Bowers et al., 1990), we found several blind spots. Particularly, we missed a coherent model or at least well-defined and proper cognitive processes which unambiguously demarcate insight from intuition. We now have evidence from empirical and theoretical contributions which shed light on those blind spots.

All contributions agree that intuition and insight are based on distinguishable cognitive processes, but emphasized and detailed in part fairly different aspects. We are positive that our research topic will help to draw a clearer and more coherent picture, and inspire further research.

From a conceptual point of view, Zander et al. proposed that intuition is characterized by an experience-based and continuous process, whereas insight relies on a discontinuous process. An insight is realized by the problem solver all of a sudden, as if coming "out of the blue." Given this assumption, they aimed at developing a paradigm in which insight and intuition could be investigated by the same tasks. They identified semantic coherence tasks as an ideal candidate for this challenge.

In the same vein Zhang et al. proposed the details of an experimental procedure which addresses the underlying processes of insight and intuition within a unified experimental paradigm. They also analyzed similarities and differences between these two processes. They focused on the different roles that tacit knowledge plays in both processes. Both the work of Zhang et al. and Zander et al. stressed the importance of a single paradigm allowing to investigate both processes to uncover the significant differences and similarities of insight and intuition.

Öllinger and von Müller proposed that coherence building and search might structure the problem-solving process determining a stage model. In their proposal, coherence building acts as base for intuition and insight. However, for the latter a change of the initial coherent representation is crucial. The change is driven by the realization after repeated failure that the initial problem representation cannot lead to the solution.

Edited and reviewed by: Sumitava Mukherjee, Ahmedabad University, India

#### \*Correspondence:

Michael Öllinger michael.oellinger@ parmenides-foundation.org

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 28 June 2017 Accepted: 20 April 2018 Published: 14 May 2018

#### Citation:

Öllinger M, Volz K and Szathmáry E (2018) Editorial: Insight and Intuition – Two Sides of the Same Coin? Front. Psychol. 9:689. doi: 10.3389/fpsyg.2018.00689

**5**

Pétervári et al. investigated the relationship between intuition and creativity. They reviewed the relevant literature and detected a strong link between the two concepts. The authors decomposed creativity into two separate processes—idea generation and idea evaluation, and linked intuition to the stage of idea generation. This investigation also highlighted an obvious link to dual process models detailing the interplay of exploitation and exploration, such as in AI and in various fields of biology (evolution and development) and might inform further research in these fields.

In sum, it seems that insight and intuition play different roles at different stages of the problem-solving process, differing by information integration, generation of hypotheses, search and eventually the change of representations. However, at this point empirical questions still remain: "What particular processes are underlying intuition and insight?" and "How could they be investigated?". The following contributions elaborate on these questions.

Hedne et al. tested whether subjective feelings of intuitions are predictive for successful problem solving. During problem solving participants were asked to make metacognitive judgments. Insight trials showed a higher accuracy than non-insight trials. With their findings, the authors suggested that insight relies more on unconscious processes than non-insight problem solving. Insightful attempts are characterized by a deeper understanding of the solution. The authors speculate that at a metacognitive level problem solvers became aware that insightful solutions are more complete and better understood than non-insight solutions.

Complementarily, Gilhooly showed the importance of the unconscious work hypothesis for insightful solutions in an extensive literature review on incubation and creativity. The author contrasted the unconscious work hypothesis by alternative explanations and demonstrated convincingly that unconscious work becomes a driving factor and the main process for intuition and new insight during incubation—a phase during problem solving, in which participants do not make deliberate solution attempts.

The work of Dietrich and Haider proposes a new cognitive architecture which provides the key ingredients for answering the question: Why are our brains so creative? They detailed 10 foundational concepts, such as evolutionary algorithms which show the importance of recombining these concepts in a new and creative way. Prediction, scaffolding and competition of representations provide a dynamic which is sufficient for generating new candidate solutions, which were tested against a fitness function. Importantly, the authors also pointed out the open problems and further research questions, which have to be addressed in the future to complete the evolutionary picture of creativity.

Beyond mere phenomenology, Fedor et al. implemented a cognitive architecture, which is able to solve a difficult insight problem. The four-tree insight problem requires to overcome an ill-defined, over-constrained problem representation. This will lead to a larger search space which contains the solution. The framework is based on Darwinian Neurodynamics. The model evolved candidate solutions by replicating and evaluating neural representations in parallel. Emphatically, this parallel search must happen in the unconscious domain. The authors convincingly demonstrated that the model behaved comparable to human problem solvers.

Another key feature of insight is the Aha! experience. Little is known about the exact nature of this subjective experience. Clarifying the underlying processes might be crucial, since almost all neuroscientific studies rely on the pre-supposed relationship between Aha! and correct and insightful solutions.

Danek and Wiley scrutinized the question whether the Aha! experience is a reliable indicator for a correct and insightful solution. The authors proposed a multicomponent construct which decomposes the Aha! into distinct facets (suddenness, certainty, surprise, pleasure, etc.). Their study indicated that Aha! experiences were also found for incorrect solutions, and correct solution differed behaviorally (e.g., by faster solution times) from incorrect solutions.

Webb et al. were interested in the relationship between accuracy and Aha! ratings across problem types [insight, noninsight, compound remote associates (CRA)]. They found that classical insight problems elicited stronger Aha! experiences than hybrid types (like CRA), or non-insight problems. They demonstrated that an Aha! is elicited during an insightful problem-solving process and linked to accuracy.

Kizilirmak et al. shed light on the neural correlates which occur when insightful solutions were induced. Participants solved compound-remote-association tasks while lying in an fMRI scanner. The authors proposed that induced insight is the result of an interplay of detecting novel congruent schemata (medial prefrontal cortex) and the left hippocampus, which forms a novel meaning by the interrelatedness of familiar items. Additionally, positive memory effects of induced insight were found 24 h after the learning phase.

Finally, two contributions were interested in the notion: How do interventions affect intuition and insight? The first study trained thinking on contraries and the second addressed the interplay of intuitive processes and depression.

Branchini et al. addressed the question: How does training, which fosters thinking on contraries, influence the solution of insight problems. They applied the training either to small groups or individually. The main finding was that trained persons in small groups focused stronger on problem elements that were relevant for the solution. The study provides potential evidence that group processes might help to overcome self-imposed constraints.

Remmers and Michalak scrutinized the impact of depression on intuition. From their review of the relevant literature, they provided evidence of an impaired decision-making process in persons who suffer from depression. They stated that depression impedes coherent and holistic representations, resulting in unsatisfying states. Depression increases the likelihood for dysfunctional solutions that have negative behavioral consequences. They discussed potential treatments (e.g., metacognitive training) which might improve beneficial intuitive processes in depressive patients and reduce maladaptive intuitive processes.

In summary, these contributions to the research topic demonstrate convincingly how intuition and insight could be demarcated and modeled and provide a potentially productive paradigm for further research on this issue. There are still open questions: "Are insight and intuition different stages at the stream

# REFERENCES

Bowers, K. S., Regehr, G., Balthazard, C., and Parker, K. (1990). Intuition in the context of discovery. Cogn. Psychol. 22, 72–110. doi: 10.1016/0010-0285(90)90004-N

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

of problem solving (two sides of the same coin), or whether the two differ by the underlying processes such as discontinuous vs. continuous?". An exciting period of further research lies ahead.

# AUTHOR CONTRIBUTIONS

MÖ wrote the editorial. KV and ES reviewed and edited the manuscript.

Copyright © 2018 Öllinger, Volz and Szathmáry. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Intuition and Insight: Two Processes That Build on Each Other or Fundamentally Differ?

Thea Zander<sup>1</sup> \*, Michael Öllinger2,3 and Kirsten G. Volz<sup>4</sup>

<sup>1</sup> Department of Psychology, University of Basel, Basel, Switzerland, <sup>2</sup> Parmenides Foundation, Munich, Germany, <sup>3</sup> Department Psychology, Ludwig-Maximilians-Universität München, Munich, Germany, <sup>4</sup> Werner Reichardt Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany

Intuition and insight are intriguing phenomena of non-analytical mental functioning: whereas intuition denotes ideas that have been reached by sensing the solution without any explicit representation of it, insight has been understood as the sudden and unexpected apprehension of the solution by recombining the single elements of a problem. By face validity, the two processes appear similar; according to a lay perspective, it is assumed that intuition precedes insight. Yet, predominant scientific conceptualizations of intuition and insight consider the two processes to differ with regard to their (dis-)continuous unfolding. That is, intuition has been understood as an experience-based and gradual process, whereas insight is regarded as a genuinely discontinuous phenomenon. Unfortunately, both processes have been investigated differently and without much reference to each other. In this contribution, we therefore set out to fill this lacuna by examining the conceptualizations of the assumed underlying cognitive processes of both phenomena, and by also referring to the research traditions and paradigms of the respective field. Based on early work put forward by Bowers et al. (1990, 1995), we referred to semantic coherence tasks consisting of convergent word triads (i.e., the solution has the same meaning to all three clue words) and/or divergent word triads (i.e., the solution means something different with respect to each clue word) as an excellent kind of paradigm that may be used in the future to disentangle intuition and insight experimentally. By scrutinizing the underlying mechanisms of intuition and insight, with this theoretical contribution, we hope to launch lacking but needed experimental studies and to initiate scientific cooperation between the research fields of intuition and insight that are currently still separated from each other.

Keywords: intuitive decision making, insight problem solving, continuity, discontinuity, non-analytical solution processes

# INTRODUCTION

There are situations, in which decision makers arrive at an idea or a decision not by analytically inferring the solution but by either sensing the correct solution without being able to give reasons for it, or by realizing the solution all of a sudden without being able to report on the solution process. Roughly, the former phenomenon has been called intuition, the latter insight. Both have fascinated the public as well as the scientific audience.

#### Edited by:

Snehlata Jaswal, Indian Institute of Technology Jodhpur, India

#### Reviewed by:

Michał Wierzchon,´ Jagiellonian University, Poland Elisabeth Norman, University of Bergen, Norway

> \*Correspondence: Thea Zander thea.zander@unibas.ch

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 02 May 2016 Accepted: 31 August 2016 Published: 13 September 2016

#### Citation:

Zander T, Öllinger M and Volz KG (2016) Intuition and Insight: Two Processes That Build on Each Other or Fundamentally Differ? Front. Psychol. 7:1395. doi: 10.3389/fpsyg.2016.01395

**8**

Here are two historical cases that illustrate the two phenomena (Gladwell, 2005; Mclean, as cited in Klein and Jarosz, 2011): The first is known as the Getty kouros and happened to the J. Paul Getty Museum in Los Angeles at the end of the 20th century. The museum was offered to add an over-life-sized statue in form of a kouros – allegedly from Ancient Greece, and thus several millions worth – to its art collection. Before the contract could be concluded, several experts set out to assure the authenticity of the statue and its origin thereby using a substantial number of high-tech methods for their analyses. After a year of thorough inspection, the experts reached the conclusion that the statue was authentic. At the same time, the former curator of the Metropolitan Museum of Art in New York, by chance, cast a glance at the artwork and spontaneously raised doubts regarding its authenticity. Thereupon, other men of renown who were asked for their spontaneous assessment of the kouros, also reported that they felt that something was wrong with it – without being able to tell the reason for this impression (cf. Gladwell, 2005). Interestingly, up to now, it could not be entirely cleared whether the statue stems from Ancient Greece or whether it is a modern forgery. Yet, the curator – instantaneously "feeling" that something was wrong and acting upon this impression although not being able to name a specific reason – is a paramount example of what it means to have an intuition being strong enough to act accordingly.

For an example of a sudden insight into the solution of a complex problem, consider Wagner Dodge, a smokejumper who survived the Mann Gulch Fire in August 1949 (Mclean, as cited in Klein and Jarosz, 2011). On a very hot day, a fire broke out in Mann Gulch, a canyon near Helena in Montana. Sixteen smokejumpers were flown close to the fire in order to extinguish it. After they had parachuted out of the aircraft, they realized that the fire was much worse than expected: They faced an uncontrollable blaze. The biggest problem was that they were in the danger of being entrapped by the fire. They could not escape and thus their lives were immediately threatened. For a moment they were desperately helpless and bustled around without a plan. They faced an impasse: well-known routines would not bring them forward and they might be caught in a mental set, that is, the tendency to try to solve a problem based on previous successful solution attempts to similar kinds of problems that are inefficient or cannot be transferred to the problem at hand (see Luchins and Luchins, 1959, as well as Öllinger et al., 2008). After a while, all at once, Wagner Dodge had the sudden idea to ignite an "escape fire" ahead of the group (i.e., he had a sudden aha-experience). Although he had never heard of such a possibility, he abruptly realized that when he could quickly stub an area of vegetation, the blaze would have no basis to continue when arriving at the cinder. He put his idea into action, ignited an additional fire and stepped into the middle of the newly burnt area. This way, he could save his life; the other smokejumpers who did not trust him lost their lives in the fire. Today, escape fires belong to the standard practice of fire services in the wild (Mclean, as cited in Klein and Jarosz, 2011).

Based on these examples, both phenomena – intuition and insight – may be conceived of as non-analytical thought processes that result in certain behavior that is not based on an exclusively deliberate and stepwise search for a solution. Non-analytical thought means a thought process in which no deliberate deduction takes place: individuals are not engaged in the consecutive testing of the obvious and/or typical routes to solution that define deliberate analysis. Instead, intuitions are characterized by the decision maker feeling out the solution without an available, tangible explanation for it; insights are characterized by the fact that the solution suddenly and unexpectedly pops into the mind of the decision maker or problem solver being instantaneously self-evident. Despite these apparent similarities of the two phenomena, intuition and insight have been conceptualized rather differently in the scientific literature up to now with regard to the underlying cognitive mechanisms as well as to the experimental designs routinely being used to gain empirical evidence. The aim of our contribution is therefore to scrutinize the similarities and differences of the cognitive mechanisms underlying intuition and insight by drawing on and extending early ideas by Bowers et al. (1990, 1995). The gripping question is whether intuition and insight are two qualitatively distinct phenomena, appearing similar only by face validity, or whether they are indeed similar/related and may only unfold on different levels of processing. To address this question, we draw on the latest contributions in the field and include recent research findings that have not been available in Bowers et al. (1990, 1995) time.

First, we will give an overview of predominant definitions of intuition and insight from a cognitive-psychological perspective. Second, we will elaborate on the underlying cognitive processes of both phenomena, thereby aiming to pin down similarities and differences. Both, similarities and differences will be addressed against the background of the research history of intuition and insight as well as in light of predominant, experimental paradigms that have been used to investigate the two phenomena. The paper ends by outlining open questions and highlighting future directions in scientific research that may progress our understanding of the underlying cognitive processes of intuition and insight (as well as on their relatedness).

# DEFINING INTUITION AND INSIGHT

# Theoretical Characterization of Intuition

Although most people "intuitively" know what an intuition is, the scientific community is split over its definition as well as its conceptualization. Despite disagreement about any definition, common ground is that intuition is an experienced-based process resulting in a spontaneous tendency toward a hunch or a hypothesis (Bowers et al., 1990; Volz and Zander, 2014). Taking all major definitions into consideration, it is possible to distil certain characteristics that prominent definitions of intuition have in common (Glöckner and Witteman, 2010; Volz and Zander, 2014).

Firstly, there is the aspect of non-conscious processing, which means that intuition occurs with very little awareness about the underlying cognitive processes so that people are mostly not able to report on these. Yet, intuitive processes can partly or completely be made conscious at some point in the entire

judgmental process (e.g., Gigerenzer, 2008). In this regard, intuitive processing is not directly conscious or non-conscious, but can be viewed as reflecting cognitive processing on the fringe of human consciousness (Mangan, 1993, 2001, 2015; Norman, 2002, 2016; Price, 2002; Norman et al., 2006, 2010). Secondly, there is the aspect of automaticity or uncontrollability. Intuitive processing appears in the form of spontaneous and instantaneous ideas or hunches that cannot be intentionally controlled in the way that they cannot be neither intentionally evoked nor ignored (e.g., Topolinski and Strack, 2008). The unintentional nature of intuition implies that intuition comes along without attentional effort and thus intuitive processing has been described as fast and effortless (e.g., Hogarth, 2001). Thirdly, there is the aspect of experientiality. Intuitive processing is based on tacit knowledge that has been acquired without attention during a person's life and is thus fueled by it (e.g., Bowers et al., 1990). In combination these aspects result in the subjective experience of "knowing without knowing why" as Claxton (1998, p. 217) put it. Lastly, there is the aspect of the initiation of action. The non-conscious, experience-based, and unintentional process finally results in a strong tendency toward a hunch, which serves as a go-signal that is strong enough to initiate action. As a result, people act in accordance with their intuitive impression or feeling (e.g., Gigerenzer, 2008). For a more detailed overview of the different aspects, consult Glöckner and Witteman (2010) or Volz and Zander (2014).

In line with these aspects, Gigerenzer (2008) has focused, inter alia, on the experiential basis of intuition and states that intuition may hardly be possible without pre-existing knowledge and experiences. To revert to the example of the Getty kouros, the interplay of the given (visible) information was dissonant for someone who had seen lots of antique statues before; a beginner to the field may have arrived at a completely different judgment. By intuitively apprehending the situation, the curator relied on specific long-term-memory content that had been primarily acquired by studying, analyzing, and reflecting about a great number of statues resulting in associative and unattended learning. Volz and Zander (2014) refer to this kind of memory content as tacitly (in)formed cue-criterion relationships. On this view, different environmental cues can have different predictive power with respect to the criterion at hand; the situational validity of the cues will moderate whether the cue is used outright. In the above example, the curator judged the grade of authenticity of the kouros (criterion) from the subjective impression that the statue's outer appearance had on him (cue). By doing this, the curator could not only rely on the given information (i.e., the visible kouros), but had to non-consciously activate further relevant knowledge from memory, that is to activate associatively learned cue-criterion relationships. Thus, the mental representation constructed during intuitive processing goes beyond the existing, perceivable information. Consequently, the curator's feeling of unease when having a look at the statue resulted from an incomplete cue-criterion relationship that was taken as diagnostic for the assessment of the statue's authenticity.

In addition to the aspect of experientiality and the unconscious read-out of implicitly learned cue-criterion relationships, Gigerenzer (2008) describes intuition as felt knowledge that aids decision making not only in cases, in which the decision maker already has a huge amount of prior experiences with a particular situation, but also when time and cognitive capacity is limited. According to the author, shadowy situations – either caused by a blurry sensory input that is only hardly detectable, or by the temporary nonavailability of necessary information about the individual decisional components, which does not allow for foreseeing all consequences of a decision – foster intuitive processing. Intuition then manifests itself in the use of certain heuristics that may form highly successful, cognitive shortcuts (Gigerenzer, 2008; Gigerenzer and Gaissmaier, 2011).

# Insight and Aha-Experience

In contrast to the above elaborations on intuition, the term insight has been used to refer to the sudden and unexpected understanding of a previously incomprehensible problem or concept. In this sense, Jung-Beeman et al. (2004, p. 506) explicate the nature of insight as "the recognition of new connections across existing knowledge." Sometimes the solution to a difficult problem may suddenly pop out in the mind and the decision maker or problem solver may immediately recognize the complex nexuses, as formerly illustrated in the episode of the smokejumper Wagner Dodge. Problems seem to be processed and solved by re-grouping or re-combining (i.e., re-structuring) existing information in a new way so that selfimposed constraints can elegantly be relaxed (Duncker, 1935; Wertheimer, 1959; Ohlsson, 1992). Wagner Dodge had prior knowledge: For instance, he knew how fires most commonly can be extinguished and that fires need vegetation or some other foundation to burn on. Furthermore, he knew about terrestrial conditions, and most important, he knew that smoke and fire could kill him. The solution to the problem occurred when he non-consciously combined all pieces of knowledge with each other in a new way so as to circumvent the fire death.

Such insightful solutions are associated with a privileged storage in long-term memory. Likewise as single trial learning. Recent studies observed a memory advantage for items that were solved by insight compared with non-insight solutions (Danek et al., 2013) as well as compared with items that were not selfgenerated (Kizilirmak et al., 2015). So, it is very likely, that Wagner Dodge never forgot how to ignite escape fires in the wild.

Yet, it has to be emphasized that an exact definition of the term insight has proven to be difficult, not least because the term insight has been used in many different ways in problemsolving research. Another hindrance is that it is very difficult to empirically operationalize the psychological construct of insight (Knoblich and Öllinger, 2006), which is a similar problem as in research on intuition. Hitherto, researchers disagree whether there are certain necessary and/or sufficient conditions to determine whether an insight has occurred. For example, due to the absence of objective physiological markers indicating the occurrence of an insight, mainly reports in form of the subjective aha-experience have been used ex post to determine whether an insight has occurred during the solution process of a certain problem (e.g., Gick and Lockhardt, 1995; Bowden et al., 2005; Danek et al., 2013). Danek et al. (2013, p. 2) state

that the aha-experience is "the clearest defining characteristic of insight problem solving." Topolinski and Reber (2010) define the aha-experience as the sudden and unexpected understanding of the solution, which comes with ease and is accompanied by positive affect as well as confidence in the truth of the solution. Given scientific endeavors to (objectively) pin down whether an insight had occurred, it can be summarized that insight and aha-experience have been equated. However, to date, there is disagreement whether (a) every insight is accompanied by an aha-experience, and (b) aha-experiences can only accompany insights and do never occur for presented solutions (i.e., solutions that are not generated by the individual herself; cf. Klein and Jarosz, 2011; Kizilirmak et al., 2015).

In order to help clarifying the conceptual muddle on insight, Knoblich and Öllinger (2006) proposed a classification of insight on three dimensions: first, on a phenomenological dimension, insight is opposed to a systematic and stepwise solution approach. Instead, it can be described as the sudden, unintended, and unexpected appearance of a solution idea, which is accompanied by a strong emotional component – the subjective and involuntary aha-experience. Second, on a task dimension, the literature on insight distinguishes between predefined insight problems and non-insight problems, with insight problems requiring sudden solution ideas and non-insight problems requiring a rather incremental solution approach. In case such an insight problem is solved, it is inferred that it is very likely that an insight has taken place. For example, the ninedot problem (Maier, 1930), the eight-coin problem (Ormerod et al., 2002), and the candle problem (Duncker, 1935) belong to such classical insight problems. However, a disadvantage of this distinction is that there are no unique criteria for an insight problem, and most of these problem could be solved with or without having an insight (Öllinger et al., 2014); the most proposed criteria refer back to the subjective experience of aha, which has led to a circular definition of insight and insight problems. To circumvent this disadvantage, Bowden et al. (2005) have suggested using a class of problems that can be solved either with insight or without insight. Last, on a process dimension, recent research is concerned with the underlying cognitive mechanisms of insight and how these are different from non-insight problem solving. The predominant assumption here is that the non-conscious cognitive process of a mental set shift enables a changed representation of the problem's elements (Ohlsson, 1992, 2011), which in turn leads to a sudden insight into the solution. For instance, in the nine-dot problem, the sudden realization that moves beyond the virtual nine-dot square are possible may lead to the relaxation of the perceptually driven boundary constraints and thus to a representational change of the problem space, which in the following enable insightful solutions (for a detailed explanation of the three dimensions consult Knoblich and Öllinger, 2006) 1 .

# DIFFERENT RESEARCH TRADITIONS OF INTUITION AND INSIGHT

After having defined both cognitive phenomena, intuition and insight, it becomes obvious that both share a similarity in terms of persisting conceptual difficulties. Moreover, with regard to the subjective phenomenology they reveal a distinct picture: While intuition means to non-consciously understand environmental patterns and to act according with this first impression without being able to justify it (Bowers et al., 1990), insight problem solving deals with situations in which a solution pops into a person's mind out of the blue (Durso et al., 1994). Yet, both processes can be viewed as nonanalytical solution or thought processes, where no incremental search takes place. In the following, we will critically elaborate on the cognitive processes assumed to underlie intuition and insight. Starting point will be a few words on the research history of both, which allow to understand why both fields of research have developed independently over time.

# The Single- vs. Dual-System View on Intuition

Intuition research has been deeply integrated in research on judgment and decision making that investigates how humans decide between alternatives and judge situations (Plessner et al., 2008). Yet this took some time, in which intuition had been neglected due to its elusiveness (Betsch, 2008). Now researchers agree that "intuition need not to be "magical" – it can be defined and explained scientifically" (Sadler-Smith, 2008, p. 1). It has to be emphasized, though, that, historically, the concept of intuition has fallen between (at least) two stools: The fastand-frugal-heuristic approach – which sees the concept in a positive light as it serves as the basis for heuristics and thus is a valid strategy successfully be used when time and cognitive capacity is limited in a fuzzy real world (Gigerenzer et al., 1999) –, and the heuristics-and-biases approach – which conceives of heuristics based on intuition as a source of erroneous and biased thinking that demonstrates human cognitive fallibility (Kahneman and Tversky, 1974). Both approaches have localized the concept of intuition completely differently within human thought processes and assign qualitatively different functions to it. Today, due to their continuing, fundamentally contradictory assumptions concerning human cognition, the fast-and-frugalheuristic approach and the heuristics-and-biases approach pit themselves against each other. Conceptually, the key difference may be that Kahneman and Tversky (1974) and Kahneman (2011) advocate a dual-system view on human thinking (intuition vs. deliberation), whereas Kruglanski and Gigerenzer (2011) and Mega et al. (2015) favor a single system view of unified processes in thinking and reasoning.

<sup>1</sup>There is the idea that a period, in which a person after encountering an impasse is not being consciously engaged in finding the solution anymore and puts the problem aside (i.e., the incubation period) fosters sudden insights of the solution (e.g., Gilhooly et al., 2012). Ritter and Dijksterhuis (2014) explain that unconscious thought processes continue to find the problem's solution by reorganizing memory content eventually resulting in gist-based representations. This

occurs in the absence of a person's conscious attempts. It has to be emphasized, however, that empirical studies revealed different results as to whether incubation periods are beneficial for problem solving. The specific conditions under which positive incubation effects take place have to be further investigated (Sio and Ormerod, 2009).

Additionally, it has to be emphasized that, since interest in intuition has mainly originated from the area of judgment and decision making, implications for intuition with respect to problem solving processes (and insight) are rather hard to derive from this kind of research. This may have complicated experimentally clarifying the relationship between intuition and insight.

# Intuition As Experienced-Based Perception of Coherence and As an Antecedent of Insight

To anticipate elaboration taking place later in this contribution, we mention a third approach in intuition research, which has developed independently from any dual- or single perspective and has its roots in the creativity and problem-solving literature (Mednick, 1962; Bowers et al., 1995; Dorfman et al., 1996). Intuition is here conceived as the experience-based perception or recognition of environmental meaning/coherence in terms of a sensitization toward the detection of hidden patterns whose structure cannot be immediately verbalized. For example, in the different versions of the semantic coherence task originally developed by Bowers et al. (1990), participants are asked to judge the semantic coherence of word triads and to name a forth word that may be the semantic link between the words, if it exists. Research found out that in these tasks participants are able to correctly categorize word triads as semantic coherent or incoherent – intriguingly even when they are not able to name the forth word, which is a paramount example of intuitive processing (e.g., Bowers et al., 1990; Bolte and Goschke, 2005). They rather feel the semantic link between the three words, but are not (yet) able to report on the reasons in terms of a solution concept that describes the semantic associations between the triad's constituents. The concept of fringe consciousness (Mangan, 1993, 2001, 2015) may be helpful to further understand intuition as the preliminary perception of environmental coherence. Price and Norman (2008), referring to the concept of fringe consciousness, have explained that the stream of consciousness does not only include a nucleus of consciously available information, but also a non-conscious fringe that contains cognitive signals of temporarily unavailable, non-conscious information processing that is constantly going on in the background (as it accompanies cognition). These signals are continuously going on as cognitive byproducts of cognitive processes. Yet, they are only consciously experienced when attention is drawn to them (Reber et al., 2004). Regarding the semantic coherence task, the product of this non-conscious processing on the fringe (i.e., the subjectively experienced intuition) is consciously perceivable, but its antecedents, direct content, and underlying processing mechanisms are outside of awareness (see also Topolinski and Strack, 2009a).

On this view, intuitive responses have been understood as "intuitive antecedents of insight" (Bowers et al., 1995, p. 27). As far as we know, this has been the first (and only) conception that up to now has addressed a potential link between intuition and insight. Their early work allows deriving assumptions concerning the interaction of intuition and insight in more detail. Moreover, this conceptualization produced valuable empirical paradigms (e.g., semantic and visual coherence judgment tasks) that are particularly suited to investigate insight and its intuitive precursors. Therefore, we will elaborate on this conception later in this contribution when aiming to clarify the conceptual relationship between intuition and insight<sup>2</sup> .

# The Special-Process vs. Nothing-Special View on Insight

In contrast, research on insightful thinking has its roots in Gestalt psychology, which investigated the integration and ordering mechanisms of human perception and problem solving (e.g., Köhler, 1921; Duncker, 1945; Metzger, 1953). Similar to intuition research, the research on insight problem solving is also located between two different views: The special-process view – which posits that insight problem solving involves a unique cognitive process that is qualitatively different from the processes non-insight problem solving utilizes – and the business-as-usual or nothing-special view – which assumes that mainly the same cognitive processes are involved in insight and non-insight problem solving (Seifert et al., 1995). Despite these two views, scientists have been highly fascinated by the topic since its early description by the Gestalt psychologists. This great interest culminated in the seminal book "The nature of insight," which mainly deals with the Gestalt psychologist's view on insight problem solving (Sternberg and Davidson, 1995).

# Interim Summary I

In sum, both concepts, due to their elusiveness, had to fight for recognition as an established field of research. Nevertheless, regrettably, research on intuition and research on insight has developed mostly independently from each other. However, this is in sharp contrast to a lay perspective on the two phenomena, which would rather endorse the perspective that intuition and insight are inherently intertwined with intuition being an antecedent of insight (in terms of a slight previous impression on the fringe of consciousness). Yet, the two branches of research evolved from different research traditions using different scientific paradigms and, unfortunately, have referred to one another only marginally (i.e., for instance by Bowers et al., 1990). Therefore, we think it is now time to scrutinize the relationship between the two phenomena in greater depth. Based on Bowers et al. (1990, 1995) work, we will do this by elaborating on the cognitive similarities and differences of the two phenomena and by offering preliminary process ideas on their relationship.

<sup>2</sup>For the sake of completeness, it has to be emphasized that metacognitive processes may play a role as well in intuitive processing. To strengthen the scope of our argumentation, we decided not to detail on this notion. Please see Mealor and Dienes (2013); Storm and Hickman (2015), or Thompson et al. (2011). A particular emphasize may be laid on the concept of experience-based metacognitive feelings (e.g., Koriat and Levy-Sadot, 1999).

# DIFFERENCES IN THE COGNITIVE PROCESSES ASSUMED TO UNDERLIE INTUITION AND INSIGHT

# The Continuity Model of Intuition: Intuition As a Gradual Process

fpsyg-07-01395 September 13, 2016 Time: 12:42 # 6

In the majority of conceptualizations, intuitive processing has been described within a continuity model locating intuition on one end of the continuum and insight on the other. A prominent example is the two-stage model put forward by Bowers et al. (1990). The authors determine intuition as the preliminary perception of coherence in the environment triggered by tacit knowledge that has been acquired unintentionally during a person's life (i.e., the cue-criterion relationships that we addressed earlier in this contribution, see also Volz and Zander, 2014). While tacit, or implicit, knowledge is seen as the foundation on which intuitions are based (e.g., Lieberman, 2000), in our view, intuition must not be regarded solely as a phenomenon of or even be equated with implicit memory processing. As Volz and Zander (2014) clarify, there are several important differences between intuition and implicit memory concerning both the format in which information is stored in memory and the kind of signal that accompanies the respective cognitive process. The fact that implicit knowledge is seen only as one component of processing is similar to the field of implicit cognition in general. Here, implicit knowledge is assumed to be supplemented and/or completed by antecedent hunches of correct solution, the subjectively experienced nearness to the solution (Reber et al., 2007).

Based on Polanyi's (1966) concept of tacit knowledge, Bowers (1984, p. 256) defined intuition as "sensitivity and responsiveness to information that is not consciously represented, but which nevertheless guides inquiry toward productive and sometimes profound insights." According to the author, the cognitive processing from an intuitive hunch toward an explicit insight is gradual and proceeds in two stages. In the first stage, the guiding or intuitive stage, environmental cues trigger the activation of tacit knowledge associatively connected in semantic memory, which results in an implicit perception of coherence that (yet) cannot be explained verbally. This process is characterized by the automatic spread of activation proposed by Collins and Loftus (1975). In the second stage of intuition, the integrative or insight stage, information becomes consciously available, which is enabled via a gradual accumulation of the previously activated concepts. The previous, implicit activation becomes now explicitly represented, which may thus be also interpreted as a form of insight processing. Hence, in Bowers et al. (1990, 1995) conception, intuition precedes insight in the way that explicit representations are anticipated by the sensitization of environmental pattern or structure. Yet, besides the idea of a gradual, successive accumulation of activated concepts in associative memory, unfortunately, it has remained unclear which cognitive and/or physiological conditions foster the transition from sensed intuition to justified insight.

Bowers et al. (1990) approach is not only theoretically important it also carries paradigmatic weight. In order to empirically test their model's assumptions, the authors developed several novel paradigms (verbal as well as perceptual ones), which today, after slight revisions, belong to the standard paradigms of intuition research (e.g., Bolte and Goschke, 2005; Volz and von Cramon, 2006; Topolinski and Strack, 2009b; Hicks et al., 2010; Remmers et al., 2014; Zander et al., 2015). One of them is the semantic coherence task mentioned above, consisting of word triads that can be either semantically coherent (e.g., SALT, DEEP, and FOAM) or incoherent (DREAM; BALL; BOOK). Semantic coherence is determined via a fourth word each word of the word triad's constituents associatively hints at (e.g., SEA for the coherent triad). Participants are instructed to perform a semantic coherence judgment, that is, to indicate via button press whether a given triad is coherent or incoherent. Researchers found that people showed an above-chance discrimination between coherent and incoherent triads even when they are not able to name the forth word (e.g., Bowers et al., 1990; Bolte and Goschke, 2005). In other words, people were intuitively sensitized to the detection of coherence prior to its explicit recognition (i.e., before having an explicit insight into the underlying semantic structure). Using a similar task, which consists of up to 15 semantically target-related clue words (i.e., the Accumulated Clues Task), it could be observed that participants continuously approached the explicit representation of environmental patterns/meaning (Bowers et al., 1990; Reber et al., 2007), which could be recently also demonstrated on a neuronal level when using the semantic coherence task (Zander et al., 2015). These results are perfectly in line with Bowers et al. (1990) definition of intuition and the corresponding gradual two-stage model. As another important aspect concerning the link between intuition and insight, Bowers et al. (1990) suggested the concept of semantic convergence to differentiate between triads that are rather easily solved by non-consciously reading out the common association (i.e., convergent triads) and triads that require a reorganization of semantic associations (i.e., divergent triads; see also the section Bridging the gap between the underlying processes of insight and intuition, second part).

To put it in a nutshell, according to the continuity model, – as Bowers et al. (1990) defined and tested it by means of verbal and visual coherence tasks – intuition and insight (in terms of an explicit representation that can be verbalized) are inherently intertwined: intuition and insight build upon each other and the one can hardly occur without the other. That is, intuitive processing is the non-conscious precursor of insight and thus, intuition and insight build on each other evolving on different processing stages. Accordingly, intuition and insight are not considered qualitatively distinct or mutually exclusive. Instead a crosstalk between the two is possible and even required to some extent. Importantly, Bowers et al. (1995) noted, that a thought process that appears to be sudden on a phenomenological level (like an aha-experience) nevertheless could have continuous underlying processes that have led to the particular subjective experience. Thus, they

do not exclude the existence of subjective aha-experiences accompanying the successful solution generation in their verbal tasks.

Along these lines, when investigating insights from a naturalistic perspective (i.e., in a field setting and not in controlled laboratory settings), Klein and Jarosz (2011) found out that a substantial number of insights occurred gradually and in an (non-conscious) evidence-accumulating fashion. Following the naturalistic-decision-making approach (Zsambok and Klein, 1997), the authors aimed at investigating the natural occurrence of insights by analyzing a collection of reported insight incidents (comprising a radical shift in understanding) having occurred in the different domains of everyday life of different occupation (e.g., invention, firefighting, management, and the like). The authors found out that (a) impasses did not occur in each insight case, (b) not every incident of an insight was accompanied by an aha-experience, and (c) an intuitive feeling of how near the solution might be occurred in many cases before the actual solution was reached. These results indicate that insights in a naturalistic setting may differ from insights synthetically induced by the class of pre-defined insight problems (e.g., eight-coin-problem, Ormerod et al., 2002) according to the degree with which the solution is derived gradually. Thus, in the naturalistic setting, a continuous solution approach (as advocated in intuition research) may be adoptable.

# The Discontinuity Model of Insight: Insight As the Result of a Mental Restructuring Process

Contrary to the idea of a gradual solution approach, there is the discontinuity model of problem solving: insight is strongly linked to cognitive processes that restructure mental problem representations in order to allow the generation of a solution to a complex problem. A prominent example of a discontinuity model is the representational change theory put forward by Ohlsson (1992, 2011) that combines the Gestalt psychological approach (characterized by a person being unable to report conscious solution strategies, cf. Duncker, 1945) and the information-processing view on problem solving (characterized by a conscious search through alternatives in a problem space, which is a controllable and reportable process, cf. Newell and Simon, 1972). According to the representational change theory, and in sharp contrast to the two-stage model developed by Bowers et al. (1990), prior knowledge and experiences are postulated to hamper (instead of promote) the generation of solutions since they easily turn into constraints (Knoblich et al., 1999). Based on this, Ohlsson (1992) introduced the idea that an impasse, that is a "blind lane" where one is caught in wrong solution attempts finding no expedient or problem solving attempts ceases, is the precondition for a representational change that results in an insight. According to the author, a restructuring process is required, during which self-imposed constraints of the problem representation change and the problem solver obtains a "fresh look" at the problem. Problem solvers may then be able to rearrange either the individual components or the general assumptions how to solve the problem. A putative mechanism assumed to drive such restructuring processes is the relaxation of self-imposed constraints. The representational change theory became very influential; there are several studies that have tested and could corroborate its assumptions (e.g., Knoblich et al., 2001; Kershaw and Ohlsson, 2004; Öllinger et al., 2006, 2013).

In an eye movement study, for example, participants were asked to transform an incorrect arithmetic statement, which is made up of Roman numbers made of matchsticks, into a correct one moving only one single matchstick. Interestingly, it could be observed that before the correct solution of difficult problems was generated, suddenly, solvers attended such problem elements of the equation (e.g., the operators) longer that they had hardly noticed before. This was taken as evidence that successful solvers overcame selfimposed constraints (Knoblich et al., 2001). Research on the underlying cognition of the representational change theory could also help in understanding the subjective ahaexperience as a subjective marker of insight: a recent study conducted by Danek et al. (2016) provides first evidence that the self-reported rates of aha-experiences depend on the degree of constraint relaxation that is necessary to solve the given problem. The authors found that the more constraints had to be relaxed, the less aha-experiences were reported, which was interpreted such that the execution of several necessary solution steps (that are needed to gain a representational change) minimizes or even eliminates the experience of suddenness as a key attribute of subjective aha-experiences.

# Interim Summary II

To summarize, according to a discontinuity model, the cognitive processes of intuition and insight seem to be qualitatively distinct. No crosstalk between them is possible. Moreover, the first (intuitive) look on a problem resulting in a mental impasse biases the subsequent solution. To be more precise, the intuitive apprehension of a problem necessarily leads to an impasse and restructuring processes are needed so as to overcome the bias and to solve the problem. This can be demonstrated, for example, via the utilization of magic tricks in order to probe insight problem solving. To explicate, Danek et al. (2013) recently introduced a novel paradigm consisting of magic tricks to investigate the cognitive underpinnings of insight problem solving. When viewing these magic tricks, the intuitive viewing pattern, which the magician intentionally utilizes, will very likely prohibit the understanding of the trick, that is, to first impede the solution to the problem. The solution is only within reach when the intuitive apprehension of the magic-trick situation, that is the first and rapidly formed impression, can be overcome. Classical insight problems as for example the famous candle problem (Duncker, 1935) utilize the same rationale.

# BRIDGING THE GAP BETWEEN THE UNDERLYING PROCESSES OF INSIGHT AND INTUITION

# Dual-System Models of Thinking and Reasoning

This discontinuity approach resembles the experimental procedure in typical judgment and decision-making studies conducted within the heuristics-and-biases framework (Kahneman, 2011). This framework draws on a class of psychological models that are very well known in social and cognitive psychology and are called dual-system or dual-process models (e.g., Evans and Frankish, 2009; Kahneman, 2011). These models assume two different modes of thinking, which Stanovich and West (2000) called System 1 (described as e.g., non-conscious, fast, associative, holistic, automatic, and emotional) and System 2 (described as e.g., conscious, slow, analytic, serial, controlled, and affect-free). In other words, according to dual-system models, judgments may be formed via two qualitatively distinct processes or systems – an intuitive one (System 1) or a deliberate one (System 2). The intuitive strategy, thereby, is thought to require some sort of a feeling that "tells" a person which option is the optimal one. Thus, affective feelings are here seen as a crucial component that is inherent to the entire decision process. In contrast, when thoroughly deliberating on the pros and cons of multiple options, the solution to the decision process is considered to come to mind by way of logic and exhaustively sensible considerations of probable consequences. Thus, System 2 processing is here thought to not need or even to not involve any affective contribution.

Despite the large number of contributions that support the dual-systems view both theoretically and empirically, such theories have nevertheless recently come under strong fire (Keren and Schul, 2009; Kruglanski and Gigerenzer, 2011). The main point of criticism put forward by Keren and Schul (2009, p. 534) is that "the different dual-system theories lack conceptual clarity, that they are based upon methodological methods that are questionable, and that they rely on insufficient (and often inadequate) empirical evidence." Kruglanski and Gigerenzer (2011) provide a unified approach and explain that both, intuition and deliberation, rely on the same functional principles (i.e., they are based on if – then rules), which is dependent on environmental conditions. As a reply to such criticism, Evans and Stanovich (2013) recently riposted that it is overstated since such criticism refers to dual-system models as a class of purely the same theoretical assumptions. They clarify that there are indeed different assumptions and terminologies subsumed under the dual-system framework, which needs to be considered. Nevertheless, there is also neuronal evidence against the assumptions of the dual-system approach (Mega et al., 2015). The authors did a functional-magnetic-resonanceimaging study and asked participants to judge either intuitively or deliberately the authenticity of emotional facial expressions. Interestingly, the authors found that intuition and deliberation recruit the same neuronal networks – a finding well in line with Kruglanski and Gigerenzer's (2011) proposal. It can be summarized that the dual-system framework is being much debated at the moment (see also volume 8 of Perspectives on Psychological Science, 2013) and therefore, it is very likely that there will be a revised conception in the foreseeable future.

# Dual-System Models and the Discontinuity Model of Insight: Intuition As the First and Biased Problem Representation

After having shortly named the key assumptions of the dualsystem framework as well as potential critical points, we will continue by elaborating on why we think the experimental approach of the insight problem solving literature (e.g., Danek et al., 2013) is similar to the one pursued by the heuristicsand-biases framework (Kahneman, 2011). A typical task used by researchers of the heuristics-and-biases approach is the bat and the ball problem. Participants are told that a bat and ball together cost \$ 1.10 in total and that the bat costs \$ 1 more than the ball. Then they are asked to state how much the ball costs. A vast number of experiments showed that the first "intuitive answer," following Kahneman's terminology, is 10 cent, but after a while of conscious deliberation (i.e., analytical thought) participants find out that the correct answer is 5 cent (Kahneman, 2011). Here is employed the same principle as in the magic-trick paradigm: the first and rapidly formed judgment, which is intentionally induced by the task material, is incorrect and hampers the generation of the correct solution (here 5 cent). In terms of the representational change theory an over-constraint problem representation is activated, where a simple goal representation is set up: total sum minus bat results immediately in the cost of the ball. Overcoming these assumptions seems difficult and requires a more sophisticated goal representation that combines two sets of information: (1) bat − ball = 1 AND (2) bat + ball = 1.10 => 1 in (2) ball + ball + 1 = 1.10 => ball = 0.05).

Together, experiments from both scientific fields show that by exploiting peoples' intuitive apprehension of a problem, the solution is precluded from the beginning. To overcome the impasse or bias, it is suggested that the problem solver may engage in restructuring the problem space or in analytic strategies so as to eventually being able to solve the problem and to arrive at the objectively correct answer. Thus, there might be a reasonable mapping of the discontinuity model to the common dual-system model: first, the intuitive system starts (whether by default first or in parallel to System 2), and will lead to an over-constrained or biased problem representation that subsequently may lead to an impasse or conflict. Essential for reaching a solution is, (i) that the problem solver or decision maker realizes that the fast initial apprehension of the problem precludes its solution and (ii) engages in a representational change to overcome the initial problem representation (Öllinger et al., 2014). Since, by definition, System 2 processing is slower than System 1 processing it can smooth out the first and hasty attempts made by System 1. In the diction of dual system theorists, the analytic mind is called up when encountering an impasse or conflict and

will attempt to deliberately solve the problem by applying certain rational strategies. Importantly, Systems 1 and 2, or intuition and insight, are here considered to be qualitatively different – "hare and tortoise."

Equally important, System 1 is considered subordinate to System 2 and its hasty responses needs to be tamed (cf. Kahneman, 2011, p. 185). Kahneman (2011, p. 44) states: "One of the main functions of System 2 is to monitor and control thought and actions "suggested" by System 1, allowing some to be expressed directly in behavior and suppressing or modifying others." Given such an understanding of intuition and insight, the discontinuity model may suffer from the very same conceptual problem as a dual-system account of reasoning: that is, how and by which factors is a conflict or impasse detected? "Who" eventually launches restructuring processes that are needed to overcome the error? How does restructuring of the first problem representation take place? This may be viewed as a variation of the "homunculus problem."

Hence, within the discontinuity conception of insight, intuition is not regarded as helpful or diagnostic for the generation of a pending insight. In line with this idea, Metcalfe and Wiebe (1987) investigated feeling of warmth accompanying insight and incremental problem solving using classical insight problems and algebraic problems. They used feeling-of-warmth ratings as the assessment of how close participants intuitively felt to the solution, which was taken to indicate the subjective nearness to the solution. Interestingly, they found out that these subjective feelings of warmth differed for insight and non-insight solutions insofar that they could predict performance only on incremental algebra problems. For insight problems such intuitive feelings were lacking. Given this result, one may conclude that intuition differs from insight concerning the (introspective) access to nonconscious processing: whereas decision makers intuit the solution to a problem, people solving the problem by insight show to lack such hunches. Thus, additionally to the continuity/discontinuity distinction, insightful solutions as in contrast to intuitive ones seem to be discrete phenomena in terms of availability to awareness. However, it could be also possible that the conscious assessment of how close/far the solution is, just easier for noninsight tasks. Since non-insight tasks are well-defined insofar that there are clear starts, solution paths, and goals, which enables exact planning of the necessary steps and its order (as for example in algebraic problems). Conversely, classical insight problems may be technically well-defined (in that there is also a clear start and goal, see e.g., the famous nine-dot problem), but since the problem's different components are unhelpfully represented in the problem solvers mental set, it is difficult or rather impossible to estimate how far/close the solution is.

# Interim Summary III

As an interim summary, it may be concluded that intuition research advocates a continuity model, in which intuition and insight build upon each other in a gradual and cumulative fashion: people are non-consciously sensitized toward pattern or meaning in the environment and act accordingly (e.g., Bowers et al., 1990). In contrast, insight research focuses on a discontinuity model, in which the initial representation of the problem (i.e., early intuition) biases later solution attempts and has to be overcome in order to reach a solution. Here, no intuitive precursors of insight in terms of a subjectively felt nearness toward the solution are assumed. This latter model resembles famous, yet recently heavily criticized, dual-system models in judgment and decision-making research insofar as in both approaches the participants first intuitive apprehension of a problem biases its later solution.

# SEMANTIC COHERENCE TASKS USED IN INTUITION AND INSIGHT RESEARCH: WORD TRIADS AND REMOTE ASSOCIATES

Interestingly, in the semantic domain, intuition research following Bowers et al. (1990) approach and contemporary insight research do have used similar stimuli yet with different task rationales, which could be used as an excellent starting point for necessary, and up to now lacking, common investigations. As described earlier in this contribution, in the tradition of Bowers et al. (1990, 1995), typical coherence judgment tasks include semantically coherent and incoherent word triads – a task that dates back to the work of Mednick (1962). Here, response patterns of both triad types (i.e., coherent vs. incoherent) are compared to each other. In recent research on insight problem solving, Bowden et al. (2005) presented a novel framework and a new class of problems in order to probe insight problem solving. The authors equate subjectively reported aha-experiences with insight. The authors have used word triads based on Mednick's (1962) task to investigate the neuronal underpinnings of insight. They presented a large number of problems that can be solved either by insight or by non-insight (i.e., Aha! vs. Non-Aha!) and do not require a lot of time to be solved (Kounios and Beeman, 2014). As a result they found that Aha! solutions revealed distinguish neural patterns than Non-Aha!-solutions. Unlike intuition research, they (1) only applied word triads that are principally solvable (i.e., no incoherent triads), and (2) word triads that consist of compound remote associate.

Bowers et al. (1990), distinguished two types of triads and termed them convergent and divergent triads, respectively. For convergent triads the common associate means the same with respect to each clue word, whereas for divergent triads the common associate is more remote and changes its meaning with respect to each clue word. An example for a coherent convergent triad is SALT DEEP FOAM– SEA; and an example for a divergent triad is AGE MILE SAND– STONE. Unlike convergent triads, divergent triads are built in a way one need to detect the multiple meanings of the solution word to associate it with the meanings of the three clue words. As divergent triads may require a restructuring of the different meanings of the clues with respect to the solution, these kinds of triads could be nicely seen as an insight condition.

According to Bowden and Jung-Beeman (2007), divergent triads are not as complex as classical insight problems, but they can nevertheless be used as a kind of insight problems. Like

typical insight tasks (1) they misdirect retrieval processes (i.e., the first word of a divergent triad biases later thought toward a specific, yet wrong direction), (2) the strategy that has led to the correct solution cannot be reported by the problem solver, and (3) aha-experiences can occur.

For such divergent triads, Cranford and Moss (2012), using a verbal protocol method, found out that there are two different types of insight problems, for which only one type shows the typical traditional characteristics of an insight. It has to be emphasized that, unlike Bowden et al. (2005), the authors consider all three components, subjective ahaexperience, impasse, and restructuring, as necessary for an insight to occur. They could show that some problems, consisting of divergent triads, could be solved via immediate insight, whereas others were solved by non-immediate or delayed insight. Interestingly, only the latter type of insights showed the supposed phases of insight. Fedor et al. (2015) detailed on this question and found that the classical insight sequence (i.e., constrained search, impasse, insight, extended search, and solution) is a rather rare event. They found that participants showed much more often fairly different insight sequences (i.e., a flexible order of the different problem-solving stages), which has to be further specified in the future. We consider this line of research (Cranford and Moss, 2012; Kounios and Beeman, 2014; Fedor et al., 2015) as promising and important for future endeavors, which may initiate the common investigations of intuition and insight.

# CONCLUSION, OPEN RESEARCH QUESTIONS, AND FUTURE DIRECTIONS

To conclude, we set out to disentangle the underlying mechanisms of intuition and insight so as to clarify their relationship. At first sight, intuition and insight seem to be very differently conceptualized: while the intuition literature favors a continuity model, insight has been described within in a discontinuity model. In a continuity model, early (semantic) readout processes are taken as diagnostic for the non-conscious detection of environmental patterns and/or meaning (in terms of an antecedent of later explicit mental representation or insight). Intuition is described as aiding decision making and problem solving when time and cognitive capacity is limited and necessary information is temporarily unavailable. Contrary to this, in a discontinuity model early intuitive responses misdirect the generation of a correct solution or are experimentally utilized to bias solution attempts. In this case, intuitions lead people astray. Instead of employing intuition, mental restructuring processes (i.e., qualitative changes in the non-conscious search processes) are needed to overcome biased intuitive impressions or apprehensions so as to eventually solve the problem. In that respect, a discontinuity model resembles dual-process accounts in judgment and decision making.

Except early work by Bowers et al. (1990, 1995) and Dorfman et al. (1996), there have not been much empirical investigations so far aiming at exploring similarities and differences in the underlying neurocognitive mechanisms of intuition and insight. A major drawback here may be that there are no tasks that easily enable a direct empirical comparison between the two concepts. Nevertheless, we consider it very important to test intuitive and insight solution processes by means of exactly the same task and within the same participants. Such a task needs to be created. With this theoretical contribution, we therefore aim to initiate common investigations of both fields of research to detect neurocognitive similarities and differences between intuitive processing and insight problem solving. A good starting point for common empirical investigations may be the use of different types of triads [as for example divergent and convergent triads, as formerly suggested by Bowers et al. (1990)] in order to induce gradual and discontinuous solution attempts. We also consider it important to investigate not only the cognitive processes that may underlie intuition and insight, but also the neuronal processes involved. Future studies may shed light on the specific (and maybe distinct) neuronal correlates, which will then also allow drawing conclusions about the theoretical conceptualization of the two phenomena. Interesting research questions would be (as non-exhaustive list): (1) Are the neuronal correlates different for the two types of triads (convergent versus divergent triads)? (2) Do aha-experiences also occur for convergent triads? (3) Do feelings-of-warmth ratings occur for both types of triads or only for convergent triads? (4) Do verbal protocols differ for the two types of triads? (5) How can the assumed recursive coherence building process be neuronally mapped? The further investigation of the underlying cognitive and neuronal processes of restructuring may also deeply progress our understanding of the topic. Here, Öllinger et al. (2006, 2013) reached influential results that may be carried forward in future research. Equally important, following Kounios and Beeman (2014) in using current neuroimaging techniques may promote the detection of objective physiological markers of insight (in form of a specific neuronal or electrophysiological activation pattern accompanying the experience of impasses and aha's as well as correlating mental restructuring processes). Kounios and Beeman (2014) as well as Sandkühler and Bhattacharya (2008) already gained promising results in this respect, thus their research may be a good starting point for the future. To sum up, intuition and insight are intriguing (non-analytical) mental phenomena that need to be further investigated in the future.

# AUTHOR CONTRIBUTIONS

TZ developed the theoretical conception; wrote the article. MÖ developed the theoretical conception; revised the manuscript. KV developed the theoretical conception; revised the manuscript.

# ACKNOWLEDGMENTS

This work was funded by the Werner Reichardt Centre for Integrative Neuroscience (CIN) at the University of Tübingen (an Excellence Cluster within the framework of the Excellence Initiative (EXG 307) funded by the Deutsche Forschungsgemeinschaft (DFG).

# REFERENCES

fpsyg-07-01395 September 13, 2016 Time: 12:42 # 11



Sadler-Smith, E. (2008). Inside Intuition. Abingdon: Routledge.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Zander, Öllinger and Volz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Approaching the Distinction between Intuition and Insight

Zhonglu Zhang, Yi Lei\* and Hong Li

Research Centre for Brain Function and Psychological Science, Shenzhen University, Shenzhen, China

Intuition and insight share similar cognitive and neural basis. Though, there are still some essential differences between the two. Here in this short review, we discriminated between intuition, and insight in two aspects. First, intuition, and insight are toward different aspects of information processing. Whereas intuition involves judgment about "yes or no," insight is related to "what" is the solution. Second, tacit knowledge play different roles in between intuition and insight. On the one hand, tacit knowledge is conducive to intuitive judgment. On the other hand, tacit knowledge may first impede but later facilitate insight occurrence. Furthermore, we share theoretical, and methodological views on how to access the distinction between intuition and insight.

Keywords: intuition, insight, judgment, solution, RAT, tacit knowledge

#### Edited by:

Michael Öllinger, Parmenides Foundation, Germany

#### Reviewed by:

Sascha Topolinski, University of Cologne, Germany Sumitava Mukherjee, Indian Institute of Management Indore, India

> \*Correspondence: Yi Lei leiyi821@vip.sina.com

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 18 April 2016 Accepted: 28 July 2016 Published: 09 August 2016

#### Citation:

Zhang Z, Lei Y and Li H (2016) Approaching the Distinction between Intuition and Insight. Front. Psychol. 7:1195. doi: 10.3389/fpsyg.2016.01195

# BACKGROUND

Intuition can be conceived of as a sudden apprehension of coherence (pattern, meaning, structure) above chance level with little conscious retrieval (Bowers et al., 1990; Bolte et al., 2003; Bolte and Goschke, 2005; Volz and Von Cramon, 2006; Ilg et al., 2007; Topolinski and Strack, 2008; Topolinski, 2011). By contrast, insight is defined as a sudden access to solution by restructuring, or changing problem representation (Ohlsson, 1984; Knoblich et al., 1999; Öllinger and Knoblich, 2009; Öllinger et al., 2013; Kounios and Beeman, 2014). The nature of intuition or insight has been empirically investigated and theoretically discussed in literature, separately. However, quite few theoretical discussions address the relationships between the two. In fact, they share many commons and are intimately linked with each other. For example, both occur under somewhat similar situations where the final results are not clear. That is, an intuitive judgment would be made under an uncertain circumstance perhaps due to time pressure or lack of sources (Kahneman, 2003) or for insight an impasse would be encountered beforehand where individuals do not know what to do next though they have made great efforts (Ohlsson, 1984; Knoblich et al., 1999). In addition, both intuition, and insight rely on the unconscious spreading activation of semantic associates (Ohlsson, 1984; Bowers et al., 1990; Bowden and Beeman, 1998; Jung-Beeman et al., 2004; Bolte and Goschke, 2005; Ilg et al., 2007; Cai et al., 2009; Sio et al., 2013) and the activation of the right superior temporal cortex (Jung-Beeman et al., 2004; Ilg et al., 2007). In line with this, they share a common counterpart for comparison, namely the analytic process which operates in a deliberately controlled style under the framework of the dual-process theory (Epstein, 1994; Sloman, 1996; Stanovich and West, 2000; Kahneman, 2003). Moreover, fluency, as the relative speed and efficiency of information processing (Reber et al., 2004), plays a causal role in both phenomena. Processing fluency of the encoded material (without actually retrieving the solution) is the driving force of the gut feeling of intuition not only in the coherence judgment (e.g., Topolinski and Strack, 2009; Topolinski, 2011) but also in the intuitive judgment for solvability of problems (e.g., Topolinski et al., 2016) and in insight the fluency of solution retrieval is a rather epiphenomenal factor that does not cause the insight itself, but that elicits its distinctive experiential feature ("Aha" feeling) (Topolinski and Reber, 2010).

# DIFFERENCES BETWEEN INTUITION AND INSIGHT

Though intuition and insight share overlapping cognitive and neural features, as summarized above, they are actually not the same coin, and can be essentially differentiated from each other to large extent. Some works have addressed the differences between them. For example, insight comes after intuition, and appears into consciousness (Volz and Von Cramon, 2006). In addition, intuition is continuous whereas insight is discontinuous (e.g., Bowers et al., 1990). Furthermore, as Reber et al. (2007) showed, there are significant increase in both subjective closeness and objective closeness in intuitive judgment whereas subjective closeness is not significantly increased, lagging far behind objective closeness in insight problem solving. Obviously, the behavioral, and phenomenological differences have been well documented. Moreover, we propose that intuition and insight are different from each other not only in the behavioral and phenomenological levels but also in the cognitive levels in essence. We will discuss them as follows in two aspects.

First, intuition and insight are toward two distinctive aspects of information processing. Though the unconsciously activated information plays a common and fundamental role in both intuition (e.g., Bolte and Goschke, 2005; Ilg et al., 2007) and insight (e.g., Jung-Beeman et al., 2004; Sio et al., 2013), it is guided by different cognitive operations. For intuition, this unconsciously activated information is guided by an intuitive judgment task on whether there is a coherence or a fourth associative word for the triads. More specifically, intuition mainly involves the processing of judgment on "yes/no, " namely intuitive judgment, which is intimately related to the behavior of decision making (Tversky and Kahneman, 1974; Dane and Pratt, 2007, 2009). In this regard, intuition cares little about "what the ultimate result is" but the individuals' subjective decision upon whether there is a solution or not. For insight, however, this unconsciously activated information is guided by conscious retrieval which requires accessing the insightful solutions (the fourth associative word for the triads). In other words, insight is something about "what" is the solution rather than judgment. Evidences from the functional magnetic resonance imaging (fMRI) studies support the views above to some extent. With the Remote Associate Test (RAT; Mednick and Mednick, 1967), Ilg et al. (2007) and Jung-Beeman et al. (2004) investigated the neural basis of intuition and insight, respectively. Both found activities in the right superior temporal cortex, which was regarded to be reflecting the common role of the unconsciously activated information (Ilg et al., 2007). Moreover, they found extra neural activity that can distinguish different cognitive operations (intuitive judgment vs. retrieving insightful solutions) on the unconsciously activated information. Specifically, the task of intuitive judgment activates brain areas such as the bilateral inferior parietal cortex that are generally related to the process of decision making under uncertainty (Paulus et al., 2001; Ilg et al., 2007). On the other hand, the task of retrieving insightful solutions elicited a gamma-band activity, which indexes the accessibility into conscious representations (Engel and Singer, 2001; Jung-Beeman et al., 2004).

Second, the role of tacit knowledge in intuition and insight should be different. Intuition mainly benefits from tacit knowledge. Activation of tacit knowledge starts to spread from the three concepts (e.g., in the RAT) and finally converges on the common remote associate. As activation accumulates, it can facilitate the intuitive judgment though not trigger conscious retrieval (Ilg et al., 2007). Meanwhile, this accumulated activation brings individuals the feeling of subjective closeness to the solution (Reber et al., 2007). The whole processing stream starting from the primary activation of tacit knowledge to final intuitive judgment goes continuously instead of discontinuously without any barrier (Bowers et al., 1990). All these indicate that tacit knowledge benefits the processing of intuitive judgment of coherence, resulting in a continuous pattern. In contrast, tacit knowledge may play double roles (first harmful and then helpful) in insight occurrence. In this sense, tacit knowledge can be divided into valid and invalid categories. In insight problem solving, solvers primarily encounter impasse, which is mainly caused by the strong activations of unhelpful tacit knowledge (Ohlsson, 1984; Knoblich et al., 1999, 2001). The impasse can be overcome when weak but valid tacit knowledge can be activated and accessed (Knoblich et al., 2001; Bowden and Jung-Beeman, 2007) and this mainly relies on the activities at the right anterior superior temporal gyrus (Jung-Beeman et al., 2004; Bowden and Jung-Beeman, 2007).

# APPROACHING THE DISTINCTION BETWEEN INTUITION AND INSIGHT: THEORETICAL AND METHODOLOGICAL PROPOSALS

As aforementioned, intuition and insight are two mutually related but different cognitive constructs. However, the differences (as well as the commonalities) that summarized above are just based on the theoretical and empirical data in the respective field of intuition and insight. To better understand the nature of intuition and insight, two concerns should be taken into consideration. First, to what extent intuition and insight are related and distinguished with each other? Second, there is lack of research that can systematically and directly examine their mechanisms in the same experiment thus far. In this vein, we share our viewpoints below.

Theoretically, future researches can consider how the unconsciously activated information interacts with intuitive judgment and the conscious retrieval of insightful solutions, respectively. Though there have been some neuroimaging evidences, as we summarized that can partly support the view that the unconsciously activated information is guided by different cognitive operations (namely "yes/no" judgment for intuition and conscious retrieval of solutions for insight, respectively), relevant studies in both fields are relatively few and need to

be further replicated, and expanded. In addition, as we have distinguished, tacit knowledge may play different role in between intuition, and insight. Some tacit knowledge may be helpful for intuitive judgment but harmful for insight occurrence (and vice versa). We suggest that more empirical studies can be conducted to examine how tacit knowledge influence intuition and insight.

In methodology, we propose that future researches can directly examine, and compare the cognitive and neural mechanisms between intuition and insight in the same experiment and this is possible for two reasons. First, the commonly used materials—the RAT—have been widely used in the studies of both intuition (e.g., Bowers et al., 1990; Bolte et al., 2003; Bolte and Goschke, 2005; Ilg et al., 2007; Topolinski and Strack, 2008, 2009; Topolinski, 2011) and insight (e.g., Bowden and Beeman, 1998; Jung-Beeman et al., 2004; Cai et al., 2009; Sio et al., 2013). The RAT consist of a certain number of items and in each item there are three words of a triad as well as their common associate (the solution word; Mednick, 1962; Mednick and Mednick, 1967). For example, the triad "night, wrist, stop" are in association with the solution word "watch." In insight problem solving, the task for the participants is to retrieve the solution word according to the three words. Only those solutions accompanied by "aha" feelings are regarded as insightful ones (e.g., Bowden and Jung-Beeman, 2003, 2007; Jung-Beeman et al., 2004). In intuitive judgment task, there are not only the coherent triads (e.g., "night, wrist, stop") with their common associates but also the incoherent triads (e.g., "house, lion, butter") without any common associate. Participants do not need to retrieve the solution word but judge whether the triads are coherent or not (e.g., Bolte and Goschke, 2005; Ilg et al., 2007). Second, intuition and insight stay at different phases in the stream of information processing. Intuition occurs at the moment of coherence judgment with the potential solutions not retrieved (Ilg et al., 2007). Insight, however, comes at a later stage (Volz and Von Cramon, 2006), occurring at the moment of solution retrieval (Jung-Beeman et al., 2004) which cannot be predicted by the intuitive judgment of FOK (feeling of knowing) (Metcalfe and Wiebe, 1987). Considering these two points, we suggest that they can be measured subsequently in one experimental paradigm with the RAT as the materials. A general paradigm is developed as follows (it should be noted that this is one but not the only way to explore the differences between intuition and insight).

As described in **Figure 1**, the RAT (the coherent triads with solutions) as well as the incoherent triads (without solutions) can be congregated together and then be randomly presented to the participants one by one. Considering that the intuitive judgment and the solutions retrieval stay at different phases in the stream of information processing in problem solving, participants can be instructed to complete the two tasks subsequently. Specifically, participants can receive the coherence judgment task first, in which they are asked to judge whether the word triads have a common associate. In light of previous researches (e.g., Bolte et al., 2003; Bolte and Goschke, 2005; Ilg et al., 2007), intuition can be measured when the coherence judgments were made with the solutions not retrieved. After the coherence judgment task, participants can be told to retrieve the solutions to the problems. According to previous literature (Jung-Beeman et al., 2004; Bowden and Jung-Beeman, 2007), insight can be measured at the moment of correct solutions retrieved which are reported insightful.

Furthermore, researchers can investigate and compare the cognitive and neural basis of intuition and insight based on the above-introduced paradigm by utilizing the brain imaging techniques such as fMRI, electroencephalograph (EEG), and so on. For example, with high spatial resolution, fMRI can be used to localize "where" the neural signals related to the cognitive events are in the level of millimeter in space. fMRI has been used in the fields of both intuition and insight and the relevant studies have found some brain region such as the right superior temporal cortex activated in intuition and insight (Jung-Beeman et al., 2004; Ilg et al., 2007). This provides potential regions of interest (ROI), based on which future researches can build their respective hypothesis and further examine the neural basis of intuition and insight. Similarly, with millisecond-level temporal resolution, EEG would be useful in elucidating the neural correlates of intuition, or insight by providing neural marks such as the event-related potentials (e.g., N100, N200, P300) in time domain or the neural oscillations (e.g., alpha,

beta, gamma) in frequency domain. With RAT test, Jung-Beeman et al. (2004) found a gamma-band oscillation associated with conscious retrieval in insight problem solving. In addition, they observed an alpha burst preceding the gamma burst. This insightspecific alpha effect may reflect unconscious solution-related processing (Jung-Beeman et al., 2004). By contrast, there are few EEG studies of intuition. Thus, one straightforward hypothesis would be, for example, could alpha-band oscillation, or gammaband oscillation be observed during the moment of intuition? In short, the brain imaging techniques would help to prosper the fields of both intuition and insight.

# CONCLUSIONS

As we summarized, intuition, and insight can be essentially differentiated from each other when considering whether the

# REFERENCES


unconsciously activated information is guided by intuitive judgment or conscious retrieval and the different roles of tacit knowledge. Nevertheless, the differences may not be just limited to these two aspects, which in fact need more empirical examinations and evidences. We propose that by means of the brain imaging techniques, future researches can consider directly examining the cognitive and neural mechanisms of both intuition and insight based on the RAT in one experiment.

# AUTHOR CONTRIBUTIONS

ZZ drafted the manuscript, YL and HL provided critical revisions.

# ACKNOWLEDGMENTS

This work was supported by the National Natural Science Foundation of China (31571153, 31100740, 31271088, and 30370488) and the MOE Project of Key Research Institute of Humanities and Social Sciences at Universities (11JJD190002).

verbal problems with insight. PLoS Biol. 2:e97. doi: 10.1371/journal.pbio. 0020097


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Zhang, Lei and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Search and Coherence-Building in Intuition and Insight Problem Solving

#### Michael Öllinger1,2 \* and Albrecht von Müller1,3

<sup>1</sup> Parmenides Center for the Study of Thinking, Pullach, Germany, <sup>2</sup> Psychological Department, Ludwig-Maximilians-Universität München, Munich, Germany, <sup>3</sup> Philosophical Department, Ludwig-Maximilians-Universität München, Munich, Germany

Coherence-building is a key concept for a better understanding of the underlying mechanisms of intuition and insight problem solving. There are several accounts that address certain aspects of coherence-building. However, there is still no proper framework defining the general principles of coherence-building. We propose a fourstage model of coherence-building. The first stage starts with spreading activation restricted by constraints. This dynamic is a well-defined rule based process. The second stage is characterized by detecting a coherent state. We adopted a fluency account assuming that the ease of information processing indicates the realization of a coherent state. The third stage is designated to evaluate the result of the coherence-building process and assess whether the given problem is solved or not. If the coherent state does not fit the requirements of the task, the process re-enters at stage 1. These three stages characterize intuition. For insight problem solving a fourth stage is necessary, which restructures the given representation after repeated failure, so that a new search space results. The new search space enables new coherent states. We provide a review of the most important findings, outline our model, present a large number of examples, deduce potential new paradigms and measures that might help to decipher the underlying cognitive processes.

Keywords: insight, intuition, binding, coherence, stage models

# INTRODUCTION

During 1916 Max Wertheimer, the famous Gestaltist, and Einstein had several discussions. Wertheimer was keen to understand Einstein's outstanding thinking. He realized that Einstein was already puzzled by apparent unanswerable questions at a very early stage, such as: "What would happen if one rode on a ray of light, or what would happen if one ran fast enough? Would the light stop to move?" Einstein felt an incoherence between the novel experimental findings at this time and the given theoretical assumptions. However, he was not able to put the single pieces together and arrange them in a new coherent picture. It was unclear how such a new picture should look like. According to Wertheimer, Einstein experienced the intuition that the common presuppositions in physics might be wrong. By that time, Einstein had the ingenious insight that the measurement of time is dependent on the applied frame of reference. By using this insight, he relaxed the existing dogmas, and eventually the single pieces became part of a coherent picture.

Wertheimer questioned that Einstein attained his great insight by the concatenation of logical operations. "Einstein did not put ready-made axioms, or mathematical formulas together." (p. 183). He emphasized that Einstein's progress was characterized by structural changes which were

#### Edited by:

Shira Elqayam, De Montfort University, United Kingdom

#### Reviewed by:

Linden John Ball, University of Central Lancashire, United Kingdom Ulrich Von Hecker, Cardiff University, United Kingdom

#### \*Correspondence:

Michael Öllinger michael.oellinger@parmenidesfoundation.org

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 06 September 2016 Accepted: 05 May 2017 Published: 29 May 2017

#### Citation:

Öllinger M and von Müller A (2017) Search and Coherence-Building in Intuition and Insight Problem Solving. Front. Psychol. 8:827. doi: 10.3389/fpsyg.2017.00827

driven by overcoming the traditional understanding of physical events, time and simultaneity. Wertheimer remarked that Einstein's thinking was often far ahead of the available mathematical apparatus.

Einstein himself reported that his thinking was not bound to words. He used mostly pictures and imagination, as his early thought experiments (Gedankenexperiment, see above) demonstrated. "I very rarely think in words at all. A thought comes, and I may try to express it in words afterward" (Wertheimer, 1959).

Einstein's thinking showed how literally a new and coherent picture leads to the solution of a difficult problem.

Currently, coherence-building plays an important role within cognitive psychology. Coherence is the key concept in a great number of studies on intuition (e.g., Dorfman et al., 1996; Shirley and Langan-Fox, 1996; Bolte et al., 2003; Bolte and Goschke, 2008; Volz et al., 2008; Dehaene, 2009; Topolinski and Strack, 2009b; Zander et al., 2016) and in a few studies on insight problem solving (Metcalfe and Wiebe, 1987; Bowden and Beeman, 1998; Kounios et al., 2006).

Intuition can be understood as a widely unconscious process, which provides a hunch for a judgment, which is often accompanied by an affective state or gut feeling (Gigerenzer and Todd, 2001a; Kruglanski and Gigerenzer, 2011). A standard task, which demonstrates the dynamic of intuitive judgments, is the word-triads task. Mednick and Mednick (1967) introduced this task. The original task requires finding a fourth word which builds meaningful compounds with three given words (e.g., SALT, DEEP, FOAM could be associated with the word SEA resulting in three meaningful compounds such as SEA SALT, etc.). In a modified version (Bolte et al., 2003; Topolinski and Strack, 2009c) participants were asked to make quick judgments on whether a given triad was coherent or incoherent without searching for associates. Note that incoherent trials had no obvious associate (e.g., DREAM, BALL, and BOOK).

Insight problem solving requires participants to find the solution to a given problem. E.g., the solution of the above presented coherent triads or more difficult problems such as puzzles (Sternberg and Davidson, 1995; Jung-Beeman et al., 2004; Bowden et al., 2005; Öllinger and Knoblich, 2009; Öllinger et al., 2014). Insight problems are often characterized by the fact that they are resistant to standard solution approaches. They often require restructuring the given problem or goal representation (Ohlsson, 1984a,b, 1990, 2011; Fleck and Weisberg, 2013). Insight problem-solving goes usually beyond the information which is actually given (c.f. Bowers et al., 1990, p. 74).

Although intuition and insight are often treated as different research domains, they obviously share certain features (see below). There are only a few studies addressing both and aiming at an integrated framework (Bowers et al., 1990, 1995; Kihlstrom, 1998; Topolinski and Reber, 2010; Zander et al., 2016). In this vein, we attempt to provide an integrated view which merges both domains by rule-based coherence-building processes.

Bowers et al. (1990) seminal work on "Intuition in the context of discovery" coherence was supposed to be the key process underlying intuition and insight. Coherence results from a widely unconscious and guided search process, which converges in an integrated representation of the given information, which surpasses the threshold to consciousness.

In greater detail, the guiding stage is driven by spreading activation within mnemonic networks (Collins and Loftus, 1975). Those activation patterns build up to an implicit and unconscious "perception of coherence" (Bowers et al., 1990, p. 74). This tacit perception of coherence guides the thought toward a more "explicit perception in question." It is important to note that Bowers et al. (1990) did not assume that such an implicit coherent representation is equal to the later consciously experienced coherence, but provides a fragmentary representation which could be enriched gradually by accumulating information.

Eventually, the integrative stage provides the result of a completed accumulation process. The activation within the network becomes so strong that it crosses the threshold to consciousness. At this stage coherence is, recognized as a hunch, which needs to be validated by an analytic validation process.

Although an exact definition of coherence was not provided, Bowers and colleagues' experimental design elucidates its alleged characteristics, e.g., in experiment 3a Bowers and colleagues asked participants to find an unknown solution word while a list of up to 15 clue words was presented subsequently. Each clue word was associated with the unknown solution word. An example of the accumulated clues task is for instance: (1) "Times", (2) "Inch", (3) "Deal", (4) "Peg", (5) "Head", (6) "Foot", (7) "Dance", (8) "Table", (9) "Person", (10) "Town", (11) "Math", (12) "Four", (13) "Block", (14) "Table", (15) "Box". The target word is "Square."

One result was that participants needed up to 10 clue words to find the solution. **Figure 1** illustrates the idea of associations and spreading activation. Each clue word is associated with the unknown target word (solution).

# LIMITATIONS OF BOWERS STAGE MODEL

Given the importance Bowers and colleagues' approach, we want to draw attention to a few concerns that we have with the current model.

First, the idea of a guided accumulation process is striking, but seems underspecified and unclear. Spreading activation elicits literally unspecific neighboring nodes in the network. That means the more clues are provided, the more activity should confuse the search process. The question is what guides the process? Pure associations would not be able to guide the process, since too many unspecific associations are activated by, e.g., 10 very different clue words. That is, the potential search space would explode.

We propose that the given information activates concepts from long-term memory. Spreading activation provides a bulk of information which either belongs to the solution of the problem or not. We assume that finding a coherent representation requires constraining the search space. In the easiest case this could be attained by identifying overlapping features or meanings as in the word clue example above.

We conclude that for those problems it is necessary to have a concerted interplay between spreading activation and constraining (Ohlsson, 1990; Thagard and Verbeurgt, 1998; Thagard, 2002) the activation landscape in a goaldirected manner. More difficult problem representations require constraining the search space by prior knowledge, hypotheses or chunking of information which structures and guides the process of coherence building (implications see below).

Our second concern refers to the transition between unconscious and conscious stage is somewhat unclear. We adopt a fluency account (Topolinski and Strack, 2009a,c) which relies on the ease of the processing of the given information. We assume that a constrained activation leads to a balance state (Heider, 1946), which could easily be processed, and results in the realization of a coherent state.

Third, the result of the integration process is a hunch or intuition which had to be validated and checked (Wallas, 1926). We propose a separate process for that and a re-entry loop, if the result is unsatisfactory or erroneous (**Figure 2**). Importantly, after repeated failure it might be necessary to restructure the search space to find a coherent state in an even larger search space. The new search space allows to integrate new information.

We hypothesize that this four-stage model allows to describe coherence-building. We further suggest that at each stage implicit and explicit processes are involved, however, the ratio between them varies to a great extent across stages. Therefore, different measures are necessary to pinpoint the underlying cognitive processes at the different stages.

In the following section, we will elaborate on the four stages by collecting evidence from different fields for each stage.

# STAGE 1: SPREADING ACTIVATION AND CONSTRAINING

As **Figure 1** illustrates a spreading activation account is not sufficient to explain the emergence of coherence. Pure spreading activation would result in an unsynchronized activation of unrelated information which distorts the coherence building process (see Wns in **Figure 1**).

We assume that each word activates associations (via spreading activations in the semantic network). The given clues are constraining (shaping) the search space. They are strengthening particular features of the activated concepts, and inhibiting others, at the same time. The interplay between the features of the clues, which also could be interrelated, constrains the search space until the solution word is isolated. Coherence is attained by finding the intersection of all the associations of the clue words. For the clue experiment that means that the more clues are provided, the narrower becomes the search space until the target word is isolated. The more overlapping associations the clues have, the more likely is the detection of a coherent state. For the word triads task this would explain why "coherent triads" are processed faster than "incoherent triads." Incoherent triads share less association which constrain the search space, whereas coherent triads do.

Our argumentation is closely related to the work of Holyoak and Thagard (1995), Thagard and Verbeurgt (1998) and Thagard (2002). They provided a rule based definition of coherence. Coherence follows a constraint satisfaction process. Constraint satisfaction is an idea which was successfully applied in connectionistic models, for example to model ambiguous figure perception (McClelland et al., 1986; McClelland and Rumelhart, 1989). An illustrative example is for example a model of the Necker cube (Necker, 1832), where the nodes of one cube representation exited themselves in parallel. That leads to a stable and coherent state. The exited nodes concurrently inhibit the nodes of the alternative cube representation.

Thagard and Verbeurgt (1998, see p. 2–3 for the detailed list) stated seven computational principles that define coherence:



That means for the clue task we start with the clues "Times" and "Inch." Let us further assume that the concept "Times" activates among others a concept such as "Newspaper," and "Inch" a concept such as "unit of measurements," which results in a negative constraint between the activated concepts. A "feeling" of incoherence would occur. Providing additional information result in a coherent state until positive constraints between all the concepts are mutually activated.

Einstein struggled with incoherent pictures resulting from pieces of information which did not fit together. Re-connecting the given information with the new understanding of the importance of reference frame resolves the incoherence and consequently results in a coherent picture – positive and satisfied constraints.

However, Einstein's thinking also shows the limitation of a pure constraint satisfaction account, because the solution is not always available in the initially activated search space. Sometimes it is necessary to overcome the given constraints to find novel coherent states (see stage 4 below).

Another question related to constraint satisfaction is how coherence could be implemented at a neuronal level. Lakoff and Johnson (1980) proposed a neural theory of metaphor (NTM) (Lakoff, 2009, 2014) which provides a detailed mechanism for coherence-building that has some relevance for our discussion. The following elements consolidate NTM (Lakoff, 2009):

• Neural groups. Small networks of neurons. Neurons can mutually be part of different groups.


We postulate that a coherent state is closely related to Gestalt circuits. There are some nodes, e.g., A, B, C, D, and a Gestalt node G (**Figure 3**). If node G is firing, the nodes A, B, C, D are also firing. If a few nodes are activated and a threshold is surpassed, G is elicited. When G is inhibited, at least one of the other nodes is also inhibited. We propose that a Gestalt node could serve as coordinating hub which binds together information. The node G constrains the search space exciting A, B, C, D and inhibiting other nodes like E, F. NTM provides several mechanisms how distant concepts are linked together and how inferences could be drawn. Most important is the assumption that co-activation of remote concepts link those concepts and result in a coherent state.

We assume that for the clue example each single word could be seen as Gestalt node. The word co-activates several other words or meanings that are linked to this word. At the beginning (i.e., providing the first few clues) the clues excite remote and only partial and weak overlapping nodes. The more clues that are presented, the more likely it becomes that a particular node will be co-activated increasingly until it reaches the threshold to consciousness. The target word could be viewed as a new Gestalt node which binds distinct features of all the other clue words together.

The tight link between Gestalt perception, binding, and consciousness was shown by the detection of synchronized EEG signals (Singer, 1999; Engel and Singer, 2001; Jung-Beeman et al., 2004; Uhlhaas et al., 2006; Öllinger, 2009). An instructive example is the sudden recognition of an ambiguous figure, showing a Dalmatian dog sniffing at the ground (Tallon-Baudry and Bertrand, 1999). At the first glance the image seems a scrambled pattern of black and white colored patches. After a while the patches re-organize apparently out of nothing to an arrangement of meaningful objects. One explanation for this phenomenon emphasizes the importance of gamma-oscillations when viewers consciously recognize the Dalmatian dog. Tallon-Baudry and Bertrand (1999) proposed that this pattern stands for a binding process building a coherent picture from scrambled information.

# STAGE 2: COHERENCE DETECTION

How does a person realize that a coherent state is reached? There are two intimately related concepts which might address this question. First, Topolinski and Strack (2009a,b,c), Topolinski et al. (2009), Topolinski and Reber (2010) showed that process fluency is closely related to a coherence state. As mentioned above coherent triads are processed more fluently than incoherent triads. Process fluency could be defined as the ease with which given information is processed by the cognitive system.

Process fluency could also be exploited as an indicator showing a transition in a person's behavior while solving a series of problems. Haider and Frensch (1999), Wagner et al. (2004), Gaschler et al. (2013), Haider et al. (2013), Dietrich and Haider (2015) have been pursuing the idea that during learning of skills there are such transitions. They used for example the number reduction task (Wagner et al., 2004). In this task, participants were confronted by strings composed of three different digits. E.g., the string 1 1 4 4 9 4 9 4. There are two rules that have to be obeyed:


The task is to process the string stepwise from left to right. For the example given above 1 1 → 1. Then the task requires problem solvers to use the result from the first reduction and to take the next number from the string: 1 4 → 9; etc. The result of number reduction will be 9 for the string above. The strings were composed in a way that they either could be solved by this step-wise or sequential method, or much faster by realizing that there is a hidden rule, where the solution to the problem is already determined after the second attempt, since the sequence of the reduced digits is symmetrical [see Wagner et al. (2004) for the details of the task].

The number reduction task allows the moment of time to be determine when participants utilize the hidden rule. A sudden drop in the solution time is detectable, which could not be explained by step-wise learning process. Haider et al. (2013) postulated that after a large number of attempts implicit processes extract and detect the underlying regularity of the given sequences. This enters a processing shortcut resulting in a much higher process fluency. Such distinct behavioral changes could be realized consciously by the participants. The realization allows insight to be gained consciously into the symmetric nature of the response strings.

Another indicator that helps to realize a coherent state is the change of the affective state. This addresses the famous Aha! experience. The Aha! is described by a few dimensions, such as suddenness, positive affect, or the feeling of being right (Topolinski and Reber, 2010; Danek et al., 2013; Danek and Wiley, 2016). It seems conceivable that such changes could easily be detected by the problem solver and could lead to the re-evaluation of the problem-solving process.

It is important to note that an Aha! experience is not a proper predictor for the correctness of the solution (Koffka, 1935; Danek and Wiley, 2016; Salvi et al., 2016).

# STAGE 3: EVALUATION

At this stage the result of the coherence-building process is evaluated. The problem solver validates whether the solution fits the given requirements and meets the desired goal. The solution is either found and coherent or the result is incorrect, which necessitates a restart of the search.

Heider (1946) called a coherent state a state of balance. The given elements (information) fit together and there are no contradicting relations between the given elements. Following Heider's account explains the need for coherence. Incoherence leads to tension within the system and there is a tendency toward a balance state. This might explain, why at the first place the cognitive system has a drive toward coherence. Heider's field

theoretical approach addressed the relations between persons and objects. Heider aimed at providing the determinants of social behavior and social perception. Beyond that, we propose that Heider's account is generally applicable to situations where mutual relations of interdependent information are given. It provides a rule-based framework explaining the dynamics of coherence-building. Cartwright and Harary (1956, p. 266) summarized Heider's account as follows.

Given a P-O-X unit consisting of a person P, another person O, and an impersonal unit X. The relations of each part of the unit are interdependent with each other. If P likes O and O is seen as responsible for X then there would be a tendency that P also would like X. This would be a balance state. If X has a negative relation with P then an imbalanced state results. In the person O the need arises to change the situation toward a balance state, e.g., by changing the relation between P and O from "like" to "not like." A state of balance results Cartwright and Harary (1956) showed by a general graph theoretical account that Heider's three elements approach can be extended to more complicated situations.

Following this account incoherence lead to the drive to search for a state of balance, and there is a schema that justifies that the deductions within the given information are mutually consistent. This implies the search for new relationships between the existing information driven by logical consistency with the existing information. This search process might to a great extent be unconscious, but will be shaped by the person's attention, deliberations, prior knowledge, attitudes, and motivations.

The theory of balance has some similarities with Thagard and colleagues' idea of constraint satisfaction (see above). An important question is how the cognitive system resolves existing conflicts.

Hélie and Sun (2010) proposed an elegant framework that provides a conflict resolution mechanism. Their explicit-implicit interaction theory (EII theory) assumes the parallel activation of implicit processes which are mainly associative. In contrast, explicit processes are driven by attention and characterized by more precise and distinct information processing. The explicit processes are predetermined by hard constraints. Processing of a new problem activates simultaneously the two systems. Conflict resolution is necessary, when no satisfying result is found. As a consequence, the results from both systems (implicit–explicit) will be integrated into one representation. This result is fed in as new input. The program cycles to the conflict resolution and integration cycle until the goal state is found.

The authors tested their model by a famous study on insight problem solving (Durso et al., 1994). Originally, Durso et al. (1994) introduced a graph theoretical approach. The approach combined the idea of semantic network analysis and the concept of restructuring (Ohlsson, 1984a,b). The goal of the study was to uncover participants' underlying knowledge structures when solving an insight problem. Durso et al. (1994) asked participants to solve the following puzzle: "A man walks into a bar and asks for a glass of water. The bartender points a shotgun at the man. The man says, 'Thank you,' and walks out." (Durso et al., 1994, p. 95). While solving the problem participants answered 'yes' and 'no' questions. The questions were intended to reveal the individual problem representation, e.g., question: "Was the man thirsty?" – answer: "No". Afterward participants were asked to judge the relatedness of concepts of pairs (e.g., bartender, surprise). From this data semantic graphs were construed. In the next step, the authors compared the semantic graphs of solvers and non-solvers. It was found that solvers represented more likely direct connections between concepts which refer to the solution (e.g., surprise and remedy). Non-solvers focused more strongly on facts which were explicitly given (e.g., bartender and man). Solvers represented important aspects of the problem very early. Durso et al. (1994) concluded that the relatedness between certain concepts determines the likelihood for restructuring (see below, stage 4).

Given this finding Hélie and Sun (2010) modeled the hiccups problem with the connectionistic network (CLARION). CLARION's explicit knowledge system was fed with answers to the yes–no questions. Initially, it mainly represented the given task instruction. The associations between concepts were randomly determined and built the implicit system. The degree of randomness was varied between conditions. The authors found that the higher the randomness score, the more likely is a graph structure which resembles the solvers' structure actually found by Durso et al. (1994). Higher variation rates allowed a better conflict resolution that result in the desired solution.

Importantly, the authors suggested that higher randomness leads to more frequent remote and distant concept associations. Those associations are often incoherent with the given explicit knowledge representation. The conflict between the implicit and explicit representations might result in the generation of new and insightful hypotheses which help to solve the problem.

Conflict detection plays also an important role in the field of intuition research. Kahneman (2012) showed how misleading first intuitions could be. E.g., in the famous Linda problem a number of statements about a fictive person were given. Linda is 31 years old, outspoken, bright, single. She majored in philosophy. As a student she was deeply concerned with issues of discrimination, social justice, and also participated in anti-nuclear demonstrations (Tversky and Kahneman, 1983).

After reading the description participants were asked to choose the statement which seems more probable. (a) Linda is a bank teller. (b) Linda is a bank teller and is active in the feminist movement. Almost all participants opt for statement (b). The answer is wrong. Assuming the probability that Linda is a bank teller is 60% and the probability that she is active in the feminist movement is 70%. The product (conjunction) of both is 42% (0.6 × 0.7 = 0.42). The product is always smaller than each multiplier. Consequently, option (a) is the only correct answer. Tversky and Kahneman (1983) proposed that participants used an implicit (intuitive – system 1) heuristic which is biased toward option (b), because (b) seems more representative than (a). After a deliberate evaluation (system 2) it should become clear that (a) is the correct answer. In our discussion that means that a first coherent representation of the problem results from prior knowledge constraints or heuristics which restrict the evaluation process. Consequently, a conflict is detected between the apparent solution and the actual (logical) solution. In our model the participant would also commit an error, since an

external feedback – whether the solution is correct – would be necessary at the evaluation stage to restart the process. Then the coherence building process could be restructured. We do not agree with Kahneman's conclusion that intuitive processes are per se problematic. Moreover, there are alternative accounts which demonstrate how the conjunction fallacy could be explained (e.g., Tentori et al., 2013).

In contrast to Kahneman, Gigerenzer and Todd (2001a,b), Kruglanski and Gigerenzer (2011), Mega et al. (2015) assumed that intuitions help solving problems fast and frugal, e.g., facing the following question: "Which city has the better football team – Karlsruhe or Munich?" You do not know Karlsruhe, so you opt for Munich. The recognition heuristic (Gigerenzer and Todd, 2001a) helps to solve the problem. The idea is that uncertainty is reduced by relying on the ease of recognition. That means that a processing advantage indicates a potential solution to the problem. In our example, larger cities are more familiar. This corresponds to a higher likelihood of having a successful football team. However, Gigerenzer's approach has also its limitations. Changing the cities in the above example and using the cities Nuremberg and Hoffenheim would result in a wrong solution by the recognition heuristic. Hoffenheim is fairly unknown but has the better football team than Nuernberg.

In sum, both the deliberate and the intuitive account need an evaluation process which justifies that the found solution is plausible and reliable. Both systems can provide erroneous results.

Generally, Kruglanski and Gigerenzer (2011) criticized (sensu Keren and Schul, 2009; Keren, 2013) that the dichotomy between an intuitive and deliberate system might be arbitrary. They proposed a rule-based account which relies in principle on if-then rules, as does our approach. The authors elaborated on this assumption and demonstrated that deliberate and intuitive judgments could be based on the same rules, as they demonstrated for the recognition heuristic (p. 100). They further assume that rules could be hardwired and explicit rules become implicit after training and expertise. The rule selection process is constrained by the task type. Certain heuristics do fit the task requirements others do not. Again, expertise and prior knowledge play an important role.

Thomson et al. (2015) provided a cognitive model that is based on the ACT-R framework to model intuitive-decision-making. ACT-R is a sophisticated production system. Productions consist of an IF statements (conditions) and a THEN part which represents an action. If a condition is matched the production system will execute an action. Productions could be newly learned, modified, or compiled. They are mostly explicit at the beginning of learning, and become implicit after repeated training. The ACT-R system is divided into an implicit and a declarative memory system (explicit), and has a goal stack which controls the flow of operations like a working memory. Information is stored in chunks. The strength of a chunk is determined by its recency and its frequency of retrieval (ease of recall). The authors assume that implicit memory content is activated by matching the given information. Spreading activation is pre-supposed and implemented by allowing associations between existing chunks. Attentional processes

guide activation. The strengths of associations are determined by their co-occurrence in the past. The authors emphasized that intuition is a blend of consciously accessible and consciously inaccessible information. They suggest that retrieval processes are mainly unconscious, whereas declarative knowledge elements and the selection of heuristics and strategies are more deliberate and conscious.

Taken together the results of different fields show that intuitive and insightful problem solving could be modeled by rule-based accounts that entail similar properties (like implicit and explicit systems). Problem solving needs both systems to detect conflicts which drive the search for new associations. Eventually, the search results in new coherent state. However, there are situations where the building of new associations or the combination of implicit and explicit information is not enough. These situations require a deeper structural change, namely the restructuring of the search space.

# STAGE 4: RESTRUCTURING AS COHERENCE BUILDING PROCESS

The Gestalt psychologists (Wertheimer, 1923, 1959; Duncker, 1935; Koffka, 1935; Katona, 1940; Köhler, 1947) showed a major interest in answering the question under which conditions perceptual information is grouped to meaningful units. They identified that similarity, symmetry, and the proximity of perceptual elements affect the grouping process. For Köhler (1947), Wertheimer (1959) re-grouping (restructuring) of the given information was the major factor for productive or insightful thinking.

**Figure 4** illustrates the grouping dynamics by the Parallelogram-Square problem (Wertheimer, 1925). The task requires determining the total sum of the area of the parallelogram plus the area of the square, given "a" and "b" (**Figure 4A**). A beautiful solution entails that the given lines are restructured so that two rectangular triangles result (**Figure 4B**). Eventually, the triangles form a rectangle (new grouping, **Figure 4C**). Now, it is simple to determine the area "a" × "b".

Within the field of insight problem solving constraints play a significant role (Isaak and Just, 1995; Knoblich et al., 1999). Ohlsson (1992, 2011) argues that a problem activates prior knowledge from long-term memory. The activated knowledge imposes constraints on the representation. It was demonstrated in several studies (Knoblich et al., 1999, 2001; Kershaw and Ohlsson, 2004; Öllinger et al., 2006, 2008, 2013, 2014; Danek et al., 2013, 2014; Kershaw et al., 2013) that self-imposed constraints caused the main source of problem difficulty. The relaxation of constraints leads to a new problem representation which allows for novel insights. There is a major transition from a state of "not knowing a solution" to a state of "knowing a solution" (Ohlsson, 2011; Danek et al., 2014).

It is important to note, that constraint satisfaction does not need to provide a solution. **Figure 5** shows the famous Ninedot problem. The task is to connect the given nine dots by four connected straight lines, without lifting the pen, or retracing a line.

The Nine-dot problem proves to be extremely difficult. The common explanation claimed that a Gestalt-like perception of the given nine dots prevents drawing lines beyond the perceptual boundaries (Maier, 1930; Kershaw and Ohlsson, 2004; Öllinger et al., 2014). Importantly, and hardly recognized was the fact (Öllinger et al., 2014) that after problem solvers had relaxed the perceptual constraint an even larger search space resulted – adumbrated in **Figure 5C**. The scattered dots emphasize that after constraint relaxation (restructuring) lines could be drawn to arbitrary positions outside the boundaries of the nine dots. Consequently, it is not trivial to find the correct sequence of lines connecting all dots (Weisberg and Alba, 1981). Öllinger et al. (2014) showed that the concerted interplay between heuristics – restricting the search space – and constraint relaxation – expanding the search space – is sufficient to solve the problem.

In sum, restructuring allows problem solvers to search for the solution within a new search space. The larger search space enables the activation of new concepts. The new concepts could be integrated or build interrelationships with already existing concepts of the problem representation. It is necessary that the larger search space is restricted by constraints that guide the coherence-building process.

# EXAMPLES AND GENERALIZATION

In this section, we elaborate on the stage model. **Figure 6** shows an introductory example which illustrates the basic principles of coherence-building. First, three arbitrary dots were presented. According to our model, in stage 1 implicit processes spread activation and constrain the search space by prior knowledge. The dots "start" to build interrelationships with each other. At a neural level the dot pattern results in a synchronized spatial activation pattern which organizes the three dots into a unified

representation (Koffka, 1935; Singer, 1999; Engel and Singer, 2001; Tallon-Baudry, 2003; Hebb, 2005).

Following Lakoff's approach (**Figure 3**) the three dots will be connected via a Gestalt node which concerted the interplay and co-activation (Hebbs rule: "fire together wire together") of the three dots. The Gestalt node coordinates the coherent state. The three dots build a triangle. The concept of a triangle (another Gestalt node) is associated with knowledge about triangles (form, rules, and theorems). This would be the result of stage 2. At a conscious level the recognition of a triangle occurs. At stage 3 the evaluation could focus on the question, whether this finding is significant, reliable, or interesting. However, it is not necessary and pre-determined that a triangle is recognized. Other coherent representations are conceivable and are mostly driven by the given task set, context, prior knowledge, and/or instructions, e.g., the three dots could also activate the concepts of a number (three) or trinity. Others will recognize the dots as representing individual subjects who have certain relationships – two of the dots seem to be linked closer. One seems to be more distant. In principle, a rather large number of coherent states are possible, all of them could be evaluated or further developed. Maybe the last example led the reader into a phase of restructuring which changes the coherence-building process (stage 4) from triangle to social domain.

Japanese haikus (von Müller, 2015) illustrate the dynamics of coherence-building in a more sophisticated field. Haikus are poems that have a well-defined phrase structure like in the famous haiku:

#### the stillness penetrating the rock a cicada's cry

#### Basho (1644–1694)

Initially, reading Basho's beautiful haiku word by word might seem confusing. It did not immediately become clear what is meant by the given words and how they are interrelated – a state of imbalance and conflicts might occur. After a few iterations through the phrases it is possible that new interrelationships between the concepts were elicited. First there is the image of a state of silence that is turned into a state of noise by a cicada's cry. The contrast increases and is emphasized. It is alternating between stillness and noise, where both are so intense that even a rock is penetrated. This draws the picture of strong forces which almost hurt. Lastly, it is imaginable to assign different directions to the forces caused by noise and stillness. It seems that noise drills into the rock, whereas stillness corrodes the rock. The whole meaning unfolds from the presence of all three parts of the haiku and the constant re-interpretation (restructuring) of the different parts might result in a vivid image of the scene until a beautiful coherent representation (image) results.

Haikus might provide a rich source for new empirical research, e.g., to investigate in more detail how coherence-building is influenced when the order of the phrases is shuffled or words are replaced or substituted? Would it result in the same coherent image at the end or would it result in a distorted image which becomes meaningless?

Our final example is taken from the domain of insight problem solving. It is used to demonstrate how our model promotes a more detailed and elaborated view on problem representations of already well-known standard insight problems. We chose Duncker's (1945) tumor problem: "Given a human being with an inoperable stomach tumor, and lasers which destroy organic tissue at sufficient intensity, how can one cure the person with these lasers and, at the same time, avoid harming the healthy tissue that surrounds the tumor?" Duncker used thinking aloud protocols as one of the first to uncover participants thought processes (Ericsson and Simon, 1993). **Figure 7** showed Duncker's thinking aloud analysis of various solution attempts.

The most right-hand path in **Figure 7** shows an elegant solution to the problem. The solution requires superimposing rays of weak intensity at the tumor, so that the tumor is destroyed and the surrounding skin is not affected.

The tumor problem proved as reluctant to hints and analogical transfer (Gick and Holyoak, 1980, 1983), and was difficult to solve. For quite a long time it was unclear, what caused the difficulty of the problem.

Grant and Spivey (2003) provided participants with a sketch of the problem, such as **Figure 8**. In a first experiment, they recorded the eye-movement patterns. They analyzed the patterns of successful and un-successful problem solvers. They found that successful solvers more likely attended to the surrounding skin, whereas unsuccessful participants fixated on the tumor. Ingeniously, the authors run a second experiment with three conditions. In the animated skin condition, the skin was flickering. In the animated tumor condition, the tumor was flickering. In the third condition a static picture was presented (control condition). As expected the animated skin condition outperformed the two other groups (solvers: 67% animated skin; 33% animated tumor condition, 37% static control condition).

Duncker's and Grant and Spivey's findings suggest an initial representation of the tumor problem as depicted in

**Figure 9A**. Initially, the given concepts were constrained by the importance of the tumor and did not integrate the remote concept "superposition" which is the key concept of the solution. After evaluation (stage 3) it becomes clear that a solution within this representation is impossible and a state of imbalance is achieved which increases the need to drive toward a state of balance. Restructuring (stage 4) is necessary which expands the search space. For the tumor problem restructuring requires a broad associative search with a high variation rate (Hélie and Sun, 2010). Importantly, the search process is not blind, but guided by constraints which are stated by the instruction and the goal representation which is strongly tied to the concept of "skin" (Dietrich and Haider, 2015). New associations are possible and a state of balance between the given concepts could be attained. Consequently, a coherent representation results which entails the solution to the problem.

According to Grant and Spivey's (2003) finding skin becomes the driving concept which integrates superposition and leads to new interrelated concepts. Hebbian learning is elicited and leads to a new coherent representation which links the concepts tumor, destruction, laser, and superposition.

Currently, we realize a big gap between the empirical data which demonstrates effects according to varying experimental conditions and the underlying knowledge structures. We propose that our four-stage model allows for the pinpointing of knowledge structures. To do so, it is inevitable to validate hypothetical assumptions on potential problem representations by using quantitative measures which reveal the actual knowledge structure. We assume that the four-stage model might help to choose the appropriate means. In the next section we will summarize a few potential measures at the behavioral level.

# MEASURING COHERENCE BUILDING

Since the early work of cognitive psychologist (Newell and Simon, 1972) it has been a main goal to discover significant individual representations during the stream of problem solving. This also holds true for measuring coherence. How could the experimenter realize that a coherent state is achieved? Answering this question is important for the empirical test of our model. We assume that it is helpful to have different measures which could be assigned to the different stages of our model. We enlisted a few potential measures:


nodes) are activated during spreading activation by an individual at the beginning of the problem solving process or later on (stage 1–stage 4). This is crucial for learning more about the actual representations of the problem and potential changes during the time course.


We assume that a detailed understanding of coherencebuilding is the key to answer the questions when and why a biased or inappropriate representation leads to false intuitions or why problem solvers get stuck in an impasse. We also assume that to pinpoint the processes and knowledge structure it is crucial to decide whether a problem is solved with or without insight. Nowadays, we either rely on the weak and tautological assumptions that insight problems require insight, or we rely on subjective experience like indicating an Aha! (Öllinger and Knoblich, 2009).

# DISCUSSION

We demonstrated that intuition and insight share some significant features and could be explained within a four-stage model. In both domains constraints play an important role. Constraints drive coherent states (Thagard, 2002), but also restrict the search space. Prior knowledge imposes rules and activates heuristics and problem solving strategies (Ohlsson, 2011). Intuition is in our understanding a result of a mainly automatic and implicit process which results from constraining processes and simple heuristics and rules which could lead to the solution or could be misleading.

A simple pattern matching mechanism guides the selection of competing rules or heuristics (Kruglanski and Gigerenzer, 2011). The selected rule determines the processing of the given information and determines the frame for the coherence building process. E.g., in our three dot example we showed that according to the selected rules, the three dots could cohere in a triangle, number representation, or social interactions. Spreading activation (Collins and Loftus, 1975) and the variation of combinatorial links between remote concepts are key features which help to come up with new coherent states of difficult problems (Hélie and Sun, 2010).

Coherence in this framework could be understood as state of balance (Heider, 1946), in which the concepts within the constraint representation have a consistent interrelatedness without conflicts. Such a state leads to a higher process fluency which causes detectable behavioral changes (Wagner et al., 2004; Topolinski and Strack, 2009c). Gestalt nodes (Lakoff, 2009) stand for the condensed meaning of the linked concepts and bind the given pieces of information together. Additionally, new meanings (links) could emerge by the binding processes. At the neural level, it seems plausible that Hebbian learning plays an important role and strengthens the connection between simultaneously activated concepts.

Our model extends Bowers et al. (1990) model in a few aspects. In contrast to Bowers' model we assume a constraining process at the guiding stage in addition to spreading activation. Importantly, the accumulation of information is not a necessary criterion for a coherent state in our model. The coherencebuilding in our model is implemented by constraint satisfaction.

The result is a balanced state. If such a state is reached, then the process fluency will increase, and cause behavioral changes (Wagner et al., 2004). Those changes foster that the coherent state surpasses the threshold to consciousness. Coherence-building is recursive, widely implicit and consists of conflict resolution and the integration of information. The process is affected and guided by attentional processes and deliberate thinking. We also emphasized the existence of a restructuring stage which overcomes already elicited coherent representations by changing the search space. This indicates a qualitative change in the problem-solving process. Still the constraint satisfaction process is active, but now more remote concepts could be integrated in a new representation. As we showed variation plays an important role to build those new associations (Fedor et al., 2017). In our understanding restructuring demarcates intuition from insight. Intuition could result from the realization of a coherent representation resulting in a hunch how a problem could be solved and accompanied by affective and cognitive processes. Whereas insight results from a restructured problem representation which allows a new and unusual solution to a problem which suddenly leads to a deep understanding of the given problem. That means intuition evaluates the coherence of the given information, whereas insight evaluates the result of restructuring.

# OPEN QUESTIONS AND LIMITATIONS

An important premise that the four-stage model made is that constraint satisfaction and binding are the basic processes, one at a cognitive, the other a neural level. There are alternative accounts that question the idea of binding by synchrony (Hayworth, 2012) or provide alternative accounts for the combination of information, such as the latching mechanism provided by Amati and Shallice (2007), Song et al. (2014) or binding by convolution introduced by Thagard and Stewart (2011). We leave this question open to be answered by future work. We are positive about the fact that our model would also work with an alternative binding process.

# REFERENCES


Another open point is why the system tends to search for a coherent or state of balance? Related to this point is the question, is it possible that there are problems where imbalance is necessary to solve the problem? Furthermore, it would be helpful to determine at an individual level, which traits of characteristics of personality increase the probability of finding coherence.

The notion of a rule-based account is also questionable. This refers to the notion of dual systems. Dual system accounts in general differentiate between a fast, unconscious, unlimited, holistic (system 1) and a slow, deliberate, logical, restricted system (system 2) (Evans, 2008; Kahneman, 2012). Insight and intuition are often assigned to system 1 processes. For many years, there have been discussions, whether such two separate systems, modes or processes are necessary, plausible, well-defined and complete (Evans, 2008; Kruglanski and Gigerenzer, 2011; Kahneman, 2012; Evans and Stanovich, 2013; Mega et al., 2015).

In our line of argumentation, we followed Kruglanski and Gigerenzer (2011) who proposed a rule-based account in which the rules range between an explicit and implicit level. We think this account is also supported by the modeling accounts we reviewed above (Hélie and Sun, 2010; Thomson et al., 2015).

In contrast, Evans and Stanovich (2013) disagree with this proposal and argue that there are clear indicators for two systems. Most important would be the fact that only system 2 supports hypothetical thinking and showed heavy working memory load. Again, we are not in a position to resolve this discussion right now, but we think that our model might help to search for unified processes which vary in the processing stage.

In sum, we hope to demonstrate a more general model on insight and intuition which shows that insight and intuition are the two different sides of the same coin.

# AUTHOR CONTRIBUTIONS

MÖ provided the review on grouping phenomena in psychology. AvM developed the idea on coherence building and provided the philosophical foundations.


in Knowledge Organization, ed. R. Sehvaneveldt (Norwood, NJ: Ablex), 267–277.


tile human cerebral cortex. Nature 532, 453–458. doi: 10.1038/nature 17637



Thagard, P. (2002). Coherence in Thought and Action. Cambridge, MA: MIT press.

Thagard, P., and Stewart, T. C. (2011). The AHA! experience: creativity through emergent binding in neural networks. Cogn. Sci. 35, 1–33. doi: 10.1111/j.1551- 6709.2010.01142.x


Wallas, G. (1926). The Art of Thought. New York, NY: Harcourt Brace Jovanovich.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Öllinger and von Müller. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Insight Is Not in the Problem: Investigating Insight in Problem Solving across Task Types

Margaret E. Webb\*, Daniel R. Little and Simon J. Cropper

Melbourne School of Psychological Sciences, The University of Melbourne, Melbourne, VIC, Australia

The feeling of insight in problem solving is typically associated with the sudden realization of a solution that appears obviously correct (Kounios et al., 2006). Salvi et al. (2016) found that a solution accompanied with sudden insight is more likely to be correct than a problem solved through conscious and incremental steps. However, Metcalfe (1986) indicated that participants would often present an inelegant but plausible (wrong) answer as correct with a high feeling of warmth (a subjective measure of closeness to solution). This discrepancy may be due to the use of different tasks or due to different methods in the measurement of insight (i.e., using a binary vs. continuous scale). In three experiments, we investigated both findings, using many different problem tasks (e.g., Compound Remote Associates, so-called classic insight problems, and non-insight problems). Participants rated insight-related affect (feelings of Aha-experience, confidence, surprise, impasse, and pleasure) on continuous scales. As expected we found that, for problems designed to elicit insight, correct solutions elicited higher proportions of reported insight in the solution compared to non-insight solutions; further, correct solutions elicited stronger feelings of insight compared to incorrect solutions.

Michael Öllinger, Parmenides Foundation, Germany

#### Reviewed by:

Edited by:

Carlo Reverberi, University of Milano-Bicocca, Italy Ute Schmid, University of Bamberg, Germany

\*Correspondence:

Margaret E. Webb mbwebb@student.unimelb.edu.au

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 10 June 2016 Accepted: 06 September 2016 Published: 26 September 2016

#### Citation:

Webb ME, Little DR and Cropper SJ (2016) Insight Is Not in the Problem: Investigating Insight in Problem Solving across Task Types. Front. Psychol. 7:1424. doi: 10.3389/fpsyg.2016.01424 Keywords: insight, problem solving, accuracy

# INTRODUCTION

Insight or Aha is often identified as the subjectively distinct feeling of sudden and unexpected understanding that may accompany attempts to solve a problem (Sternberg and Davidson, 1995; Davidson and Sternberg, 2003; Cushen and Wiley, 2012; Weisberg, 2014). This feeling of sudden comprehension often alerts the problem solver to a potentially correct solution (Irvine, 2015). However, as noted as early as Poincaré (1913), feelings of Aha may also accompany ideas that turn out to be incorrect. Recent investigations into the relationship between Aha and accuracy indicate that the Aha experience predicts accuracy (Salvi et al., 2012, 2016); however, these investigations focus on non-classic insight problems (i.e., problems such as Compound Remote Associates, Rebus Puzzles, and Anagrams, as opposed to the classic riddle-like problems favored by Gestalt psychologists that typically populate the insight problem solving literature). In this paper, we compare newer "non-classic" insight problems with the classic problems and non-insight problems and explore the accuracy-Aha relationship across problem types.

Definitions of insight vary on three dimensions: process, task, and phenomenology (Öllinger and Knoblich, 2009). Process concerns the cognitive mechanisms through which insightful solutions are generated. Descriptions of an insightful problem solving process emphasize a sudden certainty of a correct response, with little or no conscious access to the processing of the solution, whereas an analytic process emphasizes a deliberate and systematic evaluation of the problem, emphasizing logical deduction and strategic thinking (Kounios et al., 2008; Topolinski and Reber, 2010).

The task dimension concerns the identification of tasks that elicit sudden (insight) solutions; these tasks are often identified through comparison to other tasks that require stepwise solutions (Öllinger and Knoblich, 2009). The concept of a problem space iin which all possible problem states are mapped provides a useful tool for differentiating problem types. A regular problem has a well-defined problem space with operators that are obvious enough to enable steady stepwise progress toward the solution (DeYoung et al., 2008). In contrast, an ill-defined problem does not allow a clear mapping of the initial problem space, and the method of achieving the solution is unclear. Indeed, an illdefined problem often deliberately extends the problem space by misdirecting the solver (Ovington, 2016). For these problems, insight is often described as a restructuring of ill-defined problem space, which occurs after a period of impasse. The sudden narrowing of the problem space enables an easier generation of a solution. Classic insight tasks are typical of ill-defined problems (DeYoung et al., 2008).

The phenomenology of insight focuses on the Aha experience and is typically examined using case studies and anecdotes; however, there has been a recent push to explore the phenomenology using self report in laboratory studies (e.g., Bowden et al., 2005; Danek et al., 2014b). The use of the term insight is inclusive of the Aha experience as well as other insight related affect, such as confidence, impasse, surprise and pleasure, proposed to accompany an Aha experience (Danek et al., 2014a).

# Insight Problems and Problems Thereof

The methodological challenges of objectively measuring a subjective phenomenon such as insight are well known (Öllinger and Knoblich, 2009; Ash et al., 2009). Historically, researchers have used "classic insight problems," ill-defined problems originally used by Gestalt psychologists to elicit feelings of insight upon realization of the solution. Gestalt psychologists investigated insight as the result of perceptual and cognitive restructuring (Klein and Jarosz, 2011). This is of note predominately because the Gestaltists had backgrounds in visual science and were particularly invested in perceptual restructuring, which is a sudden change in which an object is perceived (say in Rubin's Face/Vase illusion, where the perception shifts from figure/face to ground/vase). Similarly, much insight research revolves around cognitive restructuring, a sudden change in the way a problem is perceived. Restructuring the problem makes the correct solution easy to obtain. This sudden ease of solution results in the feelings of pleasure, joy, and the rise of confidence associated with insight.

An example of a classic insight problem follows:

Water lilies double in area every 24 h. At the beginning of summer there is one water lily on the lake. It takes 60 days for the lake to become completely covered with water lilies. On which day is the lake half covered? (Sternberg and Davidson, 1982).

The answer (59) may or may not be immediately apparent: problem solvers frequently fixate on what they perceive as the key information of "60 days," and "half full" and either begin calculating the answer from day 1 or conclude that the answer is 30 (Bowden, 1997). For others, it becomes immediately apparent that, if water lilies double in area every 24 h and that if the lake is full on day 60, it must be half full on day 59 (see Appendix for other example problems). This question functions as an insight problem only if the solver initially misconstrues the problem space (focusing on the information of "60" and "half-full"). Sudden realization of the solution is accompanied by a feeling of certainty, as the answer is simple to check. For a comprehensive review of this style of research, see Sternberg and Davidson (1995).

Non-insight problems have been used as a control for insight problems, particularly when contrasting individual differences in problem solving (e.g., Fleck, 2008; Gilhooly and Fioratou, 2009; DeCaro et al., 2015). Non-insight problems are designed to be solvable through a process of systematic application of knowledge and logical deduction (Bowden, 1997, p. 548). For example:

If you have four coins, two slightly heavy and two slightly light, but which look and feel identical, how could you find out which are which in two weighings on a balance scale? (Schooler et al., 1993).

The answer to this question requires systematic consideration of the problem space, and potential steps toward solving the problem. The solution: (1) place one coin either side (if they do not balance, you have identified one heavy and one light coin), (2) replace one coin with one of the remaining. This weighing will provide the remaining information.

This problem and problems like it tend to be categorized as a non-insight problems (Gilhooly and Murphy, 2005). However, for someone with little or no training in logic, the underlying mechanisms may involve the experience of insight. The Aha experience may arise from recognizing that one cannot complete this task in two weighings if one attempts to weigh all four coins at once. From a from a phenomenological perspective, for experienced puzzle solvers, it is possible that neither or possibly both the heavy/light coin problem and the problem of the lilies raised above may be considered insight problems (Bowden, 1997). In other words, classic insight and non-insight problems alike can be solved with both insight and analysis (Weisberg, 2014). In the absence of feedback from the problem solver or other kinds of compelling evidence, the previously held a priori assumption that particular problems elicit insight, and are solved using particular processes (i.e., insightful or analytic), is highly problematic. Though classification and use of non-insight problems stems largely from the seminal papers of Metcalfe and Wiebe (1987) and Weisberg (1995), these authors noted the vagueness of the distinction between insight and non-insight problems, and there has been little systematic investigation of insight and non-insight problem types since the publication of those papers.

While research historically has contrasted classic insight problems with non-insight problems, more contemporary research uses a single problem task as indicative of both insightful and non-insightful problem solving, relying on the participant's self-report to determine whether or not insight has occurred (e.g., Bowden, 1997; Bowden and Jung-Beeman, 2003a; MacGregor and Cunningham, 2008). Non-classic problems (such as Compound Remote Associates; Bowden and Jung-Beeman, 2003b), rebus puzzles (MacGregor and Cunningham, 2008), or anagrams (Novick and Sherman, 2003; Salvi et al., 2016), are presented to participants, who provide a solution and also information about their experience of insight. This shift to asking whether or not insight was experienced in these non-classical problems was sparked by Bowden and Jung-Beeman (2003a) and Jung-Beeman et al. (2004), and has been followed by a line of research largely centered around Compound Remote Associates (CRAs; Bowden and Jung-Beeman, 2003b).

The taxonomy of these problem tasks as pure insight, noninsight or as both insight and non-insight (hybrid) has been debated for decades (see particularly Metcalfe and Wiebe, 1987; Weisberg, 1995 for contrasting viewpoints). Yet there is little recent evidence regarding the efficacy of classic (or proposed pure) problems to elicit insight or, lack thereof, consistently. There is also little investigation regarding the effect of accuracy on insight.

# Accuracy and Aha

Salvi et al. (2016) investigated four types of non-classic insight problems (as classified by Cunningham et al., 2009): CRAs, Rebus Puzzles, Anagrams, and Visual Puzzles and found that insightful processes elicited a higher proportion of correct responses. The authors had solvers use a dichotomous indication of whether or not the problem had been solved insightfully or analytically. Danek et al. (2016) used a similar dichotomous measure, investigating three widely used classic insight problems, and found that, across all problem types, participants reported experiencing insight in only 51.9% of correctly solved trials. Two questions arise from these studies: (1) are there differences between classic and non-classic problem types in the degree of insight elicited in the solution of each task, (2) are there differences between so-called insight and non-insight problems, and (3) what influence does solution accuracy have on the experience of insight?

Methods of insight self-report have varied between: dichotomous (insight/analytic) responses (Danek et al., 2016; Salvi et al., 2016), Likert scales (Bowden and Jung-Beeman, 2003a), and rating scales (0–100; Danek et al., 2014a), but have typically been analyzed as reports of insight (or insightful processing) vs. non-insight (or analytical processing). The strength of an insight response and its relationship with other significant components of insight phenomenology (such as a feeling of Impasse, and Confidence) are yet to be examined in depth (Danek et al., 2014a, 2016).

Given the differences in the purported solution methods, illdefined problems such as classic insight problems and CRAs may elicit greater amounts of insight compared to well-defined problems such as non-insight problems. Nonetheless, noninsight problems may also evoke insight in their solution, though this may only be evident when insight is measured on a continuous scale. Given the clarity of the problem space in noninsight problems, performance accuracy may be higher when problem solving is time-constrained in non-insight problems compared to insight problems or CRAs.

# The Present Research

The current study investigated ratings of insight (the Aha experience and other insight-related affect: i.e., confidence, surprise, impasse, and pleasure; Danek et al., 2014a) and performance accuracy in order to examine the relationship between accuracy and Aha across problem types (insight problems, non-insight problems, and CRAs). We were interested in the differences in performance accuracy and Aha ratings across problem types, and predicted that there would be (1) higher accuracy rates on non-insight problems compared to insight problems and CRAs, and (2) there would be significantly higher ratings of Aha in insight problems and CRAs compared to noninsight problems. We predicted that feelings of Aha would be predictive of correct solutions in classic insight problems and CRAs but that ratings of Aha would not be predictive of correct solutions in non-insight problems.

# EXPERIMENT 1

# Methods

## Participants

University of Melbourne students (193: 118 female, age range, 17–52, mean, 19.639) completed the study for course credit. Before beginning the study, participants were provided with consent forms detailing the proposed study. Nine participants were removed for errors on more than 20% of the tasks.

# Materials

### **Problem solving tasks**

Insight in problem solving was measured with a mixture of "classic" insight and non-insight problems, and compound remote associates (CRAs; Bowden and Jung-Beeman, 2003b).

#### **"Classic" insight and non-insight problems**

Riddle tasks and brain teasers were drawn from the existing insight problem solving literature (e.g., Schooler et al., 1993; Gilhooly and Murphy, 2005; Karimi et al., 2007), and categorised as insight and non-insight problems based on their classification in previous studies (see Appendix 1 for problems). Participants were given 4 min per problem to generate solutions. Accuracy and RT were recorded.

### **Compound remote associates**

CRAs (Bowden and Jung-Beeman, 2003b) are verbal association tasks patterned after the Remote Associates Test (RAT: Mednick, 1962). Three words are presented to the participant, each of which can be combined with a fourth word to make compound words (e.g., potato/tooth/heart can all be combined with sweet). CRAs were developed in response to criticisms of classic insight problems, particularly the limited number of problems and the need for different types of problems (incrementally solvable, "non-insight problems") used as a control. Participants had 30 s to generate the fourth word.

#### Procedure

Each participant was individually tested: problems sets and questionnaires were presented in random order. No solutions were given.

#### **Problem solving sets**

There were two problem-solving sets: classic and nonclassic problem solving, respectively. The classic "insight" and "incremental" (non-insight) problems were randomly interleaved within a set. Participants were given no information about whether the problem to be solved was classified as "insight" or "non-insight" but were given 4 min to work through the problem. In the CRA problem set, 20 problems, selected for varying difficulty levels (Bowden et al., 2005) were presented in random order. Five practice trials preceded the set. Participants were given 30 s to solve the word association task.

Before the problem solving set, participants were given the following information (drawn from Danek et al. (2014b):

We would also like to know whether you experienced a feeling of insight when you solve each task: A feeling of insight is a kind of "Aha!" characterized by suddenness and obviousness (and often relief!)—like a revelation. You are relatively confident that your solution is correct without having to check it. In contrast, you experienced no Aha! if the solution occurs to you slowly and stepwise. As an example, imagine a light bulb that is switched on all at once in contrast to slowly dimming it up. We ask for your subjective rating whether it felt like an Aha! experience or not, there is no right or wrong answer. Just follow your intuition.

After each problem solving task, participants rated five feelings during the problem solving task: (1) Confidence that the given response was correct ("very unsure" to "very sure"), (2) Strength of the insight experience ("very weak" to "very strong"), (3) Pleasantness of the insight experience ("very unpleasant" to "very pleasant"), (4) Surprising nature of the insight experience ("not surprising" at all to "very surprising"), (5) Feeling of impasse before the insight experience ("no impasse" at all to "very stuck")<sup>1</sup> . Participants responded by moving a slider (pre-set at 50) along a scale of 0–100.

#### **Questionnaires**

A series of individual differences measures were presented in random order. These included the O-LIFE (Oxford-Liverpool Inventory of Feelings and Experiences; Mason and Claridge, 2006), Raven's (1985) Advanced Progressive Matrices, a verbal fluency measure adapted from Lezak (2004), and an adaptation of the Alternative Uses Task (AUT: Guildford et al., 1978). These measures are reported elsewhere in a follow-up study of the same sample.

# RESULTS AND DISCUSSION

# Descriptive Statistics

Problems were scored as either correct or incorrect and averaged across category (insight, non-insight, CRAs), as were the ratings of insight related affect. Descriptive statistics of performance

1 Scales 2:5 are drawn from the methodology of Danek et al. (2014b). TABLE 1 | Means and standard deviations for accuracy and insight quale for classic insight and non-insight problems, and for compound remote associates (CRAs).


accuracy, and ratings of insight-related affect are displayed in **Table 1**.

We include in our results the Bayes Factor (BF10), which compares the ratio of model evidence for the alternative hypothesis (i.e., that there is an effect) to the null hypothesis<sup>2</sup> . This enables us to provide a more nuanced picture of the data in relation to the question addressed by the experiment than a standard p-value (Wagenmakers et al., 2016).

#### Accuracy and Insight: Differences across Problem Types?

The first question was whether the accuracy of insight and noninsight problem solving differed across problem types (classicinsight, classic-non-insight and non-classic insight). A Bayesian repeated-measures ANOVA indicated strong evidence for a difference in accuracy of response between problem types, BF<sup>10</sup> >150, η <sup>2</sup> = 0.22. The effect was largely explained by low accuracy on CRAs compared to insight (mean difference = 0.14,

<sup>2</sup>Bayesian tests were computed using JASP (Love et al., 2015). The null hypothesis indicates that the effect size equals zero; the alternative hypothesis is that the effect size is not equal to 0 and is assigned a Cauchy prior (Rouder et al., 2009). Any Bayes factor less than 1 (reported BF < 1) indicates support for the null hypothesis (Kass and Raftery, 1995). Any BF > 150 indicates very strong support for the alternative hypothesis. 1 < BF < 3 provides weak evidence for the alternative hypothesis ("barely worth a mention," Kass and Raftery, 1995, see also Jeffreys, 1961), 3 > BF > 20 is considered positive evidence for the alternative hypothesis, and 20 < BF < 150 is considered strong support for the alternative hypothesis.

BF<sup>10</sup> > 150) and non-insight (mean difference = 0.17, BF<sup>10</sup> > 150) problems (see **Figure 1A**). There was no difference between insight and non-insight problems (mean difference = 0.035, SE = 0.032, p = 0.81). The low accuracy on CRAs is congruent with Bowden and Jung-Beeman's (2003b) normative data, from which the problems were drawn.

A second Bayesian repeated-measures ANOVA (see **Figure 1B**) indicated strong evidence for difference between problem types on the level of reported insight, BF<sup>10</sup> > 150. Post-hoc analyses indicated that the effect was driven by higher reported insight in insight problems compared to both non-insight problems (mean difference = 10.15, BF<sup>10</sup> > 150) and CRAs (mean difference = 14.49, BF<sup>10</sup> > 150). There was no significant difference between CRAs and non-insight problems (mean difference = 4.34, BF = 1.562). The difference between insight and non-insight problems is congruent with the literature indicating that these are solved with different underlying processes (Gilhooly and Murphy, 2005; Chu and MacGregor, 2011). The difference in ratings of insight between insight problems and CRAs speaks against the use of CRAs as insight problems; however, this may simply indicate that CRAs are a hybrid insight problem (i.e., CRAs may be used as both insight and non-insight problems, depending on self-reported classification, e.g., Bowden and Jung-Beeman, 2003a; Salvi et al., 2012). It may also reflect the reduced solution accuracy.

# ACCURACY AND INSIGHT AFFECT: RELATIONSHIPS

Pearson correlations suggest that accuracy is related to degree of reported Aha for insight problems and CRAs but not for non-insight problems (see **Figure 2**). There were moderately strong, significant and positive relationships between feelings of Aha and accuracy on both classic insight problems (r = 0.50, BF<sup>10</sup> > 150) and CRAs (r = 0.41, BF<sup>10</sup> > 150); however, there was no relationship between accuracy and non-insight problems (r = 0.02, BF<sup>10</sup> < 1). This relationship supports the current assumptions within insight problem-solving literature that insight problems result in feelings of insight in their accurate solution but that non-insight problems do not (Gilhooly and Murphy, 2005; Chu and MacGregor, 2011). The relationship between accuracy and Aha on CRAs suggests that the difference between insight problems and CRAs was indicative of a lack of accuracy, rather than the use of CRAs as hybrid problems.

Across problem types, Aha was significantly and positively related to Confidence, Pleasure, and Surprise (see **Figure 2**, see Supplementary Materials for correlation matrices). Confidence was the most strongly related with Aha ratings across problem types, having a moderate to strong positive relationship with Aha in insight and non-insight problems, and in CRAs. The relationship between Aha and Impasse was negative and significant for insight problems, and negative but non-significant for both non-insight problems and CRAs. Interestingly, ratings of Surprise were significantly and positively related with ratings of Aha, but not with solution accuracy. This suggests the importance of surprise in an Aha experience.

### Summary

The results of Experiment 1 support the assumptions in the literature: Aha occurred more often in insight than non-insight problems. The moderate positive relationship between Aha and performance accuracy on both classic insight problems and CRAs indicated that performance accuracy was an important component of insight affect in problem solving. This may be indicative of the sudden ease of solution once the problem space has been restructured and the solution is easy to realize.

The positive relationship between Surprise and Aha indicated that Surprise may be an important component of the Aha experience, more than the previously considered Impasse.

Low levels of accuracy potentially indicate that students with English as a second language (ESL) may have experienced more difficulty on some of the problems, as these problems require high levels of English proficiency (Ansburg, 2000).

## EXPERIMENT 1A

We sought in Experiment 1a to replicate our Experiment 1 results using a sample that was explicitly selected with English as a first language.

Methods

#### Participants

Undergraduates from the University of Melbourne (82: 64 female, age range, 16–47, mean, 19.60) completed the study for course credit. Eight participants were removed for errors on more than 20% of the tasks.

remote associates. Only relationships with less than p = 0.05 have been graphed).

#### Materials, Procedure, and Design

The materials and procedure were identical to Experiment 1, save that participants were tested online, and ESL students were requested not to participate in the study.

# Results and Discussion

The results of Experiment 1a replicated the very strong support of differences between problem type on accuracy found in Experiment 1, F(2, 158) = 33.98, BF<sup>10</sup> > 150, η <sup>2</sup> = 0.30, with post-hoc analyses indicating significant differences between all variables: Non-insight problems demonstrated significantly higher accuracy than both insight problems (mean difference = 0.15, BF<sup>10</sup> > 150) and CRAs (mean difference = 0.23, BF<sup>10</sup> > 150), and insight problems demonstrated higher accuracy than CRAs (mean difference = 0.08, BF<sup>10</sup> = 2.387). This marks a change from Experiment 1, in which the low accuracy on CRAs alone drove the observed difference. This change in results may be arising from the filtering of ESL students.

The ANOVA conducted on the elicited Aha across problem type demonstrated strong support for differences between problem type, F(2, 158) = 33.98, BF<sup>10</sup> = 29.90, η <sup>2</sup> = 0.084, with insight problems eliciting significantly higher ratings of insight than non-insight problems (mean difference = 6.130, BF<sup>10</sup> = 75.662). As in Experiment 1, CRAs and non-insight problems demonstrated an anecdotal difference in reported insight (mean difference = 2.105, BF<sup>10</sup> = 1.592); however, in another marked difference from Experiment 1, there was no difference between insight problems and CRAs (mean difference = 5.025, BF<sup>10</sup> = 0.279). This fluctuation in results may be indicative of the "hybrid" nature of CRAs (i.e., as both an insight and non-insight problem), an indication of the filter of ESL students, or an indication of greater accuracy eliciting greater insight.

# Investigating the Relationship between Accuracy and Aha

The moderate positive relationship between accuracy and insight affect were replicated in insight problems (r = 0.40, BF<sup>10</sup> > 150) and CRAs (r = 0.40, BF<sup>10</sup> = 85.65), see **Figure 3**; however, in non-insight problems, the relationship between accuracy and insight shifted from no relationship to a positive relationship, albeit a weak one (r = 0.24, p = 0.04, BF<sup>10</sup> = 48.502). This marks a change from the current literature, in which non-insight problems are used as controls. However, it is congruent with statements from Weisberg (2014) indicating that both insight and non-insight problems can be solved through insightful or analytic processes.

# Investigating Relationships within Insight Affect

The direction of the relationships between accuracy and insight related affect were replicated, as were the direction of the relationships within insight related affect (i.e., Aha, Confidence, Impasse, Pleasure, and Surprise). The relationship of Surprise with performance accuracy and Aha ratings were again interesting in this dataset: There was a negative relationship between Surprise and performance accuracy, yet a positive weakto-moderate with Aha. Surprise may be the component of insight related affect that is able to differentiate an Aha experience from the pleasure and confidence of a solution.

# Summary

Investigation of differences in Experiment 1a replicate the findings in Experiment 1; that is, there is a significant difference in performance accuracy and reported insight across problem types. However, the relationship between accuracy and Aha ratings reflects the growing indication that problems can be solved with and without feelings of insight.

# EXPERIMENT 2

In this study, we investigated the consistency of the relationship between reported insight and problem type by replicating

the results from Experiments 1 and 1a. We also investigated the effect of feedback on reported insight; this data is not presented here as we focus instead on performance accuracy on reported insight. We replicated the analyses of Experiment 1a, and extended these analyses by combining the datasets of Experiments 1, 1a, and 2 to run a Multilevel Logistic Regression.

# Methods

#### Participants

Undergraduates from the University of Melbourne (129: 88 female; age range, 17–45, mean, 19.059) completed the tasks for course credit. Twelve participants were removed for errors in more than 20% of the tasks.

#### Materials, Procedure, and Design

The methods were the same as in Experiment 1a, but feedback was given regarding the correctness of the solution (this data is investigated in forthcoming papers). The affect-related questions were asked both before and after the solution feedback was given. In the current analysis, only the data from before accuracy feedback was used.

## Results and Discussion

#### Differences in Accuracy and Aha

As in the first two experiments, strong support for the effect of problem type on solution accuracy, F(2, 224) = 7.964, BF<sup>10</sup> = 47.61, η <sup>2</sup> = 0.066, with significantly higher accuracy on non-insight problems compared to classic insight problems (mean difference = 11.02, BF<sup>10</sup> = 36.184) and CRAs (mean difference = 8.94, BF<sup>10</sup> = 15.040). There was no significant difference between insight problems and CRAs (mean difference = 0.02, BF<sup>10</sup> = 0.133). This marks another change from both previous experiments: The accuracy across problem type is not consistent, but seems to follow a similar trend, with higher accuracy on non-insight problems, lower accuracy on insight problems.

As in the first two experiments, there was a significant difference in the reported insight in response to the problems F(2, 205) = 5.370, BF<sup>10</sup> = 4.389. As in Experiment 1a, this main effect was explained by the higher feelings of insight in response to classic insight questions, compared classic noninsight questions (mean difference = 6.13, BF<sup>10</sup> = 4.679, see **Figure 1B**). Similarly replicating Experiment 1a, there was no significant difference either between reported insight between CRAs and insight problems (mean difference = 4.025, BF<sup>10</sup> = 1.572). There was also no significant difference between non-insight problems and CRAs (mean difference = 2.105, BF<sup>10</sup> = 0.199). This may again reflect the use of CRAs as hybrid problems. However, this inconsistency in differences in reported insight across problem types, even keeping the problems constant, flags potential problems in the use of non-insight problems as controls in insight problem studies, particularly without self-reported measures of insight.

# CORRELATIONAL ANALYSES

As in Experiment 1a, feelings of Aha were significantly, and positively, correlated with accuracy across all problem types (insight problems: r = 0.495, BF<sup>10</sup> > 150; CRAs: r = 0.39, BF<sup>10</sup> > 150; non-insight problems: r = 0.19, BF<sup>10</sup> = 40.007 (see **Figure 4** for graphical representation, or Supplementary Materials for correlation matrices). This replication of the relationship between accuracy and Aha across all problem types emphasizes the requirement of self-report before use of noninsight problems as controls for insight problems.

The direction and strength of the relationships held for Experiment 2 within insight related affect. This consistency of a significant positive relationship with Surprise and Aha, compared to either a significant negative or non-significant relationship between Surprise and solution accuracy again indicates the importance of Surprise in the Aha experience.

### MULTILEVEL ANALYSIS

We sought to determine how well accuracy could be predicted from the subjective feeling of insight along with the other measures recorded in our study: Impasse, Pleasure, Surprise, and Problem Type. Due to high collinearity between ratings of Confidence and Aha, we removed Confidence from the analysis. We used a multilevel logistic regression in order to account for different overall levels of accuracy for each subject (i.e., by including different subject level intercepts) and different levels of accuracy across problem types. We modeled the binary-valued accuracy as a logistic function of these variables. Data from native-English speaking participants from across Experiments 1, 1a, and 2, were combined for this analysis. (One question, the Trace non-insight problem was removed from Experiments 1a and 2 and is not analyzed here).

We compared a number of different multilevel models: The first model included the rated feelings of: Insight, Impasse, Pleasure and Surprise, as well as problem type and is given by the equation:

$$\begin{aligned} \boldsymbol{\chi}\_{i\dot{j}} &= \beta\_0 + \beta\_1 \text{Insight}\_{i\dot{j}} + \beta\_2 \text{Impasse}\_{i\dot{j}} + \beta\_3 \text{Pleasure}\_{i\dot{j}} + \\ &\beta\_4 \text{Surprise}\_{i\dot{j}} + \beta\_5 \text{Type}\_{i\dot{j}} + \left(\mathbb{S}\_{i} + \varepsilon\_{i\dot{j}}\right) \end{aligned} \tag{1}$$

where yij is the binary response accuracy indicating whether participant i make a correct (1) or incorrect (0) responses on item j. Each term in the model represents participant i's ratings on that trait for item j. Each model also includes a set of subject-specific random effects, S<sup>i</sup> , and an error term, εij.

We additionally fit a second model, which allowed for an interaction between insight and problem type. This model is based on the grounds that classic insight problems and CRAs are proposed to elicit greater amounts of Aha than non-insight problems:

$$\begin{aligned} \rho\_{ij} &= \beta\_0 + \beta\_1 \text{Insight}\_{\vec{\eta}} + \beta\_2 \text{Impasse}\_{\vec{\eta}j} + \beta\_3 \text{Pleasure}\_{\vec{\eta}j} + \\ \beta\_4 &\text{Surprise}\_{\vec{\eta}j} + \beta\_5 \text{Type}\_{\vec{\eta}j} + \beta\_6 \text{Insight}\_{\vec{\eta}j} \times \text{Type}\_{\vec{\eta}j} + \left(\mathbb{S}\_i + \varepsilon\_{\vec{\eta}j}\right) \end{aligned} \tag{2}$$

For both models, we compared accuracy across insight, CRAs, and non-insight problems through the inclusion of a categorical Type variable. This allowed us to use noninsight questions as a baseline and extract separate weights for insight problems and CRAs. Additionally, for both models, we systematically tested alternative random effects by allowing intercept to vary by participant (Models 1 and 3), by allowing intercept and problem type to vary by subject (Model 2), and by allowing intercept and the insight by problem type interaction to vary by subject (Model 4). These comparisons allow for (a) different overall performance between participants, (b) different performance on each type of problem, and (c) different levels of insight on each problem type to be expressed between participants. We determined the preferred model using the Bayesian Information Criterion (BIC). The results are presented in **Table 2**.

Comparison of the BICs pointed to Model 4 which included the interaction between Insight and Problem Type both as a fixed and random effect as the preferred model. As in all models, as might be expected, insight had a positive effect on accuracy, but the experience of impasse decreased accuracy. Further contrasting effects were found for pleasure, which increased accuracy, and surprise, which decreased accuracy. Accuracy was poorer on insight problems than on CRAs or noninsight problems, respectively. We also found higher interactions between insight and CRAs than between insight and insight problems.

# OVERALL SUMMARY

The three studies demonstrate a difference in problem solving and ability to elicit insight in insight and non-insight problems;



All significant coefficients are shown in bold font. df, degrees of freedom; BIC, Bayesian Information Criterion.

however, the patterns elicited through correlation analyses indicate a relationship between performance accuracy and insight across problem types, when selected for English proficiency. Particularly, the consistent occurrence of a significant positive relationship between reported Aha and non-insight problems is worthy of further investigation. The results of the multilevel regression indicate the importance of the problem type and components of insight (surprise, impasse, Aha) elicited in the accuracy of a problem solution. However, these results do not enable us to differentiate between a feeling of insight as co-occurring with a correct solution, and the process of arising at a solution through insight-problem-solving vs. analytic problem solving.

# GENERAL DISCUSSION

In this work, we investigated the relationship between insight and accuracy across three different problem types, comparing classic "insight," "non-insight" and non-classic insight problems (CRAs). Insight related affect were predictive of correct solutions. We also reflected upon the comparison of insight and "noninsight" problems in the literature, finding that differences in reported insight between problem types make the distinctions in the literature seem valid; however, our secondary analysis revealed a consistent relationship between accuracy and insight ratings in non-insight problems, which emphasizes the issues in the comparison of insight and non-insight problems without self-report measures.

# ACCURACY AND INSIGHT

Salvi et al. (2012, 2016) that a solution accompanied by insight is more likely to be correct than a solution that is systematically and consciously deduced. That is, insightful problem solving is an all or none process, in which the problem solver arrives at solution through processing which is subthreshold to awareness, and therefore unconscious. The implication is that a solution that has been obtained through an insightful process is not consciously accessible until the process of problem solving has been completed and therefore solutions are more likely to be either correct or omitted. Salvi et al.'s (2016) data contrasts with Metcalfe's statement that feeling suddenly close to the solution often marks an incorrect solution (Metcalfe, 1986, p. 633). However, investigations of insight across experiments indicate that there was a greater proportion of problems correctly solved with insight than incorrectly solved with insight. The discrepancy may arise from the different self-report measures: Metcalfe used Feeling of Warmth (FOW) ratings, which were generated before the problem was solved and investigated pattern ratings; Salvi et al. (2016) used participant indications of within-experiment defined insight that were given after the problem was solved (i.e., participants agreed that the solution was sudden, surprising, and felt like a small Aha moment).

We asked participants to rate the strength of insight related affect, and were so able to investigate the relationships between accuracy and insight components on a continuous measure (as used in Danek et al., 2014a), compared to the more common dichotomous measures of insight<sup>3</sup> . Our results are congruent with Salvi et al. (2016): a feeling of Aha is associated with accuracy. Our data could be interpreted in a similar manner to Salvi et al. (2016); that is, that an Aha experience is elicited during an insightful problem solving process. However, our use of a rating scale rather than a binary response enables us to investigate the strength of an Aha experience, which varies with a moderate relationship with accuracy.

At this point we must raise the possibility of a distinction between a sudden insight as a process as opposed to an affect. Whether post-hoc self-reports of Aha reflect insightful processing is unclear. Our data indicate that the feeling of insight varies in strength and that the strength of an Aha is related to Surprise

<sup>3</sup>While in the examination of the frequency of responses on the response scale, we do see modes at the extreme values of 0 and 100; however, there is also substantial data that ranges between these values. For instance, 35% of the responses were in the inner quartile range and 70% of responses did not equal the extreme values. We therefore determined that dichotomising the continuous data would discard relevant information. Furthermore, the proportion of inner quartile range responses make it unclear where the cut-off should be between "no insight" and "insight." Consequently, we determined that splitting the scale at 50 would be problematic since this would be grouping quite substantial feelings of aha into the no insight category.

more than accuracy. Our results indicate that there are many components to problem solving that is accompanied by an Aha.

# METHODOLOGICAL IMPLICATIONS

The use of classic insight problems as pure, hybrid, or noninsight problems arise primarily from the papers of Weisberg (1995) and Metcalfe and Wiebe (1987); however, there has been little investigation regarding the efficacy of these problem types to elicit insight. We found significant differences in the efficiency of problem types in eliciting Aha experiences in a direction that was as expected: pure insight problems elicited the greatest degree of Aha, then hybrid problems (CRAs), and finally non-insight problems elicited the lowest ratings of Aha. This may be a reflection of how well-defined a problem is (DeYoung et al., 2008). An ill-defined problem (i.e., an insight problem) may be more likely to result in a feeling of surprise in the solution, which is in turn related to an Aha experience.

A shift to using insight problems as ill-defined problems may help avoid a number of the issues in the literature. DeYoung et al. (2008) does not require subjective feedback regarding the feeling of insight as they use insight problems as stimuli that require restructuring, thereby acknowledging and utilizing the potential problems of these stimuli. Our experiments therefore compare well-defined to ill-defined problems. Well-defined problems contain sufficient information in the question to allow steady progress toward a solution (DeYoung et al., 2008), while ill-defined problems have insufficient information to allow for incremental progress and typically require restructuring in how the problem is approached (as in an insight problem). Thus, insight problems are used by DeYoung and colleagues as a subordinate set of ill-defined problems.

Nevertheless, the comparison of problems which have been defined as non-insight or insight problems or even well/illdefined problems retain the problem of trying to verbally define the processes of interest, rather than relying on computational approaches to identify latent cognitive processes that underlie task performance (e.g., Hélie and Sun, 2010).

# INSIGHT IN PROBLEM SOLVING

The majority of research conducted into the efficacy of insighteliciting problems has been conducted on CRAs (see, e.g., Jung-Beeman et al., 2004; Kounios et al., 2006; Sandkühler and Bhattacharya, 2011; Wegbreit et al., 2012; Salvi et al., 2016). Our data is congruent the finding that CRAs are able to elicit feelings of insight but do not do so necessarily. Furthermore, our results indicate accuracy is a significant factor of the Aha response.

The current data support the use of successfully solved insight problems as measures of elicited insight, yet they also call for caution; the positive relationship between Aha and accuracy in insight problems is moderate, and by no means very strong. Our data also provides indications of insight problems solved without feelings of insight.

Despite their use as a control for insight problems in research (Murray and Byrne, 2001; Ash and Wiley, 2006; Fleck, 2008; Gilhooly et al., 2010; Wieth and Zacks, 2011; Wen et al., 2013; DeCaro et al., 2015), "non-insight" problems demonstrated a significant positive correlation when completed by students with high English proficiency. These results are comparable to those of Davidson (1995), who reported that 12–13% of non-insight problems indicated an insight pattern of FOW ratings. They are also comparable to Metcalfe (1986), who reported insight problems and anagrams showing both an insight and incremental pattern of analysis (Feeling of Knowing ratings), again indicating that problems can be solved both with feelings of insight, and by working through each step (Bowden, 1997; Weisberg, 2014).

The positive relationship between accuracy and Aha in insight and non-insight problems alike is congruent with the literature that indicates that problems can be solved with and without feelings of insight (Danek et al., 2016), and calls for the use of some form of self-report in all studies investigating insight affect and insight processes.

# CONCLUSION

The current study indicates that accuracy is often heralded by feelings of insight and insight-related affect (such as Confidence, and Pleasure). We have indicated that Surprise may be a significant indicator of Aha experiences, as it has a moderate to strong positive relationship to Aha experiences while only a weak relationship to solution accuracy.

Further, we have shown that both well-defined and illdefined problems (or non-insight, and insight problems respectively) can be solved both with feelings of insight, and by consciously working through each step (Weisberg, 2014). Without participant driven feedback regarding feeling or occurrence of insight, the assumption that some problem types elicit insight, and are solved using particular processes (i.e., insightful or analytic) is highly problematic. While this data cannot tease apart whether feelings of insight in problem solving is indicative of a "special" and separate process, it does provide evidence for insight and insight-related quale in insight and non-insight problems.

# AUTHOR CONTRIBUTIONS

MW: Write up, running participants, analysis and design of experiment. DL: Data analysis (advice and running), write up. SC: Write up, experimental design.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.01424

# REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Webb, Little and Cropper. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Intuitive Feelings of Warmth and Confidence in Insight and Noninsight Problem Solving of Magic Tricks

Mikael R. Hedne<sup>1</sup> \*, Elisabeth Norman<sup>1</sup> and Janet Metcalfe<sup>2</sup>

<sup>1</sup> Department of Psychosocial Science, Faculty of Psychology, University of Bergen, Bergen, Norway, <sup>2</sup> Department of Psychology, Columbia University, New York, NY, USA

The focus of the current study is on intuitive feelings of insight during problem solving and the extent to which such feelings are predictive of successful problem solving. We report the results from an experiment (N = 51) that applied a procedure where the to-be-solved problems were 32 short (15 s) video recordings of magic tricks. The procedure included metacognitive ratings similar to the "warmth ratings" previously used by Metcalfe and colleagues, as well as confidence ratings. At regular intervals during problem solving, participants indicated the perceived closeness to the correct solution. Participants also indicated directly whether each problem was solved by insight or not. Problems that people claimed were solved by insight were characterized by higher accuracy and higher confidence than noninsight solutions. There was no difference between the two types of solution in warmth ratings, however. Confidence ratings were more strongly associated with solution accuracy for noninsight than insight trials. Moreover, for insight trials the participants were more likely to repeat their incorrect solutions on a subsequent recognition test. The results have implications for understanding people's metacognitive awareness of the cognitive processes involved in problem solving. They also have general implications for our understanding of how intuition and insight are related.

Edited by:

Kirsten G. Volz, University of Tübingen, Germany

#### Reviewed by:

Joachim Funke, Heidelberg University, Germany Thora Tenbrink, Bangor University, UK

> \*Correspondence: Mikael R. Hedne post@mikael-hedne.no

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 01 May 2016 Accepted: 17 August 2016 Published: 31 August 2016

#### Citation:

Hedne MR, Norman E and Metcalfe J (2016) Intuitive Feelings of Warmth and Confidence in Insight and Noninsight Problem Solving of Magic Tricks. Front. Psychol. 7:1314. doi: 10.3389/fpsyg.2016.01314 Keywords: intuition, insight, magic, aha! experience, problem solving, metacognitive feelings, warmth ratings, confidence ratings

# INTRODUCTION

Experiences of insight may occur in many different domains—both in cognitive activities like perception, language comprehension, and problem solving, as well as during moments of self-awareness in clinical psychological settings (Kounios and Beeman, 2014). The focus of the current paper is on insight experiences in a special kind of problem solving during which the individual is trying to figure out how a magic trick was done. Sometimes, as in other kinds of problem solving, such solutions are characterized by their sudden appearance, and by a special feeling state, often referred to as an Aha! experience (e.g., Topolinski and Reber, 2010; Salvi et al., 2016). In line with focus of the research topic, we ask whether problem solving of magic tricks that occurs with or without the Aha! experience is differentially reflected on intuitive, metacognitive feelings during and after the solution attempt. This would in turn shed light on whether the two types of problem solving differ in the availability of relevant conscious knowledge.

The question of how intuition and insight are related follows from existing debates concerning the involvement of automatic/unconscious vs. controlled/conscious processes in insight problem solving. To illustrate this debate, take two models that both focus on the processes that lead up to the change in problem representation preceding insight. According to progress monitoring theory/satisfaction progress theory (MacGregor et al., 2001) problem solving involves the conscious, step-by-step monitoring of one's problem solving behavior. Two mechanisms are proposed for how this monitoring occurs. One is mental simulation, which involves that the problem solver tries to look ahead and predict the consequences of future moves. The other is evaluation of prospective moves against an internal criterion, which makes it possible to estimate the likelihood of success or failure. For both mechanisms, the emphasis is on conscious and intentional planning, monitoring, and evaluation. In contrast, according to representational change theory (Ohlsson, 1992; Knöblich et al., 1999), insight problem solving initially involves the construction of an erroneous problem space. Representational change can then occur through constraint relaxation, i.e., the release of unnecessarily constraining assumptions, or chunk decomposition, i.e., deconstruction of perceptual chunks into smaller features, which may in turn be recombined into more productive representations. According to this model, neither the erroneous problem representation nor the mechanisms that resolve it, need to involve intentional, conscious deliberation. Instead, they are assumed to be characterized by automatic and unconscious processes. Other theories that focus on unconscious mechanisms in problem solving include those of Smith and Kounios (1996), and Topolinski and Reber (2010). The latter theory focuses on the interplay between conscious and unconscious mechanisms in problem solving, and provides a framework for understanding how the phenomenology of insight can be understood as the conscious correlate of processing fluency caused by a sudden appearance of the solution. It should be added that one could also assume a continuum of understanding, from shallow to deep, in which intermediate levels of understanding are possible. It could also be that the extent to which a problem representation may be understood in this way would depend on the complexity of the problem.

Among researchers who acknowledge the role of unconscious processes in insight problem solving, there is disagreement over whether insight occurs through a sudden/discontinuous or gradual/continuous process. Theories that focus on the mechanisms involved in cognitive restructuring (e.g., Kounios and Beeman, 2014) would often imply that insight is a product of non-deliberate, unconscious processing that is independent of conscious, analytic thought (Smith and Kounios, 1996). An alternative is to regard insight as resulting from a gradual, more continuous process. The idea is that, over the course of the problem solving attempt, the problem representation changes from being unconscious/vague to becoming conscious/verbalisable. Importantly, this latter view does not imply any sudden, qualitative shift in informationprocessing (e.g., Bowers et al., 1990; Zander et al., 2015). Central to either view is that the subjective experience of insight would involve the activation of relevant unconscious/implicit knowledge. For example, Bowers et al. (1990) referred to an insight/hunch as involving a behavioral preference for a certain solution before this solution can be verbalized/justified. Similarly, Kounios and Beeman (2014) argued for the involvement of unconscious knowledge in insight problem solving by referring to findings demonstrating that subliminal priming may facilitate insight problem solving. A different hypothesis that seems compatible with a discontinuous view is the one by Topolinski and Reber (2010), who argued that the subjective experience of insight reflects increased perceptual fluency associated with the sudden activation of a solution. Thus, even though it is commonly agreed that insight would involve implicit/unconscious knowledge, there is disagreement about the processes by which such knowledge gives rise to the subjective experience of insight.

Furthermore, when people solve incrementally by satisficing, or getting to a "good enough" answer, the answer itself may be less stable than when they solve by insight. Novick and Sherman (2003) refer to insight solutions as "pop-out" solutions. By the Gestalt view of problem solving (see Kounios and Beeman, 2014), insight solutions have a crystallized quality, resulting from a restructuring of an unstable organization into a new stable structure. The stability of the solution, and both the correctness of this new structure and the individual's confidence in it and willingness to change it, will be of interest in the present research.

One way to get a better understanding of the relationship between intuition and insight is to measure intuitive, metacognitive feelings associated with insight vs. noninsight solutions to a set of problems, and to measure the relationship between such feelings and aspects of problem solving. Whereas the relationship between subjective feelings and unconscious knowledge has been extensively studied in relation to other forms of implicit cognition, including implicit learning (e.g., Dienes and Scott, 2005; Norman and Price, 2015), the question of how subjective feelings relate to objective performance at the different stages of insightful vs. noninsightful problem solving is still under-explored. A demonstration of whether and how unconscious knowledge is related to insight requires a clearer understanding of how subjective feelings relate to objective performance in problem-solving situations. The focus of the current paper is on how the two forms of problem solving differ in terms of the relation between intuitive feelings and objective performance during and after problem solving, which would provide an important contribution to the ongoing debate on the cognitive mechanisms underlying insightful problem solving.

Metcalfe and Wiebe (1987) studied the relation between prospective intuitive feelings and objective performance by asking participants to provide warmth ratings at regular intervals whilst the person was working on each problem. The question was whether warmth ratings would predict problem solving differently depending on whether the problems were multistep problems/puzzles (e.g., the Tower of Hanoi task), or vignette descriptions previously demonstrated to give rise to insight solutions (e.g., the "water lilies problem"). Metcalfe and Wiebe found that warmth ratings increased gradually before people Hedne et al. Intuitive Feelings

produced the correct solutions to the first type of problem (referred to as "incremental" problems), but did not increase much before people gave the correct solutions to the latter kinds of problems (referred to as "insight" problems). The authors argued that the difference in phenomenology accompanying insight and incremental problem solving could be used to define insight.

However, a limitation of this and other classical paradigms for studying insight vs. noninsight problem solving relates to the fact that they make use of two different sets of tasks. When, in studies like that of Metcalfe and Wiebe (1987), participants are presented with 2 sets of different problems that are predefined to be associated with either insight or not, behavioral or self-reported differences between the two could also be attributed to factors other than those related to information-processing differences associated with the presence or absence of insight. For example, tasks could differ in terms of difficulty, motivation/engagement, the number of steps needed for solution, or involvement of prior knowledge (see also Bowden, 1997; Bowden et al., 2005; Kounios and Beeman, 2014, for similar arguments). In addition, it has recently been argued that the use of pre-defined insight problems may be problematic because correct solutions to these problems are not always characterized by Aha! experiences (Danek et al., 2016).

Danek et al. (2013, 2014a,b) developed a novel experimental paradigm to counter these limitations. Rather than presenting participants with different sets of problems that were predefined to be associated with insight or not, their experimental stimuli were a series video recordings of magic tricks. Their assumption was that magic tricks can potentially be solved with or without insight. They argued that magic tricks can sometimes be solved with sudden insight that occurs as a result of constraint relaxation. However, they may also be solved in a step-by-step manner, which involves that the person systematically considers different possibilities (Danek et al., 2014a). The researchers therefore asked participants to report, for each suggested solution, whether or not the solution was associated with the experience of insight. As predicted, Danek et al. found that some solutions were associated with insight whereas others were not. Importantly, they also found that the two types of solution were associated with measurable differences on a number of dependent variables. Insight solutions were more likely to be accurate, occurred after fewer presentations, and were associated with higher levels of confidence than noninsight solutions. Furthermore, in a different paper reporting results from the same experiment (Danek et al., 2013), it was found that insight solutions were also remembered more accurately. Danek et al. interpreted these results as supporting the idea that problem solving characterized by insight is qualitatively different from problem solving without insight. It should be noted that since such a procedure does not make claims about which problems are more likely to be solved with or without insight based on, e.g., assumptions about the necessary problem solving steps involved. Instead, the focus is on the subjective experience of insight/Aha! In the remainder of the paper, we refer to problem solving characterized by this form of subjective experience as "insight problem solving."

Importantly, such a procedure makes it possible to explore the relationship between intuitive feelings (of, e.g., warmth and confidence) and objective indices of problem solving across the two types of solution, without the possible confounding influence of task differences. Thus, the procedure can be used to address whether the two forms of problem solving differ in terms of conscious availability of relevant knowledge. However, the specific procedure used by Danek et al. also had some limitations. First, their definition of insight specifically stated that it is characterized by high confidence. Participants were told that an Aha! experience would be characterized by feeling "relatively confident that your solution is correct" (p. 662). To circumvent the potential risk of demand characteristics, in the experiment that we present here, we took care to not include any information concerning confidence in the definition we gave participants about what comprised an insight solution. Moreover, in the earlier work of Danek et al., the measure of solution time could be criticized for low precision. Because their measure was the number of presentations (from 1 to 3) rather than absolute solution time in seconds, the true difference in solution time within a single category might be larger than between categories. We standardized the duration of each video, and used milliseconds as the measurement of solution time<sup>1</sup> . Furthermore, they did not systematically assess the relationship between confidence and accuracy, which could have given insights into the conscious status of activated knowledge. To explore this we measured the confidence related to the accuracy for each solution type. Additionally, their sole measure of intuitive feelings was retrospective confidence, and they also did not include any measurement of intuitive feelings during the solution attempt. In the present study we evaluate intuitive feelings of nearness to the solution before the solution is given, in a manner similar to Metcalfe and Wiebe's warmth ratings. Finally, they had no measure of the stability of the solutions once they had been given. If insight solutions were more crystallized than noninsight solutions it would be expected that people would be unlikely to change them. Therefore, the tendency to hold on to the suggested solution was measured by including a multiple choice task giving several options for possible solutions.

# Aims of Current Study

The main aim of the current study was to explore whether the relationship between intuitive feelings and behavioral measures differed for solutions characterized by insight vs. solutions that were not, when the to-be-solved problems were magic tricks. This would in turn contribute to our understanding of the availability of conscious knowledge in the two forms of problem solving. We both asked participants to provide prospective warmth ratings while working on each problem, as well as confidence ratings after having provided a suggested solution. Based on previous findings (Danek et al., 2014b), we predicted that the two types of solution would differ with respect to solution time, accuracy, and confidence. If our subjective measures of confidence and warmth were found to be more strongly related to objective

<sup>1</sup> In our view, the advantages of controlling for duration are larger than the possible limitations associated with this procedure (e.g., that the complexity of tricks cannot be varied within a single experiment).

Hedne et al. Intuitive Feelings

indices of problem solving for noninsight than insight problems, this would support the view that insight to a larger extent involves implicit/unconscious knowledge. Although our study alone is not designed to directly test whether insightful problem solving reflects a continuous or discontinuous process, a similar pattern of equally predictive warmth and confidence ratings across the two types of solution would be compatible with a continuous view of insight. We were also interested in whether the insight solutions were more stable than the noninsight solutions, and this was tested by comparing the stability between the suggested solution and the subsequent multiple choice. In conjunction with the multiple choice task participants would also report their decision strategy, where one of the options described having chosen the alternative most closely resembling the already suggested solution.

# METHODS

# Participants

Fifty-one students (14 male, 37 female), aged 19–31 (M = 21.81, SD = 2.55) were recruited from the University of Bergen (The Faculties of Humanities, Law, Mathematics and Natural Sciences, Medicine and Dentistry, and Psychology). Each participant received a gift card of NOK 150 (about 18 USD) as a compensation for participating. The total duration of the experiment was between 50 and 70 min, depending on how much time participants spent on individual trials. The research was conducted in accordance with the stipulations of the declaration of Helsinki, and conformed to the regulations of the Norwegian Data Protection Official for Research.

# Materials

The task was programmed in E-prime 2.0 (Schneider et al., 2002a,b) and displayed by a 19′′ monitor. All instructions were in Norwegian, and all written instructions relating to the experimental procedure were presented on screen. Participants were tested in groups of 3–5 in individual cubicles in a psychology testing room. The post-experimental questionnaire and instructions were presented in paper format.

We reviewed the list of magic tricks presented in Danek et al. (2014b), and selected tricks based on a number of criteria. These included timing of individual tricks and variability across tricks in terms of effect and method. A magic trick consists of an initial situation, a magic moment, and a revelation (de Ascanio, 1964/2005), and for a trick to be selected it had to be structured so that it was possible to clearly present all these three phases within the time frame of 15 s. The different tricks selected should also cover a variety of different basic magic effects, e.g., production, vanish, transformation, penetration (Fitzkee, 1944/1989). Additionally, the methods used to accomplish the different effects should vary across tricks. Some of the magic tricks used similar methods to accomplish different magical effects, whereas other magic tricks used different methods to accomplish similar effects. Most of the magic tricks were accomplished using methods specific to those magical effects, making sure the problems to be solved were all different. All methods used should be possible to describe in a simple and straightforward fashion using relatively few words. Each magic trick was presented as a problem solving task with little or no use of misdirection or superfluous gestures. Of the 32 magic effects selected, 20 were used in the study conducted by Danek et al. (2014b).

On each of the 32 trials, a video was presented that displayed a professional magician performing a magic trick. The videos were filmed in a photographic studio and each video clip had a duration of 15 s. The full clips of three of the tricks are available online, and are also illustrated by picture sequences in **Figures 1**–**3** (Example 1: https://www.youtube.com/watch?v=\_ jE25LbLaoQ/ **Figure 1**; Example 2: https://www.youtube.com/ watch?v=YTvTFNnwDEg/ **Figure 2**; Example 3: https://www. youtube.com/watch?v=VqNYrADykUk/ **Figure 3**). As different magic tricks require different points of focus from the spectator, 13 of the videos were filmed viewing the magician standing upright (See Example 1), 6 displayed the magician standing behind the table (See Example 2), and 13 displayed the magician's hands and a tabletop (See Example 3). A full list describing all the 32 magic tricks is provided in the Appendix.

# Procedure

## Instructions

At the start of the experiment, participants were given verbal instructions relating to the overall procedure as well as to our definition of an Aha! experience. This was described as a solution appearing "out of nowhere" and as being different from other / previously suggested solutions. Furthermore, it was instructed that if they could explain the entire reasoning process leading up to the solution, this would not be considered an Aha! experience. The definition was similar to the one used by Danek et al. (2013, 2014a,b), with the only difference being that we did not include reference to confidence. Before proceeding to the experimental procedure, each individual participant was asked by the experimenter whether they had understood the definition and whether they had any further questions.

## Practice Trials

Participants were first given a practice trial where they were shown a short and unrelated video clip before being asked to click on a visual analog scale (VAS). On the second practice trial they were to watch the unrelated video clip once more and were instructed to abort the video at a certain point by pressing the spacebar. Finally they were shown what would be the duration of the warmth rating (WR) scale in the following procedure (4000 ms), to inform them of how much time they would have to answer the warmth rating.

# Problem Solving Task

The videos of the 32 magic tricks were presented in a different randomized order for each participant. Each trial consisted of the initial presentation of the magic trick followed by a WR display where the participant was to indicate perceived closeness to the solution. WR was reported using mouse click on a VAS consisting of a bar colored with a blue ("cold") to red ("warm") gradient. The WR scale would disappear after 4000 ms if no response was given. The first WR scale was followed by a break of 11,000 ms

\_jE25LbLaoQ.

YTvTFNnwDEg.

FIGURE 3 | Picture sequence illustrating the magic trick Ball to cube (Example 3). The full clip is available at https://www.youtube.com/watch?v= VqNYrADykUk.

before another WR scale was shown and then followed by another presentation of the video. Each video could be displayed a maximum of 3 times. Every trial sequence would thus include a maximum of 3 presentations of the given video clip, 2 breaks, and a total of 5 WRs between each of these presentations/breaks.

Participants were instructed to press the spacebar once they knew the solution for the magic trick being displayed. Pressing the spacebar would abort the ongoing sequence, and this could be done at any point after the first video presentation had been completed. If the participants did not press the spacebar, the sequence would run out for the aforementioned maximum duration. This procedure is depicted in **Figure 4**.

In all cases, both when the participant would abort the sequence or if it ended by timeout, the participant was presented with the question "Did you have an Aha! experience?" They answered this by indicating "yes" or "no". An on-screen text box then appeared, in which they were to type in the solution for the magic trick, or write "don't know" if they did not have any hypotheses for how the trick was done. After having written the solution they were to report their confidence related to the suggested solution. This was done using a VAS similar to the WR scale with a bar colored in gradients from light gray ("not at all confident") to dark gray ("totally confident").

## Recognition

After reporting the confidence related to the written solution, participants were given a multiple choice task of four possible solutions of which one was the correct solution. This was followed by a confidence rating similar to that used in the problem solving task, but was now related to the chosen alternative. They were finally asked to report the strategy used for arriving at the chosen alternative, with the alternatives being: "After looking at and comparing all the four alternatives I chose the one I thought to be the most probable," "The moment I saw one of the alternatives I knew it had to be the correct one," "The alternative I chose was the one most similar to my written solution," "I felt equally uncertain of all the alternatives and chose one at random." The procedure for the problem solving task and recognition was then repeated until all 32 videos had been viewed.

Participants did not receive any feedback about the accuracy of their chosen solution for neither the written description nor the recognition-task.

#### Questionnaires

After completing the 32 trials the participants were first given a questionnaire asking if they knew anyone who had, or had themselves, been doing magic as a hobby or professionally at any point in their lives. They were also asked if they had knowledge about magic beyond what they perceived to be the average.

# RESULTS

## Rating the Accuracy of Solutions

Initial data analyses excluded single trials where the participants had reported that they did not know how the magic trick was done or where no response was given. Two raters (both

professional magicians) scored all the remaining solutions independently on a 4-alternative scale (completely incorrect-1, mostly incorrect-2, mostly correct-3, completely correct-4), with the cutoff for correct/incorrect being 2/3. The 4-alternative scale was used only for the purpose of scoring, making it evident which items required the most thorough discussions. Inter-rater reliability measured using Cronbach's alpha was 0.911. As a rule of thumb, if a magic trick involved several minor effects (such as the vanish and reappearance of a ball), all of these had to be accounted for if the solution provided were to be rated as correct. Trials where the raters had scored differently were discussed casewise if the ratings were different with regard to incorrect (1 or 2) vs. correct (3 or 4). For the remaining analyses accuracy of solutions were measured as dichotomous. Trials rated 1 or 2 were given the value 0, and trials rated 3 or 4 were given value 1.

Time was measured in milliseconds, and warmth and confidence were measured in whole values ranging from 1 to 100.

# Filtering of Data

Several other cases were excluded for different reasons. Cases where the response given was more than one single solution were excluded from the analysis both for instances where one of the suggested solutions were correct and in cases where neither of the suggested solutions were correct. This was also valid for cases where the participant would not understand the magic effect. Cases where the participant did not abort the procedure (i.e., timeouts) were also excluded from the further analyses as these were considered errors of omission (Salvi et al., 2016). Data from 8 of the participants were excluded altogether as they did not report any Aha! experiences. Trials involving one of the magic tricks ("Three Card Monte") were excluded across all participants as no one reported the correct solution. Finally, several single trials were excluded in cases where participants reported, either in the text box during the procedure or in the post-experimental questionnaire, that they had prior knowledge of how the magic trick was accomplished. The reason for this filtering was to make sure that the two groups of solution types did not differ in any way which might cause erroneous results (e.g., neither timeout trials nor trials with the response "don't know" would occur for insight trials). After excluding trials not fulfilling the set criteria (661), a total of 971 trials were left for the remaining analyses.

# Insight vs. Noninsight Solutions

Of the included trials (N = 971), 29% were reportedly solved using insight, whereas 71% of the trials were not. There was substantial variability in the frequency with which different tricks were solved with or without insight. To illustrate, example video 1 (https://www.youtube.com/watch?v=\_jE25LbLaoQ, see also **Figure 1**) was the problem most frequently solved with insight. In contrast, example video 3 (https://www.youtube.com/ watch?v=VqNYrADykUk, see also **Figure 3**) was the problem least frequently solved with insight.

We will now give an example of how a single magic trick could be solved both with and without insight. In the magic trick "Chop Cup" (https://www.youtube.com/watch?v= YTvTFNnwDEg/ **Figure 2**), a ball is taken from under a cup, vanished, and then reappears under the cup. For this particular magic trick, an understanding of the premise involves understanding that the ball to vanish is not the same as the one reappearing under the cup (i.e., the trick involves using two identical balls). This understanding may take the form of an Aha! experience. If one has understood this core premise, one can then deduct from this how the first ball is vanished and the second one is produced. A noninsight route to the same solution would be to first realize that the magician does not place the ball to be vanished in his hand before showing the hand empty. This, however, will not explain how the ball can reappear under the cup. Only by then understanding that the ball to appear under the cup is in fact different from the one vanishing will the spectator have understood the premise.

A series of t-tests were conducted with self-reported solution type (insight vs. noninsight) as the independent variable, and accuracy (correct/incorrect), solution time, and confidence in the written solution as the dependent variables, respectively. We expected insight solutions to be associated with higher accuracy, shorter solution time, and higher confidence. As predicted, there was a significant difference in solution accuracy in each task for solutions reported as insight (n = 281, M = 0.57, SD = 0.50) and solutions reported as noninsight (n = 690, M = 0.37, SD = 0.48); t(506.9) = 5.78, p < 0.001, d = 0.51. The average time spent before aborting the procedure showed a non-significant trend in the predicted direction between trials characterized by insight (M = 38.23, SD = 18.69) vs. noninsight trials (M = 40.56, SD = 19.53); t(969) = 1.705, p = 0.089, d = 0.10. This borderline trend becomes significant (p < 0.05) with a one-tailed t-test.

# Warmth Ratings

Analyses comparing the development of warmth rating across time for insight vs. noninsight trials only included trials containing 3 or 4 points of measure. Trials where WR was reported on all 5 points were already excluded due to the omission criterion. It was assumed that participants may sometimes wait for a short time between figuring out the solution and aborting the procedure<sup>2</sup> . To avoid this possible confounding influence, the first WR rating was compared with the second last (rather than the last) rating. This corresponds to the procedure used by Metcalfe and Wiebe (1987), who compared the first WR rating to the last rating before the rating given with the answer. Trials containing less than 3 points of data were therefore also excluded from these particular analyses. Warmth ratings were analyzed in terms of two types of scores that corresponded to "differential" and "angular" warmth measures (Metcalfe and Wiebe, 1987). Differential warmth was calculated by subtracting the first value from the last, similar to Metcalfe

<sup>2</sup>This assumption was confirmed through a questionnaire distributed to a subset of participants after completion of the experiment.

and Wiebe's procedure. This raw score could range from -99 to 99. Angular warmth was calculated by dividing differential warmth by seconds. This is based on a similar reasoning as both methods will measure development in warmth controlled for time. We expected to find a higher value for differential and angular warmth rating on trials not associated with insight.

A set of t-tests showed no significant difference in differential warmth ratings between trials with solutions characterized by insight (n = 50, M = 2.92, SD = 23.26) and noninsight (n = 162, M = 0.30, SD = 19.52); t(210) = 0.79, p = 0.429, d = 0.12. There was also no significant difference in angular warmth ratings between trials with solutions characterized by insight (M =.41, SD = 2.60) and noninsight (M = −0.004, SD = 0.29); t(49.36) = 1.12, p = 0.266, d = 0.22. Although, as noted above, the last warmth rating probably should not be included in the analysis, when we did include it, the means for insight and noninsight solutions with the different analyses were 12.93 (SD = 18.87) and 11.65 (SD = 16.38) (differential warmth, t(239.38) = 0.72, p = 0.47, d = 0.07), and 0.24 (SD = 0.36) and 0.22 (SD = 0.31) (angular warmth, t(239.84) = 0.82, p = 0.42, d = 0.08), respectively. Thus, these findings contrast with the earlier findings of Metcalfe and Wiebe.

# Confidence

There was a significant difference in mean confidence between insight (M = 78.32, SD = 20.35) and noninsight (M = 68.95, SD = 23.96); t(606.8) = 6.17, p < 0.001, d = 0.50. This finding is important as participants in previous studies were explicitly instructed that they would be more confident on insight than noninsight solutions. Our instruction did not mention confidence, and yet participants were, in fact, more confident about insight solutions.

In order to compare the relationship between confidence and accuracy separately for the different solution types, two sets of analyses were conducted, one of which used mean values (i.e., trial based) and the other signal detection statistics (i.e., participant based). First, t-tests were conducted examining each solution type respectively, with accuracy treated as if it were an independent variable. For insight solutions confidence was significantly higher for correct (n = 160, M = 81.34, SD = 17.16) than incorrect trials (n = 121, M = 74.32, SD = 23.41); t(210.97) = 2.78, p < 0.01, d = 0.34. The same was true for noninsight solutions, where mean confidence was higher for correct (n = 254, M = 74.31, SD = 22.73) than incorrect trials (n = 436, M = 65.83, SD = 24.14); t(688) = 4.55, p < 0.001, d = 0.36.

The relationship between confidence and accuracy in the two conditions was compared using the signal detection theory (SDT) statistic Az (Macmillan and Creelman, 2004; Norman and Price, 2015). This is calculated from performance across the different values of the rating scale, and corresponds to the area under the SDT ROC curve. This area expresses the "probability of being correct for a given level of confidence" and can be regarded as indicative of the individual's metacognitive ability (Song et al., 2011, p. 1789). An Az score of 1 indicates perfect discrimination between correct and incorrect answers, and an Az score of 0.5 indicates random responding. Note that Az scores need to be calculated for each individual subject; thus, the following analyses are subject-based rather than trial-based.

Comparing the Az scores between insight (M = 0.56, SD = 0.28) and noninsight trials (M = 0.63, SD = 0.18) in the 33 participants who had a valid Az score for both types of trials<sup>3</sup> , there was no significant difference between the two groups t(32) = 1.06, p = 0.297, d = 0.30. There was also no significant difference from random responding (0.5) in mean Az score for trials associated with insight (M = 0.57, SD = 0.28)<sup>4</sup> ; t(33) = 1.51, p = 0.142, d = 0.25. For trials not associated with insight, though, mean Az scores were significantly higher than what would result from a random assumption, (M = 0.64, SD = 0.17); t(41) = 5.43, p < 0.001, d = 0.82. Thus, when a person solved with insight they seemed unable to judge whether they were right or wrong, whereas they could make this distinction when they produced a noninsight response.

## Recognition

To evaluate whether people were differentially persevering with the responses they had produced when they had experienced insight or not, we separated trials on which participants indicated that they chose the alternative most similar to their written solution, from those on which they claimed to have recognized the chosen alternative using any other strategy. Reported decision strategy was recoded as a dichotomous variable ("The alternative I chose was the one most similar to my written solution"— 1; "other strategies"—0). Comparing the two sets of strategies, there was a significant difference between trials associated with insight (M = 0.72, SD = 0.45) vs. noninsight attributions (M = 0.61, SD = 0.49); t(969) = 3.37, p = 0.001, d = 0.23. When analysing trials where the written solution was correct, there was no significant difference between insight (M = 0.79, SD = 0.41) and noninsight (M = 0.78, SD = 0.41); t(412) = 0.25, p = 0.80, d = 0.02. For trials where the written solution was incorrect, there was a significant difference between insight (M = 0.63, SD = 0.49) and noninsight (M = 0.51, SD = 0.50); t(196.58) = 2.41, p = 0.017, d = 0.24, indicating that participants had a stronger tendency to hold on to incorrect solutions for trials recognized by insight than noninsight.

# DISCUSSION

In the present study we explored whether the relationship between metacognitive, "intuitive" feelings and objective indices of problem solving differed for insight vs. noninsight solutions when the to-be-solved problems were magic tricks (cf. Danek et al., 2013). The aim was to increase our understanding of the conscious availability of relevant knowledge in the two forms of problem solving, thus contributing to ongoing debates regarding conscious vs. unconscious processes in problem solving. A methodological aim was to explore the applicability of magic tricks as a problem solving task.

# Accuracy and Solution Time

In line with previous findings, insight solutions were more likely to be correct than noninsight solutions. This result is consistent with Danek et al.'s findings (2014b) and with notion that insight nearly always predicts correctness (Ohlsson, 1992; Salvi et al., 2016). In the present study, several of the trials solved by insight were incorrect. A reason for this could be that the participants were ignorant to magic tricks and their methods, as well as to how the responses were scored. A response was considered correct only if it described the actual method used to accomplish the magic effect. It might be that if a provided solution is feasible (Danek et al., 2014b), albeit incorrect, the participant has still understood the basic premise of the problem, without being aware of the particular details of the method itself. That said, for most of the problems presented in the current study, only one solution was possible given the presented context.

Contradicting the results of Danek et al. (2014b), there was little evidence supporting the hypothesis that solution time would be shorter for insight trials compared to noninsight trials. This could be due to differences in experimental design and time measurements, as the present study featured videos all with a duration of 15 s, and milliseconds as measurement for solution time. In the experimental procedure developed by Danek et al. (2013), the videos lasted between 6 and 80 s, and solution time was measured as the number of presentations for each video (1–3). Considering that the magic moment and revelation in a magic trick usually takes very little time and happens at the end of the entire magic trick, the initial situation of the magic trick (de Ascanio, 1964/2005) could then be used to contemplate on how to solve the problem at hand. For shorter videos, participants would then in be given less time to solve the problem.

It could be argued that limiting each video clip to 15 s limits the design to feature simple magic tricks. However, even with this constraint, one of the magic tricks (Three Card Monte) was not solved by any of the participants. Using more complex magic tricks as problems could also give rise to what is perceived as several possible solutions (Tamariz, 1988), whereas the magic tricks used would most often only have one possible solution, and as such be comparable to a puzzle.

# Warmth Ratings

Contrary to predictions, there were no differences in the development of warmth ratings for insight vs. noninsight solutions. One possible explanation is that the two types of solution were preceded by the same underlying problem-solving processes (Bowers et al., 1990; Zander et al., 2015). However, it could also be related to our measurement procedure. Due to the aforementioned exclusion criteria, several trials were dismissed when measuring warmth. Even though participants could report warmth up to 5 times for each trial, only trials including 3 or 4 warmth ratings were used in the analyses, resulting in the exclusion of 70% of all trials. 3 or 4 ratings constitute relatively few data points in this form of analysis, and by comparison, the original study by Metcalfe and Wiebe (1987) allowed for up to 40 warmth ratings per problem.

Another salient difference between Metcalfe and Wiebe's (1987) study and the present one is that in the former,

<sup>3</sup>For Az to be calculated, there needs to be at least 1 response in each category (correct vs. incorrect).

<sup>4</sup>The means and SD's differ from the above analyses due to casewise exclusion in the paired sampled t-tests.

participants had to be 100% confident in their answer before providing it. People were not free to give an answer with low confidence, as they were in the present study. As those authors noted and as is consistent with the present data, when a person is working on a problem they may come to a tentative solution without high confidence. In order to be allowed to provide that (wrong) answer in Metcalfe and Wiebe's experiment, they would have to convince themselves that the answer was correct, or maybe good enough, and increase their confidence rating about that answer. This increase in confidence due to allowing that a solution that is not a perfect solution is actually good enough—the acceptance of a 'satisficing' solution might itself have accounted for the incrementality seen in their noninsight condition, and also seen when people were solving insight problems but produced the wrong answer. Indeed, high confidence on insight problems just before the answer actually predicted that a mistake would be produced (Metcalfe, 1986), as if people might have been going through a self deceptive process of convincing themselves that a wrong answer was acceptable. (Note, that in the present study they would have been able to simply give the wrong response with low confidence).

# Confidence Ratings

Although insight and noninsight trials did not differ in terms of warmth ratings, they differed in terms of confidence ratings given after arriving at the solutions. This indicated that, cognitively, they were not identical. The results showed that confidence reflected solution accuracy more precisely for noninsight than insight trials. Confidence ratings have previously been used to measure awareness of knowledge used in problem solving (Metcalfe, 1986; Metcalfe and Wiebe, 1987) as well as in other types of cognitive tasks, including implicit learning (Shanks and St. John, 1994; Dienes and Berry, 1997).

In the present study, insight trials were characterized by an overall stronger conviction that one's solution was correct, as well as overall more accurate responding. This is in line with the claim by Topolinski and Reber (2010) that the experience of insight is accompanied by a feeling of being right. However, confidence was in fact less predictive of solution accuracy for insight when this relationship was compared for correct vs. incorrect trials within individual participants. The relatively stronger correspondence between confidence and accuracy on noninsight trials, combined with the fact that confidence did not predict accuracy above chance level for insight trials, could be interpreted as indicating that participants had more metacognitive awareness of the accuracy of the provided solution on trials not characterized by insight. The contention that there was a difference between the two types of problem solving is further supported by the self-reported decision strategies for recognition judgments. Participants perseverated more with their incorrect solutions for insight than noninsight trials, indicating they were more likely to adjust their solution for the latter.

The finding is also compatible with the idea of highconfidence responses reflecting higher-quality mental representations, and with Danek et al.'s (2013) findings that insight solutions were associated with better long-term recall. Even though there was no support for the hypothesis that access to metaknowledge preceding the solution was different for insight vs. noninsight, the results involving intuitive feelings and decision strategies occurring after arriving at the solution, indicated that the two types of problem solving did indeed reflect qualitatively different processes.

# Insight As Reflecting Unconscious Knowledge

The aim of including metacognitive measures of warmth and confidence was to make it possible to draw inferences about the conscious availability of relevant knowledge in the two forms of problem solving (Norman and Price, 2015). Whereas, a correspondence between confidence and accuracy indicates that behavior is influenced by conscious knowledge, the lack of such correspondence is normally taken to indicate unconscious knowledge (Dienes and Berry, 1997).

Our finding that confidence was less predictive of accuracy on insight trials could therefore indicate that such trials were characterized by relatively less conscious awareness of relevant knowledge. For example, insight trials may involve less access to conscious fragment knowledge and/or informative cues related to the provided solution (e.g., noticing a detail in the scene that one may use as a basis for subsequent hypothesis testing). Alternatively, it could be that insight trials are associated with a deeper understanding of the premise of the problem, but that this understanding is not fully available to conscious introspection/verbalisation at the time confidence is rated. If this is true, one could assume that when having an Aha! experience, participants first understand the core premise of the magic trick, and then "fill in the blanks" (Metcalfe and Wiebe, 1987; Smith and Kounios, 1996). The higher accuracy for insight trials could thus indicate that participants in these cases are more likely to have understood the problem "more fully", i.e., to have a more complete understanding of the problem<sup>5</sup> (Dominowski and Dallob, 1995), whereas for noninsight solutions they may be more likely to have understood and solved one piece of the problem whereas other parts are left unsolved. The relatively lower confidence for (incorrect) noninsight solutions could then reflect that on noninsight trials, participants were metacognitively aware that their knowledge/understanding was partial as opposed to complete. In contrast, on insight trials participants may intuitively have felt that they had understood the problem more fully. However, if they lacked conscious access to the details of this knowledge, they would be less able to metacognitively monitor its correctness, resulting in a lower correspondence between confidence and accuracy.

In sum, the confidence results suggest that problem solving by insight at least partly reflects unconscious knowledge. In other words, insight reflects more than just conscious, stepby-step monitoring (MacGregor et al., 2001). Instead, the results seem more compatible with theories that emphasize automatic/unconscious cognitive processes in insight problem solving (e.g., Ohlsson, 1992; Smith and Kounios, 1996; Knöblich et al., 1999; Topolinski and Reber, 2010).

<sup>5</sup> for a description of how this can manifest, see the description of the magic trick "Chop Cup" in the section "Insight vs. noninsight solutions" under Results.

Even though this conclusion would be stronger if also supported by the results involving warmth ratings, there are several reasons why the warmth measurement in the current experiment was not sensitive to possible differences in the cognitive processes preceding insight vs. noninsight solutions. Future studies should measure warmth in ways that avoid these limitations, which are accounted for in more detail earlier.

# Insight As Resulting from a Continuous or Discontinuous Process

Insight has been viewed as either a product of a discontinuous (e.g., Kounios and Beeman, 2014) or continuous process (e.g., Bowers et al., 1990; Zander et al., 2015), and a better understanding of whether insight is preceded by intuitive feelings or whether it reflects a sudden shift in information-processing is clearly needed. The fact that insight solutions were associated with higher accuracy and confidence compared to noninsight solutions, and also displayed a weak trend for shorter solution time, could be taken to support a discontinuous view. The same holds for the findings that insight solutions were characterized by a weaker correspondence between confidence and accuracy, and a stronger tendency to hold on to the provided solution, than noninsight solutions. Even though these findings are related to what happens after the insight has occurred, they could nevertheless be used to argue for qualitative differences between the two types of problem solving. In contrast, the lack of difference in warmth ratings between insight and noninsight trials does lend support to the continuous view. Thus, together the results do not give a clear answer to the question of continuity. In order to provide a clearer answer to this question, future studies should include additional measures of intuitive feelings and a larger number of measurement points. More specifically, additional points of data for intuitive feelings that occur before arriving at the solution would increase the experiment's sensitivity in reflecting possible differences in the development of warmth ratings across the two types of trials.

### Limitations and Future Directions

Even though self-reported Aha! experience is by many regarded as indicative of insight problem solving (e.g., Bowden et al., 2005; Bowden and Jung-Beeman, 2007; Sandkühler and Bhattacharya, 2008; Danek et al., 2016), there is still a concern that what we here

# REFERENCES


classify as insight solutions were not necessarily arrived upon exclusively through insight, or that noninsight solutions did not purely reflect an incremental process. Instead, some solutions may have been reached through a combination of both. The fact that the problems to be solved were all from the same set of tasks may even have increased the possibility that participants used largely similar strategies appraising each problem across different trials. This could be due to the aforementioned issue relating to participants receiving feedback, as well as a consideration that magic tricks as a problem solving task cannot necessarily be separated into categories of purely insight or incremental problems. If this was the case, this may to a certain extent explain why warmth ratings were not more different across the two types of trials. However, the fact that the two types of solution were subjectively experienced by participants as being different, and the fact that participants tended to hold on to their suggested solutions more strongly on high-confidence insight trials, both go against this possible criticism.

# AUTHOR NOTE

We would like to express our gratitude to Mats Svalebjørg for performing the magic tricks and contributing to the scoring of the solutions. We would also like to thank the people at Myreze for their contribution in filming the magic tricks.

# AUTHOR CONTRIBUTIONS

This study was conducted within a student scholarship project granted to MH. The supervisor for this project was EN. MH and EN contributed to the research design, data analysis, interpretation, and critical revision of the manuscript. MH programmed the experiment, and had the main responsibility for data collection and handling, as well as drafting the manuscript. JM contributed to the data analysis, interpretation, and revision of the manuscript.

# FUNDING

The project was supported by a student research grant from the Faculty of Psychology at the University of Bergen and a grant from Skibsreder Jacob R. Olsen og hustru Johanne Georgine Olsens legat (grant no. 2016/11/FOL/KH).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Hedne, Norman and Metcalfe. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# APPENDIX


<sup>∗</sup>This or a similar magic trick was also reported being used by Danek et al. (2014b). The presentation of the trick as well as the method used to achieve the desired effect might be different.

† This magic trick was excluded from the analyses because no participant provided a correct response.

# What about False Insights? Deconstructing the Aha! Experience along Its Multiple Dimensions for Correct and Incorrect Solutions Separately

#### Amory H. Danek \* and Jennifer Wiley

Department of Psychology, University of Illinois at Chicago, Chicago, IL, USA

#### Edited by:

Eörs Szathmáry, Parmenides Foundation, Germany

#### Reviewed by:

Joachim Funke, Heidelberg University, Germany Rakefet Ackerman, Technion – Israel Institute of Technology, Israel

> \*Correspondence: Amory H. Danek danek@uic.edu

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 04 October 2016 Accepted: 26 December 2016 Published: 20 January 2017

#### Citation:

Danek AH and Wiley J (2017) What about False Insights? Deconstructing the Aha! Experience along Its Multiple Dimensions for Correct and Incorrect Solutions Separately. Front. Psychol. 7:2077. doi: 10.3389/fpsyg.2016.02077 The subjective Aha! experience that problem solvers often report when they find a solution has been taken as a marker for insight. If Aha! is closely linked to insightful solution processes, then theoretically, an Aha! should only be experienced when the correct solution is found. However, little work has explored whether the Aha! experience can also accompany incorrect solutions ("false insights"). Similarly, although the Aha! experience is not a unitary construct, little work has explored the different dimensions that have been proposed as its constituents. To address these gaps in the literature, 70 participants were presented with a set of difficult problems (37 magic tricks), and rated each of their solutions for Aha! as well as with regard to Suddenness in the emergence of the solution, Certainty of being correct, Surprise, Pleasure, Relief, and Drive. Solution times were also used as predictors for the Aha! experience. This study reports three main findings: First, false insights exist. Second, the Aha! experience is multidimensional and consists of the key components Pleasure, Suddenness and Certainty. Third, although Aha! experiences for correct and incorrect solutions share these three common dimensions, they are also experienced differently with regard to magnitude and quality, with correct solutions emerging faster, leading to stronger Aha! experiences, and higher ratings of Pleasure, Suddenness, and Certainty. Solution correctness proffered a slightly different emotional coloring to the Aha! experience, with the additional perception of Relief for correct solutions, and Surprise for incorrect ones. These results cast some doubt on the assumption that the occurrence of an Aha! experience can serve as a definitive signal that a true insight has taken place. On the other hand, the quantitative and qualitative differences in the experience of correct and incorrect solutions demonstrate that the Aha! experience is not a mere epiphenomenon. Strong Aha! experiences are clearly, but not exclusively linked to correct solutions.

Keywords: aha experience, insight, problem solving, false insights, phenomenology, suddenness, pleasure, confidence

# INTRODUCTION

Theoretically, false insights should not exist. The founders of insight research, the Gestalt psychologists, understood insight to be the result of a productive thinking process turning a problem, or "defective Gestalt," into a solution, a "good Gestalt" (Wertheimer, 1925, 1959; Duncker, 1945). This classical view of insight as being defined by a restructuring of the problem representation (Wertheimer, 1925) implies that an insight always results in a correct solution, as for example also postulated by Sandkühler and Bhattacharya (2008). The subjective Aha! experience that problem solvers often report when they find a solution has been taken as a marker for insight (e.g., Kaplan and Simon, 1990; Gick and Lockhart, 1995) and researchers have relied on self-reports of the Aha! experience to distinguish insight solutions from non-insight solutions (e.g., Jung-Beeman et al., 2004; Kounios et al., 2006; Subramaniam et al., 2009). If the Aha! experience is closely linked to insightful solution processes based on restructuring ("representational change" in terms of Ohlsson, 1992), then theoretically, an Aha! should only be experienced when the correct solution is found (i.e., a "true insight"). This implies that the Aha! experience should be different or even non-existent for incorrect solutions. On the other hand, already Ohlsson theorized that "erroneous insights" could exist (Ohlsson, 1984b, p. 124) and that they would arise if a solution attempt that seems promising at first glance does not map onto the actual problem space. However, the question of the existence of false insights (experiences that feel like insights during incorrect solution attempts) has not received much attention so far. Empirical findings regarding the nature of Aha! experiences during false insights are sparse because incorrect solutions are typically discarded and not further analyzed. Exceptions are recent studies by Danek et al. (2014b), Salvi et al. (2016), and Webb et al. (2016) which will be discussed in detail further below.

Empirical support for the strong position that insight is linked to finding a correct solution, comes from one study by Metcalfe (1986b). She was the first to look at metacognition during problem solving by using feeling-of-warmth ratings on a set of problems thought to require insight for solution. She found that warmth ratings differed as a function of solution correctness: 76% of all correct solutions were preceded by a "subjectively catastrophic process" (Metcalfe, 1986b, p. 633), measured as a sudden increase in warmth ratings upon finding a solution (from a previous flat line). In contrast, incorrect solutions were more likely to be preceded by a gradual increase in warmth. Although her results were not completely clear-cut (52% of all incorrect solutions also showed the pattern of a sudden increase), this initial study provided evidence that the subjective perception of solutions as sudden may be linked to correctness. However, although subjective perceptions were assessed with feelings-ofwarmth in this study, participants' subjective Aha! experiences were not.

Three more recent studies that did assess participants' subjective Aha! experiences using self-reports have found a small percentage of false insights, i.e., Aha! experiences that were reported for incorrect solutions (Danek et al., 2014b; Hedne et al., 2016; Salvi et al., 2016). Apart from trial-wise Aha! ratings, these studies did not examine the Aha! experience any further, so it remains an open question whether Aha! experiences reported after incorrect solutions differ from those reported after correct solutions. There is some evidence from a study by Sandkühler and Bhattacharya (2008) that correct solutions are processed differently than incorrect solutions with stronger gamma band activity (40 Hz) over parieto-occipital regions. Interestingly, Jung-Beeman et al. (2004) also reported a sudden burst of gamma band activity in the right anterior superior temporal gyrus about 0.3 s prior to solution (only for insight solutions as compared to non-insight solutions). Further, Salvi et al. (2016) found that Aha! experiences are more likely to be reported following correct solutions than incorrect ones. Similarly, but without splitting their analysis into correct and incorrect solutions, Webb et al. (2016, reported in the same Research Topic) found that a feeling of Aha! is positively associated with accuracy. Finally, it is important to note that in all of these studies (and in the present study, too), problem solvers did not receive any feedback about the correctness of their solutions which suggests that possible differences in the Aha! experience between correct and incorrect solutions were not due to solvers' awareness that they had suggested an incorrect solution. The aim of the present study was to more directly compare whether differences might be found in subjective Aha! experiences for correct vs. incorrect solutions.

# DEFINING THE DIMENSIONS OF AHA!

The Aha! experience is probably not a unitary construct, but has several different facets. This is reflected in the following typical instruction given to participants as part of self-report methods:

A feeling of insight is a kind of "Aha!" characterized by suddenness and obviousness. You may not be sure how you came up with the answer, but are relatively confident that it is correct without having to mentally check it. It is as though the answer came into mind all at once—when you first thought of the word, you simply knew it was the answer. This feeling does not have to be overwhelming, but should resemble what was just described. (Jung-Beeman et al., 2004, p. 507).

Such definitions of an Aha! generally include many different dimensions of experience which clouds the interpretation of which dimensions are most important. In Jung-Beeman's Aha! prompt, the dimension of Suddenness in the emergence of the solution is described (literally, and also by "all at once"), as well as a feeling of Obviousness and Certainty (which both seem to refer to the same sensation, namely being sure about the correctness of a solution). Then there is the additional aspect of not having used a clear strategy ("You may not be sure how you came up with the answer" and "You simply knew it was the answer"). Other researchers focus on different dimensions, for example, based on earlier work that characterized insightful solutions as sudden and surprising (Metcalfe, 1986a,b; Metcalfe and Wiebe, 1987; Schooler et al., 1993; Davidson, 1995; Bowden, 1997), Cushen and Wiley (2012) used the following prompt: "If you figured out how to solve the puzzle, how surprised were you? How much did it feel like a sudden realization?" relying on only two dimensions, Suddenness and Surprise, to characterize the Aha! experience. There is no consensus about which components make up the Aha! experience which unfortunately leads to a large variety in which dimensions are used across studies. In fact, with every research group creating their own definition of Aha! experiences, it is nearly impossible to find studies that use the same prompts. Therefore, a systematic analysis of how much each purported dimension predicts the overall Aha! experience would be useful.

A main goal of this study was to decompose the Aha! experience along its different dimensions in order to identify those dimensions that best predict a global Aha! rating. This would then allow for the investigation of which dimensions might differ in their relation to correct and incorrect solutions. Danek et al. (2014a) provided an initial attempt to determine which specific dimensions drive the Aha! experience. In this study, participants attempted to discover solutions to a set of magic tricks (a task which has been demonstrated to lead to Aha! experiences; Danek et al., 2013, 2014b). At the end of the study, participants were asked to think back to the Aha! experiences they had during the study, describe them in an open-ended response, and rate the importance of several individual dimensions. As shown in **Figure 1**, high endorsement implicated the dimensions of Happiness, Surprise, Certainty and Suddenness as important for the Aha! experience both at the end of the study (1st rating) and after 14 days (2nd rating). Open-ended responses also suggested Drive (being motivated to continue problem solving) and Relief (feeling relieved or relaxed) as two further dimensions. However, these data were collected only once at the end of the study, which means they could not be used to align performance and Aha! experiences on particular problems. In contrast, the present study will take trialwise ratings after each solution attempt. Just recently, the same approach was chosen by Webb et al. (2016) who had participants solve sets of insight and non-insight problems and collected trialwise ratings of Certainty ("Confidence" in their study), Pleasure, Surprise and Impasse along with a measure of the intensity of the insight experience ("Strength"), using the same visual analog

scales as Danek et al. (2014a, namely a continuous scale from 0 to 100) that allow a more fine-grained assessment of these feelings than the typically used binary or Likert scales.

In the present study, after each trick, participants were asked to rate six dimensions of their solution experience, based on prior work and intended to represent both cognitive and affective dimensions. Each dimension is illustrated by a short quotation from participants' open-ended descriptions of "What an Aha! moment feels like" in Danek et al.'s study (Danek et al., 2014a).

**Suddenness**. Cognitive dimension. "The moment comes quite suddenly, as if the idea jumps directly into your mind and doesn't develop step by step by reflection."

That an insightful solution appears suddenly rather than incrementally is thought to be a key characteristic of insight, consistent with the findings of Metcalfe (1986b) and Metcalfe and Wiebe (1987) who demonstrated a discontinuous pattern of feeling-of-warmth ratings. The Gestalt psychologists encompassed the idea of Suddenness of insight in their writings (e.g., Duncker, 1945). This idea was further corroborated by Davidson (1995) and also by Sandkühler and Bhattacharya (2008) who reported high ratings of Suddenness for correct solutions.

**Certainty.** Cognitive dimension. "A feeling of definite knowledge or alternatively, a first sensation of knowledge that is not necessarily confirmed in the next step, but initially, feels certain and irrefutable."

The obviousness of insightful solutions, or the "intuitive sense of success" (Gick and Lockhart, 1995, p. 215) was emphasized as an important aspect of the Aha! experience by Jung-Beeman et al. (2004) and is also apparent in anecdotal reports of scientific discoveries (Irvine, 2015). By separately asking for a confidence rating and an Aha! rating after each solution, Danek et al. (2014b, Experiment 1) found that participants were more confident in the correctness of their Aha! solutions than in the correctness of their non-Aha! solutions. Hedne et al. (2016, reported in the same Research Topic as the present study) just recently replicated this effect (higher confidence about insight solutions compared to non-insight solutions) in a study using a very similar set of magic tricks (see Supplementary Material for full trick list).

**Pleasure.** Affective dimension. "I feel lively and happy to have figured it out. A feeling of bliss."

This dimension was included because problem solvers endorsed having pleasant feelings after a solution ("Happiness") stronger than any other dimension in Danek's previous study (Danek et al., 2014a). Based on this finding, it was predicted that Pleasure would be the strongest predictor of the global Aha! rating. Of course, the emotional reaction to gaining an insight can also be negative. Already Wertheimer has described the example of a lawyer who suddenly realizes that he has burnt important documents (Wertheimer, 1925, p. 173). Gick and Lockhart (1995, p. 199) also pointed out the "groan response" or "feeling of chagrin" that sometimes comes with gaining insight, and recently, Hill et al. found evidence for such "Uh-oh moments" in reports of everyday insight experiences in an online study (Hill and Kemp, 2016). The negative aspect of insight was included in the present study with the scale for Pleasure going from "unpleasant" to "pleasant," but not as an individual dimension.

**Surprise.** Affective dimension. "I feel surprised that I have understood something."

An insight is often thought to feel surprising, and Gick and Lockhart (1995) suggested that surprise might constitute one of the main components of Aha! experience. However, empirical evidence for this dimension is lacking with the exception of our previous study, where Surprise was endorsed significantly less than Happiness (Danek et al., 2014a), but on the same level as Certainty and Suddenness.

**Relief.** Affective dimension. "It was a feeling of relief combined with a feeling of happiness after a phase of strain caused by failure."

The idea that tension is released or that some kind of relaxation comes about with insight already figures in the Gestalt concept of insight (Duncker, 1945), as also noted by Ormerod et al. (2002). Relief could also reflect the overcoming of an impasse (see below), and therefore be a marker for the underlying representational change processes leading to correct solutions. Empirically, first evidence for this dimension came from open-ended questions about how an Aha! moment feels like (Danek et al., 2014a) where problem solvers repeatedly described feelings of relaxation and relief.

**Drive.** Affective dimension. "This feeling gives me wings that make me continue working on the problem which I had not been able to solve before. And, naturally, I immediately feel inclined to solve further problems, as it seems now you can do anything, no matter which task you have been set."

This is another new dimension that was derived from openended questions in a prior study on the same stimulus set (Danek et al., 2014a) and that has already been described on a theoretical level (as an "energizing effect on problem solving behavior" Ohlsson, 1984a, p. 70).

# Excluded Dimensions

For the sake of completeness, further possible dimensions of the multi-faceted Aha! experience are listed here, together with an explanation why they were not included in the present study.

#### Impasse

A feeling of being stuck. This dimension was rated significantly lower than all other dimensions in Danek et al. (2014a), with ratings near the midline. Further, in Webb et al. (2016), impasse was shown to be negatively correlated to the strength of selfreported Aha! experiences which supports the idea that although impasse might be part of the problem solving process, it is not part of the Aha! experience itself. Being in an impasse would also happen at a different point in time, namely before a solution is found.

### Feelings of Frustration

As discussed above, by implementing the dimension Pleasure with the two poles "unpleasant" and "pleasant," a strong negative affective reaction is already contained in the Pleasure scale. Note that participants only see the scale with the two anchors, but not the title "Pleasure."

# Processing Fluency

Topolinski and Reber (2010a) have argued that fluency (in the sense of a certain ease of thinking, when thoughts flow uninterruptedly and smoothly) might be the overarching feature of the Aha! experience, the "glue between its experiential features" (Topolinski and Reber, 2010a, p. 404). However, for the present purpose of regressing the Aha! experience on several dimensions (and avoiding multicollinearity between predictors), this aspect seemed already sufficiently captured by the Suddenness scale. In addition, while Topolinski and Reber (2010b) used an indirect way of assessing fluency (by varying the onset of shown solutions) that was not feasible within the present paradigm of self-generated solutions, self-reports on processing fluency seemed rather difficult to obtain.

# Overview of the Present Study

The present study aimed at identifying those dimensions that best predict a global Aha! rating specifically for correct solutions by using a large problem set from the domain of magic as problem solving task (Danek et al., 2014b) and asking participants to provide a solution, a global Aha! rating, and ratings on each of the six dimensions following each trick (i.e., trial-wise). Based on Danek et al. (2014b) and Salvi et al. (2016), it was predicted that correctly solved problems should be more likely to be accompanied by Aha! experiences than incorrectly solved problems. To the extent that longer solution times are due to the use of analytic or incremental solution processes, then Aha! experiences could also be predicted to be more likely to accompany faster correct solutions. Further, if Aha! experiences are a marker for true insight, then there should be some distinction between the Aha! experiences that accompany correct solutions and incorrect solutions. Theoretically, one would expect that the thinking processes leading to incorrect solutions should be fundamentally different than those leading to correct solutions that involve representational change. However, if no quantitative or qualitative differences are found, this would suggest that the Aha! experience might be epiphenomenal rather than a defining characteristic, as some researchers have argued (e.g., Weisberg and Alba, 1981). One reason why the Aha! experience might be better considered as epiphenomenal is because problem solvers do not seem to have reliable access to their solution processes and thus cannot report on them (Ash et al., 2009). However, while it is true that several studies (e.g., Cushen and Wiley, 2012) have found a disconnect between the actual solution process and solvers' reportable experience of it, this might also simply be due to using incomplete prompts (e.g., missing important dimensions or stressing less important ones) about what an Aha! experience feels like. The present systematic dissection of Aha! will hopefully contribute to getting a clearer picture about this.

# METHODS

# Participants

Participants were 70 undergraduate students from the University of Illinois at Chicago Introduction to Psychology Subject Pool who received course credit for their participation (M = 19.6 (SD = 2.8) years of age; 22 males, 48 females). All of them were tested individually. Two additional subjects were tested, but could not be included in the analysis for failing to follow the instructions. In addition, on an individual trick level, whenever a participant had pressed the solution button without typing in an answer, their ratings were not analyzed, but treated as missing values. There were 35 participants in each of two conditions that counterbalanced the direction of the individual dimension ratings. Note that all participants solved at least three tricks correctly.

# Stimuli

# Magic Tricks

A set of 37 magic tricks (listed in the Supplementary Material) were presented to participants as a problem solving task using a paradigm established by Danek et al. (2014b). Students were told "Your task is to solve this puzzle and try to see through the magic trick." This large set of problems was used in order to generate many repeated solution events (with or without insight) that participants could report on. Short video clips (duration ranged from 6.3 to 72.5 s) were presented on a 19′′ computer screen through PsychoPy (Peirce, 2007). The tricks had been performed by a professional magician, Thomas Fraps (Abbott, 2005), and recorded in a standardized theatre setting (see https://www.youtube.com/watch?v=3B6ZxNROuNw for an example clip from the set). The stimulus set covered a wide range of different magic effects (e.g., transposition, restoration, vanish) and methods (e.g., misdirection, gimmicks, optical illusions) (for more details, see Danek et al., 2014b). Two additional tricks were used for practice trials. Two of the 37 tricks were not solved by anyone and therefore not included in any analyses, resulting in a final problem set of 35 magic tricks.

# Rating Scale for Global Aha! Rating

Immediately after indicating that they had found a solution, participants were asked "Did you have an Aha! moment?" and gave an answer by selecting a point between "no" and "yes" on a visual analog scale, see Slide 3 on **Figure 2**. In previous work, selfreports of Aha! experiences have varied between dichotomous

measures (Yes - No), to Likert scales with 3, 5, or 7 points, to continuous scales. We agree with Webb et al. (2016) that binary ratings suffer from the problem that participants might use very different benchmarks for what constitutes an Aha! experience or not. Some might set the criterion for when they rate "Aha!" very high, others very low. Continuous scales allow participants to report a range of stronger and weaker Aha! experiences. Thus, the present study employed a continuous scale.

For the global Aha! scale, the "yes" anchor always appeared on the right-hand side of the scale. Participants were instructed to base their rating decision on the following description of what an Aha! moment typically feels like (translated with minor modifications from the German instruction of Danek et al., 2013; which had been originally adapted from Jung-Beeman et al., 2004):

"An Aha! moment is when the solution suddenly dawns on you and everything is clear immediately. << Experimenter snaps fingers. >> In a flash. You are relatively confident that your solution is correct without having to check it once more. In contrast, if the solution occurs to you slowly and in steps, and if you feel you still need to check it that would not be an Aha!. As an example, imagine a light bulb that is switched on all at once in contrast to slowly turning up the lights. Have you ever experienced an Aha! moment, perhaps during studying? For each solution, we ask for your subjective rating whether it felt like an Aha! moment or not. There is no right or wrong answer. Just follow your intuition."

# Rating Scales for Individual Dimensions of Solution Experience

For each trick, participants rated their subjective solution experiences with respect to six different dimensions, using visual analog scales with the following wording for the prompts and anchors:


The dimensions appeared in the order shown above for all participants. The direction of the anchors was counterbalanced across two groups of participants. For one half of the participants, the anchors of the Pleasure, Suddenness and Drive scales were reversed from the direction of the global Aha! rating [e.g., Pleasure: "At the moment of solution, my feelings were... (pleasant - unpleasant)"]. For the other half of the participants, the anchors of the remaining three scales (Surprise, Relief, Certainty) were reversed. This created the two counterbalancing conditions.

# Procedure

After signing an agreement form, participants were seated at a computer and instructed to watch the video clips and try to find the solution. It was stressed that they should only provide plausible solutions (nothing like "a magic powder lets the coin disappear"), but that if they had an idea what the solution could be, then they should type it in even if they were not sure about it. The latter was intended to help increase the low solution rates from previous studies and generate more events of interest. They were also told to press the space bar as soon as possible once they had a solution idea. This ended the video clip presentation and brought them to the first rating screen with the global Aha! rating (see **Figure 2** for the sequence of one trial). The global rating was followed by four more ratings (Pleasure, Surprise, Suddenness, and Relief). Then participants were prompted to type in their solution and finished the trial with two more ratings (Certainty and Drive). Participants did not receive any feedback on the correctness of their solutions. The procedure began with two practice trials. Then, the 37 experimental video clips were presented in randomized order. Each trick was shown a maximum of three times. If no button was pressed to indicate that a solution was found, the next trick followed. At the end of the experiment, participants filled in a demographic data sheet and were debriefed. The entire experiment lasted about 1 h.

# Response Coding

Responses were coded as correct or incorrect solutions by two independent raters using a solution coding manual based on prior work with this problem set (Danek et al., 2013, 2014a,b). Correct solutions were either the real solution (i.e., the method that the magician used) or alternative, but plausible solutions, while incorrect solutions were either implausible or partial (key solution element missing) solutions. The intraclass correlation coefficient was 0.83 indicating a satisfactory level of agreement between the two raters. Conflicting cases between the two raters were resolved by a third rater.

All rating scales including the global Aha! rating were measured in whole values from 0 to 100. Solution time was measured in milliseconds from the start of the video clip until participants pressed a button to indicate that they had found a solution. Previous viewings of the trick were included in the solution times for each trial.

# RESULTS

In total, 70 participants being presented with 35 tricks yielded 2450 observations. Of those, 603 were not solved (i.e., timeouts) and thus discarded, and an additional 69 observations were missing values due to computer errors or skipped trials. All analyses were based on the remaining 1778 observations where participants suggested a solution. Of these 1778 observations, 36.8% (654 occurrences) were correctly solved, and 63.2% (1124) were incorrectly solved. For all analyses, data were collapsed across the two counterbalancing conditions. The dataset of the present study will be made available at the open repository for psychology data "PsychData" (https://www.psychdata.de/index. php?main=none&sub=none&lang=eng).

# Relationship between Solution Success, Solution Times, and Aha! Ratings

Before exploring the dimensions that predicted Aha! experiences, basic differences in the magnitudes of Aha! ratings and solution times were explored for correct and incorrect solutions.

Computing average ratings for correct and incorrect solutions at the participant level revealed that correct solutions led to higher Aha! ratings (M = 66.50, SD = 18.42) than did incorrect solutions, (M = 52.34, SD = 18.78, t(69) = 10.21, p < 0.01), replicating Danek et al. (2014b) and Salvi et al. (2016), see **Figure 3**. This difference in the magnitude of Aha! ratings offers initial support for the position that Aha! experiences might differ following correct vs. incorrect solutions.

However, it is notable that a substantial percentage of incorrect solutions (37% or 417 out of 1124) received Aha! ratings that were higher than the average for correct solutions. This shows that the Aha! experience is not an exclusive feature of correct solutions, but that it is also reported for incorrect solutions.

In terms of solution time, on average, correct solutions (M = 35.81, SD = 19.71) were significantly faster than incorrect solutions, (M = 42.46, SD = 24.49, t(1776) = 6.26, p < 0.01). To understand the relation of solution time to Aha! ratings, a linear mixed-effects model was calculated to predict Aha! ratings, including solution time, solution correctness, and their interaction as fixed effects, and random intercepts for subjects. As shown in **Figure 4**, there was a main effect of solution time (t = 4.03, p < 0.01), with faster solutions more likely to be rated high on Aha! and longer solutions more likely to be rated low. There was also a main effect of solution correctness (t = −3.36, p < 0.01) with correct solutions more likely to be rated high on Aha! than incorrect solutions, as already reported above. The interaction was not significant (t = 1.58, p < 0.12). For fast incorrect solutions, it is possible that solution time is being misused as a cue because it leads to giving high Aha! ratings

FIGURE 3 | Mean Aha! ratings as a function of solution correctness. Error bars denote SEM.

("false insights"). But for longer incorrect solutions, problem solvers give low Aha! ratings, so they seem to realize that these are not "true insights."

# Which Dimensions of Aha! Predict Global Aha?

The main aim of the present study was to test whether differences might be found in subjective Aha! experiences after correct vs. incorrect solutions. However, before proceeding to analyses that consider only correct or incorrect solutions, we first report correlations using Webb et al.'s approach (Webb et al., 2016) of analyzing both correct and incorrect solutions together, see **Table 1**. We find rather similar results to theirs, with all dimensions showing a relation with Aha! ratings in simple correlations, except for the Surprise dimension. Even though the relation was still significant, we find a much lower correlation between Surprise and the global Aha! rating (r = 0.07, Webb et al. ranging from 0.29 to 0.48). The dimensions Suddenness, Relief and Drive were assessed only in the present study and therefore not compared with Webb et al.'s results.

# What Predicts Aha! For Correct Solutions?

One of the main questions for this study was which dimensions of the Aha! experience specifically predict global Aha! ratings for correct solutions. As shown in **Table 2**, simple correlations showed that all dimensions but Surprise were significantly and positively correlated with the global Aha! rating for tricks with correct solutions.

Correlations between the six dimensions and the global Aha! rating were also computed for each individual and averaged across individuals. As shown in **Table 3**, this led to the same pattern of results as the simple correlations. Average correlations were significantly greater than 0 for all dimensions except Surprise.

To understand the relation of each dimension to the Aha! ratings, a linear mixed-effects model was calculated to predict the Aha! ratings for just the correct solutions, including each of the


TABLE 1 | Both correct and incorrect solutions: Simple correlations between participants' ratings of their problem solving experience (on the dimensions pleasure, surprise, suddenness, relief, certainty and drive) and one global Aha! rating.

Dimensions listed in the order that they were asked.

N = 1778. All values are Pearson correlation coefficients. \*p < 0.05. \*\*p < 0.01 (2-tailed).

TABLE 2 | Correct solutions: Simple correlations between participants' ratings of their problem solving experience (on the dimensions pleasure, surprise, suddenness, relief, certainty and drive) and one global Aha! rating.


Dimensions listed in the order that they were asked.

N = 654. All values are Pearson correlation coefficients. \*p < 0.05. \*\*p < 0.01 (2-tailed).

TABLE 3 | Average intra-individual correlations between dimensions and Aha! Ratings.


N = 68. \*\*p < 0.01 one tailed t-test vs. 0. All values are mean correlations (i.e., the average of 68 individual correlation coefficients).

TABLE 4 | Linear mixed-effects model of predictors of the global Aha! rating, for correct solutions only.


N = 654.

dimensions as fixed effects, and random intercepts for subjects. As shown in **Table 4**, Pleasure, Suddenness, Certainty and Relief were found to be unique predictors of the Aha! experience for correct solutions.

# What Predicts Aha! For Incorrect Solutions?

As shown in **Table 5**, simple correlations showed that all dimensions were significantly and positively correlated with the global Aha! rating for tricks with incorrect solutions. However, when correlations were computed for each individual and averaged as shown in **Table 3**, the average correlation for Surprise was not significantly greater than 0.

To test which dimensions uniquely predicted Aha! ratings, a parallel linear mixed-effects model was calculated just for the incorrect solutions. As shown in **Table 6**, Pleasure, Suddenness, Certainty and Surprise were found to be unique predictors of the Aha! experience for incorrect solutions.

# What Distinguishes the Aha! Experience between Correct and Incorrect Solutions?

The above analyses demonstrated that Pleasure, Suddenness and Certainty were the key dimensions that combined to uniquely predict the Aha! experience for both correct and incorrect solutions. This means, Pleasure, Suddenness and Certainty ratings always covaried with Aha! ratings, independent of solution correctness. Further, Relief emerged as the one single dimension of the Aha! experience that was more likely for correct than incorrect solutions. On the other hand, Surprise was the dimension that predicted Aha! experiences only for incorrect solutions, and may be considered as misleading cue. These results suggest that all Aha! experiences may consist of a core of three dimensions, but that in addition, solution correctness


TABLE 5 | Incorrect solutions: Simple correlations between participants' ratings of their problem solving experience (on the dimensions pleasure, surprise, suddenness, relief, certainty and drive) and one global Aha! rating.

Dimensions listed in the order that they were asked.

N = 1124. All values are Pearson correlation coefficients. \*p < 0.05. \*\*p < 0.01 (2-tailed).

#### TABLE 6 | Linear mixed-effects model of predictors of the global Aha! rating, for incorrect solutions only.


N = 1124.

may be associated with slightly different emotional coloring, with problem solvers feeling relieved for correct solutions, and feeling surprised for incorrect ones.

The other major difference between Aha! experiences for correct and incorrect solutions seems to be in magnitude. Although both were predicted by the Pleasure, Suddenness and Certainty dimensions, correct solutions were rated as more pleasant (M = 66.05, SD = 13.79) than incorrect (M = 56.67, SD = 15.63, t(69) = 7.17, p < 0.01), more sudden (M = 55.68, SD = 18.17) than incorrect (M = 47.19, SD = 16.67, t(69) = 5.90, p < 0.01), and solvers were more certain about being correct when they gave correct solutions (M = 70.55, SD = 14.15) than incorrect solutions (M = 56.14, SD = 16.0, t(69) = 9.83, p < 0.01), even though they never received feedback about their solution correctness<sup>1</sup> .

# Differences in Aha! Experiences Due to Solution Complexity

Ohlsson postulated that the perceived suddenness of a solution might be a function of how much problem solving is needed to complete the problem after the initial representational change has taken place (Ohlsson, 1984b, 1992, 2011). He claimed that whether a solution feels sudden or not is contingent upon how many thinking steps are still required once a potential solution element is identified. If the entire remaining solution can be "seen" in the mind's eye [i.e., if it lies within the horizon of mental look-ahead, (MacGregor et al., 2001), which is limited by working memory capacity, (Ohlsson, 2011)], the problem will seem to be solved very quickly after the initial breakthrough. This leads to the following hypothesis (stated in chapter 4 of Ohlsson, 2011): If several additional steps are required to achieve the full solution after the first realization of a crucial solution element (Weisberg and Alba, 1981), then the solution will feel less sudden. This hypothesis can be tested within our task domain of magic tricks. Thus, the current problem set of 35 magic tricks was analyzed for the number of steps that each trick required for solution. Tricks that required just one realization after which the full solution should directly appear within the horizon of mental look-ahead, were coded as having a "single-step" solution (cf. Murray and Byrne, 2013). Alternatively, tricks that required several additional steps to reach a full solution after the first realization of the crucial solution element, were coded as "multi-step" solutions. The set was found to contain both single-step (n = 24 tricks) and multistep (n = 11) solutions. Item-level analyses showed that correctly solved magic tricks with single-step solutions received higher Suddenness ratings (M = 55.69, SD = 7.50) than magic tricks with multi-step solutions (M = 47.10, SD = 10.69, t(33) = 2.74, p < 0.05). This was independent of actual solution times which did not differ between the two groups of tricks. This analysis was computed using data for correct solutions only (because incorrect solutions vary individually and can be single- or multistep for the same problem). Single-step solutions did not differ from multi-step solutions in any other dimension nor in the global Aha! rating nor in solution time. In contrast to Murray and Byrne's study (Murray and Byrne, 2013), single-step tricks did not differ from multi-step tricks with regard to their difficulty (measured as mean solution rate for each trick).

# DISCUSSION

The starting point for the present study was the question whether false insights happen at all, i.e., whether high Aha! experiences are also reported for incorrect solutions. We found that overall, correct solutions were more likely to lead to Aha! experiences. However, some incorrect solutions (37%) also led to high Aha! experiences. Therefore, although the Aha! is linked to finding a correct solution, false insights clearly exist, too (as suggested

<sup>1</sup>On average, solvers were over-confident on tricks they solved incorrectly, and under-confident on tricks they solved correctly. Because the majority of tricks were solved incorrectly, participants were on average 23.22% over-confident.

by previous studies, Danek et al., 2014b; Hedne et al., 2016; Salvi et al., 2016). This shows that the Aha! experience is not an exclusive feature of correct solutions.

The present finding that correct solutions led to higher Aha! ratings than incorrect solutions is in accordance with prior studies (Danek et al., 2014b; Hedne et al., 2016; Salvi et al., 2016). Further differences were apparent with regard to solution time, with correct solutions emerging significantly faster than incorrect solutions. Both of these results offer initial support for the position that Aha! experiences might feel different for correct vs. incorrect solutions. The reasoning was if Aha! experiences are a marker for true insight, correct solutions should not only lead to higher ratings of Aha!, as found here, but also to qualitatively different ratings along the individual Aha! dimensions. If no such differences were found, this would suggest that Aha! is merely epiphenomenal, and not an indicator of different problem solving processes underlying correct and incorrect solutions.

With a systematic decomposition of the Aha! experience into its constituents, and by obtaining separate ratings for each of them, the present study found that Pleasure, Suddenness and Certainty uniquely predicted Aha! experiences for both correct and incorrect solutions. This means, when participants reported Aha!, they also had pleasant feelings in the moment of solution, felt that the solution had come to them all at once, and were certain that their solution was correct. These three dimensions seem to be at the core of Aha! experiences, independent of solution correctness. However, although these three dimensions are shared, correctness is reflected in major quantitative differences between Aha! experiences that follow correct and incorrect solutions: Compared to incorrect, correct solutions were rated as more pleasant and more sudden and solvers were more confident about being correct. Further, a small qualitative difference was found: for correct solutions, Relief also uniquely predicted Aha! whereas for incorrect solutions, it was Surprise. This suggests a slightly different emotional coloring of the Aha! experience, with problem solvers who found the correct solution feeling relieved, and problem solvers who found an incorrect solution feeling surprised. Importantly, these differences were observed in the absence of any feedback about solution correctness. These findings speak against regarding the Aha! experience as only epiphenomenal (as for example suggested by Weisberg and Alba, 1981).

Looking at solution times, faster solutions were found to be more likely to be rated high on Aha! and slower solutions were more likely to be rated low, a result which is in accordance with several other studies (e.g., Aziz-Zadeh et al., 2009; Wegbreit et al., 2012; Chein and Weisberg, 2014; Danek et al., 2014b).

The results of the present study can be compared to the results of Webb et al.'s recent study (2016, reported in the same Research Topic). Although the motivation for the Webb et al. study was to explore how different dimensions underlying the Aha! experience might predict solution accuracy, and in contrast the motivation for the present study was to explore how different underlying dimensions might predict Aha! differently for correct and incorrect solutions, there are still a number of commonalities that can be noted across the results of the two studies. Differences between the two studies that might limit the comparability will be discussed later on, as well as unique insights that were gained from exploring relations for correct and incorrect solutions separately in the present study.

# Pleasure

There was a strong and positive relationship in simple correlations (r = 0.66) between Pleasure and the global feeling of Aha!. This finding seems to generalize across different problem solving tasks, with Webb et al. (2016) reporting r's in the range of 0.71 to 0.73 when using five classic, mostly verbal insight problems and r's ranging from 0.63 to 0.70 when using Compound Remote Associate (CRA) problems (Bowden and Jung-Beeman, 2003). It is also in accordance with another study on CRA problems by Kizilirmak et al. who report a more positive emotional response (measured on a 5-point graphical affective rating scale with smiley faces) for Aha! solutions compared to non-Aha! solutions (Kizilirmak et al., 2016b). It also matches our everyday experience of insight as a very pleasant event. Further, positive affect is known to facilitate insight (e.g., Isen et al., 1987; Bolte et al., 2003; Subramaniam et al., 2009; Sakaki and Niki, 2011). The present finding that feeling happy or in a good mood predicts a global rating of Aha! sheds some new light on these studies, at least on those where insight was assessed through self-reports. With positive emotions being a key aspect of the subjective Aha! experience, inducing positive mood prior to solving might simply lead participants to report more Aha! experiences. They may be more likely to say that any solution was an insight. This is in contrast to the hypothesis that being in a good mood increases the likelihood of insightful solutions (reflected in higher solving rates).

Another possible theoretical explanation for the prevailing role of Pleasure is offered by Thagard and Stewart's attempt to model the Aha! experience (Thagard and Stewart, 2011). Their EMOCON model conceptualizes the Aha! experience as a pattern of neural activity that arises through the convolution of an emotional reaction with a new combination of mental representations. Of course, a novel combination of representations (or restructuring) is just what is needed for solving a magic trick or other difficult problem solving tasks where solvers are lured into an inappropriate initial representation. The "ecstasy of discovery" (Thagard and Stewart, 2011, p. 10) is proposed to arise from automatic appraisal mechanisms that judge each new combination of mental representations with regard to its relevance. If the novel combination is non-trivial and highly relevant for the problem solver, a strong emotional response is triggered which is also reflected on a physiological level.

# Suddenness

The feeling that a solution appears all at once instead of stepwise was another unique predictor of Aha! in the present study, with a strong and positive simple correlation (r = 0.49) between Suddenness and the global Aha! rating. This means problem solvers who experienced the solution as very sudden were also likely to report a strong Aha! feeling. This supports the idea of different cognitive processing underlying solutions with stronger or weaker reported Aha! experiences. In the case of strong Aha! experiences, the solution pops into mind all at once, as a whole. Webb and colleagues did not gather data on this dimension, so it is unclear whether it might generalize across problem solving tasks. Further, perceived Suddenness depended on the degree of complexity of the solution, with single-step solutions feeling more sudden than multi-step solutions, independent of trick difficulty or time to solution.

Of course, because Suddenness was explicitly mentioned in the Aha! prompt that participants were given, that could be the reason for the strong relation between Suddenness and the global Aha! rating in this study. However, this simple explanation seems less likely when one considers that Suddenness was found to be more of a factor for tricks that required single-step solutions as opposed to multi-step solutions. This shows that there was not a simple positive relation between Suddenness and Aha! ratings which would be more consistent with a bias or demand characteristic resulting from Suddenness as being included as part of the Aha! prompt. It also highlights the importance of careful task analyses when selecting which problems to study, even with the recognition that any problem solving task can be solved with or without Aha! experience (Bowden et al., 2005; Öllinger et al., 2014; Kizilirmak et al., 2016a; Danek et al., 2016; Webb et al., 2016). Clearly, the aim for researchers who want to study insight and Aha! is to select tasks that not only have a high probability of leading to an initially biased problem representation which is false and must be improved through a representational change, but also to select tasks that have a high probability of triggering Aha! experiences. The present data indicates that mainly problems with single-step solutions will yield the feeling of Suddenness. This important new finding converges with a recent study on three classical insight problems (9 Dot, 8 Coin and one Matchstick Arithmetic Problem) reporting that problems with solutions for which only one constraint needs to be relaxed feel more like an "Aha!" than multi-step solutions with several constraints (Danek et al., 2016). The prototypical example of a multi-step solution problem is the classic 9 Dot Problem (Maier, 1930) which Kershaw and Ohlsson (2004) as well as Öllinger et al. (2014) have shown involves multiple causes of difficulty. These types of problems are not what insight researchers should aim for if they are trying to study Aha! experiences.

# Certainty

Confidence in the correctness of the proposed solution (in the absence of feedback) also uniquely predicted the strength of the global Aha! rating, with a simple correlation of r = 0.58 between Aha! and Certainty. Again, this finding seems to generalize across different problem solving tasks, with Webb et al. (2016) reporting r's ranging from 0.60 to 0.65 (classic insight problems) and from 0.52 to 0.63 (CRAs). On one hand, the strong relation between Certainty and the global Aha! rating could be due to the fact that, like Suddenness, Confidence was stressed in the Aha! prompt that participants were given in both this study and the Webb et al. study ("You are relatively confident that your solution is correct without having to check it once more."). However, other studies that have not included Certainty in their prompt (Hedne et al., 2016) have also found that Certainty is higher for Aha! trials than non-Aha! trials, which suggests that it may be an essential dimension of the Aha! experience even without explicit prompting.

# Relief

The affective dimension of Tension Release or Relief has not been widely explored previously. Webb et al. (2016) did include it by mentioning relief in the Aha! prompt, but did not collect data on it. Relief was found to be highly correlated with Aha! in this study (r = 0.49). The fact that it also correlated strongly with Pleasure (r = 0.64) suggests that the dimensions of Pleasure and Relief might be measuring similar emotional constructs. However, it is also possible that Relief is related to the cognitive process of representational change that allows the solver to resolve an impasse, overcome a difficulty, or escape fixation. Relief was the only dimension unique to correct solutions. This means, if a correct solution was found, problem solvers' Aha! ratings covaried with Relief ratings. This was not the case for incorrect solutions.

# Surprise

The overall relation between ratings on the Surprise dimension and Aha! was only 0.07 in simple correlations in this study, while the Webb study reports r's ranging from 0.29 to 0.48 (classic insight problems) and from 0.15 to 0.25 (CRAs) for their Surprise dimension. There are a number of possible ways to interpret these differences. One possibility is that the Surprise ratings in the Webb study are capturing the same underlying perception as the Suddenness ratings in the present study, and our results turned out differently because we asked participants to rate both dimensions. Alternatively, because Webb et al. did not counterbalance the direction of their scales (all dimensions were aligned with the global Aha! rating), they may have inflated the positive relations among the dimensions. Of course, differences between the problem types (magic tricks vs. puzzles and CRAs) could also be responsible for differences in Aha! experiences, but this seems less likely given the high consistency with regard to the other dimensions.

Most importantly, the Surprise dimension was one of two dimensions (the other one was Relief) to suggest that Aha! experiences triggered by correct solutions slightly differ from those triggered by incorrect solutions, as the Surprise dimension was a unique predictor only for incorrect solutions. This result questions the wisdom of the established approach of using a multi-component operational definition for Aha! that encompasses Suddenness, Certainty and Surprise. Studies relying on Surprise in their Aha! prompts might actually have encouraged participants to use a misleading cue and therefore obtained invalid self-reports of insight.

# Drive

The overall relation between ratings on the Drive dimension and Aha! was 0.28 in this study (no Drive dimension was included in the Webb study). Interestingly, Drive was canceled out and did not predict the Aha! rating at all when variance due to subjects was removed (by fitting random intercepts for subjects in our mixed model analysis). These results suggest that Drive is just an individual factor that is experienced differently by each person, but that it is not a relevant part of the Aha! experience.

In future studies, it would be interesting to investigate possible cues problem solvers might be using for their subjective dimension ratings. For the dimension Suddenness, this study provides first evidence that solution complexity (single vs. multistep solutions) plays a role in judging a solution as emerging suddenly or not. However, it remains unclear what leads problem solvers to feel that a solution is pleasant or relieving or surprising.

# Differences between the Present Study and Webb et al. (2016)

Comparing the present results with Webb et al.'s study (Webb et al., 2016) who used a very similar methodology on completely different problem sets offers the exciting possibility to scale the findings up to different tasks. This comparability might be a bit limited however, due to differences in the way the Aha! experience was assessed. Instead of a global Aha! rating like the one used here ("Did you have an Aha! moment?", with a sliding scale from No to Yes), their "Aha" variable was measured as "Strength of the insight experience" (with a sliding scale from very weak to very strong). At first glance, this might seem like only a small difference, but in particular the lower end of the scale does not seem fully equivalent. The wording of the strength rating scale might suggest to participants that some form of insight always takes place, because the lowest possible rating would still mean "a very weak insight experience." Thus, there is no room for "no Aha's", only for weak Aha's. Similarly, the ratings for the underlying dimensions were not counterbalanced for their direction, meaning that they were always aligned with the Aha! rating. This may have inflated both ratings of Aha! and the relation between Aha! and each dimension if some subjects simply had a leftward or rightward bias when using the scales and might also explain why Webb et al. tended to find slightly higher correlations. Yet, despite these differences a number of commonalities were found.

In contrast to the present study, Webb et al. (2016) did not analyze incorrect and correct solutions separately. This makes sense given that the aim of Webb et al. (2016) was not to decompose the Aha! experience, but was instead to predict solution accuracy from the individual dimensions as well as from a global measure of Aha! (strength of the insight experience). However, the fact that differences were seen in the present study in which dimensions served as unique predictors of Aha! for correct and incorrect solutions shows that it is important to consider these different solution types separately. Several unique insights that emerged from exploring relations for correct and incorrect solutions separately included a better understanding of the Surprise dimension and its relation to both Aha! experiences and solution accuracy. Webb et al. found a consistently positive relationship between Surprise and Aha! which led them to conclude that Surprise is an important factor in the Aha! experience. At the same time, they reported a negative or non-significant correlation between Surprise and accuracy across three experiments and in their powerful multilevel regression model (combining data from 674 subjects), they found that Surprise decreased solution accuracy. This suggests a disconnect between the way Surprise relates to Aha! experiences and accuracy. By splitting solutions based on their correctness, in the present analysis it becomes clear that the relation between Surprise and Aha! may be specific for incorrect solutions. In other words, feelings of Surprise that accompany a solution may relate more strongly to false insights rather than true ones. Finally, the analysis for only correct solutions reveals Relief as the one dimension that relates more to correct than incorrect solutions, suggesting a slightly different emotional coloring of the Aha! experience, dependent on solution correctness.

# CONCLUSION

In sum, this study reports three main findings: First, false insights exist. Second, the Aha! experience is truly multidimensional, centered around both affect (Pleasure) and cognition (evaluating solutions as emerging suddenly and feeling confident about them). Third, although Aha! experiences for correct and incorrect solutions share these three common dimensions, they are also experienced somewhat differently with regard to magnitude and quality. Correct solutions emerged faster and led to stronger Aha! experiences; higher ratings of Pleasure, Suddenness, and Certainty; and were more associated with Relief, while incorrect solutions were more associated with Surprise.

Taken as a whole, these results cast some doubt on the assumption that the occurrence of an Aha! experience can serve as a definitive signal that a true insight has taken place. Theoretically, this would have suggested that Aha! experiences should have only resulted from correct solutions. Although the present study measured a rather comprehensive set of six dimensions, more work is needed to determine if there may be other specific aspects of the Aha! experience that may be more indicative of only true insights. Moreover, if we adopt the Gestalt psychologists' original definition of insight as being based on restructuring (Wertheimer, 1925), future studies should try to include some measure of restructuring. On the other hand, the quantitative and qualitative differences in the experience of correct and incorrect solutions demonstrate that the Aha! experience is not a mere epiphenomenon. To conclude, strong Aha! experiences are clearly, but not exclusively linked to correct solutions, and consist of three key components: joy of discovery, confidence in being correct and a feeling that the solution appears all at once.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Institutional Review Board and the Office for the Protection of Research Subjects of the University of Illinois at Chicago with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Institutional Review Board and the Office for the Protection of Research Subjects of the University of Illinois at Chicago.

# AUTHOR CONTRIBUTIONS

AD and JW designed the experiment. AD developed the magic trick material, conducted the study and wrote the first draft of the manuscript. AD and JW analyzed the data. Both authors were critically involved in the interpretation of the results and in revising the manuscript.

# FUNDING

This work was funded by a grant to AD from the DFG (German Research Foundation), grant # DA 1683/1-1. The Research Open Access Publishing (ROAAP) Fund of the University of Illinois at Chicago provided financial support toward the open access publishing fee for this article.

# REFERENCES


# ACKNOWLEDGMENTS

We are grateful to Stellan Ohlsson for discussion and helpful methodological insights. We are also indebted to magician Thomas Fraps (http://www.thomasfraps.com) for performing the magic tricks used in this study. We thank Franziska Konitzer for writing the PsychoPy code for this experiment and Shannon Menard and Jocelyn Rodriguez for assistance in data collection and coding of solutions.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.02077/full#supplementary-material

The dataset of the present study will be made available at the open repository for psychology data "PsychData" (https://www. psychdata.de/index.php?main=none&sub=none&lang=eng).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Danek and Wiley. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Cognitive Architecture with Evolutionary Dynamics Solves Insight Problem

Anna Fedor 1, 2, 3 \*, István Zachar 3, 4, András Szilágyi 2, 3, Michael Öllinger <sup>1</sup> , Harold P. de Vladar 3, 5 and Eörs Szathmáry 3, 4

<sup>1</sup> Parmenides Center for the Study of Thinking, Parmenides Foundation, Pullach am Isartal, Germany, <sup>2</sup> MTA-ELTE Theoretical Biology and Evolutionary Ecology Research Group, Budapest, Hungary, <sup>3</sup> Institute of Advanced Studies Koszeg (iASK), ˝ Koszeg, Hungary, ˝ <sup>4</sup> Department of Plant Systematics, Ecology and Theoretical Biology, Eötvös Loránd University (ELTE), Budapest, Hungary, <sup>5</sup> Center for the Conceptual Foundations of Science, Parmenides Foundation, Pullach am Isartal, Germany

In this paper, we show that a neurally implemented a cognitive architecture with evolutionary dynamics can solve the four-tree problem. Our model, called Darwinian Neurodynamics, assumes that the unconscious mechanism of problem solving during insight tasks is a Darwinian process. It is based on the evolution of patterns that represent candidate solutions to a problem, and are stored and reproduced by a population of attractor networks. In our first experiment, we used human data as a benchmark and showed that the model behaves comparably to humans: it shows an improvement in performance if it is pretrained and primed appropriately, just like human participants in Kershaw et al. (2013)'s experiment. In the second experiment, we further investigated the effects of pretraining and priming in a two-by-two design and found a beginner's luck type of effect: solution rate was highest in the condition that was primed, but not pretrained with patterns relevant for the task. In the third experiment, we showed that deficits in computational capacity and learning abilities decreased the performance of the model, as expected. We conclude that Darwinian Neurodynamics is a promising model of human problem solving that deserves further investigation.

Keywords: insight, Darwinian Neurodynamics, attractor networks, four-tree problem, evolutionary search

# INTRODUCTION

# Darwinian Neurodynamics

The Bayesian brain is an increasingly popular idea in cognitive science. According to this theory, the mind assigns probabilities to hypotheses and updates them based on observations. Bayesian cognitive models were successfully used in many different areas of cognition, like learning, memory, reasoning and decision making. However, the "Bayesian brain falls short in explaining how the brain creates new knowledge" (Friston and Buzsáki, 2016), it does not account for the generation of new hypotheses; it only accounts for the selection of already existing variant hypotheses.

It has been pointed out that Bayesian update effectively implements a process analogous to selection (Harper, 2009), where the prior distribution is equivalent to an existing set of hypotheses, the likelihood function acts as the selection landscape, and the posterior distribution is the output population of hypotheses after a round of selection. If selection acts on units that can replicate and inherit their traits with variability we get full-blown evolution (Maynard Smith, 1986). We believe that the Bayesian paradigm for modeling cognition, especially problem solving, could be

#### Edited by:

Dietmar Heinke, University of Birmingham, UK

#### Reviewed by:

Diarmuid Patrick O'Donoghue, Maynooth University, Ireland Davide Marchiori, University of Southern Denmark Odense, Denmark

> \*Correspondence: Anna Fedor fedoranna@gmail.com

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 04 July 2016 Accepted: 07 March 2017 Published: 29 March 2017

#### Citation:

Fedor A, Zachar I, Szilágyi A, Öllinger M, de Vladar HP and Szathmáry E (2017) Cognitive Architecture with Evolutionary Dynamics Solves Insight Problem. Front. Psychol. 8:427. doi: 10.3389/fpsyg.2017.00427

**79**

successfully complemented with replication and inheritance to explain where new hypotheses come from.

Problem solving can be conceptualized as search for the solution in a search space (sometimes also called the hypothesis space, state space, or problem space). The search space is the space of all hypotheses that are possible within the dimensions that the problem solver considers. Cognitive search mechanisms must be very effective in exploring the search space and must account for the generation of new hypotheses. Evolutionary search (Maynard Smith, 1986) fulfills those requirements, as it implements parallel, distributed search with a population of competing evolutionary units and it also explains the generation of these units that depends on fitness. Evolutionary search as a model for creative cognitive processes is not a new idea (see e.g., Campbell, 1960; Simonton, 1999, 2011; Fernando et al., 2012). Some of us have previously proposed (Fernando and Szathmáry, 2009, 2010; Fernando et al., 2010) the framework of Darwinian Neurodynamics (previously called the Neural Replicator Hypothesis) as a cognitive model for problem solving in the brain. In this framework, hypotheses or candidate solutions to a problem play the role of evolutionary units: they are selected based on their fitness just like in Bayesian update, but they also multiply with heredity and variation, thus the model implements a full evolutionary search and explains the generation of new hypotheses.

In de Vladar et al. (2016) and Szilágyi et al. (2016), we describe an instance of a neural implementation for a cognitive architecture and show how the synergy between selection and learning can solve pattern-matching problems. Here, we take these ideas a step further to demonstrate the problem solving capabilities of Darwinian Neurodynamics in a task that is more relevant to understanding cognition. For this purpose, we apply the Darwinian Neurodynamics framework to a classic insight task, namely, the four-tree problem.

# The Four-Tree Problem

Insight problems are used by cognitive scientists to study insight problem solving behavior. While most agree that insight tasks can be solved analytically, these tasks usually trigger a different route of problem solving that can be characterized by typical problem solving stages, including impasse and insight (Chronicle, 2004). After an initial phase of search, when problem solving is mostly conscious and analytical, most problem solvers enter a phase of impasse when they feel that they are not getting closer to the solution (Ohlsson, 1992; Öllinger et al., 2014). Search and impasse can alternate several times (Fedor et al., 2015). While most researchers agree on the behavioral correlates of impasse (repeating previous solution attempts or becoming inactive, Ohlsson, 1992), what happens at the cognitive level remains unknown. Yet, it can be assumed that the search goes on unconsciously, because some problem solvers emerge from the impasse phase with an insight, when they figure out how to proceed.

We chose an insight task to test our cognitive architecture, because they usually have vast search spaces and their solutions are new and unusual in some sense. This is a case where evolutionary search can be very effective, because it implements parallel, distributed search and explains the generation of new hypotheses. We do not think that evolutionary search can account for all aspects of cognition, but it could have huge benefits in certain problems, where the search space is large and/or where the solution is new.

The four-tree problem is posed for participants in the following way: A landscape gardener is given instructions to plant four special trees so that each one is exactly the same distance from each of the others. How is he able to do it? (de Bono, 1967). The solution is that he plants the trees on the apices of a regular tetrahedron, so that one of the trees is on top of a hill (or at the bottom of a valley), and the other three trees are at ground level in a shape of a triangle (any other rotation of a tetrahedron would do, but this is the easiest solution in terms of the amount of landscaping that must be done).

The four-tree problem belongs to the class of 2D constraint problems (Katona, 1940; Ormerod et al., 2002), in which problem solvers implicitly impose on themselves the constraint that the problem should be solved in two-dimensional space, although the solution is three-dimensional. Most insight tasks are misleading in some way and most problem solvers unnecessarily constrain the initial search space. Restructuring (Ohlsson, 1992) happens when the problem solver, either consciously or unconsciously, lifts the constraint and starts searching in a new, unrestricted (or less restricted) search space. While these dynamics might not be true for all insight tasks (Metcalfe and Wiebe, 1987; Kershaw and Ohlsson, 2004), many other insight problems (e.g., ninedot problem, five-square problem, ten-penny problem) can be described in this way.

We propose that the difference between conscious search and search during impasse can be modeled as search based on previous experiences vs. search during which entirely new hypotheses are generated that broaden the effective search space, respectively. We speculate that the futility of trying to solve the problem and the frustration it causes makes problem solvers to stop conscious search. This might lead to a different kind of search, which is mainly unconscious (or this might go on in parallel since before), and which might lead to restructuring. In the case of the four-tree problem, the behavioral correlate of restructuring is the appearance of the first three-dimensional solution attempt.

Kershaw et al. (2013) recently conducted a study of the fourtree problem. Their pilot work revealed that the main sources of difficulty in the four-tree problem were participants' geometric misconceptions (e.g., "believing that the diagonal of a square is the same length as the sides") as well as their "perceptual bias of constructing a two-dimensional problem space". In their experiments, Kershaw et al. attempted to relax the knowledge constraint with direct instructions and the perceptual constraint with analogy training. Direct instructions included teaching participants about the properties of squares, equilateral triangles and tetrahedrons. During analogy training participants had to solve three problems that were isomorphic to the four-tree problem, i.e., four objects had to be placed equidistant from each other in a shape of a tetrahedron. They conducted two experiments, which differed only in the analogy training: in Experiment 1 analogy training only posed the problems, but participants did not get feedback from the experimenter; in Experiment 2, the first two problems were presented together with their solutions, and participants were encouraged to compare these examples, then participants got feedback on their solution attempts to the third problem. Additionally, after receiving instructions for the four-tree problem half of the participants received picture clues, including pictures of trees on mountaintops, in an attempt to prime participants to think about three-dimensional landscapes and prevent unhelpful prior knowledge, activated by the task, to restrict the problem representation to two dimensions. They compared the solution rates of groups of participants who either received direct instructions, analogy training, both (combined groups) or none (control group). Experiment 1 revealed that the direct instruction and the combined groups performed better than the analogy and the control group. In Experiment 2 they found, among others, that participants with the analogy training and the combined training were more likely to solve the task than the control group and that within the combined group, participants who received picture clues were more likely to solve the task than participants who did not receive picture clues.

Kershaw et al. argue that the bias to represent the problem in two dimensions arises from prior experiences of problem solvers. We think that giving participants pen and paper to solve the problem is also a factor, in fact, it can be thought of as a misleading element in the task. We think that presenting the problem in a less misleading manner, for example asking participants to plant small model trees in a sandbox, would increase the frequency of three-dimensional solution attempts. While Kershaw et al. did not manipulate the misleading component in the task, their priming through picture clues might have influenced how much the same misleading component (i.e., giving them paper and pencil) actually misled participants.

# Motivation for the Present Study and Predictions

## Experiment 1

The aim of our first experiment was to benchmark the behavior of our cognitive architecture with evolutionary dynamics based on human data. Our second and third experiments provide new predictions about human behavior that are yet to be tested.

Kershaw et al. (2013)'s direct instruction training addressed gaps in prior knowledge, while their analogy training increased participants' experience with problems involving tetrahedrons. Since both training types occurred right before participants were given the four-tree problem, in our view, both served to prime participants to think about three-dimensional shapes, and particularly tetrahedrons. The picture clues can be thought of as additional and pure priming that affects the twodimensional bias (without training), but they were only given to half of their combined training group in Experiment 2. To sum up, all their experimental groups received training with tetrahedrons and priming with tetrahedrons to some degree, while their control group received neither training, nor priming.

In our simulation experiments, we could not differentiate between the different types of trainings (direct instructions vs. analogy training), because these require higher order cognitive functions that we do not model here. Instead, we aimed at explaining the mechanistic effect of training and priming on problem solving. In our Experiment 1, we tried to reproduce the difference between the control group (1 out of 31 participants, 3% solved the problem in the given 4 min) and the combined training group with picture clues (16 participants out of 28, 57% solved the task; Kershaw, 2016, Personal communication, 28 June) in Kershaw et al.'s (2013) experiment, to provide a benchmark for our cognitive architecture (de Vladar et al., 2016; Szilágyi et al., 2016). We ran 30 simulations in both conditions and compared the problem solving behavior and performance of the models.

# Experiment 2

In Experiment 2, we were interested in tearing apart the effects of prior experience and priming on problem solving. In a 2 × 2 design, we investigated the effects of two-dimensional vs. three-dimensional training and two-dimensional vs. threedimensional priming. Accordingly, in the first condition, the models received two-dimensional training, and two-dimensional priming, in the second condition the models received twodimensional training and three-dimensional priming, in the third condition, the models received three-dimensional training and two-dimensional priming and in the fourth condition the models received three-dimensional training and threedimensional priming (we explain how these manipulations were implemented for the model in the Methods section). We ran 30 simulations in all of the four conditions, each. We predicted that the group that received two-dimensional training and priming would perform worst and that the group that received three-dimensional training and priming would perform best.

# Experiment 3

In Experiment 3, we wanted to compare the problem solving abilities of different populations of models. Specifically, we wanted to model how different cognitive abilities might influence problem solving behavior. Chein et al. (2010) showed that a large spatial working memory capacity is beneficial for solving the nine-dot problem, another multi-step insight problem. Ash and Wiley (2006) also found that individual differences in working memory had an effect on insight problem solving. Apart from differences in working memory, we do not know of other cognitive abilities that have been investigated in connection with insight problem solving, but we assume that learning speed and synaptic efficiency could also have an effect. To investigate this question, we ran simulations with different parameter settings, one group being the control group, and three other groups representing different cognitive "deficits," i.e., parameter settings that we think would negatively influence problem solving. These deficits were lower working memory, slower learning and less effective synapses between layers of neurons. We predicted that the deficit groups would perform worse than the control group.

# METHODS

# The Cognitive Architecture for Darwinian Neurodynamics

#### Architecture of the Model

Our model is also described in de Vladar et al. (2016) and Szilágyi et al. (2016). The MATLAB code of the model, the parameters and scripts for running and analyzing the experiments can be downloaded from osf.io/vjfv9.

The core component of our model (**Figure 1**) is a population of attractor networks. Attractor networks are recurrent autoassociative artificial neural networks with only one layer of units (artificial neurons). Attractor networks are fully connected, i.e., each unit is connected to all the other units within the same network (but self-connections are missing) with weighted connections (weights are real values). In these simulations, the population consisted of 100 attractor networks and each attractor network consisted of 300 units (N = 300).

Attractor networks can be provoked or trained with input patterns. Input patterns are binary vectors of the same length as the number of neurons in the network. When the network is trained with input pattern ξ at timestep m, the weight of the connection between unit i and j is calculated according to the learning rule (Storkey, 1998, 1999):

$$\begin{cases} \boldsymbol{w}\_{ij}^{m} = \boldsymbol{w}\_{ij}^{m-1} + \frac{1}{N} \boldsymbol{\xi}\_{i}^{m} \boldsymbol{\xi}\_{j}^{m} - \frac{1}{N} \boldsymbol{\xi}\_{i}^{m} \boldsymbol{g}\_{j}^{m} - \frac{1}{N} \boldsymbol{g}\_{i}^{m} \boldsymbol{\xi}\_{j}^{m} \text{ if } i \neq j, \\ \boldsymbol{w}\_{ij}^{m} = \boldsymbol{0} \quad \text{if } i = j, \end{cases}$$

with g m i being:

$$g\_i^m \ = \sum\_{k=1}^N \omega\_{ik}^{m-1} \xi\_k^m.$$

We used a forgetting rate of f = 0.1, which means that the weights were multiplied by (1 − f) before each learning event to prevent the saturation of weights. The result of training is that the network learns (stores) the training (input) pattern. It means that when the network is later provoked (see later) with noisy versions of the training pattern, it outputs the original pattern or a pattern very similar to it (pattern completion). The learning rule we used is a modified Hebbian rule, which enables palimpsest memory (Storkey, 1998, 1999; Storkey and Valabregue, 1999),

meaning that the networks can be retrained sequentially with different patterns, without inducing catastrophic forgetting. When the networks reach their memory capacity, they forget earlier patterns, but they are still able to learn new ones.

When an attractor network is provoked by a pattern, the pattern is clamped on the neurons and then the state of the neurons is recalculated according to the update rule. First, the local field h<sup>i</sup> of neuron i is calculated as the weighted sum of recurrent signals from other neurons:

$$h\_i = \sum\_{\substack{j=1 \ (\neq i)}}^N \omega\_{i\bar{j}} \mathbf{x}\_{\bar{j}}(t),$$

where N is the number of neurons in the network, xj(t) is the state of neuron j (active or inactive) in update step t and wij is the weight of the connection between neuron i and neuron j. Then, the state of neuron i is calculated as xi(t + 1) = sgn(hi). The neuron is said to be active, if its state is +1, and inactive otherwise. The state of neurons is updated asynchronously in random order (i.e., N neurons are chosen randomly with replacement to be updated). After N updates, the collective state of neurons is called the activation pattern of the network, which is a binary vector of length N.

The output of the neurons is then fed back as input for the next update step and the neurons are updated again. Recurrent update cycles go on until the output converges to a stable pattern or until the limit is reached (33 cycles in these experiments). The final activation pattern of the network is called the output pattern.

All networks in the model produce output patterns simultaneously. These patterns constitute one generation of output patterns. The fitness of each output pattern is then calculated by a fitness function (see later), where fitness is a real value between 0 and 1. The best patterns (patterns with the highest fitness; three patterns in these simulations) are selected and then fed back to the networks as input patterns; the rest of the patterns are deleted. Some random noise is added to the patterns during this step to simulate imperfect copying. We implemented this by randomly flipping (changing −1 to +1, and vice versa) each bit in the patterns with a probability of m (mutation rate).

Initializing a simulation means that we randomly generate training and provoking patterns for each network. First, each network is pre-trained with a different set of random patterns, i.e., each network has different weights at the beginning. Then, each network is provoked with a different random pattern and the first generation of the simulation begins. The networks produce output patterns, and the best patterns are selected based on the fitness function. The selected patterns are randomly ordered and fed back to the networks. The networks are either trained with these new patterns, or not (see later). If training happens, it is called retraining, to differentiate it from the initial pre-training. The selected patterns are randomly ordered again to provoke the networks and the second generation of the simulation begins. The simulation goes on until one of the selected patterns reaches fitness = 1 or until time out.

#### Evolution and Selection Modes

As we mentioned above, input patterns are either used to retrain some of the networks or not. We call these two different working modes of the model evolution mode and selection mode, respectively. In evolution mode, networks can be retrained with the selected patterns with a probability of r (retraining probability). The term "evolution mode" makes sense, if we consider that in this mode the whole system effectively implements evolutionary search for the pattern with the highest fitness.

Evolutionary units have three essential traits: multiplication, inheritance and variability (Maynard Smith, 1986). In a population of evolutionary units, if these units are multiplied with variation and if their hereditary traits influence their fitness, evolution takes place. In our model the evolutionary units are the patterns. In each step of the simulation, a new generation of output patterns are produced by the attractor networks. Output patterns are similar to the input provoking patterns if a similar pattern is stored in the network. This step implements inheritance with variability. A few patterns are selected based on their fitness and these are copied with errors (mutations) back to the attractor networks as inputs. These patterns multiply when they are used as retraining inputs. They get stored in more networks, which in turn will be able to reproduce these patterns if they are provoked with a correlated pattern.

There are several sources of variation of patterns in this architecture. The first one is a result of the stochastic asynchronous update of the attractor networks. This means that an attractor network usually produces slightly different output patterns when repeatedly provoked by the same input. Second, each attractor network in the population has a unique training history, thus they produce different outputs when provoked with the same input. Third, copy connections are error-prone, i.e., when the selected patterns are copied back to the networks to provoke and to retrain them, they go through mutations. Finally, networks sometimes produce so called spurious patterns, which are different from any of the previously trained patterns or even the input pattern. This usually happens when the input pattern is quite far from the training patterns, thus none of the stored patterns can be retrieved.

In evolution mode, the model performs evolutionary search, and it can be thought of as an evolutionary algorithm in the sense that it is "based on the model of natural evolution as an optimization process" (Bäck et al., 1993). The attractor networks take care of multiplication with inheritance and variation. The selected patterns are copied back with errors to the networks as inputs through neural afferents. These are the components that are neurally implemented, while the fitness function and selection mechanism are symbolic. One novelty of the model is that in fact, it is possible to semi-neurally implement evolutionary search through a population of attractor networks. Inheritance is different from that of other evolutionary algorithms because selected patterns are not directly replicated but instead trained to networks which can in turn reproduce them. Our experiments show that this kind of indirect replication results in evolutionary dynamics similar to that of asexual populations of evolutionary units (there is no cross-over).

In selection mode, the networks are not retrained and thus their output is solely dependent on their pretraining and on the input (provoking) pattern. We call this selection mode, because it is based purely on selection over the standing variation: the best patterns are selected but they do not reproduce, they do not spread to new networks. Because of this, the model can only search in the space of already available patterns and their close neighbors (there is still mutation during copying of provoking patterns).

In selection mode, our model is similar to Bayesian cognitive models of learning and problem solving, where output patterns play the role of hypotheses (Griffiths et al., 2010; Tenenbaum et al., 2011), because Bayesian update is analogous to selection (Harper, 2009) as we described in the Introduction.

### Problem Solving as Evolutionary Search

We think of this model as a cognitive process model for problem solving which is also neurally plausible to some extent. Patterns represent hypotheses, or candidate solutions to a problem that a problem solver might entertain during problem solving. Patterns are either stored in the long-term memory represented by the weight matrices of attractor networks or in the working memory that consist of the maintained activation of the networks. We call the pattern with the highest fitness in each generation a candidate solution. We suggest that most hypotheses are unconscious and only a small sample emerges into consciousness. Solution attempts are candidate solutions that the problem solver acts out, i.e., draws on the given paper or describes verbally. They allow us a very limited peek into the thought processes of participants in insight experiments. We propose that human participants sample their solution attempts from the candidate solutions, and only a small subset of the candidate solutions become conscious, especially, during impasse. Human participants probably generate new hypotheses at different rates, but for the sake of simplicity, we equate generations of patterns in the model with time steps.

We conceptualize priming as an effect on the initial assumptions of the problem solver. These initial assumptions are modeled by the first set of patterns by which the attractor networks are provoked before the first generation of output patterns emerges. By manipulating how these initial provoking patterns are generated, we can model different priming conditions.

Pre-training patterns are analogous to prior experiences of problem solvers and possible solutions to problems that are stored in long-term memory. Selection and evolution modes model two different thinking modes in humans: selection mode is when the problem solver searches for the solution in longterm memory and evolution mode is when the problem solver generates new hypotheses.

When solving insight tasks, humans first try to solve the problem based on their previous experiences (selection mode). Insight tasks are constructed in a way that previous experiences combined with some misleading elements in the task drive problem solvers to unnecessarily restrict the search space. For example, when the four-tree problem is presented on a piece of paper, it misleads participants to think that the solution must be two-dimensional. This coincides with the fact that most people have more experience in two-dimensional paper-and-pencil type tasks than in three-dimensional tasks. To find the solution, problem solvers need to switch to a different thinking mode, where they consider new hypotheses (evolution mode). This might lead to extending their search space to three dimensions through representational change (restructuring).

To model this process, we start simulations in selection mode and then switch to evolution mode with a certain probability. Before the switch between modes, the model only searches based on its previous experiences, whereas after the switch, new candidate solutions can evolve. Without the switch, finding a solution is only possible if long-term memory already contained the solution. We implement switching in a probabilistic way so that it can occur any time during problem solving with a certain probability. The probability of switching is calculated in each generation of patterns by the following equation:

$$s = 1/r^{\epsilon} \, ^\*(1 - a^{b \ast \mathcal{g}}),$$

where r is the number of repeated candidate solutions so far, g is the number of generations so far, and a, b, and c are constants, which were set to 0.7, 0.03, and 1.0, respectively. We suggest, that these parameters can be adjusted when the architecture is used to model different tasks. Switching happens only once during a simulation, which is a simplification. We plan to implement back-and-forth probabilistic switching in our future work.

As indicated, the first term of the equation (1/r c ) is dependent on the number of repeated candidate solutions. The probability of switching decreases as the number of repetitions increases and selection mode also increases the probability of producing a repeated candidate solution. Repeated candidate solutions are patterns that represent solutions to the problem that has already occurred in a previous generation. It has been shown (Kershaw et al., 2013; Fedor et al., 2015) that in human problem solvers the number of repeated solution attempts is inversely proportional to the probability of solving the task. In fact, repetitions are one of the two behavioral associates of impasse. One possibility is that repetitions cause impasse as a self-induced mental set (Luchins, 1942; Lovett and Anderson, 1996; Öllinger et al., 2008). A second possibility is that repetitions are a direct consequence of either a saturated working memory (the problem solver forgets that he has already tried a solution attempt) or an inability to generate new hypotheses, which makes it less probable that a solution is found. The first term of the switching probability equation implements a causative relationship between repetitions and the inability of getting out of impasse. However, the other factors, namely a poor working memory, is also present indirectly (see Experiment 3).

The second part of the equation (1−a b ∗ g ) is proportional to the number of generations, i.e., it is proportional to the time spent by trying to solve the task. We assume that as time passes, problem solvers become more likely to realize that their initial search space is insufficient and that they need to look for a solution in a different search space. **Figure 2** shows the probability of switching through the generations in one of our simulations. It can be seen that if the model fails to switch

in the first few tens of generations, switching becomes quite improbable.

# Implementing the Four-Tree Problem Adaptation of the Task for the Model

In the original four-tree problem the task of the landscaper is to plant all four trees. Here, we modified this task so that only one of the trees must be placed; the rest of the trees are already planted in a shape of a triangle on a plain surface (**Figure 3**). While there have not been human experiments with this modification, we can safely assume that the main problem difficulty (the twodimensional bias) remains the same. We represented the trees in a three-dimensional coordinate system, where each axis ranged from 0 to 100. The distance between each pair of trees was 80 units. The coordinates of the four trees were rounded to the nearest integer: (15, 10, 0), (15, 90, 0), (84, 50, 0), and (38, 50, 65). The last set of coordinates represents the fourth tree that the model has to place in order to solve the task.

### Representation of the Task

An important aspect of modeling problem solving behavior is how to translate the human-readable puzzle to a problem defined within the model and how to translate the outputs of the model to candidate solutions. The output patterns of attractor networks are necessarily binary patterns so we need a representation where these patterns (300-bit-long binary vectors) can be unambiguously converted to a point in space where the fourth tree is placed.

This conversion should take into consideration the properties of attractor networks. For example, attractor networks have probabilistic outputs, i.e., they can have slightly different outputs when provoked with the same input. Because of this, slight differences in the output should not translate to major differences

in the candidate solution: outputs that only differ in a few bits should represent points in space that are close to each other. A cumulative conversion, where the number of active neurons (or the sum of the vector) is proportional to some kind of effort or movement that the efferent of the system (that we do not model here) exhibits in order to place the tree, seems to be a natural way of representing this problem.

The conversion that we used is very simple: the output patterns of networks represented the x, y, and z coordinates of the fourth tree in the following way:


If we put the fourth tree in the (0,0,0) position before each solution attempt, the output pattern can be interpreted as an instruction about moving the tree to its final position. The number of active neurons equals to the number of units of movement in the three dimensions. While this representation is probably not how a location in three-dimensional space is represented in the brain, the details of the model are not essential to the evolutionary argument.

#### Fitness Function

The fitness of patterns was based on the hypothetical instructions ("Plant the fourth tree so that it is the same distance from all other trees as they are from each other"): how close is the distance of trees to the target distance:

$$\begin{array}{rcl} \text{Fitness } = & 1 \ - \text{(sum(abs(round(distance))))},\\ \text{(tree}\_{1-3}, \text{ tree}\_4) \text{) } - \text{target} \text{) } / 3^\* \text{target} \text{)}, \end{array}$$

where tree1−<sup>3</sup> are the already planted trees, tree<sup>4</sup> is the tree whose coordinates the model has to find, and target is the target distance between trees (target = 80 in these simulations).

#### Initializing Simulations

Pre-training patterns and initial provoking patterns were generated probabilistically with three different sparseness values representing the probability that a unit responsible for the x, y, or z coordinate is active. For example, a sparseness of [0.5, 0.5, 0.0] means that within a pattern, each x and y neuron has a state of +1 with a probability of 0.5 and −1 with a probability of 0.5, while all z neurons are inactive. Within each simulation, 90 pre-training-patterns and 1 initial pattern was generated for each attractor network. The sparseness of these patterns differed across conditions.

# Simulation Experiments Experiment 1

In this experiment, we simulated the positive effects of training and priming on solution rates. We wanted to reproduce the results of Kershaw et al. (2013), more specifically, the difference between their control group and their combined group with picture clues. We ran two groups of simulations, where each simulation can be thought of as one individual in the experiment. The combined condition received pretraining in two-dimensions and on tetrahedrons and priming on tetrahedrons; the control condition received pre-training and priming in two-dimensions.

The question might arise why we pre-trained the control group at all. In simulations, we have to simulate participants' previous experiences (i.e., their "training" that happened throughout their lives, before they arrived to the experiment) and also the training that they might receive as an experimental manipulation. Human participants who do not receive training during the experiment are left with their previous experiences, which we suppose are predominantly two-dimensional regarding paper-and-pencil type tasks, because most people do not solve three-dimensional tasks very often (this might be one of the reasons for the low solution rates in the four-tree problem). These predominantly two-dimensional experiences are modeled as pretraining with two-dimensional patterns in our simulations. These patterns were generated with a sparseness of [0.5, 0.5, 0.0] (90 pre-training patterns for each network), which meant to represent general two-dimensional experiences.

The combined group received both two-dimensional pretraining (sparseness = [0.5, 0.5, 0.0] for 45 patterns), and pretraining on patterns representing tetrahedrons (sparseness = [0.38, 0.50, 0.65] for 45 patterns). This pre-training regime modeled that participants in the combined condition had similar two-dimensional experiences as the control group, but they were trained with exercises involving tetrahedrons before they were given the main task.

We conceptualized successful priming as an effect on participants' initial hypotheses about the task. This is a starting point for subsequent hypotheses, as it initializes the thought process. Successful priming with tetrahedrons results in initial hypotheses that are close to tetrahedrons. No priming means that the misleading presentation of the task takes over, and the initial hypotheses are two-dimensional. In this sense, we can think of the control group in Kershaw et al.'s (2013) experiment as a group that received two-dimensional priming in the form of the misleading presentation of the task. To reflect this difference, our control group was "primed" (initialized) with two-dimensional patterns (sparseness = [0.38, 0.5, 0.0]), and the combined group was initialized with patterns representing tetrahedrons (sparseness = [0.38, 0.50, 0.65]). The sparseness of the initializing patterns for the control group was derived from the coordinates of the already planted three trees: the x, y, and z sparseness values were calculated as the averages of the x, y, and z coordinates of the trees. This meant to model that when there is no deliberate priming, participants draw their initial assumptions from the presentation of the task.

In both conditions, we ran 30 simulations, initialized with the same random seed across conditions, to be able to easily compare our results with the results of Kershaw et al. (2013) who had 31 participants in their control condition and 28 participants who received combined training and picture clues.

# Experiment 2

In this experiment, we investigated the effect of prior experiences and priming in a two-by-two design: **Table 1** shows the resulting four conditions.

Condition 2DD (**2D** pre-training, **D**erived patterns for initializing) was identical to the control condition in Experiment 1 (but initialized with different random seeds): it was pre-trained with two-dimensional patterns (sparseness = [0.5, 0.5, 0]) and initialized with two-dimensional patterns derived from the task (sparseness = [0.38, 0.5, 0], calculated as the averages of the coordinates of the three planted trees).

Condition 2DR (**2D** pre-training, **R**andom patterns for initializing) received the same two-dimensional pre-training patterns (sparseness = [0.5, 0.5, 0]) as condition 2DD, but was initialized with three-dimensional patterns with sparseness = [0.5, 0.5, 0.5]. These patterns model the result of either priming with three-dimensional shapes, or a less misleading presentation of the task (sandbox).

Condition 3DD (**3D** pre-training, **D**erived patterns for initializing) was pre-trained with three-dimensional patterns (sparseness = [0.5, 0.5, 0.5]) and initialized with two-dimensional patterns derived from the task (sparseness = [0.38, 0.5, 0], just like condition 2DD).

Condition 3DR (**3D** pre-training, **R**andom patterns for initializing) was pre-trained with three-dimensional patterns (sparseness = [0.5, 0.5, 0.5]) and initialized with threedimensional patterns (sparseness = [0.5, 0.5, 0.5]). In some sense, this condition is similar to the combined condition of Experiment 1 as both training and priming were threedimensional, but both manipulations were weaker (meaning, probably less effective in increasing performance compared to


#### TABLE 1 | Treatment conditions in Experiment 2.

the control condition). Here, three-dimensional pre-training involved general three-dimensional patterns, not tetrahedrons as in Experiment 1 and was not mixed with two-dimensional patterns. Three-dimensional priming was also more general than in Experiment 1, because it did not involve tetrahedrons per se, but general three-dimensional patterns.

In each condition, we ran 30 simulations. Simulations were initialized with the same random seed across conditions, thus conditions can be thought of as repeated manipulations on the same group of participants (but the effects of previous conditions erased).

#### Experiment 3

In this experiment, we modified some parameters of the model in a way that we suspected to cause a deficit in the problem solving abilities of the model. The result of deficits could be lower probability of solving the problem, or slower problem solving. These modifications model the problem solving abilities of different human problem solvers.

To model these differences, we ran simulations in four different groups of models. The control group (CC) had identical parameters to the control condition in Experiment 1, but was initialized with a different random seed. In each of the other three groups one of the default parameters was changed (all the default parameters can be seen at the repository link given at the beginning of the methods section). The MC group (Memory Capacity) had a lower memory capacity: the number of attractor networks was 10 instead of 100. The MR (Mutation Rate) group had 10 times higher mutation rate (0.3 instead of 0.03) on the copying connections than the CC group. The RR group (Retraining Rate) had 10 times lower retraining probability than the control group (0.07 instead of 0.7).

Similarly to the previous experiments, in each group we ran 30 simulations, initialized with the same random seed.

# Analysis

Each simulation was run for a maximum of 200 generations, i.e., 200 subsequent candidate solutions were selected. This timeframe was chosen because our previous simulations showed that in most simulations, fitness reached a plateau by this point. We do not assert that this timeframe is equivalent to the time limit given to human participants in experiments, for example, 4 min in Kershaw et al.'s experiment (Kershaw et al., 2013). We do not know of any study that measures how human solution rates change with time in a more extended timeframe, but we speculate that 200 generations are equivalent to several hours of thinking time in humans. A simulation was scored as a "solver" if the model found the correct position for the fourth tree within this timeframe. We would like to point out that by setting a time out, we turn a possibly quantitative difference between individuals (the speed of problem solving) into a qualitative difference (solver vs. non-solver). To make our results more comparable to human data, we also calculated solution rates at the time point when the first solver appeared in the control condition, because that is how many people solved the task in the control condition of Kershaw et al. (2013) within 4 min.

We also looked at the time spent with the task, measured as the number of generations that the model went through until it either solved the task, or it reached time out. In the former case, time spent with the task equals solution time, in the latter case, time spent with the task equals time out (200 generations). Of course, we cannot assume, that every person comes up with new candidate solutions at the same rate, but this is a simplification we made, because we did not want to overcomplicate the model at this initial stage by modeling time. Time spent with the task can be broken down to selection phase and evolution phase. Selection phase starts with the first generation and lasts until switching to evolution phase. Evolution phase starts from the switch and lasts until the model either solved the problem or reached time out.

Since simulations in each condition were initialized with the same random seed within experiments, conditions can be thought of as different treatments given to the same group of individuals. Thus, we used repeated measures statistics to compare time spent with the task, the length of selection phase and the length of evolution phase. The data in one or more conditions were not normally distributed so we used nonparametric tests.

We also looked at the number of repetitions. A repetition is a candidate solution that has already been selected before. It is a repetition of the coordinates of the fourth tree, not a repetition of output patterns, i.e., many output patterns can code the same coordinates.

Finally, we also looked at the dimensions of candidate solutions. The interesting questions is whether threedimensional candidate solutions are present from the beginning, or they only appear later during problem solving. If candidate solutions are three-dimensional from the beginning, it means that the problem solver did not need representational change, because the initial search space was already three-dimensional.

# RESULTS AND DISCUSSION

# Experiment 1

# Number of Solvers

Almost all simulations found the solution: 28 in the control condition (out of 30) and 29 in the combined condition (also out of 30). This means that 200 generations are too long compared to the 4 min given to human participants, because only one human participant (out of 31 participants; 3.2%) solved the task in the control condition (Kershaw et al., 2013) in the given 4 min. In our simulations, the first solution (3.3%) in the control condition appeared at generation 33. In the combined condition, there were already 22 solutions by that time (73.3%), see **Figure 4**. We compared the number of solvers in the two conditions at generation 33 with a chi-square test and we found a significant interaction: χ 2 (1) <sup>=</sup> 28.202, <sup>p</sup> <sup>&</sup>lt; 0.0001.

#### Dimension of Candidate Solutions

When we looked at the candidate solutions, we found that all simulations in the control condition had two-dimensional candidate solutions at the beginning, and successful problem solvers later started to use three-dimensional patterns. In contrast, all simulations in the combined condition used threedimensional patterns from the very beginning. This means that priming and pre-training with three-dimensional patterns removed the bias to solve the task in two dimensions. We have no comparable data from the human experiment.

# Experiment 2

#### Number of Solvers

The number of solvers was 25, 29, 26, 23 in the 2DD, 2DR, 3DD, and 3DR conditions out of 30 simulations, respectively. According to the χ 2 test the row and column variables are not significantly associated in the contingency table: χ 2 (df = 3) = 5.140, p = 0.1618.

Looking at the number of solvers through time (**Figure 5**) shows that earlier differences between conditions tend to disappear halfway through the simulations, except for condition 2DR, which always has the highest number of solvers. To reveal earlier differences, we also compared the number of solvers at the time point, where the first solver appeared in the control condition. This happened in generation 35, when the number of solvers was 2, 26, 17, 11 in the 2DD, 2DR, 3DD, and 3DR conditions out of 30 simulations, respectively. According to the χ 2 test the row and column variables are significantly associated in the contingency table: χ 2 (3) <sup>=</sup> 40.982, <sup>p</sup> <sup>&</sup>lt; 0.0001.

The number of solvers at generation 35 shows an unexpected rank order: 2DD < 3DR < 3DD < 2DR. **Table 2** shows the results of pair-wise comparisons with a series of six χ 2 tests. We used Bonferroni correction to compensate for multiple comparisons: α = 0.05/6 = 0.0083. The difference between consecutive conditions in the rank order was not significant, but all other differences were significant. The 2DD condition had the least number of solvers, as we predicted, but the order of the 3DR and 2DR conditions were swapped compared to our predictions.

#### Length of Selection and Evolution Phases

To reveal what could have caused superior performance in the 2DR condition, we checked when the switch between selection

conditions of Experiment 2.

TABLE 2 | Results of pair-wise comparisons with a series of χ 2 tests on the number of solvers at generation 35 in Experiment 2.


mode and evolution mode happened and how long each phase took (**Figure 6**). The models were not pre-trained with the solution, so finding the solution without switching to evolution mode was very unlikely. It seems that the 2DR group performed better than expected, because only one simulation did not switch to evolution mode (it is the outlier in the figure, for which the evolution phase was 0 generations long). In the 2DD, 3DD, and 3DR conditions, 4, 3 and 6 simulations failed to switch. **Figure 6** also shows that most simulations in the 2DR condition switched very early to evolution mode, whereas the time of switching is more widely spread in the other conditions.

For the length of the selection phase, according to the Friedman test, variation among condition medians is significantly greater than expected by chance, Fr = 8.883, p = 0.309, but pairwise comparisons with Dunn's multiple comparisons test did not show significant differences between conditions. For the evolution phase, the Friedman test was also significant, Fr = 37.653, p < 0.0001, and Dunn's pairwise comparisons showed that the evolution phase in the 2DD condition was significantly longer than in the other conditions (rank sum difference was 45.5, 38.5, and 56.0 for 2DD vs. 2DR, 2DD vs. 3DD, and 2DD vs. 3DR, the p < 0.001 for all three comparisons), and there were no other significant differences between conditions.

#### Number of Repetitions

The probability of switching depends on the number of repetitions before the switch, so we compared the number of repeated candidate solutions during the selection phase in the four conditions to see whether this could have caused the advantage of the 2DR condition, see **Figure 7** and **Table 3**. According to the Friedman test, variation among column medians was significantly greater than expected by chance, Fr = 28.093, p < 0.0001. Dunn's multiple comparisons test showed

FIGURE 6 | Length of selection phase and evolution phase in the four conditions of Experiment 2. On each box, the central mark is the median, the edges of the box are the 25 and 75 percentiles, the whiskers extend to the most extreme data points not considered outliers and outliers are plotted individually (red +). Notches represent comparison intervals: two medians are significantly different at the 5% significance level if their intervals do not overlap.

that condition 2DR had significantly fewer repetitions than the other conditions, and there were no more significant differences between conditions.

This means, that the advantage of condition 2DR came from earlier switching to evolution mode because of very few repetitions. Probably the weight matrix trained on twodimensional patterns and then provoked with three-dimensional patterns resulted in very hectic behavior, where the selected patterns of subsequent generations were very dissimilar. This is because the provoking patterns were very far from the attractor basins of the networks so that the output was more or less random, until evolution was switched on. **Figure 8** shows the first simulation from each condition: it can be seen that in condition 2DR the fitness is very variable at the beginning, compared to the other conditions. The reason for condition 3DR performing worse than expected is the opposite: the interaction of three-dimensional pre-training and three-dimensional initial provoking patterns resulted in too uniform candidate solutions and many repetitions, thus late switching to evolution. Even though the initial fitness was the highest among conditions, late switching resulted in inferior performance.

represent comparison intervals: two medians are significantly different at the 5% significance level if their intervals do not overlap.

TABLE 3 | Results of Dunn's multiple comparisons test on the number of repeated candidate solutions during the selection phase in Experiment 2.


### Dimensions of Candidate Solutions

We also looked at the dimensions of candidate solutions. All simulations in the 2DD condition started with two-dimensional candidate solutions, whereas the rest of the conditions had three-dimensional candidate solutions from the very beginning. This explains why evolution phase in the 2DD condition was longer than in the other conditions: because when evolution started, candidate solutions were still two-dimensional, and it took longer to gather activations in the z coordinate starting from 0 through mutations than in the other conditions, where the z coordinate was already a higher than 0 value at the time of switching.

# Experiment 3

### Number of Solvers

The number of solvers after 200 generations was 26, 17, 2, and 25 in the CC, MC, MR, and RR groups. According to the χ 2 -square test, the row and column variables are significantly associated in the contingency table: χ 2 (df = 3) = 50.606, p < 0.0001. **Figure 9** shows that group CC had the most solvers at all generations as expected, group MC was the second until about generation 110, when group RR caught up with it, and group MR had the least number of solvers most of the time.

#### Time Spent with the Task

We compared the time spent with the task in the four conditions (**Figure 10**). We used Friedman test because the data were not normally distributed and then compared all groups to the control group with Dunn's multiple comparisons test. The Friedman test showed that variation among group medians is significantly greater than expected by chance, Fr = 54.477, p < 0.0001. Pairwise comparisons showed significant difference between the control group and all deficit groups, see **Table 4**.

#### Length of Selection and Evolution Phase

We also compared the length of selection and evolution phase between groups, as in Experiment 2, see **Figure 11**. Since mutation rate and retraining rate did not influence the simulations in the selection mode, all simulations in the CC, MR and RR groups were identical until evolution switched on. That is why we only compared the length of the selection phase in the control group and group MC. The Wilcoxon matchedpairs signed ranks test showed that the median of the differences between the two groups differed significantly from zero: W = −185, p = 0.0176. For the evolution phase, the Friedman test showed that variation among group medians was significantly greater than expected by chance, Fr = 56.740, p < 0.0001, and pairwise comparisons with Dunn's multiple comparisons test between the deficit groups and the control group revealed that the MR and RR groups spent more time with the task in the evolution phase than the control group, see **Table 4**.

# CONCLUSIONS

# Summary of Results

We developed a model for human problem solving that is based on the selection and evolution of hypotheses (de Vladar et al., 2016; Szilágyi et al., 2016). The model is a possible cognitive architecture for Darwinian Neurodynamics and it is based on a population of attractor networks that store and reproduce the hypotheses which are then selected for reproduction according to their fitness. We assumed that search for the solution starts in a computationally cheaper selection mode, when the model only explores previously learnt candidate solution patterns. If the model has not met the given task before, selection generally does not find the solution. If the model switches to evolution mode, it can explore new hypotheses, and has a chance to go through restructuring. In evolution mode (1) better candidate solutions get stored in more and more attractor networks by cross-network

box are the 25 and 75 percentiles, the whiskers extend to the most extreme data points not considered outliers and outliers are plotted individually (red +). Notches represent comparison intervals: two medians are significantly different at the 5% significance level if their intervals do not overlap.

TABLE 4 | Results of Dunn's multiple comparisons test on the time spent with the task and on the length of evolution phase in Experiment 3.


learning and (2) novel candidate solutions are introduced by mutations. The probability of switching between selection and evolution increases with time, but decreases with the number of repeated candidate solutions because of a self-induced mental set.

In this paper, we applied this cognitive architecture to an insight task, the four-tree problem. Experiment 1 served as a benchmark to test our model against human data from Kershaw et al.'s experiment (Kershaw et al., 2013). The model performed similarly to human participants, i.e., there were more solvers in the combined group, which was pre-trained and primed with tetrahedrons than in the control group, which did not receive these treatments. In Experiment 2, three-dimensional training and priming were supposedly less efficient than in Experiment 1. That is, because they involved three dimensional patterns instead of tetrahedrons per se. However, we predicted that training and priming would still have a positive effect on problem solving. This proved to be true, however, combined pretraining and priming with three-dimensional patterns was not as effective as we thought, instead the group that received two-dimensional pretraining and three-dimensional priming performed best. This is a prediction that we plan to test in human experiments. In Experiment 3, we showed that deficits in computational capacity

and learning abilities of the model decreased solution rate, as it was expected.

significantly different at the 5% significance level if their intervals do not overlap.

# Limitations and Future Work

The simplification (plant only the fourth tree) and the representation of the problem (100 neurons code additively each coordinate) might be overly simplistic in this study. We plan to work out a more complex representation, where the model searches for the position of all four trees, and with a more realistic coding. We would like to implement this in embodied robots that could physically solve the problem.

We did not explicitly model time in this model (time steps equalled generations). This makes it impossible to model inactivity, which is an important behavioral correlate of impasse. In fact, we did not model impasse per se in these simulations. However, we propose that impasse starts sometime before the switch to evolution mode and ends around representational change, because impasse is the phase of problem solving where unconscious thought processes lead to representational change.

Future work should address sampling of candidate solutions to represent solution attempts of human participants. The apparent jump between the goodness of solution attempts of human problem solvers right before the solution can be a result of two different processes. One possibility is that hypotheses gradually increase their fitness through time, but a series of solution attempts does not become conscious, so when one emerges into consciousness, there is an apparent discontinuity. Another possibility is that there is a real jump in the fitness of unconscious hypotheses.

In the present paper, a switch from selectionist to evolutionary dynamics leads to representational change. We are aware of other possibilities, however. A prime candidate could be the re-rendering of the associated adaptive landscape (going beyond adding one more dimension), which would correspond to representational change. Analysis of such alternatives is a task for the future. Another limitation is that switching is unidirectional and happens only once. It would be more realistic to implement a mechanism that can switch back and forth between selection mode and evolution mode.

In Experiment 2, we found that the group that was pretrained with two-dimensional patterns and initialized with threedimensional patterns performed best, which is unexpected. This might be a limitation of the model, or a valid prediction about a behavior that is like beginner's luck. We plan to test human participants in conditions similar to our Experiment 2 to find out.

We think that the realization of evolutionary processes in the human brain is not impossible. We speculate about the possible components of the cognitive architecture elsewhere (Szilágyi et al., 2016). Here, we would just like to point out that it should be different from Neural Darwinism as it was proposed by Edelman (1987), because he only proposes selection on pre-existing variants, which is a mere one-shot game.

This study shows how semi-neurally implemented evolutionary processes can solve the four-tree problem, and that manipulations lead to increased solution rates just like in human problem solvers. We have some interesting predictions about human behavior, which we will test later. We would also like to implement a more realistic version of the four-tree problem, as well as implementing other insight problems. Our investigations so far show that Darwinian Neurodynamics and its implementation in our cognitive architecture is a promising model for human problem solving.

# AUTHOR CONTRIBUTIONS

AF was responsible for implementing, running and analyzing the experiments and writing the manuscript. IZ and ASz were responsible for implementing the model and writing the manuscript. HV and MÖ were responsible for writing the manuscript. ESz was responsible for the conceptual definition of the model, providing guidance during the implementation of the model and the experiments, writing the manuscript and supervising the work.

# ACKNOWLEDGMENTS

This work was financially supported by the EU FET Open project "Insight," agreement number no 308943, and by the European Research Council project "EvoEvo," grant agreement no. 294332.

# REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Fedor, Zachar, Szilágyi, Öllinger, de Vladar and Szathmáry. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Neural Correlates of Learning from Induced Insight: A Case for Reward-Based Episodic Encoding

Jasmin M. Kizilirmak<sup>1</sup> \*, Hannes Thuerich<sup>2</sup> \*, Kristian Folta-Schoofs<sup>1</sup> , Björn H. Schott3,4 and Alan Richardson-Klavehn<sup>2</sup>

<sup>1</sup> Cognitive Neuroscience Lab, Institute of Psychology, University of Hildesheim, Hildesheim, Germany, <sup>2</sup> Memory and Consciousness Research Group, Department of Neurology, Otto-von-Guericke University of Magdeburg, Magdeburg, Germany, <sup>3</sup> Leibniz Institute for Neurobiology, Department of Behavioral Neurology, Magdeburg, Germany, <sup>4</sup> Department of Psychiatry, Charité University Hospital, Berlin, Germany

Experiencing insight when solving problems can improve memory formation for both

#### Edited by:

Kirsten G. Volz, University of Tübingen, Germany

#### Reviewed by:

Chunyan Guo, Capital Normal University, China Amory H. Danek, University of Illinois at Chicago, USA

#### \*Correspondence:

Jasmin M. Kizilirmak jasmin.kizilirmak@scienceforfun.org Hannes Thuerich hannes.thuerich@outlook.com

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 09 June 2016 Accepted: 13 October 2016 Published: 01 November 2016

#### Citation:

Kizilirmak JM, Thuerich H, Folta-Schoofs K, Schott BH and Richardson-Klavehn A (2016) Neural Correlates of Learning from Induced Insight: A Case for Reward-Based Episodic Encoding. Front. Psychol. 7:1693. doi: 10.3389/fpsyg.2016.01693 the problem and its solution. The underlying neural processes involved in this kind of learning are, however, thus far insufficiently understood. Here, we conceptualized insight as the sudden understanding of a novel relationship between known stimuli that fits into existing knowledge and is accompanied by a positive emotional response. Hence, insight is thought to comprise associative novelty, schema congruency, and intrinsic reward, all of which are separately known to enhance memory performance. We examined the neural correlates of learning from induced insight with functional magnetic resonance imaging (fMRI) using our own version of the compound-remoteassociates-task (CRAT) in which each item consists of three clue words and a solution word. (Pseudo-)Solution words were presented after a brief period of problemsolving attempts to induce either sudden comprehension (CRA items) or continued incomprehension (control items) at a specific time point. By comparing processing of the solution words of CRA with control items, we found induced insight to elicit activation of the rostral anterior cingulate cortex/medial prefrontal cortex (rACC/mPFC) and left hippocampus. This pattern of results lends support to the role of schema congruency (rACC/mPFC) and associative novelty (hippocampus) in the processing of induced insight. We propose that (1) the mPFC not only responds to schemacongruent information, but also to the detection of novel schemata, and (2) that the hippocampus responds to a form of associative novelty that is not just a novel constellation of familiar items, but rather comprises a novel meaningful relationship between the items—which was the only difference between our insight and no insight conditions. To investigate episodic long-term memory encoding, we compared CRA items whose solution word was recognized 24 h after encoding to those with forgotten solutions. We found activation in the left striatum and parts of the left amygdala, pointing to a potential role of brain reward circuitry in the encoding of the solution words. We propose that learning from induced insight mainly relies on the amygdala evaluating the internal value (as an affective evaluation) of the suddenly comprehended information, and striatum-dependent reward-based learning.

Keywords: problem solving, long-term memory, encoding, fMRI, hippocampus, mPFC, insight

# INTRODUCTION

fpsyg-07-01693 October 31, 2016 Time: 17:3 # 2

Insight has been an important subject of investigation in the field of Cognitive Psychology for around a 100 years (Mayer, 1995). By insight, we refer to the phenomenon that sometimes the solution to a previously unsolvable problem is comprehended suddenly as opposed to gradually, usually accompanied by a positive feeling, while being convinced of the correctness of the solution. Several studies suggest that insight can enhance long-term memory (LTM) encoding (Auble et al., 1979; Dominowski and Buyer, 2000; Ash et al., 2012; Danek et al., 2013; Kizilirmak et al., 2015). However, the neural mechanisms that mediate this link between insight and successful encoding are largely unknown. Previous studies suggest that the positive emotional response to insight may play an important role, because successful encoding of an insight solution is associated with higher activation of the amygdala (Ludmer et al., 2011). The hippocampus is critically important for the neural manifestation of explicit memory, and its role in memory includes the detection and encoding of novel stimuli, contexts, and associations (Ranganath and Rainer, 2003). Notably, the hippocampus has also been shown to be involved in the processing of insights (Luo and Niki, 2003), which may provide a further explanation for the facilitated LTM encoding of insight-related information. The aim of the current study is to identify neural correlates of successful encoding of insight solutions, that is, suddenly comprehended presented solutions, via functional magnetic resonance imaging (fMRI) and to stimulate further research by proposing a theory of a neural network involved in learning from induced insight.

When investigating insight, it is important to be aware of the fact that the operationalization of "insight" varies considerably between studies. These variations can be boiled down to two main operationalizations: a relatively objective one, in which the experimenter classifies given problems as either insight or no-insight problems in advance (Auble et al., 1979; Metcalfe, 1986; Bowden and Jung-Beeman, 1998; Wills et al., 2000), and a subjective one, in which participants classify their solution either as being conceived via insight or not after they solved the problem (Bowden and Jung-Beeman, 2003b; Danek et al., 2013, 2014; Kizilirmak et al., 2016). In the experimenter-based approach, insight problems are designed to make it very difficult to solve them gradually by incorporating problem features that usually lead to initial solution attempts, which in turn lead to a dead end, necessitating "thinking outside of the box" to break the fixation on the dead end solution attempt (Öllinger et al., 2014). For example, when one needs to find a common association for words that are only remotely associated, such as "tennis, manners, and cloth" (Bowden and Jung-Beeman, 2003b), one may become fixated on the close associations of the single words such as tennis—ball, racket, player, match, and manners—to say thankyou, holding the door open, gentlemen. Solving this task is very difficult to problem solvers, because it is difficult to think of more remote associations. However, this is necessary to solve the problem (the solution is "table": table tennis, table manners, and table cloth). In the participant-based approach, insight is assessed by asking participants whether they had an "aha!" experience during the solution of the problem. The subjective "aha!" experience is usually defined as the feeling that the solution was comprehended suddenly, while feeling surprised and convinced of the correctness of the solution. Moreover, once comprehended, the solution appears to be very easy to understand. A few studies suggest that the subjective "aha!" experience does not depend on solving the problem, but that it can also be perceived when confronted with the solution after having unsuccessfully attempted to solve the problem (Bowden and Jung-Beeman, 2003a; Kizilirmak et al., 2015, 2016). It should be noted that both approaches to investigate insight, the experimenter-based classification of insight and no-insight problems, as well as the participant-based classification of solutions accompanied by a feeling of "aha!" (insight) and no "aha!" (no insight), are equally important to gain a better understanding of the mechanisms behind insight. Evidence exists that both cognitive and neural processing differ considerably, when comparing insight and no insight with either approach (Auble et al., 1979; Jung-Beeman et al., 2004; Ludmer et al., 2011; Danek et al., 2014). Both approaches have their own merits: While the number of "insight" and "no-insight" items as well as in which trial and time point "insight" occurs can be controlled in the experimenterbased approach, only the participant-based approach provides information as to whether the participant actually consciously perceives a qualitative difference between insight ("aha!") and no insight (no "aha!"). We therefore intended to combine both approaches for the current study.

Traditionally, the problems used to study insight are tasks with only one trial, such as the 9-dot problem (Maier, 1930) or the widely known problem with the candle, book of matches, and box of thumbtacks (Duncker, 1945). Although such tasks are wellsuited for studying behavioral manifestations or the subjective phenomenology of the insight experience, different tasks are necessary to study the neural underpinnings of insight. Measures of underlying neural activity require multiple measurements of the same kind, that is, many insight problems with minimal differences that all engage comparable cognitive processing strategies. One such task is the CRAT, developed by Bowden and Jung-Beeman (2003b), which is based on the Remote Associates Task by Mednick (1962), originally developed to study creativity. For each trial of this task, a triad of three words is presented which seem completely unrelated at a first glance (e.g., "death, drain, and stem"). Participants are required to find a fourth word that allows (by using this word as a pre- or suffix) to create a compound word with each of the three initially presented words (here: brain, i.e., brain death, brain drain, and brain stem). The CRAT is especially suited for fMRI studies, as a large number of such triads and solutions can be generated, and the solution can be presented to induce insight (i.e., sudden comprehension) at a defined moment in time, after participants had the opportunity to think about the solution for a short time. This also facilitates fMRI data analysis as the variation between participants and trials is relatively small.

For the current study, we used a modified German version of the CRAT where not only solvable (true CRA), but also unsolvable (control) items were presented. Solutions were presented after a short while [4 s presentation of the riddle + 2– 8 s fMRI jitter (fixation cross)] to induce insight (sudden

comprehension) or not (continued incomprehension) at a welldefined time point. Unsolvable items were created by shuffling the triad and solution words of a subset of originally solvable items that was equal in solution rate (when given 30 s to solve an item), probability of experiencing a subjective "aha!" as defined above, and probability to be rated as plausible, based on a prior normative study which assessed these features. Which of four subsets of items was used to create unsolvable items was counterbalanced across participants. This procedure ensured that all differences between the solvable insight condition and the unsolvable control condition could be attributed to sudden comprehension vs. continued incomprehension and not to item-related differences (e.g., word length, frequency, or any other perceptual differences between items). To avoid any misconceptions, we would like to point out that in the current study, we investigated induced insight, that is sudden comprehension following a state of incomprehension induced by presenting the solution, as did Ludmer et al. (2011). This is important to note, because many recent studies operationalized insight as "generating the correct solution to a problem accompanied by a feeling of 'aha!"' (e.g., Bowden and Jung-Beeman, 2003b; Jung-Beeman et al., 2004; Danek et al., 2013).

To investigate the neural correlates of successful encoding into LTM, learning trials are usually compared based on whether the encoded items were later successfully remembered or not. Such contrasts are often referred to as "difference due to memory" or DM contrasts, for short (Paller et al., 1987). Importantly, neural correlates of LTM formation are not only determined by the encoding task, but also by the memory retrieval task used to test for encoding success. Depending on how memory is tested, the DM contrast can reflect the encoding of different aspects of the encoded information. In the current study, we used the modified CRAT described above as an encoding task. Participants were not informed that their memory would later be tested, thus, successful encoding was incidental (as opposed to intentional; see Richardson-Klavehn, 2010). The information in the focus of the encoding process during this task may be subdivided into several aspects, namely the triad, the problem, the association between the triad and the problem, as well as episode-specific aspects such as how participants felt when they suddenly comprehended a solution. Memory was tested 24 h later by presenting solution words without their triads, which were either old (i.e., presented during the learning phase) or were new. The task was to decide whether a given solution word had been presented during the encoding task ("old" or "new"). Although recollecting the associated triad or any other contextual information about the encoding episode would almost certainly be helpful during the decision whether a solution was old or new, it was not a necessary requirement for the task. Thus, contrasting learning trials of later recognized and later forgotten solutions should primarily reflect successful encoding of the solution. If successful encoding of the solution were mainly independent of whether the presented solution was comprehended suddenly or not, one would expect no difference between the successful encoding of a CRA or control item's solution. However, if induced insight, that is, sudden comprehension, facilitated encoding, as we hypothesized, CRA solutions would be expected to be associated with higher recognition memory, higher recollection rates, and differences in neural correlates of successful encoding.

While a number of previous studies investigated the neural correlates of insight, only very few studies have addressed the relationship between the occurrence of an insight and episodic memory at a neural level. Insight as compared with no insight (with differing operationalizations) has been associated with increased activations of the medial temporal lobe (MTL) memory system (right hippocampus, and bilateral amygdala/parahippocampal gyrus) as well as prefrontal brain structures (bilateral inferior frontal gyrus, IFG, middle frontal gyrus, MFG), of the salience network [right insula, right anterior to dorsal cingulate cortex (ACC)], and a temporo-parietal network including the precuneus, the bilateral angular gyrus (ANG), the right superior temporal gyrus (STG), and the right temporal pole (Jung-Beeman et al., 2004; Aziz-Zadeh et al., 2009—also using the CRAT; Luo and Niki, 2003; Qiu et al., 2010) 1 . Brain areas implicated in the processing of insight solutions were the right anterior STG, which may reflect the integration of information across distant semantic relations (Jung-Beeman et al., 2004), the hippocampus, which has been linked to the formation of novel associations (Luo and Niki, 2003), and IFG and ACC, which have been associated with more meta-cognitive processes controlling the search and evaluation of (potential) solutions (Aziz-Zadeh et al., 2009). Sandkühler and Bhattacharya (2008), who also used a version of the CRAT, further suggest that the right temporal activation may reflect retrieval of the novel solution.

Regarding episodic memory for presented insight solutions, analyzed by comparing later recognized old solutions with later forgotten old solutions, the amygdala has been proposed to play an important role due to the positive emotional response in response to sudden comprehension (Ludmer et al., 2011). These researchers further reported the left medial prefrontal cortex (mPFC), ACC, and precuneus from the same contrast (Ludmer et al., 2011). The precuneus is a region which previously has been associated with successful episodic memory retrieval (Shallice et al., 1994; Miller and D'Esposito, 2012) as well as effortful semantic integration (Hagoort et al., 2009; Shimamura, 2011; Seghier, 2013). This may reflect the phenomenon that a solution that could better be semantically integrated was more likely to be remembered later on. The mPFC has recently been suggested to play an important role in the detection and encoding of schema-consistent information, that is, information which can be easily integrated into pre-existing knowledge (Tse et al., 2011; van Kesteren et al., 2012). In this context, insight could also be understood as the rapid formation of a novel schema (Mayer, 1995). Thus, the sudden formation of a novel schema may further support learning from insight.

In short, here, we investigated induced insight and the successful episodic encoding of insight solutions by using a version of the CRAT. To this end, we compared behavioral and

<sup>1</sup> Studies contrasting an insight condition with a null-event baseline were not included in this summary as such a contrast may include unspecific effects and therefore not accurately reflect the differential activation of insight solutions compared to no-insight solutions.

fMRI responses to solvable (CRA = induced insight condition) vs. unsolvable items (control condition) and further contrasted later recognized with later forgotten solution words. We hypothesized that induced insight would facilitate encoding via


Accordingly, we further hypothesized that, at a neural level, insight-based encoding would engage brain regions previously associated with reward-based learning such as the ventral and dorsal striatum (nucleus accumbens/caudate) (Ikemoto and Panksepp, 1999; Haruno et al., 2004) as well as brain structures previously implicated in schema-based memory formation, most prominently the mPFC (Tse et al., 2011; van Kesteren et al., 2012). We used an episodic recognition memory test, because such tests have often been used to study the influence of rewardrelated areas on hippocampus-dependent encoding (Wittmann et al., 2005; Krebs et al., 2009a,b; Chowdhury et al., 2012). With respect to the hippocampus we predicted that activations would primarily relate to the detection of novel relationships (Davachi, 2006), rather than successful schema-consistent encoding, which has previously been demonstrate to bypass the hippocampus (Tse et al., 2007; van Kesteren et al., 2013).

# MATERIALS AND METHODS

# Participants

Twenty-eight graduate and undergraduate students volunteered to participate in our study. Two participants were excluded due to illness or technical problems during scanning. The remaining 26 participants (15 male, 11 female) had an average age of 25 years (SD = 3.7, range = 18 to 32 years) and were German native speakers with normal or corrected-to-normal vision. All participants gave written informed consent to participate in the study. At the end of the study, they received financial compensation, and the purpose of the study was explained if requested. The study was approved by the Ethics Committee of the University of Magdeburg, Faculty of Medicine, and was conducted in accordance with the Declaration of Helsinki.

# Material

We used our own German version of the CRAT (Kizilirmak et al., 2016), which is based on the version published by Bowden and Jung-Beeman (2003b) and contains 180 items. Each item consists of three clue words (triad) and a solution word that can be used to form a compound word with each of the triad words

(**Figure 1**). The solution word could either be used as a prefix or suffix to build a compound with the other words. Whether the same compound rule (only prefix/suffix or mixed) could be applied to all triad words varied; about half of the items were mixed. All words (triad words and solutions) were nouns or color words. Solution words were only presented in singular form. Due to the German grammatical rules regarding the formation of compound words, some solution words had to be slightly altered (for example by appending an 's') to combine them with the triad words. Solution words were unique while some triad words could appear in up to two different triads.

The resulting normative data were used to divide the 180 items into six pools of 30 CRA items that had approximately equal means with regard to item difficulty, plausibility, and 'aha' ratings obtained in a previously conducted normative behavioral study<sup>2</sup> . Two of these pools were used as solvable CRA items and the other two pools as unsolvable control items presented in the encoding phase and the memory test. The remaining two lists were used to provide new solutions for the memory test to provide information about the false alarm rate (new solutions incorrectly categorized as old). Assignment of pools to conditions was counterbalanced across participants by means of a reduced Latin square, such that each pool was used in the CRA, control, and new conditions. This procedure, along with the careful matching of item-pools, ensured that old and new items at test had highly similar normative properties (e.g., a priori solution probability).

Unsolvable items for the control condition were created by taking all triad words and solution words of each pool and shuffling them separately (i.e., triad words among triad words and solution words among solution words), using the random permutation algorithm from MATLAB 7.1 (The MathWorks, Inc., Natick, MA, USA). The resulting control items were manually inspected for accidental plausibility, meaning that words of an item were semantically associated or could incidentally still be combined to create compound words. In those cases, shuffling was repeated until the triad and solution words could no longer be combined. This way, six matched pools of 30 items were created, each composed of four words, in which the triad words could not be combined in a meaningful way with each other or the fourth "solution" word. When an item-pool fell into the control condition according to the counterbalancing scheme just described, the shuffled version of that item-pool was used. Owing to the counterbalancing scheme and this shuffling procedure, differences between the CRA and control condition could only be due to differences in cognitive processing, and not to differences in perceptual, semantic, or affective properties between individual words, or differences in word frequency in the language.

Four items from the CRA and control conditions each were used in practice trials during the encoding phase. The remaining items were assigned to the two functional MRI runs in equal proportions (56 items per run, 28 items per condition per run).

# Design

The solvable CRA and unsolvable control conditions were presented in event-related manner. For the analysis, the items were further split into conditions according to the participants' responses. During the encoding phase in the scanner, participants were asked to decide whether a presented solution was plausible, and, if so, whether they experienced a feeling of "aha!" or not, or whether a solution was implausible, once it was presented. During the test phase, participants were presented with new and old solution words and asked to (1) decide whether the solution word was old or new and, if old, (2) whether they remembered something from the encoding context (be it remembering what they thought when they saw the target word, remembering some of the triad words associated with it, etc.), whether they knew due to a feeling of familiarity that the item was old, or whether they were actually not at all sure and simply guessed it was old (Gardiner and Richardson-Klavehn, 2000; Yonelinas et al., 2005). This was done to get information about the quality of the participants' memory.

# Procedure

### Encoding Phase

The encoding of the stimuli was performed while participants underwent fMRI scanning at 3 Tesla. Before entering the scanner, participants first ran through a training session with four items from the control and CRA conditions each. This was done to ensure that they understood their task correctly. During scanning, they saw 56 CRA items and 56 control items equally split into two runs. An exemplary trial is shown in **Figure 1A**. Each trial began with a fixation cross which was presented for a duration between 2 and 8 s (pseudo-exponential distribution) which was followed by the triad, presented for 4 s. Participants were instructed to try to think of a solution during that time. From the normative study we knew that only 7% of the CRA items are usually (median solution rate) solved under 6 s. After the triad, another fixation cross was displayed for a variable delay of 2 to 8 s. Thus, with the median duration of the fixation cross jitter following the triad being 4 s and the triad not being displayed during that time, we approximated that probably less than 10% of all items could be solved during that time. The fixation cross was followed by the target word, which was presented for 6 s. Participants were instructed to provide one of three responses during the presentation of the solution: plausible with "aha!," plausible without "aha!," and implausible. Plausible with "aha!" and plausible without "aha!" were assigned to either the left or right index finger on a response box (counterbalanced across participants), while implausible required a bimanual response. The definition of the "aha!" experience was an adaptation of the definition provided by Topolinski and Reber (2010) and highlighted that comprehension of the solution should be sudden and unexpected, and that the solution appears to be crystal clear, once it is understood. The description was adapted to be applicable to all solutions, whether they were presented or found by the participant. The "aha!" definition (rough English translation of the German original, cf. Supplementary) read as

<sup>2</sup>The normative data can be requested via e-mail from the corresponding author: jasmin.kizilirmak@scienceforfun.org.

follows: "You have most likely already had an 'aha!' experience yourself. These are moments in which you surprisingly find the solution for a previously incomprehensible problem. You are often unsure how you came up with this solution, but you are convinced of its truth. However, you cannot only experience such an 'aha!' when coming up with a solution on your own, but also when you are provided with the solution after you have unsuccessfully thought about the problem on your own. For example, a friend tells a joke which you do not get. He then explains the missing piece of information, and suddenly it all makes sense and you may even ask yourself why you did not comprehend it immediately. In our experiment, the 'aha!' experiences may qualitatively diverge from your real-life experiences. It is therefore important to know the following characteristics of an 'aha!' experience to make a decision during the task: (1) The solution to the verbal riddle is comprehended **suddenly** and with surprise. (2) The solution, once understood, is comprehended with **ease** and seems very clear. (3) You are **convinced of the correctness** of the solution and do not need to question it. (4) The sudden comprehension is often associated with a **positive feeling**. Importantly, we are not referring to pride, but to the positive feeling which is based on the dissolved tension upon comprehension.

#### Test Phase

Retrieval took place outside of the scanner, approximately 24 h (mean = 24.24 h, SD = 1.08 h, range: 22.75–27.67 h) after encoding. During the test phase (see **Figure 1B** for an exemplary trial), participants were presented with solution words from the encoding phase randomly intermixed with new solution words (solution words from CRA items not presented at encoding). Solution words from the eight practice trials were not presented. Presentation of a solution word was preceded by a fixation cross displayed for 1000 ms. Solution words were presented until a response was made. During the presentation of each solution word, participants were to decide whether the word was either "old" or "new" (left or right cursor buttons, counterbalanced across participants). They were specifically instructed that they should only choose "old" when they were sure that the word was seen during the encoding phase the day before in the scanner. This served to split items for the fMRI analysis into later recognized and later not recognized to investigate brain processes during successful encoding (DM effect). When a participant rated a word as "old," they were further asked to decide whether they remembered it and could recollect contextual information or if they only knew that the word was old on the next screen (displayed until a response was made). In case "old" was chosen although the participant did not feel confident about the item being old, they should respond with "guess" instead of "know" or "remember." The remember/know/guess differentiation is a standard procedure to differentiate between familiarity (e.g., recognizing a person as someone who you know, but not knowing who it is or where you know him from) and recollection (e.g., remembering that this is Paul who was sitting in the row in front of you during your last lecture). Please see Gardiner and Richardson-Klavehn (2000) or Yonelinas et al. (2005) for further information and our Supplementary Material for a decision tree for the remember/know/guess/new decision provided as part of the test phase's instruction sheet.

# Image Acquisition

Scanning sessions were conducted with a 3 Tesla Siemens Magnetom Prisma Syngo MR D13D at the University Hospital of Magdeburg, Germany, with a 64 channel head coil. The MRI session consisted of two anatomical and two functional runs. The first image acquired was a non-distortion corrected T1-weighted image with a resolution of 1.1 × mm 1.1 mm × 7 mm that served as a localizer to set orientation for the following anatomical scan [MP-RAGE sequence, resolution of 1 mm × 1 mm × 1 mm, field of view (FOV) = 256 mm<sup>3</sup> , 192 slices, time to repetition (TR) = 2500 ms, time to echo (TE) = 2.82 ms, flip-angle = 7 ◦ ], which was used for co-registration of the subsequently acquired functional images. During the two functional MRI runs, blood oxygen level-dependent (BOLD) signal-sensitive T2<sup>∗</sup> weighted echo-planar images (EPIs) were acquired (voxel size = 2 mm × 2 mm × 3 mm including 10% inter-slice gap; FOV = 216 mm<sup>3</sup> ; 34 axial slices aligned to the AC-PC line; TR = 2000 ms, TE = 30 ms, flip angle = 90◦ ). EPIs covered most parts of the brain except for the most dorsal parts of the parietal lobe, sensory and motor cortices. Both functional runs contained 500 scans.

# Image Analysis

Data pre-processing and analysis was done in FSL 5.0 FMRIB's Software Library<sup>3</sup> (Smith et al., 2004). Anatomical data were processed with FSL's brain extraction tool (Smith, 2002), to free cerebral tissue from skull. The functional images were first motion-corrected with the aid of FSL-tool MCFLIRT (Motion Correction FMRIB's Linear Registration Tool; Jenkinson et al., 2002), followed by slice-time-correction as integrated in FEAT which uses (Hanning-windowed) Sinc interpolation to shift each time-series by an appropriate fraction of the TR relative to the middle of the TR period. EPIs were then smoothed with a full width at half maximum Gaussian kernel of 6 mm. To remove lowfrequency signal drifts, a high-pass filter with a cut-off at 100 s was applied to the data. Participants' functional scans were coregistered with their brain-extracted anatomical scans using FSL FLIRT (FMRIB's Linear Registration Tool; Jenkinson and Smith, 2001) and spatially transformed into the Montreal Neurological Institute (MNI) standard reference frame.

First level (single-subject) analyses were carried out with multiple regression (parameter estimation via least squares method). Statistical time series analysis was performed using FILM (FMRIB's Improved Linear Model; Woolrich et al., 2001) implemented in FSL, which includes a local correction of autocorrelations. Two different general linear models (GLMs) were generated: One model contrasted CRA and control conditions, and the other was used to contrast later recognized with later forgotten CRA solutions. Because control solutions yielded too few recognized items, recognized and not recognized items could not be modeled separately, but were collapsed

<sup>3</sup>http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/

for the control condition. As it could not be assumed that participants would stop trying to solve the triads or think about their solutions before stimulus-offset we used a stimulusconvolved approach to compare CRA and control processing. To this end, we included the presentation times of the triads (4 s) and the (pseudo-)solutions (6 s) for both conditions as predictors in our first GLM. Owing to the concern that conditions with ambiguous responses, that is, "implausible" for CRA and "plausible" for the control condition, may introduce additional variance, we computed the GLM also with additional regressors for the triad and target interval for those "unfitting" combinations (hence called ambiguous trials). While there were too few trials to model separate regressors for CRA + implausible and control + plausible (zero trials in at least one of these conditions in 16 subjects), the composite regressor would nevertheless capture the variance explained by ambiguous trials. The results of the GLM with and without the regressor modeling ambiguous trials were nearly identical. We report the data of the first GLM with the regressor for ambiguous trials included.

In the second GLM, the CRA condition was further split into later recognized and later not recognized items. Because the number of trials in each category was already rather low, we did not model ambiguous items (here: CRA items judged as implausible) in a separate regressor in this model. Just like Jung-Beeman et al. (2004), who also investigated neural correlates of insight with the CRAT, we modeled the presentation of the solution, meaning the moment of deciding the response, as response-locked starting 2 s before the response and ending 2 s after it (4 s interval). In modeling this regressor, it was irrelevant whether the 2 s after button press were already part of the presentation of the fixation cross that followed the presentation of the solution. The rationale behind this approach was to capture the moment of comprehension vs. deciding that the solution is not meaningful, especially as the BOLD response is slow. In both GLMs, all regressors were convolved with a gamma model of the hemodynamic response function, and temporal derivations were added to the model. Functional analysis was done with z-statistics, which had been corrected at cluster level according to random field theory. Unless otherwise stated, z-threshold was 2.8 and cluster significance threshold was 0.05 (Worsley, 2001).

Due to a programming error, the CRA and control trials of the second run were not presented in a randomized but block-wise manner (first, all solvable CRA items were presented and then all unsolvable control items). Hence, all data were analyzed for both runs separately and not collapsed (28 trials per condition per run). The data from the second run are reported in the Supplementary, as a block-wise presentation of first the CRA and then the control items may have led to different effects. All data reported below are from the first run in which conditions were presented in a randomized order, that is, in an eventrelated design. Although the results from the first and second runs were overall comparable with respect to the behavioral data and also for the fMRI contrast between CRA and control (seeNeural Correlates of Induced Insight vs. Control here and Test Phase 24 h Later of the Supplementary Materials), the results did differ for the DM contrast (see Neural Correlates of Learning from Induced Insight here and Test Phase 24 h Later

of the Supplementary Materials). This suggests that the blocked presentation had an influence on LTM encoding, at least on the neural level.

# RESULTS

# Behavioral Data Encoding Phase

First, we analyzed the distribution of responses across solvable CRA and unsolvable control items on a purely descriptive level (see **Figure 2** for an overview). A total of 0.75 (SD = 0.18) of all CRA items were rated as "plausible" and accompanied by an "aha!" experience, 0.19 (SD = 0.18) were rated as "plausible" without being accompanied by an "aha!" experience. Only 0.05 (SD = 0.06) of all CRA items were rated as implausible, and participants failed to respond before the start of the next trial in less than 0.01 (SD = 0.01) of all CRAT trials. With respect to control items, the majority of items were rated as "implausible" with 0.88 (SD = 0.15), while 0.07 (SD = 0.12) were judged as "plausible" with "aha!" and 0.05 (SD = 0.09) as "plausible" without "aha!." Again, in less than 0.01 (SD = 0.01) of all control items, participants failed to respond.

The high proportion of "plausible" responses for CRA items confirmed a successful induced insight manipulation (seeing an item as "implausible" would preclude sudden comprehension, hence insight), whereas the high proportion of implausible responses for control items corroborates the control condition as a successful no-insight (continued incomprehension) manipulation. Because many response categories contained only very few trials (e.g., no "aha!"| CRA, no response| CRA, implausible| CRA, "aha!"control), we did not split trials according to the participants' responses for the analysis of the behavioral nor the fMRI data.

For the analysis of response times (RT), median RTs were calculated on an individual level and were then averaged across participants for each condition. For the CRA condition, mean RT was 3453 ms (SD = 695 ms) and for the control condition it

was 4465 ms (SD = 1219 ms). A paired-samples t-test confirmed the statistical difference of mean RTs [t(25) = 3.68, p = 0.001, Cohen's d = 0.748<sup>4</sup> ].

#### Test Phase

We analyzed memory performance with respect to the CRA and control conditions. For these analyses, the conditions were not further split depending on the response categories (i.e., plausible with "aha!" / plausible without "aha!" / implausible). All means and standard deviations are reported in **Table 1**.

Compared to the control condition (M = 0.39, SD = 0.17), participants correctly recognized more old solutions from the CRA condition (M = 0.48, SD = 0.16). A paired t-test confirmed this difference to be statistically significant [t(25) = 4.36, p < 0.001, Cohen's d = 0.955]. Moreover, significantly more solutions were remembered from the CRA compared to the control condition [t(25) = 4.74, p < 0.001, Cohen's d = 1.061]. The CRA (M = 0.29, SD = 0.11) and control (M = 0.29, SD = 0.13) conditions did no differ in regard to their rate of "know" responses as supported by a repeated-measures t-test [t(25) = −0.25, p = 0.823, Cohen's d = −0.063]. In other words, recognition memory only differs for our CRA and control conditions due to a higher remember rate for CRA (see **Figure 3**). This suggests that CRA solutions leave a more detailed memory trace, enabling participants to recollect some information about the encoding episode.

To ensure that these results did not depend on the few cases in which participants responded to CRA items as "implausible" (mean number of trials = 1.4, SD = 1.6) or control items as "plausible" (M = 3.0, SD = 4.8), we ran these analyses again without those trials. The pattern of the results was basically the same. CRA (M = 0.47, SD = 0.16) and control conditions (M = 0.34, SD = 0.15) differed significantly in regard to recognition rates [t(25) = 5.83, p < 0.001, Cohen's d = 1.335]. CRA and control significantly differed in regard to their rate of "remember" responses [M = 0.19, SD = 0.15 vs. M = 0.09, SD = 0.08; t(25) = 4.65, p < 0.001, Cohen's d = 1.096], but not in regard to their "know" response rates [M = 0.28, SD = 0.11 vs. M = 0.25, SD = 0.11; t(25) = 1.30, p = 0.205, Cohen's d = 0.348].

New items were correctly identified in 0.77 (SD = 0.17) of all cases.

<sup>4</sup>Cohen's d was always calculated via d<sup>4</sup> = MD/SDD, Cohen's d = d4/ √ (r).

TABLE 1 | Memory performance (proportion of responses) for Run 1 during the test phase 24 h after the encoding phase.


split for response category (remember, know, guess, miss). To provide an overview, all responses considered hits, i.e., correctly recognizing items as old, are represented in the "hit (remember and know responses)" bar.

# Functional Imaging Data

Due to the low number of no "aha!" CRA trials reported as proportions under the section "Encoding Phase" (in absolute numbers of trials, we had <16 trials in 25 participants, and even <10 trials in 20 participants), we could not model "aha!" and no "aha!" separately for CRA items.

#### Neural Correlates of Induced Insight vs. Control

The comparison of brain activity during presentation of the triad in the CRA and control conditions revealed no significant differences (p > 0.05) in either direction, suggesting similar search processes in both conditions. In other words, participants did not notice whether an item was solvable or not during the presentation of the problem, supporting the comparability of our CRA and control conditions. During the presentation of solution words, however, an increased activation (Z-threshold = 3.3, p = 0.05) was observed for correct solution words compared to pseudo-solution words (contrast CRA > control) in frontal as well as mediotemporal and inferior parietal regions (**Figure 4**, yellow–red activations). We found higher activation in prefrontal cortical areas, including inferior frontal gyrus, mPFC, and ACC, as well as in the left hippocampus, and in temporo-parietal cortices, including bilateral middle temporal gyrus (MTG), ANG, and supramarginal gyrus (SMG). All activation clusters are summarized in **Table 2**.

Calculation of the reverse contrast control > CRA (Zthreshold = 3.3, p = 0.05) revealed significant activations in brain structures primarily implicated in sensory-motor structures, such as the bilateral sensory-motor cortices (postcentral gyrus and precentral gyrus), and supplementary motor area (SMA) (**Table 3**; **Figure 4** green–blue activations) which is probably due to the high rate of bimanual "implausible" responses for the control condition.

FIGURE 4 | Functional magnetic resonance imaging (FMRI) contrast for the CRA vs. control condition during the presentation of the solution. Unfitting responses (i.e., "plausible" for control and "implausible" for CRA) were excluded. White–red activation clusters indicate CRA > control and green-blue activations indicate CRA < control.



MNI coordinates are provided for the peak voxel as well as for the center of activation (in parentheses).

#### Neural Correlates of Learning from Induced Insight

Second, to investigate neural activation during successful episodic encoding of presented CRA solutions (difference due to memory effect, DM-effect), BOLD responses to later recognized vs. later forgotten CRA solution words were compared (Zthreshold = 2.3, p = 0.05). Because of a relatively low number of remember trials in each run, we compared only hits and misses without further differentiating between remember, know and guess. The results are reported in **Table 4**.

Neural activation was observed in the left amygdala (**Figure 5B**), left putamen and left caudate nucleus (**Figure 5A**), bilaterally in the anterior and dorsomedial thalamus (**Figure 4**),

and in the left inferior/middle frontal gyrus (**Figure 5C**). Further activation clusters were observed in temporo-parietal regions, namely within the posterior part of the left inferior temporal gyrus (ITG), and in the inferior parietal lobe (IPL), spanning the right SMG and ANG (**Figure 6**). All activation clusters for the DM-effect are summarized in **Table 4**.

# DISCUSSION

The current study aimed to illuminate the neural correlates of induced insight and successful explicit memory formation for presented insight solutions by comparing solvable CRA (insight) and unsolvable control problems, as well as by contrasting encoding trials of later recognized and later not recognized CRA solutions. We had hypothesized that induced insight, that is, sudden comprehension of a previously incomprehensible problem upon the presentation of the solution, would evoke a positive feeling which may serve as an intrinsic reward, thereby facilitating successful encoding. Though this positive emotional response would probably be considerably weaker compared to generating the solution themselves, most likely due to the missing pride of solving the puzzle, evidence exists that it is still often accompanied by a moderate positive response (Kizilirmak et al., 2016). We had further hypothesized that CRA items would be associated with better semantic integration due to the formation of novel schemata, facilitating the integration of the new information into existing knowledge.

# Induced Insight Is Associated with Better Learning of the Solution

Of all solvable CRA items (induced insight condition), almost three quarters were rated to have elicited an "aha!" experience

```
TABLE 3 | Activation clusters for the insight (solvable CRAT) < no-insight (control) contrast during the presentation of the solution.
```


MNI coordinates are provided for the peak voxel as well as for the center of activation (in parentheses).

TABLE 4 | Activation clusters for the contrast between successfully encoded > not successfully encoded insight solutions.


MNI coordinates are provided for the peak voxel as well as for the center of activation (in parentheses). There were no differential activations in the opposite direction for the chosen significance threshold.

successfully encoded (later recognized) > unsuccessfully encoded (later forgotten) CRA solutions. (A) Axial view, z = 10, (a1) left caudate nucleus, (a2) anterior thalamus, (a3) putamen, (a4) inferior temporal gyrus. (B) Axial view, z = −12, (1) amygdala, (b2) inferior temporal gyrus. (C) Sagittal view. (c1) inferior frontal gyrus, (c2) inferior/medial temporal gyrus.

(as described under the section "Encoding Phase"). This finding supports the idea that, even when correct solutions are presented rather than found by the participants themselves, a subjective experience of "aha!" can be induced. On the other hand, unsolvable control items were correctly identified as implausible in almost 90% of all cases. Slightly more than 10% of those items were rated as plausible (either with or without "aha!"), suggesting that participants may, at some instances, have failed to press both buttons for the implausible response simultaneously, as required, or that participants might have occasionally found their own creative individual associations between the triad words.

In line with the assumption that insight facilitates encoding into LTM (Auble et al., 1979; Dominowski and Buyer, 2000; Ash et al., 2012; Danek et al., 2013; Kizilirmak et al., 2015), we observed higher recognition rates for CRA solutions compared to the control condition's "solutions" as well as higher recollection rates for the CRA solution. The higher recollection rates indicate that memories for CRA solutions were associated with a more elaborate recollective experience, and were in this sense more "episodic" (Gardiner and Richardson-Klavehn, 2000; Yonelinas et al., 2005). In fact, the superior memory performance for CRA items could be almost exclusively attributed to recollection (**Figure 3**). This observation is similar to the commonly reported preferential contribution of deep (i.e., semantic and/or elaborate) study processing to recollection compared with familiarity (Gardiner et al., 1996; Gardiner and Richardson-Klavehn, 2000).

One limitation of the present study is that we did not collect further information with respect to the content of the contextual information recollected, for example, whether participants recollected triad words associated with a solution or what they felt when they saw the solution during encoding. We suggest that the most likely information retrieved would be the triad words associated with the solution, but we cannot exclude that this particular information could also be retrieved when participants correctly recognized the target word based on familiarity. Therefore, we can only speculate that induced insight was most likely associated with higher positive emotional responses (Danek et al., 2013, 2014; Kizilirmak et al., 2015) and better integration of the novel information into preexisting knowledge (van Kesteren et al., 2012). These potential

explanations are supported at a neural level by the higher activation of the amygdala and striatum as well as the mPFC for insight vs. no insight, as discussed below.

# Neural Correlates of Induced Insight and Insight-Related Memory Encoding

Regarding the neural correlates of induced insight vs. control and successful (later recognition of old items) vs. unsuccessful encoding (later misses), it is remarkable that differences were only found for the presentation of the solution but not for the presentation of the problem itself. This suggests (1) that our control condition was not obviously unsolvable when presented without the (pseudo-)solution, but well comparable to the actual remote associations from the CRA condition which also seem not associated at first glance, (2) that the CRA and control items differed only in regard to sudden comprehension vs. continued incomprehension when the solution was processed, and (3) that the relevant encoding processes, which either led to later recognition or non-recognition of the solution, occurred during the processing of the solution.

The increased activations observed for CRA compared to control items were largely consistent with previous findings. In line with the idea that insight reflects the sudden comprehension of a novel relationship between the solution word and the triad, we found that insight was associated with a higher activation of the left hippocampus. This activation is compatible with the finding by Luo and Niki (2003). Luo and Niki (2003) interpreted the observed hippocampal activation as reflecting reorienting processes, implying both the breaking of mental fixations on unsuccessful solution attempts as well as the formation of novel associations (Luo and Niki, 2003). The present finding could be explained analogously. Importantly, in the context of the CRAT, neither the triad or solution words nor the compound words per se were novel to the participants. Instead, the novelty of the relationship between triad and solution words is a purely associative one, as it is solely defined by the sudden comprehension that the triad words have a common link in the target word. Our data thus conform with our initial hypothesis that the primary role of the hippocampus in insight processing is the detection of novel associations. This is also in line with earlier studies that have more generally implicated the hippocampus in the detection of associative novelty, as defined by a novel combination of familiar items (Düzel et al., 2003; Schott et al., 2004; Davachi, 2006). Because novel combinations of familiar items occurred in both, the solvable and the unsolvable condition, our data extend these findings by further suggesting that the hippocampus may be particularly sensitive to the novel meaningful relationships between familiar items.

The induced insight condition differed from the control condition also with respect to prefrontal cortical activations, specifically in the mPFC, both rostral and dorsal ACC and IFG. Similar to the interpretation by Aziz-Zadeh et al. (2009), the ACC and IFG may have been involved in the evaluation of the presented solution (Aziz-Zadeh et al., 2009). The dorsal ACC would most likely act as a salience detector here (Seeley et al., 2007), whereas the rostral ACC would rather be part of the mPFC schema encoding network (van Kesteren et al., 2013). More generally, particularly the left IFG has been implicated in the semantic analysis of verbal information (Demb et al., 1995; Poldrack et al., 1999; Schott et al., 2013; Soch et al., 2016) and also in the retrieval of information from semantic (as opposed to episodic) memory (Düzel et al., 1999). In the present study, IFG activation might constitute a neural correlate of retrieving semantic information regarding the compound words from semantic memory (e.g., by checking with the pre-existing English vocabulary whether "brain" and "death" can be combined to a meaningful known compound word).

The mPFC on the other hand has not yet been implicated in the context of insight as compared to no insight, although it has been reported to be associated with successful encoding of insight solutions (Ludmer et al., 2011). We suggest that the stronger activation of this region for insight as compared to no insight items most likely reflects the processing of schemacongruency. Previous studies have demonstrated that the mPFC is critically involved in the rapid encoding of novel information into pre-existing schemata (van Kesteren et al., 2010, 2012). In those studies, however, participants had acquired a schema prior to the study, and mPFC involvement could therefore only be demonstrated for encoding of novel, but schema-congruent stimuli. Here, on the other hand, providing the solutions to the triad words presented before most likely resulted in the almost instantaneous formation of previously non-existing schemata. We therefore suggest that, in addition to its, by now well-established, role in the encoding of schema-congruent information, the mPFC is also involved in the initial formation of a schema—at least when this occurs at a rapid time scale. In addition to schema congruency, mPFC activation in response to CRA as compared to control solutions might, to some extent, be associated with reflecting on pre-existing semantic associations, which contributes more to deep as compared to shallow memory encoding (Schott et al., 2013).

In contrast to studies by Jung-Beeman and Bowden (2000), Jung-Beeman et al. (2004) and Kounios et al. (2006), we did not find activation in the anterior STG. In the aforementioned studies, this region had been found when contrasting CRA items that were solved and accompanied by a subjective "aha!" "experience with CRA items solved without "aha!" experience. Considering that we compared CRA items collapsed across "aha!"/no "aha!" trials (due to the low number of no "aha!" responses) with unsolvable CRA-like control items, this difference is not surprising. It moreover suggests that there is a neural processing difference between insight and no insight, depending on whether this refers to sudden comprehension vs. continued incomprehension or the subjective experience vs. nonexperience of an "aha!," that is, the feeling that the solution is comprehended suddenly, accompanied by a positive emotional response, being convinced of the correctness of the solution, and feeling that the solution is very clear and easily comprehensible once understood.

The DM contrast revealed that brain regions involved in successful LTM encoding of CRA items overlapped only partially with those involved in the CRA condition per se. Specifically, the only robust overlap between induced insight processing

and successful encoding of insight solutions was observed in inferior parietal regions (ANG, SMG), which might be explicable by attentional processes (Corbetta and Shulman, 2002; Cabeza et al., 2012). Alternatively, or additionally, the temporo-parietal junction (i.e., ANG and SMG) has also been implicated in level of processing (LOP) during episodic encoding (Schott et al., 2013). Strikingly, whereas in that study, we observed encoding-related functional connectivity increase of the hippocampus with the left IFG, the mPFC (see above) and the TPJ, only the hippocampal-TPJ connectivity increase predicted the degree of the LOP effect at the level of individual participants. Given this somewhat comparable involvement of overlapping brain structures in deep encoding and in the processing and encoding of CRA items, we tentatively suggest that insight-related encoding might, to some extent, reflect a special case of deep (i.e., semantic, associative) encoding.

In line with our hypothesis that encoding of insight-associated information might be related to positive feelings during sudden comprehension, the DM contrast revealed increased activation of the amygdala during successful encoding of presented insight solutions, a finding in line with a previous study by Ludmer et al. (2011). This supports their idea that emotional arousal during processing of the solutions may contribute to successful encoding. Furthermore, successful encoding of CRA solutions was also associated with activations of the striatum, particularly the caudate nucleus, extending into the ventral striatum (**Figure 3**). The role of the ventral striatum in reward processing is a well-replicated finding (Knutson et al., 2001; Wittmann et al., 2005), and activation of more dorsal portions of the caudate has been associated with short-term reward (Haruno et al., 2004) and with reinforcement-based learning (Kahnt et al., 2009). Given the previously reported improved explicit encoding of reward-associated stimuli (Wittmann et al., 2005; Adcock et al., 2006; Krebs et al., 2009a,b), a rather straight-forward explanation for the striatal activation observed in the present study would be the notion that learning from insight may in part be driven by the processing of intrinsically rewarding information, which has also been associated with recruitment of the mesolimbic reward system (Daniel and Pollmann, 2010). Notably, participants did not solve items on their own, but were presented with a solution word (after an interval of generally unsuccessful solution attempts) for which they needed to comprehend how it could be combined with the triad words to build compound words. Thus, even though the rewarding feeling of sudden comprehension was probably lower than one might expect for generated solutions, it seems to have been strong enough to elicit increased striatal activation.

Somewhat surprisingly, neither the hippocampus, nor the mPFC differentiated between successfully vs. unsuccessfully encoded CRA solutions. Although one has to be careful with null effects, one potential explanation for this finding could be the way memory was tested. Memory for the solution was probed via an old/new recognition test. Only the solution was presented, and it was not necessary to retrieve any associated information (e.g., the triad words), in order to decide whether a solution word had been presented 24 h earlier. Such a decision could be achieved solely on the basis of familiarity, although the behavioral results clearly indicate that the recognition memory advantage for CRA solutions could be largely attributed to recollection. We tentatively suggest that the hippocampus might have already been strongly engaged by the encoding of the novel meaningful relationship, irrespective of later recognition (**Figure 4**), such that subtle differences in hippocampal activation might not have been picked up by the DM contrast. Given the recent identification of the differential contribution of hippocampal input and output structures to novelty detection and successful LTM encoding (Maass et al., 2014), we propose that future research should employ high-field fMRI to detect a potential contribution of hippocampal output structures (i.e., pyramidal CA1, deep entorhinal cortex) to successful encoding of insightassociated information.

# Limitations

One limitation of our study is that the number of trials per condition in the DM analysis was rather low, and we were therefore not able to further separately consider ambiguous trials (i.e., CRA trials rated as implausible) in that analysis. It should be noted, on the other hand, that the inclusion of a separate regressor for ambiguous trials did not qualitatively affect the results of our main statistical model (if at all, we observed somewhat larger clusters when including the regressor; see Materials and Methods section for details), and it thus appears that, in our view, it would be unlikely that considering ambiguous trials separately in the DM analysis would substantially affect the results.

Along the same line, it must be acknowledged that the number of subsequently recalled items in the control condition was too low to allow for a DM type analysis. We can therefore not completely exclude the possibility that successful encoding of the control items might engage a comparable network of brain structures. Given the predominant engagement of the hippocampal-prefrontal networks observed in more "classic" DM studies and the previously demonstrated role of the striatum in intrinsic reward (Daniel and Pollmann, 2010), along with the unlikeliness of the control items to elicit intrinsic reward responses, we nevertheless suggest that the involvement of the mesolimbic network in successful encoding is at least to some extent specific to the insight-inducing task used here.

Another limitation concerns the activation of the amygdala and the striatum during insight processing successful encoding. While activation of these brain regions has repeatedly been linked to reward processing and/or emotional arousal, we did not record an objective measure of arousal, such as skin conductance or pupillary dilation in the present study. Such a psychophysiological measure would be of particular interest when comparing "aha!" and "non-aha!" items, and future research should be aimed at differentiating objective and subjective insight manipulations also at the level of psychophysiology.

# CONCLUSION

The findings of the current study suggest that encoding of solutions to verbal riddles is more successful when the solution is

comprehended suddenly (CRA = induced insight) as compared to continued incomprehension (control). We further found that induced insight was associated with higher activation of several frontal, temporal, and parietal brain regions of which we would like to point out the hippocampus and mPFC. The hippocampus has been known to be involved in associative novelty, however, never in the sense of detecting a novel meaningful combination of known items (insight condition) as compared to just a novel combination of known items (control condition). Thus, the hippocampus may play a special role during insight processing, by detecting novel meaningful relationships. The mPFC on the other hand has been associated with detecting schemaconsistency and may be associated with the detection that a novel meaningful relationship is consistent with existing knowledge. Regarding the neural correlates of successful encoding of CRA items, our current findings suggest that (1) the positive emotional response toward sudden comprehension (insight) as reflected by higher activation of the amygdala and (2) intrinsic reward as reflected by higher activations of the striatum play key roles in learning from insight. Our findings suggest that encoding insightrelated information is different from the encoding of non-insight related information, because it seems to rely on reward learning, which is not typical for information that is not associated with external rewards. We would therefore propose that insight, that is, sudden comprehension of a solution, may itself be rewarding, thereby facilitating LTM encoding of insight-related information.

# AUTHOR CONTRIBUTIONS

AR-K conceived this line of research, and JK and AR-K conceived and designed the experiment. JK and HT developed the CRAT materials and programmed the experiment. BS gave advice on fMRI-related design questions, and contributed significantly to

# REFERENCES


the writing of theoretical sections on neural correlates of LTM formation and dopamine. JK wrote the first draft of the section "Introduction and Discussion" and coordinated the writing of the manuscript. HT conducted the study, wrote the first draft of the section "Materials and Methods," and analyzed the data. KF-S wrote the first draft of the section "Results." AR-K, BS, and KF-S funded the study/publication. All authors were critically involved in the interpretation of the results and in revising first versions of the manuscript.

# FUNDING

This work was supported by a grant assigned to AR-K and BS by the German Research Foundation (Deutsche Forschungsgemeinschaft) for project TPA10N, part of the Collaborative Research Center "Neurobiology of Motivated Behavior" SFB779, awarded to the University of Magdeburg.

# ACKNOWLEDGMENTS

Research was carried out at the Department of Neurology, Ottovon-Guericke-University, Magdeburg, Germany. We thank Dr. Claus Tempelmann for help with the MR scanning protocols and medical technical assistant Denise Göttert for assistance with the MR data collection.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.01693/full#supplementary-material



network during self-reference versus reference to others. Cereb. Cortex doi: 10.1093/cercor/bhw206 [Epub ahead of print].


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Kizilirmak, Thuerich, Folta-Schoofs, Schott and Richardson-Klavehn. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Incubation and Intuition in Creative Problem Solving

#### Kenneth J. Gilhooly1,2 \*

<sup>1</sup> Psychology Department, University of Hertfordshire, Hatfield, UK, <sup>2</sup> Department of Clinical Sciences, Brunel University London, London, UK

Creative problem solving, in which novel solutions are required, has often been seen as involving a special role for unconscious processes (Unconscious Work) which can lead to sudden intuitive solutions (insights) when a problem is set aside during incubation periods. This notion of Unconscious Work during incubation periods is supported by a review of experimental studies and particularly by studies using the Immediate Incubation paradigm. Other explanations for incubation effects, in terms of Intermittent Work or Beneficial Forgetting are considered. Some recent studies of divergent thinking, using the Alternative Uses task, carried out in my laboratory regarding Immediate vs. Delayed Incubation and the effects of resource competition from interpolated activities are discussed. These studies supported a role for Unconscious Work as against Intermittent Conscious work or Beneficial Forgetting in incubation.

#### Keywords: creativity, intuition, problem-solving, incubation effect, insight problem solving

What form might unconscious work take? On theoretical grounds, the notion that Unconscious Work involves the same processing steps as Conscious Work but minus conscious awareness is discounted, despite some recent arguments that the unconscious can duplicate any conscious function. A candidate account in terms of spreading activation, coupled with below-threshold but active goal representations, is put forward. This account could explain the emergence of subjectively sudden intuitive solutions (Aha-insight solutions) as a result of unconscious processes (Unconscious Work) during incubation periods.

"Intuition: the power of the mind by which it immediately perceives the truth of things without reasoning or analysis; a truth so perceived, immediate, instinctive knowledge or belief.

Latin, in, into, upon, and tueri, tuitus, to look." The Chambers Dictionary, 9th Edition, 2003, p. 778. Edinburgh: Chambers Harrap.

Creative problem solving involves the production of approaches and solutions that are novel to the solver even if not historically novel (Boden, 2004). Explaining the generation of personally novel solutions is an unresolved issue for the psychology of thinking and problem solving. Sometimes, problems seem to be solved by an immediate intuition or insight (e.g., Salvi et al., 2016) but, with difficult problems, a period of conscious analysis is usually needed, even if it does not directly lead to solution and the problem is set aside before solution. Why might setting a problem aside facilitate solution? One popular explanation is that setting creative problems aside for a period can allow unconscious processes to generate solution ideas, which are then experienced, either as spontaneous breakthroughs into consciousness while attention is focussed on other matters, or as very rapid solutions on returning to previously intractable problems. These solutions occurring apparently rapidly and without awareness of intermediate steps, will be experienced as akin to the dictionary idea of an intuition as a truth (a solution in this case) perceived without reasoning or analysis.

The value of setting a problem aside for facilitating solutions has been a concern of theorists in the area for at least the past 100 years. Wallas (1926, p. 80) drew on Poincaré's (1910) earlier

#### Edited by:

Michael Öllinger, Parmenides Foundation, Germany

#### Reviewed by:

Gary Jones, Nottingham Trent University, UK Ut Na Sio, Carnegie Mellon University, USA

> \*Correspondence: Kenneth J. Gilhooly k.j.gilhooly@herts.ac.uk

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 28 April 2016 Accepted: 01 July 2016 Published: 22 July 2016

#### Citation:

Gilhooly KJ (2016) Incubation and Intuition in Creative Problem Solving. Front. Psychol. 7:1076. doi: 10.3389/fpsyg.2016.01076

**110**

analysis of mathematical creation and labeled the stage in which a problem is not consciously processed as "Incubation." (It is noteworthy that Poincaré himself did not use the term "Incubation" in his 1910 paper, although he reported four examples of incubation periods from his own experience of creative work in mathematics). In Wallas's analysis, Incubation is proposed as a useful stage after conscious Preparation but preceding Illumination (or Inspiration) and Verification. Clues to processes underlying creative thinking should be found from analyses of when and why Incubation can be useful. Subjective reports by acknowledged creative thinkers over many areas of work have supported the existence of incubation phenomena (e.g., Poincaré, 1910; Ghiselin, 1952; Csikszentmihalyi, 1996). However, since such personal reports have often been given many years after the events described, the reliability of such reports is highly questionable. For example, frequently cited accounts by Coleridge (composition of poem Kubla Khan in a dream), Mozart (complete compositions coming to mind without error) and Kekulé (discovery of benzene ring in a dream) have proven to be false (Weisberg, 2006, pp. 73–78). Poincaré (1910) himself based his own analysis of creative thinking on self reports of problem solving episodes he had experienced nearly 30 years previously. This is actually rather curious, as Poincaré was an active researcher in mathematics at the time of making his analysis of creative thinking and could presumably have drawn on more recent episodes which would be less susceptible to recall problems. However, after Poincaré (1910) and Wallas (1926), who had relied on their own introspections and on subjective reports by others (e.g., Wallas drew on daydream reports by Varendonck, 1921), a substantial body of experimental work research has been carried outusing both (a) insight problems, in which t the solver has to develop a re-structuring of the task to reach a unique solution and (b) divergent problems, that have no single unique solution but in which many novel potential solutions are to be generated. A typical divergent task, often used in research studies, is the Alternative Uses Task. In this task, participants are to produce as many uses as they can which are different from the normal use in response to one or more everyday items, such as a house building brick, a coat hanger, a pencil, a paperclip, and so on (Guilford, 1967; Guilford et al., 1978; Gilhooly et al., 2007).

Early work on incubation used a laboratory paradigm, known as the Delayed Incubation Paradigm, in which participants work on the target problem for an experimenter set preparation time before being given an interpolated activity different from the target task for a setincubation period before returning to the target problem for a set post-incubation work time. Performance in the incubation condition is compared with that of the control condition in which participants work without a break on the target task for a time equal to the sum of preparation and post-incubation conscious working times in the incubation condition. A recent alternative, the Immediate Incubation paradigm, has an interpolated task immediately after the instructions on the main problem before any conscious work has been undertaken on that problem, followed by uninterrupted work on the maint problem for a set time (Dijksterhuis and Meurs, 2006).

# DELAYED AND IMMEDIATE INCUBATION EFFECTS

There is now considerable evidence from laboratory studies for the benefits of Delayed Incubation, i.e., that setting a problem aside after a period of work is beneficial (see Dodds et al., 2012, for a qualitative review). A quantitative meta-analysis by Sio and Ormerod (2009), of 117 studies identified a positive effect of Delayed Incubation, where the overall average effect size was in the low-medium band (mean d = 0.29) over a range of insight and divergent tasks. Sio and Ormerod's review also revealed that the benefits of an incubation period are greater when participants are occupied by an undemanding interpolated task than when they engage in a demanding interpolated task or no task at all. Overall, from narrative reviews and meta-analysis, it can be concluded that the basic existence of Delayed Incubation effects is clearly established, especially for divergent problem solving.

Concerning the effectiveness of Immediate Incubation opportunities, Dijksterhuis and Nordgren (2006) found that better performances when Immediate Incubation occurred after decision problems or divergent tasks were initially presented. Indeed, Nordgren et al. (2011) reported that Delayed Incubation resulted in better decisions than Immediate Incubation and both types of incubation were beneficial relative to No Incubation.

A meta –analysis (Strick et al., 2011) of 92 decision studies found a significant beneficial aggregate effect size of g = 0.224 for Immediate Incubation. Their results also pointed to a number of moderating factors, for example, beneficial effects were greater, with more options, with shorter presentation times, with shorter incubation times and with induction of a configural mindset vs. a feature based mindset.

In creative divergent tasks Dijksterhuis and Meurs (2006), reported that responses were more creative on average, when the divergent task instructions were followed immediately by a short distracting task before producing uses for a brick, compared to a control condition. We may note that the instructions in this study did not ask for unusual uses, which is the norm in divergent thinking tasks, and so it is not clear whether participants had the goal of being creative. Participants may have been reporting infrequent uses, that they happened to know, rather than generating uses novel to them at the time of test. Raters tend to score infrequent responses as creative, although such uses may have been pre-known and therefore could reflect memory retrieval rather than generation of subjectively novel responses (Quellmalz, 1985). However, Gilhooly et al. (2012) using more standard instructions with a stress on unusual uses found a stronger beneficial effect of Immediate Incubation than of Delayed Incubation with both incubation effects being superior to control effects, scored for fluency and novelty of responses. Thus, the benefit of immediate incubation was also found when the task involved novelty (Gilhooly et al., 2012) as well as fluency (Dijksterhuis and Meurs, 2006).

Zhong et al. (2008), applied the Immediate Incubation paradigm to the Remote Associates Task (RAT), in which solvers have to generate an associate common to three words (e.g., cottage, blue, mouse? Answer : cheese), and found that, Immediate

Incubation activated solution words more on unsolved trials. compared to solution word activation on unsolved trials where that had been no Immediate Incubation.

Overall, it may be concluded from both meta-analyses (Sio and Ormerod, 2009; Strick et al., 2011) and from recent studies (Gilhooly et al., 2012, 2013, 2015) that incubation periods, whether delayed or immediate, do have beneficial effects. The main theories regarding mechanisms underlying incubation effects will now be outlined.

# THEORIES OF INCUBATION EFFECTS

# Intermittent Conscious Work

This approach proposes that participants carry out intermittent conscious work during the incubation period despite instructions to be fully engaged on the interpolated task used to fill the incubation period (Seifert et al., 1995, p. 82; Weisberg, 2006, pp. 443–445). Any conscious work during the supposed incubation period would help reduce the time required when the target problem was re-addressed – but conscious work on the target task would be expected to impair performance on the interpolated task. This theory has the merit of parsimony and essentially explains incubation away as not involving any special processes, such as intuitive unconscious thinking.

# Beneficial Forgetting

This view (e.g., Woodworth, 1938; Simon, 1966; Smith and Blankenship, 1991; Smith, 1995; Segal, 2004; see also, Dijksterhuis and Meurs, 2006) argues that "mental sets," weaken during the incubation period. Such "beneficial forgetting" facilitates fresh starts or "set shifting" when the problem is taken up again after the incubation period. As well as decay and interference, misleading approaches may conceivably be weakened through inhibition as proposed in the theory of retrieval-induced forgetting (Anderson et al., 1994; Storm and Angello, 2010). Segal (2004) proposed a variant (known as "Fresh Look") in which simply switching attention away from the main task allowed a new start, with no forgetting or unconscious work proposed. The Fresh Look view does not predict effects of Immediate Incubation because with in that condition, there is insufficient opportunity for sets or fixations to develop that need to be forgotten to enable later progress.

# Unconscious Work

On this account incubation effects involve active, but unconscious, or intuitive processing. The term "unconscious work" seems to first appear in the problem solving literature in Poincaré's (1910) paper (p. 328). Related phrases such as "non-conscious idea generation" (Snyder et al., 2004) and "unconscious thought" (Dijksterhuis and Nordgren, 2006; Ritter and Dijksterhuis, 2014) are also used in the literature, but I will use the phrase "unconscious work" throughout the present paper.

Theoretically, what form might unconscious work take? For example, could unconscious work be exactly like conscious work, but with just one difference, namely that it is carried out without any conscious awareness? Or is unconscious work better thought of as some form of automatic spreading activation along associative links, as against a conscious rule or strategy governed activity? Wallas (1926) proposed the idea of spreading "associative chains" as being active during incubation, which can be seen as anticipating modern ideas of spreading activation. Poincaré (1910) argued for quite specific mechanisms of automatic idea generation and selection tailored to his domain of interest which was mathematical creation. Both Poincaré and Wallas argued that the suddenness of Illumination or Inspiration coupled with the feeling of confidence in the sudden insight arose from prolonged unconscious work. Wallas's analysis is often labeled as a Four Stage theory, incorporating Preparation, Incubation, Illumination, and Verification, but he also proposed a sub-stage of Illumination which he dubbed "Intimation" (Wallas, 1926, p. 97). This sub-stage is often overlooked in discussions of Wallas's analysis, although Wallas considered it was important, practically and theoretically (see also, Sadler-Smith, 2015, for an extended discussion of Intimation in Wallas's model). Intimation is the moment at the very start of the Illumination period when the solver becomes aware that a flash of success is imminent. Theoretically, Wallas saw Intimation as reflecting increasing activation of a successful association train which was about to become conscious. Thus, Intimation was consistent with the view that Incubation involved unconscious work. Practically, Wallas felt it was important that the solver recognize the Intimation feeling and desist from distracting activities to allow the solution to continue rising into consciousness. Overall, unconscious work has long been favored as a possible explanation of incubation effects. The question of what specific processes might be involved in unconscious work will be considered further in the Theoretical Discussion section.

The possible mechanisms indicated above are not mutually exclusive (or exhaustive). Delayed Incubation could involve all three suggested mechanisms, with some intermittent conscious work taking place when attention strays from the distracting task during the incubation period and with some beneficial forgetting and unconscious work also occurring when the solver is consciously processing to the distracting incubation task. However, a beneficial effect of Immediate Incubation would not be consistent with the Beneficial Forgetting hypothesis in that there is not time in the Immediate paradigm for sets or misleading directions to be established, but the Immediate paradigm would permit some intermittent conscious work and/or some unconscious work.

# THEORIES OF INCUBATION: EMPIRICAL EVIDENCE

# Intermittent Work

As a check for intermittent conscious work during an incubation period, performance on the interpolated task, during incubation, should be compared with performance by a control group using the interpolated task as a stand-alone activity. Impaired

interpolated task performance during incubation would be consistent with the hypothesis of some conscious work on the target task during incubation. The argument here being that intermittent conscious work represents a diversion of resources away from the interpolated task and that should impair performance on the interpolated task. Although this may seem a basic methodological check for intermittent conscious work, it does not appear to have been carried out (Sio and Ormerod, 2009; Dodds et al., 2012) until quite recently. In particular, Gilhooly et al. (2012, 2015) incorporated suitable checks for intermittent conscious work on a target divergent thinking task during the incubation period. In an experiment involving delayed and immediate incubation and two different interpolated activities (Gilhooly et al., 2012), there was no evidence of impairment to the interpolated incubation period tasks (which were mental rotations and anagram solving) as a result of the tasks being carried out during incubation periods, as against being carried out as stand-alone tasks in control conditions. These studies also found positive incubation effects, despite a lack of evidence for intermittent conscious work. If anything, the trends in the data were the opposite of those that would be predicted by the intermittent work hypothesis. Mental rotation and anagrams were somewhat (but not significantly) facilitated by being carried out as distractor tasks during incubation. None of the one tail predictions of the intermittent conscious work hypothesis were upheld. An additional analysis examined the correlations between performance scores on the interpolated tasks and post-incubation scores on the target, divergent thinking task. The Intermittent Work Hypothesis would predict negative correlations in that the more attention given to the interpolated task, the better the interpolated task scores would be, and the worse would be the target task scores. Over eight Pearson correlations examined, two were negative and six positive; the average Pearson correlation between target task and interpolated task performance measures was 0.11. Only one correlation was significant (r = 0.36, p < 0.05, two tail) and this was in the direction opposite to that predicted by the Intermittent Work Hypothesis. This analysis of correlations between interpolated task and target task performance measures thus did not support the Intermittent Work hypothesis. A later study (Gilhooly et al., 2015) using a target divergent thinking task and mental rotations as the interpolated task in a delayed incubation paradigm, also found no impairment in the interpolated task relative to controls. Indeed, mental rotations were significantly better performed as an interpolated task as against as a stand-alone task, contrary to the Intermittent Work Hypothesis.

In a related study, Baird et al. (2012), using thought monitoring techniques, found that frequency of target task related intermittent thoughts during incubation was not related to quality of performance after the incubation period. So, it seems that even if intermittent thoughts about the target task occurred they were ineffective and did not explain the beneficial effects of incubation. In conclusion, from Baird et al. (2012) and Gilhooly et al. (2012, 2015), it seems safe to rule out the Intermittent Work explanation of incubation effects.

# Beneficial Forgetting

On this view, solvers often develop initial approaches that are misleading and become fixated on these approaches. A break allows such tendencies to become weaker and so a fresh start is possible when the problem is resumed after an incubation break.

Smith (1995) investigated this possibility using word problems presented either with helpful or with misleading cues. After failures to solve, participants were given breaks of varying lengths and then on returning to the task tried to recall the cues and to solve. In the case of misleading cues, participants were more likely to solve when they had forgotten the cues and likelihood of forgetting increased with length of the break. The results thus supported the idea that beneficial forgetting of misleading information could be a factor underlying incubation effects.

Segal (2004) examined a variant of the Beneficial Forgetting approach which may be labeled the Fresh Look hypothesis. On this variant, simply switching attention from the target task is enough and length of break is not important. His study involved a spatial insight problem, in which a square has a parallelogram superimposed on it and the task is to find the sum of the areas of the two shapes. The problem is made easier when the solver realizes that the shapes can be restructured as two equal sided right angle triangles which, if slid, form a rectangle whose area is easily calculated. Participants engaged in this target task until they felt they were experiencing an impasse.

After impasse, participants were given 4 or 12 min on either a demanding verbal task (crossword) or undemanding task (browsing through newspapers) and then returned to the main task for up to 6 more min.

Results indicated significant benefits for incubation break v. no break, but no effects for length of break or for the demandingness of the activity during the break. Segal argued that these results were consistent with a the Fresh Look view, that simply removing attention from the target task was sufficient and that it was not important what was done in the incubation period or how long it was. This study thus supports a role for attentional shifting as a mechanism for Delayed Incubation. Together, Smith (1995) and Segal (2004) are consistent with a role for Beneficial Forgetting in the Delayed Incubation paradigm.

# Unconscious Work

In contrast to Smith (1995), Segal (2004), and Dijksterhuis and Meurs (2006) argued that in the Immediate Incubation paradigm, the Beneficial Forgetting approach may be ruled out as there is no period of initial work in which misleading fixations and sets could be developed. Thus, if Immediate Incubation is shown to be effective, the unconscious work hypothesis must remain in contention for Immediate Incubation effects at least and would also be a candidate explanation as one possible mechanism for Delayed Incubation. Dijksterhuis and Meurs (2006) took the beneficial effects of the Immediate Incubation paradigm on a divergent task in their Experiment 3 as support for the role of unconscious work in incubation. However, as already mentioned, the task in this study did not clearly meet the usual criteria for a creative task and the scoring did not distinguish infrequent from genuinely novel responses. Hence, this study did not unequivocally address creative thinking as against free recall

of possibly rare but previously experienced events from episodic and semantic memory.

Gilhooly et al. (2012) using explicit instructions to generate novel responses did find that both delayed and immediate incubation were effective in the Alternative Uses task and that immediate incubation produced more facilitation than delayed incubation. These results were consistent with a role for unconscious work in divergent thinking, particularly for Immediate Incubation, to which the Beneficial Forgetting approach is not applicable.

Snyder et al. (2004) investigated the role of unconscious work in the Delayed Incubation paradigm using a surprise return to the target task. In this case, beneficial effects of incubation emerged, consistent with the view that an automatic continuation of work but unconsciously may have occurred after the task was set aside. We may note that Snyder et al.'s (2004) task required simply production of uses for a piece of paper as against novel uses. Thus, this study did not necessarily require creative thinking as against recall of previously known uses.

The interpolated tasks used by Segal (2004) and by Dijksterhuis and Meurs (2006) were different in modality from the main tasks. Segal's main task was spatial while the interpolated tasks were verbal and Dijksterhuis and Meurs's study showed the opposite pattern in that their target task was verbal but the interpolated task was spatial. The similarity– dissimilarity relationship between target and interpolated tasks could be important theoretically as the main competing hypotheses suggest different effects of similarity between target and interpolated tasks. If unconscious work is the main process then interpolated tasks similar to the target task should interfere with any unconscious work using the same mental resources and so lead to weaker (or even reversed) incubation effects when compared with effects of dissimilar interpolated tasks. The unconscious work hypothesis suggests that when it comes to incubation it would be helpful to "do something different" from the target task. On the other hand, a forgetting account would suggest that interpolated tasks similar to the target task would cause greater interference, which would lead to more forgetting of misleading approaches and thus enhanced incubation benefits.

Helie et al. (2008) explored the effects of different interpolated tasks on the reminiscence paradigm in free recall. This is relevant to our present concerns because the reminiscence paradigm is analogous to incubation, in that an initial free recall is followed by interpolated tasks for a set period and then the same free recall is attempted a second time. The reminiscence score is the number of items recalled on re-test that were not recalled on the initial free recall. Helie et al. (2008) found that the more executively demanding the interpolated tasks were, the lower were the reminiscence scores for picture recall These results fitted well with Helie and Sun's (2010) Explicit–Implicit Interaction model which envisages unconscious implicit processes running in parallel with conscious explicit processes. Helie et al.'s (2008) result is consistent with the Unconscious Work hypothesis for incubation in that more demanding interpolated tasks will leave less resources available for unconscious work. However, Helie et al.'s (2008) focus was free recall from episodic memory rather than creative thinking, which requires novel combinations and so, although suggestive, and consistent with Unconscious Work, this result does not directly address creative thinking which is the focus of the present paper.

Ellwood et al. (2009) found a beneficial effect on number of responses post-incubation of a dissimilar interpolated task in a Delayed Incubation experiment. However, this study used a fluency of uses task rather than a novel uses task. Also, as Ellwood et al. (2009) pointed out, although their findings are consistent with an explanation in terms of unconscious work, an explanation in terms of selective relief of fatigue could also be invoked to account for the effects of similarity between incubation and target tasks. On this view, for example, a spatial Delayed Incubation task very different from a main verbal task could facilitate more recovery from fatigue specific to verbal processing than might an interpolated verbal task. Gilhooly et al. (2013) included tests of the effects of the similarity between incubation and target tasks in an Immediate Incubation paradigm, so that where fatigue as an explanation could be examined. The Gilhooly et al. (2013) study factorially varied incubation activities (verbal – anagram solving vs. spatial – mental rotations), used either a clearly creative verbal divergent task (alternate uses) or a clearly spatial divergent task (mental synthesis) and both divergent tasks were scored for novelty as well as fluency. Significant incubation effects were found, but of most interest were the interactions, in that spatial incubation benefitted verbal divergent thinking more than did verbal incubation activity and verbal incubation activity benefitted spatial divergent thinking more than did spatial incubation activity. These results supported a role for unconscious work during incubation periods in creative thinking tasks and did not support the hypotheses that incubation effects are due to Beneficial Forgetting or attention shifting. The Beneficial Forgetting account predicted the opposite pattern of facilitation (i.e., that similar incubation and target tasks would be more beneficial than different modality incubation and target tasks).

# THEORETICAL DISCUSSION

From recent research discussed above relating to the three main explanations for incubation effects, viz., Unconscious Work, Intermittent Work, and Beneficial Forgetting, it seems that given the effectiveness of Immediate Incubation, in which sets are unlikely to have been developed, the Beneficial Forgetting hypothesis can be ruled out for immediate incubation at least. In addition, Gilhooly et al. (2012, 2015) found no support for the idea of Intermittent Work, from studies in which suitable control conditions were included. Unconscious Work thus remains as the best candidate explanation for the effects of Immediate Incubation periods and it handles the effects of similarity between incubation and target task Gilhooly et al. (2013). Gilhooly et al. (2013) found that Delayed Incubation was beneficial, but less so than Immediate Incubation in a divergent thinking task (Alternative Uses).

It could be that in Delayed Incubation, sets do build up during the initial period of conscious work, and are then reduced by Beneficial Forgetting, after which useful unconscious work could come into play. In contrast, with Immediate Incubation, there are no sets to be overcome and beneficial unconscious work can start sooner than in the Delayed paradigm leading to better performance than with Delayed incubation. Overall, however, the Unconscious Work hypothesis is in contention for both Delayed and Immediate Incubation.

However, the question still arises of what processes might be involved in unconscious work? Could unconscious work processes be identical toe conscious work processes with the sole difference that they are executed without conscious awareness? This issue will now be addressed.

# Unconscious Work?

Conscious work is generally rule or strategy governed. Could unconscious work also be rule governed? Poincaré (1910, p. 329) considered the possibility of a "subliminal self " that worked in the same way as the conscious self, but without consciousness, and might even be a superior "self " since it could find solutions that evaded the conscious mind. Kounios and Beeman (2015) illustrate this notion of a subliminal self by supposing that a man has the job of solving long anagrams during office hours. Suppose the person concerned works systematically all day, on the day shift, from 9 am to 5 pm, trying to solve say, "iaiaeiaeiiamsnrtnmhslbtssdtn," but when he leaves at 5 pm it is still not solved. Another worker takes over and continues the systematic search on the night shift, from where the first worker left off. At 7 pm the night shift worker phones through to the day shift worker with the answer (cf., insight) saying "It's "antidisestablishmentarianism!"". In this example, the second shift worker represents the unconscious and works just the same way, using systematic search, as the day shift worker; but, the day shift worker is not aware of the night shift worker's activities until the answer is phoned through.

To explore further the idea that unconscious work might be a subliminal version of conscious work let us consider conscious processing in the Alternate Uses task. This was addressed in a think aloud study of the Brick Uses task by Gilhooly et al. (2007) in which it was found that participants used strategies, such as scanning the target object's properties ("Bricks are heavy") and using the retrieved properties to cue and infer uses from semantic memory ("Heavy objects can hold down things like sheets, rugs, tarpaulin and so on, so a heavy brick could do those things too"). Could unconscious work essentially duplicate this form of conscious work but with no awareness. As we have argued previously (Gilhooly et al., 2012, p. 976).

"The standard view in cognitive science is (a) that mental contents vary in activation levels, (b) that above some high activation level mental contents become available to consciousness, (c) that we are conscious of only a limited number of highly activated mental elements at any one time (that is, the contents of working memory) and (d) that strategy or rule based processing, as found in Gilhooly et al.'s (2012) think aloud study, requires such highly activated (conscious) material as inputs and generates highly activated (conscious) outputs."

On the standard view then, conscious work requires the highly activated contents of working memory and highly activated material is necessarily in consciousness. Overall, it seems impossible that unconscious processes could really be exactly like conscious processes in every respect except that of being conscious. For example, using the rules of arithmetic and temporary working memory storage processes to multiply two 3 digit numbers (e.g., 364 × 279 = ?) is surely impossible without highly activated representations in working memory of the numbers, goals, and intermediate results. The short term representations involved in mental arithmetic would seem to be necessarily conscious. It seems impossible to carry out unconscious multiplication of two or three digit numbers. (With practice of course, one can learn and store three digit multiplication results in long term memory which can be directly retrieved by a type of unconscious process. However, this t is not mental multiplication). Poincaré (1910, p. 334) made a very similar point when he wrote "It never happens that the unconscious work gives us the result of a somewhat long calculation all made, where we only have to apply fixed rules." In conclusion, the idea that unconscious work or thought processes could be just the same as conscious work processes with the sole difference that they lack awareness of any mental content, seems unlikely.

However, a challenge to this conclusion has been recently put forward by Hassin (2013) who argues in favor of what he labels a "Yes, It Can" (YIC) principle. According to YIC, unconscious processes can perform the same fundamental, high level functions that conscious processes perform. While it would be generally accepted that the elementary (fundamental?) component processes in carrying out 364 × 279 = ?, are unconscious (e.g., the first step of 364 × 279 is likely to be 9 × 4 = 32, which involves a direct retrieval process that occurs without conscious concomitants in adults practiced in basic multiplication at least) and many such steps and processes are needed, yet precise results need to be held in working memory and precise goals need to be formulated in an organized way (executive processes) all of which seems impossible without mental contents activated to conscious levels. Hassin cites some experiments (Sklar et al., 2012) which appear to show priming in subliminally presented additions and subtractions involving two and even three digits. However, these are far from the long calculations with intermediate results that Poincare discussed as difficult for the subliminal self. Exact calculations cannot realistically be made purely by priming which would activates associatively related numbers and not just the correct ones which are needed at every step of a long calculation if it is to be successful. Similar points apply to all types of problem solving which require multiple steps to be carried out and multiple intermediate results to be held along the way between presentation and solution.

Assuming unconscious work cannot actually be just the same in terms of processing steps as conscious work, of what then, might unconscious work consist?

Poincaré (1910, p. 333) drew on Epicurus's (341–270 BC) ancient-world theory of atoms as having hooks so that these elementary building blocks of nature could combine with each other. He imagined ideas like hooked atoms hanging on a wall before relevant ideas/atoms are set in motion during Preparation and continue in motion during Incubation. As with molecules of a gas in a container, the atoms/ideas collide at random and sometimes the hooks snag and a new combination is formed. The atoms initially set in motion can strike atoms at rest and may combine with them. This would represent initial ideas being combined with new ideas so that the products of random combination would always have some relation to the starting conditions of the problem.

Campbell (1960) drew on a range of pre-cursors of his view who had stressed the role of extensive trial-and-error in creative work (Bain, 1874; James, 1880) and he was strongly influenced by Poincaré (1910). Campbell argued that creative problem solving involves a quasi-random generation of associations between mental elements ("Blind Variation") to produce novel combinations of ideas, some of which may be useful and so be subject to Selective Retention. This approach draws an analogy with biological evolution in which random changes in genetic material lead to changes in organisms, some of which are useful and hence retained by natural selection. Similarly, it is argued that ideas are modified in creative problem solving in ways which are blind to the final solution and only by chance lead ultimately to modifications that solve the problem and are retained for future use. Campbell (1960) quoted extensively from Poincaré's (1910) account of creative thinking in mathematics, as involving extensive quasi-random search, although Campbell did not stress any special role for unconscious processing. His concern was very much with the role of blind trial-and-error, whether carried out at a conscious or an unconscious level. It could be argued that Campbell saw productive conscious creative thinking as like the unconscious work proposed by Poincaré (1910).

Simonton (1995, 2003) developed Campbell's ideas and used the notion of "mental elements" which are similar to Poincaré's (1910) "hooked atoms." However, unlike Campbell, Simonton stresses the role of unconscious processes which lead to new combinations, some of which are retained and selected to enter consciousness on the basis of their "stability."

In terms of current approaches to cognitive processing, how might novel combinations come about? Parallel spreading activation processes in a semantic network could lead to remote and unusual associations (Jung-Beeman et al., 2004). One specific proposal is that of Helie and Sun's (2010) Explicit–Implicit Interaction model. In this model, incubation is regarded as involving unconscious, implicit, stochastic associative processes that demand little attentional capacity in contrast with conscious explicit rule governed attentionally demanding processes that run in parallel. In this model, activation spreading through implicit networks during incubation periods leads to novel associations which could facilitate later work when conscious processing resumes and the explicit level processes and knowledge interact with the implicit level processes and knowledge. The model does not seem to deal with incubation leading to a breakthrough of solutions into consciousness without an explicit return to the task. According to Dijksterhuis and Nordgren's (2006) Unconscious Thought Theory (UTT), unconscious thought, or work, is parallel, bottom-up, inexact, and divergent; whereas conscious thought is, serial, exact, and convergent. Thus, the characteristics of unconscious thought, as envisaged by UTT are consistent with incubation effects.

Overall, there is general agreement among many theorists that unconscious thinking, or unconscious work, in the form of implicit associative processes involving spreading activation [similar to Wallas's (1926) concept of "associative trains"], is a possible explanation of incubation effects.

How might the suddenness of inspiration be explained? Both Poincaré and Wallas saw this feature of creative thinking as indicative of prolonged unconscious work that found a solution and delivered it to consciousness. However, here Poincaré identified a problem for the unconscious work account. How did the good idea become selected for promotion to consciousness? Poincaré was focussed on mathematical creation and he proposed that in this domain selection was based on the mathematician's special intuitive sensibility to beauty in mathematics and further that the subliminal self possessed this intuitive sensibility. Poincaré's theory, as stated in the 1910 paper, is narrow in solely addressing mathematical creation; generalization to other fields, such as poetry, music, physics, and so on, would require specific intuitive sensibilities to be proposed for those fields. An alternative possibility that has general applicability, is that when a problem is set aside, a goal representation remains active for extended time periods, although below the threshold for consciousness. The active goal representation would tend to boost activation flow into associated solution-relevant paths and when a solution combination of associations or a single relevant association became active, the solution and the goal representations would mutually activate each other in a positive feedback loop leading both to become conscious as their activations pass threshold levels. It is suggested that this rising activation (or "rising train of association" as Wallas put it) is experienced as Intimation. The present account has the benefit of automaticity and is parsimonious in not requiring special sensibilities to be invoked. The sub-threshold but active goal representation automatically does the work of selecting promising solution –relevant associations.

# CONCLUDING COMMENTS AND LIMITATIONS

Overall, it can be concluded that the field, although still acknowledging the pioneering work of Poincaré and Wallas, has made considerable progress. The existence of incubation as a beneficial stage in creative thinking has been established through a large number of empirical studies (Sio and Ormerod, 2009), so that the field does not depend on potentially unreliable introspective accounts. New paradigms, such as Immediate Incubation have been established and have helped

justify a role for implicit Unconscious Work. Theoretical ideas have been sharpened and refined and the joint effects of spreading activation and subconscious goal activation provide a candidate 9 explanation for insight or intuitive solutions following incubation. The approach put forward here, in terms of spreading activation and goal representations, is most applicable to relatively small scale but knowledge rich problems such as divergent thinking tasks. Further work is needed to develop the present approach for knowledge lean problems, such as laboratory insight problems on the one hand and for larger scale real life problems on the other hand.

# REFERENCES


# AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

# FUNDING

This paper is based on research funded by grants from UK Economic and Social Research Council (RES-000-22-2191) and Leverhulme Trust (F008281G) to KG.


Smith, S. M. (1995). "Getting into and out of mental ruts: a theory of fixation, incubation, and insight," in The Nature of Insight, eds R. J. Sternberg and J. E. Davidson (Cambridge, MA: MIT Press), 121–149.

Snyder, A., Mitchell, J., Ellwood, S., Yates, A., and Pallier, G. (2004). Nonconscious idea generation. Psychol. Rep. 94, 1325–1330. doi: 10.2466/pr0.94.3c.1325-1330

Storm, B. C., and Angello, G. (2010). Overcoming fixation: creative problem solving and retrieval-induced forgetting. Psychol. Sci. 21, 1263–1265.

Strick, M., Dijksterhuis, A., Bos, M. W., Sjoerdsma, A., Van Baaren, R. B., and Nordgren, L. F. (2011). A meta-analysis on unconscious thought effects. Soc. Cogn. 29, 738–762. doi: 10.1521/soco.2011.29.6.738

Varendonck, J. (1921). The Psychology of Daydreams. New York, NY: Macmillan.

Wallas, G. (1926). The Art of Thought. New York, NY: Harcourt Brace.

Weisberg, R. W. (2006). Creativity: Understanding Innovation in Problem Solving, Science, Invention, and the Arts. New York, NY: J. Wiley & Sons.

Woodworth, R. (1938). Experimental Psychology. New York, NY: Holt.

Zhong, C.-B., Dijksterhuis, A., and Galinsky, A. D. (2008). The merits of unconscious thought in creativity. Psychol. Sci. 19, 912–918. doi: 10.1111/j.1467-9280.2008.02176.x

**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Gilhooly. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Role of Intuition in the Generation and Evaluation Stages of Creativity

#### Judit Pétervári<sup>1</sup> \*, Magda Osman<sup>1</sup> and Joydeep Bhattacharya<sup>2</sup>

<sup>1</sup> Biological and Experimental Psychology, School of Biological and Chemical Sciences, Queen Mary University of London, London, UK, <sup>2</sup> Department of Psychology, Goldsmiths, University of London, London, UK

Both intuition and creativity are associated with knowledge creation, yet a clear link between them has not been adequately established. First, the available empirical evidence for an underlying relationship between intuition and creativity is sparse in nature. Further, this evidence is arguable as the concepts are diversely operationalized and the measures adopted are often not validated sufficiently. Combined, these issues make the findings from various studies examining the link between intuition and creativity difficult to replicate. Nevertheless, the role of intuition in creativity should not be neglected as it is often reported to be a core component of the idea generation process, which in conjunction with idea evaluation are crucial phases of creative cognition. We review the prior research findings in respect of idea generation and idea evaluation from the view that intuition can be construed as the gradual accumulation of cues to coherence. Thus, we summarize the literature on what role intuitive processes play in the main stages of the creative problem-solving process and outline a conceptual framework of the interaction between intuition and creativity. Finally, we discuss the main challenges of measuring intuition as well as possible directions for future research.

#### Edited by:

Kirsten G. Volz, University of Tübingen, Germany

#### Reviewed by:

Haiyan Geng, Peking University, China Verena Nitsch, Bundeswehr University Munich, Germany

> \*Correspondence: Judit Pétervári j.petervari@qmul.ac.uk

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 13 June 2016 Accepted: 05 September 2016 Published: 20 September 2016

#### Citation:

Pétervári J, Osman M and Bhattacharya J (2016) The Role of Intuition in the Generation and Evaluation Stages of Creativity. Front. Psychol. 7:1420. doi: 10.3389/fpsyg.2016.01420 Keywords: idea generation, evaluation, creativity, intuitive judgment, intuition

# INTRODUCTION

Celebrated mathematicians, scientists, painters alike often credit the role of intuition as part of the creative process that constitutes their discoveries (e.g., Hadamard, 1954; Gardner and Nemirovsky, 1991; Miller, 2000). For example, intuition was described as being at the core of creative visions of Steve Jobs, one of the foremost creative professionals in recent history (Isaacson, 2011). Yet despite this seemingly obvious connection between intuition and creativity, Dane and Pratt (2007), in their influential article, noted that "with the exception of a few studies (e.g., Raidl and Lubart, 2001), little empirical research has connected intuition to creativity" (p. 48–49), and this has been echoed by other researchers as well (Sinclair, 2010; Dörfler and Ackermann, 2012). In this article, we propose that though we cannot make a strong conclusion yet, there is, however, good conceptual grounds for proposing a link between the two, and promising evidence to suggest, that intuition and creativity are linked, at least on a minimal level.

The principal aim of the present review is to explore the potential link between intuition and creativity in a process-centric framework, in order to consider how intuition would be implicated in different phases of creative problem-solving. By intuition, we refer to its traditional characterization (Hogarth, 2001; Sadler-Smith, 2008; Dörfler and Ackermann, 2012), which treats the process as

**119**

one which is rapid (also labeled as instantaneous), spontaneous (does not require extensive effort and cannot be voluntarily controlled), and alogical (does not necessarily follow the logical rules). Further, the outcomes generated from the intuitive process are generally holistic (also labeled as Gestalt as it is mainly concerned with the whole situation instead of its parts), tacit (the intuitive process cannot be verbalized or articulated with sufficient details), and made with high confidence. When a problem is complex, multidimensional and no pre-established clearly defined rules are available for solving it, a solution (i.e., a novel idea) is often based on the problem solver's judgment of what is an appropriate solution in the absence of any clear, reasoned path. It is the contrast to developing a solution in a linear logically manner that makes idea generation characteristically intuitive and the idea itself that is opaque and inaccessible to the problem solver. Before establishing how intuition slots into different stages of the creative process, we first attempt to establish our conceptualization of creativity.

# Creative Problem-Solving Process

Creativity is a multifaceted construct and notoriously difficult to capture by a single definition (Runco and Jaeger, 2012). We conceptualize creativity as a process that is broadly similar to problem solving, in which, for both, information is coordinated toward reaching a specific goal (Wiggins and Bhattacharya, 2014), and the information is organized in a novel, unexpected way. For instance, Plucker et al. (2004) define creativity as "the interaction among aptitude, process, and environment by which an individual or group produces a perceptible product that is both novel and useful as defined within a social context" (p. 90). Problems which require creative solutions are ill-defined, primarily because there are multiple hypothetical solutions that would satisfy the goals (Reitman, 1965). Therefore, embarking on a solution to an ill-defined problem necessitates the problem solver to frame and interpret what might be relevant as a possible goal and then to establish a solution that meets that goal (Hayes, 1989; Mumford et al., 1994).

For a creative problem, an original solution is often unthinkable in advance, thus assessing creative solutions (i.e., creative ideas) occurs in the absence of objective criterion/criteria against which a creative product can be measured up to. As Amabile (1983, p. 359) put it, "current definitions of creativity are conceptual rather than operational; their conceptualizations have not been translated into actual assessment criteria" yet. Due to this "criterion problem," it is difficult to objectively evaluate the extent in which a particular goal is met (Runco and Smith, 1992; Runco and Chand, 1995). Instead, various indirect features are used which often include, among others, the fluency, flexibility, originality, and elaboration of the solution (Torrance, 1966). It is questionable whether adding up the different features into a score of creativity does, in fact, constitute creativity, and whether, in fact, it should instead be the criteria by which the creative problem solver should assess a creative solution (Amabile, 1982).

The features by which a creative product is evaluated typically fall into categories that include novelty, feasibility, relevance, and specificity (Dean et al., 2006). It is here that intuitive judgments have been implicated with each of the categories related to evaluation. A creative problem solver may intuitively judge the creative product of the problem-solving process with regards to how novel the combination of information is, an intuitive recognition of the feasibility and appropriateness of the creative product, and the extent to which it seems like a good fit.

Turning now to the actual composition of the creative problem-solving process, there have been several ways in which this has been described. Most theorists assert that there are several consecutive stages (e.g., blind variation and selective retention model, Campbell, 1960; associative hierarchy theory, Mednick, 1962; three-process theory of creativity, Davidson and Sternberg, 1986; geneplore model, Finke et al., 1992). The number of stages differs by theory, and this is largely dependent on the ways in which theorists describe the critical components of the stages (e.g., preparation, incubation, illumination, and verification by Wallas, 1926; whereas problem formulation, preparation, idea generation, idea evaluation, and idea selection by Amabile, 1983). However, regardless of these variations, researchers agree on two main essential operations of the creative problems solving process: (1) the generation of ideas and (2) the evaluation and selection of (an) appropriate outcome(s) (e.g., Finke et al., 1992; Lubart, 2001; Reiter-Palmon and Illies, 2004).

Given that these two stages are common to all theories of creativity, and are relatively uncontroversial, it is for these reasons that this review focuses on these two stages as central to the creative problem-solving process. However, it is worth noting that the majority of the available literature tends only to investigate creative idea generation rather than idea evaluation (Amabile and Müller, 2009; Rietzschel et al., 2010). A further rationale for focusing exclusively on these two stages is that they can be explicitly related to how creative processes are measured empirically, and also help to conceptualize more easily where intuition as a process is directly associated with each of these stages, which we present in our framework in the concluding section of this review. Here we propose that both idea generation and evaluation are critical for shaping the creative product of the creative process, and that the two stages are tightly linked (neither makes sense without the other), and that the creative process is a dynamic one which can involve several iterations of generation and evaluation of ideas that a problem solver goes through before reaching an end state (Runco, 2003; Lonergan et al., 2004; Kozbelt and Durmysheva, 2007).

Regarding the underlying cognitive mechanisms, two antithetical types of thinking, convergent and divergent thinking (Guilford, 1956, 1967) are speculated to underlie both generation and evaluation of ideas in the creative problem-solving process. It has been proposed that problem solvers use convergent thinking for selecting a single (best) solution in response to a well-defined problem by applying standard procedures to existing knowledge. By contrast, divergent thinking can be utilized in more ambiguous situations, where a range of alternative solutions are possible, therefore responses may vary individually (Cropley, 2006). The popularity of the concept of divergent thinking has meant that for some it has been translated into a measurement tool of creativity itself (Zeng et al., 2011; Kaufman and Baer, 2012); though this approach has been severely criticized (e.g., Dietrich, 2007; Piffer, 2012). Among

others, Cropley (2006) reset the balance by noting that both convergent and divergent thinking are necessary for producing creative ideas and that it is not simply contingent on divergent thinking alone.

Thus, to sum up, both idea generation and idea evaluation are two essential stages in creative problem solving, and in both stages, divergent and convergent thinking is utilized. Yet, no theory has provided the specific characteristics of intuition in these phases despite the speculation that intuitive judgment features throughout the creative process (Dane and Pratt, 2009). We propose here that intuitive judgment can be characterized in both idea generation and idea evaluation, and we spell out in our framework how this is the case.

# Intuition

Reaching a coherent perception of how to proceed toward solving an ill-defined problem is the key goal during both of the idea generation and the idea evaluation phases. Now we outline how intuition is defined and conceptualized related to this key goal. Bowers et al.'s (1990) classical model describes the process of intuition in two stages. In the first, guiding stage, clues (such as words, shapes, voices, odors, etc.) are accumulated from a complex, noisy environment and synthesized into a pattern in a gradual manner, resulting in a vague perception of coherence. If the spreading activation of relevant mnemonic networks exceeds a threshold, the perception of coherence becomes robust enough to enter awareness and results in a reportable hunch or judgment. This is interpreted as the second, integrative stage (see Volz and von Cramon, 2006; Zander et al., 2015 for neuroscientific evidence of this model).

We suggest that a perception of coherence underlies the finding of novel solutions. During the creative process, separate bits of information are acquired gradually. When embarking on a creative problem-solving process, the relevant prior representations/memories get activated from the accumulated prior experiences. These fragments are converted into a new unit that eventually reaches coherence. The novel organized whole (Gestalt) is assembled via associations, in a non-analytic and noneffortful manner. That is, a deliberate elaboration on how a novel product should be constructed would not count as intuitive.

Association-based information processing was found more appropriate than applying explicit algorithms or pre-established rules for solving complex problems by Dijksterhuis and colleagues (Dijksterhuis, 2004; Dijksterhuis and Nordgren, 2006; Dijksterhuis et al., 2006). Keeping in mind the task-specific goal but being distracted from it, coined as "unconscious thought," was affiliated with association-based, bottom-up processing, as well as with a high processing capacity for solving multidimensional problems.

With regard to creativity, association-based processing serves as a good foundation for generating original responses. As noted by Gallate and Keen (2011), using intuition means not pursuing "a consciously deductive path and is, therefore, more likely to be original because it does not build on something that is already 'known"' (p. 686). Essentially, taking the claims here as a point of departure, big leaps often found in the creative process might be thought to happen if creative problem solvers are not fixed on the rules of a current paradigm (e.g., set out to optimize aspects of an already existing structure), rather, this will happen when solutions are generated independently, keeping in mind the desired end state and making individual judgments on how to get there instead of relying on what has been put forward already. Individually tailored responses are more diverse and more likely to converge toward a unique outcome than those building upon existing structures.

Many times, individual, association-based responses must be formed to complete a task-specific goal. Intuitive processes are even categorized based on the domains to which these goals are connected: (1) problem-solving, (2) creativity, and (3) moral judgments (Dane and Pratt, 2009), as well as (4) social judgments (Gore and Sadler-Smith, 2011). As an alternative typology, Glöckner and Witteman (2010) unpack the sub-categories of intuition based on its underlying cognitive mechanisms, i.e., they lay out associative intuition, matching intuition, accumulative intuition and constructive intuition as partly overlapping but differently focused intuitive processes. Glöckner and Witteman's (2010) approach is distinct from the domain-based approach yet still consistent with it, e.g., matching intuition can be easily related to problem-solving intuition, or constructive intuition appears to form part of creative intuition. We consider creative intuition as key to idea generation, and problem-solving intuition as key to idea evaluation.

# The Link between Intuition and Creativity

Although various researchers have reported a close connection between intuition and creativity (e.g., Perkins, 1992; Boden, 1994; Policastro, 1999), a precise spelling out of how these two constructs are linked has not yet been adequately established. In the main the reason for this is largely the result of the common observation that there is only scarce direct evidence at hand on the particular role of intuition in the creative problem-solving process (e.g., Agor, 1989; Policastro, 1995; Shirley and Langan-Fox, 1996; Dane and Pratt, 2009; Eubanks et al., 2010; Sinclair, 2010; Stierand and Dörfler, 2015), and due to a lack of such evidence, more empirical work is needed (e.g., Raidl and Lubart, 2001; Dollinger et al., 2004; Dane and Pratt, 2007).

As we have proposed earlier, idea generation and evaluation are stages of creative problem solving. They are both found in unstructured and ill-defined problems that have no predefined objective criterion to measure against to the product of the creative process. As mentioned in the previous section, the complication is that stating explicit rules is unworkable when it comes to creating novel and/or original solutions, also because often there are no objective rules. Thus we propose that intuitive judgment is an important feature in the creative process, for this reason that people often lack insight into how they generated a novel solution, and experience surprise, i.e., the violation of previous expectations related to the solution is phenomenologically often at the heart of perceiving something as creative (cf. effective surprise, Bruner, 1962; Wiggins and Bhattacharya, 2014). Because there are no objective rules on how to reach a solution to a creative problem, a combinatorial explosion of possible choices occurs (Simon, 1989; Simonton, 2010). Relying on intuition is a common tool for coping with

such a complex and noisy environment, somatic signals are often guiding the early stages of the creative process (Finke et al., 1992; Hodgkinson et al., 2008).

During the integration of information both while looking for novel patterns (idea generation phase) and while assessing them against prior experiences (idea evaluation phase), an internal sensing of which choice alternatives have the most potential can direct attention away from selecting predictable solutions. A creator proposing ideas which rely heavily on previously acquired information is more likely to generate solutions that are predictable, as compared to a creator relying on hunches about unknown, new directions which would more likely lead to surprising solutions (Simonton, 2012). These hunches cannot be well described with words (Sadler-Smith and Shefy, 2004) and are largely different from having a sudden stroke of insight (e.g., Hogarth, 2001; Dane and Pratt, 2007). As insight is often considered a hallmark of creative problem solving, and there are common practices of using these two words in an interchangeable fashion, we note that there are considerable differences between these concepts. In contrast to the aforementioned characteristics of intuition, we propose that gaining an insight means that the problem solver obtains an explicit understanding of how to reach the goal (Lieberman, 2000), and is capable of articulating it too (Dane and Pratt, 2007). While intuitions unfold gradually, "Aha!" moments are experienced in a discontinuous manner (Zander et al., 2015), as if a light bulb is switched on in the problem solver's head (Jung-Beeman et al., 2004; Slepian et al., 2010).

In contrast to the definitiveness of an insight, intuitions are more indefinite. E.g., creative intuition is described as "a vague anticipatory perception that orients creative work in a promising direction" (Policastro, 1995, p. 99). What's more, it has been conceptualized as "a tacit form of knowledge that broadly constrains the creative search by setting its preliminary scope" (p. 100) as well as a guide for discovering new ideas and assessing whether the idea is appropriate for a problem (Dollinger et al., 2004). However, creative intuition utilized at the early stages of the creative process seems to be only one side of the coin (Policastro, 1995; Raidl and Lubart, 2001; Dane and Pratt, 2009).

We suggest that not only creative intuition but other types of intuition too are relevant for creativity. Namely, we propose that problem-solving intuition (Dane and Pratt, 2009; Gore and Sadler-Smith, 2011) is employed during the later stages of the creative process. This type of intuition is defined as a "domain-specific, expertise-based response to a tightlystructured problem based on the non-conscious processing of information, activated automatically, eliciting matching of complex patterns of multiple cues against previously acquired prototypes and scripts held in long-term memory" (Gore and Sadler-Smith, 2011, p. 307).

If we compare the two functions on which our conceptualization of intuition emerges, they can be seemingly contradictory. The contrast being that creative intuition employed during the idea generation phase relies chiefly on synthesis, while problem-solving intuition operating during the evaluation phase is frequently tied to analysis. That is, in the idea generation phase, creative intuition can work as an associative process linking together distinct pieces of stored information and restructure/combine them into a coherent, task-relevant unit. Akin to constructive intuition (Glöckner and Witteman, 2010), mental representations are constructed based on both current information and traces activated from long-term memory.

In the idea evaluation phase, expertise related to the recognition of novel contributions and judgment regarding whether the product would be perceived as appropriate in a given social context must be drawn upon. Usually, this operation is performed by matching stimuli to already acquired prototypes, however, creative solutions may be special in that they are likely to alter from previous prototypes. In extreme cases, a surprising creation might not fit any existing prototypes, which can also make it difficult to assess its significance in the context in which it was generated. If an idea is unlike the judge's earlier experience, clues to its coherence must be evaluated.

# Reviewing the Evidence on the Link between Intuition and Creativity

Before we go on to lay out our proposed framework, we now consider of the extant empirical findings regarding explorations of the link between intuition and creativity. The empirical findings are presented according to the type of research (qualitative/quantitative) and phase of the creative problemsolving process (idea generation/evaluation) they explore. What follows after the review is a summary of the main difficulties of measurement and assessment of the association between intuition and creativity, and a recommendation of a way forward based on our new conceptual framework, and possible future research directions that logically follow from it.

# METHODS

# Literature Search

We first performed an extensive search of relevant databases, namely used the Web of Science, PubMed, PsycINFO, Google Scholar, and Scopus. The search was conducted using the following keywords: creative, creativity, creative evaluation, insight, innovation, divergent thinking with the Boolean operator AND linking intuition, intuitive problem solving, and decisionmaking to them. Through the use of these broader terms, we, therefore, incorporate studies focused on more specific ideas within these terms, such as the idea generation and idea evaluation expressions. Though we have not specifically used idea evaluation, in wider literature, this term is used interchangeably with one of our selected keywords, creative evaluation.

For selecting keywords, we started at baseline terms: creativity and intuition. After conducting a literature search with these, we chose to include additional terms which were both common and could possibly incorporate further relevant studies in our search. Additionally, theses and dissertations were retrieved from the British Library EThOS and from the Open Access Theses and Dissertations databases. The citations of studies were examined in order to obtain further relevant empirical work regarding the link of intuition and creativity.

# Inclusion Criteria

fpsyg-07-01420 September 16, 2016 Time: 10:20 # 5

Two criteria were applied for inclusion of studies: the research must be (1) empirical work and (2) taking both intuition and creativity into account. Thus research investigating only intuition or only creativity was not included in this review. Results were filtered from sole phenomenological descriptions and work diaries lacking any qualitative or quantitative analysis, as well as from parapsychological investigations since they did not fit the scope of the article. Individual testimonies, historical studies, and biographies (e.g., Policastro, 1995) were also not included here. Further, creative performance must have been demonstrated either by professional track record or by completing creative problem-solving tests, studies relying solely on self-report questionnaires to determine creative potential were not considered here. These procedures yielded a pool of 70 potential studies from which 11 fulfilled all of the aforementioned criteria. **Table 1** includes the list of papers organized by the timeline of the creative process.

# FINDINGS

Studies found within our literature review will be presented below according to their relation the main stages of creativity, i.e., idea generation or idea evaluation.

# Studies on Intuition and Creative Idea Generation

Experts of different domains have been interviewed in order to gain insight into the role of intuition in their idea generation process. Dörfler and Eden (2014) reported the common patterns emerging from face-to-face interviews with 17 Nobel laureates and two Eckert–Mauchly prize winners. Marton et al. (1994) analyzed answers to short, prearranged interview questions across a larger sample from footage of a television program "Science and Man" across 14 years, totaling 93 Nobel Laureates from physics, chemistry, and medicine. Marton et al. (1994) grouped the reported experiences according to (1) when intuition was defined as an outcome, (2) as an act or event, or (3) as a capability. Seventy-two of the 93 respondents expressed a belief that scientific intuition does exist, and from those 28 saw it as a capability, 20 as an act or event, and eight as an outcome, and even these last respondents suggested that it formed part of the starting stage of the creative process. Apart from describing the frequencies of the responses given by the Nobel laureates, Marton et al.'s (1994) study only reflected the scientists' naïve understanding of the issue and was inconclusive about the interpretation of the results with regards to a precise link between intuition and creativity.

In contrast, Dörfler and Eden (2014) analyzed the transcripts of lengthy interviews conducted with a smaller sample (n = 19). They identified three common themes: (1) the role of a "big leap" and how intuition contributes to big scientific discoveries, (2) the significance of having a dual-view, i.e., processing information both globally and a locally (Dijkstra et al., 2012; Förster, 2012) and (3) what is a common structure of successful research teams. All of the respondents confirmed they utilize their intuition during the scientific inquiry, even if they avoided using the exact term due to its pejorative connotation. Instead, Dörfler and Eden (2014) treated references to big leaps as situations showing evidence of intuition, "where a step in thinking is made that does not logically follow from a process of analysis; rather the process of analysis follows the big leap and is used to justify the 'big leap"' (p. 5).

There has been some work examining professions connected to artistic creativity, namely the creation of haute cuisine served by fine dining restaurants, and filmmaking. While the aim of Stierand and Dörfler's (2015) study was to find out more about the creative process of turning raw ingredients into delicious dishes, the theme of intuition emerged from their interviews. The in-depth reports from renowned European chefs revealed that they rely on intuition both during the generation and the screening of ideas. The self-reported experiences were classified as either (1) intuitive insight or (2) intuitive judgment (Dörfler and Ackermann, 2012). Intuitive insight was conceptualized as a resource during which chefs' mentally combined ingredients and developed a gut feeling about which combination should be tested. The researchers identified the role of intuition as a rapid


coupling between the idea generation and the idea evaluation phases providing feedback loops for the iterative creation process.

In regard to film production, Sinclair (2012) interviewed 47 filmmakers between the age of 26 and 71 and with 8–42 years of domain-related experience classified their job as primarily creative (11 directors, three architects, three screenwriters, six directors of photography), primarily technical/operational (12 production managers) or primarily strategic (nine executive producers, two studio directors). The responses recorded in the interviews were clustered into three main categories: (1) intuitive expertise, (2) intuitive creation, and (3) intuitive foresight. The extent in which filmmaking professionals utilized intuition differed according to job specialization. Intuitive creation was demonstrated only by creative film professionals when they approached the story or visualized the set, conceived characters/shots, created (visual) storylines, or gave instructions to actors. Taken together, qualitative studies revealed personal insights regarding the experiences of intuition in the creative process amongst professionals across a variety of sectors. In the main, the common insights appear to be interviewees spontaneously report that intuition is an essential part of the creative process. Moreover, they rely on their intuitive capacity to find new directions of inquiry leading to discoveries they would not have otherwise have made, as well as judging the success of their creative solutions.

Compared to the limitations of using qualitative methods, quantitative study designs can capture a larger, but non-expert, sample. In practice, the most common approach has been to use psychometric assessments to capture individual differences in the intuitive processing in creativity through questionnaires. Intuition and creativity are heterogeneous concepts, and particular components of them are likely to be correlated in various ways; Raidl and Lubart's (2001) study involved several measures. As a measure of creativity, they used Torrance's Unusual Uses Test (Torrance, 1966). This involved participants generating as many and rare uses as possible for a cardboard box. Amabile's (1982) Consensual Assessment Technique was used to assess two further creative production tasks which involved participants producing a drawing from a set of graphical elements, and creating a short story from just a title.

On the other hand, intuition was assessed using the Rational-Experiential Inventory (REI, Epstein et al., 1996), in which preferences for rational versus experiential information processing were scored based on Likert-scale, and the Intuitive Behavior Questionnaire (IBQ) in which participants faced a problem and selected a solution that could be either an intuitive or an analytic one. In addition, two behavioral measures of intuition were also presented. In one of them, participants had to group 8 abstract images in multiple ways, giving a title to each grouping. The responses were analyzed by judges who classified the groupings either as intuitive or analytical. The other involved presenting participants with 10 items were taken from the Metaphoric Triads Task (Kogan et al., 1980), each item corresponding to three words or three images which could be associated either via a metaphorical or a functional link. Preference for the metaphorical and not the physical link was counted as an intuitive response.

The results from this battery of tests presented to 76 undergraduate psychology students revealed that IBQ scores correlated with drawing production, and with the fluency and mean originality scores on the Unusual Uses Test. The high intuition group, assessed by the IBQ, scored higher on the creativity measures than the low IBQ group. REI test performance correlated positively with the drawing production task performance, the metaphor preference test performance, and the mean originality score on the Unusual Uses Test.

In a further study by Garfield et al. (2001), intuition was measured by the most commonly used measure of intuition, the Myers–Briggs Type Indicator (MBTI). MBTI makes use of binary distinctions of personality types based on the scores of its extraversion–introversion, sensing–intuition, thinking– feeling, and judging–perceiving subscales (Myers and McCaulley, 1985). The MBTI takes Jung's idea that personality types are connected to conscious and unconscious working methods of the mind (1921/1971), and has adapted it to assess dimensions of personality, of which the "intuitive type" is one. Myers and McCaulley (1985) conceptualized intuitive types as those that form perceptions which are oriented to the future and concerned with seeing previously undetected patterns.

Garfield et al. (2001) used the MBTI with participants who were trained either an analytical or an intuitive problem-solving technique (VanGundy, 1988; Couger, 1995). Creativity was measured by the Kirton score (Carne and Kirton, 1982), which categorizes problem solvers as either adaptors or innovators and expects them to come up with either paradigm-modifying or paradigm-preserving ideas accordingly, and was manipulated by presenting the participating 219 undergraduate business students with novel or not novel ideas "from others." The group which used the intuitive problem-solving technique came up with more novel and paradigm-modifying ideas as contrasted to those who used the analytical technique. Also, participants exposed to novel and paradigm-modifying ideas from "others" generated more novel and paradigm-modifying ideas themselves, and vice versa.

The influence of intuition on idea generation process was also examined by Eubanks et al. (2010). This research aimed to show direct evidence for the link between intuition and creative problem-solving by manipulating affect and level of training, both treated as facilitators for using intuition. Participants' affect was manipulated at the beginning of the experiment by playing music that was designed to induce positive affect in one group, and a neutral experience in the other group. All participants, except the control group, were then trained through instructional exercises to use their intuition to solve a series of creative problems Participants were classified as being intuitive if they were above the group average in providing correct answers, below the average in solution time and below the average in utilizing optional additional information for the problems. Training made a strong positive contribution to creative problem-solving performance (measured according to the quality, originality, and elegance of solutions to the problems) in general. When a neutral affect was induced, intuition scores were strongly associated with enhanced creative problem-solving performance. When positive affect was induced, the association between

intuition and problem-solving performance was undermined, and it alone did not lead to any creative performance advantage alone and in the control group which received no instructional training.

These studies have been grouped on the basis that they employed questionnaires to quantify the intuitive and creative abilities of students. All demonstrated a positive association between generating new ideas and relying on intuitive resources, including the production of more novel, higher quality, and more diverse ideas.

# Studies on Intuition and Creative Idea Evaluation

Idea evaluation is a more scarcely used term within the literature, with a few studies combining this concept with idea generation, and even fewer assessing this concept in isolation. We have introduced two studies (Sinclair, 2012; Stierand and Dörfler, 2015) in the previous section which predominantly discuss the concept of idea generation but also include short passages on idea evaluation. Both studies introduce new terminology to describe similar concepts with functional differences. We coordinate these with our framework.

Stierand and Dörfler (2015) introduced intuitive insight and intuitive judgment as mechanisms underlying creative discoveries. From these, intuitive judgment may be applied in the creative evaluation stage, e.g., deciding the array of dishes on a menu. An additional term introduced by Sinclair (2012), intuitive foresight, can also be connected to idea evaluation. According to her data, both intuitive expertise and intuitive foresight were used by all filmmaking professionals. Intuitive expertise functioned as a way to create unity amongst crew members whereas intuitive foresight was crucial for making decisions regarding the selection projects, topics/script, and for helping spot talent or market trends.

Two studies we examined focused exclusively on the idea evaluation stage. In the first one (Magnusson et al., 2014), expert judges carried out the evaluations of products. Intuitive idea evaluation was compared with analytical idea evaluation against predefined criteria in the context of developing new products. Clients of a big telecommunications operator were asked to submit their ideas on developing future mobile services. Eightythree separate ideas were evaluated by four experts—one of whom also provided qualitative data as part of a thinking-outloud protocol but due to the limited sample size this data is not reported here. All four judges evaluated each idea first in a holistic manner (intuitively), and then 2 weeks later according to formal criteria (analytically). Intuitive evaluations were made while keeping first a radical and then an incremental market in mind, while analytical evaluations were made according to three formal criteria, namely originality, user value, and producibility. A link between the two techniques was shown with linear regression. The analysis showed that the scores on the three formal criteria predicted approximately 50% of the variance in the holistic evaluations. Furthermore, two innovation indexes (based on Magnusson, 2009) were calculated, with which the best ideas from both the incremental and radical perspectives were selected.

In a similar vein, Eling et al. (2015) also investigated the role of intuitive and analytical evaluation processes during early idea screening by utilizing Dijksterhuis' (2004) research design. Fifty professionals that were qualified in product development were presented with four new product ideas, each consisting of 12 attributes. After briefly reading one of new product ideas participants could either perform a rational analysis (i.e., deliberately assess the idea in a logical manner) or complete a distractor task for the equivalent length of time (i.e., 3 min) after which they were required to rely on their "intuition and gut feeling" about the new product idea. Another group was exposed to both, in the order of rational analysis then intuition (via the distractor task), and a final group was exposed to the intuitive then rational analysis. The combined approach of intuition and rational analysis increased the speed and quality of the evaluation of the new product ideas rather than rational analysis or intuition alone, the latter of which would have been predicted by Dijksterhuis (2004).

In conclusion, larger creative outcomes can only be examined by breaking them down into smaller building blocks and tracking how they influence the final product. These studies followed reallife examples of creative achievement from beginning to end, interpreting evaluation through the attrition of lower quality ideas within each building block. In addition, it was shown that there is more to intuitive evaluation than a rapid use of criteria since an analytical evaluation could explain only half of the variance shown in the intuitive assessment. Combining intuitive and analytical approaches led to higher quality and faster idea evaluation than relying on one of the approaches only. Considering the low number of studies conducted on idea evaluation, further research efforts would be necessary to explore the exact role of intuition within this stage.

# Studies on Intuition and Creativity (with no Differentiation between the Stages)

Two of the found studies did not decompose the creative process into multiple stages, but made general claims and focused on the details of intuitive processes. Sundgren and Styhre (2004) focused their work on scientific research and narrowed their scope to a case study of pharmaceutical research. Particularly, the organization of pre-clinical drug development, employee's understanding of the concept of intuition, intuition's role in the discovery of new drugs, as well as moderating organizational factors were recorded. The narrative analysis of the interviews resulted in a list of characteristic experiences, however, the contents were not quantified nor fit into a larger context. Nevertheless, the key quotes served as valuable sources for enhancing insider understanding and inspiring further research.

Dollinger et al. (2004) used the MBTI along with several other creative performance measurements to explore the link between intuition and creativity. In their study, 94 college students completed a shortened version of the Creative Behavior Inventory (Hocevar, 1979; Dollinger, 2003), the Creative Personality Scale (Gough, 1979) and produced a drawing as part of the Test for Creative Thinking–Drawing Production (Urban, 1991; Urban

and Jellen, 1996). Consistently with past research (Myers, 1998), participants who were classified as both intuitive and feeling types scored the highest on the creativity tests, while the lowest scores were associated with those identified on the MBTI as Sensor-Feeler types. Though these studies reinforced general notions about intuition contributing to discoveries/creative productions, they were unable to outline new directions for further expansion.

# DISCUSSION

The aim of the present review was to examine the link between creativity and intuition with a special emphasis on how intuition fits into the specific stages of creative processes. We decomposed creativity into idea generation and idea evaluation phases and considered two types of intuition, creative intuition, and problem-solving intuition. Creative intuition was linked to the idea generation phase, whereas problem-solving intuition was linked to the idea evaluation phase. It was hypothesized that a gradual accumulation of clues to coherence underlies the generation and recognition of creative ideas, as reaching a coherent perception of how to proceed is the key goal during both of the idea generation and the idea evaluation phases in the absence of consensually accepted rules.

We categorized available research literature into three sections based on the proposed conceptual framework. The majority of our findings were concerned with idea generation, which could reflect the common belief that creativity arises from idea generation. Qualitative studies suggested that intuition was relevant for creativity but this was based on introspection and anecdotal evidence, albeit given by professionals in their own respective fields. What we could infer here is that these two constructs are likely to be connected but it is not known how they are connected. Correlational studies showed a reasonable correlation between intuition and creativity, but there may well be conflation given that the creativity and intuition measuring instruments may include similar items. Finally, empirical studies showed that intuition may guide idea generation and evaluation, and optimal performance was achieved when analytical and intuitive judgments were combined.

Taking these findings into consideration, we can conclude that the exact ways through which intuition is connected to the different stages of the creative process still need to be empirically demonstrated. However, they do suggest that for ill-defined problem scenarios where the number of possible solutions increases to near-infinity, creative thought starts with intuition and intuition is inherently part of the process. In order to examine this connection, there need to be a clear set of hypotheses to test regarding the precise nature of the relationship. We propose a framework that makes this possible which is also informed by the current evidence reviewed, We draw attention to the fact that thus far, no existing theories of creativity have included intuition as a component prior to our framework. Our aim is to lay out a framework which establishes the timing and magnitude of the contributing intuitive process make to the creative process. But, before we present the framework, we discuss a few limitations.

# Limitations

To begin, the review represents specific literature that may be construed as biased in the following ways. We only considered the period after the first landmark review of the psychological evidence connecting creativity and intuition (Policastro, 1995). Further, our selection criteria were strict which in turn mean that this only generated a handful of studies that could be included in the review. Furthermore, this review does not represent the entire spectrum of studies relevant to the main topic, because of the stringent exclusion criteria which did not include main streams of research (e.g., excluding the studies featuring self-reports only). We wanted to keep a sharp focus on the most directly relevant evidence available on the topic of the connection between intuition and creativity, with the view to only including high-quality literature that provided insights that directly concerned the connection between the two phenomena of interest. Thus while we have indeed used self-imposed filters in this review but these filters we presented a clear justification for them earlier in the Section "Methods" of this article. The goal was to gain a deeper understanding of the connection between the two concepts and to be able to start moving forward with the experimental work from there.

One concern regarding using the reviewed literature to potentially inform our framework is the difficultly in synthesizing it. Questions can be raised about what we can take away from the findings discussed from the literature given the different conceptualizations and operationalizations about the core phenomena being investigated. In addition, a further related problem concerns the misaligned assumptions surrounding both intuition and creativity and the way in which they are measured. Another issue concerns the topic of examining the connection between intuition and creativity itself, which confronts the edges of our current discipline's understanding of the operations of knowledge integration at a cognitive and neural level (Park and Friston, 2013).

Thus, for now our review, while broadly informed by the empirical literature, does not have a dedicated set of studies to support it. However, the aim here is to find common ground in theoretical and empirical work, in order to provide testable hypotheses about the linkage of the processes couched in a detailed conceptual framework.

# Conceptual Framework of the Link between Intuition and Creativity

Our aim here is to present a framework that is able to consolidate the essential features of the creative problem-solving process, and intuition (more specifically intuitive judgment), and to lay out how the two are connected. Moreover, the aim is to show sensitivity to the insights from theoretical and empirical work that has speculated a link between intuition and creativity. In order to follow our proposals, **Figure 1** presents a schematic of our conceptual framework, and the elaboration of the framework that follows discusses the components from left to right as they appear in **Figure 1**.

Ill-defined problems are the starting point of the creative problem-solving process, and once a creator faces such a

problem, they can begin tackling it in one of two possible ways: (1) they may refer to existing prescribed paradigms (these may be institutional depending on the context in which the problem arises) to define the problem space and the possible strategies that could be taken, or (2) they may use one's individual judgment based on prior experiences to define the problem's characteristics.

# Path 1

Selecting an established work frame to tackle a problem may seem initially efficient, but may also be unsuitable for reaching the goal, thus ultimately lead to an insufficient solution or no solution at all. However, the advantage, along with efficiency, is that later down the process of creative problem solving, solutions/innovations may be achieved by committing to established paradigms and inserting new elements into the framework or finding a beneficial variation of existing elements based on accumulated cues to solve a problem. The underlying assumption is that the existing framework is sufficient for reaching the goal (in many cases it is the optimization of the process by which the goal was achieved already), thus it is used as a starting template to build upon.

Within an already established framework, it is relatively easier to assess the potential and actual value of new propositions. These newly proposed alternatives are comparable with the prior less elegant/optimal solutions and often there is a general set of criteria for judging their value. During this first pathway, intuition may be employed to recognize new elements or variation of elements by recognizing their value based on gut feeling. However, rational analysis may yield the same results through a less elegant, more time-consuming procedure. It is thus Path 2 in which intuition is more obviously featured in both idea generation and evaluation.

#### Path 2

In contrast, big leaps in knowledge occur if problem solvers create a novel paradigm to solve a problem and this can serve as the basis for solving future, related problems. The motivation for doing so is that the existing framework proved to be unproductive for reaching a specific goal, such as there may be empirical evidence at hand which does not fit the theoretical assumptions, or a problem must be solved which cannot be asked/answered under the existing frame. It is also possible that a creator is not knowledgeable of existing procedures thus establishes their own. Deliberate analysis is ruled out here because a thorough evaluation of a vast amount of randomly generated possibilities would not be feasible due to a lack of resources (time, funding, etc.). The same applies to relying on chance and selecting ideas completely randomly. Rather what happens is that a creator gains a starting hypothesis relying on a gut feeling. He/she combines separate chunks of gradually acquired information about what could be working and boils them down to form a new coherent construct via associations. Intuition does not solve the entire problem but grants an idea which is purposefully selected. In this path, intuition cannot be replaced with analysis and it sometimes even precedes analysis (Dörfler and Eden, 2014). It is tightly linked to establishing new paradigms, not only in the idea generation phase but in the evaluation phase too. Initial ideas need refinement and must be monitored based on how close is the current state to the desired end state. Experts of a particular domain must rely on their perception of coherence to judge the explanatory potential of a new framework (whether it is suitable for addressing the question and what further problems may get answered with it).

# Directions for Future Research

Further experimental studies are necessary to investigate the proposal here. In particular, based on the predictions made, future investigations should explore whether well-defined problems involve intuitive solutions. In addition, it would be useful to test whether a truly creative paradigm, which incorporates three essential criteria, i.e., originality, utility, and surprise (e.g., Simonton, 2012), can be generated by relying solely on analytical methods. Furthermore, to answer the question whether intuition is indispensable for creative achievements, scenarios in which only intuitive processing of the problem, only analytical processing of the problem and both intuitive and analytical processing of the problem is carried out should be contrasted (cf. Eling et al., 2015). Studies usually contrast intuitive judgment to analytical judgment, so it could be worthwhile to look specifically at association- versus rule-based judgments during creative problem-solving. Experiments targeting both the

idea generation and the evaluation phases could manipulate the number of explicit rules participants are provided with and/or the extent in which making associations is necessary to complete the task. Finally, ecologically valid environments could be simulated by providing participants with a vast amount of information and observing how intuition is used to find the relevant clues to the solution.

# CONCLUSION

Our review showed that intuition is associated with both the idea generation and the idea evaluation phases of the creative problem-solving process. Data was pooled together to obtain a more fine-grained picture about where and how intuitive processes are linked with specific stages of creative problem solving. It was found that previous studies connected intuition chiefly to the idea generation phase. Two possible pathways were sketched out explaining the use of intuition in response to illdefined problems. Finally, intuition, despite being increasingly investigated in psychological research, is still interpreted in a

# REFERENCES


Amabile, T. M. (1983). The Social Psychology of Creativity. New York, NY: Springer.


broad, vague manner, and we suggest future empirical research should be directed to test specific hypotheses such as those offered here or by Sadler-Smith (2015) in order to reveal its underlying working mechanisms in creative problem solving.

# AUTHOR CONTRIBUTIONS

JP performed the literature review and wrote the manuscript. MO and JB supervised the project and edited the manuscript.

# ACKNOWLEDGMENTS

This research was supported by the Queen Mary University of London. JB was partially supported by the European Commission (Grant Agreement No. 612022). This publication reflects the views only of the authors, and the funders cannot be held responsible for any use that may be made of the information contained therein.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Pétervári, Osman and Bhattacharya. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Neurocognitive Framework for Human Creative Thought

#### Arne Dietrich<sup>1</sup> \* and Hilde Haider<sup>2</sup>

<sup>1</sup> Department of Psychology, American University of Beirut, Beirut, Lebanon, <sup>2</sup> Department of Psychology, University of Cologne, Cologne, Germany

We are an intensely creative species. Creativity is the fountainhead of our civilizations and a defining characteristic of what makes us human. But for all its prominence at the apex of human mental faculties, we know next to nothing about how brains generate creative ideas. With all previous attempts to tighten the screws on this vexed problem unsuccessful – right brains, divergent thinking, defocused attention, default mode network, alpha enhancement, prefrontal activation, etc. (Dietrich and Kanso, 2010) – the neuroscientific study of creativity finds itself in a theoretical arid zone that has perhaps no equal in psychology. We propose here a general framework for a fresh attack on the problem and set it out under 10 foundational concepts. Most of the ideas we favor are part and parcel of the standard conceptual toolbox of cognitive neuroscience but their combination and significance to creativity are original. By outlining, even in such broad strokes, the theoretical landscape of cognitive neuroscience as it relates to creative insights, we hope to bring into clear focus the key enabling factors that are likely to have a hand in computing ideational combinations in the brain.

#### Edited by:

Eörs Szathmáry, Parmenides Foundation, Germany

#### Reviewed by:

Michał Wierzchon,´ Jagiellonian University, Poland Caroline Di Bernardi Luft, Queen Mary University of London, UK

#### \*Correspondence:

Arne Dietrich arne.dietrich@aub.edu.lb

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 13 June 2016 Accepted: 26 December 2016 Published: 10 January 2017

#### Citation:

Dietrich A and Haider H (2017) A Neurocognitive Framework for Human Creative Thought. Front. Psychol. 7:2078. doi: 10.3389/fpsyg.2016.02078 Keywords: connectionist architecture, consciousness, creativity, default network, emulation/simulation, evolutionary psychology, prediction, task set

# INTRODUCTION

The last half century has seen a veritable explosion of knowledge about the mind and how it works. Perhaps the single most glaring exception in this success story is creative thinking. Indeed, it is hard to think of a mental phenomenon so central to the human condition that we understand so little. Careful reviews of the recent literature on the neuroscience of creativity (Dietrich and Kanso, 2010; Sawyer, 2011; Weisberg, 2013; Yoruk and Runco, 2014) have shown that the field is heavily fragmented, with data being selectively recruited to support concepts that are theoretically incoherent and cannot do the explanatory work we require in neuroscience (Dietrich, 2015). At this point, there is not a single cognitive or neural mechanism we can rely on to explain the extraordinary creative capacities of an Einstein or a Shakespeare.

The principal reason for this situation is that all current psychometric tests used to look for creativity in the brain are based on divisions – divergent thinking, defocused attention, remote associations, for instance – that (1) are false category formations, given their exact opposites – convergent thinking, focused attention, or close associations, in this case – also precipitate creative ideas (Dietrich, 2007b) and (2) result in constructs that still consist of many separate mental processes that are distributed in the brain. For neuroimaging studies, the combination of both theoretical problems – false category formation and compound construct – makes defeat certain. Simply put, these so-called creativity tests, such as the Alternative Uses Test (AUT) that are based

on the division of divergent thinking, cannot identify the cognitive or neural processes that turn "normal" thinking into creative thinking. And if you fail to isolate the subject of interest in your study, you cannot use neuroimaging to hunt for mechanisms. You just don't know what the brain image shows (Dietrich and Kanso, 2010; Sawyer, 2011).

# AIM AND SCOPE

The central, motivating intent of this paper is to show a possible way out of the disciplinary insolvency in which the neuroscientific study of creativity currently finds itself. There is a whole host of evidently relevant concepts that, despite being securely anchored in the bedrock of mainstream cognitive neuroscience, have so far been ignored in creativity research. As a matter of tactics, we confine ourselves in this first approach to 10 such concepts. Together they form a neurocognitive framework less intended to offer a specific set of hypotheses but rather to inform future experiments on, and theorizing about, the creative process occurring in human brains. Although the framework does intimate testable hypotheses, our general strategy at this early stage is to survey the theoretical landscape and highlight those concepts that might open altogether new avenues of research in the field of creativity.

To suitably constrain the scope of our framework further, we focus exclusively on the computation of creative insights. Creativity typically refers to a product – something useful, novel and surprising (Simonton, 2012). An ingenious idea is often the first step toward a creative product but this is neither necessary nor sufficient. Creative products come into existence without the incidence of an antecedent, conscious thought, and creative thoughts have no impact unless converted into an actual product. Specifically, we leave to one side steps of the creative process dealing with the implementation of a creative idea which often requires additional creative thinking. We also pass over higher-order evaluative processes, that is, those cognitive processes that assess the original idea's merits after it manifested itself in consciousness. Those processes required for the idea's successful execution as well as those carrying out appraisals at the explicit information-processing level are likely to engage different cognitive processes and different brain regions (Dietrich, 2004b).

To pursue the question of how brains compute creative ideas, we bring to the fore a number of well-established neuroscientific concepts whose explanatory power with respect to creative cognition has not been realized. Collectively, they sketch out the contours of a broad framework consisting of what might be called "foundational concepts" for human creativity. We present the foundational concepts under 10 headings as follows.

# THE FRAMEWORK'S 10 FOUNDATIONAL CONCEPTS

# Evolutionary Algorithms

More than half a century ago, Campbell (1960) proposed that creative thoughts result from the twofold Darwinian process of blind variation (BV) followed by selective retention (SR), or BVSR (see also Campbell, 1974; Popper, 1984; Simonton, 1999). A long debate on the exact parameters of the evolutionary algorithm, and especially the matter of blindness, has recently settled on a broad consensus (see Kronfeldner, 2010; Dietrich, 2015) that culture is a variational system with some coupling between variation and selection. This partial coupling means that human cultural transmission, and thus human creativity, is partially directed and thus fits, strictly speaking, neither into the rigid category requirements of Neo-Darwinian (total) blindness nor Lamarckian (total) sightedness (Richerson and Boyd, 2005; Kronfeldner, 2010). Despite this common denominator on the basic mechanism of human creativity, the two-step evolutionary rationale has been nearly universally ignored in setting up empirical protocols in neuroscience. All current psychometric measures of creativity collapse the two fundamental constituent elements of the creative process, and it is hard to imagine useful neuroimaging data from studies blending variation with selection, given that both likely engage different cognitive processes and different brain areas (Dietrich, 2004b).

The understanding of creativity as a partially sighted variation-selection process should guide the search for the brain mechanisms underlying creativity. One place to start this quest are four features that distinguish evolutionary algorithms occurring in brains from those transforming nature, as it is these four features that can be linked to a neural mechanism (Dietrich and Haider, 2015). They are: (1) cognitive coupling providing degrees of sightedness, (2) establishment of fitness values for hypothetical selection processes, (3) cognitive scaffolding for multistep thought trials, and (4) the experiences of foresight and intention.

We have proposed that the main neural mechanism that enables the cognitive coupling of variation to selection is the brain's prediction machinery (Dietrich, 2015; Dietrich and Haider, 2015). In computational terms, this results in advanced heuristic algorithms that can boost the effectiveness of the blind, ex-post-facto search algorithm of the biosphere by orders of magnitude. This partial sightedness must necessarily be driven via predictive processes.

For the mind's second adaptation, we first need to describe a complication inherent in thought trials. Evolutionary algorithms require a fitness function. In the biosphere, this is done by causal factors in the environment; that is, selection occurs in the real world, on individuals made flesh. But in simulations, or hypothetical thought trials, the selection process depends on merit criteria that must also be modeled. On what basis is this done? Since the very essence of creativity is to go into uncharted territory, how do we know what would be adaptive in that unknown topography.

A third adaptation that enhances the basic evolutionary algorithm is scaffolding. In nature, every variation-selection cycle in a species' trajectory is actualized and must, in its own right, be a viable form. The basic move in Darwinian evolution, in other words, is to generate-and-field-test. Brains, on the other hand, can short-circuit instantiation and breed multiple generations in a hypothetical manner. The basic move, then, becomes to generate-and-hypothesis-test. This produces a striking effect.

Because some designs require elements that cannot be realized without a temporary scaffold, a mechanism that includes an instant pay-off requirement, such as biological evolution, can also not build them. What scaffolding permits is that trajectories through the infosphere can bypass impossible intermediates. The benefit is a plethora of higher-order, discontinuous design solutions. Cognitive scaffolding also has important implications for the debate on continuous versus discontinuous processing in insight formation.

Finally, the creative process in the biosphere is not teleological. It serves no end, and its designs are neither premeditated nor deliberately initiated in response to a perceived need. Human creators, by contrast, act on purpose; they create with intent and with an objective in mind. Although one might expect such improvements in a process that inexorably bootstraps, this argument is typically framed in cognitive psychology in terms of expert systems and often falsely considered at odds with evolutionary models of creativity.

# Predictive Processing

The second foundational concept, the assumption of predictive computation, holds that the universal principle of brain function is to generate predictions (Wolpert et al., 2003; Grush, 2004; Bar, 2007; Clark, 2013), making a perpetual variation-selection search process the brain's default operating mode. The core idea is that for behavior to be purposeful and timely in a highdimensional environment, we must continuously, automatically, and unconsciously generating expectations that meaningfully inform – constrain – perception and action at every turn (Wolpert et al., 1995; Llinas and Roy, 2009; Clark, 2013). Even when not engaged in a specific task, the brain actively produces predictions that anticipate future events (Moulton and Kosslyn, 2009).

The brain's prediction machinery offers a mechanistic explanation for the complex properties of cultural evolutionary algorithms running in brains that we outlined in the previous section. Specifically, our idea is that internal representations of the emulated future, which we call ideational RPGs, or Representations of Predictive Goals, provide the neural mechanism for the four special properties of our mind's evolutionary algorithms. They can address (1) partial sightedness or coupling, (2) the ability to set a fitness function in an unknown solution space, (3) cognitive scaffolding, and (4) the feeling of foresight and intention (Dietrich, 2015).

Finally, well-established Bayesian inference techniques could tell us how such advanced evolutionary algorithms converge on a predicted goal state or the potential creative solution (Dietrich and Haider, 2015). We think that the prediction perspective, especially when embedded into a larger evolutionary frame, offers a promising direction in our search for the creative process taking place in brains.

# No Single Place; No Single Process

In foundational concept 3, we set forth the vaudeville conception of creativity (Dietrich, 2015). The vaudeville conception of creativity is based on two fundamental notions in neuroscience: modularity and non-linearity. The brain's functional specificity, or modularity, suggests that the recombination of bits and pieces of content into novel configurations must come from the same neural circuits that normally handle those bits and pieces of content. This must also be conceded as part of our understanding of the brain as a non-linear information processor.

The tacit assumption that has been driving creativity research, however, is the opposite. Creativity is obviously special and there must be something, somewhere, that makes it so. This way of thinking betrays the commitment to a distinct factor, an extra something – the creative bit, if you like – that's specifically added to the plain mix to make the sparkling difference. Powered by this instinctive hunch, creativity is routinely treated as a monolithic entity and assigned to some brain network (e.g., default mode network, DMN) or associated with a particular cognitive process (e.g., divergent thinking). The fact that such conclusions are based on "creativity tests" that combine a false category formation with a compound constructs, effectively renders this research paradigm phrenology.

In our view, any global statements about creativity per se being located in specific brain areas or networks is devoid of meaning and would border on an outright violation of the modular conception of brain function. What the vaudeville conception of creativity does is to shift the focus from mistaking colorful brain images as a substitute for an explanation to the software side of things, that is, the cognitive and computational processes that implement variation-selection runs leading to creative thoughts.

# Network Dynamics of Global Competition

The fourth foundational concept is the brain's connectionist architecture. It takes the conventional position that information processing – selective attention, working memory, or cognitive control – involves large-scale competition between widely distributed representations that are biased by top-down, prefrontal activity (e.g., Baars, 1988; Dehaene and Changeux, 2011).

One important element that might shed light on the computation of creative ideas is the strengthening mechanism of connectionist models, as it is this mechanism that helps transient coalitions to reach threshold levels and turn them into conscious representations. The release of dopamine from neurons in the ventral tegmental area, and their subsequent activity in prefrontal and limbic regions, is currently the main proposal for such mechanism (Schultz, 2000; Rose et al., 2010). The possibility that a dopamine signal precedes the emergence of a creative insight might inform more precise neuroscience research on creativity.

# Dual Systems

For foundational concept 5, we add one more layer of complexity to the basic connectionist platform, the view that two distinct systems for knowledge representation exist, one implicit and one explicit (Reber, 1993; Dienes and Perner, 1999). This distinction seems to matter a great deal for the urgently needed task of parsing creativity into different types that have some validity.

The explicit system is a sophisticated system that is tied to consciousness and thus capable of representing knowledge

in a higher-order format. In contrast, the implicit system is inaccessible to consciousness. It is stimulus-driven and, as its information is encapsulated, it cannot form such higherorder representations (Dienes and Perner, 2002; Haider and Frensch, 2009; Haider et al., 2011). Due to these encapsulated representations, the explicit system, or any other functional system in the brain, does not know about knowledge imprinted in the implicit system. However, implicit knowledge can affect performance by, for instance, biasing our current predictions.

The differences between these two systems in terms of creative capacity have been treated elsewhere (Dietrich, 2004a, 2015). Here, we only briefly highlight some differences related to the predictive machinery of each system (Downing, 2009). Due to the implicit system's inability to represent hypothetical future scenarios, implicit prediction is online; that is, it works only in known and currently present solution spaces. In general, the implicit system uses a stochastic process to optimize behavior, simply testing out, by trial and error, solutions to environmental contingencies (Perruchet and Vinter, 2002; Haider et al., 2011). The implicit system does use prediction – in the motor system, for instance – but can only do so for already learned actions. It cannot launch ideational RPGs into abstract and unknown solution spaces. In terms of sightedness, prediction in the implicit system only possesses (partial) sightedness for known problem spaces, a situation that does not really qualify as creativity. For explorations in terra incognita, we have proposed that the implicit system can still be creative, but this creativity must be limited to the blind algorithms in nature (Dietrich and Haider, 2015). We have also hypothesized that these features are more consistent with the flow state (Dietrich, 2015).

The game-changing advantage of explicit prediction is that the explicit system can generate ideational RPGs that can be used to gain some sightedness in unknown problem spaces. Explicit prediction can thus operate offline; that is, on problems that are hypothetical and that can be solved outside real time (Grush, 2004; Moulton and Kosslyn, 2009). Ideational RPGs, in other words, internalize the selection process since the parameters of that goal state prediction work as a fitness function. With the ability to simulate a complete internal model, we can imagine – predict the effects of – events in uncharted territory.

# Task Sets

Foundational concepts 6 is the construct of a task set (Allport et al., 1994; Monsell, 2003; Dreisbach and Haider, 2009). A task set denotes the configuration of mental resources that goes with a task. Through instructions or schemas, it defines those aspects of the task to which we selectively attend, the features of the stimulus that are bound to certain dimensions of the response as well as the response selection. The construct was formulated in response to experiments providing evidence that switching between different tasks produces substantial performance costs (Allport and Wylie, 2000; Monsell, 2003).

We cannot perform a task until the cognitive system is properly attuned and organized. If the task changes the new task set must first be uploaded, so to speak. It is a kind of mindset containing the elements and their values that are tagged as temporarily belonging together in the network because they played a role in completing the task in the past. By facilitating, top-down, certain task-relevant cognitive operations and inhibiting others, the implementation of a task set affects the processing of all stimuli associated with that task and, by extension, of a problem space.

A task set guarantees internal stability to keep the ongoing task free from interference and disruption by other task sets (Dreisbach and Haider, 2009). At the same time, task-set activation must also allow enough flexibility for mental gear changing so that we can adjust should the context necessitate it (Neumann, 1984).

The importance of this theoretical construct to the phenomena of creativity should be immediately self-evident. The task representation governs, for instance, how we would initially approach a problem-solving task (Knoblich et al., 1999; Öllinger et al., 2013). It also maps the shape of the solution space and establishes critical search parameters. These settings are, in effect, predictions about the kinds of solutions that are likely. Moreover, task set strength determines the degree of functional fixedness, or cognitive flexibility, and the probability for remote associations.

# Task-Set Inertia

Foundational concept 7 is the related notion of task-set inertia. Task-set inertia (Allport et al., 1994) was introduced to explain an unexpected asymmetry in task-switching studies that could not be accounted for with task-set reconfiguration alone. Like task sets, it is also a concept that, as far as we know, has not yet been applied to creative thinking, despite its obvious relevance to several creativity phenomena.

Since neural networks are not on/off switches, we can expect that a strongly interacting coalition of neurons does not instantly decay back to baseline. Any disintegration phase would take time, during which a new task set would be subjected to interference from the previous one. It would seem obvious that task-set inertia, extended to creative thinking, holds precious clues for understanding incubation. The fact that the removal of a problem from conscious awareness can break the impasse that often frustrates the problem-solving process shows that the task-set coalition associated with it must continue to reverberate with purpose.

But carryover activation in the knowledge structure is unlikely to be the only mechanism here. For instance, creative insights, have a way of popping up long after we last worked on a problem and it is hard to see how transient task-set inertia could linger for days or weeks. Also, the fact that there remains a problem in need of a solution is unlikely to be embedded at the level of the knowledge structure itself. This is a type of goal representation and it should require higher-order brain regions, such as the prefrontal cortex.

One way to address these complications might involve the notion of fringe working memory (Cowan, 1999, 2005). Working memory is thought to have a focal center and a fringe, with the latter containing information that still has some conscious properties. Following a task switch, a goal representation could remain active in the fringes of working memory and continue to provide, via top-down projections, some organizational control to steer the spreading activation in the task-set coalition toward a solution. Task-set inertia and fringe working memory are concepts that would seem to provide propulsive help in understanding the mechanism that rearranges bits of information into ideational combinations while the conscious mind is otherwise applied.

# Large-Scale Networks

fpsyg-07-02078 January 7, 2017 Time: 14:56 # 5

Foundational concept 8 relates large-scale brain networks – the central executive and the DMN, in particular – to creativity (Raichle et al., 2001; Bressler and Menon, 2010). We explore what we can, and cannot, say about them in the context of creative thinking.

The DMN is a set of neural regions that shows heightened activity during resting states as well as during a number of directed mental tasks, which led to the idea that DMN activity supports mind-wandering or moments of introspective selftalk and thought (Mason et al., 2007). More recently, the DMN is often characterized as being involved in predictive processing and the ability to simulate worlds that differ mentally, temporally, and physically from the present. It includes medial temporal lobe structures, especially the hippocampus and parahippocampal cortex, the medial parietal and lateral temporal cortices, especially the temporal-parietal junction, as well as the medial prefrontal cortex, cerebellum and thalamus (Buckner, 2012).

We have the foreboding sense that the recent proposals that link the DMN to creativity has appealed to some for the unfortunate reason that it feeds into old and misbegotten category formations about creativity, such as divergent thinking or daydreaming. But there is no reason to presume that the other, central-executive network (CEN) is not also involved in creative thinking.

The CEN is anchored in the dorsolateral prefrontal cortex and several areas of the posterior parietal cortex (Bressler and Menon, 2010). It controls executive functions and shows activity whenever, we have to focus our attention on a specific task. Indeed, activity in the CEN is inversely correlated with the DMN. They operate like two components of a flip-flop circuit; while the DMN is associated with endogenous activity, the CEN is driven by exogenous input. As might be expected, predictive processing has also been associated with this CEN (Downing, 2009; Clark, 2013).

The notion that the DMN is a proactive system implies that there must be a continuous search process taking place that reduces uncertainty even when no task is at hand. This constant anticipatory drive in the DMN during moments of passive contemplation brings it into close contact with the concept of task-set inertia and the specific phenomenon of incubation. If the DMN is active during introspective simulations of the future and, by extension, simulations of possible alternative solutions to a problem, we can assume that it also shows inertia once a problem is incubated. This possibility could inform neuroimaging research on incubation.

We are careful not to associate one network, or some kind of back-and-forth interplay between them with creativity per se, or divergent thinking for that matter (Beaty et al., 2016). This, we think, is just another false category formation and a version of the monolithic entity fallacy. Rather, we consider that the two processing modes, or the two core networks, support different types of creative thinking.

# The Deliberate Mode

Finally, we close by defending, under headings 9 and 10, the proposal that there are two distinct modes, or types, of creativity that emanate from the explicit system, a deliberate, top-down mode (foundational concept 9) and a spontaneous, bottom-up mode of processing (foundational concept 10; Dietrich, 2004b, 2007a). The decomposition of creativity into variation and selection aside, this deliberate-spontaneous partition of creativity, along with a third flow mode that emanates from the implicit system, is the only one that we think has empirical and theoretical support. We also suggest that a mapping of the two modes on to the CEN and DMN might provide more hypotheses for future imaging studies.

The deliberate problem-solving mode is strongly biased by top-down pathways from the prefrontal cortex so that the rearrangement of informational units has built-in predispositions that are likely constrained by biases, expectancies, schemas, and previous experiences. In other words, the search function is restricted to more commonsense solutions that are more paradigmatic and rely on more close associations. But being tied to effortful and conscious processing, the deliberate mode also enables us to bring the full toolbox of our highercognitive function to bear on the problem, including focusing attention, retrieval of relevant memories, and the recombination of knowledge by sustaining several representations in mind at once.

The advantage of such advanced heuristic algorithms is, of course, efficiency. But trimming the vast search space also has a drawback. The deliberate mode only works well if the solution is indeed located in the predicted area of the problem space. To quip, while the deliberate mode has the advantage of limiting the solution space, it has the disadvantage of limiting the solution space!

# The Spontaneous Mode

For foundational concept 10, we contrast the deliberate mode with novel ideas that emerge from a spontaneous problemsolving mode in which top-down influences are weakened and the search function is less directional. Although this comes with a speed and efficiency tradeoff, the spontaneous mode has the potential to chance upon more paradigm-shifting ideas or remote associations. During incubation or various altered states of consciousness, the brain shifts a problem from a deliberate to a more spontaneous mode of processing that is not controlled by intentional reasoning. This significantly weakens the supervisory, top-down biases from the prefrontal cortex that guided the effortful deliberations. The drawback, however, is that a spontaneous mode does not benefit from the higher-order and efficient forecasting ability of conscious thought.

# CONCLUSION

fpsyg-07-02078 January 7, 2017 Time: 14:56 # 6

Creativity has a dubious distinction in the psychological sciences. For no other mental phenomenon so central to the human condition do we know so little as to how the brain does it. Reviews of the existing literature (e.g., Dietrich and Kanso, 2010; Sawyer, 2011) have shown that the field is heavily fragmented and its neuroscientific findings are invalidated by false category formations and compound constructs. The aim of the present paper was to suggest alternative ways to attack the problem.

Our framework of human creative thought consists of 10 foundational concepts organized into 10 separate headings. The ideas we favor are all part and parcel of cognitive psychology and neuroscience: evolutionary algorithms, predictive representations, distributed processing, connectionist architecture, explicit-implicit distinction, task set, task-setinertia, large-scale networks, and top-down vs. bottom-up processing. However, their significance to creativity, especially

# REFERENCES


the crossties we developed here among them, is original. Together they form a neurocognitive framework that provides a fresh attack on the possible mechanisms that compute ideational combinations in the brain.

As a matter of tactics, we limited ourselves to those concepts that we think hold the greatest potential for progress. The framework is not intended to be complete. But for our purposes, the degree of completeness is not important. So long as it is agreed that the combination of concepts we bring to the fore are fundamental to creative cognition and possess eminent explanatory power that has not been realized. We hope that our framework helps revitalize research on an issue that defines our humanity.

# AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Dietrich and Haider. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Can Contraries Prompt Intuition in Insight Problem Solving?

Erika Branchini <sup>1</sup> \*, Ivana Bianchi <sup>2</sup> , Roberto Burro<sup>1</sup> , Elena Capitani <sup>3</sup> and Ugo Savardi <sup>1</sup>

<sup>1</sup> Department of Human Sciences, University of Verona, Verona, Italy, <sup>2</sup> Department of Humanities (Section Philosophy and Human Sciences), University of Macerata, Macerata, Italy, <sup>3</sup> Department of Education, Cultural Heritage and Tourism, University of Macerata, Macerata, Italy

This paper aims to test whether the use of contraries can facilitate spatial problem solving. Specifically, we examined whether a training session which included explicit guidance on thinking in contraries would improve problem solving abilities. In our study, the participants in the experimental condition were exposed to a brief training session before being presented with seven visuo-spatial problems to solve. During training it was suggested that it would help them to find the solution to the problems if they systematically transformed the spatial features of each problem into their contraries. Their performance was compared to that of a control group (who had no training). Two participation conditions were considered: small groups and individuals. Higher success rates were found in the groups exposed to training as compared to the individuals (in both the training and no training conditions), even though the time required to find a solution was longer. In general, participants made more attempts (i.e., drawings) when participating in groups than individually. The number of drawings done while the participants were trying to solve the problems did not increase after training. In order to explore if the quality (if not the number) of drawings was modified, we sampled one problem out of the seven we had used in the experiment (the "pigs in a pen" problem) and examined the drawings in detail. Differences between the training and no training conditions emerged in terms of properties focused on and transformed in the drawings. Based on these results, in the final discussion possible explanations are suggested as to why training had positive effects specifically in the group condition.

# Edited by:

Michael Öllinger, Parmenides Foundation, Germany

#### Reviewed by:

Rakefet Ackerman, Technion – Israel Institute of Technology, Israel Karsten Werner, University of Potsdam, Germany

> \*Correspondence: Erika Branchini erika.branchini@univr.it

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 13 June 2016 Accepted: 01 December 2016 Published: 26 December 2016

#### Citation:

Branchini E, Bianchi I, Burro R, Capitani E and Savardi U (2016) Can Contraries Prompt Intuition in Insight Problem Solving? Front. Psychol. 7:1962. doi: 10.3389/fpsyg.2016.01962 Keywords: insight, problem solving, contrast class, heuristics, contraries, spatial properties

# INTRODUCTION

In this paper we explore the impact of explicitly guiding people to think in contraries when searching for solutions to visuo-spatial insight problems (examples of this kind of problem are provided in **Table 1**). We held a training session during which participants in the experiment were provided with demonstrations of how manipulating the representation of a problem in terms of contraries might be helpful. They were then asked to apply this "way of thinking" in a test phase where they were presented with other visuo-spatial problems. The aim of the experiment was to investigate the effects of specific hints or training on insight problem solving. The impact of general meta-cognitive training on performance has been addressed in previous literature (e.g., Walinga et al., 2011; Patrick and Ahmed, 2014; Patrick et al., 2015), as has the impact of more specific hints which have been customized to the contents of a specific problem (e.g., Chronicle et al., 2001, Experiment 3; Weisberg and Alba, 1981; Grant and Spivey, 2003; Kershaw and Ohlsson, 2004; Kershaw et al., 2013; Öllinger et al., 2013, 2014). What we aimed to focus on here, and to further test based on the results of the experiment, was the hypothesis that thinking in contraries might support transformations in the mental representation of a problem, as required by insight problem solving. Clear evidence of this has yet to be provided, but there are some precursors to the present study which suggest that the question is worth investigating. We will briefly revise these in the next section, contextualizing the underlying processing in terms of specialprocess and business-as-usual perspectives.

# Contrariety: A Radical Change while Maintaining Continuity

In the special process theory, insight is conceived of as a process arising from a sudden restructuring of the representation of a problem occurring at an unconscious level (Siegler, 2000; Kershaw and Ohlsson, 2004; Bowden et al., 2005; Murray and Byrne, 2013). From this point of view, insight is a discontinuous process since it implies a break with previous constraints and attempts. On the other hand, in the business-as-usual theory, insight is seen as a continuous, step by step, conscious process which is similar in nature to the processes underlying the solving of non-insight problems (Newell and Simon, 1972; MacGregor et al., 2001; Chronicle et al., 2004; Ormerod et al., 2013). An integrated perspective has also been put forward based on the argument that the two alternative views are not mutually exclusive and that they both contribute to insight although perhaps in different ways and/or at different moments (Fleck and Weisberg, 2013; Weisberg, 2015).

Using contraries as a strategy in problem solving seems to necessitate an integrated process of this type. Breaking things up into perceptual chunks and reorganizing them into opposite patterns means producing a radical change (i.e., a sharp discontinuity). However, at the same time this change is data-driven, that is, it is anchored on and driven by the inherent features of whatever is represented and in this sense the process implied is gradual. For instance, in Gale and Ball's studies (2003, 2006, 2009, 2012) on people's thought processes during hypothesis testing in Wason's (1960) 2-4-6 rule discovery task, the participants' performance improved when they were given a contrast class cue. In the original form of the task associated with Wason's rule, participants are asked to discover the rule (known only to the experimenter) that in this case governs the production of series of three numbers. The rule is "any ascending sequence." Participants are then given 2-4- 6 as a seed triple and are asked to generate further series of three number, which are then assessed by the experimenter as either conforming or not conforming to the target rule. When the participants are confident, they announce that they have discovered the rule. In their studies, Gale and Ball (2003, 2006, 2009, 2012) used a dual-goal variant of this task, in which participants are asked to discover two complementary rules, one labeled "DAX" (i.e., the standard "ascending numbers" rule) and the other labeled "MED" (i.e., "any other triples"). The aim was to test whether providing "contrast class cues" for the MED rule might facilitate participants' performance. They provided the participants with different types of contrast class cues as MED exemplars (see in particular Gale and Ball, 2012). One of these was the 6-4-2 triple, that contrasted with the original 2-4-6 triple on a salient and crucial dimension, i.e. an "ascending" series versus a "descending" series. The other exemplar series, i.e., 4-4-4 and 9-8-1, contrasted with the original series of numbers in terms of dimensions which were irrelevant to the task. Namely, 4-4-4 contrasts with 2-4-6 on the "same-different" dimension (i.e., "three identical numbers" versus "three different numbers") while 9-8-1 contrasts with 2- 4-6 on the "mixed-homogeneous" dimension (i.e., "mixed odd and even numbers" vs. "only even numbers"), as well as on the dimensions relating to "equal-unequal" intervals and whether the middle number "is-is not" the arithmetic mean of the two numbers which flank it. Participants who had been presented with examples of series of three ascending versus descending numbers recognized the oppositional nature of the two rules implied, explored fewer confirmatory alternatives and more frequently found the solution suggesting that contrasts play a facilitatory role. As a result of the evident contrast between the ascending and descending series of numbers, the thought process that was then triggered apparently focused on a marked discontinuity. However, at the same time, the cue also prompted the recognition of a straightforward relationship connecting the two example series suggesting that a continuous thought process was involved here too. In addition, a continuous, step by step process was suggested by the participants' tendency to generate from time to time hypotheses that varied along just one dimension. This latter feature is in agreement with hypothesis testing in general, as conceived by Oaksford and Charter in their iterative counterfactual model (1994).

The fact that there is a clear and straightforward relationship linking two "contrast classes" is part of the definition of this psychological construct (see how "contrast class" is defined by Oaksford and Stenning, 1992; Oaksford, 2002). More in general, the characterization of contrast/contrariety/opposition in terms of maximum distance with at the same time a high degree of affinity is a common feature in research in the fields of both Psycholinguistics and Experimental Psychology. In the areas of Cognitive Semantics and Linguistics, opposites refer to the extremes of an underlying dimension (e.g., Lehrer and Lehrer, 1982; Cruse, 1986; Jones et al., 2012). Antonyms are at the same time minimally and maximally different from one another. They are associated with the same conceptual domain, but they denote opposite poles or parts of that domain (Cruse, 1986; Paradis, 1997, 2001; Murphy, 2003, pp. 43–45; Willners, 2001; Croft and Cruse, 2004, pp. 164–192; Paradis et al., 2009). These two features (maximum distance and invariance) also characterize contrariety/opposition from a perceptual point of view. Various studies on the perception of this relationship in a number of different types of visual configuration have shown that a necessary condition for two events under observation to be perceived as contrary is that a maximum transformation of a salient feature (which in these studies was usually orientation)

is manifested among overall invariant configurations. This has been formalized in the perceptual principles of non additivity and invariance in Bianchi and Savardi (2006; 2008a; see also Bianchi and Savardi, 2008b,c; Savardi et al., 2010; Bianchi et al., 2014).

The duality that thinking in terms of contraries seems to imply (i.e., maximum variation in an overall invariant configuration, extremes of a common underlying dimension and discontinuity in a clearly continuous pattern) also emerges when we consider negation and counterfactual thinking. In natural language and reasoning, humans tend to use negation in precise ways, following cognitive rules. One of the roles of negation is that of being a modifier of degree. This happens, for instance, when we say that "the water is not hot" about water that may be warm, lukewarm or cool (Bolinger, 1972; Horn and Kato, 2000; Israel, 2001; Giora et al., 2005a,b). Negation presupposes a polar dimension along which a shift away from the adjective to which not is applied occurs (Kaup et al., 2006, 2007; Paradis and Willners, 2006; Fraenkel and Schul, 2008; Bianchi et al., 2011c). Negated propositions are assumed to evoke two contrasting spaces, a factual and a counterfactual space (Langacker, 1991; Fauconnier and Turner, 2002; Hasson and Glucksberg, 2006). Counterfactual thinking requires a capacity to imagine alternatives to events, actions or states in order to test and validate hypotheses (Roese, 1997; Byrne, 2005). Counterfactual strategies are employed in falsification processes which are central to inductive and deductive reasoning (Wason, 1960, 1966, 1968; Farris and Revlin, 1989; Oaksford and Chater, 1994; Evans, 2002; Augustinova et al., 2005; Augustinova, 2008). In this case both confirmatory hypotheses and disconfirmatory hypotheses (that according to Wason's definition, 1960, literally contradict the previous ones) are generated. Counterfactual thinking is also implied in decision making. Inducing participants to take into account possibilities which are diametrically opposite to their initial assumptions is a de-biasing strategy which allows them to contrast the tendency to not consider adequately those alternatives which are at odds with their beliefs and perceptions and this leads to more accurate decisions (Lord et al., 1984; Mussweiler et al., 2000). The ability to imagine contrasting alternatives is also related to creativity and analytical problem solving. Additive counterfactuals, i.e. the addition of different antecedent elements to reconstruct reality (Roese and Olson, 1993), enhance performance in creative generation tasks that are facilitated by an expansive processing style, whereas subtractive counterfactuals, i.e., removing antecedent elements to reconstruct reality (Roese and Olson, 1993), enhance performance in analytical problem solving tasks that are facilitated by a relational process style (Markman et al., 2007).

The idea that both discontinuity and continuity are involved in the re-organization which takes place in insight problem solving was somehow prefigured by Gestalt psychologists. They did not explicitly discuss it in terms of contrariety/contrast, but in a sense they paved the way toward the hypothesis put forward in this paper, i.e., that contraries support the representational change that is required for an insight problem to be solved. As Wertheimer (1945) was the first to point out, a solution process requires problem solvers to reorganize the phenomenological features of the problem and this apparently occurs as a sudden "aha" moment. But it is less often remembered that Wertheimer also explicitly specified that this reorganization is based on the requirements of the initial phenomenological structure of the problem and is as such guided by them (representing the continuity element). According to him, the key operations in this reorganization are dividing elements that are unified and unifying elements that are separated while transforming their orientation and position in space [see Wertheimer's classic parallelogram problem, reported in Appendix 1 (Supplementary Material)]. From Duncker's perspective too (Duncker, 1945), productive thinking implies creating a break with the original formulation and representation of the problem and the usual way of thinking and using its inherent features in an unusual, sometimes even contrary way (representing the discontinuity element). In line with Wertheimer, he also explicitly stated that the solution process is suggested by and guided along directions emerging from the original structure of the problem.

If one adds to Wertheimer's and Duncker's premises the evidence that the human direct experience of space is grounded on oppositional structures which mostly refer to the human body (e.g., Howard and Templeton, 1966; Golledge, 1992; Shelton and McNamara, 1997; Tversky and Hard, 2009) such as nearfar, high-low, vertical-horizontal, in front of-behind, abovebelow, left-right, etc. (Savardi and Bianchi, 2009; Bianchi et al., 2011a,b, 2013), one can see first of all why contraries support the transformation of a problem's spatial representation and foster its reorganization and secondly why they do so while remaining anchored to the structure of the problem (this has been partially discussed in Branchini et al., 2009, 2015b).

# Aware vs. Unaware Processes

One of the factors implied in business-as-usual versus special-process perspectives concerns the consciousness vs. unconsciousness of the underlying thought processes in problem solving. This is one of the basic dichotomies characterizing thinking and reasoning processes even beyond problem solving, as acknowledged in dual-process theories (for an updated review of this see Evans and Stanovich, 2013; Weisberg, 2015).

The issue is also discussed in the literature investigating the effects of hints or training in problem solving. In most of these studies the hints provided by the experimenters which aim to bring to the fore the critical feature of the problem consist of implicit suggestions to problem solvers. For example, the solution to Duncker's radiation problem speaks of multiple lowintensity lasers being directed from several angles tissue rather than concentrating them onto a limited area (and thus risking damage to the skin in that area). In Grant and Spivey's study (2003) the hint came from an animation of the whole oval perimeter representing the skin. In Bröderbauer et al.'s study (2013) on Katona's five square problem, the hint provided to participants in the experiment consisted of a "wave form" (the shape represented in the solution) hidden in the logo of the research group (Bröderbauer et al., 2013). In Öllinger et al.'s study (2013) on the eight-coin problem, the implicit suggestion to use the third dimension to find the solution was provided by a variety of different initial configurations of the eight-coin problem (some of which cued the use of the third dimension).

Conversely, training tends to work on an explicit level because the aim is to make participants aware of how to solve a specific set of problems. For example, Dow and Mayer (2004) developed four different training programs (i.e., a verbal insight problem training packet, a mathematical insight problem packet, a spatial insight problem training packet and a combined verbal and spatial insight problem training packet). Each of these included information about the critical features of the specific set of problems and a description of the three-step procedure to be followed in order to solve that particular type of problem. Patrick and Ahmed (2014) and Patrick et al. (2015) developed various training programs in which participants were informed about the nature of verbal insight problems and were then instructed to use a specific procedure to solve that specific type of insight problem.

The study presented in this paper represents a conceptual development of a previous study (Branchini et al., 2015a). In that study participants working in small groups were given implicit guidance during the search phase in order to help them to analyze the spatial properties of the problems they had been presented with in terms of contraries. Contraries acted as an implicit heuristic since participants were only "primed" to consider contraries in one experimental condition and "prompted" with a vague hint in another condition. They were not specifically told how or why doing this might help. The suggestion led to shorter periods of time needed to find the solution, increased success rates and it also modified the kind of operations performed during the solution process: there were more goal directed behaviors, more reformulations of the problem and more operations directed toward a modification of the visual structure of the problem (e.g., changing orientation and localization, and reciprocal positioning of parts of the overall structure). In Gale and Ball's study too (2012), a contrast cue (ascending vs. descending triples) acted on an implicit level. Why the two exemplars should facilitate the discovery of the rules by pointing to the salient dimension (ascending-descending) was not made explicit to participants.

The study presented in this paper aimed to provide an expanded analysis of the impact of contraries on visual-spatial problem solving by foreseeing and testing the possibility that contraries might have a beneficial effect when used as part of a conscious and explicit strategy. If one keeps in mind Öllinger et al. model of the phases characterizing problem solving (Öllinger et al., 2008; but see also Ohlsson, 1992; Knoblich et al., 1999, 2001), one could foresee contraries to be beneficial in three different stages:


in problem representation will reach the threshold of awareness.

In this paper we do not put forward a hypothesis regarding at which specific stage the effect of the training should come into play. Prompting participants to think in terms of opposites from the very beginning of the solution search phase, might have an impact already at the level of the initial representation formed in problem solver's mind, but it might also have a later effect and support representational changes following the experiences of impasse, by suggesting the "new" starting point to be considered in the new attempt (at this level it would act as an intentional process). However, the preliminary training might also activate an arousal toward oppositional thinking operating also at an unconscious level during the impasse phases. A different experimental design from that used in the present study would be needed in order to answer the question of whether the effect concerns exclusively one of these phases or all of them. The aim of this study was to verify whether the training has an effect or whether it has not.

# THE PRESENT STUDY

In this study we explore whether explicit training aimed at increasing the awareness of a heuristic based on contraries has a positive effect on the reasoning processes related to visuo-spatial insight problems. A brief training program was developed which demonstrated that the systematic manipulation of the features relating to a problem (in this case transforming them into their contraries) might facilitate the search for a solution. We focused on problem solving in a group setting (with groups of three people) since a previous study which demonstrated the positive effects of providing implicit guidance to use opposites in problem solving (Branchini et al., 2015a) was conducted with groups. It is also well known from previous literature that problem solving in groups does not necessarily follow the same path as individual problem solving (for a review, see for example Laughlin et al., 2006). Although our main interest was in the group condition, we also added an individual condition in order to have a comparative indication of the effect of training in this latter case.

We tested the effects of the training in terms of success rates, the time needed to find the solution and the number of attempts made in the search phase. Each drawing done by the participants in the search phase was considered an attempt to find a solution. In order to further tap into the ways in which training influenced the thought processes of the participants, we also studied the spatial characteristics of the drawings done by the groups for one of the seven problems we had given them to solve. We randomly selected the "pigs in a pen" problem. The decision was made to analyze the drawings (as a dependent variable) rather than the discussions between the participants since drawings can be regarded as behavioral correlates of the cognitive search space and as such reveal participants' aware and unaware cognitive processes (see Fedor et al., 2015). Moreover, drawings are often the best way to share thoughts when people work together in a group.

Specifically, we aimed to explore whether the training impacted on their performance in terms of:


Moreover, by means of an in-depth analysis of the drawings done by the participants working in groups while trying to solve the "pigs in a pen" problem, we aimed to gain an insight into how the training impacted on:


In order to help us to interpret the findings which had emerged for the analyses of the drawings in the group condition, a comparative analysis of the dimensions manipulated by participants in the individual (baseline versus training) conditions was also conducted.

# MATERIALS AND METHODS

### Participants

One hundred and thirty-six participants (46 male, 90 female, M = 25.74 years, SD = 2.45 years) took part in the experiment individually (62 in the baseline condition and 74 in the training condition) during university classes on topics not related to the study. Another one hundred and twenty participants (33 male, 87 female, M = 21.73 years, SD = 2.19 years) took part in the experiment in groups of three. They were divided in forty inter-observational groups (20 groups, i.e., 60 participants, in the baseline condition and 20 groups, i.e., 60 participants, in the training condition). All of the participants gave written informed consent to participate in the study and they were undergraduate students at the University of Verona. The study was approved by the ethical committee of the Department of Human Sciences of the University of Verona (Italy) and conforms to the ethical principles of the declaration of Helsinki (World Medical Association, 2013).

# Materials/Problems

Seven spatial geometrical problems were used in all conditions (see **Table 1**). The order of the seven problems was randomized between participants.

## Procedure

The experiment consisted of two Participation conditions (individually vs. in groups) and two Training conditions (the training condition vs. the no-training condition or baseline).

#### TABLE 1 | The seven problems used in the study.


In the baseline condition, participants were presented with seven spatial geometrical problems and were asked to find a solution. In the training condition, participants attended a brief training session (duration: 10 min) before being shown the problems. During the training session, one of the experimenters explained how a strategy based on the manipulation of contraries could help with spatial geometrical problems. This was done by showing how three spatial geometrical problems—i.e., the "parallelogram" problem (Wertheimer, 1945), the "ninedot" problem (Maier, 1930) and the "altar-window" problem (Wertheimer, 1945)—could be solved by applying this strategy (to understand precisely what "changing a property into its contrary" means in relation to the three example problems we refer to, see Appendix 1 in Supplementary material). The participants were then requested to apply the strategy to seven new problems (see **Table 1**). They were specifically invited to identify and list all the spatial features which characterized the problem and then transform them into their contraries (the first step) before embarking on the search for a solution (the second step). Before being given the seven problems, the participants were requested to rate on a 0–10 point scale how well they had understood the training and to what extent they considered it to be useful.

In all the conditions, participants were provided with pens and sheets of paper to use for drawings or notes. They were given seven and a half minutes for each problem<sup>1</sup> . When they thought they had found the solution, they were instructed to raise their hands. The experimenters took note of their response time before ascertaining whether the solution was correct or not. If it was, the response time was recorded, if not, they were encouraged to keep searching. All the sessions were video-recorded.

# RESULTS

There are two points regarding methodology which need to be noted before the discussion of the results. First, the survey carried out after the training session showed that the participants exposed to training reported that they felt they had understood it (mean rating of understanding in both conditions: M = 7.4, SD = 1.67) and that they considered it to be potentially useful (mean rating of predicted usefulness in both participation conditions: M = 7.33, SD = 1.86). This confirms that the participants had not only been exposed to training, but that they had also taken it in.

Secondly, all the statistical analyses presented in the following sections were carried out using Generalized Mixed Effect Models (GLMM). This meant it was possible to deal with the variability related to the Problems and the Subjects as random effects while considering the two experimental conditions (Participation condition and Training condition) as fixed effects. Random effects have factor levels that do not exhaust the possibilities. If one of the levels of a variable were replaced by another level, the study would be essentially unchanged (Borenstein et al., 2009). For the purposes of the hypotheses tested in our study, the problems used in the experiment were simply exemplars of a general category (i.e., visuo-spatial geometrical problems) and they were interchangeable with any other problems of the same type. They did not differ in terms of one or another feature that we were interested in studying because we expected a systematic interaction between it and the fixed effects manipulated in the study; we chose these problems as random exemplars of visuo-spatial problems of varying degrees of difficulty. According to the item response theory (Baker, 2001), every item can be described by two characteristics: item discrimination and item difficulty. These express the relationship between a latent ability (in our case insight problem solving ability) and the probability of correct responses for an item (in our case, a problem). Since our study aimed to test whether the experimental conditions (Participation and Training) affected performance, one of the minimum desirable conditions to start with was to use a set of items (problems to solve) which were characterized by varying degrees of difficulty ranging in probability from 0 to 1. As can be seen in **Table 2**, the frequency of correct solutions associated with each problem in effect varied across problems. It was particularly high for some problems, particularly low for others and in between for some others.

# Success Rates

To begin with, we studied the effects on the success rate (i.e., the number of correct responses over the total number of responses) of the Training condition, i.e., training versus baseline, and the Participation condition, i.e., individual versus group, using a GLMM (binomial family, with Subjects and Problems as random effects).

No significant main effect of the Training condition emerged meaning that training did not lead per se to better results independently of the Participation condition, i.e. in groups or individually. A main effect of the Participation condition emerged [χ 2 (1, <sup>N</sup> <sup>=</sup> 176) <sup>=</sup> 11.6301, <sup>p</sup> <sup>&</sup>lt; 0.001], suggesting that groups perform better than individuals. However, there was a significant interaction between the Participation condition and the Training condition [χ 2 (1, <sup>N</sup> <sup>=</sup> 176) <sup>=</sup> 3.673, <sup>p</sup> <sup>=</sup> 0.05; see **Figure 1**] indicating that groups did not perform better than individuals in the baseline condition (Bonferroni post-hoc baseline-group vs. baseline-individual: EST = 0.407, SE = 0.389, z ratio = 1.044, p = 1.000). Therefore, being part of a group did not in itself guarantee a better success rate. Higher success rates emerged exclusively when the groups were exposed to training: their performance was significantly better than the performance of the individual participants in the training condition (post-hoc training-group vs. training-individual: EST = 1.452, SE = 0.384, z ratio = 3.778, p < 0.001) and also the individual participants in the baseline condition (post-hoc training-group vs. baselineindividual: EST = 1.064, SE = 0.387, z ratio = 2.744, p < 0.05).

# Time Needed to Find a Solution

A higher success rate did not necessarily mean that participants were also faster. On the contrary, a GLMM carried out on the time taken to reach the correct solution (Gaussian Family, with Training condition and Participation condition as fixed effects; Subjects and Problems as random effects) revealed that it took the participants longer to find the correct solution in the training condition than in the baseline condition [main effect of Training condition: χ 2 (1, <sup>N</sup> <sup>=</sup> 176) <sup>=</sup> 6.144, <sup>p</sup> <sup>&</sup>lt; 0.02; see **Figure 2**]. This was independently of whether they were working individually or in groups (i.e., no interaction between the Training and Participation conditions emerged). In the training condition they were asked to start by listing all the opposite spatial properties they could identify in the structure of the problem. As we considered this phase to already constitute part of the analysis of the problem, in the experimental design the time taken up for this analysis was included in the seven and a half minutes they had at their disposal. The longer solution times may thus be a consequence of them having spent some time on this initial phase. In other words, training is effective in terms of success rates but it is nonetheless time consuming.

# Number of Attempts, i.e., the Number of Drawings Done

We analyzed the number of attempts made by each group by means of a GLMM (Poisson family, with Frequency as a dependent variable, Training condition and Participation Condition as fixed effects, Subjects and Problems as random

<sup>1</sup> Six to ten minutes have been used to test insight problem solving in a thinking aloud condition, for example, in studies done by Ball et al. (2015), Schooler et al. (1993) and Fleck and Weisberg (2013).


TABLE 2 | Success rate (i.e., the proportion of correct responses over the total number of responses) for the seven problems used in the study, in the Training and Participation conditions.

effects). There was a significant effect relating to the Participation condition [χ 2 (1, <sup>N</sup> <sup>=</sup> 176) <sup>=</sup> 63.671, <sup>p</sup> <sup>&</sup>lt; 0.0001]: in groups, participants made more attempts than when participating individually (**Figure 3**). There was no significant effect of the Training condition and no interaction between the two fixed effects thus indicating that training did not lead to a difference in terms of the number of attempts. The analyses which were conducted subsequently were in order to ascertain whether there were any differences in the quality rather than the quantity of the drawings.

# Behavior during the Search for a Solution: Spatial Features Manipulated in the Drawings Done by the Groups When Trying to Solve the "Pigs in a Pen" Problem (in the Baseline and Training Conditions)

As part of this study, we also examined the drawings done by the participants in their search for a solution to the "pigs in a pen" problem (randomly chosen out of the seven presented). We studied whether and how the training and baseline conditions differed in terms of the set of spatial properties explored in the drawings (Section The space relating to the problem: relevant and non-relevant properties) and whether both poles of a dimension were considered (Section The search space in terms of dimensions). We also assessed the degree of changeability/fixedness of the properties considered in each of the attempts (Section Degree of changeability/fixedness of the properties considered in each of the attempts). These analyses were meant to help explain how the training had modified the procedures followed by problem solvers in the search phase. We acknowledge the limits of an analysis conducted on only one of the seven problems. However, analyses of a single problem are not uncommon in insight-problem solving studies (e.g., Grant and Spivey, 2003; Kershaw et al., 2013; Öllinger et al., 2013, 2014). Moreover, the results which emerged from this analysis were not meant to be conclusive, our intention was merely to offer some further indications on how training might have modified the direction which the participants' search took.

Two independent judges analyzed 313 drawings and determined which spatial properties were displayed in each drawing using an ad hoc classification grid made up of 42 pairs of opposite spatial properties, i.e., 84 properties in total (e.g.,

FIGURE 1 | Fixed effect plot of the interaction between the Training condition (baseline, training) and the Participation condition (group, individual) on the success rate (logit-scale). The bars represent 95% confidence intervals.

symmetrical-asymmetrical, angular-rounded, left-right, densesparse; see Appendix 2 in Supplementary Material). The grid was an adaptation of a list of 37 basic dimensions characterizing direct experiences of space (Bianchi et al., 2011b) in terms of

extension, shape, localization and orientation. The degree of inter-rater agreement reached by the two independent judges turned out to be very high (K di Cohen = 0.85).

#### The Space Relating to the Problem: Relevant and Non-relevant Properties

The task in the "pigs in a pen" problem is to add two more square pens so as to ensure that each pig ends up in a pen of its own. The square pen shown in the initial figure is represented in its typical orientation, i.e., a square with two horizontal sides and two vertical sides (**Figure 4**, diagram on the left). By adding two differently oriented, progressively smaller squares inside the original pen, the solution can be found (**Figure 4**, diagram on the right). In terms of the classification grid used in this study, 37 of the 84 spatial properties listed are relevant (as indicated in Appendix 2 in Supplementary Material). We created a new independent variable (Relevance, with two levels: relevant, nonrelevant) and analyzed the drawings done by the participants in the two Training conditions in terms of use or non-use (a dichotomous dependent variable) of the various relevant and non-relevant properties.

A GLMM was conducted on the Use of the 84 properties made by the groups in the baseline and training conditions (binomial family, with Training condition and Relevance as fixed effects, Group and Property as random effects). The analysis revealed a main effect of Relevance [χ 2 (1, <sup>N</sup> <sup>=</sup> 40) <sup>=</sup> 108.173, <sup>p</sup> <sup>&</sup>lt; 0.0001], i.e., relevant properties were used more frequently than nonrelevant properties, but it also revealed a significant interaction between the Training condition and Relevance [χ 2 (1, <sup>N</sup> <sup>=</sup> 40) <sup>=</sup> 64.725, p < 0.0001; see **Figure 5**]. Post-hoc tests clearly showed

that relevant properties were more likely to appear in the drawings produced by the participants exposed to training as compared to the baseline condition (EST = 0.371, SE = 0.142; z ratio = 2.602, p < 0.05), whereas no significant difference was found for non-relevant properties (EST = 0.201, SE = 0.141; z ratio = 1.423, p = 0.928).

#### The Search Space in Terms of Dimensions

In the training session, it was explicitly suggested that in the search phase participants should consider not only the properties pertaining to the initial representation of the problem but also their contraries. Therefore, we expected participants exposed to training to more frequently use both of the two opposite properties in their drawings, i.e., both of the two poles forming a dimension (e.g., large and small, inside and outside) than the participants in the baseline condition.

We defined a new variable (Dimension Use) on 4 levels: Dimension Within Attempt (DWA), i.e., both properties were used within the same drawing (e.g., a straight sided square pig pen and an obliquely oriented pig pen); Dimension Between Attempts (DBA), i.e., a property was used in one drawing and the opposite property was used in another drawing (e.g., one drawing exclusively showed straight sided square pig pens and another drawing displayed an oblique pig pen); Pole (P), i.e., participants never referred to a whole dimension in any of their drawings (e.g., they drew only straight sided pens and never changed the orientation of the pen); None (N), i.e., neither of the two poles were used in any of the drawings. Since in some cases both contrary properties were relevant to the solution, in other cases only one of the two properties was relevant and in yet other cases neither of the two properties was relevant, the analyses were made taking into account the Relevance of the dimension on the three levels mentioned earlier (relevant, partially relevant, non-relevant). For each of the 42 dimensions forming the classification grid, we calculated how frequently (in proportion to the total number of drawings done) the dimension was used in one of the four modalities (DWA, DBA, N, P). We then conducted a GLMM on this data (binomial family, with Training Condition, Relevance and Dimension Use as fixed effects and Dimensions and Groups as random effects).

The interaction between the Training condition and Dimension Use turned out to be significant [χ 2 (3, <sup>N</sup> <sup>=</sup> 40) <sup>=</sup> 39.784, p < 0.0001]. Bonferroni post-hoc tests revealed that the drawings done in the two Training conditions did not differ either in terms of the probability of the whole dimension being used within the same drawing (DWA, EST = 0.077, SE = 0.043, z ratio = 1.774, p = 1.000) or in terms of a dimension never being used (N, EST = −0.083, SE = 0.048, z ratio = −1.699, p = 1.000). The differences we found concerned the use of only one of the two poles of a dimension (P) and the use of one dimension divided between attempts (DBA). Participants more frequently used only one pole (P) in the baseline condition as compared to the training condition (EST = 0.383, SE = 0.094, z ratio = 4.052, p = 0.001). This is in line with our prediction that training would prompt the exploration of both poles of a dimension. The use of one dimension divided between two attempts (DBA) also turned out to be more probable in the baseline as compared to the training condition (EST = 0.291, SE = 0.051, z ratio = 5.664, p < 0.0001). This is apparently in contrast with our predictions, but a significant interaction between the Training condition, Dimension Use and Relevance [χ 2 (6, <sup>N</sup> <sup>=</sup> 40) <sup>=</sup> 104.871, <sup>p</sup> <sup>&</sup>lt; 0.0001; see **Figure 6**] and corresponding post-hoc tests revealed that this held specifically for dimensions which were not relevant to the solution (EST = 0.621, SE = 0.118, z ratio = 5.226, p < 0.0001). No significant differences were found between the training and baseline conditions with regard to the relevant dimensions (EST = −0.124, SE = 0.079, z ratio = −1.569, p = 1.000) or the partially relevant dimensions (EST = −0.129, SE = 0.058, z ratio = −2.204, p = 1.000).

Post-hoc test also revealed that the training had reduced the use of only one pole (P) specifically for the Relevant dimensions (EST = 2.153, SE = 0.401, z ratio = 5.362, p < 0.0001). No significant differences were found between the training and

baseline conditions with regard to the irrelevant dimensions (EST = 0.018, SE = 0.347, z ratio = 0.054, p = 1.000) or the partially relevant dimensions (EST = 0.177, SE = 0.324, z ratio = 0.547, p = 1.000).

### Degree of Changeability/Fixedness of the Properties Considered in Each of the Attempts

The changeability/fixedness of a property in the search space considered by participants was expressed in terms of the number of drawings done which displayed the property in question, as a proportion of the total number of drawings done by that group. The greater the proportional value, the greater the degree of fixedness, e.g., a value of 1 would indicate that the property was used in all of the drawings done by a particular group, representing maximum fixedness. A GLMM was conducted on the values of changeability/fixedness for each property (binomial family, with Training condition and Relevance as fixed effects, Group and Property as random effects). A significant main effect of Relevance emerged [χ 2 (1, <sup>N</sup> <sup>=</sup> 40) <sup>=</sup> 68.268, <sup>p</sup> <sup>&</sup>lt; 0.0001]: relevant properties were kept fixed more frequently across attempts than non-relevant properties. However, a significant interaction between Relevance and the Training condition also emerged [χ 2 (1, <sup>N</sup> <sup>=</sup> 40) <sup>=</sup> 26.099, <sup>p</sup> <sup>&</sup>lt; 0.0001; see **Figure 7**]. Posthoc tests revealed that the groups exposed to training did not differ from those in the baseline condition in terms of their aptitude toward changing relevant properties (EST = 0.092, SE

= 0.047, z ratio = 1.933, p = 0.319). Conversely, non-relevant properties were less fixed (in other words more changeable) across attempts in the training condition than in the baseline condition (EST = −0.178, SE = 0.058, z ratio = −3.081, p < 0.01).

# The Search Space in Terms of Dimensions Manipulated in the Drawings Done by the Participants in the Individual Condition When Trying to Solve the "Pigs in a Pen" Problem (in the Baseline and Training Conditions)

In order to help us explain why the effects of the training had emerged in the group condition but not in the individual condition, we explored whether the drawings made by individual participants while solving the "pigs in a pen" problem manifested similar trends to those found with the groups. In particular, we explored whether participants exposed to the training made use of only one pole of the dimension (P) less frequently than the participants in the baseline condition, or use of both of the two opposite properties more frequently in their drawings (i.e., Dimension Within Attempt, DWA, and/or Dimension Between Attempts, DBA). This might be considered a clue that they succeeded in applying the training, even though this did not lead to a higher solution rate—it is clear from literature on the subject that the effects of training are not necessarily manifested by better success rates (e.g., Patrick and Ahmed, 2014; Patrick et al., 2015).

When analysing the results of the groups, for each of the 42 dimensions forming the classification grid we calculated Branchini et al. Contraries in Insight Problem Solving

how frequently (in proportion to the total number of drawings done) the dimension was used in one of the four modalities (DWA, DBA, N, and P). We then conducted a GLMM on this data (binomial family, with Training Condition, Relevance and Dimension Use as fixed effects and Dimensions and Individuals as random effects). The interaction between the Training condition and Dimension Use turned out to be significant [χ 2 (3, <sup>N</sup> <sup>=</sup> 136) <sup>=</sup> 16.1313, <sup>p</sup> <sup>=</sup> 0.001]. Bonferroni post-hoc tests revealed that the drawings done in the baseline and training conditions did not differ either in terms of the probability of the whole dimension being used within the same drawing (DWA, EST = −0.071, SE = 0.060, z ratio = −1.193, p = 1.000) or between drawings (DBA, EST = 0.102, SE = 0.074, z ratio = 1.380, p = 1.000), nor did they differ in terms of a dimension never being used (N, EST = −0.134, SE = 0.069, z ratio = −1.939, p = 1.000). The differences concerned the use of only one of the two poles of a dimension (P): similarly to the results found for the groups, individual participants more frequently used only one pole in the baseline condition as compared to the training condition (EST = 0.388, SE = 0.083, z ratio = 4.654, p < 0.0001). The significant interaction between the Training condition, Dimension Use and Relevance [χ 2 (6, <sup>N</sup> <sup>=</sup> 136) <sup>=</sup> 48.940, p < 0.0001], and corresponding post-hoc tests, revealed that this held specifically for dimensions which were relevant to the solution (EST = 1.045, SE = 0.183, z ratio = 5.691, p < 0.0001). No significant differences were found between the training and baseline conditions with regard to the other categories of responses (DBA, DWA, and N). Therefore, in both the group and individual conditions, the training led to a reduction in the partial explorations of the solution space in terms of relevant properties (i.e., those limited to only one property, P). A second GLMM was conducted to compare the two participation conditions (with Participation condition, Training Condition, Relevance and Dimension Use as fixed effects and Dimensions and Groups as random effects). A significant interaction between Participation Condition, Training condition and Dimension Use emerged [here are the Chi square values: χ 2 (3, <sup>N</sup> <sup>=</sup> 136) <sup>=</sup> 32.7819, p < 0.0001]. The use of only one pole (P) was significantly less frequent when participants exposed to the training solved the problems in groups as compared to when they did it individually (EST = 0.448, SE = 0.109, z ratio = 4.108, p < 0.005).

# DISCUSSION AND CONCLUSIONS

The study presented in this paper aimed to further explore the hypothesis that reasoning in terms of contrast/contrariety/opposition might facilitate problem solving. Our results from the explicit guidance condition add to the previous literature based on implicit guidance which we mentioned in the introduction (e.g., Gale and Ball, 2012; Branchini et al., 2015a). The participants in our study were exposed to a brief training session in which it was suggested that they approach the task by systematically transforming the spatial features of a problem into their contraries. Examples were provided in order to demonstrate how this strategy might help to guide the participants' exploration of new representations throughout the solution search phase. Participants were then asked to transfer the strategy they had learned to seven other problems.

Four main findings emerged from the analyses. First, in terms of success rates (i.e., the number of problems which participants were able to find a solution to), the groups exposed to training performed better than the individuals. Exposure to training did not lead to an increase in the number of attempts made (i.e., the number of drawings). Our in-depth analysis of the characteristics of the drawings the participants had completed when trying to solve the "pigs in a pen" problem revealed that the search space which they had concentrated on did not in general expand, but they focused more on properties which were relevant to the solution, while at the same time the properties that they had examined and were non relevant to the solution were more readily disregarded in subsequent drawings. Moreover, the participants exposed to training made fewer "incomplete" explorations of the possible manipulations of the relevant properties related to the structure of the problem by limiting their explorations to only one pole. This last finding, in particular, was tested and verified in both the group and individual participation conditions.

In conclusion, our in-depth analysis of the effects of training in the case of the "pigs in a pen" problem suggests that in the group condition the training expanded the search space in a focused way, i.e., it did not lead to a disoriented multiplication of attempts and participants kept close to the properties which were relevant to the problem (on the relationship between "antonymous reasoning" and originality of solutions, rather than fluency, see also Dumas et al., 2016). We interpret this focused process (which is in line with Öllinger et al., 2013, but also Gale and Ball, 2012) as a consequence of the element of continuity that is implied in the idea of contrariety, as we pointed out in the introduction. In terms of continuity versus discontinuity in reasoning processes activated by thinking in terms of opposites, our results support discontinuity as the participants in the training condition were more likely to investigate various different paths (i.e., they less frequently limited their transformation to only one pole of the spatial dimensions they explored) each time discarding non relevant properties. They also kept more relevant properties fixed across the attempts and here too continuity is implied. Further investigations are needed in order to ascertain the extent to which these last results (which are based on an in-depth exploration of only one problem) can be generalized. What emerged is, however, in agreement with studies that suggest that the search phase of problem solving is evolutionary in nature with several search processes being launched simultaneously and their results being tested against a criterion of success which is defined by the structure of the problem. The most promising candidates are copied and modified until a solution is found or a dead-end is reached (Fernando et al., 2010; Dietrich and Haider, 2015).

The training we exposed participants to was not specific to a given problem (as in, for example, Chronicle et al., 2001, Experiment 3; Weisberg and Alba, 1981; Grant and Spivey, 2003; Kershaw and Ohlsson, 2004; Kershaw et al., 2013; Öllinger et al., 2013, 2014) but rather provided advice on how to search for a solution to a set of (spatial) problems and in this sense it resembles meta-cognitive training. However, it differs from other types of domain-specific meta-cognitive training investigated in previous literature (some of which is domain specific, e.g., Walinga et al., 2011; Patrick and Ahmed, 2014; Patrick et al., 2015) in that the participants in the present study were asked to use the "oppositional reasoning" strategy they had been told about in their exploration of the spatial domain. For both of these reasons it can be said to represent yet another method to add to the types of training whose facilitating effects on insight problem solving have been tested. It should be clear from the experimental design adopted in our study (with the control condition being no training and not another type of training) that the goal of the study was not to verify whether prompting participants to think in terms of opposites is more effective as compared to other types of training. Our goal was to collect evidence of whether explicitly showing participants how thinking in terms of contraries supports representational change leads to a better result than leaving them to work on their own. We wanted to ascertain whether this type of advice is useful and in fact the results of our study were positive.

As stated in the introduction, the study presented in this paper represents a conceptual development of a previous study in which contraries were used as an unaware, implicit strategy (Branchini et al., 2015a). A comparison between the effects of providing contraries as an implicit versus explicit guidance might tell us something more about how this heuristic impacts on the solution process. It would also be interesting to explore new ways of stimulating both implicit processing (e.g., using dynamic visual tasks) and explicit processing (e.g., using different types of training). In terms of the current state of the art situation, only a provisional comparison can be made between the improvement due to prompting participants to use contraries in implicit and explicit guidance conditions based on the findings from the present study and that carried out by Branchini et al. (2015a). There are obvious limits when one compares two experiments which do not perfectly coincide in terms of their experimental design. The problems used in Branchini et al. (2015a) were of a similar type to those used in the present study but only two were exactly the same; moreover, in the previous study the participants had no time limits whereas in the present study they were given seven and a half minutes. These differences are reflected in the percentage of correct responses in the baseline conditions in the two studies: 32% in the present study versus 67% in Branchini et al. (2015a). However, if we compare the improvement in the success rates associated with the experimental conditions in both of the studies, a similar effect emerges. Providing implicit guidance (as in Branchini et al., 2015a) led to 79% of correct solutions, which means an increment of 12% with respect to the corresponding baseline. Providing explicit guidance (as in the present study) led to 42% of correct solutions, which means an increment of 10% with respect to the corresponding baseline. The similarity between these two increments is thought-provoking. It might indicate that aware or unaware processing of contraries led to similar results or, alternatively, it might indicate that it was not the explanation given to participants about which mechanism to apply that was relevant in the training condition. The participants were asked to look for contraries before embarking in the solution process (and this is exactly the same as in the implicit guidance condition in the study done by Branchini et al., 2015a), and this, rather than the training as a whole, might have implicitly stimulated the expansion of the search space and the relaxation of the constraints relating to the mental representation of the problems.

A further question raised by the findings of the experiment presented in this paper concerns why training had positive effects specifically in the group condition. Participants worked in small groups also in the previous study where a positive effect of an implicit prompt to use opposites was found (Branchini et al., 2015a). The fact that groups work more effectively than individuals in problem solving also emerged in Augustinova's study (2008) on the benefits of falsification cueing in Wason's selection task, and in Laughlin et al.'s studies on letters-tonumbers problems (Laughlin et al., 2002, 2003). Attention has been devoted to how groups process information and a key factor seems to be the high degree of social sharedness of information at group level (e.g., Larson and Christensen, 1993; Wittenbaum and Stasser, 1996; Tindale and Kameda, 2000; Tindale et al., 2001; Galinsky and Kray, 2004). In order to help us to explain why, in our study, the training specifically affected success rates in the performance of the groups, but not in the performance of the individuals, we explored the dimensions manipulated in the drawings made by participants in the individual condition and matched them to those made by the groups [Section The search space in terms of dimensions manipulated in the drawings done by the participants in the individual condition when trying to solve the "pigs in a pen" problem (in the baseline and training conditions)]. We found that—similarly to what happened in groups—also in the individual condition, participants exposed to the training limited their explorations of the properties relevant to the solution to only one pole (P) less frequently than individuals in the baseline condition. Therefore, at least for the "pigs in a pen problem," the training seems to have a similar effect in both the individual and group conditions. It was simply stronger in the latter case. Moreover, when they were in groups, the participants made more attempts than when participating individually (Section Number of attempts, i.e., the number of drawings done). These two results taken together suggest that the difference in success rates between individuals and groups might actually lay in the fact that individuals made fewer solving attempts than groups (although in the right direction), rather than in the fact that the attempts made to apply what they had learned were less effective.

Three further considerations on the effects of "thinking in opposites" in group can however be put forward. Firstly, the general hypothesis underlying our training is that referring to contraries helps people to deal with the complexity of the problem structure by showing them "what is in there," not only in terms of actual properties but also in terms of their potential variations. The observation that every variation occurs within the framework of contraries is not only an intuition that we are indebted to Aristotle for (ed. Aristotle, 1984, Cat. 5, 4a 30-34). It is a principle that models the human direct experience of space (as pointed out by Savardi and Bianchi, 2009; Bianchi et al., 2011b) and also goes well beyond that, as testified by the pervasiveness of antonyms in every natural language (e.g., Jones, 2002; Murphy, 2003; Paradis et al., 2013). The training session in our study aimed to prompt divergent thinking and the creation of alternative representations by suggesting changes to the structure of the problem which were radical but at the same time also anchored to it. Working in groups might have facilitated this process since it has been demonstrated that the information which is more likely to be brought up in the discussion and more likely to influence decisions made is that which is shared by all group members (e.g., Wittenbaum and Stasser, 1996); the structure of the problem is the information shared by everyone in the group.

Secondly, in the training session participants were advised, as a first step, to identify all the spatial features characterizing the configuration of the problem. The better the descriptive analyses conducted in this initial phase were (in terms of exhaustiveness and precision), the richer the list of constraints to be relaxed in the following steps was when they transformed each feature into its contrary. We know from research into the Psychology of Perception that inter-observation in small groups of three to four members leads to more accurate descriptions of the facts under observation (see Bozzi, 1978; Bozzi and Martinuzzi, 1989; Kubovy, 2002). Therefore, in our case working in small groups might have improved the quality of the initial analysis of the structure of the problem.

Lastly, our training consisted of prompting participants to explore the structure of the problem in disconfirmatory rather than confirmatory terms. Confirmation biases are more likely to be prominent when people use their own problem solving strategies in an individual condition. On the contrary, if people in groups are asked to think in terms of opposites, not only do they do so on an individual basis, they also apply this strategy to suggestions coming from other members of the group. Moreover, it has already been shown that in argumentative discourse the ability to address opposing positions is crucial in order for people to coordinate their own perspective to that of other people (Kuhn and Udell, 2007) and that groups benefit more than individuals from the use of falsification cueing in reasoning (Augustinova

# REFERENCES


et al., 2005; Augustinova, 2008). Whether these data, taken together, are general evidence that small groups provide a better context for "thinking in terms of opposites" is an intriguing question, but as yet it is still premature for conclusions to be drawn.

# ETHICS STATEMENT

The study conforms to the ethical principles of the declaration of Helsinki and was approved by the ethical committee of the Department of Human Sciences of the University of Verona. Participants volunteered in the study. They signed the informed consent form approved by the ethical committee of the Department of Human Sciences of the University of Verona.

# AUTHOR CONTRIBUTIONS

EB, and EC substantially contributed to the conception of the work, the acquisition and coding of the data, the revision of the literature on the topic and the drafting of the study. IB, RB, and US substantially contributed to the conception and design of the study and the interpretation and analysis of the data. They also contributed to the drafting and revision of the work. EB, IB, RB, EC, and US approved the final version to be published and agree to be accountable for all aspects of the work in terms of the accuracy or integrity of any part of the study.

# ACKNOWLEDGMENTS

This work was funded by the Department of Human Sciences, University of Verona (Italy).

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.01962/full#supplementary-material


Bolinger, D. (1972). Degree words. The Hague: Mouton.


of the Cognitive Science Society, eds R. Alterman and D. Kirsh (Boston, MA: Cognitive Science Society), 438–443.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Branchini, Bianchi, Burro, Capitani and Savardi. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Losing Your Gut Feelings. Intuition in Depression

#### Carina Remmers1,2 \* and Johannes Michalak<sup>3</sup>

<sup>1</sup> Vivantes Wenckebach Clinic – Clinic for Psychiatry, Psychotherapy and Psychosomatics, Berlin, Germany, <sup>2</sup> Department of Clinical Psychology, University of Hildesheim, Hildesheim, Germany, <sup>3</sup> Department of Clinical Psychology, Witten/Herdecke University, Witten, Germany

Whereas in basic research, intuition has become a topic of great interest, clinical research and depression research in specific have not applied to the topic of intuition, yet. This is astonishing because a well-known phenomenon during depression is that patients have difficulties to judge and decide. In contrast to healthy individuals who take most daily life decisions intuitively (Kahneman, 2011), depressed individuals seem to have difficulties to come to fast and adaptive decisions. The current article pursues three goals. First, our aim is to establish the hypothesis that intuition is impaired in depression against the background of influential theoretical accounts as well as empirical evidence from basic and clinical research. The second aim of the current paper is to provide explanations for recent findings on the depression-intuition interplay and to present directions for future research that may help to broaden our understanding of decision difficulties in depression. Third, we seek to propose ideas on how therapeutic interventions can support depressed individuals in taking better decisions. Even though our knowledge regarding this topic is still limited, we will tentatively launch the idea that an important first step may be to enhance patients' access to intuitions. Overall, this paper seeks to introduce the topic of intuition to clinical research on depression and to hereby set the stage for upcoming theory and practice.

#### Edited by:

Michael Öllinger, Parmenides Foundation, Germany

#### Reviewed by:

Kinga Morsanyi, Queen's University Belfast, UK Nicola Baumann, University of Trier, Germany

\*Correspondence: Carina Remmers remmers.carina@gmail.com

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 30 April 2016 Accepted: 12 August 2016 Published: 23 August 2016

#### Citation:

Remmers C and Michalak J (2016) Losing Your Gut Feelings. Intuition in Depression. Front. Psychol. 7:1291. doi: 10.3389/fpsyg.2016.01291 Keywords: intuition, depression, automatic processes, decision-making, mood

# INTRODUCTION

In many situations, individuals judge and decide without long reflections about the problem at hand. Despite the lack of long deliberation, the decisional and judgmental outcomes are often smart and satisfactory (Gigerenzer, 2007). In other words, we sometimes know what is right even if, we cannot explain why. In many situations, this is because, we use our intuition. Even though intuition is a cognitive capacity that influences many decisions and subsequent actions in daily life (Kahneman, 2011), it has received little attention so far within clinical research on cognitive processes and decision-making in depression. This seems unfortunate because individuals with depression often report to have difficulties to come to decisions (American Psychiatric Association [APA], 2013). Converging with its preceding versions, the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) includes indecisiveness as a diagnostic criterion for Major Depressive Disorder (American Psychiatric Association [APA], 2013). A question that may follow from this often reported phenomenon is whether intuitive decision-making is impaired during depression. Research on this hypothesis, however, is scarce.

# AIMS OF THE CURRENT PAPER

The current paper addresses three objectives. Our first aim is to develop the hypothesis that intuition is impaired in depression and that considering intuition in the scope of depression research may have important theoretical and practical consequences. The second aim of the article is to point to methodological issues and open questions. Against the background of novel findings on the depression-intuition interplay, we will propose directions for future research and specific ideas as to how research on intuition may broaden our knowledge regarding decision difficulties in depression. The third aim of the current paper is to adopt a practical and therapeutical point of view by addressing the question how decision-making of depressed patients may be enhanced. Given that the overall knowledge regarding the interplay between depression and intuition is still limited, we will raise novel questions rather than giving concluding answers on this topic and hereby seek to set the stage for future research.

# DECISION-MAKING DURING DEPRESSION

When facing a decision, patients suffering from an acute episode of Major Depression often cannot make up their mind what to do. The depressive mind is narrowed to a tunnel vision in which patients tend to circle around the same (negative) pieces of information. This processing style is called rumination (Nolen-Hoeksema et al., 2008; Lyubomirsky et al., 2015). Rumination means to repetitively over-think the causes and consequences of one's situation or mood state (Nolen-Hoeksema et al., 2008). It has been shown that a ruminative self-focus predicts and prolongs depression and that it impairs problem solving (see Lyubomirsky et al., 2015 for a review). Thus, rumination may interfere with intuitive processes as it fosters a narrow and analytical information processing style (Watkins and Teasdale, 2004).

Research has shown that depressed individuals take poor decisions (Leykin and DeRubeis, 2010) or no decision at all (Okwumabua et al., 2003) – a phenomenon they severely suffer from American Psychiatric Association [APA] (2013). Moreover, it has been shown that depressive symptomatology is significantly associated with reduced search for information (Leykin et al., 2011). Individuals with higher levels of depression further report that ambiguities and uncertainties remain unresolved after they take a decision (Leykin et al., 2011). Furthermore, higher levels of depression are associated with reduced perception of existing resources (e.g., assistance of other people; own talents) and reduced satisfaction with decisions (Leykin et al., 2011). Individuals with depression have a high risk to feel uncertain about decisions they have taken (Stacey et al., 2008). Moreover, depressed individuals report more anticipatory regret (Schwartz et al., 2002; Monroe et al., 2005). Whereas anticipatory regret may serve as a warning mechanism that protects a person from bad decisions (McCormack et al., 2015) it seems that depressed individuals experience such high levels of anticipatory regret that this results in passivity and inaction. In line with this, depressed patients report to have less confidence and self-esteem regarding their decision-making capacities (Leykin and DeRubeis, 2010) and tend to take decisions that have had negative outcomes previously (Leykin et al., 2011). Thus, patients with depression seem to have difficulties to learn from prior decision-making experiences and tend to use maladaptive strategies repeatedly.

The above-mentioned shows that decision-making during depression seems to be afflicted with difficulties. The question, which component of decision-making is impaired during depression remains open. It should be noted that there are a number of factors that may be related to decision-making difficulties such as impaired reasoning capacities in depression (Radenhausen and Anker, 1988; Sedek and von Hecker, 2004; Perham and Rosser, 2012; Jung et al., 2014), lacking appreciation of information (Hindmarch et al., 2013) or limitations in working memory capacities (Channon and Baker, 1994) and increased ruminative processes (Nolen-Hoeksema et al., 2008). However, in the following, we aim to address the topic of potentially impaired intuitive processes because intuition is an important and – amongst healthy samples – often used decisionmaking tool (Gigerenzer, 2007; Kahneman, 2011) with highly adaptive features, especially in personally relevant, complex decision making situations (Kuhl, 2001; Topolinski and Strack, 2009a).

# WHAT IS INTUITION?

Since researchers investigate decision-making and judgment, there has been a fascination for the topic of intuition. Ancient philosophers used the term nous (greek: noein) to refer to the ability of human beings to grasp what is real or true. Nous (or noesis) is often translated with 'good sense' or 'intuition' and it stands in contrast to rational, conscious reasoning. In this ancient definition, intuition is understood as a vehicle by which one can get aware of what one already knows. As such, it may allow people to get access to pre-existing knowledge. Intuition further received considerable attention in the scope of psychoanalysis. Jung (1921) conceptualized intuition as a means by which a person can see the bigger picture. According to Jung (1921), intuition strives for new possibilities in what is objectively given. Intuition is the vehicle that automatically operates as soon as no other psychological function is able to find a way out of a complex situation (Jung, 1921). Along this line, for Jung, intuition is about discovering – a facet that still applies to current conceptions of intuition (Bowers et al., 1990).

Modern social and cognitive psychology operationalize intuition as a specific product in which puzzle pieces are quickly put together. Intuitions result from information processes that operate fast, associative and unconsciously (Kahneman, 2011). Prior experiences and their mental representations, build the basis for intuitive judgments and decisions. Thus, operating like a pattern completion mechanism, it appears that intuitive judgments are related to prior learning experiences and arise through unconscious holistic spreading processes (Sadler-Smith, 2008). They are often experienced as if they had come out of nowhere and enable individuals to detect coherences and patterns

(Kahneman and Klein, 2009). Intuitions are typically described with the phenomenon of knowing something without knowing how (Epstein, 2010).

In order to directly elicit and measure intuition in the laboratory, researchers developed paradigms such as the semantic coherence task, a well-established experimental paradigm developed by Bowers et al. (1990). In the semantic coherence task intuition is operationalized as the sudden perception or realization of coherence based on unconscious activation spread within associative networks (Bolte et al., 2003; Bolte and Goschke, 2005). During the task, participants see triads of words. Each word triad consists of three words, presented in a stacked format on a computer screen. Participants are asked to judge intuitively whether the presented word triad shares a common denominator (e.g., SALT DEEP FOAM; all words are associated to the solution concept SEA; coherent triad) or whether the triad consists of randomly selected words (DREAM BALL BOOK; no common denominator; incoherent triad). The intuitive performance is reflected by the degree to which participants can differentiate between coherent and incoherent word triads without being able to explicitly name the solution word (which would be indicative for insight and not intuition; Bolte and Goschke, 2005; Topolinski and Strack, 2009a,b,c; Topolinski and Reber, 2010). It has been shown that healthy participants are generally able to detect semantic coherence above chance level (Bolte and Goschke, 2005). They know when a triad is coherent, without being able to explicitly name the underlying solution word. This is even shown in experimental designs, in which participants have less than 3 s for their decision, a time window during which the operation of explicit processes is very unlikely (Bolte and Goschke, 2005). Thus, the semantic coherence task operationalizes and measures intuition by assessing the activation of information (solution word) which is not consciously accessible (Bolte et al., 2003; Bolte and Goschke, 2005).

According to the continuous model (Bowers et al., 1990), intuitions arise from a gradual two-stage process. Within the first stage, information spreads and accumulates. This results in the activation of an associated network. Because of its activation, the mnemonic network is processed more fluently (Topolinski and Strack, 2009a,c), which in turn is accompanied by subtle positive affective changes (Topolinski and Strack, 2009a,c for empirical demonstrations of processing fluency). It is during this first stage, the guiding stage, when a person may experience the feeling of coherence – an intuition (Bowers et al., 1990). If the unconscious activation spread of coherent information exceeds a certain threshold the initial intuitive feeling of coherence may evolve into the explicit representation of the solution. This second stage, in which a person can explicitly reason about the decision or action taken within the guiding stage, is called the integrative stage (see Zander et al., 2015 showing distinct brain activation patterns for each stage of the intuition generation process).

The theoretical conception of a continuous two-stage progress from intuition to explicit insight, allows us to hypothesize at which stage impairments may occur in individuals who have little intuitive capacities. For example, the inability to take decisions based on intuitive processing may be attributable to impairments at very early stages of the intuition generation process, such as reduced spreading activation within the semantic networks. However, it is also conceivable that intuitive impairments may occur because individuals are not able to make use of subtle positive affective cues, normally elicited by coherence perception (Topolinski and Strack, 2009a,b). On a later stage, it may be that an individual has the intuition but does not use it because of low confidence in his or her decisional abilities. Overall, it becomes clear that the conceptualization of intuition generation as a two-stage process may have important consequences regarding further theorizing.

# THE ADVANTAGES OF INTUITION

For a long time, intuition was the black box of modern experimental psychology (Catty and Halberstadt, 2008) and initial research programs in this field focused on instances in which non-deliberate, heuristic problem solving strategies lead to erroneous and suboptimal outcomes (Tversky and Kahneman, 1974). However, within the past decades, research on potential advantages of intuitive decision-making has received particular attention. Studies in the scope of the Naturalistic Decision Making Paradigm (Klein, 1998, 2008), for example, demonstrated that subjects from various professional backgrounds such as firefighters, doctors, chess players, nurses, and judges use their intuition in complex situations and under high stress and time pressure. Especially in situations in which rational-analytical processing is not possible (e.g., under stress or uncertainty) and in case of high experience with the problem at hand, intuitions can lead to impressively adaptive outcomes. When large amounts of information need to be encoded, intuitive decisions bear better outcomes and lead to more diagnostic judgments than extensive reasoning. A vivid demonstration of this has been shown by Betsch et al. (2001). In their study, participants were given large amounts of information concerning the numerical increases and decreases of five hypothetical shares. Seventy-five units of information were briefly presented on a computer screen. Even though participants could not explicitly tell what, for example, the average money returns were, they had developed a gut feeling of what the best and worst options were.

Subsequent studies bolstered the idea that relying on intuitive hunches is especially useful when the problem at hand is complex in nature (Dijksterhuis, 2004; Dijksterhuis and van Olden, 2006; see also Wilson and Schooler, 1991; Topolinski and Strack, 2008) and that deliberate processes such as searching for solutions or memorizing may even impair decision-making performance (Topolinski and Strack, 2008). Also in the context of social cognition, intuition has received considerable attention (Lieberman, 2000). Studies that operationalized intuition with the semantic coherence task found that intuitive processing seems to be especially relevant for the enactment of affiliation motives (Maldei et al., under review) and that intuitive performance is positively associated with meaning in life (Hicks et al., 2010).

Moreover, it has been shown that people are also more satisfied with decisions that were based on their gut feeling. In their seminal study, Wilson et al. (1993) let participants choose a poster that they could take home. Subjects could choose either intuitively or after thinking through the reasons why they liked or disliked each alternative. Results revealed that subjects in the rational-reasoning condition were less satisfied with their choice when asked about 3 weeks after the experimental session compared to subjects who chose a poster intuitively. Reduced levels of satisfaction in the analytical group may have occurred because analytic processing typically abstracts from the emotional and personal meaning of a decision at hand (Kuhl et al., 2015). In other words, analytic processes reduce the complexity of a problem by breaking ambiguous information down to one aspect that is important in a particular situation (Dijksterhuis, 2004; Kuhl et al., 2015). This is of advantage for logical problem solving but of disadvantage when the problem includes divergent aspects that need to be considered (e.g., solving a complex personal problem; interpersonal relationships; dealing with an illness; see Kuhl et al., 2015). For the latter problem type, intuitive decision-making seems to be advantageous.

Also in the context of personality psychology intuitions that are based on holistic and associative processing sequences are conceived as highly adaptive. More specifically, Personality Systems Interaction Theory (PSI; Kuhl, 2000, 2001), distinguishes between low-level and high-level intuitions. Low-level intuitions help people to execute concrete actions and typically arise under high levels of positive affect (Kuhl, 2001). They are guided by a system called intuitive behavior control. One of the ontogenetically earliest observation of such processes is the automatic imitation and contagion of emotional expressions in newborn children (Meltzoff and Moore, 1994). So whereas low-level intuitions help to implement intentions and to enact automatized behavioral programs, high-level intuitions derive from what PSI theory calls extension memory, a system that stores all experiences of a person and that integrates new information (Kuhl, 2001; Kuhl et al., 2015; see Lieberman et al., 2004 for neuropsychological evidence for intuitionbased self-knowledge). The extension memory operates on the basis of unconsciously operating processes of activation spread, which enable a person to effortlessly include a vast amount of information regarding experiences, needs and goals simultaneously into the decision-making process (Kuhl et al., 2015). Thus, high-level intuitions are conceptualized as feelings or hunches in which diverging aspects of the self can be integrated. Intuitions help people to reconcile many – maybe even conflicting – aspects of a decision at hand and lead hereby to adaptive and helpful outcomes even when a person has not explicitly thought about all relevant aspects.

Altogether, the foregoing illustrates that the ability to make use of high-level intuitive processes may lead to adaptive outcomes in complex situations and connects us to ourselves in an integrated manner. In the following, we will thus further elaborate our main assumption of impaired intuition in depression by referring to influential theoretical accounts and empirical demonstrations from basic psychology.

# THEORETICAL AND EMPIRICAL INDICATIONS FOR IMPAIRED INTUITION IN DEPRESSION

Even though – normally – intuitions guide us through every-day life, there seem to be psychological states in which individuals are less intuitive and therefore less able to come adaptive decisions without long reflections. On the one hand, research has focused on external factors intuitive processes may depend on, such as time pressure or complexity of the problem (Klein, 1998, 2008). On the other hand, there are intra-individual conditions under which it is more or less likely that people will use their intuition. The question is thus, within which psychological states people easily decide intuitively and when they are blocked and unable to decide out of the belly. Because Major Depression is an affective disorder that is most and foremost characterized sustained negative mood, we will refer to empirical evidence from basic research on the interplay between mood and cognition in order to consolidate our assumptions in the following.

# Intuition and Mood

Regarding the question how people's intuitive capacities are associated with their current mood state, considerations originating from affect-as-information-theory (Schwarz, 2002) and broaden-and-build theory (Fredrickson, 2001) provide important insights. According to these accounts, associative, flexible information processes needed for intuitions to develop, are more likely to operate under positive mood. Indeed, it has been shown that positive mood makes individuals find unusual (but reasonable) associations and fosters categorizations of material in a more flexible manner (Isen, 2001). The effects of positive mood on problem solving, flexibility and innovation are observable in a broad field of settings and among various populations (Isen, 2001). Most importantly for the current thrust, it has robustly been found that positive mood fosters the activation of remote semantic associations (Isen et al., 1985, 1987; Estrada et al., 1994; Fredrickson and Branigan, 2005) and that participants' intuitive coherence judgments benefit from positive mood (Bolte et al., 2003; Balas et al., 2012). In addition, being in a positive mood makes it more probable to make use of feelings and intuitive hunches in the decision-making process (see affect-as information theory; Schwarz and Clore, 2007). Converging with this, there are several studies showing that individuals are more likely to rely on their intuitions when they are in a positive mood (Bless et al., 1990; Elsbach and Barr, 1999; Ruder and Bless, 2003; King et al., 2007) and intuitions themselves are accompanied by subtle positive affective cues (Topolinski and Strack, 2009a). Thus, positive mood enlarges our thought-action repertoire, widens the associative field and makes us consider more and new information (Csikszentmihalyi, 1990). As a result, individuals approach and explore their environment during positive mood states and consequently engage in activities (Diener and Diener, 1996; Fredrickson, 2001).

Negative mood states, in contrast, signal that the environment is problematic. This in turn narrows the thought-action repertoire (Fredrickson, 2001). Consequently, more analytical

and systematic decision-making approaches are selected and flexible processing needed for intuitions to develop are inhibited. In line with this, affect-as-information theory (Schwarz, 2002), posits that negative mood states such as sadness foster cognitive analytic reasoning which makes individuals attend to few details rather than the bigger picture. Thus, whereas positive affectivity cues top-down processes, negative affectivity prompts bottomup, data-driven and item specific processing (Clore et al., 2001; Clore and Storbeck, 2006). Converging with this, an influential study showed that in happy moods, participants match geometric figures on the basis of global similarities whereas in sad moods, subjects tend to match figures on the basis of local similarities (Gasper and Clore, 2002). Consequently, it is assumed that intuitive processes are impaired during negative mood states, because negative mood fosters analytic reasoning. Baumann and Kuhl (2002) investigated the interplay between intuition, affect and affect regulation ability and found that intuitions of semantic coherence were impaired by negative affect in participants who reported to have difficulties to down-regulate negative mood states. In contrast, intuitive performance was not impaired by negative mood in participants who were generally successful in down-regulating negative affective states (Baumann and Kuhl, 2002).

From a clinical perspective those findings are worth noting, as one of the main features of psychological disorders and especially Major Depression is the sustained experience of negative affectivity as well as the inability to down-regulate dysphoric mood states. Thus, enduring states of negative affectivity as well as the inability to experience positive affective states may be aspects of depression that inhibit open and flexible ways of processing information needed for intuition. To summarize, the assumption of impaired intuitive processing during depression is substantiated from several different theoretical perspectives.

# Depression and Intuition: Preliminary Findings

In the following, we will present three recent studies that have empirically tested the hypothesis of impaired intuition in depression. We will outline the study designs as well as findings of these three studies. Moreover, we will critically discuss the pattern of results and will then conclude which future studies should be done in order to further elucidate the interplay between depression and intuitive decision-making. The first study that has investigated intuition in depression (Remmers et al., 2015a) compared the intuitive performance of depressed inpatients (n = 29) to a healthy control sample (n = 27). Both samples were comparable in terms of gender distribution, while the depressed sample being slightly younger than the control group. To assess intuition, the well-established intuition measure described above, namely the semantic coherence task, was used. Results revealed that depressed inpatients were less able to detect semantic coherence than healthy control participants. In addition, depressed patients who fulfilled criterion A8 of the DSM-5 (American Psychiatric Association [APA], 2013), reflecting patients' difficulties to think, concentrate and decide, had significantly lower intuitive accuracy than patients without those symptoms. Thus, this first study on intuitive performance during depression supported the hypothesis of impaired intuition in depressed patients.

Two follow-up studies aimed to replicate the finding that semantic coherence intuitions are impaired in depression and to generalize this finding to another intuition measure. In their first study, Remmers et al. (2016a) used a sample of depressed patients (N = 39) from a day-clinic. To replicate the finding of impaired semantic coherence detection, patients' severity of depressive symptoms measured with the Beck Depression Inventory (BDI-II, Beck et al., 1996) was correlated with their performance in the semantic coherence task. To generalize the impairment in intuitive processing to another intuition measure, patients further completed the visual coherence task (Bowers et al., 1990; Bolte and Goschke, 2008; Topolinski and Strack, 2009c) which is similar to the semantic coherence task because it operationalizes intuition as fast, non-analytical coherence detection. However, the tasks differ in terms of stimulus type as in the visual coherence task participants see blurred pictures (instead of word triads presented in the semantic coherence task). One half of the stimulus pool is coherent because it contains distorted meaningful but very rarely explicitly identified pictures. For the other half of the stimuli the pixel information of the coherent pictures is rotated to such a degree that no meaningful gestalt is preserved. Thus, coherent as well as incoherent pictures contain the same pixel information but they differ in their arrangement. During the task, subjects are asked to judge whether the presented picture is coherent (depicting a real object) or incoherent (depicting no object). Similar to the semantic coherence task, it has been shown that participants are able to differentiate between coherent and incoherent pictures without being able to explicitly name the depicted pictures (Bowers et al., 1990). In their study, Remmers et al. (2016a) found in line with the study of Remmers et al. (2015a) that higher levels of depression were associated with less intuitive accuracy in the semantic coherence task. However, findings regarding the visual coherence task were against the initial hypothesis. Patients with higher levels of depression showed enhanced ability to detect visual coherence. Notably, there was a near zero correlation between the two intuition measures across the sample.

In order to explore the unexpected finding that visual coherence detection is enhanced in patients with higher levels of depression, the authors conducted a second study in which they compared the performance in the visual coherence task of depressed inpatients (n = 27) to a matched healthy control sample (n = 30). Similar to the study design of Remmers et al. (2015a), the diagnostic status of subjects was determined with the SCID interview. Results revealed that depressed patients did not only perform as good as healthy subjects, but that they outperformed the healthy control sample in discriminating coherent from incoherent blurred pictures. Granted that both measures assess the same construct namely intuition (see discussion on this below), it may tentatively be concluded that that for depressed individuals, processes underlying visual and semantic coherence detection are distinct from each other and that only language-based semantic intuitions seem to be impaired in depression. Visual coherence detection in contrast seems to

profit from depressed mood. However, given the preliminary nature of these results future research should replicate these findings before drawing firm conclusions.

# How to Explain Detrimental and Beneficial Aspects of Intuition during Depression?

The novel dissociation between semantic and visual coherence intuitions during depression raises questions regarding the differential decisional consequences of depression and regarding the construct validity of intuition measures. Even though it has previously been postulated that successful performance in the semantic as well as in the visual coherence task results from equivalent processes, this assumption needs further investigation. For example, the near zero correlation between the semantic intuition index and the visual intuition index in Remmers et al. (2016a) raises doubts to whether both tasks measure the same construct. Furthermore, the deleterious effect of negative mood on coherence intuitions has only been shown for semantic coherence intuitions so far (Baumann and Kuhl, 2002).

Specific stimulus features and the processes needed for successful performance may explain the dissociation between depressed patients' performance in the semantic and visual coherence tasks. A core difference between the two tasks used in Remmers et al. (2016a) is that one is based on visual processing whereas the other requires language-based, semantic processing. It has been assumed that – despite this difference in stimulus type – the two tasks measure the same construct, namely intuitive coherence detection (e.g., Topolinski and Strack, 2009a). However, the current pattern of findings regarding this capacity during depression suggests that the differences outweigh the commonalities between the tasks – at least as far as individuals with depression are concerned.

First, the finding that language-based intuitions are impaired, whereas visual intuitions are not, may be related to empirical evidence showing that biased responses in implicit memory tasks are only consistently found in depression for tasks that require processing of the meaning of stimuli (Watkins, 2002). Implicit memory tasks that require the attention to perceptual features, in contrast, are not biased during depression. Referring these findings to the results in Remmers et al. (2016a) it may thus be that particularly semantic coherence intuitions are impaired, as they require semantic meaning processing, whereas blurred pictures in the visual coherence task, do not and are therefore intact.

Along this, line, studies using magnetoencephalography (MEG) to investigate neural mechanisms underlying intuitive coherence perception are worth noting in elaborating the idea that semantic and visual coherence intuitions may be distinguished regarding underlying mechanisms and processes needed for successful performance. Horr et al. (2015) found that the orbitofrontal cortex (OFC) serves as a crucial integrator of incomplete stimulus input for semantic as well visual intuitions. However, there seems to be a striking difference in terms of temporal dynamics. Whereas in visual coherence detection, the OFC is one of the earliest regions that showed differential activation (Horr et al., 2014), OFC activation was comparably delayed in semantic intuition. In line with the foregoing, the authors point to conceptual difference between the two tasks. Visual coherence intuitions are specific to one sensory domain and based on low-level stimulus features which can directly be integrated by the OFC to a coarse holistic representation of the pixel information. In contrast, for semantic coherence intuitions, higher-level semantic processing needs to take place prior or parallel to the spreading activation process that signals coherence or incoherence, because each word of the word triad itself is a meaningful concept that needs encoding, respectively (Horr et al., 2015).

Furthermore, the dissociation between semantic and visual intuitions in depression may be related to the phenomenon that patients with depression tend to get caught in circles of rumination (see Watkins and Teasdale, 2004). Rumination operates largely language-based and it may be suspected that during depression the language-based processing mode is under high loads which may become evident in poor performance in tasks that require this capacity.

Another important task-specific particularity that should be discussed is that the detection of a Gestalt in the visual coherence task requires the isolation of an object within a stimulus. As such, successful performance in the visual coherence task requires that subjects attend to what is already there (the object within the blurred picture). From the angle of PSI theory this process may be assigned to what Kuhl (2000) calls the object recognition system. Importantly, this system is specialized in isolating elements from the context. It benefits from negative mood and fosters analytic-detailed processing on the one hand, but impairs holistic processing and self-compatibility checking on the other hand (Kazén et al., 2014). In line with this, it has been shown that subjects with emotion regulation difficulties are better to detect spelling errors in words (detail-oriented attention; isolating elements from the context) when they are in a negative mood compared to subjects who do not have difficulties in emotionregulation (Kazén et al., 2014). In the semantic coherence task subjects focus on what is there, too: the three words written on the screen. However, in contrast to the object within the blurred picture in the visual coherence task, which is present during the task, the solution word (the common denominator) is not present (on the screen) in the semantic coherence task. Successful performance in the semantic coherence task thus requires letting the attention move away in order to integrate and finally use activated associations in the following judgment. Unlike the detection of an object within the blurred picture, this processing sequence may be assigned the extension memory (Kuhl, 2000), a system that fosters the integration of single elements (DEEP SALT FOAM) into a coherent whole (SEA) via high-level intuitive holistic processing sequences and it is connected to the integrated self. Thus, in line with the theoretical assumptions in the foregoing, this extended memory system including the parallel-holistic, flexible processing sequences that it relies on seems to be impaired during depression.

Finally, yet importantly, the findings of enhanced visual coherence judgments during depression may further be embedded into research showing that negative mood – in

general – fosters detail-oriented and early visual processing (Bocanegra and Zeelenberg, 2009). For example, Phelps et al. (2006) found that participants' contrast sensitivity is enhanced after viewing fearful faces. Furthermore, negative affective states have been shown to foster spatial working memory capacities whereas they impair verbal working memory capacities (Gray, 2001; Storbeck, 2012).

Concluding, a fine-grained analysis of stimulus features as well as of cognitive and emotional processes required for successful task performance can help to understand in how far different tasks eventually measure the same or distinct outcomes and how different task characteristics interact with psychological processes. From the current evidence, it may be concluded that depressed individuals have impairments in intuitions that rely on flexible, associative processes of semantic spread, but that depression might have no or even a beneficial effect on visual processes and visual gestalt perception. If these findings were consolidated in future studies, important practical implications may be concluded. For example, in therapeutic interventions it may be considered that depressed individuals have difficulties to recur on holistic semantic associations when solving problems. Supporting therapy sessions with visually based material, may thus be helpful in supporting patients to see the bigger picture and integrate information in a holistic manner.

However, for the moment, we think that conclusions should be drawn with care as the empirical basis is not sufficiently robust. Even though current findings suggest that in some instances intuitions are enhanced in depression whereas in others they are impaired, we think that a definitive conclusion would be premature. For example, we cannot conclude from the current studies whether impairments in other faculties such as analytical processes have influenced the operation of intuitive processes in the current studies. Upcoming research would do well in examining the interplay between intuitive processes and rational-analytic processes that may also be impaired and biased in depression (Beevers, 2005). In addition, future research should first of all elucidate the construct validity of the intuition tasks. Moreover, it should be examined to what extent the operationalization of intuition used in the former studies is related to depressed individuals' decision-making styles in daily life. On the basis of these considerations, we will outline suggestions of future research that seeks to further elucidate the interplay between intuition and depression in the following.

# DIRECTIONS FOR FUTURE RESEARCH

# Which Mechanisms Underlie Intuitive Decision-Making in Depression?

The investigation of intuition and depression is still in an early phase. Concluding from current findings it seems important that future research first of all elucidates whether different intuition tasks effectively measure the same psychological phenomena. Furthermore, from the perspective of a continuous conceptualization of intuition (Bowers et al., 1990), future research should explore at what stage within the intuition generation process impairments occur. First, it should be explored whether the underlying process of semantic spreading activation is impaired in depression or whether this is intact, which would become obvious in successful performance in semantic priming tasks (see Topolinski and Strack, 2009a,c). In a next step it should be examined whether the impairment in intuitive performance is attributable to patients' low confidence in their intuitive hunches (see for example Rolison et al., in press for a study on the effects of anxiety and reduced confidence on decision-making). If underlying processes of activation spread were shown to be intact in depression, and intuitive performance deficits mostly stem from low confidence levels, this would have important implications for therapeutic interventions that may consequently be directed to enhance patients' trust in their intuitive capacities. Moreover, it should be explored whether activation spread is negatively biased in depression. This could be examined by using affectively laden word triads. One assumption may be that negative word triads are processed more fluently in depression, which would result in better intuitive accuracy for negative stimuli compared to positive stimuli (see Topolinski and Strack, 2009a for stimulus pool).

It should further be explored whether intuition deficits in depression are related to the diminished ability of depressed individuals to experience positive affect (Heller et al., 2009; Joormann and Vanderlind, 2014). This would be important to study as intuitive hunches have shown to be accompanied by subtle positive affective changes (Topolinski and Strack, 2009a,b) and intuitive decision-making itself is boosted by emotional information (Bolte et al., 2003; Lufityanto et al., 2016). Along this line, a recent study has found that especially people with affect regulation difficulties benefit from positive mood when taking intuitive decisions (Maldei and Baumann, 2015). However, depressed individuals may have problems to make use or even experience these positive affective cues needed for intuitive decisions. In other words, whereas in healthy people intuitive decisions just feel right, depressed patients may lack the ability to experience such positive feelings of coherence. This in turn may lead to less favorable decisions or no decision at all. Investigating these ideas would provide important insights on why depressed individuals struggle to come to decisions that feel right.

Moreover, future investigations would do well in assessing also effortful, analytical decision-making capacities of depressed patients. It would be of interest to examine how impairments in one capacity influences the other. For example, it should be explored whether intuitive processes are related to limitations in reasoning or working memory capacities. In addition, it should be explored to what extent the generation of irrelevant thoughts or ruminative processes impair intuitive decision-making, as for deliberate reasoning it has been shown that irrelevant thoughts elicited by negative mood impair performance (Perham and Rosser, 2012). In addition to these ideas, it would be interesting to investigate in future studies which neurophysiological impact antidepressants exhibit on unconscious processes of coherence detection. Altogether, there is a set of research questions resulting from the current empirical evidence on intuition in depression that are specific to the experimental tasks used in former studies (Remmers et al., 2015a; Remmers et al., 2016a).

Apart from these specific issues that concern well-established intuition measures such as the visual and semantic coherence tasks, upcoming research should further continue to explore intuitive capacities in depression by using measures that tap into other facets of intuition (Sinclair, 2011). For example, investigating intuitions based on stimuli that are more selfrelevant may be especially important in order to increase ecological validity of empirical findings (Lieberman et al., 2004). This line of research would take into account that intuition is highly influenced by experience as it is 'nothing more and nothing less than recognition' (Kahneman and Klein, 2009, p. 520). Thus, even though intuitions are interindividually comparable in terms of the processes they are based on (i.e., associative, unconscious, fast) people can differ regarding the content of these processes and the products that results from them. It thus becomes evident that some intuitions, such as semantic coherence intuitions assessed with the semantic coherence task are inter-individually comparable (most participants would agree that SEA SALT FOAM are all semantically connected to SEA) whereas others are largely idiosyncratic as persons may differ in their associative network and memory contents that are activated in certain situations (Lieberman, 2000; Lieberman et al., 2004). In line with this, evidence from neurophysiological research found distinct brain activation patterns for self-representations that are based on intuition (Lieberman et al., 2004). Biases in this domain would provide important insight, especially because intuition-based self-presentations are likely to change slowly and are relatively insensitive explicit feedback from others (Lieberman et al., 2004). Moreover, using more self-relevant stimuli for intuitive decisionmaking is important because we do not know to what extent intuition assessed with experimental paradigms such as the semantic coherence task relate to daily life intuitive decisionmaking.

Along this line, it would be of interest to differentiate between low-level and high-level intuitive processing suggested by PSI theory (Kuhl, 2000). Future research would do well in elucidating how low-level intuitive processing sequences that are related to automatized behavioral programs and that help to put plans into action are affected by depression. Moreover, and importantly, it would be of interest to examine how the activation or inhibition of self-regulatory systems such as the extension memory and intuitive behavior control system interact with each other within depressed patients and to what degree they play a role in predicting the onset of depressive episodes.

# Investigating Real-Life Decision-Making

Investigations that track idiosyncratic decision-making profiles of depressives or vulnerable subjects would help to understand how subjects decide when facing major or minor daily life decisions such as whether to accept a job offer or whether to go out meeting friends. Do they go with their intuition? Or do they reflect analytically about these issues? Do depressed individuals have decision difficulties in complex situations in which intuitions may help? Or does indecisiveness also occur for rather simple decisions, in which no high loads of information has to be integrated? To answer these questions, experience sampling methods may constitute a usable option as they can assess decision-making modes more directly by prompting subjects to render reports many times a day (Larson and Csikszentmihalyi, 1983; Hektner et al., 2006). Future studies could hereby also investigate in which decision-making areas (work, relationship, leisure time, health) patients report more or less difficulties. In a nutshell, to obtain a more precise picture of how individuals suffering from depression decide in daily life would inform our understanding of decision-making difficulties and may broaden our understanding regarding intuition in depression.

Another method that may be used for this thrust are retrospective reports assessed via interviews or survey questionnaires (Klein, 1998; Dane and Pratt, 2009). These methods allow participants to describe how they approached a decision-making problem and researchers to assess factors such as the complexity of the problem and the mood state prior and after the decision taken. Retrospective reports may further inform us when individuals with depression tend to take functional or dysfunctional decisions and whether the decision was grounded on intuitive or rational processes or both. It should be noted, however, that despite the advantage of high ecological validity retrospective reports are limited in terms of accuracy. For researchers it would be difficult to control whether decisions were actually made intuitively (see Dane and Pratt, 2009 for a discussion on this).

Last but not least, research should examine the etiological role of high-level intuitive capacities. For example, from a clinical perspective it would be of interest to explore whether the impairment in intuitive capacities in specific and decision-difficulties more generally remain after remission. Additionally, the question arises whether vulnerable individuals are less intuitive even before depression breaks out. Therefore, longitudinal designs may be advisable for future research.

# Are there Maladaptive Intuitions in Depression?

In the foregoing, we have considered research and experimental paradigms in which intuition is conceptualized as an adaptive capacity that allows fast coherence detection as well as quick and effortless decision-making (Gigerenzer, 2007; Klein, 2008). For the purpose of a clear demarcation and operationalization of this construct in future research and theorizing in clinical psychology, it would be important to examine intuition and its relation to other depression-related cognitive phenomena. Emotional reasoning, for example, describes the phenomenon to conclude from an emotional reaction that something is proven or true (Beck et al., 1979). It guides decisions and judgments and resembles intuition on a phenomenological level but also regarding the processes it is based on. Both intuition and emotional reasoning have in common that they are influenced by affect, appear automatically and are experienced as self-evident.

The risk to confound intuition with other cognitive phenomena will be illustrated in the following example. Imagine a woman walks down a street and sees two friends sitting in a coffee place. Without thinking about the situation, the woman has the immediate hunch to walk by the café trying

to stay unseen. On the one hand one may argue, that this is no example of intuition because the underlying processes were not operating holistically. Using her intuition, the woman would have seen the bigger picture. She would have integrated implicit goals and wishes (e.g., the need to interact with other people) into her decision. Moreover, more positive associations regarding those two friends would have been taken into account (Kuhl et al., 2015). This, in turn, may have resulted in the intuitive decision to join the friends. Thus, the reaction of the woman may be interpreted as product resulting from an emotional reasoning process. The current mood state might have influenced the way information was processed (Klein, 1998; Hogarth, 2001; Kahneman and Klein, 2009) and served as evidence for the correctness of the decision ('it does not feel good to join them, therefore I will not join them'; Schwarz, 2002). Thus, from this perspective, the decision rather reflects an automatic decision that followed from emotional reasoning and from the activation of subconscious negative schemes. The access to otherwise adaptive intuitions was, from this point of view, impaired in this example. However, the argument that this was indeed an example of intuition, showing that intuitive decisions and judgments may be biased and flawed is also conceivable. Therefore, future research would do well in disentangling intuition from other emotion- and experience-driven processes influencing decisions and judgments.

Apart from these delimitation problems, it appears to be an important step for future research to examine to what extent intuitions in depressed patients may be influenced by negative distortions and imprints of the implicit memory structure. Along this line, current dual-process models of depression (Beevers, 2005) assume that cognitive vulnerability to depression stems from biased associative, implicit processing (Beevers, 2005; but also see Teachman et al., 2012). Importantly, it is claimed that whenever biased self-referent associative processing remains uncorrected (e.g., when cognitive resources are not available to engage reflective correcting processing) cognitive vulnerability to depression is given (Beevers, 2005). Thus, the question arises whether intuitions may become dysfunctional or unrealistic when they result from biased underlying implicit memory. As biases in implicit memory have mostly been shown in the semantic domain (Watkins, 2002) and especially intuitions based on semantic networks seem to be impaired in depression (Remmers et al., 2015a, 2016a), investigations that connect these two lines of research (e.g., how do implicit memory biases influence intuition?) seem very promising.

# IMPLICATIONS FOR CLINICAL TREATMENT

It is clear from the foregoing that intuitions influence people's decisions and subsequent actions. On the one hand, the problem in depression may be that patients do not use functional intuitions stemming from holistic information processing (see Kuhl, 2000, 2001). A consequence of this may be that they have difficulties to come to decisions that integrate great amounts of information and reconcile different aspects of the self. Being unable to use these kinds of intuitions may further result in actions and behaviors that are inconsistent with needs, wishes and goals. Moreover, decisions that result from a rather nonintegrative process may be experienced as dissatisfying and alienating (see Baumann and Kuhl, 2003). On the other hand, negative self-schemes and dysfunctional core beliefs may not only stabilize depressive symptomatology but may also nourish the development of dysfunctional intuitions (Beevers, 2005). Even though, this latter assumption still needs examination, we tentatively conclude that gaining access to intuitions may be an important practical implication from the current theorizing. From a practical point of view, establishing awareness of intuitive hunches seems important because this would enable individuals to differentiate between those intuitions that are functional and that may be acted upon and those intuitions that should be dismissed or corrected as they might lead to dysfunctional and depressogenic actions (Shapiro and Spence, 1997; Beevers, 2005). This idea is in line with Kahneman and Klein (2009) positing that 'when there are cues that an intuitive judgment could be wrong, System 2 [rational-analytic processes] can impose a different strategy, replacing intuition by careful reasoning' (p. 519). In other words, from a practical point of view, it may be advisable to get aware of and examine intuitions before acting upon or dismissing them.

From a clinical and practical perspective, the question how such awareness of intuitions may be enhanced directly follows. Interestingly it strikes out that the wisdom that lies within the every-day expression of 'go with your gut' corresponds to a widely established therapeutic conception stating that 'listening' to inner voices and to the body may be helpful when we are trying to understand what we need or when we are trying to change what makes us suffer. Along this line, investigations within the scope of embodiment research (Niedenthal, 2007) have shown that the association between the body and cognitive-affective responses is bidirectional. It has further been shown that the degree to which individuals are able to correctly perceive body signals (interoception) influences intuitive decision-making (Damasio, 1994; Dunn et al., 2010). Thus, it may be concluded that it is an important capacity to know which signal (bodily, intuitive) may be trusted and which should be dismissed.

One approach that stresses this aspect of careful listening to bodily experiences is the Focusing method introduced by Gendlin (1981). The main premise is that Focusing helps patients to get in touch with the felt-sense. The felt sense entails pre-verbal knowledge about 'something,' such as what one needs or wants and it may be accessed through the body. The felt sense is not an emotion or a mood state and it entails an implicit complexity. By getting in touch with the felt sense, patients may become more aware of what a difficult situation or a pending decision evokes and they may then gently explore this bodily felt experience and its meaning. In the next step they are encouraged to find a word, phrase, or picture for the bodily felt experience and to examine, whether this word or phrase matches with the not-yet-articulated knowing. If the verbal representation matches with the feeling, the bodily experience generally changes which may be called a felt shift. This alteration in the felt-sense may be a result of the preceding process of intuiting and careful examining. As such,

the felt sense may be understood as a 'holistic, implicit, bodily sense of a complex situation' (Gendlin, 1996, pp. 20) which goes beyond intellectual reflections of a problem.

Another approach by which access to intuitions may be gained via the body is mindfulness. In mindfulness exercises individuals learn to, listen to sensations in the here-and-now in a non-defensive, non-reactive way. Based on the definition of mindfulness as a form of attention that focuses on present feelings, thoughts and bodily sensations (Kabat-Zinn, 1990), we tested the assumption that mindfulness also enhances access to intuitive responses in one of our own investigations (Remmers et al., 2015b). After a sad mood induction, healthy participants (N = 94) were randomly assigned to perform either a mindfulness, distraction or rumination exercise. To assess the effect of the respective exercise on intuition, participants then performed the semantic coherence task. Even though mindfulness was successful in down-regulating negative mood, it did not have any impact on the task performance (see Remmers et al., 2015b for a detailed discussion). In addition, it was found that self-reported levels of trait mindfulness, assessed with the Kentucky Inventory of Mindfulness Skills (Baer et al., 2004), were negatively associated with the intuitive performance. As such, the hypothesis was not supported and results even pointed to the opposite direction.

A number of methodological aspects may explain this pattern of findings. For example, the intuition task requires participants to decide and judge instantly. This in turn may stand in contrast to a facet of mindfulness that requires individuals to adopt a non-reactive, non-judgmental, observing attitude. Indeed, results in Remmers et al. (2015b) revealed that the overall negative correlation between trait mindfulness and intuitive accuracy was driven by a strong negative correlation between the acting without judgment subscale and the intuitive performance. Furthermore, the sample consisted of subjects who were naïve in mindfulness practice and it has been shown that the degree of mindfulness experience may explain differential effects of trait mindfulness on cognitive tasks (Jha et al., 2007). For example, mindfulness novices train to narrow their attentional focus (attention to the breath) whereas experienced meditators widen their attentional field (Jha et al., 2007). Thus, the low mindfulness experience of the sample may have influenced the pattern of findings in the study of Remmers et al. (2015b).

Another approach that may foster access to intuitive processes is psychodynamic psychotherapy (Shedler, 2010). This approach stems from psychoanalysis of which the central goal was according to Freud (1916/1917) to get access to implicit or unconscious representations and experiences. In line with this, a key focus of psychodynamic treatment is to enhance patients' access to initially non-conscious knowledge about the self (Hayes et al., 1996). Therefore, it may be concluded that also intuitions

# REFERENCES

American Psychiatric Association [APA] (2013). Diagnostic and Statistical Manual of Mental Disorders, 5th Edn. Arlington, VA: American Psychiatric Publishing.

Baer, R. A., Smith, G. T., and Allen, K. B. (2004). Assessment of mindfulness by self-report. Assessment 11, 191–206. doi: 10.1177/1073191104268029

are accessed more easily as a consequence of psychodynamic treatment. However, of course the exact relationship between unconscious processes, as defined in psychoanalysis, and intuition as investigated with the experimental paradigms described above has to be determined.

More generally, all treatments mentioned above seem to cultivate a form of self-focus that retains the advantages of selfknowledge (Watkins and Teasdale, 2004, p. 6; see also Kuhl, 2000). As such, it may be assumed that directing attention toward oneself is helpful when being done in a more adaptive manner than during rumination (Watkins and Teasdale, 2004). In such instances, it may enable individuals not to think about inner experiences but to get aware of the (self in the) present moment in an intuitive, experiential way (see Watkins and Teasdale, 2004, p. 2). Approaches that foster this kind of experiential self-focus (e.g., the different humanistic-experiential approaches; see Elliott et al., 2013) may create the basic requirements for the access to intuitions. Furthermore, it may be suspected that individuals who have access to intuitions may become aware of subtle conflicts between formerly unconscious, intuitive responses, and conscious elaborations. Resolving such conflicts or discrepancies between intuitive and rational responses may thus be another adaptive consequence of gaining access to intuitions. Indeed, for mindfulness it has been shown in a number of studies that one means by which mindfulness exhibits its beneficial effects is by enhancing the alignment between implicit and explicit responses (Brown and Ryan, 2003; Koole et al., 2009; Crescentini and Capurso, 2015; Remmers et al., 2016b).

As a conclusion, we would like state that intuitions have an impact on what we decide and do and how we subsequently feel. Thus, addressing the question how intuitive decision-making operates during psychopathological states such as depression is an important thrust for science and practice. In the long run, this line of work may help depressed individuals to take adaptive decisions and to find a way out of indecisiveness.

# AUTHOR CONTRIBUTIONS

CR conceived and wrote the current manuscript. JM made contribution to the design of the paper and revised the manuscript critically for intellectual content. Both authors give their final approvement for the version to be published. All authors take full responsibility for the content of the paper.

# FUNDING

This work was supported by a fund of the University of Witten-Herdecke, Department of Clinical Psychology.

Balas, R., Sweklej, J., Pochwatko, G., and Godlewska, M. (2012). On the influence of affective states on intuitive coherence judgements. Cogn. Emot. 26, 312–320. doi: 10.1080/02699931.2011.568050

Baumann, N., and Kuhl, J. (2002). Intuition, affect, and personality: unconscious coherence judgments and self-regulation of negative affect. J. Pers. Soc. Psychol. 83, 1213–1223. doi: 10.1037/0022-3514.83.5.1213




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Remmers and Michalak. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.