# APPLICATIONS OF QUANTUM MECHANICAL TECHNIQUES TO AREAS OUTSIDE OF QUANTUM MECHANICS, 2nd Edition

EDITED BY : Emmanuel Haven and Andrei Khrennikov PUBLISHED IN : Frontiers in Physics

#### Frontiers Copyright Statement

© Copyright 2007-2019 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88963-150-6 DOI 10.3389/978-2-88963-150-6

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# APPLICATIONS OF QUANTUM MECHANICAL TECHNIQUES TO AREAS OUTSIDE OF QUANTUM MECHANICS, 2nd Edition

Topic Editors: Emmanuel Haven, Memorial University, Canada Andrei Khrennikov, Linnaeus University, Sweden

Image by Philippe Haven.

This book deals with applications of quantum mechanical techniques to areas outside of quantum mechanics, so-called quantum-like modeling. Research in this area has grown over the last 15 years. But even already more than 50 years ago, the interaction between Physics Nobelist Pauli and the psychologist Carl Jung in the 1950's on seeking to find analogous uses of the complementarity principle from quantum mechanics in psychology needs noting.

This book does NOT want to advance that society is quantum mechanical! The macroscopic world is manifestly not quantum mechanical. But this rules not out that one can use concepts and the mathematical apparatus from quantum physics in a macroscopic environment.

A mainstay ingredient of quantum mechanics, is 'quantum probability' and this tool has been proven to be useful in the mathematical modelling of decision making. In the most basic experiment of quantum physics, the double slit experiment, it is known (from the works of A. Khrennikov) that the law of total probability is violated. It is now well documented that several decision making paradoxes in psychology and economics (such as the Ellsberg paradox) do exhibit this violation of the law of total probability. When data is collected with experiments which test 'nonrational' decision making behaviour, one can observe that such data often exhibits a complex non-commutative structure, which may be even more complex than if one considers the structure allied to the basic two slit experiment. The community exploring quantum-like models has tried to address how quantum probability can help in better explaining those paradoxes. Research has now been published in very high standing journals on resolving some of the paradoxes with the mathematics of quantum physics. The aim of this book is to collect the contributions of world's leading experts in quantum like modeling in decision making, psychology, cognition, economics, and finance.

Publisher's note: In this 2nd edition, the following article has been updated: Flender C (2016) Information and Temporality. *Front. Phys.* 4:40. doi: 10.3389/fphy.2016.00040

Citation: Haven, E., Khrennikov, A., eds. (2019). Applications of Quantum Mechanical Techniques to Areas Outside of Quantum Mechanics, 2nd Edition. Lausanne: Frontiers Media. doi: 10.3389/978-2-88963-150-6

# Table of Contents


	- Diederik Aerts, Massimiliano Sassoli de Bianchi and Sandro Sozzo

William F. Lawless


Catarina Moreira and Andreas Wichert

*118 Topological and Orthomodular Modeling of Context in Behavioral Science*

Louis Narens


# Editorial: Applications of Quantum Mechanical Techniques to Areas Outside of Quantum Mechanics

Emmanuel Haven<sup>1</sup> \* and Andrei Khrennikov <sup>2</sup>

*<sup>1</sup> Faculty of Business Administration, Memorial University, St. John's, NL, Canada, <sup>2</sup> Department of Mathematics, International Center for Mathematical Modeling, Linnaeus University, Vaxjo, Sweden*

Keywords: quantum-like paradigm, quantum field theory, quantum probability, quantum probability cognition models, quantum information

#### **Editorial on the Research Topic**

#### **Applications of Quantum Mechanical Techniques to Areas Outside of Quantum Mechanics**

The recent quantum information revolution has tremendous consequences not only for physics. It stimulates the use of quantum formalisms in various areas outside of quantum physics: cognition, psychology, economics and finance, microbiology and genetics. This approach is known as quantum-like modeling. For cognition, this modeling should be sharply distinguished from attempts to represent information processing by the brain through quantum physical processes (cf. with works of Penrose and Hameroff). For microbiology and genetics, quantum-like modeling should be distinguished from quantum biophysics. In the quantum-like approach a biological system (brain, cell) is considered as a black box processing information in accordance with the laws of quantum information and probability.

In psychology one can now claim that quantum probability has reached the mainstream. Ideas from quantum field theory now reach into applications to biology and medicine and economics and finance. One of the papers in this special issue, by Marcer and Rowlands, does look at so called "nilpotent quantum mechanics" a form of quantum field theory. The use of a functor in natural language semantics, as proposed in the work of Sadrzadeh derives also from quantum field theory.

The overarching theme in the applications considered here in this special issue, is the use of the so called "quantum-like paradigm." As pointed out in the article by Khrennikov, social science is confronted with probabilistic and "entangled" systems. In this special issue, the paper by Lawless looks specifically at entanglement in his treatment of the interdependence of teams.

Each of the papers accepted for publication under our research topic "Applications of quantum mechanical techniques to areas outside of quantum mechanics," highlights a particular facet of this new multi-faceted area of research.

Plotnitsky's paper in our collection of papers is maybe the contribution which provides for an overarching thinking template on all the work published here. The papers in our special issue assume that the mathematical modeling of a social science bound phenomenon is possible. But interestingly enough, as Plotnitsky remarks, even if we were to question such an assumption, it will not necessarily lead to halting the use of mathematics in such modeling but rather it may result in new modeling and even, maybe, new mathematics? The paper by Aerts et al. far from claiming that mathematical modeling is impossible, does propose that new mathematical structures may well be needed (structures which go beyond quantum structures) to model cognition.

#### Edited and reviewed by:

*Alex Hansen, Norwegian University of Science and Technology, Norway*

> \*Correspondence: *Emmanuel Haven ehaven@mun.ca*

#### Specialty section:

*This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics*

Received: *12 September 2017* Accepted: *09 November 2017* Published: *24 November 2017*

#### Citation:

*Haven E and Khrennikov A (2017) Editorial: Applications of Quantum Mechanical Techniques to Areas Outside of Quantum Mechanics. Front. Phys. 5:60. doi: 10.3389/fphy.2017.00060*

**5**

The idea of "probability waves," a novel intuitive concept when quantum mechanics was being formulated, was born out of the double slit experiment. This brings us neatly to think about the multiple cases of violations of the law of total probability. The wave function is a fantastic device which helps us to understand that there is no unique position, until a measurement is made. As the paper by Flender carefully lays out, understanding this uncertainty lies at the heart of so called temporality. Temporality is a key ingredient in non-chronological time (what Flender calls "time of acausality") and it seems to define also information. The paper by Yukalov and Sornette comes back to the interference effect which is the result, from what they call, an inconclusive event. Inconclusive events, as they rightly point out, underlie many human decisions too. One may argue that their paper transcends some of the results presented in this special issue, as the model they propose can be used for both quantum measurements AND decision making.

In the paper by Broekaert and Busemeyer the authors propose a Hamiltonian based quantum-like model which allows for the temporal evolution of memory states. Time is not the usual physical time variable, but it is rather used for the temporal ordering of states. The paper also carefully spells out the issue of closed and open systems. Open systems have now also been considered in areas other than psychology, such as political science and economics. The paper by Khrennikov provides for an overview.

Narens considers orthomodular lattice modeling in behavioral science. The article queries why there may be a link between the conservation principle and psychology and it also wonders why Hilbert space based quantum probability may be relevant to psychology.

Gabora and Kitto's contribution develops a so-called quantum theory of humor (i.e., the cognitive aspect of humor is considered). An experimental study is set up to start defining the state space of "humor."

The work by Moreira and Wichert compare several models which explain violations of the well know sure-thing principle in expected utility.

We hope this special issue provides for a rich addition to the problem of modeling social science phenomena with the help of the quantum-like paradigm.

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Haven and Khrennikov. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Hamiltonian Driven Quantum-Like Model for Overdistribution in Episodic Memory Recollection

Jan B. Broekaert 1, 2 \* and Jerome R. Busemeyer <sup>3</sup>

*<sup>1</sup> Center Leo Apostel for Interdisciplinary Studies, Vrije Universiteit Brussel, Brussels, Belgium, <sup>2</sup> Department of Psychology, City, University of London, London, United Kingdom, <sup>3</sup> Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN, United States*

While people famously forget genuine memories over time, they also tend to mistakenly over-recall equivalent memories concerning a given event. The memory phenomenon is known by the name of *episodic overdistribution* and occurs both in memories of disjunctions and partitions of mutually exclusive events and has been tested, modeled and documented in the literature. The total classical probability of recalling exclusive sub-events most often exceeds the probability of recalling the composed event, i.e., a *subadditive* total. We present a Hamiltonian driven propagation for the Quantum Episodic Memory model developed by Brainerd et al. [1] for the episodic memory overdistribution in the experimental immediate *item false memory* paradigm [1–3]. Following the Hamiltonian method of Busemeyer and Bruza [4] our model adds time-evolution of the perceived memory state through the stages of the experimental process based on psychologically interpretable parameters—γ*<sup>c</sup>* for *recollection capability* of cues, κ*<sup>p</sup>* for bias or description-dependence by probes and β for the average gist component in the memory state at start. With seven parameters the Hamiltonian model shows good accuracy of predictions both in the EOD-disjunction and in the EOD-subadditivity paradigm. We noticed either an outspoken preponderance of the gist over verbatim trace, or the opposite, in the initial memory state when β is real. Only for complex β a mix of both traces is present in the initial state for the EOD-subadditivity paradigm.

#### Edited by:

*Emmanuel E. Haven, University of Leicester, United Kingdom*

#### Reviewed by:

*Kirsty Kitto, Queensland University of Technology, Australia Yoshiharu Tanaka, Tokyo University of Science, Japan*

> \*Correspondence: *Jan B. Broekaert jan.broekaert@vub.ac.be*

#### Specialty section:

*This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics*

Received: *12 November 2016* Accepted: *30 May 2017* Published: *23 June 2017*

#### Citation:

*Broekaert JB and Busemeyer JR (2017) A Hamiltonian Driven Quantum-Like Model for Overdistribution in Episodic Memory Recollection. Front. Phys. 5:23. doi: 10.3389/fphy.2017.00023* Keywords: episodic over distribution, disjunction fallacy, subadditivity, quantum cognition, Hamiltonian operator

# 1. INTRODUCTION - THE EPISODIC MEMORY

In an early effort to systematize the developing science of memory, Tulving [5] aimed to provide operative definitions for presumed various categories of memory. Continuing a dichotomic approach, he proposed to complement the previously coined "semantic" memory with the "episodic" memory. While our "semantic" memory would allow us to regain facts and abstract knowledge about our world, our "episodic" memory would let us recall personally lived events in a specific spatio-temporal context from our past. While distinct, both were still considered partially overlapping information processing systems. With Mandler's [6] dual process approach it became more clear to distinguish the more contrived recollection by details with respect to the recall of facts [7]. In the dual recollection-familiarity process models a cue is processed respectively either in terms of remembering an event's details up to its genuine recollection, or by retrieving a feature which is associated to the cue so it becomes familiar and conflated with a truly episodic memory.

Broekaert and Busemeyer A Hamiltonian Quantum Like Model for EOD

Jacoby [8] pointed out a confusion of the recollectionfamiliarity process with the retrieval task itself. He urged for explicit process dissociation providing two separate parameters for the aspects of recollection—or intentional memory use and for familiarity—or automatic memory use—in the dual process. In a further developed dual process approach the "conjoint recognition" model of Brainerd et al. [9] proposes separate parameters for the processes of; identity judgment, similarity judgment, and response bias. The latter model is able to implement the "fuzzy trace" theory (FTT) with its identity vs. similarity distinction. Reyna and Brainerd [10] crucially distinguishes verbatim and gist dimensions to memories. Verbatim traces hold the detailed contextual features of a past event, while gist traces hold its semantic—"fuzzy" details. Our brain would analyze a past event by accessing its stored verbatim and gist trace in parallel. On the one hand the verbatim trace of a verbal cue handles it "surface" content—i.e., orthography and phonology for words—with its contextual features, while the verbal cue's gist trace will encode "relational" content—i.e., semantic content for words—with its contextual features. In more recent work the FTT model has received a quantum probabilistic formalization to cope with overdistribution in memory tests [1, 4, 11, 12]. While we are essentially connecting to this line of research with our present quantum model, a wide variety of recollection memory models have been developed in the literature that are not discussed here. We do refer to one specific semantic network approach by Nelson et al. [13] and Bruza et al. [14] which also infers quantum structures for its development. In essence their model proposes a semantically associated network, in which a target word is adjacent to all associated terms according the natural language of its users. It has been shown best prediction of memory performance is obtained by implementing the network in a quantum superposition state of either complete activation—amplitude 1—or non-activation—amplitude 0. The model provides weighed directional word associations, and a quantumlike entanglement between nodes is invoked to predict parallel instead of serial activation of neighbors. We have not included the Nelson and McEvoy model in our present discussion since it has not been developed to explicitly implement a gistverbatim distinction with respect to which the EOD effect we target here is developed.

**The EOD effect.** One striking phenomenon concerning memory is the Episodic Over-distribution effect—or EOD. More or less this effect expresses a person's proneness to conflate memories of distinct events. More precisely the effect points out we tend to affiliate "alien" memories to true memories concerning a given event, leading to an "exaggeration" of memories concerning that event.

In Brainerd et al. [3] the disjunction fallacy is modeled for the item false memory paradigm while the source false memory version is covered in Brainerd et al. [15]. Brainerd et al. [1] exposes the more common case of subadditivity of episodic memory.

These EOD effects are shown in specifically designed experiments: the item false memory experiment in 2015 is a modification (also [2, 3, 9]) of a classical paradigm in which a single "instruction" (or probe) would be given to measure whether "a given cue is a target (or not)."

**EOD–subadditivity.** In the item false memory (IFM) experiment three possible cues "old"—or o, "new-similar"—or ns, and "new-dissimilar"—or nd are presented. These cues are crossed with three "probes" namely o?, ns?, and nd?. These probes "o?, ns?, and nd?," respectively, enquire the participant "is this probe old?" (studied before), "is this probe new but similar?" (semantically related to the old cues but not literally among them) and "is this probe new and dissimilar?" (has nothing to do with with the studied cues, even not semantically). In this experimental paradigm after exposure to an unidentified cue the participant is enquired by one of three distinct probes.

In practice most of the IFM experiments turn out subadditive acceptance probabilities:

$$p(o\mathbb{1}) + p(ns\mathbb{1}) + p(nd\mathbb{1}) > 1.\tag{1}$$

That is, the disjoint partial features are over recalled with respect to its encompassing event. Notice that even if the law of total probability would be satisfied Brainerd mentions the possible issue of compensations; a systematic change in remembering ns as such may compensate a reverse change to remember ns as o, restoring the classical addition to 1.

**EOD–disjunction fallacy.** In the 2010-version of the IFM experiment a disjunctive probe was presented to the participants instead of the nd? probe. This probe questioned whether the cue was either "old" or "new-similar," leaving unnecessary the answer to the question which one of both types the cue really was. A comparison of acceptance probability under the disjunctive probe o or ns? and the summed acceptance probabilities under the separate probes o? and ns? revealed a subadditive relation

$$p(o\text{?}) + p(ns\text{?}) \text{ > } p(o\text{ or }ns\text{?}).\tag{2}$$

This relation amounts to a disjunction fallacy since both cue types are mutually exclusive categories. The EOD effect was further identified using the unpacking factor of [16]

$$\frac{p(o\text{?}) + p(ns\text{?})}{p(o\text{ or }ns\text{?})} > 1.\tag{3}$$

the excess value of the fraction above one gives a measure of the amplitude of the effect.

**Explanations of EOD effects.** A number of theoretical explanations have been provided to interpret this phenomenon:

The fuzzy trace theory was implemented in QEM—the Quantum Episodic Memory model—by Brainerd et al. [1, 11]. By processing the perception of the verbatim and gist memory trace as separate components of a state vector, QEM allows to encompass the non-classical EOD probability effects of episodic memory. This capacity, we will see in the next section, is essentially due to the ubiquity of the gist component and its implementation in the corresponding outcome projectors for acceptance. Another quantum-like modeling perspective has been proposed by Busemeyer and Bruza [4, Ch.6] which provides unitary transformation matrices based on Feynman path analysis and ordering of the gist/verbatim processing of cues, which we will discuss below. Finally a complementarity based quantumlike development was done by Denolf and Subadditivity [17], and Denolf and Lambert-Mogiliansky [18]. Bohr's complementarity provides the gist–verbatim features by implementing for each an alternative basis of the Hilbert space. Our Hamiltonian development follows more closely the outline of the QEM model for FTT.

We will at present not make comparisons to either Markovian models [4, 19, 20] but focus on the possibility of Hamiltoniandriven time propagation in QEM, and compare to the Feynman path model and the original QEM model itself.

**Experimental paradigms.** A number of experimental paradigms have been proposed to test the over-distribution effect and the episodic disjunction effect. In this paper we mainly refer to Brainerd et al. [1, 3] which build and expand on "item false memory" and "source false memory" experimental paradigms, but only the former IFM paradigm will be modeled here. We shortly describe both paradigms of 2010 and 2015 for the IFM case. As we have mentioned, each of these paradigms consist of two consecutive stages.

**2010 experimental paradigm.** In the first stage participants studied a set {o} of memory targets consisting of words from a list of the Deese–Roediger–McDermott paradigm (DRM). The presented DRM lists are abbreviated sequences of the original 15 semantically related words that all associate forward to one common word. That latter word does not appear in the list and is therefore known as the distractor [21, 22]. We will in our approach not include the issue of the preliminary orienting task based on qualifying adjectives as positive, neutral, or negative, which "increase the processing of semantic content during subsequent exposure to word lists" [3]. After this memorization stage either immediate testing ensues or a time delay of a week is inserted. Subsequently a cue is presented to the participant and finally an instruction to respond to the cue is given.

Three possible types of cue are used; a studied target from {o} consisting of a word from one of the 24 lists, a related nontarget from set {ns} consisting of words on the list but not learned ({o} ∩ {ns} = φ) and finally a new-dissimilar non-target from set {nd} with words not related in any sense to the selected DRM lists ({nd} ∩ {ns} = φ = {nd} ∩ {o}).

These cues are crossed with one of three instructions per participant<sup>1</sup> . Either; the first instruction o? (or old?) to accept only an **exact** target from {o} and otherwise reject, or the second instruction ns? (new-similar?) to accept only a **related** nontarget from {ns} otherwise reject, or a third instruction. The third instruction is o or ns? (or old or new-similar? ) to accept either an **exact** target from {o} or a **related** non-target from {ns} and otherwise reject.

**2015 experimental paradigm.** The alternative version of the IFM paradigm of Brainerd et al. [1] follows precisely the two stages of the 2010-version except for the final stage. First the participants studied cues c of memory target words (24 times 6 in total). Then a time delay is either inserted or not. in the test phase participants are first exposed to a cue which is either a studied target from o, a new-similar non-target ns, or a new-dissimilar non-target nd. Finally the participant is asked to respond to one of three probes querying to which category the cue belongs; that is o?, ns?, or nd?. In comparison to the 2010-paradigm the o or ns? probe has been replaced by the nd? probe.

**About the source false memory paradigm.** In source false memory experiments the experimental paradigm focuses on the origin of the cue. It probes the source recollection in memory of cues originating from either List 1 or List 2 and crossed with probes List1?, List2?, and nd?. We will only focus our present Hamiltonian based quantum model on the IFM setting, it is however very possible to adapt the model to SFM requirements as well.

Besides the QEM model, this specific paradigm has been alternatively modeled by Denolf and Lambert–Mogiliansky using Bohr's quantum approach to consider gist and verbatim traces as complementary properties, each trace represented by an alternative bases of the same Hilbert space [18].

**Experimental data in 2010 and 2015 paradigms.** Since Brainerd et al. [1] focuses on subadditivity with the probes o?, ns?, or nd?, there is no interest in o or ns? thus it is not measured nor reported. While vice versa [3] has a focus on the disjunction fallacy, which reports o? , ns?, and o or ns? but does no reporting of nd cue data. We therefore take the data of Brainerd et al. [9] from which a full 3 × 3 grid of data can be reconstructed using an intervention proposed by Busemeyer and Bruza [4, p. 171].

In sum we have no data set which shows the subadditivity and the disjunction effect at the same time. We will thus adapt the parameters of the Hamiltonian model to each data set separately (see **Tables 1**, **2**). We adopt Busemeyer's solution to complete the data set in the paradigm for the EOD disjunction effect by supplementing the nd probe data in the set through their response bias measures bT, bR, and bT+<sup>R</sup> ([9]–Table 6). Moreover, we will fit to the average values over six experimental conditions here<sup>2</sup> . While for the EOD subadditivity effect we will use ([1], Table 3, p. 233) in which we take the values for ns cue as the averages of ns-critical and ns-related cues, distinguished by Brainerd et al. [1].

# 2. QUANTUM MODELS

Probabilistic anomalies with respect to classical probability law have in many cases been successfully covered by models using quantum formalism, likewise the anomalies of EOD have been modeled in quantum-like manner. We shortly present some of these developments, mainly focussing on QEM.

# 2.1. The QEM Model by Brainerd et al.

**Memory state vectors.** QEM provides three orthogonal vectors in Hilbert space, respectively one for (verbal) surface form,

<sup>1</sup>We adopted the notation of Brainerd et al. [1] in the context of Brainerd et al. [3] as well. Always cues will be denoted o, ns, and nd for old, new-similar, and new-dissimilar, and their respective enquiring probes are o?, ns?, and nd?. Memory traces are denoted by V, G, and N for verbatim, gist, and neither.

<sup>2</sup> In the conjoint recognition model the probabilities for acceptance for unrelated distractors are: pu,<sup>T</sup> = bT, pu,<sup>R</sup> = bR, and pu,T+<sup>R</sup> = bT+R, ([9], Equations 19–21).


TABLE 1 | EOD-disjunction fallacy: Experimental and predicted acceptance probabilities by probe and cue in item false memory paradigm - immediate test [9].

*The experimental p*(*p?*|*c*) *are averages over six experimental conditions. Corresponding unpacking factor and "conjunction probability" values are listed. The Hamiltonian model has a very good prediction RMSE* <sup>=</sup> *0.0073 for* β > *0, and a less good RMSE* <sup>=</sup> *0.0341 for* β < *0. Fitting attempts for a* <sup>β</sup> <sup>∈</sup> <sup>C</sup> *gravitated toward the* β < *0 solution, i.e., phase*(β) <sup>→</sup> <sup>π</sup> *and have therefore not been included. The predictions of the Feynman path based model [4] have an RMSE* = *0.0298 and are indicated with Pred-BB2012. The re-calculated QEM predictions [1] are indicated with Pred-Br2015 and have an RMSE* = *0.1112.*

one for semantical relatedness, and one for the case when neither of both previous are relatedly present. In line with the FTT these respective dimensions acquire probability amplitudes that represent the participant's mental state on the cue in the experiment, which we order as (v<sup>c</sup> , g<sup>c</sup> , nc) τ 3 . The fact that these features are expressed by orthogonal vectors, reflects that these are perceived distinct properties of a word in memory. This orthogonality property should be differentiated from associative relationships of words like e.g., for a target word in a semantic memory network [13], which dominantly hinges on related gist but mostly leaves out related verbatim features. Brainerd et al. [1] and Brainerd and Reyna [2] describe the "perceived memory state" spanned by vectors in three-dimensional Hilbert space corresponding to verbatim, gist, and non-matching dimensions of the respective fuzzy traces for the set of words in the experimental paradigm in the brain:

$$|\mathcal{S}\_{\mathfrak{c}}\rangle = \nu\_{\mathfrak{c}}|V\rangle + \mathcal{g}\_{\mathfrak{c}}|G\rangle + n\_{\mathfrak{c}}|N\rangle\tag{4}$$

Where c can be any cue type, o, ns, or nd, and each basis vector corresponds to respectively the fuzzy trace of form (V), semantic relation (G), and complete unrelatedness (N) 4 . According the model requirements—exhaustiveness and exclusiveness of the cues—the respective probabilities add up to unity

$$|\nu\_{\mathfrak{c}}|^2 + |\mathfrak{g}\_{\mathfrak{c}}|^2 + |n\_{\mathfrak{c}}|^2 = 1. \tag{5}$$

With these three normalizations constraints QEM requires the parameters {vo, go, no, vns, gns, nns, vnd, gnd, nnd} of which six are independent. We discuss some related fitting issues in QEM at the end of Section (2).

**Acceptance projectors.** A probe o?, ns?, or nd? is affirmatively—"yes" (y)—answered by applying the corresponding projection operator

$$M\_{\mathbf{y},o\mathbf{?}} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \ M\_{\mathbf{y},n\mathbf{?}} = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \ M\_{\mathbf{y},nd\mathbf{?}} = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} \tag{6}$$

on the state |Sci. These respective projector matrices are simply obtained by considering the final outcome vectors which they need to produce. In VGN space the projector My,nd? should lead to a vector proportional to (0, 0, 1), representing perception of no related verbatim nor gist of the nd cue. The form of this expected outcome vector (0, 0, 1) is directly related to the projector expression Equation (6, c). Similarly the projector My,ns? , Equation (6, b), is constructed from the expected outcome vector (0, 1, 0) representing only perception of related gist in the ns cue. For the projector matrix My,o? the outcome should

<sup>3</sup>We use the symbol τ to designate the transpose of a vector or matrix. Basically transposition turns columns into rows and vice versa.

<sup>4</sup>We recall that state functions or vectors in quantum-like models for cognitive processes will always represent averages of the participant group. Individual memory state vectors are not envisaged in this approach: as all humans are allegedly equal but rather existentially different the average state function does not reflect the individual's memory state. We emphasize the difference with the situation in the micro-physical realm; e.g., the state function of an ensemble of

identically prepared electrons does reflect the behavior of an individual electron since all electrons are equal, not just allegedly.


TABLE 2 | EOD-subadditivity: Experimental and predicted acceptance probabilities by probe and cue in item false memory paradigm—immediate test [1].

*The experimental p*(*p?*|*ns*) *are the averages of critical and related distractor cues. The Hamiltonian model with* β > *0 constraint has a prediction accuracy RMSE* = *0.0489, the* β < *0 constraint gives slightly improved RMSE* <sup>=</sup> *0.0191, while for* <sup>β</sup> <sup>∈</sup> <sup>C</sup> *we have a strongly improved RMSE* <sup>=</sup> *0.0032. The re-calculated QEM predictions [1] are indicated with Pred-Br2015 and have an RMSE* = *0.0963.*

lead into the space spanned by both related verbatim and gist components for the perception of the o cue. The latter is a two dimensional space spanned by the basis vectors (1, 0, 0) and (0, 1, 0), and is the outcome space of projector Equation (6, a). The β-parameter—present in the vector |oi for old cues—will allow to navigate such vectors in this two-dimensional subspace, altering the relative weight of the verbatim to gist components (see functioning of β in the description of the initial state, below).

In the experimental paradigm for the EOD disjunction fallacy the operator My,b? for the probe o or ns ? is used;

$$\begin{split} M\_{\mathsf{y},b\mathsf{?}} &= M\_{\mathsf{y},o\mathsf{?}} + M\_{\mathsf{y},ns\mathsf{?}} - M\_{\mathsf{y},o\mathsf{?}}M\_{\mathsf{y},ns\mathsf{?}} \\ &= M\_{\mathsf{y},o\mathsf{?}} \end{split} \tag{7}$$

since My,o?My,ns? = My,ns? . In QEM the memory state for the experimental paradigm is posited to be |Sci following the exposure to cue c. After providing the probe p the state collapses to the eigenvector of the projection operators My,p: "the cue elicits the memory state, and the probe determines the projector used to answer [affirmatively to] the question." ([1], p. 243).

**The origin of EOD in QEM.** Notice that the form of the projectors My,o? and My,ns? show that subadditivity is an immediate consequence of measuring the presence of the common gist trace in both operations. Which also implies as Brainerd points out—the cases in which a gist trace would be lacking will not produce subadditivity. Similarly, we could remark that in the dual trace approach ns? is a subspace of o?. Therefore, the operator for the disjunctive probe coincides with the operator for the o? probe. As a consequence the EOD disjunction fallacy is not due to an interference dynamics in the QEM model, but follows from "double counting" the gist component in the outcome of disjunctive probe. Also the EOD subadditivity is due to this same double counting of gist. Both subadditivity and the disjunction fallacy are therefore considered ''parameter-free" features of the QEM model [1]. In Section 2.3, we will cover the origin of the EOD effect more extensively and show how in our Hamiltonian approach of QEM one is not restricted to subadditive nor fallacious disjunctive scenarios.

**The initial state.** A short discussion on the initial state vector in the QEM model is needed since it plays an important role, both in Brainerd et al.'s development of QEM and our Hamiltonian driven version of the model. At the start of the experiment the participants of the experiment are informed about the equal probability by which each type of cue o, ns, or nd, will be presented [1]. It can be easily seen however that this is not possible to implement exactly without forcing this initial perceived memory state to be voided of all of its verbatim trace<sup>5</sup> . We claim a more appropriate representation of this initial state is done by addressing this uncertitude on the level of the probability amplitudes, not the probabilities themselves. More specifically, we implement each component probability amplitude is attributed equal weight 1/ √ N in the initial state

$$|\psi\_0\rangle = \frac{1}{\sqrt{\mathcal{N}}} \left( |o\rangle + |ns\rangle + |nd\rangle \right), \text{ with } \| |\psi\_0\rangle \| = 1. \tag{8}$$

<sup>5</sup>Brainerd et al. [1, p. 239], mentions participants would have roughly p(o) = p(ns) = p(nd) = 1/3 as baseline probabilities prior to study of {o}. Let the initial state be represented in VGN space as (α, β, γ ) τ . Using the appropriate projection operators, Equation(6), we find p(o) = |α| <sup>2</sup> + |β<sup>|</sup> 2 , p(ns) = |β| 2 and p(nd) = |γ | 2 . Equating them all to 1/3 requires α = 0, reducing the perceived related verbatim trace to nought.

where N is the vector's normalization factor. This initial state can be expressed in terms of perceived verbatim, gist, and unrelated components. The o-state is composed of components of verbatim and gist in the perceived memory according a superposition of both; |oi = α|Vi+β|Gi or explicitly normalized ( p 1 − |β| 2 , β, 0)<sup>τ</sup> , where <sup>β</sup> <sup>∈</sup> <sup>C</sup> 6 . Thus both aspects V and G contribute with a variable amplitude to the targeted cue o—which a priori should have been expected since the relative magnitude of both traces seem variably dependent on the particular instance of the o-type cue. The memory states for ns and nd on the other hand do not decompose over multiple traces, and coincide with the unambiguous eigenvectors of the respective operators My,ns? and My,nd? , i.e., |nsi = |Gi and |ndi = |Ni. The initial state prior to cue and probe presentation can thus be expressed in terms of orthogonal states for V, G, and N 7 :

$$|\psi\_0\rangle = \frac{1}{\sqrt{\mathcal{N}}} \left( \alpha |V\rangle + (1+\beta)|G\rangle + |N\rangle \right), \quad \text{with } |\alpha|^2 + |\beta|^2 = 1. \tag{9}$$

from which we find an expression of the normalization factor

$$\mathcal{N} = \sqrt{|\alpha|^2 + |1 + \beta|^2 + 1} = \sqrt{3 + 2\Re(\beta)}$$

of the initial state. Most importantly, we have a variable β in our description, which stands for the average amplitude of gist trace in the true target set {o} of the experimental paradigm. We assume that o cues with little relevance to the participants will correspond to low β, while o cues common to the participants will increase β.

Given the participant is informed she will be exposed to an equal amount of o, ns, and nd cues, overall she will expect an excess of gist in comparison to verbatim or unrelated features. A "constructive interference" in 1 + β with β > 0 would be expected (when <sup>β</sup> <sup>∈</sup> <sup>R</sup>). In the present experimental paradigm the cues are semantically forward related words (to its target word) of the DRM lists, therefore we would expect low or moderate associated gist traces here, certainly not really intense gist traces as for instance the Madeleine-cue provoked in Marcel Proust.

# 2.2. The Feynman Path Model for EOD by Busemeyer et al.

The Feynman path model by Busemeyer and Trueblood [23] and Busemeyer and Bruza [4, Ch. 6] introduces a four dimensional Hilbert space to encompass the two orderings of the types of process; verbatim before gist on o cues, and gist before verbatim on ns and nd cues. This model thus provides a cue dependent construction of the memory state.

As in the QEM model, the Feynman path approach does not concatenate reflection time periods. The exposure of the cue or the probe to the participant does not engender a unitary time evolution of the memory state. Notably this model provides cuedependence of evolution by ordering verbatim and gist stages in the process of recollection and depends on interference of probability amplitudes to form the acceptance probability in the disjunctive b? probe. Busemeyer and Bruza [4] model requires only 6 parameters for a satisfactory prediction of the 9 data points of the disjunctive EOD paradigm. The predictions of the Feynman path based model by Busemeyer et al. have been included in the data (**Table 1**). A short comparative discussion of the model's prediction capacity is given at the end of Section 2.

We summarize the Feynman paths in this model and have adapted the notation of Busemeyer and Bruza [4, Ch. 6] to conform with the present context<sup>8</sup> . We inserted the question mark to distinguish a probe—o?—from a cue o. The negation of a probe is indicated by the tilde sign—e.g., o˜?—and corresponds to the negation of the instruction "Is this not an o cue?" This allows to express the complementarity of the cases o? and o˜? according:

$$|o\rangle\langle o| + |\tilde{o}\rangle\langle \tilde{o}| = \mathbb{T} \tag{10}$$

For o cues, verbatim is treated before gist, which means first o? operates on the initial state |Soi for o cues, then followed by the operation of ns?:

$$\begin{aligned} p(o \!\!/ o) &= |\langle o \!\!/ S\_o \rangle|^2, \\ p(ns \!\!/ o) &= |\langle ns \!\!/ S\_o \rangle|^2 = |\langle ns \!\!/ o \rangle \langle o \!\!/ S\_o \rangle + \langle ns \!\!/ \tilde o \rangle \langle \tilde{o} \!\!/ S\_o \rangle|^2, \\ p(b \!\!/ \vert o) &= p(o \!\!/ \vert o) + p(\tilde{o} \!\!/ \vert o) p(ns \!\!/ \tilde o) \\ &= |\langle o \!\!/ S\_o \rangle|^2 + |\langle \tilde{o} \!\!/ S\_o \rangle|^2 |\langle ns \!\!/ \tilde o \rangle|^2, \\ &= 1 - |\langle \tilde{n} \!/ \tilde{o} \!\!/ \rangle|^2 |\langle \tilde{o} \!\!/ S\_o \rangle|^2, \end{aligned}$$

requiring two parameters; hns|oi and hns|˜oi. For ns cues gist is treated before verbatim, then first ns? operates on the initial state |Snsi for ns cues, followed by o?

$$\begin{aligned} p(o \rhd | ns) &= |\langle o | \mathbf{S\_{ns}} \rangle|^2 = |\langle o | ns \rangle \langle ns | \mathbf{S\_{ns}} \rangle + \langle o | \tilde{n} \rangle \langle \tilde{n} | \mathbf{S\_{ns}} \rangle|^2, \\ p(ns \rhd | ns) &= |\langle ns | \mathbf{S\_{ns}} \rangle|^2, \\ p(b \rhd | ns) &= p(ns \rhd | ns) + p(\tilde{n} \rhd | ns)p(o \rhd | \tilde{n} \mathbf{s}) \\ &= |\langle ns | \mathbf{S\_{ns}} \rangle|^2 + |\langle \tilde{n} \mathbf{s} | \mathbf{S\_{ns}} \rangle|^2 |\langle o | \tilde{n} \mathbf{s} \rangle|^2, \\ &= 1 - |\langle \tilde{o} | \tilde{n} \mathbf{s} \rangle|^2 |\langle \tilde{n} \mathbf{s} | ns \rangle|^2, \end{aligned}$$

requiring two more parameters ho| ˜nsi and ho|nsi. Also for nd cues gist is treated before verbatim

$$\begin{aligned} p(o\sharp|nd) &= |\langle o|\mathbf{S}\_{nd}\rangle|^2 = |\langle o|ns\rangle\langle ns|\mathbf{S}\_{nd}\rangle + \langle o|\tilde{n}s\rangle\langle \tilde{n}s|\mathbf{S}\_{nd}\rangle|^2, \\ p(ns\sharp|nd) &= |\langle ns|\mathbf{S}\_{nd}\rangle|^2, \\ p(b\sharp|nd) &= p(ns\sharp|nd) + p(\tilde{n}s\sharp|nd)p(o\sharp|\tilde{n}d) \\ &= |\langle ns|\mathbf{S}\_{nd}\rangle|^2 + |\langle \tilde{n}s|\mathbf{S}\_{nd}\rangle|^2|\langle o|\tilde{n}d\rangle|^2, \\ &= 1 - |\langle \tilde{o}|\tilde{n}d\rangle|^2|\langle \tilde{n}\tilde{s}|\mathbf{S}\_{nd}\rangle|^2, \end{aligned}$$

<sup>6</sup>An explicit eigenvector <sup>|</sup>o(α, <sup>β</sup>)<sup>i</sup> of <sup>M</sup>y,o? is given by My,o? |o(α, β)i = |o(α, β)i = [α, <sup>β</sup>, 0]<sup>τ</sup> <sup>=</sup> <sup>α</sup>|Vi + <sup>β</sup>|Gi, with <sup>|</sup>α<sup>|</sup> <sup>2</sup> + |β<sup>|</sup> <sup>2</sup> <sup>=</sup> 1. Evidently there is a possible denomination issue caused by the relative weight of both components, since diminishing α will eventually turn an o state indiscernibly into an ns state.

<sup>7</sup> The equally weighed initial state 1/N( p 1 − |β| 2 , 1 <sup>+</sup> <sup>β</sup>, 1)<sup>τ</sup> was obtained by giving each type of cue's vector |oi, |nsi and |ndi equal weight at start. Our implementation however does neither reflect equal baseline probability of o, ns, and nd in the participants memory state as aimed for by Brainerd et al. [1], also here one cannot have p(o) = p(ns) = p(nd) at the start. For real-valued β, the initial probabilities come at the closest for β = −2 + √ 3 at p<sup>o</sup> = 0.59, pns = 0.22, pnd = 0.41.

<sup>8</sup>The original notation V for "verbatim," R for "related," and U for "unrelated" cues are here replaced by o, ns, and nd, respectively. One should be attentive to the fact that V stood for "is the cue verbatim?" (actually meaning old), it does not stand for the verbatim trace of QEM.

without new parameter requirements. The parameters appear as elements of unitary transformations and must satisfy unitarity. Leaving <sup>h</sup>ns|oi = <sup>p</sup> 1 − |hns|˜oi|<sup>2</sup> = h ˜ns|˜oi and h ˜ns|oi = −hns|˜oi ⋆ . The initial state is described in a four dimensional Hilbert space, in which the initial state depends on the presented cue:

$$\begin{split} \left| \left| S \right\rangle &= \left| o\_o \right\rangle \langle o\_o \left| S \right\rangle + \left| \tilde{o}\_o \right\rangle \langle \tilde{o}\_o \left| S \right\rangle + \left| o\_{\tilde{o}} \right\rangle \langle o\_{\tilde{o}} \left| S \right\rangle + \left| \tilde{o}\_{\tilde{o}} \right\rangle \langle \tilde{o}\_{\tilde{o}} \left| S \right\rangle, \\ &= \left| n s\_o \rangle \langle n s\_o \left| S \right\rangle + \left| \tilde{n} s\_o \right\rangle \langle \tilde{n} s\_o \left| S \right\rangle + \left| n s\_{\tilde{o}} \right\rangle \langle n s\_{\tilde{o}} \left| S \right\rangle + \left| \tilde{n} \tilde{s}\_{\tilde{o}} \right\rangle \langle \tilde{n} s\_{\tilde{o}} \left| S \right\rangle. \end{split}$$

where the index represents the cue type. The first expression is applicable for target cues from {o}, thus |Soi (where the index o˜ indicates either ns or nd). And the second expression for the initial state is applicable when the cue is not a target but comes from {ns} ∪ {nd}, thus for |Snsi and |Sndi. Therefore, hoo˜ |Soi = h˜oo|Soi = 0 and hnso|So˜i = h ˜nso|So˜i = 0. A simplification of the formalism is obtained by chosing the phase ϕ of hns|˜oi and the phase θ of ho| ˜nsi equal to each other. This choice corresponds to a simplification of the dynamics in the subspaces of the 4-dimensional Hilbert space, in which the gistbefore-verbatim states and the verbatim-before-gist states differ. Equating the phases on the transition components is considered a compromise between reducing parameters and prediction accuracy.

The final six parameters for the Feynman path model for EOD of Busemeyer et al. then are {h po|Soi,hns|Snsi,hns|Sndi, |hns|˜oi|, |ho| ˜nsi|, <sup>θ</sup>}, where <sup>h</sup>o|Soi = <sup>p</sup>(o?|o), <sup>h</sup>ns|Snsi = <sup>p</sup> <sup>p</sup>(ns?|ns), <sup>h</sup>ns|Sndi = <sup>p</sup> p(ns?|nd), while |hns|˜oi|, |ho| ˜nsi| and the phase angle θ are used to fit the remaining six data points.

#### 2.3. The Hamiltonian Driven QEM Model

A Hamiltonian based quantum-like model allows the description of temporal evolution of the perceived memory state of participants through the stages of the experiment. Although, the explicit time-dependence of states in this approach would in principle allow response time predictions, the main goal is to describe increasing and decreasing tendencies building up toward the point of decision. We emphasize that while we let "t" stand for time in our model, it is rather to be considered as an indicative parameter for temporal ordering than "physical" time [24].

**States and probabilities.** Following the Hilbert space construction of the QEM model, the memory states are conceived to have one component for accepting o (memory target, old cues), one component for accepting ns (new semantically related cues), and one component for accepting nd (unrelated, new-dissimilar cues). Expressed on the orthogonal basis vectors for the fuzzy traces the state function following the VGN ordering of components is denoted as

$$
\Psi\_{probe|cue}(t) = [\psi\_{p|c\_V}(t), \psi\_{p|c\_G}(t), \psi\_{p|c\_N}(t)]^\varepsilon,
$$

with

$$|\psi\_{\mathcal{P}|c\_{V}}|^{2} + |\psi\_{\mathcal{P}|c\_{G}}|^{2} + |\psi\_{\mathcal{P}|c\_{N}}|^{2} = 1.\tag{11}$$

A state vector is thus defined separately for each combination of cue in {o, ns, nd} and probe in {o?, ns?, nd?} for partition subadditivity and in {o?, ns?, o or ns ?} for disjunction EOD. In contrast with Brainerd et al.'s QEM approach our method results in nine different state vectors—for each the experimental paradigms—that are obtained by adapting the Hamiltonian depending on the choice of probe and the choice of cue. Under a specific instruction probe and cue, the acceptance probabilities are obtained by applying the projectors (Equation 6) to the final state and take the modulus square of the outcome. All acceptance probabilities for both paradigms are then explicitly given by:

$$p(o \!? \vert o) = \left| \psi\_{o \!? \vert o \!v\_{V}} \right|^{2} + \left| \psi\_{o \!v\_{\!} \vert o\_{G}} \right|^{2}, \qquad p(o \!? \vert ns) = \left| \psi\_{o \!? \vert ns\_{N}} \right|^{2},$$

$$p(ns \!? \vert o) = \left| \psi\_{n \!? \vert o \!v\_{G}} \right|^{2}, \qquad p(ns \!? \vert ns) = \left| \psi\_{n \!? \vert ns\_{G}} \right|^{2},$$

$$p(nd \!? \vert o) = \left| \psi\_{n \!? \vert o\_{N}} \right|^{2}, \qquad p(nd \!? \vert ns) = \left| \psi\_{n \!? \vert ns\_{N}} \right|^{2},$$

$$p(b \!? \vert o) = \left| \psi\_{b \!? \vert o \!v\_{V}} \right|^{2} + \left| \psi\_{b \!? \vert ns \!} \right|^{2}, \qquad p(b \!? \vert ns) = \left| \psi\_{b \!? \vert ns \!} \right|^{2} \tag{12}$$

$$\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \tag{12}$$

and

$$p(o \mathbb{1} | nd) = |\psi\_{o \mathbb{1} | nd \, V}|^2 + |\psi\_{o \mathbb{1} | nd \, G}|^2$$

$$\begin{split} p(ns \mathbb{1} | nd) &= |\psi\_{ns \mathbb{1} | nd \, G}|^2 \\ p(nd \mathbb{1} | nd) &= |\psi\_{nd \mathbb{1} | nd \, N}|^2 \\ p(b \mathbb{1} | nd) &= |\psi\_{b \mathbb{1} | nd \, V}|^2 + |\psi\_{b \mathbb{1} | nd \, G}|^2 \end{split} \tag{13}$$

where the instruction o or ns? is denoted by shorthand b? for "both" o? or ns?. We have noted previously in FTT theory under b? probe the amplitudes of the V component and the G component both are in the acceptance subspace. This leads to formal similarity but not numerical equivalence of the probabilities p(o?|o) and p(b?|o)—idem for conditionalization on probes ns and nd—since memory is description-dependent ([1, 3])<sup>9</sup> . The quantum model can thus provide explicit expressions for both the unpacking factor and the subadditivity.

**Unpacking factor and subadditivity.** First we discuss the expression for the subadditivity, Equation (1), for some cue c;

$$\begin{split} p(o\!\!/c) + p(ns\!\!/c) + p(nd\!\!\!/c) &= \left| \psi\_{o\!\!\!/c\!\!/c} \right|^2 + \left| \psi\_{o\!\!\!/c\!\!/c} \right|^2 \\ &+ \left| \psi\_{ns\!\!\!/c\!\!/c\!\!/} \right|^2 + \left| \psi\_{nd\!\!\!\!/c\!\!/c\!\!/} \right|^2 \\ &= \left| \psi\_{o\!\!\!/c\!\!/c\!\!/} \right|^2 + \left| \psi\_{o\!\!\!\!/c\!\!\!/c\!\!} \right|^2 \\ &+ \left( \left| \psi\_{o\!\!\!/c\!\!/c\_N} \right|^2 - \left| \psi\_{o\!\!\!\!\!/c\_N} \right|^2 \right) \\ &+ \left| \psi\_{ns\!\!\!\!/c\_G} \right|^2 + \left| \psi\_{nd\!\!\!\!\!/c\_N} \right|^2 \\ &= 1 + \left| \psi\_{ns\!\!\!\!/c\_G} \right|^2 + \left| \psi\_{nd\!\!\!\!\!\!\!/c\_N} \right|^2 \\ &- \left| \psi\_{o\!\!\!\!\!\!/c\_N} \right|^2 \end{split} \tag{14}$$

We remark that our Hamiltonian driven account of QEM does not necessarily imply subadditivity of total acceptance probability10. Mostly the QEM model will imply subadditivity

<sup>9</sup>Meaning probe-dependence by differing κ<sup>b</sup> and κ<sup>o</sup> in the second stage propagation.

<sup>10</sup>Brainerd et al. [1, p. 19]) mentions: "[. . . ] a distinct memory state vector is generated for each of the three types of cues, with corresponding amplitudes vC,

of total acceptance probability as long as a gist component is present in the memory state [1]. If however in some instance the gist trace is weak and the unrelated trace N is strongly description-dependent such that—unexpectedly—the probe o? engenders stronger response than the probe nd?, it is possible to have superadditivity in our Hamiltonian driven QEM model.

Next we shortly discuss the expression for the unpacking factor, Equation (3), for some cue c, which we find by replacing the respective acceptance probabilities by their modulus squared amplitude components, Equations (12, 13);

$$\begin{split} \frac{p(o\!\!/\!\!/\!c) + p(ns\!\!\!\!/\!c)}{p(b\!\!\!\!\!/\!c)} &= 1 + \frac{|\psi\_{ns\!\!\!^\ast|\!\!\!c}|^2}{|\psi\_{b\!\!\!^\ast|\!\!\!\!/\!.}|^2 + |\psi\_{b\!\!\!\!\!\!/\!\!\!\/}\_{b\!\!\!\!\!/}|^2} \\ &+ \frac{|\psi\_{o\!\!\!\!\!/\!c\_V}|^2 - |\psi\_{b\!\!\!\!\!\!\!/\!\!\/\!\!\/}\_{b\!\!\!\!\!\!/\!\!\/}\_{c\!\!\!\!\!}|^2}{|\psi\_{b\!\!\!\!\!\!\!}\_{b\!\!\!\!\!\!\/\!\!\/\!\!\/}|^2} \end{split} \tag{15}$$

We thus remark that our Hamiltonian QEM approach will mostly show an EOD disjunction fallacy when a gist component is present in the memory state. However again, when the gist trace is weak in the ns? condition and the verbatim V and gist trace G are strongly description-dependent such that the probe b? engenders stronger response than the probe o?, it is possible to avoid EOD in the QEM model (rhs will be less than 1 as the second fraction becomes very small and the final fraction becomes negative and sufficiently dominant).

**Subsequent reflection periods.** The experimental paradigm essentially shows two reflection periods in the participants; the first period involves processing a cue from {o, ns, nd} after it is presented, the second period concerns the processing of a probe from {o?, ns?, nd?} after that one has been presented.

In the first period the participant will do a descriptionindependent effort to evolve the equally weighed initial state (Equation 8)—a-expressed in VGN base (Equation 9)– as good as the participant can to one that corresponds to the presented cue. This type of reflection will be represented by a dedicated cuedependent Hamiltonian H<sup>c</sup> , thus requiring three parameters in total to cover the first stage of the full experimental paradigm.

In the second period the participant receives the probe instruction and possibly changes her attitude toward the perception in the first stage, allowing for description-dependent processing. The input of new information by the probe in the participant's mind engenders a change of dynamics (e.g., [25]). This second type of reflection will thus proceed along a different Hamiltonian Hp? also requiring three parameters to cover the experimental paradigm.

**First reflection period.** We specify now the Hamiltonians describing the reflection of the first period following the presentation of the respective cues. This stage will change the memory state from an undecided equally weighed one to a state that reflects the recognition of the cue's nature by the participant. Since the Hamiltonian is the generator of change over infinitesimal time we can model it to cause the required transitions<sup>11</sup> .

**Reflection following ns and nd cue.** For the reflection following the presentation of the cue we will construct a superposition of 2 × 2-dimensional Hadamard gates that transfer probability amplitude mass toward the targeted components of the state vector in VGN-space ([4, Ch. 8], [24, 26]). One can see however that higher matrix powers of such Hamiltonians will not show the simple closure of transitions we find when using single parametrized Hadamard gates. Except for shedding the possibility of simple analytical calculation of the unitary evolution operator this does not alter the essence of the dynamics. In the present model we will use parametrized Hadamard gates with off-diagonal appearance of the parameter12. We derive Hamiltonians for the presentation of the ns and nd cue based on their respective target states (0, 1, 0)<sup>τ</sup> and (0, 0, 1)<sup>τ</sup> . On the presentation of an ns cue to the participant the amplitude mass has to shift from verbatim trace to gist trace and from the unrelated trace to the gist trace. In the perceived memory state vector of the VGN-space this means the Hamiltonian must transfer amplitude from 1st to 2nd entry and from 3rd to 2nd entry of the perceived memory state vector:

$$\begin{split} H\_{\rm ns}(\boldsymbol{\wp\_{\rm ns}}) &= \, \, G\_{12}(\boldsymbol{\wp\_{\rm ns}}) + \, G\_{32}(\boldsymbol{\wp\_{\rm ns}}), \\ &= \, \frac{1}{\sqrt{\boldsymbol{\wp\_{\rm ns}^{2}} + 1}} \begin{pmatrix} -1 & \boldsymbol{\wp\_{\rm ns}} & 0 \\ \boldsymbol{\nu\_{\rm ns}^{\star}} & 2 & \boldsymbol{\wp\_{\rm ns}} \\ 0 & \boldsymbol{\nu\_{\rm ns}^{\star}} & -1 \end{pmatrix}. \end{split} \tag{16}$$

Where γns will be the parameter describing the participant's ability to recognize an ns cue (γns <sup>∈</sup> <sup>R</sup>).

Similarly when an nd cue is presented to the participant the amplitude mass has to shift according the targeted vector from, from verbatim to unrelated and from gist to unrelated. This means that the dedicated Hamiltonian must transfer amplitude from 1st to 3rd entry and from 2nd to 3rd entry of the perceived memory state in VGN-space.

$$\begin{split} H\_{nd}(\boldsymbol{\chi}\_{nd}) &= \, \, G\_{13}(\boldsymbol{\chi}\_{nd}) + G\_{23}(\boldsymbol{\chi}\_{nd}), \\ &= \, \, \frac{1}{\sqrt{\boldsymbol{\chi}\_{nd}^{2} + 1}} \begin{pmatrix} -1 & 0 & \boldsymbol{\chi}\_{nd} \\ 0 & -1 & \boldsymbol{\chi}\_{nd} \\ \boldsymbol{\chi}\_{nd}^{\star} & \boldsymbol{\chi}\_{nd}^{\star} & 2 \end{pmatrix}. \end{split} \tag{17}$$

<sup>11</sup>Applying the Hamiltonian to the initial state gives a first-order approximation of the change of the state vector for an infinitesimal time interval:

$$
\psi\_{\delta t} - \psi\_0 \approx \frac{i\delta t}{\hbar} H \psi\_0
$$

This allows us to design the Hamiltonian according the needs of the cognitive process. <sup>12</sup>E.g.,:

$$G\_{21}(h) = \frac{1}{\sqrt{1+|h|^2}} \begin{pmatrix} 1 & h & 0 \\ h^\bullet & -1 & 0 \\ 0 & 0 & 0 \end{pmatrix} \quad \text{with} \quad G\_{21}(h)^2 = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{pmatrix}.$$

This modification retains the rotation effects of the operator and squares to the unity operator in VG-space. A main advantage of the present form is the oscillations of probability over time stop when the parameter is set equal to zero.

gC, and nC." (see Equations 4, 5). Our present Hamiltonian take of the QEM structure provides nine memory state vectors. Starting from one single initial state our Hamiltonian dynamics provides a distinct state vector for each of the nine configurations of the three cues crossed with the three probes. Therefore we have nine normalization conditions of the vectors (Equation 11), and can have some modulation in the unpacking factor and in the subadditivity expression.

Where γnd will be the parameter describing the participant's ability to recognize an nd cue, (γnd <sup>∈</sup> <sup>R</sup>).

**Reflection following o cue.** The Hamiltonian for the dynamics after o cue presentation to the participant is again based on its target state vector (p 1 − |β| 2 , β, 0)<sup>τ</sup> . When the o cue is presented to the participant the amplitude mass has to shift from unrelated to verbatim and from unrelated to gist. In this case with the o cue however, both processes must not occur at the same rate. The dedicated Hamiltonian has to transfer amplitude form 3rd to 1st and from 3rd to 2nd with the respective rates p 1 − |β| <sup>2</sup> and β in accordance with the target vector state. Moreover the initial gist component needs to be redistributed according the target vector as well, leading to a complementary transfer from 2nd to 1st entry with rate p 1 − |β| 2 :

$$H\_o(\chi\_o, \beta) = G\_{21}(\chi\_o \sqrt{1 - |\beta|^2}) + G\_{31}(\chi\_o \sqrt{1 - |\beta|^2})$$

$$+ G\_{32}(\chi\_o \beta). \tag{18}$$

Where γ<sup>o</sup> will be the parameter describing the participant's ability to recognize an <sup>o</sup> cue, (γ<sup>o</sup> <sup>∈</sup> <sup>R</sup>, <sup>β</sup> <sup>∈</sup> <sup>C</sup>) 13 .

**Second reflection period.** Following the reflection period after the presentation of the cue, the participant is presented with a probe stemming from o?, ns?, nd? and matches it with her recollection memory state post first stage. This comparison can either lead to an affirmation of the probe or a challenge. Coincidence of perceived cue and probe may induce to some degree a tendency to affirm one's memory state, while contrasting cue and probe may to some degree induce a challenge or cognitive dissonance. We remark that affirmation and challenge are relative; in an o-run a participant with an ns-recollection will consider the o?-probe as a challenge rather than a confirmation. The terms affirmation and challenge clearly take their meaning only for the inter-participant average of acceptance probabilities, not in general for individual intra-participant occasions (see **Table 3**). In the second reflection period the probe thus either affirms or challenges the recollection effort of the first stage, dynamically this corresponds to either an amplified continuation of the first stage dynamics or a reversed evolution with regard to the probe:

$$H\_{o\text{\textquotedblleft}}(\kappa\_o, \wp\_0, \beta) = H\_o(\kappa\_o \wp\_o, \beta), \tag{19}$$

$$H\_{ns\text{\textquotedblleft}}(\kappa\_{ns}, \chi\_{ns}) = H\_{ns}(\kappa\_{ns}\chi\_{ns}),\tag{20}$$

$$H\_{nd\!\!\!\!=}(\kappa\_{nd}, \chi\_{nd}) \;= H\_{nd}(\kappa\_{nd}\chi\_{nd}),\tag{21}$$

$$H\_{b\circ}(\kappa\_b, \chi\_o, \chi\_{ns}, \beta) \,= \, H\_b(\kappa\_b \chi\_o, \kappa\_b \chi\_{ns}, \beta). \tag{22}$$

Where κ is the parameter expressing affirmation (κ > 0) or challenge κ < 0 of the cue by the probe when the parameter γ<sup>c</sup> > 0, and the other way round when γ<sup>c</sup> < 0. Multiplication of the driving parameter γ<sup>c</sup> leads to a modified composed parameter κpγ<sup>c</sup> in the Hamiltonian to affirm or mitigate the participant's initial recollection of the cue. We want to emphasize that the second stage Hamiltonians for the probes are thus structured exactly in the same way as the Hamiltonians for

TABLE 3 | Affirmation and challenge of cues by probes; + sign indicates corresponding features, − sign indicates challenge.


*The subindex indicates the conflicting or affirming feature.*

the corresponding cues, except that the driving parameters γ are modulated by multiplying them with dedicated tweaking parameters κ.

**Reflection following b? probe.** While it is not needed in the first reflection stage, under the disjunctive probe b? in the second stage a dedicated Hamiltonian, Equation (22), is still required. Also the Hamiltonian proper to the exposure of the b? probe is based on its target state vector (p 1 − |β| 2 , 1 <sup>+</sup> <sup>β</sup>, 0)<sup>τ</sup> . This consists of the actions of Ho(γo) and Hns(γns) where the parameters in the corresponding gates have been added, or subtracted if the transport is in opposite direction14;

$$\begin{aligned} H\_b(\chi\_o, \chi\_{\rm ns}, \beta) &= G\_{21} (\chi\_o \sqrt{1 - |\beta|^2} - \chi\_{\rm ns}) + G\_{31} (\chi\_o \sqrt{1 - |\beta|^2}) \\ &+ G\_{32} (\chi\_{\rm ns} + \beta \chi\_o) \end{aligned} \tag{23}$$

The Hamiltonian for the b? probe thus uses three parameters γo, γns and β which it inherits from the Hamiltonians for its component probes o? or ns?.

**Unitary evolution and time of measurement.** An issue with quantum-like models is the typical appearance of oscillations of probability over time. These oscillations in the evolution are essentially due to the inherent periodicity of a finite dimensional and energetically closed quantum system. Simply put, such systems will always evolve back to their initial state and do over the exact same itinerary in their Hilbert space—ad infinitum. Evidently, in the domain of cognition, when quantum-like modeling of experimental paradigms is done, only within-period evolution should be given meaningful interpretation [24]. In that sense a guideline for the time of measurement would be to keep the reflection times short with respect to the full period. Another option to arrest the characteristic probability oscillation is to include a third stage in the experimental paradigm driven by a 'grab coat and leave' Hamiltonian, which would be dedicated to freeze the perceived memory state (set all driving parameters γ equal to zero). More elegantly a termination should be formalized to damp the memory state vector back into its baseline uninformed state by using Lindblad evolution for an open system (e.g., [27, 28] Broekaert et al., under review).

A number of alternative criteria could be put forward to decide this instance of measurement, though at present we keep to an ad hoc cut to the unitary time propagation as proposed by Busemeyer et al. [23] and Busemeyer and Bruza [4] <sup>15</sup>. With

<sup>13</sup>With β a complex number one must take care to keep the Hadamard gate Hermitian.

<sup>14</sup>One must take into account <sup>G</sup>12(<sup>γ</sup> ) = −G21(−<sup>γ</sup> ).

<sup>15</sup>Choice <sup>t</sup> <sup>=</sup> π 2 corresponds to a first extremum when significant parameters in the Hamiltonian are set equal to zero, i.e., when the actual psychological dynamics is "turned off " in the model.

the intent of the possibility of tweaking the observed acceptance probabilities by description dependence in the second reflection period, we have taken the ad hoc reflection durations of both stages somewhat shorter; <sup>π</sup> 3 for each stage.

The first stage ends at t = π 3 , the unitary operator of the second stage picks up there after. The vector of the perceived memory state at time t after probe presentation is then obtained by propagating the initial state, Equation (9), by the concatenated Schrödinger propagators;

$$\begin{aligned} \Psi\_{\mathbb{P}|\epsilon}(t) &= e^{-iH\_{\mathbb{P}}t} e^{-iH\_{\mathbb{C}}\frac{\pi}{3}} \Psi\_0, \text{ and} \\ \Psi\_0 &= \frac{1}{\sqrt{3 + 2\Re(\beta)}} (\sqrt{1 - |\beta|^2}, (\beta + 1), 1)^{\mathbb{T}} \end{aligned} \tag{24}$$

Also the second stage ends at t = π 3 , after the first stage. Time evolution prior to the second stage can be obtained by deleting the propagator of the second stage and letting the first propagator have the argument t. The acceptance probabilities p(p?|c) can then be derived using their expressions, Equations (12, 13), in terms of state vector components and will be fitted to the observed data by SSE optimization of the seven free parameters of our Hamiltonian driven QEM model.

#### 3. FITTING THE MODELS TO THE EOD DATA

**The data fitted post-hoc parameters for Brainerd et al.'s QEM.** QEM provides three amplitudes per cue {v<sup>c</sup> , g<sup>c</sup> , ndc}, which satisfy normalization (Equation 5). Therefore six numbers should cover the experimental data sets. Our prescription for acceptance probabilities, Equations (12, 13), coincide with Brainerd et al. [1, p. 243]:

$$\begin{aligned} ||M\_{\mathcal{O}, \mathcal{Y}} | \mathcal{S}\_{\mathcal{C}} ||^2 &= |\nu\_{\mathcal{C}}|^2 + |\mathcal{g}\_{\mathcal{C}}|^2 ||M\_{\mathcal{m}, \mathcal{Y}} |\mathcal{S}\_{\mathcal{C}}| ||^2 \\ &= |\mathcal{g}\_{\mathcal{C}}|^2 ||M\_{\mathcal{N}\mathcal{D}, \mathcal{Y}} |\mathcal{S}\_{\mathcal{C}}| ||^2 = |\nu\_{\mathcal{C}}|^2 \end{aligned}$$

and in the same logic we have in QEM, see Equation (7);

$$\left\| \left| M\_{\mathcal{B}, \mathcal{Y}} | \mathcal{S}\_{\mathcal{C}} \right\| \right\|^2 = \left\| \left| M\_{\mathcal{O}, \mathcal{Y}} | \mathcal{S}\_{\mathcal{C}} \right\rangle \right\|^2 = \left| \nu c \right|^2 + \left| \mathcal{g} c \right|^2.$$

We notice that in QEM we will always predict p(o?|C) ≥ p(ns?|C), which is of course only the case for the o cue data. Since the modulus square amplitudes are positive numbers, data with p(o?|C) < p(ns?|C) cannot be accommodated in the original version of QEM.

Similarly in the disjunction paradigm QEM would always predict p(o?|C) = p(ns?|C), which is not apparent in the experimental disjunction data (**Table 1**) and certainly not so for ns and nd cues. Without any other means to fine tune the acceptance probabilities we would expect low accuracy of prediction for them, while we expect pronounced total probability and unpacking factor in the subadditivity paradigm and the disjunction paradigm respectively, **Tables 1**, **2**. Optimized QEM parameters appear in **Tables 4**, **5**.

**The data fitted parameters of the Feynman path based model.** The Feynman path model required six parameters to obtain the nine acceptance probabilities of disjunction paradigm ([4], Ch. 6). The model allows to reproduce very well the general required pattern of acceptance probabilities at RMSE = 0.0298, which turn out the precise EOD effects, except for the new dissimilar cues {nd}, **Table 1**. In the latter case the unpacking factor turns out smaller than 1, i.e., the conjunction value turns out negative.

The Feynman path model was not adapted yet to the subadditivity paradigm, but since it uses interference of amplitudes and reversed gist/verbatim processing depending on the type of cue, the model should be applicable in that paradigm as well.

**The data fitted Hamiltonian driven QEM parameters.** With both experiments reporting different data for similar expressions, we have fitted the Hamiltonian model to each separately16. For the EOD-disjunction paradigm the model obtained closely fitted parameters to the experimental data, with RMSE = 0.0073 with β > 0. When β < 0 constraint was imposed a less good RMSE = 0.0341 was obtained. The nine predicted probabilities p(p?|c) by the parameters of **Table 6** are shown in **Table 1**.

For the EOD-subadditivity paradigm the model obtained a less efficient fit of parameters to the experimental data, with RMSE = 0.0565 for β > 0. When β < 0 the parameter fit allowed an improved RMSE = 0.0191. The Hamiltonian model for the EOD-paradigm allowed a very good data fit using complex β at RMSE = 0.0032. We recall that complex numbers consist of a modulus and a phase, therefore one complex parameter should actually be counted as two real parameters. We shortly comment on this issue in the discussion, Section 4. The nine predicted probabilities p(p?|c) following the parameters of **Table 7** for the three cases are shown in **Table 2**.

**The temporal evolution of acceptance probabilities.** With the optimized values of the driving parameters calculated, the temporal progression of the acceptance probability can be graphed (**Figures 1**–**5**). The dashed lines represent the first-stage evolutions when the participant is shown the cue for recognition, while the full lines represent the second-stage evolutions when the probes enquire for accepting the type of a cue. The ultimate instance of measurement happens at the end of the second stage (t = 2π/3). In all graphs, in the first stage the color indicates the "probability value" of the traces; red codes for perceiving the cue's unrelated features (N), orange codes for perceiving the cue's gist (G) and green codes for perceiving the cue's verbatim and gist (V + G)—one can quickly check that for the same cue the dashed red and dashed green values add up to 1 at each moment in the first stage. Evidently these first stage "probability values" should not be conflated with the participants acceptance probabilities. Only after the probe has enquired the participant do these values evolve as the nine acceptance probabilities. It is worthwhile to note that the optimalization of parameters has returned initial states which either contain no gist or no verbatim perception in the cases with real-valued β (at t = 0 respectively; orange has almost value zero or, green and orange almost coincide). Only when β is complex-valued does the initial value show

<sup>16</sup>Matlab's fmincon function on SSE was used with a 3<sup>6</sup> 2 1 (<sup>β</sup> <sup>∈</sup> <sup>R</sup>) or 3<sup>6</sup> 2 2 (<sup>β</sup> <sup>∈</sup> <sup>C</sup>) grid for the initial vector in the parameter space.



TABLE 5 | EOD-subadditivity paradigm: Optimized fit of independent QEM parameters providing RMSE <sup>=</sup> 0.0963, (*vc*, *<sup>g</sup><sup>c</sup>* <sup>∈</sup> <sup>R</sup> +).


substantial gist and verbatim trace. We have provided two graphs for the EOD-disjunction paradigm (**Figures 1**, **2**) the first one was constrained to have β > 0 while the second had to satisfy β < 0. In this EOD-disjunction paradigm no complex-valued β offered an optimized fit. For the EOD-subadditivity paradigm we provide three graphs (**Figures 3**–**5**), respectively with the constraints β > 0, β < 0, and <sup>β</sup> <sup>∈</sup> <sup>C</sup>.

One observes that in the first stage the dynamics is mostly monotonic—except for the one case where <sup>β</sup> <sup>∈</sup> <sup>C</sup> (**Figure 5**). In the second stage dynamics some intermediary extrema do appear, which from a cognitive point of view are not to be expected. The factor of description dependence was expected to be a smaller modification of the first stage recognition. The second stage extrema however need to be understood with respect to the ad hoc instance of measurement at t = π/3 after enquiry, adopting a shorter measurement time could have mitigated this temporal behavior. Finally we note that also the outspoken VGN spread of the initial vector could be related to a too extended period for evolution. While the fitting of the experimental acceptance probability data in the Hamiltonian driven QEM has shown good accuracy, the concomitant intermediate temporal evolution leaves room for improving the measurement protocol.

# 4. DISCUSSION

We had set out to develop a Hamiltonian driven model that would provide temporal evolution of the memory state of the Quantum Episodic Model of Brainerd et al. [1, 9]. The model uses nine different state vectors for the three cues crossed with three probe paradigms, and requires six parameters to drive the Hamiltonians and one parameter to tweak the gist in the initial state. We provided psychological interpretation of the parameters fitting the experimental process. Initially the memory state prior to cue and probe presentation is an equally weighed mix of o, ns and nd states leading to an overall amount of the gist component monitored by the parameter β. In first stage the ability to recognize the type of the cue c is driven by the cue-specific parameter γc in the Hamiltonian. In the second stage the instruction probe p? engenders an amplified or mitigated evolution driven by the probe-specific parameter κ<sup>p</sup> for descriptiondependence.

TABLE 6 | EOD-disjunction paradigm: Optimized fit of Hamiltonian parameters under β > 0 (RMSE = 0.0073), and β < 0 (RMSE = 0.0341) constraint.


*Fitting attempts for a* <sup>β</sup> <sup>∈</sup> <sup>C</sup> *gravitated toward the* β < *0 solution, i.e., phase*(β) <sup>→</sup> <sup>π</sup>*.*

TABLE 7 | EOD-subadditivity paradigm: Optimized fit of Hamiltonian parameters under constraint β > 0 (RMSE <sup>=</sup> 0.0565), β < 0 (RMSE <sup>=</sup> 0.0191) and <sup>β</sup> <sup>∈</sup> <sup>C</sup> (RMSE = 0.0032).


Our Hamiltonian driven account of QEM shows that the subadditivity and disjunction fallacy are not a priori guaranteed or "parameter free" in our model. The occasions in which these effects would not occur are however very improbable in practice, Equations (14, 15). This possibility is due to the fact that the two-staged Hamiltonian evolution produces nine state vectors 9probe|cue(t) instead of regular QEM's three cue-dependent state vectors 9cue.

Using two reported experimental data sets showing subadditivity and over-distribution of the disjunction in acceptance probabilities for episodic memory recollection, we were able to provide parameter values with good prediction capacity in the Hamiltonian model. In practice we provided values for seven parameters {κo, κns, κnd, γo, γns, γnd, β} to predict nine acceptance probabilities {po?o, pns?o, pnd?<sup>o</sup> , po?ns, pns?ns, pnd?ns, po?nd, pns?nd, pnd?nd} in the subadditivity paradigm and did the same for the disjunction paradigm, **Tables 6**, **7**. Rigorously one should discern the parametrization case when <sup>β</sup> <sup>∈</sup> <sup>C</sup>, which should be counted for two parameters even if the function of the real and imaginary part of the parameter take the same position in the model. The present model thus uses one extra parameter in comparison to the Feynman path model of Busemeyer and Bruza but provides better EOD prediction for all type of cues. Moreover the parameters in the Hamiltonian model do allow psychological interpretation. The predictions of acceptance probabilities following the original QEM formulation by Brainerd et al. showed to be flawed by systematic features. In the disjunction paradigm QEM's acceptance probabilities for the both? probe and the old?-probe can only be identical, and in both experimental paradigms QEM's acceptance probability for the old?-probe can only be larger than or equal to the acceptance probability for the new-similar?-probe, whatever the cue type.

The issue of "description-dependence" effect seems crucial in obtaining final acceptance probabilities; the κ factors are rather large in comparison to the driving parameters γ and cause

FIGURE 1 | β > 0 case: temporal evolution of acceptance probabilities for the EOD-disjunction paradigm [3]. Red indicates N probability component, orange indicates G probability component and green indicates V <sup>+</sup> G probability component. In the second stage brown indicates the acceptance probability for the *<sup>b</sup>*? probe (*V* + *G*). Notice the near absence of verbatim in the initial state.

outspoken evolution in second stage. This fact is rather counter intuitive as a priori we had expected small corrective modulation in second stage evolution (see **Tables 6**, **7**, **Figures 1**–**5**).

We found it remarkable that β ≈ ±1 is needed for best fit in both experimental paradigms when keeping <sup>β</sup> <sup>∈</sup> <sup>R</sup>. This would suggest that the verbatim trace is almost negligible in comparison to the gist in the set of true cues {o} in the initial state, or just the inverse. When β is allowed to be complex a mix of both traces is present in the best fitting initial state for the EOD-subadditivity paradigm. In the EOD-disjunction paradigm

complex β did not provide a best fit (the limit value became real).

Superposed Hadamard gates with off-diagonal parameters show to be a viable method in the construction of Hamiltonians. The "description-dependence" factor κ can indeed mitigate probability oscillations. The best example can be seen in the β < 0 subadditivity Graph 4 where a small κ<sup>o</sup> = 0.022764 acts on an average γ<sup>o</sup> = 2.5625 and gives in the second stage nearly unmodulated continuation for po?|o(t), po?|ns(t) and po?|nd(t) (solid green lines).

The ad hoc time <sup>π</sup> 3 avoided most intermediate extrema in the probabilities in the second reflection stage, except for <sup>β</sup> <sup>∈</sup> <sup>C</sup>.

We remark that lower time of measurement could bring about the problem of not being able to spread open to a range of probabilities in time when starting from some pre-defined e.g., equal– probability configuration, or just trade of with ever growing driving parameters γ . A longer time of measurement would increase the well-known issue of intermediate extrema.

We have used the equally weighed initial state 1/N [ p 1 − |β| 2 , 1 <sup>+</sup> <sup>β</sup>, 1]<sup>τ</sup> in VGN space to give each vector |oi, |nsi and |ndi equal weight at start which we consider reflected best the information communicated by the experimenter. The optimized data fit shows e.g., β ≈ ±1 in both paradigms with the perceived implicit probabilities at the start at p(o) ≈ 0.8, p(ns) ≈ 0.8 and p(nd) ≈ 0.2. Which one can observe at t = 0 in both the subadditivity paradigm (**Figure 3**) and disjunction paradigm (**Figure 1**). The precise nature of the initial vector for the memory state of the participant after studying {o} and having heard 'all type of cues will be presented with equal probability' but prior to cue and probe presentation remains somewhat puzzling.

In sum we consider to have constructed an acceptable Hamiltonian driven QEM version, with good prediction capacity for acceptance probabilities. Future work could include covering the model fitting of a data set which covers both the subadditivity and disjunction paradigm at once –eight parameters for twelve datapoints– to verify its further prediction capacity, and to monitor more closely the initial memory state in the experimental paradigms and the meaurement protocol.

# AUTHOR CONTRIBUTIONS

JBB has designed the Hamiltonian model for the EOD paradigm and provided data fitting and interpretation. JRB is the author of the Feynman path model for the EOD-paradigm and provided prior knowledge on the QEM model and Hamiltonian design, and did critical revision of the work.

# ACKNOWLEDGMENTS

JBB gratefully thanks JRB for extensive discussions on Hamiltonian and Markov dynamical decision models and their relation to the EOD effect, and also thanks Cole Rodman for insightful discussions on 3 × 3 categorizationdecision paradigms in quantum-like modeling. This work was made possible by FWO-Vlaanderen mobility grant V410016N. Further thanks go to an reviewer for inciting clarifications.

# REFERENCES


Notes in Computer Science, Vol. 8951. Cham: Springer (2015). p. 67–77. doi: 10.1007/978-3-319-15931-7\_6


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Broekaert and Busemeyer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# On the Foundations of the Brussels Operational-Realistic Approach to Cognition

#### Diederik Aerts <sup>1</sup> , Massimiliano Sassoli de Bianchi <sup>2</sup> and Sandro Sozzo<sup>3</sup> \*

*<sup>1</sup> Center Leo Apostel for Interdisciplinary Studies, Free University of Brussels, Brussels, Belgium, <sup>2</sup> Laboratorio di Autoricerca di Base, Lugano, Switzerland, <sup>3</sup> School of Management, Institute for Quantum Social and Cognitive Science, University of Leicester, Leicester, UK*

The scientific community is becoming more and more interested in the research that applies the mathematical formalism of quantum theory to model human decision-making. In this paper, we expose the theoretical foundations of the quantum approach to cognition that we developed in Brussels. These foundations rest on the results of two decade studies on the axiomatic and operational-realistic approaches to the foundations of quantum physics. The deep analogies between the foundations of physics and those of cognition lead us to investigate the validity of quantum theory as a general and unitary framework for cognitive phenomena, and the empirical success of the Hilbert space models derived by such investigation provides a strong theoretical confirmation of this validity. However, two situations in the cognitive realm, "question order effects" and "response replicability," indicate that even the Hilbert space framework could be insufficient to reproduce the expected pattern. This does not mean that the mentioned operational-realistic approach would be incorrect, but simply that a larger class of measurements would be in force in human cognition, so that an extended quantum formalism may be needed to deal with all of them. As we will explain, the recently derived "extended Bloch representation" of quantum theory (and the associated "general tension-reduction" model) precisely provides such extended formalism, while remaining within the same unitary interpretative framework.

Keywords: human cognition, cognitive modeling, quantum structures, foundations of quantum theory, tension-reduction model

# 1. INTRODUCTION

A fundamental problem in cognition concerns the identification of the principles guiding human decision-making. Identifying the mechanisms of decision-making would indeed have manifold implications, from psychology to economics, finance, politics, philosophy, and computer science. In this regard, the predominant theoretical paradigm rests on a classical conception of logic and probability theory. According to this paradigm, people take decisions by following the rules of Boole's logic, while the probabilistic aspects of these decisions can be formalized by Kolmogorov's probability theory [1]. However, increasing experimental evidence on conceptual categorization, probability judgments, and behavioral economics confirms that this classical conception is fundamentally problematical, in the sense that the cognitive models based on these mathematical structures are not capable of capturing how people concretely take decisions in situations of uncertainty.

#### Edited by:

*Andrei Khrennikov, Linnaeus University, Sweden*

#### Reviewed by:

*Jerome Busemeyer, Indiana University, USA Irina Basieva, General Physics Institute, Russia Giuseppe Sergioli, University of Cagliari, Italy*

> \*Correspondence: *Sandro Sozzo ss831@le.ac.uk*

#### Specialty section:

*This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics*

Received: *02 February 2016* Accepted: *19 April 2016* Published: *06 May 2016*

#### Citation:

*Aerts D, Sassoli de Bianchi M and Sozzo S (2016) On the Foundations of the Brussels Operational-Realistic Approach to Cognition. Front. Phys. 4:17. doi: 10.3389/fphy.2016.00017*

**22**

In the last decade, an alternative scientific paradigm has caught on which applies a different modeling scheme. The research that uses the mathematical formalism of quantum theory to model situations and processes in cognitive science is becoming more and more accepted in the scientific community, having attracted the interest of renowned scientists, funding institutions, media, and popular science. And, quantum models of cognition showed to be more effective than traditional modeling schemes to describe situations like the "Guppy effect," the "combination problem," the "prisoner's dilemma," the "conjunction and disjunction fallacies," "similarity judgments," the "disjunction effect," "violations of the Sure-Thing principle," "Allais," "Ellsberg," and "Machina paradoxes" (see, e.g., [2–22]). Recently, quantum computational semantics were applied to natural and musical languages in a novel approach [23, 24].

There is a general acceptance that the use of the term "quantum" is not directly related to physics, neither this research in "quantum cognition" aims to unveil the microscopic processes occurring in the human brain. The term "quantum" rather refers to the mathematical structures that are applied to cognitive domains. The scientific community engaged in this research does not instead have a shared opinion on how and why these quantum mathematical structures should be employed in human cognition. Different hypotheses have been put forward in this respect. Our research team in Brussels has been working in this domain since early nineties, providing pioneering and substantial contributions to its growth, and we think it is important to expose the epistemological foundations of the quantum theoretical approach to cognition we developed in these years. This is the main aim of the present paper.

Our approach was inspired by a two decade research on the mathematical and conceptual foundations of quantum physics, quantum probability, and the fundamental differences between classical and quantum structures [25–28]. We followed an axiomatic and operational-realistic approach to quantum physics, in which we investigated how the mathematical formalism of quantum theory in Hilbert space can be derived from more intuitive and physically justified axioms, directly connected with empirical situations and facts. This led us to elaborate a "State Context Property" (SCoP) formalism, according to which any physical entity is expressed in terms of the operationally well defined notions of "state," "context," and "property," and functional relations between these notions [29]. If suitable axioms are imposed to such a SCoP structure, then one obtains a mathematical representation that is isomorphic to a Hilbert space over complex numbers (see, e.g., [30]).

Let us shortly explain the "operational-realistic" connotation characterizing our approach, because doing so we can easily point out its specific strength, and the reason why it introduces an essentially new element to the domain of psychology. "Operational" stands for the fact that all fundamental elements in the formalism are directly linked to the measurement settings and operations that are performed in the laboratory of experimentation. "Realistic" means that we introduce in an operational way the notion of "state of an entity," considering such a "state" as representing an aspect of the reality of the considered entity at a specific moment or during a specific time-span. Historically, the notion of "state of a physical entity" was the "easy" part of the physical theories that were the predecessors of quantum theory, and it was the birth of quantum theory that forced physicists to take also seriously the role of measurement and hence the value of an operational approach. The reason is that "the reality of a physical entity" was considered to be a simple and straightforward notion in classical physics and hence the "different modes of reality of a same physical entity" were described by its "different states." That measurements would intrinsically play a role, also in the description of the reality of a physical entity, only became clear in quantum physics for the case of micro-physical entities.

In psychology, things historically evolved in a different way. Here, one is in fact confronted with what we call "conceptual entities," such as "concepts" or "conceptual combinations," and more generally with any cognitive situation which is presented to the different participants in a psychology experiment. Due to their nature, conceptual entities and cognitive situations are "much less real than physical entities," which makes the notion of "state of a conceptual entity" a highly non-obvious one in psychology. And, as far as we know, the notion of state is never explicitly introduced in psychology, although it appears implicitly within the reasoning that is made about experiments, their setups and results. Possibly, the notion of "preparation of the experiment" will be used for what we call "the state of the considered conceptual entity" in our approach. Often, however, the notion of state is also associated with the "belief system" of the participant in the experiment. In our approach we keep both notions of "state" and "measurement" on equal footing, whether our description concerns a physical entity or a conceptual entity. In this way, we can make optimal use of the characteristic methodological strengths of each one of the notions. It is in doing so that we observed that there is an impressive analogy between the operational-realistic description of a physical entity and the operational-realistic description of a conceptual entity, in particular for what concerns the measurement process and the effects of context on the state of the entity. As a matter of fact, one can give a SCoP description of a conceptual entity and its dynamics [4, 8, 9]. This justifies the investigation of quantum theory as a unified, coherent and general framework to model conceptual entities, as quantum theory is a natural candidate to model context effects and contextinduced state transformations. Hence, the quantum theoretical models that we worked out for specific cognitive situations strictly derive from such investigation of quantum theory as a scientific paradigm for human cognition. In this respect, we think that each predictive success of quantum modeling can be considered as a confirmation of such general validity. It is however important to observe that, recently, potential deviations from Hilbert space modeling were discovered in two cognitive situations, namely, "question order effects" [31] and "response replicability" [32]. According to some authors, question order effects can be represented by sequential quantum measurements of incompatible properties [14, 18, 31]. However, such a representation seems to be problematical, as it cannot reproduce the pattern that would be observed in response replicability, in case the effect were confirmed experimentally [32], nor it

can fit the experimental data, when non-degenerate models are considered [33, 34], or "exactly" fit the data, when degenerate models are used, as for instance the quantum identity called the QQ-equality (see Section 5) is never "perfectly" obeyed by the data (although it is remarkably almost obeyed by measurements not including background information [14, 18, 31]).

We put forward an alternative solution for these effects within a "hidden measurement formalism" elaborated by ourselves (see, e.g., [26, 35–39] and references therein), which goes beyond the Hilbert space formulation of quantum theory (probabilities), though it remains compatible with our operational-realistic description of conceptual entities [34, 40].

For the sake of completeness, we summarize the content of this paper in the following.

In Section 2, we present the epistemological foundations of the quantum theoretical approach to human cognition we developed in Brussels. We operationally describe a conceptual entity in terms of concrete experiments that are performed in psychological laboratories. Specifically, the conceptual entity is the reality of the situation which every participant in an experiment is confronted with, and the different states of this conceptual entity are the different modes of reality of this experimental situation. There are contexts influencing the reality of this experimental situation, and the relevant ones of these contexts are elements of the SCoP structure, the theory of our approach, and their influence on the experimental situation is described as a change of state of the conceptual entity under consideration. There are also properties of this experimental situation, the relevant ones being elements of the SCoP structure, and they can be actual or potential, their "amount of actuality" (i.e., their "degree of availability in being actualized") being described by a probability measure. The operational analogies between physical and conceptual entities suggest to represent the latter by means of the mathematical formalism of quantum theory in Hilbert space. Hence, we assume, in our research, the validity of quantum theory as a scientific paradigm for human cognition. On the basis of this assumption, we provide a unified presentation in Section 3 of the results obtained within a quantum theoretical modeling in knowledge representation, decision theory under uncertainty and behavioral economics. We emphasize that our research allowed us to identify new unexpected deviations from classical structures [41–43], as well as new genuine quantum structures in conceptual combinations [44–46], which could not have been identified at the same fundamental level as it was possible in our approach if we would have adopted the more traditional perspective only inquiring into the observed deviations from classical probabilistic structures. In Section 4, we analyze question order effects and response replicability and explain why a quantum theoretical modeling in Hilbert space of these situations is problematical. Finally, we present in Section 5 a novel solution we recently elaborated for these cognitive situations [34, 40]. The solution predicts a violation of the Hilbert space formalism, more specifically, the Born rule for probabilities is put at stake. We however emphasize that this solution remains compatible with the general operational and realistic description of cognitive entities and their dynamics given in Section 2. In Section 6, we conclude our article by offering a few additional remarks, further emphasizing the coherence and advantage of our theoretical approach. We stress, to conclude this section, that the deviation above from Hilbert space modeling should not be considered as an indication that we should better come back to more traditional classical approaches. On the contrary, we believe that new mathematical structures, more general than both pure classical and pure quantum structures, will be needed in the modeling of cognitive processes.

# 2. AN OPERATIONAL-REALISTIC FOUNDATION OF COGNITIVE PSYCHOLOGY

Many quantum physicists agree that the phenomenology of microscopic particles is intriguing, but what is equally curious is the quantum mathematics that captures the mysterious quantum phenomena. Since the early days of quantum theory, indeed, scholars have been amazed by the success of the mathematical formalism of quantum theory, as it was not clear at all how it had come about. This has inspired a long-standing research on the foundations of the Hilbert space formalism of quantum theory from physically justified axioms, resting on well defined empirical notions, more directly connected with the operations that are usually performed in a laboratory. Such an operational justification would make the formalism of quantum theory more firmly founded.

One of the well-known approaches to the foundations of quantum physics and quantum probability is the "Geneva-Brussels approach", initiated by Jauch [47] and Piron [48], and further developed by our Brussels research team (see, e.g., [25, 28]). This research produced a formal approach, called "State Context Property" (SCoP) formalism, where any physical entity can be expressed in terms of the basic notions of "state," "context," and "property," which arise as a consequence of concrete physical operations on macroscopic apparatuses, such as preparation and registration devices, performed in spatio-temporal domains, such as physical laboratories. Measurements, state transformations, outcomes of measurements, and probabilities can then be expressed in terms of these more fundamental notions. If suitable axioms are imposed on the mathematical structures underlying the SCoP formalism, then the Hilbert space structure of quantum theory emerges as a unique mathematical representation, up to isomorphisms [30].

There are still difficulties connected with the interpretation of some of these axioms and their physical justification, in particular for what concerns compound physical entities [25]. But, this research line was a source of inspiration for the operational approaches applying the quantum formalism outside the microscopic domain of quantum physics [49, 50]. In particular, as we already mentioned in Section 1, a very similar realistic and operational representation of conceptual entities can be given for the cognitive domain, in the sense that the SCoP formalism can again be employed to formalize the more abstract conceptual entities in terms of states, contexts, properties, measurements, and probabilities of outcomes [4, 8, 9].

Let us first consider the empirical phenomenology of cognitive psychology. Like in physics, where laboratories define precise spatio-temporal domains, we can introduce "psychological laboratories" where cognitive experiments are performed. These experiments are performed on situations that are specifically "prepared" for the experiments, including experimental devices, and, for example, structured questionnaires, human participants that interact with the questionnaires in written answers, or each other, e.g., an interviewer and an interviewed. Whenever empirical data are collected from the responses of several participants, a statistics of the obtained outcomes arises. Starting from these empirical facts, we identify in our approach entities, states, contexts, measurements, outcomes, and probabilities of outcomes, as follows.

The complex of experimental procedures conceived by the experimenter, the experimental design and setting and the cognitive effect that one wants to analyze, define a conceptual entity A, and are usually associated with a preparation procedure of a state of A. Hence, like in physics, the preparation procedure sets the initial state p<sup>A</sup> of the conceptual entity A under study. Let us consider, for example, a questionnaire where a participant is asked to rank on a 7-point scale the membership of a list of items with respect to the concepts Fruits, Vegetables and their conjunction Fruits and Vegetables. The questionnaire defines the states pFruits, pVegetables, and pFruits and Vegetables of the conceptual entities Fruits, Vegetables, and Fruits and Vegetables, respectively. It is true that cognitive situations exist where the preparation procedure of the state of a conceptual entity is hardly controllable. Notwithstanding this, the state of the conceptual entity, defined by means of such a preparation procedure, is a "state of affairs." It indeed expresses a "reality of the conceptual entity," in the sense that, once prepared in a given state, such condition is independent of any measurement procedure, and can be confronted with the different participants in an experiment, leading to outcome data and their statistics, exactly like in physics.

A context e is an element that can provoke a change of state of the conceptual entity. For example, the concept Juicy can function as a context for the conceptual entity Fruits leading to Juicy Fruits, which can then be considered as a state of the conceptual entity Fruits. A special context is the one introduced by the measurement itself. Indeed, when the cognitive experiment starts, an interaction of a cognitive nature occurs between the conceptual entity A under study and a participant in the experiment, in which the state p<sup>A</sup> of the conceptual entity A generally changes, being transformed to another state p. Also this cognitive interaction is formalized by means of a context e. For example, if the participant is asked to choose among a list of items, say, Olive, Almond, Apple, etc., the most typical one with respect to Fruits, and the answer is Apple, then the initial state pFruits of the conceptual entity Fruits changes to pApple, i.e., the state describing the situation "the fruit is an apple," as a consequence of the contextual interaction with the participant.

The change of the state of a conceptual entity due to a context may be either "deterministic," hence in principle predictable under the assumption that the state before the context acts is known, or "intrinsically probabilistic," in the sense that only the probability µ(p, e, pA) that the state p<sup>A</sup> of A changes to the state p is given. In the example above on typicality estimations, the typicality of the item Apple for the concept Fruits is formalized by means of the transition probability µ(pApple, e, pFruits), where the context e is the context of the typicality measurement.

Like in physics, an important role is played by experiments with only two outcomes, the so-called "yes-no experiments." Suppose that in an opinion poll a participant is asked to answer the question: "Is Gore honest and trustworthy?" Only two answers are possible: "yes" and "no." Suppose that, for a given participant, the answer is "yes." Then, the state pHonesty of the conceptual entity Honesty and Trustworthiness (which we will denote by Honesty, for the sake of simplicity) changes to a new state pGy, which is the state describing the situation "Gore is honest." Hence, we can distinguish a class of yes-no measurements on conceptual entities, as we do in physics.

The third step is the mathematical representation. We have seen that the Hilbert space formalism of quantum theory is general enough to capture an operational description of any entity in the micro-physical domain. Then, the strong analogies between the realistic and operational descriptions of physical and conceptual entities, in particular for what concerns the measurement process, suggest us to apply the same Hilbert space formalism when representing cognitive situations. Hence, each conceptual entity A is associated with a Hilbert space H, and the state <sup>p</sup><sup>A</sup> of <sup>A</sup> is represented by a unit vector <sup>|</sup>Ai ∈ <sup>H</sup>. A yesno measurement is represented by a spectral family {M, <sup>1</sup> <sup>−</sup> <sup>M</sup>}, where M denotes an orthogonal projection operator over the Hilbert space H, and 1 denotes the identity operator over H. The probability that the "yes" outcome is obtained in such a yes-no measurement when the conceptual entity A is in the state represented by |Ai is then given by the Born rule µ(A) = hA|M|Ai. For example, M may represent an item x that can be chosen in relation to a given concept A, so that its membership weight is given by µ(A).

The Born rule obviously applies to measurement with more than two outcomes too. For example, a typicality measurement involving a list of n different items x1, . . . , x<sup>n</sup> with respect to a concept A can be represented as a spectral measure {M1, . . . , <sup>M</sup>n}, where <sup>P</sup><sup>n</sup> <sup>k</sup>=<sup>1</sup> <sup>M</sup><sup>k</sup> <sup>=</sup> <sup>1</sup> and <sup>M</sup>kM<sup>l</sup> <sup>=</sup> <sup>δ</sup>klM<sup>k</sup> , such that the typicality µ<sup>k</sup> (A) of the item x<sup>k</sup> with respect to the concept A is again given by the Born rule µ<sup>k</sup> (A) = hA|M<sup>k</sup> |Ai.

An interesting aspect concerns the final state of a conceptual entity A after a human judgment. As above, we can assume the existence of a nonempty class of cognitive measurements that are ideal first kind measurements in the standard quantum sense, i.e., that satisfy the "Lüders postulate." For example, if the typicality measurement of a list of items x1, . . . , x<sup>n</sup> with respect to a concept A gave the outcome x<sup>k</sup> , then the final state of the conceptual entity after the measurement is represented by the unit vector |Ak i = <sup>M</sup><sup>k</sup> |Ai √ hA|M<sup>k</sup> |Ai . This means that the weights µ<sup>k</sup> (A) given by the Born rule can actually be interpreted as transition probabilities µ(p<sup>k</sup> , e, pA), where e is the context producing the transitions from the initial state p<sup>A</sup> of the conceptual entity A, represented by the unit vector |Ai, to one of the n possible outcome states p<sup>k</sup> , represented by the unit vectors |A<sup>k</sup> i.

Thus, how can a Hilbert space model be actually constructed for a cognitive situation? To answer this question let us consider again a conceptual entity A, in the state pA, a cognitive measurement on A described by means of a context e, and suppose that the measurement has n distinct outcomes, x1, x2, . . . , xn. A quantum theoretical model for this situation can be constructed as follows. Let us assume, for the sake of simplicity, that the measurement outcomes can be considered to be nondegenerate—this is a special situation which does not hold for a wide class of cognitive measurements, see Section 3. Then, we associate A with a n-dimensional complex Hilbert space H, and then consider an orthonormal base {|e1i, <sup>|</sup>e2i, . . . , <sup>|</sup>eni} in <sup>H</sup> (since H is isomorphic to the Hilbert space C n , the orthonormal base of H can be the canonical base of C n ). Next, we represent the cognitive measurement described by e by means of the spectral family {M1, M2, . . . , Mn}, where M<sup>k</sup> = |e<sup>k</sup> ihe<sup>k</sup> |, k = 1, 2, . . . , n. Finally, the probability that the measurement e on the conceptual entity A in the state p<sup>A</sup> gives the outcome x<sup>k</sup> is given by µ<sup>k</sup> (A) = hA|M<sup>k</sup> |Ai = hA|e<sup>k</sup> ihe<sup>k</sup> |Ai = |he<sup>k</sup> <sup>|</sup>Ai|<sup>2</sup> .

What about the interpretation of the Hilbert space formalism above? Two major points should now be reminded, namely:


This means that, as we mentioned already, the state p<sup>A</sup> of the conceptual entity A is represented in the Hilbert space formalism by the unit vector |Ai, the possible outcomes x<sup>k</sup> of the experiment by the base vectors |e<sup>k</sup> i, and the action of a participant (or the overall action of the ensemble of participants) as the state transformation |Ai → |e<sup>k</sup> i induced by the orthogonal projection operator M<sup>k</sup> = |e<sup>k</sup> ihe<sup>k</sup> |, if the outcome x<sup>k</sup> is obtained, so that the probability of occurrence of x<sup>k</sup> can also be written as µ<sup>k</sup> (A) = µ(|e<sup>k</sup> i, e, |Ai), where e is the measurement context associated with the spectral family {M1, M2, . . . , Mn}.

It follows from (i) and (ii) that a state, hence a unit vector in the Hilbert space representation of states, does not describe the subjective beliefs of a person, or collection of persons, about a conceptual entity. Such subjective beliefs are rather incorporated in the cognitive interaction between the cognitive situation and the human participants deciding on that cognitive situation. In this respect, our operational quantum approach to human cognition is also a realistic one, and thus it departs from other approaches that apply the mathematical formalism of quantum theory to model cognitive processes [12, 14, 17, 18, 31, 32]. Of course, one could say that the difference between interpreting the quantum state as a "state of belief " of a participant in the experiment, or as a "state of a conceptual entity," i.e., a "state of the situation which the participant is confronted with during an experiment," is only a question of philosophical interpretation, but comes to the same when it concerns the methodological development of the approach. Although this is definitely partly true, we do not fully agree with it. Interpretation and methodology are never completely separated. A certain interpretation, hence giving rise to a specific view on the matter, will give rise to other ideas of how to further develop the approach, how to elaborate the method, etc., than another interpretation, with another view, will do. We believe that an operational-realistic approach, being balanced between attention for idealist as well as realist philosophical interpretations, carries in this sense a particular strength, precisely due to this balance. A good example of this is how we were inspired to use the superposition principle of quantum theory in our modeling of concepts as conceptual entities. We represented the combination of two concepts by a state that is the linear superposition of the states describing the component concepts. This way of representing combined conceptual entities captures the nature of emergence, exactly like in physics. It would not be obvious to put forward this description when state of beliefs are the focus of what can be predicted.

We stress a third point that is important, in our opinion. For most situations, we interpret the effect of the cognitive context on a conceptual entity in a decision-making process as an "actualization of pure potentiality." Like in quantum physics, the (measurement) context does not reveal pre-existing properties of the entity but, rather, it makes actual properties that were only potential in the initial state of the entity (unless the initial state is already an eigenstate of the measurement in question, like in physics) [4, 8, 9].

It follows from the previous discussion that our research investigates the validity of quantum theory as a general, unitary and coherent theory for human cognition. Our quantum theoretical models, elaborated for specific cognitive situations and data, derive from quantum theory as a consequence of the assumptions about this general validity. As such, these models are subject to the technical and epistemological constraints of quantum theory. In other terms, our quantum modeling rests on a "theory based approach," and should be distinguished from an "ad hoc modeling based approach," only devised to fit data. In this respect, one should be suspicious of models in which free parameters are added on an "ad hoc" basis to fit the data more closely in specific experimental situations. In our opinion, the fact that our "theory derived model" reproduces different sets of experimental data constitutes in itself a convincing argument to support its advantage over traditional modeling approaches and to extend its use to more complex cognitive situations (in that respect, see also our final remarks in Section 6).

We present in Section 3 the results obtained in our quantum theoretical approach in the light of the epistemological perspective of this section.

#### 3. ON THE MODELING EFFECTIVENESS OF HILBERT SPACE

The quantum approach to cognition described in Section 2 produced concrete models in Hilbert space, which faithfully matched different sets of experimental data collected to reveal "decision-making errors" and "probability judgment errors." This allowed us to identify genuine quantum structures in the cognitive realm. We present a reconstruction of the attained results in the following.

The first set of results concerns knowledge representation and conceptual categorization and combination. James Hampton collected data on how people rate membership of items with respect to pairs of concepts and their combinations, conjunction [51], disjunction [52], and negation [53]. By using the data in Hampton [52], we reconstructed the typicality estimations of 24 items with respect to the concepts Fruits and Vegetables and their disjunction Fruits or Vegetables. We showed that the concepts Fruits and Vegetables interfere when they combine to form Fruits or Vegetables, and the state of the latter can be represented by the linear superposition of the states of the former. This behavior is analogous to that of quantum particles interfering in the doubleslit experiment when both slits are open. The data are faithfully represented in a 25-dimensional Hilbert space over complex numbers [15, 16].

In the data collected on the membership estimations of items with respect to pairs (A, B) of concepts and their conjunction "A and B" and disjunction "A or B," Hampton found systematic violations of the rules of classical (fuzzy set) logic and probability theory. For example, the membership weight of the item Mint with respect to the conjunction Food and Plant is higher than the membership weight of Mint with respect to both Food and Plant ("overextension"). Similarly, the membership weight of the item Ashtray with respect to the disjunction Home Furnishing or Furniture is lower than the membership weight of Ashtray with respect to both Home Furnishing and Furniture ("underextension"). We showed that overextension and underextension are natural expressions of "conceptual emergence" [10, 16]. Namely, whenever a person estimates the membership of an item x with respect to the pair (A, B) of concepts and their combination C(A, B), two processes act in the person's mind. The first process is guided by "emergence," that is, the person estimates the membership of x with respect to the new emergent concept C(A, B). The second process is guided by "logic," that is, the person separately estimates the membership of x with respect to A and B and applies a probabilistic logical calculus to estimate the membership of x with respect to C(A, B) [54]. More important, the new concept C(A, B) emerges from the concepts A and B, exactly as the linear superposition of two quantum states emerges from the component states. A twosector Fock space faithfully models Hampton's data, and was later successfully applied to the modeling of more complex situations involving concept combinations (see e.g., [54, 55]).

It is interesting to note that the size of deviation of classical probabilistic rules due to overextension and underextension generally depends on the item x and the specific combination C(A, B) of the concepts A and B that are investigated. However, we recently performed a more general experiment in which we asked the participants to rank the membership of items with respect to the concepts A, B, their negations "not A," "not B," and the conjunctions "A and B," "A and not B," "not A and B," and "not A and not B." We surprisingly found that the size of deviation from classicality in this experiment does not depend on either the item or the pair of concepts or the specific combination, but shows to be a numerical constant. Even more surprisingly, our two-sector Fock space model correctly predicts the value of this constant, capturing in this way a deep non-classical mechanism connected in a fundamental way with the mechanism of conceptual formation itself rather than only specifically with the mechanism of conceptual combination [42, 43].

Different concepts entangle when they combine, where "entanglement" is meant in the standard quantum sense. We proved this feature of concepts in two experiments. In the first experiment, we asked the participants to choose the best example for the conceptual combination The Animal Acts in a list of four examples, e.g., The Horse Growls, The Bear Whinnies, The Horse Whinnies, and The Bear Growls. By suitably combining exemplars of Animal and exemplars of Acts, we performed four joint measurements on the combination The Animal Acts. The expectation values violated the "Clauser-Horne-Shimony-Holt" version of Bell inequalities [56, 57]. The violation was such that, not only the state of The Animal Acts was entangled, but also the four joint measurements were entangled, in the sense that they could not be represented in the Hilbert space C 4 as the (tensor) product of a measurement performed on the concept Animal and a measurement performed on the concept Acts [44]. In the second experiment, performed on the conceptual combination Two Different Wind Directions, we confirmed the presence of quantum entanglement, but we were also able to prove that the empirical violation of the marginal law in this type of experiments is due to a bias of the participants in picking wind directions. If this bias is removed, which is what we did in an ensuing experiment on Two Different Space Directions, one can show that people pick amongst different space directions exactly as coincidence spin measurement apparatuses pick amongst different spin directions of a compound system in the singlet spin state. In other words, entanglement in concepts can be proved from only the statistics of the correlations of joint measurements on combined concepts, exactly as in quantum physics [45].

Since concepts exhibit genuine quantum features when they combine pairwise, it is reasonable to expect that these features should be reflected in the statistical behavior of the combination of several identical concepts. Indeed, we detected quantum-type indistinguishability in an experiment on the combination of identical concepts, such as the combination 11 Animals. More specifically, we found significant evidence of deviation from the predictions of classical statistical theories, i.e., "Maxwell-Boltzmann distribution." This deviation has clear analogies with the deviation of quantum mechanical from classical mechanical statistics, due to indistinguishability of microscopic quantum particles, that is, we found convincing evidence of the presence of "Bose-Einstein distribution." In the experiment, indeed, people do not seem to distinguish two identical concepts in the combination of N identical concepts, which is more evident in more abstract than in more concrete concepts, as expected [46].

The second set of results concern "decision-making errors under uncertainty." In the "disjunction effect" people prefer action x over action y if they know that an event A occurs, and also if they know that A does not occur, but they prefer y over x if they do not know whether A occurs or not. The disjunction effect violates a fundamental principle of rational decision theory, Savage's "Sure-Thing principle" and, more generally, the total probability rule of classical probability [58]. This preference of sure over unsure choices violating the Sure-Thing principle was experimentally detected in the "two-stage gamble" and in the "Hawaii problem" [59]. In the experiment on a gamble that can be played twice, the majority of participants prefer to bet again when they know they won in the first gamble, and also when they know they lost in the first gamble, but they generally prefer not to play when they do not know whether they won or lost. In the Hawaii problem, most students decide to buy the vacation package when they know they passed the exam, and also when they know they did not pass the exam, but they generally decide not to buy the vacation package when they do not know whether they passed or not passed the exam. We recently showed that, in both experimental situations, this "uncertainty aversion" can be explained as an effect of underextension of the conceptual entities A and "not A" with respect to the conceptual disjunction "A or not A," where the latter describes the situation of not knowing which event, A or "not A," will occur. The concepts A and "not A" interfere in the disjunction "A or not A," which determines its underextension. A Hilbert space model in C 3 allowed us to reproduce the data in both experiments on the disjunction effect [55].

Ellsberg's thought experiments, much before the disjunction effect, revealed that the Sure-Thing principle is violated in concrete decision-making under uncertainty, as people generally prefer known over unknown probabilities, instead of maximizing their expected utilities. In the famous "Ellsberg three-color example," an urn contains 30 red balls and 60 balls that are either yellow or black, in unknown proportion. One ball will be drawn at random from the urn. The participant is firstly asked to choose between betting on "red" and betting on "black." Then, the same participant is asked to choose between betting on "red or yellow" and betting on "black or yellow." In each case, the "right" choice will be awarded with \$100. As the events "betting on red" and "betting on black or yellow" are associated with known probabilities, while their counterparts are not, the participants will prefer betting on the former than betting on the latter, thus revealing what Ellsberg called "ambiguity aversion," and violating the Sure-Thing principle [60]. This pattern of choice has been confirmed by several experiments in the last 30 years [61]. Recently, Machina identified in a couple of thought experiments, the "50/51 example" and the "reflection example," a similar mechanism guiding human preferences in specific ambiguous situations, namely, "information symmetry" [62, 63], which was experimentally confirmed in L'Haridon and Placido [64]. In our quantum theoretical approach, ambiguity aversion and information symmetry are two possible cognitive contexts influencing human preferences in uncertainty situations and changing the states of the "Ellsberg and Machina conceptual entities," respectively. Hence, an ambiguity aversion context will change the state of the Ellsberg conceptual entity in such a way that "betting on red" and "betting on black or yellow" are finally preferred. In other terms, the novel element of this approach is that the initial state of the conceptual entity, in its Hilbert space representation, can also change because of the pondering of the participants in relation to certain choices, before being collapsed into a given outcome. This opens the way to a generalization of rational decision theory with quantum, rather than classical, probabilities [65].

The results above provide a strong confirmation of the quantum theoretical approach presented in Section 2, and we expect that further evidence will be given in this direction in the years to come. In the next section we instead intend to analyze some situations where deviations from Hilbert space modeling of human cognition apparently occur. We will see in Section 5 that these deviations are however compatible with the general operational-realistic framework portrayed in Section 2.

#### 4. DEVIATING FROM HILBERT SPACE

As mentioned in Section 2, if suitable axioms are imposed on the SCoP formalism, the Hilbert space structure of quantum theory can be shown to emerge uniquely, up to isomorphisms [30]. However, we also know that certain experimental situations can violate some of these axioms. This is the case for instance when we consider entities formed by experimentally separated sub-entities, a situation that cannot be described by the standard quantum formalism [25].Similarly, one may expect that the structural shortcomings of the standard quantum formalism can also manifest in the ambit of psychological measurements, in the form of data that may not be exactly modelable (or jointly modelable) by means of the specific Hilbert space geometry and the associated Born rule. The purpose of this section is to describe two paradigmatic examples of situations of this kind: "question order effects" and "response replicability." In the following section, we then show how the quantum formalism can be naturally completed to also faithfully model these data, in a way that remains consistent with our operational-realistic approach.

Let us first remark that the mere situation of having to deal with a set of data for which we do not yet have a faithful Hilbert space model should not make one necessarily search for an alternative more general quantum-like mathematical structure as a modeling framework. Indeed, it is very well possible that the adequate Hilbert space model has not yet been found. Recently, however, a specific situation was identified and analyzed indicating that the standard quantum formalism in Hilbert space would not be able to be used to model it [32]. This situation combines two phenomena: "question order effects" and "response replicability." We start by explaining "question order effects" and how the cognitive situation in which they appear can be represented in Hilbert space.

For this we come back to the yes-no experiment of Section 2, where participants are asked: "Is Gore honest and trustworthy?" This experiment gives rise to a two-outcome measurement performed on the conceptual entity Honesty in the initial state <sup>p</sup>H, represented by the unit vector <sup>|</sup>Hi ∈ <sup>H</sup>, where <sup>H</sup> is a two-dimensional Hilbert space if we assume the measurement to be non-degenerate, or more generally a n-dimensional Hilbert space if we also admit the possibility of sub-measurements. Denoting {MG, <sup>M</sup>¯ <sup>G</sup> <sup>=</sup> <sup>1</sup> <sup>−</sup> <sup>M</sup>G} the spectral family associated with this measurement, the probability of the "yes" outcome (i.e., to answer "yes" to the question about Gore's honesty and trustworthiness) is then given by the Born rule µGy(H) = hH|MG|Hi, and of course µGn(H) = hH|M¯ <sup>G</sup>|Hi = 1−µGy(H) is

the probability for the "no" outcome. We then consider a second measurement performed on the conceptual entity Honesty, but this time associated with the question: "Is Clinton honest and trustworthy?" We denote {MC, <sup>M</sup>¯ <sup>C</sup> <sup>=</sup> <sup>1</sup> <sup>−</sup> <sup>M</sup>C} the spectral family associated with this second measurement, so that the probabilities for the "yes" and "no" outcomes are again given by µCy(H) = hH|MC|Hi and µCn(H) = hH|M¯ <sup>C</sup>|Hi, respectively.

Starting from these two measurements, it is possible to conceive sequential measurements, corresponding to situations where the respondents are subject to the Gore and Clinton questions in a succession, one after the other, in different orders. Statistical data about "Clinton/Gore" sequential measurements were reported in a seminal article on question order effects [66] and further analyzed in Busemeyer and Bruza [14, 67]. More precisely, after fixing a rounding error in Wang and Busemeyer [67], we have the following sequential (or conditional) probabilities [34]:

$$
\begin{aligned}
\mu\_{\text{CylG}}(H) &= 0.4899 & \mu\_{\text{CylG}}(H) &= 0.0447 & \mu\_{\text{CylG}}(H) \\
&= 0.1767 & \mu\_{\text{CylG}}(H) &= 0.2887 & \text{(1)}
\end{aligned}
$$

$$
\begin{aligned}
\mu\_{\text{GyCr}}(H) &= 0.5625 & \mu\_{\text{GyCr}}(H) &= 0.1991 & \mu\_{\text{GnfCy}}(H) \\
&= 0.0255 & \mu\_{\text{GnCn}}(H) &= 0.2129
\end{aligned}
\tag{2}
$$

where Equation (1) corresponds to the sequence where first the Clinton and then the Gore measurements are performed, whereas Equation (2) corresponds to the reversed order sequence for the measurements. Considering that the probabilities in each of the four columns above are sensibly different, these data describe typical "question order effects."

Quantum theory is equipped with a very natural tool to model question order effects: "incompatible measurements," as expressed by the fact that two self-adjoint operators, and the associated spectral families, in general do not commute. More precisely, the Hilbert space expression for the probability that, say, we obtain the answer CyGn when we perform first the Clinton measurement and then the Gore one, is [14, 67]: µCyGn(H) = hH|MCM¯ <sup>G</sup>MC|Hi. Similarly, the probability to obtain the outcome GnCy, for the sequential measurement in reversed order, is: µGnCy(H) = hH|M¯ <sup>G</sup>MCM¯ <sup>G</sup>|Hi. Since we have the operatorial identity M¯ <sup>G</sup>MCM¯ <sup>G</sup> − MCM¯ <sup>G</sup>M<sup>C</sup> = (M<sup>G</sup> − MC)[MG, MC], the difference µGnCy(H) − µCyGn(H) will generally be non-zero if [MG, MC] 6= 0, i.e., if the spectral families associated with the two measurements do not commute. In the following we will analyze whether non-compatibility within a standard quantum approach can cope in a satisfying way with these question order effects, and show that a simple "yes" to this question is not possible. Indeed, a deep problem already comes to the surface in relation to the phenomenon of "response replicability."

Consider again the Gore/Clinton measurements: if a respondent says "yes" to the Gore question, then is asked the Clinton question, then again is asked the Gore question, the answer given to the latter is expected to be "yes," independently of the answer given to the intermediary Clinton question. This conjectured phenomenon, still necessitating a clear experimental confirmation, is called "response replicability<sup>1</sup> ." If, in addition, to question order effects also response replicability is jointly modeled in Hilbert space quantum mechanics, a contradiction can be detected, as shown in Khrennikov et al. [32]. Let us indicate what are the elements that produce this contradiction. In standard quantum mechanics only if a state is an eigenstate of the considered measurement the outcome "yes" will be certain in advance. Also, measurements that can transform an arbitrary initial state into an eigenstate are ideal measurements called of the first kind. According to response replicability, outcomes that once have been obtained for a measurement will have to become certain in advance if this same measurement is performed a second time. This means that the associated measurements should be ideal and of the first kind. For the case of the Gore/Clinton measurements, and the situation of response replicability mentioned above, this means that the Gore measurement should be ideal and of the first kind. But one can also consider the situation where first the Clinton measurement is performed, then the Gore measurement and afterwards the Clinton measurement again. A similar analysis leads then to the Clinton measurement needing to be ideal and of the first kind. This means however that after more than three measurements that alternate between Clinton and Gore, the state needs to have become an eigenstate of both measurements. As a consequence, both measurements can be shown to be represented by commuting operators. The proof of the contradiction between "response replicability" and "non-commutativity" worked out in Khrennikov et al. [32] is formal and also more general than the intuitive reasoning presented above—for example, the contradiction is also proven when measurements are represented by positive-operator valued measures instead of projection valued measures, which is what we have considered here—and hence indicates that the noncommutativity of the self-adjoint operators needed to account for the question order effects cannot be realized together with the "ideal and first kind" properties needed to account for the response replicability within a standard quantum Hilbert space setting.

Although refined experiments would be needed to reveal the possible reasons for response replicability, it is worth to put forward some intuitive ideas, as we have been developing a quantum-like but more general than Hilbert space formalism within our Brussels approach to quantum cognition [35–37], and we believe that we can cope with the above contradiction within this more general quantum-like setting in a very natural way. It seems to be a plausible hypothesis that response replicability is, at least partly, due to a multiplicity of effects, that however take place during the experiment itself, such as desire of coherence, learning, fear of being judged when changing opinion, etc. And a crucial aspect for both question order effects and response replicability appearing in the Gore/Clinton situation is that the sequential measurements need to be carried out with the same participant, who has to be tested again and again. This is different

<sup>1</sup>We stress here that such a conjecture does require that "all' psychological measurements should satisfy "response replicability." It rather claims that the latter should hold for a non-empty class of these measurements.

than the situation in quantum physics, where order effects appear for non-commuting observables also when sequential measurements are performed with different apparatuses. Hence, both question order effects and response replicability seem to be the consequence of "changes taking place in the way each subject responds probabilistically to the situation—described by the state of the conceptual entity in our approach—he or she is confronted with during a measurement." Since the structure of the probabilistic response to a specific state is fixed in quantum mechanics, being determined by the Born rule, it is clear that such a change of the probabilistic response to a given measurement, when it is repeated in a sequence of measurements, cannot be accounted for by the standard quantum formalism. And it is exactly such structure of the probabilistic response to a same measurement with respect to a given state that can be varied in the generalized quantum-like theory that we have been developing [35–37]. This is the reason that, when we became aware of the contradiction identified in Khrennikov et al. [32], we were tempted to investigate whether in our generalized quantum-like theory the contradiction would vanish, and response replicability would be jointly modelizable with question order effects. And indeed, we could obtain a positive result with respect to this issue [34], which we will now sketch in the next section.

# 5. BEYOND-QUANTUM MODELS

We presented in Section 4 two paradigmatic situations in human cognition that cannot be modeled together using the standard quantum formalism. We want now to explain how the latter can be naturally extended to also deal with these situations, still remaining in the ambit of a unitary and coherent framework for cognitive processes.

For this, we introduce a formalism where the probabilistic response with respect to a specific experimental situation, i.e., a state of the conceptual entity under consideration, can vary, and hence can be different than the one compatible with the Born rule of standard quantum theory. This formalism, called the "extended Bloch representation" of quantum mechanics [35], exploits in its most recent formulation the fact that the states of a quantum entity (described as ray-states or density matrix-states) can be uniquely mapped into a convex portion of a generalized unit Bloch sphere, in which also measurements can be represented in a natural way, by means of appropriate simplexes having the eigenstates as vertex vectors. A measurement can then be described as a process during which an abstract point particle (representing the initial state of the quantum entity) enters into contact with the measurement simplex, which then, as if it was an elastic and disintegrable hyper-membrane, can collapse to one of its vertex points (representing the outcomes states) or to a point of one of its sub-simplexes (in case the measurement would be degenerate).

We do not enter here into the details of this remarkable process, and refer the reader to the detailed descriptions in Aerts and Sassoli de Bianchi [34–37]. For our present purposes, it will be sufficient to observe that a measurement simplex, considered as an abstract membrane that can collapse as a result of some uncontrollable environmental fluctuations, can precisely model that aspect of a measurement that in the quantum jargon is called "wave function collapse." More precisely, when the abstract point particle enters into contact with the "potentiality region" represented by such membrane, it creates some "tension lines" partitioning the latter into different subregions, one for each possible outcome. The collapse of the membrane toward one of the vertex points (see **Figure 1**) then depends on which subregion disintegrates first, so that the different outcome probabilities can be expressed as the relative Lebesgue measures of these subregions (the larger a subregion, the higher the associated probability). In other terms, this membrane's mechanism, with the tension lines generated by the abstract point particle, is a mathematical representation of a sort of "weighted symmetry breaking" process. Now, thanks to the remarkable geometry of simplexes, it can be proven that if the membrane is chosen to be uniform, thus having the same probability of disintegrating in any of its points (describing the different possible measurementinteractions), the collapse probabilities are exactly given by the Born rule. In other terms, the latter can be derived, and explained, as being the result of a process of actualization of potential hidden-measurement interactions, so that the extended Bloch representation constitutes a possible solution to the measurement problem.

Thus, when the membrane is uniform, the "way of choosing" an outcome is precisely the "Born way." However, a uniform membrane is a very special situation, and it is natural to also consider membranes whose points do not all have the same probability of disintegrating, i.e., membranes whose disintegrative processes are described by non-uniform probability densities ρ, which we simply call ρ-membranes. Non-uniform ρ-membranes can produce outcome probabilities different from the standard quantum ones and give rise to probability models different from the Hilbertian one (even though the state space is a generalized Bloch sphere derived from the Hilbert space geometry<sup>2</sup> ). But this is exactly what one needs in order to account, in a unified framework, for the situation we encounter when combining the phenomena of "response replicability" and "question order effects," as previously described and analyzed in Khrennikov et al. [32].

We thus see that it is possible to naturally complete the quantum formalism to obtain a finer grained description of psychological experiments in which the probabilistic response of a measurement with respect to a state can be different to the one described by the Born rule. Additionally, our generalized quantum-like theory also explains why, despite the fact that individual measurements are possibly associated with different non-Born probabilities, the Born rule nevertheless appears to be a very good approximation to describe numerous experimental situations. This is related to the notion of "universal measurement," firstly introduced by one of us in Aerts [38] and further analyzed in Aerts and Sassoli de Bianchi [35–37, 68]. In a nutshell, a universal measurement is a measurement whose

<sup>2</sup>More general state spaces can also be considered, in what has been called the "general tension-reduction" (GTR) model [36, 37, 40].

probabilities are obtained by averaging over the probabilities of all possible quantum-like measurements sharing a same set of outcomes, in a same state space. In other terms, a universal measurement corresponds to an average over all possible nonuniform ρ-membranes, associated with a given measurement simplex. Following a strategy similar to that used in the definition of the "Wiener measure," it is then possible to show that if the state space is Hilbertian (more precisely, a convex set of states inscribed in a generalized Bloch sphere, inherited from a Hilbert space), then the probabilities of a universal measurement are precisely those predicted by the Born rule.

In Aerts and Sassoli de Bianchi [34] we could show that the joint situation of question order effects and response replicability for the data collected with respect to the Gore/Clinton measurements, and others, is modelizable within our generalized quantum theory by introducing non-Born type measurements. However, we were also able to provide a better modeling of the question order effects data as such. Indeed, using standard Born-probability quantum theory it was only possible to model approximately these data in earlier studies [67]. This is due to the existence of a general algebraic equality about sequential measurements in Hilbert space quantum theory which is the following [34, 67, 69]:

$$\begin{aligned} Q & \equiv M\_G M\_C M\_G - M\_C M\_G M\_C + \bar{M}\_G \bar{M}\_C \bar{M}\_G \\ & - \bar{M}\_C \bar{M}\_G \bar{M}\_C = 0 \end{aligned} \tag{3}$$

where {MG, <sup>M</sup>¯ <sup>G</sup> <sup>=</sup> <sup>1</sup> <sup>−</sup> <sup>M</sup>G} and {MC, <sup>M</sup>¯ <sup>C</sup> <sup>=</sup> <sup>1</sup> <sup>−</sup> <sup>M</sup>C} are the spectral families associated with the Hilbert model of the Gore and Clinton measurements introduced in Section 4. Taking the average q = hH|Q|Hi, one thus obtains, more specifically:

$$q \equiv \mu\_{\text{CyCy}}(H) - \mu\_{\text{CyGy}}(H) + \mu\_{\text{CuCu}}(H) - \mu\_{\text{CuCu}}(H) = 0. \tag{4}$$

This equality has been called the "QQ-equality," and can be used as a test for the quantumness of the probability model, but only in the sense that a quantum model, necessarily, has to obey it, although the fact that it does so is not a guarantee that the model will be Hilbertian. Inserting the experimental values Equations (1) and (2) into Equation (4), one finds q = 0.0032 6= 0. This value is small (being only 0.32% of the maximum value q can take, which is 1), which is the reason that approximate modeling can be obtained within Hilbert space quantum theory [67]. Note however that Equation(3) does not depend on the dimension of the Hilbert space considered, which means that even in higher dimensional Hilbert spaces, if degenerate measurements are considered, an exact modeling would still be impossible to obtain. We have reasons to believe that also question order effects, with the QQ-equality standing in the way of an exact modeling of the data, contain an indication for the need to turn to a more general quantum-like theory, such as the one we used to cope with the joint phenomenon of question order effects and response replicability. We present some arguments in this regard in the following of this section.

First, we note that in case one chooses a two-dimensional Hilbert space, additional equalities can be written which are strongly violated by the data this time. As an example, consider the quantity [34]:

$$q' = \mu\_{\text{CyGn}}(H)\mu\_{\text{CrCuSn}}(H) - \mu\_{\text{CrGy}}(H)\mu\_{\text{CyGy}}(H) \tag{5}$$

$$= \langle H|M\_{\text{C}}\bar{M}\_{\text{G}}M\_{\text{C}}|H\rangle\langle H|\bar{M}\_{\text{C}}\bar{M}\_{\text{G}}\bar{M}\_{\text{C}}|H\rangle$$

$$- \langle H|\bar{M}\_{\text{C}}M\_{\text{G}}\bar{M}\_{\text{C}}|H\rangle\langle H|M\_{\text{C}}M\_{\text{G}}M\_{\text{C}}|H\rangle \tag{6}$$

If the Hilbert space is two-dimensional, one can write M<sup>G</sup> = |GihG|, M¯ <sup>G</sup> = |G¯ihG¯ |, as well as M<sup>C</sup> = |CihC|, M¯ <sup>C</sup> = |C¯ihC¯|. Replacing these expressions into Equation (6) one finds, after some easy algebra, that q ′ = 0. However, inserting the experimental values Equations (1) and (2) into Equation (6), one finds q ′ = −0.073 6= 0, which not only is not zero, but also 29.2% of the maximum value that q ′ can take (which is 0.25).

Second, let us repeat our intuitive reasoning as to why measurements in the situation of response replicability carry non-Bornian probabilities. Due to the local contexts of the collection of sequential measurements, Gore, Clinton, and then Gore again, the third measurement internally changes into a non-Bornian one, and more specifically a deterministic one for the considered state, since response replicability means that for all subsequent Gore measurements the same outcome is assured. It might well be the case, although an intuitive argument would be more complex to give in this case, that also for the situation of question order effects, precisely because they only appear if a same human mind is sequentially interrogated, non-Bornian probabilities would be required. An even stronger hypothesis, which we plan to investigate in the future, is that most individual human minds, and perhaps even all, would carry in general non-Bornian probabilities, so that the success of Hilbert space quantum theory and Bornian probabilities would be mainly an effect of averaging over a sufficiently large set of different human minds, which effectively is what happens in a standard psychological experiment. If this last hypothesis is true, the violation of the Born rule for question order effects and response replicability would be quite natural, since the same human mind is needed to provoke these effects. Indeed, our analysis in Aerts and Sassoli de Bianchi [36, 37] shows that standard quantum probabilities in the modeling of human cognition can be explained by considering that in numerous experimental situations the average over the different participants will be quite close to that of a universal measurement, which as we observed is exactly given by the Born rule. In other terms, even if the probability model of an individual psychological measurement could be non-Hilbertian, it will generally admit a first order approximation, and when the states of the conceptual entity under investigation can be described by means of a Hilbert space structure, this first order approximation will precisely correspond to the quantum mechanical Born rule.

If the above considerations provide an interesting piece of explanation as to why the Born rule is generally successful also beyond the micro-physical domain, at the same time it also contains a plausible reason of why it will possibly be not successful in all experimental situations, i.e., when the average is either not large enough, or when the experiment is so conceived that it doesn't apply as such. This could be the typical situation of question order effects and response replicability, since in this case we do not consider an average over single measurements, but over sequential (conditional) measurements. And this could be an explanation of why Hilbertian symmetries like those described above can be easily violated and that it will not be possible, by means of the Born rule, to always obtain an exact fit of the data [34, 40].

Additionally, as we said, it allowed us to precisely fit the data by using the extended Bloch representation, and more specifically simple one-dimensional locally uniform membranes inscribed in a 3-dimensional Bloch sphere that can disintegrate (i.e., break) only inside a connected internal region [34]. Thanks to this modeling, we could also understand that the reason the Clinton/Gore and similar data appear to almost obey the QQequality (Equation 4) is quite different from the reason the equality is obeyed by pure quantum probabilities. Indeed, in a pure quantum model two specific contributions to the q-value (Equation 4), called the "relative indeterminism" and "relative asymmetry" contributions, are necessarily both identically zero, whereas we could show, using our extended model, that for the data (Equation 2), and similar data, these two contributions are both very different from zero, but happen to almost cancel each other, thus explaining why the q = 0 equality is almost obeyed, although the probabilities are manifestly non-Bornian [34].

#### 6. FINAL CONSIDERATIONS

In this article we explained the essence of the operationalrealistic approach to cognition developed in Brussels, which in turn originated from the foundational approach to quantum physics elaborated initially in Geneva and then in Brussels (in what has become known as the "Geneva-Brussels school"). Our emphasis was that this approach is sufficiently general, and fundamental, to provide a unitary framework that can be used to coherently describe, and realistically interpret, not only quantum theory, but also its natural extensions, like the extended Bloch model and the GTR-model. In this final section we offer some additional comments on our approach to cognition, taking into consideration the confusion that sometimes exists between "ad hoc (phenomenological) models' and "theoretical (first principle) models," as well as the critique that a Hilbertian model (and a fortiori its possible extensions) is suspicious because it allows "too many free parameters' to obtain an exact fit (and not just an approximate fit) for all the experimental data.

In that respect, it is worth emphasizing that the principal focus of our "theory of human cognition" is not to model as precisely as possible the data gathered in psychological measurements. A faithful modeling of the data is of course an essential part of it, but our aim is actually more ambitious. In putting forward our methodology, consisting in looking at instances of decisionmaking as resulting from an interaction of a decision-maker with a conceptual entity, we look first of all for a theory truly describing "the reality of the cognitive realm to which a conceptual entity belongs," and additionally also "how human minds can interact with the latter so that decision-making can occur."

In this sense, each time we have put forward a model for some specific experimental data, it has always been our preoccupation to also make sure that (i) the model was extracted following the logic that governs our theory of human cognition, and (ii) that whatever other experiments would be performed by a human mind interacting with that same cognitive-conceptual entity under consideration, also the data of these hypothetical additional experiments could have been modeled exactly in the same way. Clearly, this requirement—that "all possible experiments and data" have to be modeled in an equivalent way—poses severe constraints to our approach, and it is not a priori evident that this would always be possible. However, we are convinced that the fundamental idea underlying our methodology, namely that of looking upon a decision as an interaction of a human mind with a conceptual entity in a specific state (with such state being independent of the human minds possibly interacting with it), equips the theory of exactly those degrees of freedom that are needed to model "all possible data from all possible experiments."

As we already explained in the foregoing, in all this we have been guided by how physical theories deal with data coming from the physical domain. They indeed satisfy this criterion and are able to model all data from all possible experiments that can be executed on a given physical entity. What we have called "conceptual entity" is what in physics corresponds to the notion of "physical entity." Now, in our approach we might be classified as adhering to an idealistic philosophy, i.e., believing that the conceptual entities "really exist," and are not mere creations of our human culture. Our answer to this objection is the following: to profit of the strength of the approach it is not mandatory to take a philosophical stance in the above mentioned way, in the sense that we are not obliged to attribute more existence to what we call a conceptual entity than that attributed, for example, to "human culture" in its entirety. The importance of the approach lies in considering such a conceptual entity as independently existing from any interaction with a human mind, and describe the continuously existing interactions with human minds as processes of the "change of state of the conceptual entity," and whenever applicable also as processes of the "change of context." And again, let us emphasize that this "hiddeninteraction" methodology is inspired by its relevance to physical theories. Our working hypothesis is that in this way it will be possible to advantageously model, and better understand, all of human cognition experimental situations.

Having said this, we observe that the interpretation of the quantum formalism that is commonly used in cognitive domains is a subjectivist one, very similar to that interpretation of quantum theory known as "quantum Bayesianism," or "QBism" [70]. In a sense, this interpretation is the polar opposite of our realistic (non-subjectivistic) operational approach. Indeed, QBism originates from a strong critique [71] of the famous Einstein-Podolsky-Rosen reality criterion [72], whereas at the foundation of the Geneva-Brussels approach there is the idea of taking such criterion not only extremely seriously, but also of using it more thoroughly, as a powerful demarcating tool separating "actually existing properties" from "properties that are only available to be brought into actual existence," and therefore exist in a potential sense [73]. In other terms, a quantum state is not considered in QBism as a description of the actual properties of a physical entity, but of the beliefs of the experimenter about it. Similarly, for the majority of authors in quantum cognition, a quantum state is a description of the state of belief of a participant, and not of the actual state of the conceptual entity that interacts with the participants. In ultimate analysis, this difference of perspectives is about taking a clear position regarding the key notion of "certainty": is certainty (probability 1 assignments) just telling us something about the very firm belief of a subject, or also about some objective properties of the world (be it physical or cultural)? In the same way, are probabilities only shared personal beliefs, based on habit, or also elements of reality (considering that in principle their values can be predicted with certainty)? Although we certainly agree that it is not necessary to take a final stance on these issues to advantageously exploit the quantum mathematics in the modeling of many experimental situations, both in physics and cognition, we also think that the explicative power of a pure subjectivist view rapidly diminishes when we have to address the most remarkable properties of the physical and conceptual entities, like non-locality (nonspatiality) and the non-compositional way with which they can combine.

It is important to emphasize that the subjectivist view is also a consequence of the absence, in the usual quantum formalism, of a meaningful description of what goes on "behind the scenes" during a measurement. On the other hand, the hiddenmeasurement paradigm, as implemented in the extended Bloch representation [35], or even more generally in the GTR-model [36, 37, 40], offers a credible description of the dynamics of a measurement process, in terms of a process of actualization of potential interactions, thus explaining a possible origin of the quantum indeterminism. This certainly allows understanding the so-called "collapse of the state vector" as an objective process, either produced by a macroscopic apparatus in a physics laboratory, or by a mind-brain apparatus in a psychological laboratory. As we tried to motivate in the second part of this article, this completed version of the quantum formalism also allowed us to describe those aspects of a psychological measurements—the possible different ways participants can choose an outcome—that would be impossible to model by remaining within the narrow confines, not only of the quantum formalism, but also of a strict subjectivistic interpretation of it.

To conclude, a final remark is in order. Quantum cognition is undoubtedly a fascinating field of investigation also for physicists, as it offers the opportunity to take a new look at certain aspects of the quantum formalism and use them to possibly make discoveries also in the physical domain. We already mentioned the example of "entangled measurements," that were necessary to exactly model certain correlations. Entangled (nonseparable) measurements are usually not considered in the physics of Bell inequalities, while they are widely explored in quantum cryptography, teleportation and information. However, it is very possible that this stronger form of entanglement will prove to be useful for the interpretation of certain nonlocality tests and the explanation of "anomalies" that were identified in EPR-Bell experiments [44]. Also, for what concerns the notion of "universal measurement," which is quite natural in psychological measurements, since data are obtained from a collection of different minds, could it be that "universal averages" also happen in the physical domain? In other terms, could it be that a single measurement apparatus is actually more like "a collection of different minds" than "a single Born-like mind"? Considering that the origin of the observed deviations from the Born rule, in situations of sequential measurements, can be understood as the ineffectiveness of the averaging process in producing the Born prescription, is it possible to imagine, in the physics laboratory, similar experimental situations where these deviations would be equally observed, thus confirming that the hypothesis of "hidden measurement-interactions" would be a pertinent one also beyond the psychological domain? Whatever the verdict

### REFERENCES


will be, we certainly live in a very stimulating time for foundational research; a time where the conceptual tools that once helped us building a deeper understanding of the "microscopic layer" of our physical reality are now proving to be instrumental for understanding our human "mental layer"; but also a time where all this is also coming back to physics, not only in the form of possible new experimental findings, but also of possible new and deeper understandings [74, 75].

#### AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The Handling Editor declared recent co-publications, though no other collaboration, with the reviewers (IB) and (JB) and states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2016 Aerts, Sassoli de Bianchi and Sozzo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Information and Temporality

#### Christian Flender\*

*Faculty of Economics and Behavioral Sciences, University of Freiburg, Freiburg, Germany*

Being able to give reasons for what the world is and how it works is one of the defining characteristics of modernity. Mathematical reason and empirical observation brought science and engineering to unprecedented success. However, modernity has reached a post-state where an instrumental view of technology needs revision with reasonable arguments and evidence, i.e., without falling back to superstition and mysticism. Instrumentally, technology bears the potential to ease and to harm. Easing and harming can't be controlled like the initial development of technology is a controlled exercise for a specific, mostly easing purpose. Therefore, a revised understanding of information technology is proposed based upon mathematical concepts and intuitions as developed in quantum mechanics. Quantum mechanics offers unequaled opportunities because it raises foundational questions in a precise form. Beyond instrumentalism it enables to raise the question of essences as that what remains through time what it is. The essence of information technology is acausality. The time of acausality is temporality. Temporality is not a concept or a category. It is not epistemological. As an existential and thus more comprehensive and fundamental than a concept or a category temporality is ontological; it does not simply have ontic properties. Rather it exhibits general essences. Datability, significance, spannedness and openness are general essences of equiprimordial time (temporality).

#### Edited by:

*Emmanuel E. Haven, University of Leicester, UK*

#### Reviewed by:

*Ignazio Licata, ISEM - Institute for Scientific Methodology, Italy Paavo Pylkkanen, University of Helsinki, Finland*

#### \*Correspondence:

*Christian Flender mail@christian-flender.de*

#### Specialty section:

*This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics*

Received: *31 May 2016* Accepted: *22 August 2016* Published: *12 September 2016*

#### Citation:

*Flender C (2016) Information and Temporality. Front. Phys. 4:40. doi: 10.3389/fphy.2016.00040* Keywords: information, technology, temporality, acausality, quantum mechanics

# 1. INTRODUCTION

In Plato's famous allegory of the cave chained prisoners only see shadows of things projected on the wall they are forced to look at. As one of their fellows is freed from the cave, he comes to see reality and returns to inform about what he experienced. Nobody believes his report. Plato's idealism stems from the presupposition that there are pure ideas apart from humanity (the cave) which only sees instances and appearances (shadows) of perfect shapes. The truth is judged according to perceptions and conceptions matching or corresponding to a perfect idea (eidos) which may never be attainable.

With the advent of modern science in the Sixteenth and Seventeenth century correspondence started to bear fruits again. Descartes was the first who assured himself of what things really are by claiming cogito ergo sum (I think, therefore I am) [1]. He pulled Plato's ideas to his cognitive faculty and made thinking and reasoning the ultimate means for determining being of the self<sup>1</sup> . His thoughts eluded doubt and became the subject-pole (res cogitans) as opposed to objects in the external world (res extensa). Correspondence was redefined as the relation between propositions uttered by the thinking ego and properties of things out there in the external world. The truth of the res extensa depended on its matching the res cogitans. Modern dualism was born.

With the rise of commercial information technology and the Internet in the second half of the last century dualism has been a fruitful engine for business innovation and economic prosperity. The template for digital information processing is social phenomena in the analog world.

**36**

Flender Information and Temporality

Communication, coordination, cooperation and competition are metaphors employed for building information systems. Calculating machines, digital storages, and information highways facilitate and support human activities from the viewpoint of input-output relations and state transitions. Information is coded, transformed, stored, and transmitted at high speed over large distances. Symbol representation and manipulation are at the heart of computation and information exchange exercised by digital machines and human minds. Typical artifacts are algorithms and data structures. Brain-like sub-symbolic networks are trained to represent and simulate symbolic information and problem-solving abilities at a higher human-like level of reasoning. A huge amount of tools and services emerged to support social activities like manufacturing, information search, or relationship management. Such informational artifacts have become pervasive and ubiquitous and alter entrenched norms of social activities at an increasing level of speed and sophistication. Their utility and usability can be determined anthropologically. The way they are developed and used within a given cultural context partly determines their significance. With the advent of commercial online social networks in the late 1990s managing contacts was a major utility. Today, they serve millions of businesses to advertise their products and services. Essentially, informational artifacts are instrumental. Their inner-causality serves humanity as means to an end. Both human causation and inner-causal-functioning follow design principles: purposive ideas described in (in-) formal terms for the sake of computational and material instantiation and support.

Engineering as problem-solving reduces technology to a means; technology is instrumental. In contrast, science is dedicated to find out about what nature and technology is—its essence (Wesen)—including the essence of utility and problem-solving. The essence of technology is neither simply anthropological nor merely instrumental [5, 6]. The essence of something is its enduring as presence<sup>2</sup> . It is temporal and goes beyond meaning in the sense of a correspondence between an idea or formal description and its material instantiation and computational enactment. Engineering builds upon science and science makes use of technological artifacts. Science wants to know what things really are. It wants to know essences as that what remains through time what it is.

From a foundational and scientific point of view it is reasonable to question instrumental conceptions of information technology. However, nothing is gained if we play off utility against foundation. Both applied research and basic science are legitimate. In the history of science the latter was often a precursor of the former. Who expected that after the discovery of the quantum in 1900 transistors and micro-electronics would enable mobile access to global and personalized services as we find them today?

In the early Twenty-first century we stand at the brink of a fourth industrial revolution [9]. After mechanical production with power from steam and water in the late Eighteenth century, electricity and mass production a 100 years later, and production automation through information technology in the second half of last century, today, cyber-physical systems such as augmented reality appliances, Industry 4.0, autonomous cars, and the Internet of Things mark the cornerstone of a next revolution. Interpreting data truthfully is a key competence in this context. They call it the cognitive era. When it comes to explain how the cognitive and the embodied, the mechanical and the enlivened, humans and machines, actually correlate and interact with each other interdisciplinary approaches involving disciplines such as engineering and philosophy appear as firstclass candidates to clarify the very nature of what it means to make sense of the world both from an applied and foundational point of view.

However, up to this day we still lack a sound and coherent understanding of what it means to be a conscious, autonomous, freedom-loving, situated and culturally-embedded individual. There are many debates about whether a computing machine one day will be able to turn into a conscious being like a human [10]. Of course, this depends on our definition of consciousness. For a panpsychist even a dead stone or a river is somehow enlivened. Another extreme demarcates certain pathological observations of people having lost control of their autonomy. Is it possible for a human to turn into a deterministic machine totally controlled from outside? These and many other questions will increasingly pop up the more we advance and extend our industry and culture with information technology.

What is information? Many answers to this question are spatial. They refer to a location. For instance, a dialectic approach may distinguish information from matter and energy and locate it in the human mind or a storage device such as the front page of a newspaper or the magnetic tape of a hard disk drive. But even for matter and energy it is far from clear and settled if and where they are located. Think of non-local correlations in quantum physics. For two classically correlated observables usually a change of property A (e.g., acceleration) causes property B (e.g., position) to change<sup>3</sup> . The time it takes for A to have

<sup>1</sup>Descartes' ontological mind-matter dualism stems from his understanding of God as ens perfectissimum. This being is substantial, i.e., self-sustaining; it needs nothing else than itself. He took this understanding of being as substance and applied it to thinking and the world, i.e., ens creatum. There is an infinite difference between the creator and its creations; however, human beings are as self-sustaining as their creator. More precisely, the res cogitans (ego) and the res extensa (world) are substances. Ego and world are ontological in the sense of substantial. Like Kant at a later point in time he acknowledged that these self-sustaining things are not knowable how they are in-themselves. Therefore, value predicaments are necessary. However, he like the scholastics and the ancient Greeks presupposed substances as self-sustaining things nevertheless. Heidegger was the first who questioned this presupposition of being as substance and came up with a new ontology (I call it Twenty-first century ontology [2] because it will be our current millennium and century that Heidegger's ontology will be understood properly) for which the essence of being is not infinite like God or finite like ego and world. The essence of being is temporality. Temporality temporalizing itself is not selfsustaining though remaining and resting-in-itself. There is no infinite difference between temporality and human beings like Descartes' ontology presupposed. If you want God is through and with us. See §20 (The Fundaments of the Ontological Definition of the "World") in Heidegger [3, 4].

<sup>2</sup>Gumbrecht [7] uses the term presence to signify effects fusing with meaning. Noë [8] refers to presence in the context of sensor-motor activity. Both ground the intellect in the physical and socio-cultural world, the place where essences are to be found [2].

<sup>3</sup>This is not to say that A is necessary and sufficient for B to change.

an effect on B is constrained by the speed of light. A and B are spatially localized whereas a change of A exerts a force leading to a change of B. Non-local correlations between observables (e.g., spin of photons) are faster than light and thus instantaneous. A and B change at two (even far distant) places at the same time without a local force between them. In other words, A and B are at two spatially separated locations simultaneously. Their relation is acausal.

Again what is information? Some scientists claim information is matter and energy. All information about matter and energy is encoded in their respective wave function. But where is the wave function of my information about the latest stock market news? In my head or in the weekly magazine I read to gather information about the stock exchange? If information is nonlocal the relevant news about the shareholder value of a particular company may be distributed among both physical devices, my brain and the magazine, and thus it may be localized at two places simultaneously.

Besides the many problems a reductive view on information raises, it is hard to deny the experience of information being something extra-physical. It resides over, above or beyond the material. The development of information technology starts with an abstract idea—let's say a diagram of the main classes and their relationships of an object-oriented software application to be developed—and ends with the implementation of a prototype ready to run and be presented at the customer's in-house hardware infrastructure. The software and its design are essentially separated from its implementation and hardware. But what about the users, how do they relate and interact with the software and its interfaces? While using a smart phone, can we clearly separate the device from its user? Do we transfer information from our minds into the database of a software application and vice versa? Is there a correspondence between information in the mind and information stored on a physical device? Is the essence of information a correspondence between thinking and the external world (adaequatio intellectus et rei)? What is the essence of information technology: software, hardware, interface design, or usability as experienced?

In this article I'll argue that the essence of information technology is temporality. Temporality is the time of acausality. Acausality is introduced by means of the mathematical apparatus of quantum mechanics (QM) [11] and takes into account the current state of what natural science revealed to be form and matter and how humans actually come to know what form and matter is.

The paper proceeds as follows. In the next section anthropology and instrumentalism demarcate the starting point for a discussion of information technology. Causality is revealed as a unity of four causes including an anthropological dimension which philosophy has taught for centuries [5, 6]. In Section 3 quantum concepts are presented in light of causality as a precise acausal means for revealing the essence of information technology. Section 4 argues that temporality is the time of acausality and temporalizes information technology ecstatically and horizontally [3, 4]. Finally, Section 5 concludes the paper.

# 2. ANTHROPOLOGY AND INSTRUMENTALISM

We use information technology in manifold ways. Take browsers as an example. Through browsers we access web pages, fill out forms, view statistics, retrieve search results, or leave traces. Our active engagement with browsers partly determines utility and results we get out of the web. With our decisions and actions, clicks and hand movements, we cause browsers to perform a variety of tasks serving our purposes. In antiquity thinkers already knew about causation and causality for which effects were partly determined by a performer (causa efficiens). The browser is a window through which we trigger calculations and visualizations. The results we retrieve are not fully determined by this triggering. A search algorithm implemented on a server we are connected with takes our queries as input, interprets our request, and processes results according to its causalfunctioning. This causal-functioning is formally described by an engineer (causa efficiens) in terms of a counting and calculating procedure instructed to determine a ranked list of web pages most relevant to our query (causa formalis). However, a formal search procedure like the famous page-rank algorithm is not sufficient for the essence of searching the web. Its materialization and instantiation on a physical machine is required (causa materialis). Like the calculating human mind is indebted to its physical realization—body, arms, hands, fingers, pen, and paper—a search procedure is caused by its material correlate. Moreover, the instantiated and materialized algorithm follows certain rules and these rules were designed to guarantee an end (telos)—the search result—with respect to degrees of freedom (causa finalis). Also an end is a cause for which means were developed and implemented. Together, these four causes make up the essence of causality. Anthropologically, this essence encompasses the causa efficiens in terms of an engineer who designed and implemented browser and search algorithm and an end-user who formulates and puts queries in order to retrieve results. The former is the original performer who adopts the perspective of the latter. All four causes make up instrumentalism. Together with performers who trigger design, implementation and usage an instrumental and anthropological conception of information technology stands.

Philosophy has taught these four causes for centuries [5, 6]. It becomes clear that the essence of searching the web is neither a general idea or form (eidos) of a search algorithm and the data structures it operates upon formalized as means to an end. Nor is it its physical implementation and readiness to be used. Essentially, at the heart of instrumentalism causality is anthropological too with the performers (engineer and user) being an integral part of technology.

In antiquity techne was not simply a technological artifact like a browser or a search procedure. Techne was a way of revealing truth (aletheia). Revealing was more than a craft. It also meant knowing (episteme)—the working of the mind—and artistic work like poetry. Poetic work stems from poiesis and means revealing in the sense of bringing-forth or disclosing. The essence of technology is revealing as it shows itself in the world. This self-revealing encompasses but stands in sharp contrast to a correspondence theory of truth (adaequatio intellectus et rei). Correspondence starts with a proposition—a linguistic expression—which is either true or false. Truth and falsity is decided by referring to an object which either fulfills the proposition or fails to do so. For instance, the proposition Plato was a genius refers to the philosopher Plato who either was a genius or not. Plato himself would have reduced this proposition to a form or general essence—the proposition as eidos—from which truth or falsity would have emanated. He would have reduced his uniqueness and situatedness to an abstract idea. In contrast techne as revealing and poetic production brings-forth possibilities for action and affordances to act as remaining and resting-in-themselves. Possibilities and affordances of material (hyle), form (eidos) and purpose (telos) reveal themselves into unconcealment (aletheia). This revealing bears a concealing (letheia), i.e., a revealing that hides itself, for instance by means of context-annihilating propositions or ideas. The key of techne as revealing is to re-contextualize the hidden or concealed toward the essence of technology. Quantum mechanics provides the acausal means to do so.

# 3. QUANTUM MECHANICS AND INFORMATION TECHNOLOGY

Quantum mechanics is increasingly applied to areas outside of physics [11]. This has made it possible to investigate quantumlike effects in domains such as computer science, economics, and psychology. Since the discovery of the quantum in the beginning of the last century, physics has raised questions far beyond what has been traditionally conceived as physical. Determinism, reductionism and physical realism are usually concepts in philosophy. With the advent of quantum theory they became entangled not only with physics but a lot of other disciplines and even popular science. Today, with the success and economic significance of information technology, a large number of disciplines related to information exist side by side. Many of them claim to be an applied science. Institutions offering information-related research and education may wish to clarify their subject matter with respect to current scientific progress and questions related to techne.

This section gives credit to the current state of what natural science revealed to be form and matter and it takes into account how humans actually come to know what form and matter is. It shows that there are physical forms or shapes for which there is no cause in the sense of causality discussed in the previous section. Acausality (technology) reveals the essence of physical forms (information). Acausality has its own time, a primoridal time (temporality) that temporalizes itself.

What makes formalisms of quantum mechanics interesting is that they can't fully abstract from the material world. Pure mathematics is neither required to put its formal statements to empirical test<sup>4</sup> . Nor does it derive necessarily its formalisms from empirical data. Symbolic descriptions are supposed to stand on their own feet<sup>5</sup> . Their application to engineering and the natural and social sciences is of secondary importance. In quantum mechanics, however, the notion of wave function collapse or state reduction enforces empirical context. Phenomenological observation or measurement creates or constructs real states, which beforehand were indeterminate or didn't exist.

The so called quantum enigma [12]—also known as the measurement problem—is one of the outstanding mysteries in physics and the sciences as a whole. Last but not least, its explosiveness stems from the fact that causality breaks at the most fundamental level of objective, third-personal and contextfree descriptions of nature. The way a human experimenter sets up a measurement device—a decision made by human consciousness—determines whether he will find matter and energy behaving like waves spread out and extended in space or discrete particles whose real existence is determined with the actual measurement performed. It appears that human decision making is inseparably connected with the perspective taken upon one or the other experimental setting and its outcomes. Last but not least, inseparability of mind and matter is reasonable since humans have a body and sensing organs built out of atoms and forces guiding them.

#### 3.1. Inseparability and Acausality

A scientifically and philosophically informed means for revealing the essence of information technology does not simply take presuppositions about causality for granted. Therefore, a first step toward developing such a means is questioning if there are causes beyond causa efficiens,causa formalis,causa materialis, and causa finalis. In quantum mechanical systems there are inseparable states. Such states occur within combined systems composed of two or more individual systems. Inseparable states can't be reduced causally to states of individual systems. They seem to have no cause; they are acausal.

Mathematically, a combined system is described in multidimensional vector space. Vectors and linear operators in combined vector spaces represent states, properties, and measurements of systems [13]. For instance, suppose one operator represents two alternating decisions of a human (the anthropological part)—let's say observe a (Plato was a genius) or observe b (Plato was not a genius)—and another one (the instrumental part) represents outcome a (Plato was a genius) or outcome b (Plato was not a genius). These two operators interact in such a way that alternating decisions and alternating outcomes mix up, entangle, and evolve toward inseparable states. Such inseparable states of combined operators can't be factorized into the states of the individual systems they emerged from or were a part of all the way long.

$$\begin{aligned} \boldsymbol{AB\_{General}} &= \begin{pmatrix} \boldsymbol{p} & \boldsymbol{q} \\ \boldsymbol{r} & \boldsymbol{s} \end{pmatrix} \otimes \begin{pmatrix} \boldsymbol{l} & \boldsymbol{m} \\ \boldsymbol{n} & \boldsymbol{o} \end{pmatrix} \\ &= \begin{pmatrix} \boldsymbol{p}l & \boldsymbol{p}m & \boldsymbol{q}l & \boldsymbol{q}m \\ \boldsymbol{p}n & \boldsymbol{p}o & \boldsymbol{q}n & \boldsymbol{o}o \\ \boldsymbol{r}l & \boldsymbol{r}m & \boldsymbol{s}l & \boldsymbol{s}m \\ \boldsymbol{r}n & \boldsymbol{o}s & \boldsymbol{s}o \end{pmatrix} \end{aligned}$$

The 4-dimensional matrix above shows a combined operator representing decisions and decision outcomes in a general form.

<sup>4</sup>Ontologically, the empirical is first and foremost phenomenological.

<sup>5</sup>This does not deny the necessity of embodied cognitive skills. If symbolic representation and manipulation are agnostic toward syntax, i.e., physical form or shape, it is bad phenomenology.

For instance, if A is represented as an operator in 2-dimensional vector space with two decisions a = (1, 0) and b = (0, 1) and B is represented as an operator in 2-dimensional vector space with two decision outcomes a = (1, 0) and b = (0, 1), then the following 4-dimensional operator represents the state of a combined system that is separable.

$$AB\_{Separable} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} \otimes \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}$$

However, the combined state space of decisions A and outcome alternatives B (ABGeneral) embraces states which are not separable into the operators of the individual spaces. Take the following example:

$$AB\_{Inseparable} = \begin{pmatrix} \ \ \ \ \ \end{pmatrix} \otimes \begin{pmatrix} \ \ \ \ \ \end{pmatrix} = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & -1 \end{pmatrix}$$

pl = 1 and so = −1 and po = 0 and sl = 0. If pl = 1, then p 6= 0. If po = 0 and p 6= 0, then o = 0. But so = −1 and therefore o 6= 0. It is no surprise that some states in AB are inseparable with regard to A and B. This is a purely structural consideration. It accounts for the fact that there are (higher) combined states and properties which are not reducible to (lower) individual states and properties. ABInseparable is causally not reducible to operators of the individual systems as it is the case for ABSeparable. Using the words of a correspondence theorist, the separation between the proposition that Plato was either a genius or not (A) and its verification or falsification by referring to Plato as a putative genius (B) is not tenable anymore.

So far, from a dynamic point of view, there is nothing said about how such an inseparable state came up in the first place. How did A and B interact over time in order to end up in ABInseparable? Perhaps ABInseparable is presupposed all the way long? In several previous contributions it was argued that inseparability is an indicator for phenomena where presuppositions are at work [2, 14–17]. We are always already situated in the world and many skills are not propositional in nature. For instance, we notice that it is raining not by formulating a proposition and verifying this proposition by observation. When we are walking on the street, the sky is cloudy, the air is wet, and our skin is sensing water drops, we understand that it is raining. Phenomenologically there is no experience of—and therefore no empirical evidence for—a matchmaking (adaequatio) between an intellectual understanding of what it is like to walk in the rain (intellectus) and the ontic fact that it is raining outside (rei). Nevertheless, in many situations we separate an idea or proposition from its referential object. This ontic perspective facilitates a separation between cause and effect. However, it is not primordial.

It turns out that ABSeparable is an ontic case of ABInseparable. The former requires an attitude to describe things, in this case decisions and decision outcomes, as existing independent of observation. Here techne is a revealing that hides itself. A leveling down or crossing over (letheia) annihilates context and reveals propositions or ideas and their referential objects stripped off their situatedness in the world. This is the positivist viewpoint in epistemology and realism in ontology both of which are ontic views in a presence culture [2]. A presence culture acknowledges that decisions are always already situated within decision situations; they are always already connected with potential outcomes determined with the actual decision made. In order to acknowledge causality and thus things in sequential time, i.e., cause and effect as separate entities, the ontic viewpoint separates or disentangles inseparable states if certain structural aspects hold. In the next subsection it will be demonstrated that these structural aspects bind together exponents of exponential functions to describe things as separate entities and things in sequential time. Here there are no absolute zero points and no derivatives with respect to time. The only derivative is temporality itself where things appear as being open

#### 3.2. Acausality in Time

to others and even to everyone else.

In a presence culture things are given as available or readyto-hand within a horizon [2]. Things are open and accessible to others and even to everyone else [15]. They don't stand in isolated opposition like the proposition about Plato stands over and against the historical person Plato, or a glass of wine stands in direct opposition to a bottle on its right hand side. A piece of paper and the symbols written on it are real in the sense of being open to be read by any other reader. While reading letters, words and sentences on the paper, however, the meaning of a text shows up as an unbroken reading experience. Therefore, it doesn't show up as independent of me reading them, but as meaningful in alignment with my background knowledge. The meaning of the text is present within a chronotope spanning across the gap or separation between me—the reader—and the symbols on the paper. I do not count time steps while enjoying a poem. Reading and grasping a poem come with their rhythm and tone, perhaps even their smell. But temporality is not structured like a causal chain of arguments where a premise is clearly antecedent to a conclusion. Time is not spatially located on a horizontal line with points indicating what was before and what will be after. Temporality embraces sequential time separated into discrete steps or continuous events. However, it will be shown that this is a special case.

A continuous time line can be read from the exponential function<sup>6</sup> . It describes growth and decay in space without absolute zero points. Its derivative is the function itself. There is no absolute beginning and no absolute end. The exponential function is transcendental in the sense of inexhaustible (Euler's number is an inexhaustible number). There is no absolute benchmark for discrete time steps and therefore there is no absolute causal relation where an antecedent event causes a subordinate event. Time is acausal, or better, the time of

<sup>6</sup>From a temporal point of view, the dynamics of combined quantum systems are prescribed by evolution equations, which, in their general form, consist of exponentiations [13].

acausality is temporality. Temporality temporalizes or befalls itself (cf. Section 4).

Acausality spans or broadens the present. Presence is broadening [7]. In a presence culture the future is increasingly inaccessible and the past increasingly difficult to let behind. In a meaning culture<sup>7</sup> going far back in time is equally transcendental and inaccessible as future predictions. The very far past and the far future remain highly speculative, difficult to reproduce, and impossible to anticipate. Therefore, they are not definite or determinate. One way to cope with this uncertainty is to admit that the past and the future are simply inexhaustible or infinite<sup>8</sup> . Transcendentalism embraces uncertainty and openness to interpretation. It eludes certainty. Causality gives certainty. However, it is a special case; an ontic viewpoint that establishes a clear antecedent event and a clear subordinate event, i.e., sequential time. It modifies primordial time (temporality) when exponents P, S, Q, and R relate to each other in such a way that inseparable states of combined spaces evolve toward separability as the result of a forgetting, leveling down, or crossing over (letheia).

$$\begin{aligned} \boldsymbol{AB}\_{\text{Separable}}^{\text{InTime}} &= \begin{pmatrix} \boldsymbol{e}^{\boldsymbol{P}t} & \boldsymbol{0} \\ \boldsymbol{0} & \boldsymbol{e}^{\boldsymbol{Q}t} \end{pmatrix} \otimes \begin{pmatrix} \boldsymbol{e}^{\boldsymbol{R}t} & \boldsymbol{0} \\ \boldsymbol{0} & \boldsymbol{e}^{\boldsymbol{S}t} \end{pmatrix} \\ &= \begin{pmatrix} \boldsymbol{e}^{(\boldsymbol{P}+\boldsymbol{R})t} & \boldsymbol{0} & \boldsymbol{0} & \boldsymbol{0} \\ \boldsymbol{0} & \boldsymbol{e}^{(\boldsymbol{P}+\boldsymbol{S})t} & \boldsymbol{0} & \boldsymbol{0} \\ \boldsymbol{0} & \boldsymbol{0} & \boldsymbol{e}^{(\boldsymbol{Q}+\boldsymbol{R})t} & \boldsymbol{0} \\ \boldsymbol{0} & \boldsymbol{0} & \boldsymbol{0} & \boldsymbol{e}^{(\boldsymbol{Q}+\boldsymbol{S})t} \end{pmatrix} \end{aligned}$$

ABInTime is separable if <sup>P</sup> <sup>+</sup> <sup>S</sup> <sup>=</sup> <sup>Q</sup> <sup>+</sup> <sup>R</sup> at each instant of time. <sup>P</sup>, S, Q, and R are linear operators and can be thought of as a matrix counterpart of a real number. That P + S = Q + R must not hold in general and is rather a special ontic case.

$$\begin{aligned} \boldsymbol{A} \boldsymbol{B}\_{\textit{Inseparable}}^{\textit{InTime}} &= \left( \boldsymbol{?} \right) \otimes \left( \boldsymbol{?} \right) \\ &= \begin{pmatrix} \boldsymbol{\varepsilon^{(P+R)t}} & \boldsymbol{0} & \boldsymbol{0} & \boldsymbol{0} \\ \boldsymbol{0} & \boldsymbol{\varepsilon^{(P+S)t}} & \boldsymbol{0} & \boldsymbol{0} \\ \boldsymbol{0} & \boldsymbol{0} & \boldsymbol{\varepsilon^{(Q+R)t}} & \boldsymbol{0} \\ \boldsymbol{0} & \boldsymbol{0} & \boldsymbol{0} & \boldsymbol{\varepsilon^{(Q+S)t}} \end{pmatrix} \end{aligned}$$

ABInTime is inseparable if <sup>P</sup> <sup>+</sup> <sup>S</sup> 6= <sup>Q</sup> <sup>+</sup> <sup>R</sup> at each instant of time. ABInTime Inseparable is primordial. Up to this date, ontological emergence of mental causation from material causal laws has been witnessed nowhere [18]. There are no ontological causes leading from ABInTime Separable to ABInTime Inseparable. Therefore, it is reasonable to assume that the former is a special ontic case of the latter. If structural aspects hold together exponents distributed among both individual spaces at each instant of time, a leveling down or crossing over (letheia) of primordial time (temporality) separates A and B and provides the condition of the possibility for experiencing vulgar time<sup>9</sup> as a succession of present moments (cf. Section 4). However, in primordial time subjective decision (A) and objective outcome (B) are always already combined. Instead of being an aggregate or a unity over time, A and B are combined acausally and thus equiprimordially. The dynamic viewpoint of A and B provides a higher degree of inseparability and therefore a stronger evidence for acausality as each component of A refers to a component of B at each single moment simultaneously and thus equiprimordially [13].

At this point, an acausal means for revealing the essence of information technology stands. ABInTime Inseparable adds to the essence of causality as presented in Section 2. Causa formalis and causa materialis are too sides of the same coin. The formal representation of an inseparable evolution is not supreme. Quantum concepts can't fully abstract from causa materialis. Vice versa, body, matter and syntax alone are not sufficient for causing form or even the essence of information technology. Performer is the information engineer. He or she is the anthropological component and causa efficiens as part of an acausal means for revealing the essence of information technology. Eventually, causa finalis is the essence itself. It brings itself into unconcealment by resting and remaining-initself. This telos is not simply an end but an end-in-itself. It is neither subjective (a preference, desire, or value) nor objective (a common good or value) in the sense of opposed to a subject. It comes into being out of temporality. Temporality will be discussed in the next section in more detail. So far it was introduced as the time of acausality.

Acausality is associated with synchronicity, a term introduced by Jung and Pauli who searched for correlated events with no causal link [19]. In physics such events are known under the label of entanglement and activation at a distance. In the life and psychological sciences, there are phenomena like social mirroring or contagious yawning offering an acausal interpretation [15]. They can't be proved or disproved by means of statistical methods. Statistics may kill acausal events. Synchronistic events or acausal means require a non-willing or releasement [2], whereas correspondence tests enforce separability, a leveling down or crossing over (letheia) of primordial time (temporality).

The next section introduces temporality as the time of acausality and the essence of information technology. Primordial time is neither a subjective stream of present moments in the observer's mind nor is it an objective though relative flow of events in the external world. Temporality temporalizes ecstatically and horizontally [3, 4]. Temporality is not a concept or a category. It is not epistemological. As an existential and thus more comprehensive and fundamental than a concept or category temporality is ontological; it does not simply have ontic properties. Rather it exhibits general essences. Datability, significance, spannedness, and openness are general essences of equiprimordial time (temporality).

#### 4. TEMPORALITY AND INFORMATION

Science has a natural inclination to strive from epistemology to ontology. It does not only want to know how we as scientists, consumers, citizens, etc. come to know; it wants to know how things really are. A statement as simple as "it is gold" is

<sup>7</sup> In a meaning culture the meaning of concepts (e.g., privacy) stands for or represents something (e.g., a right). Meaning is attributed, predicaments are made. In contrast a presence culture takes linguistic expressions as a medium that overcomes the separation between subject and object, mind and matter, physics and metaphysics.

<sup>8</sup>Primordial time (temporality) is finite and the boundary, end or frontier of this finiteness is authentic future or indeterminacy (cf. Section 4.1).

<sup>9</sup>The term "vulgar" is not meant to be a value judgment.

ontological. Being is at stake. Epistemology is concerned with the ways we come to know that "it is gold," e.g., by way of understanding how sensory stimulation from golden surface material changes as a function of movement. Cognition is at stake.

In the introduction (cf. Section 1) information technology was introduced in terms of causality. In the previous section it was argued that besides traditional causes (instrumentalism and anthropology) acausality adds to the essence of information technology. Our implicit understanding of information is often purely instrumental. We are less interested in what information thematically is—its essence—and more concerned about its utility. Information is for processing, education, entertainment, notification, reporting, etc. Semiotics agrees with such an instrumental view. There is a pragmatic aspect to information besides syntax and semantics. Syntax is simply the physical form of information. Think of the linear operators in Section 3.1 with components 1 and 0. On a semantic level operators and their components have a meaning. Operators represent two alternating decisions or observations: Plato was a genius (1, 0) or Plato was not a genius (0, 1) and two alternating outcomes: Plato was a genius (1, 0) or Plato was not a genius (0, 1). This information turns pragmatic once it is used to explain acausality.

However, there is more to information than its meaningfulness and usefulness. Meaning attributions are arbitrary. Conventionalized meaning, however, often conceals arbitrariness. Attributing a trait of Plato's intellect to 1 and 0 is arbitrary. Certainly genius is not reducible to bits. Acausality reveals this non-reducible character of traits and information in general. In many situations of circumspect taking care and skillful coping we do not attribute meaning and usefulness to physical forms. Rather meaning and usefulness are made present [2]. Information encountered shows itself as what it is in a meaningful and pragmatic way. Semantics and pragmatics are not something extra to syntax. They are to be found and made present within the physical form itself. Unless a conspicuous encounter with information makes me wonder what it really means—for instance, I may find Chinese letters underneath a painting without the slightest understanding of Chinese language—I do not start grappling with meaning and pragmatics in an explicit and thematic way.

In summary, the essence of information technology is far more than a (causal) processing of information on different layers of abstraction (syntax, semantics, and pragmatics). Information is temporal. Temporality is the time of acausality. This time is not chronological. Causality requires chronological time. Cause and effect are separate entities in time. Effect comes after cause and, vice versa, cause is prior to effect. Chronological time and causality derive from temporality. They are released by an awaiting that retains.

### 4.1. Making Present, Awaiting, and Retaining

What makes QM particularly apt for modeling and understanding decision making and other cognitive phenomena is indeterminacy10. In contrast to classical uncertainty, indeterminacy in QM does not presuppose a particle—its properties like angular momentum or position—to be predetermined though not yet known. Like in the Plato example in Section 3.1 states and properties represented as vectors or linear operators may be superposed. Before a decision is made about Plato's intellect two even mutually exclusive options constitute one state of potentiality11. In quantum physics properties like position may be superposed and have contradicting values or values violating the law of total probability. A wave function is distributed or spread out though the particle it represents can only exist at one discrete position. The probability that a particle exists at one particular position A and not at another position B may not add up to the total probability of 1. It looks like that in a wave scenario a particle can be at several positions simultaneously. Unless a measurement determines position with the measurement made it is undetermined. Grasping this outmost uncertainty is at the heart of temporality.

Temporality temporalizes out of an authentic future. Authentic future is not something outstanding. It is not something missing or lacking. There are no information deficits in primordial time. For instance, if I want to buy a new car and my savings already cover 3/4 of the full price, then 1/4 is still outstanding and expected to add to my savings within the coming months. My expectation of the remaining amount of money to be saved is always already in foresight of the full price for the car to be saved. Future savings are outstanding. Indeterminacy and authentic future, however, are not outstanding because uncertainty (position of a particle, Plato's genius, etc.) is not epistemological (due to a lack of knowledge) but ontological. In the car savings case uncertainty is epistemological. I do not yet know if the remaining amount will add up to my savings within the coming months. However, I do know what the remaining amount is: 1/4 of the full price. The full price is pre-determined and my uncertainty is relative to it.

Authentic future is not chronological. But future chronologically conceived is founded in indeterminacy or authentic future. Savings of 3/4 of the full price of a car is prior to savings of the full price I will or will not have in my account in the future. For indeterminacy or authentic future there is no full price. It is not the case that a full price is not known. It doesn't exist. Indeterminacy is an end or a future that is not outstanding. It is always already given though most of time hidden, concealed, or forgotten12. Being-toward-indeterminacy is presupposed but leveled down or crossed over when time is experienced as a succession of present moments. Such a flow of events or stream of experience finds its formal expression in separable entities or ABInTime Separable (cf. Section 3.1), a requirement for chronological

<sup>10</sup>cf. Flender and Müller [16] for an application of QM to privacy decision making. <sup>11</sup>If Plato really was a genius or not, is, of course, a matter of debate. It is not predetermined. Therefore, such historical examples lend themselves for illustrations of effects as found in QM.

<sup>12</sup>For Heidegger this outmost uncertainty is death or being-toward-the-end [3, 4]. He acknowledges that a common understanding of death is demise. I prefer not to use the term death as a synonym for indeterminacy. The reason is that the common or vulgar connotation of death as demise is most difficult to shake off, a requirement for its transformation into authentic future.

being. However, ABInTime Inseparable is primordial. Temporalized components refer to each other instantaneously, simultaneously, or equiprimordially. Primordial time (temporality) is finite and the boundary, end or frontier of this finiteness is indeterminacy. Having-been, presence and authentic future are equiprimordial in temporality. The present is released in an awaiting that retains. Once P, Q, R, S relate to each other (P + S = Q + R) equiprimordiality is modified in such a way that a succession of present moments arises. The immediate future is constantly anticipated and the immediate past is constantly slipping away. A condition of the possibility for transcendentalism, inexhaustibility or infinity is that the equiprimordial awaiting that retains is annihilated or de-contextualized. The making present of "now and now and now" is predominant; the awaiting that retains fades into the background. A constant making present is released without conceiving its origin in an awaiting (authentic future) that retains (having-been). A stream of present moments conceals horizontal ecstasies (awaiting, retaining, and making present) of temporality. This concealment (letheia) constitutes the modus operandi of everydayness.

As scientists, managers, consumers, citizens, etc. we are always already in-the-world. This "always already" refers to presuppositions which are not necessarily resolute, grasped or conceived but leveled down or crossed over due to one's being within a common factual world. In everyday taking care—our business as usual as scientists, managers, consumers, citizens, etc.—we observe, manage, consume and participate as "one" does it. The time of the "one" is a making present that forgets. It forgets an awaiting that retains as the condition of the possibility for its release. A good example is taking care of time itself as one coordinates one's behavior with other people.

Suppose you have booked a one week meditation retreat together with a friend. In the evening of the first day of your stay you make an appointment for the next day. You agree with your friend on having a first meditative exercise at sunrise. Both of you and possibly most of the population on earth know what a sunrise is. In our shared and common world the sun as a natural clock is always already discovered. Before sundials as well as mechanized, electrified and digitized clockwork were invented, the sun was a thing encountered at hand ready to be used. In circumventive taking care it was used as a natural pointer to sunrise, noon and sunset according to which everyday activities were coordinated. The next morning you and your friend wake up at sunrise. Both of you look into the sky and you see the sun at the horizon. "Now it is time to have a meditative exercise" is what both of you understand and share publicly in measuring time with the oldest clock on earth, the sun. Usually and most of the time we take care of things and time itself as "one" does it. Implicitly and unthematically we understand what time it is and what we have to do. Although primordial time is leveled down or crossed over we understand temporality temporalizing itself ecstatically and horizontally. With every "now, that it is time to have a meditative exercise" (sunrise), an "on that former occasion" (earlier when the sun rose, yesterday, the days before, etc.) and a "then, when the sun will have reached its peak or will set" (later on at noon or sunset) are presupposed and equiprimordially understood though not explicitly articulated.

Saying "now it is time to have a meditative exercise" is a discoursing articulation of a making present that temporalizes itself in unity with an awaiting that retains [3, 4]. In measuring time, the sun gets made public in such a way that it is encountered for you, your friend and perhaps other practitioners joining you as "now" and not later, earlier, tomorrow, or yesterday. Time is a stream of present moments. Chronology, whether discrete or continuous, requires a sense of what was before and what will be after. The "now it is time to have a meditative exercise" is a present moment within a flow of time, an inner duration or a continuous time experience whereby the equipromordial making present of an awaiting that retains is leveled down, crossed over, or forgotten. This vulgar understanding of time levels down or forgets the having-been and the awaiting and just reveals at sunrise "now it is time to have a meditative exercise."

#### 4.2. Essences of Temporalized Information

Temporality is not a concept or a category. It is not ontic. As an existential and thus more comprehensive and fundamental than a concept or category temporality is ontological; it does not simply have ontic properties. Rather it exhibits general essences<sup>13</sup> .

#### 4.2.1. Datability

Datability is a general essence of equiprimordial time (temporality). In taking care of time itself (time measurement) every making present or saying "now" is accompanied by a "then, when" and "on that former occasion, when." Every ontic statement like "it is gold" implies a "now, it is gold" and, equiprimordially, an awaiting ("then, it will still be gold") and a retaining ("on that former occasion, it already was gold"). An athlete who is always already in the flow of what he is doing (e.g., the running, jumping, or dribbling of a basketball player) is making present by awaitingly retaining. In taking care of the game he is within time as a succession of an immediate past (the not-anymore), the present now, and an immediate future (the not-yet). However, this flow is derivative or vulgar if datability is hidden, concealed, crossed over, or leveled down. "If circumspect taking care were simply a succession of experiences occurring in time, and even if these experiences were associated with each other as intimately as possible, letting a conspicuous, unusable tool be encountered would be ontologically impossible" [3, 4].

#### 4.2.2. Significance

Time is likewise derivative or vulgar if significance is nullified. In average everydayness, if I wake up in the morning and have an appointment at sunrise, I do not ponder or reason why I have this appointment, what it is good for, or for the sake of which desire or preference I made it. I just have it. Like datability significance is crossed over or leveled down in circumventive taking care of a situation. In primordial time, however, temporality temporalizes "in-order-to" take care of a situation. Its significance tells that it is time for what shows itself or is given, which may either be appropriate or inappropriate. For instance, it is appropriate to catch up with my friend for having our first meditative exercise and it is inappropriate to go back to bed and have a couple of hours extra sleep. For a basketball player it is appropriate to

<sup>13</sup>An apt German word for general essences is "Wesensmomente".

take a three-point shot when it's time for taking the lead, or, it's inappropriate, when there is a bad defense in the zone depending on the situation and his circumspect taking care of it.

#### 4.2.3. Spannedness

Temporality broadens the presence. Authentic making present is broadening<sup>14</sup> [20, 21]. Temporal ecstasies broaden the presence in the sense that they span across future, past, and present. Making present, awaiting and retaining are equiprimordial ecstasies. The present is released from an awaiting that retains. This horizontal spannedness of future, past and present has its primary moment in an anticipation of indeterminacy (authentic future). Through temporality temporalizing itself ecstatically and horizontally out of authentic future information comes into existence. The meaning or significance of information in general is temporality. To say that information is this or that is to let-it-come-toward-itself (awaiting), let-it-beas-it-already-was (retaining), and let-it-be-encountered-as-it-is (making present). To say that information "is" admits the existence of information. Existence means being-ahead-ofitself. Being-ahead-of-itself temporalizes out of authentic future. Indeterminacy shines into what it is not: information. This spanning or broadening of temporal ecstasies finds it formal expression in ABInTime Inseparable (cf. Section 3.1). Here temporalized components refer to each other instantaneously, simultaneously, or equiprimordially. Past, present and future are equiprimordial as long as P, Q, R, S do not relate to each other (P + S 6= Q + R).

#### 4.2.4. Openness

Last but not least, primordial time is open to others and even to everyone else. It is public or shared. Openness is a condition of the possibility for coordinating our behavior with others. Like things and others encountered in circumspect taking care and scientific investigation we are always already together with others no matter if they are physically present or not. Time shows itself as open or public and thus it is shared like things and others encountered in everyday taking care. Being together with things and others encountered is always already being-in-time. For instance, we share an astronomical calendar and use it for coordinating our behaviors, from the planning of our careers to weekly meetings. Perhaps in the natural sciences and historiography such shared and agreed upon conceptions of time are shaken more than in any other realm of human life. Temporality is prior to any specialized discipline and prior to any distinction. It is the condition of the possibility for information to come into being out of indeterminacy (authentic future) by temporalizing itself. It is the time of acausality, a techne for revealing the essence of information as that what remains and rests-initself.

# 5. CONCLUSION

In his 1946 foreword of Brave New World, in retrospective, Aldous Huxley speculates about how he would have had rewritten his dystopian novel 15 years earlier [22]. He reasons about a third alternative between an insane life in Utopia where genetic engineering, brainwashing and recreational use of drugs produce happy consumers who appear to be plugged into a universal happiness machine and a lunatic world of primitive people who resisted any economic and technological progress.

"Religion would be the conscious and intelligent pursuit of man's Final End, the unitive knowledge of the immanent Tao or Logos, the transcendent Godhead or Brahman. And the prevailing philosophy of life would be a kind of High Utilitarianism, in which the Greatest Happiness principle would be secondary to the Final End principle—the first question to be asked and assured in every contingency of life being: 'How will this thought or action contribute to, or interfere with, the achievement, by me and the greatest possible number of other individuals, of man's final end?"'

Huxley's dream of a society composed of freely co-operating individuals devoted to the pursuit of sanity has a Final End, a causa finalis, in mind. Today, 70 years after he wrote his foreword, science may be in a position to enter the middle way, a third alternative between naive technological enthusiasm and nostalgic, ultra-conservative, or even total rejection of progress. Perhaps it is an irony of fate that science—the prestigious and success-laden project of modernity and representative of an enlightened, reasonable and secularized world—offers reconciliation with the spiritual, enchanted, and numinous.

For a long time the presupposition of knowledge being freed from value has been responsible for scientific progress. Objective, third-personal, and context-free knowledge—the fruit of science—is rid of subjective ends, motives, desires, interests, and feelings, all of which can be subsumed as being valuable. Value-free knowledge, however, is a fallacy. There is no science without presuppositions. There is only science whose foundations so far have remained unexamined. This is not to deny that even a traditional (meta-) discipline like philosophy presupposes conditions upon which its interpretations rest. However, it makes a difference whether presuppositions are simply taken for granted or if they are well-founded by means of reasonable arguments and phenomenological evidence.

Arguments and evidence employed in this contribution draw from QM [11]. QM offers unequaled opportunities because it raises foundational questions in a precise form. So far, applying QM to phenomena and problems outside of physics has been highly successful and, last but not least, its explanatory power for concepts (i.e., existentials to be precise) as general and specific as information and temporality has been substantiated in this article. There is growing evidence that effects and laws of QM also hold for macroscopic phenomena. However, far more revolutionary is the fact that applying QM to cognition is not equivalent but the same as altering and refining the cognitive apparatus of the scientist as an acausal measuring instrument.

<sup>14</sup>Heidegger calls this making present "Moment" (Augenblick). See §65 (Temporality as the Ontological Meaning of Care) in [3, 4]. I prefer speaking of broadening because it captures the other ecstasies (awaiting and retaining) more elegantly.

Now we live in an age with unprecedented possibilities for extending our capacities and abilities to reveal. Information technology and the Internet are extensions. They challenge us and we challenge them. For a long time we thought about technological apparatus being something purely instrumental, i.e., causality as traditionally conceived (cf. Section 2). Galilei was the first who employed an apparatus—a telescope—to verify a scientific hypothesis. He wanted to observe and verify if earth is indeed orbiting the sun. Today, our apparatus still extends into the material world but acausality and temporality as developed in this article challenge us to become part of observation.

In this article I tried to be as objective and value-free as possible. The method of doing so may appear unconventional. Questioning the very nature of causes, means, ends and values the presumptive opposition of subjectivity and objectivity seems like a deconstruction. Perhaps it is a deconstruction with the supposition of a value-free connotation—a branch of consciousness studies—which bears the potential to bring science and technology forward and guide us through the cognitive era that just started.

Investigating the relationship of first-person experience and third-person facts has been at the center of consciousness studies for quite a while. Unfortunately, we are still in the dark when it comes to give causally necessary and sufficient conditions for consciousness to arise. We are able to package reasons and causal chains of arguments into narratives explaining how causality, means, ends and values may have evolved. However, explanations after the fact still lack causally necessary and sufficient conditions as desired for a full-blown materialist theory of consciousness. For instance, retinal cells and the visual cortex may be necessary for seeing shape and color. However, they will never be sufficient for explaining visual consciousness. Repeatable and reproducible observations of a particular constellation of firing cells in the

#### REFERENCES


visual cortex prior to or "simultaneous with" a visual experience of a particular object is not sufficient evidence for visual consciousness of that particular object (post hoc ergo propter hoc). Non-materialist accounts are equally problematic as nobody can convincingly deny having a body and living in a material world.

Abstaining from causal material explanations leaves at least two other options for making sense of consciousness [23]. First, including one or many intentional beings into narratives is subject of various religious traditions, in particular monotheism, e.g., the Judeo-Christian tradition, or polytheism of antiquity. Second, there are (pantheist) approaches neither claiming mechanistic laws and necessity underlying consciousness nor referring to a higher being or eternal creator as the missing link to the mystery.

A vision like Huxley's third alternative supposes a final end: a deity or end-in-itself. Such an approach abandons valuefree explanations in the traditional sense. It acknowledges individualism and relativities of preferences, ends, and values. However, relativities are not merely subjective as opposed to objective. They are objective in the sense of information and temporality as developed in this contribution.

#### AUTHOR CONTRIBUTIONS

The author confirms being the contributor of this work who wrote and conducted what is laid-out and approved it for publication.

### ACKNOWLEDGMENTS

For their support the author would like to thank all individuals and institutions who know they made an essential contribution to the author.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Flender. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Corrigendum: Information and Temporality

#### Christian Flender\*

*Faculty of Economics and Behavioral Sciences, University of Freiburg, Freiburg, Germany*

Keywords: information, technology, temporality, acausality, quantum mechanics

#### **A Corrigendum on**

#### **Information and Temporality**

by Flender, C. (2016). Front. Phys. 4:40. doi: 10.3389/fphy.2016.00040

In the original article, there was an error in the formal expression of separability and inseparability. Components of a combined evolution matrix are exponential functions and if their exponents are structured in a specific way it is possible to factorize the matrix into constituents of a Tensor product. This was expressed formally with direct reference to exponents P, Q, R, and S.

For describing separability and inseparability exponents relate to each other in combination. To be precise, an evolution equation is separable if e(P+R)<sup>t</sup> <sup>+</sup> <sup>e</sup> (Q+S)<sup>t</sup> <sup>=</sup> <sup>e</sup> (P+S)<sup>t</sup> <sup>+</sup> <sup>e</sup> (Q+R)t .

Therefore, in order to avoid confusion, correct symbolic expression, and to re-establish the integrity of the contribution, it is necessary to use different symbols, for instance W, X, Y, and Z where W = e (P+R)t , X = e (P+S)t , Y = e (Q+R)t , and Z = e (Q+ S)t .

If W + Z = X + Y the matrix is separable. If W + Z 6= X + Y it is inseparable.

A correction has been made to **3. Quantum Mechanics and Information Technology**, section **3.2 Acausality in time**, paragraph four and five:

"ABInTime is separable if W <sup>+</sup> <sup>Z</sup> <sup>=</sup> <sup>X</sup> <sup>+</sup> Y at each instant of time where W <sup>=</sup> <sup>e</sup> (P+R)t , X = e (P+S)t , Y = e (Q+R)t , and Z = e (Q+S)t . That W + Z = X + Y must not hold in general and is rather a special ontic case."

"ABInTime is inseparable if W <sup>+</sup> <sup>Z</sup> 6= <sup>X</sup> <sup>+</sup> Y at each instant of time where W <sup>=</sup> <sup>e</sup> (P+R)t , X = e (P+S)t , Y = e (Q+R)t , and Z = e (Q+S)t . ABIn Time Inseparable is primordial. Up to this date, ontological emergence of mental causation from material causal laws has been witnessed nowhere [18]. There are no ontological causes leading from ABInTime Separable to ABIn Time Inseparable. Therefore, it is reasonable to assume that the former is a special ontic case of the latter. If structural aspects hold together exponents distributed among both individual spaces at each instant of time, a leveling down or crossing over (letheia) of primordial time (temporality) separates A and B and provides the condition of the possibility for experiencing vulgar time<sup>9</sup> as a succession of present moments (cf. Section 4). However, in primordial time subjective decision (A) and objective outcome (B) are always already combined. Instead of being an aggregate or a unity over time, A and B are combined acausally and thus equiprimordially. The dynamic viewpoint of A and B provides a higher degree of inseparability and therefore a stronger evidence for acausality as each component of A refers to a component of B at each single moment simultaneously and thus equiprimordially [13]."

A correction has been made to **4. Temporality and Information**, section **4.1 Making Present, Awaiting, and Retaining**, paragraph three:

"Authentic future is not chronological. But future chronologically conceived is founded in indeterminacy or authentic future. Savings of 3/4 of the full price of a car is prior to savings of the full price I will or will not have in my account in the future. For indeterminacy or authentic future there is no full price. It is not the case that a full price is not known.

#### Edited by:

*Emmanuel E. Haven, University of Leicester, United Kingdom*

#### Reviewed by:

*Frontiers Editorial Office, Frontiers Media SA, Switzerland Sandro Sozzo, University of Leicester, United Kingdom*

#### \*Correspondence:

*Christian Flender mail@christian-flender.de*

#### Specialty section:

*This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics*

Received: *26 May 2019* Accepted: *26 July 2019* Published: *14 August 2019*

#### Citation:

*Flender C (2019) Corrigendum: Information and Temporality. Front. Phys. 7:113. doi: 10.3389/fphy.2019.00113*

**47**

It doesn't exist. Indeterminacy is an end or a future that is not outstanding. It is always already given though most of time hidden, concealed, or forgotten12. Being-toward-indeterminacy is presupposed but leveled down or crossed over when time is experienced as a succession of present moments. Such a flow of events or stream of experience finds its formal expression in separable entities or ABIn Time Separable (cf. Section 3.1), a requirement for chronological being. However, ABIn Time Inseparable is primordial. Temporalized components refer to each other instantaneously, simultaneously, or equiprimordially. Primordial time (temporality) is finite and the boundary, end or frontier of this finiteness is indeterminacy. Having-been, presence and authentic future are equiprimordial in temporality. The present is released in an awaiting that retains. Once W, X, Y, Z relate to each other (W + Z = X + Y) equiprimordiality is modified in such a way that a succession of present moments arises. The immediate future is constantly anticipated and the immediate past is constantly slipping away. A condition of the possibility for transcendentalism, inexhaustibility or infinity is that the equiprimordial awaiting that retains is annihilated or de-contextualized. The making present of "now and now and now" is predominant; the awaiting that retains fades into the background. A constant making present is released without conceiving its origin in an awaiting (authentic future) that retains (having-been). A stream of present moments conceals horizontal ecstasies (awaiting, retaining, and making present) of temporality. This concealment (letheia) constitutes the modus operandi of everydayness."

#### A correction has been made to **4. Temporality and Information**, section **4.2 Essences of Temporalized Information**, subsection **4.2.3 Spannedness**:

"Temporality broadens the presence. Authentic making present is broadening<sup>14</sup> [20, 21]. Temporal ecstasies broaden the presence in the sense that they span across future, past, and present. Making present, awaiting and retaining are equiprimordial ecstasies. The present is released from an awaiting that retains. This horizontal spannedness of future, past and present has its primary moment in an anticipation of indeterminacy (authentic future). Through temporality temporalizing itself ecstatically and horizontally out of authentic future information comes into existence. The meaning or significance of information in general is temporality. To say that information is this or that is to let-it-come-toward-itself (awaiting), let-it-be-as-it-already-was (retaining), and let-it-beencountered-as-it-is (making present). To say that information "is" admits the existence of information. Existence means being-ahead-of-itself. Being-ahead-of-itself temporalizes out of authentic future. Indeterminacy shines into what it is not: information. This spanning or broadening of temporal ecstasies finds it formal expression in ABIn Time Inseparable (cf. Section 3.1). Here temporalized components refer to each other instantaneously, simultaneously, or equiprimordially. Past, present and future are equiprimordial as long as W, X, Y, Z do not relate to each other (W + Z 6= X + Y)."

A correction has been made to **4. Temporality and Information**, section **4.1 Making Present, Awaiting, and Retaining**, paragraph five:

"Suppose you have booked a 1 week meditation retreat together with a friend. In the evening of the first day of your stay you make an appointment for the next day. You agree with your friend on having a first meditative exercise at sunrise. Both of you and possibly most of the population on earth know what a sunrise is. In our shared and common world the sun as a natural clock is always already discovered. Before sundials as well as mechanized, electrified and digitized clockwork were invented, the sun was a thing encountered at hand ready to be used. In circumspect taking care it was used as a natural pointer to sunrise, noon and sunset according to which everyday activities were coordinated. The next morning you and your friend wake up at sunrise. Both of you look into the sky and you see the sun at the horizon. "Now it is time to have a meditative exercise" is what both of you understand and share publicly in measuring time with the oldest clock on earth, the sun. Usually and most of the time we take care of things and time itself as "one" does it. Implicitly and unthematically we understand what time it is and what we have to do. Although primordial time is leveled down or crossed over we understand temporality temporalizing itself ecstatically and horizontally. With every "now, that it is time to have a meditative exercise" (sunrise), an "on that former occasion" (earlier when the sun rose, yesterday, the days before, etc.) and a "then, when the sun will have reached its peak or will set" (later on at noon or sunset) are presupposed and equiprimordially understood though not explicitly articulated."

#### A correction has been made to **4. Temporality and Information**, section **4.2 Essences of Temporalized Information**, subsection **4.2.2 Significance**:

"Time is likewise derivative or vulgar if significance is nullified. In average everydayness, if I wake up in the morning and have an appointment at sunrise, I do not ponder or reason why I have this appointment, what it is good for, or for the sake of which desire or preference I made it. I just have it. Like datability significance is crossed over or leveled down in circumspect taking care of a situation. In primordial time, however, temporality temporalizes "in-order-to" take care of a situation. Its significance tells that it is time for what shows itself or is given, which may either be appropriate or inappropriate. For instance, it is appropriate to catch up with my friend for having our first meditative exercise and it is inappropriate to go back to bed and have a couple of hours extra sleep. For a basketball player it is appropriate to take a three-point shot when it's time for taking the lead, or, it's inappropriate, when there is a bad defense in the zone depending on the situation and his circumspect taking care of it."

The author apologizes for this error and state that this does not change the scientific conclusions of the article in any way. The original article has been updated.

Copyright © 2019 Flender. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Toward a Quantum Theory of Humor

#### Liane Gabora<sup>1</sup> \* and Kirsty Kitto<sup>2</sup>

<sup>1</sup> Department of Psychology, University of British Columbia, Kelowna, BC, Canada, <sup>2</sup> Department of Mathematical Sciences, Queensland University of Technology, Brisbane, QLD, Australia

This paper proposes that cognitive humor can be modeled using the mathematical framework of quantum theory. We begin with brief overviews of both research on humor, and the generalized quantum framework. We show how the bisociation of incongruous frames or word meanings in jokes can be modeled as a linear superposition of a set of basis states, or possible interpretations, in a complex Hilbert space. The choice of possible interpretations depends on the context provided by the set-up vs. the punchline of a joke. We apply the approach to a verbal pun, and consider how it might be extended to frame blending. An initial study of that made use of the Law of Total Probability, involving 85 participant responses to 35 jokes (as well as variants), suggests that the Quantum Theory of Humor (QTH) proposed here provides a viable new approach to modeling humor.

#### Edited by:

Andrei Khrennikov, Linnaeus University, Sweden

#### Reviewed by:

Haroldo Valentin Ribeiro, Universidade Estadual de Maringá, Brazil Raimundo Nogueira Costa Filho, Federal University of Ceará, Brazil Irina Basieva, Graduate School for the Creation of New Photonics Industries, Russia

> \*Correspondence: Liane Gabora liane.gabora@ubc.ca

#### Specialty section:

This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics

Received: 01 September 2016 Accepted: 21 December 2016 Published: 26 January 2017

#### Citation:

Gabora L and Kitto K (2017) Toward a Quantum Theory of Humor. Front. Phys. 4:53. doi: 10.3389/fphy.2016.00053

Keywords: bisociation, context, humor, incongruity, law of total probability, pun, quantum cognition, quantum interaction

#### 1. INTRODUCTION

Humor has been called the "killer app" of language [1]; it showcases the speed, playfulness, and flexibility of human cognition, and can instantaneously put people in a positive mood. For over a 100 years scholars have attempted to make sense of the seemingly nonsensical cognitive processes that underlie humor. Despite considerable progress with respect to categorizing different forms of humor (e.g., irony, jokes, cartoons, and slapstick) and understanding what people find funny, there has been little investigation of the question: What kind of formal theory do we need to model the cognitive representation of a joke as it is being understood?

This paper attempts to answer this question with a new model of humor that uses a generalization of the quantum formalism. The last two decades have witnessed an explosion of applications of quantum models to psychological phenomena that feature ambiguity and/or contextuality [2–4]. Many psychological phenomena have been studied using quantum models, including the combination of words and concepts [5–10], similarity and memory [11, 12], information retrieval [13, 14], decision making and probability judgment errors [15–19], vision [20, 21], sensation–perception [22], social science [23, 24], cultural evolution [25, 26], and creativity [27, 28]. These quantum inspired approaches make no assumption that phenomena at the quantum level affect the brain, but rather, draw solely on abstract formal structures that, as it happens, found their first application in quantum mechanics. They utilize the structurally different nature of quantum probability. While in classical probability theory events are drawn from a common sample space, quantum models define states and variables with reference to a context, represented using a basis in a Hilbert space. This results in phenomena such as interference, superposition and entanglement, and ambiguity with respect to the outcome is resolved with a quantum measurement and a collapse to a definite state.

**49**

This makes the quantum inspired approach an interesting new candidate for a theory of humor. Humor often involves ambiguity due to the presence of incongruous schemas: internally coherent but mutually incompatible ways of interpreting or understanding a statement or situation. As a simple example, consider the following pun:

"Time flies like an arrow. Fruit flies like a banana."

This joke hangs on the ambiguity of the phrase FRUIT FLIES, where the word FLIES can be either a verb or a noun. As a verb, FLIES means "to travel through the air." However, as a noun, FRUIT FLIES are "insects that eat fruit." Quantum formalisms are highly useful for describing cognitive states that entail this form of ambiguity. This paper will propose that the quantum approach enables us to naturally represent the process of "getting a joke."

We start by providing a brief overview of the relevant research on humor.

# 2. BRIEF BACKGROUND IN HUMOR RESEARCH

Even within psychology, humor is approached from multiple directions. Social psychologists investigate the role of humor in establishing, maintaining, and disrupting social cohesion and social status, developmental psychologists investigate how the ability to understand and generate humor changes over a lifetime, and health psychologists investigate possible therapeutic aspects of humor. This paper deals solely with the cognitive aspect of humor. Much cognitive theorizing about humor assumes that it is driven by the simultaneous perception [29, 30] or "bisociation" [31] of incongruent schemas. Schemas can be either static frames, as in a cartoon, or dynamically unfolding scripts, as in a pun. For example, in the "time flies" joke above, interpreting the phrase FRUIT FLIES as referring to the insect is incompatible with interpreting it as food traveling through the air. Incongruity is generally accompanied by the violation of expectations and feelings of surprise. While earlier approaches posited that humor comprehension involves the resolution of incongruous frames or scripts [32, 33], the notion of resolution often plays a minor role in contemporary theories, which tend to view the punchline as activating multiple schemas simultaneously and thereby underscoring ambiguity (e.g., 34, 35).

There are computational models of humor detection and understanding (e.g., 36), in which the interpretation of an ambiguous word or phrase changes as new surrounding contextual information is parsed. For example, in the "time flies" joke, this kind of model would shift from interpreting FLIES as a verb to interpreting it as a noun. There are also computational models of humor that generate jokes through lexical replacement; for example, by replacing a "taboo" word with a similar-sounding innocent word (e.g., [37, 38]). These computational approaches to humor are interesting, and occasionally generate jokes that are laugh-worthy. However, while they tell us something about humor, we claim that they do not provide an accurate model of the cognitive state of a human mind at the instant of perceiving a joke. As mentioned above, humor psychologists believe that humor often involves not just shifting from one interpretation of an ambiguous stimulus to another, but simultaneously holding in mind the interpretation that was perceived to be relevant during the set-up and the interpretation that is perceived to be relevant during the punchline. For this reason, we turned to the generalized quantum formalism as a possible approach for modeling the cognitive state of holding two schemas in mind simultaneously.

# 3. BRIEF BACKGROUND IN GENERALIZED QUANTUM MODELING

Classical probability describes events by considering subsets of a common sample space [39]. That is, considering a set of elementary events, we find that some event e occurred with probability pe. Classical probability arises due to a lack of knowledge on the part of the modeler. The act of measurement merely reveals an existing state of affairs; it does not interfere with the results.

In contrast, quantum models use variables and spaces that are defined with respect to a particular context (although this is often done implicitly). Thus, in specifying that an electron has spin "up" or "down," we are referring to experimental scenarios (e.g., Stern-Gerlach arrangements and polarizers) that denote the context in which a measurement occurred. This is an important subtlety, as many experiments have shown that it is impossible to attribute a pre-existing reality to the state that is measured; measurement necessarily involves an interaction between a state and the context in which it is measured, and this is traditionally modeled in quantum theory using the notion of projection. The state |9i representing some aspect of interest in our system is written as a linear superposition of a set of basis states {|φii} in a Hilbert space, denoted H, which allows us to define notions such as distance and inner product. In creating this superposition we weight each basis state with an amplitude term, denoted a<sup>i</sup> , which is a complex number representing the contribution of a component basis state P |φii to the state |9i. Hence |9i = i ai |φii. The square of the absolute value of the amplitude equals the probability that the state changes to that particular basis state upon measurement. This non-unitary change of state is called collapse. The choice of basis states is determined by the observable, Oˆ , to be measured, and its possible outcomes o<sup>i</sup> . The basis states corresponding to an observable are referred to as eigenstates. Observables are represented by self-adjoint operators on the Hilbert space. Upon measurement, the state of the entity is projected onto one of the eigenstates.

It is also possible to describe combinations of two entities within this framework, and to learn about how they might influence one another, or not. Consider two entities A and B with Hilbert spaces <sup>H</sup><sup>A</sup> and <sup>H</sup>B. We may define a basis <sup>|</sup>ii<sup>A</sup> for <sup>H</sup><sup>A</sup> and a basis <sup>|</sup>ji<sup>B</sup> for <sup>H</sup>B, and denote the amplitudes associated with the first as a A i and the amplitudes associated with the second as a B j . The Hilbert space in which a composite of these entities exists is given by the tensor product <sup>H</sup><sup>A</sup> <sup>⊗</sup> <sup>H</sup>B. The most general state in <sup>H</sup><sup>A</sup> <sup>⊗</sup> <sup>H</sup><sup>B</sup> has the form

$$|\Psi\rangle\_{AB} = \sum\_{i,j} a\_{ij} |i\rangle\_A \otimes |j\rangle\_B \tag{1}$$

This state is separable if aij = a A i a B j . It is inseparable, and therefore an entangled state, if aij 6= a A i a B j .

In some applications, the procedure for describing entanglement is more complicated than what is described here. For example, it has been argued that the quantum field theory procedure, which uses Fock space to describe multiple entities, gives a kind of internal structure that is superior to the tensor product for modeling concept combination [5]. Fock space is the direct sum of tensor products of Hilbert spaces, so it is also a Hilbert space. For simplicity, this initial application of the quantum formalsm to modeling humor will omit such refinements, but such a move may become necessary in further developments of the model.

Quantum models can be useful for describing situations involving potentiality, in which change of state is nondeterministic and contextual. The concept of potentiality has broad implications across the sciences; for example, every biological trait not only has direct implications for existing phenotypic properties such as fitness, but both enables and constrains potential future evolutionary changes for a given species. The quantum approach been used to model the biological phenomenon of exaptation—wherein a trait that originally evolved for one purpose is co-opted for another (possibly after some modification) [40]. The term exaptation was coined by Gould and Vrba [41] to denote what Darwin referred to as preadaptation<sup>1</sup> . Exaptation occurs when selective pressure causes this potentiality to be exploited. Like other kinds of evolutionary change, exaptation is observed across all levels of biological organization, i.e., at the level of genes, tissue, organs, limbs, and behavior. Quantum models have also been used to model the cultural analog of exaptation, wherein an idea that was originally developed to solve one problem is applied to a different problem [40]. For example, consider the invention of the tire swing. It came into existence when someone re-conceived of a tire as an object that could form the part of a swing that one sits on. This re-purposing of an object designed for one use for use in another context is referred to as cultural exaptation. Much as the current structural and material properties of an organ or appendage constrain possible re-uses of it, the current structural and material properties of a cultural artifact (or language, or art form, etc.) constrain possible re-uses of it. We suggest that incongruity humor constitutes another form of exaptation; an ambiguous word, phrase, or situation, that was initially interpreted one way is revealed to have a second, incongruous interpretation. Thus, it is perhaps unsurprising that, as with other forms of exaptation, a quantum model is explored.

# 4. A QUANTUM INSPIRED MODEL OF HUMOR

A quantum theory of humor (QTH) could potentially inherit several core concepts from previous cognitive theories of humor while providing a unified underlying model. Considering the past work discussed in Section 2, it seems reasonable to focus on the notion that cognitive humor involves an ambiguity brought on by the bisociation of internally consistent but mutually incongruous schemas. Thus, cognitive humor appears to arise from the double think that is brought about by being forced to reconsider some currently held interpretation of a joke in light of new information: a frame shift. Such an insight opens humor upto quantum-like models, as a frame shift of an ambiguous concept is well modeled by the notion of a quantum superposition described using two sets of incompatible basis states within some underlying Hilbert space structure.

In what follows we sketch out a preliminary quantum inspired model of humor and discuss what would be required for a full-fledged formal QTH. Next, we outline a study aimed at discovering whether humor behaves in a quantum-like manner. The last section discusses how the QTH opens up avenues for future investigation in a field that to date has not been well modeled.

#### 4.1. The Mathematical Structure of QTH

We start our journey toward a QTH by building upon an existing model of conceptual combination [8]: the State–COntext– Property (SCOP) model. As per the standard approach used in most quantum-like models of cognition, |9i represents the state of an ambiguous element, be it a word, phrase, object, or something else, and its different possible interpretations are represented by basis states. Core to the SCOP model is a treatment of the context in which every measurement of a state occurred, and the resultant property that was measured. These three variables are stored as a triple in a lattice.

#### 4.1.1. The State Space

Following Aerts and Gabora [6], the set of all possible interpretation states for the ambiguous element of a joke is given by a state space 6. Specific interpretations of a joke are denoted by |pi, |qi, |ri, · · · ∈ 6 which form a basis in a complex Hilbert space H. Before the ambiguous element of the joke is resolved, it is in a state of potentiality, represented by a superposition state of all possible interpretations. Each of these represents a possible understanding arising due to activation of a schema associated with a particular interpretation of an ambiguous word or situation. The interpretations that are most likely are most heavily weighted. The amplitude term associated with each basis state represented by a complex number coefficient a<sup>i</sup> gives a measure of how likely an interpretation is given the current contextual information available to the listener. We assume that all basis states have unit length, are mutually orthogonal, and are complete, thus P i |ai | <sup>2</sup> <sup>=</sup> 1.

#### 4.1.2. The Context

In the context of a traditional verbal joke, the context consists primarily of the setup, and the setup is the only contextual element considered in the study in Section 5. However, it should be kept in mind that several other contextual factors not considered in our analysis can affect perceived funniness. Prominent amongst these is the delivery; the way

<sup>1</sup>The terms exaptation, preadaptation and co-option are often used interchangeably.

in which a joke is delivered can be everything when it comes to whether or not it is deemed funny. Other factors include the surroundings, the person delivering the joke, the power relationships among different members of the audience, and so forth.

As a first step, we might represent the set of possible contexts for a given joke as <sup>c</sup><sup>i</sup> <sup>∈</sup> <sup>C</sup>. Each possible interpretation of a joke comes with a set <sup>f</sup><sup>i</sup> <sup>∈</sup> <sup>F</sup> of properties (i.e., features or attributes), which may be weighted according to their relevance with respect to this contextual information. The weight (or renormalized applicability) of a certain property given a specific interpretation <sup>|</sup>p<sup>i</sup> in a specific context <sup>c</sup><sup>i</sup> <sup>∈</sup> <sup>C</sup> is given by <sup>ν</sup>. For example, ν(p, f1) is the weight of feature f<sup>i</sup> for state |pi, which is determined by a function from the set <sup>6</sup>×<sup>F</sup> to the interval [0, 1]. We write:

$$\nu: \Sigma \times \mathcal{F} \to [0, 1] \tag{2}$$

$$(p, f\_l) \mapsto \nu(p, f\_l).$$

#### 4.1.3. Transition Probabilities

A second function µ describes the transition probability from one state to another under the influence of a particular context. For example, µ(q, e, p) is the probability that state |pi under the influence of context c<sup>i</sup> changes to state |qi. Mathematically, µ is a function from the set <sup>6</sup> <sup>×</sup> <sup>C</sup> <sup>×</sup> <sup>6</sup> to the interval [0, 1], where µ(q, e, p) is the probability that state |pi under the influence of context |ei changes to state |qi. We write:

$$
\mu: \Sigma \times \mathcal{C} \times \Sigma \to [0, 1] \tag{3}
$$

$$
(q, e, p) \mapsto \mu(q, e, p).
$$

Thus, a first step toward a full quantum model of humor consists of the 3-tuple (6, C, F), and the functions ν and µ. Next we address a key question that should be asked of any cognitive theory of humor: what is the underlying cognitive model of the funniness of a joke?

#### 4.2. The Humor of a Joke

As the listener hears a joke, more context is provided, and in our model the listener's understanding evolves according to the transition probabilities associated with the cognitive state and the emerging context. When the listener hear the joke a bisociation of meaning is percieved; that is, the listener realizes that a second way of interpreting it is possible. A projective measurement onto a funniness frame is the mechanism that we use to model the likelihood that a given joke is considered funny.

Thus, in our model, funniness plays the role of a measurement operator, and it is affected by the shift that occurs in the understanding of a joke with respect to two possible framings: one created by the setup, and one by the punchline. The probability of a joke being regarded as funny or not is proportional to the projection of the individual's understanding of the joke (|9i) onto a basis representing funniness. This means that the probability of a joke being considered as funny, p<sup>F</sup> is given by a projection onto the <sup>|</sup>1<sup>i</sup> axis in <sup>H</sup><sup>2</sup> F , a two-dimensional Hilbert sub-space where |0i represents "not funny" and |1i represents "funny."

$$p\_F = |\langle 1 \rangle \langle 1 | \Psi \rangle|^2 \tag{4}$$

Similarly, the probability of a joke being regarded as not funny is represented by

$$\rho\_{\bar{F}} = \left| \left| 0 \right\rangle \langle 0 | \Psi \rangle \right|^2. \tag{5}$$

Note that |9i evolves as the initial conceptualization of the joke is reinterpreted with respect to the frame of the punchline. This is a difficult process to model, and we consider the work in this paper to be an early first step toward an eventually more comprehensive theory of humor that includes predictive models.

We now present two examples in which specific instances of humor are considered within the perspective of this basic quantum inspired model. First the approach is applied to a pun. Second, the approach is applied to a cartoon that is a frame blend. Both scenarios will help to deepen our understanding of the significant complexity of humor, and the difficulties associated with creating a mathematical model of this important human phenomenon.

#### 4.3. Applying QTH to a Pun

Consider the pun: "Why was 6 afraid of 7? Because 789." The humor of this pun hinges on the fact that the pronunciation of the number EIGHT, a noun, is identical to that of the verb ATE. We refer to this ambiguous word, with its two possible meanings, as EYT. An individual's interpretation of the word EYT is represented by |9i, a vector of length equal to 1. This is a linear superposition of basis states in the semantic subspace H<sup>2</sup> <sup>M</sup> which represents possible states (meanings) of the word EYT: EIGHT or ATE<sup>2</sup> . The interpretation of EYT as a noun, and specifically the number EIGHT, is denoted by the unit vector |ni. The verb interpretation, ATE, is denoted by the unit vector <sup>|</sup>vi. The set {|ni, <sup>|</sup>vi} forms a basis in <sup>H</sup><sup>2</sup> <sup>M</sup>. Thus, we have now expanded our original two-dimensional funniness space with an additional two-dimensional semantic space, where the full space <sup>H</sup><sup>4</sup> <sup>=</sup> <sup>H</sup><sup>2</sup> <sup>F</sup> <sup>⊗</sup> <sup>H</sup><sup>2</sup> <sup>M</sup>. We note that these two spaces should not be considered as mutually orthogonal, but that they will overlap. If they were orthogonal then the funniness of a joke would be independent of the interpretation that a person attributes to it.

With this added mathematical structure, we can represent the interpretation of the joke as a superposition state in H<sup>2</sup> M

$$|\Psi\rangle = a\_n|n\rangle + a\_\nu|\nu\rangle,\tag{6}$$

where a<sup>n</sup> and a<sup>v</sup> are amplitudes which, when squared, represent the probability of a listener interpreting the joke in a noun or a verb form (|ni and |vi) respectively. This state is depicted in **Figure 1A**, which shows a superposition state in the semantic space. When given no context in the form of the actual presentation of the joke, these amplitudes represent the prior

<sup>2</sup>We acknowledge that other interpretations are possible, and so this is a simplified model. It is straightforward to extend the model into higher dimensions by adding further interpretations as basis states.

likelihood of a listener interpreting the uncontextualized word (i.e., EYT) in either of the noun or verb senses (e.g., a free association probability; see [12] for a review). However, we would expect to see these probabilities evolving throughout the course of the pun as more and more context is provided (in the form of additional sentence structure). Throughout the course of the joke, the state vector |9i therefore evolves to a new position in H<sup>4</sup> .

Since the set-up of the joke,"Why was 6 afraid of 7?," contains two numbers, it is likely that the numbers interpretation of EYT is activated (a situation represented in **Figure 1A**). The listener is biased toward an interpretation of EYT in this sense, and so we would expect that a<sup>n</sup> >> av. However, a careful listener will feel confused upon considering this set-up because we do not think of numbers as beings that experience fear. This keeps the interpretation of EYT shifted away from an equivalence with the eigenvector |ni. As the joke unfolds, the predator interpretation that was hinted at in the set-up by the word "afraid," and reinforced by "789," activates a more definite alternative meaning, ATE, represented by |vi. This generates an alternative interpretation of the punchline: that the number 7 ate the number 9. The cognitive state |9i has evolved to a new position in H<sup>4</sup> , a scenario that is represented in **Figure 1B**. At this point a measurement occurs: the individual either considers the joke as funny or not within the context represented by the funniness sub space H<sup>2</sup> F , and a collapse to the relevant funniness basis state occurs (see **Figure 1C**). Note that this final state still contains a superposition within the meaning subspace H<sup>2</sup> <sup>M</sup>—the funniness judgment merely shifts the interpretation of the joke, it does not eliminate the bisociation. Rather, it depends upon it.

If we consider the set of properties associated with EYT then we would expect to see two very different prototypical characteristics associated with each interpretation. For example, the EIGHT interpretation is difficult to map into properties such as "food" denoted f1, and "not living" denoted f<sup>2</sup> (since when something is eaten it is usually no longer alive). Because "food" and "not living" are not properties of EIGHT, ν(p, f0) << ν(n, f0), and similarly ν(p, f1) << ν(n, f1). However, "food" and "not living" are properties of EYT, ν(p, f0) << ν(v, f0), and similarly ν(p, f1) << ν(v, f1).

We can now start to construct a model of humor that could be correlated with data. If jokes satisfy the law of total probability (LTP) then their funniness should satisfy the distributive axiom, which states that the total probability of some observable should be equal to the sum of the probabilities of it under sets of more specific conditions. Thus, considering a funniness observable Oˆ F (with eigenstates {|1i, |0i} and the semantic observable Oˆ <sup>M</sup> (with a simplified two eigenstate structure {|Mi, |M¯ i} representing two possible meanings that could be attributed to the joke). We can take the spectral decomposition of Oˆ <sup>M</sup> = m|MihM| + ¯m|M¯ ihM¯ |, where m, m¯ are eigenvalues of the two eigenstates {|Mi, |M¯ i}. Doing this, we should find that if this system satisfies the LTP then the probability of the joke being judged as funny is equal to the sum of the probability of it being judged funny given either semantic interpretation

$$\mathfrak{p}(F) = \mathfrak{p}(|1\rangle) = \mathfrak{p}(M) \cdot \mathfrak{p}(F|M) + \mathfrak{p}(\bar{M}) \cdot \mathfrak{p}(F|\bar{M}).\tag{7}$$

We can manipulate the interpretation that a participant is likely to attribute to a joke by changing the semantics of the joke itself. Thus, changing the joke should change the semantics, and so affect the humor that is attributed to the joke. We shall return to this idea in Section 5.

This section has demonstrated that a formal approach to concept interactions that has been previously shown to be consistent with human data [5] can be adapted to the simultaneous perception of incongruous meanings of an ambiguous word or phrase in the understanding of a pun.

# 4.4. Applying QTH to a Frame Blend

Although our first example used a pun for simplicity, we believe that quantum inspired models may also be useful for more elaborate forms of humor, such as jokes and cartoons referred to as frame blends. A frame blend involves the merging of incongruous frames of reference [42]. A common example of a frame blend is a cartoon in which animals are engaged in some kind of human behavior (such as a cartoon of a cow with all her teats pierced saying "Just gotta be me"). In a frame blend rather than being led "down the garden path" by the setup and subsequent re-interpretation in light of the punchline, the humor results from the simultaneous presentation of seemingly incompatible frames. Using QTH, the two interpretations of the incongruous situation would be designated by the unit vectors {|di, |oi}. The cognitive state of perceiving the blended frames is represented as a superposition of the two frames. As with phenomena such as conceptual combination, there are likely to be constraints on how frames can be successfully blended, and it will be necessary to consider this when constructing models of frame blends. We reserve further exploration of this interesting class of humor for future work.

# 5. PROBING THE STATE SPACE OF HUMOR

Returning to the question raised by Equation (7), a QTH should be justified by considering whether humor does indeed violate the Law of Total Probability (LTP) [3]. However, the complexity of language makes it difficult to test how humor might violate the LTP using a method similar to those followed for decision making [11]. Past work on humor is unlikely to yield the data required to perform tests such as this. For example, we currently have no experimental understanding of how the semantics of a joke interplays with its perceived funniness. It seems reasonable to suppose that the two are related, but how? We are not aware of any data that provide a way to evaluate this relationship. This is problematic, as there are a number of interdependencies in the framing of a joke that make it difficult to construct a model (even before considering factors such as the context in which the joke is made, and the socio-cultural background of the teller and the listener). In this section we present results from an exploratory study designed to start unpacking whether humor should indeed be considered within the framework of quantum cognition. As an illustrative example, consider the following joke:

VO: "Time flies like an arrow. Fruit flies like a banana."

As with the joke discussed in Section 4.3, the humor arises from the ambiguity of the words FRUIT and FLIES. The first frame (F1, the set-up), leads one to interpret FLIES as a verb and LIKE as a preposition, but the second frame (F2, the punchline), leads one to interpret FRUIT FLIES as a noun and LIKE as a verb. A QTH must be able to explain how the funniness of the joke depends upon a shift in the semantic understanding of the two frames, F1 and F2.

We now outline a preliminary study that has helped us to explore the state space of humor.

# 5.1. Stimuli

We collected a set of 35 jokes and for each joke we developed a set of joke variants. A V<sup>S</sup> variant consisted of the set-up only for the original, VO. Thus, the V<sup>S</sup> variant of the V<sup>O</sup> joke is

VS: "Time flies like an arrow."

A V<sup>P</sup> variant consists of the original punchline only. Thus, the V<sup>P</sup> variant of the V<sup>O</sup> joke is

VP: "Fruit flies like a banana."

We then considered the notion of a congruent punchline as one that does not introduce a new interpretation or context for an ambiguous element of the set-up (or punchline). Congruence was achieved by modifying the set-up to make it congruent with the punchline, or by modifying the punchline to make it congruent with the set-up. Thus, if the original set-up makes use of a noun, then so does a congruent modification (and similarly for the punchline).

A CP variant consists of the original set-up followed a congruent version of the punchline. Thus, a CP variant of the O joke is:

CP: "Time flies like an arrow; time flies like a bird."

A CS variant consists of the original punchline preceded by a congruent version of the set-up. Thus, a CS variant of the O' joke is

CS: "Horses like carrots; fruit flies like a banana."

For some jokes we had a fifth kind of variant. A IS variant consists of the original set-up followed an incongruent version of the punchline that we believed was comparable in funniness to the original. Thus, considering the joke discussed in Section 4.3:

O: "Why was 6 afraid of 7? Because 789."

A IS variant of this joke is:

IS: "Why was 6 afraid of 7? Because 7 was a six offender."

Thus the stimuli consisted of a questionnaire containing original jokes, and the above variants presented in randomized order. The complete collection of jokes and their variants is presented in the Appendix (Supplementary Material).

#### 5.2. Participants

The participants in this study were 85 first year undergraduate students enrolled in an introductory psychology course at the University of British Columbia (Okanagan campus). They received partial course credit for their participation.

#### 5.3. Procedure

Participants signed up for the study using the SONA recruitment system, and subsequently responded at their convenience to an online questionnaire hosted by FluidSurveys. They were informed that the study was completely voluntary, and that they were free to withdraw at any point in time. They were also informed that the researcher would not have any knowledge of who participated in the study, and that their participation would not affect their standing in the psychology class or relationship with the university. Participants were told that the purpose of the study was to investigate humor, and to help contribute to a better understanding the cognitive process of "getting" a joke. Participants were asked to fill out consent forms. If they agreed to participate, they were provided a questionnaire consisting of a series of jokes and joke variants (as described above) and asked to rate the funniness of each using a Likert scale, from 1 (not funny) to 5 (hilarious). The questionnaire took approximately 25 min to complete. They received partial course credit for their participation.

#### 5.4. Results

The mean funniness ratings across all participants for the entire collection of jokes and their variants (as well as the jokes and variants themselves) is provided in the Appendix (Supplementary Material). **Table 1** provides a summary of this information (the mean funniness rating of each kind of joke variant across all participants) aggregated across all joke sets. As expected, the original joke (O) was funniest (mean funniness = 2.70), followed by those jokes that had been intentionally modified to be funny: Incongruent Setup (IS) (mean funniness = 2.37) and Incongruent Punchline (IP) (mean funniness = 2.12). Next in funniness were the jokes that had been modified to eradicate the incongruency and thus the source of the humor: Congruent Setup (CS) (mean funniness = 1.41) and Congruent Punchline (CP) (mean funniness = 1.47). The joke fragments without a counterpart–i.e., either Setup (S) or Punchline (P) alone–were considered least funny of all (the mean funniness of both was 1.22). The dataset is entirely consistent with the view that the humor derives from incongruence due to bisociation.

#### 5.5. Toward a Test of the QTH

Recall that the Law of Total Probability (LTP) as represented by Equation (7) suggests that the mean funniness of a joke should be equal to the sum of its funniness as judged under all possible semantic interpretations. This is not an equality that we can directly test given our current understanding of language and how it might interplay with humor. However, the



O, Original; S, Set-up only; P, Punchline only; CS, Congruent Set-up; CP, Congruent Punchline; IS, Incongruent Set-up; IP, Incongruent Punchline.

dataset reported here gives us some initial ways to address this. With a methodology for converting the Likert scale ratings into projective measurements of a joke being funny or not, we can start to consider the relative frequency that an original joke is judged as funny and compare this result with the individual components.

We start by translating the Likert scale responses into a simplified measurement of funniness, by mapping the funniness ratings into a designation of funny or not. In order to run a quick comparison between the relative frequencies that participants decided the full joke (VO) was funny when compared to the components of the joke (V<sup>S</sup> and VP), we took the mean value of the components for each subject. Given that puns are not generally considered particularly funny (a result backed up by our participant ratings) we used a fairly low threshold value of 2.5 (i.e., if the mean was less than 2.5 then the components were judged as unfunny, and vice versa). Exploring the results of this mapping gives us the data reported in **Figure 2** for the VO, V<sup>S</sup> and V<sup>P</sup> variants of the jokes, listing the frequency at which participants judged the joke and subcomponents funny. A mean value for the joke fragments is also presented. All data uses confidence intervals at the 95% level.

We see a significant discrepancy between the funniness of the original and the combined funniness of its components. This is not a terribly surprising result; jokes are not funny when the set-up is not followed by the punchline, and participants usually rated V<sup>S</sup> and V<sup>P</sup> variants as unfunny (i.e., scoring them at 1). Table 2 in the Appendix (Supplementary Material) shows that in the participant pool of 85, the set-up and punchline variants of the joke rarely had a mean funniness rating above 1.5. However, to extract a violation of the LTP for this scenario, we would need to construct expressions such as the following

$$\mathfrak{p}(F) = \mathfrak{p}(EIGHT).\mathfrak{p}(F|EIGHT) + \mathfrak{p}(ATE).\mathfrak{p}(F|ATE). \tag{8}$$

How precisely could such a relationship be tested? Two forms of data are required to test whether the simple puns used in our experiment actually violate the LTP:


We have demonstrated a method for extracting the funniness ratings above. How might we obtain data for the semantic probabilities? We must consider the precise interpretation of what these probabilities might actually be. Firstly, we note that it seems likely participants will interpret just a set-up or a punchline in the sense that the fragment represents. The bisociation that humor relies upon is not present for a fragment, and so a person hearing a fragment will be primed by its surrounding context toward interpreting an ambiguous word in precisely the sense intended for that fragment. Indeed, the incongruity that results from having to readjust the interpretation of the joke, and the

resulting bisociation, lies at the very base of the humor that arises. Free association probabilities will not give these values. To test the LTP, it would be necessary to extract information about how a participant is interpreting core terms in the joke as it progresses; some form of nondestructive measurement is required, and a new experimental protocol will have to be defined. We reserve this for future work.

However, the significant difference between the rated funniness of the fragments and that of the original joke allows us to formulate an alternative mechanism for testing equations of the form (7) and (8). We can do this by asking whether there is any way in which the semantic probabilities could have values that would satisfice the LTP? An examination of **Figure 2** for the setup and punchline variants of the jokes suggests that there is no way in which to chose semantic probabilities that will satisfy the LTP. Thus, we have preliminary evidence that humor should perhaps be treated using a quantum inspired model.

# 6. DISCUSSION

It would appear that there is some support for the hypothesis that the humor arising from bisociation can be modeled by a quantum inspired approach. Furthermore, the experimental results presented in section 5 suggest that this model might more appropriate than one grounded in classical probability. However, much work remains to be completed before we can consider these findings anything but preliminary.

Firstly, the model presented in Section 4 is simple, and will need to be extended. While an extension to more senses for an ambiguous element of a joke is straightforward with a move to higher dimensions, the model is currently not well suited to the set of variants discussed in Section 5.3. A model that can show how they interrelate, and how their underlying semantics affects the perceived humor in a joke is desirable. Furthermore, the funniness of the joke was simplistically represented by a projection onto the "funny"/"not funny" axis. A more theoretically grounded treatment of the Likert data is desirable. For example, the current threshold value of 2.5 was chosen somewhat arbitrarily [although could be justified by a consideration of the mean values for funniness scores reported in the Appendix (Supplementary Material)—see Table 2]. A more systematic way of considering the Likert scale measures to allow for a normalization of funniness ratings at the level of an individual is also desirable. As a highly subjective phenomenon, funniness is liable to be judged by different individuals inconsistently and so it will be important that we control for this effect in comparing Likert responses among individuals.

Considering experimental results, the sample size of the data set is somewhat small (85 participants), although our funniness ratings appear to be reasonably stable for this cohort. A more concerning problem revolves around the construction of a LTP relationship for our simple model. There are many alternative ways in which a LTP could be constructed for puns, and more sophisticated models need to be investigated before we can be confident that our results conclusively demonstrate that humor must be modeled using a quantum inspired approach. In particular, we require a more sophisticated method that facilitates the extraction of data about the semantics attributed by a participant to a joke. A two stage protocol may be the answer for obtaining the necessary semantic information for a more rigorously founded test of the violation of LTP. It would be useful to construct a systematic study of the manner in which adjusting the congruence of the set-ups and punchlines influences perception of the joke. The quantum inspired semantic space approaches of Van Rijsbergen [13] and Widdows [43] may prove fruitful in this regard, as they would facilitate the creation of similarity models such as those explored by Aerts et al. [44] and Pothos and Trueblood [45].

In summary, humor is complex, and it will take an ongoing program of research to understand the interplay between the semantics of a joke and its perceived funniness. However, at this point we might pause to consider the broader question of why humor might be better modeled by a quantum inspired approach than by one grounded in classical probability? To Gabora and Kitto Quantum Theory of Humor

this end we return to the discussion of Section 3. As we saw, the humor of a pun involves the bisociation of incongruent frames, i.e., re-viewing a setup frame in light of new contextual information provided by a punchline frame. Moreover, the broader contextuality of humor means that even the funniest of jokes can become markedly unfunny if delivered in the wrong way (e.g., a monotone voice), or in the wrong situation (e.g., after receiving very bad news). Funniness is not a preexisting "element of reality" that can be measured; it emerges from an interaction between the underlying nature of the joke, the cognitive state of the listener, and other social and environmental factors. This makes the quantum formalism an excellent candidate for modeling humor, as this interaction is well described by the concept of a vector state embedded in a space which is represented using basis states that can be reoriented according to the framing of the joke. However, this paper only provides a preliminary indication that a QTH may indeed provide a good theoretical underpinning for this complex process. Much more work remains to be done.

# 7. CONCLUSIONS

This paper has provided a first step toward a quantum theory humor (QTH). We constructed a model where frame blends are represented in a Hilbert space spanned by two sets of basis states, one representing the ambiguous framing of a joke, and the other representing funniness. The process of "getting a joke" then consists of a dual stage scenario, where the cognitive state of a person evolves toward a re-interpretation of the meaning attributed to the joke, followed by a measurement of funniness. We conducted a study in which participants rated the funniness of jokes as well as the funniness of variants of those jokes consisting of setting or punchline by alone. The results demonstrate that the funniness of the jokes is significantly greater than that of their components, which is not particularly surprising, but does show that there is something cognitive taking place above and beyond the information content delivered in the joke. A preliminary test to see whether the humor in a joke violates the law of total probability appears to suggest that there

# REFERENCES


is reason to suppose that a quantum inspired model is indeed appropriate.

Our QTH is not proposed as an all-encompassing theory of humor; for example, it cannot explain why laughter is contagious, or why children tease each other, or why people might find it funny when someone is hit in the face with a pie (and laugh even if they know it will happen in advance). It aims to model the cognitive aspect of humor only. Moreover, despite the intuitive appeal of the approach, it is still rudimentary, and more research is needed to determine to what extent it is consistent with empirical data. Nevertheless, we believe that the approach promises an exciting step toward a formal theory of humor. It is hoped that future research will build upon this modest beginning.

# ETHICS STATEMENT

This research was approved by the Behavioral Research Ethics Board at the University of British Columbia (Okanagan Campus).

# AUTHOR CONTRIBUTIONS

LG had the idea for the paper and designed and conducted the study. Both authors contributed equally to all other aspects of the research and the writing of the paper.

# ACKNOWLEDGMENTS

This work was supported by a grant (62R06523) from the Natural Sciences and Engineering Research Council of Canada. We are grateful to Samantha Thomson who assisted with the development of the questionnaire and the collection of the data for the study reported here.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fphy. 2016.00053/full#supplementary-material


Empirical Issues. New York, NY: Academic Press (1972). p. 81–100. doi: 10.1016/B978-0-12-288950-9.50010-9


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Gabora and Kitto. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Quantum-like modeling of cognition

#### Andrei Khrennikov \*

Department of Mathematics, Linnaeus University, Växjö, Sweden

This paper begins with a historical review of the mutual influence of physics and psychology, from Freud's invention of psychic energy inspired by von Boltzmann' thermodynamics to the enrichment quantum physics gained from the side of psychology by the notion of complementarity (the invention of Niels Bohr who was inspired by William James), besides we consider the resonance of the correspondence between Wolfgang Pauli and Carl Jung in both physics and psychology. Then we turn to the problem of development of mathematical models for laws of thought starting with Boolean logic and progressing toward foundations of classical probability theory. Interestingly, the laws of classical logic and probability are routinely violated not only by quantum statistical phenomena but by cognitive phenomena as well. This is yet another common feature between quantum physics and psychology. In particular, cognitive data can exhibit a kind of the probabilistic interference effect. This similarity with quantum physics convinced a multi-disciplinary group of scientists (physicists, psychologists, economists, sociologists) to apply the mathematical apparatus of quantum mechanics to modeling of cognition. We illustrate this activity by considering a few concrete phenomena: the order and disjunction effects, recognition of ambiguous figures, categorization-decision making. In Appendix 1 of Supplementary Material we briefly present essentials of theory of contextual probability and a method of representations of contextual probabilities by complex probability amplitudes (solution of the "inverse Born's problem") based on a quantum-like representation algorithm (QLRA).

#### Edited by:

Wei-Xing Zhou, East China University of Science and Technology, China

#### Reviewed by:

Zhi-Qiang Jiang, East China University of Science and Technology, China Qing Yun Wang, Beihang University, China

#### \*Correspondence:

Andrei Khrennikov, Department of Mathematics, Linnaeus University, PJ vägen 1, Växjö S-35195, Sweden andrei.Khrennikov@lnu.se

#### Specialty section:

This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics

Received: 16 July 2015 Accepted: 28 August 2015 Published: 22 September 2015

#### Citation:

Khrennikov A (2015) Quantum-like modeling of cognition. Front. Phys. 3:77. doi: 10.3389/fphy.2015.00077 Keywords: quantum-like models, cognition and psychology, two slit experiment, order and disjunction effects

# 1. Introduction

Recently, scientists working in various disciplines (physicists, psychologists, economists, sociologists) started to apply the mathematical apparatus of quantum mechanics (QM), [1, 2] especially quantum probability calculus [3] (based on Born's rule), to multi-disciplinary problems [4–36]. Some physicists regard such an activity as totally "illegal." They argue that the mathematical apparatus of QM was designed specifically for description of particular physical phenomena and it cannot be used in, e.g., psychology. Why? Some elaborate that the apparatus of QM is relevant to micro phenomena only (though this viewpoint is debatable even in the quantum physics community). One aim of this paper is to convince physicists, especially those working in the quantum information theory and quantum probability, that applications of the methods of QM to cognition can be justified. We show that the present sharp separation of subjects of physics and psychology/cognition is only a peculiarity of the present moment, that 19th and the first part of 20th century were characterized by mutual influence of physical and psychological theories and the fruitful exchange of ideas between the brightest representatives from both sides. One of the best known examples is the impact made by psychology on QM which resulted in borrowing the

**59**

principle of complementarity [37] by Niels Bohr from William James' book [38], see also books of Plotnitsky [39–41]. It may be less known that, in turn, the idea of complementarity was elaborated by James under the influence of the 19th century studies in thermodynamics which led him (as well as later Freud [42, 43]) to the notion of psychic energy; initially, complementarity in psychology was about complementarity of different representations of psychic energy [38].

Meanwhile, we point out that quantum-like modeling of cognition considered here must be distinguished from theories of physical quantum brain in the spirit of Hameroff [44], Penrose [45, 46]. We work in the purely operational framework: it was found that some experimental studies in cognitive psychology, economics, and social science generate statistical data which match well quantum description of measurements and the corresponding probabilistic outputs (see e.g., [4, 5, 9, 13, 14, 18]). Therefore, it is natural to model cognition with the aid of QM formalism. The quantum cognition project does not try to explain the physiological origin of quantum rules for information processing and probability, similarly to Copenhageners in QM (following Bohr [37]). As in physics, this approach does not exclude a possibility to go beyond the operational quantum formalism. However, for the moment, there is no commonly accepted "prequantum model of cognition," cf., however, with Khrennikov [47].

In this paper we also mark the turning points in the development of mathematical models for laws of thought starting with the book of Boole [48] and considering the foundations of classical probability theory as established by Kolmogorov [49] in 1933.

Then, we briefly review the violations of the laws of classical logic and probability in quantum statistical experiments, in particular we discuss the probabilistic structure of the two slit experiment [50] and adress no-go theorems [1, 51, 52] (von Neumann, Kochen-Specker, Bell), see also [53]. We demonstrate that such violations (including the interference effect) also occur in statistics collected in cognitive experiments. This similarity with effects in quantum physics convinced scientists from physics and cognitive science to apply the mathematical apparatus of QM to modeling of cognition. For illustration we use two concrete applications [12–18]: the order and disjunction effects. The paper is concluded with a short review of recent research in quantum(-like) cognition, in particular, cognitive applications of the theory of open quantum systems [23, 24, 30, 31] and positive operator valued measures [4, 7, 36].

We remark that the use of the mathematical apparatus of QM for problems of cognition is motivated not only by the existence of non-classical statistical data collected in cognitive psychology, but also by similarities of basic features of (1) states of a system under study and (2) possible observations performed on the system, in physics and cognition. First feature concerns the representation of a state (e.g., a mental state) as a superposition of other (basis) states. In quantum(-like) modeling of cognition, superpositions play the crucial role because they represent states of very deep uncertainty which can not be modeled by classical probability distributions. Secondly, the representation of incompatible quantum physical observables by non-commuting operators also corresponds well to psychological intuition, since the majority of observables used in psychology, in particular, in the theory of decision making, exhibit the order effect. The property of entanglement of the states of two (or more) different systems is crucial for most peculiar QM effects (such as quantum teleportation and quantum computing). Entanglement also plays an important role in cognitive studies but as an exhibition of contextuality of cognitive phenomena (in the spirit of Cabello [54]) rather than physical non-locality (see also [53, 55–58]).

The problem of a proper interpretation of a quantum state (represented by a wave function) is still one of the most intriguing problems of quantum foundations [53]. The present situation is characterized by a huge diversity of interpretations (which can be considered as a sign of deep foundational crisis). Working with applications of the QM formalism in new fields of science one also meets this problem. In QM there are, roughly speaking, two big classes of interpretations: (a) quantum state is a physical state of an individual system; (b) quantum state is a special (probabilistic) representation of information about the results of possible measurements on an ensemble of (identically prepared) systems. The first one can be called the physical interpretation and the second one the information interpretation. Recently, the latter became very popular in quantum information theory and led (in its extreme forms) to subjective interpretation of quantum states, including quantum Bayesianism of Fuchs [59–61] and the information interpretation of Zeilinger [62, 63], Brukner [64]. Such interpretations match the ideology of quantum(-like) cognition. (Though, as we have seen in QM, the problem of interpretation is very complex, and it would be too risky to try to fix firmly the interpretation of quantum(-like) states used in cognitive studies.) Meanwhile, there is one crucial difference between conventional QM and quantum cognition. In QM, in accordance with Bohr's views, there is a system and an observer, the latter considered as external with respect to the system. This ideology, although working successfully in experimental studies of micro-world phenomena, is problematical where the possibility of separation between a system under observation and the observer is questionable, e.g., in quantum cosmology. Trying to solve this problem, the problem of interpretation of the "wave function of the Universe," Hugh Everett proposed the many worlds interpretation of the wave function, probably the most exotic among all interpretations. In fact, in quantum cognition we meet the same problem. The brain is a self-observer; here it is not easy to separate the system under measurement from the observer. However, it seems that the information interpretation in the spirit of Zeilinger-Brukner-Fuchs gives a possibility to resolve it: in the brain, one information subsystem makes predictions about the result of the observation on another information subsystem. Still, the problem of interpretation of the "mental wave function" is complicated. In this paper, we do not keep to any fixed interpretation, while we are most sympathetic to the information interpretation. At the same time we are very cautious (maybe, too cautious) with respect to the use of the many worlds interpretation for quantum cognition, in spite of novel possibilities and yet unexplored ways.

# 2. From Psychology to Physics and Back

Reviewing a variety of definitions from dictionaries and encyclopedias, we believe that we can safely state the following. Physics is the science that deals with the properties of matter. Psychology is the science that deals with mental processes and behavior. In accordance with the views of Rene Descartes there are two basic types of substance, material and mental, and one is not reduced to the other<sup>1</sup> . Although during the last century physical reductionism captured the headlines in psychology, Descartes' ideology still penetrates the body of modern science. Naturally, physics and psychology are considered as different fields of science as they can be, each with its specific theoretical and experimental methodologies. It seems that there is nothing or very little in common between them. Most physics students would probably not like to spend their time studying psychology courses and vice versa. However, developments in physics and psychology are connected much stronger than one can imagine. We can point to a few big names who contributed to establishing a connection between the two most fundamental sciences (one about nature and the other about psyche): Hermann von Helmholtz (**Figure 1**), Sigmund Freud (**Figure 2**), Gustav Theodor Fechner, William James (**Figure 4**), Niels Bohr (**Figure 3**), Carl Jung, Wolfgang Pauli, Albert Einstein,....

Freud was strongly influenced by works of von Helmholtz on thermodynamics and especially on the energy conservation law<sup>2</sup> . He noted similarities between thermodynamics and the human psyche and developed a kind of mental thermodynamics known as psycho-dynamics [42, 43]. Freud actively used the notion of psychic energy (libido) and the law of its conservation. (Primarily libido represents the sexual energy. However, according to

FIGURE 2 | Sigmund Freud.

FIGURE 3 | Niels Bohr.

Freud, the sexual energy is one of the forms of the psychic energy which can be transformed into other forms.) At the first stage of his psycho-dynamical studies Freud was influenced by the ideas of Fechner: considering physical facts (related to human body) and mental facts as sides of one reality. Fechner concluded that both physical and mental phenomena has to be described by the same mathematical apparatus [65]. (This remark is very important for us as foretelling the main idea of this paper: behavior of both mind and matter nicely fits the framework of the mathematical formalism of quantum theory).

The notion of psychic energy played an important role in theorizing of James [38]. Following physicists (who at that time were already using the field theory) he started to operate with the notion of psychic field. This psychic field as well as a physical field can have different modes. This analogy led James [38] to the fundamental principle of complementary of information belonging to different modes of consciousness (the words of James are italicized):

<sup>1</sup> It is also a Buddhist dogma that life is comprised of mind and matter.

<sup>2</sup>Of course, when discussing this law we have to mention the works of Germain Henri Hess, James Prescott Joule, and Rudolf Clausius. But, for Freud, the influence of von Helmholtz's ideas was especially strong. He started his research in physiology under the supervision of Ernst Brucke who previously worked with Hermann von Helmholtz.

"It must be admitted, therefore that in certain persons, at least, the total possible consciousness may be split into parts which coexist but mutually ignore each other, and share the object of knowledge beteen them. More remarkable still, they are complementary. Give an object to one of the consciousnesses, and by this very act you remove it from the other or others. Barring a certain common fund of information , like the command of language, etc., what the upper self knows the under self is ignorant of, and vice versa."

Above we pointed to the "knowledge transfer" in one direction, from physics to psychology. However, the opposite also took place. In particular, the principle of complementarity was invented in quantum physics by Bohr under the strong influence of James' "Principles of Psychology" [38] (cf. the above citation with the principle of complementarity in QM).

Now we point to the famous correspondence between Pauli and Jung [66] on comparative analysis of foundations of physics and psychology. These letters were written in a free style of discussion between friends (and, in part, a patient and a psychoanalyst)<sup>3</sup> . This freedom allowed them to express (in psychoanalytic manner) many thoughts which would be never presented in formal scientific discussions and publications. From the letters it is clear that Jung was deeply influenced by quantum theory in Pauli's presentation; e.g., Jung wrote to Pauli:

"As the phenomenal world is an aggregate of the processes of atomic magnitude, it is naturally of the greatest importance to find out whether, and if so how, the photons (shall we say) enable us to gain a definite knowledge of the reality underlying the mediative energy processes Light and matter both behave like separate particles and also like waves. This ... obliged us to abandon, on the plane of atomic magnitudes, a causal description of nature in the ordinary space-time system, and in its place to set up invisible fields of probability in multidimensional spaces."

Inspired by acausal features of quantum mechanics, Jung developed his famous theory of synchronicity [67]; the theory about the experiences of two or more events as meaningfully related, where they are unlikely to be causally related (The subject sees it as a meaningful coincidence). The use by quantum physicists of "invisible fields of probability on multidimensional spaces" strongly supported Jung's interest in psychic fields, invisible, probabilistic, and defined not on the physical space time **R** 4 , but on some kind of "mental space," cf. [68]. This was a clue to unification of psychic and quantum physical fields in one psycho-physical field. The idea was very appealing to both Pauli and Jung and it was one of the topics of their correspondence. Jung also discussed field models with Einstein, and Einstein's attempts to create a unified pure field model of physical reality (see e.g., [69]), also supported Jung's studies on psychical fields. Finally, however, neither the Einstein dream about a purely field description of physical reality nor the Jung-Pauli dream about the unified (quantum) psycho-physical field found a rigorous mathematical realization.

Our discussion on mutual influence of physics and psychology can be shortly represented as the following (of course, incomplete) diagram:

[Hess, Joule, Clausius, and von Helmholtz] → [Freud, Fechner, James] → [Bohr] ↔ [Pauli] ↔ [Jung] ← [Einstein]...

# 3. Modeling of Cognition with Classical-nonclassical Logic vs. Classical-nonclassical Probability

Now we concentrate on problems in cognition (keeping in mind our ultimate goal—the quantum modeling in cognitive psychology). Recall that "cognition" usually treats psychological functions of an indvidual from the viewpoint of information processing. (Sometimes "cognition" is treated more tendentiously as the "science of mind"). We shall use mathematics as an instrument for linkage of cognition and physics.

#### 3.1. From Boolean Logic to Kolmogorovian Probability

In 19th century George Boole wrote the book "An Investigation of the Laws of Thought on Which are Founded the Mathematical Theories of Logic and Probabilities" [48], see also [70]. This was the first mathematical model of the thinking process based on the laws of reasoning nowadays known as the Boolean logic. The role of Boolean logic in modern science is impossible to overestimate, it plays the crucial role in information theory, decision making, artificial intelligence, digital electronics. Boolean logic is the basic mathematical model of classical logic.

One of the most important features of Boolean logic is that it serves as the basis of the modern probability theory [49]: representation of events by sets, subsets of some set , the so-called sample space, or space of elementary events. The

<sup>3</sup>At the beginning Pauli wanted to discuss with Jung his psychical problems which might be a subject of psychoanalytic treatment. However, Jung smartly redirected Pauli to a young female psychoanalyst and the most part of Pauli-Jung correspondence is about psyche-physics inter-relation.

system of sets representing events, say F, allows operations of Boolean logics; F is the so-called σ-algebra of sets<sup>4</sup> . It is closed with respect to the (Boolean) operations of (countable) union, intersection, and complement (or in logical terms "and," "or," "no"). Thus, the first lesson for a physics student is that by applying any theorem of probability theory, e.g., the law of large numbers, one has to be aware that paradigm of Boolean logic is being used. The set-theoretic model of probability was presented by Kolmogorov in 1933 [49]; it is based on the following two natural (from the Boolean viewpoint) axioms:


We remind that a probabilistic measure p is a (countably) additive function on a <sup>σ</sup>-algebra <sup>F</sup>: <sup>p</sup>(∪<sup>∞</sup> <sup>j</sup> <sup>=</sup>Aj) <sup>=</sup> P∞ <sup>j</sup> <sup>=</sup> <sup>p</sup>(Aj) for <sup>A</sup><sup>j</sup> <sup>∈</sup> <sup>F</sup>, <sup>A</sup><sup>i</sup> <sup>∩</sup> <sup>A</sup><sup>j</sup> = ∅, <sup>i</sup> 6= <sup>j</sup>, which is valued in [0, 1] and normalized by 1. We also recall the definition of a random variable as a measurable function, <sup>a</sup> : <sup>→</sup> **<sup>R</sup>** 5 . In classical probability theory random variables represent observables.

Thus, the second lesson for a physics student is that probability is an axiomatic theory, as, e.g., geometry. (My experience of probabilistic discussions with physicists is that only a few of them understand this. Majority tries to treat probability heuristically, e.g., as frequency. This approach may work well in applied research, e.g., with experimental data. However, it may lead to paradoxic conclusions in foundational studies, as e.g., in the case of violation of Bell's inequality, see [53], for details)<sup>6</sup> .

#### 3.2. Formula of Total Probability, Bayesian Analysis

One of the basic laws of the Kolmogorovian model, the formula of total probability (FTP), will play very important role in our further considerations. Before addressing FTP, we point to an exceptional role which is played by conditional probability in the Kolmogorov model. This sort of probabilities is not derived in any way from "usual probability"; conditional probability is per definition given by the Bayes formula:

$$p(B|C) = p(B \cap C) / p(C), \; p(C) > 0. \tag{1}$$

By Kolmogorov's interpretation it is the probability of an event B to occur under the condition that an event C has occurred. One can immediately see that this formula is one of strongest exhibitions of the Boolean structure of the model; one cannot even assign conditional probability to an event without using the Boolean operation of intersection.

Let us consider a countable family of disjoint sets A<sup>k</sup> belonging to F such that their union is equal to and p(A<sup>k</sup> ) > 0, k = 1, .... Such a family is called a partition of the space .

**Theorem 1** Let {A<sup>k</sup> } be a partition. Then, for every set B <sup>∈</sup> <sup>F</sup>, the following formula of total probability holds

$$\rho(B) = \sum\_{k} p(A\_k) p(B|A\_k) \tag{2}$$

Especially interesting for us is the case where a partition is induced by a discrete random variable a taking values {α<sup>k</sup> }. Here, <sup>A</sup><sup>k</sup> == {<sup>ω</sup> <sup>∈</sup> : <sup>a</sup>(ω) <sup>=</sup> <sup>α</sup><sup>k</sup> }. Let b be another discrete random variable. It takes values {βj}. For any βj, we have

$$p(b=\beta\_{\dot{\jmath}}) = \sum\_{k} p(a=\alpha\_{k}) p(b=\beta\_{\dot{\jmath}}|a=\alpha\_{k}).\tag{3}$$

This formula plays a crucial role in classical decision theory: knowing probabilities of the a-variable and the corresponding conditional probabilities for the b-variable one can obtain the "total probability" for any value of the latter. We also point out that FTP is the cornerstone for the Bayesian procedure for probability updating which is also widely used in decision making.

#### 3.3. Probability-geometry: Comparison of Evolutions

To understand better the role of the axiomatic nature of the modern set-theoretic model of probability it is useful to make comparison with another axiomatic theory - geometry. We can learn a lot from history of development of geometry. Of course, the biggest name in geometry is Euclid. His axiomatics of geometry was considered as the only possible for about 2000 years. It became so common that people started to identify Euclidean model of geometry with physical space. In particular, Immanuel Kant presented deep philosophic arguments [71] that physical space is Euclidean. The Euclidean dogma was rejected as the result of internal mathematical activity, the study of a possibility of derivation of one of axioms from others. This axiom was the famous fifth postulate: given a line and a point not on the line, there is precisely one line parallel to the given one and containing the given point. Nikolay Ivanovich Lobachevsky was the first to understand that this postulate can be replaced with one of its negations. This led him to a new geometric axiomatics, the model which nowadays is known as Lobachevsky geometry (or hyperbolic geometry). Thus, the Euclidean geometry started to be treated as just one of possible models of geometry. This discovery revolutionized, first, mathematics (with contributions of Gauss, Bolyai, and especially Riemann) and then physics (Minkowski, Einstein, Hilbert).

This geometry lesson tells us that there is no reason to expect that the Kolmogorovian model is the only possible axiomatic model of probability. One can expect that by modifying the Kolmogorovian axioms in the same spirit as Lobachevsky modified the Euclidean axiomes, mathematicians could create non-Kolmogorovian models of probability which may be useful for various applications, in particular in physics. However, in the case

<sup>4</sup>Here the symbol σ encodes "countable." In American terminology such systems of subsets are called σ-fields.

<sup>5</sup>Here measurability has the following meaning. The set of real numbers **R** is endowed with the Borel σ-algebra B: the minimal σ-algebra containing all open and closed intervals. Then for any <sup>A</sup> <sup>∈</sup> <sup>B</sup> its inverse image <sup>a</sup> −1 (A) <sup>∈</sup> <sup>F</sup>. This gives a possibility to define on B the probability distribution of a random variable, pa(A) = p(a −1 (A)).

<sup>6</sup> I see a big problem in the absence of mathematically advanced courses in probability theory for physics students. It seems that education in physics suffers from this problem throughout the world.

of probability the historical pathway of development of geometry was not repeated. Mathematicians did not have 2000 years to rethink the Kolmogorovian axiomatics...

#### 3.4. Non-Kolmogorovian Nature of Quantum Probability; No-go Theorems

New physics, QM, intervened brutally in the mathematical kingdom. The probabilistic structure of QM did not match classical probability theory based on the set-theoretic approach of Kolmogorov. At the first stage of development of QM this mismatching was not so visible. The first sign can be seen in Born's rule:

$$p(\mathbf{x}) = |\psi(\mathbf{x})|^2,\tag{4}$$

where ψ(x) is the wave function and p(x) is the probability to detect a particle at point x. The wave function is primary here, not the probability. What is encoded in these complex amplitudes pre-existing behind probabilities obtained in quantum measurements? One of the most evident consequences of Equation (4) is violation of the formula of total probability (FTP), one of the basic laws of classical probability theory, see Section 4 for details. In the two slit experiment constructive and destructive interference of the wave functions corresponding to passing through different slits is probabilistically represented as violation of FTP, so to say, interference of probabilities. (Moreover, in QM only such interference of probabilities can be observed, nothing closer to probability amplitudes, since "quantum waves" are not directly approachable).

John von Neumann was the first to pay attention to the peculiar probabilistic structure of QM as compared to the probabilistic structure of classical statistical mechanics [1]. In particular, he generalized Born's rule to quantum observables represented by Hermitian operators. For an observable represented by an operator with purely discrete spectrum, the probability to obtain the value λ as the result of measurement is given as

$$p(\lambda) = \|P\_{\lambda}\psi\|^2,\tag{5}$$

where P<sup>λ</sup> is the projector corresponding to the eigenvalue λ. (Here A = P λ λPλ).

In his seminal book [1] von Neumann pointed out that, opposite to classical statistical mechanics where randomness of the results of measurements is a consequence of variability of physical parameters such as, e.g., the position and momentum of a classical particle, in QM the assumption about the existence of such parameters (for the moment, probably, still hidden and unapproachable by the existing measurement devices) leads to a contradiction. This statement presented in Von Neumann [1] is known as von Neumann no-go theorem, theorem about impossibility to go beyond the description of quantum phenomena based on quantum states: it is impossible to construct a theoretical model providing a finer description of those phenomena than given by QM<sup>7</sup> . Thus, von Neumann was sure that it is impossible to construct a classical probability measure on the space of some hidden variables which would reproduce probabilities obtained in quantum measurements. Later this statement was confirmed by other no-go theorems, e.g., of Kochen and Specker [51] and Bell [52].

#### 3.5. Quantum Logic

These "theorems" are consequences of the mathematical structure of QM. While classical probability theory is based on the set-theoretical description, QM is founded on the premise that events are associated with subspaces (or orthogonal projectors on these subspaces) of a vector space, complex Hilbert space. The adoption of subspaces as the basis for predicting events also entails a new logic, the logic of subspaces (projectors), which relaxes some of the axioms of classical Boolean logic (e.g., commutativity and distributivity).

First time this viewpoint that QM is based on a new type of logic, quantum logic, was expressed in the book of Von Neumann [1], where he treated projectors corresponding to the eigenvalues of quantum observables (represented by Hermitian operators) as propositions (see also [72]). The explicit formulation of logic of QM as a special quantum logic is based on the lattice [73] of all orthogonal projectors. For reader's convenience, below we present the mathematical structure of quantum logic (see [74], for details). However, in principle one can jump directly to Section 3.6.

#### 3.5.1. Logical Operations on for Projectors

For an orthogonal projector P, we set H<sup>P</sup> = P(H), its image, and vice versa, for subspace L of H, the corresponding orthogonal projector is denoted by the symbol PL.

The set of orthogonal projectors is a lattice with the order structure: P ≤ Q iff H<sup>P</sup> ⊂ H<sup>Q</sup> or equivalently, for any ψ ∈ H, hψ|Pψi ≤ hψ|Qψi.

We recall that the lattice of projectors is endowed with operations "and" (∧) and "or" (∨). For two projectors P1, P2, the projector R = P<sup>1</sup> ∧ P<sup>2</sup> is defined as projector onto the subspace H<sup>R</sup> = HP<sup>1</sup> ∩ HP<sup>2</sup> and the projector S = P<sup>1</sup> ∨ P<sup>2</sup> is defined as projector onto the subspace H<sup>R</sup> defined as the minimal linear subspace containing the set-theoretic union HP<sup>1</sup> ∪ HP<sup>2</sup> of subspaces HP<sup>1</sup> , HP<sup>2</sup> : this is the space of all linear combinations of vectors belonging these subspaces. The operation of negation is defined as the orthogonal complement: P <sup>⊥</sup> = {<sup>y</sup> <sup>∈</sup> <sup>H</sup> : <sup>h</sup>y|xi = 0 for all x ∈ HP}.

In the language of subspaces the operation "and" coincides with the usual set-theoretic intersection, but the operations "or" and "not" are non-trivial deformations of the corresponding set-theoretic operations. It is natural to expect that such deformations can induce deviations from classical Boolean logic.

Consider the following simple example. Let H be two dimensional Hilbert space with the orthonormal basis (e1, e2) and let v = (e<sup>1</sup> + e2)/ √ 2. Then P<sup>v</sup> ∧ Pe<sup>1</sup> = 0 and P<sup>v</sup> ∧ Pe<sup>2</sup> = 0, but P<sup>v</sup> ∧ (Pe<sup>1</sup> ∨ Pe<sup>2</sup> ) = Pv. Hence, for quantum events, in general the distributivity law is violated:

<sup>7</sup>This theorem was criticized for unphysical assumptions used by von Neumann to approach his no-go conclusion; especially strong critique was from the side of

Bell [52], the author of another famous no-go theorem; calmer critical arguments were presented by Ballentine [2]. (We also remark that, although in the modern literature the von Neumann statement is called "theorem," in the German edition it was called an "ansatz").

$$P \land \{P\_1 \lor P\_2\} \neq \{P \land P\_1\} \lor \{P \land P\_2\} \tag{6}$$

As can be seen from our example, even mutual orthogonality of the events P<sup>1</sup> and P<sup>2</sup> does not help to save the Boolean laws<sup>8</sup> .

We remark that for commuting projectors quantum logical operations have the Boolean structure. Thus, non-commutativity can be considered as algebraic representation of non-classicality of quantum logic. In particular, for a single observable (with purely discrete spectrum) A = P λ λPλ, projectors corresponding to different eigenvalues are orthogonal and, hence, commutative. Therefore, deviations from classical logic and probability can be found only through analysis of results of a few incompatible measurements.

The idea that cognition and quantumness have something in common has been discussed during last 80 years, starting with the philosophic studies of Alfred North Whitehead.

#### 3.6. Toward Quantum Modeling of Cognition

As we have seen, quantum logic relaxes some of the axioms of classical Boolean logic, e.g., commutativity and distributivity. Human judgments are not always commutative (order effects are pervasive) and often violate the probabilistic implications of the distributive axiom. The principles of QM resonate with deeply rooted psychological intuitions and conceptions about human cognition and decision. Therefore, it is natural to try to use the mathematical apparatus, developed to describe the aforementioned quantum deformations of Boolean logics, to model cognition and, in particular, to apply quantum measurement theory to model decision making. Also, the mathematical apparatus of QM is actively applied to probabilistic problems of psychology, cognitive science, social science, economics, and finances (see e.g., the monographs [4–7]).

We remark that non-commutativity of incompatible observables can be considered as the algebraic representation of the principle of complementarity. Thus, the loop in the inter-relation of physics and psychology was finally closed: complementarity came back to psychology, but in the advanced mathematical form.

We remark that in QM probabilities can only be expressed through elements of quantum logics, see Equation (5). Thus, non-classicality is a statistical effect. In the same way nonclassicality of human reasoning can be observed only as a statistical effect. In fact, such an effect has been well known in psychology for long, but it was interpreted as irrational behavior of people which was statistically exhibited in the form of various probability fallacies. Their role (both in psychology and economics) was emphasized in the influential Tversky (over 30,000 citations), Kahneman (Nobel prize in economics) research program [77]: the conjunction and disjunction fallacies, order effects in decisions, over- and under- extension errors in conceptual combinations, and ambiguous concepts [78, 79].

In author's works [4, 10] it was pointed out that violation of FTP can serve as a statistical test of non-classicality of data generated by both physical and cognitive phenomena. The coefficient of interference expressed in the probabilistic terms, see Equation (9) Section 4 can be interpreted as quantitative measure of non-classicality (non-Kolmogorovness). These papers deal with an important case of dichotomous observables of the inverse Born problem: a complex probability amplitude ψ is reconstructed with the aid of the interference coefficient, see Appendix 1 in Supplementary Material for a detailed presentation. This constructive wave function method is especially important for cognitive applications. In QM the space geometry is often used to construct the corresponding wave functions, e.g., for a free particle with a fixed momentum p, ψ(x) = e ixp; generally one can use the Schrödinger equation in **R** <sup>3</sup> with a potential V(x) and initial and boundary conditions. The main problem of the quantum cognition project is that a proper notion of mental space has not yet been elaborated (cf. [68]). We cannot directly use physics methods, such as introducing functions (e.g., energy) on physical space. A possibility to construct a "mental wave function" directly from data is properly justified. The author designed an algorithm for inversion of Born's rule, the so-called Quantum-like representation algorithm (QLRA) [4], see Section 8 for a few applications.

Author's article [10] served as the theoretical basis for a series of experiments on contextual effect (of Gestalt type) in recognition of ambiguous figures by Conte et al. [13, 14, 18], see Section 8.1 for brief presentation of these results. Analysis of obtained statistical data showed that classical FTP is violated and that the "belief state" of students participated in the experiment can be described by a complex amplitude ψ and observables by non-commutative Hermitian operators.

Busemeyer et al. performed extended studies [5, 12, 16, 25– 27, 32–34], see also the monograph of Busemeyer and Bruza [5], on violation of FTP for well-known data on probability fallacies obtained in experiments by Shafir and Tversky, Hofstader, Grosson and other cognitive psychologists [80–84]. It was shown that such data can be modeled with the aid of the mathematical formalism of QM [5]. Besides, Busemeyer et al. lauched the project on quantum(-like) decision making; see also the pioneer work of Aerts and Aerts [8], the paper of Phothos and Busemeyer [16] and the series of works of Asano et al. [23, 24, 30, 31, 35].

# 4. Violation of Formula of Total Probability and Non-Kolmogorov Probability Theory

The two slit experiment is the basic example demonstrating that QM describes statistical properties of microscopic phenomena, to which the classical probability theory seems to be not applicable (see e.g., Feynman and Hibbs [50]). In this section, we consider the experiment with the symmetric setting: the slits are located symmetrically with respect to the source of photons, **Figure 5**. Consider a pair of random variables a and b. We select a as the

<sup>8</sup>At first glance, representation of events by projectors/linear subspaces may look exotic because of the very common use of the set-theoretic representation of events in the modern classical probability theory. We want to fight this prejudice and support the view that alternatives are possible and sometimes desirable. The tradition to represent events by subsets was firmly established by Kolmogorov [49] only in 1933. We remark that before him the basic classical probabilistic models were not of the set-theoretic nature. For example, the main competitor of the Kolmogorov model, the von Mises frequency model [75], was based on the notion of a collective (see [76], for formulation of QM on the basis of the von Mises model).

slit variable, i.e., a = 0 (the photon passes through the upper slit), a = 1 (the photon passes through the lower slit), see **Figure 5**, and b as the position on the photo-sensitive plate, see **Figure 5**. Remark that the b-variable has the continuous range of values, the position x on the photo-sensitive plate.

For the experimental context with both slits open, see **Figure 6**, by Born's rule Equation (4) the probability that a photon is detected at position x on the photo-sensitive plate is represented as

$$p(b=\mathbf{x}) = \left| \frac{1}{\sqrt{2}} \psi\_0(\mathbf{x}) + \frac{1}{\sqrt{2}} \psi\_1(\mathbf{x}) \right|^2 = \frac{1}{2} \left| \psi\_0(\mathbf{x}) \right|^2$$

$$+ \frac{1}{2} \left| \psi\_1(\mathbf{x}) \right|^2 + \left| \psi\_0(\mathbf{x}) \right| \left| \psi\_1(\mathbf{x}) \right| \cos \theta,\qquad(7)$$

where ψ<sup>0</sup> and ψ<sup>1</sup> are two wave functions, whose absolute values <sup>ψ</sup>i(x) 2 give the distributions of photons passing through the slit i = 0, 1, respectively, see **Figures 7**, **8** The term

$$\delta(\mathfrak{x}) = \left| \psi\_0(\mathfrak{x}) \right| \left| \psi\_1(\mathfrak{x}) \right| \cos \theta$$

represents quantitively the interference effect of two wave functions. Let us denote <sup>ψ</sup>i(x) 2 by p(b = x|a = i), then Equation (7) is represented as

$$p(b=x) = p(a=0)p(b=x|a=0) + p(a=1)p(b=x|a=1)$$

where the "interference term" δ has the form:

$$\delta(\mathbf{x}) = \begin{array}{c} 2\sqrt{p(a=0)p(b=\mathbf{x}|a=0)p(a=1)}\\ \times p(b=\mathbf{x}|a=1)\cos\theta. \end{array} \tag{9}$$

Here the values of probabilities p(a = 0) and p(a = 1) are equal to 1/2, since we consider the symmetric setting. For a general experimental setting, p(a = 0) and p(a = 1) can be taken as the arbitrary non-negative values satisfying p(a = 0) + p(a = 1) = 1. In the above form, the classical probability law—FTP, see Equation (3),

$$p(b=\mathbf{x}) = \sum\_{i} p(a=i)p(b=\mathbf{x}|a=i),\tag{10}$$

is violated, and the interference term Equation (9) quantifies the violation. The additional interference term appears not

only in the two slit experiment, but in any experiment with arbitrary incompatible quantum observables represented by noncommuting Hermitian operators <sup>A</sup>, <sup>B</sup> : [A, <sup>B</sup>] 6= 0 (see [53], for details).

Now consider two random variables of any origin, from physics, cognitive science, biology, sociology. Let FTP be violated. Of course, for a classical probabilist this is impossible, but plenty of such data exist, see Section 3.6. Here p(b = x) 6= P i p(a = i)p(b = x|a = i), i.e., a kind of (probabilistic) interference term appears:

$$\delta(\mathbf{x}) = p(b=\mathbf{x}) - \sum\_{i} p(a=i)p(b=\mathbf{x}|a=i),\qquad(11)$$

The point is that we cannot use the Kolmogorov probability model. For example, psychologists can look for special psychological explanations of such strange data, e.g., altruism. However, such a psychological "resolution" does not change the mathematical problem: how to describe such data mathematically? The previous analysis of quantum measurements of the interference type (more generally of pairs of incompatible quantum observables) demonstrated that the appearance of the interference type term matches the predictions of quantum probability theory, where probabilities are based on complex probability amplitudes. Therefore, it is natural to use this non-classical probability theory to model phenomena generating data with non-trivial interference terms which violate FTP. This was one of the starting points for quantum probability theory to impact mathematical modeling of cognition [4, 5, 10, 12].

We remark that (Equation 11) can be (tautologically) rewritten in the form similar to the formula for quantum interference (Equation 8) and the interference term can be always represented similarly: Equation (9):

$$\delta(\mathbf{x}) = 2\lambda(\mathbf{x})\sqrt{p(a=0)p(b=\mathbf{x}|a=0)p(a=1)p(b=\mathbf{x}|a=1)}.\tag{12}$$

The only difference is that for arbitrary data we cannot guarantee that |λ(x)| ≤ 1. Thus, for arbitrary statistical data, we have FTP with the interference term:

$$\begin{split} p(b=\mathbf{x}) &= \sum\_{i} p(a=i)p(b=\mathbf{x}|a=i) \\ &+ \ 2\lambda(\mathbf{x})\sqrt{p(a=0)p(b=\mathbf{x}|a=0)p(a=1)p(b=\mathbf{x}|a=1)}. \end{split} \tag{13}$$

## 5. Savage Sure Thing Principle, Disjunction Effect

**STP** [85] If you prefer prospect B<sup>0</sup> to prospect B<sup>1</sup> if a possible future event A happens, and you prefer prospect B<sup>0</sup> still if future event A does not happen, then you should prefer prospect B<sup>0</sup> despite having no knowledge of whether or not event A will happen.

Savage's illustration refers to a person deciding whether or not to buy a certain property shortly before a presidential election, the outcome of which could radically affect the property market. "Seeing that he would buy in either event, he decides that he should buy, even though he does not know which event will obtain," [85], p. 21.

The crucial point is that the decision maker is assumed to be rational. Thus, the sure thing principle was used as one of foundations of rational decision making and rationality in general. It plays an important role in economics in the framework of Savage's utility theory. Mathematically Savage's STP is a simple consequence of FTP. Thus, this principle, widely used in economics, is mathematically based on the classical probability (and Boolean logic). In particular, the Bayes formula for conditional probabilities (Equation 1) plays the crucial role. Therefore, rationality determined by this principle is Bayesian rationality.

Experimentally observed [80, 81] violations of STP were interpreted by Shafir and Tversky as a new effect, the disjunction effect (see also Hofstader [82, 83] and Croson [84]). STP was also confronted by a number of famous (in cognitive psychology, economics, and decision making) paradoxes, Ellsberg, Allais, and Simpson paradoxes [6].

As was discovered by professor of cognitive psychology Jerome Busemeyer, statistical exhibiting the disjunction effect can be treated as non-classical, violating FTP, and hence these data has to be described by some non-Kolmogorovian probability model, e.g., quantum probability. Detailed analysis of data collected in Shafir and Tversky [81] and Tversky and Shafir [80] experiments as well as experiments of other cognitive psychologists was performed by the author Khrennikov [4]: FTP is violated; the corresponding quantum representations were constructed. Below we consider the simplest experiment.

In Section 8.3 we produce the quantum-like representation for statistical data obtained in one of experiments on disjunction effect which was performed by Tversky and Shafir [80]. By using the constructive wave function approach and QLRA, see Section 3.6 and Appendix 1 Supplementary Material, we construct the representation of data with the aid of a complex probability amplitude, "belief state," "mental wave function," such that experimental probabilities (frequencies) are given by the Born rule.

# 6. The General Scheme of Representation of Measurements in Quantum Physics and Cognition

In this section we repeat the discussion [36] on similarity between the schemes of representation of measurements in quantum physics and cognition.

On a very general level, QM accounts for the probability distributions of measurement results using two kinds of entities, called observables A and states ψ (of the system on which the measurements are made). Let us assume that measurements are performed in a series of consecutive trials numbered 1, 2,.... In each trial t the experimenter decides what measurement to make (e.g., what question to ask), and this amounts to choosing an observable A. Despite its name, the latter is not an observable per se, in the colloquial sense of the word. Still, it is associated with a certain set of values, which are the possible results one can get when measuring A. In a psychological experiment these are the responses that a participant is allowed to give, such as Yes and No.

The probabilities of these outcomes in trial t (conditioned on all the previous measurements and their outcomes) are computed as some function of the observable A and of the state ψ (t) in which the system (a particle in quantum physics, or a participant in psychology) is at the beginning of trial t,

$$p(A = \nu \text{ in trial } t \mid \text{measurement in trials 1, } \dots, t - 1) = \newline \qquad F\left(\psi^{(t)}, A, \nu\right). \tag{14}$$

This measurement changes the state of the system, so that at the end of trial t the state is ψ (t+1) , generally different from ψ (t) . The change ψ (t) <sup>→</sup> <sup>ψ</sup> (t+1) depends on the observable A, the state ψ (t) , and the value v = v (A) observed in trial t,

$$
\psi^{(t+1)} = G\left(\psi^{(t)}, A, \nu\right). \tag{15}
$$

On this level of generality, a psychologist will easily recognize in Equations (14, 15) a probabilistic version of the time-honored Stimulus-Organism-Response (S-O-R) scheme for explaining behavior [86]. This scheme involves stimuli (corresponding to A), responses (corresponding to v), and internal states (corresponding to ψ). It does not matter whether one simply identifies A with a stimulus, or interprets A as a kind of internal representation thereof, while interpreting the stimulus itself as part of the measurement procedure (together with the instructions and experimental set-up, that are usually fixed for the entire sequence of trials). What is important is that the stimulus determines the observable A uniquely, so that if the same stimulus is presented in two different trials t and t ′ , one can assume that A is the same in both of them.

QM is characterized by linear representation of observables by Hermitian operators; pure states are represented by normalized vectors of complex Hilbert space H. Consider an observable which is mathematically represented by the Hermitian operator A with purely discrete spectrum: A = P v vPv, where P<sup>v</sup> is the projector onto the eigensubspace corresponding to the eigenvalue v. Then

$$p(A = \nu \text{ in trial } t \mid \text{measurement in trials } 1, \dots, t - 1) = \newline F\left(\psi^{(t)}, A, \nu\right) = \|P\_{\nu}\psi^{(t)}\|^2 \tag{16}$$

and

$$\psi^{(t+1)} = G\left(\psi^{(t)}, A, \nu\right) = \frac{P\_{\nu}\psi^{(t)}}{\|P\_{\nu}\psi^{(t)}\|}. \tag{17}$$

This state transform expresses the von Neumann-Lüders projection postulate of QM and represents the quantum state update as a back reaction on measurement.

Nowadays these transformations are actively used in psychology; for example, to describe the order effect [32].

# 7. Short Review on Various Directions of Research on Quantum Modeling of Cognition

As was emphasized in Khrennikov [4], some statistical data from psychology cannot be described by the standard von Neumann model in which observables are represented by Hermitian operators and state transformations (resulting from the back actions of measurements) by the von Neumann-Lüders projection postulate. As well as in quantum physics, one have to use generalized quantum observables represented by positive operator valued measures (POVMs) with corresponding state transformers [4, 36]. In quantum physics POVM-type observables naturally arise in the framework of theory of open quantum systems describing interaction of a quantum system with an environment; especially useful is the Markovian approximation in the form of the Gorini-Kossakowski-Sudarshan-Lindblad equation. This advanced formalism was widely applied to problems of cognition, in psychology, social and political sciences [23, 24, 30, 31, 87]. In this framework the process of decision making is represented as the process of interaction of a concrete psychological function with a mental environment: decision making as decoherence. This approach was used to model irrational behavior of players in games of the Prisoner's Dilemma type. In such games the rational behavior is associated with selection of the Nash equilibrium as the optimal strategy. However, there were found numerous experimental evidences that players can select strategies different from the Nash equilibrium [80, 81]. Such behaviors were modeled with the aid of theory of open quantum systems in a series of works of Asano et al. [23, 24, 30, 31].

As was already pointed out, no-go theorems play a crucial role in distinguishing classical and quantum probabilistic behaviors. In quantum physics the Bell-type inequalities are explored as experimental tests. In cognitive science the first experimental violation of a Bell-type inequality (in the form of the Wigner inequality) was reported in the article of Conte et al. [14, 18], see also [5]. In quantum physics the Leggett-Garg inequality was explored to test compatibility of macroscopic realism with QM. Harald Atmanspacher and Thomas Filk used this inequality [28] to study the problem of bistable perception (see also [88]).

Violations of the Bell-type inequalities can be coupled to the problem of contextuality, e.g., [53]. The contextual interpretation of the aforementioned results on violations of these inequalities in cognitive science and psychology is most natural. Cognition is irreducibly contextual. The contextual modeling of cognition was performed on the large scale in the monograph [4] in which a general contextual theory of probability was developed. Theory of contextual probability contains quantum probability as a special case. Recently Ehtibar Dzhafarov initiated extended studies on contextuality and Bell-type inequalities in psychology and psychophysics [89, 90].

# 8. Examples of Applications of the Mathematical Formalism of Quantum Theory

Here we present some examples of application the mathematical formalism of quantum theory to psychology and decision making.

#### 8.1. Recognition of Ambiguous Figures

Let us explain our experiment on recognition of ambiguous figures [13], see also [4], and its connection with Gestalt psychology.

It is well known that, starting in 1912, Gestalt psychology moved a devastating attack against the structuralism formulations of perception in psychology. The classical structuralism theory of perception was based on a reductionistic and mechanistic conception that was assumed to regulate the mechanism of perception. For any perception there exists a set of elementary defining features that are at the same time necessary (each of them) and jointly sufficient in order to characterize perception also in cases of more complex conditions. The Gestalt approach introduced instead a holistic new approach, showing that the whole perception behavior of complex images can never be reduced to the simple identification and sum of elementary defining features defined in the framework of our experience.

During the 1920s and 1930s Gestalt psychology dominated in the study of perception. Its aim was to identify the natural units of perception, explaining it in a revised picture of the manner in which the nervous system works. Gestalt psychology's main contributions have provided some understanding of the elements of perception through the systematic investigation of some fascinating features, such as the causes of optical illusions, the manner in which the space around an object is involved in the perception of the object itself, and, finally the manner in which ambiguity plays a role in the identification of the basic laws of the perception.

In particular, Gestalt psychology also made important contributions to the question of how it is that sometimes we see movements even though the object we are looking at is not really moving. As we know, when we look at something we never see just the thing we look at. We see it in relation to its surroundings (underlying context). An object is seen against its background. In each case we distinguish between the figure, the object or the shape, and the space surrounding it, which we call background or ground, see **Figures 9**, **10**, **11**.

The psychologist Rubin was the first to systematically investigate this phenomenon, and he found that it was possible to identify any well-marked area of the visual field as the figure, leaving the rest as the ground.

However, there are cases in which the figure and the ground may fluctuate and one is forced to consider the dark part as the figure and the light part as the ground, and vice versa, alternately.

Subjects of the experiment respond (recognize the image) based on subjective and context-dependent factors, and output of the experiment is principally probabilistic. The early work

FIGURE 9 | Ambiguity Figure 1A.

of Rubin, which observed the importance of the figure– ground relationship, marked the starting point from which Gestalt psychologists began to explain what today is known as the organizing principles of perception. A number of organizing or grouping principles emerged from such studies of ambiguous stimuli. Three identified principles may be expressed

as similarity, closure and proximity. Gestalt psychologists attempted to extend their work also at a more physiological level, postulating the existence of a strong connection between the sphere of the experience and the physiology of the system, by admitting the well-known principle of isomorphism. This principle establishes that the subjective experience of a human being and the corresponding nervous event have substantially the same structure.

In our experiment, we examined subjects by Tests a and b in order to test quantum-like behavior. For Tests a and b we used the ambiguity figures of **Figures 9**, **10** as they were widely employed in Gestalt studies:


Thus, the a-test is based on the following cognitive task: look at **Figure 9** and reply to question (a). The b-test is based on **Figure 10**: look a this figure and reply to question (b).

The reasons for using such ambiguity tests here for analyzing quantum-like behavior in perception may be summarized as it follows. First of all, the Gestalt approach was based on the fundamental acknowledgment of the importance of the context in the mechanism of perception. Quantum-like behavior also postulates this basic importance and role of the context in the evolution of the considered mechanism, see Section 4. Finally, we have seen that in ambiguity tests, the figure and the ground may fluctuate during the perception. Consequently, a nondeterministic (a quantum-like) behavior should be involved.

Ninety-eight medical students of University of Bari (Italy) were enrolled in this study, with about equal distribution of females and males, aged between 19 and 22 years, after giving their informed consent to participate in the experiment. In the first experiment a group of 53 students was subjected in part to Test b (presented with Test b only) and in part to Tests a and b (presented with Test a and soon after presented with Test b with prefixed time separation of about 2 s between the two tests). The same procedure was employed in the second and third experiments for groups of 24 and 21 students, respectively. All the students of each group were subjected to Test b or to Test a followed by Test b. The ambiguity figures of Test b or Test a followed by b appeared on a large screen for a time of only 3 s, and simultaneously the students were asked to mark on a previously prepared personal schedule their decision as to whether the figures were equal or not. Test a after Test b presentation had the objective of evaluating whether the perception of the first image (Test a) can alter the perception of the subsequent image (Test b). All the experiments were computer assisted and in each phase of the experiment the following probabilities were calculated:

$$\begin{aligned} p^b(+), p^b(-), &\ p^a(+), & p^a(-),\\ p(b=+|a=+), &\ p(b=-|a=+), & p(b=+|a=-),\\ & &\ p(b=-|a=-). \end{aligned}$$

Here the role of context, say C, is played by the selection procedure of a sample for the experiment. All probabilities depend on C.

A statistical analysis of the results was performed in order to ascertain whether coefficients of interference λ<sup>β</sup> are non-zero or zero in Tests b, a and b|a. The first experiment gave the following results

$$\begin{aligned} \text{Test } b: p^b(+) &= 0.6923; \ p^b(-) = 0.3077, \\ \text{Test } a: p^a(+) &= 0.9259; \ p^a(-) = 0.0741, \\ \text{Test } b | a: p(b = + | a = +) &= 0.68; \ p(b = - | a = +) = 0.32, \\ \text{ } p(b = + | a = -) &= 0.5; \ p(b = - | a = -) = 0.5. \end{aligned}$$

The calculation of conditional probability gave the following result with regard to p b (+):

$$p^a(+)p(b=+|a=+) + p^a(-)p(b=+|a=-) = 0.6666.\tag{19}$$

The second experimentation gave the following results:

$$\begin{aligned} \text{Test } b: p^b(+) &= 0.5714; \ p^b(-) = 0.4286, \\ \text{Test } a: p^a(+) &= 1.0000; \ p^a(-) = 0.0000, \\ \text{Test } b | a: p(b = + | a = +) &= 0.7000; \ p(b = - | a = +) = 0.3000, \end{aligned}$$

$$p(b=+|a=-)=1.0000; \; p(b=-|a=-)=0.0000. \quad \text{(20)}$$

The calculation of the conditional probability gave the following result with regard to p b (+):

$$p^a(+)p(b=+|a=+) + p^a(-)p(b=+|a=-) = 0.7.\quad \text{(21)}$$

Finally, the third experimentation gave the following results:

$$\begin{aligned} \text{Test } b: p^b(+) &= 0.4545; \ p^b(-) = 0.5455, \\ \text{Test } a: p^a(+) &= 0.7000; \ p^a(-) = 0.3000, \\ \text{Test } b | a: p(b = + | a = +) &= 0.4286; \ p(b = - | a = +) = \\ &= 0.5714; \end{aligned}$$

$$p(b=+|a=-)=1.0000, p(b=-|a=-)=0.0000.\tag{22}$$

The calculation of the conditional probability with regard to p b (+) gave the following result:

$$p^a(+)p(b=+|a=+) + p^a(-)p(b=+|a=-) = 0.6000.\tag{23}$$

Khrennikov Quantum-like modeling of cognition

The mean value ± SD of p b (+) resulted in p b (+) = 0.5727 ± 0.1189 in Test b and calculated using Equations (18), (20), and (22), while instead a mean value of 0.6556 ± 0.0509 resulted for p b (+) when calculated in Test b|a and thus using Equations (19), (21), and (23). The two calculated mean values are different and thus give evidence of quantum-like behavior of cognitive mental states as they were measured by testing mental observables by Tests b, a, and b|a. Student's t-test showed that the probability that the obtained differences between the two estimated values of p b (+) by Test b and by Test b|a are accidental, does not exceed 0.30. Thus, with probability 0.70 the coefficients of supplementarity are non-zero and, hence, students behave (think) in a quantum-like way (with respect to observables based on the ambiguous figures). We also found that these coefficients are bound by 1, so behavior is trigonometric, see Appendix 1 in Supplementary Material.

As the final step, we calculate cos θ<sup>β</sup> on the basis of the coefficient of interference λ<sup>β</sup> given by Equation (13) in Supplementary Material. In our experiments we obtained

$$
\cos \theta\_+ = -0.2285, \theta\_+ = 1.80131
$$

and

$$\cos \theta\_- = 0.0438, \theta\_- = 1.5270, 1$$

which are quite satisfactory phase results indicating quantumlike behavior for the investigated mental states.

The above results present a preliminary evidence of the existence of quantum-like behavior in the dynamics of some mental states. Luckily, we were able to capture mental conditions of subjects in which the context influenced decision making in an essential way. We have established equivalence between quantum-like entities and corresponding cognitive entities.

As the performed experiment suggests a quantum-like behavior of cognitive entities, a consequence could be that cognitive entities as well as quantum entities exhibit a highly contextual nature. In the same manner as quantum entities are influenced by the routine physical act of measurement, cognitive entities are influenced by the act of measurement (decision). In the case of cognitive entities, the measurement is characterized by cognitive interaction.

Mathematical modeling of the experiment considered above was beased on a behavioral similarity between cognitive and quantum-like entities, so we were able to make direct use of an abstract quantum-like formalism and apply it to cognitive entities. Moreover, we were able to account for quantum-like dynamics of the cognitive entities. The numerical results of the previous experiment give us an opportunity to delineate basic features of cognitive entities not known in the past. Let us outline this approach in more detail. We can introduce a complex quantum-like amplitude, which represents the state of our cognitive entity expressed in relation to some selected mental observables. Let us suppose that we selected the mental observable b, belonging to a given cognitive entity. Suppose also that b can assume only two possible values (b = +, −). This complex quantum-like amplitude can be produced by QLRA, Appendix 1 in Supplementary Material. The Born rule holds

$$|\psi(\pm)|^2 = p^b(\pm). \tag{24}$$

The complex quantum-like amplitude can represent the state of our cognitive entity in relation to the considered mental observable b.

The experiment indicates a methodological way for quantumlike processing of future experiments. We will briefly reconsider the case of the experiment we have performed, showing how to calculate quantum-like complex amplitudes and thus to give a quantum-like characterization of the state of the cognitive entity that was employed in the experiment. Let us consider in detail the model entities of our experiment. As we indicated previously, we managed to calculate two different values for cos θ(+) and cos θ(−), whose meaning is now clear. In our case, as we found above, cos θ<sup>+</sup> = −0.2285,θ<sup>+</sup> = 1.8013 and cos θ<sup>−</sup> = 0.0438,θ<sup>−</sup> = 1.5270, which nicely corresponds to quantum-like behavior of the investigated cognitive entity. As a final step, we present a detailed calculation of the quantum-like model of the mental state of the cognitive entity as characterized during the course of the experiment.

By using the obtained data, we can write a mental wave function ψ = ψ<sup>C</sup> of the mental state C of the group of students who participated in the experiment—corresponding to a mental context denoted by the same symbol C. QLRA, see Appendix 1 in Supplementary Material, produces

$$\begin{array}{rcl} \psi(\beta) &=& \sqrt{\not p(a=+)p(b=\beta|a=+)} \\ &+ e^{\not \vartheta(\beta)} \sqrt{p(a=-)p(b=\beta|a=-)}. \end{array} \tag{25}$$

The ψ is a function from the range of values {+, −} of the mental observable b to the field of complex numbers. Since b may assume only two values, such a function can be represented by two-dimensional vectors with complex coordinates. Our experimental data give

$$\begin{split} \psi(+) &= \sqrt{0.8753 \times 0.6029} \\ &+ e^{i\theta(+)} \sqrt{0.1247 \times 0.5} \approx 0.7193 + i0.2431 \end{split} \tag{26}$$

and

$$\begin{aligned} \psi(-) &= \sqrt{0.8753 \times 0.3971} \\ &+ e^{i\theta(-)} \sqrt{0.1247 \times 0.5} \approx 0.5999 + i0.2494. \end{aligned} \tag{27}$$

#### 8.2. Quantum Representation of Order Effect in Psychology

For example, in a typical opinion-polling experiment, a group of participants is asked one question at a time, e.g., A = "Is Bill Clinton honest and trustworthy?" and then B = "Is Al Gore honest and trustworthy?" or in the opposite order, B and then A[91]. The corresponding probability distributions, p(A = i, B = j) - "first the B-question with the result j and then the A-question with the result i" and p(B = j, A = i) - "first the A-question with the result i and then the B-question with the result j" do not coincide.

For classical probability theory this is a problem. Here the observables A and B have to be represented by functions A, B : <sup>→</sup> <sup>0</sup>, 1 (random variables). Set <sup>A</sup><sup>i</sup> = {<sup>ω</sup> <sup>∈</sup> : <sup>A</sup>(ω) <sup>=</sup> <sup>i</sup>}, <sup>B</sup><sup>j</sup> <sup>=</sup> {<sup>ω</sup> <sup>∈</sup> : <sup>B</sup>(ω) <sup>=</sup> <sup>j</sup>}. Then

$$p(A=i, B=j) = p(A\_i \cap B\_j) = p(B\_j \cap A\_i) = p(B=j, A=i). \tag{28}$$

The order effect is washed out as the result of commutativity of conjunction. For comparison with the quantum approach, it is useful to write the previous equality by using conditional probabilities:

$$p(A=i, B=j) = p(B=j)p(A=i|B=j)$$

$$= p(A=i)p(B=j|A=i) = p(B=j, A=i).\tag{29}$$

In the quantum model of the opinion poll, observables are represented by Hermitian operators, A = P i = 0,1 P iPi, B = j = 0,1 jQ<sup>j</sup> . Here

$$p(A=i, B=j) \equiv p(B=j)p(A=i|B=j),\tag{30}$$

$$\mathfrak{p}(B=j, A=i) \equiv \mathfrak{p}(A=i)\mathfrak{p}(B=j|A=i). \tag{31}$$

Opposite to Equation (29) which is a consequence of Equation (28), these are the definitions of the "sequential probabilities." Here the joint probability distribution is, in general, not well defined. Quantum conditional probability is defined as the probability with respect to the state obtained as the update of the initial state ψ after the first measurement (and crucially dependent on the first measurement result)

$$p(A=i|B=j) = \frac{\|P\_iQ\_j\psi\|^2}{\|Q\_j\psi\|^2}, \ p(B=j|A=i) = \frac{\|Q\_jP\_i\psi\|^2}{\|P\_i\psi\|^2}.$$

The order effect takes place if and only if kPiQjψk <sup>2</sup> 6= kQjPiψ<sup>k</sup> 2 , orh[Pi, Qj]ψ|ψi 6= 0. If the operators do not commute, then such a state ψ exists.

#### 8.3. "Hawaii Experiment"

Tversky and Shafir [80] considered the following psychological test demonstrating the disjunction effect. They showed that significantly more students report that they would purchase a non-refundable Hawaii vacation if they knew that they had passed or failed an important exam than report they would purchase if they did not know the outcome of the exam (So, a student is going to travel to Hawaii in any event, whether she passed exam or not, but only under the condition that she knows the result).

There can be introduced the following two variables; a = 1 (exam passed), a = 0 (exam failed) and b = 1 (go to Hawaii), b = 0 (not to go to Hawaii). The data [80] has the form:

p(b = 1) = 0.32 and hence p(b = 0) = 0.68 (these are the probabilities in the context of uncertainty). Then we also have p(a = 0) = p(a = 1) = 0.5. In the experiment 50% of students were informed that they passed/not passed the exam. The general structure of the experiment was the following. There were two groups of students; one was used for the unconditional measurement of the b-variable and generated the probabilities p(b = 0), p(b = 1) and the second group was used for the conditional measurement of <sup>b</sup>: under the conditions <sup>a</sup> <sup>=</sup> 1 or a = 0. The data collected in the second setting was

$$\begin{aligned} p(b=1|a=1) &= 0.54, & p(b=1|a=0) &= 0.57; \\ p(b=0|a=1) &= 0.46, & pb = 0|a=0 &= 0.43. \end{aligned}$$

The transition probabilities can be represented in the form of the following matrix: **P** <sup>b</sup>|<sup>a</sup> <sup>=</sup> 0.54 0.57 0.46 0.43 . These data violate FTP and the degree of violation is given by the coefficients of interference, see Equation (11): δ(1) = 0.17,δ(0) = −0.17. (We remark that always P x δ(x) = 0). These coefficients can be represented in the form Equation (9) (as for interference of wave functions in the two slit experiment) with θ<sup>1</sup> = 1.3,θ<sup>0</sup> = 2. For dichotomous variables, the data easily allow to reconstruct the quantum(-like) state and observables, by using the constructive wave function approach and QLRA, see Appendix 1 in Supplementary Material. We present the formula giving the "belief state" ψ of students in the basis of eigenvectors of the Hermitian operator B representing the bp observable, i.e., B = diag(0, 1). It has the form: ψ(x) = p(a = 0)p(b = x|a = 0) + e iθx p p(a = 1)p(b = x|a = 1). By inserting the values of probabilities and angles into this expression we obtain the vector with complex coordinates, x = 0, 1. The direct calculation shows that Born's rule Equation (4) holds, i.e., p(b = x) = |ψ(x)| 2 , x = 0, 1. Thus, statistical data from this cognitive psychology experiment can be mathematically represented with the aid of the quantum formalism.

#### 8.4. Categorization-decision Experiment

One of the most elucidating examples of quantum theory as applied to psychology is the experiment on interference of categorization in decision making. Statistical data collected in such experiments exhibits non-classical feature in the form of violation of FTP with high statistically significance. In particular, it is impossible model such data with the aid of classical Markov dynamics. Therefore, it is natural to proceed with the quantumlike model justifying violation of laws of classical probability theory. In coming presentation of this model we follow the paper [92].

Often decision makers need to make categorizations before choosing an action. For example, a military operator has to categorize an agent as an enemy before attacking with a drone. How does this overt report of the category affect the later decision? This paradigm was originally designed to test a Markov model of decision making that is popular in psychology [93]. Later it was adapted to investigate quantum-like interference effects in psychology [17, 92].

We begin by briefly summarizing the methods used in the experiments (see [92], for details) . On each trial of several hundred training trials, the participant is first shown a picture of a face that may belong to a "good guy" category (category G) or a "bad guy" category (category B), and they have to decide whether to "attack" (action A) or "withdraw" (action W). The trial ends with feedback indicating the category and appropriate action that was assigned to the face on that trial. There are many different faces, and each face is probabilistically assigned to a category, and the appropriate action is probabilistically dependent on the category assignment. Some of the faces are usually assigned to the "good guy" category, while other faces are usually assigned to the "bad guy" category. The category is important because participants are usually rewarded (win points worth money) for "attacking" faces assigned to "bad guys" and they are usually punished (lose points worth money) for "attacking" faces assigned to the "good guys;" likewise they are usually rewarded for "withdrawing" from "good guys" and punished for "withdrawing" from "bad guys." Participants are given ample training during which they learn to first categorize a face and then decide an action, and feedback is provided on both the category and the decision. Although the feedback given at the end of each trial is probabilistic, the optimal decision is to always "attack" when the face is usually assigned to a "bad guy" category, and always "withdraw" when the face is usually assigned to a "good guy" category. The key manipulation occurs during a transfer test phase which includes the standard "categorization– decision" (C-D) trials followed by either "category alone" (Calone) trials or "decision alone" (D-alone) trials. For example, on a "decision alone" trial, the person is shown a face, and simply decides to "attack" or "withdraw," and recieves feedback on the decision. The categorization of the face on the D-alone trials remains just as important to the decision as it is on C-D trials, and some implicit inference about the category is necessary before making the decision, but the person does not overtly report this implicit inference.

Note that the C-D condition in the psychology experiment allows the experimenter to observe which "path" the participant follows before reaching a final decision. This is analogous to a "double slit" physics experiment in which the experimenter observes which "path" a particle follows before reaching a final detector. In contrast, for the D-alone condition in the psychology experiment, the experimenter does not observe which "path" the decision maker follows before reaching a final decision. This is analogous to the "double slit" physics experiment in which the experimenter does not observe which "path" the particle follows before reaching a final detector<sup>9</sup> .

According to the Markov model proposed in Townsend et al. [93], for the D-alone condition, the person implicitly performs the same task as explicitly required by the C-D condition. More specifically, for the D-alone condition, once a face (denoted f) is presented, there is a probability that the person implicitly categorizes the face as a "good" or "bad" guy. From each category inference state, there is a probability of transiting to the "attack" or "withdraw" decision state. So the probablity of "attack" in the D-alone condition (denoted as p(A|f)) should equal the total probability of "attacking" in the C-D condition (denoted as pT(A|f)). The latter is defined by the probability that the person categorizes a face as a "good guy" and then "attacks" plus the probability that the person categorizes the face as a "bad guy" and then "attacks" (pT(A|f) = p(G ∩ A|f) + p(B ∩ A|f)). Using this categorization-decision paradigm, one can examine how the overt report of the category interferes with the subsequent decision. An interference effect of categorization on decision making occurs when the probability of "attacking" for D-alone trials differs from the total probability pooled across C-D trials. The Markov model for this task originally investigated by Townsend et al. [93] predicts that there should be no interference, and the law of total probability should be satisfied.

Beginning with our first study [17], we have conducted a series of four experiments on this paradigm (see 92, for review). All results of these experiments show similar results, but we briefly report a summary of findings from the fourth experiment that included 246 participants (a minimum 34 observations per person per condition). For a face more likely assigned to the "god guy" category (we denote these faces as g), the law of total probability is approximately satisfied (pT(A|g) = 0.36, p(A|g) = 0.37). However, for a face more often assigned to the "bad guy" category (we denote these faces as b), the probability of "attack" (i.e., the optimal decision with respect to the average payoff) is systematically greater for the D-alone condition as compared to the C-D condition" violating the law of total probability (p(A|b) <sup>=</sup> 0.62 <sup>&</sup>gt; <sup>p</sup>T(A|b) <sup>=</sup> 0.56)10. More surprising, the probability of "attack" for the D-alone condition (which leaves the "good" or "bad" guy category unresolved) was even greater than the probability of "attack" given that the person previously categorized the face as a "bad guy" (p(A|b) = 0.62 > p(A|b, B) = 0.61) on a C-D trial! For some reason, the overt categorization response interfered with the decision by reducing the tendency to "attack" faces that most likely belonged to the "bad guy" category. These violations of the law of total probability contradict the predictions of the Markov model proposed by Townsend et al. [93] for this task.

A detailed quantum-like model for the categorizationdecision task is presented in [17], and here we only present a brief summary following the paper [92]. The human decision system is represented by a unit length state vector |ψi belonging to 4-dimensional Hilbert space spanned by four basis vectors. (Here we use Dirac's symbolic notations, see Appendix 2 in Supplementary Material).

Each basis vector represents one of the four combinations of categories and actions (e.g., |GAi is a basis vector corresponding to category G and action A). The state ψf = ψGA |GAi + ψGW |GWi + ψBA |BAi + ψBW |BWi is prepared by the face stimulus f that is presented during the trial. The question about the category is represented by a pair of projectors for good and bad categories C<sup>G</sup> = |GAi hGA| + |GWi hGW| , C<sup>B</sup> = (I − CG). The question about the action is represented by a pair of projectors for attack and withdraw actions D<sup>A</sup> = UDC |GAi hGA| U † DC + UDC |BAi hBA| U † DC, D<sup>W</sup> = (I − DA), where UDC is a unitary operator of transformation from the categorization basis to the decision basis.

<sup>9</sup>We remark that here the picture of path is used only for illustrative purpose; therefore we placed path in quotation marks. In QM there is no such a concept as a "path" (trajectory) of a particle. We can only ascertain, and this is only statistically, a singular event of an electron "passing" through a slit. In fact this way of seeing the situation provides an even better parallel here.

<sup>10</sup>This difference is statistically significant: <sup>t</sup>(245) <sup>=</sup> 4.41, <sup>p</sup> <sup>=</sup> 0.0004. Also the same effect was replicated in 4 independent experiment

Following [92], we obtain that the probability of first categorizing the face as a "bad guy" and then "attacking" equals p(B, A|f) = p (B) · p (A|B) = <sup>C</sup><sup>B</sup> ψf 2 · <sup>k</sup>D<sup>A</sup> <sup>|</sup>ψBik<sup>2</sup> , with |ψBi = CB ψf <sup>C</sup><sup>B</sup> ψf , and combining the terms in the product we obtain p(B, A|f) = <sup>D</sup><sup>A</sup> · <sup>C</sup><sup>B</sup> · ψf 2 ; similarly, the probability of first categorizing the face as a "good guy" and then "attacking" equals p(G, A|f) = <sup>D</sup><sup>A</sup> · <sup>C</sup><sup>G</sup> · ψf 2 ; and so the total probability of attacking under the C-D condition equals <sup>p</sup>T(A|f) <sup>=</sup> <sup>D</sup><sup>A</sup> · <sup>C</sup><sup>G</sup> · ψf 2 + <sup>D</sup><sup>A</sup> · <sup>C</sup><sup>B</sup> · ψf 2 .

 The probability of attack in the D-alone condition equals [92] p(A|f) = <sup>D</sup><sup>A</sup> · ψf <sup>2</sup> <sup>=</sup> <sup>D</sup><sup>A</sup> · (C<sup>G</sup> <sup>+</sup> <sup>C</sup>B) ψf <sup>2</sup> <sup>=</sup> <sup>D</sup><sup>A</sup> · <sup>C</sup><sup>G</sup> ψf + D<sup>A</sup> · C<sup>B</sup> ψf <sup>2</sup> <sup>=</sup> <sup>D</sup><sup>A</sup> · <sup>C</sup><sup>G</sup> ψf <sup>2</sup> <sup>+</sup> <sup>D</sup><sup>A</sup> · <sup>C</sup><sup>B</sup> ψf <sup>2</sup> <sup>+</sup> Int, where Int <sup>=</sup> <sup>2</sup> · Re - ψf |CGDACB|ψ<sup>f</sup> . If the projectors for categorization commute with the projectors for action (e.g., UDC = I), then the interference is zero, Int = 0, and we obtain p(A|f) = <sup>D</sup><sup>A</sup> · <sup>C</sup><sup>G</sup> ψf 2 + <sup>D</sup><sup>A</sup> · <sup>C</sup><sup>B</sup> ψf 2 = pT A|f , and the law of total probability is satisfied. However, if the projectors do not commute (e.g., UDC 6= I), then we obtain an interference term. We can select the unitary operator UDC which produces an inner product Int = −0.06, and account for the observed violation of the law of total probability.

#### 9. Conclusion

We demonstrated that the mathematics developed to solve QM problems is highly suitable to solving particular problems

#### References


(and paradoxes) in psychology and social sciences in general [6], cf. with the views of G. T. Fechner, see Section 1. The reason is that psychologists, like quantum physicists, must work with contextualized probabilistic systems that are highly sensitive to measurement, as well as "entangled" systems that are strongly interconnected and difficult to decompose into separate and independent parts. Our point (we call it the quantumlike paradigm [4], see also [94, 95]) is that the mathematical formalisms of quantum theory are highly suitable for such complex systems.

This inspires us to say that Descartes' dualism between the two substances, material and mental, can be resolved through construction of the general mathematical model based on quantum information and probability and applicable both to physics and cognition.

#### Acknowledgments

I would like to thank I. Basieva for fruitful discussions on the structure of this paper and inter-relation between physics and cognition and J. Busemeyer and E. Dzhafarov for discussions on analogy between quantum measurement theory and Stimulus-Organism-Response (S-O-R) scheme [96].

### Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fphy. 2015.00077


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Khrennikov. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Physics of Teams: Interdependence, Measurable Entropy, and Computational Emotion

William F. Lawless \*

*Math & Psychology, Paine College, Augusta, GA, United States*

Most of the social sciences, including psychology, economics, and subjective social network theory, are modeled on the individual, leaving the field not only a-theoretical, but also inapplicable to a physics of hybrid teams, where hybrid refers to arbitrarily combining humans, machines, and robots into a team to perform a dedicated mission (e.g., military, business, entertainment) or to solve a targeted problem (e.g., with scientists, engineers, entrepreneurs). As a common social science practice, the ingredient at the heart of the social interaction, interdependence, is statistically removed prior to the replication of social experiments; but, as an analogy, statistically removing social interdependence to better study the individual is like statistically removing quantum effects as a complication to the study of the atom. Further, in applications of Shannon's information theory to teams, the effects of interdependence are minimized, but even there, interdependence is how classical information is transmitted. Consequently, numerous mistakes are made when applying non-interdependent models to policies, the law and regulations, impeding social welfare by failing to exploit the power of social interdependence. For example, adding redundancy to human teams is thought by subjective social network theorists to improve the efficiency of a network, easily contradicted by our finding that redundancy is strongly associated with corruption in non-free markets. Thus, built atop the individual, most of the social sciences, economics, and social network theory have little if anything to contribute to the engineering of hybrid teams. In defense of the social sciences, the mathematical physics of interdependence is elusive, non-intuitive and non-rational. However, by replacing determinism with bistable states, interdependence at the social level mirrors entanglement at the quantum level, suggesting the applicability of quantum tools for social science. We report how our quantum-like models capture some of the essential aspects of interdependence, a tool for the metrics of hybrid teams; as an example, we find additional support for our model of the solution to the open problem of team size. We also report on progress with the theory of computational emotion for hybrid teams, linking it qualitatively to the second law of thermodynamics. We conclude that the science of interdependence advances the science of hybrid teams.

#### Edited by:

*Emmanuel E. Haven, University of Leicester, United Kingdom*

#### Reviewed by:

*Ignazio Licata, ISEM- Institute for Scientific Methodology, Italy Nicolas Francisco Lori, LANEN, INCYT, INECO Foundation, Argentina*

> \*Correspondence: *William F. Lawless w.lawless@icloud.com*

#### Specialty section:

*This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics*

Received: *30 April 2016* Accepted: *05 July 2017* Published: *02 August 2017*

#### Citation:

*Lawless WF (2017) The Physics of Teams: Interdependence, Measurable Entropy, and Computational Emotion. Front. Phys. 5:30. doi: 10.3389/fphy.2017.00030*

Keywords: social reality, hybrid teams, Von Neumann entropy, interference, interdependence

# INTRODUCTION

One of the major conclusions from modern game theorists, based on findings in the laboratory, is that the societies that cooperate have better social welfare [[1], p. 7–8]. The evidence from the field, however, does not support this claim: Cooperation between competitors is often considered by the judiciary to be collusion [2]; consensus-seeking permits a minority faction to control a majority (e.g., in European Union politics, see [3], p. 29); and central decision-making promotes corruption [4]. Unexplained by traditional theory, misallocations of resources by corrupt activities abound across the globe [5]. In our research, we have concluded that corruption is more likely unchecked in countries, businesses and teams that impede the interdependence spontaneously arising among citizens in a nation with functional checks and balances; China is an example of the corruption that occurs from blocking interdependence (e.g., censorship), replaced by central decision-making [6]:

Much about the Hong'ao dump was not as it appeared on paper, a reconstruction of the disaster shows. The duplicity, involving doctored documents and false identities, illustrates systemic gaps in China's efforts to prevent industrial and transportation accidents, which claim tens of thousands of lives annually and have galvanized public anger over official corruption . . . like the deadly explosions last year at a toxic chemical storage site in Tianjin . . . the disaster in Shenzhen suggests that dark pools of mismanagement and corruption persist even in the most developed parts of the country.

Conceptually, interdependence has been known for some time. According to Smith's [7] "invisible hand," a service provided by one worker exploiting an opportunity is interdependent with another worker providing food, housing, transport, and on in an endless iteration of services across a market free to respond to market demands and signals by the movement of capital and labor sufficient to satisfy demand. But free movement is impeded by barriers established by centralized commands, decisions or procedures (e.g., Dodd-Frank rules in the USA), authoritarian governments (e.g., China), or violent gangs (e.g., Palestine's Hamas; Lebabon's Hezbollah; the US's Mara Salvatrucha).

What we know so far from this our work-in-progress is that reducing interdependence increases errors and the misallocation of resources [8]. We also know from the National Academy of Sciences ([9], p. 33) that while interdependence is important to effective teamwork, the size of a team for a given problem remains an open question. The Academy then contradicted itself by stating "many hands make light work," indicating its belief that redundancy has a positive effect on teams. Traditional models of subjective social network theory also predict that an increase in redundancy in social networks increases efficiency [10]. We approach team size with our quantum-like model of interdependence. By treating oil firms as teams [11], we theorized that the best size for teams is the least size possible that maintains interdependence across a team to solve a problem identified by a society when its labor and capital are free to move. We discovered that by overstaffing, redundancy reduces interdependence. In this paper, we extend our finding to the size of a nation's military.

Even in American bureaucracies, consensus-seeking, corruption, and mismanagement appear to go hand in hand (e.g., for a cover-up by the Veterans Affairs, see [12]; for unjustified rule-making by the US Treasury, see [13]; for Department of Energy guidance that citizen advisors should "strive for consensus," see [14]). As an example of the mismanagement associated with consensus-seeking (i.e., minority control; in Lawless [15]), DOE planned to vitrify high-level radioactive wastes into glass for its eventual geologic disposal starting in the 1980s at both its Hanford facility in Washington State and at its Savannah River Site in South Carolina. However, the consensus-seeking Hanford Citizens Advisory Board has not formally motivated DOE to accelerate its Hanford vitrification facility, a project plagued by gross mismanagement now delayed until 2033 [16]. Compare that with the majority-ruled Savannah River Site's Citizen Advisory Board that formally motivated SRS to start its vitrification facility in 1996 and has overseen its safe operations for the more than 20 years hence.

In the literature, Khrennikov [17] suggests now is the time to apply quantum-like models to address open questions in many fields; e.g., business, psychology, and social systems. Busemeyer and Wang ([18], p. 43) add that "Quantum cognition is an emerging field that uses mathematical principles of quantum theory to help formalize and understand cognitive systems and processes." Wang and Busemeyer [19] reintroduce the concept of complementarity to account for order effects; we use complementarity to account for the stable gap between physical (objective) measures of behavior and self-reported (subjective) observations of behavior (e.g., [20]), as well as for the different interpretations of reality common to individuals (e.g., present-day supporters of Einstein's views on quantum theory vs. Bohr's Copenhagen interpretation; in Lawless [2]), and the different skills held by members of a team, where each may have subjective interpretations and beliefs (e.g., in the search for justice, construing the courtroom as a team, prosecutors, and defense attorneys work together by pursuing different theories of a crime; [11, 21]).

The phenomenon that links these examples is the interdependence between behavior and its interpretations; interdependence between multiple interpretations of social reality; and the interdependence among members of a team multitasking to solve a problem. In its review of teams, the National Academy of Sciences repeatedly cited the presence of interdependence but without addressing the phenomenon theoretically [9]. In this study, we apply quantum-like models to the study of interdependence (e.g., [22], p. 147). From Wendt [23], "humans live in highly interdependent societies" (p. 150); interdependence, however, creates a measurement problem [2], which Wendt ([23], p. 67) describes as "the apparent impossibility of an objective measurement," and which we have linked to the behavior-cognition gap, for example, between the objective measures of behavior and the self-reported subjective accounts of behavior (e.g., [20]). Wendt ([23], p. 34) adds that a quantum-like model "offers the potential for revealing new social phenomena," which we demonstrate by determining the size of teams, heretofore an open problem ([9], p. 33).

In the 1940s, Von Neumann and Morgenstern's ([24], Section 4.8.2) theory of games introduced to generations of social scientists a mathematical model of static interdependence in a configuration of arbitrary rewards and punishments promoted as tradeoffs among the choices offered to players with values determined by scientists, not by social reality, producing decades of biased social and political policies from these toy models. Unlike Smith's [7] "invisible hand" or the physical sciences where "reality is not as it appears" to human observers [25], game theory and wide swaths of social science are based on, at best, simple observations of individuals and, at worst, self-reported observations ([26]; e.g., questionnaires, surveys, interviews). The value of actual behaviors vs. self-reports of constructs poorly correlate, if at all, with most of the variance between actual behavior and self-reported behavior unaccounted [20].

Bohr, the quantum physicist, criticized game theory on foundational grounds, leading [24] to decry that if Bohr was correct, how to proceed was "inconceivable" (p. 148). Generalizing from quantum theory, Bohr [27] conceived of humans as dual agents constituted of two independent but interdependent parts in the brain (e.g., motor control and vision; from Rees et al. [28]), viz., a human can serve to enact (objectively) a behavior or to observe (subjectively) a behavior; or a human can hold belief #1 (e.g., conservative) or opposing belief #2 (e.g., liberal), the degree of complementarity between these two parts affecting the tradeoffs common to making decisions in social reality [11], consequently creating a measurement problem long ignored by social scientists [23]; viz., game theory does not recognize the existence of a measurement problem in social reality. Specifically, when measuring a social object interdependent with another, both are affected (e.g., [8]). When playing games, as scientists feed choices to subjects to test preferences and responses, they avoid seeing this problem, one of the reasons that game theorists lament that the "evidence of mechanisms for the evolution of cooperation in laboratory experiments . . . [has not yet been found in] real-world field settings" ([29], p. 422). Later, Von Neumann ([30], p. 420, footnote 217) grappled with the social science implications of Bohr's ideas for the quantum interaction: "Bohr . . . was first to point out that the dual description which is necessitated by the formalism of the quantum mechanical description of nature is fully justified by the physical nature of things . . . [that] may be connected with the principle of psycho-physical parallelism."

Kelley [31], an eminent social psychologist who spent most of his career studying interdependence with games, finally abandoned the study of interdependence because he could never bridge the gap between the game matrices presented to a pair of players versus the "invisible" matrices subjects responded to during games; i.e., no matter how strongly held, the preferences self-reported by subjects before they participated in games were repeatedly contradicted by their choices made during games.

The inability of scientists to determine the value of the social interaction at the heart of games is mirrored across the social sciences by practitioners who base their theories on observations of the processes of how the best teams should operate, often with self-reported (subjective) surveys that tell us nothing new (e.g., the surveys and interviews of teams at Google; in Duhigg [32]). By infusing social science with the normative values that happen to agree with religious beliefs, presently, social science is, unfortunately, of no value in the engineering of hybrid teams. An exception of sorts is the report by the National Academy of Sciences on the value of interdependence to scientific teams [9]; but, by being non-mathematical, the Academy report offers no guidance to engineer hybrid teams.

In comparison to game theory and other traditional approaches to the study of interdependence in teams, we define interdependence as responsive or reactive to signals in nature between non-independent organisms (e.g., elk grazing in a forest with predators leads to healthier forest grasses; from Carroll [33]). Our physics of interdependence as mutual responsiveness is similar to that of entanglement, where the factors that produce interdependence cannot be factored, remaining opaque or invisible to even well-trained observers [2]. But although the effects of interdependence are often "invisible" to rational human observers [7], we recognize that humans manage or exploit it with the competition between at least two teams vying for the support of each team's ideological beliefs or skills before an audience of neutral individuals freely able to choose, thereby entangling them in one belief and then countered by its contrary belief as they process the information generated by the opposing sides of an argument (viz., a Nash equilibrium), exactly what dictators first seek to suppress [2].

When measuring states of interdependence, the measurement problem's "apparent impossibility of an objective measurement" ([23], p. 67) makes social reality non-deterministic. As an example, Cohen [34] reported that women with HIV partners voluntarily participated in the trial of a new drug designed to prevent HIV infection. Ninety-five percent of the women self-reported to medical staff that they had faithfully taken the medication, but, if true, because the infection spread to many of these women, the results indicated that the trial had failed. Inadvertently, the medical research team recalled that they had also collected blood samples from the women during the trial. Once investigated, researchers discovered that only about 26% of the women had actually taken the drug, saving the trial.

From the HIV example, if "quantum-like effects exist in the social world, expressed as interdependence" ([22], p. 147), the interdependence should produce a complementarity in socialpsychological systems that causes interference between the two factors of a human's physical behavior and its very different observation of behavior, a difference ignored by traditional social scientists' belief in the independence of these effects. The interpretations of observations by individuals and scientists are impacted by their beliefs, biases, and experience, producing, for example, interference illusions [35]. Unlike quantum systems where angles of separation between two beams of light produce replicable effects, and whereas we can reliably reproduce Adelson's interferences to create his checkerboard illusion, at this early stage of social application, much remains unknown and surprising as in the example above reported by Cohen [34]. It is likely the reason that wide swaths of social science have recently come under suspicion for being unable to be replicated [36]. The goal of our research project is thus to find a way to objectively study the interference between "behavior" and "observed behavior."

As another example of how interdependence makes social reality non-deterministic, consider self-esteem, one of the major foci for the clinical practice of psychology over the decades. In the book published by the American Psychological Association (APA), [37] began:

Although, relatively little is known about self-esteem, it is generally considered to be a highly favorable personal attribute, the consequences of which are assumed to be as robust as they are desirable. Books and chapters on mental hygiene and personality development consistently portray self-esteem as one of the premier elements in the highest levels of human functioning . . . Its general importance to a full spectrum of effective human behaviors remains virtually uncontested. We are not aware of a single article in the psychological literature that has identified or discussed any undesirable consequences that are assumed to be a result of realistic and healthy levels of personal self-regard.

Despite this bold claim by Bednar and Peterson under the imprimatur of the APA, a 30-year meta-analysis of all of the known experimental studies where self-esteem could be measured against actual physical performance for both academics and the workplace by Baumeister [38] found a negligible correlation, confirming that self-reports of self-esteem are unrelated to actual behavior.

As a result, we adopt the spirit imbued in game theory to model interdependence, but we reject game theory as fundamentally observational and a-theoretical. Instead, by using Von Neumann's model of quantum interference and Bohr, we review herein our advances: by taking limits, we derived a quantitative measure in the limit of what constitutes a perfect team, another for the worst team, and another we found as a relative metric of team performance modeled after Kullback– Leibler divergence where redundancy in teams is characterized by the divergence in team size from comparable free market teams [11]. Finally, we review our past research to lay the groundwork for a computational model of emotion in teams characterized as a phase shift between overstaffed and rightly-sized teams.

In his theory of self-replicating automata, Von Neumann [39] addressed energy costs and thermodynamics; Shannon information theory; an individual, traditional, and rational perspective of reality; stability (p. 70); errors (p. 71); parts of self-replicating automata (p. 74); the difficulty of choosing the parts of a self-replicating automata in the right order (p. 76); and common sense in assembling the parts (p. 77). In contrast [11], we use a phase shift in the production of maximum entropy to demarcate teams with good allocations of resources from those that misallocate; interdependence between ideologically opposed power centers reflected as a point of social stability that drives information processing (what we have named as Nash equilibria; e.g., Republican and Democrat political parties; defense and district attorneys; Einstein's and Bohr's view of quantum reality); and a metric for the assembly of teams measured by a decrease in structural entropy production. Regarding Nash equilibria, we exemplify them as checks and balances, the source for the best possible government. Contradicting the results of toy games by game theorists ([1], p. 7) and social scientists, Madison [40] established that good governance occurs where "Ambition must be made to counteract ambition."

In summary, briefly, our goal is to apply our findings to determine mathematically the performance of hybrid teams. Traditional, but normative, models centered around cooperation, while of value in the creation of stories or religious homilies, are of little value for the engineering of hybrid teams. By extending our research to team emotion, we hope to generalize our research where our most recent goal was to use hybrid team performance as a guide to minimize human error (e.g., [41]).

### REVIEW OF PRIOR RESEARCH. MATHEMATICAL PHYSICS

Martyushev [42] theorized that maximum entropy production (MEP) drove the evolution of systems. Wissner-Gross and Freer [43] added that intelligence maximizes the entropic force with Equation (1),

$$F(\mathbf{X}\_0) = T \nabla\_{\mathbf{x}} \mathbf{S}(\mathbf{X}) | \mathbf{X}\_0 \tag{1}$$

where F is the entropic force associated with macrostate **X**, T the temperature and S the Shannon entropy for state **X.** To apply Equation (1) to a social system, say with a team of scientists seeking to operate at MEP, we would expect a scientific team to use its intelligence to be able to devote its available energy to the fullest exploration of its chosen problem space in the search for a solution, but barriers encumber exploration, reducing MEP, motivating the need for intelligence to overcome barriers (e.g., bureaucracy; corruption; arbitrary rules; censorship; etc.). We conclude that teams use their collective intelligence to seek MEP to overcome barriers; e.g., to seek the path where multitasking applies the maximum effort to solve a difficult problem. Building on Wissner–Gross and Freer that barriers impede MEP, intelligence in a team is needed to navigate around or to overcome these barriers, helping top teams to better compete to succeed. If, as we hypothesize, redundancy acts as a barrier that increases destructive interference in a team, reducing the "force" in Equation (1), then overstaffing in a team is a barrier that frustrates the application of intelligence to decisions. As one of our steps, we will adopt a method that helps us to look for a sign of the collective effects of intelligence.

Our theory is that excluded spaces are governed by the politics in play operating in a social reality, with bistable interpretations of social reality determined by neutral supporters [44]. In contrast to our approach with bistability, others have suggested that stable beliefs could be implemented with epistemic logics, comprising a Hilbert space semantics of belief states that could lead to a formal derivation of social entanglement. Instead, we let the beliefs held by one subgroup attempting to force its interpretation of social reality on the whole group to be |0>, and its complementary, orthogonal view held by a second subgroup to be |1>, giving as the state (Equation 2) for the combined group:

$$|\Psi> = 1/\sqrt{2}(|0> + |1>);\tag{2}$$

the factors in Equation (2) of non-separable states [22] become separable by measurement [2], but the measurement problem ([23], p. 67) means that as we determine the state of one factor, we no longer have reliable information on the state of the other factor. If a state's subsystems are not separable, it is entangled; however, if a state is, or has been made, separable, it cannot be entangled [45, 46].

A social system that controls, stops or blocks the bistable interdependence in Equation (2) should be modeled by Shannon entropy. Pure states are product states, where S(ρ) = 0. Product states are uncorrelated; e.g., , where AB is the Hilbert space of a composite system [46]. The measurement of one subsystem in a composite, product state system has no affect on its second subsystem ([45], [[47], p. 61–3]). If the interdependence among skill sets does not exist in two or more subsystems, Shannon information governs, the data is iid<sup>1</sup> , no correlations exist, and joint entropy is likely greater than the contributions of subsystems, i.e.,

$$H(X,Y) \ge H(X), H(Y) \tag{3}$$

To reflect correlations caused by interference among the sources of information, unlike Shannon entropy, interdependence can be destructive or constructive, captured by Von Neumann's density matrix, ρ, with entropy depicted by Equation (4):

$$\mathcal{S} = -\text{Tr}(\rho \log \rho) \tag{4}$$

If a team is successful in producing a team with members who multitask together to form what appears to be a team with "single mind," its degrees of freedom (dof) go to 1 (from the equation for cognitive interdependence by Kenny et al. [48], p. 235), accounting for "invisibility," giving:

$$\text{S} = \lim\_{\text{dof} \to \sim 1} \log(\text{dof}) = 0 \tag{5}$$

Interestingly, Einstein was the first to discover the reduction in dof at the quantum level, a critical insight that he shared with Schrodinger ([49], p. 238–9). Like the "single mind" of a team, an example of constructive interference occurs with the melding of the brain into a single mind was given in an interview of Donald Hoffman [50], a cognitive scientist with an evolutionary perspective,

We have two hemispheres in our brain . . . [that form a] unified single mind. . . . But when you do a split-brain operation, a complete transection of the corpus callosum, you get clear evidence of two separate consciousnesses.

Interference may be constructive, as when the members of a team work well together. In contrast to Equation (3), to represent Hoffman's "unified single mind" and to further account for the invisibility of interdependence effects, we use subaddivity to get:

$$\mathcal{S}(\rho\_{\rm AB}) \preceq \mathcal{S}(\rho\_{\rm A}) + \mathcal{S}(\rho\_{\rm B}) \tag{6}$$

Working from Von Neumann's perspective, correlations in joint entropy can become greater or equal to their differences, reflected by Equation (7):

$$\mathcal{S}(\rho\_{\rm AB}) \succeq |\mathcal{S}(\rho\_{\rm A}) - \mathcal{S}(\rho\_{\rm B})| \tag{7}$$

Equation (7) implies that social groups engage in tradeoffs to choose the more fit members of a team, where the best fit is signified by a reduction in joint entropy. Shannon states for subadditivity in a composite system can also be expressed as: H x, y ≤ H (x) + H(y)([51], p. 515–6). From our perspective, subadditivity holds when subsystems are correlated, indicating that the components are interdependent with offsetting entropies, justifying our comments that teams need coaches to compensate for a team's invisible information as it multitasks. At the atomic level, the trace of a density matrix, ρ, isTr(ρ) = 1; for a pure state, ρ <sup>2</sup> <sup>=</sup> <sup>ρ</sup>(idempotent). IfTr ρ 2 = 1, ρis pure and |ψ>AB is separable; however, ifTr ρ 2 < 1, ρis mixed and |ψ>AB is entangled ([52], p. 207–8). The degree of mixing determines the departure from a pure state. Based on these considerations, we theorize that interdependence among teammates produces subadditivity, where interdependence specifically means a lack of separability.

Returning to Equation (2), if the two factions in a group, represented by the operators A and B, have reached a single consensus without compromise, the eigenvalues for the operators representing both factions in the group are the same ([[53], p. 256), giving:

$$[A,B] = AB \cdot BA = 0\tag{8}$$

But interference from social interaction may be destructive; e.g., the rupture of a sports team; a married couple undergoing a divorce; the splitting apart of a business striving to survive a market turndown, like the Maersk Conglomerate [54]. When a group with two factions holds opposed viewpoints, a gap occurs in the group's interpretations of (social, physical) reality. In social-psychological systems, if a binary operator fails to commute ([[53], p. 343], [55] and [56],), it may produce order effects (e.g., [19], p. 2), uncertainty or incompleteness [11], giving:

$$[A,B] = AB \text{ - } BA = iC \tag{9}$$

where C represents a gap, a quantum gap at the atomic level [27] and the incommensurable political gap at the social level, the latter relabeled by us as a Nash equilibrium [4]. This gap at the social level offers a rich, new view of social reality. When the gap is fully driven by both factions with no neutrals on either side, conflict becomes likely [44]. But when neutrals must be wooed by both sides to win a debate, the solicitation of neutrals compels a compromise for the two sides to reach a decision, magnifying the power of neutrals freely able to influence both sides of a debate by helping to avoid a rupture [57]. As a merger of ideas, a compromise satisfies the decision at hand in the heat of the moment, releasing the emotional energy pent up by both factions (emotions are discussed later), energy that had

<sup>1</sup> iid: independent and identically distributed random variables

been reserved to continue an intellectual battle associated with a decision under uncertainty. The self-interests of the two sides of a Nash equilibrium act as a quasi-team with neutrals to process information that serves to check (control) the ambitions of both sides [40].

As they form an audience, neutrals, we argue, are the only social element to enter into a superposition (Equation 2), driven into the superposition by the Nash equilibrium that acts like the two cylinders of an engine. As they are wooed to and fro, once neutrals are measured, the trail they leave behind forms limit cycles [4]. Other than the trail left behind, neither side fully grasps the social reality sufficiently well enough to control the neutrals, why dictators, gangs and command economies suppress Nash equilibria and free speech [4]. But in a free society, the result is multiple tradeoffs that a free society exploits to evolve [2], such as finding the optimum size of a team.

Wang and Busemeyer [19] use the concept of complementarity to account qualitatively for order effects; we use complementarity to account for the gap between behavior and self-reported observations of behavior (e.g., [20]), as well as the different interpretations of reality by members of opposing teams (e.g., present-day supporters of Einstein's views on quantum theory vs. Bohr's acausal Copenhagen interpretation; in Lawless [2] and Bohr [27]).

Cohen [58] revised Equation (10) in signal detection theory to arrive at transformations between Fourier pairs, concluding that a "narrow waveform yields a wide spectrum, and a wide waveform yields a narrow spectrum and that both the time waveform and frequency spectrum cannot be made arbitrarily small simultaneously" (p. 45), giving:

$$
\sigma\_{\mathbf{A}} \sigma\_{\mathbf{B}} \ge 1/2,\tag{10}
$$

where σ<sup>A</sup> is the standard deviation of variable A (often events), σ<sup>B</sup> for variable B (often the time for when events occur) both modeled in **Figure 1**.

In quantum theory, the uncertainty relation Equation (10) follows directly from the non-commutativity of Hilbert space operators (Equation 9). Similar relations appear for Fourier pairs in classical field theory as well. By itself, the application of Equation (10) to what follows for the action of teams (Equations 10, 11) can be criticized as a mere analogy and not formally motivated. However, a new discovery of redundancy or overstaffing among oil producers as teams coupled with another discovery (e.g., flawed DOE nuclear waste management teams of scientists and their managers with Equation 14 below; in Lawless [15]) add credibility to our formulation (e.g., [59]).

Based on our prior findings, when the goal of tradeoffs is to find the team members with the right skills for the best team fit, we begin to extend our findings with a revision of Equation (10) to:

$$
\sigma\_{\rm A} \sigma\_{\rm B} - \geqslant \sigma\_{\rm skills} \sigma\_{\rm interprodetations} \geq 1/2 \tag{11}
$$

Along with the claims of Smith [7] and Bohr [27], the struggles of Kelley [31] and the findings of Zell and Krizan [20], Equations (5, 11) help us to see that, based strictly on physics, reducing the standard deviation for skills (action) to improve teamwork increases the standard deviation for the interpretations (observations) of the performance of a team's members, accounting for the "invisible" loss of awareness. Accordingly, if the skills of a team approach perfection, the width of different interpretations widens, making it difficult to see what makes a team effective, motivating the need for a coach to improve the performance of a poorly performing team in the search for more successful outcomes for a team's

actions (e.g., efforts have recently begun at NSF to train teams of scientists to become better teams of scientists; e.g., [9, 60]).

To study the implications of Equation (10), we decompose a team into a (static) structure that directs its efforts, and its efforts at performing its mission (i.e., dynamic skill roles; actions based on those roles). Assume that the structure of a team is functioning perfectly, allowing the team to use an optimum amount of its available energy to solve the problems that the team was designed to address. Building on our prior success, but speculating, we convert Equation (11) into two components representing the least entropy production (LEP) for the structure of a team and maximum entropy (MEP) to perform a team's mission:

$$\sigma\_{\text{LEPPOMEP}} \ge 1/2 \tag{12}$$

Taking limits with the variables in Equation (12) gives us an equation that captures a team's excellence; i.e., as a team's consumption of energy by its structure goes to zero, it's ability to maximize its ability to problem-solve itself becomes a maximum:

$$S = \lim\_{\sigma \perp \to -\simeq 0} \sigma\_{\text{MEP}} = \infty \tag{13}$$

With Equation (13) in hand, by inverting it, an account is discovered for what happens when a team fails, splits apart, or implodes [2], giving

$$S = \lim\_{} \mathfrak{a}\_{\mathsf{oMEP} \to \succ 0} \mathfrak{a}\_{\mathsf{LEP}} = \infty \tag{14}$$

The teams represented by Equation (14) might be a couple undergoing divorce; a business team failing (e.g., Maersk Shipping; in Chopping [54]); or a team of scientists forced by managers to not follow rigid scientific practices, exactly what [61] was concerned about. Such an example of scientific malfeasance, driven by management, happened in 1983 with the Department of Energy (DOE) at its Savannah River Site (SRS) in Aiken, SC. Despite its many scientific claims to Congress that DOE waste management practices were safe and equivalent to commercial ones, the file photograph in **Figure 2** from SRS points out that from the 1950s until 1983, DOE's waste management practices permitted 90% of its military solid radioactive wastes to be disposed of in ordinary cardboard boxes, allowing these boxes to sit in open trenches exposed to the weather for months at a time, becoming one of the primary sources of radioactively contaminated groundwater across DOE's complex. Public awareness stopped DOE's use of cardboard boxes in 1985. After DOE had openly admitted its past errors and had begun to rectify them, during the cleanup in 2000, renewed public support for DOE accelerated the closure of the same radioactive waste burial ground at SRS [15].

#### MATERIALS AND METHODS

The National Academy of Sciences [9] concluded that the problem of team size was an open question, yet implicitly supported redundancy with their consensus speculation that "more hands make light work" (Ch. 1, p. 13). In contrast, to our examples of excluded volumes we add redundancy as a

FIGURE 2 | At DOE's Savannah River Site from the 1950s until 1985, DOE's waste management permitted 90% of its military solid radioactive wastes to be buried in ordinary cardboard boxes, allowing these boxes to sit in open trenches exposed to the weather for months at a time.

cause of poorly performing oil firm teams [11]. We had found that GDP/capita, our surrogate for the competitiveness of a nation's oil firm teams, was significantly related to its freedom index, less teamwork redundancy, and less redundancy in the number of employees per oil firm. Then with divergence for a distribution of oil firms compared to another for a comparable freedom index, we found a significant regression to indicate that worker redundancy decreased per unit of oil produced as the oil firms were freer to optimize their teams to deploy their capital and labor as they saw fit when drilling for oil. For example, Exxon's production with 15.5 employees/M BBL of oil compares to Sinopec's 124.6 employees/M BBL of oil produced, illustrating that redundancy creates inefficiency.

We first define our four factors: redundancy, economic freedom, military power, and corruption. These factors are mixed objective and subjective, meaning the results will include varying levels of subjectivity.

#### Redundancy

Redundancy is a quantity measured for interacting human autonomous systems and interfering with other autonomous systems [11]. Redundant are any number of mates that exceed the minimum number of members of a team designed to solve the problem assigned to a team; e.g., a baseball team with more than 9 members on a baseball field has redundant members by that many. In quantum theory, redundant copies of quantum states violate the no-cloning rule ([62], p. 77), and, we argue, interdependent states [11]; e.g., compare Sinopec's 124.6 employees/M BBL of oil produced with Exxon's 15.5, illustrating that authoritarian regimes creates inefficiency with redundancy.

#### Military Power Ranking

We used the ranking devised by Global Firepower (http:// www.globalfirepower.com). Its ranking is based on a nation's weapon diversity and conventional forces without relying on nuclear stockpiles. It includes geographic factors, logistics, natural resources, and industry.

#### Economic Freedom

An index established by the Heritage foundation based on four broad factors to measure liberty and free markets for 186 nations: rule of law; government size; regulatory efficiency; and open markets. Each factor has three sub-factors (http://www.heritage. org/index/ranking).

# Corruption

An index of nations established by Transparency International (https://www.transparency.org). Its factors determine the abuse of power for private gain, and whether the abuse is covert or concealed. The assessment is first to the branches of government, then the public sector, law enforcement, media, businesses, and then other factors.

We measure redundancy with divergence from a Kullback– Leibler-type equation for relative entropy, where DKL(Q||P) is the Kullback–Leibler's divergence of probability distribution Q from P:

$$D\_{\rm KL}(Q||P) = \sum\_{i} Q(i) \log(Q(i)/P(i))\tag{15}$$

The sum of Equation (15) reflects the divergence of distribution Qi , from distribution P<sup>i</sup> , with both distributions normalized. For example, log (P(i)/P(i)) = log 1 = 0. Thus, the more divergence, the larger the separation between two distributions. Based on Equation (15), assuming that a relatively perfect team is possible to solve the problem at hand, we also assume that some structures for desired teams may be closed-ended for a solved problem like those that exist for sports teams; e.g., for a baseball team, designated members take the role of pitcher, catcher, first baseman, etc. Unlike the relatively simple problem of designing a sports team, most business and scientific teams are open-ended whenever competition or innovation are factors. To solve this kind of a structural problem, in business, we look to an industry leader for the best team structure possible for the problem at hand.

To extend these findings to militaries, we hypothesize that redundancy is associated with less freedom in the marketplace and with more corruption. We test this hypothesis with correlations and Kullback–Leibler divergence. We expect that distributions in the real world range from minimum to maximum redundancy; from minimum to maximum freedom; and from minimum to maximum corruption. The nations used in this problem are footnoted below<sup>2</sup> **,** as is the data for each of them<sup>3</sup> .

Example:

As an example of the calculations with Equation (15), for China's Military Power Distribution (MPD, or PMPD), we divided its military power rank (3) by its population in billions (1.374) = 2.183 and the result we divided by 8.1, China's GDP/capita in thousands = 0.370; we summed this result for our top 22 nations = 82.28, which we divided into 0.370 to get the fraction for China, PChina =.0045. We repeated to calculate the Free Market Distribution (FMD) for China (59.4) by dividing by the sum (1296.9) for our Q1. Next we entered the calculations stepwise into Equation (15) to get for China the following calculation:

$$FMD^\*Ln(FMD/MPD) = 0.044^\* 
ln(0.044/0.0045) = 0.101...$$

In addition, as one of our methods, we will look for a sign of the collective effects of intelligence.

# RESULTS

For a pilot run, we used a convenience sample of 12 nations consisting of some of the largest militaries in the world<sup>4</sup> . We assumed that a country's military could be represented as a team. We compared military size with a country's economic freedom index and its corruption index. For this sample, we first calculated correlations to obtain the following results: As a country's economic freedom index increased positively, its military size per GDP decreased significantly (r = −0.78, p < 0.005) and its corruption level decreased significantly (r = −0.59, p < 0.05). We also found that economic freedom and corruption were inversely correlated significantly (r = −0.77, p < 0.025), indicating that an increase in freedom was associated with a decrease in corruption.

Heartened by these pilot results, we were ready to test our hypothesis with Equation (15). For Q1, we summed the result of FMD versus MPD to get 1.78. We repeated the process for Q<sup>2</sup> to get another distribution for corruption levels versus MPD for a sum of 1.95. Then we regressed the FMD results individually nation by nation versus MPD (Q1) with the divergence of

<sup>2</sup>Nations: the top 20 militaries in the world plus Cuba and North Korea were used: China, USA, Russia, Brazil, UK, India, France, Japan, Turkey, Germany, Italy, South Korea, Egypt, Pakistan, Indonesia, Israel, Vietnam, Poland, Taiwan, and Iran.

<sup>3</sup>For the actual study, we used the top 20 militaries in the world per capita (from http://www.globalfirepower.com) and GDP per capita from the IMF (https:// en.wikipedia.org/wiki/List\_of\_countries\_by\_GDP\_(nominal)\_per\_capita) versus national Free Market Economy ranking (http://www.heritage.org/index/ranking) and Transparency International's corruption index (https://www.transparency. org/news/feature/corruption\_perceptions\_index\_2016).

<sup>4</sup>For the pilot study, we used the following sample: USA, China, Cuba, North Korea, India, Israel, Iran, Japan, Mexico, Pakistan, Russia & Turkey; economic freedom index from 2016 http://www.heritage.org/index/ranking; b: the number of military personnel was derived from the 2014 edition of "The Military Balance" published annually by the International Institute for Strategic Studies, except for Cuba, North Korea and Pakistan, with data from http://www.tradingeconomics. com/; and the corruption index was from Transparency International at https:// en.wikipedia.org/wiki/Transparency\_International.

corruption from MPD (Q2) and plotted the result in **Figure 3**. The result is significant (R 2 , 0.926, p < 0.0001).

As a side issue, we also looked for signs of intelligence. We found it in our calculations. Consider an abstract from our data in column 4 of the **Table 1** below where we see a demarcation between authoritarian and democratically governed militaries. Although fuzzy, we argue that the results in the table's column 4 are signs of intelligence based on information processing; i.e., the military power rank per capita per gdp per capita for China is 0.370 vs. that for the USA of 0.017 and for the UK of 0.15, indicating greater protection per capita by the USA and UK compared to China, Russia, Brazil, and Cuba.

#### DISCUSSION

We had hypothesized and found in two separate distributions, one for the divergence of GDP for a country's Index of Economic Freedom with its military power ranking per capita, and the other for the divergence of a country's corruption index versus its military power ranking per capita, a significant regression. This indicates that, even with real-world data containing subjective estimations, redundancy increases the more authoritarian is a country's decision-making. As a corollary, the collective effects of intelligence in a society operate best under the freedom to allocate capital and labor for its best uses.

Our results for this study, also backed with correlations, support theory and justify our use of quantum-like models. We found less divergence with our hypothesis for military team size and economic freedom, but more divergence with military team size and corruption, indicating that National Defense improves under the collective effects of intelligent decisions at the level of the team in free markets. It means that a military is leaner and more effective under democracies that under autocracies.

We suspect that redundancy in the market of teams isolates excess teammates from interdependent effects, reducing responsiveness, and converting co-workers into featherbedders. Barriers, like authoritarian leadership and corruption, impede reaching MEP by intelligent teams. And, as we have found, redundancy increases under authoritarian governments, for the possible but corrupt political payoffs that may become necessary to keep civil peace. For example, corruption has stymied the reform of scientific practices in Russia [63] and the transformation of Russian businesses attempting to reduce redundancy [64].

Our model is different from the traditional model, specifically, the cognitive model. As a representative example of the influence of the cognitive model transported from social science to history in the hands of a popular historian<sup>5</sup> , Harari [65] concludes that human groups of no more than 150 can be held together, primarily with gossip, but that larger groups, like Peugeot SA, are "a figment of our collective imagination" (p. 29) based on shared stories, a social construct that forms the "imagined realities" of the cognitive model. But if Harari's account is true, the differences in team distributions between those domiciled in authoritarian

FIGURE 3 | In this figure, we regressed the divergence of freedom from a military distribution with the divergence of corruption from a military distribution. The result indicates a significant regression (*R* <sup>2</sup> <sup>=</sup> 0.926, *<sup>p</sup>* <sup>&</sup>lt; 0.0001). The nations used in this regression are listed in section Materials and Methods.

TABLE 1 | Data rounded off to three significant decimals.


*(Military power ranking and population from Global Firepower; Freedom Index from Heritage Foundation; GDP per capita from the International Monetary Fund; and Corruption Index from Transparency International, footnoted and defined above.)*

regimes versus democracies should be random. Nor would there be any path forward to build teams of machines or robots that could reasonably be expected to advance social welfare in any meaningful way. If Harari's perspective is true, the success of any one's story may be no more than a matter of taste, preference or culture, not a matter for physics or engineering.

That is not what we have found. Our results establish the meaningful differences that interdependent information plays in the interactions and affairs of humans under any and every form of government. Information constraints (barriers) under authoritarian regimes are less able to direct the movement of labor and capital to best solve targeted problems, an added constraint for innovation, one reason the Chinese rely on the theft of intellectual property (see the interview of General M. Hayden, the former CIA and NSA chief, by the editor-in-chief of the Wall Street Journal, [66]). Certainly, obstacles exist in democracies, especially when they become less free to allocate resources to solve the problems targeted (e.g., the Department of Energy's practices included cardboard boxes, seepage basins and other shortcuts to dispose of its radioactive wastes to save money that may eventually cost DOE well over two hundred billions of dollars to remediate its Hanford Site and its Savannah River Site; in Lawless [15]).

<sup>5</sup>His book is a New York Times Bestseller

Unlike Google's survey of teams [32], guided as it was by traditional social science, we have conjectured and found evidence that an improved theory of human behavior includes both cognitive (subjective surveys like Google's) and behavioral (physical) data which our quantum-like model handles well. By reporting that interdependence is a factor in the best scientific teams, the National Academy has made a nice corrective ([9]; also see [67]). While we agree with the Academy about the value of interdependence, it would have been better for it to have addressed the theoretical value of interdependence as we did with our quantum-like models to shift the focus from individuals to teams, to how teams disarm "imagined realities" to improve their, and their society's, situational awareness of reality, and to better justify the social tools that humans use to produce superior decisions (e.g., political debate). From our perspective, independent individuals or neutrals are valued as critical to the determination of the winners and losers in a contest where the uncertainty associated with an outcome is high and depends on the persuasion of an audience of neutrals (e.g., in the competition between two equally competent teams competing against each other in politics, in courtrooms, or for the philosophical meaning of quantum reality). The added benefit is that we can generalize these results to mathematical metrics for hybrid teams.

#### NEW WORK: EMOTION

In the HRI community, a lot of research with reinforcement learning (RL) is designed to assist in social interaction where "emotions obviously are important for social interaction" ([68], p. 29). For their research, RL agents require few assumptions, are easy to apply in all kinds of domains, and allow for learning. In contrast, our theory is designed to determine when teams are working well and when not.

In his magisterial review of the literature on emotion, supported by our theory, Zajonc [69] saw that emotion may be interpersonal rather than individual (p. 593), especially during communication (p. 604); emotion exists independently of cognition or is even disconnected from awareness (p. 607) and correlates poorly with self-reports (p. 612), supporting the concept of a mind-body duality (p. 596); habituation indicates a low level of emotion (p. 614); positive emotions lower temperature, T, negative ones increase T (p. 616); and deaf subjects respond more emotionally to spoken texts than normals (p. 619), an effect that, ceteris paribus, suggests expressing a skill is less evident to observers than its absence (p. 619).

In addition, a rise in T occurs with cognitive or social dissonance [2]; energy doubles when expressing a statement in a normal versus an angry voice [70]. Emotions reflect an individual's self-interest ([71], p. 439; i.e., less dissonance) and serve to guide social behavior (p. 442) by minimizing marginal expenditures of energy (see also [69], p. 592 and 622).

What if judgments about reality are not rational but guided primarily by experience (where a culture has been ushered into being and molded by experiential learning; [35])? Letting LEP represent the ground state for the structure of a team, a team with its structure at its ground state can devote its availability energy to solving problems, giving experience time to develop into a successful culture. For example, a perfect business team is able to devote its available energy to addressing the problems life offers to it. By way of contrast, when a team's structure exists at an excited state, a business team splitting apart is expending most of its free energy on ripping apart the culture and structure of its team (e.g., Maersk; in Chopping [54]), leaving little available energy to solving the problems it encounters.

Applied to teams by integrating Zajonc and others, we can see that the structure of a team is in a relatively stable state (dof −> 1), and that independent, asocial individuals are in a freer state (maximum dof) than team members. Based on the second law of thermodynamics, comparing a solid substance (ice) and its liquid form (water), energy must be emitted by a group of individuals as a team is formed (e.g., those mergers that reduce redundant employees; in Bunge [72]) and absorbed by a team if it breaks apart.

Initially, we use a sigmoid function to model the effort required to hold a team together (see **Figure 4**). In Equation (16), the effort, f(effort), applied to a team's structure to channel its interactions into a single whole becomes

$$f(\text{effort}) = 1/(1 + \exp(-\text{effort})) \tag{16}$$

Results from a Monte Carlo simulation of Equation (16) shown in **Figure 4** below indicates that as effort to hold together a team's structure increases beyond a critical point, the team's structure begins to fail. In this simple model, we consider the effort as

FIGURE 4 | A Monte Carlo simulation of Equation (16) with the y-axis intercept at (0, ½) in the center, with *y* ranging from 0 to +1 (listed vertically on the far left side). From the *y* intercept to the right along the *x*-axis (with *x* = 0, +1, +2 units) represents increasing effort and emotion; from the left of the intercept along the *x*-axis represents stability and a team's ground state (where *x* = -2, -1, 0 units). As the effort to maintain a team's structure approaches zero in the middle of the graph, a critical point is reached. As more effort is required to hold the team's structure together (i.e., moving to the right), it begins to break apart as team members begin to act more and more like redundant, independent individuals.

the potential energy of the team; a well-fitted team then has negative potential energy. Conversely, the less successful the team at solving its designated problems, the more its teammates begin to act as individuals, the more the strength of the team's structure becomes random, causing the cohesion of the team to dissipate. Once the team reaches a critical point (near the y-intercept) in its dissipation, a phase shift has occurred, requiring more and more energy to maintain the team, offsetting a team's successes, destroying the team's structure. Once that happens, joint entropy begins to resume a Shannon-like nature (i.e., Equation 3; e.g., the coming collapse of Sears; in Halzack [73]).

We have also found evidence that a well-fitted team having success at solving the problems it was designed to solve exhibits more intelligence than an under-fitted or overfitted team with redundant members. The well-fitted team generates less entropy than its individual contributors, an indication that a state of maximum interdependence exists inside of the team, where each member is responsive to every other team member and to the team's mission as well. The state of maximum interdependence, however, can be reversed or blocked. Like quantum computations, the state of interdependence is a resource for a team but also for the society within which the well-fitted team is embedded and to which the well-fitted team contributes. Once a wellfitted team establishes a point of stability, an emotionless baseline, it is operating in a ground state (**Figure 5**, bottom left). If the joint entropy generated by the team begins to exceed the entropy of any single contributor, the team's interdependence and structure have begun to dissipate (**Figure 5**, upper right).

We have not addressed the characteristics of the problem targeted, but we suspect that a team must be designed to match its designated problem (e.g., a well-fitted 5-member baseball team is of value in playing against an equally competent 5-member baseball team, but of little value when playing against an equally competent but nine-person baseball team).

# CONCLUSIONS

Significant impediments exist in the formulation of a science of teams using traditional theories. Specifically, Shannon's information theory and the social sciences, including economics, assume that the human observation of human behavior records the actual behavior that has occurred, even for selfreports of self-observed behavior. In computational social science, this phenomenon has been labeled informally as the "god's eye view," indicating that the "computer" within which computational action occurs knows immediately whatever action a computational agent takes. In the social sciences, this phenomenon manifests as an observational bias; it allows social scientists to assume that self-reported behavior is actual behavior (e.g., if this assumption was true, deception or denial, such as alcoholic denial or spying, would not exist). We claim that this assumption is unsupported by the evidence, as is the "knowledge" gathered in support, such as the conclusion consonant with widespread religious beliefs that cooperation provides for the best social good. At the heart of these rational, but false models, interdependence is seen as a constraint (information or communication theory) or experimental confound (cognitive science) that must be overcome by traditional social scientists to confirm a theoretical models based on methodological individualism (MI; [26]).

By replacing MI with quantum-like models, we have found computational metrics for good and poor teamwork performance, and a third finding that redundancy is associated with corruption by using relative entropy to model divergence from an oil market leading team, now supported in this study by the size of a nation's military. We have also proposed a new model for a team's emotion as it shifts from a ground state to an excited state. We conclude that, like entanglement at the atomic level, interdependence at the social level is the primary social resource that ordinary humans exploit to innovate and promote social welfare.

Wendt [23] said that quantum-like models should be given a chance to make new discoveries. Who would have even thought that redundancy is a problem, or that it could give insight into the structure of what constitutes the best team. The National Academy of Sciences report on teams points out that team size is an open problem, but while it did not mention redundancy [9], it did speculate that "many hands make light work," a speculation faulted by our results.

We reject the traditional model of redundancy (e.g., [10]). Cummings [67] found that the more interdisciplinary a science team, the least productive it was as a science team; however, he also found that the best science teams were highly interdependent; i.e., highly responsive to each other. We agree with Cummings, and our results support him.

Excessive team emotion is observable to external observers; e.g., a divorce; a business breakup; a team's collapse. More difficult to observe is the critical point, the transition from a team arguing appropriately [74] over an "invisible" structural issue that, if not resolved, may represent a transition from being a wellfitted team past the critical point until "visible" to those observing a team's transition along the path to becoming ill-fitted as a team's structure breaks down.

For a mathematical physics of teams, a significant impediment has too long existed from accepting the traditional belief that social truth can be established by observing individuals. As exemplars, both built around the statistics of independent, identically distributed data (i.i.d.), information theory and social science, including economics, assume that self-reports record actual behavior, especially self-observed behavior. But the traditional social science model simply does not generalize to hybrid teams; to evolve, to design hybrid teams, this idea that "self-observations record actual behavior" must be rejected.

In contrast, based on our model where interdependence reduces a team's degrees of freedom (dof), thereby obscuring this effect by making it "invisible" to viewers, we propose that ordinary teamwork is characterized by the search for an optimum in the tradeoff between maximum entropy production (MEP) and least entropy production (LEP), where MEP reflects team performance (dynamics; e.g., productivity), LEP determines team structure (statics), and, unexpectedly, the tradeoff generalizes to represent a new and computational model of team emotion. With our theory, we are able to draw several conclusions. First, as a resource, social humans exploit interdependence to innovate and promote social welfare, suggesting that, by increasing and aligning the MEP density across teams, a culture of competition among teams predictably improves social intelligence, innovation and social welfare. Second, however, interdependence precludes replication, causality and truth, exactly what is commonly found in social reality, including social science. And, finally, our

#### REFERENCES


local theory of teams appears to scale without limit, limiting the value of independent individuals; but, we theorize, value returns when independent individuals enter into states of superposition driven by the opposed worldviews of competing teams, interdependently entangled until these now individuals superposed to both views are measured to determine the winner of the competition that they are most responsive to.

The best teams have the least redundancy so that they are maximally interdependent among teammates to be responsive to each other as they multitask to solve the problems that they face intelligently. In conclusion, we have found support for our quantum-like model with the solution to the open problem of team size.

#### AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

### FUNDING

Some of the work was performed while the corresponding author was a senior faculty researcher at the Naval Research Laboratory over the past 2 years, including the Summer 2016 and 2017.

### ACKNOWLEDGMENTS

The author thanks the reviewers of his manuscript for their very helpful comments, suggestions and corrections.

and Sensory Sciences; Division of Behavioral and Social Sciences and Education; National Research Council. Washington, DC: National Academies Press (2015).


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Lawless. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Nilpotent Quantum Mechanics: Analogs and Applications

#### Peter Marcer <sup>1</sup> and Peter Rowlands <sup>2</sup> \*

1 Independent Researcher, St. Raphael, France, <sup>2</sup> Oliver Lodge Laboratory, Department of Physics, University of Liverpool, Liverpool, United Kingdom

The most significant characteristic of nilpotent quantum mechanics is that the quantum system (fermion state) and its environment (vacuum) are, in mathematical terms, mirror images of each other. So a change in one automatically leads to corresponding changes in the other. We have used this characteristic as a model for self-organization, which has applications well beyond quantum physics. The nilpotent structure has also been identified as being constructed from two commutative vector spaces. This zero square-root construction has a number of identifiable characteristics which we can expect to find in systems where self-organization is dominant, and a case presented after the publication of a paper by us on "The 'Logic' of Self-Organizing Systems" [1], in the organization of the neurons in the visual cortex. We expect to find many more complex systems where our general principles, based, by analogy, on nilpotent quantum mechanics, will apply.

#### Edited by:

Emmanuel E. Haven, University of Leicester, United Kingdom

#### Reviewed by:

Raimundo Nogueira Costa Filho, Federal University of Ceará, Brazil Diego Lucio Rapoport, Universidad Nacional de Quilmes (UNQ), Argentina

> \*Correspondence: Peter Rowlands p.rowlands@liverpool.ac.uk

#### Specialty section:

This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics

Received: 12 April 2016 Accepted: 28 June 2017 Published: 18 July 2017

#### Citation:

Marcer P and Rowlands P (2017) Nilpotent Quantum Mechanics: Analogs and Applications. Front. Phys. 5:28. doi: 10.3389/fphy.2017.00028 Keywords: universal rewrite system, self-organization, nilpotent quantum mechanics, renormalization group

# INTRODUCTION

Three main developments form the background to this work. The first is a universal rewrite system, which is a scale-independent and fractal computational process of generating zerototality alphabets, with seemingly very general application [1], The most immediate applications of the rewrite structure have been found in physics and biology, which brings us to the second development. Nilpotent quantum mechanics is a form of relativistic quantum mechanics/quantum field theory which can be derived from the rewrite system and which minimalizes the whole quantum apparatus to a single operator acting on a universal environment, which is its mirror image. The third development is that both the rewrite structure and nilpotent quantum mechanics require a combination of two vector spaces, each dual to the other, which provides a powerful model for self-organization [1].

Nilpotent quantum mechanics is the most immediately successful application of the universal rewrite system and serves as an almost perfect model for other applications. It is not so much that these applications derive directly from nilpotent quantum mechanics, rather that they derive from the structure which makes this form of quantum mechanics possible. Many characteristics can be described as identifiers of both the rewrite and the nilpotent structures, whether at the quantum mechanical level, or applicable in mathematics, chemistry, biology, or other areas of physics. We have proposed a number of such features as being detectable in systems of very different kinds and as thus being signatures of quantum-like organization or behavior, especially where selforganization is dominant [1], and we have identified a new one in the organization of the neurons in the visual cortex.

#### THE UNIVERSAL REWRITE SYSTEM

The universal rewrite system provides a computational approach to both mathematics and physics based on the idea of a zero totality alphabet [1]. We successively create alphabets in which "conjugation" represented by ∗ ensures totality zero, and in which new creation is ensured by the regular appearance of anticommutative pairs A, B; C, D; etc., each of which is commutative to all the others:

$$(R, R^\*) (R, R^\*) \implies (R, R^\*, A, A^\*) \tag{1}$$

$$\{R, R^\*, A, A^\*\} \{R, R^\*, A, A^\*\} \implies \{R, R^\*, A, A^\*, B, B^\*, AB, AB^\*\} \tag{2}$$

Successive alphabets absorb the previous ones in the sequence, so creating a new cardinality. We may start at any arbitrary zerototality alphabet but there is no natural beginning or end to the process, which can be summarized in **Table 1**.

#### An Algebra for the Rewrite Process

The rewrite process is more general than any particular mathematical interpretation, but such interpretations include both binary integers and digital logic, in addition to the algebraic series:

$$\begin{aligned} &(1, -1) \\ &(1, -1) \times (1, i\_1) \\ &(1, -1) \times (1, i\_1) \times (1, j\_1) \\ &(1, -1) \times (1, i\_1) \times (1, j\_1) \times (1, i\_2) \\ &(1, -1) \times (1, i\_1) \times (1, j\_1) \times (1, i\_2) \times (1, j\_2) \\ &(1, -1) \times (1, i\_1) \times (1, j\_1) \times (1, i\_2) \times (1, j\_2) \times (1, i\_3) \dots \end{aligned}$$

In this interpretation, the anticommutative pairs A, B; C, D; E ... are expressed as quaternion units, **i**1, **j**1; **i**2, **j**2; **i**3..., each of which is commutative to all the others. By the fourth stage, we have repetition, which then continues indefinitely. An incomplete set of quaternion units (for example, **i**<sup>3</sup> in the sixth alphabet) becomes equivalent to the algebra of complex numbers. Mathematically, we can see the process of the creation of the zero totality alphabets as one of conjugation, followed by repeated cycles of complexification and dimensionalization (where each **i** is paired with a **j**).

At the point where the cycle repeats, we have what can be recognized as a Clifford algebra—the algebra of 3-D space, where the vectors **i**, **j**, **k** are constructed from **i**<sup>1</sup> **i**2, **j**<sup>1</sup> **i**2, **i**<sup>1</sup> **j**<sup>1</sup> **i**2, and **i**1, **j**1, **i**<sup>1</sup> **j**<sup>1</sup> = **k**<sup>1</sup> and **i**2, **j**2, **i**<sup>2</sup> **j**<sup>2</sup> = **k**<sup>2</sup> are (mutually commutative) quaternion algebras of the form **i**, **j**, **k**.

$$\begin{aligned} &(1, -1) \\ &(1, -1) \times (1, i) \\ &(1, -1) \times (1, i) \times (1, j) \\ &(1, -1) \times (1, i) \times (1, j) \times (1, i) \\ &(1, -1) \times (1, i) \times (1, j) \times (1, i) \times (1, j) \\ &(1, -1) \times (1, i) \times (1, j) \times (1, i) \times (1, j) \times (1, i) \dots \end{aligned}$$

In this algebraic structure, the unit vectors **i**, **j**, **k** have the multiplication rules

$$\mathbf{i}^2 = \mathbf{j}^2 = \mathbf{k}^2 = \mathbf{l} \tag{3}$$

$$\mathbf{i}\mathbf{j} = -\mathbf{j}\mathbf{i} = i\mathbf{k}; \mathbf{j}\mathbf{k} = -\mathbf{k}\mathbf{j} = i\mathbf{i}; \mathbf{k}\mathbf{i} = -\mathbf{i}\mathbf{k} = i\mathbf{j} \tag{4}$$



The ∆ symbols, here, represent the alphabets:

∆<sup>a</sup> (R)

...

∆<sup>b</sup> (R, R\*)

∆<sup>c</sup> (R, R\*, A, A\*)

∆<sup>d</sup> (R, R\*, A, A\*, B, B\*, AB, AB\*) ∆<sup>e</sup> (R, R\*, A, A\*, B, B\*, AB, AB\*, C, C\*, AC, AC\*, BC, BC\*, ABC, ABC\*)


which are essentially those of complexified quaternions, with multiplication rules

$$(\text{ii})^2 = (\text{ij})^2 = (i\mathbf{k})^2 = 1 \tag{5}$$

compared to those for pure quaternions

$$
\hbar^2 = \mathfrak{j}^2 = \mathfrak{k}^2 = \mathfrak{i}\mathfrak{j}\mathfrak{k} = -1.\tag{6}
$$

In the Clifford vector algebra, there is a full product between vectors **a** and **b** which combines vector and scalar products

$$\mathbf{a}\mathbf{b} = \mathbf{a}\mathbf{b} + i\mathbf{a} \times \mathbf{b} \tag{7}$$

It has been shown by Hestenes [2] and others, that using a Clifford vector algebra is a natural way of incorporating spin into quantum mechanics as an automatic consequence of the vector structure of space and momentum. The units are, significantly, isomorphic to those of Pauli matrices.

Clifford vector algebra produces three subalgabras from the products of its basic units (see **Table 2**). Bivectors (for example, area and angular momentum, in physics) are products of two orthogonal vector units (such as **i** and **j**); they are also called pseudovectors and are isomorphic to quaternion units. Trivectors (for example, volume) are products of three orthogonal vector units (**i**, **j**, **k**), and are also called pseudoscalars; their full algebra is that of complex numbers.

Standard Clifford vector algebra notably produces these subalgebras in the reverse order to the universal rewrite system, which generates, in its first four alphabets, scalars, pseudoscalars, quaternions and vectors, along with the scalar subalgebras of pseudoscalars and quaternions.

Significantly, if we take all these algebras as independently true, and hence commutative, as the rewrite structure seems to suggest we should, since each is a complete description of zero totality, then we require an algebra that is a commutative combination of vectors, bivectors, trivectors and scalars, or vectors, quaternions, pseudoscalars and scalars. This turns out to be equivalent to the algebra of the sixth alphabet, a group structure of order 64 with elements, as in **Table 3**.

The algebra represented by these group elements is isomorphic to the gamma matrix algebra of the Dirac equation, which defines conventional relativistic quantum mechanics.

#### Characteristics of the Rewrite Process

The universal rewrite process is characterized by duality, self-similarity, scale-independence and holism; it proceeds by bifurcation at every stage. It can be distinguished from nonuniversal rewrite processes by the fact that it has no fixed starting or ending point, and by the endless reconstruction of both alphabet and production rules at every bifurcation. The self-similarity follows immediately from the absence of a fixed starting point. It implies that physical applications at some particular level will be matched by applications at other levels, and that the scaling up from small to larger and more complex systems is governed by a principle analogous to the renormalization group in physics, where the structures which the rewrite process generates are maintained by new emergent physical principles.

The observer, who is always placed within the zero totality system (variously described as "the universe," "nature," or "reality"), must necessarily start from the minimum representation of a zero totality alphabet, say (R, R ∗ ). The stages in the process are all zeros. During the process we go from one zero cardinality or totality to the next. The cardinalities are like Cantor's cardinalities of infinity, but are cardinalities of zero instead. We ensure that they are cardinalities by always including the previous cardinality or alphabet. So (R, R ∗ , A, A ∗ ) includes (R, R ∗ ). Because they are cardinalities or zero totality alphabets (descriptions of the universe in physical terms), the process is always holistic. We have to include everything.

Now, we have to assume that (R, R ∗ ) is not necessarily the beginning, though it is the point where we as observers start from. So, this already "bifurcated" state will have started from a previous alphabet, which we assume we can't access directly. If we describe this as R, then the ∗ or R ∗ character creates the doubling process. Before we create (R, R ∗ ), we have to assume that (R) is a zero totality alphabet, but it is a zero we can't access because we have no structure for it. In effect, we are trying to posit an ontology that exists before the epistemology or observation, begins with (R, R ∗ ). So, we assume that it must happen without being able to observe it.

Duality is intrinsic to the process. The operation () () ⇒ (,) describes how we proceed from one description of the entire or universe zero totality alphabet to the next alphabet in the hierarchy. The (,) becomes a "bifurcation" or "doubling." So

$$(R, R^\*)(R, R^\*) \Rightarrow (R, R^\*, A, A^\*)\tag{8}$$



can be expressed as (R, R ∗ , A, A ∗ ), where (R, R ∗ ) (R, R ∗ ) is the bifurcation that creates the new zero cardinality (R, R ∗ , A, A ∗ ), in effect transforming the second (R, R ∗ ) into the new (A, A ∗ ). All the "doublings" or "bifurcations" in the process are, in this sense, similar to the initial creation of (R, R ∗ ), even when they involve complexification or dimensionalization rather than conjugation. So

$$(R, R^\*, A, A^\*) (R, R^\*, A, A^\*) \Rightarrow (R, R^\*, A, A^\*, B, B^\*, AB, AB^\*) \tag{9}$$

can also be written as

$$(R, R^\*, A, A^\*) (R, R^\*, A, A^\*) \Rightarrow (R, R^\*, A, A^\*, \quad B, B^\*, A B, A B^\*) \tag{10}$$

where the original alphabet (R, R ∗ , A, A ∗ ) and its dual (B, B ∗ , AB, AB∗ ) are created by a process similar to the one which created R and R ∗ . For practical purposes, a new character (B) is introduced.

#### Application to Physics

In applying to physics, we note that the universal rewrite process creates successive models for a zero-totality universe. This is what we mean by physical parameters. We can recognize the algebras of the fundamental parameters mass, time, charge and space as being, respectively, scalar, pseudoscalar, quaternion, and vector, exactly as are generated by the first four alphabets of the universal rewrite system. Here, mass is the source of gravity and includes energy, while charge is a term used to represent the sources of the three non-gravitational interactions [3]. The four alphabets seem to be independent descriptions of the universe which must be simultaneously true, as should all subsequent alphabets following these.


Now, if we combine the algebras of these quantities, we obtain the 64-part algebra isomorphic to the Dirac gamma algebra that we tabulated in Section An Algebra for the Rewrite Process. But the 8 units of time, space, mass, and charge are not the minimum number of starting units to generate the 64-part algebra. This is, in fact, 5 and its construction always involves the breaking of the symmetry of one of the two 3-D components, space or charge. Typically, we "combine" one of the units of charge with each of those of time, space and mass, to obtain:


The units i**k**, **ii**, **ij**, **ik**, **j** correspond to the five base units of the gamma algebra, and the new terms, energy, momentum and rest mass, can be seen to take on aspects of the original charge units, together with the pseudoscalar, vector and scalar properties of their other parent quantities. Now, when we combine the momentum terms into a single vector **p** and take the complete package (**ik**E + **ip** + **j**m) to represent the properties of a fundamental physical unit (particle or fermion), we find that we can, in this case, immediately solve the problem of indefinite extension of the alphabets since (**ik**E + **ip** + **j**m) is a nilpotent or square root of zero. The equation

$$(i\mathbf{k}E + \mathbf{i}\mathbf{p} + \mathbf{j}m)(i\mathbf{k}E + \mathbf{i}\mathbf{p} + \mathbf{j}m) = E^2 - p^2 - m^2 = 0 \quad (11)$$

is then simply the relativistic and quantum mechanical conservation of energy and momentum. So, if we take (**ik**E + **ip** + **j**m) as incorporating all the alphabets needed to create a repetitive sequence, when we seek to generate the next alphabet by squaring, we will find that it is zeroed automatically, so zeroing all higher alphabets which incorporate it, and we can describe the world through an indefinite succession of such units.

Simultaneously when we create energy, momentum and rest mass as concepts, the combination of time, space, mass, and charge breaks the symmetry between the weak, strong and electric charges, which then take up the algebraic characteristics of their associated parameters:


The reduction of the original 8 units to the composite set of 5 is, in fact, a characteristic symmetry-breaking operation in nature, which is found also in mathematics, chemistry and biology as well as several other aspects of physics. In all these areas, 5 seems to be the number at which symmetry is necessarily broken.

#### NILPOTENT QUANTUM MECHANICS

Nilpotent quantum mechanics is founded on a nilpotent operator which can be expressed in the form (± i**k**E ± **ip** + **j**m), which is an abbreviated expression for a row or column vector, whose 4 terms encompass the four sign variations in E and **p**. We can use a canonical quantization procedure to replace E and **p** as operators by <sup>E</sup> <sup>→</sup> <sup>i</sup><sup>∂</sup> /∂t, **<sup>p</sup>** <sup>→</sup> – <sup>i</sup>▽, for a free fermion, or by covariant derivatives such as <sup>E</sup> <sup>→</sup> <sup>i</sup><sup>∂</sup> /∂<sup>t</sup> <sup>+</sup> <sup>e</sup><sup>8</sup> ..., **<sup>p</sup>** <sup>→</sup> – <sup>i</sup>▽ <sup>+</sup> e**A** +..., for a fermion constrained by any number of potentials of any type or even by curvature terms. The structure of the operator then determines both the complete quantum behavior of the fermion and also that of its environment or "vacuum" by defining a unique phase term which, when operated on, produces an amplitude which squares to zero:

$$\text{(operator acting on unique phase term)}^2 = \text{amplitude}^2 = 0$$
 
$$\text{(12)}$$

The process incorporates both Dirac and Klein-Gordon equations in the form

$$(\pm \ i k E \pm \not m + j m) \left( \pm \ i k E \pm \not m \pm j m \right) \to 0 \tag{13}$$

where (± i**k**E ± **ip** + **j**m) can stand for either operator or amplitude. This would make the Dirac equation for a free fermion

$$\left(\mp k\partial/\partial t \mp i\mathbf{i}\nabla + \mathbf{j}m\right)\left(\pm ikE \pm i\mathbf{p} + \mathbf{j}m\right)e^{-i(Et-\mathbf{p}.\mathbf{r})} = 0 \tag{14}$$

For a fermion under the constraint of a potential or any number of potentials, the phase factor would take a different form but the result would still be a term with the same structure as (± i**k**E ± **ip** + **j**m) being squared to zero.

## Characteristics of Nilpotent Quantum Mechanics

Nilpotent quantum mechanics is relativistic and is concerned with fermions. It shares all the standard characteristics of relativistic quantum mechanics using the more conventional formalisms of the Dirac equation, and can be easily transformed into these formalisms using the one-to-one correspondence between the algebraic operators and gamma matrices. However, it also has some characteristics which only become apparent in this mathematical form, but which are necessary for understanding how the process can be scaled up in higher order systems.

Spin ½ and zitterbewegung are among the shared characteristics and can be easily derived using variants of the standard formalisms. Chirality (or the intrinsic left-handedness of fermions and right-handedness of antifermions) emerges in the same way. Fermion uniqueness or Pauli exclusion is obvious in the nilpotent formalism, as any combination state of identical fermions will automatically vanish. However, the nilpotent also creates a completely new meaning for the concept. Because the totality of experience is defined always to be zero, if we take a fermion in any state, say (**ik**E + **ip** + **j**m), subject to any number of constraints that can be built into its operator, and imagine that we can create it from absolutely nothing, then the "vacuum" which defines the rest of the universe for that fermion, must be a kind of mirror image, −(**ik**E + **ip** + **j**m), so that both the superposition and the combination of vacuum and fermion remain at zero:

$$-(i\mathbf{k}E + i\mathbf{p} + jm) + (i\mathbf{k}E + i\mathbf{p} + jm) = 0\tag{15}$$

$$-(i\mathbf{k}E + i\mathbf{p} + jm)\,\left(i\mathbf{k}E + i\mathbf{p} + jm\right) = 0\tag{16}$$

To maintain this zero totality in all circumstances, any change in either the fermion or its environment must be reflected in a corresponding change in the other. In effect, this creates a principle of self-organization which can be imagined in systems on a much larger scale, and which will be identifiable by strongly characteristic features which originate in the nilpotent structure and the universal rewrite process.

Another significant aspect of quantum mechanics is that it involves both locality and non-locality. The distinction between the two processes is clear in the nilpotent form. Everything inside the bracket is local; everything outside the bracket is non-local. So the conservation laws of energy and angular momentum are local; superpositions and combination states and interactions with vacuum are non-local. Both processes, however, are holistic in requiring the cooperation of the entire universe, and each produces consequences which affect the other. In nilpotent quantum mechanics, the individual fermion conserves its energy only with respect to the rest of the universe. The fermion is an open system and intrinsically dissipative. The first law of thermodynamics must be accompanied by the second.

Fermions are, in a very fundamental way, incomplete. They have half-integral spin, are only observable when interacting in a pairing with other fermions, and are square roots of algebraic operators which only have meaning when multiplied with other objects of the same kind. In the nilpotent formalism, bosons of spin 1 and spin 0 are formed from fermion-antifermion combinations of the form (± i**k**E ± **ip** + **j**m) (∓ i**k**E ± **ip** + **j**m) and (± i**k**E ± **ip** + **j**m) (∓ i**k**E ∓ **ip** + **j**m), while a fermion-fermion combination can exist in the form (± i**k**E ± **ip** + **j**m) (± i**k**E ∓ **ip** + **j**m) in Cooper pairs, Bose-Einstein condensates and other applications of Berry phase. All these expressions become scalars when multiplied out. All the tendency for aggregation in nature can be seen as stemming from the need for fermions to acquire partners to remove this incompleteness, and it can be linked to the action of a harmonic oscillator, of which the zitterbewegung is a special instance. The same pattern emerges at higher levels, suggesting that the nilpotent model applies well beyond the direct application of quantum principles. It is very likely that a major role in providing the "staircase" that we hope to show leading from the smallest systems to the largest will be provided by the renormalization group procedure.

### Dual Space

The most significant aspect of the structure is that it incorporates two full vector spaces with the full Clifford algebra of each. The 64-part algebra requires a Clifford vector algebra for space commutatively combined with its three subalgebras, representing time, mass, and charge. If we take these three subalgebras together, we find that they have the mathematical characteristics of another vector space, entirely commutative to the first. This "space," however, as a composite of three other parameters, is not an observable quantity. So, the nilpotent structure emerges from combination of two vector spaces, only one of which is observable.

We can call the unobserved space "vacuum space," and its effects are immediately apparent in spin ½ and the 4-component structure of the fermion wavefunction. Here, the fermion also includes two terms associated with antifermion states. These are a manifestation of the fermion's vacuum, and are responsible for the fermion spending half its time as a real particle and half as a vacuum particle (zitterbewegung), which is also one of many ways of accounting for the fermion's ½ spin.

Another way of looking at this is to relate it to Berry phase, and to attribute this to the fact that the fermion is a singularity with respect to ordinary space. As is well-known, Berry or geometric phase can be described in purely topological terms. If we parallel transport a vector around any complete circuital path in ordinary or simply-connected space, we can expect it to leave the vector pointing in the same direction at both beginning and end of the circuit. However, if the space of the circuit contains a singularity or is multiply-connected, then the vector will gain a phase change of π and end up pointing in the opposite direction from its starting position.

Spin ½ could be seen as indicating that the fermion singularity rotates in its own multiply-connected space. So, we can attribute the same effect to the fact that the fermion is defined as a singularity and that it is defined by a nilpotent connection between two spaces, leading to the conclusion that the dual space structure is actually responsible for the existence of discrete matter in the form of physical singularities. In our understanding, the Berry phase/spin ½/zitterbewegung results from defining a localized point particle simultaneously with defining the nonlocalized vacuum that determines its relation to the rest of the universe, and that carries the information about its future evolution. We can consider Berry phase to be a particularly significant indicator of the presence of some kind of dual space, nilpotent-related behavior, especially in systems subject to selforganization.

### The Holographic Principle

The nilpotent dual spaces are genuinely dual, in that they contain precisely the same information, though in different forms. This duality has many manifestations. For example, the uniqueness of the nilpotent (**ik**E + **ip** + **j**m) and Pauli exclusion could be determined by the "direction" of a line drawn from the origin if iE, p, and m re represented as coordinates on the quaternion axes **k**, **i,** and **j**. Alternatively, we could express Pauli exclusion by the more conventional method of defining fermion wavefunctions as antisymmetric. This leads to a truly remarkable result if we take (ψ1ψ<sup>2</sup> – ψ2ψ1) for two fermions in the nilpotent formalism:

$$\begin{aligned} \left(\pm ikE\_1 \pm i\mathbf{p}\_1 + jm\_1\right) \left(\pm ikE\_2 \pm i\mathbf{p}\_2 + jm\_2\right) \\ - \left(\pm ikE\_2 \pm i\mathbf{p}\_2 + jm\_2\right) \left(\pm ikE\_1 \pm i\mathbf{p}\_1 + jm\_1\right) \\ = 4\mathbf{p}\_1\mathbf{p}\_2 - 4\mathbf{p}\_2\mathbf{p}\_1 = 8i\mathbf{p}\_1 \times \mathbf{p}\_2, \end{aligned}$$

for this only has a non-zero value if the fermion spins are oriented in different directions. In effect, the complete information about a fermion state is contained in its instantaneous spin direction, or in the plane to which this is perpendicular. In principle, the orientation of the fermion in real space and in the "vacuum space" created by the quaternion axes **k**, **i,** and **j** carries the same information. Exactly the same duality occurs in the derivation of spin ½ either from the anticommutativity of the momentum operator, which uses real space, or from the Thomas precession, which uses vacuum space, and the duality again informs the holographic principle.

The holographic principle, in which the entire information about a system is found on the bounding area, is thought to be a significant organizing principle for many systems. We have already considered it as "a characteristic signature of a nilpotent, self-organizing system with its planar fractality" [1]. Essentially, it uses the information coded in the E and **p** terms of the operator (**ik**E + **ip** + **j**m), that is, in two components of the vacuum space, as nilpotency makes the third term redundant. This then becomes equivalent to using the information coded in two components of the dual real space. Significantly, this can also be coded in one dimension of space and one of time, which would be equivalent to using the vacuum space. As space and momentum are conjugate variables, area is also a conjugate of angular momentum, and (**ik**E + **ip** + **j**m) is recognizably an angular momentum operator, with the E term determining the handedness, **p** the direction and m the magnitude. Since any system which conserves angular momentum or which operates according to the holographic principle (for example, galaxies acting collectively under gravity) can be expressed in this form, then any such system can be seen as a direct analog of the nilpotent fermion, even though it is not intrinsically quantum or relativistic.

The application of the nilpotent operator to the holographic principle also suggests that it can itself be regarded as a quantum hologram, with phase **ik**E, amplitude **ip** and reference phase **j**m. Again, we can recover the entire structure from just two terms, for example, phase and reference phase. Quantum holography has now been officially recognized as occurring in the case of "quantum holographic encoding in a two-dimensional electron gas" [4], but the work of Walter Schempp has already shown that it has extensive practical application in Magnetic Resonance Imaging based on harmonic analysis on the 3D Heisenberg Lie group [5]. The universal rewrite system shows that the repeating unit that we need for the description of a quantum or quantumlike system is a double vector space. The two three dimensional spaces make quantum holography possible via Fourier transform action, and relate to the 3D Heisenberg Lie Group and its nilpotent Lie algebra and their dual/inverses [1].

The holographic paradigm is particularly significant in that the wavefunction is defined only up to an arbitrary fixed phase, which provides a direct meaning for the quantum vacuum in quantum field theory, as ensuring that only relative phases, which encode the 3 + 1 space-time geometries, can be measured. This phase becomes the fixed, though arbitrary, measurement standard for all subsequent measurements, and acts as the holographic basis for a universal and self-organized quantum process in which new fermionic states of matter are produced. After each new emergence, a new arbitrary standard is created, providing a complete history, i.e., hologram or holographic record of past events, which our senses perceive as an unending irreversible evolution. Nature allows us to use our arbitrary standard as the new beginning of a rewrite process (as in Section Characteristics of the Rewrite Process) and to conceive of how self-organization can take place at each new level of complexity. The universal rewrite system implies that the only valid mathematical representations of nilpotent quantum-like systems are all automorphisms of the universe itself, and that this is the mathematical meaning of quantum entanglement [6].

# SELF-ORGANIZATION: A NEW APPLICATION

The universal rewrite system provides a blueprint for selforganization mediated through the nilpotent relation between the defined system and the rest of the universe, which emerges in this form of quantum mechanics. The particular characteristics of nilpotent quantum mechanics provide us with a number of identifiers which we have already linked to selforganization, citing specific examples, and which include double 3-dimensionality; a five-fold broken symmetry; geometric phase; spin ½ or equivalent double helical structure; uniqueness of the objects and unique birthordering; irreversibility; dissipation; chirality, harmonic oscillator mechanism, zitterbewegung; fractality of dimension 2; the holographic principle and quantum holography [1]. Self-organization appears to be very general in nature and an almost obvious consequence of a universal rewrite system which reappears at each new level of complexity, so we can expect to find increasing evidence of such identifiers in systems that have been shown to be self-organizing.

A completely new application emerged immediately after our last QI presentation [1]. Our previous work has indicated that geometrical or Berry phase is a particularly significant identifier of nilpotent-like behavior in a system that need not necessarily be quantum. In an earlier paper, we wrote, concerning nilpotent structure (where X <sup>2</sup> <sup>=</sup> 0): "each <sup>X</sup> 2 signifies a return (in terms of a corresponding unique dual Dirac annihilation operator) to the quantum mechanical vacuum state which takes the form of a universal attractor of fractal dimension 2 ..., where the uniqueness of each of the nilpotent quantum mechanical Dirac operators is carried by means of quantum phase, in the form a unique gauge invariant Berry/geometric phase able to encode the requisite relativistic 3 + 1 space time geometric information about the unique fermion state vector, and is 'scale free"' [7].

Now, new research, by Kaschube et al. published just after "The Logic of Self-Organizing Systems," shows that the neurons in the visual cortex in the brain of three distantly-related mammals have a quasiperiodic structure. Orientations of the neurons in the flat sheets of the cortex change continuously, repeating over a length known as the "map period" (λ), while appearing to converge on centers known as "pinwheels," while the pinwheel density per λ 2 appears to equal π to within a few percent [8]. According to Miller, writing in the same issue: "The result offers insight into the development and evolution of the visual cortex, and strongly suggests that key architectural features are self-organized rather than genetically hard-wired" (our emphasis). Miller also says that "The universality of self-organizing behavior provides a simple and compelling explanation for the arrival of widely divergent evolutionary lines at this common design [9]."

In our interpretation, π might well appear in the density of the squared "map period" of the neurons because the spatial structure of the system requires a geometrical phase. If the "pinwheel" is taken as a "singularity" in the physical space, then we need a double circuit through the "map period" or cycle of orientations to re-establish the original phase state. The singularity would then generate a double map period (2λ) in any direction of the two-dimensional cortical sheet, and each pinwheel singularity would be situated in a circle with radius length λ in this twodimensional space, creating a pinwheel density/λ <sup>2</sup> of π. This would coincide directly with our proposal that a characteristic structure for the space of self-organizing systems at all levels of complexity results from a dual vector system, or equivalent, for which a geometrical phase of π becomes an identifying feature.

This is referred to in many publications and very explicitly in the biological context, with the relevant identifying structures, in "A Computational Unification of Scientific Law" and references therein [10]. The underlying Clifford algebra suggests that analogous mathematical models are also possible, one of which is the Klein bottle structure proposed by one of us in earlier work [3]. This has been developed further by Rapoport, along with the appearance of pinwheel structures and the appearance of the identifying π which we associate with the Berry phase, and Rapoport has proposed extensions of the analogy to many seemingly unrelated areas [11, 12]. The fundamental dualities involved can be expressed in many different ways and have been discussed by the authors as the universal basis for physical, chemical, biological, and other systems and their organization in many previous publications. Essentially, where energy is a conserved or near-conserved quantity in any structure, or where there is a definable energy flow, a nilpotent relation can be found between that structure and the rest of the universe, and the analogies presented in Zero to Infinity and earlier works will automatically apply, together with certain identifying characteristics [3]. The structure of the visual cortex is just one example where our prior predictions appear to have been vindicated in a seemingly unexpected and visually striking way. It is our belief that examination of other selforganizing systems will reveal the presence of other characteristic identifiers of a structure analogous to nilpotent quantum mechanics.

# CONCLUSION

Self-organization in Nature has been posited as resulting from a universal rewrite system, which manifests itself at each new structural level. In this system, the totality of the entire universe or anything that can be applied universally is taken to be zero. New zero totality structures or alphabets emerging from a previous one always include it, leading to what has been described here as a succession of alphabets, with zero cardinality by analogy with the well-known succession of infinite cardinalities in mathematics. The process is universal, so is not confined to specific interpretations, but one such interpretation is an algebraic series which becomes a form of Clifford algebra, or an infinite series of sets of quaternion units, with the full set of terms produced by squaring out or multiplying to a higher order. The series of zero totalities has a fractal quality in that combinations of all the alphabets in the series up to any order, as independent units, leads to an alphabet higher up in the series. This appears to be applicable to physics where the first few terms in the series correspond to the successive algebraic properties of the fundamental physical parameters mass, time, charge and space.

A combination of these leads to a higher algebra which appears to correspond to that used in the Dirac equation of relativistic quantum mechanics, which describes the fermionic state, the only known fundamental entity in physics. Interpreting this algebra as a group of order 64, allows us to select sets of 5 generators for the entire combination, which we can show correspond precisely to the algebraic terms that define the Dirac state and that we conventionally identify as the gamma matrices. In addition, the combination of the terms as used in physics has only nilpotent solutions, squaring to zero, suggesting that all higher alphabets incorporating these will automatically produce zero squared and higher order products as well.

The higher order alphabet which incorporates all the alphabets corresponding to the parameters appears to be equivalent to that which would be produced by a double vector or dual space. The nilpotent structure of the fermion (which can also be derived from the conventional form of the Dirac equation using gamma matrices) immediately explains Pauli exclusion and interprets vacuum as corresponding to the "rest of the universe" (zero totality—fermion) which allows a fermion to exist in any particular state, with the fermion and vacuum occupying the two "spaces" required by our algebra. Quantum mechanics can then be structured as the interaction between a nilpotent fermion and the rest of the universe acting like a mirror image creating a totality of zero. The many powerful applications of this kind of quantum mechanics have already been extensively described [3]. If we now interpret the nilpotent fermion plus vacuum combination as an example of a more universal condition, produced by the universal rewrite system, we can extend the application to self-organizing systems in general, and we suggest, among other things, that it is the explanation of the holographic principle being applicable to such systems. We also suggest how it should apply to biological systems, giving an example from the structure of the visual cortex, which we propose is an example of the Berry phase which results from the system and its entire environment occupying two different mathematical "spaces."

# AUTHOR NOTE

This paper is a revised and expanded version of an AAAI technical report on "The 'Logic' of Self-Organizing Systems" (2010-08-020). The authors hold the copyright and no permission is required from AAAI for the use and reproduction of material from this report.

# AUTHOR CONTRIBUTIONS

PR: The original ideas of universal rewrite system, the associated algebra and nilpotent quantum mechanics. Joint contribution with PM on recognizing the wide application of these ideas in particular to many areas outside of physics.

#### REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Marcer and Rowlands. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Quantum Probabilistic Models Revisited: The Case of Disjunction Effects in Cognition

Catarina Moreira\* and Andreas Wichert

Instituto Superior Técnico/INESC-ID, Oeiras, Portugal

Recent work in cognitive psychology has revealed that quantum probability theory provides another method of computing probabilities without falling into the restrictions that classical probability has in regard to modeling cognitive systems and decision-making. This enables the explanation of paradoxical scenarios that are difficult, or even impossible, to explain through classical probability theory. In this work, we perform an overview of the most important quantum models in the literature that are used to make predictions under scenarios where the Sure Thing Principle is being violated (the Quantum-Like Approach, the Quantum Dynamical Model, the Quantum Prospect Theory and Quantum-Like Bayesian Networks). We evaluated these models in terms of three metrics: interference effects, parameter tuning and scalability. The first examines if the analyzed model makes use of any type of quantum interferences to explain human decision-making. The second is concerned with the assignment of values to a large number of quantum parameters. The last one consists of analyzing the ability of the models to be extended and generalized to more complex scenarios. We also studied the growth of the quantum parameters when the complexity and the levels of uncertainty of the decision scenario increase. Finally, we compared these quantum models with traditional classical models from the literature. We conclude with a discussion of the manner in which the models addressed in this paper can only deal with very small decision problems and why they do not scale well to larger, more complex decision scenarios.

Keywords: quantum cognition, quantum-like approach, quantum dynamical model, quantum prospect theory, quantum-like Bayesian networks

# 1. INTRODUCTION

The process of decision-making is a research field that has always triggered a vast amount of interest among several fields of the scientific community. Throughout time, many frameworks for decision-making have been developed, namely the Expected Utility hypothesis, which is characterized by a specific set of axioms that enable the computation of the person's preferences with regard to choices under uncertainty [1]. Later, Savage [2] proposed an extension to this theory: the Subjective Expected Utility theory. In this extension, uncertainty is described by subjective probabilities, since not all uncertainty can be described using an objective probability distribution. However, human behavior tends to violate the axioms of Expected Utility, leading to the well known Allais paradox [3]. Human behavior also tends to violate the axioms of the Subjective Expected Utility framework, leading to the Ellsberg paradox [4].

#### Edited by:

Emmanuel E. Haven, University of Leicester, UK

#### Reviewed by:

Jan Broekaert, Vrije Universiteit Brussel, Belgium Irina Basieva, General Physics Institute, Russia

> \*Correspondence: Catarina Moreira catarina.p.moreira@ist.utl.pt

#### Specialty section:

This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics

Received: 22 March 2016 Accepted: 10 June 2016 Published: 28 June 2016

#### Citation:

Moreira C and Wichert A (2016) Quantum Probabilistic Models Revisited: The Case of Disjunction Effects in Cognition. Front. Phys. 4:26. doi: 10.3389/fphy.2016.00026

**99**

# 1.1. Background

In the 70s, the cognitive psychologists Amos Tversky and Daniel Kahneman decided to put to the test the axioms of the Expected Utility hypothesis. They performed a set of experiments in which they demonstrated that people usually violate the Expected Utility hypothesis and the laws of logic and probability in decision scenarios under uncertainty [5–9]. This means that, when people need to make a decision under scenarios with high levels of uncertainty, ambiguity and risk, they tend to violate the laws of probability theory, leading to decision paradoxes [3, 4].

One of these paradoxes was demonstrated in the article of Tversky and Shafir [10] and corresponds to the violation of Savage's Sure Thing Principle, also known as disjunction effects, under the Prisoner's Dilemma Game. This principle is fundamental in classical probability theory and states that, if one prefers action A over B under the state of the world X, and if one also prefers A over B under the complementary state of the world X, then one should always prefer action A over B even when the state of the world is unspecified [2]. Violations of the Sure Thing Principle imply violations of the classical law of total probability [11].

Quantum cognition has emerged as a research field that aims to build cognitive models using the mathematical principles of quantum mechanics. Given that classical probability theory is very rigid in the sense that it poses many constraints and assumptions (single trajectory principle, obeys set theory, etc.), it becomes too limited (or even impossible) to provide simple models that can capture human judgments and decisions since people are constantly violating the laws of logic and probability theory [12–14].

In this sense, psychological (and cognitive) models benefit from the usage of quantum probability principles because they have many advantages over classical counterparts [15]. They can represent events in vector spaces through a superposition state, which comprises the occurrence of all events at the same time. In quantum mechanics, the superposition principle refers to the property that particles must be in an indefinite state. That is, a particle can be in different states at the same time. Under a psychological point of view, a quantum superposition can be related to the feeling of confusion, uncertainty or ambiguity [16]. This vector space representation does not obey the distributive axiom of Boolean logic and to the law of total probability. This enables the construction of more general models that can mathematically explain cognitive phenomena such as violations of the Sure Thing Principle [17, 18], which is the focus of this study. Quantum probability principles have also been successfully applied in many different fields of the literature, namely in biology [19, 20], economics [21, 22], perception [23, 24], jury duty [25], etc.

One of the pioneering contributions to the Quantum Cognition field comes from the works of Aerts and Aerts [26]. The authors designed a quantum machine, which consists in a particle that can move across the surface of a sphere. An elastic, representing some experiment is introduced in this sphere. The particle then moves orthogonally to the elastic and the elastic breaks uniformly into two parts. With this geometric representation, one can easily compute the probabilities of the particle falling into each side of the elastic. The model was extended with an ǫ parameter that represents the evolution from a quantum structure to a classical one. This parameter varies between [0, 1], where 0 corresponds to maximum lack of knowledge (quantum structure) and 1 to zero lack of knowledge (classical knowledge). Between this interval, there is the possibility of exploring other types of structures that are neither classical nor quantum. The authors also made several experiments to test the variation of probabilities when posing yes/no questions. According to their experiment, most participants formed their answer at the moment the question was posed. This behavior goes against classical theories, because in classical probability, it would be expected that the participants have a predefined answer to the question and not form it at the moment of the question. A further discussion about this study can be found in the works of Aerts [27–29], Gabora and Aerts [30], and Aerts et al. [31].

In other subsequent works, namely in Aerts [32], the author uses the formalisms of quantum mechanics in order to accommodate disjunction effects. The author, represents concepts as vectors and membership weights as quantum weights, in a complex Hilbert Space. By using quantum interference effects and quantum superpositions, the author was able to model accurately the disjunction of concepts present in experimental data.

# 1.2. The Article's Main Statement

In this article, we provide an overview and discussion of the most important state-of-the-art quantum cognitive models that are able to explain the paradoxical findings of experiments that violate the Sure Thing Principle (ex: the Prisoner's Dilemma game [33]). We conduct a deep comparison of and discussion on several quantum models: the Quantum-Like Approach [34], the Quantum Dynamical Model [35], the Quantum Prospect Decision Theory [36] and Quantum Bayesian Networks [37–40]. We discuss these models in terms of three metrics: (1) incorporation of quantum interference effects, (2) how to find values for quantum parameters, and (3) scalability of the model for more complex decision problems.

The first metric checks if the model uses quantum interference effects to predict actions chosen under uncertainty. Following the work of Yukalov and Sornette [36], toward uncertainty, human beings tend to have aversion preferences. They prefer to choose an action that brings them a certain but lower propensity/utility instead of an action that is uncertain but can yield a higher propensity/utility [41]. This can be simulated through quantum interference effects, in which one outcome is enhanced (or diminished) toward the opposite outcome.

The second metric takes into account the problem of finding values for quantum parameters. In quantum mechanics, a quantum state is modeled by probability amplitudes [42]. These amplitudes are a component of the wave function and this wave function represents a quantum state. Associated with each

probability amplitude is a quantum parameter representing the phase of the wave. The interpretation of this parameter under the psychology literature is still not clear, although various works have presented interpretations [17]. Moreover, when applying quantum principles to cognition (or to any other subject), one will need to set these quantum parameters in such a manner that they will lead to accurate predictions. In this metric, we will check how easy it is for the analyzed models to set these parameters.

The third and last metric consists of determining if the model can be extended to more complex scenarios. Although there are many experiments that report violations of the Sure Thing Principle [17, 35, 43, 44], these experiments consist of very small scenarios that are modeled by, at most, two random variables. Therefore, many of the proposed models in the literature are only effective under such small scenarios and become intractable (or even cannot be applied) under more complex situations. These metrics will be analyzed with more detail in Section 8 of the present work.

It is important to note that the goal of this work is the following: we have collected a set of models from the literature that attempt to tackle violations of the Sure Thing Principle in a quantum fashion, and then we compare the collected models. For this comparison, we just show, through a mathematical description of each model, their advantages and disadvantages. That is, we compare these models with the three metrics proposed: number of parameters involved in the model, the scalability of the quantum interference effects and their usage. We will also show that classical models also suffer from the same parameter growth problem as quantum approaches. However, because these models must obey set theory and the laws of classical probability, it is not possible to use them to make predictions in situations where the Sure Thing Principle is being violated.

# 1.3. Outline

We will start this article with a motivational problem, in which the Sure Thing Principle is found to be violated under the Prisoner's Dilemma Game (Section 2). In Section 3, we will show that a classical approach cannot accommodate violations of the Sure Thing Principle because these approaches obey set theory and consequently the laws of probability theory. We will make a full step-by-step description of the most influential models of the literature. We will show how one could apply them to predict the results concerned with violations of the Sure Thing Principle in the Prisoner's Dilemma Game. In Section 4, we will cover the Quantum-Like Approach [34]. In Section 5, we will analyze the Quantum Dynamical Model [17]. In Section 6, we will describe the Quantum Prospect Decision Theory [36]. In Section 7, we will provide an overview of Quantum-Like Bayesian Networks [37– 40]. We then engage in a deeper discussion of these approaches and give thought to the advantages/disadvantages of each model in Section 8. We finish this article by presenting the main conclusions of this work by providing some insights regarding various trends in quantum probabilistic models (Section 9).

# 2. VIOLATION OF THE SURE THING PRINCIPLE: THE PRISONER'S DILEMMA GAME

The Prisoner's Dilemma game corresponds to an example of the violation of the Sure Thing Principle. In this game, there are two prisoners who are in separate solitary confinements with no means of speaking to or exchanging messages with each other. The police offer each prisoner a deal: they can either betray each other (defect) or remain silent (cooperate). For understanding purposes, we provide an example of a payoff matrix for the Prisoner's Dilemma Game (**Figure 1**). The payoff matrix represents the rewards that each player receives for a given action.

The dilemma of this game is the following. Taking into account the payoff matrix, the best choice for both players would be to cooperate. However, the action that yields a bigger individual reward is to defect. If player A has to make a choice, he has two options: if B has chosen to cooperate, the best option for A is to defect because he will be set free; if B has chosen to defect, then the best action for A is also to choose to defect because he will spend less time in jail than if he cooperates.

To test the veracity of the Sure Thing Principle under the Prisoner's Dilemma game, several experiments were performed in the literature in which three conditions were tested:


**Table 1** summarizes the results of several works in the literature that have performed this experiment using different payoffs. Note that all entries of **Table 1** show a violation of the Sure Thing Principle and, consequently, the law of total probability. In a


Game.

classical setting, assuming neutral priors, it is expected that:

$$\begin{aligned} &\Pr\left(P\_2 = \text{Defect} \mid P\_1 = \text{Defect}\right) \ge \Pr\left(P\_2 = \text{Defect}\right) \\ &\ge \Pr\left(P\_2 = \text{Defect} \mid P\_1 = \text{Cooperate}\right) \end{aligned}$$

However, this is not consistent with the experimental results reported in **Table 1**. Note that Pr(P<sup>2</sup> = Defect | P<sup>1</sup> = Defect) corresponds to the probability of the second player choosing the Defect action given that he knows that the first player chose to Defect. In **Table 1**, this corresponds to the entry Known to Defect. In the same manner, Pr(P<sup>2</sup> = Defect | P<sup>1</sup> = Cooperate) corresponds to the entry Known to Cooperate. The observed probability during the experiments concerned with player 2 choosing to defect, Pr(P2 = Defect), corresponds to the unknown entry of **Table 1** because there is no evidence regarding the first player's actions. Finally, the entry Classical Probability corresponds to the classical probability Pr(P<sup>2</sup> = Defect), which is computed through the law of total probability assuming neutral priors (a 50% chance of a player choosing either to cooperate or to defect):

$$\begin{aligned} \Pr\left(P\_2 = Defect\right) &= \Pr\left(P\_1 = Defect\right) \\ \cdot \Pr\left(P\_2 = Defect \, | P\_1 = Defect\right) \\ + \Pr\left(P\_1 = Conperate\right) \cdot \Pr\left(P\_2 = Defect | P\_1 = Conperate\right) \end{aligned}$$

For simplicity, we will use the following notation. The probability of Player 2 choosing to defect will be Pr ( P2 = D ). In the same way, the probability of Player 2 choosing to cooperate will be Pr (P2 = C).

In the next sections, we will introduce the most representative models in the quantum cognition literature that are able to solve problems concerning violations of the Sure Thing Principle and also show that a classical model cannot accommodate violations of the Sure Thing Principle. We will also demonstrate how quantum models work when trying to predict the probabilities of the average results of the Prisoner's Dilemma Game, reported in **Table 1**.

#### 3. A CLASSICAL MARKOV MODEL OF THE PRISONER'S DILEMMA GAME

A Markov Model can be generally defined as a stochastic probabilistic undirected graphical model that satisfies the Markov property. This means that the probability distribution of the next state depends on the current state and not on previous states. These probabilistic models are very useful for modeling systems that change states according to a transition matrix that specifies some probability distribution or some transition rules that depend solely on the current state.

One can apply a dynamical Markov process to model the Prisoner's Dilemma Game in the following manner. Having as reference the work of Pothos and Busemeyer [17], the Prisoner's Dilemma is a 2-person game and can be modeled in a fourdimensional classical Markov model. Initially, the states can be represented by all possible actions of the players: Cooperate (C) and Defect (D). These are represented in a state vector in which all possible actions are equally likely to be chosen:


The probability of the second player choosing to Defect given that the action of the other player is unknown is given by Equation (1) and consists of the multiplication of this initial probability state P<sup>I</sup> by a transition function T(t):

$$P\_F = T(t) \cdot P\_I \tag{1}$$

The transition function T(t) is represented by a matrix containing positive real numbers and with the constraint that each row must sum to one (normalization axiom). In other words, this matrix represents the new probability distribution across the player's possible actions over some time period t [17].

$$\frac{d}{dt}T(t) = K \cdot T(t) \Rightarrow T(t) = e^{K.t} \tag{2}$$

In Equation (2), the matrix K corresponds to an intensity matrix. It is a matrix representation of all payoffs of the players. A solution to the above equation is given by T(t) = e K.t , which allows one to construct a transition matrix for any time point from the fixed intensity matrix. These intensities can be defined in terms of the evidence and payoffs for actions in the task. In other words, the intensity matrix performs a transformation

TABLE 1 | Works of the literature reporting the probability of a player choosing to defect under several conditions.


<sup>a</sup> corresponds to the average of the results reported in the first two payoff matrices of the work of Crosson [45].

<sup>b</sup> corresponds to the average of all seven experiments reported in the work of Li and Taplin [46].

on the probabilities of the current state to favor defection or cooperation, which are represented by the parameters µ<sup>d</sup> and µ<sup>c</sup> , respectively [17].

$$KA\_d = \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix} \otimes \begin{bmatrix} \mu\_d & 1 \\ 1 & -\mu\_d \end{bmatrix} \qquad KA\_c = \begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix} \otimes \begin{bmatrix} \mu\_c & 1 \\ 1 & -\mu\_c \end{bmatrix} \tag{3}$$

$$KA = KA\_d + KA\_c = \begin{bmatrix} -1 & \mu\_D & 0 & 0 \\ 1 & -\mu\_D & 0 & 0 \\ 0 & 0 & -1 & \mu\_C \\ 0 & 0 & 1 & -\mu\_C \end{bmatrix} \tag{4}$$

In the work of Pothos and Busemeyer [17], the authors proposed the incorporation of dissonance effects to simulate the change of mind to dissolve contradictory beliefs that a player can experience. This is given by the parameter γ and corresponds to the payoffs of the players (Equation 5).

$$KB = \begin{bmatrix} -1 & 0 & \mathcal{Y} & 0 \\ 0 & -\mathcal{Y} & 0 & 1 \\ 1 & 0 & -\mathcal{Y} & 0 \\ 0 & \mathcal{Y} & 0 & -1 \end{bmatrix} \tag{5}$$

Thus, the final intensity matrix K is given by:

$$K = KA + KB = \begin{bmatrix} -2 & \mu\_D & \mathcal{\vee} & 0 \\ 1 & -\mathcal{\vee} - \mu\_D & 0 & 1 \\ 1 & 0 & -1 - \mathcal{\vee} & \mu\_C \\ 0 & \mathcal{\vee} & 1 & -\mu\_C - 1 \end{bmatrix} \tag{6}$$

To compute the final probability of a player defecting, we need to sum the components of the column vector P<sup>F</sup> that correspond to the second player choosing the action Defect. Note that the four components of the column vector P<sup>F</sup> correspond to [ DD DC CD CC ], where C corresponds to Cooperate and D to Defect. The first letter represents the action chosen by the first player, and the second letter corresponds to the action of the second player. Thus, the probability of player 2 choosing the action Defect corresponds to the summation of the first and the third components of the column vector PF:

$$Pr(P\_2 = \text{Defect }) = P\_{\text{F}\left[1st\_{-}dim\right]} + P\_{\text{F}\left[3rd\_{-}dim\right]} \tag{7}$$

In Equation (7), we do not need to perform any normalization in the end because the operation in Equation (1) together with the intensity matrix K ensures that the values computed are already probability values. Moreover, there is no possible combination of parameters resulting from Equation (7) that will satisfy the results observed in **Table 1**. This occurs because, although we have parameterized the Markov Model, the model will always satisfy the laws of classical probability theory. Thus, there is no possible optimization that can predict the violation of the Sure Thing Principle in such situations. This was already noticed in the previous works of Pothos and Busemeyer [17] and Busemeyer et al. [35].

In the next sections, we explain several quantum approaches proposed in the literature that can accommodate violations ofthe Sure Thing Principle.

# 4. THE QUANTUM-LIKE APPROACH

The Quantum-Like Approach has its roots in contextual probabilities. This model was proposed by A. Khrennikov and corresponds to a general contextual probability space from which the classical and quantum probability models can be derived [34, 49].

## 4.1. Contextual Probabilities: The Växjö Model

In the Växjö Model, the context relates to the circumstances that form the setting for an event in terms of which it can be fully understood, clarifying the meaning of the event. For instance, in domains outside of physics, such as cognitive science, one can have mental contexts. In social sciences, we can have a social context. The same idea is applied to many other domains, such as economics, politics, game theory, and biology.

Associated with a context, there is a set of observables. In quantum mechanics, an observable corresponds to a self-adjoint operator on a complex Hilbert Space. Under the Växjö Model, these observables correspond to the set of possible events with their respective values.

$$Pr\_{\text{context}} = \langle \mathcal{C}, \mathcal{O}, \pi \rangle \tag{8}$$

For instance, for a context <sup>C</sup> <sup>∈</sup> <sup>C</sup> and for an observable <sup>a</sup> <sup>∈</sup> <sup>O</sup> having values α, the probability of the value of one observable is expressed in terms of the conditional (contextual) probability involving the values of an observable. That is, the probability distribution π is given by:

$$\pi(\mathcal{O}, \mathcal{C}) = \Pr(a \;= \; \alpha \mid \mathcal{C} \;) \tag{9}$$

If we move into the quantum mechanics realm, Equation (9) can be interpreted as the selection with respect to the result a = α of a measurement performed in a.

For the contextual probability model, the Växjö framework corresponds to a model <sup>M</sup> described by M <sup>=</sup> (C, <sup>O</sup>, <sup>π</sup>(O, <sup>C</sup>)). Again, C is a set of contexts, O is the set of observables, and π(O, C) corresponds to a probability distribution of some observables belonging to a specific context.

In addition, assume for a context <sup>C</sup> <sup>∈</sup> <sup>C</sup> that there are two dichotomous observables <sup>a</sup>, <sup>b</sup> <sup>∈</sup> <sup>O</sup> and that each of these observables can take some values α ∈ a and β ∈ b, respectively.

The Växjö Model can be built from the general structure of the quantum law of total probability. That is, the formula is a combination of the classical probability theory with a supplementary term called the interference term (Equation 10). This term does not exist in classical probability and enables the representation of interferences between quantum states.

$$Pr(b=\beta) = \text{Classical\\_Probability}(b=\beta) + \text{Interference\\_Term} \tag{10}$$

Under this representation, we can replace Classical\_Probability by the classical total probability and also replace the quantum Interference\_Term by a supplementary measure, represented by δ(β | a, C). Under the Växjö Model, the term δ(β | a, C) corresponds to:

$$\delta(\beta|a,\mathcal{C}) = Pr(b=\beta) - \sum\_{a \in a} Pr(a=\alpha|\mathcal{C}) Pr(b=\beta|a=a,\mathcal{C}) \tag{11}$$

Equation (11) can be written in a similar way to the classical probability in the following manner:

$$Pr(b=\beta|\mathcal{C}) = \sum\_{a \in a} Pr(a=\alpha|\mathcal{C}) Pr(b=\beta|a=\alpha, \mathcal{C}) + \delta(\beta|a, \mathcal{C}) \tag{12}$$

If we perform the normalization of the probability measure of supplementary δ(β | a, C) by the square root of the product of all probabilities, we obtain:

$$\lambda\_{\theta} = \frac{\delta(\beta|a, \mathcal{C})}{2\sqrt{\prod\_{a \in a} Pr(a = \alpha|\mathcal{C}) Pr(b = \beta|a = \alpha, \mathcal{C})}} \tag{13}$$

From Equation (13), the general probability formula of the Växjö Model can be derived. For two variables, it is given by:

$$\Pr(b=\beta|\mathcal{C}) = \sum\_{a \in a} \Pr(a=\alpha|\mathcal{C}) \Pr(b=\beta|a=\alpha, \mathcal{C})$$

$$+2\lambda\_{\theta} \sqrt{\prod\_{a \in a} \Pr(a=\alpha|\mathcal{C}) \Pr(b=\beta|a=\alpha, \mathcal{C})} \tag{14}$$

If we look closely at Equation (14), we can see that the first summation of the formula corresponds to the classical law of total probability. The second term of the formula (the one that contains the λ<sup>θ</sup> parameter) does not exist in the classical model and is called the interference term.

#### 4.2. The Hyperbolic Interference

Although the Quantum-Like Approach provides great possibilities compared with the classical one, it appears that it cannot completely cover data from psychology and that a quantum formalism was not enough to explain some paradoxical findings (see [50]), so hyperbolic spaces were proposed [51–53].

From Equation (14), if Pr(b = β)− P α∈a Pr(a = α|C)Pr(b = β|a = α, C) is different from zero, then various interference effects occur. To determine which type of interference occurred, one tests the Växjö Model for quantum probabilities. This can be determined by normalizing the supplementary measure in a quantum fashion, just as presented in Equation (13).

If we are under a quantum context, then the quantum interference term will be:

$$\delta(\beta|a,C) = 2\sqrt{\prod\_{a \in a} Pr(a = \alpha | C) Pr(b = \beta | a = \alpha, C)} \cos(\theta) \tag{15}$$

In a quantum context because the supplementary term δ(β | a, C) is being normalized in a quantum fashion, then we automatically know that the indicator term λ<sup>θ</sup> will always have to be smaller than 1 to obtain quantum probabilities, λ<sup>θ</sup> ≤ 1. Thus, under trigonometric contexts, the Växjö Model for quantum probabilities becomes:

$$\lambda\_{\theta} = \cos(\theta) \quad \rightarrow \quad Pr(\beta|C) = \sum\_{\alpha \in a} Pr(\alpha|C) Pr(\beta|\alpha, C)$$

$$+ 2 \sqrt{\prod\_{\alpha \in a} Pr(\alpha|C) Pr(\beta|\alpha, C) \cos(\theta)} \tag{16}$$

If, however, the probability Pr(b = β) was not computed in a trigonometric space (that is, it is not quantum), then, it is straightforward that the quantum normalization applied in Equation (13) will yield a value larger than 1. Because we are not in the context of quantum probabilities, the quantum normalization factor will fail to normalize the interference term and will produce a number larger than the normalization factor. Under these circumstances, the Växjö Model incorporates the generalization of hyperbolic probabilities, arguing that the context in which these probabilities were computed was Hyperbolic [49, 53, 54].

Under Hyperbolic contexts, the Växjö Model contextual probability formula becomes:

$$\lambda\_{\theta} = \cosh(\theta) \quad \rightarrow \quad Pr(\beta|C) = \sum\_{\alpha \in a} Pr(\alpha|C) Pr(\beta|\alpha, C)$$

$$\pm 2 \sqrt{\prod\_{\alpha \in a} Pr(\alpha|C) Pr(\beta|\alpha, C) \cosh(\theta)} \tag{17}$$

In summary, according to the values computed by the indicator function λ<sup>θ</sup> , the Växjö Model enables the computation of probabilities in the following contexts:


#### 4.3. Quantum-Like Probabilities as an Extension of the Växjö Model

The probabilities that emerge from the Växjö model for trigonometric spaces (i.e., quantum probabilities), do not provide a complete description of a quantum system because it can violate the positivity axiom of probability theory [49].

In this sense, an algorithm was proposed in the literature that extends the Växjö model and is able to accommodate the positivity axiom. The algorithm proposed is the Quantum-Like Representation Algorithm (QLRA), and it was proposed by Khrennikov [55–59].

As already mentioned, quantum complex amplitudes can be obtained from classical probability by using Born's rule [60, 61]. In the QLRA, for any trigonometric context C, one can simplify Born's rule for two dichotomous variables using (Equation 19) [49].

$$\begin{split} Pr(\beta|\mathcal{C}) &= Pr(\alpha\_1|\mathcal{C}) Pr(\beta|\alpha\_1, \mathcal{C}) + Pr(\alpha\_2|\mathcal{C}) Pr(\beta|\alpha\_2, \mathcal{C}) + \\ &\quad + 2\sqrt{Pr(\alpha\_1|\mathcal{C}) Pr(\beta|\alpha\_1, \mathcal{C})} \\ &\quad \sqrt{Pr(\alpha\_2|\mathcal{C}) Pr(\beta|\alpha\_2, \mathcal{C})} \cos\theta \end{split} \tag{18}$$

Equation (18) can be simplified in the following manner:

$$\begin{split} \Pr(\beta|\mathcal{C}) &= \left| \sqrt{\Pr(\alpha\_1|\mathcal{C}) \Pr(\beta|\alpha\_1, \mathcal{C})} \right. \\ &\left. + e^{i\theta\beta|\alpha, \mathcal{C}} \sqrt{\Pr(\alpha\_2|\mathcal{C}) \Pr(\beta|\alpha\_2, \mathcal{C})} \right|^2 \end{split} \tag{19}$$

Equation (19) corresponds to the representation of the quantum law of total probability through the Växjö model. In this equation, the angle θβ|α,<sup>C</sup> corresponds to the phase of a random variable and incorporates the phase of both A = α<sup>1</sup> and A = α<sup>2</sup> in the following manner: θβ|α, <sup>C</sup> = θβ|α<sup>1</sup> − θβ|α<sup>2</sup> .

One should note that the Quantum-Like Approach can be extended to more complex decision scenarios, that is, with more than two random variables. However, this will lead to the very difficult task of tuning an exponential number of quantum θ parameters. Peter Nyman noticed this problem when he generalized the Quantum-Like Approach for 3 dichotomous variables [52, 62–64].

### 4.4. Modeling the Prisoner's Dilemma using the Quantum-Like Approach

If we want to compute the average probabilities reported in **Table 1** for the Prisoner's Dilemma game, then we would need to make the following substitutions to Equation (18):

$$\begin{aligned} \Pr\left(\alpha\_1|\mathcal{C}\right) \cdot \Pr\left(\beta|\alpha\_1, \mathcal{C}\right) &= \Pr\left(P\_1 = \text{Defect}|\mathcal{C}\right) \\ \cdot \Pr\left(P\_2 = \text{Defect}|P\_1 = \text{Defect}\right) &= 0.5 \times 0.87 = 0.435 \\ \Pr\left(\alpha\_2|\mathcal{C}\right) \cdot \Pr\left(\beta|\alpha\_2, \mathcal{C}\right) &= \Pr\left(P\_1 = \text{Cooperate}|\mathcal{C}\right) \\ \cdot \Pr\left(P\_2 = \text{Defect}|P\_1 = \text{Cooperate}\right) \\ &= 0.5 \times 0.74 = 0.37 \end{aligned}$$

The main problem of the Växjö model and the Quantum-Like Approach is that it can only address very small decision scenarios and the fitting of the θ parameter has to be done fitted to data. To compute the probability of a player choosing to defect, Pr P<sup>2</sup> = Defect , one would proceed as follows:

$$Pr(P\_2 = Defect) = 0.435 + 0.37 + 2 \cdot \sqrt{0.435} \cdot \sqrt{0.37} \cdot \cos(\theta)$$

To achieve the observed result, θ must be equal to 1.7779 to achieve the final probability Pr(P<sup>2</sup> = Defect) = 0.64. However, this method does not provide any other means to find this θ parameter except by extrapolating the observed data.

### 5. THE QUANTUM DYNAMICAL MODEL

In the works of Busemeyer et al. [11], Pothos and Busemeyer [17], and Busemeyer et al. [35], the authors present a model to perform quantum time evolution. This model requires the creation of a doubly stochastic matrix, which represents the rotation of the participants' beliefs. The double stochasticity is a requirement to preserve unit length operations and to obtain a probability value that does not require normalization. The participants' actions are represented by a superposition vector with all possible actions: [ψDD ψDC ψCD ψCC], where C corresponds to Cooperate and D to Defect.

The doubly stochastic matrix that the model requires can only be computed by the use of an auxiliary Hamiltonian matrix, which needs to be self-adjoint. For instance, to explain the average results of the Prisoner's Dilemma game, the Hamiltonian matrix is given by Equation (20), where µ<sup>D</sup> and µ<sup>C</sup> correspond to parameters representing the payoffs of the defect and cooperate actions, respectively.

$$\begin{aligned} HA\_d &= \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix} \otimes \begin{bmatrix} \mu\_D & 1 \\ 1 & -\mu\_D \end{bmatrix} \frac{1}{\sqrt{1 + \mu\_D^2}} \\ HA\_c &= \begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix} \otimes \begin{bmatrix} \mu\_C & 1 \\ 1 & -\mu\_C \end{bmatrix} \frac{1}{\sqrt{1 + \mu\_C^2}} \\ HA &= HA\_d + HA\_c = \begin{bmatrix} \frac{\mu\_D}{\sqrt{1 + \mu\_D^2}} & \frac{1}{\sqrt{1 + \mu\_D^2}} & 0 & 0 \\ \frac{1}{\sqrt{1 + \mu\_D^2}} & -\frac{\mu\_D}{\sqrt{1 + \mu\_D^2}} & 0 & 0 \\ 0 & 0 & \frac{\mu\_C}{\sqrt{1 + \mu\_C^2}} & \frac{1}{\sqrt{1 + \mu\_C^2}} \\ 0 & 0 & \frac{1}{\sqrt{1 + \mu\_C^2}} & -\frac{\mu\_C}{\sqrt{1 + \mu\_C^2}} \end{bmatrix} \tag{20} \end{aligned} \tag{21}$$

The dynamical model also takes dissonance effects into account. That is, the participants might have been confronted by some information that conflicted with his/her existing beliefs to simulate the dissonance effect when the participants had to decide on an action. Thus, the Quantum Dynamical Model makes use of a second Hamiltonian matrix, HB.

$$HB\_d = \begin{bmatrix} +1 & 0 & +1 & 0 \\ 0 & 0 & 0 & 0 \\ +1 & 0 & -1 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \cdot \frac{-\gamma}{\sqrt{2}} \quad HB\_c = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & -1 & 0 & +1 \\ 0 & 0 & 0 & 0 \\ 0 & +1 & 0 & +1 \end{bmatrix} \cdot \frac{-\gamma}{\sqrt{2}}$$

$$HB = HB\_d + HB\_c = \begin{bmatrix} \frac{-\gamma}{\sqrt{2}} & 0 & \frac{-\gamma}{\sqrt{2}} & 0 \\ 0 & \frac{\gamma}{\sqrt{2}} & 0 & \frac{-\gamma}{\sqrt{2}} \\ \frac{-\gamma}{\sqrt{2}} & 0 & \frac{\gamma}{\sqrt{2}} & 0 \\ 0 & \frac{-\gamma}{\sqrt{2}} & 0 & \frac{-\gamma}{\sqrt{2}} \end{bmatrix} \tag{21}$$

The general Hamiltonian matrix combines the matrices from Equations (20) and (21). In the end, the final matrix needs to be self-adjoint and, consequently, symmetric. To explain the average results of the Prisoner's Dilemma game, the final Hamiltonian matrix is given by:

$$\begin{aligned} H &= HA + HB = \\ \begin{bmatrix} \frac{-\mathcal{V}}{\sqrt{2}} + \frac{\mu\_{D}}{\sqrt{1 + \mu\_{D}^{2}}} & \frac{1}{\sqrt{1 + \mu\_{D}^{2}}} & \frac{-\mathcal{V}}{\sqrt{2}} & 0\\ \frac{1}{\sqrt{1 + \mu\_{D}^{2}}} & \frac{\mathcal{V}}{\sqrt{2}} - \frac{\mu\_{D}}{\sqrt{1 + \mu\_{D}^{2}}} & 0 & \frac{-\mathcal{V}}{\sqrt{2}}\\ \frac{-\mathcal{V}}{\sqrt{2}} & 0 & \frac{\mathcal{V}}{\sqrt{2}} + \frac{\mu\_{C}}{\sqrt{1 + \mu\_{C}^{2}}} & \frac{1}{\sqrt{1 + \mu\_{C}^{2}}}\\ 0 & \frac{-\mathcal{V}}{\sqrt{2}} & \frac{1}{\sqrt{1 + \mu\_{C}^{2}}} & \frac{-\mathcal{V}}{\sqrt{2}} - \frac{\mu\_{C}}{\sqrt{1 + \mu\_{C}^{2}}} \end{bmatrix} \end{aligned} \tag{21}$$

Next, we need to create a unitary matrix. In quantum mechanics, a unitary matrix restricts the allowed evolution of quantum systems, ensuring that the sum of probabilities of all possible outcomes of any event is always 1. This means that the matrix must be doubly stochastic (all rows and columns sum to 1). In the Quantum Dynamical Model, this matrix encodes all state transitions that a person can experience while choosing a decision. A unitary matrix is computed by a differential equation called Schrödinger's equation:

$$\frac{\delta}{\delta t}U(t) = -i \cdot H \cdot U(t) \Rightarrow U(t) = e^{-i \cdot H \cdot t} \tag{23}$$

The parameter t corresponds to the time evolution. Under the Dynamical Quantum Model, this parameter was set to π/2, corresponding to the average time that a participant takes to make a decision (approximately 2 seconds) [17, 35]. Also, in the book of Busemeyer and Bruza [16], the authors state that the time parameter was set to π/2, because it produces a probability that reaches its maximum.

The initial belief state corresponds to a quantum state representing a superposition of the participant's beliefs.

$$Q\_i = \frac{1}{2} \begin{bmatrix} 1 \\ 1 \\ 1 \\ 1 \end{bmatrix} \tag{24}$$

By multiplying the unitary matrix with the initial superposition belief state, one can compute the transition of the participants' beliefs at each time. The final vector Q<sup>f</sup> represents the amplitude distribution across states after deliberation.

$$Q\_{\mathcal{F}} = U \cdot Q\_i = U \cdot \begin{bmatrix} 1 \\ 1 \\ 1 \\ 1 \end{bmatrix} \cdot \frac{1}{2} \tag{25}$$

Having the final state QF, one can compute probabilistic inferences by computing the sum squared magnitude of the rows of interest in the final belief state. Note that the four components of the column vector Q<sup>F</sup> respectively correspond to [ DD DC CD CC ], where C corresponds to Cooperate and D to Defect. The first letter represents the action chosen by the first player, and the second letter corresponds to the action of the second player. Thus, the probability of player 2 choosing

the action Defect corresponds to the summation of the squared magnitude of the first and the third components of the column vector QF:

$$\Pr(P\_2 = \text{Defect}) = \left| Q\_{F\left[1st\_{\\_dim}\right]} \right|^2 + \left| Q\_{F\left[3rd\_{\\_dim}\right]} \right|^2$$

$$\Pr(P\_2 = \text{Cooperate}) = \left| Q\_{F\left[2nd\_{\\_dim}\right]} \right|^2 + \left| Q\_{F\left[4th\_{\\_dim}\right]} \right|^2 \text{ (26)}$$

To explain the average results observed in the Prisoner's Dilemma Game, in the work of Pothos and Busemeyer [17], the authors chose the following parameters:


Using the above parameters, one can estimate the average results of **Table 1** to be Pr(P<sup>2</sup> = Defect) = 0.64. The Quantum Dynamical model shows that quantum probability is a very general framework and can lead to many different probabilities. These probabilities just depend on the way one chooses to fit these free parameters. This has also been shown in the previous study of Moreira and Wichert [65]. To illustrate this concept, we decided to fix one of the parameters µD, µ<sup>C</sup> or γ and vary the others between the interval [−1, 1]. **Figures 2**–**4** show all possible probabilities that can be obtained with the presented Dynamical Quantum Model for the Prisoner's Dilemma game<sup>1</sup> The value of these figures is to show how sensitive quantum parameters are and how challenging it is to find values for these parameters.

In the Quantum Dynamical Model, the parameters used are based on a psychological setting. The incorporation of parameters to model dissonance effects and the payoffs of the players provide an approximation for the psychology of the problem that is not observed in other quantum cognitive

<sup>1</sup>These graphs were plotted using the Wolfram Mathematica 10.4.1 software.

Frontiers in Physics | www.frontiersin.org

**107**

June 2016 | Volume 4 | Article 26

)

Pr(πA=a<sup>1</sup> ) = Pr(A = a1, B = b1) + Pr(A = a1, B = b2) + q(πA=a<sup>1</sup> ) = |ψ11| <sup>2</sup> <sup>+</sup> <sup>|</sup>ψ12<sup>|</sup> <sup>2</sup> <sup>+</sup> <sup>q</sup>(πA=a<sup>1</sup> Pr(πA=a<sup>2</sup> ) = Pr(A = a2, B = b1) + Pr(A = a2, B = b2) (29)

+ q(πA=a<sup>2</sup> ) = |ψ21| <sup>2</sup> <sup>+</sup> <sup>|</sup>ψ22<sup>|</sup> <sup>2</sup> <sup>+</sup> <sup>q</sup>(πA=a<sup>2</sup> )

where the interference term q is defined by:

$$q(\pi\_{A=a\_1}) = 2 \cdot \varphi(\pi\_{A=a\_1}) \sqrt{Pr(A=a\_1, B=b\_1)}$$

$$\begin{split} \cdot \sqrt{Pr(A=a\_1, B=b\_2)}\\ q(\pi\_{A=a\_2}) = 2 \cdot \varphi(\pi\_{A=a\_2}) \sqrt{Pr(A=a\_2, B=b\_1)}\\ \cdot \sqrt{Pr(A=a\_2, B=b\_2)} \end{split} \tag{30}$$

models of the literature. However, one great disadvantage of the Quantum Dynamical Model is related to Hamiltonian matrices. Creating a manual Hamiltonian is a very hard problem because it is required that all possible interactions of the decision problem are known, and this specification must be made in such a way that the matrix is doubly stochastic. A recent work from Yearsley and Busemeyer [66] describes how to construct Hamiltonians for quantum models of cognition. The Hamiltonian matrix grows exponentially with the complexity of the decision problem, and the computation of a unitary operator from such matrices is a very complex process. Most of the time, approximations are used because of the complexity of the calculations involved in the matrix exponentiation operation.

#### 6. THE QUANTUM PROSPECT DECISION THEORY

The Quantum Prospect Decision Theory was developed by Yukalov and Sornette [36, 67] and developed throughout many other works [68–71]. The foundations of this theory are very similar to the previously presented Quantum-Like Approach.

In the Quantum-Like Approach, we start with two dichotomous observables. In the Quantum Prospect Decision Theory, these observables are referred to as intensions. An intension can be defined by an intended action, and a set of intended actions is defined as a prospect.

Each prospect can contain a set of action modes, which are concrete representations of an intension. Making a comparison with the Quantum-Like Approach, a prospect can be seen as a random variable, and the set of action modes are the assignments that each random variable can have. For instance, the intension to play can have two representations: play action A or play action B.

Following the work of Yukalov and Sornette [36], two intensions A and B have the respective representations: A = x where x ∈ a1, a<sup>2</sup> and B = y, where y ∈ b1, b2. The corresponding

Equation (27) represents a linear combination of the prospect basis states. From a psychological perspective, the state of mind is a fixed vector characterizing a particular decision-maker with his/her beliefs, habits, principles, etc. That is, it describes each decision-maker as a unique subject.

The prospect states corresponding to the intensions A and B are given by Equation (28). The ψ symbol corresponds to quantum amplitudes associated with the prospect state. Under the Quantum Prospect Decision Theory, these amplitudes represent the weights of the intended actions while a person is still deliberating about them.

$$\begin{aligned} \vert \pi\_{A=a\_1} \rangle &= \varepsilon\_{11} \vert A = a\_1 B = b\_1 \rangle + \varepsilon\_{12} \vert A = a\_1 B = b\_2 \rangle \\ \vert \pi\_{A=a\_2} \rangle &= \varepsilon\_{21} \vert A = a\_2 B = b\_1 \rangle + \varepsilon\_{22} \vert A = a\_2 B = b\_2 \rangle \end{aligned} \tag{28}$$

The probabilities of the prospects can be obtained by computing the squared magnitude of the prospects states (just as in the Quantum-Liek approach and the Quantum Donaldson Model).

Consequently, the final probabilistic are given by:

$$\begin{array}{c|c|c} \text{net} & \text{FIGURE 4 | llusrration of all possible probabilities, } Pr(\text{P}) \\ \hline \\ \text{can be obtained by varying the parameters } \mu\_{\mathcal{D}} \text{ and } \mu\_{\mathcal{C}}. \\ \text{can the } \sigma \text{-deformed is minimum loss.} \end{array}$$

$$\text{state of mind is given by:}$$

$$\left| \left| \psi\_{s} \left( t \right) \right> = \sum\_{i,j} c\_{i,j} \left( t \right) \left| A\_{i} \ B\_{j} \right> \tag{27}$$

In Equation (30), the symbol ϕ corresponds to the uncertainty factor and is given by Equation (31).

$$\begin{aligned} \varphi(\pi\_{A=a\_1}) &= \cos \left( \arg \left( \psi\_{11} \cdot \psi\_{12} \right) \right) \\ \varphi(\pi\_{A=a\_2}) &= \cos \left( \arg \left( \psi\_{21} \cdot \psi\_{22} \right) \right) \end{aligned} \tag{31}$$

The interference term corresponds to the effects that emerge during the process of deliberation, that is, while a person is making a decision. These interference effects result from conflicting interests, ambiguity, emotions, etc. [36].

One can notice that the Quantum Prospect Decision Theory is very similar to the Quantum-Like Approach proposed by Khrennikov [72]. Both theories end up with the same quantum probability formula. However, the Quantum Prospect Decision Theory provides some heuristics for how to choose the uncertainty factors. This information will be addressed in the next section.

#### 6.1. Choosing the Uncertainty Factor

To accommodate the violations of the Sure Thing Principle, the uncertainty factor must be set in such a way that it will enable accurate predictions. Two methods were proposed by Yukalov and Sornette [36] to estimate the uncertainty factor: the Interference Alternation method and the Interference Quarter Law.

• **Interference Alternation** - Under normalized conditions, the probabilities of the prospects p πj must sum to 1. This normalization only occurs if one characterizes the interference term as an alternation such that the interference effects disappear while summing the probability of the prospects. This results in the property of the interference alternation, given by:

$$\sum\_{j} q\left(\pi\_{j}\right) = 0\tag{32}$$

The interference alternation property is in accordance with the findings of Epstein [41]: the destructive interference effects can be associated with uncertainty aversion. This leads to a less probable action under uncertainty conditions. In contrast, the probabilities of other actions that contain less uncertainty are enhanced through constructive quantum interference effects. This uncertainty aversion happens quite frequently in situations where the Sure Thing Principle is violated. This implies that one of the probabilities of the prospects must be enhanced, whereas the other must be decreased.

$$\begin{aligned} \operatorname{sign}\left[\varphi(\pi\_{A=a\_1})\right] &= -\operatorname{sign}\left[\varphi(\pi\_{A=a\_2})\right] \\ \text{where } \left|\varphi(\pi\_{A=a\_l})\right| &\in \left[0, 1\right] \end{aligned} \tag{33}$$

• **Interference Quarter Law** - The interference terms generated by quantum probabilistic inferences have a free quantum parameter, which is the uncertainty factor (Equation 31). The Interference Quarter Law corresponds to a quantitative estimation of this parameter. The modulus of the interference term q can be quantitatively estimated by computing the expectation value of the probability distribution of a random variable ξ in the interval [0, 1]:

$$q \equiv \int\_0^1 \xi \cdot \operatorname{pr} \left( \xi \right) \, d\xi = \frac{1}{4} \tag{34}$$

The probability distribution p (ξ ) is given by Equation (35) and can be computed by taking the average of two probability distributions.

$$\left[\rho r\left(\xi\right) = \frac{1}{2}\left[\rho r\_1\left(\xi\right) + \rho r\_2\left(\xi\right)\right] = \delta\left(\xi\right) + \frac{1}{2}\Theta\left(1 - \xi\right) \quad \text{(35)}\right]$$

One of the probability distributions, (p<sup>1</sup> (ξ )), is concentrated in the center and is described by a Dirac function δ (ξ ).

$$pr\_1\left(\xi\right) = 2 \cdot \delta\left(\xi\right) \tag{36}$$

The other probability distribution,(p<sup>2</sup> (ξ )), is a uniform distribution in the interval [0, 1].

$$pr\_2\left(\xi\right) = \Theta\left(1-\xi\right) \quad \text{where } \Theta\left(\xi\right) = \begin{cases} 0, \text{ if } \xi < 0 \\ 1, \text{ if } \xi \ge 0 \end{cases} \tag{37}$$

For a more detailed proof of the Interference Quarter Law, the reader should refer to Yukalov and Sornette [36].

### 6.2. The Quantum Prospect Decision Theory Applied to the Prisoner's Dilemma Game

In this section, we apply the Quantum Prospect Decision Theory to try to predict the average results for the Prisoner's Dilemma Game reported in **Table 1**.

The probability of a player defecting (and cooperating), given that one does not know what the action of the other player was, is given by Equation (38). For simplicity, we will assume the following notation: Defect (D) and Cooperate (C).

$$\begin{aligned} Pr(P\_2 = D) &= Pr(P1 = D, P2 = D) \\ &+ Pr(P1 = C, P2 = D) + Interference\_d \\ Pr(P\_2 = C) &= Pr(P1 = D, P2 = C) \\ &+ Pr(P1 = C, P2 = C) +Interference\_c \end{aligned}$$

The interference terms are given by:

$$\begin{aligned} \text{Interference}\_d &= 2 \cdot \varphi \left( P2 = D \right) \\ &\cdot \sqrt{Pr(P1 = D, P2 = D) \cdot Pr(P1 = C, P2 = D)} \\ \text{Interference}\_d &= 2 \cdot \varphi \left( P2 = C \right) \\ &\cdot \sqrt{Pr(P1 = D, P2 = C) \cdot Pr(P1 = C, P2 = C)} \end{aligned} \tag{39}$$

The uncertainty factors are given by:

$$\wp\left(P\_{2} = D\right) = \frac{interference\_{d}}{2 \cdot \sqrt{pr(P\_{1} = D, P\_{2} = D) \cdot Pr(P\_{1} = C, P\_{2} = D)}}$$

$$\wp\left(P\_{2} = D\right) = \frac{interference\_{c}}{2 \cdot \sqrt{pr(P\_{1} = D, P 2 = C) \cdot Pr(P1 = C, P2 = C)}}\tag{40}$$

According to the Interference Quarter Law and to the Alternation Law, the probabilities for acting under uncertainty are given by:

$$\begin{aligned} Pr(P\_2 = D) &= Pr(P\_1 = D, P\_2 = D) \\ &+ Pr(P\_1 = C, P\_2 = D) - 0.25 \\ Pr(P\_2 = C) &= Pr(P\_1 = D, P\_2 = C) \\ &+ Pr(P\_1 = C, P\_2 = C) + 0.25 \end{aligned}$$

For the Prisoner's Dilemma Game,

$$\begin{aligned} Pr(P\_1 = D, P\_2 = D) &= Pr(P1 = D) \cdot Pr(P\_2 = D | P1 = D) \\ &= 0.5 \times 0.87 = 0.435 \\ Pr(P\_1 = C, P\_2 = D) &= Pr(P1 = C) \cdot Pr(P\_2 = D | P1 = C) \\ &= 0.5 \times 0.74 = 0.37 \end{aligned}$$

Then, the final predicted probabilities are given by:

$$\begin{aligned} Pr(P\_2 = D) &= 0.435 + 0.37 - 0.25 = 0.555\\ Pr(P\_2 = C) &= 0.065 + 0.13 + 0.25 = 0.445 \end{aligned} \tag{42}$$

The average probability to defect for the Prisoner's Dilemma Game in **Table 1** when the first player's action is unknown is 0.64. That means that, with the Quarter Interference Law together with the Interference Alternation property, the Prospect Quantum Decision Theory obtained an error of 13%.

#### 7. PROBABILISTIC GRAPHICAL MODELS

In this section, we introduce the concepts of classical and Quantum-Like Bayesian Networks, as well as some approaches in the literature that formalized traditional Bayesian Networks into a Quantum-Like Approach.

#### 7.1. Classical Bayesian Networks

A classical Bayesian Network can be defined by a directed acyclic graph structure in which each node represents a different random variable from a specific domain and each edge represents a direct influence from the source node to the target node. The graph represents independence relationships between variables, and each node is associated with a conditional probability table that specifies a distribution over the values of a node given each possible joint assignment of values of its parents [73].

The full joint distribution [74] of a Bayesian Network, where X is the list of variables, is given by:

$$\Pr(\mathbf{X}\_1, \dots, \mathbf{X}\_n) = \prod\_{i=1}^n \Pr(\mathbf{X}\_i | \mathbf{Pareents}(\mathbf{X}\_i)) \tag{43}$$

The formula for computing classical exact inferences on Bayesian Networks is based on the full joint distribution (Equation 43). Let e be the list of observed variables and let Y be the remaining unobserved variables in the network. For some query X, the inference is given by:

$$\Pr(X|e) = \alpha \Pr(X, e) = \alpha \left[ \sum\_{\mathcal{Y} \in Y} \Pr(X, e, \mathcal{Y}) \right] \tag{44}$$

$$\text{Where} \quad \alpha = \frac{1}{\sum\_{x \in X} Pr(X = x, e)}$$

The summation is over all possible y, i.e., all possible combinations of values of the unobserved variables y. The α parameter corresponds to the normalization factor for the distribution Pr(X|e) [74]. This normalization factor comes from some assumptions that are made in Bayes rule.

# 7.2. Classical Bayesian Networks for the Prisoner's Dilemma Game

We represent the Prisoner's Dilemma Game under a Bayesian Network structure in which we assume neutral priors: there is a 50% of a player choosing the actions Defect or Cooperate (**Figure 5**). The decision of the first participant is then followed by the decision of the second participant. The probability distribution of the second player is obtained (or learned) from the experimental data for the averaged results in **Table 1** when the actions of the first player are observed. Using this data, the goal is to try to determine the probability of the second player choosing to defect given that it is not known what action the first player chose.

To compute the probability Pr(P<sup>2</sup> = Defect), two operations are required: the computation of the full joint probability distribution (Equation 43) and the computation of the marginal probability.

The full joint probability distribution can be easily computed by multiplying all possible assignments of the network

FIGURE 5 | Bayesian Network representation of the Average of the results reported in the literature (last row of Table 1). The random variables that were considered are P1 and P2 , corresponding to the actions chosen by the first participant and second participant, respectively.



with each other. **Table 2** shows the computation of these probabilities.

The marginalization formula is used when we want to perform queries to the network. For instance, in the Prisoner's Dilemma Game, we want to know what the probability is of the second player choosing to defect given that we do not know what the other player has chosen, Pr(P<sup>2</sup> = Defect). This is obtained by summing the entries of the full joint probability (**Table 2**) that have P<sup>2</sup> = Defect. That is, we sum up the first and third rows of this table. Equation (45) shows this operation. For simplicity, we have used the following notation: D = Defect and C = Cooperate.

$$\Pr(P2 = D) = \Pr(P1 = D) \cdot \Pr(P2 = D|P1 = D) + \Pr(P1 = C)$$

$$\cdot \Pr(P2 = D|P1 = C) = 0.8050\tag{45}$$

In Equation (45), one can see that the classical Bayesian Network was not able to predict the observed results in **Table 1** using classical inference. One might think that, if we parameterize the Bayesian Network to take into account the player's actions and dissonance effects, there could be a possibility of obtaining the required results. This line of thought is legitimate, but one must take into account that, in the end, the probabilistic inferences computed through the Bayesian Network must obey set theory and the law of total probability. This means that, even if we parameterize the network, we cannot find any closed form optimization that could lead to the desired results. This happened with the previous example of the Markov Model in Section 3. Although we parameterized the player's actions and dissonance effects, we could not arrive at the desired results because they go against the laws of probability theory, and Markov Models (as well as Bayesian Networks) must obey these laws.

### 7.3. Quantum-Like Bayesian Networks in the Literature

There are two main works in the literature that have contributed to the development and understanding of Quantum Bayesian Networks. One belongs to Tucci [37] and the other to Leifer and Poulin [38].

In the work of Tucci [37], it is argued that any classical Bayesian Network can be extended to a quantum one by replacing real probabilities with quantum complex amplitudes. This means that the factorization should be performed in the same manner as in a classical Bayesian Network. Thus, the Bayesian Network of **Figure 5** could be represented by a Quantum Bayesian Network with the following matrices tables (the ordering of the probability amplitudes in the matrices are the same as the ones in **Figure 5**):

$$\begin{aligned} P\_1 &= \left[ \begin{array}{cc} a \cdot e^{i\theta\_1} & \sqrt{1 - \left| a \cdot e^{i\theta\_1} \right|^2} \cdot e^{i\theta\_2} \end{array} \right] \\ P\_2 &= \left[ \begin{array}{cc} b \cdot e^{i\theta\_3} & \sqrt{1 - \left| b \cdot e^{i\theta\_3} \right|^2} \cdot e^{i\theta\_4} \\ c \cdot e^{i\theta\_5} & \sqrt{1 - \left| c \cdot e^{i\theta\_5} \right|^2} \cdot e^{i\theta\_6} \end{array} \right] \end{aligned}$$

One significant problem with Tucci's work is related to the nonexistence of any methods to set the phase parameters e iθ . The author states that one could have infinite Quantum Bayesian Networks representing the same classical Bayesian Network depending on the values that one chooses to set the parameter. This requires that one knows a priori which parameters would lead to the desired solution for each node queried in the network (which we never know). Thus, for these experiments, Tucci's model cannot predict the results observed because one does not have any information about the quantum parameters.

In the work of Leifer and Poulin [38], the authors argue that, to develop a Quantum Bayesian Network, quantum versions of probability distributions, quantum marginal probabilities and quantum conditional probabilities are required (**Table 3**). The authors performed a preliminary study of these concepts. Generally speaking, a quantum probability distribution corresponds to a density matrix contained in a Hilbert space, with the constraint that the trace of this matrix must sum to 1. In quantum probability theory, a full joint distribution is given by a density matrix, ρ. This matrix provides the probability distribution of all states that a Bayesian Network can have. The marginalization operation corresponds to a quantum partial trace [75, 76].

In the end, the models of Tucci [37] and Leifer and Poulin [38] fail to provide any advantage relative to the classical models because they cannot take into account interference effects between random variables. Thus, they provide no advantages in modeling decision-making problems that try to predict decisions that violate the laws of total probability.

A more recent work from Moreira and Wichert [65] suggested defining the Quantum-Like Bayesian Network in the same manner as in the work of Tucci [37], replacing real probability numbers by quantum probability amplitudes.

In this sense, the quantum counterpart of the full joint probability distribution corresponds to the application of Born's rule to Equation (43):

$$\Pr(X\_1, \ldots, X\_n) = \left| \prod\_{i=1}^{N} \psi\_{(X\_i|P\_{i\text{-}P\_i}(X\_i))} \right|^2 \tag{46}$$

The general idea of a Quantum-Like Bayesian network is that, when performing probabilistic inference, the probability amplitude of each assignment of the network is propagated and influences the probabilities of the remaining nodes. In other words, every assignment of every node of the network is propagated until the node representing the query variable is reached. Note that, by taking multiple assignments and paths

#### TABLE 3 | Relation between classical and quantum probabilities used in the work of Leifer and Poulin [38].


at the same time, these trails influence each other in producing interference effects.

The quantum counterpart of the Bayesian exact inference formula corresponds to the application of Born's rule to Equation (44), leading to:

$$Pr(X|e) = \alpha \left| \sum\_{\mathcal{Y}} \prod\_{\mathbf{x}=1}^{N} \psi\_{(\mathbf{X}\_{\mathbf{x}}|P\_{\text{parents}}(\mathbf{X}\_{\mathbf{x}}), e, \mathcal{y})} \right|^2 \tag{47}$$

Expanding Equation (47), it will lead to the quantum interference formula:

$$Pr(X|\mathcal{e}) = \alpha \left( \sum\_{i=1}^{|Y|} \left| \prod\_{\mathbf{x}}^{N} \psi\_{(X\_{\mathbf{x}}|P\_{\text{Purchts}}(X\_{\mathbf{x}}), \epsilon, \mathbf{y} = i)} \right|^2 + 2 \cdot Interference\right)$$

$$\begin{aligned} Interference &= \sum\_{i=1}^{|Y|-1} \sum\_{j=i+1}^{|Y|} \left| \prod\_{x}^{N} \psi\_{(X\_{x}|P\_{2}rents(X\_{x}), e, \mathbf{y} = i)} \right| \\ &\cdot \left| \prod\_{x}^{N} \psi\_{(X\_{x}|P\_{2}rents(X\_{x}), e, \mathbf{y} = j)} \right| \cdot \cos(\theta\_{i} - \theta\_{j}) \text{(48)} \end{aligned}$$

In the Quantum Dynamical Model, because it uses unitary operators, the double symmetric property of these operators does not require the normalization of the computed values. However, in this approach, because we do not have the constraints of double stochasticity operators, we need to normalize the final scores that are computed to achieve a probability value. In classical Bayesian inference, normalization of the inference scores is also necessary due to assumptions made in Bayes rule. The normalization factor corresponds to α in Equation (48).

Note that, in Equation (48), if one sets (θ<sup>i</sup> − θj) to π/2, then cos(θ<sup>i</sup> − θj) = 0, which means that the quantum Bayesian Network collapses to its classical counterpart. That is, they can behave in a classical way if one sets the interference term to zero. Moreover, in Equation (48), if the Bayesian Network has N binary random variables, we will end up with 2<sup>N</sup> free quantum θ parameters. We represent each set of quantum parameters as a single parameter of the full joint probability distribution just like it is presented in **Table 4**. Approaches to tune those parameters under a Quantum-Like Bayesian Network approach are still an open research question.

In the model of Moreira and Wichert [65], if there are many unobserved nodes in the network, then the levels of uncertainty are very high and the interference effects produce changes in the final likelihoods of the outcomes. However, in the opposite scenario, when there are very few unobserved nodes, then the proposed quantum model tends to collapse into its classical counterpart because the uncertainty levels are very low. This work only provides a study on the impact of the quantum parameters in complex decision scenarios. On later works, the same authors have proposed the usage of heuristics to automatically assign values to quantum parameters [39, 77].

### 7.4. Application of the Quantum-Like Formalism to the Prisoner's Dilemma Game

In this section, we will demonstrate how the proposed Bayesian Network can be applied to the average results presented in **Table 1** for the Prisoner's Dilemma game, just as was proposed in the work of Moreira and Wichert [65].

We begin applying the Quantum-Like formalism by creating a Bayesian Network out of the decision problem, in which real classical probabilities are replaced by quantum amplitudes (**Figure 6**). In the Prisoner's Dilemma Game, if nothing is told to the participants, then there is a 50% chance of the first participant choosing to defect or cooperate. The decision of the first participant is then followed by the decision of the second participant.

To compute the probability Pr(P<sup>2</sup> = Defect), two operations are required: the computation of the quantum version of the full joint probability distribution (Equation 46) and the computation of the quantum version of the marginal probability (Equation 48).

The full joint probability distribution can be easily computed by multiplying all possible assignments of the network with each other. For instance, the quantum full joint probability amplitude ψ(P1=Defect,P2=Defect) is given by multiplying the prior probability amplitude ψ(P1=Defect) with the conditional probability amplitude ψ(P2=Defect|P1=Defect) . **Table 4** shows the computation of these quantum probability amplitudes.

From the quantum version of the full joint probability distribution, one can compute the quantum version of the marginal probability distribution by summing all the entries of **Table 4** that contain the assignment P2 = Defect (Equation 49). For simplification purposes, we will consider the following abbreviations: Defect = D and Cooperate = C.

$$\begin{split} Pr\left(P\_{2} = D\right) &= \alpha \left| \left| \psi\_{(P\_{1} = D, P\_{2} = D)} \right|^{2} + \left| \psi\_{(P\_{1} = C, P\_{2} = D)} \right|^{2} \right. \\ &\left. + 2 \cdot \psi\_{(P\_{1} = D, P\_{2} = D)} \cdot \psi\_{(P\_{1} = C, P\_{2} = D)} \cos\left(\theta\_{A} - \theta\_{B}\right) \right| \end{split} \tag{49}$$

$$\begin{split} Pr\left(P\_{2} = D\right) &= \alpha \mathbb{I}\left|\left|\psi\_{(P\_{1} = D)} \cdot \psi\_{(P\_{2} = D|P\_{1} = D)}\right|\right|^{2} + \\ &\quad \left|\psi\_{(P\_{1} = C)} \cdot \psi\_{(P\_{2} = D|P\_{1} = C)}\right|^{2} + 2 \cdot \psi\_{(P\_{1} = D)} \\ &\quad \cdot \psi\_{(P\_{2} = D|P\_{1} = D)} \cdot \psi\_{(P\_{1} = C)} \cdot \psi\_{(P\_{2} = D|P\_{1} = C)} \\ &\quad \cdot \cos\left(\theta\_{A} - \theta\_{B}\right) \left[ \end{split} \tag{50}$$


TABLE 4 | Quantum full joint probability amplitude distribution representation of the Bayesian Network in Figure 5.

FIGURE 6 | Bayesian Network representation of the Average of the results reported in the literature (last row of Table 1). The random variables that were considered are P1 and P2, corresponding to the actions chosen by the first participant and second participant, respectively.

$$\Pr\left(P\_2 = D\right) = \alpha \left[ |0.6595|^2 + |0.6083|^2 + 2 \times 0.6595$$

$$\begin{aligned} & \quad \times 0.6083 \cdot \cos\left(\theta\_A - \theta\_B\right) \\ & = \alpha \left[ 0.8050 + 0.8023 \cdot \cos\left(\theta\_A - \theta\_B\right) \right] \end{aligned} \right] \quad \text{(51)}$$

To compute the normalization factor α, we also need to compute Pr(P2 = C):

$$\Pr\left(P\_2 = C\right) = \alpha \mathbb{I}\left|\left\langle \psi\_{(P\_1 = D)} \cdot \psi\_{(P\_2 = C|P\_1 = D)} \right\rangle^2 + \left|\psi\_{(P\_1 = C)} \right. \right.$$

$$\begin{split} \left. \left. \psi\_{(P\_2 = C|P\_1 = C)} \right|^2 + & 2 \cdot \left. \psi\_{(P\_1 = D)} \cdot \psi\_{(P\_2 = C|P\_1 = D)} \right. \\ \left. \left. \psi\_{(P\_1 = C)} \cdot \psi\_{(P\_2 = C|P\_1 = C)} \cdot \cos \left(\theta\_A - \theta\_B\right) \right] \right. \end{split} \tag{52}$$

$$\Pr\left(P\_2 = C\right) = \alpha \left[ |0.255|^2 + |0.3606|^2 + 2 \times 0.255 \times 0.3606 \right]$$

$$\cdot \cos\left(\theta\_A - \theta\_B\right) = \alpha \left[ 0.195 + 0.1839 \right]$$

$$\cdot \cos\left(\theta\_A - \theta\_B\right) \tag{53}$$

The normalization factor α is given by Equation (54).

$$\alpha = \frac{1}{\Pr\left(P\_2 = D\right) + \Pr\left(P\_2 = C\right)} = \frac{1}{1 + 0.9862 \cdot \cos\left(\theta\_A - \theta\_B\right)}\tag{54}$$

Equation (54) contains two quantum parameters θ. Setting these parameters is still an open research question in the literature, although in some works, various heuristics have been proposed to address this problem [39, 40, 77].

### 8. DISCUSSION OF THE PRESENTED MODELS

The purpose of this section is to present discussion of and a comparison between the existing quantum models in terms of the proposed evaluation metrics: terms of interference, parameter tuning and scalability. The discussion will be mainly focused on the set of parameters that the current quantum cognitive models have that need to be fitted to match the desired predictions. For instance, the Quantum Dynamical Model requires three parameters for such small decision scenarios, whereas the Quantum-Like Approach only needs one, and the Quantum Prospect Decision Theory does not need any parameters because it has a static heuristic to replace the interference term. Note that the Quantum Dynamical Model uses three parameters µ<sup>c</sup> , µd, γ to predict three probabilities to defect when { known to defect, known to cooperate, unknown }. While the Quantum-Like Approach uses one chosen parameter and two probabilities to defect { known to defect, known to cooperate } to predict one probability to defect when { unknown }. In the end, we will see that the problems that we note for the quantum models are similar to many other classical cognitive models.

## 8.1. Discussion in Terms of Interference, Parameter Tuning and Scalability

In this section, we analyze the presented works in the literature regarding three different metrics: interference effects, parameter tuning, and scalability.


• **Scalability.** Most problems of the current models of the literature are concerned with their inability to scale to more complex decision scenarios. Most of these models are built to explain very small paradoxical findings (for example, the Prisoner's Dilemma Game and the Two-Stage Gambling Game). Therefore, this metric consists of analyzing the presented models with respect to their ability to extend and generalize to more complex scenarios.

**Table 5** presents a summary of the evaluation of the models presented in this work with respect to the three metrics described above. The parameter growth column is based on the number of parameters that each model generates when we increase the number of unknown random variables in the decision model

Starting the discussion with the classical models presented in Sections 3 and 7.1, the probabilistic inferences computed through Bayesian/Markov Networks must obey set theory and the law of total probability. This means that, even if we parameterize the networks, we cannot find any closed form optimization that could lead to the desired results. These networks can be modeled with no parameters (just as was presented in Sections 3 and 7.1), or they can be parameterized. This parameterization can end up with the same size as the full joint probability distribution of the networks. Although these models do not make use of any quantum interference effects and consequently cannot accommodate violations of the Sure Thing Principle, it is worth noting that one can always classically explain behavioral results through appropriate conditionalizations and extensions of classical probabilistic models [16].

The Quantum-Like Approach [72] is based on the direct mapping of classical probabilities to quantum probability amplitudes through Born's rule. This means that one can perform inferences for more complex decision-making scenarios by using the quantum counterpart of the classical marginal probability formula. Thus, the model generates quantum interference effects. The main problem of the Quantum-Like approach concerns the quantum parameters. The current works of the literature do not provide any means to assign values to these quantum parameters. They have to be fitted to explain the observed outcome. Thus, the Quantum-Like approach, although it can be (mathematically) extended to more complex decision scenarios, does not provide any means to assign quantum parameters. Note that, in the Quantum-Like approach, just like in many other models, it is required a mathematical fitting of a set of parameters to make an optimal prediction of the probabilities. So, the Quantum-Like Approach is considered to be a predictive model.

The Quantum Dynamical Model proposed by Pothos and Busemeyer [17] and Busemeyer et al. [35] incorporates quantum interference effects not from the quantum law of probability but by the usage of unitary operators and Hamiltonians. One of the main disadvantages of this model concerns the definition of the Hamiltonian matrices. Creating a Hamiltonian is a very hard problem. It is required that all possible interactions of the decision problem are known, and this specification must be made in such a way that the matrix is doubly stochastic. The unitary matrix also grows exponentially with the complexity of the decision problem, and the computation of a unitary operator from such matrices is a very complex process. Most of the time, approximations are used because of the complexity of the calculations involved in the matrix exponentiation operation. Just as in the Quantum-Like Approach, one needs to fit the quantum parameters so that the final model can give the observed outcome. It is important to note that, in the Quantum Dynamical Model, the parameters used are based on a psychological setting. The incorporation of parameters to model dissonance effects and the payoffs of the players provide an approximation to the psychology of the problem that is not observed in other quantum cognitive models in the literature.

Finally, the Quantum Prospect Theory proposed by Yukalov and Sornette [36] also incorporates quantum interference effects from the quantum law of total probability. This model is very similar, from a mathematical point of view, to the Quantum-Like Approach, with the difference that it proposes laws to compute the quantum interference parameters: the alternation and the quantum quarter laws. Although the model is very precise for very small decision problems (such as the Prisoner's Dilemma), it is not clear how the quantum quarter law and the alternation law would work for more complex problems. For this reason, the Quantum Prospect Theory is a model that enables the usage of quantum interference terms to make predictions under paradoxical scenarios and also provides an automatic mechanism to set the quantum parameters under very small scenarios with a static interference term (q = ±0.25). That is, the interference term is always the same, even for different contextual problems. For this reason, the model is not able to generalize well for more complex decision scenarios.

Regarding Bayesian Networks, it is hard to apply the model proposed in the work of Tucci [37] in paradoxical findings that violate the Sure Thing Principle because the author makes no mention of how to set these parameters. He even argues that a classical Bayesian Network can be represented by an infinite number of quantum Bayesian Networks depending on how one tunes the quantum parameters. Because the model is a Bayesian Network, one is able to perform inferences for any scenario by using the quantum counterpart of the classical marginal probability formula. Thus, in the end, the quantum Bayesian Network proposed by Tucci [37] is scalable and takes into account quantum interference effects; however, it does not give any insights into how to set the quantum parameters that result from the interference.

In the work of Leifer and Poulin [38], the authors create a direct mapping from classical probability to quantum theory. Because they made a quantum Bayesian Network, this model enabled probabilistic inference, and consequently, it can be generalized for any number of random variables through the use of the quantum part of the marginal probability formula. By making the direct mapping from classical to quantum probabilities, the full joint probability distribution is mapped into a density matrix. This means that the interference terms are canceled. The authors also take into account the order in which the operations are performed. Because, the commutativity axiom is not valid in quantum mechanics, we obtain different outcomes if the calculations are performed in a different order. Thus, the quantum Bayesian Network proposed by Leifer and Poulin [38]



is scalable and takes into account quantum interference effects; however, by making a direct mapping from classical to quantum, these interference effects will cancel because the network will collapse into its classical counterpart. Thus, in the end, this model does not take advantage of quantum interferences to explain paradoxical decision scenarios.

In the work of Moreira and Wichert [65], the authors also make a direct mapping from classical theory to quantum probability by replacing classical real probability values by complex quantum probability amplitudes using Born's rule. They also applied the same mechanism to derive a quantum-like full joint probability distribution formula and a quantum-like marginal probability distribution for exact inference. In the end, the model is very similar to the Quantum-Like Approach, and it can be modeled for more complex decision-making scenarios very easily due to its graphical structure. Because this model uses quantum probability amplitudes, quantum interference effects arise from the quantum-like exact inference formula. However, the number of parameters grows exponentially large when the levels of uncertainty are high, that is, when there are many unobserved nodes in the network. Although the authors have proposed some dynamic heuristics to address this problem in recent works [39, 40, 77], one needs to take into account that they are heuristics, which means that it can lead to the expected outcome, but it can also lead to completely inaccurate results.

Note that we are aware that the problems that we note in this discussion section about the quantum models are the same in many cognitive science models. However, we are not claiming that it is difficult to find the parameters for a game such as the Prisoner's Dilemma. What we are claiming is that the several models analyzed in this work (Quantum-Like Approach, Quantum Dynamical Model, Quantum-Like Bayesian Networks) contain a set of parameters that need to be fitted to match the desired predictions.

For instance, the Quantum Dynamical Model requires three parameters for such a small decision scenario, whereas the Quantum-Like Approach only needs one, and the Quantum Prospect Decision Theory does not need any parameters, because it has a static heuristic to replace the interference term. The purpose of this discussion section is simply to compare the existing quantum models in terms of the evaluation metrics specified in **Table 5**.

# 8.2. Discussion in Terms of Parameter Growth

All models analyzed in this work present different growth rates in what concerns parameters. For instance, the Dynamical Model parameterizes the player's actions plus an additional parameter to model cognitive dissonance effects. Thus, the number of parameters would be static if we consider the N-Person Prisoner's Dilemma Game. That is, instead of having only 2 players, it is extended to N players. In the case of the Quantum-Like Approach, we would have 2<sup>N</sup> parameters for the N-Person Prisoner's Dilemma Game. The number 2 comes from the fact that each player has two actions (either defect or cooperate). The same applies to the Classical Networks, the Quantum-Like Bayesian Networks and the Quantum Prospect Theory Model. However, because the authors of this last model presented the Quantum Quarter Law of Interference as a static heuristic, this model does not require any parameters.

At this point, the reader might be thinking that the Quantum Dynamical Model provides great advantages vs. the existing models because the number of parameters required corresponds to the player's actions with an additional cognitive dissonance parameter. Although this line of thought is correct, one should also take into account how the model unfolds. Although the numbers of parameters do not grow exponentially large as in the Quantum-Like Approach, the size of the Hamiltonian does. In fact, it grows exponentially large with the following size: N Nplayers actions × N Nplayers actions , where Nactions represents the number of actions of the players and Nplayers corresponds to the number of players.

We conclude this section by clarifying that most of the quantum cognitive models proposed in the literature have been directed toward small decision scenarios because of the scarcity of datasets representing complex decision scenarios and violations of the Sure Thing Principle. Consequently, the models proposed are simply overfitting simple decision scenarios. Moreover, we believe that the violations of the Sure Thing Principle tend to diminish with the complexity of the decision scenario. Imagine, for instance, a Three-Stage Gambling game. It will be very hard to find significant data that shows a player wishing to play the last gamble given that he has lost the two previous gambles. More experimental data and more studies are needed for more complex decision scenarios to test the viability of quantum models vs. their classical counterparts.

### 9. CONCLUSION

Recent work in cognitive psychology has revealed that quantum probability theory provides another method of computing probabilities without falling into the restrictions that classical probability has in modeling cognitive systems of decisionmaking. Quantum probability theory can also be seen as a generalization of classical probability theory, because it also includes the classical probabilities as a special case (when the interference term is zero).

Quantum probability has the particularity of enabling the representation of events in a geometric structure. The main advantage of this geometrical representation is the ability to rotate from one basis to another to contextualize and interpret events. This ability does not exist in the classical probability theory and provides great flexibility for decision-making systems. Consequently, quantum probability can be more expressive than its traditional classical counterpart. Under quantum theory, these paradoxical findings can simply be seen as consequences of the geometric flexibility that quantum probability theory offers.

We have collected a set of models from the literature that attempt to tackle violations of the Sure Thing Principle in a Quantum fashion, and then we compared the collected models. To illustrate this comparison, we provided a mathematical description of each model and how they could be applied in a decision scenario. We compared the models in terms of three proposed metrics: the number of parameters involved in the model, the scalability and the usage of the quantum interference effects. We have also performed a more detailed study concerning the growth of the number of quantum parameters when the complexity and the levels of uncertainty of the decision scenario increase. We have also performed this comparison with classical models, namely a Markov Model and a Bayesian Network. The main statement of this work is not to express that quantum models are preferred with respect to the classical models. With this work, we have concluded that purely classical models suffer from the same exponential parameterization growth as quantum models, with the added difficulty that they are not capable of simulating results that violate the Sure Thing Principle. It is worth

## REFERENCES

1. Friedman M, Savage L. The expected-utility hypothesis and the measurability of utility. J Polit Econ (1952) **50**:463–74. doi: 10.1086/2 57308

noting that one can always classically explain behavioral results through appropriate conditionalizations of the classical law of total probability. In the end, classical models are constrained to obey set theory and the laws of probability theory, so there is no closed optimization form that could lead to the paradoxical results found in the experiments violating the Sure thing Principle.

The proposed models of the literature only work for very small decision problems. Most of them do not provide any means to fit the quantum parameters that are required in their models. These models are useful to accommodate the paradoxical violations reported in the literature, but are not able to predict the decisions of the players without a manual fit of the parameters. One should also note that it is very difficult to validate these types of models, especially when the complexity of the decision problem increases. Thus far, in the literature, there are almost no demonstrations of violations of the Sure Thing Principle for more complex decision scenarios. More studies are needed in this direction to validate the viability of quantum models.

This work provides a technical overview of the proposed quantum models of the literature and a discussion of many key aspects of the original studies. With the proposed evaluation metrics, we were able to discuss many key aspects that have been ignored in the literature, namely how the quantum interference terms affect the complexity of the decision problems. Most of the quantum cognitive models proposed in the literature cannot predict the results observed in several experiments of the literature without first knowing the outcome of the experiment. Having this information, they can then fit their models to the desired outcome. Thus, the primary goal of these models is to accommodate the violations of the Sure Thing Principle. The usage of parameters, in some models, with a more clear psychological interpretation are also considered to be explicative. The discussions addressed turn this work into a complement to the study of the original works.

# AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

# ACKNOWLEDGMENTS

This work was supported by national funds through Fundação para a Ciência e a Tecnologia (FCT) with reference UID/CEC/50021/2013 and through the PhD grant SFRH/BD/92391/2013. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Moreira and Wichert. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Topological and Orthomodular Modeling of Context in Behavioral Science

Louis Narens \*

*Department of Cognitive Sciences, University of California, Irvine, Irvine, CA, USA*

Two non-boolean methods are discussed for modeling context in behavioral data and theory. The first is based on intuitionistic logic, which is similar to classical logic except that not every event has a complement. Its probability theory is also similar to classical probability theory except that the definition of probability function needs to be generalized to unions of events instead of applying only to unions of disjoint events. The generalization is needed, because intuitionistic event spaces may not contain enough disjoint events for the classical definition to be effective. The second method develops a version of quantum logic for its underlying probability theory. It differs from Hilbert space logic used in quantum mechanics as a foundation for quantum probability theory in variety of ways. John von Neumann and others have commented about the lack of a relative frequency approach and a rational foundation for this probability theory. This article argues that its version of quantum probability theory does not have such issues. The method based on intuitionistic logic is useful for modeling cognitive interpretations that vary with context, for example, the mood of the decision maker, the context produced by the influence of other items in a choice experiment, etc. The method based on this article's quantum logic is useful for modeling probabilities across contexts, for example, how probabilities of events from different experiments are related.

#### Edited by:

*Emmanuel E. Haven, University of Leicester, UK*

#### Reviewed by:

*Irina Basieva, Graduate School for the Creation of New Photonics Industries, Russia Tomas Veloz, University of British Columbia, Canada*

> \*Correspondence: *Louis Narens lnarens@uci.edu*

#### Specialty section:

*This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics*

Received: *26 May 2016* Accepted: *18 January 2016* Published: *14 February 2017*

#### Citation:

*Narens L (2017) Topological and Orthomodular Modeling of Context in Behavioral Science. Front. Phys. 5:4. doi: 10.3389/fphy.2017.00004* Keywords: non-boolean methods, Hilbert space, intuitionistic logic, quantum logic, event lattices

# 1. INTRODUCTION

Probability functions are special kind of functions on event algebras. Following Birkhoff and von Neumann [1], a lattice event algebra is a structure of the form,

$$\mathfrak{X} = \langle \mathcal{X}, \subseteq, \Downarrow, \cap, X, \mathcal{Q} \rangle,$$

where <sup>X</sup> is a nonempty set, <sup>X</sup> is a set of subsets of <sup>X</sup>, <sup>⊆</sup> is the set-theoretic subset relation, <sup>X</sup> and the empty set <sup>∅</sup> are in <sup>X</sup> , and for all <sup>A</sup> and <sup>B</sup> in <sup>X</sup> , <sup>A</sup> <sup>⋒</sup> <sup>B</sup> is the <sup>⊆</sup>-least upper bound in <sup>X</sup> of <sup>A</sup> and <sup>B</sup>, and <sup>A</sup> <sup>⋓</sup> <sup>B</sup> is the <sup>⊆</sup>-greatest lower bound in <sup>X</sup> of <sup>A</sup> and <sup>B</sup>. <sup>X</sup> is said to be complemented if and only if for all <sup>A</sup> in <sup>X</sup> there exists a <sup>B</sup> in <sup>X</sup> , called the complement of A, such that <sup>A</sup> <sup>⋒</sup> <sup>B</sup> <sup>=</sup> <sup>X</sup> and <sup>A</sup> <sup>⋓</sup> <sup>B</sup> <sup>=</sup> <sup>∅</sup>. (Throughout this article, <sup>⋒</sup> and <sup>⋓</sup> will always denote, respectively, the <sup>⊆</sup>-least upper bound and ⊆-greatest lower bound operators on some collection of sets. The complement of A will often be denoted by A⊥.) A special kind of lattice event algebra has been used throughout science and mathematics to describe the domain of finitely additive probability functions. It is where

$$\mathfrak{X} = \langle \mathcal{X}, \cup, \cap, -, X, \mathcal{Q} \rangle$$

i.e., where <sup>⋒</sup> = set-theoretic union, <sup>∪</sup>, <sup>⋓</sup> = set-theoretic intersection, ∩, and set-theoretic complementation, −, is a complementation operation for X. This special event algebra is called a set-theoretic boolean algebra.

Probability theory began in the seventeenth century with the study of gambling games. Part of the assumptions underlying such games was that the occurrence of each event that was the basis of a wager could be determined to have happened or could be determined not to have happened. The non-happening of an event A was viewed as the occurrence of another event, the complement of A, −A. Ambiguous or indefinite outcomes were not allowed. In the nineteenth century Boole formulated the logical structure underlying such gambling situations as a settheoretic boolean algebra. One principle of this algebra is the Law of the Excluded Middle: For each event A, either A happens or −A happens, or in algebraic notation, A ∪ −A = X, where X is the sure event. Another is the Distributive Law, for all A, B, C, A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C).

During the late 1920s to early 1930s, the validity of the Law of the Excluded Middle and the Distributive Law were called into question as general logical principles: The mathematician Brouwer concluded that the Law of the Excluded Middle was improper for some kinds of mathematical inference, and the mathematician von Neumann found the Distributive Law to be too restrictive for the structure of events in quantum physics. Both Brouwer and von Neumann constructed new logics that generalized boolean algebras.

Brouwer's logic became known as intuitionistic logic. This article uses the special form of it that is a topology open sets. Brouwer developed his intuitionistic logic for philosophical considerations in the foundations of mathematics. Here intuitionistic logic is used for entirely different purposes: It has a more flexible algebraic structure than boolean algebras, and this flexibility is exploited to described how context can affect probability in organized manners.

#### 2. TOPOLOGICAL ALGEBRA OF EVENTS

<sup>T</sup> = h<sup>T</sup> ,∪,∩, – , ˙ <sup>X</sup>, <sup>∅</sup><sup>i</sup> is said to be a topological algebra if and only if T is a topology of open subsets with universal set X, and for each A in T ,

$$
\neg A \text{ is the } \subseteq \text{-largest element of } T \text{ that is in } X - A,
$$

that is,

$$\div A = \bigcup\_{B \cap A = \mathcal{B}} B$$

–˙ A is called the pseudo complement of A. For the special case where T is a boolean algebra (and thus each element of T is both an open and closed set), – is set-theoretic complementation, ˙ −. A "topological probability function" is defined on T as follows:

Definition 1. A topological probability function P is a function from T into the closed interval [0,1] of the reals such that for all A and B in T ,


If T is a boolean algebra, then topological finite additivity is logically equivalent to the usual concept of finite additivity for probability functions. In this article, a finitely additive probability function on a set-theoretic boolean algebra is called a boolean probability function.

A topology with a topological probability function is a generalization of a set-theoretic boolean algebra with a finitely additive probability function. Topologies are much richer algebraically than boolean algebras, and this richness is useful for describing probabilistic concepts that are difficult or impossible to formulate in a boolean algebra, for example, various concepts of ambiguity, vagueness, and incompleteness. This article uses topologies to formulate a specific concept of "context" that applies to some decision situations. This is done through the use of properties of the pseudo complementation operation – . ˙

Definition 2. Let <sup>T</sup> = h<sup>T</sup> ,∪,∩, –˙ , <sup>X</sup>, <sup>∅</sup><sup>i</sup> be a topological algebra. Then A in T is said to be a refutation if and only if there exists a B in <sup>T</sup> such that A = ˙– B.

One interpretation of – is based on the operations of ˙ "verification" and "refutation" used in the philosophy of science. For this interpretation, an underlying empirical domain is assumed along with a scientific theory about its events. An event is said to be "verified" if its occurrence is empirically verified or it is a direct consequence of the underlying theory. An event A is said to be "refuted" if and only if the assumption of its occurrence is inconsistent with known facts and theory about its occurrence. Event A can be refuted by verifying an event B such that A ∩ B = ∅. A can also be refuted by showing its occurrence is inconsistent with known verifiable events and fundamental tenets of the theory underlying the empirical domain. Under this interpretation, the refutation of A is the largest open set S in the topology that refutes A. It follows that A ∩ S = ∅ and thus S = ˙– A. The refutation of –˙ A, –˙ –˙ A, is the largest open set T that refutes –˙ A. Because A ∩ ˙– A = ∅, A refutes –˙ A. However, is often the case that –˙ –˙ A is not verifiable—i.e., it is only the case that –˙ A is refutable. In such a situation A ⊂ ˙– –˙ A. Because of this, it is often the case that for verifiable A, A∪ ˙– A is not the sure event. This reflects that in most cases that verifiability should not be identified with truth and refutation with falsehood.

Refutations play a different role in defining context for topological algebras. Their key properties for this are given in the following theorem.

**Theorem 1.** Let <sup>T</sup> = h<sup>T</sup> ,∪,∩, –˙ , <sup>X</sup>, <sup>∅</sup><sup>i</sup> be a topological algebra. Then the following six statements hold for all A and B in X .


**Proof**. Statements 1–4 follow from Theorem 3.13 of Narens [2]. Statements 5 and 6 follow from Theorem 8 of Narens [3].

The key difference between set-theoretic boolean algebras and topological algebras is that in set-theoretic boolean algebras

$$B \; = \; - \; -B$$

for all B in the algebra, whereas for topological algebras that are not boolean it can be shown that there exist events A and D such that

$$A \subset D \subset \dot{-} \dot{-} A. \tag{1}$$

By Statement 5 of Theorem 1, such a D in Equation 1 cannot be a refutation.

Let <sup>T</sup> = h<sup>T</sup> ,∪,∩, – , ˙ <sup>X</sup>, <sup>∅</sup><sup>i</sup> be a topological algebra. Define <sup>≡</sup> on <sup>T</sup> as follows: For all <sup>A</sup> and <sup>B</sup> in <sup>T</sup> , <sup>A</sup> <sup>≡</sup> <sup>B</sup> if and only if –˙ –˙ <sup>A</sup> = ˙– –˙ <sup>B</sup>. Then <sup>≡</sup> is an equivalence relation on <sup>T</sup> , and each ≡-class is called a contextual class. The ≡-class to which A belongs, A <sup>≡</sup>, is called the contextual class for A. Note if A∩B = ∅, then A and B belong to different contextual classes by Statement 5 of Theorem 1.

In psychology, context viewed as an operation that changes an event's interpretation. This is often done in formalizations by making a distinction between a description of an event E (gamble, etc.) given in an instruction and the interpretation of that description in a context C, EC, that can vary with instructions, emotional states, or other forms of context.

The contextual classes of a topological algebra are highly structured. In particular, each contextual class A <sup>≡</sup> has a ⊆ maximal element, the refutation –˙ –˙ A, and that these maximal elements form the following boolean algebra.

**Theorem 2.** Define <sup>⊎</sup> on the set of refutations <sup>R</sup> as follows: For all A and B in <sup>R</sup>, A <sup>⊎</sup> <sup>B</sup> = ˙– –˙(<sup>A</sup> <sup>∪</sup> <sup>B</sup>). Then

$$\mathfrak{R} = \langle \mathcal{R}, \uplus, \cap, \dot{\cdot}, X, \mathcal{Q} \rangle$$

is a boolean lattice, that is, for all A, B, C in <sup>R</sup>, A <sup>∩</sup> (<sup>B</sup> <sup>⊎</sup> <sup>C</sup>) <sup>=</sup> (A ∩ B) ⊎ (A ∩ C).

**Proof**. Theorem 3.16 of Narens [2].

In general R is not a set-theoretic boolean algebra, because there may exists an element <sup>A</sup> in <sup>R</sup> such that <sup>A</sup>∪ ˙– <sup>A</sup> is a proper subset of <sup>X</sup>. When –˙ <sup>A</sup>∪<sup>A</sup> <sup>=</sup> <sup>X</sup> for all <sup>A</sup> in <sup>R</sup>, <sup>R</sup> is called a stone algebra, and it can be shown that –˙ = − on <sup>R</sup>, that is, <sup>R</sup> is a set-theoretic boolean algebra. Stone algebras are useful in applications, because a topological probability function P on a stone algebra T is also a finitely additive probability function on R, and for each A in <sup>T</sup> , <sup>P</sup>(–˙ –˙ <sup>A</sup>) can be viewed as the upper boolean probability of the topological probabilities of the events in the contextual class A ≡.

There are many ways contextual classes can be used in psychology. One way is to provide generalizations of the standard theory for rational decision making, SEU (Subjective Expected Utility.) For gambling situations, SEU assumes a gamble g = (a1, A<sup>1</sup> · · · an, An) is composed of a series of terms of the form ai , A<sup>i</sup> , where a<sup>i</sup> , A<sup>i</sup> stands for receiving outcome a<sup>i</sup> if the event A<sup>i</sup> occurs, and where S<sup>n</sup> <sup>i</sup> <sup>=</sup> <sup>1</sup> <sup>A</sup><sup>i</sup> is a partition of the sure event X. In determining the utility of gambles in SEU, the subjective probability P(Ai) of A<sup>i</sup> is independent of the outcome a<sup>i</sup> across gambles. That is, SEU requires that if b<sup>i</sup> , A<sup>i</sup> is a term in another gamble h that partitions X, then P(Ai) is also the probability assigned to A<sup>i</sup> in the computation of h. Some in the literature have question whether this is a valid rationality principle. In any case, one might want to investigate psychological models where such independence is violated. This is done in a model of Narens [3] called "DSEU" ("Descriptive Subjective Expected Utility"). In DSEU, the nature of the outcome a in a term a, A can influence the implied subjective judgment of the probability of the event A, e.g., where a is a catastrophe such as losing one's life vs. a is winning \$5. Narens models the various interpretations of an event occurring in different gambles as events in a contextual class of a topological algebra. Strong disjointness (i.e., –˙ –˙ C ∩ –˙ –˙ D) guarantees that contextual interpretations of gambles remain gambles. Narens [3] shows that subjective judgments of the utilities of the contextual interpretations of gambles and their associated subjective probability of events are rational in the sense that there is a SEU model that has a submodel that is isomorphic to the judgments made on the subjective interpretations of gambles. The existence of such a submodel shows that any irrationality observed in the DSEU model by standard tests (e.g., making a Dutch Book) will transfer to SEU, making SEU irrational by such tests, which is impossible by known results.

# 3. A BEHAVIORAL QUANTUM PROBABILITY THEORY

#### 3.1. Orthomodular Event Lattices

In making decisions involving probabilistic phenomena, people's behavior often violate economic and philosophic principles of rationality. Various theories in economics and psychology have been developed to account for these violations, Prospect Theory of Kahneman and Tversky [4] being currently the most influential. Almost always the accounts assumed an underlying boolean algebra of events. The deviations from SEU are modeled by changing or generalizing characteristics of a finitely additive probability function. Relatively recently, a different approach has been taken: Change the event space to accommodate the violations of economic and philosophic rationality. Topological event spaces of the previous section are one example of such an approach. More commonly in the literature are modeling techniques inspired by von Neumann's approach to quantum mechanics, for example, Busemeyer and Bruza [5].

In his classic Mathematische Grundlagen der Quantenmechanik, von Neumann [6] modeled probabilistic quantum phenomena using closed subspaces of a Hilbert space as events. The seminal article by Birkhoff and von Neumann [1], "The Logic of Quantum Mechanics," isolated the algebraic properties of the event spaces that von Neumann thought underly the probability theory inherent in quantum phenomena. The logic consisted of the following:


$$(A \uplus B)^{\perp} = A^{\perp} \uplus B^{\perp} \text{ and } (A \uplus B)^{\perp} = A^{\perp} \uplus B^{\perp}.\tag{2}$$

A complementation operation that satisfies DeMorgan's Laws is called an othrocomplementation operation.

• <sup>X</sup> satisfies the modular law, that is for all <sup>A</sup>, <sup>B</sup>, and <sup>C</sup> in <sup>X</sup> ,

$$\text{if } B \subseteq A \text{ then } B \Downarrow (A \oplus C) = (B \Downarrow A) \oplus (B \Downarrow C) \dots$$

For lattice algebras, the modular law is a generalization of the distributive law, B <sup>⋒</sup> (<sup>A</sup> <sup>⋓</sup> <sup>C</sup>) <sup>=</sup> (<sup>B</sup> <sup>⋒</sup> <sup>A</sup>) <sup>⋓</sup> (<sup>B</sup> <sup>⋒</sup> <sup>C</sup>). Thus, the von Birkhoff-von Neumann logic is a generalization of a boolean lattice algebra, that is, of an orthocomplemented lattice algebra satisfying the above distributive law. It applies to the lattice algebra of all subspaces of a finite dimensional Hilbert space. However, as Husimi [7] pointed out, the lattice algebra of closed subspaces of an infinite dimensional Hilbert space does not satisfy the modular law. He suggested replacing the modular law with the following consequence of it that he called the orthomodular law: For all A, B, C,

$$\text{if } B \subseteq A \text{ then } A = B \Downarrow \text{( $B^\perp$  \cap A)}\text{. }$$

Today, Husimi's suggestion won out and the term quantum logic applies to lattice event algebras with orthocomplementation satisfying the orthomodular law. In this article, lattice terminology is used instead, and such lattices are called orthomodular lattices.

In psychology, ideas derived from quantum mechanics have been implemented in various ways, from borrowing methods that assume some physics, to using only Hilbert space probability theory, to using only orthomodular lattices. All of these have foundational issues: Why should methods based on physical laws, e.g., methods based on the conservation of energy, apply to psychology? How does one derive the geometrical properties of Hilbert space used in quantum probability from psychological considerations? What does orthomodularity have to do with how experiments are designed and conducted? To my knowledge, the first two questions has not been adequately addressed in the literature. This article makes some progress on the third.

# 3.2. Counterfactuals in Behavioral Experiments

The behavioral modeling described in this section concerns a simplified experimental situation. It differs from a similar model presented in Narens [8] in that minor errors and ambiguities in the construction of that model are eliminated and the material is presented in a more clear manner. The version presented here is also more general.

The assumed simplified situation makes for easier mathematical modeling and philosophical analysis, which are the principal goals of this article. The cost for this is a loss of realism and a design that may require much larger numbers of subjects than is practical for usual psychological experimentation.

The experimental situation under consideration has a large population of subjects, where each is put into exactly one of a finite number of experiments. In psychology, this is called a between-subject paradigm.

Each experiment has a finite, nonempty set of choices—called outcomes—and each of an experiment's subjects must choose exactly one of the experiment's outcomes. Different experiments are assumed to have different outcomes. Thus, each outcome occurs only in one experiment.

To simplify the presentation, only a specific case involving two experiments is considered throughout most of this article. The definitions, concepts, and methods of proof developed for this specific case are formulated in manners so that they generalize to the case of finitely many experiments. Such a generalization is briefly discussed in Section 3.5.

The (experimental) paradigm (P) has two experiments, (A) and (B). Experiment (A) has a set of 3 outcomes, <sup>O</sup><sup>A</sup> = {a, <sup>b</sup>,c}, and experiment (B) has a set of 3 outcomes, <sup>O</sup><sup>B</sup> = {d, <sup>e</sup>, <sup>f</sup>}. (P), which spans (A) and (B), has the set six outcomes, <sup>O</sup> <sup>=</sup> {a, <sup>b</sup>,c, <sup>d</sup>, <sup>e</sup>, <sup>f</sup>}. The set of (P)'s subjects, <sup>S</sup>, is randomly divided in half, with one of the halves participating in (A) and the other in (B). In each experiment, the identity of each subject is recorded along with the outcome she chose. Thus, the number of subjects, N, and the number N<sup>x</sup> of subjects who chose outcome x, x = a, b,c, d, e, f , are known. This is the collected data of (P).

Paradigm (P) also has a theory that connects its experiments (A) and (B). This connection is described counterfactually, for example,

If subject s in experiment (A) chose an outcome in event E in the power-set ℘(OA) of O<sup>A</sup> were instead originally put in experiment (B), then she would have chosen an outcome in event F in the power-set ℘(OB) of OB.

Such counterfactuals exist only in theory, not in data: For a subject <sup>s</sup> who chose some outcome of <sup>E</sup> in <sup>℘</sup>(OA) and <sup>E</sup> 6= <sup>O</sup><sup>A</sup> in experiment (A), it is not possible to determine from (P)'s data alone whether or not s's choice would have been in F ∈ <sup>℘</sup>(OB), where in experiment (B), <sup>F</sup> 6= <sup>O</sup><sup>B</sup> and <sup>F</sup> 6= <sup>∅</sup>. Such a determination must be a consequence of the theory posited by paradigm (P).

Definition 3. Let s be a subject in paradigm (P) and o be an outcome in O. Then s is said to have actually chosen o if and only if o is an outcome in an experiment of (P), s is a subject in that experiment, and s chose o. s is said to have counterfactually chosen o if and only if


Let E be an event in ℘(O). Then s is said to have paradigmatically chosen E if and only if s actually chose some element of E or s counterfactually chose some element of E.

Theoretical assumptions of (P). The following three theoretical assumptions are made about (P):


(T1) is a general theoretical assumption that extends to paradigms having finitely many experiments. (T2) and (T3) are theoretical assumptions that are specific to properties of (P). Section 3.5 describes modified versions of them that apply more widely to paradigms having finitely many experiments.

By assumption, <sup>O</sup><sup>A</sup> <sup>∩</sup> <sup>O</sup><sup>B</sup> <sup>=</sup> <sup>∅</sup>. However, there are situations where outcomes in O<sup>A</sup> and O<sup>B</sup> are needed to be identified. This accomplished through the use of counterfactual statements. Assumption (T2) above is an example of this: For the purposes of analysis and drawing conclusions about (P), it counterfactually identifies c and d as being the same outcome.

The following notation and concepts are useful.

Definition 4. The following notation is used throughout this article.


The following definition provides a method for identifying events across experiments.

Definition 5. Throughout this article for each G <sup>⊆</sup> <sup>O</sup>, <sup>σ</sup>(G) denotes the event in <sup>℘</sup>(O) such that G <sup>⊆</sup> <sup>σ</sup>(G) and for each of (P)'s subjects p, if p has a paradigmatic choice in G then all of her paradigmatic choices are in σ(G). (From the latter, it follows that she has no paradigmatic choices in <sup>O</sup> <sup>−</sup> <sup>σ</sup>(G).) H is said to be a proposition if and only if for some K and H = σ(K). Such a H is also called the proposition associated with K. Note that for each K in ℘(O), the proposition associated with K, σ(K), exists.

**Notation** For each <sup>H</sup> <sup>∈</sup> <sup>℘</sup>(O), let <sup>&</sup>lt; <sup>−</sup><sup>H</sup> <sup>&</sup>gt; <sup>=</sup> <sup>&</sup>lt; <sup>O</sup> <sup>−</sup> <sup>H</sup> <sup>&</sup>gt; and <sup>−</sup> <sup>&</sup>lt; <sup>H</sup> <sup>&</sup>gt; <sup>=</sup> <sup>&</sup>lt; <sup>O</sup> <sup>&</sup>gt; <sup>−</sup> <sup>&</sup>lt; <sup>H</sup> <sup>&</sup>gt;.

The following lemma is a simple consequence of Definition 5.

**Lemma 1.** Let H be the proposition. Then < −H > = − < H >. **Proof**. Each subject p makes one unique actual choice. If this choice is in −H then p is in < −H > and therefore p must be in − < H >, i.e., < −H > ⊆ − < H >. If p is in − < H >, then her actual choice is in −H, i.e., − < H > ⊆ < −H >.

Definition 6. The following notation is used throughout this article:


Elements of P are described later in **Figure 1**.

It follows from P's theory and data that ∅ is a proposition and O is a proposition.

It will be shown that the proposition **a** is {a, e, f}.

{a} ⊆ {a, e, f}. Because each subject paradigmatically selects exactly one outcome in <sup>O</sup>A, it follows that <sup>b</sup> <sup>∈</sup>/ **<sup>a</sup>** and <sup>c</sup> <sup>∈</sup>/ **<sup>a</sup>**, and thus by assumption (T2), d ∈/ **a**. By assumption (T1) each subject who paradigmatically chooses some element of **a** must also paradigmatically choose some element in OB. This element cannot be d. Therefore, it is either e or f . If it is e, then another subject who paradigmatically chose a must have chosen f , for otherwise **a** ⊆ **e**, contradicting assumption (T3). Similarly if a subject who chose an element of **a** paradigmatically chose f , then another subject who paradigmatically chose and element of **a** must have paradigmatically chosen e. Thus, it has been shown that **a** = {a, e, f}. Similarly **b** = {b, e, f}. Note that

$$\mathbf{a} \cap \mathbf{b} = \{e, f\}\,\mathrm{s}$$

However{e, f}is not a proposition. Note that the proposition that is the ⊆-greatest-lower bound of **a** and **b** is the empty set, ∅.

Definition 7. Let **E** and **F** be, respectively, the propositions associated with E and F. Then the following definitions hold:


Note by Lemma 1 and the meanings of "< >" and "∪" that for all E and F in ℘(O),

$$<-E> = - < E> \text{ and }  \cup  =  \tag{3}$$

**Lemma 2.** Let **C**, **D**, and **E**, respectively be, respectively, propositions associated with C, D, and E. Then **C** ⊥ is a proposition, **C** = **C** ⊥⊥, and **D** ⊆ **E** iff **E** <sup>⊥</sup> ⊆ **D**⊥.

**Proof**. **C** <sup>⊥</sup> = σ(C) is a proposition by Definition 7. By Equation (3),

$$\begin{split} \mathbf{C}^{\mathsf{L}\mathsf{L}} &= \sigma \big[ \mathsf{<} - (\mathsf{C}^{\mathsf{L}}) > \big] = \sigma \big[ - < \mathsf{C}^{\mathsf{L}} > \big] = \sigma \big[ - < -\mathsf{C} > \big] \\ &= \sigma \big[ - < \mathsf{C} > \big] = \sigma \big[ < \mathsf{C} > \big] = \mathsf{C} . \end{split}$$

Because

$$D \subseteq E \quad \text{iff} \quad  \subseteq  $$

$$\text{iff} \quad \sigma() \subseteq \sigma(E) $$

$$\text{iff} \quad \mathbf{D} \subseteq \mathbf{E} $$

and

$$D \subseteq E \quad \text{iff} \ - < E > \subseteq \ - < D > \ \ \ \ \ \ \ \text{iff} \ \ \ \sigma(< -E >) \subseteq \ \sigma(-D) \\ \ \ \ \text{iff} \ \ \ \ \mathsf{E}^{\mathsf{L}} \subseteq \mathsf{D}^{\mathsf{L}},$$

it follows that **D** ⊆ **E** iff **E** <sup>⊥</sup> ⊆ **D**⊥.

**Lemma 3.** <sup>⋒</sup> is the <sup>⊆</sup>-least upper bound operation on <sup>P</sup>.

**Proof**. Let **F** and **G**, respectively, be propositions associated with F and G. Then, because

$$\cup = ,$$

it follows that

$$\mathbf{F} \uplus \mathbf{G} = \sigma \{ < F \cup G > \}$$

and therefore **F** ⋒ **G** is a proposition. Then

$$\mathcal{F} = \sigma() \subseteq \sigma( \cup )$$

and

$$\mathbf{G} = \sigma() \subseteq \sigma( \cup ),$$

making **F** ⋒ **G** an upper bound of **F** and **G**. Suppose **H** is the proposition associated with H and is such that **F** ⊆ **H** and **G** ⊆ **H**. Then

$$<\mathbf{F} > \subseteq <\mathbf{H} > \quad \text{and} \quad <\mathbf{G} > \subseteq <\mathbf{H} > \dots$$

Thus

$$ \cup  \subseteq ,$$

and therefore,

$$\mathbf{F} \Downarrow \mathbf{G} = \sigma(\lhd F > \sqcup \lhd G > ) \sqsubseteq \sigma(\lhd H > ) = \mathbf{H}, \rho$$

showing that **F** ⋒ **G** is the least upper bound of **F** and **G**.

**Lemma 4.** <sup>⋓</sup> is the <sup>⊆</sup>-greatest lower bound operation on <sup>P</sup>. **Proof**. Let **F** and **G**, respectively, be propositions associated with F and G. Then, because < **F** > ∩ < **G** > = < **F**∩**G** >, it follows that

$$\mathbf{F} \oplus \mathbf{G} = \sigma(<\mathbf{F} \cap \mathbf{G} >)$$

and therefore **F** ⋓ **G** is a proposition. Thus,

$$\mathcal{F} = \sigma(\lhd F \succ) \supseteq \sigma(\lhd F \succ \sqcap \lhd G \succ) = \mathcal{F} \circledr \mathcal{G}$$

and

$$\mathbf{G} = \sigma(\text{}) \supseteq \sigma(\text{} \cap \text{}) = \mathbf{F} \oplus \mathbf{G},$$

making **F**⋓**G** a lower bound of **F** and **G**. Suppose the proposition **H** associated with H is such that

$$<\mathbf{F} > \supseteq <\mathbf{H} > \quad \text{and} \quad <\mathbf{G} > \supseteq <\mathbf{H} > \dots$$

Then < F > ⊇ < H > and < G > ⊇ < H >, and thus

$$ \cap  \supseteq ,$$

and therefore,

$$\mathcal{F} \cap \mathcal{G} = \sigma(\prec F > \cap < G >) = \sigma(H) = \mathcal{H},$$

showing that **F** ⋓ **G** is the greatest lower bound of **F** and **G**.

**Lemma 5.** <sup>P</sup> = hP,⋒,⋓, <sup>⊥</sup>, <sup>O</sup>, <sup>∅</sup><sup>i</sup> is a complemented lattice event algebra.

**Proof**. <sup>O</sup> is clearly the <sup>⊆</sup>-largest element of <sup>P</sup> and <sup>∅</sup> is clearly the <sup>⊆</sup>-smallest element of <sup>P</sup>.

Because, by Lemmas 3 and Lemma 4, ⋒ and ⋓ are, respectively, the ⊆-least upper bound and ⊆-greatest lower bound operators on P, P is a lattice event algebra. The following shows that <sup>⊥</sup> is a complementation operation on P.

Let **E** and **F**, respectively, be the propositions associated with E and F. Then, by **E** <sup>⊥</sup> = σ(−E) and Equation (3),

$$\begin{aligned} \mathbf{E} \uplus \mathbf{E}^{\mathsf{L}} &= \sigma(\mathsf{<} E) \; \mathsf{U} \; \mathsf{<} \; \mathsf{<} - E \; \mathsf{>} \; \mathsf{>} \\ &= \sigma(\mathsf{<} E \; \mathsf{>} \; \mathsf{U} \; - \; \mathsf{<} E \; \mathsf{>}) = \sigma(\mathsf{<} \; \mathcal{O} \; \mathsf{>}) = \mathcal{O}, \; \mathsf{U} \end{aligned}$$

and

$$\begin{aligned} \mathbf{E} \oplus \mathbf{E}^{\mathsf{L}} &= \sigma(\mathsf{} \cap \mathsf{<} - \mathsf{E}) \rhd \\ &= \sigma(\mathsf{<} E \rhd \sqcap - \mathsf{<} E \rhd) = \sigma(\mathsf{<} \otimes \succ) = \mathsf{B}, \end{aligned}$$

**Lemma 6.** The complemented lattice event algebra P = <sup>h</sup>P,⋒,⋓, <sup>⊥</sup>, <sup>O</sup>, <sup>∅</sup><sup>i</sup> satisfies DeMorgan's Laws.

**Proof**. It is a well-known result of lattice theory (e.g., Theorem 2.14 of [2]) that DeMorgan's Laws for P are equivalent to the following: For all **E** and **F** in P,

$$\mathbf{E}^{\mathsf{L}\mathsf{L}} = \mathsf{E} \text{ and } \langle \mathsf{E} \subseteq \mathsf{F} \text{ iff } \mathsf{F}^{\mathsf{L}} \subseteq \mathsf{E}^{\mathsf{L}} \rangle. \tag{4}$$

Equation (4) follows from Lemma 2.

The above lemmas show that the description of the experimental situation gives rise to an orthocomplemented lattice. Aerts and Gabora [9] have a similar result for a different psychological paradigm: They show that their empirical data is representable as an orthocomplemented lattice that they imbed in a Hilbert space.

Theorem 3 given later shows that <sup>P</sup> = hP,⋒,⋓, <sup>⊥</sup>, <sup>O</sup>, <sup>∅</sup><sup>i</sup> also satisfies the Orthomodular Law. The proof, which generalizes to a wide class of paradigms with finitely many experiments, uses a probability function that is defined on the set of P's propositions. The probability theory for this function is developed in the following two sections.

# 3.3. Probability Theory for P

Definition 8. Throughout the rest of this article, let P be the following function on P: For each **E** in P,

$$\mathbb{P}(E) = \frac{||}{|<\mathcal{O}>|} \cdot$$

P is called (P)'s propositional probability function.

The following is the intended interpretation of P: For each proposition **E** in P, P(**E**) is the probability that a randomly chosen paradigm subject actually chose some outcome e in **E**. If the subjects in < E > are known through data and theory, then the value of P(E) completely computable from data.

Propositions **E** that span experiments are necessarily partially based on counterfactuals. Because of this, they are theoretical in nature. Nevertheless, as discussed at the end of Section 3, for paradigm (P), P's value for a proposition **F** is estimable from data to a good approximation. As discussed at the end of Section 3 this is not generally true of other paradigms. However, for the special case where a proposition comes from one of a paradigm's experiments it is generally true by an analog of the following argument given for (P).

Because of the large numbers of subjects participating in (P)'s experiments and the way they were randomly assigned in equal numbers to each of (P)'s experiments, it follows that for each E in ℘(OA),

$$|| \approx || ,$$

where ≈ stands for "approximately" and | < E <sup>⋆</sup> <sup>&</sup>gt; <sup>|</sup> for "the number of subjects in (B) that counterfactually chose some outcome in E." Thus, for E in ℘(OA),

$$\mathbb{P}(\mathbf{E}) = \frac{|<\mathbf{E}>|}{|<\mathcal{O}>|} = \frac{||+||}{|<\mathcal{O}>|} \approx \frac{2||}{|<\mathcal{O}>|},$$

which is computable since <sup>|</sup> <sup>&</sup>lt; <sup>E</sup> <sup>&</sup>gt; <sup>|</sup> and <sup>|</sup> <sup>&</sup>lt; <sup>O</sup> <sup>&</sup>gt; <sup>|</sup> are known from data.

Thus for each proposition **E** in ℘(OA) and similarly for each proposition **F** in ℘(OB), P(**E**) and P(**F**) are estimable to a good approximation from data. For <sup>G</sup> <sup>∈</sup> <sup>℘</sup>(O) where **<sup>G</sup>** spans experiments can be more complicated. For such spanning propositions, theoretical assumptions as well as data are needed to calculate P's probabilities. As discussed at the end of Section 3, this is possible for paradigm (P) but may not be possible for other paradigms where the the theory may not complete enough to estimate all spanning propositions.

### 3.4. Logical and Probabilistic Structure of Orthomodular Event Lattices

**Figure 1** is a Hasse diagram of the lattice <sup>P</sup> = hP,⋒,⋓, <sup>O</sup>, <sup>∅</sup>i. The set-theoretic boolean algebra generated by <sup>O</sup> has 2<sup>6</sup> <sup>=</sup> <sup>64</sup> elements. The elements at the bottom of **Figure 1** but above ∅ are called atoms. They are lattice elements **E** such that there there does not exist a lattice element **F** such that ∅ ⊂ **F** ⊂ **E**. **Figure 1** has 5 atoms, **a**, **b**, **k**, **e**, **f** . The set-theoretic boolean algebra generated by these atoms has 2<sup>5</sup> <sup>=</sup> 32 elements. <sup>P</sup> has 12 elements—a substantial reduction from 64 or 32.

In **Figure 1**, the lattice-theoretic intersection ⋓ of atoms, e.g., **a** ⋓ **f**, is the proposition ∅. This is a consequence of assumption (T3).

The following concepts are useful for the understanding of the structure of orthomodular lattices.

Definition 9. <sup>X</sup> = h<sup>X</sup> ,⋒,⋓, <sup>⊥</sup> , X, ∅i is said to be an ortholattice if and only if X is a complemented lattice event algebra satisfying DeMorgan's Laws.

Definition 10. Let <sup>X</sup> = h<sup>X</sup> ,⋒,⋓, <sup>⊥</sup> , X, ∅i be an ortholattice. Then the following definitions hold.


(higher nodes).

• for all C and D in <sup>X</sup> , if C <sup>⊥</sup> D then <sup>Q</sup>(<sup>C</sup> <sup>⋒</sup> <sup>D</sup>) <sup>=</sup> <sup>Q</sup>(C) <sup>+</sup> C(D).

Q is said to be ⊂-monotonic if and only if for all C and D in <sup>X</sup> , if C <sup>⊂</sup> D then <sup>Q</sup>(C) <sup>&</sup>lt; <sup>Q</sup>(D).


**Figure 2** shows a Hasse diagram of an O<sup>6</sup> subalgebra of X when X is a set of propositions.

**Lemma 7.** Suppose <sup>X</sup> = h<sup>X</sup> ,⋒,⋓, <sup>⊥</sup> , X, ∅i is an ortholattice that has no O<sup>6</sup> subalgebra. Then X is orthomodular.

**Proof**. Theorem 2 (pp. 22–23) of Kalmbach [10]. (Also Theorem 2.25 of [2].)

**Lemma 8.** P is a ⊂-monotonic orthoprobability function on P = <sup>h</sup>P,⋒,⋓, <sup>⊥</sup>, <sup>O</sup>, <sup>∅</sup>i.

**Proof**. Suppose **F** and **G** are arbitrary elements of P such that **G** ⊆ **F** ⊥. We first show

$$<\mathbf{F} \Downarrow \mathbf{G} > \quad \color{red}{\bf x} \texttt{F} > \color{red}{\bf x} \texttt{G} > \color{red}{\bf x} \texttt{G} > \color{red}{\bf x} \texttt{G} > \color{red}{\bf x} \texttt{G} > \color{red}{\bf x}$$

It is immediate that <sup>&</sup>lt; **<sup>F</sup>** <sup>&</sup>gt; <sup>⊆</sup> <sup>&</sup>lt; **<sup>F</sup>** <sup>⋒</sup> **<sup>G</sup>** <sup>&</sup>gt; and <sup>&</sup>lt; **<sup>G</sup>** <sup>&</sup>gt; <sup>⊆</sup> <sup>&</sup>lt; **F** ⋒ **G** >. Thus,

$$<\mathbf{F}>\downarrow\quad\text{Í}\quad\mathbf{G}>\subseteq\!<\mathbf{F}\,\Downarrow\;\mathbf{G}>\;.\tag{6}$$

Suppose s is in

$$<\mathbf{F} \uplus \mathbf{G}> = <\sigma(F \cup \mathbf{G})> = <\sigma(F) \cup \sigma(\mathbf{G})> \dots$$

Then s is in σ(F) or s is in σ(G). Without loss of generality, suppose s is in σ(F) = **F**. Then

$$<\mathbf{F} \uplus \mathbf{G} > \subseteq <\mathbf{F} > \cup <\mathbf{G} >,\tag{7}$$

and Equation (5) follows from Equations (6) and (7).

By the definitions of "proposition" and "⊥" and Equation (3), < **F** <sup>⊥</sup> > = < −**F** > = − < **F** >. Thus, < **F** > ∩ < **F** ⊥ > = ∅, and therefore, because **G** ⊆ **F** <sup>⊥</sup>, < **F** > ∩ < **G** > = ∅. Thus,

$$<\mathbf{F} \uplus \mathbf{G} > = <\mathbf{F} > \cup <\mathbf{G} > = <\mathbf{F} > +<\mathbf{G} > \dots$$

Therefore,

$$\begin{aligned} \mathbb{P}(\mathbf{F} \Downarrow \mathbf{G}) &= \frac{| < \mathbf{F} \Downarrow \mathbf{G} > |}{|\mathcal{O}|} = \frac{| < \mathbf{F} > |}{|\mathcal{O}|} + \frac{| < \mathbf{G} > |}{|\mathcal{O}|}, \\ &= \mathbb{P}(\mathbf{F}) + \mathbb{P}(\mathbf{G}), \end{aligned}$$

showing ortho-additivity.

To show monotonicity suppose **F** and **H** are arbitrary elements of <sup>P</sup> such that **<sup>F</sup>** <sup>⊂</sup> **<sup>H</sup>**. Then <sup>&</sup>lt; **<sup>F</sup>** <sup>&</sup>gt; <sup>⊂</sup> <sup>&</sup>lt; **<sup>H</sup>** <sup>&</sup>gt;. Then, by the definition of P, P(**F**) < P(**H**).

**Theorem 3.** <sup>P</sup> = hP,⋒,⋓, <sup>⊥</sup>, <sup>O</sup>, <sup>∅</sup><sup>i</sup> is an orthomodular lattice and P is an orthoprobability function on P.

**Proof**. By Lemma 8, P is a monotonic orthoprobability function on P. Suppose P is not an orthomodular lattice. A contradiction will be shown. Then by Lemma 7 there exists a sublattice of P that has a Hasse diagram of the form displayed in **Figure 2**. By the monotonicity of P,

$$\mathbb{P}(\mathbf{F}) < \mathbb{P}(\mathbf{G}),\tag{8}$$

and by the ortho-additivity of P,

$$\mathbb{P}(\mathbf{F}) + \mathbb{P}(\mathbf{G}^{\mathsf{L}}) = \mathbb{P}(\mathbf{F} \uplus \mathbf{G}^{\mathsf{L}}) = \mathbb{P}(X) = 1 = \mathbb{P}(\mathbf{F}) + \mathbb{P}(\mathbf{F}^{\mathsf{L}}) . \tag{9}$$

Equations (8) and (9) contradict one another, because, by the monotonicity of P, P(**G**⊥) < P(**F** ⊥) .

The literature has studied orthomodular lattices as generalizations of the logic underlying quantum mechanics. Unfortunately, not all orthomodular lattices admit orthoprobability functions [11]. This in itself is a clue that for science something more than general orthomodular lattices are needed. For P, the probability function P was derived directly from (P)'s theory and empirical considerations.

#### 3.5. Generalizations and Properties of Paradigm Probability Functions

Thus far, our analysis has focussed on the paradigm (P) and the probabilistic structure P. Although the analysis sometimes used special features of them, care was taken to present, when possible, concepts and methods of proof that generalized to a wider class experimental situations and a wider class of probabilistic structures. There are, however, some conditions special to P and P that do not apply to all between-subject paradigms involving finitely many experiments. These are concerned with the use of P's atoms.

The boolean algebra <sup>B</sup> = h℘(O),∪,∩, <sup>−</sup>, <sup>O</sup>, <sup>∅</sup><sup>i</sup> spans (P)'s experiments. Its set of atoms is <sup>O</sup> = {a, <sup>b</sup>,c, <sup>d</sup>, <sup>e</sup>, <sup>f</sup>}, which is the set of outcomes of (P)'s experiments. (P)'s data consist of records of the choices in O made by its subjects. In shifting the analysis from B to P, it is desirable to keep the data intact. (P)'s theoretical axioms and concepts does this by making the propositions **a**, **b**, **k**, **e**, and **f** the atoms of P. (**k** results from theoretical assumption (T2) that requires the identification of c and d.) This allows the collected data about O to be transferred to **a**, **b**, **k**, **e**, **f**. Concepts and theorems can exploit this transfer. For example, this transfer is needed to implement the important concept of "actually determined" in obtaining consequences of (P)'s theory and data.

Many of the previous results about P generalize to a paradigm (Q) involving finitely many experiments, (A1), . . . , (An), where (Q) has disjoint experimental outcomes, disjoint subject populations, and where for 1 ≤ i, j ≤ n, A<sup>i</sup> 's subject population is randomly sampled from (Q)'s subject pool and is the same size as A<sup>j</sup> 's subject population. In particular, with appropriate generalizations of (P)'s theory, Lemmas 2 to 8 and Theorem 3 generalize to (Q)'s lattice of propositions using the methods of proof similar to the those presented for P and P.

(Q)'s theory consists of a set of statements describing relationships among subjects' responses across experiments. Among these are statements that generalize (T1), (T2), and (T3) of (P)'s theory in the following manner:


"< >" and P have analogous definitions and results for (Q) to those for (P).

### 3.6. Comparison with Quantum Probability

Many researchers of the formal foundations of quantum mechanics have speculated that the underlying probability theory for quantum mechanics is not interpretable in a physically acceptable manner into a boolean probability theory (e.g., [1, 12–14]). Others have disagreed (e.g., [15]), producing a long-running controversy that continues to the present (e.g., 16).

Von Neumann was well aware of foundational difficulties presented in his seminal 1932 book, Mathematische Grundlagen der Quantenmechanik. It appears to me that such difficulties are sharply increased and compounded by the importation of formalisms involving probability from quantum mechanics to cope with the difficult contextual issues presented in the behavioral sciences.

Rédei [17] writes the following about the evolution of von Neumann's position about the nature of probability in quantum mechanics.

What von Neumann aimed at in his quest for quantum logic in the years 1935–1936 was establishing the quantum analog of the classical situation, where a Boolean algebra can be interpreted as being both the Tarski-Lindenbaum algebra of a classical propositional logic and the algebraic structure representing the random events of a classical probability theory, with probability being an additive normalized measure on the Boolean algebra satisfying [monotonicity], and where the probabilities can also be interpreted as relative frequencies. The problem is that there exist no "properly non-commutative" versions of this situation: The only (irreducible) examples of non-commutative probability spaces probabilities of which can be interpreted via relative frequencies are the modular lattices of the finite (factor) von Neumann algebras with the canonical trace; however, the non-commutativity of these examples is somewhat misleading because the noncommutativity is suppressed by the fact that the trace is exactly the functional that insensitive for the non-commutativity of the underlying algebra. So it seems that while one can have both a non-classical (quantum) logic and a mathematically impeccable non-commutative measure theory, the conceptual relation of these two structures cannot be the same as in the classical commutative case—as long as one views the measure as probability in the sense of relative frequency. This must have been the main reason why after 1936 von Neumann abandoned the relative frequency view of probability in favor of what can be called a "logical interpretation." In this interpretation, advocated by von Neumann explicitly in his address to the 1954 Amsterdam Conference, (quantum) logic determines the (quantum) probability, and vice versa, i.e., von Neumann sees logic and probability emerging simultaneously.

Von Neumann did not think, however, that this rather abstract idea had been worked out by him as fully as it should. Rather, he saw in the unified theory of logic, probability, and quantum mechanics a problem area that he thought should be further developed. He finishes his address to the Amsterdam Conference with these words [18]:

I think that it is quite important and will probably shade a great deal of new light on logics and probably alter the whole formal structure of logics considerably, if one succeeds in deriving this system from first principles, in other words for a suitable set of axioms. All the existing axiomatizations of this system are unsatisfactory in this sense, that they bing in quite arbitrarily algebraical laws which are not clearly related to anything that one believes to be true or that one has observed in quantum theory to be true. So, while one has very satisfactorily formalistic foundations of projective geometry of some infinite generalizations of it, including orthogonality, including angles, none of them are derived from intuitively plausible first principles in the manner in which axiomatizations in other areas are.

Now I think that at this point lies a very important complex of open problems, about which one does not know well of how to formulate them now, but which are likely to give logics and the whole dependent system of probability a new slam.

Von Neumann's concerns about probability theory in quantum mechanics do not hold for the multi-experiment behavioral paradigms presented in this article. The paradigms' orthomodular lattice event structures follows directly from their experimental designs and theories linking experiments. This produces an orthoprobability probability function Q for a paradigm's lattice of propositions, Q. Because for propositions, actual probabilities coincide paradigmatic probabilities, Q can be estimated through a relative-frequency process for events for which the underlying theory and collected data specify to a good approximation which subjects paradigmatically chose outcomes for those events. Paradigm (P) is an example where such a relative frequency approach applies to all of its events: Its event lattice has twelve elements. Of these, the probabilities of two, O and ∅, are determined by definition. Five others, **a**, **b**, **k**, **e**, **f**, are atoms and their actual probabilities are estimable by collected data and thus, as described earlier, their paradigmatic probabilities are estimable. The remaining five are complements of the five atoms and these have as probabilities 1 minus the probability of its atom, and thus they too are estimable. Now consider the general case Q where **F** and **G** are lattice disjoint propositions where it is known which subjects chose an element of **F** and which chose an element of **G**. If it is the case that **<sup>F</sup>**<sup>∪</sup> **<sup>G</sup>** <sup>⊂</sup> **<sup>F</sup>**<sup>⋒</sup> **<sup>G</sup>** then more information is required to estimate the number of subjects who are in **F**⋒ **G**. The additional information has to come from the paradigm's theory. For (P), its theory tells us that **<sup>a</sup>** <sup>⋒</sup> **<sup>f</sup>** = {a, <sup>b</sup>, <sup>e</sup>, <sup>f</sup>}, which is the complement of **<sup>k</sup>** and thus has number of subjects <sup>|</sup>O| − |**k**|. This number is known because <sup>|</sup>O<sup>|</sup> and <sup>|</sup>**k**<sup>|</sup> are known.

# 4. CONCLUSIONS

Both the topological probability and the quantum-like paradigm theories presented here are applicable to a variety of psychological experimental situations where Kolmogorov probability theory appears inadequate for modeling cognitive processes. Although very different in how they handle probabilities, they both can often offer explanations for puzzling behavioral phenomena. From a modeling point of view, this is not entirely surprising: After all, both are generalizations of Kolmogorov probability, and, as such, both have greater freedom to model behavioral data than the Kolmogorov theory. However, because of their algebraic structural differences, they are likely to suggest different cognitive mechanisms producing the data. Topological probability functions are arguably "rational" in the sense that they do not violate the key ideas of rationality inherent in the Dutch Book Argument and the SEU model.

The probability theory of quantum mechanics and the psychological paradigm probability theory developed here share many formal characteristics, but at a fundamental level they are about different kinds of uncertainty. The uncertainty in paradigm probability theory is manufactured by the random assignment of subjects to experiments by the scientist. It is not an inherent part of the subjects, outcomes, or of the paradigm's theory. The subjects in an experiment have actual and counterfactual choices. These choices, as well as the theory connecting the paradigm's experiments, are modeled in deterministic manners. All of this is very different than the probability theory of quantum mechanics, where the uncertainty results from the randomness inherent an ensemble of particles.

Systems satisfying the Kolmogorov axioms for probability produce a probability theory founded on a σ-additive boolean probability function. Such probability functions have come to dominate the probability theories of mathematics, statistics, and science. They are usually conceptualized as a single boolean probability function defined over all relevant situations. They are often interpreted as measuring the propensity of an event to occur or a subjective degree of belief that an event will occur. Such a propensity or degree are considered to be completely associated with the event, and, as a consequence, does not depend on the situation to which an event belongs. In this sense, propensity (or degree of belief) is noncontextual.

Kolmogorov probability theory can be generalized to become contextual by allowing events that belong to different situations to have different propensities (or different degrees of belief). These situations are characterized as having different probability functions. This causes various challenging issues in behavioral science, e.g., the identification of random variables across situations, or descriptions of the relationships of random variables across different probabilistic situations. Dzharafov and Kujala [19] and Dzharafov et al. [20] have laid out a foundation for such a generalization. It produces an alternative to the single probability function interpretation of the Kolmogorov theory that have many features in common with the probability theory underlying quantum mechanics. There are several other quantum-like probability theories in the literature that are not discussed in this article (e.g., the probability theory of [21]). It is beyond the scope of this article to go into their foundations or relationships to the alternative probability theories described in this article.

Context is an ill-understood concept in the behavioral sciences. While there are many psychological experiments illustrating its ubiquity and importance in psychological phenomena, e.g., framing effects in cognitive psychology, there is very little theory and experimentation describing the relationship of contexts across different experiments. I believe part of the reason for this has been the lack of mathematical theories designed to model contextual relationships. For particle physics, this kind of modeling was accomplished by von Neumann. His method has been imported by Busemeyer and colleagues and others into the behavioral science (e.g., [5]). This has produced some interesting new phenomena (e.g., [22]) and has been used as a unifying foundation for explaining many puzzling psychological phenomena. Not surprisingly, this importation has raised new, serious foundational and methodological issues.

Narens [2] interprets many results from lattice theory most known in the 1930s—as suggesting there are not many alternatives to boolean algebras that are useful event spaces for modeling probabilistic experimental phenomena, except for those that are distributive (e.g., topological algebras) or orthomodular (e.g., closed subspaces of a Hilbert space). This means that rich mathematical theories of probabilistic context are likely very limited without giving up much more structure from Kolmogorov probability theory, particularly, without greatly reducing the parts of the event space displaying forms of "probabilistic additivity."

# AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

# REFERENCES


# FUNDING

The research for this article was supported by grant SMA-1416907 from NSF.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fphy. 2017.00004/full#supplementary-material


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Narens. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Real and the Mathematical in Quantum Modeling: From Principles to Models and from Models to Principles

#### Arkady Plotnitsky \*

*Theory and Cultural Studies Program, Purdue University, West Lafayette, IN, United States*

The history of mathematical modeling outside physics has been dominated by the use of classical mathematical models, C-models, primarily those of a probabilistic or statistical nature. More recently, however, quantum mathematical models, Q-models, based in the mathematical formalism of quantum theory have become more prominent in psychology, economics, and decision science. The use of Q-models in these fields remains controversial, in part because it is not entirely clear whether Q-models are necessary for dealing with the phenomena in question or whether C-models would still suffice. My aim, however, is not to assess the necessity of Q-models in these fields, but instead to reflect on what the possible applicability of Q-models may tell us about the corresponding phenomena there, vis-à-vis quantum phenomena in physics. In order to do so, I shall first discuss the key reasons for the use of Q-models in physics. In particular, I shall examine the fundamental principles that led to the development of quantum mechanics. Then I shall consider a possible role of similar principles in using Q-models outside physics. Psychology, economics, and decision science borrow already available Q-models from quantum theory, rather than derive them from their own internal principles, while quantum mechanics was derived from such principles, because there was no readily available mathematical model to handle quantum phenomena, although the mathematics ultimately used in quantum did in fact exist then. I shall argue, however, that the principle perspective on mathematical modeling outside physics might help us to understand better the role of Q-models in these fields and possibly to envision new models, conceptually analogous to but mathematically different from those of quantum theory, that may be helpful or even necessary there or in physics itself. I shall, in closing, suggest one possible type of such models, singularized probabilistic models, SP-models, some of which are time-dependent, TDSP-models. The necessity of using such models may change the nature of mathematical modeling in science and, thus, the nature of science, as it happened in the case of Q-models, which not only led to a revolutionary transformation of physics but also opened new possibilities for scientific thinking and mathematical modeling beyond physics.

Keywords: principles, models, probability, statistics, reality, realism

#### Edited by:

*Emmanuel E. Haven, University of Leicester, United Kingdom*

#### Reviewed by:

*Marco G. Mazza, Max Planck Institute for Dynamics and Self Organization (MPG), Germany Gregg Jaeger, Boston University, United States*

> \*Correspondence: *Arkady Plotnitsky plotnits@purdue.edu*

#### Specialty section:

*This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics*

Received: *16 November 2016* Accepted: *26 May 2017* Published: *19 June 2017*

#### Citation:

*Plotnitsky A (2017) The Real and the Mathematical in Quantum Modeling: From Principles to Models and from Models to Principles. Front. Phys. 5:19. doi: 10.3389/fphy.2017.00019*

# INTRODUCTION

The history of mathematical modeling outside physics has been dominated by classical mathematical models, C-models, based on mathematical models developed in classical physics, especially probabilistic or statistical models, borrowed from classical statistical physics or chaos and complexity theories. More recently, however, models based in the mathematical formalism of quantum theory, Q-models, primarily borrowed from quantum mechanics but occasionally also quantum field theory, became more current outside physics, specifically in psychology, economics, and decision science, the fields (beyond physics) with which I will be primarily concerned here [e.g., 1, 2] 1 . My abbreviations follows P. Dirac's distinction between c-numbers (classical numbers) and q-numbers (quantum numbers), because the variables used in Q-models are in fact q-numbers. Quantum mechanics and Q-models are based in the mathematics of Hilbert spaces over complex numbers, **C**, with Hilbert-space operators used as physical variables in the equations of quantum mechanics, as against functions of real (mathematical) variables, c-numbers, that serve as physical variables in classical physics. The use of Q-models in these fields remains controversial, because it is not entirely clear whether they are necessary for dealing with the phenomena in question or whether Cmodels would suffice. It is true that debates and sometimes controversies have also accompanied quantum mechanics since its birth in 1925. These debates, initiated by the famous confrontation between N. Bohr and A. Einstein on, in Bohr's phrase, "epistemological problems in atomic physics," used in the title of his account of this confrontation, have never lost their intensity and appear to be interminable [3, v. 2, pp. 32–66]. However, as Bohr's phrase indicates, the reasons for these controversies have been primarily philosophical. The effectiveness of quantum mechanics or higher-level quantum theories, such as quantum field theory, has not been in question: they are among the best-confirmed theories in physics. The situation is different in psychology, economics, and decision science, where it is the scientific effectiveness or at least necessity of Q-models that is doubted. My aim here, however, is not to assess this effectiveness or necessity, but instead to reflect on what the possible applicability of Q-models may tell us about the corresponding phenomena in these fields vis-à-vis quantum phenomena in physics. In order to do so, I shall first consider the key reasons for the use of Q-models in physics. In particular, I shall examine the fundamental principles that grounded and indeed led to the development of quantum theory. Then I shall consider a possible role of similar principles in using Q-models beyond quantum theory. My emphases are due to the fact that psychology, economics, and decision science borrow already available Q-models from quantum theory, rather than derive them from their own fundamental principles, while quantum mechanics and then quantum field theory were derived from such principles. This is not surprising because there was at the time no available mathematical model or (a more general concept, which includes an interpretation of the model used) theory to effectively handle quantum phenomena. The "old quantum theory" of M. Planck, A. Einstein, N. Bohr, and A. Sommerfeld, which ushered in the quantum revolution, became manifestly inadequate by the time W. Heisenberg began his work on quantum mechanics that he discovered in 1925 [4]. For the reasons explained below (mostly a search for a more rigorous derivation of the formalism), the research in quantum foundations is still concerned with deriving quantum theory from such principles, a project in part motivated by the rise of quantum information theory. That does not appear to be a significant concern outside physics where the use of Q-models is motivated primarily by their predictive capacities, which is of course a crucial consideration in physics as well. It may, however, be beneficial to consider the deeper reasons for the possible use of Q-models in these fields, or, in terms of my title, the real that gives rise to the mathematical of Q-models there. The principle perspective on mathematical modeling beyond physics might help us to do this and possibly to envision new, post-quantum, models there or even in physics. I shall, in closing, suggest one possible type of such models, singularized probabilistic models, SP-models, some of which are time-dependent, TDSP-models, and consider their implications for mathematical modeling in science and for our understanding of the nature of science<sup>2</sup> .

# PHYSICAL PRINCIPLES AND MATHEMATICAL MODELS IN QUANTUM MECHANICS

# Theories, Principles, and Models in Fundamental Physics

I would like to begin by outlining the key features of the standard mathematical model of quantum mechanics, more customarily used as a probabilistically or statistically predictive model in view of the difficulties of in maintaining its representational capacities, which continue to be debated:


<sup>1</sup> I shall only discuss the standard quantum mechanics or quantum field theory, bypassing alternative theories of quantum phenomena, such as Bohmian theories, which are sometimes used in mathematical modeling outside physics, but which would require a separate consideration. By "quantum phenomena" I refer to those physical phenomena in considering which Planck's constant, h, must be taken into account, and by "quantum objects" (thus different from quantum phenomena) to those entities in nature that are responsible for the appearance of quantum phenomena, manifested in measuring instruments involved in quantum experiments or in certain natural phenomena.

<sup>2</sup>The discussion to follow in part builds on two previous articles [5, 6], but only in part: overall the present argument is different, especially (but not exclusively) by virtue of considering SP-models.

associated, in terms of probabilistic or statistical predictions, with physically observable quantities;


In the development of quantum mechanics, discovered in 1925, these features were not initially assumed, but were derived from certain physical features of quantum phenomena and principles arising from these features. The formalism was only given a properly Hilbert-space form by J. von Neumann, in 1932, in The Mathematical Foundations of Quantum Mechanics, a standard text ever since [7] 4 .

I shall now explain the concepts of theory, principle, and model, as they will be understood here. By a theory, I mean an organized assemblage of concepts, explanations, principles, and models by means of which one is able to relate, in one way or another, to the phenomena or (they are not always the same) objects the theory considers. In defining principles, I follow Einstein's distinction between "constructive" and "principle" theories, two contrasting, although in practice often intermixed, types of theories [8, 9, pp. 35–50]. "Constructive theories" aim "to build up a picture of the more complex phenomena out of the materials of a relatively simple formal scheme from which they start out" [8, p. 228]. Thus, according to Einstein, the kinetic theory of gases, as a constructive theory in classical physics, "seeks to reduce mechanical, thermal, and diffusional processes to movements of molecules—i.e., to build them up out of the hypothesis of molecular motion," described by the laws of classical mechanics [8, p. 228]. By contrast, principle theories "employ the analytic, not the synthetic, method. The elements which form their basis and starting point are not hypothetically constructed but empirically discovered ones, general characteristics of natural processes, principles that give rise to mathematically formulated criteria which the separate processes or the theoretical representations of them have to satisfy" [8, p. 228]. Thus, thermodynamics, a classical principle theory (parallel to the kinetic theory of gases as a constructive theory), "seeks by analytical means to deduce necessary conditions, which separate events have to satisfy, from the universally experienced fact that perpetual motion is impossible" [8, p. 228].

Principles, then, are "empirically discovered, general characteristics of natural processes, ... that give rise to mathematically formulated criteria which the separate processes or the theoretical representations of them have to satisfy." I shall adopt this definition, but with the following qualification, which is likely to have been accepted by Einstein. Principles are not empirically discovered but formulated, constructed, on the basis of empirically established evidence. "The impossibility of perpetual motion" is hardly empirically given; it is as a principle formulated on the basis of such evidence.

Constructive theories are, more or less by definition, realist theories, and conversely, many realist theories are constructive. Realist theories represent, commonly causally, the phenomena or objects they consider and their behavior, in science by mathematical models, assumed to idealize how nature or reality works, in the case of constructive theories at the simpler, or deeper, level of reality constructed by a theory. In other words, a constructive theory offer a representation of the processes underlying and connecting the observable phenomena considered, commonly by understanding the ultimate character of these processes on the model of classical mechanics or classical electrodynamics, as in the kinetic theory of gases, as described above or other forms of classical statistical physics. All such theories assume that the individual behavior of the ultimate constituents of the systems they consider is described by the laws of classical mechanics. A realist theory may represent objects or phenomena it considers in a more direct, if still idealized, manner, as classical mechanics (which deals with individual or sufficiently small systems) or classical electrodynamics do. I shall discuss the concepts of reality and realism, which encompasses that of realist theory, in more detail below. First, however, I shall define a mathematical model.

By a "mathematical model" I refer to a mathematical structure or set of mathematical structures that enables any type of relation to the (observed) phenomena or objects considered. (As I shall only deal with mathematical models here, the term "model" hereafter refers to mathematical models.) All modern, post-Galilean, physical theories are defined by their uses of such models. The requirement of using mathematical models may be seen as a principle, the mathematization principle, "the M principle," arguably the single defining principle of all modern physics, from Galileo on. Such models may be realist, representational, as in classical physics, specifically classical mechanics, or predictive, as in classical statistical physics (the models of which are, however, underlain by representational models of classical mechanics), or in quantum mechanics, without assuming realism and causality even in considering elementary individual quantum processes, such as those concerning elementary quantum objects, "elementary particles." This assumption is expressly abandoned or even precluded in non-realist interpretations of quantum phenomena and quantum mechanics, following Bohr and "the spirit of

<sup>3</sup> I bypass more technical definitions, found in standard texts and reference sources. <sup>4</sup>There are alternative formalisms, such as those in terms of C<sup>∗</sup> -algebras or more recently category theory, thus far, all mathematically equivalent to the Hilbertspace formalism.

Plotnitsky The Real and the Mathematical

Copenhagen," as Heisenberg called it [10, p. iv]<sup>5</sup> . The M principle is upheld in quantum mechanics, but, in non-realist interpretations, in a way different from how it is used in realist theories.

The probabilistic or statistical character of quantum predictions must also be maintained by realist interpretations of these theories or alternative theories (such as Bohmian theories) of quantum phenomena, in conformity with quantum experiments, in which only probabilistic or statistical predictions are possible. The reasons for this is that the repetition of identically prepared quantum experiments in general leads to different outcomes, a difference that cannot be improved beyond a certain limit (defined by Planck's constant, h) by improving the conditions of measurement, which is possible in classical physics. This fact is also manifested in Heisenberg's uncertainty relations, which are statistical in character as well. This situation leads to the quantum probability or (depending on interpretation) quantum statistics principle, the QP/QS principle, arguably the single defining principle in Q-models in physics and beyond, keeping in mind that in psychology, economics, and decision science, we do not have anything corresponding to elementary individual physical processes, involving the ultimate elementary constituents of nature, "elementary particles." Nor do we have anything analogous to h. The probabilities themselves necessary for making correct predictions, in either quantum mechanics or in using Q-models elsewhere, are, thus far, calculated by using the Hilbert-space or mathematically equivalent formalisms and the (non-additive) procedure described above that uses quantum amplitudes and Born's or a similar rule<sup>6</sup> .

Realist models are, then, representational models, idealizing the nature of objects or phenomena they consider. The term "realism" will be primarily understood here as referring to the possibility, at least, again, in principle, of such models, and, in the first place, theories allowing for such models. One could define another type of realism, which would refer to theories that presuppose an independent architecture of reality they consider, while allowing that this architecture cannot be represented, either at a given moment in history or perhaps ever, but if so, only due to practical human limitations [9, pp. 11–23]. In the first case, a theory that is strictly predictive may be accepted, but with the hope that a future theory will do better, by being a realist theory of the representational type. Einstein adopted this attitude toward quantum mechanics, which he expected to be eventually replaced by a (representational) realist theory. Even in the second case, the ultimate nature of reality is commonly deemed to be conceivable on realist models of classical physics, possibly adjusting them to accommodate new phenomena. However, this type of realism implies that there is no representational theory or model of the ultimate nature of the phenomena or objects considered. Either type of realism is abandoned or even precluded in quantum mechanics, when interpreted in the spirit of Copenhagen. However, such interpretations do assume the concept of reality, by which I refer to what exists or is assumed to exist, without making any claim upon the character of this existence, which type of claims defines realist theories. By existence I refer to a capacity to have effects on the world, ultimately, which also assume the existence of the world by virtue of its capacity to have effects upon itself, effects which establish by means of and thus in terms as effects of our interactions with the world. In physics, the primary reality considered is that of nature or matter. It is generally assumed to exist independently of our interaction with it, which also assumes that it has existed when we did not exist and will continue to exist when we will no longer exist. This assumption is also made in non-realist interpretations of quantum mechanics, in the absence of a representation or even (as against the second, non-representational type of realism defined above) conception of the character of this existence. Thus, if realism presupposes a representation or at least a conception of reality, this concept of reality is that of "reality without realism" [9, 11]. The assumption of this concept of reality is a principle, the RWR principle. The existence or reality of quantum objects, a form of reality beyond representation or even conception, is inferred from effects they have on our world, specifically on experimental technology. It has not been possible, at least thus far, to observe a moving electron or photon, or for that matter even stationary electrons (there are no stationary photons, which only exist in motion before they are absorbed by other forms of matter, such as electrons). It is only possible to observe traces of their interactions with measuring instruments, traces that do not allow us to reconstitute the independent behavior of quantum objects movement, an impossibility reflected in Heisenberg's uncertainty relations. In non-realist, RWR-principle-based, interpretations, quantum mechanics only predicts, in probabilistic or statistical terms (no other predictions are, again, possible on experimental grounds), effects manifested in measuring instruments impacted by quantum objects.

While a principle theory, which, as I explained, need not be constructive in Einstein's sense, could be either realist or nonrealist, a constructive theory is by definition realist. Realist or, it follows, constructive theories do involve principles, such as the equivalence principle in general relativity, or the principle of causality, which, to adopt Kant's definition, commonly used ever since, states that, if an event takes place, it has a cause of which it is an effect [12, p. 305, 308]<sup>7</sup> . Asymmetrically, however,

<sup>5</sup>The designation "the spirit of Copenhagen" is preferable to a more common "the Copenhagen interpretation," because there is no single Copenhagen interpretation. <sup>6</sup>That does not mean that an alternative way of doing so, for example, by bypassing amplitudes or by using some an alternative formalism (not mathematically equivalent to the standard one) is impossible.

<sup>7</sup>Causality is, thus, an ontological category, characterizing the nature of reality. It proceeds by connecting a cause (an event, phenomenon, a state of a system, or force) to an effect, while the principle of causality connects an event to a cause. Determinism is assumed here to be an epistemological category. It designates our ability to predict the state of a system (ideally) exactly at any moment of time once we know its state at a given moment of time. In classical mechanics (which deals with a small number of objects), causality and determinism coincide. Once a classical system is large, one can no longer predict its causal behavior exactly. In other words, a system may be causal without our theory of its behavior being deterministic, as is the case, for example, in classical statistical physics or chaos theory. Causal influences are generally, although not always, assumed to propagate from past or present towards future. Relativity theory further precludes the propagation of physical influences faster than the speed of light in a vacuum, c. Principle theories do not require causality, which is, again, difficult to assume

a principle theory need not involve constructive aspects or be realist. In non-realist, RWR-principle-based, interpretations, quantum mechanics is a principle theory by definition, by virtue of the RWR principle. It is not possible, in such interpretations, to have a constructive theorization of the ultimate entities, quantum objects, which are responsible for the observable quantum phenomena, unless one sees quantum objects as constructed as in principle unconstructible. According to Bohr, thus formulating the RWR principle, "in quantum mechanics we are not dealing with an arbitrary renunciation of a more detailed analysis of atomic phenomena, but with a recognition that such an analysis is in principle excluded," beyond a certain point [3, v. 2, p. 62]. In this interpretation, quantum mechanics divorces itself from the representation of the connections between observed quantum phenomena, which it only relates in terms of predictions, in general probabilistic or statistical in character, thus fulfilling the M principle under the conditions of the RWR principle.

Finally, the present view does not assume a permanent, Platonist, essence to any given principle, which can always be abandoned under the pressure of new experimental findings or new ways of theorizing previously available experimental findings. Indeed, one might argue that the greatest form of creative thinking in science or other theoretical fields is that which lead to the invention of new principles, which implies the transformation of principles, rather than any Platonist permanence to them.

# The Physical Principles of the Quantum Theory

The RWR principle and the corresponding interpretation of quantum mechanics emerged only in the 1930s. Heisenberg's discovery of quantum mechanics in 1925 and Bohr's initial interpretation of it, proposed in 1927, were based on the following principles, with Bohr's complementarity principle added in 1927:


classical limit, but was given by Heisenberg a new and more rigorous form of "the mathematical correspondence principle," which required that the equations of quantum mechanics convert into those of classical mechanics in the classical limit, thus, in accordance with the M principle.

I speak of the proto-RWR principle because Heisenberg saw the project of describing the motion of electrons as unachievable at the time, rather than "in principle excluded," as Bohr assumed a decade later [3, v. 2, p. 62]. This was, nevertheless, a radical move on Heisenberg's part, as Bohr was the first to realize: "In contrast to ordinary [classical] mechanics, the new quantum mechanics does not deal with a space–time description of the motion of atomic particles. It operates with manifolds of quantities [matrices] which replace the harmonic oscillating components of the motion and symbolize the possibilities of transitions between stationary states in conformity with the correspondence principle. These quantities satisfy certain relations which take the place of the mechanical equations of motion and the quantization rules [of the old quantum theory]" [3, v. 1, p. 48].

Quantum discreteness was eventually (as part of Bohr's ultimate interpretation) recast by Bohr in terms of his concept of "phenomenon," defined in terms of what is observed in measuring instruments under the impact of quantum objects, in contradistinction to quantum objects themselves, which cannot be observed or represented [3, v. 2, p. 64]. Quantum phenomena are, in Bohr's interpretation, irreducibly discrete in relation to each other, and there is no continuous or any other conceivable process that could be assumed to connect them. Probability has a temporal structure by virtue of its futural and discrete nature: one can only verifiably estimate future discrete events. Such events may, however, be continuously and causally connected, as they are in classical physics, even though we may not be able to track these connections to make exact predictions, as happens in classical statistical mechanics or chaos theory. By contrast, in non-realist, RWR-principle-based, interpretations, the nature of quantum phenomena and events precludes us from causally (or otherwise) connecting them. This means that only probabilistic or statistical predictions are possible, even ideally and in principle, and even in dealing with elementary individual quantum objects, such as those known as "elementary particles," and the processes and events they lead to, objects and processes that cannot be decomposed into a smaller objects and processes. This qualification distinguishes quantum mechanics from classical probabilistic or statistical theories, or of course classical mechanics where such predictions could, at least ideally, be exact in dealing with individual classical objects or a small number of classical objects. In quantum mechanics, in non-realist interpretations, this type of idealization is not possible, a fact reflected in the uncertainty relations. The theory only estimates the probabilities or statistics of the outcomes of discrete future events, on the basis of previous events, and tells us nothing about what happens between events. Nor does it describe the data observed in measuring instruments and hence quantum phenomena. They are described by classical physics, which, however, cannot predict them.

in quantum physics without, however, violating relativity or more generally the principle of locality, which requires that all physical influences are local (still under the assumption that they cannot, locally, propagate faster than c).

The QP/QS principle was mathematically expressed in Heisenberg's scheme by matrices containing the necessary probability amplitudes cum Born's rule. Heisenberg only formulated this rule in the case of electrons' quantum jumps in the hydrogen atom, rather than as universally applicable in quantum mechanics, as Born did. Born's rule is not inherent in the formalism but is added to it—it is postulated.

The correspondence principle was central to Heisenberg's derivation of quantum mechanics. In its mathematical form, introduced by Heisenberg, the principle required that both the equations of quantum mechanics, which were formally those of classical mechanics, and the variables used, which were different, convert into those of classical mechanics in the classical limit, a conversion automatic in the case of equations but not variables. (The processes themselves, however, are still quantum even in this limit.) Thus, the principle gave Heisenberg a half of the mathematical architecture he needed.

An important qualification is in order. Heisenberg's derivation of quantum mechanics from principles cannot be considered a strictly rigorous derivation, especially in a mathematical sense. As he noted in The Physical Principles of the Quantum Theory (from which title I borrow my title of this section): "The deduction of the fundamental equation of quantum mechanics is not a deduction in the mathematical sense of the word, since the equations to be obtained form themselves the postulates of the theory. Although made highly plausible, their ultimate justification lies in the agreement of their predictions with the experiment" [10, p. 108]. While Heisenberg, again, borrowed the form of equations themselves from classical mechanics by the mathematical correspondence principle, he virtually guessed the variables he needed—one of the most extraordinary guesses in the history of physics. A more rigorous derivation of quantum mechanics from fundamental principles may, thus, be pursued. More recent work in this direction has been in quantum information theory in the case of discrete quantum variables, such as spin, which require finite-dimensional Hilbert spaces, as opposed to infinite-dimensional ones for continuous variables, such as position and momentum (e.g., 13–15) 8 . I shall comment on this work below.

Bohr's interpretation of quantum phenomena and quantum mechanics added a new principle, the complementarity principle. It arises from Bohr's concept of complementarity and may be defined as requiring: "(a) a mutual exclusivity of certain phenomena, entities, or conceptions; and yet (b) the possibility of considering each one of them separately at any given point, and (c) the necessity of considering all of them at different moments for a comprehensive account of the totality of phenomena that one must consider in quantum physics" [9, p. 70].

In Bohr's ultimate interpretation, this concept applies strictly to what is observed in measuring instruments, quantum phenomena, and not to quantum objects, placed beyond representation or even conception. Complementarity is a reflection of the fact that, in a radical departure from classical physics or relativity, the behavior of quantum objects of the same type, say, electrons, is not governed by the same physical law, especially a representational physical law, in all possible contexts, specifically in complementary contexts. In other words, the behavior of quantum objects has mutually incompatible effects in complementary set-ups, although this mutual incompatibility is, generally, manifested collectively, in multiple identically prepared experiments. On the other hand, the mathematical formalism of quantum mechanics offers correct probabilistic or statistical predictions of quantum phenomena in all contexts, in non-realist interpretations, under the assumption, that quantum objects and processes are beyond representation or even conception, by the RWR principle.

In some non-realist interpretations, such as the one the present author would favor, following W. Pauli, individual quantum events are not subject even to the probabilistic laws of quantum mechanics. This makes these laws collective, statistical [9, pp. 173–186; 11]. The QP/QS principle, accordingly, becomes strictly the QS principle. According to Pauli:

As this indeterminacy is an unavoidable element of every initial state of a system that is at all possible according to the [quantummechanical] laws of nature, the development of the system can never be determined as was the case in classical mechanics. The theory predicts only the statistics of the results of an experiment, when it is repeated under a given condition. Like the ultimate fact without any cause, the individual outcome of a measurement is, however, in general not comprehended by laws. This must necessarily be the case, if quantum or wave mechanics is interpreted as a rational generalization of classical physics, which take into account the finiteness of the quantum of action [h]. The probabilities occurring in the new laws have then to be considered to be primary, which means not deducible from deterministic laws. [19, p. 32]

Thus, in Pauli or the present view, this "beyond the law" includes the probabilistic or, in this view, statistical laws of quantum mechanics, laws that, thus, only apply to statistical multiplicities of repeated quantum events. Individual quantum events are not subject to laws, even to the probabilistic or statistical laws of quantum mechanics. Their outcomes cannot, in general, be assigned a probability: they are strictly random<sup>9</sup> . Only the statistics of multiple (identically prepared) experiments could be predicted and repeated, which repeatability appears to have been, thus far, necessary for scientific practice. Whether, however, one interprets quantum mechanics on such statistical lines or on the Bayesian lines, by assigning probability to individual events, we are compelled to rethink the concept of physical law as unavoidably contextual. This is "an entirely new situation as regards the description of physical phenomena that, the notion of complementarity aims at characterizing" [20, p. 700].

There are other important features of quantum phenomena, mathematically expressed in the quantum-mechanical formalism, in particular, the so-called "quantum non-locality," which refers to the existence of the statistical correlations

<sup>8</sup>Among the key earlier approaches are [16], Fuchs's work, which "mutated" to the program of quantum Bayesianism or QBism [17], and [18].

<sup>9</sup>Randomness may be defined by this impossibility. This concept of randomness is not ontological, because one cannot ascertain the reality of this randomness, but epistemological. It is ultimately a matter of assumption or belief, practically justified in a given interpretation.

between spatially separated quantum events, and "quantum entanglement," which reflects these correlations in the formalism. These features were discovered later and played no role in the initial derivation of quantum mechanics by either Heisenberg or Schrödinger. They do figure significantly in quantum information theory and recent attempts, mentioned above, to derive quantum mechanics from the principles of quantum information. Their analysis would require a treatment beyond my scope10. A few key points may, however, be mentioned. First, while quantum entanglement is a clearly defined feature of the formalism, the situation is different in the case of quantum non-locality. Although originating in the experimentally well-confirmed fact that certain spatially separated quantum phenomena or events exhibit statistical correlations (not found in classical physics), quantum non-locality is a complex and much debated issue. The problematic was in effect introduced in 1935 in the famous article by Einstein et al. [22]. I qualify because neither EPR's article nor Bohr's equally famous reply to it [20] used the language of correlations or entanglement. The latter term was introduced, in both German [Verschränkung] and English, by Schrödinger in his response to EPR's article, known as "the cat-paradox paper," after the paradox found there [23]. The subject remained dormant until the 1960s, when it was rekindled by the Bell and Kochen-Specker theorems, even to the point of nearly defining the current debate concerning quantum foundations. The theoretical and experimental research on the subject during the last decades has been massive and literature concerning it is immense. The term "non-locality" is not uniformly used in referring to quantum correlations, because it may suggest some sort of instantaneous physical connections between distant events, a "spooky action at a distance," as Einstein called it. Such connections are incompatible with relativity, although the principle of locality, which prohibits such connections, is independent of relativity. This type of physical non-locality, which is found, for example, in Bohmian mechanics, is commonly viewed as undesirable. The absence of realism allows one to avoid physical non-locality, as Bohr argued in his reply to EPR's article, which contended that quantum mechanics is either incomplete or physically nonlocal [20, 22].

# FROM MODELS TO PRINCIPLES IN Q-MODELING OUTSIDE PHYSICS

# Q-Models, Fundamental Principles, and Reality without Realism Outside Physics

In addressing Q-models in physics in preceding discussion, my main question, arising from the history of quantum theory, was: Given certain fundamental physical principles, established on the basis experimental evidence, in particular the QD and QP/QS principles, and perhaps adopting additional principles, such as the correspondence principle or the RWR (or proto-RWR) principle, what are the mathematical models that would enable us to handle this evidence? In turning now to the Qmodels beyond physics, my main question is reverse: Assuming that mathematical Q-models apply in psychology, economics, and decision science, which features and which fundamental principles are behind such models, and how they accord with the fundamental principles of quantum mechanics? There are two sets of principles I have in mind. The first contains the principles that led to the emergence of quantum mechanics; and the second the principles of quantum information theory, which are, however, in accord with most principles of the first set. I shall be primarily concerned with this first set (apart from the correspondence principle, unique to quantum theory), but will also comment on the second<sup>11</sup> .

But why is this question important in the first place? As noted from the outset, if there are phenomena outside physics that appear to require Q-models, one need, unlike at the time of the introduction of quantum mechanics, not invent such models at this point. One can borrow them, "ready-made," from quantum theory, which is what happed in the case of Q-modeling outside physics. Nevertheless, establishing, now inferentially, fundamental principles behind Q-models might allow us to make important conclusions about the nature of the phenomena handled by these models. To put it in stronger terms, finding the fundamental principles behind a given model, even if this model is already available, is important because otherwise we don't have a rigorous theory or a rigorous model, which is true even if a constructive theory is available, but is all the more important if it is not. Otherwise, we don't really know what our models are models of, especially, again, in the absence of a constructive theory and realism, which absence is likely if Qmodels apply and is my main interest here. These considerations are also relevant in pursuing projects of more rigorous derivation of quantum mechanics from principles in physics, for example on lines of quantum information theory, even though the theory itself is already established. Part of the reason is, again, that doing so can give us a deeper understanding of quantum phenomena and quantum theory. More, however, is at stake. The main value of such projects lies in solving outstanding problems of fundamental physics, as in quantum field theory (which still has unresolved problems, its extraordinary successes notwithstanding) or quantum gravity, which has no model as yet [24, 25]. The same argument applies to Q-modeling beyond physics. The future of mathematical modeling there is at stake as well.

Before addressing the relationships between fundamental principles and Q-models in psychology, economics, and decision science, it may be helpful to summarize the non-realist, the RWRprinciple-based, interpretation of quantum phenomena and quantum mechanics outlined in Section Physical Principles and Mathematical Models in Quantum Mechanics. While quantum objects are assumed to exist, the character of this existence or reality is, by the RWR principle, assumed to be beyond representation and even conception. As such, this reality is different from the reality of quantum phenomena, which are

<sup>10</sup>I have discussed the subject, also in relation to complementarity, in Plotnitsky (9, pp. 136–54). These connections also bring in a related (EPR-correlation) concept, "contextuality." This concept plays a significant role in Q-modeling beyond physics [1, pp. 363–5, 21].

<sup>11</sup>I have discussed the role of principles of quantum information theory beyond physics in Plotnitsky [6].

defined by what is observed in measuring instruments under the impact of quantum objects and, thus, can be represented. There are no mathematically expressed physical laws corresponding to the behavior of quantum objects. There are, however, mathematical laws that, expressing the QP/QS principle, enable correct probabilistic or statistical predictions of the outcomes of quantum experiments, manifested in measuring instruments, in all contexts. In addition, there are two interpretations of these mathematical laws. The first is probabilistic, along Bayesian lines, in which case these laws are seen as allowing one to assign probabilities to the outcomes of individual quantum events in accordance with one or the other law of the available set of laws, specifically those applicable in complementary situations. The second is statistical, when no such probabilities could be assigned because the outcomes of individual quantum experiments are not comprehended even by these laws and are seen as random, while these laws are assumed to predict the statistics of multiple identically prepared experiments in the corresponding contexts.

It is clear, however, that this conceptual architecture, in either the Bayesian or statistical interpretation, cannot apply unaltered in considering, along non-realist lines, human phenomena found in psychology, economics, or decision science and the possible Q-models there. This is because, while there are individual objects or, the case may be, (human) subjects and processes to consider, there are no elementary objects of the type found in quantum physics. There is nothing analogous to elementary particles, such as electrons or photons, and there is rarely a completely random individual behavior. When one deals in these fields with large multiplicities one can, either in using Cor Q-models, average the individual behavior and statistically disregard the differences in this behavior, differences defined by psychological or other human and social factors, in which case one could apply either a Bayesian or statistical interpretation of the Q-model used. While, however, this averaging is sometimes possible in psychology, economics, and decision science, there are often significant obstacles in using it. Each sequence of events considered in such situations is singular, unique. Accordingly, if a Q-model applies in a given class of such cases, it would have to be interpreted on Bayesian lines, if one can establish such a class. If not, then, as discussed below, another type of models may be possible, the singularized probabilistic (SP) models, some of which are time-dependent (TDSP). Each such model is unique to the individual situation considered, rather than applicable to a class of individual situations; and this uniqueness may pose difficulties for scientific use of such models.

# The QP/QS Principle and the Complementarity Principle

Beginning with Tversky and Kahneman's work in the 1970–80's [e.g., 26], it has been primarily the presence of probabilistic data akin to those encountered in quantum physics that suggested using Q-models in cognitive psychology, decision science, and economics [e.g., 1, 2] <sup>12</sup>. Economic behavior may also involve psychological factors of the type analyzed by Tversky and Kahneman. (Kahneman was eventually awarded a Nobel Prize in economics.) The recourse to Q-models is motivated by the fact that one could not effectively use the classical (additive) rules but could use the quantum-mechanical-like (non-additive) rules for predicting the probabilities of the outcomes of certain psychological experiments, such as those involving responses to certain specific questions, asked sequentially. These responses were found to be statistically dependent on the order in which they were asked, which, again, in parallel with quantum mechanics, suggested that a non-commutative model and, in combination with the non-additive rules for calculating the probabilities involved, a Q-model could be more effective13. To clarify this parallel, in quantum mechanics, simultaneously measuring, or simultaneously asking questions concerning, two or more complementary variables, such as the position and the momentum of a given quantum object, are mutually exclusive or incompatible. Correlatively, changing the order of measuring (of asking the question concerning) the position and then the momentum of a quantum object, in general, changes the outcomes and hence our predictions concerning them. This circumstance is reflected, experimentally, in the uncertainty relations, and mathematically, in the noncommutativity of the multiplication of the corresponding Hilbert-space operators in the formalism, and epistemologically, in the complementarity of these two measurements. One can, analogously, consider psychologically incompatible and, thus, complementary questions in psychology and attempt to handle the corresponding events statistically by a Q-model [e.g., 1, pp. 259–260]. The situation involves further complexities in and outside quantum physics, which I put aside here. I would like, however, to mention R. Spekkens's article, which introduced "a toy theory," based on the following principle, linked to complementarity: "the number of questions about the physical state of a system that are answered must always be equal to the number that are unanswered in a state of maximal knowledge. Many quantum phenomena are found to have analogs within this toy theory." Many but not all! For the theory expressly fails to reproduce some among the crucial features of quantum theory, specifically and intriguingly some of those related to correlations and entanglement, such as "violations of Bell inequalities and the existence of a Kochen-Specker theorem" [27, p. 032110]. This failure reminds us that models based on the existence of incompatible questions, in and outside physics, may mathematically differ from quantum mechanics.

Q-models are, then, used to predict probabilities and correlations found in such experiments, without being expressly concerned with the principles characterizing the situations considered, but only assuming certain mathematical principles inherent in the quantum-mechanical formalism. Some among the principles of the first kind are, nevertheless, implicitly at work, specifically the QP/QS principle or the principle of incompatibility, in effect complementarity14. Whether these Qmodels are required or C-models, models derived from the

<sup>12</sup>I also refer to these works for more detailed discussions of the ways in which Q-models are used in these fields.

<sup>13</sup>As noted earlier, this does not mean that such probabilities could not be predicted by means of alternative models even in quantum physics.

<sup>14</sup>Complementarity has received some attention outside physics, beginning with Bohr's own (tentative) suggestions. Inspired by Bohr and others did propose using the concept in philosophy, biology, and psychology. See Plotnitsky [28, pp. 158–66] and [29].

mathematics of classical physics, suffice remains, again, an open question, although it is difficult to assume that C-models could provide the non-additive probabilities necessary in such cases. A model alternative to that of quantum mechanics, possibly also free of quantum amplitudes and dealing directly with probabilities, is, in principle, possible even, as noted earlier, in quantum physics, but such a model is unlikely to be akin to those of classical physics. Thus, while they are both realist and causal, Bohmian models are mathematically different from those of classical physics. It may also be possible to construct a realist and causal mathematical model that would represent a deeper level of reality and that would have quantum mechanics as its limit, and then extend this model beyond physics [e.g., 30].

In any event, one can see the QP/QS principle, in part in conjunction with complementarity, as the main principle behind the use of Q-models beyond physics, accompanied, as in quantum mechanics, by the specific (non-additive) calculus of probability. Indeed, the QP/QS principle, along with the QD principle, was the starting principle for Heisenberg. The role of complementarity, only implicit initially by virtue of the non-commutative nature of Heisenberg's scheme, became apparent shortly thereafter, helped by Heisenberg's discovery of the uncertainty relations in 1927. It became clear that noncommutativity, the uncertainty relations, and complementarity were correlative, representing, respectively, the mathematical, physical, and epistemological aspects of the quantum-mechanical situation, defined by quantum discreteness (the QD principle). As noted earlier, quantum discreteness was eventually rethought by Bohr in terms of quantum phenomena, defined by what is observed in measuring instruments impacted by quantum objects, as opposed to the nature of quantum objects and processes, which are beyond conception and, hence, cannot be thought of as either discrete or continuous.

The psychological, economic, and decision-making phenomena treated by means of Q-models do not exhibit this type of irreducible discreteness or individuality. The processes that connect these phenomena are more akin to processes considered in classical physics, especially in chaos or complexity theory, again, often providing mathematical models, C-models, used in these fields. Now, assuming the defining role of, jointly, the QP/QS principle and the complementarity principle in considering these phenomena, could some form of the QD principle, correlative to the QP/QS principle in quantum mechanics, find its place in considering or even in order to derive Q-models in these fields? And if so, or in the first place, would the RWR principle, or a proto-RWR principle of the type used by Heisenberg, also be applicable? There are reasons to believe that such might be the case.

# The RWR and QD Principles

Bohr thought that, along with the complementarity principle, the RWR principle might apply in biology and psychology. In considering biology, he argued as follows:

appears as an irrational element from the point of view of the classical mechanical physics, taken together with the existence of elementary particles, forms the foundation of atomic physics. The asserted impossibility of a physical or chemical explanation of the function peculiar to life would in this sense be analogous to the insufficiency of the mechanical analysis for the understanding of the stability of atoms. [31, p. 458; emphasis added]

The ultimate character of biological processes may, thus, be beyond representation or even conception, in accord with the RWR principle. Once the theory suspends accounting for the connections between the phenomena considered, these phenomena are unavoidably discrete, leading to the QD principle, and our predictions concerning them are unavoidably probabilistic, leading to the QP/QS principle. Our predictions concerning them are likely to follow a (non-additive) probability calculus of the type used in quantum probability, and thus are likely to require a Q-model. This is because, by the RWR or proto-RWR principle, it would be difficult or even impossible to treat the processes connecting the phenomena considered as either continuous or causal. Bohr's appeal to "an irrational element" is noteworthy, and I shall comment on it below. It is important that, as Bohr clearly implies here, this approach is possible even if the nature of biological processes is not physically quantum in the sense of being able to have physically quantum effects. (The ultimate constitution of all matter is quantum, but this constitution does not manifest itself apart from quantum experiments.) If they were quantum, such processes would be unrepresentable or inconceivable in Bohr's interpretation. At stake here, however, are parallel, rather than physically connected, situations that may require using the same type of mathematical models, Q-models, without possible connections between the systems defining these situations<sup>15</sup> .

A recent article by Haven and Khrennikov provides an instructive example for possible roles of both the RWR and QD principle in market economics in their Q-modeling of market phenomena involving arbitrage as analogous to quantum tunneling [33]. The term "quantum tunneling" refers to a quantum object's capacity to "tunnel" through an energy barrier that it would not be able to surmount if it behaved classically. It is a quantum phenomenon par excellence. The quantum process itself behind any given case of quantum tunneling cannot be observed. One only ascertains that a particle can be found beyond the barrier, which is to say, that the corresponding measurement will register an impact of this particle on the measuring instrument beyond the barrier. Thus, in accord with the general situation that obtains in quantum mechanics, one deals with two discrete phenomena, connected by probabilistic or (in which case, we need multiple trials) statistical predictions concerning the second event on the basis of the first. "Arbitrage" is the practice of taking advantage of a price difference between two or more markets: striking a combination of matching deals that capitalize on the imbalance, the profit being the difference

The existence of life must be considered as an elementary fact that cannot be explained, but must be taken as a starting point in biology, in a similar way as the quantum of action, which

<sup>15</sup>There are several recent arguments for such connections, most prominent of which is arguably that by Penrose [32] and developed in several subsequent studies. The model itself that Penrose has in mind is, thus far, only mathematically conjectured, following certain approaches to quantum gravity.

between the market prices. An arbitrage is a transaction that involves no negative cash flow at any probabilistic or temporal state and a positive cash flow in at least one state; in simple terms, it is the possibility, ideally, of a risk-free profit at zero cost. In practice, there are always risks in arbitrage, sometimes minor (such as fluctuation of prices decreasing profit margins) and sometimes major (such as devaluation of a currency or derivative). In most ideal models, an arbitrage involves taking advantage of differences in price of a single asset or identical cash-flows.

Now, if arbitrage can be modeled analogously to quantum tunneling in physics, one might expect features analogous to those found in quantum tunneling, which dramatically exhibits the character of quantum phenomena. Haven and Khrennikov are primarily concerned with the use of Q-models in predicting the probabilities involved, by QP/QS principle (accompanied by the non-additive calculus of probabilities), rather than with the QD and the RWR, or proto-RWR, principles. They do, however, offer some considerations concerning discreteness:

We believe that the equivalent of quantum discreteness in this paper corresponds to the idea that each act of arbitrage is a discrete event corresponding to the detection of a quantum system after it passed ... the barrier. In reality arbitrage opportunities do not occur on a continuous time scale. They appear at discrete time spots and often experience very short lives. We would like to argue that it is the tunneling effect which is closely associated to the occurrence of arbitrage. ... We also mentioned the wave function in the discussion above, and quantum discreteness is narrowly linked with quantum probabilities. [33, p. 4095]

This view at least allows for an interpretation of the phenomenon of arbitrage in terms of the QD and the RWR principles, even if it does not require it. Haven and Khrennikov, while, again, allowing for the applicability of the QD principle, do not appear to subscribe to the RWR principle, or even to the proto-RWR principle16. In effect, however, they follow the proto-RWR principle, insofar as they are not concerned with representing how arbitrage actually occurs, any more than Heisenberg was concerned with representing the behavior of the electron in the hydrogen atom in deriving his formalism. They are only concerned with predicting the probabilities or statistics of future events of arbitrage.

Thus, situations governed the QD, QP/QS, and RWR (or proto-RWR) principles are possible in economics, psychology, and decision science, and just as in quantum mechanics, they may allow for either a statistical or Bayesian view of the Qmodel used. When finite-dimensional Q-models (dealing with discrete variables, such a spin) are used, as they often are in these fields, one can also consider the application of the principles of quantum information theory. While I cannot address the subject in detail, the operational framework, used in this field, merits a brief detour. This framework allows one to arrive at Q-models in a more rigorous and first-principle-like way, by using the rules governing the structure of operational devices, "circuits," via recent work on monoidal categories and linear logic [13–15, 34].

According to Chiribella et al.: "The operational-probabilistic framework combines the operational language of circuits with the toolbox of probability theory: on the one hand experiments are described by circuits resulting from the connection of physical devices, on the other hand each device in the circuit can have classical outcomes and the theory provides the probability distribution of outcomes when the devices are connected to form closed circuits (that is, circuits that start with a preparation and end with a measurement)" [13, p. 3]. A circuit is an arrangement of measuring instruments capable of quantum measurements and predictions, which are, again, probabilistic or statistical, and sometimes, as in the EPR type of experiments, are correlated, which gives a circuit a very specific architecture, corresponding only to quantum but not classical experiments. A realist representation of a circuit is possible because a circuit is described by classical physics, even though it interacts with quantum objects, and thus has a quantum stratum, enabling this interaction. Hence, the information obtained by means of a circuit is physically classical, too, but the architecture and mode of transmission of this information is quantum: they cannot be generated by a classical process.

As discussed earlier, Heisenberg found the formalism of quantum mechanics by adopting, in addition to the QD, QP/QS, and proto-RWR principles, the mathematical correspondence principle and, by the latter principle, using the equations of classical mechanics, while changing the variables in these equations. This principle was not exactly the first principle. In particular, it depended on formally adopting the equations of classical mechanics, while one might prefer these equations to be a consequence of fundamental quantum principles. Heisenberg's variables were new, which was his great discovery. But they were new more of a guess, a logical guess, fitting the probabilities of transitions between the energy levels of the electron in the hydrogen atom he worked with. In the operational framework, one derives finite-dimensional quantum theory in a more first-principle-like way, in particular, independently of classical mechanics (which does not exist for discrete variables, such as spin). This derivation is made possible by applying the rules that define the operational language of circuits, as the language of monoidal categories and linear logic, and thus giving a mathematical structure to operational circuits themselves and thus, in effect, to measuring instruments [13, p. 4, 33]. These rules are more empirical, but they are not completely empirical (which no rules may ever be), because circuits are given a mathematical structure, from which the mathematical architecture of the theory emerges17. The resulting formalism is equivalent to the standard Hilbert-space formalism. As in Heisenberg, one only deals with "mathematical representations" providing the probabilities or statistics of the outcomes of discrete quantum experiments, in accord with the QD and QP/QS principles, without providing a representation of quantum processes themselves, in accord with the RWR principle.

<sup>16</sup>As indicated earlier, elsewhere Khrennikov argued for a classical-like model at the ultimate level of the constitution of nature in physics [30].

<sup>17</sup>See also Plotnitsky [9, pp. 248–58] and Hardy [15].

In the areas of social science, which concerns human subjects, establishing the mathematical architecture for such "circuits" is a formidable task. However, given important recent work along the lines of category theory beyond physics [e.g., 35], this approach may prove to be viable in enabling a principle approach in Q-modeling outside physics<sup>18</sup> .

# Q-Theories as Rational Theories of the Irrational

As indicated earlier, while the main reasons for using Q-models in psychology, economics, and decision science are due to the quantum-like nature or calculus of the probabilities associated with predicting certain phenomena, the underlying dynamics of the cognitive or psychological processes leading to each such phenomenon individually might, in principle, be causal or partially causal. This dynamics might also not be causal, especially given the quantum (non-additive) character of the probabilities involved. If it is causal or partially causal, then, unlike quantum processes, in non-realist interpretations, an analysis of these psychological processes may be possible, rather than "in principle excluded" [3, v. 2, p. 62]. This is because one might expect psychological, social, or economic reasons shaping these situations, and one of the tasks of analyzing them to explain these reasons, an imperative that is hard to avoid, as is clearly apparent in Tversky and Kahneman's articles [26, 37] or in Pothos and Buseymeyer's survey [1].

Psychological, social, or economic research using Q-models may renounce this task, especially in statistical analysis, thus in effect assuming a form of proto-RWR principle, akin to that used by Heisenberg. Even in this case, however, the question would still arise to what degree the QP/QS, QD, and (strictly) RWR principles, or the principles of quantum information theory, could apply in these fields, in particular in considering individual situations. As explained earlier, in quantum mechanics, in non-realist interpretations, the latter could either be treated on Bayesian lines or, in statistical interpretations, assumed to be random, which assumption would, again, be difficult in the fields in question at the moment. Some considerations of discreteness are unavoidable because, as noted, probability has an irreducibly futural and discrete character by dealing with estimates concerning discrete future events.

It is a more complex question whether one can renounce, as one does in quantum mechanics, in non-realist interpretations, considering or even assuming the existence of continuous processes connecting these events. I would surmise that such may be the case and that our brains may work, at least sometimes, in accordance with the QD, the QP/QS, and the RWR principles. This means they would not be relying on and calculating hidden causality connecting events but would instead functions by relying on the quantum-like workings of probabilities and correlations. This type of brain functioning would define what may be called a Bayesian Q-brain, which would require the corresponding Bayesian models. Importantly, however, this kind of Bayesian brain is fundamentally different from rational Bayesian agents, associated with the term Bayesian in cognitive psychology. Indeed, Q-models there are in part advanced in these fields against this concept of human agency. A Bayesian Q-brain need not always function "rationally," at least, not in accordance with any single concept of rationality. A corresponding Bayesian Q-model, if possible, would allow one to predict the outcomes of decisions governed by the brain processes of the individual subjects involved without having, even conjecturally, a full access to these processes, by the RWR principle. Nor do those who make these decisions have this access: these processes are unconscious, and, if one assumes the RWR principle, this part of the unconscious is not causal or "rational" (in its own way), as S. Freud, for example, saw it [38]. Freud's thinking on this point was, however, ultimately more complex, even if against his own grain.

It is instructive to return, in this context, to Bohr's invocation of "an irrational element," in the passage cited above and repeated elsewhere in his writings. The idea and even the language of irrationality have often been seen as problematic by Bohr's critics and even by some of his advocates. I would argue this assessment to be a result of misunderstanding Bohr's meaning. This "irrationality" is not any "irrationality" of quantum mechanics, which Bohr saw as a rational theory, a "rational quantum mechanics," and argued for its rational character throughout his writing (e.g., 3, v. 1, p. 48; 3, v. 2, p. 63). However, he did see it as a rational theory of something—the nature of quantum objects and processes—that is inaccessible to rational thinking, or at least to a rational representation. If, as he says, "the quantum of action [h], which appears as an irrational element from the point of view of the classical mechanical physics," it only means that cannot be rationally incorporated into the latter [31, p. 458].

Tversky and Kahneman's and related arguments are, too, sometimes seen as related to "irrational" elements in decisionmaking. This decision-making replaces purportedly "rational" Bayesian agents with at least partially "irrational" Bayesian agents. The "rational" Bayesian agents, as explained above, use probabilistic reasoning subject to updating their estimates on the basis of new information (which defines the Bayesian approach to probability). The irrationality of "irrational" Bayesian agents may be divided into three main, sometimes overlapping, types. The first type is in effect a form of rationality. This rationality is, however, different from rationality presumed to be dominant in the class of situations considered, say, the rationality of maximizing one's monetary benefits. In addition, this alternative rationality may be unconscious. The second type of irrationality refers to something that could be explained. However, it defies explaining it as anything assumed to be rational, say, as a form of rational behavior, beforehand. This irrationality may, upon further analysis, reveal itself to be the irrationality of the first type, but it may also be an alternative form of rationality19. Finally, the third type of irrationality is that invoked by Bohr: a realist

<sup>18</sup>See also a recent approach to representing sensation-perception dynamics in terms of quantum-like mental instruments, which are akin to "circuits," in Khrennikov [36].

<sup>19</sup>Some might still see, as Freud did, this "irrationality" as a form of unconscious "rationality." Once again, however, Freud, against his own grain, could not ultimately avoid giving the unconscious a stratum that is beyond representation, if not conception.

theory cannot incorporate it in its handling of the corresponding phenomena, while a non-realist Q-model or theory can make it part of its probabilistically predictive scheme without explaining it. In this way, QD, QP (or, if averaging is possible QS), and RWR principles can be brought together in this domain.

There is yet another possibility, which leads to a different type of models or theories, conforming to the QD, QP (but not QS), and RWR principles. I shall call such models or theories singularized probabilistic (SP) models or theories, keeping in mind their non-realist, RWR-principle-based, character. Realist SP models are possible, but I shall not be concerned with them. SP-models may also be time-dependent (TDSP). Such models can only be briefly sketched here in conceptual and somewhat abstract terms, but their possibility is intriguing. SP- or TDSPmodels need not be mathematically related to Q-models, but they might be, given the shared principles in which they are based.

# Singularized Probabilistic (SP) Theories and Models

Let us recall that, as reflected in the complementarity principle, in quantum mechanics there is no single, uniform physical law applicable to quantum behavior in all contexts, while the same mathematical formalism or model can be used in all contexts. Depending on whether an interpretation is statistical or (Bayesian) probabilistic, the individual quantum behavior is either assumed to be random or to be subject to the probabilistic law, the application of which is defined by the context. By contrast, in the case an SP-model or theory, the following situation obtains. While, as in quantum physics, there is no single uniform physics law, realist or not, each individual behavior obeys its own singular law, defined by its own mathematical model, rather than conforms to one or another contextual probabilistic or statistical law, from a (determinable) set of such laws determined by the theory, using a single mathematical model. Under the RWR principle, assumed here for SP-models, such a model still does not represent the reality of the ultimate processes considered, which makes the absence of not only determinism but also causality automatic, just as in quantum mechanics under the RWR principle. One cannot, however, any longer adopt a statistical view, which assumed multiplicities of events that could be averaged (in quantum mechanics, contextually). In each case, only a Bayesian view of the corresponding (unique) model is possible. Such individual laws and accompanying mathematical models may also be changing in time, a change observed each time a new observation occurs. If so, the corresponding model or theory becomes time dependent, TDSP.

The concept of an SP and especially a TDSP model or theory is a radical idea, to my knowledge, rarely, if ever, entertained, at least in science20. Indeed, it is not clear whether such theories and, especially, the mathematical models defined by them are scientifically viable, particularly if the corresponding mathematical laws are assumed to be changing in time, possibly on small scales. For an effective scientific practice to be possible, one might need regularities beyond those found in each singular situation, for which a mathematical model, unique to it, would be introduced, say, in order to predict the outcome of events. Such changes of laws and models could, in principle, be governed mathematically, have an overall mathematical model. Thus, one could have a set of models mathematically parameterized so as to allow one to use them for different individual situations and to adjust them to make effective predictions in all of these situations. If not, then each case would require its own mathematical model. Would mathematical-experimental sciences, as they are practiced now, still be possible, then?

Furthermore, there might, in a given domain, be individual cases the character of which will defeat our attempt to treat them by mathematical means. Indeed, this is already so in the case individual quantum processes if one adopts a statistical view, according to which each individual process is random, beyond the law. Now, however, there would not be statistical regularities, of the type found in quantum physics, applicable to multiplicities of repeatable cases (handled, moreover, by the same model, even if contextually), because there would be no repeatable cases in any meaningful sense. There would be neither statistical averaging, nor individual mathematical probabilistic treatment. This situation may be more familiar in literature, which is concerned with the particular or the singular, for example, with a unique life history of a novel's protagonist. One also encounters this singularity or uniqueness in life itself. Such histories resist and even preclude statistical averaging, again, allowed by, otherwise equally unique, histories (which cannot be thought of as classical trajectories of motion) of individual quantum objects, as well as mathematical handling. But they may become, at least outside physics, perhaps especially, in psychology (which often deals with the same human conditions as literature), part of science, a science that will combine science and non-science, or at least mathematical, both of the more standard or the SP/TDSP type, and nonmathematical modeling. Indeed, as just indicated, the SD/TDSP-modeling already poses complexities for scientific practice. Could this situation also emerge in physics, for example, in dealing with quantum gravity? This is not inconceivable. If it does, it will not end mathematical modeling in physics or, again, beyond, or the mathematical-experimental character of modern science, which has defined it beginning with Galileo. It might, however, change both, just as it happened in the case of quantum theory, which not only led to a revolutionary transformation—physical, mathematical, and philosophical—of physics itself but also opened new possibilities for scientific thinking and mathematical modeling beyond physics.

# AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

# FUNDING

This work was funded by The Purdue Distinguished Professorship Research Fund.

<sup>20</sup>Something akin to this possibility has been suggested in physics in Ungar and Smolin [39], but in a different context and based it on a very different set of principles than those adopted here, most especially because, as against the present argument, they assume realism and causality.

### ACKNOWLEDGMENTS

I would like to thank Mauro G. D'Ariano, Emmanuel Haven, Gregg Jaeger, and Andrei Khrennikov for helpful discussions concerning the subjects considered in this article. I am grateful

#### REFERENCES


to both readers of the article for their constructive criticisms, especially one of these readers, who made helpful specific suggestions for revisions and who directed my attention to Robert Spekkens's article "Evidence for the epistemic view of quantum states: A toy theory."


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Plotnitsky. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Quantization, Frobenius and Bi Algebras from the Categorical Framework of Quantum Mechanics to Natural Language Semantics

#### Mehrnoosh Sadrzadeh\*

*Theory Group and Computational Linguistics Lab, School of Electronic Engineering and Computer Science, Queen Mary University, London, United Kingdom*

#### Edited by:

*Emmanuel E. Haven, University of Leicester, United Kingdom*

#### Reviewed by:

*Yousef Azizi, Institute for Advanced Studies in Basic Sciences, Iran Jan Sladkowski, University of Silesia in Katowice, Poland Alexander Vladimirovich Bogdanov, Saint Petersburg State University, Russia*

#### \*Correspondence:

*Mehrnoosh Sadrzadeh mehrnoosh.sadrzadeh@qmul.ac.uk*

#### Specialty section:

*This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics*

Received: *12 December 2016* Accepted: *23 May 2017* Published: *04 July 2017*

#### Citation:

*Sadrzadeh M (2017) Quantization, Frobenius and Bi Algebras from the Categorical Framework of Quantum Mechanics to Natural Language Semantics. Front. Phys. 5:18. doi: 10.3389/fphy.2017.00018* Compact Closed categories and Frobenius and Bi algebras have been applied to model and reason about Quantum protocols. The same constructions have also been applied to reason about natural language semantics under the name: "categorical distributional compositional" semantics, or in short, the "DisCoCat" model. This model combines the statistical vector models of word meaning with the compositional models of grammatical structure. It has been applied to natural language tasks such as disambiguation, paraphrasing and entailment of phrases and sentences. The passage from the grammatical structure to vectors is provided by a functor, similar to the *Quantization* functor of Quantum Field Theory. The original DisCoCat model only used compact closed categories. Later, Frobenius algebras were added to it to model long distance dependancies such as relative pronouns. Recently, bialgebras have been added to the pack to reason about quantifiers. This paper reviews these constructions and their application to natural language semantics. We go over the theory and present some of the core experimental results.

Keywords: compact closed categories, frobenius algebras, bialgebras, quantization functor, categorical quantum mechanics, compositional distributional semantics, pregroup grammars, natural language processing

# 1. INTRODUCTION

Categorical compositional distributional semantics is a model of natural language that combines the statistical vector models of word meanings with the compositional models of grammar. The grammatical structures are modeled as morphisms of a compact closed category of grammatical types, the vector representations of word meanings are modeled as morphisms of the category of finite dimensional vector spaces, which is also a compact closed category. The passage from grammatical structure to vectorial meaning is by connecting the two categories with a structure preserving map, in categorial words, a functor.

<sup>F</sup> : Grammar H⇒ Meaning

This passage allows us to build vector representations for meanings of phrases and sentences, by using the vectors of the words and the grammatical structure of the phrase or sentence. Formally, this procedure is the application of the image of the functor on the grammatical structure to the meaning vectors of the words. Still more formally, given a string of words w1w<sup>2</sup> · · · wn, one first formalizes their grammatical structure as a morphism α in the compact closed category of grammar, introduced by Lambek and Lambek and Preller, as the categorical semantics of pregroup type-logical grammars, see Lambek [1, 2]. We denote these below by **Preg**. The vector meanings of words live the category of finite dimensional vector spaces, denoted below by **FVect**. The more concrete version of the above functor is thus as follows:

$$F \colon \mathbf{Prog} \implies \mathbf{Fect}$$

The functor F transforms α to a linear map in **FVect**. This linear map is applied to the vectors of the words within the phrase or sentence. The whole procedure is formalized below:

$$F(\*) \quad \overrightarrow{\boldsymbol{w}\_1 \boldsymbol{w}\_2 \cdots \boldsymbol{w}\_n} = F(\overrightarrow{\boldsymbol{w}}\_1 \otimes \overrightarrow{\boldsymbol{w}}\_2 \otimes \cdots \otimes \overrightarrow{\boldsymbol{w}}\_n)$$

Vectors of words, i.e., the −→w <sup>i</sup> 's are represented by morphims I → V of the category of finite dimensional vector spaces, for V the vector space in which the meaning of the word lives. The tensor product ⊗ between these morphisms is the categorical way of packing them together. This model, referred to by DisCoCat, for Categorical Compositional Distributional, was the first model that put together the vector meanings of words by taking into account their grammatical structure, in order to build a vector for the phrase or sentence containing the words.

DisCoCat relates to the categorical models of Quantum phenomena in two ways. One is through the function F; quoting from Coecke et al. [3]:

"A structure preserving passage to the category of vector spaces is not a one-off development especially tailored for our purposes. It is an example of a more general construction, namely, a passage long-known in Topological Quantum Field Theory (TQFT). This general passage was first developed by Atiyah [4] in the context of TQFT and was given the name "Quantization," as it adjoins "quantum structure" (in terms of vectors) to a purely topological entity, namely the cobordisms representing the topology of manifolds. Later, this passage was generalized to abstract mathematical structures and recast in terms of functors whose co-domain was **FVect** by Baez and Dolan [5] and Kock [6]. This is exactly what is happening in our [DisCoCat] semantic framework: the sentence formation rules are formalized using type-logics and assigned quantitative values in terms of vector composition operations. This procedure makes our passage from grammatical structure to vector space meaning a "Quantization" functor. Hence, one can say that what we are developing here is a grammatical quantum field theory for Lambek pregroups. "

The other connection is that the DisCoCat model, i.e., the tuple **Preg**, **FVect** , F , was originally inspired by the categorial model of Quantum Mechanics, as developed by Abramsky and Coecke [7]. CQM, for Categorical Quantum Mechanics, models Quantum protocols using compact closed categories and their vector space instantiations (more specifically they use dagger compact closed categories and category of Hilbert spaces, which have also been used in DisoCat, e.g., see [8]). The aim of this review is to briefly describe the DisCoCat model and its recent extensions with Frobenius and Bi algebras. These extensions were inspired by extensions to categorical Quantum Mechanics: the work of Coecke et al. [9, 10] for the use of Frobenius Algebras and Coecke and Duncan [11] for Bialgebras. These extensions have enabled us to reason, in a structured way, about logical words of language such as relative pronouns "who, whom, what, that, which, etc." and quantifiers "all, some, at least, at most, none, etc."

In what follows, we will first review the advances made in the DisCoCat model in a chronological order; then go through the the core of theoretical underpinnings of the model and finally present some of the main experiments performed to validate the theoretical predictions.

# 2. A CHRONOLOGICAL OVERVIEW OF DISCOCAT

The origins of the DisCoCat model goes back to the work of Clark and Pulman [12], presented in the AAAI Spring Symposium on Quantum Interaction (QI) in 2007. The paper discussed vector models of word meaning, otherwise known as distributional semantics, and outlined an open problem they faced. The open problem is how to extend distributional models so that they can assign vector meanings to phrases and sentences of language. In their proposed extended model, Clark and Pulman, inspired by the Harmonic Grammars of Smolensky [13], argue for the use of tensors. The vector meaning of a string of words w1w<sup>2</sup> · · · wn, as defined by them, is as follows:

$$\overrightarrow{\boldsymbol{\omega}\_{!}} \overrightarrow{\boldsymbol{\omega}\_{1} \cdots \boldsymbol{\omega}\_{n}} = \sum\_{i} \overrightarrow{\boldsymbol{\omega}\_{i}} \otimes \overrightarrow{\boldsymbol{\tau}}\_{i}$$

where −→r <sup>i</sup> is a vector representing the grammatical role played by word w<sup>i</sup> in the string. The problem with this model is that firstly, vector meanings of sentences grow as the sentence becomes larger and building tensor models for them becomes costly. Secondly, sentences that have different grammatical meanings live in different spaces and thus their meanings cannot be compared to each other. Further, Clark and Pulman do not provide any experimental support for their models. Subsequently, in a paper presented in QI in 2008, Stephen et al. [14], addressed the former two of these problems by presenting the first DisCoCat model. An extended version of this paper later appeared in Lambek's 90'th Festschrift [15] in 2010. The original DisCoCat model presented there worked along side the following triangle:

where the space between the **FVect** and **Preg** was interpreted as pairing. That is, instead of working with a functor between the two categories of grammar and meaning, there is only one category: the category whose objects are pairs (V, p) of V a vector

space and p its grammatical type (i.e., the grammatical type of the words living in that category), and whose morphisms are also pairs (<sup>f</sup> : <sup>V</sup> <sup>→</sup> <sup>W</sup>, <sup>p</sup> <sup>≤</sup> <sup>q</sup>), for <sup>f</sup> : <sup>V</sup> <sup>→</sup> <sup>W</sup> a linear map and <sup>≤</sup> :<sup>p</sup> <sup>→</sup> <sup>q</sup> the partial ordering of the pregroup grammar. In this model, pregroups where treated as partial order categories. The functorial form of DisCoCat, described in the introduction, was fist introduced by Preller and Sadrzadeh [16] in 2010 and later connected to the Quantization of TQFT by Coecke et al. [3] in 2013. Similar connections, albeit not in a functorial form and not to TQFT but to CQM in general, were also drawn in a paper by Lambek [17]. It should be noted that the main contribution of Coecke et al. [3] was however, to extend the functorial passage and thus the DisCoCat from Lambek Pregroups to the original monoidal calculus of Lambek [18].

# <sup>F</sup> : **Monoidal Closed Cats** H⇒ **FVect**

Although grammar-aware vector space models of meaning existed for adjective noun phrases via the work of Baroni and Zamparelli [19], but the above DisCoCat model was the first one where this grammar-awareness was theoretically defined for all language constructs. Later, in 2013, in the paper by Grefenstette et al. [20], it was shown how the concrete constructions of Baroni and Zamparelli [19] can be used in the DisCoCat to build matrices and tensors for intransitive and transitive verbs. But extending these concrete models to words such as relative pronouns, quantifiers and prepositions proved to be problematic, due to data sparsity, as also shown in Baroni et al. [21].

The theoretical predictions of DisCoCat were first experimentally verified in a paper by Grefenstette and Sadrzadeh [22]. They presented an algorithm to build the matrices/tensors of the model and implemented it on intransitive and transitive verbs and further applied these to a disambiguation task. The intransitive version of the task was originally developed by Mitchell and Lapata [23], the extension to transitive was a novel contribution, leading to a dataset that was later used by the community in many occasions. In a short paper in the same year [24] presented a few extensions to the constructions of the latter model. The full set of results, together with extensions to adjective-noun phrases and sentences containing them, appeared in the journal article by Grefenstette and Sadrzadeh [25].

The model described above had one flaw, namely that sentences of different types acquired vectors that lived in different vectors spaces. This made it impossible to fully benefit from DisCoCat, one of whose original promises was an extension of the model of Clark and Pulman to one that one can actually compare sentences. This shortcoming was later overcome by Kartsaklis et al. [26], where it was shown that two possible applications of Frobenius algebras on the concrete model of Grefenstette and Sadrzadeh [22] solves the problem and leads us to a uniform sentence space.

After the above, DisCoCat has been extended to cover larger fragments of language and also it has been implemented on different vector models with different implementation parameters and experimented with in different settings. These latter contributions include the following:


We have also extended the model theoretically, where the major contributions are as follows:

• The use of Frobenius and Bi algebras to model linguistic phenomena that involve certain types of rearranging of information within phrases and sentences. An example of such a phenomena is relative and quantified clauses, such as "all men ate some cookies," "the man who ate the cookies" and "the man whose dogs ate the cookies." We extended the DisCoCat model from compact closed categories to ones with Frobenius and Bi algebras over them and showed how the copying

<sup>1</sup>https://www.tensorflow.org/versions/r0.11/tutorials/word2vec/index.html

and merging operations of these algebras allow us to model meanings of quantified and relative clauses and sentences. The work done here includes that of Clark et al. [35], Sadrzadeh et al. [36, 37], Hedges and Sadrzadeh [38] and Sadrzadeh [39].

• Density matrices are the core of vector space models of Quantum Mechanics and indeed it has been shown that the category of density matrices and completely positive maps is also a compact closed category. The question arises so as whether and how these matrices have applications in linguistics. The work of Piedeleu et al. [8], Balkir et al. [40, 41] and Bankova et al. [42] showed that density matrices can model different meanings of ambiguous words and that they can also model a hierarchy of these meanings and thus be applied to entailment.

#### 3. OVERVIEW OF THEORY

This section reviews the theoretical framework of a DisCoCat. Its structure is as follows: in Section 3.1, we will review the distributional semantic models. We show how the motivating idea of these models are formalized in terms of vector representations and describe some theoretical and experimental parameters of the model and some of the major applications thereof. In Section 3.2. we review the grammatical model that was first used as a basis for compositional distributional models, namely the pregroup grammars of Lambek. We review the theory of pregroup algebras and exemplify its applications to reasoning about grammatical structures in natural language. In Section 3.3, we show how a functorial passage can be developed between a pregroup grammar, seen as a compact closed category, and the category of finite dimensional vector spaces and linear maps. We describe how this passage allows one to assign compositional vector semantics to words and sentences of language. This passage is similar to the one used in TQFT, where the grammatical part is replaced by the category of manifolds and cobordisms. Section 3.4, describes the theory of Frobenius and Bi algebras over compact closed categories. In Section 3.5, we show how these algebras can model meanings of relative and quantified clauses and sentences. In Section 3.6, we go through the graphical calculus of compact closed categories and Frobenius and Bi algebras over them. We exemplify how these are used in linguistics, where they depict flows of information between words of a sentence.

#### 3.1. Vector Models of Natural Language

Given a corpus of text, a set of contexts and a set of target words, the vector models of words work with a so called co-occurrence matrix. This has at each of its entries "the degree of co-occurrence between the target word and the context," developed amongst other by Salton et al. [43] and Rubenstein and Goodenough [44]. This degree is determined using the notion of a window: a span of words or grammatical relations that slides across the corpus and records the co-occurrences that happen within it. A context can be a word, a lemma, or a feature. A lemma is the canonical form of a word; it represents the set of different forms a word can take when used in a corpus. A feature represents a set of words that together express a pertinent linguistic property of a word. Given an m × n co-occurrence matrix, every target word t can be represented by a row vector of length n. For each feature c, the entries of this vector are a function of the raw co-occurrence counts, are computed as follows:

$$\text{raw}\_f(t) = \frac{\sum\_{c} N(f, t)}{k}$$

for N(f , t) the number of times the t and f have co-occurred in the window. Based on L, the total number of times that t has occurred in the corpus, the raw count is turned into various normalized degrees. Some common examples are probability, conditional probability, likelihood ratio and its logarithm: The lengths of the corpus, window, and normalization scheme are parameters of the model, as are the sizes of the feature and target sets, there has been a plentiful of papers who study these parameters, for example see Lapesa and Evert [45], Bullinaria and Levy [46], and Turney [47].

The distance between the meaning vectors, for instance the cosine of their angle, provides an experimentally successful measure of similarity of their meanings. For example, in the vector space of **Figure 1**, cited from Coecke et al. [3], the angle between meaning vectors of "cat" and "dog" is small and so is the angle between meaning vectors of "kill" and "murder." Such similarity measures have been implemented on large scale data (up to a billion words) to build high dimensional vector spaces (tens of thousands of basis vectors). These have been successfully applied to automatic generation of thesauri and other natural language tasks such as automatic indexing, meaning induction from text, and entailment, for example see Curran [48], Lin [49], Landauer and Dumais [50], Geffet and Dagan [51], and Weeds et al. [52].

#### 3.2. Pregroup Grammars

A pregroup algebra, as defined by Lambek [1], is a partially ordered monoid (P, ≤, ·, 1) where every element has a left and a right adjoint, which means that for every element p ∈ P we have a p <sup>r</sup> <sup>∈</sup> <sup>P</sup> and a <sup>p</sup> <sup>l</sup> <sup>∈</sup> <sup>P</sup> such that the following four inequalities hold:

$$p \cdot p^r \le 1 \le p^r \cdot p \qquad p^l \cdot p \le 1 \le p \cdot p^l$$

An example of a pregroup in arithmetics is the set of unbounded monotone functions on integers, where the monoid multiplication is function composition with the identity function its unit, and the left and right adjoints defined using min and max of integers. For reasons of space, we will not give these definitions here and refer the reader to Lambek [1, 2].

Pregroup algebras are applied to natural language via the notion of a pregroup grammar, defined to be a pair hD, Si, where D is a pregroup lexicon and S ⊂ B is a set of designated types, containing types such as that of a declarative sentence s, and a question q. A pregroup lexicon is a binary relation D, defined as

$$D \subseteq \Sigma \times T(B)$$

where T(B) is the free pregroup generated over B (for the free construction see [1]).

Given a pregroup grammar, as specified in Lambek [1], one says that a string of words w1w<sup>2</sup> · · ·w<sup>n</sup> of language is grammatical iff for 1 ≤ i ≤ n, there exists a (w<sup>i</sup> , ti) ∈ D, such that we have a type t ∈ T(B) ∩ D[6] such that the following partial order holds in T(B):

$$t\_1 \cdot t\_2 \cdot \cdots \cdot t\_n \le t$$

An example of a pregroup lexicon is presented in **Table 1**:

The pregroup reductions corresponding sentences (1) "men kill dogs," (2) "men kill cute dogs," and (3) "men do not kill dogs" are as follows (all cited from [3]):

$$\begin{aligned} (1) \ n \cdot n^r \cdot s \cdot n^l &\le 1 \cdot s \cdot 1 = s\\ (2) \ n \cdot n^r \cdot s \cdot n^l \cdot n \cdot n^l \cdot n &\le 1 \cdot s \cdot 1 \cdot 1 = s\\ (3) \ n \cdot n^r \cdot s \cdot j^l \cdot \sigma \cdot \sigma^r \cdot j \cdot j^l \cdot \sigma \cdot \sigma^r \cdot j \cdot n^l \cdot n\\ &\le 1 \cdot s \cdot j^l \cdot 1 \cdot j \cdot j^l \cdot 1 \cdot j \cdot 1 = s \cdot j^l \cdot j \cdot j^l \cdot j \le s \end{aligned}$$

#### 3.3. Quantization

In order to formalize the structure preserving passage between syntax: pregroup grammars and semantics: vector models, we formalize both of these in the language of compact closed categories [53]. For this reason, we very briefly recall some definitions. A compact closed category has objects A, B; morphisms <sup>f</sup> : <sup>A</sup> <sup>→</sup> <sup>B</sup>; a monoidal tensor <sup>A</sup> <sup>⊗</sup> <sup>B</sup> that has a unit I; and for each object A two objects A r and A l together with the following morphisms:

$$A \otimes A^r \stackrel{\epsilon\_A^r}{\longrightarrow} I \stackrel{\eta\_A^r}{\longrightarrow} A^r \otimes A \qquad A^l \otimes A \stackrel{\epsilon\_A^l}{\longrightarrow} I \stackrel{\eta\_A^l}{\longrightarrow} A \otimes A^l$$

These morphisms have to satisfy certain other conditions, among which are the four yanking equations, which for reasons of space

TABLE 1 | Type assignments for a toy language in a Lambek pregroup; table from Coecke et al. [3].


we will not give here. It is evident (and has also been proven, see for example [17, 54]), that pregroup algebras are compact closed categories. This is by taking the above ǫ and η maps to be the four adjoint inequalities of a pregroup algebra. Finite dimensional vector spaces with linear maps as morphisms are also compact closed categories, this has been shown by Kelly and Laplaza [53]. This category is symmetric, thus the left and right adjoints collapse to one, that is for V a finite dimensional vector space, we have V <sup>l</sup> <sup>=</sup> <sup>V</sup> <sup>r</sup> <sup>=</sup> <sup>V</sup> ∗ , where V ∗ is the dual space of V. In the presence of a fixed basis, however, one obtains the equivalence V ≡ V ∗ . Assuming so, the ǫ and η maps are defined as follows, for { −→<sup>r</sup> <sup>i</sup>}<sup>i</sup> a fixed basis:

$$\begin{aligned} \epsilon &= \epsilon^l = \epsilon^r \colon V \otimes V \to \mathbb{R} \\ &\vdots : \sum\_{ij} c\_{ij} \stackrel{\scriptstyle \rightharpoonup}{r}\_i \otimes \stackrel{\scriptstyle \rightharpoonup}{r}\_j \mapsto \sum\_{ij} c\_{ij} (\stackrel{\scriptstyle \rightharpoonup}{r}\_i \mid \stackrel{\scriptstyle \rightharpoonup}{r}\_j) \\ \eta &= \eta^l = \eta^r \colon \mathbb{R} \to V \otimes V^\* \qquad \colon \colon 1 \mapsto \sum\_i \stackrel{\scriptstyle \rightharpoonup}{r}\_i \otimes \stackrel{\scriptstyle \rightharpoonup}{r}\_i \end{aligned}$$

Now we can define the structure preserving map via the following Quantization functor:

$$F \colon \mathbf{Prog} \implies \mathbf{Fect}$$

explicitly defined as follows:


We have now formally defined a DisCoCat: the tuple (**Preg**, **FVect**, F), as defined above. It is in this setting that one obtains vector representations for sentences by applying the definition (∗) of introduction. For example, the vector representations of two of our example sentences above become as follows:

$$\begin{aligned} \overrightarrow{\text{men}\,\text{kill}\,\text{dogs}} &= (\epsilon\_W \otimes 1\_S \otimes \epsilon\_W) \left( \overrightarrow{\text{meh}} \otimes \overrightarrow{k\,l} \mathbb{I} \otimes \overrightarrow{\text{dogs}} \right) \\ &= \sum\_{ijk} \epsilon\_{ijk} \left< \overrightarrow{\text{meh}} \mid \overrightarrow{\text{w}\_i} \rangle \langle \overrightarrow{\text{w}\_k} \mid \overrightarrow{\text{dogs}} \rangle \right>\_{\overrightarrow{s}} \end{aligned}$$

−−−−−−−−−−−−→ men kill cute dogs <sup>=</sup>

$$\begin{split} & (\epsilon\_{W} \otimes 1\_{S} \otimes \epsilon\_{W} \otimes \epsilon\_{W}) (\overrightarrow{\text{me}} \overleftarrow{\text{h}} \otimes \overrightarrow{\text{k}} \overleftarrow{\text{l}} \otimes \overrightarrow{\text{cute}} \otimes \overrightarrow{\text{dogs}}) \\ & = \sum\_{ijk} \sum\_{lm} c\_{ijk} \epsilon\_{lm} \langle \overrightarrow{m\epsilon\hbar} \mid \overrightarrow{\text{w}} \overleftarrow{\text{i}} \rangle \langle \overrightarrow{\text{w}}\_{i} \rangle \langle \overrightarrow{\text{w}}\_{k} \mid \overrightarrow{\text{w}}\_{l} \rangle \langle \overrightarrow{\text{w}}\_{m} \mid \overrightarrow{\text{dogs}} \rangle \overrightarrow{\text{s}}\_{j} \rangle \end{split}$$

An important observation is that in this setting one obtains that, vector representations of words that have atomic types, e.g., men and dogs with type n are vectors −−→men, −−→dogs <sup>∈</sup> <sup>W</sup>. The representations of other words, e.g., cute and kill with types n r s and n r snl are matrices P ij cij −→<sup>w</sup> <sup>i</sup> <sup>⊗</sup> −→<sup>w</sup> <sup>j</sup> <sup>∈</sup> <sup>W</sup> <sup>⊗</sup><sup>W</sup> for { −→wi}<sup>i</sup> a basis for W and tensors P ijk cijk −→<sup>w</sup> <sup>i</sup> <sup>⊗</sup> −→<sup>s</sup> <sup>j</sup> <sup>⊗</sup> −→<sup>w</sup> <sup>k</sup> <sup>∈</sup> <sup>W</sup> <sup>⊗</sup> <sup>S</sup> <sup>⊗</sup> <sup>W</sup>, for { −→<sup>s</sup> <sup>j</sup>}<sup>j</sup> a basis in <sup>S</sup>.

#### 3.4. Frobenius and Bi Algebras

Both Frobenius and Bi algebras are defined over a symmetric monoidal category C. Frobenius Algebras were developed in their current from by Kock [6] and McCurdy [55], bialgebras by McCurdy [56] and Bonchi et al. [57]. Formally, they are both denoted by tuples (X, δ, ι,µ, ζ ) where, for X an object of C, the triple (X, δ, ι) is an internal comonoid and the triple (X,µ, ζ ) an internal monoid; i.e., the following are coassociative and counital, respectively associative and unital morphisms of C:

$$\delta\_X \colon X \to X \otimes X \quad \iota\_X \colon X \to I \quad \mu\_X \colon X \otimes X \to X \quad \zeta\_X \colon I \to X$$

One difference between these two is that the Frobenius algebra satisfies the following so-called Frobenius condition (due to [58] who originally introduced the algebraic form of these in the context of representation theorems for group theory):

$$(\mu\_X \otimes 1\_X) \circ (1\_X \otimes \delta\_X) = \delta\_X \circ \mu\_X = (1\_X \otimes \mu\_X) \circ (\delta\_X \otimes 1\_X)$$

The bialgebras satisfy a weaker version of this condition, referred to by Q3 in McCurdy [56]

$$\delta\_X \circ \mu\_X = (\mu\_X \otimes \mu\_X) \circ (1\_X \otimes \sigma\_{X,X} \otimes 1\_X) \circ (\delta\_X \otimes \delta\_X)$$

for σX,<sup>X</sup> the symmetry morphism of the category C. Both these algebras do satisfy other conditions, which we will not give here.

In **FVect**, any vector space V with a fixed basis { −→v<sup>i</sup> }<sup>i</sup> has a Frobenius algebra over it, explicitly given by:

$$\begin{aligned} \delta\_V &: \overrightarrow{\nu\_i} \mapsto \overrightarrow{\nu\_i} \otimes \overrightarrow{\nu\_i} \\ \mu\_V &: \overrightarrow{\nu\_i} \otimes \overrightarrow{\nu\_j} \mapsto \delta\_{ij} \overrightarrow{\nu\_i} \end{aligned} \qquad \begin{aligned} \iota\_V &: \overrightarrow{\nu\_i} \mapsto 1 \\ \xi\_V &: 1 \mapsto \sum\_i \overrightarrow{\nu\_i} \end{aligned}$$

where δij is the Kronecker delta. These definitions were introduced in Coecke et al. [9, 10] to characterize vector space bases.

Bialgebras over vectors spaces were introduced in Coecke and Duncan [11] to characterize phases. For linguistic purposes, however, we use a different definition, first introduced in Hedges and Sadrzadeh [38]. Let V be a vector space with basis P(U), where U is an arbitrary set. We give V a bialgebra structure as follows:

$$\begin{aligned} \mu\_{\mathcal{P}(U)}|A &= 1 & \delta\_{\mathcal{P}(U)}|A &= |A \otimes |A| \\ \xi\_{\mathcal{P}(U)} &= |U & \mu\_{\mathcal{P}(U)}(|A \otimes |B| &= |A \cap B|) \end{aligned}$$

The Frobenius and the bialgebra δ act similarly here: they both copy their input, that is given a vector −→υ , the produce two copies of it −→<sup>υ</sup> <sup>⊗</sup> −→υ . The slight difference in this special natural language instantiation is that the inputs to the bialgebra δ's are vectors whose basis are subsets of a universal set U, whereas the inputs to Frobenius algebra δ's can be any vector. The main difference between these two algebras are in their µ maps. The Frobenius µ, when inputted with two same vectors, returns one of them, the bialgebra µ acts on any two input vectors, but of course these both have to have as basis subsets of P(U), and returns the "intersection" of these two vectors. By "intersection of vectors" we mean (as defined above), a vectors whose basis is the intersection of the basis of the input vectors.

The reason for working with the above bialgebras is that they are there to model generalized quantifiers of Barwise and Cooper [59]. These quantifiers are defined as maps with the type <sup>P</sup>(U) <sup>→</sup> PP(U). In order to see why, consider the following definition for the logical quantifiers "all" and "some":

$$\begin{aligned} \text{([some]) (A) = \{X \subseteq U \mid X \cap A \neq \emptyset\}}\\ \text{([every]) (A) = \{X \subseteq U \mid A \subseteq X\}} \end{aligned}$$

A similar method is used to define non-logical quantifiers, for example "most A" is defined to be the set of subsets of U that has "most" elements of A, "few A" is the set of subsets of U that contain "few" elements of A, and similarly for "several" and "many." These functions can be formalized as relations over <sup>P</sup>(U), where they will thus obtain the type <sup>P</sup>(U) <sup>→</sup> <sup>P</sup>(U). These relations can be formalized in the category of sets and relations, which is also compact closed. The above definitions are vector space generalizations of the bialgebras defined for relations. They enable us to work with intersection of these relations. This is an operation that allows Barwise and Cooper to formalize an important property of generalized quantifiers of natural language, i.e., that they are conservative. For details and the from-relation-to-vector embedding, see Hedges and Sadrzadeh [38].

#### 3.5. Relative Pronouns and Quantifiers

In order to model relative pronouns and quantifiers, according to the developments of Clark et al. [35], Sadrzadeh et al. [36, 37], and Hedges and Sadrzadeh [38], one adds to the pregroup lexicon, the following assignments:

To subject relative pronouns "who, that, which," we assign type n <sup>r</sup>nsln

To object relative pronouns "whom, that, which," we assign type n <sup>r</sup>nnlls l

To determiners of any role "a, the, all, every, some, none, at most, many, · · · ," we assign type nn<sup>l</sup>

The vectorial representations of the subject and object relative pronouns are as follows, respectively for each case:

$$\begin{aligned} \overrightarrow{\text{Sbj Rel}} &:= (1\_W \otimes \mu\_W \otimes \zeta\_S \otimes 1\_W) \circ (\eta\_W \otimes \eta\_W) \\ \overrightarrow{\text{Obj Rel}} &:= (1\_W \otimes \mu\_W \otimes 1\_W \otimes \xi\_S) \circ (\eta\_W \otimes \eta\_W) \end{aligned}$$

where the µ<sup>W</sup> and ζ<sup>S</sup> are maps of the Frobenius algebras defined over the W and S spaces. The vectorial representation of the determiners are as follows:

$$\begin{aligned} \widetilde{\text{determine}} &:= \{ \mathbf{1}\_W \otimes \epsilon\_W \} \circ \{ \mathbf{1}\_W \otimes \mu\_W \otimes \epsilon\_W \otimes \mathbf{1}\_W \} \\ &\circ \mathbf{(1}\_W \otimes \overline{\boxed{d}} \otimes \delta\_W \otimes \mathbf{1}\_{W \otimes W} \} \\ &\circ \{ \mathbf{1}\_W \otimes \eta\_W \otimes \mathbf{1}\_{W \otimes W} \} \circ \{ \eta\_W \otimes \mathbf{1}\_W \} \end{aligned}$$

where the µ<sup>W</sup> and δ<sup>W</sup> maps are bialgebraic maps defined over the space W = VP(U) , which is notation for a vector space spanned by the subsets of the set U. The -d map has type W → W, it is a linear map that directly encodes the relational graph of the generalized quantifier d.

By applying definition (∗) from the introduction, one obtains vectorial representations for relative clauses and quantified sentences. An example of the former is "men who eat cake," which acquires the following vectorial representation:

$$\begin{array}{l}\mathrel{\mathop{\text{men}\,\text{wo}}}\text{\huge{\text{eat}}\,\text{cak}}\mathrel{\mathop{\text{eat}}}\text{\huge{\text{eat}}}\mathrel{\mathop{\text{eat}}}\text{\huge{\text{eat}}}\text{\huge{\text{e}}}\mathrel{\mathop{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{\text{e}}}\text{\hfil{}}\text{\hfil{\text{e}}}\text{\hfil{}}\text{\hfil{\text{e}}}\text{\hfil{$$

This simplifies as follows, after opening up the meaning of "who" using the "Obj Rel" definition above:

$$\begin{split} (\mu\_{W} \otimes \epsilon\_{W}) \left( \overrightarrow{\text{m}} \overleftrightarrow{\text{e}} \otimes \overrightarrow{\text{e}} \overleftrightarrow{\text{e}} \overleftrightarrow{\text{e}} \overleftrightarrow{\text{e}} \right) \\ &= (\mu\_{W} \otimes \epsilon\_{W}) \left( \sum\_{k \in K} \overrightarrow{\text{w}}\_{k} \otimes (\sum\_{ij} \alpha\_{ij} \overrightarrow{\text{w}}\_{i} \otimes \overrightarrow{\text{w}}\_{j}) \otimes \sum\_{l \in L} \overrightarrow{\text{w}}\_{l} \right) \\ &= \sum\_{ij, k \in K, l \in L} \alpha\_{ij} \mu\_{W} (\overrightarrow{\text{w}}\_{k} \otimes \overrightarrow{\text{w}}\_{i}) \otimes \epsilon\_{W} (\overrightarrow{\text{w}}\_{j} \otimes \overrightarrow{\text{w}}\_{l}) \\ &= \sum\_{ij, k \in K, l \in L} \alpha\_{ij} \delta\_{ki} \overrightarrow{\text{w}}\_{i} \delta\_{jl} = \sum\_{k \in K, l \in L} \alpha\_{kl} \overrightarrow{\text{w}}\_{k} \end{split}$$

An example of a quantified sentence is "most cats snore," which acquires the following vectorial representation,

$$(\epsilon\_W \otimes \mathbf{l}\_S) \circ (\mathbf{l}\_W \otimes \epsilon\_W \otimes \mathbf{l}\_{W \otimes S}) (\overrightarrow{\text{möst}} \otimes \overrightarrow{\text{cáts}} \otimes \overrightarrow{\text{snior's}})$$

The above simplifies to the following

$$(\epsilon\_W) \circ (\overrightarrow{\|\text{most}\|} \otimes \mu\_W \otimes 1\_S) \circ (\epsilon\_W \otimes 1\_S) (\overrightarrow{\text{méh}} \otimes \overrightarrow{\text{snieré}})$$

Different instantiations for U and S are provided in Hedges and Sadrzadeh [38], as an example consider U to be the set of words of language, in which case P(U) represents the set of, what is called "lemmas" of language, i.e., the set of canonical forms of words. One takes S = VP(<sup>S</sup> ′ ) ,for S ′ an abstract sentence space, denoted by the symbol S in our previous examples. In this case, the above categorical definition takes a concrete instantiation as follows:

$$\sum\_{ijk} \sum\_{B \in \{\text{most\\_}\} \cup \{\text{cat}\} \cup} c\_i^{cat} c\_{jk}^{snore} c\_B^{most} \langle B \mid A\_i \cap A\_j \rangle \vert s\_k \rangle$$

where we have −→cats: <sup>=</sup> P i c cat i |Ai for A<sup>i</sup> ⊆ U and P −−−→snore: <sup>=</sup> jk c snore jk |A<sup>j</sup> ⊗ A<sup>k</sup> , for A<sup>j</sup> ⊆ U and |A<sup>k</sup> a basis vector of S.

#### 3.6. Diagrams

The compact closed categorical setting of Abramsky and Coecke comes equipped with a diagrammatic calculus, originally developed in Joyal and Street [60] and referred to by string diagrams. This calculus allows one to draw diagrams that depict the protocols of Quantum mechanics and simplify the computations thereof. For example see **Figure 2** for the diagram for teleportation:

These diagrams depict the flow of information between the parties involved in the protocol and also simplify the tensor contraction computations. In the setting of language, every language construct can be seen as a protocol, with words as the involved parties. The same diagrammatic calculus has been widely used to show how information flows amongst the words of a phrase or sentence and to depict the meaning of the language unit resulting from it. In the interest of space, we will not introduce this diagrammatic calculus here, but provide examples, via the following **Figures 3**–**5**.

# 4. OVERVIEW OF EXPERIMENTS

Our first set of experiments was on two datasets, both consisting of pairs of transitive sentences with ambiguous verbs and their two eminent meanings. One of the datasets had adjectives modifying the subjects and objects, the other contained bare subjects and objects. The goal was to disambiguate verbs, based on the sentences in which they occurred. A non-compositional distributional approach to this task would be to build vector representations for verbs and their different meanings (in this case the two most eminent ones); then measure the cosine of the angle between the vector of the verb and those of the meanings and use this as a measure of disambiguation. In other words, if the vector of the verb was closer to one of the meaning vectors, that meaning would be chosen as the right meaning for the verb. This non-compositional method, however, does not take into account the specific sentence in which the verb has occurred. In our compositional version, we build a vector representation for each sentence of the dataset; specifically, we build a vector for the sentence with the verb in it, and two for the two sentences where the verb is replaced with one of its two eminent meanings. Then we compare the distances between these sentence vectors. The sentence vectors were built using different composition operators, and the non-compositional verb vector was taken as

FIGURE 3 | Diagram of information flow in the negative transitive sentence, cited from Preller and Sadrzadeh [16].

TABLE 2 | Example entries from the transitive dataset, cited from Grefenstette and Sadrzadeh [25].


a comparison base line. The results show that one of the tensor composition methods, namely our Kronecker model, performed best. This was the first time 3 and 4 word sentences were used to disambiguate a single word. A precursor to this experiment, was that of Mitchell and Lapata [23], where ambiguous verbs were disambiguated using 2-word "Sbj Verb" or "Verb Obj" phrases.

A snapshot of two of these datasets are presented in **Tables 2**, **3**. The first two entries are the two sentences in question and the last entry is a tag we gave to the sentences based on how similar the meanings of the sentences in the pair were. As you can see, the first dataset consists of "Sbj Verb Obj" sentences, the second dataset consists of sentences of the form "Adj Sbj Verb Adj Obj" where the subject and object are moreover modified by adjectives:

We asked human annotators (on Amazon Turk) to assign a degree of similarity to each pair of the dataset, using a number from 1 to 7, ranking the degree of similarity of the sentences therein. If the sentence "Sbj Verb1 Obj" was ranked TABLE 3 | Example entries from the adjective-transitive dataset, cited from Grefenstette and Sadrzadeh [25].


to have an average high similarity with the sentence "Sbj Verb2 Obj," then we concluded that "Verb1" had the same meaning as "Verb2," thus disambiguating it. We implemented different models to compute vectors for sentences and used the cosine of the angle between them as a measure of similarity. The results are presented in **Table 4**. In the "Sbj Verb Obj" dataset, the vectors built via the **Kronecker** model achieve the highest correlation with the annotators' judgments. This model is one of the DisCoCat models, a model that has consistently performed very well. In the "Adj Sbj Verb Adj Sbj" dataset, the model referred to by **Categorical Adj** has consistently performed the best. This model builds a matrix for the adjective and matrix multiplies it with the vector of the noun to obtain a vector for the adjective noun phrase. The exact results denote the degree of correlation (computed by using Spearman's ρ) between the human judgments and the judgments predicted by the models. These seem quite low, but so is the inter annotators agreement, presented in the last line of each table. This agreement is an upper bound for the experiment, denoting the degree to which the human annotators agreed amongst themselves about their similarity judgments. Having this upper bound in mind, we see that the "Adj Sbj Verb Adj Sbj" dataset performed better than the "Sbj Verb Obj" dataset (since it had larger compositional contexts), as it aligns to human judgment in about 60% of the time.

A criticism to this first set of experiments was that they relied on human judgments and that these were not done according to clear guidelines. One argument against lack of such a judgment was that human annotators were asked to judge the degree of similarity between sentences and that is a hard task. It was argued that similarity has different interpretations in different contexts and annotators might have had different interpretations (different to ours) in mind when judging the dataset. In a second task, we avoided this weakness by working on term-definition pairs mined from a junior dictionary. The terms were words and TABLE 4 | Model correlation coefficients with human judgments, cited from Grefenstette and Sadrzadeh [25].


TABLE 5 | Sample of the dataset for the term/definition comparison task, cited from Kartsaklis et al. [26].


TABLE 6 | Accuracy results for the term/definition comparison task, Kartsaklis et al. [26].


*The best performing models are highlighted in boldface.*

TABLE 7 | Examples from a Term-Description dataset, cited from Sadrzadeh et al. [37].


*The best performing models are highlighted in boldface.*

the definitions were short descriptions given by the dictionary as the meaning of the word. The goal was to distinguish which definition was describing which word. We built word vectors for the terms and phrase vectors for the definitions and used the cosine of the angle between these vector as a classifier. We collected five definitions whose vectors were closest to the vector of the verb and then verified whether the correct dictionary definition was amongst these five. The model that classified more terms to their correct definitions was considered to be the better model.

A snapshot of the dataset is presented in **Table 5**. The accuracy results are presented in **Table 6**, where the DisCoCat **CopyObj** model achieves the highest accuracy for the terms that are nouns, and the second best for terms that are verbs (28% vs. the accuracy of 30% reached by multiplying the word vectors).

Based on this experiment, we formed a toy experiment and did a preliminary evaluation of the application of Frobenius algebras to modeling relative clauses. Similar to the above experiment, we mined term-description pairs from a dictionary, but this time the terms were chosen such that their descriptions had a relative pronoun in them. We then proceeded as before: built vectors for the term and for the description. A snapshot of the dataset is presented in **Table 7**.

The latter used three different composition operators: simple addition and point wise multiplication, i.e., we just added and point wise multiplied the word vectors within the descriptions to obtain a vector representation for the whole relative clause. We also built vector representations using the Frobenius model presented above and an extension of it to possessive relative pronoun "whose." In the first two models, we had a choice of either building a vector for the relative pronoun or dropping it and thus treating it as noise. We presented results for both

Sadrzadeh CQM and Natural Language

TABLE 8 | Results for the Term-Description dataset, cited from Sadrzadeh et al. [37].


*The best performing models are highlighted in boldface.*

TABLE 9 | Models correlation with human judgments for the disambiguation task with normal and neural (NWE) vectors, cited from Milajevs et al. [32].


*The best performing models are highlighted in boldface.*

of these options. For the possessive Frobenius case, we had the choice of either building a vector for 's, or treating it as the unit vector. Again, we presented the results of both cases. We tested which model achieved a better accuracy (Acc) and a mean reciprocal rank (MRR). The results are presented in the **Table 8**, where both of the Frobenius models achieve the highest accuracy and MRR.

The differences between the two Frobenius models only applies to the possessive relative clauses, which were not reviewed in this article, in the benefit of space. In these models, one has to build a linear map for the " 's " phoneme. In one model we summed all the nouns that were modified by this morpheme, in the other, we simply took it to be the identity map, i.e., the unit map. In either case, the Frobenius model performed better than our other implemented models, e.g., the other two models in which we added the vectors of the words, taking into account the vector of the relative pronoun or ignoring it, and two other similar models where we multiplied them.

As another set of experiments, we used the neural word embeddings of Mikolov et al. [30]. The motivation behind this task was the popularity and success of the word embeddings. Often, when vector representations are built from scratch using count-based methods and on a given corpus, many parameters have to be taken into account (e.g., size of the window, the normalization scheme for the counts, the dimensions of the vector spaces and its size). The preprepared word embeddings provides a platform wherein new vectors need not be built for each task and parameters need not be individually tuned by each experimenter and for each experiment. The word embeddings provide a standard framework (to some extent) for all experimenters to do experiments and compare their results in a more unified manner. It also relieves us from the task of building the vectors ourselves.

We used the neural noun vectors of Mikolov et al. [30] and built adjective matrices and verb tensors and re-experimented with the disambiguation task presented above. The results are presented in **Table 9**. **GS11** and **KS14** denote the "Sbj Verb Obj" and "Adj Sbj Verb Adj Obj" datasets with count-based vectors, and **NWE** denotes both of these datasets together with the word embedding vectors.

Our hope was that one of the tensor-based models would do better here; this would indicate that the tensor models worked better regardless of their underlying vectors, count-baed or neural. This was shown to be indeed the case, as the DisCoCat CopyObj model achieves the highest correlation with human judgments.

### 5. BRIEF SUMMARY AND FUTURE WORK

In this paper, we reviewed the general field of categorical compositional distributional semantics, to which we referred as DisCoCat. This field introduces grammar awareness into vectors models of language, otherwise known as distributional semantics, thus enabling these models to build vectors for phrases and sentences, using the vectors of the words therein and their corresponding grammatical relations. The setting of a DisCoCat is that of a compact closed category, to which later Frobenius and Bi algebras were added to reason about relative pronouns and quantifiers. Compact closed categories, Frobenius and Bi algebras are also the building blocks of the categorical approach to Quantum Mechanics, known under the acronym CQM. Another connection to Quantum formalisms, is the structure preserving passage from grammatical structure to vectorial meaning, which is through a functor similar to the Quantization functor of Topological Quantum Field Theory. In this paper, we presented a chronology of the developments of the DisCoCat, briefly went through its theoretical underpinnings and its experimental validations.

What remains to be done is to relate the setting of DisCoCat to the Quantum logical approaches to language, such as the work done by Preller [61], by Widdows [62], and the original seminal work of Van Rijsbergen [63].

# AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

## FUNDING

Support by EPSRC for Career Acceleration Fellowship EP/J002607/1 and AFOSR International Scientific Collaboration grant FA9550-14-1-0079 is gratefully acknowledged by MS.

## REFERENCES


on Empirical Methods in Natural Language Processing (EMNLP), Stroudsburg, PA: Association for Computational Linguistics (2011). p. 1394–404.


Computational Linguistics (2004). doi: 10.3115/1220355.1220501


**Conflict of Interest Statement:** The author declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Sadrzadeh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Inconclusive Quantum Measurements and Decisions under Uncertainty

Vyacheslav I. Yukalov 1, 2 \* and Didier Sornette1, 3

<sup>1</sup> Department of Management, Technology and Economics, ETH Zürich, Zürich, Switzerland, <sup>2</sup> Bogolubov Laboratory of Theoretical Physics, Joint Institute for Nuclear Research, Dubna, Russia, <sup>3</sup> Swiss Finance Institute, University of Geneva, Geneva, Switzerland

We give a mathematical definition for the notion of inconclusive quantum measurements. In physics, such measurements occur at intermediate stages of a complex measurement procedure, with the final measurement result being operationally testable. Since the mathematical structure of Quantum Decision Theory (QDT) has been developed in analogy with the theory of quantum measurements, the inconclusive quantum measurements correspond, in QDT, to intermediate stages of decision making in the process of taking decisions under uncertainty. The general form of the quantum probability for a composite event is the sum of a utility factor, describing a rational evaluation of the considered prospect, and of an attraction factor, characterizing irrational, subconscious attitudes of the decision maker. Despite the involved irrationality, the probability of prospects can be evaluated. This is equivalent to the possibility of calculating quantum probabilities without specifying hidden variables. We formulate a general way of evaluation, based on the use of non-informative priors. As an example, we suggest the explanation of the decoy effect. Our quantitative predictions are in very good agreement with experimental data.

#### Edited by:

Emmanuel E. Haven, University of Leicester, UK

#### Reviewed by:

Jan Sladkowski, The University of Silesia, Poland Salvatore Micciche', Universitá Degli Studi di Palermo, Italy

#### \*Correspondence:

Vyacheslav I. Yukalov yukalov@theor.jinr.ru

#### Specialty section:

This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics

Received: 25 January 2016 Accepted: 24 March 2016 Published: 14 April 2016

#### Citation:

Yukalov VI and Sornette D (2016) Inconclusive Quantum Measurements and Decisions under Uncertainty. Front. Phys. 4:12. doi: 10.3389/fphy.2016.00012 Keywords: quantum measurements, decision theory, inconclusive events, quantum probability, non-informative priors, decoy effect

# 1. INTRODUCTION

The standard theory of quantum measurements [1] is based on the projection operator measure corresponding to operationally testable events. Simple measurements really have to be operationally testable in order to possess physical meaning. However, if a measurement is composite, consisting of several parts, the intermediate stages do not have to necessarily be operationally testable, but can be inconclusive.

As a typical example, we can recall the known double-slit experiment, when particles pass through a screen with two slits and then are registered by particle detectors some distance away from the screen. This experiment can be treated as a composite event consisting of two parts, one being the passage through one of the slits and second, registration by detectors. The registration of a particle by a detector is an operationally testable event, since the particle is either detected or not, with the result being evident for the observer. But the passage of the particle through one of the slits is not directly observed, and the experimentalist does not know which of the slits the particle has passed through. In that sense, the passage of the particle through a slit is an inconclusive event. The existence of this inconclusive event, occurring at the intermediate stage of the experiment,

**154**

is intimately associated with an interference effect. Otherwise, if the experimentalist would precisely determine the slit through which the particle has passed, the interference pattern registered by the particle detectors would be destroyed. The existence of interference is precisely due to the presence of the inconclusive event that happened at the intermediate stage.

The occurrence of inconclusive events in decision making is even more frequent and important. Practically any decision, before it is explicitly formulated, passes through a stage of deliberation and hesitation accompanying the choice. That is, any decision is actually a composite event consisting of an intermediate stage of deliberation and of the final stage of taking a decision. The final stage of decision making is equivalent to an operationally testable event in quantum measurements. While the intermediate stage of deliberation is analogous to an inconclusive event.

The analogy between the theory of quantum measurements and decision theory has been mentioned by von Neumann [1]. Following this analogy, Quantum Decision Theory (QDT) has been advanced [2–7], with the mathematical structure that is applicable to both decision making as well as to quantum measurements. The generality of our framework, being equally suitable for quantum measurements and decision making, is its principal difference from all other attempts that employ quantum techniques in psychological sciences. An extensive literature on various quantum models in psychology and cognitive science can be found in the books [8–11] and review articles [12–15].

Any approach, applying quantum techniques to decision theory, is naturally based on the notion of probability. This is because quantum theory is intrinsically probabilistic. Respectively, the intrinsically probabilistic nature of QDT is what makes it principally different from stochastic decision theory, where the choice is treated as always being deterministic, while in the process of choosing the decision maker acts with errors [16–20]. Such stochastic decision theories can be termed as"deterministic theories embedded into an environment with stochastic noise." The standard way of using a stochastic approach is to assume a probability distribution over the values characterizing the errors made by the subjects in the process of decision making. Then the parameters entering the distribution are fitted to a posteriori empirical data by maximizing the loglikelihood function. Such a procedure allows one to better fit the given set of data to the assumed basic deterministic decision theory, in particular due to the introduction of additional fitting parameters. However, it does not essentially change the structure of the underlying deterministic theory, although improving it slightly. And, being descriptive, the classical stochastic approach does not provide quantitative predictions.

Contrary to classical stochastic theory, in the quantum approach, we do not assume that the choice of a decision maker is deterministic, with just some weak disturbance by errors. Following the general quantum interpretation, we consider the choice process, including deliberations, hesitations, and subconscious estimation of competing outcomes, as intrinsically random. The probabilistic decision, in the quantum case, is not just a stochastic decoration of a deterministic process, but it is an unavoidable random part of any choice. The existence of the hidden, often irrational subconscious feelings and deliberations, results in the appearance of quantum interference and entanglement. The difference between classical stochastic decision theory and QDT is similar to the difference between classical statistical physics and quantum theory. In the former, all processes are assumed to be deterministic, with statistics coming into play because of measurement uncertainties, such as no precise knowledge of initial conditions and the impossibility of measuring exactly the locations and velocities of all particles. In contrast, quantum theory is principally probabilistic, which becomes especially important for composite measurements.

A detailed mathematical theory of quantum measurements in the case of composite events has been developed in our previous papers [21–23]. In the present paper, we concentrate our attention on composite measurements including intermediate inconclusive events and on the application of this notion for characterizing decision making under risk and uncertainty. The importance of composite events, including intermediate inconclusive events, in decision theory makes it necessary to pay a special attention to the correct mathematical formulation of such events and to the description of their properties allowing for the quantitative evaluation of the corresponding quantum probabilities. We show that, despite uncertainty accompanying inconclusive events, it is possible to give quantitative evaluations for quantum probabilities in decision theory, based on noninformative priors. Considering, as an illustration, the decoy effect, we demonstrate that even the simple non-informative priors provide predictions in very good agreement with experimental data.

# 2. COMPOSITE QUANTUM MEASUREMENTS AND EVENTS

In this section, we give a brief summary of the general scheme for defining quantum probabilities for composite events. As we have stressed above, in our approach, the mathematics is the same for describing either quantum measurements or decision making. Therefore, when referring to an event, we can keep in mind either a fact of measurement or a decision action.

Let A<sup>n</sup> be a conclusive operationally testable event labeled by an index n. And let B = {Bα} be a set of inconclusive events labeled by α. Defining the space of events as a Hilbert space H, we associate with an event A<sup>n</sup> a state |ni in this Hilbert space and an event operator Pˆ n,

$$A\_n \to |n\rangle \to \hat{P}\_n = |n\rangle\langle n|\,. \tag{1}$$

The event operator for an operationally testable event is a projector.

The set of inconclusive events B generates in the Hilbert space <sup>H</sup> the state <sup>|</sup>B<sup>i</sup> and the event operator <sup>P</sup><sup>ˆ</sup> B,

$$|B \to |B\rangle \to \hat{P}\_B = |B\rangle\langle B|\ ,\tag{2}$$

where the state reads

$$|B\rangle = \sum\_{\alpha} b\_{\alpha} |\alpha\rangle \,, \tag{3}$$

with coefficients bα being random complex numbers. The event operator for an inconclusive event is not necessarily a projector, but a member of a positive operator-valued measure [7, 21–23].

The space of events, in the quantum approach, is the Hilbert space

$$\mathcal{H} = \mathcal{H}\_A \bigotimes \mathcal{H}\_B \tag{4}$$

that is a tensor product of the spaces

$$\mathcal{H}\_A = \text{span}\{ |n\rangle \}\,, \qquad \mathcal{H}\_B = \text{span}\{ |\alpha\rangle \}\, .$$

Each decision maker is characterized by an operator ρˆ that can be termed the strategic state of a decision maker, which, in quantum theory, corresponds to a statistical operator. The pair {H, <sup>ρ</sup>ˆ}, in physics, is named a statistical ensemble, and in decision theory, it is a decision ensemble.

A composite event is called a prospect and is denoted as

$$
\pi\_n = A\_n \bigotimes B \,. \tag{5}
$$

A prospect π<sup>n</sup> generates a state |πni in the Hilbert space of events H and a prospect operator Pˆ(πn),

$$|\pi\_n \to |\pi\_n\rangle \to \hat{P}(\pi\_n) = |\pi\_n\rangle\langle\pi\_n|\ ,\tag{6}$$

with the prospect state

$$|\pi\_n\rangle = |n\rangle \bigotimes |B\rangle = \sum\_{\alpha} b\_{\alpha} |n\alpha\rangle \,. \tag{7}$$

The prospect operator is a member of a positive operator-valued measure, which implies that these operators satisfy the resolution of unity [21, 23]. Since they contain random quantities bα, the corresponding random resolution has to be understood not as a direct equality between numbers, but, e.g., as the equality in mean [24].

The prospect probability is

$$\rho(\pi\_n) = \text{Tr } \hat{\rho} \hat{P}(\pi\_n) \; , \tag{8}$$

with the trace over the space H. To form a probability measure, the prospect probabilities are to be normalized:

$$\sum\_{n} p(\pi\_n) = 1 \; , \qquad 0 \le p(\pi\_n) \le 1 \; . \tag{9}$$

Taking explicitly the trace in expression (Equation 8) and separating diagonal and off-diagonal terms, we see that the prospect probability

$$
\rho(\pi\_n) = f(\pi\_n) + q(\pi\_n) \tag{10}
$$

is represented as a sum of a positive-definite term

$$f(\pi\_n) = \sum\_{\alpha} |b\_{\alpha}|^2 \langle n\alpha | \hat{\rho} | n\alpha \rangle \tag{11}$$

and a sign-undefined term

$$q(\pi\_n) = \sum\_{\alpha \neq \beta} b\_{\alpha}^\* b\_{\alpha} \langle n\alpha | \hat{\rho} | n\beta \rangle \,. \tag{12}$$

The appearance of the sign-undefined term is a purely quantum effect responsible, in quantum measurements, for interference patterns. The attenuation of this quantum term is called decoherence. In quantum theory, decoherence can be due to external as well as to internal perturbations and the influence of measurements [25–27]. And in QDT, decoherence can occur due to the accumulation of information [28].

The disappearance of the quantum term implies the transition to classical theory. This is formulated as the quantum-classical correspondence principle [29], which in our case reads as

$$p(\pi\_n) \to f(\pi\_n) \,, \qquad q(\pi\_n) \to 0 \,\, . \tag{13}$$

This principle tells us that the term f(πn) plays the role of classical probability, hence is to be normalized:

$$\sum\_{n} f(\pi\_n) = 1 \; , \qquad 0 \le f(\pi\_n) \le 1 \; . \tag{14}$$

When decisions concern a choice between lotteries, the classical term f(πn) has to be defined according to classical decision theory based either on expected utility theory or on some non-expected value functional. This suggests to call f(πn) a utility factor, since it is defined on rational grounds and reflects the utility of a choice. The quantum term is caused by the interference and entanglement effects in quantum theory, which correspond, in decision making, to irrational effects describing the attractiveness of choice. Therefore, it can be called the attraction factor. From Equations (9) and (14), it follows the alternation law

$$\sum\_{n} q(\pi\_n) = 0 \quad , \qquad -1 \le q(\pi\_n) \le 1 \,. \tag{15}$$

Note that, in quantum theory, the definition of the composite event in the form of prospect (Equation 5) is valid for any type of operators, since they are defined on different spaces. No problem with non-commutativity of operators defined on a common Hilbert space arises. This way makes it possible to introduce joint quantum probabilities for several measurements [21, 23]. Contrary to this, considering operators on the same Hilbert space does not allow one to define joint probabilities for non-commuting operators. Sometimes, one treats the Lüders probability of consecutive measurements as a conditional probability. This, however, is not justified from the theoretical point of view [21, 23, 30] and also contradicts experimental data [31, 32]. But defining the quantum joint probability according to expression (Equation 8) contains no contradictions.

In the present section, the general scheme of QDT is presented. Being limited by the length of this paper, we cannot go into all mathematical details that have been thoroughly described in our previous publications. However, we would like to stress that for the purpose of practical applications, it is not necessary to study all these mathematical peculiarities, but it is sufficient to employ the final formulas following the prescribed rules. One can use the formulated rules as given prescriptions, without studying their justification. The main formulas of this section, which are necessary for the following application, are: the form of the quantum probability (Equation 10), the normalization conditions (Equations 9 and 14), and the alternation law (Equation 15). More details required for practical application will be given in the sections below.

#### 3. NON-INFORMATIVE PRIOR FOR UTILITY FACTORS

To make the above scheme applicable to decision theory, it is necessary to specify how one should quantify the values of the utility factors and attraction factors. Here we show how these values can be defined as non-informative priors.

Let us consider a set of lotteries <sup>L</sup><sup>n</sup> = {xi, <sup>p</sup>n(xi) : <sup>i</sup> <sup>=</sup> 1, 2, . . . , Nn}, enumerated by the index n = 1, 2, . . . , NL, with payoffs x<sup>i</sup> and their probabilities pn(xi). The related expected utilities U(Ln) = P i u(xi)pn(xi) can be defined according to the expected utility theory [33]. For the present consideration, the utility functions u(x) do not need to be specified. For instance, they can be taken as linear functions, since this choice has the advantage of making the utility factors independent from the units measuring the payoffs.

In QDT, the act of choosing a lottery Ln, denoted as An, together with the accompanying set of inconclusive events B, including the decision-maker hesitations [6, 30], compose the prospect (Equation 5). Depending on whether the expected utilities are positive on negative, there can be two cases.

If the expected utilities of the considered set of lotteries are all positive (non-negative), such that

$$U(L\_n) \ge 0 \qquad \text{ ( $n = 1, 2, \dots, N\_L$ ) },\tag{16}$$

then it is reasonable to require that zero utility corresponds to zero utility factor:

$$f(\pi\_n) \to 0 \,, \qquad U(L\_n) \to 0 \,\, . \tag{17}$$

The case where the utility factor is simply proportional to the related expected utility trivially obeys this condition (Equation 17). Taking into account the normalization condition (Equation 14) gives the utility factor

$$f(\pi\_n) = \frac{U(L\_n)}{\sum\_n U(L\_n)}\,. \tag{18}$$

When the expected utilities are negative, which happens in the domain of losses, such that

$$U(L\_n) < 0 \qquad \text{ (}n = 1, 2, \dots, N\text{) },\tag{19}$$

the required condition is that an infinite loss corresponds to zero utility factor:

$$f(\pi\_n) \to 0 \,, \qquad |U(L\_n)| \to \infty \,. \tag{20}$$

The simplest way to satisfy this condition (Equation 20) is that the utility factor is inversely proportional to the related expected utility. Taking into account the normalization condition, we get

$$f(\pi\_n) = \frac{|U(L\_n)|^{-1}}{\sum\_n |U(L\_n)|^{-1}}\,. \tag{21}$$

The utility-factor forms (Equations 18 and 21) coincide with the choice probabilities in the Luce choice axiom [34]. It is possible to show that generalized forms for the utility factors can be derived by maximizing a conditional Shannon entropy or from the principle of minimal information [12, 35, 36].

In the case of positive expected utilities, we consider the information functional, taking into account the normalization condition (Equation 14) and the expected log-likelihood 3. This functional reads as

$$I[f(\pi\_n)\ ] = \sum\_{n} f(\pi\_n) \ln f(\pi\_n) \ + \ \lambda \left[ \sum\_{n} f(\pi\_n) - 1 \right]$$

$$\quad + \alpha \left[ \sum\_{n} f(\pi\_n) \Lambda(\pi\_n) - \Lambda \right], \tag{22}$$

where

$$\Lambda(\pi\_n) = -\ln U(L\_n) \quad , \qquad U(L\_n) \ge 0 \; . $$

Minimizing functional (Equation 22) results in the utility factor

$$f(\pi\_n) = \frac{U^\alpha(L\_n)}{\sum\_n U^\alpha(L\_n)} \qquad (\alpha > 0) \; , \tag{23}$$

in which the positive sign of α is prescribed by the condition that the larger utility implies the larger factor.

In the case of negative expected utilities, the information functional takes the form

$$I[f(\pi\_n)\ ] = \sum\_{n} f(\pi\_n) \ln f(\pi\_n) \ + \ \lambda \left[ \sum\_{n} f(\pi\_n) - 1 \right] \\
$$

$$+ \ \chi \left[ \Lambda - \sum\_{n} f(\pi\_n) \Lambda(\pi\_n) \right], \tag{24}$$

where

$$\Lambda(\pi\_n) = -\ln |U(L\_n)| \ , \qquad U(L\_n) < 0 \ . \ .$$

Then its minimization yields the utility factor

$$f(\pi\_n) = \frac{|U(L\_n)|^{-\gamma}}{\sum\_n |U(L\_n)|^{-\gamma}} \qquad (\gamma > 0) \,, \tag{25}$$

with the positive sign of γ prescribed by the requirement that the larger cost implies the smaller factor.

The utility factors (Equations 23 and 25) are the examples of power-law distributions that are known in many applications [35–37].

#### 4. NON-INFORMATIVE PRIOR FOR ATTRACTION FACTORS

Although the attraction factor characterizes irrational features of decision making, it can be estimated by invoking noninformative prior assumptions. An important consequence of the latter is the quarter law derived earlier [4, 5, 12]. Here we first give the new, probably the simplest, derivation of the quarter law and, second, we show how this law can be used for estimating the attraction factors in the case of an arbitrary number of prospects.

Let us consider the sum

$$\frac{1}{N\_L} \sum\_{n=1}^{N\_L} |q(\pi\_n)| = \int\_0^1 \varphi(\mathbf{x}) \mathbf{x} \, d\mathbf{x} \tag{26}$$

of the attraction factor moduli, where

$$\varphi(\mathbf{x}) \equiv \frac{1}{N\_L} \sum\_{n=1}^{N\_L} [\; \delta(\mathbf{x} - q(\pi\_n)) + \delta(\mathbf{x} + q(\pi\_n)) \; ] \tag{27}$$

plays the role of the attraction-factor distribution. The latter is normalized as

$$\int\_{-1}^{1} \varphi(\mathbf{x}) \, d\mathbf{x} = 1 \; , \tag{28}$$

since the attraction factors, in view of condition (Equation 15), vary in the interval [−1, 1]. If q(πn) does not equal zero, then normalization (Equation 28) is evident. And when q(πn) = 0, then one should use the identity

$$\int\_0^1 \delta(x) \, dx = \frac{1}{2}$$

for the semi-integral of the Dirac function.

The use of a non-informative prior implies that the values of the attraction factor are not known. A full ignorance is captured by a uniform distribution, which, according to normalization (Equation 28), gives

$$
\varphi(\mathbf{x}) = \frac{1}{2} \; . \tag{29}
$$

In that case, integral (Equation 26) results in the quarter law

$$\frac{1}{N\_L} \sum\_{n=1}^{N\_L} |q(\pi\_n)| = \frac{1}{4} \,\, . \tag{30}$$

If the prospect lattice <sup>L</sup> = {πn} consists of <sup>N</sup><sup>L</sup> prospects, we can always arrange the corresponding attraction factors in the ascending order, such that

$$q(\pi\_n) \succ q(\pi\_{n+1}) \qquad (n = 1, 2, \ldots, N\_L - 1) \;. \tag{31}$$

We denote the largest attraction factor as

$$q\_{\max} = q(\pi\_1) > 0 \; . \tag{32}$$

Given the unknown values of the attraction factors, the noninformative prior assumes that they are uniformly distributed and at the same time they must obey the ordering constraint (Equation 31). Then, the joint cumulative distribution of the attraction factors is given by

$$\begin{split} \Pr[q(\pi\_1) < \eta\_1, q(\pi\_2) < \eta\_2, \dots, q(\pi\_{N\_L}) < \eta\_{N\_L} | \eta\_1 \le \eta\_2 \le \dots \\ \quad \le \eta\_{N\_L} \} = \int\_0^{\eta\_1} d\mathbf{x}\_1 \int\_{\mathbf{x}\_1}^{\eta\_2} d\mathbf{x}\_2 \dots \int\_{\mathbf{x}\_{N\_L - 1}}^{\eta\_{N\_L}} d\mathbf{x}\_{N\_L} \,, \end{split} \tag{33}$$

where the series η<sup>1</sup> ≤ η<sup>2</sup> ≤ ... ≤ ηN<sup>L</sup> of inequalities ensure the ordering. It is then straightforward to show that the average values of the q(πn) are equidistant, i.e., the difference between any two neighboring factors is on average

$$\Delta \equiv \langle q(\pi\_n) \rangle - \langle q(\pi\_{n+1}) \rangle = const \qquad \text{(independent of } n\text{)}\text{. (34)}$$

Taking their average values as determining their typical values, we omit the symbol h.i representing the average operator and use Equation (34) to represent the n-th attraction factor as

$$q(\pi\_n) = q\_{\max} - (n-1)\Delta \,. \tag{35}$$

With notations (Equations 32 and 34), the alternation condition (Equation 15) yields

$$q\_{\max} = \frac{N\_L - 1}{2} \text{ } \Delta \text{ }. \tag{36}$$

And the quarter law (Equation 30) leads to the gap

$$
\Delta = \frac{N\_L}{2\sum\_n |N\_L + 1 - 2n|}\tag{37}
$$

If N<sup>L</sup> is even, then

$$\sum\_{n=1}^{N\_L} |N\_L + 1 - 2n| = \frac{N\_L^2}{2} \qquad \text{( $N\_L$  even)}\text{ ,}$$

while when N<sup>L</sup> is odd, then

$$\sum\_{n=1}^{N\_L} |N\_L + 1 - 2n| = \frac{N\_L^2 - 1}{2} \qquad \text{(N}\_L \text{ odd)}\dots$$

This allows us to represent gap (Equation 37) as

$$\Delta = \begin{cases} \frac{1}{N\_L} & \text{(N\_L \text{ }e\nu en)} \\\\ \frac{N\_L}{N\_L^2 - 1} & \text{(N\_L \text{ }odd)} \end{cases} \tag{38}$$

And for the largest attraction factor, we find

$$q\_{\max} = \begin{cases} \frac{N\_L - 1}{2N\_L} & \text{(N\_L \ even n)}\\\\ \frac{N\_L}{2(N\_L + 1)} & \text{(N\_L \ odd d)} \end{cases} \tag{39}$$

Frontiers in Physics | www.frontiersin.org

The above expressions make it possible to evaluate, on the basis of the non-informative prior, the whole set

$$Q\_{\mathcal{N}\_{\mathcal{L}}} = \{q(\pi\_n) \colon n = 1, 2, \dots, N\_{\mathcal{L}}\} \tag{40}$$

.

of the attraction factors:

$$q(\pi\_n) = \begin{cases} \frac{1}{2N\_L} \text{ (N}\_L - 2n + 1) & \text{(N}\_L \text{ } \text{even} \text{)} \\\\ \frac{N\_L}{2(N\_L^2 - 1)} \text{ (N}\_L - 2n + 1) & \text{(N}\_L \text{ } \text{odd)} \text{ .} \end{cases} \tag{41}$$

For example, in the case of two prospects, we have

$$
\Delta = \frac{1}{2}\ , \qquad q\_{\text{max}} = \frac{1}{4} \qquad \text{(N}\_L = 2\ )\ ,
$$

which yields the attraction set

$$Q\_2 = \left\{\frac{1}{4}, \, - \, \frac{1}{4}\right\}$$

For three prospects, we get

$$
\Delta = \frac{3}{8} \text{ , } \qquad q\_{\text{max}} = \frac{3}{8} \text{ } \qquad \text{(N}\_L = 3\text{) , }
$$

hence

$$Q\_3 = \left\{\frac{3}{8}, \ 0, \ -\ \frac{3}{8}\right\} \ . $$

Similarly, for four prospects, we find

$$
\Delta = \frac{1}{4} \quad , \qquad q\_{\text{max}} = \frac{3}{8} \quad \quad \text{(N}\_L = 4\text{)} \; ,
$$

with the attraction set

$$Q\_4 = \left\{\frac{3}{8}, \,\,\frac{1}{8}\,,\,\, -\,\,\frac{1}{8}\,,\,\, -\,\,\frac{3}{8}\right\}\,\,.$$

When there are five prospects, then

$$
\Delta = \frac{5}{24} \text{ , } \qquad q\_{\text{max}} = \frac{5}{12} \qquad \text{(N}\_L = 5\text{) , }
$$

from where

$$Q\_5 = \left\{\frac{5}{12}, \frac{5}{24}, \ 0, \ -\ \frac{5}{24}, \ -\ \frac{5}{12}\right\}.$$

Thus, we can evaluate the attraction factors for any number of prospects, obtaining a kind of a quantized attraction set. In the case of an asymptotically large number N<sup>L</sup> of prospects, we have

$$
\Delta \simeq \frac{1}{N\_L} \,, \qquad q\_{\text{max}} \simeq \frac{1}{2} \qquad \text{(N}\_L \gg 1\text{)} \,, \tag{42}
$$

and

$$q(\pi\_n) \simeq \frac{1}{2} - \frac{2n - 1}{2N\_L} \,. \tag{43}$$

The non-informative priors can be employed for predicting the results of decision making. This makes the principal difference compared with the introduction into expected utility of adjustment parameters that are fitted post-hoc to the given experimental data [38].

# 5. QUANTITATIVE EXPLANATION OF DECOY EFFECT

We now show how the non-informative priors of the attraction factors can be employed to explain the decoy effect and for quantitative prediction in decision-making. Throughout this section, we denote, for simplicity, the objects of choice, say A, as well as the act of choosing an object A, by the same letter A. As has been emphasized above, the act of choice under uncertainty is a composite prospect. But, again for simplicity, we employ the same letter for denoting the action A and the related prospect (Equation 5).

The decoy effect was first studied by Huber et al. [39], who called it the effect of asymmetrically dominated alternatives. Later this effect has been confirmed in a number of experimental investigations [40–43]. The meaning of the decoy effect can be illustrated by the following example. Suppose a buyer is choosing between two objects, A and B. The object A is of better quality, but of higher price, while the object B is of slightly lower quality, while less expensive. As far as the functional properties of both objects are not drastically different, but B is cheaper, the majority of buyers value the object B higher. At this moment, the salesperson mentions that there is a third object C, which is of about the same quality as A, but of a much higher price than A. This causes the buyer to reconsider the choice between the objects A and B, while the object C, having the same quality as A but being much more expensive, is of no interest. Choosing now between A and B, the majority of buyers prefer the higher quality but more expensive object A. The object C, being not a choice alternative, plays the role of a decoy. Experimental studies confirmed the decoy effect for a variety of objects: cars, microwave ovens, shoes, computers, bicycles, beer, apartments, mouthwash, etc. The same decoy effect also exists in the choice of human mates distinguished by attractiveness and sense of humor [44]. It is common as well for animals, for instance, in the choice of female frogs of males with different attraction calls characterized either by low-frequency and longer duration or by faster call rates [45].

The decoy effect contradicts the regularity axiom in decision making telling that if B is preferred to A in the absence of C, then this preference has to remain in the presence of C.

In the frame of QDT, the decoy effect is explained as follows. Assume buyers consider an object A, which is of higher quality but more expensive, and an object B, which is of moderate quality but cheaper. Suppose the buyers have evaluated these objects A and B, which implies that the initial values of the objects are described by the utility factors f(A) and f(B). In experiments, the latter correspond to the fractions of buyers evaluating higher the related object. When the decoy C, of high quality but essentially more expensive, is presented, it attracts the attention of buyers to the quality characteristic. The role of the decoy is well understood as attracting the attention of buyers to a particular feature of the considered objects, because of which the decoy effect is sometimes named the attraction effect [40]. In the present case, the decoy attracts the buyer attention to quality. The attraction, induced by the decoy, is described by the attraction factors q(A) and q(B). Hence the probabilities of the related choices are now

$$p(A) = f(A) + q(A)\ , \qquad p(B) = f(B) + q(B)\ .$$

Since the quality feature becomes more attractive, q(A) > q(B). According to the non-informative prior, we can estimate the attraction factors as q(A) = 1/4 and q(B) = −1/4.

To be more precise, let us take numerical values from the experiment of Ariely and Wallsten [43], where the objects under sale are microwave ovens. The evaluation without a decoy results in f(A) = 0.4 and f(B) = 0.6. In the presence of the decoy, we predict that the choice probabilities can be evaluated as

$$p(A) = f(A) + 0.25\,, \qquad p(B) = f(B) - 0.25\,\,.$$

This gives p(A) = 0.65 and p(B) = 0.35. The experimental values for the choice between A and B, in the presence of but excluding C, correspond to the fractions pexp(A) = 0.61 and pexp(B) = 0.39, which is close to the predicted probabilities.

Another example can be taken from the studies of the frog mate choice [45], where frog males have attraction calls differing in either low-frequency sound or call rate. The males with lower frequency calls are denoted as A, while those with high call rate, as B. In an experiment with 80 frog females, without a decoy, it was found that females evaluate higher the fastest call rate, so that f(A) = 0.35 and f(B) = 0.65. In the presence of an inferior decoy, attracting attention to the low-frequency characteristic, the non-informative prior predicts the probabilities

$$
\rho(A) = 0.35 + 0.25 = 0.6 \,\, , \qquad \rho(B) = 0.65 - 0.25 = 0.4 \,\, .
$$

The empirically observed fractions are found to be pexp(A) = 0.6 and pexp(B) = 0.4, in remarquable agreement with our predictions.

To make it clear how the decoy effect fits the title of the paper "Inconclusive quantum measurements and decisions under uncertainty," it is worth extending the comments that have been mentioned in the Introduction.

Our principal point of view is that decision making, generally, almost always deals with composite events, since any choice is accompanied by subconscious feelings and irrational biases. The latter are often difficult to formalize and, even more, their weights usually are not known and are practically unmeasurable. This is why these subconscious irrational factors can be treated as what is called inconclusive events. When choosing between several possibilities, say An, one actually considers composite prospects, as defined in Equation (5). And the composite nature of choices requires the use of quantum techniques, as has been explained in our previous paper [30]. Otherwise, the probabilities of simple events could be characterized by classical theory. It is the composite nature of the considered prospects that yields the appearance of the quantum term q(πn) related to interference and coherence effects. In that way, the choice between the objects in the decoy effect is also a composite prospect, being composed of the choice as such and accompanying subconscious feelings forming an inconclusive set. This is why the use of QDT here is necessary and why it gives so good results.

It is admissible to give a schematic picture of the choice in the decoy effect by analogy with the double-slit experiment in physics, which is mentioned in the Introduction. Thus, making a concrete selection of either an object A or B is the analog of the registration of the particle by a detector. But before such a selection is done, there exists the uncertainty of deciding which of the object features are actually more important. These not precisely defined acts of hesitation play the role of the slits, with the uncertainty associated with which of them the particle has passed through. When it is known which of the slits the particle has passed through, then the interference effects in physics disappear. Similarly, in decision theory, if the values of each object are clearly defined, there are no hesitations, no interference, and the selection can be based on classical rules. Such an objective evaluation in the decoy effect happens in the absence of any decoy, when a decision maker rationally evaluates the features of the given objects, say quality and price. The appearance of a decoy induces hesitations concerning which of the features are actually more important. These hesitations before the choice are the analogs of the uncertainty of which slits will be visited by the traveling particle. The uncertainty results in the interference and the arising quantum term, whether in the registration of a particle or in the final choice of a decision maker.

#### 6. DISCUSSION

We have presented a mathematical formulation for the concept of inconclusive quantum measurements and events. This type of measurements in physics happens at intermediate stages of composite measuring procedures, while the final measurement stage is operationally testable. In decision making, inconclusive events correspond to the intermediate stage of deliberations. Invoking non-informative priors, it is possible to estimate the prospect probabilities, thus, predicting the results of decision making.

Generally, invoking more information on the properties of the attraction factor, it is possible to define its form more accurately than the value given by non-informative prior. For example, from condition (Equation 9) it follows that

$$-f(\pi\_n) \le q(\pi\_n) \le 1 - f(\pi\_n) \text{ .}$$

Hence, for a positive q(πn), we have

$$0 \le q(\pi\_n) \le 1 - f(\pi\_n) \text{ .}$$

While for a negative q(πn), we get

$$-f(\pi\_n) \le q(\pi\_n) \le 0 \text{ .}$$

Therefore, the attraction factor has to satisfy the limits

$$q(\pi\_n) \to +0 \,, \qquad f(\pi\_n) \to 1 \,,$$

$$q(\pi\_n) \to -0 \,, \qquad f(\pi\_n) \to 0 \,\,.$$

This suggests that the absolute value of the attraction factor can be modeled by an expression proportional to

$$q(\pi\_n) \propto f^{\mu}(\pi\_n)[1 - f(\pi\_n)]^{\nu},$$

with µ and ν being positive parameters and the sign defined by the ambiguity and risk aversion principle [4–6, 12]. More detailed study of such a form will be given in a separate paper.

But it turns out that even the simple non-informative prior provides us a rather good estimate allowing for quantitative predictions in decision making. And we have illustrated the approach by the decoy effect for which the non-informative priors yield quantitative predictions in very good agreement with empirical data.

In this paper, decision making by separate subjects is considered. We think that the theory can be generalized by considering societies of decision makers. The exchange of information in a society should certainly influence the decisions of the society members. To develop a theory of many agents, it is necessary to generalize the apporach by treating a dynamical model of agents exchanging information. Then, we think, it would be feasible to describe the behavior of the agents operating

#### REFERENCES


in a financial market and taking decisions about buying or selling shares in the presence of information asymmetry. And it would be possible to explain the known stylized facts in financial markets, such as, for example, the fat tails of return distributions, and volatility clustering, as well as transient bubbles and crashes, which are connected with herding behavior. Some first results in that direction are reported in our previous papers [6, 28], where the role of additional information, received by decision makers, is analyzed and it is shown that the amount of the additional information essentially influences the value of the quantum term. Further work on the generalization of the approach toward a dynamical theory of decision-maker societies is in progress.

### AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

#### ACKNOWLEDGMENTS

Financial support from the ETH Zürich Risk Center is appreciated. We are greateful for discussions to E.P. Yukalova.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Yukalov and Sornette. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

digital media

of impactful research

article's readership