Probabilities and Shannon's Entropy in the Everett Many-Worlds Theory

Wichert, Andreas; Moreira, Catarina

doi:10.3389/fphy.2016.00047

HYPOTHESIS AND THEORY article

Front. Phys., 06 December 2016

Sec. Interdisciplinary Physics

Volume 4 - 2016 | https://doi.org/10.3389/fphy.2016.00047

Probabilities and Shannon's Entropy in the Everett Many-Worlds Theory

Andreas Wichert^*

Catarina Moreira

Department of Informatics, INESC-ID/Instituto Superior Técnico - University of Lisboa, Porto Salvo, Portugal

Following a controversial suggestion by David Deutsch that decision theory can solve the problem of probabilities in the Everett many-worlds we suggest that the probabilities are induced by Shannon's entropy that measures the uncertainty of events. We argue that a relational person prefers certainty to uncertainty due to fundamental biological principle of homeostasis.

1. Introduction

The Everett many-worlds theory views reality as a many-branched tree in which every possible quantum outcome is realized [1–8]. The modern version of the wave-function collapse is based on decoherence and leads to the multiverse interpretation of quantum mechanics [9]. Every time a quantum experiment with different possible outcomes is performed, all outcomes are obtained. If a quantum experiment is preformed with two outcomes with quantum mechanical probability 1/100 for outcome A and 99/100 for outcome B, then both the world with outcome A and the world with outcome B will exist. A person should not expect any difference between the experience in a world A and B. The open question is the following one: Due to deterministic nature of the branching, why should a rational person care about the corresponding probabilities? Why not simply assume that they are equally probable due to deterministic nature of the branching [10]. How can we solve this problem without introducing additional structure into the many-worlds theory?

David Deutsch suggested that decision theory can solve this problem [11–13]. A person identifies the consequences of decision theory with things that happen to its individual future copies on particular branches. A person who does not care in receiving 1 on the first branch A and 99 on the second branch B labels them with a probability 1/2. A rational person that cares assigns the probability 1/100 for outcome A and 99/100 for outcome B. It should be noted that David Deutsch introduced a rational person into the explanation of the corresponding problem. The probability rule within unitary quantum mechanics is based on the rationality of the person. However since the branching is deterministic and no uncertainty is present, how can this rational person justify the application of decision-making?

In this work, we pretend to give an alternative and simpler explanation of Deutch's decision theoretic argument motivated by biological mechanisms and also based on Epstein's ideas that toward uncertainty, human beings tend to have aversion preferences. They prefer to choose an action that brings them a certain but lower utility instead of an action that is uncertain but can yield a higher utility [14]. We precede in two steps.

In the first step we propose to use Shannon's entropy as the expected utility in Deutsch's approach. Probabilities in Shannon's entropy function can be seen as frequencies; they can be measured only by performing an experiment many times and indicate us the past experience. Surprise is inversely related to probability. The larger the probability that we receive a certain message, the less surprised we are. For example the message “Dog bites man” is quite common, has a high probability and usually we are not surprised. However, the message “Man bites dog” is unusual and has a low probability. The more we are surprised about the occurrence of an event the more difficult an explanation of such an event is. The surprise is defined in relation to previous events, in our example men and dogs.

In the second step we introduce the experience of identity derived from homeostasis as a fundamental biological principle. It is preformed subconsciously by our brain as a coherent explanation of events in a temporal window [15]. Events with higher surprise are more difficult to explain and require more energy. Before an event happens an explanation has to be be initiated, so after the event happened it can be integrated in the present explanation in the temporal window. A rational person may not care about the attached weights during deterministic branching, but our brain machinery cares. This information is essential for the ability to give a continuous explanation of our “self” identity.

The paper is organized as follows:

• We review Deutsch's decision-theoretic argument.

• We propose Shannon's entropy as the expected utility function and surprisal as the utility function.

• We argue that probabilities that are the basis of surprise are essential for the ability to give a continuous explanation.

2. Review of Deutsch's Decision-Theoretic Argument

Decision theory according to Savage is a theory designed for the analysis of rational decision-making under conditions of uncertainty [11]. A rational person faces a choice of acts as a function from the set of possible states to the set of consequences. There are some constraints on the acts of the rational person, as for example the preferences must be transitive. It can be then shown that there exists a probability measure p on states s and a utility function U on the set of consequences of an act A so that the expected utility is defined as

\begin{matrix} E U (A) : ​ = \sum_{s} p (s) \cdot U (A (s)) . & (1) \end{matrix}

It follows that a rational person prefers act A to act B if the expected utility of A is greater than that of B. The behavior corresponds to the maximization of the expected utility with respect to some probability measure [16].

2.1. Decision-Theoretic Argument

In the context of the many-worlds the rational person is able to describe each of its acts as a function from the set of possible future branches that will result from a given quantum measurement to the set of consequences [11]. Consequences are the things that happen to individual future copies of the person on particular branch. Act is a function from states to consequences, the preferences must be transitive. If a rational person prefers act A to act B, and prefers act B to act C, then the same person must prefer act A to act C. This can be summarized by assigning a real number to each possible outcome in such a way that the preferences are transitive. The corresponding number is called the utility or value. The deterministic process of branching is identified as a chance setup for the rational person by a quantum game with a payoff function P associating a consequence with each eigenvalue of the observable $\hat{X}$ . When the measurement is performed the state vector collapses into one or the other eigenstate of the observable being measured, a projection into a eigenstate is preformed. For observable $\hat{X}$ and the state |y〉 in which the expression for the probability reduces to |〈x|y〉|² in which x is an eigenvector of $\hat{X}$ . A quantum game is specified by the triple

(| y 〉, \hat{X}, P) .

It is assumed that the utilities of the possible payoffs have an additivity property [11]. The approach is based on two non-probabilistic axioms of decision theory. The principle of substitutability, constrains the values of sub-games. If any of the sub-games is replaced by a game of equal value then the value of the composite game is unchanged. The other axiom concerns two-player zero sum games. If there are two possible acts A and B with payoff c for A and −c for B then playing A and B results in zero. The expected utility or value for playing A and B is

E U (A) + E U (B) = c - c = 0.

For any two acts A, B, the rational person prefers A to B if

E U (A) > E U (B) .

The person acts if she regarded her multiple future branches as multiple possible futures. In classical decision theory two rational persons may be represented by different probability measures on the set of states. Not so in David Deutsch suggested approach that deals with many-worlds theory [10]. Expected value with an observable $\hat{X}$ is

\begin{matrix} E U (| y 〉) = 〈 y | \hat{X} | y 〉 & (2) \end{matrix}

and since

\begin{matrix} 〈 y | \hat{X} | y 〉 = \sum_{i} 〈 x_{i} | y 〉 |^{2} \cdot x_{i} = \sum_{i} p_{i} \cdot x_{i} & (3) \end{matrix}

representing the weighted mean over the eigenvalues of $\hat{X} .$ In quantum physics 〈y〉 is called the expected value. A rational person that makes decision about outcomes of measurement believes that each possible eigenvalue x_i had the probability $〈 x_{i} | y 〉 |^{2} = p_{i}$ due to the process of maximizing the probabilistic expectation value of the payoff [11].

2.1.1. The Derivation of Probabilities for Real Amplitudes

We sketch the proof by David Elieser Deutsch, see Deutsch [11]. If |y〉 is an eigenstate |x〉 of $\hat{X}$ it follows that

E U (| y 〉) = x

is equal to the eigenvalue x. By appealing twice to the additivity of utilities for eigenvectors |x_i〉 and adding a constant k we arrive at

\begin{matrix} E U (\sum_{i} ω_{i} \cdot | x_{i} + k 〉) = E U (\sum_{i} ω_{i} \cdot | x_{i} 〉) + k . & (4) \end{matrix}

The rational person is indifferent in receiving the separate payoffs with utilities x₁ and x₂ or receiving a single payoff with utility x₁ + x₂. The expected utility of |x_i + k〉 has by additivity the same value as the expected utility of |x_i〉 followed by k. This is the central equation on which the proof of David Deutsch is based. The constant k on the left side corresponds to a combination of eigenstates. This is because it is required that for each branch the payoffs are present. The constant k on the right side corresponds to a combination of the corresponding eigenvalues.

According to additivity the left side of equation has the same expected utility as the superposition of possible branches and the payoffs represented by a corresponding combination of eigenvalues is on the right side. The other equation follows from the axiom concerns two-player zero- sum games

\begin{matrix} E U (\sum_{i} ω_{i} \cdot | x_{i} 〉) + E U (\sum_{i} ω_{i} \cdot | - x_{i} 〉) = 0. & (5) \end{matrix}

If |y〉 is in a superposition

\begin{matrix} | y 〉 = \frac{1}{\sqrt{2}} \cdot | x_{1} 〉 + \frac{1}{\sqrt{2}} \cdot | x_{2} 〉 . & (6) \end{matrix}

and with

\begin{matrix} k = - x_{1} - x_{2} & (7) \end{matrix}

it follows

\begin{matrix} \begin{array}{l} E U (\frac{1}{\sqrt{2}} (| x_{1} - x_{1} - x_{2} 〉 + | x_{2} - x_{1} - x_{2} 〉)) \\ = E U (\frac{1}{\sqrt{2}} (| ​ - x_{2} 〉 - | x_{1} 〉)) \\ = - E U (\frac{1}{\sqrt{2}} (| x_{1} 〉 + | x_{2} 〉)) \\ = E U (\frac{1}{\sqrt{2}} (| x_{1} 〉 + | x_{2} 〉)) - x_{1} - x_{2} \end{array} & (8) \end{matrix}

and by the Equation the value of the expected utility is derived as

\begin{matrix} \frac{1}{2} (x_{1} + x_{2}) = E U (\frac{1}{\sqrt{2}} (| x_{1} 〉 + | x_{2} 〉)) . & (9) \end{matrix}

For a superposition of n eigenstates

\begin{matrix} \frac{1}{n} (x_{1} + x_{2} + ​ \dots ​ + x_{n}) = E U (\frac{1}{\sqrt{n}} (| x_{1} 〉 + | x_{2} 〉 + ​ \dots ​ + | x_{n} 〉)) & (10) \end{matrix}

the proof is based on induction from the principle of substitutability and additivity. For n = 2^m the proof follows from substitutability by inserting a two equal amplitude game into remaining 2^m−1 equal-amplitude outcomes. Otherwise the proof follows inductive hypothesis and additivity by replacing n − 1 by n, for details see Deutsch [11]. For unequal amplitudes

\begin{matrix} \frac{m}{n} \cdot x_{1} + \frac{n - m}{n} \cdot x_{2} = E U (\sqrt{\frac{m}{n}} \cdot | x_{1} 〉 + \sqrt{\frac{n - m}{n}} \cdot | x_{2} 〉) . & (11) \end{matrix}

we introduce auxiliary system that can be in two states different states, either

\begin{matrix} \sqrt{\frac{1}{n}} \cdot \sum_{a = 1}^{m} | z_{a} 〉 & (12) \end{matrix}

\begin{matrix} \sqrt{\frac{1}{n - m}} \cdot \sum_{a = m + 1}^{n} | z_{a} 〉 & (13) \end{matrix}

with eigenstates |z_a〉 and eigenvalues z_a of the observable Ẑ that are all distinct. Then the joint state is given by

\begin{matrix} \sqrt{\frac{1}{n}} \cdot (\sum_{a = 1}^{m} | x_{1} 〉 | z_{a} 〉 + \sum_{a = m + 1}^{n} | x_{2} 〉 | z_{a} 〉) . & (14) \end{matrix}

When we measure the observable Ẑ depending on the index a of the eigenvalues z_a we know that $\hat{X}$ is x₁ if a < m + 1 or x₂ otherwise. With additional properties of the eigenvalues

\begin{matrix} \sum_{a = 1}^{m} z_{a} = \sum_{a = m + 1}^{n} z_{a} = 0 & (15) \end{matrix}

and all n values

\begin{matrix} x_{1} + z_{a}, 1 \leq a \leq m & (16) \end{matrix}

and

\begin{matrix} x_{2} + z_{a}, m + 1 \leq a \leq n & (17) \end{matrix}

are all distinct, the composite measurement with observable $\hat{X}$ and Ẑ has the same value as the one with observable $\hat{X}$ alone. Because of the additivity the state is equivalent to the amplitude superposition

\begin{matrix} \sqrt{\frac{1}{n}} \cdot (\sum_{a = 1}^{m} | x_{1} 〉 | z_{a} 〉 + \sum_{a = m + 1}^{n} | x_{2} 〉 | z_{a} 〉) & (18) \end{matrix}

and it follows

\begin{matrix} \frac{1}{n} \cdot (\sum_{a = 1}^{m} | x_{1} 〉 | z_{a} 〉 + \sum_{a = m + 1}^{n} | x_{2} 〉 | z_{a} 〉) = \frac{m}{n} \cdot x_{1} + \frac{n - m}{n} \cdot x_{2} . & (19) \end{matrix}

A rational person would choose the expected value of |y〉 with an observable $\hat{X}$

\begin{matrix} E U (| y 〉) = 〈 y | \hat{X} | y 〉 & (20) \end{matrix}

and since

\begin{matrix} 〈 y | \hat{X} | y 〉 = \sum_{i} 〈 x_{i} | y 〉 |^{2} \cdot x_{i} = \sum_{i} p_{i} \cdot x_{i} & (21) \end{matrix}

represents the weighted mean over the eigenvalues of $\hat{X}$ , the rational person interprets p_i as probabilities. For complex amplitudes it is assumed that the unitary transformation

\begin{matrix} | x_{a} 〉 \to e^{i \cdot θ_{a}} | x_{a} 〉 & (22) \end{matrix}

with a corresponding phase θ_a does not alter the payoff, the player is indifferent as to whether it occurs or not. The proof can be extended to irrational numbers, it is based on the idea that the state undergoes some unitary evolution before the measurement [11]. The unitary evolution leads to real amplitudes with eigenvalues that exceeds the original state. Each game played after the unitary transformation is as valuable as the original game, and the values of such games have a lower bound of the original game.

3. Expected Utility and Entropy

A fundamental property of rational persons is that they prefer certainty to uncertainty. Humans prefer to choose an action that brings them a certain, but lower utility instead of an action that is uncertain, but can bring a higher utility [14].

We measure the uncertainty by the entropy of the experiment. The experiment starts at t₀ and ends at t₁. At t₀, we have no information about the results of the experiment, and at t₁, we have all of the information, so that the entropy of the experiment is 0. We can describe an experiment by probabilities. For the outcome of the flip of a fair coin, the probability for a head or tail is 0.5, p = (0.5, 0.5). A person A knows the outcome, but person B does not. Person B could ask A about the outcome of the experiment. If the question is of the most basic nature, then we could measure the minimal number of optimally required questions B must pose to know the result of the experiment. A most basic question corresponds to the smallest information unit that could correspond to a yes or no answer. For a fair coin, we pose just one question, for example, is it a tail? For a card game, to determine if a card is either red, clubs or spades, we have a different number of possible questions. If the card is red, then we need only one question. However, in the case in which the card is not red, we need another question to determine whether it is a spade or a club. The probability of being red is 0.5, of clubs 0.25 and spades 0.25, p = (0.5, 0.25, 0.25). For clubs and spades, we need two questions. In the meantime, we must ask 1 · 0.5 + 2 · 0.25 + 2 · 0.25 questions, which would result in 1.5 questions. The entropy is represented by Shannon's entropy H for an experiment A

\begin{matrix} H (A) = - \sum_{i} p_{i} \log_{2} p_{i} . & (23) \end{matrix}

It indicates the minimal number of optimal binary yes/no questions that a rational person must pose to know the result of an experiment [17, 18]. We can describe the process of measuring a state by an observable as measuring the entropy of the experiment. Before the measurement of a state |x〉 by an observable we are uncertain about the outcome. We measure the uncertainty by Shannon's entropy. After the measurement the state is in eigenstate, the entropy is zero, Shannon's entropy is defined for any observable and any probability distribution, according to Ballentin [19, p. 617] “It measures the maximum amount of information that may be gained by measuring that observable.”

Assuming that a Hilbert space H_n can be represented as a collection of orthogonal subspaces

\begin{matrix} H_{n} = E_{1} \oplus E_{2} \oplus \dots \oplus E_{f} & (24) \end{matrix}

with f ≤ n. A state |y〉 can be represented with |x_i〉 ∈ E_i as

| y 〉 = ω_{1} \cdot | x_{1} 〉 + ω_{2} \cdot | x_{2} 〉 + \dots + ω_{f} \cdot | x_{f} 〉 .

For one dimensional subspaces f = n and the value |x_k〉 is observed with a probability ||ω_k · |x_k〉||² = |ω_k|². Shannon's entropy is defined as

\begin{matrix} E (| y 〉) = - \sum_{i = 1}^{n} (| ω_{i} |^{2} \cdot \log_{2} | ω_{i} |^{2}) = - \sum_{i} p_{i} \log_{2} p_{i} . & (25) \end{matrix}

3.1. Weighted Sum of Surprisals

We say that events that seldom happen, for example, the letter x in a message, have a higher surprise. Some letters are more frequent than others; an e is more frequent than an x. The larger the probability that we receive a character, the less surprised we are. Surprise is inversely related to probability.

s_{i} = \frac{1}{p_{i}}

The logarithm of surprise

\begin{matrix} I_{i} = - \log_{2} \frac{1}{| ω_{i} |^{2}} = 2 \cdot \log_{2} | ω_{i} | = - \log_{\sqrt{2}} | ω_{i} | = \log_{2} s_{i} & (26) \end{matrix}

is the self-information or surprisal I_i. The Shannon's entropy H represents the weighted sum of surprisals.

\begin{matrix} H (A) = \sum_{i} p_{i} \cdot \log_{2} s_{i} = \sum_{i}^{n} p_{i} \cdot I_{i} . & (27) \end{matrix}

It can be interpreted as an expected utility

\begin{matrix} E U (A) : ​ = \sum_{i} p_{i} \cdot U (A (i)) = \sum_{i}^{n} p_{i} \cdot I_{i} . & (28) \end{matrix}

with the utility function U(A(i))

\begin{matrix} U (A (i)) = I_{i} . & (29) \end{matrix}

For acts the expected utility is identified with the representation of the entropy of an action, represented by H

H (A) : ​ = E U (A) .

That the rewards are determined by the negative information content is already known and is used for the utility representation problems in economics [20–22]. If there are two possible acts A and B with payoff c for A and −c for B then playing A and B results in zero. The expected utility or value for playing A and B is

H (A) + H (B) = c - c = 0,

but for any two acts A, B, the rational person prefers A to B if

H (A) < H (B)

because the uncertainty is lower. The theoretic proof David Deutsch can be applied, for example

\begin{matrix} | y 〉 = \frac{1}{\sqrt{2}} \cdot | x_{1} 〉 + \frac{1}{\sqrt{2}} \cdot | x_{2} 〉 . & (30) \end{matrix}

and with

\begin{matrix} k = - x_{1} - x_{2} & (31) \end{matrix}

the right side of the equation with eigenvalues x₁ = 1 and x₂ = 1

\begin{matrix} k = - x_{1} - x_{2} = - 2 & (32) \end{matrix}

or U(A(i)) = I_i = log₂2 = 1. Therefore

\begin{matrix} \frac{1}{2} (1 + 1) = \frac{1}{2} (x_{1} + x_{2}) = E U (\frac{1}{\sqrt{2}} (| x_{1} 〉 + | x_{2} 〉)) & (33) \end{matrix}

and

\begin{matrix} 1 = 0.5 \cdot 1 + 0.5 \cdot 1. & (34) \end{matrix}

However we can as well recover the probabilities from the definition of Shannon's entropy. This strengthens idea even more.

3.2. Recovering Probabilities from Entropy

The entropy or the uncertainty is maximal in the case in which all probabilities are equal, which means that p = (1/n, 1/n…, 1/n). In this case

\begin{matrix} H (F) = - \sum_{i} p_{i} \log_{2} p_{i} = - \log_{2} 1 / n = \log_{2} n . & (35) \end{matrix}

the surprisal I_i is equal to the entropy

\begin{matrix} H (F) = \log_{2} n = I_{i} . & (36) \end{matrix}

If |y〉 is in a superposition of n eigenstates

\begin{matrix} | y 〉 = \frac{1}{\sqrt{n}} (| x_{1} 〉 + | x_{2} 〉 + \dots + | x_{n} 〉) & (37) \end{matrix}

it follows that a surprisal $I_{i}^{*}$ of each state represented by amplitude $ω_{i} = \frac{1}{\sqrt{n}}$ is

\begin{matrix} I_{i}^{*} = \log_{2} \sqrt{n} = \frac{1}{2} \cdot \log_{2} n . & (38) \end{matrix}

We know that in the case of equal amplitudes logn yes/no questions have to be asked with H(F) = I_i so the function f(I*) is equal to multiply by two

\begin{matrix} H (| y 〉) = I_{i} = f (I_{i}^{*}) = 2 \cdot \log_{2} \sqrt{n} = \log_{2} n = - \log_{2} {(\frac{1}{\sqrt{n}})}^{2} & (39) \end{matrix}

that is equivalent to

\begin{matrix} p_{i} = 2^{I_{i}} = {(\frac{1}{\sqrt{n}})}^{2} . & (40) \end{matrix}

In the case of complex amplitudes with phase θ_a

\begin{matrix} ω_{i} = e^{i \cdot θ_{a}} \frac{1}{\sqrt{n}} & (41) \end{matrix}

\begin{matrix} I_{i}^{*} = \log_{2} e^{i \cdot θ_{a}} \sqrt{n} = \frac{i \cdot θ_{a}}{\log 2} + \frac{1}{2} \cdot \log_{2} n, & (42) \end{matrix}

the operations described by f(I*) would be, first get rid of the complex number i·θ_a/log2 by subtracting −i·θ_a/log2 and then multiply by 2. Or

\begin{matrix} p_{i} = 2^{I_{i}} = {(\frac{1}{e^{- i \cdot θ_{a}} e^{i \cdot θ_{a}} \cdot \sqrt{n}})}^{2} = {(\frac{1}{\sqrt{n}})}^{2} . & (43) \end{matrix}

For unequal amplitudes for each surprisal $I_{i}^{*}$ of each state we multiply it by two and recover from I_i the value $p_{i} = 2^{I_{i}}$ . This operation leads to the minimal number of optimal binary yes/no questions. For example for

\begin{matrix} | y 〉 = \sqrt{\frac{m}{n}} \cdot | x_{1} 〉 + \sqrt{\frac{n - m}{n}} \cdot | x_{2} 〉 & (44) \end{matrix}

we get by operations described by f(I*)

\begin{matrix} H (| y 〉) = - \frac{m}{n} \cdot \log_{2} \frac{m}{m} - (\frac{n - m}{n}) \cdot l o g_{2} (\frac{n - m}{n}), & (45) \end{matrix}

the correct value of Shannon's entropy. We assume

\begin{matrix} p_{i} = {(\frac{1}{| ω_{i} |})}^{2} . & (46) \end{matrix}

3.3. Biological Principle of Energy Minimization

Identity is a concept that defines the properties of a rational person over time [23]. It is a unifying concept based on the biological principles of homeostasis [24, 25]. Organisms have to be kept stable to guarantee the maintenance of life, like for example the regulation of body temperature. This principle was extended by Allostasis [26] for regulation of bodily functions over time. To preform this task efficient mechanisms for the prediction of future states are needed to anticipate future environmental constellations [27, 28]. This is done, because the homeostatic state may be violated by unexpected changes in the future. It means as well that every organism implies a kind of self identity over time [29]. This identity requires a time interval of finite duration within which sensory information is integrated. Different sensor information arrives at different time stamps. The fusion process has to be done over some time window. Similar problems are present during a sensor fusion task in a mobile robot. For example in visual and auditory perception in humans the transduction of the acoustic information is much shorter then the visual [30]. In it is suggested that in humans a temporal window with the duration of 3 s is created [15]. This window represents the psychological concept of “now” [29]. The consciousness concept of “now” represented by the temporal window is shifted backward in time of the consciousness itself, since a subconsciousness mechanism is required to preform the integration task.

3.3.1. Prediction of Events

One of the brain functions is to provide a casual consistent explanation of events to maintain self identity over time leading to the psychological concept of “now.” Split brain research and stimulation or brain regions during awake operation suggest that the brain generates an explanation of effects that were not initiated by consciousness [31, 32]. Before an event happens an explanation has to be incited by the subconsciousness parts of the brain so that it is possible to integrate it into the temporal window of the self when the event happens. As well other organism functions need be put in alert due to some predicted possible events.

Events with higher surprise are more difficult to explain than events with low surprise values. An explanation has to be possible. When the surprise is too high, an explanation may be impossible and the identity of self could break. The idea is related to the general constructor theory of David Elieser Deutsch, see Deutsch [33]. The metabolic cost of neural information processing of explaining higher surprise events require higher energy levels then lower surprise events. Fechner's law states that there is a logarithmic relation between the stimulus and its intensity [34]. We assume as well that there is a logarithmic relation between the cost of initiation an explanation of an event and its surprise value s_i, that is logs_i. Neuronal computation is energetically expensive [35]. Consequently, the brain's limited energy supply imposes constraints on its information processing capability. The costs should be fair divided into the explanation of all predicted possible branches since the organism will be deterministic present in all of them. A possible solution is given by the Shannon's entropy. The corresponding value I_i is weighted in relation

\begin{matrix} p_{i} = 2^{I_{i}} & (47) \end{matrix}

and the resulting costs are p_i · I_i. The resulting costs of initiating n explanations for action A of n predicted branches before a split are

\begin{matrix} H (A) = \sum_{i}^{n} p_{i} \cdot \log_{2} s_{i} = \sum_{i}^{n} p_{i} \cdot I_{i} & (48) \end{matrix}

For the human (subconsciousness) brain it makes sense to choose A to B if

H (A) < H (B) .

since it requires less energy.

4. Conclusion

Since the branching is deterministic and no uncertainty is present, how can a rational person justify the application of decision-making? Why not simply assume that they are equally probable due to deterministic nature of the branching. David's Deutsch probability rule within unitary quantum mechanics is based on rationality. Instead of rationality we introduced the biological principle of homeostasis. Before an event happens an explanation has to be be prepared. The costs of the explanation should be fair divided into the explanation of all predicted possible branches since the organism will be deterministic present in all of them. The costs are described by the negative expected utility represented by Shannon's entropy. The probabilities can be recovered from the Shannon's entropy.

Author Contributions

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

Funding

This work was supported by national funds through Fundação para a Ciência e Tecnologia (FCT) with reference UID/CEC/50021/2013 and through the PhD. grant SFRH/BD/92391/2013. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

1. Everett H. “relative state” formulation of quantum mechanics. Rev Mod Phys. (1959) 29:454–62. doi: 10.1103/RevModPhys.29.454

CrossRef Full Text | Google Scholar

2. Wheeler J. Assessment of everett's “relative state.” Rev Mod Phys. (1957) 29:463–5. doi: 10.1103/RevModPhys.29.463

CrossRef Full Text | Google Scholar

3. Dewitt BS, Graham N (eds.). The Many-Worlds Interpretation of Quantum Mechanics. Princeton, NJ: Princeton University Press (1973).

Google Scholar

4. Deutsch D. The Fabric of Reality. London: Penguin Group (1997).

5. Deutsch D. The structure of the multiverse. Proc R Soc A (2002) 458:2911–23. doi: 10.1098/rspa.2002.1015

CrossRef Full Text | Google Scholar

6. Wallace D. Worlds in the everett interpretation. Stud Hist Philos Mod Phys. (2002) 33:637–61. doi: 10.1016/S1355-2198(02)00032-1

CrossRef Full Text | Google Scholar

7. Wallace D. Everett and structure. Stud Hist Philos Mod Phys. (2003a) 34:87–105. doi: 10.1016/S1355-2198(02)00085-0

CrossRef Full Text | Google Scholar

8. Byrne P. The many worlds of hugh everett. Sci Am. (2007) 297:98–105. doi: 10.1038/scientificamerican1207-98

PubMed Abstract | CrossRef Full Text | Google Scholar

9. Bousso R, Susskind L. Multiverse interpretation of quantum mechanics. Phys Rev D (2012) 85:045007. doi: 10.1103/PhysRevD.85.045007

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Greaves H. Probability in the everett interpretation. Philos Comp. (2007) 1:109–128.

Google Scholar

11. Deutsch D. Quantum theory of probability and decisions. Proc R Soc A (1999) 455:3129–97. doi: 10.1098/rspa.1999.0443

CrossRef Full Text | Google Scholar

12. Wallace D. Everettian rationality: defending deutsch's approach to probability in the everett interpretation. Stud Hist Philos Mod Phys. (2003b) 24:415–39. doi: 10.1016/S1355-2198(03)00036-4

CrossRef Full Text | Google Scholar

13. Wallace D. Quantum probability from subjective likelihood: improving on deutsch's proof of the probability rule. Stud Hist Philos Mod Phys. (2007) 38:311–32. doi: 10.1016/j.shpsb.2006.04.008

CrossRef Full Text | Google Scholar

14. Epstein LG. A definition of uncertainty aversion. Rev Econ Stud. (1999) 66:579–608. doi: 10.1111/1467-937X.00099

CrossRef Full Text | Google Scholar

15. Pöppel E. Pre-semantically defined temporal windows for cognitive processing. Philos Trans R Soc Lond B Biol Sci. (2009) 364:1887–1896. doi: 10.1098/rstb.2009.0015

PubMed Abstract | CrossRef Full Text | Google Scholar

16. Russell SJ, Norvig P. Artificial Intelligemce: A Modern Approach, 2nd Edn. Harlow: Prentice-Hall (2003).

Google Scholar

17. Shannon CE. A mathematical theory of communication. Bell Syst Tech J. (1948) 27:379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x

CrossRef Full Text | Google Scholar

18. Topsoe F. Informationstheorie. Stuttgart: Teubner Sudienbucher (1974). doi: 10.1007/978-3-322-94886-1

CrossRef Full Text

19. Ballentine LE. Quantum Mechanics A Modern Development, 2nd Edn. Singapore: World Scientific (2012).

Google Scholar

20. Candeal JC, De Miguel JR, Indura E, Mehta GB. Utility and entropy. Econ Theory (2001) 17:233–8. doi: 10.1007/PL00004100

CrossRef Full Text | Google Scholar

21. Ortega PA, Braun DA. A conversion between utility and information. In: Proceedings of the Third Conference on Artificial General Intelligence. Mountain View, CA: Springer; Heidelberg (2009). p. 115–120.

22. Robert FN, Jose VRR, Winkler RL. Duality between maximization of expected utility and minimization of relative entropy when probabilities are imprecise. In: 6th International Symposium on Imprecise Probability: Theories and Applications. Durham (2009).

23. Pöppel E. Perceptual identity and personal self: neurobiological reflections. In: Fajkowska M, Eysenck MM, editors. Personality From Biological, Cognitive, and Social Perspectives. Clinton Corners, NY: Eliot Werner Public (2010). p. 75–82.

24. Bernard C. An Introduction to the Study of Experimental Medicine. New York, NY: Dover (1957).

Google Scholar

25. Gross CG. Claude bernard and the constancy of the internal environment. Neuroscientist (1998) 4:380–5. doi: 10.1177/107385849800400520

CrossRef Full Text | Google Scholar

26. Sterling P. Principles of allostasis: optimal design, predictive regulation, pathophysiology and rational therapeutics. In: Schulkin J, editor. Allostasis, Homeostasis, and the Costs of Adaptation. Cambridge: University Press (2004). p. 17–64. doi: 10.1017/CBO9781316257081.004

CrossRef Full Text

27. von Holst E, Mittelstaedt H. Das reafferenzprinzip (wechselwirkungen zwischen zentralnervensystem und peripherie. Naturwissenschaften (1950) 37:464–76. doi: 10.1007/BF00622503

CrossRef Full Text | Google Scholar

28. Bao Y, Pöppel E, Liang W, Yang T. When is the right time? a little later! – delayed responses show better temporal control. Proc Soc Behav Sci. (2014) 126:199–200. doi: 10.1016/j.sbspro.2014.02.370

CrossRef Full Text | Google Scholar

29. Zhou B, Pöppel E, Bao Y. In the jungle of time: the concept of identity as a way out. Front Psychol. (2014) 5:844. doi: 10.3389/fpsyg.2014.00844

PubMed Abstract | CrossRef Full Text | Google Scholar

30. Pöppel E, Schill K, von Steinbüchel N. Sensory integration within temporally neutral system states: a hypothesis. Naturwissenschaftem (1990) 77:89–91. doi: 10.1007/BF01131783

CrossRef Full Text

31. Libet B. Mind Time - The Temporal Factor in Consciousness. Cambridge, MA; London: Harvard University Press (2004).

Google Scholar

32. Coon D, Mitterer JO. Introduction to Psychology Gateways to Mind and Behavior, 13th Edn. Boston, MA: Wadsworth Publishing (2012).

Google Scholar

33. Deutsch D. Constructor theory. Synthese (2013) 190:4331–4359. doi: 10.1007/s11229-013-0279-z

CrossRef Full Text | Google Scholar

34. Frisby JP, Stone JV. Seeing, The Computational Approach to Biological Vision, 2nd Edn. Cambridge, MA: MIT Press (2010).

Google Scholar

35. Laughlin SB, de Ruyter van Steveninck RR, Anderson JC. The metabolic cost of neural information. Nat Neurosci. (1998) 1:36–41. doi: 10.1038/236

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: decision theory, Everett many-worlds, homeostasis, probability, Shannon's entropy

Citation: Wichert A and Moreira C (2016) Probabilities and Shannon's Entropy in the Everett Many-Worlds Theory. Front. Phys. 4:47. doi: 10.3389/fphy.2016.00047

Received: 23 September 2016; Accepted: 22 November 2016;
Published: 06 December 2016.

Edited by:

Lev Shchur, Landau Institute for Theoretical Physics, Russia

Reviewed by:

Ignazio Licata, Institute for Scientific Methodology, Italy
Arkady M. Satanin, N. I. Lobachevsky State University of Nizhny Novgorod, Russia

Copyright © 2016 Wichert and Moreira. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Andreas Wichert, YW5kcmVhcy53aWNoZXJ0QHRlY25pY28udWxpc2JvYS5wdA==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.