# COORDINATION AND COOPERATION IN COMPLEX ADAPTIVE SYSTEMS: THEORY AND APPLICATION

EDITED BY : Xiaojie Chen, Tatsuya Sasaki and Isamu Okada PUBLISHED IN : Frontiers in Physics and Frontiers in Ecology and Evolution

#### Frontiers Copyright Statement

© Copyright 2007-2019 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-844-8 DOI 10.3389/978-2-88945-844-8

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# COORDINATION AND COOPERATION IN COMPLEX ADAPTIVE SYSTEMS: THEORY AND APPLICATION

#### Topic Editors:

Xiaojie Chen, University of Electronic Science and Technology of China, China Tatsuya Sasaki, F-Power Inc., Japan Isamu Okada, Soka University, Japan

Image: Lightspring/Shutterstock.com

During the past decade, plenty of studies have been carried out in the literature to address the coordination and cooperation problems in complex adaptive systems, and have continued to grow. This Research Topic eBook publishes 14 papers by 39 authors, and most of these published papers present current research illustrating the depth and breadth of ongoing work on the coordination and cooperation problems in complex adaptive systems. It thus provides a timely discussion for researchers on the hotspots and challenges of the study on coordination and cooperation in theoretical models and applied systems.

Citation: Chen, X., Sasaki, T., Okada, I., eds. (2019). Coordination and Cooperation in Complex Adaptive Systems: Theory and Application. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-844-8

# Table of Contents

### SECTION 1

### EVOLUTIONARY DYNAMICS OF COOPERATION

*05 Evolution of Public Cooperation in a Risky Society With Heterogeneous Assets*

Linjie Liu and Xiaojie Chen

*14 A Theoretical Approach to Norm Ecosystems: Two Adaptive Architectures of Indirect Reciprocity Show Different Paths to the Evolution of Cooperation*

Satoshi Uchida, Hitoshi Yamamoto, Isamu Okada and Tatsuya Sasaki


SECTION 2

### EVOLUTIONARY DYNAMICS OF SOCIETIES

*44 Evolution of Human-Like Social Grooming Strategies Regarding Richness and Group Size*

Masanori Takano and Genki Ichinose


### SECTION 3

### EXPERIMENTS


Masahiko Higashi, Reiji Suzuki and Takaya Arita

## SECTION 4

#### APPLICATIONS

*103 Agent-Based Self-Service Technology Adoption Model for Air-Travelers: Exploring Best Operational Practices*

Keiichi Ueda and Setsuya Kurahashi


Shin-Ichiro Kumamoto and Takashi Kamihigashi

*144 Sociophysics Analysis of the Dynamics of Peoples' Interests in Society* Akira Ishii and Yasuko Kawahata

# Evolution of Public Cooperation in a Risky Society with Heterogeneous Assets

#### Linjie Liu and Xiaojie Chen\*

School of Mathematical Sciences, University of Electronic Science and Technology of China, Chengdu, China

The phenomenon of asset heterogeneity is widespread in human society. However, it is unclear what roles heterogeneous assets play in the evolution of cooperation of the collective-risk society. In this paper, we thus introduce asset heterogeneity into a threshold public goods game with collective-risk, and we divide the population into the rich and the poor according to individual assets. We show that asset heterogeneity hinders public cooperation no matter whether the temptation to defect is high or low. We find that cooperation collapses in the conditions of low risk, the high gap between the rich and the poor, and high threshold. Besides, the increment of individual assets can significantly enhance the level of public cooperation even the conditions for the evolution of cooperation are strongly harsh. Our work is instructive to a better understanding of the emergence of cooperation in the risky society with heterogeneous assets.

#### Edited by:

Matjaž Perc, University of Maribor, Slovenia

#### Reviewed by:

Zhen Wang, Hong Kong Baptist University, Hong Kong Jun Tanimoto, Kyushu University, Japan

> \*Correspondence: Xiaojie Chen xiaojiechen@uestc.edu.cn

#### Specialty section:

This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics

Received: 30 September 2017 Accepted: 07 December 2017 Published: 04 January 2018

#### Citation:

Liu L and Chen X (2018) Evolution of Public Cooperation in a Risky Society with Heterogeneous Assets. Front. Phys. 5:67. doi: 10.3389/fphy.2017.00067 Keywords: asset heterogeneity, collective-risk, threshold public goods game, individual assets, public cooperation

## 1. INTRODUCTION

The emergence and maintenance of cooperative behavior is fundamental for a society to thrive [1–17]. However, cooperation is often threatened by selfish individuals who only concern the shorttime interests [18–20]. Not surprisingly, if all individuals' goal is to maximize their own fitness regardless of the consequences which might have for the whole population, then there will be a dilemma of cooperation in our society [21–28]. One typical dilemma underlying the tragedy of commons is described by the public goods game (PGG) [29–35]. In the PGG, an individual will obtain a higher payoff by contributing nothing, no matter what the other players do. Therefore, rational players have no incentive to contribute, instead they choose to free ride on the benefits produced by others. Although the PGG illustrates that defection is the evolutionary stable strategy and cooperators are prone to be exploited, abundant examples of altruistic behavior exist in animal and human society [36–39].

In order to solve this inconsistency, the PGG model has been extended by adding the risk of a collective failure to ensure the emergence of cooperative behavior [40–44]. Besides, several mechanisms have been proposed in the past decades for supporting the emergence of public cooperation [45–65].

However, these mentioned works assumed that all individuals have been treated as equivalent in all respects, in sharp contrast with real-life situations, in which diversity is ubiquitous. Indeed, our modern societies are grounded in great diversity, in which some individuals play radically different roles depending on their social positions [66–79]. Until recently, such heterogeneity has attracted considerable attention. For example, one research assumed that resource heterogeneity

**5**

may enable cooperators to spread and persist if the temptation to defect is not too large [80]. Some other researches assumed that players may participant in PGG with different wealth distributions [70, 81, 82]. More specifically, Wang et al. [70] showed that participants with lower initial wealth may choose to cooperate only if all the rich are cooperators. Subsequently, Vasconcelos et al. [82] studied the evolution of cooperation in two different scenarios, namely, with wealth inequality and without wealth inequality, and showed that the former leads to more global cooperation than the latter.

Interestingly, previous researches involving wealth inequality always consider that individuals have been provided with dichotomic initial wealth before participating in the PGG [32, 70, 82]. Indeed in the real world, acquired wealth can only be regarded as a part of personal assets, such as the wage earnings. However, the implications of heterogeneous assets for cooperation have so far remained unexplored. Since uneven distributions of personal assets are ubiquitous, it remains unclear how evolutionary stable levels of cooperation are influenced by asset heterogeneity.

In this study, we thus introduce asset heterogeneity in a threshold public goods game (TPGG) with collective risk to investigate how cooperation evolves. Specifically, we first explore the impact of asset heterogeneity on social cooperation in the conditions of low and high temptation to defect, and find that asset heterogeneity can hinder cooperation no matter whether the temptation to defect is high or low. Then we study the role of increased asset values in social cooperation at the same asset heterogeneity level, and observe that the gradual increase of assets significantly promotes the emergence of cooperative behavior. Finally, we verify how social cooperation depends on other important parameters, such as risk, threshold, and the proportion of the poor.

#### 2. MODEL AND METHOD

We consider the collective-risk dilemma game in a well-mixed population. We divide the individuals into the poor and the rich, where the fraction of the poor in the population is p. We assume that each rich individual has an initial asset a<sup>r</sup> and each poor individual has an initial asset ap( a<sup>r</sup> > ap). Each individual y either pays a cost c as a cooperator with strategy s<sup>y</sup> = 1 or pays nothing as a defector with strategy s<sup>y</sup> = 0. Denote the proportion of rich cooperators, poor cooperators, rich defectors, and poor defectors as x<sup>r</sup> , xp, y<sup>r</sup> , and yp, respectively. Then x<sup>r</sup> + y<sup>r</sup> = 1 − p and x<sup>p</sup> + y<sup>p</sup> = p. The collective target will be reached if the total amount of individuals who choose to contribute to the common pool reaches the threshold T. Thus each individual can gain the benefit b, such that the payoff is p<sup>y</sup> = b − csy. However, if the collective target is not reached, all the individuals within the group lose their investment and the assets with probability r. Accordingly, the payoff of individual y with strategy S<sup>y</sup> in group having i cooperators can be written as:

$$\begin{aligned} p\_\mathcal{V} &= b\theta(\dot{\imath} - T) + b(1 - r)[1 - \theta(\dot{\imath} - T)] - a\_\mathcal{P}r[1 - \theta(\dot{\imath} - T)]\varphi, \\ &- \, ^\prime a\_r[1 - \theta(\dot{\imath} - T)](1 - \varphi) - c\mathcal{S}\_\mathcal{V}, \end{aligned}$$

where θ(u) = 0 if u < 0 and θ(u) = 1 otherwise. Besides, ϕ = 1 denotes that the participant is rich, and ϕ = 0 indicates he is poor.

We further apply a replicator system for the dynamic analysis, based on preferentially imitating strategies of the more successful individuals [83–86]. Unless otherwise specified, problem formulation and modeling are presented in Supplementary Material S1. Results are proved analytically in Supplementary Materials S2, S3.

### 3. RESULTS

We begin by showing the stationary distribution and the gradient of selection for different parameters of asset heterogeneity ap/a<sup>r</sup> and of asset a<sup>r</sup> . As shown in **Figure 1**, for low a<sup>r</sup> (for example, a<sup>r</sup> = 2), when the gap between the rich and the poor is relatively large, there are nine fixed points but only two are stable (**Figure 1A**), and the stability analysis of equilibria can be found in Supplementary Material S3.2.2(9). We find that the basin of attraction of the stable equilibrium indicating that most of the poor and all the rich are cooperators, is larger than that of another stable point denoting full defection. As ap/a<sup>r</sup> increases, the higher location stable fixed point moves toward full cooperation and the basin of attraction of full defection rapidly shrinks closely to zero (see **Figures 1A–C**). For intermediate a<sup>r</sup> (for example, a<sup>r</sup> = 10), we find that the tendency of individuals to choose defection shrinks as the gap between the rich and the poor shrinks (see **Figures 1D–F**). For even larger a<sup>r</sup> (for example, a<sup>r</sup> = 50), individuals no matter whether they are the rich or the poor do have a higher expected loss than the cost of cooperation (**Figures 1G–I**). Particularly, there are very few individuals who choose to defect when the gap between the rich and the poor is not obvious (**Figure 1I**), and the specific theoretical analysis can be seen in Supplementary Material S3.2.2(10).

Then we explore the effect of asset heterogeneity on cooperation when the temptation to defect is high. In **Figure 2**, we find that the main conclusions in **Figure 1** are not changed. Concretely, the growth of ap/a<sup>r</sup> can promote the poor to contribute to the common pool even personal assets are significantly low. Besides, the proportion of cooperators increases with personal assets, regardless of whether the gap between the rich and the poor is high or low. But, more importantly, the inhibitory effect of asset heterogeneity on cooperative behavior still exists.

In what follows, we present that public cooperation can be destroyed in the conditions of high gap between the rich and the poor and a relatively high threshold T at a low r value. From **Figure 3** we can see there is only one stable point which represents full defection (more detailed analysis of equilibria is presented in Supplementary Materials S2, S3.2.1(3)). Indeed, in this case, low risk causes individuals to worry less about losing all their assets when the target is not reached. Besides, the high gap between the rich and the poor makes the poor reluctant to contribute. Not only that, the rich will be also no longer willing to cooperate if they need to complete a relatively high target.

most likely direction of evolution. For each arrow, we use a continuous color bar associated with the likelihood of such a transition (red lines denote the highest speed of transition while purple lines represent the lowest speed of transition). The initial assets for the rich and the poor individuals are (A) ar = 2 and ap = 0.2; (B) ar = 2 and ap = 1; (C) ar = 2 and ap = 1.8; (D) ar = 10 and ap = 1; (E) ar = 10 and ap = 5; (F) ar = 10 and ap = 9; (G) ar = 50 and ap = 5; (H) ar = 50 and ap = 25; (I) ar = 50 and ap = 45. Other parameters values are N = 6, T = 3, r = 0.5, p = 0.7, and c/b = 0.1.

In **Figure 3** we mainly study the effects of relatively high threshold value on cooperation in the specific conditions. However, it remains of interest to show how different combinations of threshold and asset heterogeneity affect the stationary distribution. As shown in **Figure 4**, for low value of T (top row), we can see that the system can converge to the state where all the rich and nearly half of the poor choose to contribute when the gap between the rich and the poor is large (**Figure 4A**), and for more details see Supplementary Materials S2, S3.1(8). What's more, we find that the proportion of the poor cooperators increases with ap/a<sup>r</sup> (see **Figures 4A–C**). When T takes an intermediate value (second row), the basin of attraction of full defection state increases with increasing T. Specially, when T is sufficiently large (third row), for low ap/a<sup>r</sup> , there are three stable fixed points, and the newly added one located at the top left represents that all the rich are cooperators but the poor cooperators cannot survive (see **Figure 4G** and Supplementary Material S3.2.1(9)). But this stable equilibrium will disappear when we increase the value of ap/a<sup>r</sup> .

Furthermore, we investigate how risk values influence the stationary fraction of cooperators at an intermediate threshold value, as shown in **Figure 5**. We find that for a relatively small ap/a<sup>r</sup> (for example, ap/a<sup>r</sup> = 0.1 ), the poor cooperators cannot survive when r is low (see **Figure 5A** and Supplementary Material S3.2.2(3)). In fact, the expected loss for the poor is less than the cost of cooperation. This adverse situation will be reversed if we enhance the value of risk r (see **Figure 5D** and Supplementary Material S3.2.2(9)). More specifically, the growth of the risk leads to the higher location stable point moving toward full cooperation (see **Figure 5G** and Supplementary Material S3.2.2(10)). Besides, the effect of asset heterogeneity on

cooperation is consistent with our above conclusion, namely, narrowing the gap between the rich and the poor can promote public cooperation (see **Figures 5G–I**).

N = 6, T = 3, r = 0.5, p = 0.7, and c/b = 0.5.

In order to study how the fraction of cooperators depends on the proportion of the poor p, we show the stationary distribution of cooperators as a function of the proportion of the poor p at r = 0.5 and T = 3 for three different values of ap/a<sup>r</sup> in **Figure 6**. For a low p (top row), all the poor will choose to free ride even the gap between the rich and the poor is significantly small (see **Figures 6A–C**). Besides, it is obvious that not all the rich are enthusiastic to contribute, which means that there exist free-riders among the rich if they constitute the vast majority of the group. For an intermediate value of p (second row), the poor cooperators can survive, and beyond that, as ap/a<sup>r</sup> increases, the proportion of the poor cooperators increases as well [more details can be found in Supplementary Material S3.1(10)]. For much larger p (third row), we can find that the stable point in the upper left corner will disappear when ap/a<sup>r</sup> is significantly high [see **Figures 6G–I** and Supplementary Material S3.2.2(6) and (8)].

As also shown in **Figure 6**, the proportion of the poor p acts an important factor in supporting cooperation. More specifically, when p is particularly small, the change of asset heterogeneity will not have any effect on cooperation. When the proportions of the poor and the rich in the group are the same, then the poor cooperators can survive. At the same time, the region of attraction of full defection has a slight expansion in comparison with a smaller p. As p continues to increase, the poor account for 90 percent of the population. Then the contributions from the rich are far from meeting the target. In order to prevent their assets from losing, the majority of the poor will contribute to the common pool. Besides, narrowing the gap between the rich and the poor can effectively reduce the occurrence of defection as long as the proportion of the poor is not too small.

### 4. DISCUSSION

We have introduced asset heterogeneity in the collective-risk social dilemma game, and intensively studied its effects on the evolution of public cooperation. We have been motivated by the fact that an uneven distribution of personal assets is surprisingly common in human societies, as well as by the fact that recent research on a similar variant of the collectiverisk social dilemma game in a well-mixed population has shown that heterogeneous wealth distributions can affect public cooperation [70]. By considering personal asset rather than wealth, we mainly investigate the effects of asset heterogeneity on cooperation. Our research reveals that asset heterogeneity hinders cooperation no matter whether the temptation to defect is high or low. In addition, four important parameters have been considered in our work, namely, personal assets, threshold, risk, and the proportion of the poor. Specifically, we have shown that the increment of personal assets and risk can both significantly promote social cooperation [43, 44]. Furthermore, the cooperation level increases with the

1.8 (right column). Other parameters are p = 0.7, N = 6, and c/b = 0.1.

growth of the poor proportion. But a small number of the rich will no longer enthusiastic to contribute when the rich make up a large proportion of the population. Our model also shows an interesting phenomenon: an increase in threshold can contribute to the increase of poor cooperators. However, in some special conditions, a higher threshold can destroy cooperation.

Temptation to defect has been seen as a key factor for exploring the effect of heterogeneity on cooperation in recent years [80, 87, 88]. Kun and Dieckmann [80] have revealed that resource heterogeneity leads to decreased level of cooperation once when the temptation to defect is significantly lowered, otherwise, heterogeneity facilitates the maintenance of cooperation. Unlike previous study, however, our model introduces threshold and the risk of collective failure into the public goods game, and shows that asset heterogeneity can hinder cooperation no matter whether the temptation to defect is high or low (see **Figures 1**, **2**).

Besides, it is worth noting that the impacts of the increment of the threshold value on public cooperation are two-sided. On the one hand, the growth of the threshold enlarges the region of attraction of full defection. On the other hand, it enhances the proportion of poor cooperators (see **Figure 4**). In addition, social cooperation will collapse at low risk, high poverty gap, and high threshold (see **Figure 3**). Recently, the effects of the threshold value have been studied theoretically and experimentally [72, 82, 89]. Vasconcelos et al. [82], for instance, verified that threshold uncertainty has a disruptive effect on cooperation when all individuals in the group are equivalent, but they neglected the presence of wealth inequality. Our model proves that, in the specific conditions, a larger target value

will destroy cooperation in a risky society with heterogeneous assets.

As we said earlier, our model is inspired partly by the realistic situation, in which it is relatively straightforward to come up with examples where our model could apply. One widely considered example is the problem of climate change. The Paris climate agreement aims at holding global warming to well below 2◦C and to "pursue efforts" to limit it to 1.5◦C [90]. To accomplish this, countries, no matter whether developed countries or developing countries, have submitted national plans that spell out their intentions for addressing the climate change challenge. Nevertheless, targets and actions for reducing greenhouse gas (GHG) emissions are core components [91, 92]. Therefore, it is of greatest importance for countries to set a measurable emission reduction target. Besides, the action by all countries is effective in averting climate catastrophes, thus it is also a challenge for policy makers to enhance the level of cooperation among different countries. Our research may contribute to a better understanding of the emergence of cooperative behavior in risk society with heterogeneous assets, and thus may provide some insights to how to solve the climate change problem in the realistic world including developed and developing countries.

### AUTHOR CONTRIBUTIONS

LL performed the research. LL and XC designed the research and wrote the paper.

### FUNDING

This research was supported by the National Natural Science Foundation of China (Grants No. 61503062).

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphy. 2017.00067/full#supplementary-material

#### Liu and Chen Asset Heterogeneity for Public Cooperation

### REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Liu and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Theoretical Approach to Norm Ecosystems: Two Adaptive Architectures of Indirect Reciprocity Show Different Paths to the Evolution of Cooperation

Satoshi Uchida<sup>1</sup> \*, Hitoshi Yamamoto<sup>2</sup> , Isamu Okada<sup>3</sup> and Tatsuya Sasaki 4,5

<sup>1</sup> Research Center for Ethi-Culture Studies, RINRI Institute, Tokyo, Japan, <sup>2</sup> Faculty of Business Administration, Rissho University, Tokyo, Japan, <sup>3</sup> Faculty of Business Administration, Soka University, Tokyo, Japan, <sup>4</sup> Faculty of Mathematics, University of Vienna, Vienna, Austria, <sup>5</sup> F-Power Inc., Tokyo, Japan

#### Edited by:

Víctor M. Eguíluz, Instituto de Física Interdisciplinar y Sistemas Complejos (IFISC), Spain

#### Reviewed by:

Kunal Bhattacharya, Aalto University, Finland Matjaž Perc, University of Maribor, Slovenia

> \*Correspondence: Satoshi Uchida s-uchida@rirni-jpn.or.jp

#### Specialty section:

This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics

Received: 01 October 2017 Accepted: 05 February 2018 Published: 20 February 2018

#### Citation:

Uchida S, Yamamoto H, Okada I and Sasaki T (2018) A Theoretical Approach to Norm Ecosystems: Two Adaptive Architectures of Indirect Reciprocity Show Different Paths to the Evolution of Cooperation. Front. Phys. 6:14. doi: 10.3389/fphy.2018.00014 Indirect reciprocity is one of the basic mechanisms to sustain mutual cooperation, by which beneficial acts are returned, not by the recipient, but by third parties. This mechanism relies on the ability of individuals to know the past actions of others, and to assess those actions. There are many different systems of assessing others, which can be interpreted as rudimentary social norms (i.e., views on what is "good" or "bad"). In this paper, impacts of different adaptive architectures, i.e., ways for individuals to adapt to environments, on indirect reciprocity are investigated. We examine two representative architectures: one based on replicator dynamics and the other on genetic algorithm. Different from the replicator dynamics, the genetic algorithm requires describing the mixture of all possible norms in the norm space under consideration. Therefore, we also propose an analytic method to study norm ecosystems in which all possible second order social norms potentially exist and compete. The analysis reveals that the different adaptive architectures show different paths to the evolution of cooperation. Especially we find that so called Stern-Judging, one of the best studied norms in the literature, exhibits distinct behaviors in both architectures. On one hand, in the replicator dynamics, Stern-Judging remains alive and gets a majority steadily when the population reaches a cooperative state. On the other hand, in the genetic algorithm, it gets a majority only temporarily and becomes extinct in the end.

Keywords: evolutionary game theory, evolution of cooperation, indirect reciprocity, social norms, ecosystems, adaptive systems

### INTRODUCTION

Cooperative relationships such as I-help-you-because-you-help-me relations can often be found in both biological systems and human societies. Cooperative behaviors are obviously essential to make societies effective and smooth. However, evolutionary biologists and social scientists have long been puzzled about the origin of cooperation. Recently, scientists from a variety of fields such as economics, mathematics and physics have been tackling the puzzle using tools developed in each discipline.

According to a thorough review published from statistical physics viewpoints recently [1], there have been numerous contributions from physicists to this area for the past decade. In those researches, diverse methods to handle many interacting particles developed in statistical physics are used to investigate interactions of biological and social elements.

Following the context of the physics literature, in this paper, we deal with interactions of "social norms." Social norms are interpreted as views on what is "good" or "bad" and play an essential role in indirect reciprocity based on reputation systems. Indirect reciprocity is known as one of the main mechanisms for the emergence of cooperation. It has a long history and has been amply documented in human populations [2–12]. One feature of indirect reciprocity is that helpful acts are returned, not by the recipient as in direct reciprocity, but by third parties [13–15]. To decide helpful acts therefore needs information on others, who can be possible recipients in the future.

As mentioned in Nowak and Sigmund [16], there are two main motivations to pursue the investigation of indirect reciprocity. One concerns the evolution of human communities: how can cooperation emerge in villages and small-scale societies? (see for instance [17, 18]) The other motivation is related to the recent rapid growth of anonymous interactions on a global scale, made possible by the spread of communication networks on the internet: how can cheating be avoided in on-line trading [19]? In both cases, simple, robust methods for assessing others, i.e., social norms are necessary.

Vast studies on indirect reciprocity in the framework of evolutionary game theory have discovered various types of norms or assessment rules that enhance the evolution of cooperation in the modern society with highly mobile interactions. Theoretically, assuming that the same norm is adopted by all members of a population, Ohtsuki and Iwasa have shown that only eight out of 4,096 resulting possible norms lead to a stable regime of mutual cooperation. These are said to be the "leading eight" [20, 21]. In this context, "stable" means that the corresponding population cannot be invaded by other action rules. However, this does not settle the issue on whether the focal norm can be invaded by other norms (i.e., assessment rules) or not.

Many theoretical studies also considered another stability criterion. Those studies focus on whether the corresponding population cannot be invaded by or can invade into unconditional strategies such as perfect cooperators and perfect defectors [21–24]. Clearly, these previous studies do not allow us to fully compare different norms either.

If one wants to analyze the evolution of even the simplest system of morals, one has to consider the interaction of several assessment rules in a population. Some studies meet the theme. For example, comparing Simple-Standing with Stern-Judging, both members of the leading eight, is an important task to explore a champion of the assessment rules using second-order information. Uchida and Sigmund [25] analyzed the competition of these two different rudimentary norms and established significant findings.

Despite the theoretical developments of Uchida and Sigmund [25] on analyzing multiple rules, its approach cannot describe a mixture of more than a few rules. Real society, however, comprises a melting pot of various norms that interact with each other. Therefore an imperative next step of studies on indirect reciprocity would be to develop an analytical tool which can deal with "norm ecosystems" in which more than a few norms coexist, interact and compete. Although some insights have been derived in a research using individual-based simulations [26], a new theoretical approach may capture co-evolution of diverse norms more in detail.

Therefore the main focus of the present paper is in developing a systematic analytical methodology with which entanglements of all sixteen norms using second-order information can be formulated in an equation system. Extending the methodology proposed by Uchida and Sigmund [25], we see that the key problem, i.e., determining the average payoff of each norm surrounded by other norms to determine its fitness, comes down to a linear problem (i.e., a task of solving an inhomogeneous linear equation system). Thus it is computationally feasible to calculate the payoffs even when to deal with mixture of many norms. Uchida and Sigmund [25] treated a special case of the linear problem analyzed here.

The authors' development is useful not only for rigorous analysis of norm ecosystems, but also helps compare different "adaptive architectures." Here an adaptive architecture means a way for individuals to adapt to their environments. In this paper, we take up the two representative architectures, replicator dynamics and genetic algorithm. Although these architectures are popular in the literature, they are studied independently in different domains and their comparison in the framework of evolutionary game theory has not yet been done because there has been no technical method developed to capture all strategies in a norm space at once as the study of genetic algorithm requires. Our approach offers a first opportunity to theoretically analyze a comparison of replicator dynamics and genetic algorithm in evolutionary game theory.

The analysis reveals that the two representative adaptive architectures show different paths to the evolution of cooperation. We find that Stern-Judging, one of the best studied norms in the literature, plays important but different roles in both cases [25, 27, 28]. In the replicator dynamics, Stern-Judging remains alive and gets a majority whenever the population reaches a cooperative state. On the other hand, in the genetic algorithm, it gets a majority just before cooperation rate starts rising but becomes extinct after the cooperation has been accomplished.

In the next section, we describe the model ecosystem, derive the equation to analyze it and introduce the adaptive architectures. Then we present the results and discuss them.

#### MATERIALS AND METHODS

### Game, Norm, and Payoff

An infinitely large, well-mixed population of individuals (or players) is considered. From time to time, one potential donor and one potential recipient are chosen at random from the population and they engage in a donation game: the donor decides whether to help the recipient at a personal cost c. If the donor chooses to help, the recipient receives a benefit b > c; otherwise the recipient obtains nothing. Each individual in the population experiences such decision makings many times both as a donor and as a recipient [29–31]. From here on, we denote the action "help" by "1" and "refuse" by "0."

Individuals in the population have the ability to observe and assess others following their assessment rules (or social norms). Here "assess" means that players label other players "good" or "bad" according to their actions as a donor in their last interactions. The images of players are also denoted by "1" (for "good") and "0" (for "bad"). The assessment is done privately but the information needed for the assessment is so easily accessed that all individuals have the same information (on private information see for example [29–31]).

A donor determines whether or not to help the recipient, depending on the current image of the donor (i.e., whether the recipient is labeled as 1 or 0). If the recipient is viewed as 1 in the eyes of the potential donor, the recipient will be given help, otherwise the recipient will not be offered a help. Note that we do not assume any kind of error in the model because this is a first attempt to describe competitions of all norms in the focal norm space (for the role of errors, see [32–34]). Moreover, we assume that all individuals are trustful, therefore, initially good.

The social norms in this present research are at most of second order, i.e., they take the image of the recipient as well as the action of the donor into consideration. Denoting the action of the donor by α ∈ {0, 1} and the image of the recipient by β ∈ {0, 1}, the new image of the donor after the game from the view point of some norm is a binary function of α and β: β new = f (α, β) ∈ {0, 1}. Hence a second order norm can be identified by a four bit (f (1, 1), f (1, 0), f (0, 1), f (0, 0)).There are 16 possible norms and we number them by defining that i = f(1, 1)2<sup>3</sup> + f(1, 0)2<sup>2</sup> + f(0, 1)2<sup>1</sup> + f (0, 0) 2 <sup>0</sup> + 1. The 16 norms include some well-studied norms in the literature: the 9th norm (1000) is known as Shunning (SH), the 10th norm (1001) is called Stern-Judging (SJ), the 13th norm (1100) Image-Scoring (IS) (which is of first order) and the 14th norm (1101) Simple-Standing (SS). The first norm (0000) and the last one (1111) are unconditional norms and called AD and AC, respectively.

We denote by x<sup>i</sup> the frequency of individuals that follow social norm i P<sup>16</sup> i=1 x<sup>i</sup> = 1 . Note that individuals using the same norm have the same opinion on others, since all individuals have the same information without errors.

As individuals play the game, the images of the individuals gradually change. At the equilibrium of images, the average payoff of individuals with norm i depends on the frequencies of the other norms in the population and on how many individuals are good. The average payoff P<sup>i</sup> at the equilibrium of images is in fact given by

$$P\_{\vec{l}} = \sum\_{j=1}^{16} \chi\_{\vec{l}} \left( s\_{\vec{l}\vec{l}} b - s\_{\vec{l}\vec{l}} c \right), \tag{1}$$

where sij is the probability that a random player with norm i has a good image of a random player with norm j. We call sij∈ [0, 1]16×<sup>16</sup> the "image matrix." Thus specifying the image matrix provides the average payoff with the frequencies x<sup>j</sup> fixed. The outline of the calculation for the image matrix is shown in the Results section (The full information on the calculation is found in the Supplementary Material).

#### Adaptive Architectures

The players adaptively switch their assessment rules, aiming at more payoffs. We examine two different switching processes: adaptive changes due to social learning by imitation described by the replicator dynamics and those changes of norms modeled by the genetic algorithm.

#### Replicator Dynamics

In case of replicator dynamics, an individual occasionally has a chance to change its norm by imitating another individual (i.e., adopting its norm as a model). The probability that an individual (with norm i) is chosen as a model is proportional to the norm's frequency x<sup>i</sup> and that model's fitness F<sup>i</sup> = F + P<sup>i</sup> . Here, F is a baseline fitness (the same for all) and will be set to c in all simulations (We also normalize F<sup>i</sup> in simulations.).

With some probability, an individual selects a norm totally at random and adopts that norm. This occurs due to mutation. The resulting dynamics is given by the replicator-mutation equation x˙<sup>i</sup> = x<sup>i</sup> P<sup>i</sup> − P + µ 1 <sup>16</sup> − x<sup>i</sup> , where P = P<sup>16</sup> k=1 xkP<sup>k</sup> is the average payoff in the population (see [35]) and µ is a parameter that measures strength of mutation. In fact, we use the discretized version of replicator dynamics to compare with genetic algorithm: x<sup>i</sup> t + dt = x<sup>i</sup> (t) + dtx<sup>i</sup> (t) P<sup>i</sup> (t) − P (t) + dtµ 1 <sup>16</sup> − xi(t) .

#### Genetic Algorithm

In case of genetic algorithm, an individual decomposes a norm into a collection of bits and changes its norm "bit-wise" by imitating the norms of two randomly selected individuals (called parents) [36]. Following [26], the probability that an individual with norm i is selected as a parent is proportional to the norm's frequency x<sup>i</sup> and the square of the fitness of norm i (rule 1).

After parents have been chosen [now, the norms of the parents are j = (a, b,c, d) and k = (e, f , g, h), respectively], a crossover is uniformly performed: the first bit of the child's norm is either "a" or "e" with the same probability and the second bit "b" or " f " and so on. The uniform crossover generates the norms (a, b,c, d), (a, b,c, h)· · ·(e, f , g, d), (e, f , g, h) with the same probability, which is 1/16 (rule 2).

From rules 1 and 2, we can derive the probability that any norm i = (p, q,r,s) is generated at the next generation (which we denote by wi). However, due to mutation, a bit of the generated norm can be flipped with probability µ. Note that µ in the RD and that in the GA have different meanings. We assume that at most one bit can be inverted because of the small mutation probability. Thus the probability that none of 4 bits is flipped is 1 − 4µ. Therefore the probability that norm i = (p, q,r,s) is actually generated is v<sup>i</sup> = (1 − 4µ)w<sup>i</sup> + µI. Here I is the total of the probabilities that the neighboring norms (1−p, q,r,s), (p, 1− q,r,s), (p, q, 1−r,s), (p, q,r, 1−s) are generated before mutation.

The frequency of norm i at the next generation t + dt is given by xi(t + dt) = (1 − dt)xi(t) + dtv<sup>i</sup> , where dt is the proportion of individuals that change their norms between the generations t and t + dt.

#### Replicator Dynamics with Multiple Models

In addition to the ordinary adaptive architectures well-studied in the literature mentioned above, we consider two other adaptive architectures that are modified versions of the conventional replicator dynamics and the genetic algorithm, respectively. The first one is replicator dynamics with multiple models.

In this adaptive architecture, an individual learns each bit of its norm independently from probably different models. The probability that an individual having norm j = (p, q,r,s) flips its first bit is proportional to the average fitness of such individuals that follow a norm the first bit of which is 1 − p. This fitness is given by P V∈X<sup>1</sup> xVFV. Here X<sup>1</sup> is the set of norms whose first bit is 1 − p (i.e., norms of the form (1 − p, ∗, ∗, ∗)) and F<sup>V</sup> is the normalized fitness of V. Then for instance, the probability that the individuals change their norm from j = (p, q,r,s) to k = (1 − p, q,r,s) is given by P V∈X<sup>1</sup> xVF<sup>V</sup> 1 − P V∈X<sup>2</sup> xVF<sup>V</sup> 1 − P V∈X<sup>3</sup> xVF<sup>V</sup> 1 − P V∈X<sup>4</sup> xVF<sup>V</sup> . In this formula, for example, the second term 1 − P V∈X<sup>2</sup> xVF<sup>V</sup> is the probability that the individual does not flip its second bit. By considering all possibilities, we can calculate the in-flow to norm j from norm i (wij) and outflow from P j to i (wji). Then the increase rate of norm j is given by i (xiwij − xjwji).

As in ordinary replicator-mutation dynamics, we also include a mutation term in addition to the switching process described above. But here, we assume that mutation occurs "bit-wise" as assumed in genetic algorithm. That is, by µ, we denote the probability that each bit is flipped by mutation. Then the inflow to j due to mutation is given by µ P k x<sup>i</sup> with k being 4 neighboring norms of j (i.e., the hamming-distances between the 4 norms j are 1.) and the out-flow from j by 4µx<sup>j</sup> .

The resulting dynamics is given by x˙<sup>j</sup> = P i (xiwij − xjwji) + µ P k x<sup>i</sup> − 4x<sup>j</sup> . Note that "4µ" in this dynamics corresponds to "µ" in the ordinary replicator-mutation dynamics. As for other adaptive architectures, we use discretized version of the dynamics.

#### Genetic Algorithm with a Single Parent

The other one is genetic algorithm with a single parent. In this architecture, only one individual is chosen as the unique parent of an individual. Then the child copies the norm of the parent. That is, the child adopts the entire norm of the single parent. Mutation effects and the probability that an individual is chosen as a parent are calculated in the same way as in the ordinary genetic algorithm mentioned above.

All four corresponding evolution equations depend on expected payoffs. We assume that images are always at equilibrium at each time step of the evolution equations. Under this assumption in the next section, we derive the equation system to specify image matrices (thus expected payoffs of norms) and show time evolutions of norms based on the above mentioned adaptive architectures.

#### RESULTS

#### Image Matrix

Images of individuals change in time as well as frequencies of norms. But we assume that the time scale of the changes of images is much faster than that of norm changes. As a result, images are always at equilibrium and norm frequencies are treated as constant in estimating image matrices, as is assumed in the literature (See [37]).

To calculate image matrix sij, we introduce "image profile" s (j) <sup>e</sup>1e2e3···e<sup>16</sup> ∈ [0, 1]<sup>2</sup> <sup>16</sup> , which is the joint probability distribution in terms of the images of a random player with norm j from the viewpoints of all norms. Thus the value of s (j) <sup>e</sup>1e2e3···e<sup>16</sup> is the probability that a random player with norm j is labeled an image e<sup>1</sup> ∈ {0, 1} from the first norm and e<sup>2</sup> from the second norm, . . . , and e<sup>16</sup> from the 16th norm. Note that, since the first norm is unconditional AD, the probability that e<sup>1</sup> = 1 is zero. Thus s (j) e1=1,e2e3···e<sup>16</sup> = 0. Similarly, s (j) <sup>e</sup>1e2e3···e16=<sup>0</sup> <sup>=</sup> 0.

The image profile is a joint probability distribution and contains the finest probabilistic information about the system. For example, the image matrix is interpreted as the marginal distribution:

$$s\_{ij} = \sum\_{\mathfrak{e}\_1 \mathfrak{e}\_2 \dots \mathfrak{e}\_{i-1} \mathfrak{e}\_{i+1} \dots \mathfrak{e}\_{16}} s\_{\mathfrak{e}\_1 \mathfrak{e}\_2 \dots \mathfrak{e}\_i \dots \mathfrak{e}\_{16}}^{(j)} \tag{2}$$

with e˜<sup>i</sup> = 1.

Now we define the joint distribution in the whole population by

$$R\_{f\_1 f\_2 \cdots f\_{16}} = \sum\_{j=1}^{16} \varkappa\_j s\_{f\_1 f\_2 \cdots f\_{16}}^{(j)},\tag{3}$$

which gives the proportion of those individuals in the whole population, who are labeled image f<sup>1</sup> from the viewpoint of the first norm and f<sup>2</sup> from the second norm, f<sup>3</sup> from the third norm and so on. Since Rf1f2···f<sup>16</sup> is a probability distribution, there is a constraint on Rf1f2···f<sup>16</sup> :

$$\sum\_{f\_1 f\_2 \cdots f\_{16}} \mathcal{R}\_{f\_1 f\_2 \cdots f\_{16}} = 1. \tag{4}$$

According to our analysis, it is possible to derive the equation system that yields the values of all image profiles s (j) <sup>e</sup>1e2···e<sup>16</sup> (the joint probability distribution). More concretely, we can find an expression of s (j) <sup>e</sup>1e2···e<sup>16</sup> as a linear function of Rf1f2···f<sup>16</sup> . Then inserting those relations between s (j) <sup>e</sup>1e2···e<sup>16</sup> and Rf1f2···f<sup>16</sup> into Equations (3) and (4), we can have an inhomogeneous linear equation system for Rf1f2···f<sup>16</sup> . Solving this equation system yields the values of s (j) <sup>e</sup>1e2···e<sup>16</sup> , because s (j) <sup>e</sup>1e2···e<sup>16</sup> is expressed as a function of Rf1f2···f<sup>16</sup> . See the supplementary material for the details of the derivation.

We remark that the equation system with respect to Rf1f2···f<sup>16</sup> includes 2<sup>16</sup> − 1 unknowns in principle, but the fact that the equation system contains some trivial variables such as Rf1f2···f<sup>16</sup> = 0 with f<sup>1</sup> = 1 or f<sup>16</sup> = 0 reduces the dimension of the equation system.

Moreover, the case where f<sup>13</sup> = f(1,1,0,0) = 1 indicates that action C has been taken. In this case, the following conditions must be satisfied: f(1,1,0,1) = f(1,1,1,0) = 1 and f(0,0,1,1) = f(0,0,0,1) = f(0,0,1,0) = 0. The situations in which the above conditions are broken never happen. For those situations, Rf1f2···f<sup>16</sup> = 0.

Similarly if f<sup>13</sup> = f(1,1,0,0) = 0, which implies that action D has been chosen, then f(0,1,0,0) = f(1,0,0,0) = 0 and f(0,0,1,1) = f(0,1,1,1) = f(1,0,1,1) = 1. Therefore Rf1f2···f<sup>16</sup> = 0 for the situations where the above condition is not satisfied.

As a result the dimension of the equation system reduces to 2 <sup>9</sup> − 1, which can computationally be handled.

Note that the solution depends on the frequencies of norms in the population. In **Figure 1**, we can compare an image matrix obtained by an individual simulation with an image matrix calculated by the above mentioned method with all frequencies equal: x<sup>i</sup> = 1/16. We see that the simulation and the analytic method generate parallel results.

#### Time Evolutions of Norms

Frequencies of norms in a population change in time, based on its adaptive architecture. The equations describing such changes depend on payoffs. Therefore the calculating image matrices by the above mentioned method makes it possible to investigate the evolution of multiple norms caused by both switching processes, replicator dynamics and genetic algorithm.

In **Figure 2A**, we show a typical pattern of time evolutions of norms produced by (ordinary) replicator-mutation dynamics for a case where cooperation is achieved. **Figure 2B** shows its initial part (the first 100 steps). Similarly in **Figure 3**, a time series produced by (ordinary) genetic algorithm (for a case where cooperation is reached) and its initial part (the first 30 steps) are displayed. We note that whether or not the population evolves to cooperation depends on initial conditions. It can happen that a population evolves into non-cooperative states in both architectures. In this paper, we discuss typical situations in which cooperation is achieved.

As **Figures 2B**, **3B** show, initial parts of both architectures are similar, in that the cooperation rate declines at first as defective norms such as AD (blue solid line) and Shunning (SH; green dashed line) pervade in the population. But they gradually decrease and alternatively the frequency of Stern-Judging (SJ; red dashed line) rises. In parallel, the cooperation rate increases.

However the long-term behavior of Stern-Judging differs in both architectures. In replicator dynamics, Stern-Judging gets a majority after defective norms have disappeared and cooperation has been realized. This trend after the transition between noncooperative states and cooperative states is preserved stably (**Figure 2A**). In genetic algorithm, Stern-Judging gets a majority during the transition but it becomes extinct when cooperation has been achieved.

Generally, from **Figure 3A**, we see that genetic algorithm prefers tolerant norms to strict norms in cooperative states. In fact, after cooperation has been established, AC (the 16th norm; green dotted line) gets a majority and the 15th norm (blue dotted line) is the second best, then the 14th (Simple-Standing = SS; gray dashed line) and the 13th (Image-Scoring = IS; yellow dashed line). The more tolerant a norm is, the higher the frequency of the norm becomes in the population.

But this is not true for replicator dynamics. In replicator dynamics, Stern-Judging (red dashed line) is the best, Simple-Standing (gray dashed line) is the second best and Image Scoring (yellow dashed line) is the third. All these norms are well-known in the literature. Note that in both architectures, Image-Scoring survives in the long run. This is a significant finding since, in literature, Image-Scoring is known as an unstable strategy [32] and is not included in the leading eight [21].

**Figure 4A** shows a typical pattern of time evolutions of norms produced by replicator dynamics with multiple models for a case where cooperation is achieved. Its initial part (the first 500 steps) is shown in **Figure 4B**. We see that the evolutionary path is similar to that of ordinary genetic algorithm (**Figure 3**) rather than ordinary replicator dynamics (**Figure 2**). Conversely, a typical pattern of time evolutions of norms produced by genetic algorithms with a single parent (**Figures 5A,B**) is similar to that of ordinary replicator dynamics. Thus in **Figure 4**, Stern-Judging becomes extinct and in **Figure 5**, Stern-Judging gets a majority in the end.

### DISCUSSION

In the last section we found that the norm ecosystems based on different architectures show similarity and dissimilarity. Although the norm ecosystems investigated here are complex systems, their analyses enable us to gain deep understanding of a simple single norm. For instance, an unstable norm, Image-Scoring, evolves and survives in the melting pot of competing norms regardless of architectures individuals are based on. This insight cannot be obtained if we solely analyze the single norm.

The main difference of the two representative architectures (ordinary replicator dynamics and genetic algorithm) appears in the roles of Stern-Judging, whose local stability is wellstudied in the literature. The analysis revealed that Stern-Judging wins the competition against other norms and stays alive in ordinary replicator dynamics even after cooperation is achieved. That is, Stern-Judging is not only locally stable but can evolve from a mixture of diverse norms and gets a majority in the end as far as ordinary replicator dynamics is assumed. In this sense, we say that Stern-Judging plays the role of a "leading" norm in the framework of replicator dynamics.

This norm also plays a vital role in genetic algorithm since it gets a majority just before the cooperation rate starts rising. This occurs because Stern-Judging can defeat defective norms such as AD or Shunning and can increase its frequency in defective states. In other words, Stern-Judging kick-starts the evolution toward cooperation. In Yamamoto et al. [26], in which genetic algorithm is adopted as an adaptive architecture, it is reported that cooperation cannot evolve without Stern-Judging. However it is not a stable leading norm because it becomes extinct after cooperation has been achieved. Thus Stern-Judging takes a role of

FIGURE 1 | Image matrix sij (i, j = 1, · · · 16) produced by (A) an individual based simulation and (B) the analytical method described in the text. In order to generate (A), an individual simulation with 3,200 agents was run (each norm has 200 individuals). In the simulation, each individual plays the donation game as the donor 100 times on average with different randomly chosen recipients (i.e., 320,000 games in total). This number of games is large enough for the process to reach the equilibrium. After each game, all individuals, following their own norms, assess the donor and label "1" or "0" to the individual. After 320,000 games, the number of individuals with norm j of whom the individuals with i has image "1" is counted and the number is divided by 200 (total number of individuals with j) to obtain sij. The value of sij is shown in gray scale, in which white corresponds to "1," and black to "0".

a "go-between" (defective states and cooperative states) in genetic algorithm.

But why do these architectures show such different results? What is the essential difference between the two? In genetic algorithm, individuals divide norms into smaller parts (bits) and learn the parts more or less independently (from its mother and father). So we can call the learning process "analytic." For individuals with genetic algorithm, the first bit of a norm represents pro-sociality of the norm, the second bit tolerance, the third anti-sociality and the fourth intolerance (i.e., punitive nature) and they imitate each aspect of their parents, respectively.

On the other hand, individuals based on (ordinary) replicator dynamics do not analyze norms into parts but treat norms as a whole. The learning process based on replicator dynamics can therefore be called "synthetic." And whether or not the adaptive architecture is analytic or synthetic has a large impact on the results.

In fact, we modified genetic algorithm so that an individual learns how to assess others from only one parent (i.e., the norm is not divided into parts), and we obtained similar results as ordinary replicator dynamics. Moreover we extended replicator dynamics so that an individual decomposes norms into four bits and imitates each part of different models. As a result, we found similar results as ordinary genetic algorithm (with two parents). From these results, we can conclude that whether Stern-Judging can survive in a long run in cases where cooperation is achieved does not depend on switching processes (i.e., whether replicator dynamics is assumed or genetic algorithm is used). But it relies

(the first 30 steps) (B) for a case where cooperation is achieved. Parameters: c = 1, b = 7,µ = 0.01, dt = 1.

(A) and its initial part (the first 500 steps) (B) for a case where cooperation is achieved. Parameters: c = 1, b = 7,µ = 0.005, dt = 0.2.

on whether norms are treated as a whole or "bit-wise" in the corresponding switching processes.

In spite of the findings mentioned so far, we have to remark that much remains to be studied. The model studied in this present research especially has many limitations, which offers some tasks for future research from physics perspectives. First of all, we omitted implementation errors in the model to simplify the analysis. Whether and how errors change the results is interesting and necessary research yet to be done.

Moreover we assumed well-mixed populations in the analysis and ignored the effects of structured populations and group formations on cooperative behaviors of individuals. Recently interactions between heterogeneity of populations and reciprocal behaviors are investigated from physics viewpoints. For example, Nax et al. [38] studied interactions among groups and found that how important roles Image-Scoring plays for cooperation to emerge relative to "group scoring" depends on the population size. And Szolnoki et al. [39] introduced facilitators, a special type of players, on interaction networks and showed that the facilitators reveal the optimal interplay between information exchange and reciprocity. These studies provide evidence that structured populations in fact affect reciprocal functions. Inversely, some papers showed that indirect reciprocity affects population structures. For instance, it is reported that indirect reciprocity can function as a boosting mechanism of group formation and in-group favoritism, which is another aspect of cooperation [40–43].

Another factor that is out of scope in this research is the imperfectness of information. From the players' viewpoint, although the same interaction can be interpreted differently by players with distinct norms, different individuals that share the same norm always have the same opinion since all individuals

#### REFERENCES


are based on the same information in the model. In the literature, the imperfectness of information has been studied in several ways [29, 30, 33, 44–46] and examining the effect of such imperfectness may lead us to understand the moral ecosystem more deeply. Obviously this present paper is just a first step to theoretically investigate the competition and cooperation among multiple norms.

#### AUTHOR CONTRIBUTIONS

All authors conceived and designed the project. SU built and analyzed the model and wrote the manuscript. All authors discussed the results, helped draft and revise the manuscript, and approved the submission.

#### FUNDING

Part of this work was supported by JSPS (Grants-in-Aid for Scientific Research) 15KT0133 (HY), 16H03120 (HY), 17H02044 (HY), 16H03120 (IO), 26330387 (IO), 17H02044 (IO) and the Austrian Science Fund (FWF) P27018-G11 (TS).

#### ACKNOWLEDGMENTS

SU wishes to thank Voltaire Cang for his useful comments.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphy. 2018.00014/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Uchida, Yamamoto, Okada and Sasaki. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Evolutionary Dynamics of Coordinated Cooperation

#### Hisashi Ohtsuki\*

Department of Evolutionary Studies of Biosystems, School of Advanced Sciences, SOKENDAI (The Graduate University for Advanced Studies), Hayama, Japan

In social evolution theory, unconditional cooperation has been seen as an evolutionarily unsuccessful strategy unless there is a mechanism that promotes positive assortment between like individuals. One such example is kin selection, where individuals sharing common ancestry and therefore having the same strategy are more likely to interact with each other. Conditional cooperation, on the other hand, can be successful if interactions with the same partners last long. In many previous models, it has been assumed that individuals act conditionally on the past behavior of others. Here I propose a new model of conditional cooperation, namely the model of coordinated cooperation. Coordinated cooperation means that there is a negotiation before an actual game is played, and that each individual can flexibly change their decision, either to cooperation or to defection, according to the number of those who show the intention of cooperation/defection. A notable feature of my model is that individuals play an actual game only once but can still use conditional strategies. Since such a negotiation is cognitively demanding, the target of my model here is exclusively human behavior. I have analyzed cultural evolutionary dynamics of conditional strategies in this framework. Results for an infinitely large population show that conditional cooperation not only works as a catalyst for the evolution of cooperation, but sustains a polymorphic attractor with unconditional cooperators, unconditional defectors, and conditional cooperators being present. A finite population analysis is also performed. Overall, my results provide one explanation of why people tend to take into account others' decisions even when doing so gives them no payoff consequences at all.

Edited by:

Tatsuya Sasaki, F-Power Inc., Japan

#### Reviewed by:

The Anh Han, Teesside University, United Kingdom Michael Taborsky, Universität Bern, Switzerland Xiaojie Chen, University of Electronic Science and Technology of China, China

#### \*Correspondence:

Hisashi Ohtsuki ohtsuki\_hisashi@soken.ac.jp

#### Specialty section:

This article was submitted to Social Evolution, a section of the journal Frontiers in Ecology and Evolution

Received: 28 September 2017 Accepted: 30 April 2018 Published: 23 May 2018

#### Citation:

Ohtsuki H (2018) Evolutionary Dynamics of Coordinated Cooperation. Front. Ecol. Evol. 6:62. doi: 10.3389/fevo.2018.00062 Keywords: conditional cooperation, evolutionary game theory, negotiation, replicator dynamics, finite population

### 1. INTRODUCTION

Prevalence of altruistic traits in nature has been an evolutionary paradox since Darwin (1859). It is because defectors, also called cheaters or free-riders, avoid the cost of cooperation but enjoy its benefit, and hence act detrimentally against evolution of cooperation. Now there is a consensus among evolutionary biologists that positive assortment is a key to its evolution (Lehmann and Keller, 2006; Nowak, 2006b; West et al., 2007; Fletcher and Doebeli, 2009). Positive assortment means cooperators are more likely to meet and interact with other cooperators than by chance, and so are defectors.

A viscous population (Hamilton, 1964; Taylor, 1992; Wilson et al., 1992) provides an excellent occasion for such positive assortment to occur. Limited dispersal creates an environment where those who share the common ancestry tend to cluster in a spatially structured population. In such a situation, whether kin recognition is present or not, cooperation with neighbors tends to result in cooperation with another cooperator. This process is known as kin selection.

In contrast, conditional cooperation is another mechanism to achieve positive assortment (Fletcher and Doebeli, 2009). The success of the famous Tit-for-Tat strategy (Axelrod and Hamilton, 1981; Nowak and Sigmund, 1992) and other variants (Nowak and Sigmund, 1993) suggests that helping only those who have helped in the past (Trivers, 1971; Axelrod, 1984; Alexander, 1987; Nowak and Sigmund, 2005) is a strong driving force for the evolution of cooperation. In these cases, positive assortment does not necessarily mean genetic assortment but means behavioral assortment; whatever different genetic architecture is behind cooperation, those who behave cooperatively at a phenotypic level come together and interact with each other.

A vast majority of previous models of evolution of conditional cooperation has assumed repeated interactions, where the same group of individuals interact repeatedly, or, in the case of indirect reciprocity, one repeatedly interacts with different others, but their past history of actions is available as reputation. In either case, it is a well established fact that a long repetition is a key to success (Nowak, 2006b).

However, an experiment suggests that people behave conditionally on others' choices even in a one shot interaction (Fischbacher et al., 2001). In their four-player public goods game experiment, Fischbacher et al. (2001) asked each of the four playersto submit a contribution table, which describes how much one would like to contribute to a public good for all 21 possible average contributions by the other three players. If one assumes that everyone should behave rationally, two predictions follow. Firstly, the best choice is to contribute nothing irrespective of others' decisions. Secondly, and more interestingly, there should be no incentives at all to base one's contribution on others', because the game used in that experiment was a linear public goods game. To understand the second point more, here is the payoff function used in their experiment;

$$
\pi\_i = 20 - g\_i + 0.4 \sum\_{j=1}^{4} g\_j,\tag{1}
$$

where π<sup>i</sup> is the payoff of i-th player, and 0 ≤ g<sup>j</sup> ≤ 20 is j-th player's contribution to a public good. This functional form clearly suggests that for each additional unit amount of contribution, i-th player loses 0.6 units irrespective of others' decisions and hence that taking others into account makes no sense. Despite these predictions, Fischbacher et al. (2001) found that a significant fraction of participants made a positive contribution in this experiment, and that 50% of participants were "conditional cooperators" who monotonically increased their contribution with increased average contribution by the others. Interestingly, they also found the existence of "unconditional defectors" who persistently contributed nothing.

The experiment by Fischbacher et al. (2001) suggests that people have strong preference to coordinate their behavior with others, if possible, even in a one-shot interaction. One may think that a one-shot interaction in the real world is truly "oneshot" in the sense that no communication outside the game is allowed, but it is not necessarily true. There is sometimes a stage of negotiation or discussion by the participants before the actual game is played, where they talk with each other and can coordinate their behavior. One good example is international negotiation about the global climate change, where many hours of discussion are performed before participants finally decide whether or not to cooperate (Smead et al., 2014).

The aim of this paper is to explicitly model the process of negotiation that occurs prior to the game to understand its potential role in the evolution of cooperation. In that sense, my model is specific to human behavior because it is hard to imagine that non-human animals are engaged in negotiation before social interactions. I am in particular interested in whether it explains the emergence and maintenance of conditional cooperators in a linear public goods game. As a result of my analysis, I find that conditional cooperators and unconditional ones are sustained in the population through frequency dependent selection for a wide range of parameters. I will also discuss my model limitations in Discussion.

### 2. MODEL

#### 2.1. Public Goods Game

I study a linear public goods game played by n(≥ 2) players. Each player ultimately chooses one action, either cooperation (hereafter abbreviated as C) or defection (abbreviated as D). Each cooperator pays the cost c for a public good, but defectors do not. The total payment is aggregated, multiplied by the factor r, and equally redistributed to the participants of the game irrespective of their contribution to the public good. Therefore, when there are k cooperators and n − k defectors in the game, their payoffs are given respectively as

$$\begin{aligned} W\_{\rm C} &= -c + \frac{rkc}{n}, \\ W\_{\rm D} &= \frac{rkc}{n}. \end{aligned} \tag{2}$$

When one pays the cost c, it yields the net benefit of rc to the group. Equivalently, for each additional contribution c, each individual obtains the benefit of rc/n. Hereafter I assume 1 < r < n such that contribution to the public good is beneficial to a group (i.e., rc > c) but not to an individual (i.e., rc/n < c).

### 2.2. Strategies

In order to consider coordinated actions by players, here I assume that players in the game possess a conditional strategy. More specifically, a player refers to the actions of the other n − 1 players and conditions its own action (C or D) on the number of cooperators among those n − 1 players. Because the number of cooperators excluding self can be either 0, 1, · · · , or n − 1 (= n possibilities), a conceivable strategy takes the form of an n-digit sequence of letters of C or D, the k-th letter (1 ≤ k ≤ n) of which corresponds to the action prescribed by that strategy when the number of cooperators excluding self is exactly equal to k − 1. For example, CCC· · · CC is the strategy that always prescribes cooperation irrespective of others' actions, which is so called ALLC strategy. The strategy DDD· · · DD always defects, so it is called ALLD strategy. Of course more complicated strategies are possible; for example, the strategy CDCDCD· · · prescribes cooperation when the number of cooperators excluding self is even, and defection when odd. There are 2<sup>n</sup> possible strategies in total.

Out of all conceivable strategies, I especially pay attention to simple ones; those which have a minimum threshold level for cooperation. In other words, I consider strategies in the form of

$$\underbrace{\mathbf{D}\cdots\mathbf{D}}\_{k}\underbrace{\mathbf{C}\cdots\mathbf{C}}\_{n-k}.\quad\text{(}0\le k\le n\text{)}\tag{3}$$

The strategy represented by Equation (3) cooperates when the number of cooperators excluding self is at least k, otherwise defects. Let us call this strategy C<sup>k</sup> . Obviously C<sup>0</sup> is the ALLC strategy and C<sup>n</sup> is the ALLD strategy. In between are strategies that cooperate only if some others cooperate. In other words, the index number k represents the degree of resistance against cooperation. In the following I will consider only those (n + 1) strategies, from C<sup>0</sup> to Cn.

#### 2.3. How Negotiation Proceeds

Since players condition their actions on other players' actions, which in turn are dependent on other players' actions, it is not straightforward to predict the final consequence of the game interaction. Therefore I model the negotiation stage prior to the actual game in the following way. First, to each of the n players, his/her initial thought, either C or D, is assigned by some specific rule. Here, thought means one's temporal but not final decision, which is observable to everyone, but does not affect one's payoff at all. It is instructive to imagine, for example, n human agents at a negotiation table. Those agents simultaneously announce their initial thoughts, and therefore I can assume that perceiving others' thoughts is easy and costless. A combination of all players' thought, that is usually expressed by an n-tuple of C or D, is called a state. Given an initial state in the negotiation stage, a player is randomly chosen, and is given an opportunity to change his/her thought, from C to D, or from D to C, if his/her conditional strategy prescribes so. For example, if a C<sup>3</sup> strategist, currently having thought C, finds only two other C's in the group, he/she changes his/her thought to D, because he/she needs at least three other C players to keep his/her current thought to play C. This change of his/her thought is announced to everyone. In the next step, a player is randomly chosen again for an update, and this procedure is repeated until no one wants to change his/her thought. I call such a final state stationary state. In Section A in the Supplementary Material I prove that there always exists at least one stationary state, so this negotiation surely ends. However, multiple stationary states are possible, and which stationary state is reached depends on players' initial thought and the order of updates. Once a stationary state is reached, all players transform their thought to an actual action in the public goods game, they obtain payoffs, and the game ends. Here I exclude the possibility of lying (that is, one takes the opposite action to his/her thought at the stationary state in the negotiation), and this point will be discussed more in Discussion.

### 2.4. Example

To facilitate a better understanding of the model, consider an example of the three-person game played by individuals X, Y, and Z. Suppose that X and Y adopt strategy C<sup>1</sup> while Z adopts strategy C2. Below I will represent the temporal thought of those three players by a triplet, such as (X, Y, Z) = (D, C, C).

Suppose that the initial state is (D, C, C). If players chosen randomly in the first four update steps are Y, Z, X, and Z in this order, the following state transition occurs;

$$\begin{array}{c} \text{(D, C, C)} \xrightarrow[\text{Y chosen}]{} \text{(D, C, C)} \xrightarrow[\text{Z chosen}]{} \text{(D, C, D)}\\ \xrightarrow[\text{X chosen}]{} \text{(C, C, D)} \xrightarrow[\text{Z chosen}]{} \text{(C, C, C)}. \end{array}$$

In the first step Y is chosen. Y finds there is one cooperator, Z, and that satisfies his threshold. Therefore Y does not change his thought. In the second step Z is chosen. Z finds there is one cooperator, Y, but that does not satisfy his threshold. Therefore Z changes from C to D. In the third step, X is chosen. X finds that there is one cooperator, Y, and that satisfies his threshold. Therefore X changes from D to C. In the fourth step, Z is chosen. In contrast to the second step, Z finds two cooperators, X and Y, which satisfies his threshold. Therefore Z changes from D to C. It is easy to see that (C, C, C) is a stationary state for them. Thus they play an actual public goods game, all of them pay the cost of cooperation, and enjoy the benefit from the public good.

It is notable that in the transition shown in Equation (4), individual Z made two changes, from D to C and from C to D. Such a transition is possible depending on the order of updates. In addition, it is not difficult to see that (D, D, D) is another stationary state. For example, if players randomly chosen in the first two steps are Z and Y in this order, the following transition occurs, leading to no cooperation.

$$\text{(D,C,C)} \xrightarrow[\text{Z chosen}]{} \text{(D,C,D)} \xrightarrow[\text{Y chosen}]{} \text{(D,D,D)}.\tag{5}$$

### 2.5. Initial State

As I have seen above, initial states have a great impact on the consequence of negotiation. Players may have predisposition either toward C or D, but here I assume that each player independently has initial thought C with probability p, and initial thought D with probability 1 − p. When p = 0, it means that the default action is D. This is true when cooperation takes the form of active contribution; cooperation means doing something and defection means doing nothing. For example, monetary investment to a public good often takes this form. In contrast, p = 1 means that the default action is C. This is true when defection takes the form of active exploitation; defection means doing something and cooperation means doing nothing. Forest conservation can be a good example of this. Cutting trees and selling timber is exploitative defection, whereas not cutting trees is passive cooperation. Therefore I cannot necessarily set the value of p a priori. Instead, I treat p as my model parameter.

Another rationale behind the parameter p, especially when it is between 0 and 1, is that it reflects some uncertainty in the game. It could be the case that players do not perfectly know the payoff structure of the game at the beginning, in which case they may temporarily choose either one of the actions.

Although the value of p can possibly be chosen independently and strategically by different strategies, here I assume for simplicity that p is common among all the strategies. Therefore p is not an evolutionary trait but a model constant in this paper.

#### 2.6. Population Game and Evolutionary Dynamics

So far I have explained the public goods game played by n players, but I will also consider a population of players. Suppose that there is a population of players of size M (either infinitely large or finite). For each public goods game n players are randomly chosen from the population, they play a one-shot public goods game with a negotiation stage described above, and return to the population. Such n-person games are played many times, and each individual obtains an average payoff per game, which I will denote by w.

Time change of frequencies of strategies can be studied by evolutionary dynamics, which are based on a simple criterion that successful strategies increase in frequency. Note that equations of evolutionary dynamics can describe both genetic evolution, in which information is transmitted through genetic materials, and cultural evolution, in which information such as ideas or norms can be transmitted culturally, in a quite similar form (Traulsen et al., 2009, 2010); here it is natural to consider cultural evolution. For an infinitely large population, M = ∞, the evolutionary dynamics of (n + 1) strategies, from C<sup>0</sup> to C<sup>n</sup> are described by the replicator equation (Taylor and Jonker, 1978; Hofbauer and Sigmund, 1998; Nowak, 2006a);

$$
\dot{\varkappa}\_k = \varkappa\_k (\varkappa\_k - \overline{\varkappa}), \tag{6}
$$

where x<sup>k</sup> and w<sup>k</sup> are the frequency and the average payoff of strategy C<sup>k</sup> , respectively. The average payoff in the population, w, is calculated as w ≡ P<sup>n</sup> ℓ=0 xℓwℓ. The dynamics is defined in the n-dimensional simplex, Sn+1, where x<sup>k</sup> 's are non-negative and they sum up to unity.

For a finite population, M < ∞, a frequency-dependent Moran process (Nowak et al., 2004) and pairwise comparison processes (Traulsen et al., 2005, 2007) are standard models to describe its evolutionary dynamics. Similarly to the infinite population case, players are engaged in many n-person games and obtain average payoffs. In each elementary step of updating, two players are randomly chosen from the population (with replacement). The first player compares his payoff with that of the second player. Let 1 be the payoff of the second player minus that of the first. Then the first player copies the strategy of the second player with probability

$$\frac{1}{1+\exp[-s\Delta]},\tag{7}$$

otherwise he stays with the current strategy. Here, the parameter s > 0 is called intensity of selection (or inverse temperature). TABLE 1 | Stationary states in the two-person game.


The functional form of Equation (7) comes from the Fermi distribution function in physics (Traulsen et al., 2006, 2007), so this process is sometimes referred to as Fermi process (Traulsen and Hauert, 2009). Equation (7) suggests that the first player is more likely to copy the strategy of the second player if the payoff difference, 1, is larger.

Because of a finite population size, once all players adopt the same strategy, no other strategies can invade the population. Such a phenomenon is called fixation. In order to avoid fixation of strategies, I consider mutation in strategies. With a positive probability, µ > 0, the first player who is chosen in an elementary step of updating changes his strategy to another random strategy, irrespective of the payoff difference, 1. Under the limit of µ → 0, a newly arising mutant in a resident population either goes to extinct or takes over the whole population before a next mutant arises. Such limit is sometimes referred to as adiabatic limit (Sigmund et al., 2010, 2011). In the adiabatic limit only possible transitions are those from one monomorphic population to another, so fixation probabilities between two strategies characterize the process (see Section B in the Supplementary Material).

#### 3. TWO-PERSON GAME

#### 3.1. Payoffs

First I study the n = 2 person game. Let ak,<sup>ℓ</sup> be the payoff of a C<sup>k</sup> player matched with a C<sup>ℓ</sup> player. There are six different types of encounters, (C0, C0), (C0, C1), (C0, C2), (C1, C1), (C1, C2), and (C2, C2). It is easy to confirm that the stationary state of each encounter except (C1, C1) is unique. According to **Table 1**, payoffs are

$$\begin{aligned} a\_{0,0} &= -c + rc \\ a\_{0,1} &= -c + rc \\ a\_{0,2} &= -c + \frac{1}{2}rc \\ a\_{1,2} &= 0 \\ a\_{2,2} &= 0. \end{aligned} \qquad \begin{aligned} a\_{1,0} &= -c + rc \\ a\_{2,0} &= \frac{1}{2}rc \\ a\_{2,1} &= 0 \\ a\_{2,2} &= 0. \end{aligned} \tag{8a}$$

On the other hand, the encounter (C1, C1) needs consideration. There are two stationary states, (C, C) and (D, D). If the initial state is (C, C) (which occurs with probability p 2 ) or (D, D) (which occurs with probability (1 − p) 2 ), it is already a stationary state. If the initial state is (C, D) or (D, C) (which occurs with probability 2p(1 − p)), however, who updates first matters. If the one with thought C is chosen for an update, he changes to D and mutual defection results. If the one with thought D is chosen for an update, he changes to C and mutual cooperation results. These chances are even. Therefore, for the encounter (C1, C1), the probability that they arrive at mutual cooperation is p 2 + 1 2 · 2p(1 − p) = p, and that of mutual defection is (1 − p) 2 + 1 2 · 2p(1 − p) = 1 − p. As a result, I obtain

$$a\_{1,1} = p(-c + rc) + (1 - p) \cdot 0 = p(-c + rc). \tag{8b}$$

To summarize, I have obtained the following payoff matrix of the game;

$$\mathbf{C}\_{0} = \begin{array}{c c c c c} & \mathbf{C}\_{1} & \mathbf{C}\_{2} \\ \mathbf{C}\_{0} & -c + rc & -c + rc & -c + \frac{1}{2}rc \\ \mathbf{C}\_{1} & -c + rc & p(-c + rc) & 0 \\ \mathbf{C}\_{2} & \frac{1}{2}rc & 0 & 0 \end{array} \tag{9}$$

#### 3.2. Infinite Population

Evolutionary game dynamics based on the payoff matrix, Equation (9), are shown in **Figure 1** for the three separate cases, (a) p = 0, (b) 0 < p < 1, and (c) p = 1.

Firstly I look at the two extreme cases. When p = 0 (see **Figure 1A**), everyone initially chooses defection. Therefore no cooperation arises unless there is at least one C<sup>0</sup> player. Obviously C<sup>1</sup> is invaded by C0, which in turn is invaded by C2. In the absence of C<sup>0</sup> strategy, C<sup>1</sup> and C<sup>2</sup> are neutral. There is a continuum of fixed points on the C1-C<sup>2</sup> edge, a part of the segment including the C<sup>1</sup> corner consists of unstable fixed points; introduction of C<sup>0</sup> players drives the population away from these fixed points. The other segment including the C<sup>2</sup> corner consists of stable fixed points. Its mirror image is obtained when one considers the case of p = 1 (**Figure 1C**), where C<sup>2</sup> invades C<sup>0</sup> but it is invaded by C1. The C0-C<sup>1</sup> edge consists of unstable and stable segments.

Dynamics are in between these extreme cases when 0 < p < 1 (see **Figure 1B**). There is an internal fixed point and myriads of closed orbits surround it. Strategy C<sup>1</sup> invades the population of C2, which is invaded by C0, which is invaded by C2. The edges of the simplex constitute a heteroclinic cycle. The frequencies of strategies at the internal fixed point is given as

$$(\mathbf{x}\_0^\*, \mathbf{x}\_1^\*, \mathbf{x}\_2^\*) = \left(\frac{2p(r-1)}{r}, \frac{2-r}{r}, \frac{2(1-p)(r-1)}{r}\right). \tag{10}$$

It is worthwhile to mention that the two-player game dynamics are equivalent to the dynamics of ALLC, ALLD, and Tit-For-Tat (TFT) strategies in a repeated Prisoner's Dilemma game (Brandt and Sigmund, 2006; Sigmund, 2010). To see this, consider a two-person Prisoner's Dilemma game with the following payoff matrix;

$$
\begin{array}{cccc}
\text{C} & \text{D} & \\
\text{C} & \begin{pmatrix} -\text{C} + \text{B} & -\text{C} \\ \text{B} & \text{0} \end{pmatrix},
\end{array}
\tag{11}
$$

where C is the cost and B is the benefit of cooperation, and consider a repeated game of this Prisoner's Dilemma with a discounting factor, δ. ALLC players always cooperate. ALLD players always defect. TFT players cooperate in the first round, and then imitate whatever the opponent did in the previous round. I also consider errors; I assume that an erroneous defection occurs with probability (1 − k)ǫ when one intends cooperation, and that an erroneous cooperation occurs with probability kǫ when one intends defection. Let A ′ be the payoff matrix of this repeated game, each pivot representing a payoff per round. In the double limit of δ → 1 and then ǫ → 0, it turns to be

$$\begin{array}{c c c c c} & \text{ALLC} & \text{TFT} & \text{ALLD} \\ \text{ALLC} & -\text{C} + \text{B} & -\text{C} + \text{B} & -\text{C} \\ \lim\_{\epsilon \to 0} \lim\_{\delta \to 1} A' & \text{TFT} & \begin{pmatrix} -\text{C} + \text{B} & -\text{C} + \text{B} & -\text{C} \\ -\text{C} + \text{B} & k\{-\text{C} + \text{B}\} & 0 \\ \text{B} & \text{0} & 0 \end{pmatrix}, \end{array} \\ \text{(12)}$$

which is formally equivalent to Equation (9) with the transformation of B ≡ rc/2, C ≡ c − (rc/2) and k ≡ p (see Figure 3 of Brandt and Sigmund, 2006). This correspondence makes sense, because the negotiation stage in my model can be interpreted as hypothetical rounds of the repeated game where payoffs are not counted. The limit δ → 1 means that I count only payoffs in future rounds after a stationary state is reached.

I also find differences between the two models. In my model, players' thought is updated asynchronously such that at most one

M = 36, p = 0.5, r = 1.5 and c = 1.0. Mutation rate was set to µ = 10−<sup>4</sup> per elementary updating step. M elementary steps constitute one generation. Time average was taken over 10<sup>8</sup> generations.

player can change his thought (C to D, or D to C) in one updating event. In contrast, players in the repeated game change their actions (C to D, or D to C) in a synchronous fashion; each player takes into account the previous action by the partner. Another difference is that, while errors are not assumed in my model, the model of the repeated game does consider erroneous defection and cooperation. It is interesting that my parameter p, that is the probability that initial intension is C, correspond exactly to the parameter k in the repeated game model, which represents the fraction of erroneous cooperation among all erroneous moves.

#### 3.3. Finite Population

To simplify the analysis, I consider the adiabatic limit, µ → 0, and strong selection, s → ∞. More precisely speaking, I first take the limit µ → 0 and then take the limit s → ∞.

Under the adiabatic limit, the population is almost always monomorphic in strategies. Therefore I can consider the stationary distribution over the three strategies; namely how much proportion of time the stochastic game dynamics spends at each monomorphic state. Let q<sup>k</sup> (k = 0, 1, 2) represent the fraction of time that the stochastic process spends at the monomorphic population of strategy C<sup>k</sup> . Calculations in Section C in the Supplementary Material show that for M ≥ 3, the following result holds;

$$(q\_0, q\_1, q\_2) = \begin{cases} (\frac{1}{M+3}, \frac{1}{M+3}, \frac{M+1}{M+3}) & \text{if } p = 0\\ (\frac{1}{4}, \frac{1}{4}, \frac{1}{2}) & \text{if } 0 < p < 1\\ (\frac{1}{M+4}, \frac{M+1}{M+4}, \frac{2}{M+4}) & \text{if } p = 1. \end{cases} \tag{13}$$

Computer simulations confirm the validity of this result (**Figure 2**). To understand the significance of the result, it is TABLE 2 | Stationary states in the three-person game.


instructive to consider a traditional framework of social dilemma, where only C<sup>0</sup> and C<sup>2</sup> strategies are possible. In this case, irrespective of the value of p, the stationary distribution of the Fermi process under the adiabatic limit and strong selection is

$$(q\_0, q\_2) = (0, 1). \tag{14}$$

Equation (13) thus suggests that the existence of coordinated cooperators, C1, has a great impact on evolutionary dynamics. For 0 < p < 1, unconditional cooperation (C0) is attained a quarter of the time during evolution. This is because a C<sup>1</sup> mutant in the population of C<sup>2</sup> players has 50% chance of fixation; once a C<sup>1</sup> player replicates to two by chance, those two C<sup>1</sup> players have a positive (=p) chance of establishing mutual cooperation and thus they can outcompete C<sup>2</sup> players. However, C<sup>1</sup> players are invaded by C<sup>0</sup> players, because a dyad of C<sup>1</sup> players sometimes fail to establish mutual cooperation, which is disadvantageous compared with C0. Obviously C<sup>0</sup> is invaded by C2, and such an evolutionary cycle repeats. In other words, coordinated cooperators C<sup>1</sup> work as a catalyst of cooperation. If they exist, sociality is promoted and rationality is hindered.

Such an effect is much more dramatic when p = 1. In this case strategies C<sup>0</sup> and C<sup>1</sup> are completely neutral to each other, and the only difference between them is whether or not they can establish mutual cooperation in the population of C<sup>2</sup> without being cheated. In fact, strategy C<sup>0</sup> is easily exploited by C<sup>2</sup> but C<sup>1</sup> is not. Therefore, for a large M, the population is dominated by C<sup>1</sup> most of the time.

#### 4. THREE-PERSON GAME

#### 4.1. Payoffs

Next I consider the n = 3 person game. Let ak,ℓ1ℓ<sup>2</sup> be the payoff of a C<sup>k</sup> player matched with a Cℓ<sup>1</sup> player and a Cℓ<sup>2</sup> player. Obviously ak,ℓ1ℓ<sup>2</sup> = ak,ℓ2ℓ<sup>1</sup> holds. In the three-person game with four different strategies from C<sup>0</sup> to C<sup>3</sup> there are 20 possible encounters, which are listed up in **Table 2**. For each case, the probability with which negotiation reaches each possible stationary state is calculated (see Section D in the Supplementary Material). As a result I arrive at the following payoff matrix; holds there exists one unstable equilibrium (which I hereafter call P02) and the system shows bistability. If r is smaller than 3/2, strategy C<sup>2</sup> dominates C0. If r is greater than (3 + 3p)/(1 + 2p), strategy C<sup>0</sup> dominates C2.

On the C1-C<sup>3</sup> edge, in contrast, if

$$\frac{3}{2} < r < \frac{6 - 3p}{3 - 2p}.\tag{18}$$

holds there exists one stable equilibrium (which I hereafter call Q13) and the system allows the coexistence of the two strategies. If r is smaller than 3/2, strategy C<sup>3</sup> dominates C1. If r is greater than (6 − 3p)/(3 − 2p), strategy C<sup>1</sup> dominates C3.

A = C0C<sup>0</sup> C0C<sup>1</sup> C0C<sup>2</sup> C0C<sup>3</sup> C1C<sup>1</sup> C<sup>0</sup> −c + rc −c + rc −c + rc −c + 2 3 rc −c + rc C<sup>1</sup> −c + rc −c + rc −c + rc −c + 2 3 rc p(2 − p)(−c + rc) C<sup>2</sup> −c + rc −c + rc − p(1+p) 2 c + p <sup>2</sup>+p+1 3 rc <sup>1</sup> 3 rc p(3−p) 2 (−c + rc) C3 2 3 rc <sup>2</sup> 3 rc <sup>1</sup> 3 rc <sup>1</sup> 3 rc p(3−p) 3 rc C1C<sup>2</sup> C1C<sup>3</sup> C2C<sup>2</sup> C2C<sup>3</sup> C3C<sup>3</sup> −c + rc −c + 2 3 rc −c + p <sup>2</sup>+p+1 3 rc −c + 1 3 rc −c + 1 3 rc p(3−p) 2 (−c + rc) − p(3−p) 2 c + p(3−p) 3 rc p(1+p) 2 (−c + rc) 0 0 p(1+p) 2 (−c + rc) 0 p 2 (−c + rc) 0 0 0 0 0 0 0 . (15)

Because I assume random matching of players, the average payoff of a C<sup>k</sup> player is calculated as

$$\omega\_{k} = \sum\_{\ell\_{2}=0}^{3} \sum\_{\ell\_{1}=0}^{3} a\_{k,\ell\_{1}\ell\_{2}} \omega\_{\ell\_{1}} \chi\_{\ell\_{2}},\tag{16}$$

where x<sup>ℓ</sup> represents the frequency of C<sup>ℓ</sup> players in the population.

#### 4.2. Infinite Population

As before I consider the replicator equation, Equation (6). Since the payoff is already quadratic in x, as in Equation (16), the resulting replicator dynamics are highly non-linear. As a result, I have to largely rely on numerical simulations to study the whole dynamics. However, the evolutionary dynamics restricted on either of the six edges of the simplex S<sup>4</sup> are rather easy to study, because they are essentially reduced to a one-dimensional system.

I will hereafter consider when 0 < p < 1. The analysis in Section E in the Supplementary Material shows that behavior on four of the six edges is straightforward; C<sup>2</sup> increases on the C3- C<sup>2</sup> edge, C<sup>1</sup> increases on the C2-C<sup>1</sup> edge, C<sup>0</sup> increases on the C1-C<sup>0</sup> edge, and C<sup>3</sup> increases on the C0-C<sup>3</sup> edge. Therefore, there always exists a heteroclinic cycle connecting the four vertices of the simplex: C<sup>3</sup> → C<sup>2</sup> → C<sup>1</sup> → C<sup>0</sup> → C3. As for the C0-C<sup>2</sup> edge, if

$$\frac{3}{2} < r < \frac{3+3p}{1+2p} \tag{17}$$

Numerical simulations suggests that when r < 3/2 the dynamics either converge to a trimorphic equilibrium or an evolutionary cycle with strategies C1, C<sup>2</sup> and C<sup>3</sup> present but C<sup>0</sup> absent (see **Figure 3**). When r > 3/2, the outcome of evolutionary dynamics seems to rely on the stability of the dimorphic rest point, Q13. It is possible to show that Q<sup>13</sup> is always stable against the invasion of C2. However, it is stable against the invasion of C<sup>0</sup> only when r is below some threshold, r<sup>c</sup> = rc(p). When 3/2 < r < r<sup>c</sup> the system converges to the dimorphic equilibrium, Q13, with strategies C<sup>1</sup> and C<sup>3</sup> present. When r > r<sup>c</sup> , the system converges to a trimorphic equilibrium with strategies C0, C<sup>1</sup> and C<sup>3</sup> present but C<sup>2</sup> absent. **Figure 3** shows the phase diagram in the (p,r)-space according to this classification as well as long term consequences of evolutionary dynamics. It is easy to see there that the instability/stability of Q<sup>13</sup> accurately predicts whether strategy C<sup>0</sup> is present or absent after a long run.

#### 4.3. Finite Population

As before I consider the adiabatic limit and strong selection. The analysis for an infinite population above showed that a coexisting equilibrium (Q13) can exist on the C1-C<sup>3</sup> edge. In this case a C<sup>1</sup> mutant appearing in the finite population of C<sup>3</sup> or vice versa is highly likely to lead the population to a stable mixture of C<sup>1</sup> and C3, and the population will be trapped for a considerably long time there. Nevertheless stochasticity eventually causes either one of the strategies to fixate in the population, and the assumption of the adiabatic limit guarantees

that no second mutation occurs before the first mutant either disappears or fixates in the population.

Section F in the Supplementary Material shows the full analysis of the Fermi process. For 0 < p < 1, I find that the stationary distribution differs between the four parameter regions shown in **Figure 4**. Similarly to section 3.3, let qk (k = 0, 1, 2, 3) be the fraction of time that the Fermi process spends at the monomorphic population of strategy C<sup>k</sup> in the stationary distribution. For a large M, the following result holds; C<sup>2</sup> always create an invasion path of C<sup>3</sup> → C<sup>2</sup> → C<sup>1</sup> → C0. Additionally, when r is large there are other invasion paths, such as C<sup>3</sup> → C<sup>2</sup> → C<sup>0</sup> and C<sup>3</sup> → C<sup>1</sup> → C0. These paths contribute to the evolutionary success of more cooperative strategies. I have confirmed the validity of the analytical results [Equation (19)] by computer simulations for parameters that do not allow the existence of stable equilibrium Q<sup>13</sup> in the corresponding infinite population model (**Figures 5**, **6**). Note that when Q<sup>13</sup> exists and when the population size M is large, it takes enormous time to

$$\mathbf{q}\_{1}(q\_{0},q\_{1},q\_{2},q\_{3}) = \begin{cases} \frac{1}{16}(5,4,1,6) = (0.3125,0.2500,0.0625,0.3750) & \text{(if } r > \frac{3+3p}{1+2p} \\\frac{1}{18}(5,5,2,6) = (0.2778,0.2778,0.1111,0.3333) & \text{(if } \frac{7-3p}{4-2p} < r < \frac{3+3p}{1+2p} \\\frac{1}{10}(1,1,2,6) = (0.1000,0.1000,0.2000,0.6000) & \text{(if } \frac{3}{2} < r < \frac{7-3p}{4-2p} \\\frac{1}{25}(2,3,6,15) = (0.0769,0.1154,0.2308,0.5769) & \text{(if } r < \frac{3}{2}). \end{cases} \tag{19}$$

Compare this result with the result of a conventional model that allows only C<sup>0</sup> and C3, which is

$$(q\_0, q\_3) = (0, 1). \tag{20}$$

robust against the invasion of C0. Parameters studied: p ∈ {0.1, 0.2, · · · , 0.8, 0.9} and r ∈ {1.2, 1.4, · · · , 2.6, 2.8}.

Obviously the existence of strategies C<sup>1</sup> and C<sup>2</sup> dramatically increases the possibility of cooperation. For example, Equation (19) states that evolution favors strategies other than full defection (C3) 62.5% of the time when r is large. Remember that without C<sup>1</sup> and C<sup>2</sup> full defection (C3) prevails over full cooperation (C0) because the former exploits the benefit yielded by the latter. However, as I saw in section 4.2, strategies C<sup>1</sup> and numerically confirm Equation (19) due to the reason described in the beginning of this subsection. Analyses for the cases of p = 0 and p = 1 are found in Section F in the Supplementary Material.

#### 5. DISCUSSION

This paper explicitly models the process of negotiation among players, including conditional cooperators, to study its evolutionary consequences. There is much similarity between my model here and previous models of repeated games. In particular, my strategy C<sup>k</sup> , which changes his/her own thought

FIGURE 4 | A stationary distribution of the Fermi process of the three-person game in a finite population of size M(≫1). Each small pie chart represents how much fraction of time the Fermi process stays at each monomorphic state. The three solid lines represent r = 3/2, (7 − 3p)/(4 − 2p) and (3 + 3p)/(1 + 2p), respectively. Parameters studied: p ∈ {0.1, 0.2, · · · , 0.8, 0.9} and r ∈ {1.2, 1.4, · · · , 2.6, 2.8}.

dotted lines in the right show the stationary distribution of strategies, (q0, q1, q2, q3) = 1 <sup>16</sup> (5, 4, 1, 6), predicted by Equation (19) for the adiabatic limit. When s is close to zero, each strategy has approximately the frequency of one fourth. Parameters: M = 36, p = 0.5, r = 2.5 and c = 1.0. Mutation rate was set to µ = 10−<sup>4</sup> per elementary updating step. M elementary steps constitute one generation. Time average was taken over 10<sup>8</sup> generations.

to cooperation if and only if k or more than k others show the thought of cooperation, corresponds to strategy T<sup>a</sup> proposed by Boyd and Richerson (1988), which cooperates in the next round of the repeated Prisoner's Dilemma game if and only if a or more than a others play cooperation in the current round. A very similar formulation is also found in Segbroeck et al. (2012), where

their R<sup>M</sup> strategy cooperates if M or more than M individuals (including self) cooperated in the previous round. Two major differences between the current model and those previous models are; that (i) only the final state of negotiation affects one's payoff in my model whereas each round of the repeated game yields a payoff to players in the models of Boyd and Richerson (1988) and Segbroeck et al. (2012), and that (ii) players update their thought asynchronously in the negotiation stage in my model Ohtsuki Coordinated Cooperation

whereas all players update their actions synchronously in Boyd and Richerson (1988) and Segbroeck et al. (2012). Conditional cooperators in my model can detect unconditional defectors during negotiation at no cost and avoid being exploited by them, while conditional cooperators in Boyd and Richerson (1988) and Segbroeck et al. (2012) can detect unconditional defectors only after being exploited by them in the first round of the repeated game and hence detection of unconditional defectors is costly there (compare Figures 2, 3 of Brandt and Sigmund, 2006 to understand how the payoff in the first round qualitatively changes evolutionary dynamics). Similar phenomena, though the modeling framework is quite different from the current one, were found in the continuous-time, two-player "coaction" model by van Doorn et al. (2014), where the authors found that (i) real time coaction in response to partner's behavior (analogous to my negotiation stage) generally favors cooperation but that (ii) once delay in information about the behavior of one's partner is introduced, as is often the case with discrete-round repeated Prisoner's Dilemma games, achieving cooperation becomes more difficult. Therefore, the introduction of a negotiation stage, if the possibility of lying is suppressed by some mechanism such as punishment (Sigmund et al., 2010; Quiñones et al., 2016) or ostracism (Nakamaru and Yokoyama, 2014), contributes to enhancing the efficiency of conditional cooperation.

It is notable that my model explains the presence of conditional cooperation not as an evolutionarily stable strategy (ESS). For example, a classical ESS analysis of the Tit-For-Tat strategy (Axelrod and Hamilton, 1981) predicts that everyone should adopt conditional cooperation at an evolutionary equilibrium. However, recent experiments strongly suggest that there is wide variation in behavior among people (Fischbacher et al., 2001; Martinsson et al., 2013). My analysis here, in contrast, predicts evolutionary coexistence of many types of players. In fact, I have found, for both two-player and three-player games and in both infinite and finite population analyses, that the existence of conditional cooperators creates a cycle of invasion, in which unconditional defectors are invaded by conditional cooperators, which are invaded by unconditional cooperators, which are then invaded by unconditional defectors. As a result, cooperation is sustained to some degree in the population. Note that, although my model predicts such cyclical invasion over time, it should be best interpreted as the possibility of polymorphism, because the evolutionary model here inevitably simplifies other factors of human decision making. A similar evolutionary cycle has been found in Segbroeck et al. (2012). Conditional cooperators work as an evolutionary catalyst; they create an evolutionary advantage of being a cooperator, and selfsustain their presence in the population. This is quite in contrast to a population with unconditional defectors and unconditional cooperators only, where defection is a dominating strategy.

As mentioned in the Model section, my negotiation model makes a very strong assumption; that players can never change the action (i.e., never tell a lie) once the negotiation reaches a stationary state. It can be understood such that players make a commitment before the game is actually played. Recently, a series of papers analyzed the effect of such pre-commitments on evolution of cooperation (Han et al., 2013, 2015a,b, 2017a,b; Sasaki et al., 2015; Han and Lenaerts, 2016) and found that pre-commitments were effective in enhancing cooperation. Those works typically assume that players can choose whether they make a costly commitment before the game. If one breaks the commitment he or she has to pay a fine. It has been shown that a large fine enhances the success of commitment strategies (Han et al., 2013, 2015a, 2017a). Another possible way to suppress those who make a fake commitment would be to exclude them from other games in the future. I have not modeled these "outside-game" possibilities in this paper but have concentrated on describing the one-shot negotiation game.

Among those papers on pre-commitments, Han et al. (2017a) has notable similarity to my current model, because both study public goods games and consider conditional cooperators who are keen to the behavior of others in the group. Through a finite population game dynamics analysis, Han et al. (2017a) essentially found a similar evolutionary cycle, from unconditional defectors to conditional cooperators, then to unconditional cooperators, and then to unconditional defectors again. In contrast, there is a remarkable difference between these two models. In my model players make "commitments" to cooperate or to defect depending on the number of other cooperators and defectors during a process of dynamic negotiation. In the model of Han et al. (2017a), however, all players except pure defectors first do make commitments to cooperate, and then count the number of committers to see if this number exceeds their threshold to actually play the public goods game.

My model does not rely on the mechanism of direct reciprocity in the sense that the same individuals do not have to interact repeatedly. This feature is shared by models of generalized reciprocity (Hamilton and Taborsky, 2005; Pfeiffer et al., 2005; Chiong and Kirley, 2015), where individuals make decisions based on the previous encounter with other group members. A driving force of evolution of generalized reciprocity is assortment of cooperative strategies (Rankin and Taborsky, 2009) based on contingent movement of individuals between groups (Hamilton and Taborsky, 2005), a small group size (Pfeiffer et al., 2005) (but see Barta et al. (2011), where random drift helps generalized reciprocity to overcome initial disadvantage in a large group), or network structure (van Doorn and Taborsky, 2012). Generalized reciprocity has been proposed as a mechanism that does not require high cognitive ability, and hence is applicable to cooperation by non-human animals (Rutte and Taborsky, 2007; Schneeberger et al., 2012; Leimgruber et al., 2014; Gfrerer and Taborsky, 2017) as well as empathy-based cooperation by humans (Bartlett and DeSteno, 2006; Stanca, 2009). In contrast, a driving force of cooperation in my model is coordination of behavior based on negotiation and pre-commitments. Therefore, its scope of application is rather cognitionbased cooperation (and defection), which characterizes another aspect of human sociality (Knoch et al., 2006; Baumgartner et al., 2011; Ruff et al., 2013; Yamagishi et al., 2016).

A technical advantage of employing the finite population analysis is that, in contrast to replicator dynamics analysis for an infinitely large population where outcomes can be dependent on initial conditions and many complexities can arise due to high dimensionality, it can predict a stationary probability distribution that is independent of initial conditions. There is limitation in my analysis based on the adiabatic limit and strong selection, though, because mutation rate must be unrealistically low for the Fermi process to reach either end of the C1-C<sup>3</sup> edge (i.e., fixation of one strategy) despite the tendency of evolutionary coexistence of those two strategies due to negative frequency-dependent selection. Nevertheless, I believe that this methodology can give us some insights that would not have been derived by replicator dynamics analyses.

There is a growing interest in studying negotiation processes to see how flexibility in behavior shapes an evolutionary outcome (McNamara et al., 1999; McNamara, 2013; Quiñones et al., 2016; Ito et al., 2017). My negotiation model here is such an attempt to reveal the origin of conditional cooperators and to explain why we observe both cooperation and defection in the real world.

### REFERENCES


### AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

### FUNDING

This work was supported by JSPS KAKENHI (JP25118001, JP25118006, JP16H06324) to HO.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2018.00062/full#supplementary-material


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Ohtsuki. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Rare Third-Party Punishment Promotes Cooperation in Risk-Averse Social Learning Dynamics

#### Mitsuhiro Nakamura\*

Organization for the Strategic Coordination of Research and Intellectual Properties, Meiji University, Tokyo, Japan

Third-party punishment is a common mechanism to promote cooperation in humans. Theoretical models of evolution of cooperation predict that punishment maintains cooperation if it is sufficiently frequent. On the other hand, empirical studies have found that participants frequently punishing others do not success in comparison with those not eager to punish others, suggesting that punishment is suboptimal and thus should not be frequent. That being the case, our question is what mechanism, if any, can sustain cooperation even if punishment is rare. The present study proposes that one possible mechanism is risk-averse social learning. Using the method of evolutionary game dynamics, we investigate the effect of risk attitude of individuals on the question. In our framework, individuals select a strategy based on its risk, i.e., the variance of the payoff, as well as its expected payoff; risk-averse individuals prefer to select a strategy with low variable payoff. Using the framework, we examine the evolution of cooperation in two-player social dilemma games with punishment. We study two models: cooperators and defectors compete, while defectors may be punished by an exogenous authority; and cooperators, defectors, and cooperative punishers compete, while defectors may be punished by the cooperative punishers. We find that in both models, risk-averse individuals achieve stable cooperation with significantly low frequency of punishment. We also examine three punishment variants: in each game, all defectors are punished; only one of defectors is punished; and only a defector who exploits a cooperator or a cooperative punisher is punished. We find that the first and second variants effectively promote cooperation. Comparing the first and second variants, each can be more effective than the other depending on punishment frequency.

Keywords: evolutionary game dynamics, cooperation, third-party punishment, social learning, risk aversion

## 1. INTRODUCTION

Cooperation is observed in various species, albeit it seems unfavorable in view of selfishness [1–3]. Among others, human cooperation is unique as they enforce themselves to cooperate by means of social norms and institutions: norm violators are punished by community members and thus cooperation is maintained [4–10]. In human cooperation, a punisher is often a third party who does not directly suffer from a norm violation. From the viewpoint of rationality, the third-party punisher has no incentive to vicariously punish the norm violator at a personal cost [7, 8]; therefore, third-party punishment is another dilemma of cooperation [6, 10–12]. Despite that, empirical studies suggest that third-party punishment is ubiquitous across humans [8, 13].

#### Edited by:

Tatsuya Sasaki, F-Power Inc., Japan

#### Reviewed by:

Boyu Zhang, Beijing Normal University, China Xiaojie Chen, University of Electronic Science and Technology of China, China Víctor M. Eguíluz, Institute of Interdisciplinary Physics and Complex Systems (IFISC), Spain

\*Correspondence:

Mitsuhiro Nakamura nakamuramh@meiji.ac.jp

#### Specialty section:

This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics

Received: 30 September 2017 Accepted: 18 December 2018 Published: 29 January 2019

#### Citation:

Nakamura M (2019) Rare Third-Party Punishment Promotes Cooperation in Risk-Averse Social Learning Dynamics. Front. Phys. 6:156. doi: 10.3389/fphy.2018.00156

**35**

Numerous evolutionary models have been proposed to solve the dilemma of third-party punishment: group selection [14], reputation as a signal to induce the others' cooperation toward the punisher [15], social structure that localizes interactions [16, 17], conformist bias whereby a majority strategy is imitated in social learning [18], an option to optout of joint enterprize [12, 19], second-order punishment [20, 21], commitment to cooperation before playing a game [22], and implicated punishment in which members in the same group with a wrongdoer are also punished [23]. In all the models, punishment should be sufficiently frequent to maintain cooperation. On the other hand, laboratory studies found that participants frequently engaged in punishment did not success in comparison with those not eager for performing punishment, suggesting that punishing others too frequently is maladaptive [24, 25]. If so, how cooperation can be maintained with only occasional third-party punishment?

In this study, we propose an idea to promote cooperation even when third-party punishment is rare—risk aversion. An obvious psychological fact is that norm violation is a risky choice: it may provoke anger of community members that can lead to actual execution of punishment to the norm violator [26]. In fact, public executions were common in pre-modern societies, intended by rulers to cause fear to commit a norm violation. Moreover, experimental studies suggest that the mere threat of punishment can promote cooperation [27, 28].

To incorporate risk psychology with evolutionary game, we extend the canonical evolutionary game dynamics with a risksensitive utility function, which can describe risk-prone and riskaverse strategy selection. To summarize our results, risk aversion promotes cooperation with a little bit of third-party punishment.

## 2. AUTHORITATIVE THIRD-PARTY PUNISHMENT

We first introduce a simple model of competition between cooperators and defectors in an infinite, well-mixed population, in which defectors are probabilistically punished by a third-party authority. From time to time, randomly sampled two individuals play a social dilemma game called the weak prisoner's dilemma game [29] in which players have two options: cooperation (C) and defection (D). Its payoff matrix is given by

$$\begin{array}{c} \text{C} \quad \text{D} \\ \text{C} \begin{bmatrix} 1 & 0 \\ T & 0 \end{bmatrix}, \\ \text{D} \begin{bmatrix} T & 0 \end{bmatrix} \end{array} \tag{1}$$

where T > 1. In this game, mutual cooperation provides payoff 1 to both players, while they have temptation to enjoy one-sided defection as it provides better payoff T (> 1). However, each game is observed by an authoritative third-party punisher with probability z, and those who have selected defection are fined by an amount F (> 0). The population evolves according to replicator dynamics [30, 31].

#### 2.1. Evolutionary Stability of Cooperators

Here, we consider evolutionary stability of a monomorphic population of cooperators against invasion by defectors. Our finding is that risk aversion of individuals lower the required frequency of observation to maintain cooperation; the authority's cost for punishment is significantly lower than the prediction by the risk-neutral theory.

The ordinary evolutionary game dynamics assume that players change strategies based on their expected payoffs. Let us consider our model on this line. In a monomorphic population of cooperators, the expected payoff of resident cooperators is 1—they mutually cooperate—and that of mutant defectors is T − zF—they enjoy one-sided defection but are punished with probability z. Therefore, the population of cooperators is evolutionarily stable against invasion by defectors, i.e., ESS, if 1 > T − zF, i.e.,

$$z > \frac{T - 1}{F} =: z\_{\text{neutral}}^\*. \tag{2}$$

The infimum of the required probability of observation, z ∗ neutral, is a power function of the amount of fine, F; i.e., z ∗ neutral <sup>∝</sup> <sup>F</sup> −1 (the dashed line in **Figure 1**). This implies that even if the authority imposes a heavy fine on defectors, for maintaining cooperation, the authority needs to punish defectors quite often; the cost to maintain cooperation should be considerable.

We extend the ordinary theory by assuming that players change strategies according to their utility. Given that using strategy s results in a stochastic payoff represented by random variable R<sup>s</sup> (the realization of R<sup>s</sup> is R (i) <sup>s</sup> with probability p (i) <sup>s</sup> where i indicates each outcome), its utility is defined by

$$
\mu\_s = \frac{1}{\beta} \log \mathbb{E} \left[ \mathbf{e}^{\beta R\_s} \right], \tag{3}
$$

FIGURE 1 | The infimum of the required probability of observation by an authority to maintain cooperation: individuals are risk newtral (dashed line; z ∗ neutral where <sup>β</sup> <sup>=</sup> 0), risk averse (solid line; <sup>z</sup> ∗ biased with <sup>β</sup> = −1), or risk prone (dotted line; z ∗ biased with <sup>β</sup> <sup>=</sup> 1). The red dot-dashed line represents the asymptotic line to which z ∗ biased with <sup>β</sup> <sup>=</sup> 1 converges. Parameters: <sup>T</sup> <sup>=</sup> 2.

where E - e βRs = P i p (i) <sup>s</sup> e βR (i) <sup>s</sup> represents the expected value of random variable eβR<sup>s</sup> . Equation (3) is a well-known exponential utility function developed by Pratt [32], Howard and Matheson [33], Coraluppi and Marcus [34] and Mihatsch and Neuneier [35]. It can be expanded to

$$\mathbb{E}\left[\mathcal{R}\_{\text{s}}\right] + \frac{\beta}{2}\text{Var}\left[\mathcal{R}\_{\text{s}}\right] + O(\beta^{2}),\tag{4}$$

where the first term is the expected value of the payoff and the second term is proportional to the variance of the payoff. Thus, if β = 0, the utility is equal to the expected value, implying riskneutral utility; if β < 0, the utility is decreased by the second term, implying risk-averse utility with which an individual finds a strategy less preferable if it produces a highly variable payoff; and if β > 0, the utility is increased by the second term, implying risk-prone utility with which an individual finds a strategy more preferable if it produces a highly variable payoff.

In case of risk aversion or proneness (i.e., if β 6= 0), the utility of being a cooperator and that of being a defector are, from Equation (3), given by

$$
\mu\_{\rm C} = \frac{1}{\beta} \log \left[ 1 \cdot \mathbf{e}^{\beta \cdot 1} \right] = 1 \tag{5a}
$$

and

$$\mu\_{\rm D} = \frac{1}{\beta} \log \left[ z \text{ e}^{\beta(T-F)} + (1-z) \text{ e}^{\beta T} \right],\tag{5b}$$

respectively. A straightforward calculation leads to the ESS condition corresponding to Equation (2): u<sup>C</sup> > uD, i.e.,

$$z > \frac{1 - \mathbf{e}^{-\beta(T-1)}}{1 - \mathbf{e}^{-\beta F}} =: z\_{\text{biased}}^\*.\tag{6}$$

Note that limβ→<sup>0</sup> z ∗ biased <sup>=</sup> <sup>z</sup> ∗ neutral holds true. If β < 0, its asymptotic form is an exponential function of F—i.e., z ∗ biased <sup>∝</sup> e <sup>β</sup>F—and it rapidly approaches 0 as F increases (the solid line in **Figure 1**). This implies that for maintaining cooperation among risk-averse individuals, the authority needs to punish defectors only occasionally. Compared to the ordinary theory, the cost to maintain cooperation should be significantly less expensive. If β > 0, z ∗ biased approaches 1 <sup>−</sup> <sup>e</sup> <sup>−</sup>β(T−1) (> 0) as F increases; punishment needs to be most often (the dotted line in **Figure 1**).

#### 2.2. Dimorphism of Cooperators and Defectors

If Equations (2) or (6) is violated, defectors invade the population of cooperators. After that, they may form a stable dimorphic population with cooperators. Here, we study the effect of risk attitude of individuals on such dimorphism. We find that risk aversion increases the frequency of cooperators. Moreover, we introduce three variants of punishment relevant in dimorphism: (a) to punish all defectors (most costly); (b) to punish one of them as a warning for others [less costly than variant (a)]; or (c) to punish only one-sided defectors (cheapest). We find that the first and second variants but the third achieve cooperative dimorphism. Surprisingly, the first variant can be the most cost-effective solution to maintain cooperation with a reasonably small probability of observation.

Unlike the case of monomorphism (Section 2.1) in which defection by a mutant is always toward a resident cooperator, mutual defection between two defectors is also likely in dimorphism. Consequently, a problem arises—how should the third-party authority treat mutual defection? Should the authority punish both defectors? This might be too costly. Punish only one of them as a warning for others to inhibit defection in the future? This is less costly but could be insufficient. As the two defectors obtain nothing in mutual defection, punish none of them? For this, we consider three variants that rule differently on mutual defection (**Figure 2**): (a) the authority punishes all defectors; (b) the authority punishes one of defectors selected at random; and (c) the authority punishes only a one-sided defector so that neither defectors are punished. Hereafter, we call them ALL, ONE, and ONE-SIDED variants, respectively.

FIGURE 2 | Three variants of third-party punishment. Blank circles represent players and those with "P" represent punishers as observers. Each line connecting blank circles, above which "D D" or "D C" is attached, represents mutual defection or one-sided defection in a game, respectively. Arrows represent that punishment is executed. (A) ALL defectors are punished: in case of mutual defection, the punisher pays cost 2C and each of the two defectors pays fine F. (B) Only ONE of defectors is punished: in case of mutual defection, the punisher pays cost C and one of the two defectors, selected at random, pays fine F. (C) Only a ONE-SIDED defector is punished: the punisher does not care about mutual defection. In all the three variants, the punisher pays cost C and the defector pays fine F in case of one-sided defection.

For each variant with different risk attitudes, we numerically find stable points of the replicator dynamics of cooperators and defectors, i.e.,

$$
\dot{\boldsymbol{x}} = \boldsymbol{x}(1-\boldsymbol{x}) \left[ \boldsymbol{\mu\_{\rm C}} - \boldsymbol{\mu\_{\rm D}} \right], \tag{7}
$$

where x is the frequency of cooperators, u<sup>C</sup> and u<sup>D</sup> are the utility of being a cooperator (Equation A1) and that of being a defector (Equation A2). The ALL and ONE variants achieve stable dimorphism of cooperators and defectors (**Figures 3A,B**). In these variants, smaller β increases the stable frequency of cooperators more. As expected, the ALL variant achieve higher cooperation than the ONE variant. The ONE-SIDED variant does not achieve dimorphism because u<sup>C</sup> > u<sup>D</sup> with Equations (A1, A2c) are equivalent to Equation (6); a stable population in this variant consists of all defectors if the ESS condition (i.e., Equations 2 or 6) is violated.

Although the ALL variant achieves higher cooperation than the ONE variant does, the authority might have to punish more defectors—thus pay higher cost—in the ALL variant than in the ONE variant. This concern is needless for risk-averse individuals with a sufficiently large—but reasonably small—probability of observation, z. Given that the stable frequency of cooperators is x ∗ , the probability for the observing authority to find one-sided defection is 2x ∗ (1 − x ∗ ) and that to find mutual defection is (1 − x ∗ ) 2 . Thus, the expected number of punishment per observation in the ALL variant is 2x ∗ (1 − x ∗ ) × 1 + (1 − x ∗ ) <sup>2</sup> × 2 = 2(1 − x ∗ ) and that in the ONE variant is 2x ∗ (1 − x ∗ ) × 1 + (1 − x ∗ ) 2 × 1 = 1 − x <sup>∗</sup>2—at first glance, the former looks larger than the latter. Since x ∗ in the ALL variant is larger than that in the ONE variant (see **Figures 3A,B**), the effective number of punishment per observation in the ALL variant can be smaller than that in the ONE variant (**Figures 3C,D** plot them and **Figure 3E** shows their difference). If individuals are risk averse, a reasonably small z makes the ALL variant less expensive; i.e., the branching point at which the sign of the difference changes becomes smaller as β decreases (In **Figure 3E**, the branching point in case of β = −1 is located around z = 0.15).

### 3. ENDOGENOUS THIRD-PARTY PUNISHMENT

So far, we have assumed that the punisher is an exogenous authority that exists outside the population dynamics. Although this assumption seems reasonable for societies in which a mature institution for authoritative punishment exists, smallscale societies such as hunter-gatherers may require a different scenario. Our next question is what if without any leviathan. Our finding is that if individuals are risk averse, a few of endogenous third-party punishers—they evolve in the population dynamics can maintain high cooperation.

Here, we consider another model of competition among cooperators, defectors, and endogenous third-party punishers. From time to time, randomly sampled three individuals participate in a situation: as well as section 2, two of them selected at random—play the weak prisoner's dilemma game; the remaining one observes the game and can punish each defector at cost C (> 0). Again, those being punished pay fine F. Cooperators select C in a game and do nothing if being an observer; defectors select D in a game and do nothing if being an observer; and punishers select C in a game and perform punishment if being an observer (and observing defectors). The individuals change strategies according to replicator–mutator dynamics [36–38] based on their risk-sensitive utilities (Equation 3), given by

$$\dot{\mathfrak{x}} = \mathfrak{x}\mathfrak{f}\_{\mathbb{C}}(1-\mu) + \mathfrak{y}\mathfrak{f}\_{\mathbb{D}}\frac{\mu}{2} + \mathfrak{z}\mathfrak{f}\_{\mathbb{P}}\frac{\mu}{2} - \mathfrak{x}\langle f \rangle,\tag{8a}$$

$$\dot{\jmath} = \text{yfp}\,(1-\mu) + z\sharp\epsilon\frac{\mu}{2} + \text{xfc}\frac{\mu}{2} - \wp\langle f\rangle,\tag{8b}$$

and

$$\dot{z} = z\text{f}\_{\text{P}}(1-\mu) + \text{xf}\_{\text{C}}\frac{\mu}{2} + \text{yf}\_{\text{D}}\frac{\mu}{2} - z\langle f \rangle,\tag{8c}$$

where x, y, and z are frequencies of cooperators, defectors, and punishers,

$$
\langle f \rangle = \mathfrak{x}\mathfrak{f}\_{\mathbb{C}} + \mathfrak{y}\mathfrak{f}\_{\mathbb{D}} + \mathfrak{z}\mathfrak{f}\_{\mathbb{P}} \tag{9}
$$

is the average fitness and

$$f\_s = 1 - \omega + \omega u\_s \tag{10}$$

is the fitness of strategy s (= C, D, and P) where u<sup>s</sup> is given by Equations (A3, A4). In Equation (8), µ is the probability with which an individual mutates his/her strategy to another by chance: one does not mutate his/her strategy s with probability 1−µ; otherwise, his/her strategy after mutation is one of the other strategies s ′ (6= s) with probability µ/2, where 2 is the number of the other strategies. In Equation (10), w (0 ≤ w ≤ 1) is a parameter that controls intensity of selection; large (small) w implies strong (weak) selection.

As our main interest is not on their effects, we fix w = 0.1 and µ = 0.01 throughout Section 3. Our motivation to employ replicator–mutator dynamics here is (1) to avoid artificial neutral stability between cooperators and punishers when defectors are not present and (2) to incorporate more reality in the model in social learning, humans often explore different strategies at random [39].

We numerically examine the replicator–mutator dynamics (Equation 8) for each variant with different risk attitudes. As a reference point, we choose a set of parameters (T = 2, C = 1, F = 3,w = 0.1, and µ = 0.01) with which defectors are frequent in a population of risk-neutral (i.e., β = 0) individuals (**Figures 4G–I**). Then, we check the effect of changing parameter β: with extreme risk aversion (β = −10), individuals achieve almost full cooperation (**Figures 4A–C**); on the other hand, with extreme risk proneness (β = 10), they reach almost full defection (**Figures 4M–O**). We can understand these two extreme cases by examining limβ→±∞ u<sup>s</sup> for s = C, D, and P (see **Appendix B** in Supplementary Material): because limβ→−∞ u<sup>C</sup> = 0, limβ→−∞ u<sup>D</sup> = −F (ALL and ONE variants) or T − F (ONE-SIDED variant), and limβ→−∞ u<sup>P</sup> = −2C (ALL variant) or −C (ONE and ONE-SIDED variants) in the interior of the state

space (apply Equation B1a to Equations A3, A4), the utility of cooperators is the largest if individuals are extremely risk averse (and if T < F in the ONE-SIDED variant); similarly, because limβ→∞ u<sup>C</sup> = limβ→∞ u<sup>P</sup> = 1 and limβ→∞ u<sup>D</sup> = T (apply Equation B1b to Equations A3, A4), the utility of defectors is the largest if individuals are extremely risk prone. With moderate risk aversion (β = −1) or risk proneness (β = 1), the outcomes are in-between (**Figures 4D–F** or **Figures 4J–L**, respectively). In case of risk aversion, a population of frequent cooperators and a few punishers establishes stable and high cooperation; as well as Section 2, the required frequency of observation—i.e., the frequency of punishers in this model—is small.

Comparing the three variants of punishment, the ALL and ONE variants promote cooperation more easily than the ONE-SIDED variant in the case of risk aversion (clearly observed in **Figures 4D–F**). This is because among the three variants, only the ONE-SIDED variant misses term e−β<sup>F</sup> in the defector's utility (see Equations A4a, A4c, A4e). In the other two variants with sufficiently strong risk aversion, the largest term in the defector's

, meaning that the defector's utility is most affected

(G–I) β = 0. (J–L) β = 1. (M–O) β = 10. Parameters: T = 2, F = 3, C = 1, w = 0.1, and µ = 0.01.

by the worst-case scenario that he/she obtains nothing from cheating but is punished. In the ONE-SIDED variant, the largest term in the defector's utility is eβ(T−F) , meaning that the most dominating scenario in the utility is that the defector at least enjoys cheating but is punished. It is most difficult to promote cooperation in the ONE-SIDED variant because the defectors' worst-case scenario in this variant is milder than that in the ALL and ONE variants.

utility is e−β<sup>F</sup>

### 4. THE DONATION GAME

Throughout the analyses, we have assumed that individuals play the weak prisoner's dilemma game (i.e., Equation 1). The socalled donation game, i.e., payoff matrix

$$
\begin{array}{cccc}
\text{C} & \text{D} \\
\text{C} \begin{bmatrix} b-c & -c \\ b & 0 \end{bmatrix} \\
\text{D} \begin{bmatrix} b & 0 \end{bmatrix}
\end{array}
\tag{11}
$$

where b > c > 0, has been adopted in many studies [3, 16, 17, 24, 25]. For those interested in the difference between the two games, in **Appendix C** (Supplementary Material), we note the results if we assume the donation game instead of the weak prisoner's dilemma game.

The two games have similar results except for the case of authoritative third-party punishment in which observation by the authority is not sufficiently frequent to stabilize cooperation: in this case, only the weak prisoner's dilemma game with the ALL or ONE variant achieves dimorphism of cooperators and defectors (Section 2.2). This is technically because in the weak prisoner's dilemma game, one-sided cooperation (i.e., selecting C against an opponent's D) and mutual defection (i.e., selecting D against an opponent's D) have the same payoff. Consider a monomorphic population of defectors. We denote by S and P, respectively, the payoff if selecting C and the payoff if selecting D in the population. Assuming the authoritative third-party punishment of ALL or ONE variant, the utility of being a cooperator is S and that of being a defector is 1/β log h z/k e <sup>β</sup>(P−F) + (1 − z/k)eβ<sup>P</sup> i where k = 1 in the ALL variant and k = 2 in the ONE variant. Thus, u<sup>C</sup> > u<sup>D</sup> ⇐⇒ z/k > (1−e β(S−P) )/(1−e −βF ); in the case of the weak prisoner's dilemma game (i.e., if S = P), cooperators can invade the population of defectors if the authority watches individuals with any frequency (i.e., z > 0); in the case of the donation game (i.e., if S−P = −c), cooperators can invade the population of defectors if z > kz<sup>∗</sup> DG (see Equation C2).

### 5. DISCUSSION

In this work, we have investigated the effect of risk attitude on social learning dynamics of third-party punishment. We studied two models: in the first model, the third-party punisher is an external authority that stands outside the competition of individuals; in the second model, those individuals endogenously perform third-party punishment so that the third-party punishers compete against non-punishers. In both models, risk-averse individuals achieved higher cooperation with a significantly lower frequency of punishment than riskneutral or risk-prone individuals. In the first model, this means that a strong leviathan who constantly watches people and severely punishes norm violators is not needed; in the second model, it implies that not everyone needs to be an enforcer.

We also examined the effects of three variants of third-party punishment, ALL, ONE, and ONE-SIDED variants, on the social learning dynamics. In the ALL variant, all defectors are punished; in the ONE variant, only one of defectors is punished as a warning to others; and in the ONE-SIDED variant, only who actually enjoyed cheating against a cooperator is punished. We found that since the worst-case payoff of defectors in the ONE-SIDED variant is milder than that in the other two variants, it is most difficult to promote cooperation in the ONE-SIDED variant: even if cheating is toward a cheater, it should be punished for maintaining cooperation. We also found that in the case of authoritative punishment, the ALL variant can be more efficient than the ONE variant with a reasonably small frequency of observation: punishment as a warning for others is efficient only if the authority can watch people really rarely.

Risk aversion has been directly or indirectly observed in laboratory experiments of social dilemma games with punishment opportunity [27, 28, 40]. Yamagishi [27] reported that in his study, the mere existence of punishment was sufficient to promote cooperation in early trials of the social dilemma experiments. The participants might not sufficiently realize the reward structure in their early trials, so that uncertainty of punishment would increase participants' cooperation. This is in line with the present study predicting that risk aversion promotes cooperation under the existence of punishment. Qin and Wang studied the effect of probabilistic punishment. In their study, they observed an inverted U-shaped relationship between the probability of punishment and the level of cooperation, suggesting that the participants' utility function was risk averse [40]. Moreover, children seem to be risk averse under the threat of punishment [28].

A number of experimental studies reported that punishing just one, the worst contributor in a game, was enough to maintain cooperation [27, 40–44]. These observations are consistent with the present study in which the ONE variant as well as the ALL variant is effective to promote cooperation in risk averse individuals. Comparing the two variants, punishing one and punishing all, Andreoni and Gee [41] and Kamijo et al. [42] suggested that punishing one is a more efficient solution to promote cooperation. In the present study, however, the ONE

#### REFERENCES


variant was more effective than the ALL variant only when the frequency of watching by authority was rare. Because in their studies the amount of fine if being punished was variable depending on the amount of contribution, their study and ours are not directly comparable. More investigations to clarify this point would be required.

Finally, we mention some concerns about the assumptions in our model. One is the assumption that the risk attitude of individuals is homogenous so that they have an identical utility function of stochastic payoffs. In reality, however, people have a variety of personality and they have heterogenous attitudes toward risk [45, 46]. Risk takers might tend to be norm violators or punishers, while cautious people might tend to be non-punishing cooperators who avoid risky things. It should be interesting to incorporate such a correlation between risk attitudes and strategies into an extended model. Another concern is the assumption that the risk attitude of individuals is constant over time whereas their strategies evolve. It could be justified by thinking about the importance of risk aversion in evolutionary history. In fact, risk aversion is widely observed among animals [47], implying that it is a crucial concern across species. Risk aversion could be stressed under far stronger selection pressure than the punishment norms.

### AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and has approved it for publication.

#### FUNDING

This work was supported in part by Monbukagakusho grant 16H06412 to Joe Yuichiro Wakano.

### ACKNOWLEDGMENTS

We would like to thank the reviewers for their constructive comments to improve the manuscript.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphy. 2018.00156/full#supplementary-material


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Nakamura. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Evolution of Human-Like Social Grooming Strategies Regarding Richness and Group Size

Masanori Takano<sup>1</sup> \* and Genki Ichinose<sup>2</sup>

<sup>1</sup> Akihabara Laboratory, CyberAgent, Inc., Tokyo, Japan, <sup>2</sup> Department of Mathematical and Systems Engineering, Shizuoka University, Hamamatsu, Japan

Human social strategies have evolved as an adaption to behave in complex societies. In such societies, humans intensively tend to cooperate with their closer friends, because they have to distribute their limited resources through cooperation (e.g., time, food, etc.). It also makes the situation difficult to have uniform social relationships (social grooming) with all friends. Thus, the social relationship strengths often show a much skewed distribution (a power law distribution). Here we aim to show adaptivity of such social grooming strategies in order to explore the evolution of human social intelligence. We use a model in the framework of evolutionary games where the social grooming strategies evolve via building social relationships with cooperators. Simulation results demonstrate four evolutionary trends. One of the trends is similar to the strategy that humans use. We find that these trends depend on three parameters; individuals' richness, group sizes, and the amount of social grooming. The human-like strategy evolves in large poor groups. Moreover, the increase of the amount of social grooming makes the group size larger. Conversely, this implies that the same strategy evolves when the amount of social grooming is properly adjusted even if the group sizes are different. Our results are important in the sense that, between human and non-human primates, the differences of the group size and the amount of social grooming are significant.

#### Edited by:

Isamu Okada, Soka University, Japan ¯

#### Reviewed by:

Matjaž Perc, University of Maribor, Slovenia The Anh Han, Teesside University, United Kingdom

\*Correspondence: Masanori Takano takano\_masanori@cyberagent.co.jp

#### Specialty section:

This article was submitted to Social Evolution, a section of the journal Frontiers in Ecology and Evolution

Received: 12 September 2017 Accepted: 16 January 2018 Published: 31 January 2018

#### Citation:

Takano M and Ichinose G (2018) Evolution of Human-Like Social Grooming Strategies Regarding Richness and Group Size. Front. Ecol. Evol. 6:8. doi: 10.3389/fevo.2018.00008 Keywords: social grooming, evolutionary game, social structure, Yule–Simon process, cooperation

## 1. INTRODUCTION

Cooperation is common among humans and it is fundamental to our society (Smith and Szathmáry, 2000; Fehr and Fischbacher, 2003). The amount of cooperation by other people is limited because they have to pay costs (e.g., money, time, opportunities, food, etc.) (Santos et al., 2006; Xu and Wang, 2015). Therefore, people carefully choose their friends in order to receive intensive cooperation (Rand et al., 2011; Grujic et al., ´ 2012; Wang et al., 2012).

Actually, people tend to cooperate with close friends. An experimental study using the Donation Game shows that participants tend to cooperate more with closer friends (Harrison et al., 2011). Another study using the Public Goods Game shows that friend groups are more cooperative with each other than with other groups (Haan et al., 2006). Additionally, in a data analysis study dealing with the data set of a social network game, people's frequent communication increases their cooperative behavior (Takano et al., 2016a,b).

Thus, it is important that humans have stronger social relationships in greater numbers with cooperators than with others. We define social grooming as the behavior that constructs social relationships. Primarily, social grooming is the act of cleaning or maintaining the body of a social partner in primates (Dunbar, 2000, 2004; Nakamura, 2003). Social bonding is part of the functional aspect of social grooming. Therefore, human social bonding behavior is also called social grooming (Dunbar, 2000, 2004), as a hypothetical extrapolation of the findings in nonhuman animals.

The behavior constructing social relationships is not limited to humans but widely observed in primates (Kobayashi and Kohshima, 1997; Dunbar, 2000, 2004; Nakamura, 2003; Kobayashi and Hashiya, 2011; Takano et al., 2016a,b; Takano and Fukuda, 2017). In doing so, they face cognitive constraints (Dunbar, R. I. 2012) (e.g., memory and processing capacity) and time constraints (i.e., time costs) in constructing and maintaining social relationships. These time constraints are not negligible, as people spend a fifth of their day in social grooming (Dunbar, 1998) for maintaining the relationship (Hill and Dunbar, 2003; Roberts and Dunbar, 2011). Therefore, the strength of existing social relationships exhibits a negative correlation with the total number of social relationships (Roberts et al., 2009; Miritello et al., 2013b).

On the other hand, it is important to select cooperative partners in the evolution of cooperation because cooperators tend to be exploited by defectors (Axelrod, 2006). To select appropriate cooperative partners, it is known that reading others' intentions play an important role (Han et al., 2012, 2015; Arechar et al., 2017). Arechar et al. (2017) revealed that sending a message for their intentions (selecting a strategy in repeated games) when subjects play the games promotes cooperation even when an error is incorporated. Han et al. (2012, 2015) showed that, by theoretical models, others' intentions which are formed by past interactions in repeated games enhance cooperation. Moreover, commitments (e.g., prior agreements to cooperate) are other mechanisms to build long-term cooperative relationships, which enable cooperation to evolve by natural selection ( Nesse, 2001; Martinez-Vaquero et al., 2015, 2017). Han et al. (2015) emphasized that the balance between intention and commitments is important for cooperative relationships. These are the mechanisms working in direct reciprocity. Spatial reciprocity and network reciprocity also suggest the necessity of fixed relationships (Perc and Szolnoki, 2010; Perc et al., 2017). Therefore, it is reasonable to consider that humans and other social animals tend to cooperate with their close partners (Haan et al., 2006; Harrison et al., 2011; Takano et al., 2016a,b).

Humans must construct and maintain social relationships within the constraints of this trade-off. We expect that strategies are employed to distribute the limited time resources to maximize benefits from their social relationships (Brown and Brown, 2006; Miritello et al., 2013a; Saramaki et al., 2014). As a result of such strategies, social relationship strengths, as measured by frequency of social grooming (Roberts and Dunbar, 2011; Arnaboldi et al., 2012, 2013; Song et al., 2013; Fujihara and Miwa, 2014; Saramaki et al., 2014; Takano and Fukuda, 2017), may often show a skewed distribution (Zhou et al., 2005; Arnaboldi et al., 2013), distributions following a power law (Hossmann et al., 2011; Arnaboldi et al., 2012; Hu et al., 2012; Pachur et al., 2012; Song et al., 2013; Fujihara and Miwa, 2014; Takano and Fukuda, 2017). Moreover, it has been demonstrated that social structures of nonhuman primates (Kanngiesser et al., 2011; Tung et al., 2015; Levé et al., 2016; Dunbar, R. I. M. 2012) are also skewed.

The skewed distributions of the relationships could be generated by a strategy where individuals select social grooming partners in proportion to the strength of their social relationships (Pachur et al., 2012; Takano and Fukuda, 2017); known as the Yule–Simon process (Yule, 1925; Simon, 1955; Newman, 2005). Individuals should pay time costs to win the competitions with others by strengthening their social relationships with cooperators, assuming that having strong social relationships is to receive cooperation.

Human societies using these strategies are much larger than those of non-human primates. Based on the social brain hypothesis, human intelligence has evolved to adapt to large societies. Therefore, the evolution of human strategies of social relationship construction may explain the origin of human intelligence. However, evolutionary stability of the strategies, i.e., the Yule–Simon process, is still open investigation.

In this paper, we aim to show the adaptivity of the social grooming strategies in order to explore the evolution of human social intelligence predicted by the social brain hypothesis. Especially, we focus on how environments drive the evolution of a social grooming strategy that humans use in their daily life. The evolution should depend on group size and the amount of resources for cooperation. For this purpose, we simulate the evolution of the strategy to receive cooperation from others with different environmental conditions for cooperations. We show that strategies evolve depending on the strength of social relationships.

### 2. METHODS

We expand the model of Takano and Fukuda (2017) to an evolutionary game. They consider two types of individuals; social groomers and cooperative groomees (**Figure 1** (Takano and Fukuda, 2017). In the real world, individuals are groomers

groomees depending on their social grooming strategies. Cooperative groomees cooperate with social groomers who are top Rc on the strengths of social relationships. Groomer strategies evolve based on their fitness which is the amount of cooperation from groomees.

#### TABLE 1 | Descriptions of model parameters.


and groomees, simultaneously. For simplicity, they use this classification to focus on the social grooming strategies for social structures. In this paper, we only focus on the evolution of social grooming strategies while cooperation from gromees' is static. This is because that cooperative behaviors are common in humans and primates (Silk, 2009; Rand and Nowak, 2013). Given that cooperation from groomees' is static, we can consider the evolution of groomers' strategies. While the evolutionary dynamics of cooperation are well-known (Nowak, 2006; Perc and Szolnoki, 2010; Rand and Nowak, 2013; Perc et al., 2017), there are few study on the evolutionary dynamics of social grooming. Groomers construct their social relationships with groomees depending on their social grooming strategies in a "grooming stage." Cooperative groomees cooperate with groomers depending on social relationship strengths in a "cooperation stage." Groomer strategies evolve based on their fitness which is the amount of cooperation from groomees in each generation. Groomees' cooperation strategies are static. **Table 1** shows the parameters of this model.

In a grooming stage, groomer i repeatedly interacts with cooperative groomees R<sup>g</sup> times depending on their social grooming strategy (s<sup>i</sup> , qi). q<sup>i</sup> is a ratio that i constructs a new social relationship with a stranger, new groomee j, and s<sup>i</sup> is a parameter of a probabilistic function p(dij;si) which selects existing social grooming partner j depending on dij (dij > 0). We used the following function (**Figure 2**) as a simple function to express various strategies depending on dij including concentrated investment to strong relationships (s = 4), diversified investment to weak relationships (s = −4), at random (s = 0), and the Yule–Simon process (s = 1; i.e., human-like strategy).

$$p(d\_{\vec{\eta}}; s\_i) = b(d\_{\vec{\eta}}; \alpha\_i, \beta\_i) / \sum\_{k=1}^{M} b(d\_{ik}; \alpha\_i, \beta\_i), \tag{1}$$

where α<sup>i</sup> = 1 + s<sup>i</sup> , β<sup>i</sup> = 1 when s<sup>i</sup> ≥ 0 while α<sup>i</sup> = 1, β<sup>i</sup> = 1 − s<sup>i</sup> when s<sup>i</sup> < 0. dij is wij/max({wi1,wi2, . . . ,wiM}), where wij shows strength of social relationships, i.e., the number of social grooming from i to j. This function only depends on dij, because previous studies have revealed that people select their social grooming partners depending on the strength of social

large s tend to interact with a groomee in a strong social relationship (large d). On the other hand, groomers with small s tend to interact with a groomee in a weak social relationship (small d). When s = 0, groomers interaction is independent from d. When s = 1, groomers interact in proportion to the strength of social relationships, i.e., the Yule–Simon process.

relationships (Pachur et al., 2012; Takano and Fukuda, 2017). Therefore, this function can simply represent human-like social grooming strategies. M is the number of groomees. b(x; α, β) is a normalized beta distribution x α−1 (1 − x) β−1 /B(α, β), where B(·, ·) is a beta function. While using other functions which have fewer assumptions by using more dimensions is possible (e.g., nonparametric functions), we used Equation (1) because it is simple and is expressive enough to represent various social grooming strategies (**Figure 2**).

In a cooperation stage, groomee j cooperates with groomers in the top R<sup>c</sup> as ranked by {w1<sup>j</sup> ,w2<sup>j</sup> , . . . ,wNj}. The total payoff (i.e., fitness) of each groomer is the number of cooperation (i.e., the number of times ranked in the top R<sup>c</sup> of each cooperator). That is, cooperators cooperate in their close relationships according to their resources R<sup>c</sup> . RcM shows all resources in the environment (R<sup>c</sup> , M), i.e., the total amount of cooperation.

The next generation is generated by sampling with replacement in proportion to the groomers' fitness, i.e., the roulette wheel selection. In each generation, s mutates by the Gaussian distribution (µ = 0, σ = 0.2) and q mutates by the Gaussian distribution (µ = 0, σ = 0.05), where µ is a mean and σ is a standard deviation of the distribution, where q ∈ [0, 1] (if q is out of range by mutation, then it is set to the nearest value in 0 or 1). Groomers' s and q in an initial generation are set by the Gaussian distribution (µ = 0, σ = 5.0) and by uniform distribution [0, 1], respectively. Cooperators do not evolve.

We conducted evolutionary simulations 30 times on each R<sup>c</sup> and M by using this model (R<sup>c</sup> ∈ {5, 10, . . . , 50}, M ∈ {5, 10, . . . 200}). The number of groomers N is 100, the number of social grooming actions R<sup>g</sup> in each grooming stage is 300 (we also use R<sup>g</sup> = 100 in experiments shown in Figures S1, S3, and S4), and the number of generation T is 200. The source code is available at "https://doi.org/10.6084/m9.figshare.5526850.v1."

We set the mutation parameters to be small so that evolution converges at the equilibrium point. At the same time, we set those

and the ratio of each cooperator's resources to the number of cooperators Rc/M (see details Figure 4, Figures S1, S2). (A) Shows the results of evolution with parameter Rc and M. Each color shows the most frequent trend in parameters of the point. This was created based on Figure 4. (B) Is the concept diagram. Trend 1 evolved when RcM was small. Trend 4 evolved when RcM was large. Trends 2 and 3 evolved in the intermediate range between trends 1 and 4 where Rc/M determined whether groomers evolved to trend 2 or 3. The behavior of trends 2 and 3 were similar to human strategies, although trend 2 was closer, as described.

parameters to be large so that evolution reaches the equilibrium point within T generations. The initial range of parameters is widely distributed to cover the whole search space. All those values were determined based on the results of preliminary experiments.

#### 3. RESULTS

We found four evolutionary trends in the results of the simulations (**Figure 3**). These trends are explained by total resources RcM and the ratios of each cooperator's resources to the number of cooperators Rc/M (**Figure 4**, Figure S1).

Groomers evolved to trend 1 when RcM was small. Their s evolved larger and their q evolved smaller. This strategy concentrates investment into strong social relationships (e.g., s = 4 in **Figure 2**). Groomers tended to evolve to trend 4 when RcM was large with s < 0 . This strategy widely invests in many weak social relationships (e.g., s = −4 in **Figure 2**). These trends' s do not converge, meaning that they do not have characteristic values.

On the other hand, s converged to 0 < s < 2 in trends 2 and 3. Trends 2 and 3 evolved in the intermediate range between trend 1 and 4, and Rc/M determined whether groomers evolved to trend 2 or 3. Groomers evolved to trend 2 when Rc/M was large, where q evolved larger. They evolved to trend 3 when Rc/M

FIGURE 6 | Strategies of social grooming (A–D), i.e., probability p of social grooming after each strength of social relationship w, and social structures of each trend (E–H), i.e., distribution of w in each trend (Rg = 300). These figures show trend 1, 2, 3, and 4 from left. These trends in Rg = 100 are similar to them (see Figure S4). In (A–D), the orange points are the 25th percentile, the green points are the 50th percentile and the blue points are the 75th percentile. In the (A–D), we drew w when the number of samples was more than 20. The figures of trend 2 and 3 of the (F,G) are shown by using a logarithmic scale in both axes. In the social structure of trend 1 (E), many weak relationships were caused by mutation noises of q.

was small, where q evolved smaller. s in trend 2 tends to be larger than s in trend 3. Both strategies are diversified investments (e.g., s = 1 and s = 0.5 in **Figure 2**), where groomers intensively invest in strong social relationships while also widely investing in weak social relationships. Additionally, M, where groomers evolved to trends 2 and 3 is larger, when R<sup>g</sup> is large (see **Figure 4**, Figure S1).

Next, we demonstrate how the four trends emerged throughout the evolution and how groomers constructed social structures in each trend. Regarding the former, **Figure 5**, Figure S3 shows the evolutionary pressures (ds, dq) of each combination of s and q, and the typical orbits of evolution. Evolutionary pressures were calculated using the method of the average gradient of selection (AGoS) (Pinheiro et al., 2012). That is, we calculated the mean difference of s and q of the next generation of a population in which individuals' s and q obeyed the Gaussian distribution [(µ = s, σ = 0.2) and (µ = q, σ = 0.2)] on each cell (s, q). These orbits were drawn based on the average selection pressures and noises which are a normal distribution with µ = 0 and σ = 0.01. Incidentally, there is no cell in (ds, dq) = (0, 0). For the latter, **Figure 6**, Figure S4 shows strategies of social grooming, i.e. probability p of social grooming after each strength of social relationship w (**Figures 6A–D**) and social structures of each trend, i.e., distributions of w (**Figures 6E–H**).

Trend 1 evolved in environments with small RcM. Groomers are in intense competition for receiving cooperation from groomees in the environments. Therefore, they evolved to concentrate investments to a few poor groomees, i.e., large s and small q [(R<sup>c</sup> , M) = (5, 5) in **Figures 5**, **6A**]. The results show that they only had very strong social relationships in environments with small RcM (**Figure 6E**). That is, most w were very large and the number of relationships was low.

Trend 4 evolved in environments with large RcM. Groomers easily receive cooperation from groomees in these environments. Thus, they constructed many weak social relationships with many rich cooperators [(R<sup>c</sup> , M) = (50, 200) in **Figures 5**, **6D,H**]. That is, most w were very small and the number of relationships was high.

Trends 2 and 3 evolved between trend 1 and trend 4. Their s converge to (0, 2), this means groomers with these strategies intensively invest in strong social relationships while they also widely invest in weak social relationships [(R<sup>c</sup> , M) = (15, 45) and (5, 200) in **Figure 5**]. Their social grooming probability is in proportion to each strength of the social relationships (**Figures 6B,C**), so their construction processes of social relationships are similar to the Yule–Simon process. As a result, their social structures were similar to power law distributions (**Figures 6F,G**).

The main difference between trends 2 and 3 is how q is affected by Rc/M. When Rc/M is small, groomers have to confine the number of social relationships with groomees to construct strong social relationships, because they compete intensively in each social relationship (i.e., small Rc). Therefore, they evolved to small q with small Rc/M [trend 3; (R<sup>c</sup> , M) = (5, 200) in **Figure 5**]. In contrast, when Rc/M is large, they do not have to restrict the number of social relationships with groomees, because their competition is not intense in each social relationship (i.e., large Rc) and the maximum number of their social relationships is small (i.e., small M). Thus, they evolved to large q with large Rc/M [trend 2; (R<sup>c</sup> , M) = (15, 45) in **Figure 5**]. Interestingly, these trends of evolution show non-continuous transition (see Figure S5).

### 4. DISCUSSION

We analyzed the evolutionary dynamics of social grooming strategies and social structures. As a result, we find that the evolutionary dynamics depend on total resources (i.e., RcM) and the ratios of each cooperator's resources to the number of cooperators (i.e., Rc/M). In the poor small groups, individuals' strategies evolved to concentrate investment among strong social relationships. In the rich large groups, their strategies evolved to wide investment among many weak social relationships. In the middle groups, their strategies evolved according to the Yule–Simon process. These strategies invest intensively in strong social relationships while also investing widely in weak social relationships. As a result of these strategies, skewed distributions of social relationship strengths were generated.

There are two trend strategies which are similar to the Yule–Simon process (Pachur et al., 2012; Takano and Fukuda, 2017). One evolved in relatively rich and small groups in the middle groups. Individuals with this strategy constructed social relationships with all group members, and reinforced their relationships in proportion to the strength of social relationships. The other one evolved in relatively poor and large groups in the middle groups. Individuals with this strategy constructed social relationships with parts of their groups, and reinforced their relationships. In primitive human groups, individuals belong to large groups and interact in small cliques within them (Dunbar, R. I. M. 2012). Hence, humans' social grooming strategy may have evolved in the latter group. Non-human primates may also have similar strategies, because they also construct skewed social structures even though their group sizes are different from humans (Kanngiesser et al., 2011; Tung et al., 2015; Levé et al., 2016; Dunbar, R. I. M. 2012). Their strategies' similarity may be explained by the difference of the amount of social grooming R<sup>g</sup> . Our experiments show the increase in the amount of social grooming R<sup>g</sup> results in the increase of group sizes M, in which social grooming strategies evolve according to the Yule–Simon process (see **Figure 4**). The same social grooming strategies are stable in different group sizes. Actually, there is a positive correlation between group sizes and the amount of social grooming in primates (Dunbar, 1993, 2016).

If a social grooming strategy based on the Yule–Simon process is universal in primates not limited to humans, and group sizes depend on external factors (e.g., predators, food, etc.), then social grooming strategies of humans and non-human primates evolved to the same strategies by automatically adjusting their amount of social grooming. This relationship between group sizes and strategies may be clearly demonstrated by comparison among humans, non-human primates, and other social animals. This will contribute toward an explanation of the evolution of humans' large social groups.

It is also important how cooperators select other cooperators as their interaction partners (Hauert et al., 2002). For example, if cooperators maintain relationships with other cooperators and break relationships with exploiters, their reciprocal relationships will be maintained and their inegalitarian relationships will be broken (Perc and Szolnoki, 2010; Perc et al., 2017). This mechanism to keep cooperation is known as network reciprocity. Social grooming strategies are network construction strategies. Actually, social grooming has a beneficial effect on the construction of reciprocal relationships (Takano et al., 2016a,b). Our results suggest that the evolution of humanlike strategies for network construction depends on the resources of environments and their group size. In this paper, we focused on the evolutionary dynamics of social grooming with stable cooperative behavior. The co-evolutionary dynamics of both behaviors is an issue to be addressed in the future.

Comparison among various species' data sets will be needed in order to clear the relationships between environments and the four evolutionary scenarios of social grooming strategies.

### AUTHOR CONTRIBUTIONS

MT: Designed the research, Constructed the model and Performed the simulation; MT and GI: Discussed and analyzed

### REFERENCES


the results and Wrote the main manuscript text; All authors reviewed the manuscript.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2018.00008/full#supplementary-material


**Conflict of Interest Statement:** MT is an employee of CyberAgent, Inc.

The other author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Takano and Ichinose. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# From Continuous to Discontinuous Transitions in Social Diffusion

#### Paula Tuzón<sup>1</sup> , Juan Fernández-Gracia<sup>2</sup> and Víctor M. Eguíluz <sup>2</sup>

<sup>1</sup> Departament de Didàctica de les Ciències Experimentals i Socials, Facultat de Magisteri, Universitat de València, València, Spain, <sup>2</sup> Instituto de Física Interdisciplinar y Sistemas Complejos (CSIC-UIB), Palma de Mallorca, Spain

\*

Models of social diffusion reflect processes of how new products, ideas, or behaviors are adopted in a population. These models typically lead to a continuous or a discontinuous phase transition of the number of adopters as a function of a control parameter. We explore a simple model of social adoption where the agents can be in two states, either adopters or non-adopters, and can switch between these two states interacting with other agents through a network. The probability of an agent to switch from non-adopter to adopter depends on the number of adopters in her network neighborhood, the adoption threshold T and the adoption coefficient a, two parameters defining a Hill function. In contrast, the transition from adopter to non-adopter is spontaneous at a certain rate µ. In a mean-field approach, we derive the governing ordinary differential equations and show that the nature of the transition between the global non-adoption and global adoption regimes depends mostly on the balance between the probability to adopt with one and two adopters. The transition changes from continuous, via a transcritical bifurcation, to discontinuous, via a combination of a saddle-node and a transcritical bifurcation, through a supercritical pitchfork bifurcation. We characterize the full parameter space. Finally, we compare our analytical results with Monte Carlo simulations on annealed and quenched degree regular networks, showing a better agreement for the annealed case. Our results show how a simple model is able to capture two seemingly very different types of transitions, i.e., continuous and discontinuous and thus unifies underlying dynamics for different systems. Furthermore, the form of the adoption probability used here is based on empirical measurements.

Keywords: adoption, phase transition, mean-field, social contagion, spreading

## 1. INTRODUCTION

Spreading processes are ubiquitous in nature: the contagion of diseases [1], herd behavior in animals [2], the diffusion of innovations [3], rumor spreading [4], the evolution of social movements [5], the propagation of hashtags in Twitter [6], etc. All these processes share similar dynamics; in a population of initially neutral (disease-free, unaware of some information, etc.) agents (humans, animals, or even bots), some of them start carrying some information, pathogen, or behavior, i.e., they adopt this innovation. Through a transmission process they can pass it on to other agents, starting in this way the process of adoption diffusion.

The diffusion of adoption has been extensively studied and modeled in several fields including Biology, Physics and Social Sciences [7–10]. In general, new adopters have been in contact with

#### Edited by:

Tatsuya Sasaki, F-Power Inc., Japan

#### Reviewed by:

Vasileios Basios, Université Libre de Bruxelles, Belgium Marco G. Mazza, Max Planck Institute for Dynamics and Self Organization (MPG), Germany

\*Correspondence:

Víctor M. Eguíluz victor@ifisc.uib-csic.es

#### Specialty section:

This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics

Received: 13 October 2017 Accepted: 16 February 2018 Published: 07 March 2018

#### Citation:

Tuzón P, Fernández-Gracia J and Eguíluz VM (2018) From Continuous to Discontinuous Transitions in Social Diffusion. Front. Phys. 6:21. doi: 10.3389/fphy.2018.00021 one or several adopters, with two main mechanisms: in diseaselike models [11, 12], adoption takes place with an adoption probability per contact with an adopter which is constant irrespective of the number of adopters; in threshold-like models [8, 11–13], adoption happens only after a critical number of adopters has been reached. There are also models of "generalized contagion" [14], where both disease-like and threshold behaviors are special cases.

However, while the models describe individual adoption probabilities, most of the related empirical research was based on aggregated data, typically cumulative adoption curves [15, 16]. Recent studies have focused on individuals' behavior, where the number of adopters accessed by each individual can be measured [17–20]. These measurements have a direct connection with the form of the adoption probability. In this paper we explore the probability function obtained by Milgram et al. [17] from a social experiment. They analyzed the correlation between the size of a group looking at the same point in the street and the number of passerby that joined the behavior of looking at that point. The results of the experiment can be fitted with a Hill function for the probability of adoption [20]. We will show that the shape of the adoption probability leads to two different behaviors depending on the parameter values: either a continuous or a discontinuous phase transition. This provides a simple model that describes both regimes within the same framework, depending only on two parameters; with a probability function linked to empirical data.

#### 2. RESULTS

An agent that has not adopted yet, adopts with some probability when interacting with an adopter, which turns her an adoptermaker too. After adoption, the agent is "recovered" at a certain rate µ and becomes again a potential adopter. Here, we study the consequences of the probability of adoption. The transition from adopter to non-adopter is assumed to occur at some constant rate µ.

In the standard SIS (susceptible-infected-susceptible) model [1], the adoption probability (from susceptible to infected, S → I) β is constant for each interaction with an adopter. In general, the adoption probability can be a general function of the number of adopted neighbors, n:

$$P(n) = \lambda^{\prime} f(n) \,. \tag{1}$$

In this contribution we will consider the function proposed by Gallup et al. [20]

$$f(n) = \frac{n^a}{T^a + n^a} \; , \tag{2}$$

where λ ′ is persuasion capacity (similar to β = λ ′ for T = 0 and a = 1), a is the adoption coefficient (or Hill coefficient) and controls how fast/slow this probability increases with n and T is the adoption threshold and fixes the number of adopters needed to reach half the persuasion limit. λ ′ , T and a are real positive numbers. This type of function is known as Hill function and has been used in models of population growth and decline [21– 23]. The evolution of such a system in an annealed degree regular network (a network where all the nodes have the same number of neighbors or degree k but where they are chosen randomly in the population at each interaction) is determined by

$$\frac{d\rho}{dt'} = -\mu\rho + (1 - \rho)A,\tag{3}$$

where ρ is the density of adopters and A is the probability of adoption given the density ρ and is given by

$$A = \sum\_{n=0}^{k} P(n) \binom{k}{n} \rho^n (1 - \rho)^{k - n} \,. \tag{4}$$

The number of infected neighbors is assumed to be binomially distributed with a success probability equal to the global density of infected agents. Without loss of generality we get rid of parameter µ by changing the timescale and rescaling the persuasion capacity λ ′

$$t = \mu t'\tag{5}$$

$$
\lambda = \frac{\lambda'}{\mu},
\tag{6}
$$

which is equivalent to setting µ = 1. The equilibrium solutions for the system are determined by the condition

$$-\rho^\* + (1 - \rho^\*)A^\* = 0 \,. \tag{7}$$

Given a particular value of a and T, there are at most three possible solutions for ρ ∗ (**Figure 1**): (i) ρ <sup>∗</sup> = 0, corresponding to the adoption-free regime, (ii) ρ <sup>∗</sup> = ρ up, represented by the upper branch, and (iii) ρ <sup>∗</sup> = ρ down, the lower branch.

The stability of the fixed points can be easily checked by linear stability analysis. The solution ρ <sup>∗</sup> = 0 changes stability at

$$
\lambda\_0 = \frac{1}{k f(1)},\tag{8}
$$

being stable for λ < λ<sup>0</sup> and unstable otherwise. As can be seen in **Figure 1**, if the solution ρ <sup>∗</sup> = 0 intersects the upper branch, then that branch is stable and the solution ρ <sup>∗</sup> = 0 changes stability via a transcritical bifurcation. Then for λ > λ<sup>0</sup> and for any initial ρ<sup>0</sup> 6= 0 the system will end up in the fixed point ρ up (**Figure 1A**). If, on the contrary, the solution ρ <sup>∗</sup> = 0 intersects the lower branch, this one is unstable and there is a region λ<sup>1</sup> < λ < λ<sup>0</sup> for which two stable solutions (ρ <sup>∗</sup> = 0 and ρ up) coexist, separated by an unstable solution ρ down (**Figure 1B**). For λ = λ<sup>1</sup> the two fixed points of opposite stability annihilate through a saddlenode bifurcation, while at λ = λ<sup>0</sup> we still have a transcritical bifurcation. Therefore, in that region the final state of the system will be the upper branch solution ρ up if the initial density ρ<sup>0</sup> > ρ down and 0 otherwise and we can observe hysteresis. For λ > λ<sup>0</sup> and for any initial ρ<sup>0</sup> > 0 the system will end at ρ up. Note that λ<sup>0</sup> is only the critical point for continuous transitions, while for discontinuous ones would be λ1. The sign of the derivative of the ρ ∗ function at the intersection of ρ <sup>∗</sup> = 0 and the other branches determines the type of transition. If the derivative is

FIGURE 1 | Complete solutions of Equation (7) are shown in black for T = 3, k = 10, and a = 1.2, 1.53, 1.8 (A–C, respectively). Continuous lines represent stable solutions. Note that when λ0 intersects the upper branch, the transition is continuous (A). When λ0 intersects the lower branch (C), two stable solutions coexist in the region λ1 < λ < λ0, 0 and ρ up, and the transition is discontinuous. Simulations of the microscopic model are shown in blue points in (A,B). For (C) the simulation is shown in (D), that amplifies the region λ<sup>1</sup> − λ0, showing the hysteresis of the system. (B) illustrates the case when λ<sup>0</sup> = λ1.

positive (ρ <sup>∗</sup> = 0 intersects ρ up), the transition is continuous, while if it is negative (ρ <sup>∗</sup> = 0 intersects ρ down), the transition is discontinuous (Equations 9a,b, respectively).

$$\frac{d\rho^\*}{d\lambda}\Big|\_{\lambda\_0} > 0 \implies f(2) < \frac{2k}{k-1}f(1) \tag{9a}$$

$$\left. \frac{d\rho^{\*}}{d\lambda} \right|\_{\lambda\_0} < 0 \implies f(\mathbf{2}) > \frac{2k}{k-1} f(\mathbf{1}) \,. \tag{9b}$$

For the particular case when f(2) = 2k k−1 f(1) both λ<sup>0</sup> and λ<sup>1</sup> coincide. For this condition one can show, by approximating Equation (7) to third order in ρ∗, that the bifurcation diagram is that one of a supercritical pitchfork bifurcation, i.e., the equation is equivalent to x˙ = rx − x 3 (**Figure 1C**). In this case, the final fate of the system is similar to the continuous case. For λ < λ<sup>0</sup> there is no global adoption and the system ends at ρ <sup>∗</sup> = 0, while for λ > λ<sup>0</sup> any initial condition ρ<sup>0</sup> 6= 0 will bring the system to ρ up .

Simulations using a microscopic model are also included in the plots of **Figure 1**. This microscopic model simulates an SIS dynamics in a degree regular network of k = 10 that changes at each time step. From one step to another, an agent is selected; if it is an adopter it recovers with probability µ, if not, it adopts with probability P(n), where n is the number of adopters among k randomly chosen agents. There is an initial seed of infected agents which we fix to 1% of the total population.

In **Figures 1A,B** results of the simulations are shown in blue dots over the analytical solution. For **Figure 1C**, simulations are shown in **Figure 1D**. As can be seen, the system exhibits hysteresis in the region λ<sup>1</sup> < λ < λ0, where there is bistability. The system ends at ρ up or ρ down depending on the initial condition.

**Figure 1** also illustrates the two different kinds of transitions. The density of adopters stays at zero until a critical value of λ, where the system goes to ρ up by either a continuous transition or a discontinuous transition. As can be observed, provided a value for T, the size of the jump increases with a. For values of a ∼ 1 the system resembles the epidemic-like models while for values a > 1 the transition is threshold-like.

For the case of our choice of f(n) (Equation 2) the conditions in Equation (9) give bounds for the parameters region for which the transition is of one regime or the other:

$$\text{Cont.:} \quad T < \left(\frac{2^a(k+1)}{2^a(k-1)-2k}\right)^{\frac{1}{a}}\tag{10a}$$

$$\text{Disc.:} \quad T > \left(\frac{2^a(k+1)}{2^a(k-1) - 2k}\right)^{\frac{1}{a}}.\tag{10b}$$

**Figure 2** shows this parameters space for k = 5, 10, 20. The white region represents the parameters combination for a continuous transition while the light gray region corresponds to a discontinuous transition. The dark gray region is the condition that λ<sup>0</sup> ≤ 1 on Equation (8), that is, that the value where both curves meet is in the range λ ≤ 1,

$$T < (k-1)^{\frac{1}{a}}.\tag{11}$$

This constraint implies that the in dark gray region in the plot there is only one possible solution, ρ <sup>∗</sup> = 0.

Both conditions together, Equations (10, 11), predict the values of the parameters for which the model shows one type of transition or another, or none. For example, in **Figure 2B**, a continuous transition is allowed for all values of a ∈ [1, 2] and some values of T ∈ [0, 10], while the discontinuous transition is only possible for values of a higher than 1.25 and values of T higher than 1.5. As can be seen in **Figure 2**, for small values of k, there are only continuous transitions, while for higher values of k, also discontinuous transitions are allowed. Besides, the higher the value of k, the more parameter space allows for ρ 6= 0 solutions.

Finally, we perform simulations to characterize numerically the behavior of the system using a similar microscopic model on quenched regular random network. Again, at each time step an agent is selected, if she is an adopter it recovers with probability µ, if not, she adopts with probability P(n), where now n refers to the number of adopters in her network neighborhood, which is now fixed. There is an initial seed of infected agents equal to 1% of the total population. The long term values of the fraction of adopters ρ∞ are shown in **Figure 3** for 10 realizations and different values of a for T = 1.2, 3. The realizations are not averaged to show the low dispersion (inset of upper panel in **Figure 3** and lower panel of **Figure 3**).

As **Figure 2** indicates for T = 1.2 and k = 10, the system exhibits always a continuous transition no matter the values of a ∈ [1, 2] (inset of the upper panel). For T = 3 and k = 10, for values of a higher than 1.5 the transition is discontinuous, as shown in **Figure 2**. The upper panel of **Figure 3** zooms in the region of the critical point for the case of a = 1.0. It shows the simulations of the microscopic model on a quenched degree regular random network (pink), on an annealed degree regular random network (blue) and the exact solution of the equation (black). As can be seen, there is a small discrepancy for the model on the quenched version of the network. This is because when the topology is fixed correlations appear and in particular the approximation that the infected agents are binomially distributed among the neighbors with a success probability equal to the global fraction of infected agents breaks down. As in the cases presented above, the simulations on the annealed network and the exact solution agree. For both microscopic models, the type of transition is predicted by the parameters space represented in **Figure 2**.

#### 3. CONCLUSIONS

We have analyzed a model of social contagion (SIS-like) on degree regular random networks with an adoption probability measured in empirical data in Gallup et al. [20] that interpolates between the cases of epidemic-like spreading and thresholdlike dynamics. We show that this simple model displays both continuous and discontinuous transitions from a disease-free state to an endemic state. We find the values of the parameters that separate this transitions and the critical persuasion capacities

critical point for the simulations of the microscopic model on the quenched network (pink), the simulations on the annealed network (blue) and the exact solution (black line) of the equation for a = 1.0, respectively. For T = 3 there are continuous or discontinuous transitions depending on the value of a.

λ by applying standard linear stability and bifurcation theory tools.

The simplicity of the model studied here allows for relaxing some of the assumptions considered here. For example, the stability condition given by Equation (8) resembles the structure

#### REFERENCES


of the critical point in the SIS model in uncorrelated random networks with arbitrary degree distributions. Following this similarity, we conjecture that the solution of our model in complex networks will be given by λ<sup>0</sup> = < k > / < k <sup>2</sup> > f(1). Thus degree heterogeneity will lead to the vanishing threshold unless f(1) → 0 as N → ∞. This can be achieved for example by considering that T = ckmax. Alternatively, an interesting variation is to consider that the adoption probability depends not on the absolute number of adopters but on the fraction of them. Besides, heterogeneity can emerge not only at the degree level, but also in the distributions of the adoption threshold T and adoption coefficient a and furthermore they can be correlated with the degree of the nodes. How heterogeneity affects the nature of the transition needs to be explored in detail. Another possible line of research is adding non-Markovianity to the dynamics, for example by letting the adoption probability depend not only on the state of the neighboring agents, but also on some internal time which takes into account when an agent tries to convince another one for adopting the innovation.

Our results highlight that not only the structure of the interaction network neither the dynamics alone are responsible of the type of transition that the system displays. Furthermore, this simplified framework is able to capture this seemingly disparate types of transition, which are usually taken as a signature of different dynamics. Furthermore the choice of the adoption probability curve is based on empirical measurements from Gallup et al. [20], which highlights the relevance of our results for realistic modeling of social phenomena.

#### AUTHOR CONTRIBUTIONS

PT, VE, and JF-G designed the research, performed the calculations, and wrote the article. PT performed the simulations.

#### FUNDING

JF-G and VE received funding from Agencia Estatal de Investigación (AEI) and Fondo Europeo de Desarrollo Regional (FEDER) through project SPASIMM [FIS2016-80067-P (AEI/FEDER, UE)]. Research reported in this publication was supported by research funding from King Abdullah University of Science and Technology (KAUST).


Conference on World Wide Web. New York, NY: ACM Press (2011). p. 695–704.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Tuzón, Fernández-Gracia and Eguíluz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Evolution of Global Cooperation in Multi-Level Threshold Public Goods Games With Income Redistribution

Jinming Du1,2 \* and Baokui Wang<sup>3</sup>

*<sup>1</sup> Liaoning Engineering Laboratory of Operations Analytics and Optimization for Smart Industry, Northeastern University, Shenyang, China, <sup>2</sup> Institute of Industrial and Systems Engineering, Northeastern University, Shenyang, China, <sup>3</sup> Joint Exercises and Training Center, Joint Operations College, National Defense University, Beijing, China*

Income redistribution is a feasible means to adjust the income among individuals, which could reduce the gap between the rich and the poor and realize the social equity. By means of taxation and public services, the income could be transferred from some individuals to others directly or indirectly. We study how income redistribution affects the evolution of global cooperation through proposing a multi-level threshold public goods game model and introducing two kinds of income redistribution mechanisms. We find that both of the income redistribution mechanisms promote global cooperation. Furthermore, the global income redistribution is more in favor of the emergence of global cooperative behaviors than the local income redistribution mechanism. On the other hand, the fixation time of global cooperation is sharply shortened after introducing income redistribution mechanisms. In threshold public goods games, only when the amount of collected public goods reaches a certain threshold, the income of individuals can be guaranteed. Hence, the influences of thresholds of different levels on strategies are investigated in the paper.

#### Edited by:

*Tatsuya Sasaki, F-Power Inc., Japan*

#### Reviewed by:

*Attila Szolnoki, Research Centre for Natural Sciences (MTA), Hungary Lin Wang, University of Hong Kong, Hong Kong*

> \*Correspondence: *Jinming Du dujinming@ise.neu.edu.cn*

#### Specialty section:

*This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics*

Received: *01 March 2018* Accepted: *13 June 2018* Published: *03 July 2018*

#### Citation:

*Du J and Wang B (2018) Evolution of Global Cooperation in Multi-Level Threshold Public Goods Games With Income Redistribution. Front. Phys. 6:67. doi: 10.3389/fphy.2018.00067* Keywords: evolutionary game theory, public goods game, stochastic processes, human cooperation, income redistribution

## 1. INTRODUCTION

Collective actions, such as a group of neighborhood residents donating money to construct a public project, require voluntary contributions to collect public goods [1]. Voluntary contributing activity is widespread and substantial [2–5]. It is beneficial for the group, however, costly for individuals. Performing an altruistic act can weigh heavily on individual wellbeing and prosperity. Selfish individuals always have an advantage over cooperators. Such social dilemma can be represented as the public goods game (PGG) [6–8]. In human societies, people are often required to sacrifice personal benefits for the common goods and work together to achieve what they are unable to achieve alone [9, 10]. Especially when it comes to the situation that people are faced with the option of voluntary contribution to achieve a collective goal, where public goods cannot be provided in part, but only in whole after a certain cost (threshold) is covered. Threshold public goods game (TPGG) models nicely capture the main features of the above described social phenomena [11–18]. In the typical TPGG, the size of a proposed project and the associated total cost (threshold) are predetermined. The public goods are provided if the total contributions meet or exceed the threshold; otherwise, no goods are provided and all individuals suffer with nothing irrespective of whether they contributed or not. Since cooperation forms the bedrock of our efforts for a sustainable and better future, understanding cooperative behaviors in complex interactive systems has been one of the grand scientific challenges of the global society [19–24]. The problem is in many ways unnatural. Now that free riders can enjoy the same benefits for free, what kind of mechanism can motivate individuals to care for and contribute to the public goods, if only the fittest survive?

Most governments devote considerable resources to the provision of public goods available for all citizens to consume, such as national defense, environmental protection, health insurance and highways. Such universal provision schemes can redistribute income from the rich to the poor [25], and further realizing the fairness of the society [26–28]. Redistribution of income may provide a nonexcludable benefit to those who give, and many such schemes are universal in the sense that everyone is eligible and the provision is free [29]. One of the classic forms of income redistribution is the tax system, in which people are taxed at fixed rates. People who make more money pay higher taxes, thereby forfeiting more of their income to the government. Tax funds are used to benefit the society as a whole by providing a variety of public and social services by the government, and the direct transfer of income may occur in the case of welfare payments and other forms of cash assistance made to low-income members of society [30]. Previous works on physical models of collective dilemmas, however, seldom theoretically analyze how income redistribution influences the evolution of cooperation in the complex social-economic system.

Motivated by this, we propose a multi-level threshold public goods game model, where global and local public goods are clearly distinguished. Although pure public goods are defined as being non-rival in consumption and non-excludable [31], however, there exist impure public goods in reality. Owing to geographic space, some classes of goods are globally public, and others are only locally public. Global public goods are available to the entire population while local public goods may be available only to the residents of a very small neighborhood [32, 33]. Thus, players in the model can choose among selfishness, contributing to global public goods, or local public goods. In particular, the global public goods and local public goods both involve the threshold. If the collected public goods are not enough, dangers would happen. For example, coastal inhabitants may be inundated owing to failure of fundraising for a dam [34], disease may spread caused by inadequate voluntary vaccination [35– 39] and regional defense system may collapse due to insufficient finance [40].

We then respectively consider two kinds of income redistribution mechanisms in the multi-level TPGG. In the model with local income redistribution, players have to pay part of their idealized income to the focal group according to a given income expenditure proportion after each round. Distinguished from the contribution action during the PGG process, such compulsory payment is named as the secondorder payment. Subsequently, the accumulated income are redistributed to all the members of this group regardless of their strategies and the quantity of their second-order payments. On the other hand, for the global income redistribution, players pay part of their idealized income to the whole population. Similarly, the accumulated income is then redistributed to all the players in the whole population. In reality, the local income redistribution seems like a special transaction tax in economic system, which is collected according to the definite quantity of the volume of trade. And then, the revenue is redistributed to the group members uniformly, which amounts to the fiscal subsidy for a particular industry. While for the global income redistribution, the processes of second-order payment and income redistribution can be classified as the process of collecting and redistributing the gross income of personal income tax for the whole country. Based on this model, we theoretically investigate the evolution of cooperations of different levels and free-riders under collective risks, and focus on the influence of diverse income redistribution mechanisms on the global cooperation.

### 2. MODEL

In this paper, we study a finite population of N players. The whole population is divided into M groups, then there are m = N/M players in each of the group [41, 42]. Player x can choose a strategy S<sup>x</sup> ∈ {G, L, S}, where G, L and S represent global cooperation, local cooperation and selfishness, respectively. Each player has one unit of money at the beginning of the game. They should decide whether put their money into Global account, Local account or Personal account. If the money is put into the Personal account, it is saved without multiplication. The player will finally own the single unit of money. The money put into the Local account are added and multiplied by a local gain-factor r<sup>1</sup> (1 < r<sup>1</sup> < m). Then it will be equally distributed to the players in the focal group. The money put into the Global account are summed and multiplied by a global gain-factor r<sup>2</sup> (1 < r<sup>2</sup> < N). Then it is distributed to all the players in the whole population irrespective of whether they are global cooperators or not.

After game interaction, players are asked to take part in the process of income redistribution. We respectively consider two cases: one is local redistribution, the other one is global redistribution. For local redistribution, each player has to pay part of their income to its group according to the given income expenditure proportion p1. The parameter p<sup>1</sup> denotes the proportion of the second-order income obtained in the PGG of the focal group. Subsequently, the accumulated income expenditure of players in this group is redistributed to the m group members irrespective of their strategies and the amount of their second-order payments. Thus, the actual income of player x in this group after local redistribution of income can be calculated as

$$
\pi\_{\mathbf{x}}^{l} = \pi\_{\mathbf{x}} \times (1 - p\_1) + \frac{p\_1}{m} \times \sum\_{\mathbf{x} = 1}^{m} \pi\_{\mathbf{x}},\tag{1}
$$

where π<sup>x</sup> denotes the income of a player after one PGG, p1 P<sup>m</sup> <sup>x</sup>=<sup>1</sup> π<sup>x</sup> is the sum of the accumulated second-order payments of the group members. Denote π 1 G (i), π 1 L (l) and π 1 S (m− i − l) as the final payoff of each G, L and S player, respectively, when there are i G players, l L players in the local group and the other N − i − l players all hold S strategy in the whole population [43]:

$$
\pi\_S^1(m-i-l) = (\frac{i \times r\_2}{N} + \frac{l \times r\_1}{m} + 1) \times (1 - p\_1) + \frac{p\_1}{m} \times \sum\_{x=1}^m \pi\_x \tag{2}
$$

$$\pi\_G^1(i) = (\frac{i \times r\_2}{N} + \frac{l \times r\_1}{m}) \times (1 - p\_1) + \frac{p\_1}{m} \times \sum\_{\chi=1}^m \pi\_\chi \quad \text{(3)}$$

$$\pi\_L^1(l) = (\frac{i \times r\_2}{N} + \frac{l \times r\_1}{m}) \times (1 - p\_1) + \frac{p\_1}{m} \times \sum\_{x=1}^m \pi\_{x\*} \quad \text{(4)}$$

where π<sup>x</sup> = ir2/N+lr1/m+1 for S player and π<sup>x</sup> = ir2/N+lr1/m for G and L players.

On the other hand, for global redistribution, players are mandatory to pay the part of their income to the whole population according to a fixed proportion p2, which is the second-order payment in global redistribution. We emphasize that p<sup>2</sup> denotes the income expenditure proportion of all the second-order income player obtained in the PGG of the whole population. Subsequently, the accumulated income is redistributed to all the players irrespective of their strategies or the quantity of their second-order payments. Thus, the payoff of each player after the global income redistribution can be calculated as

$$
\pi\_\mathbf{x}^\mathbf{g} = \pi\_\mathbf{x} \times (1 - p\_2) + \frac{p\_2}{N} \times \sum\_{\mathbf{x} = 1}^N \pi\_\mathbf{x},\tag{5}
$$

where π<sup>x</sup> is the income of a player after PGG, p<sup>2</sup> P<sup>N</sup> <sup>x</sup> <sup>=</sup> <sup>1</sup> π<sup>x</sup> is the sum of the accumulated second-order payments of all the players.

We consider the two-level TPGG, thus the payoffs of players are threatened by two-level risks. Here, we denote s<sup>1</sup> as the local threshold and s<sup>2</sup> as the global threshold. Then we introduce the threshold functions:

$$\theta\_1(l) = \begin{cases} q\_1 & \text{for } l \times r\_1 < s\_1 \\ & 0 \text{ for } l \times r\_1 \ge s\_1 \end{cases} \tag{6}$$

$$\theta\_2(i) = \begin{cases} q\_2 & \text{for } i \times r\_2 < s\_2 \\ 0 & \text{for } i \times r\_2 \ge s\_2 \end{cases} \tag{7}$$

where i is the number of global cooperators and l is the number of local cooperators in the focal group. If the amount of collected global public goods is less than s2, a world-wide danger is on the way with a probability q2. Once such danger happens, the payoffs of all the individuals are zero. If the amount of public goods in the Global account is more than s2, the collective target achieves and disasters are not going to happen. In this case, all the players will get their payoffs in the public goods game. However, if the amount of local public goods in a group is less than s1, a potential risk could happen with a probability q1. Once suffered such risk, the payoffs of players in the focal group would be lost.

We use imitation process to describe the evolution of strategies. Players are likely to learn the strategies of their successful counterparts'. Firstly, we randomly select a player A from the N population. Then, another player B should be chosen. With a probability ϕ, B will be selected from the N population. Otherwise, with probability 1 − ϕ, B is chosen only from A's local group. In other words, the larger ϕ is, the more likely players interact with each other globally. In our daily life, the interaction within a group is much more frequent than that between groups. A would learn B's strategy with a probability 1/[1 + e −ω(πB−πA) ] [44–48], where π<sup>x</sup> is the payoff of individual x. ω denotes the imitation intensity [49–52], measuring the dependence of decision making on the payoff comparison. Here, we define two different imitation intensities. We denote ω<sup>1</sup> as the imitation intensity within a group and ω<sup>2</sup> as the imitation intensity between groups. During the evolutionary process of strategies, each player has the chance of switching its strategy to a different one with a probability µ. In this paper, we assume the exploration rate µ → 0. The parameters in the model are listed in **Table 1**.

#### 3. METHODS

We are interested in how global cooperation evolves. To this end, we study the stationary distribution and the fixation time. It is common that the interaction within a group is much more frequent than that between groups [53, 54], thus the fixation process of a single mutant in the population goes through two steps: the fixation of this mutant in its local group and the fixation of such group in the whole population. We theoretically analyze the two kinds of income redistributions respectively.

#### 3.1. Local Income Redistribution

We consider a single local group composed of m − i S players and i G players. All the other groups in the whole population are full of S players. Based on Equations (2, 3), the payoffs of each G

TABLE 1 | Parameters in the model.


player and each S player in the focal group are:

$$
\pi\_G^1(i) = \frac{i \times r\_2}{N} \times (1 - p\_1) + \frac{p\_1}{m} \times \sum\_{\chi=1}^m \pi\_\chi \tag{8}
$$

$$
\pi\_S^1(m-i) = (\frac{i \times r\_2}{N} + 1) \times (1 - p\_1) + \frac{p\_1}{m} \times \sum\_{\mathbf{x}=1}^m \pi\_\mathbf{x}. \tag{9}
$$

The number of G players changes from i to i ± 1 in one time step with a probability:

$$T^{\pm}(i) = (1 - \varphi) \times \frac{i}{m} \times \frac{m - i}{m} \times \left\{ 1 + e^{\pm \alpha\_1 \left[ \pi\_S^1(m - i) - \pi\_G^1(i) \right]} \right\}^{ - 1},\tag{10}$$

where ω<sup>1</sup> is the imitation intensity within a group. The fixation probability of a single G mutant invading a group of S players is denoted by P 1 SG, which is given by Traulsen et al. [45] and Wu et al. [54]:

$$\begin{split} P\_{SG}^{1} &= \left[ 1 + \sum\_{j=1}^{m-1} \prod\_{i=1}^{j} \frac{T^{-}(i)}{T^{+}(i)} \right]^{-1} \\ &= \left\{ 1 + \sum\_{j=1}^{m-1} \mathbf{e}^{a\_{1}} \sum\_{i=1}^{j} \left[ \pi\_{S}^{\frac{1}{2}(m-i) - \pi\_{G}^{\frac{1}{2}(i)}} \right] \right\}^{-1} . \end{split} \tag{11}$$

We define the fixation probability of a local group full of G players invading the whole population full of S players as P 2 SG. The payoff of each G player is denoted by π 2 G (i) and that of S player is π 2 S (M − i) when there are i local groups full of G players and the other M − i groups full of S players.

$$
\pi\_G^2(i) = \frac{i \times r\_2}{M} \times (1 - p\_1) + \frac{p\_1}{m} \times \sum\_{x=1}^m \frac{i \times r\_2}{M} = \frac{i \times r\_2}{M} \tag{12}
$$

$$
\pi\_S^2(M - i) = (\frac{i \times r\_2}{M} + 1) \times (1 - p\_1) + \frac{p\_1}{m} \times \sum\_{x = 1}^m (\frac{i \times r\_2}{M} + 1)
$$

$$
= \frac{i \times r\_2}{M} + 1. \tag{13}
$$

A new group full of G players arises when two players with different strategies from different local groups are chosen, and the S player alters its strategy through imitation, then it takes over its local group. Thus, the probability to increase the number of local groups full of G players by one is given by:

$$\Gamma^{+}(i) = \varphi \times \frac{i}{M} \times \frac{M-i}{M} \times \left\{ 1 + \mathbf{e}^{\alpha \nu\_2 \left\{ \pi\_S^2 (M-i) - \pi\_G^2(i) \right\}} \right\}^{-1} \times P\_{SG}^1(k), \tag{14}$$

where ω<sup>2</sup> is the imitation intensity between groups. P 1 SG(k) represents the fixation probability of a single G mutant invading a group of S players when there already exist k groups full of G. Similarly, the probability to decrease the number of G groups by one is:

$$\Gamma^{-}(i) = \varphi \times \frac{i}{M} \times \frac{M-i}{M} \times \left\{ 1 + \mathfrak{c}^{\alpha \varphi} \left[ \pi\_G^2(i) - \pi\_S^2(M-i) \right] \right\}^{-1} \times P\_{GS}^1(k). \tag{15}$$

Hence, the fixation probability of a G group in the whole population is obtained as follows:

$$P\_{SG}^2 = \left( 1 + \sum\_{j=1}^{M-1} \left\{ \mathbf{e}^{\omega\_2} \boldsymbol{\Sigma}\_{i=1}^j \left[ \boldsymbol{\pi}\_S^2(M-i) - \boldsymbol{\pi}\_G^2(i) \right] \prod\_{i=1}^j \left[ \frac{P\_{GS}^1(k)}{P\_{SG}^1(k)} \right] \right\}^{-1} . \tag{16}$$

We aim to analyze the multi-level TPGG. Thus, the payoffs above are conditional. Once global danger happens, all the individuals will lose their wealth. On the other hand, if local danger strikes, the players in the focal group lose their wealth. By utilizing the threshold functions, Equations (6, 7), the revised payoffs are as follows:

$$
\pi\_G^1(i) = \left[ \frac{i \times r\_2}{N} \times (1 - p\_1) + \frac{p\_1}{m} \times \sum\_{\mathbf{x} = 1}^m \pi\_\mathbf{x} \right] \tag{17}
$$

$$
\times (1 - q\_1) \times [1 - \theta\_2(i)] \tag{17}
$$

$$
\pi\_S^1(m - i) = \left[ (\frac{i \times r\_2}{N} + 1) \times (1 - p\_1) + \frac{p\_1}{m} \times \sum\_{\mathbf{x} = 1}^m \pi\_\mathbf{x} \right] \tag{18}
$$

$$
\times (1 - q\_1) \times [1 - \theta\_2(i)] \tag{18}
$$

$$
\pi\_G^2(i) = (\frac{i \times r\_2}{M}) \times (1 - q\_1) \times [1 - \theta\_2(i \times m)] \qquad \text{(19)}
$$

$$
\pi\_S^2(M - i) = (\frac{i \times r\_2}{M} + 1) \times (1 - q\_1) \times [1 - \theta\_2(i \times m)].
$$

$$
\gamma\_{\rm cl}(m \times n) \omega \sim (\rm tr - \rm tr) \times \chi\_{\rm cl} + \rm tr \tag{20}
$$

Inserting Equations (17, 18) into Equation (11) and Equations (19, 20) into Equation (16), we can get the following equations when there are k groups full of G players:

M

$$P\_{SG}^{l}(k) = \left\{ 1 + \sum\_{j=1}^{m-1} \mathbf{c}^{\alpha\_1 \left( 1 - q\_1 \right) \left( 1 - p\_1 \right)} \sum\_{i=1}^{j} \mathbf{1}^{\left[ 1 - \theta\_2 \left( i + m \, k \right) \right]} \right\}^{-1} \tag{21}$$

$$P\_{SG}^2 = \left( 1 + \sum\_{j=1}^{M-1} \left\{ \mathbf{c}^{\alpha\_2 \left\{ 1 - q\_1 \right\} \sum\_{i=1}^j \left[ 1 - \theta\_2(m) \right]} \prod\_{k=1}^j \left[ \frac{P\_{GS}^1(k)}{P\_{SG}^1(k)} \right] \right\} \right)^{-1} \,\tag{22}$$

The fixation probability of a single G mutant invading the whole global population full of S players is defined as ρSG. Hence, we have:

$$
\rho\_{SG} \approx P\_{SG}^1(0) \times P\_{SG}^2. \tag{23}
$$

Accordingly, we can get the fixation probability ρSG, and also ρGS, ρSL, ρLS, ρLG, ρGL, which are given as follows:

$$\rho\_{SG} \approx \left[ 1 + \sum\_{j=1}^{m-1} \mathfrak{e}^{\omega\_1 \left( 1 - p\_1 \right) \left( 1 - q\_1 \right) \left( 1 - q\_2 \right) j} \right]^{-1}$$

Frontiers in Physics | www.frontiersin.org July 2018 | Volume 6 | Article 67

π

S

$$\rho\_{\rm GS} \approx \left( 1 + \sum\_{j=1}^{M-1} \left\{ \mathbf{c}^{\alpha\_2 \left( 1 - q\_1 \right)} \sum\_{i=1}^j \mathbf{1}^{\left[ 1 - \theta\_2(mi) \right]} \prod\_{k=1}^j \left[ \frac{P\_{GS}^1(k)}{P\_{SG}^1(k)} \right] \right\} \right)^{-1} (24)$$

$$\rho\_{\rm GS} \approx \left[ 1 + \sum\_{j=1}^{m-1} \mathbf{c}^{-\alpha\_1 \left( 1 - p\_1 \right) \left( 1 - q\_1 \right) \left( 1 - q\_2 \right)} \right]^{-1}$$

$$\times \left( 1 + \sum\_{j=1}^{M-1} \left\{ \mathbf{c}^{-\alpha\_2 \left( 1 - q\_1 \right)} \sum\_{i=1}^j \left[ 1 - \theta\_2(mi) \right] \prod\_{k=1}^j \left[ \frac{P\_{GS}^1(k)}{P\_{GS}^1(k)} \right] \right\} \right)^{-1} (25)$$

$$\rho\_{\rm GS} \approx \left\{ 1 + \sum\_{j=1}^{m-1} \mathbf{c}^{\alpha\_1 \left( 1 - p\_1 \right) \left( 1 - q\_2 \right)} \sum\_{i=1}^j \left[ 1 - \theta\_1(i) \right] \right\}^{-1}$$

$$\times \left\{ 1 + \sum\_{j=1}^{M-1} \mathbf{c}^{\left[ \alpha\_2 \left( 1 - q\_1 - r\_1 \right) + \alpha\_1 \left( m - 1 \right) \left( 1 - p\_1 \right) \right]} \right\}^{-1} \tag{26}$$

$$\rho\_{LS} \approx \left\{ 1 + \sum\_{j=1}^{m-1} \mathbf{e}^{-\omega\_1 \left( 1 - p\_1 \right) \left( 1 - q\_2 \right)} \sum\_{i=1}^{j} \mathbf{e}^{\left[ 1 - \theta\_1(i) \right]} \right\}^{-1}$$

$$\times \left\{ 1 + \sum\_{j=1}^{M-1} \mathbf{e}^{\left[ -\omega\_2 \left( 1 - q\_1 - r\_1 \right) - \omega\_1 \left( m - 1 \right) \left( 1 - p\_1 \right) \right] \left( 1 - q\_2 \right)} \right\}^{-1} \tag{27}$$

$$\left\{ 1 - \omega\_1 \mathbf{e}^{\left[ -\omega\_2 \left( 1 - q\_1 \right) - \omega\_1 \left( m - 1 \right) \right]} \right\}^{-1}$$

$$\rho\_{LG} \approx \frac{1}{m} \times \left\{ 1 + \sum\_{j=1}^{M-1} \mathbf{e}^{\alpha\_2} \boldsymbol{\Sigma}\_{i=1}^j \left[ 1 - \theta\_2(m) \right] \left( r\_1 + q\_1 \frac{i r\_2}{M} \right) \right\}^{-1} \tag{28}$$

$$\rho\_{\rm GL} \approx \frac{1}{m} \times \left\{ 1 + \sum\_{j=1}^{M-1} \mathbf{e}^{-\alpha\_2 \sum\_{i=1}^j \left[ 1 - \theta\_2(m \, i) \right] \left( r\_1 + q\_1 \frac{ir\_2}{M} \right)} \right\}^{-1} . \tag{29}$$

During the evolutionary process, players have the chance of exploring strategies with a probability µ. Since we assume the exploration rate µ → 0, it assures that a single mutant vanishes or fixates in the population before the next one appears [55, 56]. Thus, the evolutionary process can be approximated by a Markov chain where the state space is composed of homogeneous states full of each type of players (G, L or S). The corresponding transition probability matrix T is:

$$T = \begin{pmatrix} T\_{\rm SS} & \frac{\mu}{2} \,\rho\_{\rm SL} & \frac{\mu}{2} \,\rho\_{\rm SG} \\\\ \frac{\mu}{2} \,\rho\_{\rm LS} & T\_{LL} & \frac{\mu}{2} \,\rho\_{\rm LG} \\\\ \frac{\mu}{2} \,\rho\_{\rm GS} & \frac{\mu}{2} \,\rho\_{\rm GL} & T\_{\rm GG} \end{pmatrix} . \tag{30}$$

Here, Tii = 1 − P k6=i ( µ 2 ρik), where i, k ∈ {G, L, S}. Stationary distribution describes the percentage of time spent by the population in each homogeneous state in the long run, which is determined by the normalized left eigenvector corresponding to the eigenvalue 1 of the transition matrix. The stationary distribution for Equation (30) can be calculated as follows:

$$X\_{\\$} = \begin{array}{c} \frac{\rho\_{\text{GS}} \,\rho\_{LG} + \rho\_{\text{GS}} \,\rho\_{\text{LS}} + \rho\_{\text{LS}} \,\rho\_{\text{GL}}}{\Delta} \end{array} \tag{31}$$

$$X\_L = \begin{array}{c} \frac{\rho\_{\rm GS} \,\rho\_{\rm SL} + \rho\_{\rm SL} \,\rho\_{\rm GL} + \rho\_{\rm SG} \,\rho\_{\rm GL}}{\Delta} \end{array} \tag{32}$$

$$X\_G = \frac{\rho\_{\rm SG}\rho\_{\rm LS} + \rho\_{\rm SL}\rho\_{\rm LG} + \rho\_{\rm SG}\rho\_{\rm LG}}{\Delta},\tag{33}$$

where XS, XL, and X<sup>G</sup> represent the probability to find the population in the homogeneous state consisting entirely of selfish ones, local cooperators, and global cooperators, respectively. The normalization factor 1 assures X<sup>S</sup> + X<sup>L</sup> + X<sup>G</sup> = 1.

On the other hand, the average time to reach a certain state for the first time can be derived analytically in the limit of rare explorations. For example, we denote fixation time τGS as the average time starting in pure state of G to reach S. This fixation time satisfies:

$$
\pi\_{\rm GS} = 1 + r\_{\rm GL} \,\pi\_{\rm LS} + r\_{\rm GG} \,\pi\_{\rm GS} \tag{34}
$$

where rij = δij + µ N 2 (ρij − δij). It represents the transition probability from the homogeneous state i to the homogeneous state j. ρij expresses the fixation probability. δij denotes the Kronecker delta. <sup>µ</sup> <sup>N</sup> <sup>2</sup> means the rate at which mutants of type j are born (as only two types of mutants can be produced with equal probability), since on average it takes the time of 1 µ N for per mutation. Then, we can get the average time of reaching the homogeneous state S from the initial pure states G and L:

$$\tau\_{\rm GS} = 1 + \frac{\mu \, N}{2} \rho\_{\rm GL} \, \tau\_{\rm LS} + \left[ 1 - \frac{\mu \, N}{2} \left( \rho\_{\rm GS} + \rho\_{\rm GL} \right) \right] \tau\_{\rm GS} \tag{35}$$

$$\tau\_{LS} = 1 + \frac{\mu \, N}{2} \, \rho\_{LG} \, \tau\_{GS} + \left[1 - \frac{\mu \, N}{2} \left(\rho\_{LS} + \rho\_{LG}\right)\right] \tau\_{LS}. \tag{36}$$

Solving Equations (35)-(36), we have:

$$\tau\_{\rm GS} = \frac{2\left(\rho\_{\rm GL} + \rho\_{\rm LG} + \rho\_{\rm LS}\right)}{\mu \, N \left(\rho\_{\rm GS} \, \rho\_{\rm LG} + \rho\_{\rm GL} \, \rho\_{\rm LS} + \rho\_{\rm GS} \, \rho\_{\rm LS}\right)}\tag{37}$$

$$\tau\_{LS} = \frac{2\left(\rho\_{\rm GL} + \rho\_{\rm GS} + \rho\_{\rm LG}\right)}{\mu \, N \left(\rho\_{\rm GS} \, \rho\_{\rm LG} + \rho\_{\rm GL} \, \rho\_{\rm LS} + \rho\_{\rm GS} \, \rho\_{\rm LS}\right)}.\tag{38}$$

Similarly, expressions for other fixation time can be shown as follows:

$$\tau\_{\rm SL} = \frac{2\left(\rho\_{\rm GL} + \rho\_{\rm GS} + \rho\_{\rm SL}\right)}{\mu \, N \left(\rho\_{\rm GL} \,\rho\,\text{cs} + \rho\,\text{cr} \,\rho\,\text{cs} + \rho\,\text{cs} \,\rho\,\text{sr}\right)}\tag{39}$$

$$\pi\_{GL} = \frac{2\left(\rho\_{\rm GS} + \rho\_{\rm SC} + \rho\_{\rm SL}\right)}{\mu \, N \left(\rho\_{\rm CL} \,\rho \,\text{sc} + \rho\_{\rm GL} \,\rho \,\text{sc} + \rho\_{\rm GS} \,\rho \,\text{s} \,\text{r}\right)}\tag{40}$$

$$\text{trSG} = \frac{2\left(\rho\_{\text{LG}} + \rho\_{\text{LS}} + \rho\_{\text{SL}}\right)}{\mu \, N \left(\rho\_{\text{LG}} \,\rho\_{\text{SG}} + \rho\_{\text{LS}} \,\rho\_{\text{SG}} + \rho\_{\text{LG}} \,\rho\_{\text{SL}}\right)}\tag{41}$$

$$\tau\_{LG} = \frac{2\left(\rho\_{\rm LS} + \rho\_{\rm SG} + \rho\_{\rm SL}\right)}{\mu \, N \left(\rho\_{\rm LG} \,\rho\_{\rm SG} + \rho\_{\rm LS} \,\rho\_{\rm SG} + \rho\_{\rm LG} \,\rho\_{\rm SG}\right)}.\tag{42}$$

Based on the solved fixation probabilities, Equations (24–29), we can deduce the stationary distribution and the fixation time with a complete form.

#### 3.2. Global Income Redistribution

Similarly with the former analysis for local income redistribution, we consider a single local group in which there are i G players and m − i S players, and assume that all the other groups are full of S players. Compared with Equations (17–20) for local income redistribution, the payoffs of each G and S player for global income redistribution are:

$$
\pi\_G^1(i)' = \left[ \frac{i \times r\_2}{N} \times (1 - p\_2) + \frac{p\_2}{N} \times \sum\_{x=1}^N \pi\_x \right] \times (1 - q\_1),
$$

$$\begin{array}{c} \times \left[1 - \theta\_2(\text{i})\right] \\\\ \Gamma \end{array} \tag{43}$$

$$
\pi\_S^1(m-i)' = \left[ (\frac{i \times r\_2}{N} + 1) \times (1 - p\_2) + \frac{p\_2}{N} \times \sum\_{x=1}^N \pi\_x \right] \times (1 - q\_1)
$$

$$
\times [1 - \theta\_2(i)]\tag{44}
$$

$$
\pi\_G^2(i)' = \left[ (\frac{i \times r\_2}{M}) \times (1 - p\_2) + \frac{p\_2}{N} \times \sum\_{x=1}^N \pi\_x \right]
$$

$$
\times (1 - q\_1) \times [1 - \theta\_2(m \times i)]\tag{45}
$$

$$
\pi\_S^2 (M - i)' = \left[ (\frac{i \times r\_2}{M} + 1) \times (1 - p\_2) + \frac{p\_2}{N} \times \sum\_{\mathbf{x} = 1}^N \pi\_\mathbf{x} \right]
$$

$$
\times (1 - q\_1) \times [1 - \theta\_2 (m \times i)].\tag{46}
$$

Based on the payoffs, we can get the fixation probability ρ ′ SG, ρ ′ GS, ρ ′ SL, ρ ′ LS, ρ ′ LG, and ρ ′ GL for global income redistribution, which are given as follows:

$$\rho\_{SG}^{'} \approx \left[1 + \sum\_{j=1}^{m-1} e^{\rho\_1 \left(1 - p\_2\right) \left(1 - q\_1\right) \left(1 - q\_2\right) j}\right]^{-1}$$

$$\times \left(1 + \sum\_{j=1}^{M-1} \left\{ e^{\rho\_2 \left(1 - p\_2\right) \left(1 - q\_1\right) \sum\_{i=1}^j \left[1 - \theta\_2(m\text{ i})\right]} \right.}\right)^{-1}$$

$$\prod\_{k=1}^j \left[\frac{P\_{GS}^{1}(k)}{P\_{SG}^{1}(k)}\right]\right)^{-1} \tag{47}$$

$$\rho\_{GS}^{'} \approx \left\{ 1 + \sum\_{j=1}^{m-1} \epsilon^{\left[ -\omega\_1 \left(1 - p\_2\right) \left(1 - q\_1\right) \left(1 - q\_2\right) \right]} \right\}^{-1}$$

$$\times \left(1 + \sum\_{j=1}^{M-1} \left\{ \epsilon^{-\omega\_2 \left(1 - p\_2\right) \left(1 - q\_1\right) \sum\_{i=1}^j \left[1 - \theta\_2(m\text{ i})\right]} \right.\right.}$$

$$\prod\_{k=1}^j \left[\frac{P\_{GS}^{1}(k)}{P\_{GS}^{1}(k)}\right]\right)^{-1} \tag{48}$$

$$\rho\_{\text{SI}}^{'} \approx \left[1 + \sum\_{j=1}^{m-1} \epsilon^{\omega\_{11} \left(1 - p\_2\right) \left(1 - q\_2\right) \sum\_{j=1}^j \left[1 - \theta\_1(i)\right]}\right]^{-1}$$

$$\rho\_{SL}^{\prime} \approx \left[ 1 + \sum\_{j=1}^{\prime} \mathbf{e}^{\alpha\_1 \left( 1 - p\_2 \right) \left( 1 - q\_2 \right)} \sum\_{i=1}^{\prime} \mathbf{e}^{\left[ 1 - \theta\_1(i) \right]} \right]$$

$$\times \left\{ 1 + \sum\_{j=1}^{M-1} \mathbf{e}^{\left[ \alpha\_2 \left( 1 - q\_1 - r\_1 \right) + \omega\_1 \left( m - 1 \right) \right] \left( 1 - p\_2 \right) \left( 1 - q\_2 \right)} \right\}^{-1} \tag{49}$$

$$\rho\_{LS}^{\prime} \approx \left\{ 1 + \sum\_{j=1}^{m-1} \mathbf{e}^{-\omega\_1 \left( 1 - p\_2 \right) \left( 1 - q\_2 \right)} \sum\_{i=1}^{j} \mathbf{e}^{\left[ 1 - \theta\_1 \left( i \right) \right]} \right\}^{ -1} $$

$$ \times \left\{ 1 + \sum\_{j=1}^{M-1} \mathbf{e}^{\left[ -\omega\_2 \left( 1 - q\_1 - r\_1 \right) - \omega\_1 \left( m - 1 \right) \right] \left( 1 - p\_2 \right) \left( 1 - q\_2 \right)} \right\}^{ -1} \tag{50} $$

$$\rho\_{LG}^{\prime} \approx \frac{1}{m} \times \left[ 1 + \sum\_{j=1}^{M-1} \mathbf{c}^{\alpha\_2 \left( 1 - p\_2 \right)} \sum\_{i=1}^{j} \mathbf{l}^{\left[ 1 - \theta\_2(mi) \right] \left( r\_1 + q\_1 \frac{i r\_2}{M} \right)} \right]^{-1} \tag{51}$$

$$\rho\_{\rm GL}^{\prime} \approx \frac{1}{m} \times \left( 1 + \sum\_{j=1}^{M-1} \mathbf{e}^{\{-\alpha\_2 \left( 1 - p\_2 \right) \sum\_{i=1}^{j} \left\{ 1 - \theta\_2 \left( m \ i \right) \left( r\_i + q\_1 \frac{i r\_2}{M} \right) \right\}} \right)^{-1} \tag{52}$$

Based on these fixation probabilities, it is easy to deduce the corresponding stationary distribution and the fixation time with a complete form for the case of global income redistribution.

#### 4. RESULTS AND DISCUSSION

Sustainable development calls for more and more global cooperation. Former collective risk dilemma models, however, seldom distinguish global cooperators from local ones. In this paper, we explicitly consider different cooperators arising from the group structured population to address how global cooperative behavior is affected by collective risk and income redistribution mechanisms. Income redistribution is a means of adjusting the income among individuals, which could make full use of social capital. We explore how income redistribution of different levels influence the evolution of global contribution in multi-level threshold public goods games.

We first study the local income redistribution mechanism. The stationary distribution of three strategies are compared in **Figure 1A**. With the increment of the local income expenditure proportion p1, X<sup>G</sup> (the stationary distribution of G) and X<sup>L</sup> show an ascending trend while X<sup>S</sup> descends. It is found that the global cooperation is promoted by local income redistribution compared with typical TPGG, which is shown as p<sup>1</sup> = 0. In PGG model, the Nash equilibrium predicts zero provision. Thus, the selfishness is the dominate strategy, while global cooperation is inferior. When public goods can only be provided if global contributions reach a minimum threshold, this creates an advantage in that Pareto efficient outcomes can be Nash equilibria. In TPGG, however, we still see significant underprovision of the global public goods. After introducing the local income redistribution, players share part of their payoffs. The mechanism changes the comparison between different strategies in the local group, which reduces the inferiority of global cooperators. Especially under the high risk circumstance, the global cooperation becomes a Nash equilibrium (if the collective target is so large which requires almost all the players to contribute). For different gain-factors, it is shown in **Figure 1B** that the global cooperative behavior is promoted with the rise of p1. It is well-known that, in the context of PGG, small values of gain-factor favor defectors and large values benefit cooperators. In our work, larger r1/r<sup>2</sup> indicates much worse condition for global contributors. Even though in such situations, compared with the frequencies of global cooperation at p<sup>1</sup> = 0, global cooperators still have a much better chance for survival when p<sup>1</sup> > 0. That is because the wealth gap between global cooperators and others is narrowed with increasing p1, which makes the cooperative behaviors have more chance to prevail in the multi-level TPGG rather than without such mechanism. Local income redistribution balances the income difference among individuals in the same group. Suffered as a consequence, the final payoff of each player is strongly dependent on the quality of its group. Obviously, more local cooperators make larger contributions in the group with a fixed number of participants. Thereinto, a player in the group with more local cooperators has a competitive advantage over the other players. Thus, local income redistribution, which acts as a driving force for promoting cooperation in the local group, especially local cooperation, gives prominence to the role of groups on the evolution of cooperation.

We then probe how global income redistribution mechanism influences the evolution of different strategies in the multilevel TPGG model. As is illustrated in **Figure 2**, the stationary distribution of G (XG) shows an ascending trend while X<sup>S</sup> and X<sup>L</sup> descend with the increment of p2. Different from results in local income redistribution mechanism, only global cooperation is obviously improved. Under the mechanism of global income redistribution, the accumulated second-order payments are redistributed to the whole population irrespective of their strategies and contributions. Thus, on the one hand, the payoff differences among strategies are reduced. On the other hand, the evolutionary advantage of compact cooperative clusters cannot spread to the whole population. Thus, the global income redistribution actually inhibit the heterogeneity of groups. When p<sup>2</sup> → 1, almost all the players share all of their fortunes. Under such circumstance, the whole population is in a state of random drift. Each strategy holds a stationary distribution of 1/3.

In the following, we study how long the population fixates at each state in both income redistribution mechanisms. We focus on the fixation time of each strategy, especially that of G strategy. With the increase of income expenditure proportion, both p<sup>1</sup> and p2, the changes of average time that a mutant of each strategy

FIGURE 1 | The influence of the local income expenditure proportion on the stationary distribution of strategies. In (A), the tendency of stationary distribution of selfishness, local contribution and global cooperation (*XS*, *X<sup>L</sup>* , and *XG*) with respect to the increase of local income expenditure proportion *p*<sup>1</sup> are shown. *X<sup>G</sup>* and *X<sup>L</sup>* are promoted with the increasing *p*1, while *X<sup>S</sup>* decreases. It means that the effect of local income redistribution on promoting global cooperation becomes remarkably obvious with the increase of the proportion of redistribution in groups. Parameters are *m* = 5, *M* = 20, *N* = 100, *q*<sup>1</sup> = *q*<sup>2</sup> = 0.8, *r*<sup>1</sup> = 2, *r*<sup>2</sup> = 3, *s*<sup>1</sup> = 2, *s*<sup>2</sup> = 160, and ω<sup>1</sup> = ω<sup>2</sup> = 0.005. In (B), the tendency of the stationary distribution of global cooperation with the increase of *p*<sup>1</sup> under different gain-factors are shown. We respectively study three *r*1/*r*2 ratios.

FIGURE 2 | The influence of the global income expenditure proportion on the stationary distribution of strategies. In (A), the tendency of stationary distribution of *S*, *L*, and *G* (*XS*, *X<sup>L</sup>* , and *XG*) with respect to the increase of global income expenditure proportion *p*<sup>2</sup> are shown. *X<sup>G</sup>* is promoted with the increasing *p*2, while *X<sup>S</sup>* and *X<sup>L</sup>* decrease. It means that the effect of global income redistribution on promoting global cooperation becomes remarkably obvious with the increase of the proportion of redistribution in the whole population. Parameters are *m* = 5, *M* = 20, *N* = 100, *q*<sup>1</sup> = *q*<sup>2</sup> = 0.8, *r*<sup>1</sup> = 2, *r*<sup>2</sup> = 3, *s*<sup>1</sup> = 2, *s*<sup>2</sup> = 160, and ω<sup>1</sup> = ω<sup>2</sup> = 0.005. In (B), the tendency of the stationary distribution of global cooperation with the increase of *p*2 under different gain-factors are shown. We respectively explore three *r*1/*r*2 ratios.

invades population full of the other two respectively are shown in **Figure 3**. After introducing an income redistribution (both global and local) into the multi-level TPGG model, the time for G invading the other two strategies are obviously shortened. The larger the p<sup>1</sup> (or p2), the more likely global cooperation to be learned and adopted by other strategies' holders. Then G strategy could occupy the entire population more quickly. By comparison, the G strategy under global income redistribution mechanism fixates faster than local income redistribution. The change of the fixation time of S is on the contrary. The time for S invading the other two strategies remarkably prolong. It is known that the fixation time of L is shortened in TPGG compared with PGG. For the local income redistribution, the fixation time of L declines. The difficulty for other strategies to invade L increases. For the global income redistribution, however, although the time for L invading S is shortened, the time for L invading G becomes longer with the rise of p2. Compared with the promotion of global cooperation and the inhibition of selfishness owing to the global income redistribution, to a certain extent, it only has little impact on local cooperation. For a limit case, when p<sup>2</sup> = 1, the fixation difficulty of all the strategies are the same.

We further investigate how decision-makings are affected by the change of the thresholds. As shown in **Figure 4**, the global cooperative behavior is promoted with the increasing global threshold. By adding a threshold in global PGG, the game is turned from a social dilemma into a sort of coordination game. In particular, with a large threshold, players are facing a sufficiently severe potential crisis. Such risk indicates that all the players probably lose their wealth. Higher global threshold means a bigger target which has to be reached to avoid the risk. Global cooperation is necessary for public safety, and becomes more and more important with the increasing risk. Once the disaster happens, all the individuals are equally wealthy. Thus, global cooperators can gain a foothold. Because income redistribution could narrow the payoff differences among strategies, global cooperation has more opportunity to be adopted in high risk cases. This paves the way for them to emerge in the population. In comparison, for the same income expenditure proportion, p<sup>1</sup> = p2, the growth of global cooperation is more obvious in the global income redistribution mechanism than in the local one. It hints that when sharing more wealth with the whole population, the relationship among individuals becomes more close. Hence, individuals are more inclined to cooperate globally

FIGURE 4 | The stationary distribution changes with the global threshold *s*2. Global cooperation is promoted with the increase of the global threshold *s*2, while selfishness and local cooperation decrease. It hints that players are more apt to cooperate globally under high global risks. We investigate both of the two income redistribution mechanisms, and the stationary distributions of strategies are calculated respectively in each mechanism under the same parameter values. Parameters are: *m* = 5, *M* = 20, *N* = 100, *r*<sup>1</sup> = 2, *r*<sup>2</sup> = 3, *s*<sup>1</sup> = 2, *q*<sup>1</sup> = *q*<sup>2</sup> = 0.8, *p*<sup>1</sup> = 0.5, *p*<sup>2</sup> = 0.5, and ω<sup>1</sup> = ω<sup>2</sup> = 0.005.

FIGURE 3 | The fixation time changes with the income expenditure proportions. (A) reflects the local income redistribution; (B) reflects the global income redistribution. In each panel, average fixation time of each strategy invading the others are respectively shown. When a mutant *G* invades *S* population, denoting τ*SG* as the average time starting in pure state of *S* to reach *G*. When a mutant *G* invades *L* population, denoting τ*LG* as the average time starting in pure state of *L* to reach *G*. Both τ*SG* and τ*LG* decline with the increase of *p*<sup>2</sup> and *p*1. Likely, a mutant *L* invades *S* population, a mutant *L* invades *G* population, a mutant *S* invades *L* population, and a mutant *S* invades *G* population are respectively shown as τ*SL*, τ*GL*, τ*LS* and τ*GS*. It shows that the larger *p*<sup>2</sup> and *p*<sup>1</sup> benefit the fixation of global cooperation. Parameters are: *m* = 5, *M* = 20, *N* = 100, *r*<sup>1</sup> = 2, *r*<sup>2</sup> = 3, *s*<sup>1</sup> = 2, *s*<sup>2</sup> = 160, *q*<sup>1</sup> = *q*<sup>2</sup> = 0.8, and ω<sup>1</sup> = ω<sup>2</sup> = 0.005.

for collecting global public goods to resist the disaster. Moreover, we study the influence of local threshold on the results. As shown in **Figure 5**, the increase of s<sup>1</sup> obviously promotes local

FIGURE 5 | The stationary distribution changes with the local threshold *s*1. Local cooperation is promoted with the increase of *s*1, while selfishness decreases. It has little influence on global cooperation, which slightly drops with the increase of *s*1. Local risk makes the *L* strategy become a better behavior to be chosen. The results are obtained under global income redistribution mechanisms, while similar results can be found under the local income redistribution case. Parameters are: *m* = 5, *M* = 20, *N* = 100, *r*<sup>1</sup> = 2, *r*<sup>2</sup> = 3, *s*<sup>2</sup> = 160, *q*<sup>1</sup> = *q*<sup>2</sup> = 0.8, *p*<sup>1</sup> = 0.5, *p*<sup>2</sup> = 0.5, and ω<sup>1</sup> = ω<sup>2</sup> = 0.005.

cooperation, while inhibits selfishness. Meanwhile, it has only a little impact on global cooperation, which slightly drops with the rise of s1. The results are obtained under global income redistribution mechanisms, while similar results can be found under the local income redistribution too. Compared with the global threshold, local thresholds have much less effects on the global cooperation. Since we focus on the global cooperation, we mainly study the impacts of global threshold on the results in this paper.

In the global income redistribution mechanism, we further investigate the mutual influence of global threshold s<sup>2</sup> and p<sup>2</sup> on the evolution of strategies. As shown in **Figure 6**, with the increase of s<sup>2</sup> and p2, global cooperation shows an increasing trend, while selfishness declines correspondingly. It is worth noting that the trend for local cooperation with the change of p<sup>2</sup> is related to the size of global threshold s2. For smaller s2, local cooperation decreases with the rise of p2; for larger s2, local cooperation increases with the rise of p2. While for intermediate s2, local cooperation rises under lower p<sup>2</sup> and then reduces under higher p2. It is found that there exists a most adaptable value of p<sup>2</sup> for local cooperators under global income redistribution. This phenomenon means that, on one hand, the mechanism of global income redistribution reduces income inequality within the whole population, which is generally regarded to be a positive improvement to society. But on the other hand, it may negatively affect the efficiency of social-economic system (local economic development). Thereby, the income expenditure proportion, which can be described as the social tolerance, should be limited under such case. Beyond these limits, the

FIGURE 6 | The stationary distribution changes with *p*2 under different global thresholds. Global cooperation is promoted with the increase of *p*2 as well as *s*2, while selfishness decreases. It hints that players are more apt to cooperate globally under high global risks. As to local cooperation, it depends on *s*2. For lower *s*2, local cooperation decreases with the rise of *p*2. For higher *s*2, local cooperation increases with the rise of *p*2. For moderate *s*2, local cooperation first rises and then drops with the increase of *p*2. Parameters are: *m* = 5, *M* = 20, *N* = 100, *r*<sup>1</sup> = 2, *r*<sup>2</sup> = 3, *s*<sup>1</sup> = 2, *q*<sup>1</sup> = *q*<sup>2</sup> = 0.8, and ω<sup>1</sup> = ω<sup>2</sup> = 0.005.

enthusiasm of individual investment and rapid development of social-economic system may be on the brink of collapse.

### 5. CONCLUSION

In this paper, we have studied the evolution of strategies in the multi-level threshold public goods games, where global and local cooperation are clearly distinguished. By introducing two kinds of income redistribution mechanisms, we investigate how income expenditure proportions (p<sup>1</sup> and p2) and risks (thresholds) influence the average abundance of strategies and fixation time. It is shown that with larger income redistribution proportions, players are more apt to cooperate globally, especially under high collective risks. When individuals are conscious of an even greater calamity, they are apt to form an alliance to prevent the risk through cooperation globally. The more disruptive the danger is, the more likely they succeed in collective target. Selfishness is effectively inhibited under both income redistribution mechanisms. It implies that an income redistribution mechanism may be effective for solving the social dilemma of free-riders and promoting social equity. We further compare the influences of the local and global income redistribution on the global cooperation and local cooperation. It is found that compared with local income redistribution, global income redistribution is more in favor of global cooperation. On the contrary, local income redistribution is more beneficial for local cooperation. Our model is relatively simple compared with the actual situations, but it characterizes some main features

#### REFERENCES


of the systems with income redistribution, and show results that the frequency of global cooperation may be promoted in some cases. This study may provide some useful implications for investors, fundraisers and also government officials. The theoretical analysis in this work is only a first step toward the models of learning process. Since learning and interaction between players should be on the same scale, we hope more accurate theoretical methods on this kind of models could be explored in the future.

#### AUTHOR CONTRIBUTIONS

JD and BW designed and performed the research as well as wrote the paper.

### FUNDING

This research was supported by the National Key Research and Development Program of China (2016YFB0901900), the National Natural Science Foundation of China (NSFC) (Grant No. 61703082), the Fundamental Research Funds for the Central Universities (Grant No. N160403001), the Fund for Innovative Research Groups of the National Natural Science Foundation of China (Grant No. 71621061), the Major Program of National Natural Science Foundation of China (Grant No. 71790614), the Major International Joint Research Project of the National Natural Science Foundation of China (Grant No. 71520107004), the 111 Project (B16009), and the Joint Funds of the National Natural Science Foundation of China (Grant No. U1435218).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Du and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Behavioral Heterogeneity Affects Individual Performances in Experimental and Computational Lowest Unique Integer Games

#### Takashi Yamada\*

*Faculty of Global and Science Studies, Yamaguchi University, Yamaguchi, Japan*

This study computationally examines (1) how the behaviors of subjects are represented, (2) whether the classification of subjects is related to the scale of the game, and (3) what kind of behavioral models are successful in small-sized lowest unique integer games (LUIGs). In a LUIG, *N* (≥ 3) players submit a positive integer up to *M*(> 1) and the player choosing the smallest number not chosen by anyone else wins. For this purpose, the author considers four LUIGs with *N* = {3, 4} and *M* = {3, 4} and uses the behavioral data obtained in the laboratory experiment by Yamada and Hanaki [1]. For computational experiments, the author calibrates the parameters of typical learning models for each subject and then pursues round robin competitions. The main findings are in the following: First, the subjects who played not differently from the mixed-strategy Nash equilibrium (MSE) prediction tended to made use of not only their choices but also the game outcomes. Meanwhile those who deviated from the MSE prediction took care of only their choices as the complexity of the game increased. Second, the heterogeneity of player strategies depends on both the number of players (*N*) and the upper limit (*M*). Third, when groups consist of different agents like in the earlier laboratory experiment, sticking behavior is quite effective to win.

#### Edited by:

*Isamu Okada, Soka University, Japan ¯*

#### Reviewed by:

*Tom Langen, Clarkson University, United States Kazuki Tsuji, University of the Ryukyus, Japan*

> \*Correspondence: *Takashi Yamada tyamada@yamaguchi-u.ac.jp*

#### Specialty section:

*This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics*

Received: *09 November 2017* Accepted: *04 December 2017* Published: *19 December 2017*

#### Citation:

*Yamada T (2017) Behavioral Heterogeneity Affects Individual Performances in Experimental and Computational Lowest Unique Integer Games. Front. Phys. 5:65. doi: 10.3389/fphy.2017.00065* Keywords: lowest unique integer games, laboratory experiment, heterogeneity of strategies, learning, agentbased simulation

### 1. INTRODUCTION

In social and economic systems, individuals, groups, firms and so on make their decision based on the rules they should obey. For example, call market, continuous double auction and other trading mechanisms are seen in financial markets and investors trade by taking into consideration which mechanism is introduced [2]. Or, first- and second-prize styles are usually employed in auction markets and the theoretical bid is different from the auction style [3]. On the other hand, new types of social and economic systems have been also proposed and some of them are introduced in practice. Among these, Swedish lottery (SL) game Limbo and Lowest/Highest Unique Bid Auctions (LUBA/HUBA) like the Auction Air or Juubeo websites are one of the new systems where the participants are required to be unique by taking risks of not being so.

Lowest Unique Integer Games (LUIGs) are highly simplified versions of SL and LUBA/HUBA. In a LUIG, N (≥ 3) players simultaneously submit a positive integer up to M. The player choosing the smallest number that is not chosen by anyone else is the winner. In cases where no player chooses a unique number, there is no winner. For instance, suppose there is a LUIG with N = 3 and M = 3. There are three players, A, B, and C, who each submit an integer between 1 and 3. If the integers chosen by A, B, and C are 1, 2, and 3, respectively, then A wins the game. If the integers chosen by A, B, and C are 1, 1, and 2, respectively, then C is the winner. And, as noted, if all of them choose the same integer, there is no winner.

LUIGs are more tractable than the above-mentioned real systems because the exact numbers of players or participants and the options are known for their decision-making. In this sense, these types of real systems have been attracting much attention recently from scholars of various disciplines<sup>1</sup> . In addition, several social or economic systems have characteristics of LUIGs. As Östling et al. [4] have pointed out, "choices of traffic routes and research topics, or buyers and sellers choosing among multiple markets" (p. 3) are probable examples. Or, the Braess paradox can be explained by LUIG [1]. While the previous studies have investigated these related systems theoretically and empirically, the behaviors of the bidders and participants, and the dynamics of game outcomes are not so clear. Likewise, experimental studies on LUIGs and related systems are still scarce except for Östling et al. [4] and Rapoport et al. [5]. Östling et al. have conducted a laboratory experiment of SL and found that there are mainly four kinds of behaviors observed: random, stick, lucky and strategic. Based on their findings, Mohlin et al. have proposed two learning models, global cumulative imitation and similaritybased imitation, where players make use of not only their choice but also the game outcome for updating their attractions [6]. On the other hand, Rapoport et al. have experimentally studied a version of LUBA/HUBA with (N, M) ∈ {(5, 4), (5, 25), (10, 25)} and found that only a small fraction of subjects behaved as theoretically predicted [5].

Yamada and Hanaki experimentally studied LUIGs to determine if and how subjects self-organized into different behavioral classes to obtain insights into choice patterns that can shed light on the alleviation of congestion problems [1]. They considered four LUIGs with N = {3, 4} and M = {3, 4} and implemented a laboratory experiment for totally 192 subjects. Each subject played two separate LUIGs but the difference between the two LUIGs was either N or M. Therefore, each LUIG had 96 subjects and they were equally split into two parties, those who played it in Game 1 and the others who did in Game 2 2 . Accordingly, 48 subjects played one of the four LUIGs in Game 1, which yielded 16 groups in three-person LUIGs and 12 groups in four-person LUIGs. Yamada and Hanaki found that (a) choices made by more than 1/3 of subjects were not significantly different from what a symmetric mixed-strategy Nash equilibrium (MSE) predicts; however, (b) subjects who behaved significantly differently from what the MSE predicts won the game more frequently.

These early experimental studies suggest that the strategy and the decision-making of subjects are heterogeneous and that the theoretical predictions may not be effective to win more. Yet, due to limited number of samples, it is necessary to intensively examine the relations between the behavior and learning of individuals, which can be an origin of heterogeneity, and their performances. This study extends their past experimental work to check whether such successful or unsuccessful behaviors are also true for the game with different opponents. For this purpose, the author pursues computational approach where the calibrated agents play with all agents including themselves (round robin contest) and make comparison between experimental and computational experiments. Here, several typical learning models are employed to express the behaviors of subjects in the laboratory experiment. Then, the one with the best likelihood for every subject in each game setup is used for computational experiments.

Several studies have employed both experimental and computational approaches to computationally test the experimental results and vice versa. According to Duffy, its advantages are summarized as "the agent-based approach to understand results obtained from laboratory studies with human subjects" and "to understand findings from agentbased simulations with follow-up experiments involving human subjects" (p. 951) [7]. The necessity of combining two approaches have been argued and the methodology has been proposed for the last decade (e.g., [8–11]). There are a few researches which indeed employ the combined approach to computationally test the validity of experimental findings in the laboratory, implement an intensive computational experiment, and extend the experimental design by using the laboratory data [12–14].

#### 2. MATERIALS AND METHODS

In the laboratory experiment by Yamada and Hanaki [1], they observed that keeping on choosing a number was an effective way to win LUIGs. But, it was not at that moment sure whether such sticking behavior was really successful. Here, a computational experiment of round robin competition is employed to see its effectiveness. Before the competition, several typical learning models are employed and the parameters of the models for each subject are then estimated.

#### 2.1. Learning Models

The learning models are as follows:

	- An AL1 player i has a propensity a k i (t) for number k (k = 1, · · · , M) at the beginning of round t. Before the start of a game, she is assumed to have the same non-negative propensities for all the possible integers, namely a j i (0) = a k i (0) ≥ 0 for j 6= k.

In every round, she chooses one integer according to the following exponential selection rule

$$p\_i^k(t) = \frac{\exp(\lambda\_a \cdot a\_i^k(t))}{\sum\_{k'=1}^M \exp(\lambda\_a \cdot a\_i^{k'}(t))}$$

<sup>1</sup>The list of related work is found in Yamada and Hanaki [1].

<sup>2</sup>The whole explanation for the experimental design and the mixed-strategy Nash equilibrium in each LUIG are given in Yamada and Hanaki [1].

TABLE 1 | Classification of subjects by observed behavior in the laboratory and the estimated learning model.


Cramer's coef. = 0.194


Cramer's coef. = 0.386


Cramer's coef. = 0.388


where p k i (t) is i's selection probability for integer k in round t, and λ<sup>a</sup> is a positive constant called sensitivity parameter ([15, 16]).

After a round, propensities are updated as

$$a\_i^k(t+1) = (1 - \phi\_a)a\_i^k(t) + \mathbf{1}\_{\{k, s\_i(t)\}} \psi\_a R\_i$$

where φ<sup>a</sup> and ψ<sup>a</sup> are positive constants called learning parameter ([15, 16]), 1{·} is the indicator function that takes 1 if k = si(t), and 0 otherwise. Here si(t) is the number that player i has actually chosen in round t, and R is the payoff received. Note that the model is called "cumulative" if ψ<sup>a</sup> = 1 and "averaging" if ψ<sup>a</sup> = φa.

#### • Three variables adaptive learning (AL3)

Players using this model take into consideration two additional psychological assumptions, experimentation and forgetting. Here, propensities are updated as

$$a\_i^k(t+1) = (1 - \phi\_b)a\_i^k(t) + \mathbf{1}\_{\{k\_\*s\_i(t)\}} \psi\_b R\_i$$

when they win and

$$a\_i^j(t+1) = (1 - \phi\_b)a\_i^j(t) + \psi\_b \epsilon R/(M-1) \text{ ( $j \ne s\_i(t)$ )}$$

when they lose<sup>3</sup> . φ<sup>b</sup> and ψ<sup>b</sup> are also learning parameters and ǫ is a experimentation parameter. Here ǫ is set to 1.0.

• Naive imitation (NI)

Players using this model follow a winning number regardless of whether they are a winner or not. When "no-winner" situation happens, they choose the preceding number<sup>4</sup> .

While the selection rule is the same as that in AL1 and AL3 models, the updating rule is expressed in the following:

$$a\_i^k(t+1) = (1 - \phi\_n)a\_i^k(t) + \mathbf{1}\_{\{k, \nu(t)\}} \psi\_n R\_i$$

where v(t) is a winning number in round t, and φ<sup>n</sup> and ψ<sup>n</sup> are also learning parameters.

• Stick

Players using this model always choose only one number<sup>5</sup> .

### 2.2. The Data to Calibrate

Since the subjects in the earlier laboratory experiment were asked to choose and submit one of the M integers, the experimental data for calibration include rounds, the choices of subjects and the winning number for every group in every LUIG. In other words, they were not asked to imagine what numbers their opponents would choose or to determine the probability distribution so that one number would be randomly chosen.

To determine a learning model for every subject, the author set one condition and assumed one point: First, only the experimental data in Game 1 were used for calibration. This is because learning across the games cannot be clearly treated. For example, when subjects play a LUIG with M = 3 in Game 1 and that with M = 4 in Game 2, it is not clear how the initial propensity for the integer 4 is given. Besides, even if the calibration is done, it is not preferable that the initial state is different from the subjects; Second, all initial propensities in Game 1 are set to zero, namely the subjects did not have any prior

<sup>3</sup> Similar learning model in Swedish lottery is proposed by Mohlin et al. [6]. In their model, players using the model pay attention to the numbers around the winning number when they lose. But, since the number of options in LUIGs here is much smaller, it may be possible to take into account the numbers except their chosen number in the same situation. If the players consider only the winning number, the following "naive imitation" model is applied.

<sup>4</sup> Since there are no information about the winning number at the beginning of the computational experiments, they choose one integer in accordance with the exponential selection rule.

<sup>5</sup>Level-k thinking in LUIGs chooses a strategy randomly (k = 0), 1 (k: odd), and 2 (k: even).

FIGURE 1 | Generated dendrogram (LUIG33).

belief to others or view to the game. Then, the learning model with the best log likelihood is employed for the simulation<sup>6</sup> . Note that the subjects who did not change at all in Game 1 belong to "stick."

### 2.3. Computational Round Robin Contest

The experimental design is as follows:


<sup>6</sup> "optim" function in R was used for calibration.

combinations is <sup>48</sup>H<sup>N</sup> and an agent faces 1,174 (threeperson LUIGs) and 29,329 (four-person LUIGs) patterns of opponents.


#### TABLE 2 | Expected behaviors of representative agents in each cluster (ID: Subject ID in the session).


#### TABLE 2 | Continued


#### TABLE 2 | Continued


TABLE 2 | Continued



All numerical results in the next section have been computed in double precision on a 2.4 GHz PC with 8 GB of RAM and a linux OS (Kernel 4.4.52-2vl6). All the source codes have been written in C++, and complied and optimized by GNU g++ version 4.9.3<sup>7</sup> .

### 3. RESULTS

#### 3.1. Classification of Subjects

Before discussing the results of computational round robin competitions, the author needs to pay attention to how the subjects were classified and whether there are relations between their calibrated learning model and their behaviors observed in the laboratory.

**Table 1** shows the relation between the calibrated learning model and the choice and the change criteria given in Yamada and Hanaki [1] 8 . Two updating rules, cumulative and averaging, are encapsulated into one. Cramer's coefficient of association for each LUIG is also given. Note that the abbreviation "LUIG34" means that the number of players N is 3 and the upper limit M is 4. Thus, the first number followed by "LUIG" is N and then M comes next.

Cramer's coefficient of association seems to depend on both N and M. When N and M are small, the value is relatively low (0.193 for LUIG 33). On the other hand, if N and/or M are large, the coefficient becomes larger. In particular, Cramer's coefficient of association for LUIG 44 is 0.561, namely many of the subjects who played not differently from MSE prediction are considered as NI players whereas those who deviated from the MSE prediction took into account only their own choices. This means, since larger N and M make it more difficult to imagine what number one's opponents chose from his/her choice and the winning number, some of the subjects became to rely on the available information.

Next, the author takes a look at how the subjects in the laboratory would have played if the game had continued. To answer this question, the author employed cluster analysis. By doing so, the expected behaviors of subjects would be quantitatively categorized and characterized.

To conduct the analysis, the following procedure was employed: First, the propensities in round 50 of laboratory experiment were calculated by using the game log. Second, the probability to choose each integer in round 51 was obtained. Third, the updated choice probability was calculated for all the possible cases. Here, "case" means that a subject's choice is k and the winning number is w. Accordingly, there are totally M(M+1) cases in a LUIG. Lastly, the author set the following values as inputs:

	- Updated probability to choose the same integer in round 52 when there are no winner in round 51
	- Updated probability to choose the same integer in round 52 when s/he wins in round 51
	- Updated probability to choose the same integer in round 52 when s/he loses in round 51
	- Updated probability to choose the winning integer in round 52 when s/he loses in round 51

After having a dendrogram<sup>9</sup> in each LUIG, the author split them into four or five clusters and obtained the inputs of "median" agents in each cluster<sup>10</sup> (**Figures 1**–**4**).

**Table 2** summarizes how the representative agents in each cluster would play and update their propensities in round 51<sup>11</sup> .

There are mainly three choice patterns observed: keeping on choosing one number, completely or relatively randomized behavior with fluctuation, and completely or relatively randomized behavior with non-fluctuation. The first pattern includes sticking behavior and a result of reinforcement. The

<sup>7</sup>The source code is available upon request.

<sup>8</sup>Choice criterion means whether the relative frequency of chosen number was different from that in MSE prediction meanwhile change criterion does whether the frequency of changing numbers is different from that in theory.

<sup>9</sup>The agglomeration method was "ward.D2" in R.

<sup>10</sup>The resulting dendrograms are given in the appendix.

<sup>11</sup>The meaning of string "10c" is "When number 1 is chosen and the winning number is 0 (= no-winner), the probability to choose the same number (= 1)." Likewise, the meaning of string "12w" is "When number 1 is chosen and the winning number is 2, the updated probability to choose the winning number."





(D) LUIG 44

*(Continued)*



remaining two patterns stem from the fact that the corresponding subjects failed to reinforce their propensities and that they were sensitive to the winning number. In addition, the value of sensitivity parameter was small so that every number was equally chosen anytime. Hence, whether sticking to a number or not played an important role in LUIGs, which may support the results of the earlier laboratory experiment.

### 3.2. Experimental Results

Agents in the round robin competition faced all the agents including his/herself. By doing so, the author compares their performances between when they played with different opponents and when their opponents included themselves.

**Table 3** shows the summary statistics of each LUIG in terms of the agent structure. The data include the frequency of game outcomes, the number of wins, that of changes, and Pearson' correlation between the numbers of wins and changes. This table also provides with the results of laboratory experiment and the theoretical prediction for comparison. The partitions of agents are in the following:

	- 3

Three identical agents exist;

• 2–1

Two identical agents and one different agent exist; and

• 1–1–1

Three different agents exist.

	- 4

Four identical agents exist;

• 3–1

Three identical agents and one different agent exist; • 2–2

Two different pairs of two identical agents exist;

• 2–1–1

Two identical agents and two other different agents exist and;

• 1–1–1–1

Four different agents exist.

The cases where there are identical agents mean that they played with one or more agents whose learning model and its values of parameters were the same. But the updating process is different. And the different agents mean at least their learning model or its TABLE 4 | Differences of performance with respect to the constitution of players (*p*-values are from Wilcoxon signed rank test).




#### (C) LUIG 43


(D) LUIG 44


*p* < 0.001 *(#wins), p* < 0.001 *(#changes).*

values of parameters is/are different from those of the others in the group.

The above partitions of agents are related to behavioral heterogeneity. When heterogeneity is high, "no-winner" situations were less frequently observed and thereby the average number of wins became larger. This is especially true for three-person LUIGs. In four-person LUIGs, things are a little bit different; When there are only two kinds of agents and one agent is singular, the average number of wins per agent is about 8.96 (LUIG43) and 9.55 (LUIG44). Meanwhile, when all the agents are different, the value is lower, 8.89 (LUIG43) and 9.32 (LUIG44). In addition, when one makes a comparison between LUIGs with the same N but different M, the average number of wins per agent may depend on heterogeneity. More concretely, it is more difficult to win when agents are homogeneous meanwhile there are more chances to win when heterogeneity exists.

Similar results and discussions are found with respect to the correlation between the numbers of wins and changes. When heterogeneity is low and there are no singular agents, not to change the numbers may lead to win more often in both threeperson and four-person LUIGs. As the heterogeneity increases, the extent of negative correlation becomes larger, which suggests that keeping on choosing the same number is effective in groups like in the earlier laboratory experiment.

Next, **Table 4** shows the differences of performances between identical agents and different agents for each agent constitution, by which one sees how each type of agents behaved and how often they won. An apparent fact is that the different agents

#### TABLE 5 | Differences between types of subjects with respect to the numbers of wins and changes in computational round robin contest.


(C) LUIG 43


#### (D) LUIG 44


TABLE 5 | Continued


*a, MSE (both) – non-MSE, MSE (choice) – non-MSE.*

*b, MSE (both) – MSE (choice), MSE (both) – non-MSE.*

*c, MSE (both) – non-MSE.*

*d, MSE (both) – non-MSE, MSE (change) – non-MSE.*

*e, MSE (both) – non-MSE, MSE (choice) – non-MSE, MSE (change) – non-MSE.*

won more than identical agents. This is statistically confirmed by Wilcoxon's Rank Sum Test and all the p-values are less than 0.001. But the superiority of uniqueness disappears when there are more different agents. This is because the identical agents tended to behave similarly meaning that their choices were not often unique and the different agent(s) learned to avoid it. Also, there is a clear difference between the two types of agents with respect to the number of changes and Pearson's correlation; Identical agents, on the one hand, changed more often and are expected to do so to win more. This may be because they learn to play differently and to change more often. Different agents, on the other hand, changed less frequently than identical agents when there are both identical and different agents. When there are more different agents, they need not to change their strategy to win.

There is one point to be addressed; When one reviews **Table 4**, s/he may notice the difference of Pearson's correlation for the partition 1–1–1–1 of LUIG43 and LUIG44. That is, negative correlations in experimental results whereas positive correlations in computational results. This is because these correlations are obtained from 17,296 (three-person LUIGs) or 194,590 (four-person LUIGs) groups, not from those which were played in the laboratory (16 groups in three-person LUIGs and 12 groups in four-person LUIGs). Hence, if s/he calculates correlations by picking up only the corresponding pairs, the value is −0.820 in LUIG43 and −0.767 in LUIG44 respectively. Likewise, the correlation is −0.755 in LUIG33 and −0.737 in LUIG34 respectively. This means that the computational experiment supported the experimental findings for the groups generated in the laboratory and, at the same time, that the earlier laboratory experiment might have needed more subjects. Instead, the possible reason why the sign of Pearson's correlation is opposite is that the relative frequencies of game outcome in four-person LUIGs were not reproduced, which might stem from the learning of calibrated agents.

Finally, **Table 5** shows the difference of the numbers of wins and changes between the types of subjects in each partition of LUIGs. The average values are in these tables and p-values are from Kruskal-Wallis test. The last column of each table explains the results of multiple comparisons if the corresponding pairs have significant differences (5%) and the details are given in the footnote of each panel.

When agents are identical in the group, MSE (both) agents seemed to win more than non-MSE agents while they changed more frequently. On the other hand, when the agents are different, non-MSE agents won more than MSE (both) agents by not changing their choices. Since the subjects were all different in every group, one will experimentally and computationally find that sticking behavior is quite effective so long as there are no identical players in small-sized LUIGs.

To summarize, the extent of behavioral heterogeneity may depend on the scale of LUIGs, the number of players in a group and the upper limit. In addition, the observed game outcomes and individual performances depend on the constitutions of agents. In particular, behavioral heterogeneity may improve the chances of win. When there is a mixture of identical agents and different agents, different agents win more than identical agents. However, a full of diversity lessens the winning opportunities for each different agent. With respect to individual performance, the computational experiment shows that keeping on choosing the same number leads the agents to win more, which supports the experimental findings.

#### 4. DISCUSSION

This study computationally examines (1) how the behaviors of subjects are represented, (2) whether the classification of subjects is related to the scale of the game, and (3) what kind of behavioral models are successful in small-sized LUIGs by using the earlier experimental data by Yamada and Hanaki [1]. For these purposes, the behavior of subjects is calibrated and determined among the several typical learning models. Then computational round robin competition including the games where every agent faces not only different agents but also him/herself is pursued. The main findings are as follows: First, the subjects who played not differently from the MSE prediction tended to made use of not only their choices but also the game outcomes meanwhile those who deviated from the MSE prediction took care of only their choices as the complexity of the game increased. Second, when groups consist of different agents which is the case of the earlier laboratory experiment, sticking behavior is quite effective to win LUIGs. Third, when groups consist of different agents like in the earlier laboratory experiment, sticking behavior is quite effective to win.

Since this study deals with the estimated learning models, unlike in Linde et al. [17], there may be better models for some of the behavioral data in laboratory experiment. Hence, as done by Linde et al., it is necessary to conduct another laboratory experiment where subjects are asked to elicit their decisions to play LUIGs. Another future work includes largersized experiment to see whether similar behaviors and game

#### REFERENCES


dynamics are also observed. This comes form the empirical finding by Östling et al. [4] and Mohlin et al. [18].

#### AUTHOR CONTRIBUTIONS

TY built research questions, wrote and ran computer programmings, analyzed the experimental and computational results, and wrote the manuscript.

#### ACKNOWLEDGMENTS

Financial support from Japan Society for the Promotion of Science (JSPS) Grant-in-Aid for Young Scientists (B) (24710163) and Grant-in-Aid (C) (15K01180), from Canon Europe Foundation under a 2013 Research Fellowship Program, and from JSPS and ANR under the Joint Research Project, Japan – France CHORUS Program, "Behavioural and cognitive foundations for agent-based models (BECOA)" (ANR-11-FRJA-0002) is gratefully acknowledged.

example from prediction markets research. Comput Math Organ Theory (2012) **18**:63–90. doi: 10.1007/s10588-011-9098-2


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Yamada. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

### APPENDIX

This section gives the generated dendrograms to classify the calibrated agents in computational round robin contests. The xaxis stands for subject ID (session–subject) and y-axis does the distance between the calibrated agents. The expected decisionmaking of the "median" agents in each cluster is summarized in **Table 2**.

# An Online Experimental Framework for Cooperative Relationships With a Real-Time Decision-Making and Rewarding Environment

Reiji Suzuki <sup>1</sup> \*, Momoka Ito<sup>1</sup> , Shunnya Kodera<sup>2</sup> , Keita Nishimoto<sup>3</sup> and Takaya Arita<sup>1</sup>

*<sup>1</sup> Graduate School of Informatics, Nagoya University, Nagoya, Japan, <sup>2</sup> Institute of Investment Technology, Nikko Research Center, Inc., Tokyo, Japan, <sup>3</sup> Graduate School of Information Science, Nagoya University, Nagoya, Japan*

#### Edited by:

*Tatsuya Sasaki, F-Power Inc., Japan*

#### Reviewed by:

*Christopher X. Jon Jensen, Pratt Institute, United States Alberto Antonioni, Universidad Carlos III de Madrid, Spain*

> \*Correspondence: *Reiji Suzuki reiji@nagoya-u.jp*

#### Specialty section:

*This article was submitted to Social Evolution, a section of the journal Frontiers in Ecology and Evolution*

> Received: *13 October 2017* Accepted: *15 May 2018* Published: *22 June 2018*

#### Citation:

*Suzuki R, Ito M, Kodera S, Nishimoto K and Arita T (2018) An Online Experimental Framework for Cooperative Relationships With a Real-Time Decision-Making and Rewarding Environment. Front. Ecol. Evol. 6:74. doi: 10.3389/fevo.2018.00074* This paper investigates interactions between game theoretical strategies and social relationships in real-time decision-making and rewarding environments. We propose an experimental framework based on techniques of web-based multiplayer online games for this purpose. In our framework, multiple human players, represented as particles in a two-dimensional space of social interactions, can modify their positions and game strategies for the prisoner's dilemma in real time, and receive benefit or cost emerging from both game theoretical and social relationships with neighboring players. We report on experiments with human participants in different conditions of the payoff matrix, which reflects game structures, and the speed of each player, which reflects the ability to change her social relationship. We show that cooperative relationships emerge in real human groups regardless of experimental settings, and show their basic behavioral patterns. We further discuss relationships between behavioral characters of participants in the experiments and their psychological characters to see how their personalities can be reflected in their behavior in such a game theoretical framework, and show that a few psychological characters of participants might reflect their behavioral characters at least in part, but there were variations in these relationships between experimental groups.

Keywords: social particle swarm, social dynamics, multiplayer online game-based experiments, prisoner's dilemma, the Big Five personality traits, the relational mobility scale

#### INTRODUCTION

Understanding human behavior in real-time decision-making environments is getting much attention, because such situations are ubiquitous in both real-world activities (e.g., stock markets, team works, school activities) and social networks (e.g., Facebook, Twitter, Instagram). While traditional game theoretical approaches have mainly focused on discrete interactions (e.g., standard repeated games) (Maynard Smith, 1982; Hofbauer and Sigmund, 1998), recent studies have shown human behavior in real-time decision-making environments is different from that in cases with discrete interactions (Friedman and Oprea, 2012; Hawkins and Goldstone, 2016).

Hawkins and Goldstone (2016) conducted a version of two-player and asymmetric coordination game, termed the Battle of the Exes, in both real-time decision-making environments and traditional staged environments. In their environment, players were placed at opposite ends of the two-dimensional virtual world, and allowed to move toward one of the two destinations, each corresponds to the player's decision, with full freedom to change that destination at any time. They reported that players who were allowed to interact continuously within rounds achieved outcomes with greater efficiency and fairness than players who were forced to make simultaneous decisions. Friedman and Oprea (2012) also assumed a case of continuous interactions based on a prisoner's dilemma in which players can switch between cooperation and defection at any point in time and they receive the flow of payoffs that changes in continuous time according to the changes in their strategies. They showed that the proportion of cooperative behavior in this realtime decision-making and rewarding environment was much higher than that in a case with standard discrete and repeated interactions.

It has also been shown that structures of social networks can influence the emergence of cooperative behavior (Nowak and May, 1992; Nowak, 2006; Pinheiro et al., 2012), theoretically. Recent experimental studies with interaction networks of human populations based on repeated games suggested that the population structure can affect the evolution of cooperative behavior as theoretically expected (Rand et al., 2014), or may not affect so significantly than theoretically expected (Grujic´ et al., 2014) because they might adopt different strategy updating criterions (e.g., moody conditional cooperation Grujic et al., ´ 2014, reinforcement learning Horita et al., 2017) rather than an imitation-based criterion (e.g., imitating the best), which is a common assumption in theoretical models.

In addition, theoretical studies showed that dynamic changes in network structures can affect the global dynamics of human behaviors (Zimmermann and Eguíluz, 2005; Pacheco et al., 2006; Suzuki et al., 2008), and recent experimental studies with human participants have also shown that cooperative clusters can emerge when participants could modify their network structure of interactions (Fehl et al., 2011; Rand et al., 2011; Wang et al., 2012; Antonioni et al., 2014; Yonenoh and Akiyama, 2014). This is because participants tend to keep cooperative relationships while severing connections with defectors, and thus form cooperative and highly connected clusters in general (Rand et al., 2011). Recently, Cuesta et al. (2015) showed the existence of reputation on neighbors (i.e., the history of their actions in the past a few rounds) can facilitate the emergence of cooperative clusters, and Antonioni et al. (2016) further showed there existed two types of participants who are reliable subjects and cheaters when cheating her own reputation with a cost was allowed.

There have also been studies that focused on effects of the mobility of agents on evolution of cooperation in spatial environments. Meloni et al. (2009) showed the intermediate speed of random movement on a continuous 2D space can facilitate the evolution of cooperation Sicardi et al. (2009) also showed that a random movement on a 2D diluted grid in which vacant cells are allowed to exist can affect differently different types of 2-person games. Antonioni et al. (2015) first conducted an experimental study with human participants in such a situation in which each participant can move toward a vacant neighboring cell, and showed that cooperative clusters formed temporally but dissolved due to invasion by defectors. Efferson et al. (2016) also showed that participant can establish cooperative clusters by running away from bad behavior even when they do not know much about the information of potential new neighbors.

However, these previous studies on the evolution of cooperation in dynamically networking or spatially interacting populations assumed discrete interactions between individuals in the sense that relationships between individuals are discrete (i.e., "connected or not" or "neighbor or not") and their relationships also change in time in a discrete manner while real human relationships could be continuous and can change in continuous time, as described above.

Our purpose is to understand how both game theoretical strategies and social relationships among humans change in real time decision-making and rewarding environments. For this purpose, we are developing an experimental framework based on techniques of web-based multiplayer online games (Kodera et al., 2017).

We use a simple multi-player game that is adapted from Nishimoto et al.'s computational model for investigating dynamically changing social relationships termed social particle swarm (SPS) model (Nishimoto et al., 2013, 2014). See Nishimoto et al. (2013) for details. They assumed that individuals were in a two-dimensional and toroidal plane. This represents a social or psychological space in which the proximity between two individuals reflected their social or psychological closeness. Each particle has a strategy for the prisoner's dilemma (PD) game, and moves according to the force vector generated from the payoffs in the game. The behavior of the particles in each step consists of two sequential processes: First, all particles simultaneously decide whether to select cooperation or defection in the current step in a tit-for-tat fashion based on the proportion of cooperators among its neighboring agents within a fixed range in the previous time step. If this proportion is larger than an attribute value of each individual, termed cooperation threshold, the focal individual cooperates, and otherwise it defects. Second, each individual receives attractive (repulsive) force from each neighbor who gives a positive (negative) payoff according to the payoff matrix of the PD game, whose the magnitude is proportional to the payoff value and inversely proportional to the distance between the focal individual and the neighbor. Then, each individual moves toward the direction of the resultant vector of the all forces at a fixed speed. They observed repeated occurrences of explosive dynamics that consisted of a formation of an altruistic cluster followed by its collapse with explosive dispersal of defective particles. While Antonioni et al. (2015) showed formation and collapse of cooperative clusters in a 2D diluted grid environment, it is unclear how human groups behave under such a situation in which a lot of individuals continuously change their relationships in real time.

In our preliminary framework, multiple human players, represented as particles in a shared two-dimensional space of social interactions, can modify their positions and game strategies for the prisoner's dilemma in real time, and receive benefit or cost emerging from both game theoretical and social relationships with neighboring players. In preliminary experiments (Kodera et al., 2017), we did not observe stable emergence of cooperative relationships, and simple analyses showed that this could be due to several model settings such as no limitation of visibility of other players (i.e., all players can observe all others), and the lower limit of the accumulated score.

In this paper, we propose an updated framework to investigate dynamic changes in continuously changing social relationships in real time decision-making and rewarding environments by considering these factors that might negatively affect the emergence of cooperation in the previous one. We discuss benefits of this framework for this purpose by conducting several experiments with human participants, showing that cooperative relationships can emerge regardless of parameter settings relating to the game structure and the mobility of players, and analyzing their behavioral patterns.

We further discuss relationships between behavioral characters of participants in the experiments and their psychological characters to see if how their personalities can be reflected in their behavior in such a game theoretical framework. Relationships between behaviors in online social networks (e.g., Facebook) of users and their personality have been discussed (Gosling et al., 2011; Seidman, 2013). Gosling et al. (2011) found that several connections between the Big Five personality traits and self-reported Facebook-related behaviors, suggesting that the users extended their offline personalities into the domains of online social networks. We conducted a survey on the Big Five personality traits (Oshio et al., 2012) and the relational mobility of their social environments (Yuki et al., 2007) after experimental sessions. We analyzed the correlation among behavioral characters in experiments and these psychological characters, showing that a few psychological characters of participants might reflect their behavioral characters at least in part, but there were variations in these relationships between experimental groups.

### MATERIALS AND METHODS

### A Multi-Player Online Game-Based Experimental Framework

We first introduce a multi-player game based on the SPS model to observe continuous and dynamic relationships. Then, we explain how we implemented this framework to simulate this game situation with human participants.

We assume that N human subjects (players) participate in an experimental trial. Each player is represented as a point and arranged in a 500 × 500 two dimensional and toroidal space. **Figure 1A** shows an example interface showing the distribution of 7 players in the neighboring area of a player in the plane.

The position of each player represents her social state against the other players, which approximates her physical, social and psychological properties that may affect her interest against her neighbors. The proximity between two players reflects their social closeness. Each player can move freely in this space and change its strategy of the prisoner's dilemma (cooperate or defect) at arbitrary timing during a trial. The score arising from their social relationship with neighbors was accumulated through a session, and the objective of each player is to maximize her own accumulated score.

To simulate such a real-time decision-making and rewarding environment, we implemented a server-client framework using WebSocket and HTML5, which are used for developing Webbased online games.

### Client Application

**Figure 1A** shows an example of the web-based client application for each human player. It enables a player to login to a server application with a handle name. During an experimental session, each player can see the current spatial distribution of neighboring players as shown in the square panel. The circle with a radius R = 100 in the plane represents the neighboring area. The focal player can observe other players within this area, and recognize them as neighbors. Cooperators are represented as blue points, and defectors are represented as red points. The focal player is always placed at the center of the panel, and she is connected to other neighboring players to emphasize the distance to the neighbors. The color of the connecting line represents the color of the other player (orange: cooperator, gray: defector) and its width is inversely proportional to the distance. The handle name and the current accumulated score of the focal player are indicated around the player in the plane. There is also a leaderboard showing the current ranking of all the players.

A player can specify each player's direction of movement using a mouse cursor (**Figure 1B**). If the focal player places a mouse cursor outside of the small staying area around her on the panel, she moves toward the cursor on the space. Note that her position on the space changes, but she is kept in the center of the panel, always showing her neighboring area. The strategy of the focal player is flipped when the "c" key is pressed. In addition, the strategy is also flipped with a small probability 0.2% at every time step to make players pay much attention to the strategy. Specifically, the client application sends the xy-coordinate of the mouse cursor on the window to a server application at every time step of 0.2 s asynchronously. It also sends a key event every time when "c" key was pressed.

#### Server Application

The server application conducts two procedures at every time step with a short time interval of DT = 0.5 s. First, it updates the accumulated scores of all players. The strategy of each player is updated using the information that was sent from client applications. Each player i gets a score depending on her current social relationships with all neighbors, which is defined by **Table 1** and Equation (1):

$$Score = \sum\_{j \in neighborhoods \text{ of } i} \frac{pd\_{i,j}}{d\_{i,j} + 1},\tag{1}$$

where pdi,<sup>j</sup> represents a payoff, in **Table 1**, that the player i gets by playing a game with j. di, <sup>j</sup> represents the distance between i and j in the plane. This equation means that the basic game theoretical relationship between players is based on the prisoner' dilemma, but the net score is inversely proportional to the proximity



*(player's score, opponent's score).*

between them, which reflects the effect of the social relationship between them. The score is accumulated over the whole game playing time.

Then, the position of each player is updated according to the information that was from the corresponding client. Each player moves toward the direction specified by the mouse cursor by V pixels if she does, and thus can move with a speed of V/DT pixels per second. Finally, the state and position of all players and their accumulated scores are sent to the all clients and reflected in their interfaces.

### Experimental Procedures

The experimental procedures with human participants have been approved by the planning and evaluation committee in the Graduate School of Information Science, Nagoya University (GSIS-H28-3). An informed consent was obtained from all participants before experiments. We recruited N = 23 undergraduate or graduate school students at Nagoya University as participants and conducted experiments (E1) on July 14th, 2017. All the participants were gathered in a computer room of our department. Each participant was assigned a standard desktop PC and asked to use an interface on a web-browser with a mouse and a keyboard. They were not allowed to talk with each other during the experiment. They were asked to maximize their own accumulated score regardless of the relative value from others', and told that they would receive 1,000 yen for taking part in an experiment and additional bonus at most 500 yen will be paid according to the accumulated score. However, all participants received 1,500 yen after the experiment.

After an introduction of the game and the user interface to participants, 23 participants participated in three experimental sessions (S1–S3) and 21 participants participated in one experimental session (S4) with different payoff matrices with a temptation to defect T = 1.5 (S1, S2) and 1.2 (S3, S4), and a different speed V = 3 (S1, S3), 6 (S2, S4), sequentially. The experimental instruction is shown as the Data sheet 1 in the Supplementary Materials. Each player was asked to maximize her own accumulated score. Each session lasted for about 10 min, but players were not informed of the exact time limit in advance. We used data for initial 265 s (i.e., 530 time steps) for analysis of each session, which is the minimal experimental duration among the all sessions. We also assumed a practice session (for 1 min) before S1, and a short break (for 2 min) between S2 and S3.

Two players did not participate in S4 and did not have a survey questionnaire (explained below). Another player participated in all the sessions but did not have a survey questionnaire. Thus, we used the data from all the 23 participants for the behavioral analysis in S1, S2, and S3, and used them from 21 participants in S4 in section Behavioral and psychological characteristics of players. We used the behavioral and personal data from 20 participants who answered questionnaires in the next section.

We also conducted another experiment (E2) on February 6th, 2018 with N = 13 undergraduate or graduate school students at Nagoya University, using a different order of experimental settings T = 1.5 (S1, S3) and 1.2 (S2, S4), and V = 3 (S1, S2), 6 (S3, S4). We conducted this experiment to grasp a general behavioral tendency that could be observed in both experiments. In this experiment, all participants participated in all the sessions and completed a survey questionnaire. We used data for initial 305 s (i.e., 610 time steps) for the analysis.

### A Survey on the Big Five Traits and the Relational Mobility

After S4, we also conducted a survey questionnaire on Big Five personality traits and relational mobility of their social environments. Specifically, we conducted the Big Five personality (or Five-Factor Model) test on all participants using a Japanese version of Ten-Item Personality Inventory (TIPI-J) (Oshio et al., 2012), which is based on Ten Item Personality Inventory (TIPI) proposed by Gosling et al. (2003). TIPI is a 10-item (questions) measure of the Big Five dimensions, which is commonly used to describe personality according to five traits: openness (inventive/curious vs. consistent/cautious), conscientiousness (efficient/organized vs. easy-going/careless), extraversion (outgoing/energetic vs. solitary/reserved), agreeableness (friendly/compassionate vs. challenging/detached), and neuroticism (sensitive/nervous vs. secure/confident).

The relational mobility is the degree to which individuals in a given society have the option to form new relationships and end old relationships (Yuki et al., 2007). When an individual perceives their social environment to be low in relational mobility, they perceive it as difficult and costly to leave current relationships and to establish new ones. We focus on how the relational mobility of their local environments can affect their behaviors in our experiments. The relational mobility scale (Yuki et al., 2007; Schug et al., 2010), a 12-item measure, was used to assess the two components of the relational mobility of participants.

Participants were asked to rate 22 (10 for the Big Five and 12 for the relational mobility) statements using a 6 (the relational mobility scale) or 7 (the Big Five traits) -point scales (options ranged from: 1-strongly disagree, to 6/7-strongly agree). We then calculated scores of the Big Five personality traits (OPE: openness to experience, CON: conscientiousness, EXT: extraversion, AGR: agreeableness, and NEU: neuroticism) and two components of the relational mobility (MNP: meeting new people, and CIP: choosing one's own interaction partners) for each individual. The questionnaire is shown as the Data sheet 2 in the Supplementary Materials.

### RESULTS

### General Behavioral Tendency

**Figure 2A** shows the temporal dynamics of the proportion of cooperators and the average number of neighbors in E1-S3. While both indices fluctuated through the session, we observed cooperative clusters emerged and collapsed locally. **Figure 2B** shows an example transition of the social dynamics in S3 that were often observed in all experimental sessions. We see that a cooperative cluster with a small number of players forms spontaneously (t = 25), and keeps or grows its size by increasing mutual benefit among players (t = 61). However, when some defectors find and approach them, or some players change their states from cooperative to defective, cooperative players escape from defectors and try to find other players to establish cooperative relationships (t = 90). Such an emergence and collapse of cooperative cluster occurred repeatedly in the all experimental sessions.

**Figure 3** shows the average proportion of cooperators and moving players (i.e., the proportion of players who decided to move toward any direction) among all players at each time step, in each session. It should be noted that the proportion of cooperation was between about 0.63 and 0.82, meaning that many individuals tended to be cooperative in this experimental framework.

We also see that the proportion of cooperators increased as the experimental sessions proceeded from S1 to S4 (except for S3 and S4 in E2). It is highly possible that this trend is, at least in part, due to the effect of increased learning experience of game environments because experimental sessions were conducted sequentially. Having this in mind, we still observed a negative relationship between the proportion of cooperators and the proportion of moving players in both experiments. This implies that more successful cooperators tended to move less often. Also, the smaller temptation to defect (T) tended to contribute to the higher proportion of cooperation (except for S3 and S4 in E2), and the larger speed of movement (V) tended to contribute to the lower proportion of moving players. These might reflect the effect of the temptation to defect as expected, and also reflect that the ability to more quickly modify each player's social state contributed to form stable cooperative relationships.

**Figure 4** shows the proportion of cooperative and moving players at step t when the proportion of cooperative neighbors was lower than 0.5 or not in the previous step t-1. In the all sessions, players tended to be more cooperative and tended to be less frequently moving when there were more than half of neighbors were cooperative (Kolmogorov-Smirnov test, p-value < 0.001). This simple rule is expected to be a basic mechanism that contributed to the emergence of stable cooperation in these experiments.

### Behavioral and Psychological Characteristics of Players

Next, we focus on relationships among behavioral and psychological characteristics of each participant in experimental sessions. Specifically, we defined 5 behavioral indices that represent different aspects of each individual's behavior as follows: COO: the ratio of a focal player's cooperation (i.e., the proportion of time during which her state was cooperative), NEI: the number of neighbors, MOV: the proportion of steps at which

the focal player moved, CHA: the number of time steps in which the focal player changed her strategy, SCO: the total score that the focal player obtained, in each session.

We conducted a correlation analysis (Spearman's rank correlation coefficient) among these 5 indices and scores of 7 components of psychosocial properties explained in section Server application to grasp overall correlation among behavioral and psychological characters. We focused on statistically significant (p-value < 0.05) pairs of these indices in each experimental session.

**Table 2** shows results of the analysis in each experimental session. In E1, there is a strong correlation between SCO

and COO in all the experimental sessions, meaning that more cooperative players obtained higher scores. This is because cooperative players successfully established stable cooperative relationships as discussed in the previous section.

There is also a negative relationship between "SCO or COO" and "MOV and CHA", meaning that players who frequently changed their strategy and moved in the plane were less cooperative or obtained lower scores. This implies that defectors who were seeking and trying to exploit cooperative clusters were not successful probably because they were avoided by cooperators.

As for NEI, there were differences in their relationship with SCO among sessions. In S1, NEI had a negative relationship with SCO, respectively, and they had not so clear relationship with SCO in S2. This could be because smaller clusters of cooperators (e.g., two or three cooperators) were frequently invaded by defectors when the experimental setting was beneficial for defectors (S1 and S2) or players tended to be defectors due to the less game experience. On the other hand, NEI had a positive relationship with SCO in both S3 and S4, respectively. This means that players who tended to form large cooperative clusters obtained higher scores when the experimental setting was beneficial for cooperators (S3 and S4).

As for the relationship between these behavioral characters of players and their psychological characters, it was not easy to see general trends across all experimental sessions. However, it should be noticed that AGR had a positive relationship with NEI in S2 and S3, SCO in S3, and a negative relationship with MOV in S4. This implies that agreeable players tended to get clustered and less frequently moved, obtaining a higher score. In S3 and S4 in which there were the larger proportion of cooperators, COO had a negative relationship with CIP. This implies that players who do not have many chances to choose their own partners in their social environments tended to be cooperative.


*The value shows statistically significant (p-value* < *0.05) correlation between two indices in each session. COO, the ratio of a focal player's cooperation; DIS, the average distance between the focal player and the others; NEI, the number of neighbors; MOV, the proportion of steps at which the focal player moved; CHA, the proportion of time steps in which the focal player changed her strategy; SCO, the total score that the focal player obtained, in each session; OPE, openness to experience; CON, conscientiousness; EXT, extraversion; AGR, agreeableness; NEU, neuroticism; MNP, meeting new people, and CIP, choosing one's own interaction partners.*

In E2, we see the less number of significant correlations between indices than in E1, which is expected to be due to the small number of players. However, we still see a strong correlation between SCO and COO in all the experimental sessions, and the similar tendency of the correlation observed in E1 such as a positive relationship between "COO or SCO" and NEI in S2, S3, and S4. However, we observed different relationships between behavioral and psychological indices: there was a positive correlation between CHA and NEU, meaning that more sensitive players tended to change their strategy more frequently. We also see an opposite relationship such as the positive relationship between "COO and CIP" in S4. This implies that psychological characters of participants might reflect their behavioral characters at least in part, but there were variations in these relationships between experimental groups.

### DISCUSSION

We proposed and constructed an experimental framework to observe continuous and dynamic relationships in a group of human participants by applying techniques of web-based multiplayer online games. We implemented a multi-player game based on Nishimoto et al.'s SPS model in which human participants, represented as particles in a shared space, can change their positions and game theoretical strategies in real time, according to the benefits or costs arising from social relationships with neighboring players.

We found that cooperative clusters emerged in parallel in all experimental sessions, and found a strong positive assortativity between cooperators in some sessions. This is quite different from the cases in our preliminary experiments (Kodera et al., 2017). In these experiments, defectors dominated the population, chasing cooperators through experimental sessions. This is expected to be because each player could observe all the other players in these cases, and thus defector could exploit cooperators more easily. This implies that the spatial locality is an essential factor for the emergence of cooperation in our framework. The fact that there was no incentive to avoid mutual defections when their scores were the lower limit (0) could be another reason for defectors to successfully dominate the population.

We also found a general behavioral tendency of participants that they tend to be cooperative and tend not be moving when the proportion of neighboring cooperators were high. This fact supports the validity of the behavioral rule of particles adopted in the SPS model (Nishimoto et al., 2013), at least in part, in that their game strategy is based on the proportion of cooperators among neighbors and they tend to get close when they are cooperators.

It should be noted that psychological characters of participants reflected their behavioral characters in the three experimental sessions in E1, in part. That is, agreeable players established stable and cooperative clusters and obtained higher scores. Also, we found that players who have fewer chances to choose partners in their social environments tended to be cooperative. This may be due to the experimental settings in which cooperative clusters were easy to emerge. These results imply that our experimental framework can be a platform to conduct psychological experiments with many participants to see how psychological characters can affect global dynamics of social relationships emerging from interactions among them. However, at the same time, we also found that there were variations in these relationships between experimental groups. This implies that these relationships can be strongly affected by the social settings such as the

### REFERENCES


number of participants and their distribution of psychological characters.

These results were from two small groups of participants, and the experiments were conducted in the sequential order, and thus there could be effects orders of sessions on the results. We believe that more detailed analysis with many groups can clarify general behavioral strategies of humans in real-time decision-making and rewarding environments.

### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the planning and evaluation committee in the Graduate School of Information Science, Nagoya University, Japan with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the planning and evaluation committee in the Graduate School of Information Science, Nagoya University, Japan (GSIS-H28-3).

### AUTHOR CONTRIBUTIONS

RS, MI, and TA designed the experimental procedures, conducted experiments, and analyzed the results. KN constructed the original SPS model and SK constructed the web-based framework, and they advised on the experiments and analyses. RS and MI wrote the manuscript with support from all authors.

### FUNDING

This work was supported in part by Japan Society for the Promotion of Science Grant-in-Aid for Scientific Research (JSPS KAKENHI) Grant number JP15K00335, JP15K00304, JP17KT0001 and JP17H06383 in #4903; and Topic-Setting Program to Advance Cutting-Edge Humanities and Social Sciences Research Grant number JP17J0011b.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2018.00074/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Suzuki, Ito, Kodera, Nishimoto and Arita. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Role of Social Learning in the Evolution on a Rugged Fitness Landscape

#### Masahiko Higashi\*, Reiji Suzuki and Takaya Arita

*Graduate School of Informatics, Nagoya University, Nagoya, Japan*

The role and importance of social learning have been investigated by many researchers because it is observed in many animals and is expected to play a significant role in cultural phenomena. We explore the coevolution between individual learning and social learning on a rugged fitness landscape as a realistic condition in which they can interact with each other. We demonstrate that social learning allows individuals not to have adaptive traits innately, and thus, has two important roles to enhance individual fitness. First, social learning spreads and keeps the adaptive phenotypes acquired by individual learning. Second, social learning enables individuals to explore a wide range of fitness landscape by the increased population diversity. Based on the difference of the roles of individual and social learning, they can work complementarily in the course of adaptive evolution on the rugged fitness landscape.

#### Edited by:

*Tatsuya Sasaki, F-Power Inc., Japan*

#### Reviewed by:

*Eduardo J. Izquierdo, Indiana University Bloomington, United States Xiaojie Chen, University of Electronic Science and Technology of China, China Marija Mitrovic Dankulov, University of Belgrade, Serbia*

\*Correspondence: *Masahiko Higashi higashi@alife.cs.is.nagoya-u.ac.jp*

#### Specialty section:

*This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics*

Received: *20 November 2017* Accepted: *23 July 2018* Published: *28 August 2018*

#### Citation:

*Higashi M, Suzuki R and Arita T (2018) The Role of Social Learning in the Evolution on a Rugged Fitness Landscape. Front. Phys. 6:88. doi: 10.3389/fphy.2018.00088* Keywords: social learning, individual learning, coevolution, baldwin effect, fitness landscape

### INTRODUCTION

Animals adapt to their environment by two different mechanisms working on two levels, evolution and learning. Evolution is a population level mechanism and learning is an individual level mechanism. There have been a lot of discussion on the effects of learning on the course of evolution. Baldwin, a pioneer in epigenetic evolutionary theory, proposed a possible scenario which is now called Baldwin effect that explains how evolution and learning interact with each other [1]. It consists of the following two steps [2]. (1) Some agents acquire adaptive phenotypes by learning, and then, they increase in population. (2) Because of the learning cost, agents which have the adaptive phenotypes innately become more adaptive than other agents, so the population evolves to have adaptive phenotypes innately i.e., a genetic assimilation of adaptive phenotypes. Through these two steps, learning facilitates the evolution.

Hinton and Nowlan devised a simple computational model that shows learning can accelerate evolution, and they associated this phenomenon with the Baldwin effect [3]. However, individuals in the model only used individual learning based on trial-and-error. Learning can be classified into individual learning (e.g., trial-and-error process) and social learning (e.g., imitation process). Via individual learning animals adapt to their environment by using only their own experience while via social learning they adapt to their environment by using other animals' experience. In general, it is considered that social learning affect evolution of animals significantly, because it allows animals to acquire adaptive behavior without paying the cost of trial-and-error process, and also, the adaptive behavior can be evolved cumulatively through generations by the imitation between adults and offspring. These types of transmission are necessary to create culture, and from the interest of cultural evolution, social learning has been investigated for a long time from many points of view.

**94**

Higashi et al. The Roles of Social Learning

The two major focuses of the research on social learning having been the conditions under which social learning evolves and the way of social learning. In the research focusing on the conditions, researchers mainly investigated the effects of fluctuation and structure of environment, and successfully showed that social learning is favored in stable and simple environment. For example, on the effect of fluctuation on environment, Rendell et al. [4] and Jones et al. [5] found that when the environment is varied or intense, social learning is disfavored. On the effect of structure of environment, Tamura et al. [6] developed a mathematical model to explore the effect of social networks on social learning and revealed that social networks disfavor the social learning. Kobayashi et al. [7] also developed an island migration model and revealed that spatial structures disfavor the social learning. The way of social learning can be approached from at least two aspects: "when" and "whom" they learn from Laland [8]. As a study focusing on "when," Enquist et al. [9] found that "critical social learning" that does social learning during having no information about the environment and then does individual learning, is superior to pure social learning. Rendell et al. [4] also revealed that the "conditional social learning" that does social learning only when individual learning fails, is superior to pure social learning. As a study focusing on "whom," Mesoudi [10] found that "copy-successful-individuals" strategy is more adaptive than individual learning by experimental simulation using human subjects. "whom" aspect is linked to the biases in information transmission. Specifically, it has been shown that the conformist bias is adaptive under a broad range of environmental conditions [11–14].

However, most of computational or mathematical research assumed the transfer of very simple information (typically, which of two behaviors is correct) that sometimes becomes absolute [6, 7, 9, 11–14]. This situation could be interpreted as the evolution on a fitness landscape with a single peak of which location might change occasionally. However, in the real world, the fitness landscape should have many peaks as local optima in general. Therefore, we explore the interactions between individual learning and social learning on a rugged fitness landscape as a more realistic condition. The purpose of our study is to clarify evolutionary roles of individual and social learning on a rugged fitness landscape in the context of the Baldwin effect. We adopted a minimal fitness function [15] that represents a multi-modal fitness landscape in which there is a trade-off between the adaptivity of individuals and the strength of nonlinear epistatic interactions among multiple phenotypes. We constructed an agent-based evolutionary model in which each individual can accommodate its plastic phenotypes using both individual learning based on trial-and-error and social learning based on imitation of multiple phenotypes from the most adaptive individual.

#### MODEL

#### Rugged Fitness Landscape

There are N individuals in a population and each individual has M traits t<sup>i</sup> (i = 0 . . . M-1) as shown in **Table 1**. Each gene g<sup>i</sup> (i = 0. . . M**-**1) in a M-length chromosome GI represents the initial value of the corresponding trait ti, taking an integer value within the range [1, M]. Each individual has another M-length chromosome GP (p<sup>i</sup> (i = 0. . . M − 1)) which decides whether the corresponding trait is plastic ("1") or not ("0"). Each row of plastic traits is highlighted in **Table 1**. Plastic traits can be changed through the individual or social learning process (described later). Each individual also has a gene s which represents the probability of performing social learning instead of individual learning. s has a real value in the range of [0, 1]. The trait values t<sup>i</sup> are determined in the range of gi±1 by learning. So as to evaluate the fitness of each set of traits, we adopted the following fitness function [15]:

$$fitness = \arg\max\left(f\left(n\right)\right),\tag{1}$$

$$f(n) = \begin{cases} n \text{ if } num(n) \ge n, \\ 0 \text{ otherwise,} \end{cases} \tag{2}$$

where num(n) represents the number of traits of which phenotypic value is n. The fitness is determined by a group of traits which have the same values using Equations (1, 2). Equation (2) shows that the trait group of n yields the fitness value n if its group size (num(n)) is greater than or equals to n, and Equation (1) shows that the highest f(n) of the trait group defined by Equation (2) is adopted as the fitness of the trait set. For example, the fitness of the trait set in **Table 1** is 6 because the number of 6 in the traits is 6 and at the same time, it is the highest number among those which satisfy the condition in Equation (2), as illustrated in **Table 2**.

This fitness function has the following two characteristics, and thus the fitness landscape is rugged as illustrated in **Figure 1**.


The benefit for using this fitness landscape is that we can explicitly grasp the contribution of each phenotypic value on the fitness and the progress of the evolution while keeping the ruggedness of the landscape high.


TABLE 2 | Fitness evaluation of the trait set in Table 1.


#### Individual Learning and Social Learning

We assume an intergenerationally overlapped population that consists of N/2 parents and N/2 offspring. **Figure 2** illustrates the population structure composed of two types of (i.e., parent and offspring) individuals. In each generation, all individuals simultaneously learn individually or socially L times regardless of being parents or offspring. In other words, the individuals learn L times with their parents and themselves after they are born in a generation, and then they become parents and learn L times with their offspring and themselves in the next generation. In each learning step, each individual chooses social learning with its genetically determined probability s, meaning it chooses individual learning with the probability 1-s. The way of learning is defined as follows.

#### Social Learning

The individual who chose social learning selects and imitates another individual who got the highest fitness in the last learning step. It makes each plastic trait closer to the corresponding trait

of the selected individual, by adding −1 or +1 to the genetically determined initial value.

#### Individual Learning

The individual who chose individual learning changes possibly all of its plastic traits by adding a value selected randomly from {−1, 0, 1} to the genetically determined initial value. Selecting 0 means that the corresponding trait is not changed by learning.

The fitness of acquired trait set is evaluated after each learning process. We define the step fitness as the highest value among the all fitness values of each individual's trait sets evaluated until the current time step. This means that individuals can keep and adopt the most adaptive trait set at each time step.

**Figure 3** illustrates an example of learning process. This individual adopt individual learning and obtained the fitness 3 at time step t. At the next step, it obtained the higher fitness 4 by imitating the phenotypes of the best individual in the previous step through social learning, which made the step fitness increased. At step t+3, this individual obtained the trait set of which fitness was 3, but its step fitness was kept 4, as defined.

### Evolution

After completing L steps of learning, offspring grow up to parents and produce the offspring by the following genetic operations.

(1) The lifetime fitness of each individual is defined as the average step fitness over all the learning steps during its lifetime. Two parents are independently selected from the population by roulette wheel (fitness proportionate) selection based on the lifetime fitness.

(2) For GI and GP, we apply a single-point crossover operation on a pair of cloned chromosomes from the parents, which produce two offspring chromosomes for GI and GP, respectively.

(3) Each value of cloned genes g<sup>i</sup> , p<sup>i</sup> , s from the parents are mutated with the probabilities m<sup>g</sup> , mp, and ms, respectively. A mutation occurring in g<sup>i</sup> adds +1 or −1 to the current value, and if the value exceeds its domain, does it again until satisfying the condition. A mutation in p<sup>i</sup> flips the current binary value. A mutation in s adds a random value from a normal distribution N(0, σ 2 ). If the value goes lower than 0, it also does it again until satisfying the condition.

### RESULTS

We conducted computational experiments to explore the coevolution between individual and social learning, using the parameters shown in **Table 3**. The initial population was composed of N/2 individuals of which g<sup>i</sup> were all 1, and p<sup>i</sup> and s were randomly determined. We assumed two cases of experiments, one in which individuals were allowed to perform individual learning only, and the other in which the proportion of social learning could evolve (as described above). Experiments were conducted 20 times for each case, and the average lifetime fitness at the final generation of the former case was 6.91 and that of the latter was 8.98. **Table 4** shows the breakdown of the dominant values of the fitness function in the last generation in

#### TABLE 3 | Default parameter values.


TABLE 4 | Breakdown of the how much average fitness increased (Total 20).


*Even if the average fitness approaches a certain value, it was rare that they match perfectly, so if the average fitness exceeded a certain value n* − 0.1*, we say that the average fitness had reached n.*

the 20 trials for each case. The average fitness tends to be slightly smaller than dominant values of them in the population due to the deviation of the distribution. Thus, if the average lifetime fitness exceeded a certain value n − 0.1, we regarded that it had reached n. In the former case, it reached 7 in all the trials, but in the latter case, it increased to 10 which is the highest value in this model, and in all experiments, it reached higher values than 7. Therefore, the social learning can facilitate the adaptive evolution of the population on a rugged fitness landscape.

#### Experiments Only With Individual Learning

First, we show the details of experiments only with individual learning. We fixed genes s of all individuals to 0. **Figure 4** shows a result of the experiments, which indicates the typical dynamics of evolution process in this case. In the **Figure 4A**, the horizontal axis represents the generation. The green and red lines show the highest fitness and the average of the lifetime fitness, respectively. The blue line shows the average innate fitness, which represents the average fitness of initial phenotypic values g<sup>i</sup> . In the **Figure 4B**, the blue line shows the proportion of plastic phenotypes and the light blue line shows the average of the variances of gene g<sup>i</sup> in each locus. The red line shows the average plasticity contribution. We used the plasticity contribution in order to see how and when learning effectively worked. Specifically, as this index, we calculated the number of the learned trait (in the sense that it was changed from the initial trait) which contributed to the fitness (in the sense of Equations 1, 2) divided by the number of the plastic traits, in the most adaptive phenotype attained by the individual (that equals to the phenotype in the last learning step). **Figure 4C** represents the distribution of the innate fitness, and **Figure 4D** represents the distribution of calculated fitness by individual learning in learning steps and **Figure 4E** represents the enlarged view of **Figure 4D**.

We see from **Figure 4A** that individuals evolved through repeated occurrences of the two steps of Baldwin effect during the first 2200 generations. (1) The highest fitness increased (discovering adaptive traits by individual learning) and the average fitness increased while the average innate fitness remained steady or decreased (agents which can learn adaptive traits increased in population). This corresponds to the 1st step of Baldwin effect. (2) Then, the innate fitness increased (because of the learning failure cost, agents evolve to adaptive traits innately). This corresponds to the 2nd step of Baldwin effect. As a result, the average lifetime fitness increased to 7.0 until around the 2200th generations, and converged to this value, meaning that the population got stuck in the local optima of the rugged fitness landscape.

**Figure 5** represents the enlarged view of **Figure 4**. **Figure 5(1–4)** illustrate typical phases of the evolution of the population on the rugged fitness landscape, each corresponding to the duration indicated by a double headed arrow. Each individual is represented as a pair of a circle, showing its innate fitness, and a square, showing its lifetime fitness. The circle and the square are connected with a directional arrow, representing

its learning process. In general, individuals are classified into three types a, b, c. Thick circles and squires represent dominant individuals in the population. The string of numbers around each individual represents its example phenotypes. The underlined values are plastic traits.

First, in phase (1), most of agents, classified as **type-a**, had the same fitness values before and after learning, meaning that they stayed on a peak of the landscape through their lifetime. This corresponds to around 1000th to 1250th generation. We can see that most of agents had the fitness 6 innately from **Figure 5C**, and also in **Figure 5D**, few agents acquired the fitness 7 by learning. This is because they had few "7" traits and it was difficult to satisfy the condition num (7) ≥ 7 by learning.

In phase (2), individuals, classified as **type-b**, who had the higher lifetime fitness than the innate fitness increased in the population, meaning that they jumped over the valley of the fitness landscape by learning. This phase corresponds to around 1250th to 1500th generation. These individuals had more plastic traits than **type-a** individuals and also they had more "7" traits innately. As a result, they could satisfy the condition num (7) ≥ 7 by learning. We can see the proportion of plastic traits and plasticity contribution increased in **Figure 5B**. The increase of "7" traits in innate phenotypes decreased the probability of acquiring the fitness 6 by learning as shown in **Figure 5D**. In **Figure 5E**, we can confirm the proportion of fitness 7 acquired by learning increased.

In phase (3), individuals which were born in the valley of the fitness landscape but could reach the higher peak by learning increased. They are classified as**type-c**. This phase corresponds to around 1500th to 1900th generation in the graph. This is because individuals came to have more traits 7 innately to increase the probability of acquiring the fitness 7. As a result, they became to not to satisfy the condition num (n) ≥ n innately in any numbers and thus, their innate fitness became 0. We can see the proportion of the innate fitness 0 increased in **Figure 5C** and the proportion of fitness 7 acquired by learning increased in **Figure 5D**.

In phase (4), individuals who existed on the top of the higher peak increased. This phase corresponds to around 1900th to 2100th generation. They satisfied the condition num (7) ≥ 7 innately and they could not acquire more adaptive phenotypes by learning. Thus, they were **type-a** agents. In **Figure 5C**, the proportion of plastic traits decreased to around 0.3 because nonplastic traits of "7" traits increased the probability of acquiring fitness 7 phenotypes by learning. We can confirm the proportion of the innate fitness 7 increased in **Figure 5C**.

After phase (4), the population converged to the top of the peak of the fitness 7, and it means the evolution process got back to phase (1). The population climbed the rugged fitness landscape by repeating these 4 phases until around 2200th generation. However, as seen in **Figure 5**, the evolution process completely converged. This is because the population could not acquire the fitness 8 stably as shown by the repeated temporal increase of the highest fitness in **Figure 5A**.

### Experiments With Individual and Social Learning

Next, we show the details of experiments with social learning. **Figure 6** shows a typical example when the average fitness increased to 10. This is a universal behavior in every experiments when fitness increased. The representation is the same as in **Figure 4**, but it is changed in some points. In **Figure 6A**, highest fitness is replaced by that acquired by individual learning (green line) and that acquired by social learning (purple line). If they took the same values, they are represented by black line. In

**Figure 6B**, the proportion of social learning is added, and in **Figure 6F**, the proportion of fitness acquired by social learning is added. In this model, population finally reached the fitness 10, the maximum value of this fitness function.

The gene s, which is the probability of social learning, evolved to high values at early generation, and it kept high values. This is because imitating phenotypes of the best agents was more adaptive than acquiring adaptive phenotypes by trial and error. It took high values more stably as the lifetime fitness increased. This is because as the fitness landscape became more rugged, acquiring adaptive phenotypes by individual learning became more difficult. In addition, once such adaptive phenotypes were acquired by individual learning and then came to be maintained in the population by social learning, social learning became more adaptive than individual learning.

This adaptive evolution process was caused by complex interactions between individual learning and social learning. **Figure 6(1–4)** illustrate typical phases of the evolution of the population on the rugged fitness landscape as in **Figure 4**. A green directional arrow, which connects a circle and a square, represents a change in the fitness by social learning, and a dotted square represents the best phenotypes shared in the population through imitation from parent individuals to offspring individuals in the population.

First, in phase (1), most of agents, classified as **type-c**, acquired adaptive, but innately non-adaptive, phenotypes by social learning. They had almost the same lifetime fitness as **typea** agents, so they can coexist with them and the genetic diversity increased.

In phase (2), because of the increased genetic diversity due to social learning, some **type-c** individuals occasionally had higher numbered values of innate phenotypes, meaning that they were born in the valley near to the higher peak of fitness landscape. They could found new adaptive phenotypes by individual learning, and became the best individuals to be imitated by others.

However, other **type-c** individuals often failed to imitate such new adaptive phenotypes mainly due to the lack of plasticity as illustrated in **Figure 6(2**′ **).** This made the population lose the adaptive phenotypes and **type-a** individuals dominated the population again. Thus, the population went back to phase (1). These transition processes repeatedly occurred from around 3200th to 6000th generation in this trial. The phases **Figure 6(1–2**′ **)** correspond to the increase in the innate fitness 0 in **Figure 6C**, the increase in the acquired fitness 9 in **Figure 6E**, and the increase in the innate fitness 8, respectively.

On the other hand, once individuals successfully imitated the new adaptive phenotypes by social learning and they were maintained in the population, **type-b** individuals, who could acquire such new adaptive phenotypes while keeping innate adaptive phenotypes, increased in the population, as illustrated in **Figure 6(3)**. This phase corresponds to around 6000th to 6300th

generation. We can see from the enlarged view in **Figure 6F**, which is marked by a square, individuals which could imitate fitness 9 phenotypes increased slightly. This phase is the similar to phase (2) in the case with individual learning only.

In phase (4), **type-c** individuals, who could acquire new adaptive phenotypes more quickly by discarding innate adaptive phenotypes, increased in the population as in phase (3) in the case with individual learning only. This phase corresponds to around 6300th to 7000th generation. The proportion of plastic traits and plasticity contribution in **Figure 6B** took very high values around 0.9 compared with those in the case with individual learning only. It means that individuals highly relied on social learning and they need high plasticity to imitate precisely. In **Figure 6B**, the average variance of each gene increased and it shows **type-c** individuals increased in the population.

Finally, the evolution process went back to phase (1) but the population existed on a more adaptive peak. **Type-c** individuals appeared in the other side of the valley and dominated the population, and a few **type-a** individuals appeared. Therefore, the evolutionally process was cyclic, and individuals evolved through this process on the rugged fitness landscape.

In addition, we conducted experiments with different settings of parameters, and found that the basic scenario of evolution process did not change under the assumption of plausible parameter settings. We also found that some parameters can affect the speed of evolution (i.e., the fitness increase). For example, the larger number of learning iterations L, which is a parameter relating to learning process, can increase the speed of evolution, which is expected to be due to the increase in chances to acquire new and adaptive phenotypes, and vice versa. On the

other hand, the higher values of parameters on mutation process m<sup>g</sup> , mp, m<sup>s</sup> , and σ generally decreased the speed of evolution if they were increased. These are mainly due to the fact that a strong mutation prevents the population from keeping adaptive sets of genotypes, plasticity, and high social learning rate. But the lower mg also slowed down the speed of evolution because of the smaller genetic diversity.

#### DISCUSSION

We constructed a computational evolutionary model with individuals that can learn individually or socially on a multimodal fitness landscape as a more realistic situation than those which have been used in previous research. Comparing the results with only individual learning and with both of learning, we found essential differences between these two learning, which can be described at more general level as follows.

In general, learning has an effect to expand the individuals' search range in phenotypic space. At the same time, it also enables individuals which have different genotypes to have similar phenotypes and fitness values, which means that, at the population level, learning has an effect of bringing the population genetic diversity. Comparing individual and social learning, the characteristic of individual learning is the ability to find new adaptive phenotypes, which cannot be achieved by social learning. On the other hand, social learning has greater amount of the above-described effects of learning, especially of an increase in genetic diversity, by allowing individuals to imitate the adaptive phenotype in population already found by individual learning, without trial and error.

Based on these differences, individual and social learning work complementarily in the course of adaptive evolution on the rugged fitness landscape as follows. Individual learning can find new adaptive phenotypes thanks to the diversity of genetic expressions created by social learning. It is illustrated in the transition from **Figure 5(1)** in which individuals that were born on the valleys on either side of a peak (8) leach the peak by

#### REFERENCES


social learning to **Figure 5(2)** in which an individual that was born on the valley of the right side of the top found a new fitness peak (9) by individual learning. On the other hand, social learning can keep a new adaptive phenotype found by individual learning in the population. It is illustrated in the transition from **Figures 5(2,3)** in which individuals on the lower peak (8) can find the higher peak (9) by social learning, thus keeping the new peak found by individual learning in the population. However, if every social learning is unsuccessful because of keeping different values for non-plastic trait, the peak found by individual learning is lost and the population moves back to **Figure(1)** via **Figure(2**′ **)**.

We have described how individual and social learning interact with each other and how it enables individuals to find adaptive phenotypes on the rugged fitness landscape with valleys which cannot be crossed by individual learning alone. In recent years, theoretical and empirical research to predict and explain social learning strategies of humans and other animals has been conducted [16]. One of the promising direction would be to introduce several typical strategies for social learning into the model and investigate the effect of the interaction between the strategies on the evolutionary scenario of the cooperation. It is also would be the future direction to consider network structures of social interactions so as to make the model more realistic, in terms of the "whom" aspect of social learning.

#### AUTHOR CONTRIBUTIONS

MH, RS, and TA designed the experimental procedures, conducted experiments, analyzed the results, and wrote the manuscript.

#### FUNDING

This work was supported by MEXT/JSPS KAKENHI Grant Number JP17H06383 in #4903 (Evolinguistics), JP15K00335, JP15K00304, and JP18K11467.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Higashi, Suzuki and Arita. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Agent-Based Self-Service Technology Adoption Model for Air-Travelers: Exploring Best Operational Practices

#### Keiichi Ueda\* and Setsuya Kurahashi

*Graduate School of Systems Management, University of Tsukuba, Tokyo, Japan*

The continuous development of the service economy and an aging society with fewer children is expected to lead to a shortage of workers in the near future. In addition, the growth of the service economy would require service providers to meet various service requirements. In this regard, self-service technology (SST) is a promising alternative to securing labor in both developed and emerging countries. SST is expected to coordinate the controllable productive properties in order to optimize resources and minimize consumer stress. As services are characterized by simultaneity and inseparability, a smoother operation in cooperation with the consumer is required to provide a certain level of service. This study focuses on passenger handling in an airport departure lobby with the objective of optimizing multiple service resources comprising interpersonal service staff and self-service kiosks. Our aim is to elucidate the passenger decision-making mechanism of choosing either interpersonal service or self-service as the check-in option, and to apply it to analyze several scenarios to determine the best practice. The experimental space is studied and an agent-based model is proposed to analyze the operational efficiency via a simulation. We expand on a previous SST adoption model, which is enhanced by introducing the concept of individual traits. We focus on the decision-making of individuals who are neutral toward the service option, by tracking the actual activity of passengers and mapping their behavior into the model. A new method of validation that follows a different approach is proposed to ensure that this model approximates real-world situations. A scenario analysis is then carried out with the aim of exploring the best operational practice to minimize the stress experienced by the air travelers and to meet the business needs of the airline managers at the airport. We collected actual data from the Departure Control System of an airline to map the real-world data to the proposed model. Passenger behavior was extracted by front-line service experts and clarified through consecutive on-site observations.

Keywords: ABM, airport, airline, self-service technology, fuzzy, scenario analysis, simulation, multi-agent simulation

## 1. INTRODUCTION

### 1.1. Background

The service economy continues to grow globally. Both developed and emerging countries are expected to face difficulties in securing workers in the future. Developed countries need to address this progressive and imminent issue in their aging societies. In these countries, the working population enjoys improved health care and has fewer children; thus, industries are required to

#### Edited by:

*Isamu Okada, Soka University, Japan ¯*

#### Reviewed by:

*Nicola Lettieri, Istituto Nazionale per l'analisi delle Politiche Pubbliche (INAPP), Italy Allard C. R. Van Riel, Radboud University Nijmegen, Netherlands*

> \*Correspondence: *Keiichi Ueda s1645001@u.tsukuba.ac.jp*

#### Specialty section:

*This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics*

Received: *30 September 2017* Accepted: *22 January 2018* Published: *23 February 2018*

#### Citation:

*Ueda K and Kurahashi S (2018) Agent-Based Self-Service Technology Adoption Model for Air-Travelers: Exploring Best Operational Practices. Front. Phys. 6:5. doi: 10.3389/fphy.2018.00005*

**103**

secure their workforces in new ways. Clearly, we need some mechanism to offer enhanced service and improved interactions with consumers. Self-service technology (SST) is a promising alternative for fulfilling future customer service requirements. However, unless SST is recognized and accepted by the customers, the implementation of SST is unlikely to be successful, resulting in neither customers nor firms enjoying the benefits of service investment.

This study focuses on self-service kiosks at the airport, as these are a familiar alternative for air travelers. In general, service is characterized by simultaneity and inseparability, which, in our situation, means that service agents and passengers need to work together to achieve a common goal in each of the processes that form the constituents of air travel. In particular, because the check-in process is a critical starting point for improving the travel experience, smoother operation in cooperation with the passenger is required to provide a certain level of service. It is essential for both airlines and passengers to utilize the SST at the airport because it reduces stress and the amount of waiting time. Time and suitable opportunities are required to identify the best practice for fully utilizing the resources including SST. Conducting a trial is a common strategy to evaluate the functioning of a new policy, the performance of which is also challenging to verify under new circumstances. However, it is difficult for on-site managers to understand the statistical result and analysis and apply its findings to a service operation in practice. Further, because the operational trial involves a certain degree of risk in that it may adversely affect the service quality, local managers would prefer to prevent sacrificing the customer's experience as much as possible rather than conducting operational trials in the field. An experimental space helps managers to understand the circumstances of a new handling policy without conducting actual try-and-error trials, if it stably reproduces real-world phenomena.

### 1.2. Purpose of this Study

This study investigates the way in which coordination and cooperation take place in the airport departure lobby. The optimization of current resources is key in achieving the same goal for both passengers and the airline. We investigate how effectively both SST and service staff engaging in interpersonal service can achieve a harmonized performance.

To achieve this, two steps are undertaken to uncover the fundamentals of smoother passenger handling operation: constructing the experimental space and scenario analysis by simulations. The first of these two steps of this study consists of proposing an experimental space to which to map the realworld situation. In this regard, we utilize agent-based modeling (ABM) to emulate the departure lobby of an airport with existing data collected from an airline system. We focus on the decisionmaking process of a passenger who is neutral toward SST use. Because it is important to know what causes an individual to opt for SST when they have not yet determined an attitude toward it, we develop an SST adoption model by introducing an aggregate analysis of passenger records from an airline system. We also propose a methodology to validate the proposed ABM in order to clarify that the core mechanism is robust and that the designed concept functions as intended. Then, we carefully construct several scenarios to determine the best combination of SST and service staff to ensure a certain level of quality.

This research is based on computer simulation experiments. This study was carried out in accordance with the guideline of "Researcher's ethics" of the University of Tsukuba. This study complies with Regulation for Faculty of Business Science Research Ethics Committee. An ethics approval was not required as per institutional and national guidelines and regulations.

### 2. RELATED WORK AND FINDINGS

SST has been examined from various perspectives. First, as SST adoption is an individual decision to adopt a new method, we review studies pertaining to the adoption and diffusion of innovation. Subsequently, an overview of the field of services marketing is given to provide the necessary background against which to understand the development of SST studies. Then we review ABM as a tool to elucidate the dynamics of the phenomenon of SST adoption. We specifically consider the ABM SST adoption model for reviewing the progress of current findings and research problems.

### 2.1. Innovation Diffusion

Rogers [1] defined innovation as the introduction of something new: a new idea, method, or device. The OECD Oslo Manual defines four types of innovation: product innovation, process innovation, marketing innovation and organizational innovation<sup>1</sup> . However, innovation is often also viewed as the application of improved solutions to meet new requirements, unarticulated needs, or existing market needs. This is accomplished by using more effective products, processes, services, technologies, or business models that are readily available in markets, governments, and society. The existence of designated variables to define the speed of diffusion is well-known. Greater relative advantage, higher compatibility, less complexity, higher trialability, and greater observability are known to accelerate the diffusion of innovation. The change agent has also been claimed to promote innovation and to play an important role in increasing the speed of diffusion [1]. The variables responsible for enhancing the diffusion speed indicate what we should be looking at in this study because SST adoption is an individual decision to accept innovation.

#### 2.2. Service-Marketing Framework

The study of SST can be traced back to the study of convenience. Berry et al. [2] examined and discussed convenience from two main perspectives: (1) wait time and its management and (2) what consumers find convenient. Davis [3] developed a "technology acceptance model (TAM)," which is specifically meant to explain computer usage behavior. The developed methodology emphasizes the necessity of evaluating proposed new systems prior to their implementation. The author concluded that perceived usefulness and ease of use create

<sup>1</sup>OECD Innovation strategy Defining innovation, https://www.oecd.org/site/ innovationstrategy/defininginnovation.htm.

favorable attitudes toward SST. Davis [4] empirically examined the ability of TAM to predict and explain user acceptance and rejection of computer-based technology.

Bitner et al. [5] explored the changing nature of service, with an emphasis on ways in which encounters can be improved through the effective use of technology. They focused on the benefits of thoughtfully managed and effectively implemented technology applications [5]. In the same year, self-service technologies were described as technological interfaces that enable customers to produce a service without a service employee's involvement [6]. The use of SSTs can drive up productivity and efficiency [7–9], and additionally, reduce and avoid high labor cost.

Through various means including surveys, interviews, and questionnaires, a large number of studies have found factors that influence the usage of SST. Meuter et al. [6] concluded that service convenience through SST resulted in consumer satisfaction when it was "better than the alternatives" and they appreciated "time saving" the most. It was also claimed that SST usage depends on customer readiness for SST [10].

Liljander et al. [8] reviewed SST adoption from the perspective of consumer readiness. Meuter et al. [9] explored usage patterns and the benefits of using SSTs and their findings indicate that "technology anxiety" is a superior, more consistent predictor of SST usage than demographic variables.

Dabholkar and Bagozzi [11] extended the attitudinal model of technology-based self-service (TBSS) and proposed that the moderating variables affect the attitude toward SST and intention to use SST. Their extended framework of TBSS was well supported and captured a variety of consumer traits and situational factors.

#### 2.3. ABM

Agent-based models, also known as multi-agent systems and agent-based simulation etc., are (computational) models of a heterogeneous population of agents and their interactions (CoMSES)<sup>2</sup> .

Technical instruments enable each agent to behave autonomously. By locating players in the experimental space and approximating it to the real world, ABM is developed and its effectiveness enhanced. A social multi-agent system represents phenomena of complex social systems [12]. Kawai [13, 14] utilized ABM and demonstrated the diffusion of new products and services using an ABM abstract model. These studies illustrate important facts and concepts for the diffusion of innovation. However, they merely illustrated the concept, and neglected to describe the reproducible mechanism of decision-making for choosing several options.

Stylized facts are empirical regularities in search of theoretical, causal explanations, such as statistical features. In the financial markets, there are a few stylized facts, such as volatility clustering and the power law decay of the tails of the return distribution [15]. As Watts and Gilbert introduced ABM literature by mentioning stylized facts, it is common to evaluate the simulation results of various fields using stylized facts [16]. Grimm et al. [17] propose a general framework for designing, testing, and analyzing bottom-up models, which is pattern-oriented modeling. The framework attempts to enhance the rigor and comprehensiveness of bottom-up modeling by explaining observed patterns. It is claimed that patterns are the defining characteristics of a system and indicators of the essential underlying processes and structures.

### 2.4. SST Study by Utilizing ABM

In this section, we briefly review the SST adoption model that equips ABM step by step. The building stages of the proposed ABM are reviewed to illustrate the feature of passenger handling at the airport and the basic idea of the decision-making mechanism for using SST.

#### 2.4.1. Concept of SST Adoption Model

Ueda and Kurahashi [18] created an ABM to demonstrate how air travelers choose SST at the airport. The experimental space emulates an existing airport departure lobby (**Figure 1**) and has three check-in options: interpersonal check-in service, baggage drop, and self-service kiosk.

The default check-in option is the interpersonal check-in service. Travelers who use the self-service kiosk can check in their baggage at the baggage drop position. The baggage drop positions are also utilized as check-in positions when there is no-one waiting in front of them.

ABM creates passenger agents at the same volume and the same arrival timing, with certain properties according to the system log collected from the airline. The ratios of passenger demographics and travel conditions such as checked baggage are also copied into the model. It gives each passenger agent the variable value, which represents "hesitation status" toward using SST. The passenger agent reduces the "hesitation value" when the service agent interacts, as it is common and observed that passengers are encouraged to use SST by knowing that the self-service kiosk exists and functions and that it may enable them to reduce their waiting time. According to the various records (system log, on-site observation, etc.), the number of active check-in counters, self-service kiosks, and lobby service agents are also implemented as they were in practice (**Table 1**).

**Figure 4** shows the volume and variables of the data collected mainly from the airline system.

#### 2.4.2. Decision-Making Mechanism for SST Use

**Figure 2** illustrates the core mechanism of the proposed model. The passenger agent is created randomly or stochastically with certain traits. Each agent moves toward check-in options in the lobby and decides its direction autonomously by collecting surrounding information.

The perceived waiting time and visibility of SST are key inputs for fuzzy inference methodology to determine the direction of each agent. Based on the knowledge of service experts, the "Selfservice Preference Index (SPI)" is calculated by following the simple fuzzy rule and membership function (**Figure 3**).

<sup>2</sup>OpenABM FAQ: What are agent-based models (ABMs)? https://www.openabm. org/faq-page#t780n3730


If SPI is positive, the agent moves toward SST. If SPI is negative, the agent moves toward the conventional checkin counter. Sixteen days of on-site observation clarified that passengers' actual behavior followed the defined rule almost consistently.

**Figure 3** illustrates that the fuzzy system calculates the SPI score by using the max-mini inference method and the simplified centroid method.

The Equation (1) defines the calculation of EQT, which is the input value to find the membership score of variable W. A passenger agent moving toward the interpersonal check-in service is taken as the default option. However, in the decisionmaking zone, it estimates and compares the waiting time of the two check-in options.

EQT is defined as the predicted difference in waiting time at the interpersonal check-in, and the wait time for using SST. It has weighting parameters for check-in preference ("p1" and "p2"), If p1 and p2 have the same value, the preference for the two options are the same; however, few passengers prefer SST. The variable V reflects how the passenger perceives SST. The number TABLE 1 | Experimental dataset.


*IPSC, Interpersonal Service (conventional); SSC, Self-service; SST, Self-service Kiosk; IPC, Conventional Check-in; BagDrop, Baggage check-in (interpersonal); CSR, Customer Service agent who guides passenger to SST;* \**dataset412, Training dataset.*

of passenger agents in front of SSTs is the input value of V. When there is no agent using SST, V is low; when there are more agents using SST, the value V becomes higher. If the number of passengers who are in front of SSTs exceeds the number of SSTs, V decreases, because passengers would occupy the self-service area and then the visibility of the SST significantly decreases. This model places the customer-service staff in the experimental space because the front-line staff's experience and the findings of previous work indicate that passengers respond positively to use

SST under the guidance of the airport staff [7, 19]. Similar to the different preferences of an actual individual, each created agent is assigned a different attitude toward using the SST. ABM assigns each passenger agent a random "hesitation" value from 0 to 20. If the "hesitation" value is high, the chances of using the SST are less. If the passenger agent reduces their "hesitation" value by contacting the customer service agent, there is a greater chance of using the SST.

$$EQT = \left(\frac{NCP}{CCPs}\right) \times p1 - \left(\frac{NSSQ}{SSUs}\right) \times p2\tag{1}$$


#### 2.4.3. Model Development

The replicated SST adoption model [22] introduces the concept of an attitudinal model of [11], which claims that the attitude and intention of using the SST directly influences several moderating variables such as consumer traits and situational factors.

An aggregate analysis of the airline records (DatasetB: **Figure 4**) supports the idea that situational factors and passenger traits influence the attitudes of individuals toward SST. The regression analysis demonstrates the significance of explanatory variables representing travel conditions such as the volume of held baggage and busyness of passenger handling.

The variable "recent use of self-service kiosk" indicates the service that is chosen the most, whereas the other variable "flight frequency of passenger" occupies the second position. These findings are reflected by modifying the SST adoption model to assign each produced passenger agent an individual trait: Non-SST user 35%, Strong SST user 14.8%. The ABM assigns each agent a random non-negative number up to 100. Depending on the score the agent holds, the trait category of passenger agent is defined stochastically.

#### 2.4.4. Simulation Result

Verification and validation processes are carried out carefully using six datasets: one dataset (dataset412) is used for training and the remaining datasets are used for validation.

Various parameters, including the baggage holder rate (0.7), and the processing times for the different service options (interpersonal service, self-service, and baggage check-in), are set to represent real-world conditions. After fitting the two parameters (p1 and Speedmax: maximum moving speed of passenger agent) by calibration, we conducted experiments using the other datasets with different circumstances. In each experiment, the number of check-in counters and staff is mapped according to the actual situations on those days. Experiments were conducted by using 50 runs each for the five test datasets, which differ completely in terms of the timing of passenger arrival. In these experiments, the self-service usage

rate, the quotient of passengers using self-service divided by all passengers, is observed.

The results of the simulation for each model are provided in **Table 2**. The simulation result is sufficiently close to the realworld situation and is persuasive for modeling actual passenger handling at the airport. **Table 2** indicates that in the replicated 2017 model, the RMSE (Root Mean Squared Error) for the self-service usage rate versus the real data is 0.039, which is higher than that in the 2014 model. However, the core of the SST adoption model supports the concept developed by previous services marketing literature and innovation studies, namely the effect of technology readiness and anxiety, and moderating variables with fuzzy inference systems are used to demonstrate the dynamic mechanism of SST adoption. In addition, the result of experiments finds that calibration results of two parameters are different: the speed of the passenger agent (Speedmax) is linearly related to the selfservice usage rate, whereas the interpersonal preference of an individual (p1) is non-linearly related to the self-service usage rate.

#### 2.5. Topics of Related Work

Previous innovation diffusion studies typically describe the introduction of a new method and introduce the variables that determine the rate of adoption. The services marketing literature explores and specifies factors that promote the use of technologybased self-service (TBSS). Dabholkar and Bagozzi [11] expanded the concept of a technology acceptance model to explicate that situational factors and consumer traits have a direct effect on promoting a positive attitude toward TBSS and the intention to use TBSS. They conducted an on-site survey at a fast food restaurant to examine the behavior of consumers. However, as Dabholkar noted, the results would change if one of situational TABLE 2 | Experimental results of the SST adoption models of 2014 and 2017.


factors was to change. Those studies are based on statistical methods, and this means that the analysis is static rather than dynamic. Inductive approaches based on statistical models are often inadequate for elucidating "complicated consumer behavior mechanisms" and "complex phenomena occurrence mechanisms" [20].

Applying the TAM proposed by Davis [4] in user acceptance testing would involve demonstrating system prototypes to potential users and measuring their motivation to use the alternative systems. Although "consumer behavior experiments can provide theoretical insight on consumer decision making and response to marketing measures, it is practically difficult to experiment with a large number of subjects and to examine the complicated interaction among consumers [20]." Studies in the services marketing field have not determined the mechanism by which predictable results can be reliably reproduced.

Kawai proposed the diffusion of new products and services utilizing ABM [13, 14]. Although his model demonstrated the phenomenon of diffusion, the model does not use data observed in the real world. Therefore, the model succeeds in illustrating the concepts but it does not represent the actual phenomenon of diffusion or reveal consumers motivations for selecting a new alternative.

The SST adoption model [18, 22] was used to conduct simulation experiments with actual data collected from an airline and approximated the experimental space to the real world (**Table 2**). These models were validated by comparing the selfservice usage rate of the simulation results against the actual recorded activity, under different circumstances and various patterns of passenger volume and arrival timing (**Table 1**). The results clarified that the statistical features of mass behavior are similar. However, it is necessary to find other stylized facts shared by the two spaces, which can render more credibility to the proposed SST adoption model.

The above-mentioned previous results suggest that a model capable of addressing the concerns of on-site managers would need to stably reproduce the situation and operations of the departure lobby. The service operations consist of coordination and cooperation of service resources and interactions among passengers, which ABM is inherently capable of representing.

#### 3. ENHANCING SST ADOPTION ABM

The results of experiments with the SST adoption model [18, 22] closely reproduce the real-world situation and indicate that there is room for improvement. In section 3.1, we focus on individuals who have not determined their attitude toward SST and their decision-making, by using a detailed analysis of the data from the records of the airline system. We clarify the effectiveness of this proposed agent-based model by utilizing logistic regression analysis. At the end of this section, we discuss the fact that the experimental space creates stylized facts similar to those in the real world.

### 3.1. Narrowing the Data Scope

The experimental results support the idea that individuals' choices depend on situations, as claimed by Dabholker. However, we reconsider the scope of the data because the results of the 2017 model, in which traits are given stochastically, are not as close to the actual recorded activity as the results of the 2014 model, which assigns random traits to passenger agents.

The examined data for the replicated SST adoption model [18] was collected during five days of operation time and covers both quiet and busy times. As the experiment tries to explicate the peak hours of the real world, we extract data from the peak hours of operation (7:00–8:15) from DatasetB (**Figure 4**).

We examine data from a total of 4,440 passengers and divided these data into 36 segments using two variables that explicate the usage of the self-service kiosk the most according to multiple regression analysis (**Table 3**) [18]. The flight frequency is divided into 6 categories; FF-0 classified passengers have no flight record TABLE 3 | Distribution of SST usage rate.


*FF-0, No flight record; FF-5, more flight record.*

*SR-0, No SST usage record; SR-5, Using SST most recently.*

*Weak-SST-user use SST, usage\_rate < 0.37; Frequent\_SST\_user use SST, usage\_rate > 0.68.*


*Frequent\_SST\_user* = *6.4% of all users; 78% of Frequent\_SST\_user selected SST. Neutral\_SST\_user* = *77.7% of all users.*

during the last 24 months from the departure date. Passengers who fly more frequently are divided into higher classes. Recent SST usage is also divided in the same manner: passengers classified as SR-0 have no record of using SST within the past 2 years, and SR-5 passengers have used SST recently. **Table 3** lists the SST usage rate in each passenger segment. We classified passengers into three categories: Weak-SST-user, Neutral-SSTuser, and Frequent-SST-user by investigating the SST usage rate of each segment. In this particular dataset, 15.9% were Weak-SST-users and 6.4% were Frequent-SST-users with SST usage rates of 22 and 78%, respectively.

### 3.2. Mapping the Stepwise Decision-Making of Service Selection

The replicated adoption model based on ABM is modified such that each agent is generated with a categorized trait toward SST according to the result shown in section 3.1. This takes into consideration that actual passengers with certain traits decide whether to use SST after arriving in the departure lobby. This sequence of step-wise decision-making is introduced into the proposed ABM. Further, a Weak-SST-user is stochastically allocated a 12.5% intention of using interpersonal service, because 78% of Weak-SST-users select not to use SST in this particular dataset. In the same way, a Frequent-SST-user is assigned a stochastic SST-using intention of 5%, because 78% of them ultimately used SST. The remaining agents move in the experimental space without antecedent conditions. They perceive the situation and make decisions according to the behavior rules. We would like to discern the behavior of those passengers who have not made up their mind. In the next section, we focus on validating how those who have a neutral trait toward SST make their decisions.

### 3.3. Validating ABM from a Different Perspective

It has long been known that a single pattern observed at a specific scale and hierarchical level in a complex system is not sufficient for reducing the uncertainty in the model structure and parameters [17]. In this section, we illustrate that a phenomenon not incorporated in the model has emerged from ABM. Its occurring pattern is statistically similar to the pattern in the real world, which is another stylized fact of the proposed model. The outline of the experiment is illustrated in section 3.3.1. The results of the experiment are shown in section 3.3.2, in order to discuss what they indicate. We summarize the experiments and discussion to evaluate the proposed model in section 3.3.3.

#### 3.3.1. Outline of ABM Validation

According to the experimental result in section 2.4.4, we can assume that the proposed model closely approximates the realworld situation because the SST usage rate of various simulation results is approximately equal to the actual recorded activity. In section 3.2, deeper insight into passenger traits and decisionmaking processes enhances the model. We validate the core of the decision-making mechanism by examining the simulation results of the enhanced model.

Thirty simulation runs with one of the datasets are conducted to accumulate the activity record; 2,700 passenger agents are generated for the experimental space. Dataset412 is chosen because on that day, we observed that lobby service staff had not interacted or guided passengers much and little positive feedback was received from passengers. Equivalent amounts of records representing (1) the actual passengers and (2) the generated passenger agents are randomly selected from both the real world and ABM experiments to form six datasets. These datasets are analyzed by logistic regression, and we examine the ability of the model to accurately predict the usage or non-usage of SST (that is the objective variable). We focus on the passengers and agents with neutral traits because we aim to identify the decisionmaking mechanism of those who have not determined their choice.

We compare and discuss the accuracy of the predictions for the two groups by using the same explanatory variables (**Figure 5**).

#### 3.3.2. Experimental Results and Discussion

The ABM experimental space emulates the real, existing world; however, it is difficult to cover all the necessary data to represent complex real-world scenarios. The airline stores its vast volume of activity records in their data warehouse. However, it does not cover all passenger activities. The question as to what types of information we need in order to represent the real world always remains. We select three explanatory variables (travel condition, individual traits, and operational busyness) to conduct the logistic regression analysis for predicting SST use. These variables were selected to help explain the objective variable: SST use or not (**Table 4**).

A total of 1,200 datasets are selected randomly to form three datasets containing the real world data. Each dataset contains an equal number of two different passenger groups: those using and those not using SST. The results of the logistic regression analysis of each experiment are listed in **Table 5**.

It displays the correct rate of SST usage prediction (Equation 2) of neutral trait groups for each experiment. We can see that the experimental results of real-world data (0.549) are close to those of the ABM data (0.537). This result, a slight gap in prediction accuracy by logistic regression, implicates that two

different spaces hold data that have the same degree of difficulty explaining the complex world.

$$Correct\\_rate = \frac{passenger\\_with\\_correct\\_predict}{All\\_passenger} \tag{2}$$

#### 3.3.3. Summary of the Validation

We focus on passengers who have not decided whether to use the SST to validate the extent to which the proposed ABM approximates the real world and to assess the robustness of its core mechanism. Passenger traits are examined with the narrowed range of data, namely the busy peak time. The model is thereby enhanced because it provides deeper insight into passenger trait analysis as a result of stepwise decision-making, as implemented in section 3.2.

In addition to the SST usage rate, focusing on the SST usage prediction accuracy, we observe that the experimental space contains variables to explain it to the same degree as the real world. It means that the logistic regression analysis outcomes of the real world and ABM both indicate that each space has almost the same complexity to predict. Through our experiments, we observe multiple patterns in the real system at different hierarchical levels; the SST usage rate is a statistical feature of the group behavior, and the prediction accuracy is another statistical feature of individual behavior. With two dimensions of observable simplified presentation of empirical findings—in


*# Common variable, ## ABM-generated variable.*

TABLE 5 | Summary of logistic regression analysis of experimental results.

other words, the stylized facts—it can be presumed that the ABM experimental space approximates the real world.

If a model is overly complex, the analysis of its results is likely to be cumbersome and likely to be complicated by details. Conversely, an over-simplified model would neglect the essential mechanisms of the real system, thus limiting its potential to provide an understanding of and testable predictions regarding the problem it addresses. Thus, we need a method that would optimize the model complexity [17].

This study demonstrates a method suitable for extracting essential principles and minimal information from real-world situations to represent existing phenomena. It also implicates that the knowledge of front-line experts is helpful in constructing an ABM.

#### 4. SCENARIO ANALYSIS

In this section, we conduct scenario experiments using the proposed model and discuss cooperation and coordination in the passenger service operation. Section 4.1 explains the series of scenarios. The experimental results are presented in section 4.2; the results are analyzed, discussed, and evaluated in section 4.3, followed by a summary of the scenario analysis in section 4.4.

#### 4.1. Experimental Scenarios

We conducted experiments with different scenarios based on the proposed model (2014 model) and discuss cooperation and coordination in passenger service operations. The main purpose of the scenario analysis is to examine the extent to which the coordination of service resources are effectively managed and to determine the approach the service provider could follow to cooperate with service recipients.

The scenario analysis involves examining the effect of (1) increasing and decreasing the quantity of service functions and (2) replacing the role of service staff. We discuss the simulation results in terms of the business needs of airport managers, including the cost effectiveness of current staff, increasing the number of future SST users, and moderating the impact of customer service. **Table 6** displays the series of scenarios and a reference case as the benchmark.

(1) The impact of reducing the number of service staff is examined in the following cases.



#### TABLE 6 | Results of experimental scenario.


*IPC, Interpersonal check-in counter; BD, Baggage check - in counter. If there is no passenger waiting for baggage check-in, passenger waiting for check-in at IPC can be pulled in. CSR, Customer Service Representative (lobby service agent) who guide and support passenger.*

*Usage rate, Self-service usage rate.*

*WP, The number of waiting passenger agent (peak).*

*Signif. codes: 0 "\*\*\*" 0.001, "\*\*" 0.01, "\*" 0.05, "." 0.1, "\_" 1.*

• Scenario 3: Reducing 1 BD and 1 IPC.

(2) We examine the effect of replacing the position and role of service agents in the following scenarios.


#### 4.2. Results for Experimental Scenario

Each scenario is simulated 50 times, and the average value of the following items is observed: Self-service usage rate, the total number of waiting passenger agents recorded at each step of the simulation (proxy variables of waiting time).

The simulation results of the planned scenarios and a reference case are shown **Table 6**. The table presents the average SST utilization of 50 experimental results for each scenario, and the average peak numbers of the total number of passenger agents waiting for check-in options.

The following result was obtained regarding the impact of reducing the number of staff. The results of scenario 1 show that a reduction in the number of IPC staff does not necessarily cause service quality to deteriorate because of an increase in waiting time. In the reference case, it can be assumed that there was a margin in the processing capacity of check-in options for the amount of work required to manage arriving passengers. Regarding the location and role changes in service staff, scenario 5 shows that the SST usage rate significantly increases when IPC staff are replaced in the lobby and used instead to guide passengers and assist them with SST operation. In scenario 4, we presume that the waiting time has decreased because the lobby service staff redeployed from IPC serves to guide passengers to utilize SST and other check-in options.

#### 4.3. Scenario Analysis and Discussion

We comprehensively evaluated the scenario experiments by considering whether they contributed from the following

#### TABLE 7 | Comprehensive scenario evaluation.


three perspectives: (1) increasing the self-service usage rate, (2) reducing cost, and (3) moderating the waiting time.

Scenario 6 has a relative advantage over scenarios 4 and 5 in terms of cost efficiency and self-service usage rate. **Table 6** shows that scenario 7 is the best in terms of self-service usage rate, and Scenario 6 is the second best. Similarly, the best scenario in terms of cost efficiency is scenario 3, and the second best are scenarios 1, 2, and 6, which entail a reduction in one member of the service staff. We narrow down the target of evaluation by selecting the top scenario in perspective (1) or (2) and the scenario that is ranked higher than the second-place scenario for perspectives (1) and (2).

This ranked result is displayed in **Table 7** with the relative rank of the third perspective among them, which shows that scenario 6 generates the smallest queue among 3 scenarios.

The comprehensive evaluation of the narrowed-down scenario from three viewpoints suggests that scenario 6 is considered relatively balanced and superior among the three.

The experimental space is presumed to reproduce the following situation with scenario 6.


The scenario analysis indicates that a reduction in the quantity of service resource (e.g., check-in counter, service staff) does not simply cause a deterioration of service. The experiment illustrates that the full utilization of each service resource is a key to maintaining good service quality. This is Chang and Yang [21] indicates: the "potential kiosk users expect their checkin environment to be highly controlled". The scenario analysis outcome and the result of calibration of agent speed [18] give us some implication that we need to control the walking speed and course of passenger and keep letting them recognize the "usefulness" and "ease of use" of SST in promoting the usage of SST.

#### 4.4. Summary of Scenario Analysis

In the context of service delivery at the airport, passengers definitely have a choice of service options. Airline staff needs to inform passengers about the additional options, especially for those who are unaware of them, and encourage those who have not made up their mind to attempt to use these options even though they may already know about them. This effort helps both the airline and customer to reach the same goal to minimize stress. This model demonstrates that service staff responds to passengers with different traits arriving at different times without overlapping, which is similar to the real world. We demonstrate that this model is capable of emulating the situation of congestion at the airport. Passengers who move autonomously capture external factors that are necessary for decisionmaking, and change their internal status by interacting (or not interacting) with the service staff diligently working with the passenger.

It is also important to determine the appropriate allocation of service resources to handle the expected passenger volume. The service quality depends on the extent to which local management prepares before passenger arrival; this includes determining the number of check-in positions, self-service kiosks, and service staff. This model enables us to perform the trials under different conditions. Exploring best practices involves the examination of how we consider redeploying multiple service options and increasing the processing capacity of the lobby as a whole.

ABM simulation helps us to optimally balance the quantity of different service options. In other words, it is a useful tool for exploring the structure of cooperation and coordination by mapping the existing phenomenon and its mechanism. Cooperation between service staff and passengers to minimize the waiting time determines the degree of optimization of check-in options. Coordination of service resources is critical in maintaining a certain level of service that is intangible and perishable. It is difficult for airport managers to quantitatively measure the handling quality and grasp the handling situation in daily operation. However, ABM can provide, via a simulation result, quantitative facts and the process of service operations. ABM offers a tool for desktop trial simulation that would enable those who are involved in service operations to obtain a solution that would not cause the service quality to deteriorate.

## 5. CONCLUSIONS

### 5.1. Summary

#### 5.1.1. Constructing the ABM

In this study, we reviewed and enhanced the SST adoption model and propose a new validation methodology for the agent-based model. The enhanced ABM SST adoption model uses logistic regression analysis and provides statistical features similar to those in the real world.

This exploratory approach is based on the interdisciplinary outcomes of multiple academic disciplines, including innovation diffusion, services marketing, and ABM. Innovation diffusion provides the basic viewpoint from which to promote the adoption of new service model. Large-scale studies and implications in the field of services marketing are reported in the literature regarding SST. We selected to utilize the concept of the "core" Technology Acceptance Model, which is refined by Dabholkar and Bagozzi [11]. ABM plays the role of a combining device for dynamically reproducing these studies.

In constructing an experimental space that maps the real world, many on-site observations of passenger behavior have been conducted, and we explored and extracted the data from the actual system log. Knowledge of front-line experts was drilled down to a simple rule. Along with a thorough survey, we include the necessary functions in ABM, such as the promoting agent and productive properties (mapping the same amount of self-service kiosks, check-in positions, and baggage drops). Utilizing the actual data, the aggregated analysis is also mapped to the ABM as much as possible. Following the Bottom-up simulation modeling approach, we compile relevant information about entities at a lower level of the system, formulate theories about paasenger behavior, implement these theories in a computer simulation [17].

The approach we followed to enhance the validity of the ABM is mainly discussed in this study. One of the measures of approximation is the SST usage rate. We compared the results of the simulation with those obtained in the real world, and showed that they are almost identical to each other. However, the experimental results showed that the replicated model, which introduces individual traits, has a slightly higher RMSE against the real world than the original model [22]. Another measure is the correct prediction rate of SST usage, which is calculated by logistic regression. We found the outcomes of the prediction accuracy from both spaces to be almost equivalent, using the same variable combination as in the real world to explain SST use. We analyze multiple patterns in the real system at different hierarchical levels and demonstrate similar statistical features. The group behavior is examined by the SST usage rate and the individual behavior is analyzed by the prediction accuracy. We presume that the experimental space with the proposed ABM approximates the real world, because two dimensions of the stylized facts, observable simplified presentation of empirical findings, are sufficiently close.

The unique feature of this study is that it utilized data collected from the airline system. The system contains actual passenger demographics with historical activity records, and the huge amount of data could have been used to build this model, if necessary. The critical findings obtained through the series of SST adoption models indicate that it is important to keep us from building models that are too simple in structure and mechanism, or too complex and uncertain [17]. We need to carefully extract relevant elements and adequate range of data sufficiently to explicitly formulate rigorous and comprehensive strategy.

ABM is a deductive approach and its advantages are that it not only enables the individual theory of behavior to be explored but it can also be used to verify large-scale phenomena [20]. This study demonstrates the advantage of using ABM as a combining function for interdisciplinary outcomes of multiple academic disciplines.

#### 5.1.2. Scenario Analysis with ABM

Service encounters are critical moments of truth in which customers often develop indelible impressions of a firm [5]. For air-travelers, the departure lobby of the airport is the first physical point of contact with the airlines. It is important for airlines to deliver sufficient service levels to retain their customers. By using the proposed model with ABM, several scenarios are examined. The scenarios are carefully prepared by focusing on the topic of airline on-site managers of passenger-handling operations. We evaluate the experimental results from three perspectives, in which both the airline management and its customers achieve a mutual goal. The best scenario is derived through multiple simulation experiments, which are literally difficult to conduct in the real world.

We analyzed the best practice to explain the improved result compared to other scenarios. The role of service staff guiding passengers toward the SST is important, because passengers may not be able to choose the additional option unless knowing about the available service option. An appropriate allocation of service resources for the expected work volume was found to be critical. The best practice is observed with scenario 6, which eliminates one staff member and relocates check-in staff to provide lobby service. The elimination of one staff member from the interpersonal check-in service does not increase the waiting time much, and the usage of the self-service kiosk increases by 7.6% (**Table 6**). On the basis of our scenario analysis, we recognized the importance of cooperation between service staff and their customers in achieving a certain level of service, which is the result of their work toward more optimal customer throughput. It is also clear that the coordination of service resources is one of the largest success factors for the optimal utilization of their capacity.

ABM is a powerful instrument for exploring the structure of cooperation and coordination because it is capable of equipping the mechanisms of reproducing an existing phenomenon in a simplified context. It also provides on-site managers with facts, simulation results, and animations of ongoing experiments, which helps them to understand the level of service provided and the possibility for improvement. Cooperation between service staff and passengers to minimize the waiting time determines the degree of optimization of check-in options. The coordination of current resources is critical for maintaining and improving the level of service. This experiment demonstrated a trial-anderror method that neither sacrifices passenger convenience nor services.

ABM provides a reproduction of dynamic phenomena visually and quantitatively, which promotes an understanding of factors and countermeasures for on-site management. Although a service is characterized by properties such as intangibility, heterogeneity, perishability, and inseparability, ABM can visualize the process and outcome of the operations. Therefore, ABM may help to explore the existing situation and possible future solutions with regard to customer service improvement.

### 5.2. For Future Study

This paper presented suggestions as to how to improve the validity of the ABM. This study proposed that two stylized facts of an experimental space and the real world are quite similar. As mentioned in section 2.3, many stylized facts can be used to explain the extent to which the model approximates the real world. In this regard, it is important to determine the appropriate stylized facts, as they help to explain that the simulation represents the real world. As there are vast volumes of stored data in the active system, we need to explore more empirical findings that explain how the experimental space approximates the real world. ABM can observe and report the activity of an experiment involving autonomous agents who can hold multiple variables. This study calculates the prediction correctness within experiments and compares them with realworld prediction correctness rates to assess the extent to which they approximate each other. It therefore seems worthwhile to find ways in which the respective results of ABM can relate to real-world individual data. Connecting the ABM agent with the real-world individual data may enable us to exactly simulate the phenomena more precisely.

This model is applicable and can be expanded to discuss observable phenomena with multiple variables that interact with one another. However, this model is not designed to take into account customers' emotions and satisfaction, which are invisible. In general, despite firms continued efforts to improve service delivery, not all encounters are successful. Effective service recovery is expected by customers and failure to accomplish this effectively results in losing customers. Moreover, it is evident that positive employee responses to service failures can lead directly to customer satisfaction [5]. Even though this model can examine operational excellence, it does not imply that it enhances customer experience.

The real world is difficult to map comprehensively. We need to continue to pursue a method to extract the essence of circumstances that would enable us to understand important phenomena.

### AUTHOR CONTRIBUTIONS

KU designed the study and wrote the initial draft of the manuscript. SK has critically reviewed the manuscript. All

#### REFERENCES


authors approved the final version of the manuscript, and agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Ueda and Kurahashi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Dynamic Microsimulation Model of Impoverishment Among Elderly Women in Japan

#### Seiichi Inagaki\*

Education Center, International University of Health and Welfare, Otawara, Japan

The increasing poverty rate for elderly women is a growing concern in Japan and is generally due to their lifestyle changes and the public pension system based on the pre-1980s (old) lifestyle. At that time, women were expected to get married and become homemakers. Therefore, the public pension system is generous for married women and widows but not for never-married and divorced women. Using a dynamic microsimulation model, the Integrated Analytical Model for Household Simulation (INAHSIM), previous research has shown that poverty rates for elderly people will increase significantly in the future due to changes in nuptiality behavior after the 1980s. However, this approach is an indirect method, and the mechanism of impoverishment remains unclear. This study uses the same dynamic microsimulation model but attempts a more direct approach to interpret the effects of these behavioral changes on poverty rates for elderly women. Specifically, under the baseline scenario, it makes future projections on key distributions related to poverty by marital status and illustrates how they will face the poverty problem. It shows the future projections of (1) the distribution of pension amounts by gender and marital status, (2) poverty rates for elderly women by marital status, and (3) poverty rates for elderly people by gender. After the 1980s in Japan, the marriage rate decreased and the divorce rate increased significantly. Nevertheless, society still suffers from wage inequality between men and women. As a result, the number of never-married or divorced women will increase and these women will receive poor pension benefits due to an unfavorable public pension system. In addition, they have a higher risk of living in a single-person household because they have no or very few children. In the end, they will face the risk of poverty and raise the overall poverty rate.

#### Keywords: microsimulation, poverty rate, public pension, nuptiality, marital status, family, household

### INTRODUCTION

The increasing poverty rate for elderly women<sup>1</sup> is a growing concern in Japan and is generally due to lifestyle changes and an inadequate public pension system for those women. Using a dynamic microsimulation model for Japan, the Integrated Analytical Model for Household Simulation (INAHSIM), Inagaki [1] projected that poverty rates by gender for elderly people will increase significantly in the future and showed that changes in nuptiality behaviors after the 1980s will affect poverty rates for elderly women but not for elderly men.

1 In this article, "elderly people" is defined as those aged 65 and over.

Edited by:

Isamu Okada, Soka University, Japan

#### Reviewed by:

Rajagopalan Srinivasan, Indian Institute of Technology Madras, India Benoit Gaudou, Université fédérale de Toulouse, France

> \*Correspondence: Seiichi Inagaki s.inagaki@iuhw.ac.jp

#### Specialty section:

This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics

Received: 24 October 2017 Accepted: 19 February 2018 Published: 14 March 2018

#### Citation:

Inagaki S (2018) Dynamic Microsimulation Model of Impoverishment Among Elderly Women in Japan. Front. Phys. 6:22. doi: 10.3389/fphy.2018.00022 future, then the poverty rate for women will be much lower than that for the baseline scenario while that for men will not change. Based on these simulation results, he concluded that behavioral changes will raise the poverty rates for elderly women in the future.

Normally, it should show what would happen in the poverty rates in the future if nuptiality behaviors had not changed after the 1980s. However, the initial population used in the INAHSIM is the population in 2004, and it is difficult to simulate the population backward. In other words, it is difficult to simulate the population using the assumption that nuptiality behaviors from 1980 to 2004 were the same as those in the 1970s. Therefore, he took an indirect approach.

This study uses the same dynamic microsimulation model but attempts a more direct approach to interpret the effects of these behavioral changes on poverty rates in elderly women. Specifically, under the baseline scenario, it makes future projections of (1) the distribution of pension amounts by gender and marital status, (2) poverty rates for elderly women by marital status, and (3) poverty rates for elderly people by gender. Then, it clarifies the mechanism of their impoverishment based on these results. Marital status is the key to understanding the impoverishment of elderly women.

The rest of this paper proceeds as follows. Section Public Pension System in Japan illustrates the public pension system in Japan. Section Method outlines the dynamic microsimulation model used to estimate poverty rates and related indicators for elderly women in the future by their marital status. Section Results presents the results of the future estimates and discusses why elderly women will be impoverished. Section Conclusion concludes<sup>3</sup> .

#### PUBLIC PENSION SYSTEM IN JAPAN

Japan achieved dramatic economic growth from 1954 to 1973. At that time, people's lifestyles were uniform, and families were often referred to as "post-war families." The various kinds of social systems in Japan today developed during this period of high economic growth. These systems were established based on standard households and the division of gender roles. The public pension system is one of those systems.

Gender roles were as follows: (1) most women resigned from their jobs during their twenties; (2) women got married, had children, and took care of their families as homemakers; and (3) women were generally employed to do simple clerical work, with restrictions on their advancement. Therefore, the social systems, including the public pension system, are very generous for homemakers.

Japan's public pension scheme, depicted in **Figure 1**, is a two-tier system that consists of a flat-rate benefit called the basic pension and an earnings-related pension for regular employees (Category 2 subscribers). Category 2 subscribers receive both the basic pension and earnings-related pension, whereas subscribers to Categories 1 and 3 receive only the basic pension. Category 1 subscribers include self-employed workers, non-regular employees, and the unemployed. Category 2 subscribers include regular employees. Category 3 subscribers include the dependent spouses of Category 2 subscribers.

**Table 1** summarizes the contributions and benefits by category in 2017. Category 1 subscribers pay a monthly contribution of JPY16,490 (\$146)<sup>4</sup> in exchange for receiving a basic pension of JPY64,900 (\$576). However, if they do not pay their contributions for any period, their basic pensions are reduced according to the length of the non-payment period.

The basic pension contributions for Category 2 subscribers are included in the insured person's contribution to the Employees' Pension Insurance (EPI). The 2017 contribution is 18.3% of a Category 2 subscriber's pensionable remuneration, for which the employer and employed are liable in equal amounts. The average monthly employee contribution is JPY38,223 (\$339) in exchange for receiving pension benefits totaling JPY155,486 (\$1,380). Employers deduct employees' contributions from their salaries and pay those contributions to the insurers. Therefore, the problem of the reduction of pension benefits due to nonpayment, as in the case of Category 1 subscribers, does not exist.

Category 3 subscribers do not have to contribute to the basic pension, but they are deemed to have paid their contributions; thus, they are entitled to full basic pensions. In addition, if their spouse dies, they will be entitled to the survivors' pension, which amounts to three-quarters of the spouse's earnings-related pension benefit. The total amount of the basic pension and survivors' pension is JPY132,840 (\$1,179) on average.

As illustrated above, the public pension system is particularly generous toward dependent spouses (mainly homemakers) and widows/widowers. From the viewpoint of the public pension system, men and women are treated equally. However, the division of gender roles remains, and wage inequality between men and women still exists. Consequently, there exists inequality in pension benefits between men and women because the wage inequality leads to inequality in pension benefits.

**Table 2** shows the difference in employment by gender. The percentages of regular employees are 72.9% for men and 44.0% for women. Category 2 subscribers receive the earnings-related pension and the basic pension, whereas Category 1 subscribers receive only the basic pension (at most, JPY64,900 [\$146] per month). The basic pension for Category 1 subscribers may be reduced according to the non-payment period. In addition, a significant difference exists in the pensionable remuneration (PR) that affects the earnings-related pension. The average PR for men is JPY417,743 (\$3,807) while the average PR for women is

<sup>2</sup>The baseline scenario assumes that current people's behaviors will continue in the future.

<sup>3</sup>An ethics approval was not required as per institutional and national guidelines and all data used are publicly available or have been provided to the authors in a de-identifiable format.

<sup>4</sup>The exchange rate was \$1 = JPY112.67 on October 10, 2017.


FIGURE 1 | Public pension scheme in Japan. Source: Inagaki [2]. The amount of pension benefit is recalculated pursuant to the National Pension Act.

TABLE 1 | Contributions and benefits (per month).


Source: Inagaki [2]. The amounts of pension benefit are recalculated pursuant to the National Pension Act and Employees' Pension Insurance Act. Average Pensionable Remuneration (PR) for men was JPY 417,743 (\$3,807) in 2015 Ministry of Health, Labor, and Welfare [3]. (\*) PR is an abbreviation for pensionable remuneration.

JPY273,645 (\$2,429). As a result, it is anticipated that women who are never married or divorced will suffer from low pension benefits. Married women or widows will not face this problem because married women will live with their husbands and share their pension benefits and widows will receive survivor pension benefits.

Currently, elderly people enjoy their pension benefits because they are married and seldom get divorced. Their lifestyle in their working-age was the "post-war family" that the current pension system supposes. On the contrary, future elderly people or current working-age people's lifestyles are diversified. If their lifestyle is the "post-war family," they can receive an adequate TABLE 2 | Numbers of subscribers by gender and category (in thousands).


Source: Ministry of Health, Labor, and Welfare [3].

amount of pension benefits. If not, they may be unable to receive an adequate amount of pension benefits.

Sustainability and adequacy are important points for a pension system. Because Japan is a super-aging society, a discussion on the public pension system focuses mainly on its sustainability. The discussion on its adequacy is paid less attention and is focused on post-war families only. According to the reported 2009 actuarial valuation [4], the current pension system ensured a replacement rate of 50% at the age of 65, newly awarded the pension, for a specific single-income couple<sup>5</sup> covered by the EPI.

### METHODS

The method used in this study is the INAHSIM dynamic microsimulation model. The INAHSIM was originally developed in the early 1980s as a household simulation model tailored

<sup>5</sup> In the post-war family, the husband is covered by the EPI from 20 to 59 years and the wife, who is the same age as her husband, has always been dependent on him.

to Japanese society [5]. The first version was a tool for household simulation that only incorporated demography and household changes after demographic events. It continues to be upgraded, and the current version of INAHSIM 3.8 is a comprehensive model for Japanese society. As illustrated in **Figure 2**, it can simulate not only individual incomes but also living arrangements with one's family.

The first version of INAHSIM includes only three elements— "demography," "young people leaving home," and "living with elderly parents." However, these are both necessary and sufficient for simulation for families and households in Japan.

"Demography" includes not only demographic events but also household changes following demographic events such as (1) Newborn babies adding to their mother's households; (2) Couples deciding to live with the groom's or the bride's parents or starting a new household, after marriage; (3) the divorced husband or wife deciding to return to his/her parents' household or form a new household, in the event of a divorce and once custody is settled.

The transition probabilities in the first version of the model were based on people's behavior in the early 1980s. Those of the current version are determined based on people's behaviors in the early 2,000s.

Regarding household changes (2), the probabilities used in the first version were 58% that the couple lives with the groom's parents, 25% that they live with the bride's parents, and 17% that they form a new household. In contrast, those used in the current version are 20, 5, and 75%, respectively. One sees that the probability that the couple forms a new household has increased dramatically.

Regarding custody, the first version assumes that 69% of wives obtain custody rights. The corresponding figure in the current version is 80%. Regarding household changes (3), the first version assumes that 50% of divorced wives return to their parents' household, and 50% of divorced husbands return to their parents' household. The corresponding figures for the current version are 35 and 43%, respectively.

Although, overall, the marriage rates have decreased significantly, it is difficult to compare the two versions. This is because marriage rates in the current version are controlled by employment status, in addition to age and sex. Again, although divorce rates have increased significantly, it is difficult to compare the two versions. This is because divorce rates in the current version are controlled by whether the couple has dependent children.

"Young people leaving home" refers to young people leaving their parents' households for higher education, to find employment, or to change jobs. The probability of unmarried men aged 20–24 leaving home in the first version is about 4%. Although this is lower in the current version, it is difficult to compare with the first version because the probabilities are controlled by employment status in the current version<sup>6</sup> .

"Living with elderly parents" refers to those situations in which children move in with their elderly parents to take care of them. This is an important event to secure the life of the elderly in Japan. These probabilities in the first version are 10% for those aged 70 and over. In contrast, those in the current version are controlled by age and sex, and are 1.5–20.7% for those aged 65 and over.

The current version adds "Changes in the need for long-term care," "Changes in employment status," "Estimating earnings," "Determining pension benefits," "Entering an institution," and "Social security premium/tax"<sup>7</sup> . These new elements help estimate the poverty rates for elderly people.

In Japan, there are few dynamic microsimulation models other than INAHSIM. Shiraishi [6] developed a model for pension benefits, and Koshio [7] developed one for long-term care needs. However, their models simulate pension benefits or long-term care needs at a personal level, not a family or household level.

Studies that use INAHSIM are also few. The initial population of INAHSIM is based on micro data of the Comprehensive Survey of Living Conditions (CSLC) conducted by the Ministry of Health, Labor, and Welfare. However, the use of micro data from government surveys is very restricted, and only a few studies can use them. In addition, the computer program of INAHSIM<sup>8</sup> is too complicated for the average researcher. Only Fukawa [8] estimated health and long-term expenditure using INAHSIM. Although he revised the second version of INAHSIM for simulations without using micro data of the CSLC [9], this is also not used extensively.

The poverty rate is the ratio of the number of people whose income falls below the poverty line. One-half of the median household income of the total population is typically used as the poverty line. This household income is adjusted by household size; specifically, the household income is divided by the square root of the household size. However, a minimum standard of living varies in household composition. For example, the minimum standard of living differs between elderly couple households and single-mother families. Livelihood assistance is determined based on household size; the composition of the household, e.g., ages and characteristics of members; and place of residence. Therefore, this study uses livelihood assistance as the poverty line to evaluate poverty more properly. The amounts of the livelihood assistance in 2012 for some specified households in Japan are shown in **Table 3**.

To analyze poverty, it is necessary to simulate not only individual incomes but also living arrangements with one's family to measure the poverty rate. In other words, it is necessary to simulate people's life events, such as demographic events (e.g., birth, death, marriage, divorce, and international migration), employment status, earnings, and pension amounts simultaneously and individually. INAHSIM is a suitable tool for this kind of simulation.

Individual incomes, household income, and living arrangements are necessary to evaluate poverty rates. The life events of changes in employment status, estimating earnings, determining pensions, and social security premium affect people's incomes. Demographic changes, such as young people leaving home and residing in an institution, affect people's living arrangements. In addition, there are many mutual interactions

<sup>6</sup>The probabilities in the first version are controlled by sex and age but those in the current version are controlled by sex, age, and employment status.

<sup>7</sup> See Aoi et al. [5], \$Inagaki [10], Inagaki [11], Inagaki [12], and Inagaki [1] for details.

<sup>8</sup>Published in Inagaki [11].



Source: Inagaki [2]. The amounts of livelihood assistant are recalculated pursuant to the Public Assistance Act.

among life events, e.g., employment status affects household changes and the number of children affects demographic events. INAHSIM incorporates these mutual interactions. Inagaki [1, 10–12] summarized these life events and transition probabilities considering those interactions.

The initial population is prepared using micro data from the 2004 Comprehensive Survey of Living Conditions (CSLC)<sup>9</sup> conducted by the Ministry of Health, Labor, and Welfare. As in the previous studies, the initial population includes 126,570 household members in 49,307 private households and 1,212 elderly people in institutional households. The initial population reflects Japan's society on a 1/1,000 scale. All results take an average of 100 simulation runs, and the stochastic errors can be small. Inagaki [11] estimated the stochastic error as derived from the Monte Carlo method, and pointed out that the standard error rates for the number of elderly people is only 0.2%. In other words, its 95% confidence interval is about plus minus 0.4%<sup>10</sup> .

### RESULTS

### Distribution of Public Pension Amounts by Gender and Marital Status

**Figure 3** compares the distribution of pension amounts in 2012 between men and women. There are two peaks of 0.75–0.99 and 1.75–1.99 million yen for men. For elderly men, Category 1 subscribers comprise the first peak and Category 2 subscribers form the second peak.

On the other hand, there is one peak only of 0.50–0.74 million yen for women. The peak is formed by elderly women in Categories 1 or 3. The second peak is not formed because the percentage of Category 2 subscribers for women is much lower than that for men. In addition, women do not continue to work over lengthy periods and their wages are relatively low. As a result, the women's earnings-related pensions are not high enough to form a peak.

<sup>9</sup>The data used in this study were made available to the author by the Ministry of Health, Labor, and Welfare of Japan, notice No. 0531-2, dated May 31, 2016.

<sup>10</sup>The stochastic error of the poverty rates estimated in this article should be larger than this. This is left to future research.

**Figure 4** compares the distribution of pension amounts in 2030 between men and women. The shape of the distributions is similar to those in 2012. However, the distributions will shift to the left because an automatic pension amount reduction system is incorporated to ensure the sustainability of the public pension system. This reduction system, called the "macroeconomic slide system," reduces about 1% of the pension benefits every year until the financial equilibrium is achieved. According to the 2009 actuarial valuation, the reduction will last until 2038.

**Figure 5** shows the trends in the quartiles of pension amounts by gender. The median of women's pension amounts will be lower than the first quartile of men's. The median will be 0.7 million yen, and the first quartile will be 0.5 million yen. These are only 15 or 11% of average disposable wages for men. This means most elderly women may be unable to live alone.

**Figure 6** shows the trends in the first quartiles of pension amounts for women by marital status. Married women usually live with their husbands; thus, the first quartile is calculated using half of couples' pension amounts. A significant difference exists in the pension amounts among marital statuses. Married women and widows will receive a higher amount of pension benefits than never-married and divorced women.

### Poverty Rates for Elderly Women by Marital Status

In addition to their lower amounts of pension benefits, nevermarried or divorced women will have a higher risk of living in a single-person household. After the death of their parents, they will be more likely to live alone because they have no or very few children.

**Figure 7** shows the trends in the percentages of single-person households (including institutional households) for elderly women by marital status. Never-married women will be most likely to live in a single-person household, and divorced women will be the second most likely after they reach old age. There will be a small difference in the percentage between divorced women and widows, but most widows will receive survivors' pension benefits while divorced women will receive only their own pension benefits.

In simpler terms, never-married and divorced women will receive lower pension benefits and will be more likely to live in single-person households than married women and widows. As a result, never-married and divorced women are more likely to live in poverty when they reach old age. **Figure 8** shows the trends in poverty rates for elderly women by marital status. As anticipated, their poverty rates will reach around 50% in the future.

### Poverty Rates for Elderly People by Gender

There are significant differences in poverty rates among elderly women by marital status. However, if the percentage of nevermarried and divorced women is small, then the poverty rate for elderly people will not be high. According to the Population Census 2010, the percentages of never-married and divorced women among elderly women were only 4.0 and 4.7%, respectively. Therefore, the poverty rates for elderly people in 2010 were 6.1% for men and 11.1% for women. The poverty problem is not serious currently.

As Inagaki [1] estimated, the percentages of never-married and divorced women among elderly women will increase significantly to 15.4 and 11.9% in 2050 and 17.6 and 12.7% in 2100, respectively. This is due to the changes in nuptiality behaviors that occurred among young people after the 1980s. It takes time for such behaviors to affect the percentages of marital statuses for elderly women. The effect of the changes on the

marital statuses of elderly women will appear after two or three decades.

**Figure 9** shows trends in poverty rates for elderly people by gender. Those for elderly women will increase by around 25%. This is due to the increase in the percentages of nevermarried or divorced women, who are very likely to live in poverty. If the poverty rate is low, then the problem is limited, and those people's lives can be secured by social assistance programs. Of course, if their pension benefits are adequate for their lives, the problem is also limited. However, the public pension system is not generous for women such as homemakers or widows.

### CONCLUSION

Increasing poverty rates for elderly women is a growing concern in Japan. Inagaki [1] quantitatively projected future poverty rates using a dynamic microsimulation

model (INAHSIM). He then pointed out that changes in nuptiality behaviors after the 1980s will cause the increase.

However, his approach is an indirect method, and the mechanism of impoverishment has remained unclear. This study attempts to reveal this problem more directly. Specifically, it breaks down the simulation results into key factors such as individual pension benefits and incomes and living arrangements by marital status. The results showed that marital status is the most important factor that will lead to poverty for the elderly in the future.

One reason behind this is the concept of the public pension system in Japan. The public pension system was established in the mid-1980s as a social insurance system based on the division of gender roles established during the period of high economic growth from 1954 to 1973. The system is very generous for dependent wives and widows but not for never-married or divorced women. At that time, most women resigned from their jobs during their twenties, got married, and became homemakers.

However, after the period of high economic growth, nuptiality behaviors, and lifestyles changed completely. Nevertheless, the division of gender roles remains, and wage inequality between men and women still exists. Consequently, inequality exists in pension benefits between men and women because the wage inequality leads to inequality in pension benefits. Thus, nevermarried and divorced women will suffer from poverty when they reach old age.

Using a dynamic microsimulation model, this study makes future projections of (1) the distribution of pension amounts by gender and marital status, (2) poverty rates for elderly women by marital status, and (3) poverty rates for elderly people by gender. The results indicate a huge difference in poverty rates among elderly women by their marital status and illustrate the mechanism by which the poverty rate for elderly women will increase significantly.

FIGURE 8 | Trends in poverty rates of elderly women by marital status. Simulation results.

At this moment, this poverty problem is hidden and not very serious because many in this cohort live with their parents. However, when they reach old age, the poverty problem will come to the surface because their parents will pass away and their public pensions will not be enough to sustain them. These problems will become apparent to everyone in the near future. Pension reform for a super-aging society is important. However, it is also urgent to reform the public pension system to be consistent with people's current behaviors.

## REFERENCES


## AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

## ACKNOWLEDGMENTS

This work was supported by JSPS KAKENHI Grant Number JP15548069, JP15649809, JP17921805 and Health Labour Sciences Research Grant Number H27-seisaku-ippan-004.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Inagaki. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Power Laws in Stochastic Processes for Social Phenomena: An Introductory Review

Shin-Ichiro Kumamoto\* and Takashi Kamihigashi

Research Institute for Economics and Business Administration, Kobe University, Kobe, Japan

Many phenomena with power laws have been observed in various fields of the natural and social sciences, and these power laws are often interpreted as the macro behaviors of systems that consist of micro units. In this paper, we review some basic mathematical mechanisms that are known to generate power laws. In particular, we focus on stochastic processes including the Yule process and the Simon process as well as some recent models. The main purpose of this paper is to explain the mathematical details of their mechanisms in a self-contained manner.

Keywords: power law, Zipf's law, Pareto's law, preferential attachment, geometric brownian motion

#### Edited by:

Isamu Okada, Soka University, Japan ¯

#### Reviewed by:

Francisco Welington Lima, Federal University of Piauí, Brazil Renaud Lambiotte, University of Oxford, United Kingdom

> \*Correspondence: Shin-Ichiro Kumamoto kumamoto@rieb.kobe-u.ac.jp

#### Specialty section:

This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics

Received: 13 October 2017 Accepted: 14 February 2018 Published: 15 March 2018

#### Citation:

Kumamoto S-I and Kamihigashi T (2018) Power Laws in Stochastic Processes for Social Phenomena: An Introductory Review. Front. Phys. 6:20. doi: 10.3389/fphy.2018.00020

Many phenomena with power laws have been observed in various fields of the natural and social sciences: physics, biology, earth planetary science, computer science, economics, and so on. These power laws can be interpreted as the macro behaviors of the systems that consist of micro units (i.e., agents, individuals, particles, and so on). In other words, the ensemble of dynamics of these micro units is observed as the behavior of the whole system such as a power law<sup>1</sup> . To obtain a deep understanding of the phenomenon for the system, we must first observe the behavior on the macro side, then assume the stochastic dynamics on the micro side, and finally reveal the theoretical method connecting both sides. Thus, the mechanisms generating power laws have been studied as

the second and final steps in the study of power laws. Next, we mathematically define the power law. When the probability density function p(x) for a continuous random variable<sup>2</sup> Xˆ is given by

$$p(\mathbf{x}) = \mathbf{C} \mathbf{x}^{-\alpha} \quad (\mathbf{x} \ge \mathbf{x}\_{\text{min}}) \,. \tag{1}$$

we say that Xˆ satisfies the power law. The exponent α is called the exponent of power law, C is the normalization constant, and xmin is the minimum value that x satisfies the power law. The power law is the only function satisfying the scale-free property [1]

$$p(b\mathbf{x}) = f(b)p(\mathbf{x}) \quad \text{for any } b. \tag{2}$$

Then we define the cumulative distribution function P>(x) as

<sup>2</sup>The hat of Oˆ means that Oˆ is a random variable.

1. INTRODUCTION

<sup>1</sup>For the example of a city, the micro dynamics correspond to immigration, emigration, births, and deaths, and the macro behavior is the distribution of the population.

$$P\_{>}\left(\mathbf{x}\right) := \mathbf{P}\{\hat{X} \ge \mathbf{x}\} = \int\_{\mathbf{x}}^{\infty} p(\mathbf{x})d\mathbf{x}.\tag{3}$$

When the probability density function satisfies the power law p(x) = Cx−<sup>α</sup> ,

$$P\_{\succ}(\boldsymbol{\chi}) \propto \boldsymbol{x}^{-\alpha+1}.\tag{4}$$

The behavior of the cumulative distribution function with the power law is a straight line in a log–log plot for x ≥ xmin (**Figure 1**).

Next, we list some examples of power laws in various phenomena.


Furthermore, we partly list the generating mechanisms that are important for applications, and the phenomena to which they are applied in the above list, such as "mechanism ⇒ phenomena."

	- Yule process [20] ⇒ (e);
	- Simon process [21] ⇒ (a), (b), (c), (e), and (g);
	- Barabási–Albert (BA) model [22] ⇒ (d) and (f).
	- GBM with a reflecting wall [23] ⇒ (a), (g), and (h);
	- GBM with reset events [24, 25] ⇒ (g).
	- Kesten process [26]⇒ (g).


Though there are many other generating mechanisms besides them<sup>3</sup> , the mechanisms of the above list are particularly well known and widely applied to phenomena in various fields.

In this paper, we focus on the generating mechanisms with the stochastic processes in the above list<sup>4</sup> : the growth and preferential attachment and the stochastic models based on the GBM, which, in particular, are widely applied in social science. In addition, we explain about the combination of exponentials that is related to the mechanism of the Yule process. We mainly give full details of the mathematical formalisms for these mechanisms in selfcontained manner, because understanding them is important for researchers in any field to create new models generating power laws in empirical data. The necessary mathematical supplements to understand these mechanisms are given in the Appendix at the end of this paper.

### 2. GROWTH AND PREFERENTIAL ATTACHMENT

As the name suggests, this mechanism consists of the two characteristics: growth and preferential attachment. In the example of a city, the meanings of growth and preferential attachment are as follows.


In this section, we deal with the Yule process, the Simon process, and the BA model, which all have these two characteristics. The Yule process generates the power law about the number of species within genera in biology. The Simon process generates the power laws about the frequency of use of a word in a text, the population of cities, and so on (see the list in the Introduction for details). The BA model generates the power law about the number of edges incident to nodes in the network. We now explain in detail how these three mechanisms mathematically generate the power laws.

<sup>3</sup>Readers interested in more phenomena with power laws and their generating mechanisms should refer to the reviews and textbooks by Mitzenmacher [35], Newman [1], Sornette [36], Hayashi et al. [37], Farmer and Geanakoplos [38], Gabaix [39, 40], Simkin and Roychowdhury [41], Pinto et al. [42], Piantadosi [43], Machado et al. [44], and Slanina [45].

<sup>4</sup>Though the multiplicative process [46] is also the stochastic process, it is not explained in this paper because the multiplicative process is interpreted as the discrete-time version of the GBM of the continuous-time stochastic process. Namely, the multiplicative process is essentially equivalent to the GBM (see Appendix A.4).

<sup>5</sup>Preferential attachment is also called the Matthew effect [47] or the cumulative advantage [48].

#### 2.1. Yule Process

The Yule process [20] was invented to model stochastic population growth with the preferential attachment process for the model of speciation in biology. In this process, new species and genera are born by biological mutations that are interpreted as the branchings from the lines of existing species in the evolutionary tree (**Figures 2**, **3**).

These branchings occur as Poisson processes and add lines of new genera or species to the evolutionary tree. The Yule process mathematically corresponds to the stochastic process that the numbers of genera and species increase independently by following the linear birth processes (see Appendix A.3)<sup>6</sup> . In other words, we consider the evolutionary tree of species (**Figure 2**) and that of genera (**Figure 3**) separately.

In short, the Yule process is the combination of the stochastic processes for the numbers of species and genera (**Figure 4**) [41, 49, 50].


constant and n<sup>g</sup> is the number of genera within the family at that time.

To obtain the probability distribution of the number of species within genera at a large time<sup>7</sup> , we need the conditional probability distribution of the number of species included in the genus whose age (i.e., the time intervals elapsed since the birth) is t. Let us use rs(n, t) to denote its conditional probability distribution, where n (∈ N) is the number of species and t (∈ R) is the age of the genus.

First, rs(1, t) is equivalent to the probability that no new species is born in (a, a + t] after the genus is born<sup>8</sup> at an arbitrary time a. Accordingly, we obtain rs(1, t) from (A.2) as

$$r\_s(1, t) = \mathcal{P}\{\hat{N}\_s(a + t) - \hat{N}\_s(a) = 0; \text{ rate } \lambda\_s\} = e^{-\lambda\_s t}, \tag{5}$$

where Nˆ <sup>s</sup>(t) is the number of species born in (0, t] by the Poisson process with the Poisson rate λ<sup>s</sup> .

Second, we calculate rs(2, t). It is equivalent to the probability that one new species is born in (a, a + t] after the genus is born at an arbitrary time a. Then we assume that one new species is born in the infinitesimal time interval [a + τ1, a + τ<sup>1</sup> + dτ1). From (A.2) and (A.3), we obtain the probabilities for one

<sup>6</sup>The characteristic of growth is the increase in the number of genera. The characteristic of preferential attachment is that the more species within a genus, the more new species are born.

FIGURE 4 | An example of the evolutionary tree for the Yule process. The black solid lines show the branchings of species. The black broken lines show the branchings of genera. One genus is represented by the part surrounded by the red dotted lines. In this figure, though, the probability of branching for a new genus seems to depend on the number of species in the original genus and, in fact, the Poisson rate for branching of a genus is constant in the Yule process.

<sup>7</sup>We consider the probability distribution only at a large time for the stationary state.

<sup>8</sup>This new genus is equivalent to the first species born in its own genus. Therefore, the new genus is counted as one for the number of species.

 

birth or no birth in each of the three divided time intervals: where the beta function B(a, b) is defined as

$$\begin{aligned} \text{P\{no birth in} (a, a + \tau\_1); \text{ rate } \lambda\_s\} &= \text{e}^{-\lambda\_d \tau\_1}, \\ \text{P\{one birth in} [a + \tau\_1, a + \tau\_1 + d\tau\_1]; \text{ rate } \lambda\_s\} &= \text{P\{\hat{N}\_s (a + \tau\_1 + d\tau\_1) - \hat{N}\_s (a + \tau\_1) = 1; \text{ rate } \lambda\_s\} \\ &= \text{e}^{-\lambda\_s d\tau\_1} \lambda\_s \text{d\tau\_1 \simeq \lambda\_s d\tau\_1}, \\ \text{P\{no birth in} [a + \tau\_1 + d\tau\_1, a + t); \text{ rate } 2\lambda\_s\} &= \text{e}^{-2\lambda\_s (a + t - \tau\_1 - d\tau\_1)} \simeq \text{e}^{-2\lambda\_s (a + t - \tau\_1)}. \end{aligned} \tag{6}$$

Integrating the product of these probabilities with respect to τ1, we obtain

$$r\_s(2,t) = \int\_0^t \mathbf{e}^{-\lambda\_s \tau\_1} \lambda\_s \mathbf{e}^{-2\lambda\_s(t-\tau\_1)} d\tau\_1 = \mathbf{e}^{-\lambda\_s t} (1 - \mathbf{e}^{-\lambda\_s t}).\tag{7}$$

Similarly, rs(3, t) is given by

$$\begin{split} r\_{\mathbf{s}}(\mathbf{3},t) &= \int\_{0}^{t} \mathbf{e}^{-\lambda\_{\mathbf{s}}\tau\_{1}} \lambda\_{\mathbf{s}} \mathrm{d}\tau\_{1} \int\_{\tau\_{1}}^{t} \mathbf{e}^{-2\lambda\_{\mathbf{s}}(\tau\_{2}-\tau\_{1})} (2\lambda\_{\mathbf{s}}) \mathbf{e}^{-3\lambda\_{\mathbf{s}}(t-\tau\_{2})} \mathrm{d}\tau\_{2} \\ &= \mathbf{e}^{-\lambda\_{\mathbf{s}}t} (1 - \mathbf{e}^{-\lambda\_{\mathbf{s}}t})^{2}. \end{split} \tag{8}$$

Finally, repeating the same procedure, we obtain rs(n, t), that is, the conditional probability distribution of the number of species included in the genus at the age of t:

$$\begin{split} r\_{s}(n,t) &= \mathbf{e}^{-n\lambda\_{s}t} \prod\_{k=1}^{n-1} \left[ \int\_{\tau\_{k-1}}^{t} \mathbf{e}^{\lambda\_{s}\tau\_{k}} k \lambda\_{s} d\tau\_{k} \right] \quad \text{( $\tau\_{0}:=0$ )}\\ &= \mathbf{e}^{-n\lambda\_{s}t} (n-1)! \prod\_{k=1}^{n-1} \left[ \int\_{\mathbf{x}\_{k-1}}^{e^{\lambda\_{k}t}} d\mathbf{x}\_{k} \right] \quad \text{( $\mathbf{x}\_{k}:= \mathbf{e}^{\lambda\_{k}\tau\_{k}}$ ,  $\mathbf{x}\_{0}:= 1$ )}\\ &= \mathbf{e}^{-\lambda\_{s}t} (1 - \mathbf{e}^{-\lambda\_{s}t})^{n-1}. \end{split}$$

Next, let ℓg(t) be the probability distribution function for the age of genera at a large time in the linear birth process. It is given by (A.15) as

$$
\ell\_{\mathfrak{k}}(t) = \lambda\_{\mathfrak{k}} e^{-\lambda\_{\mathfrak{k}}t}.\tag{10}
$$

Consequently, the probability density of the number of species within genera at a large time, denoted by q(n), is given by integrating the product of the conditional probability distribution of the number of species within genera and the probability density function for the age of genera at a large time:

$$\begin{split} q(n) &= \int\_{0}^{\infty} r\_{\rm s}(n, t) \ell\_{\rm g}(t) dt = \int\_{0}^{\infty} \mathrm{e}^{-\lambda\_{\rm s}t} (1 - \mathrm{e}^{-\lambda\_{\rm s}t})^{n-1} \lambda\_{\rm g} \mathrm{e}^{-\lambda\_{\rm g}t} dt \\ &= \frac{\lambda\_{\rm g}}{\lambda\_{\rm s}} \int\_{0}^{1} \mathrm{x}^{\frac{\lambda\_{\rm g}}{\lambda\_{\rm s}}} (1 - \mathrm{x})^{n-1} \mathrm{d}x \quad \text{( $\boldsymbol{\chi} := \!e$ } - \mathrm{e}^{-\lambda\_{\rm s}t} ) \\ &= : \frac{\lambda\_{\rm g}}{\lambda\_{\rm s}} \mathrm{B} \left( \frac{\lambda\_{\rm g}}{\lambda\_{\rm s}} + 1, n \right), \end{split} \tag{11}$$

$$\mathcal{B}(a,b) := \frac{\Gamma(a)\Gamma(b)}{\Gamma(a+b)} = \int\_0^1 \varkappa^{a-1}(1-\varkappa)^{b-1} \mathrm{d}\varkappa$$

$$\left(\Gamma(a) := \int\_0^\infty t^{a-1} \mathrm{e}^{-t} \mathrm{d}t\right). \tag{12}$$

When b takes a large value, the beta function is approximately

$$\text{B}(a,b) \propto b^{-a} \quad \text{( $b \gg 1$ )}.\tag{13}$$

Therefore, for a large number of species, the probability distribution of the number of species within genera at a large time satisfies the power law as

$$q(n) \propto n^{-\left(\frac{\lambda\_{\rm B}}{\lambda\_{\rm s}} + 1\right)} \quad \text{( $n \gg 1$ ),}\tag{14}$$

where the exponent of power law is <sup>λ</sup><sup>g</sup> λs + 1.

#### 2.2. Simon Process

The Simon process [21] is interpreted as a discrete-time stochastic process for the growth in the numbers of urns and balls contained in those urns: an urn and the number of balls in the urn correspond to a word and the number of times that the word is used. In this stochastic process, a certain number of balls are newly added and stochastically distributed to the existing urns containing some balls at each time step. After that, one urn containing a certain number of balls (it need not be the same as the number of balls added above) is also added newly. Repeating this procedure, the number of balls and urns grows stochastically.

We calculate the stationary probability distribution of balls contained in urns at a large time.

First, we define all quantities for the Simon process by using the following notation:


Next, we provide the detailed procedure with the stochastic rule as follows (**Figure 5**).


 

 

Then we can obtain the expectation values of E[ˆ f(k, t + 1)] (k ≥ k0) from the above stochastic rule as

$$\begin{aligned} \mathrm{E}[\hat{f}(k, t+1)] &= \hat{f}(k, t) - \frac{mk\hat{f}(k, t)}{B(t)} + \frac{m(k-1)\hat{f}(k-1, t)}{B(t)} \\ &\quad \text{( $k > k\_0$ )}, \end{aligned}$$

$$\mathrm{E}[\hat{f}(k\_0, t+1)] = \hat{f}(k\_0, t) - \frac{mk\_0\hat{f}(k\_0, t)}{B(t)} + 1. \tag{15}$$

At a large time t, we can make an approximation ˆ f(k, t) ≃ E[ˆ f(k, t)] for k ≥ k<sup>0</sup> and obtain

$$\mathbb{E}[\hat{f}(k,t+1)] \simeq \mathbb{E}[\hat{f}(k,t)] - \frac{mk\mathbb{E}[\hat{f}(k,t)]}{B(t)} + $$

$$\frac{m(k-1)\mathbb{E}[\hat{f}(k-1,t)]}{B(t)} \quad \text{( $k > k\_0$ )} \qquad \text{(16)}$$

$$\mathbb{E}[\hat{f}(k\_0, t+1)] \simeq \mathbb{E}[\hat{f}(k\_0, t)] - \frac{mk\_0\mathbb{E}[\hat{f}(k\_0, t)]}{B(t)} + 1.$$

The probability distribution of the number of balls in urns, denoted by p(k, t), can be represented by E[ˆ f(k, t + 1)]:

$$p(k,t) = \frac{\operatorname{E}[\hat{f}(k,t)]}{U(t)}.\tag{17}$$

Consequently, the master equation for p(k, t) is given by

$$\begin{aligned} U(t+1)p(k,t+1) &= U(t)p(k,t) - \frac{mkU(t)}{B(t)}p(k,t) + \\ &\quad \frac{m(k-1)U(t)}{B(t)}p(k-1,t) \quad (k > k\_0), \end{aligned} \quad \text{( $k > k\_0$ )},$$

$$U(t+1)p(k\_0, t+1) = U(t)p(k\_0, t) - \frac{mkU(t)}{B(t)}p(k\_0, t) + 1. \tag{18}$$

We are interested in only the stationary distribution function p(k) that is defined as p(k, t) in the limit of large time:

$$\mathfrak{p}(k) := \lim\_{t \to \infty} \mathfrak{p}(k, t). \tag{19}$$

Then, considering

 

$$\lim\_{t \to \infty} \frac{U(t)}{B(t)} = \frac{1}{m + k\_0} \tag{20}$$

and taking the limit t → ∞ for Equation (18), we obtain

$$\begin{cases} \begin{aligned} \rho(k) &= \frac{k-1}{k+1+\frac{k\_0}{m}} \rho(k-1) & (k > k\_0), \\\\ \rho(k\_0) &= \frac{m+k\_0}{k\_0(m+1)+m}. \end{aligned} \end{cases} \tag{21}$$

We can solve these equations recursively:

$$\begin{split} p(k) &= \frac{(k-1)(k-2)\cdots k\_{0}}{\left(k+1+\frac{k\_{0}}{m}\right)\left(k+\frac{k\_{0}}{m}\right)\cdots\left(k\_{0}+2+\frac{k\_{0}}{m}\right)} p(k\_{0}) \\\\ &= \frac{(k-1)(k-2)\cdots k\_{0}}{\left(k-1+\alpha\right)\left(k-2+\alpha\right)\cdots\left(k\_{0}+\alpha\right)} p(k\_{0}) \quad \left(\alpha := 2+\frac{k\_{0}}{m}\right) \\\\ &= \frac{\Gamma(k)\Gamma(k\_{0}+\alpha)}{\Gamma(k\_{0})\Gamma(k+\alpha)} p(k\_{0}) \\\\ &= \frac{\mathrm{B}(k,\alpha)}{\mathrm{B}(k\_{0},\alpha)} p(k\_{0}). \end{split} \tag{22}$$

For the large k, the stationary probability distribution of the number of balls in urns satisfies the power law as

$$p(k) \propto k^{-\left(\frac{k\_0}{m} + 2\right)} \quad \left(k \gg 1\right),\tag{23}$$

where the exponent of power law is <sup>k</sup><sup>0</sup> <sup>m</sup> <sup>+</sup> 2.

#### 2.3. Barabási–Albert Model

The BA model [22] is one of the scale-free network models for the growth in the number of nodes and edges. Mathematically, the BA model can be interpreted as a special case of the Simon model. In particular, the nodes and edges in the BA model correspond to the urns and balls in the Simon model, respectively (**Figure 6**). In this model, one node with a certain number of edges are

<sup>9</sup> Since we finally take the limit t → ∞, the initial state does not actually affect the stationary state. However, to make it easier to imagine the procedure, we set the initial state in this manner.

<sup>10</sup>This shows the characteristic of growth.

<sup>11</sup>This shows the characteristic of preferential attachment.

<sup>12</sup>When distributing balls in the group-(k), we do not set the probability that each urn in the group gets one ball. To obtain a master equation later, we only have to know the number of the balls distributed to the group-(k) under the condition that each urn can only get one ball at most. Namely, setting those probabilities is equivalent to imposing too strong a condition to obtain the master equation.

<sup>13</sup>Though one urn can get two or more balls, this possibility is small enough in the limit of large time. This is because the number of urns is large enough in a large time so that this possibility is ignored. Similarly, though more balls can be distributed than the number of urns in a group-(k), this possibility is also small enough in the limit of large time.

newly added at each time step. Then following a stochastic rule, the edges of new node are connected to the existing nodes. Repeating this procedure, the number of nodes and edges grows stochastically.

We calculate the stationary probability distribution of edges connecting to nodes at a large time. First, we define all quantities for the BA model by using the following notation:


Next, we give the detailed procedure with the stochastic rule for the BA model as follows (**Figure 7**).


Consequently, we obtain the same master equation for the probability distribution of edges as (18) with m = k0:

$$\begin{cases} \begin{aligned} U(t+1)p(k,t+1) &= U(t)p(k,t) - \frac{k\_0 k U(t)}{B(t)} p(k,t) + \\ &\quad \frac{k\_0 (k-1) U(t)}{B(t)} p(k-1,t) \quad (k > k\_0), \end{aligned} \end{cases} \tag{24}$$

$$U(t+1)p(k\_0, t+1) = U(t)p(k\_0, t) - \frac{k\_0 k U(t)}{B(t)} p(k\_0, t) + 1. \tag{25}$$

We can solve this master equation and obtain the stationary distribution function p(k):= limt→∞ p(k, t) for the large k from Equations (21–23):

$$p(k) \propto k^{-\alpha} \quad \left(\alpha := 2 + \frac{k\_0}{k\_0} = 3\right),\tag{25}$$

where the exponent of power law is 3.

### 3. STOCHASTIC MODELS BASED ON GEOMETRIC BROWNIAN MOTION

In this section we look at five stochastic processes, generating power laws, which can be represented by the stochastic differential equations (SDEs). They all are mathematically based on the GBM and accompanied by a constraint (i.e., additional condition) or additional terms to the SDE. The constraints correspond to a reflecting wall<sup>19</sup> as a boundary condition [23],

<sup>14</sup>Since we finally take the limit t → ∞ as in the Simon model, the initial state does not actually affect the stationary state. <sup>15</sup>This shows the characteristic of growth.

<sup>16</sup>This shows the characteristic of preferential attachment.

<sup>17</sup>This setting of probability is equivalent to the balls distributed to the group-(k) being further distributed to the urns within the group with equal probabilities in step 4 in the Simon model. That is, the stochastic rule for the BA model is stronger than that of the Simon model as a condition.

<sup>18</sup>Though one node can actually connect two or more nodes, this possibility is small enough in the limit of large time. This is because the number of nodes is large enough in a large time so that this possibility is ignored.

<sup>19</sup>The reflecting wall means that there is the minimum value for a random variable (e.g., population of a city).

and reset events (i.e., birth and death process20) [25]. The stochastic processes with additional terms to the SDE of GBM are the Kesten process, the GLV model, and the BM model. Though the effect of additional term to the GMB in the Kesten process is similar to a reflecting wall, those of the GLV model and BM model correspond to the interactions between particles, agents, or individuals. We mainly explain the mathematical formalisms and properties of these qualitatively different stochastic processes.

#### 3.1. Geometric Brownian Motion

The GBM, on which many models for power laws are based, is one of the most important stochastic processes. It is mathematically defined by the SDE

$$\mathbf{d}\hat{X}(t) = \mu \hat{X}(t)\mathbf{d}t + \sigma \hat{X}(t)\mathbf{d}\hat{B}(t),\tag{26}$$

where Bˆ(t) is a standard Brownian motion, µ is the drift, and σ is the volatility.

The SDE (Equation 26) gives us the partial differential equation (PDE), that is, the Fokker–Planck equation (FPE) [51]:

$$\frac{\partial \rho(\mathbf{x},t)}{\partial t} = -\frac{\partial}{\partial \mathbf{x}} \{\mu \mathbf{x} \rho(\mathbf{x},t)\} + \frac{\partial^2}{\partial \mathbf{x}^2} \left\{\frac{\sigma^2 \mathbf{x}^2}{2} \rho(\mathbf{x},t)\right\},\tag{27}$$

where p(x, t) is the probability density function. The solution of Equation (27) with the initial distribution p(x, 0) = δ(x − x0) is

$$p(\mathbf{x},t) = \frac{1}{\varkappa\sqrt{2\pi\sigma^2 t}} \exp\left[-\frac{\left\{\log\varkappa - \log\varkappa\_0 - \left(\mu - \frac{\sigma^2}{2}\right)t\right\}^2}{2\sigma^2 t}\right],\tag{28}$$

where x<sup>0</sup> is the initial position of the particle. This solution is the log-normal distribution where the expectation value and variance are

$$\mathbf{E}[\hat{\boldsymbol{\alpha}}] = \boldsymbol{\chi\_0} \mathbf{e}^{\mu t}, \quad \text{Var}[\hat{\boldsymbol{\alpha}}] = \boldsymbol{\chi\_0}^2 \mathbf{e}^{2\mu t} (\mathbf{e}^{\sigma^2 t} - 1). \tag{29}$$

In the limit t → ∞, the log-normal distribution never converges to the stationary solution. To obtain it, therefore, we need to impose some additional conditions on the SDE (Equation 26) or modify the SDE itself. We introduce the conditions and modifications in the following sections.

#### 3.2. GBM With a Reflecting Wall

We consider the GBM with the reflecting wall (see Appendix A.5 for details). The stationary solution p(x) for the FPE (Equation 27) is defined by

$$\frac{\partial p(\mathbf{x})}{\partial t} = \mathbf{0},\tag{30}$$

which is equivalent to the second-order ordinary differential equation (ODE):

$$0 = -\frac{\mathrm{d}}{\mathrm{d}x} \langle \mu x p(\mathrm{x}) \rangle + \frac{\mathrm{d}^2}{\mathrm{d}x^2} \left\{ \frac{\sigma^2 \mathrm{x}^2}{2} p(\mathrm{x}) \right\}. \tag{31}$$

As a result, we obtain the first-order ODE:

$$
\mu \exp(\mathbf{x}) - \frac{\mathbf{d}}{\mathbf{d}\mathbf{x}} \left\{ \frac{\sigma^2 \mathbf{x}^2}{2} p(\mathbf{x}) \right\} = D,\tag{32}
$$

where D is an arbitrary constant. We take D = 0 to obtain a normalizable power-law probability distribution. The solution of Equation (32) with D = 0 is

$$p(\mathbf{x}) = \mathbf{C} \mathbf{x}^{-\alpha} \quad \left(\text{C} := p(\mathbf{x}\_0) \mathbf{x}\_0^{\alpha}, \ \alpha := 2 - \frac{2\mu}{\sigma^2}\right), \tag{33}$$

where x<sup>0</sup> is an arbitrary constant. For this stationary solution p(x) to exist, it must satisfy the normalization condition:

$$1 = \int\_{\chi\_{\min}}^{\chi\_{\max}} p(\mathbf{x})d\mathbf{x}.\tag{34}$$

We set the reflecting wall at x = xmin(> 0) and take xmax = ∞. The existence of the reflecting wall is mathematically equivalent to the conditions Xˆ(t) > xmin and p(x) = 0 for x < xmin. Then we assume α > 1. The normalization condition

$$1 = \int\_{\chi\_{\min}}^{\infty} p(\mathbf{x})d\mathbf{x} = \frac{C}{\alpha - 1} (\chi\_{\min})^{-\alpha + 1} \tag{35}$$

<sup>20</sup>The birth and death process means that a new unit (e.g., city or firm) can be born at a rate and die at the same rate.

determines the constant C as

$$C = (\alpha - 1)(x\_{\text{min}})^{-\alpha + 1}.\tag{36}$$

Thus, we have the normalized stationary solution

$$p(\mathbf{x}) = (\alpha - 1)(\mathbf{x}\_{\min})^{-\alpha + 1} \mathbb{1}^{-\alpha} \quad \left(\alpha = 2 - \frac{2\mu}{\sigma^2} > 1\right), \quad \text{(37)}$$

where the exponent of power law is 2 − 2µ σ 2 .

Next, we generalize this formalism from the GBA to the Itô process which can have the stationary solution [52]:

$$\mathrm{d}\hat{X}(t) = a(\hat{X}(t))\mathrm{d}t + b(\hat{X}(t))\mathrm{d}\hat{B}(t). \tag{38}$$

The stationary solution (see Appendix A.5 for details) is given by

$$p(\mathbf{x}) = \frac{C}{b(\mathbf{x})^2} \exp\left[\int\_{\mathbf{x}\_0}^{\mathbf{x}} \frac{2a(\mathbf{x}')}{b(\mathbf{x}')^2} d\mathbf{x}'\right],\tag{39}$$

where C is the normalization constant. Following Yakovenko and Rosser [53] and Banerjee and Yakovenko [54], we take a(x) and b(x) as

$$a(\mathbf{x}) = \mu \mathbf{x} + \mu^\*, \ b(\mathbf{x}) = \sigma \sqrt{2(\mathbf{x}^2 + \mathbf{x}^{\*2})},\tag{40}$$

which is interpreted as a kind of qualitative combination of the generalized Wiener process<sup>21</sup> and GBM. Consequently, we obtain the stationary solution

$$p(\mathbf{x}) = C \left[ 1 + \left( \frac{\boldsymbol{\chi}}{\boldsymbol{\chi}^\*} \right)^2 \right]^{\frac{\mu}{2\sigma^2} - 1} \exp\left[ \frac{\mu^\*}{\sigma^2 \boldsymbol{\chi}^\*} \arctan\left(\frac{\boldsymbol{\chi}}{\boldsymbol{\chi}^\*}\right) \right]. \tag{41}$$

For x ≪ x ∗ , the stationary solution becomes the exponential distribution while for the large x, it satisfies the power law as

$$p(\mathbf{x}) \propto \mathbf{x}^{-\left(2-\frac{\mu}{\sigma^2}\right)} \quad \left(\mathfrak{x} \gg \mathfrak{x}^\*\right),\tag{42}$$

where the exponent of power law is 2 − µ σ 2 .

#### 3.3. GBM With Reset Events

We consider the particles that follow the GBM with the reset events, that is, the birth and death events22. For simplicity, we assume that particles can disappear with a certain probability by following a Poisson process and immediately appear at a point so that the number of particles is conserved. By these reset events, the FPE (Equation 27) is changed into

$$\begin{split} \frac{\partial p(\mathbf{x},t)}{\partial t} &= -\frac{\partial}{\partial \mathbf{x}} \{ \mu \mathbf{x} p(\mathbf{x},t) \} + \frac{\partial^2}{\partial \mathbf{x}^2} \left\{ \frac{\sigma^2 \mathbf{x}^2}{2} p(\mathbf{x},t) \right\} \\ &+ \eta \delta(\mathbf{x} - \mathbf{x}^\*) - \eta p(\mathbf{x},t), \end{split} \tag{43}$$

where η is the probability for a particle in [x, x + dx) to disappear per the time interval dt, and the particle reappears immediately at x = x ∗ (> 0). Accordingly, we obtain the second-order ODE for the stationary solution p(x):

$$0 = -\frac{\mathrm{d}}{\mathrm{d}\boldsymbol{\chi}} \{\mu \boldsymbol{x} \boldsymbol{p}(\boldsymbol{\chi})\} + \frac{\mathrm{d}^2}{\mathrm{d}\boldsymbol{\chi}^2} \left\{\frac{(\boldsymbol{\sigma}\,\boldsymbol{\chi})^2}{2} \boldsymbol{p}(\boldsymbol{\chi})\right\} - \eta \boldsymbol{p}(\boldsymbol{\chi}),\tag{44}$$

which is held except for x = x ∗ . To solve this equation easily, we change the variable x into y := log x. The new probability density function q(y) is determined by

$$q(\wp) = p(\wp) \left| \frac{\mathrm{d}\mathfrak{x}}{\mathrm{d}\wp} \right|. \tag{45}$$

Then we obtain the ODE for q(y):

$$0 = -\left(\mu - \frac{\sigma^2}{2}\right) \frac{\mathrm{d}q(\mathsf{y})}{\mathrm{d}\mathsf{y}} + \frac{\sigma^2}{2} \frac{\mathrm{d}^2 q(\mathsf{y})}{\mathrm{d}\mathsf{y}^2} - \eta q(\mathsf{y}), \tag{46}$$

except for y = y ∗ (y ∗ : = log x ∗ ). The general solution of this second-order ODE is

$$\begin{cases} \quad q(\mathbf{y}) = C\_1 \mathbf{e}^{\lambda\_1 \eta} + C\_2 \mathbf{e}^{\lambda\_2 \eta}, \\ \quad \lambda\_1 = \frac{1}{\sigma^2} \left( \mu - \frac{\sigma^2}{2} + \sqrt{\left(\mu - \frac{\sigma^2}{2}\right)^2 + 2\sigma^2 \eta} \right) > 0, \\ \quad \lambda\_2 = \frac{1}{\sigma^2} \left( \mu - \frac{\sigma^2}{2} - \sqrt{\left(\mu - \frac{\sigma^2}{2}\right)^2 + 2\sigma^2 \eta} \right) < 0, \end{cases} \tag{47}$$

where C<sup>1</sup> and C<sup>2</sup> are the arbitrary constants determined by the normalization condition:

$$\mathbf{l} = \int\_0^\infty p(\mathbf{x}) \mathbf{d}x = \int\_{-\infty}^\infty q(\mathbf{y}) \mathbf{d}y. \tag{48}$$

To normalize the solution (Equation 47), we impose the boundary conditions q(∞) = q(−∞) = 0, which result in C<sup>1</sup> = 0 for y ≥ y ∗ and C<sup>2</sup> = 0 for y < y ∗ , that is,

$$q(\boldsymbol{\nu}) = \begin{cases} \quad \, \, \_1\mathrm{C}\_1\mathrm{e}^{\lambda\_1\boldsymbol{\nu}} \quad \, \, (\boldsymbol{\nu} < \boldsymbol{\nu}^\*), \\\\ \quad \, \_2\mathrm{e}^{\lambda\_2\boldsymbol{\nu}} \quad \, \, (\boldsymbol{\nu} \ge \boldsymbol{\nu}^\*). \end{cases} \tag{49}$$

Accordingly, the normalization condition

$$1 = \int\_{-\infty}^{\mathcal{V}^\*} \mathbf{C}\_1 \mathbf{e}^{\lambda\_1 \mathcal{V}} \mathbf{d}\mathcal{V} + \int\_{\mathcal{V}^\*}^{\infty} \mathbf{C}\_2 \mathbf{e}^{\lambda\_2 \mathcal{V}} \mathbf{d}\mathcal{V} \tag{50}$$

and the continuous condition at y = y ∗ , namely, C1e λ1y ∗ = C2e λ2y ∗ give us the normalized solution of Equation (46) as

$$q(\boldsymbol{\wp}) = \begin{cases} \frac{\lambda\_1 \lambda\_2}{\lambda\_2 - \lambda\_1} e^{\lambda\_1(\boldsymbol{\wp} - \boldsymbol{\wp}^\*)} & (\boldsymbol{\wp} < \boldsymbol{\wp}^\*), \\\\ \frac{\lambda\_1 \lambda\_2}{\lambda\_2 - \lambda\_1} e^{\lambda\_2(\boldsymbol{\wp} - \boldsymbol{\wp}^\*)} & (\boldsymbol{\wp} \ge \boldsymbol{\wp}^\*). \end{cases} \tag{51}$$

<sup>21</sup>The SDE of generalized Wiener process is represented by dXˆ (t) = adt + bdBˆ(t), where a and b are constants.

<sup>22</sup>Following Gabaix [39] and Toda [55], we derive the stationary probability density function.

Consequently, we obtain the solution of Equation (44):

$$p(\mathbf{x}) = \frac{q(\log \mathbf{x})}{\mathbf{x}} = \begin{cases} \frac{\lambda\_1 \lambda\_2}{\lambda\_2 - \lambda\_1} (\mathbf{x}^\*)^{-\lambda\_1} \mathbf{x}^{\lambda\_1 - 1} & (0 < \mathbf{x} < \mathbf{x}^\*), \\\\ \frac{\lambda\_1 \lambda\_2}{\lambda\_2 - \lambda\_1} (\mathbf{x}^\*)^{-\lambda\_2} \mathbf{x}^{\lambda\_2 - 1} & (\mathbf{x}^\* \le \mathbf{x}), \end{cases} \tag{52}$$

which is called the double Pareto distribution [25]. The exponents of the power law are 1 − λ<sup>1</sup> and 1 − λ2.

Next, we derive the probability density function (Equation 52) by another method as follows [56]. The lifetimes of particles are independently distributed with the exponential distribution as ℓLT(τ ) = ηe <sup>−</sup>ητ , because the death events occur as a Poisson process, with rate η, which have the time-reversal symmetry property. Accordingly, the ages of particles (i.e., the time intervals elapsed since the birth of them) at a large time are also independently distributed with the exponential distribution:

$$\ell\_{\mathcal{A}}(t) = \eta \mathbf{e}^{-\eta t}. \tag{53}$$

The probability density function of particles of age t as the conditional probability distribution is given by the log-normal distribution (Equation 28) with x<sup>0</sup> = x ∗ . Consequently, the probability density function of the coordinate of particle at a large time, denoted by p(x), is given by integrating the product of Equations (53) and (28):

$$p(\mathbf{x}) = \int\_0^\infty \eta e^{-\eta t} \frac{1}{\varkappa \sqrt{2\pi \sigma^2 t}} \exp\left[ -\frac{\left\{ \log \mathbf{x} - \log \mathbf{x}^\* - \left( \mu - \frac{\sigma^2}{2} \right) t \right\}^2}{2\sigma^2 t} \right] d\mathbf{t}.\tag{54}$$

We can calculate this with the change of variable u 2 := t and the identity [35]

$$\int\_0^\infty \exp\left(-au^2 - \frac{b}{u^2}\right) \mathrm{d}u = \frac{1}{2} \sqrt{\frac{\pi}{a}} \exp(-2\sqrt{ab}).\tag{55}$$

Thus, we obtain the same result with Equation (52)<sup>23</sup> without solving the ODE (Equation 44).

#### 3.4. Kesten Process

The Kesten process [26] is defined as a stochastic process whereby an additional term is added to the SDE of the GBM; namely, the SDE is represented by

$$\mathbf{d}\hat{X}(t) = \mu \hat{X}(t)\mathbf{d}t + \sigma \hat{X}(t)\mathbf{d}\hat{B}(t) + \hat{c}\mathbf{d}t,\tag{56}$$

where cˆ, in the additional term, is a random variable. We can expect that the additional term prevents Xˆ (t) from decreasing toward −∞ in a similar way as the reflecting wall in section 3.2

<sup>23</sup>The two solutions in Equation (52) result from r log <sup>x</sup> x ∗ 2 = − log <sup>x</sup> x ∗ for (0 < x < x ∗ ), and r log <sup>x</sup> x ∗ 2 <sup>=</sup> log <sup>x</sup> x ∗ for (x <sup>∗</sup> ≤ x).

Here, we simply take cˆ as a positive constant: cˆ = c (> 0). We then obtain the FPE for the probability density function:

$$\frac{\partial p(\mathbf{x},t)}{\partial t} = -\frac{\partial}{\partial \mathbf{x}} \{ (\mu \mathbf{x} + c) p(\mathbf{x}, t) \} + \frac{\partial^2}{\partial \mathbf{x}^2} \left\{ \frac{\sigma^2 \mathbf{x}^2}{2} p(\mathbf{x}, t) \right\}. \tag{57}$$

The ODE for the stationary solution p(x) is given by

$$0 = -\frac{\mathrm{d}}{\mathrm{d}x} \{ (\mu x + c) p(\varkappa) \} + \frac{\mathrm{d}^2}{\mathrm{d}x^2} \left\{ \frac{(\sigma \varkappa)^2}{2} p(\varkappa) \right\}.\tag{58}$$

Consequently, we obtain the normalized stationary solution of Equation (58)24:

$$p(\mathbf{x}) = \frac{1}{\Gamma(\alpha - 1)} \left(\frac{2c}{\sigma^2}\right)^{\alpha - 1} \exp\left[-\frac{2c}{\sigma^2 \varkappa}\right] \mathbf{x}^{-\alpha} \left(\alpha := 2 - \frac{2\mu}{\sigma^2}\right),\tag{59}$$

where Ŵ(α) is the gamma function defined in Equation (12). For the large x, the stationary solution satisfies the power law given as

$$p(\mathbf{x}) \propto x^{-\beta} \quad (\mathbf{x} \gg 1), \tag{60}$$

where the exponent of the power law is 2− 2µ σ 2 . Although c, in the additional term, achieves the stationary state, it is independent of the exponent. It is worth noting that the exponent of the power law is affected not by the constant c of the additional term, but by the drift µ and volatility σ of the GBM. The additional term affects only the lower tail of the probability density function. Even for c as a random variable, these properties are invariant.

#### 3.5. Generalized Lotka–Volterra Model

The GLV model was introduced for the analysis of individual income distribution. We consider the dynamical system composed of N agents (individuals) with incomes that grow by the GBM process and have the interactions for the redistribution of wealth [27–29]. The GLV model is represented by the system of SDEs called the GLV equations:

$$\mathrm{d}\hat{X}\_{i}(t) = \mu \hat{X}\_{i}(t)\mathrm{d}t + \sigma \hat{X}\_{i}(t)\mathrm{d}\hat{B}\_{i}(t) + \xi \hat{U}(t)\mathrm{d}t - \eta \hat{U}(t)\hat{X}\_{i}(t)\mathrm{d}t$$

$$\left(\hat{U}(t) := \frac{1}{N} \sum\_{i=1}^{N} \hat{X}\_{i}(t), \,\xi > 0\right), \tag{61}$$

where Xˆ <sup>i</sup>(t) is the individual income of agent i (i = 1, 2, · · · , N) at t, and Uˆ (t) is the average income for the whole system. To keep Xˆ <sup>i</sup>(t) > 0, the third term in RHS of Equation (61) redistributes a fraction of the total income for the whole system. This term can be interpreted as the effect of a tax or social security policy. The fourth term controls the growth of whole system and can be interpreted as the effect of finiteness of resources, technological innovations, wars, natural disasters, and so on.

The GLV equations have no stationary solution, and the total income for the entire system is not constant with time. Here, we introduce the new random variable as the relative value:

$$
\hat{Y}\_i(t) := \frac{\hat{X}\_i(t)}{\hat{U}(t)}.\tag{62}
$$

<sup>24</sup>Following Slanina [45], we solve the ODE.

Then we obtain the SDEs for Yˆ <sup>i</sup>(t) as

$$\begin{split} \mathrm{d}\hat{Y}\_{i}(t) &= \frac{\mathrm{d}\hat{X}\_{i}(t)}{\hat{U}(t)} - \frac{\hat{X}\_{i}(t)\mathrm{d}\hat{U}(t)}{\hat{U}(t)^{2}} \\ &= \xi \{1 - \hat{Y}\_{i}(t)\} \mathrm{d}t + \sigma \,\hat{Y}\_{i}(t) \mathrm{d}\hat{B}\_{i}(t) - \frac{\sigma \,\hat{Y}\_{i}(t)}{N\hat{U}(t)} \sum\_{i=1}^{N} \hat{X}\_{i}(t) \mathrm{d}\hat{B}\_{i}(t), \end{split} \tag{63}$$

where the last term in the second row is of the order N − 1 <sup>2</sup> , because the standard deviation of P<sup>N</sup> <sup>i</sup>=<sup>1</sup> <sup>X</sup><sup>ˆ</sup> <sup>i</sup>(t)dBˆ <sup>i</sup>(t) is of the order <sup>√</sup> N.

We then take the large N limit as the mean field approximation and obtain the new system of SDEs:

$$\mathrm{d}\hat{Y}\_i(t) \simeq -\xi \,\hat{Y}\_i(t)\mathrm{d}t + \sigma \,\hat{Y}\_i(t)\mathrm{d}\hat{B}\_i(t) + \xi \,\mathrm{d}t,\tag{64}$$

which has the same form as the SDE of Equation (56) in the Kesten process. We can use the result of Equation (59) to obtain the normalized stationary probability density:

$$q(\boldsymbol{\eta}) = \frac{1}{\Gamma(\alpha - 1)} \left(\frac{2\xi}{\sigma^2}\right)^{\alpha - 1} \exp\left[-\frac{2\xi}{\sigma^2 \boldsymbol{\eta}}\right] \boldsymbol{\eta}^{-\alpha} \quad \left(\alpha := 2 + \frac{2\xi}{\sigma^2}\right) \tag{65}$$

For large y, the stationary solution satisfies the power law as follows:

$$q(\wp) \propto \wp^{-\alpha} \quad (\wp \gg 1), \tag{66}$$

where the exponent of the power law is 2 + 2ξ σ 2 . Consequently, by a change of variables, and the mean field approximation, the GLV model with interactions gives us the same result as that obtained by the Kesten process without interactions.

#### 3.6. Bouchaud–Mézard Model

The BM model was introduced for the analysis of wealth distribution [30, 57–59]. We suppose there is an economic network composed of N agents (individuals) with wealth that grows by the GBM process and is redistributed by the exchanges between agents. The BM model is represented by the system of SDEs as follows:

$$\mathbf{d}\hat{X}\_i(t) = \mu \hat{X}\_i(t)\mathbf{d}t + \sigma \hat{X}\_i(t)\mathbf{d}\hat{B}\_i(t) + \sum\_{j(\neq i)} a\_{ij} (\hat{X}\_j(t) - \hat{X}\_i(t))\mathbf{d}t \tag{67}$$

where Xˆ <sup>i</sup>(t) is the individual wealth of agent i at t, and aij is the positive exchange rate between agent i and j. The wealth is exchanged by the third term in RHS of Equation (67), which can be interpreted as a kind of trading in the economic network.

For simplicity, we take aij as the constant <sup>a</sup> N (> 0) in preparation for the mean field approximation. Here, we again introduce the new random variables as the average of wealth and the relative value:

$$
\hat{Y}\_i(t) := \frac{\hat{X}\_i(t)}{\hat{U}(t)} \qquad \left(\hat{U}(t) := \frac{1}{N} \sum\_{i=1}^N \hat{X}\_i(t)\right). \tag{68}
$$

We then obtain the SDEs for Yˆ <sup>i</sup>(t) in the mean field approximation:

$$\begin{split} \mathrm{d}\hat{Y}\_{i}(t) &= \frac{\mathrm{d}\hat{X}\_{i}(t)}{\hat{U}(t)} - \frac{\hat{X}\_{i}(t)\mathrm{d}\hat{U}(t)}{\hat{U}(t)^{2}} \\ &\simeq -a\hat{Y}\_{i}(t)\mathrm{d}t + \sigma\,\hat{Y}\_{i}(t)\mathrm{d}\hat{B}\_{i}(t) + a\mathrm{d}t, \end{split} \tag{69}$$

which has the same form as the SDE of Equation (64) in the LV model. Consequently, we obtain the normalized stationary solution:

$$q(\boldsymbol{y}) = \frac{1}{\Gamma(\alpha - 1)} \left(\frac{2a}{\sigma^2}\right)^{\alpha - 1} \exp\left[-\frac{2a}{\sigma^2 \boldsymbol{y}}\right] \boldsymbol{y}^{-\alpha} \quad \left(\alpha := 2 + \frac{2a}{\sigma^2}\right). \tag{70}$$

For large y, the stationary solution satisfies the following power law:

$$q(\wp) \propto \wp^{-\alpha} \quad (\wp \gg 1), \tag{71}$$

where the exponent of the power law is 2 + 2a σ 2 . It is worth noting that though the forms of the additional terms in the GLV model and BM model are quantitatively different from those of the Kesten process, the results are eventually the same in the mean field approximation.

#### 4. COMBINATION OF EXPONENTIALS

When we have a probability density or distribution function for a random variable, we can obtain a new distribution by a change of variable. In particular, we can obtain a power law function from an exponential distribution by taking a new variable as the exponential function of the original variable. This mechanism was used to interpret the observed power law for the frequency of use of words with the assumption of random typings on a typewriter [31]. In this section, firstly we formalize this mechanism. Then we give the examples of applications to the Yule process and critical phenomena in physics.

#### 4.1. General Formalism

Suppose the probability density function for a continuous random variable x is given by

$$p(\mathbf{x}) = A \mathbf{e}^{\mathrm{ax}} \text{ ( $A > 0$ )}.\tag{72}$$

We change the variable x into the new variable y as

$$\mathbf{y} = \mathrm{Be}^{\mathrm{bx}} \text{ ( $\mathcal{B} > 0$ )}.\tag{73}$$

Thus the new probability density function q(y) is obtained as

$$q(\boldsymbol{\wp}) = p(\boldsymbol{\wp}) \left| \frac{d\boldsymbol{\wp}}{d\boldsymbol{\wp}} \right| = \frac{A}{|b|B^{\frac{\boldsymbol{\wp}}{b}}} \boldsymbol{\wp^{\frac{\boldsymbol{\wp}}{b}-1}} \propto \boldsymbol{\wp^{\frac{\boldsymbol{\wp}}{b}-1}},\tag{74}$$

where the exponent of power law is <sup>a</sup> <sup>b</sup> <sup>−</sup> 1.

Similarly, when the x is a discrete random variable, the new probability distribution function q(y) is obtained as

$$q(\boldsymbol{\wp}) = p(\boldsymbol{\wp}) = \frac{A}{B^{\frac{a}{b}}} \boldsymbol{\wp^a} \propto \boldsymbol{\wp^a},\tag{75}$$

where the exponent of power law is <sup>a</sup> b .

#### 4.2. Application to Yule Process

The power law of the Yule process can be interpreted using a combination of exponentials with a rough approximation [41]. Firstly, by changing the Poisson rate λ<sup>s</sup> into λ<sup>g</sup> in (A.15), the probability density function of the age of genera at a large time is obtained as

$$
\rho(t) = \lambda\_{\mathfrak{g}} \mathbf{e}^{-\lambda\_{\mathfrak{g}}t}.\tag{76}
$$

Then, from (A.12) with ns0 = 1, we approximately obtain the number of species within the genus of age t as

$$n(t) \cong \operatorname{E}[\hat{\mathcal{N}}\_s(t)] = e^{\lambda\_s t}.\tag{77}$$

Finally, taking A = λg, a = −λg, B = 1, and b = λ<sup>s</sup> in Equation (74), the probability density function of the number of species within genera is

$$q(n) = \frac{\lambda\_{\frac{g}{g}}}{\lambda\_s} n^{-\left(\frac{\lambda\_g}{\lambda\_s} + 1\right)},\tag{78}$$

where the exponent of power law is <sup>λ</sup><sup>g</sup> λs + 1. This exponent coincides with Equation (14). Thus the generating mechanism of power law in the Yule process can be roughly interpreted as a combination of exponentials as well.

#### 4.3. Application to Critical Phenomena

It is well-known that in certain critical phenomena, some physical quantities (e.g., correlation length, susceptibility, and specific heat) take the form of power functions of the reduced temperature <sup>T</sup>−T<sup>c</sup> Tc near the critical temperature T<sup>c</sup> . By the renormalization group analysis [60], this property can be interpreted as emerging from a combination of exponentials [41].

We consider two physical quantities x and y whose scaling dimensions are d<sup>x</sup> and dy, respectively. When we perform the scale transformation (i.e., renormalization group transformation) by the scaling factor b near the critical point, we suppose that x and y are multiplied by λ<sup>x</sup> and λy, respectively. By the dimensional analysis, we obtain

$$
\lambda\_{\mathbf{x}} = b^{d\_{\mathbf{x}}}, \ \lambda\_{\mathbf{y}} = b^{d\_{\mathbf{y}}} \quad \left(\frac{\log \lambda\_{\mathbf{y}}}{\log \lambda\_{\mathbf{x}}} = \frac{d\_{\mathbf{y}}}{d\_{\mathbf{x}}}\right). \tag{79}
$$

Then we obtain geometric progressions for the transformed x and y:

$$\begin{cases} \quad \boldsymbol{\chi} \colon \ \boldsymbol{\chi}\_{0} \to \boldsymbol{\lambda}\_{x} \boldsymbol{\chi}\_{0} \to (\boldsymbol{\lambda}\_{x})^{2} \boldsymbol{\chi}\_{0} \to \cdots \text{ }, \\\quad \boldsymbol{\chi} \colon \ \boldsymbol{\chi}\_{0} \to \boldsymbol{\lambda}\_{\boldsymbol{\chi}} \boldsymbol{\chi}\_{0} \to (\boldsymbol{\lambda}\_{\boldsymbol{\chi}})^{2} \boldsymbol{\chi}\_{0} \to \cdots \text{ }, \end{cases} \tag{80}$$

where x<sup>0</sup> and y<sup>0</sup> are the initial values of the transformation. Let us denote x and y transformed n times by x<sup>n</sup> and yn, respectively. Accordingly, x<sup>n</sup> and y<sup>n</sup> are defined as

$$\begin{cases} \quad \mathbf{x}\_n := (\lambda\_\mathbf{x})^n \boldsymbol{\chi}\_0 = \mathbf{x}\_0 \mathbf{e}^{(\log \lambda\_\mathbf{x})n}, \\ \quad \boldsymbol{\chi}\_n := (\lambda\_\mathbf{y})^n \boldsymbol{\chi}\_0 = \boldsymbol{\chi}\_0 \mathbf{e}^{(\log \lambda\_\mathbf{y})n}, \end{cases} \tag{81}$$

which constitute a combination of exponentials. Therefore, taking A = y0, a = log λy, B = x0, and b = log λ<sup>x</sup> in Equation (75), we can write down y<sup>n</sup> as a function of x<sup>n</sup> as

$$\gamma\_n = \wp\_0 \left(\frac{\varkappa\_n}{\varkappa\_0}\right)^{\frac{\log \lambda\_\mathcal{Y}}{\log \lambda\_\mathcal{X}}} \propto \varkappa\_n \frac{d\_\mathcal{Y}}{d\_\mathcal{X}},\tag{82}$$

where the exponent of power law is − dy dx . We emphasize that this is a simple consequence of the dimensional analysis.

Furthermore, if y := p(x), the two geometric progressions (Equation 80) lead to

$$
\lambda\_\gamma \chi = p(\lambda\_x x), \tag{83}
$$

which satisfies the scale-free property (Equation 2) with b = λ<sup>x</sup> and f(b) = λy. Namely, the two geometric progressions, equivalent to a combination of exponentials by the scale transformation, assures that the scale-free property holds<sup>25</sup> .

#### 5. CONCLUSIONS

We have reviewed nine generating mathematical mechanisms of power laws (i.e., Yule process, Simon process, Barabási– Albert model, geometric Brownian motion with a reflecting wall and reset events, Kesten process, Generalized Lotka–Volterra model, and Bouchaud–Mézard model, and the combination of exponentials) that are widely applied in the social sciences. Since these mechanisms are only prototypes, the exponents of the power laws derived from them may not match those of real phenomena (e.g., number of links on the WWW, and so on). As explained in this paper, however, these mechanisms have been improved so that the exponents match those of real phenomena, while the basic principles of the improved mechanisms remain the same. Though many power laws as macro behaviors of systems have been studied, the mechanisms generating them from micro dynamics are not yet completely understood. In physics, however, the understanding of thermodynamics of macro behavior from quantum mechanics of micro dynamics has been advanced considerably based on statistical mechanics. A similar development may also be possible in the study of power laws.

#### AUTHOR CONTRIBUTIONS

TK: Designed the overall direction of this review paper, and found out the existing models, which are highly applicable from the viewpoint of the Computational Social Science; S-IK: Surveyed existing model studies and summarized the mathematical mechanisms of those models; S-IK and TK: Wrote the manuscript.

<sup>25</sup>We thank Prof. Ken-Ichi Aoki for pointing out this observation.

### ACKNOWLEDGMENTS

The authors greatly appreciate stimulating discussions about the scale-free property in critical phenomena

#### REFERENCES


with Prof. Ken-Ichi Aoki. Financial support from the Japan Society for the Promotion of Science (JSPS KAKENHI Grant Number 15H05729) is gratefully acknowledged.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Kumamoto and Kamihigashi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

### APPENDIX

Some mathematical supplements are given in this appendix.

### A.1. Poisson Process

We consider the Poisson process [61, 62] with the Poisson rate λ (a positive constant), that is, the events occur on average λ times per unit time. The probability that an event occurs n times in (t, t + h] follows the Poisson distribution:

$$\text{P}\{\hat{N}(t+h) - \hat{N}(t) = n; \text{ rate } \lambda\} = \text{e}^{-\lambda h} \frac{(\lambda h)^n}{n!}, \quad (n = 0, 1, 2, \cdots) \tag{A.1}$$

where Nˆ (t) denotes the number of times that the events occur in (0, t]. When h is the infinitesimal time interval, the probabilities of event occurrence can be expressed by o(h k ). The probability that no event occurs in (t, t + h] is

$$\begin{aligned} \text{P}\{\hat{N}(t+h) - \hat{N}(t) &= 0; \text{ rate } \lambda\} = \text{e}^{-\lambda h} \\ &= 1 - \lambda h + o(h). \left(\lim\_{h \to 0} \frac{o(h^k)}{h^k} = 0\right) \text{(A.2)} \end{aligned}$$

Similarly, the probability that one event occurs is

$$\begin{aligned} \text{P}\{\hat{N}(t+h) - \hat{N}(t) = 1; \text{ rate } \lambda\} &= \text{e}^{-\lambda h} \lambda h = (1 - \lambda h + o(h))\lambda\\ h &= \lambda h + o(h), \end{aligned} \tag{A.3}$$

and the probability that more than two events occur is

$$\begin{aligned} \text{P}\{\hat{N}(t+h) - \hat{N}(t) \ge 2; \text{ rate } \lambda\} &= \sum\_{n=2}^{\infty} \text{e}^{-\lambda h} \frac{(\lambda h)^n}{n!} \\ &= \text{e}^{-\lambda h} \left( \sum\_{n=0}^{\infty} \frac{(\lambda h)^n}{n!} - 1 - \lambda h \right) \\ &= 1 - \text{e}^{-\lambda h} (1 + \lambda h) \\ &= 1 - (1 - \lambda h + o(h))(1 + \lambda h) \\ &= o(h) . \end{aligned}$$

#### A.2. Pure Birth Process

We generalize the Poisson process so that the Poisson rate depends on the number of times that the events have already occurred. To apply this generalized Poisson process to the evolution model in biology, we interpret the occurrence of events as the births of new species without deaths [61, 62].

First, we are interested in the probability that the number of species becomes n (∈ N) at time t (∈ R) with the initial number of species ns0 at time 0. It is denoted by p(n, t) : = P{Nˆ <sup>s</sup>(t) = n} , where Nˆ <sup>s</sup>(t) is the number of species at time t. Then, we derive the time evolution equation of p(n, t). The probability p(n, t + h) is given as the sum of the following probabilities:

• the probability that Nˆ <sup>s</sup>(t) = n and no birth occurs in (t, t + h] with rate 3s(n);


.

. • the probability that Nˆ <sup>s</sup>(t) = n−k and k births occur in (t, t+h]; . . .

where 3s(n) is the Poisson rate when the number of species is n. Accordingly, we obtain

$$\begin{aligned} p(n, t + h) &= \text{P} \{ \hat{\text{N}}\_{\text{s}}(t) = n \cap \text{ no birth occurs in } (t, t + h] \\ &\quad \text{with } \Lambda\_{\text{s}}(n) \} \\ &+ \text{P} \{ \hat{\text{N}}\_{\text{s}}(t) = n - 1 \cap \text{ 1 birth occurs in } (t, t + h] \} \\ &\quad \text{with } \Lambda\_{\text{s}}(n - 1) \} \\ &+ \sum\_{k = 2}^{n - n\_{\text{s}0}} \text{P} \{ \hat{\text{N}}\_{\text{s}}(t) = n - k \cap \text{ k births occur in} \\ &\quad \text{(t, } t + h] \}. \end{aligned}$$

From equations (A.2), (A.3), and (A.4), the probabilities on the right-hand side of (A.5) are expressed respectively as the orders of h:

$$\begin{cases} \mathbb{P}\{\hat{\mathbf{N}}\_{s}(t) = n \ \cap \quad \text{no birth occurs in } (t, t+h] \} \\ \\ \quad = p(n, t) \times \left\{ 1 - \Lambda\_{s}(n)h + o(h) \right\}, \\\\ \mathbb{P}\{\hat{\mathbf{N}}\_{s}(t) = n - 1 \ \cap \ \mathbf{1} \text{ birth occurs in } (t, t+h] \} \\ \\ \quad = p(n-1, t) \times \left\{ \Lambda\_{s}(n-1)h + o(h) \right\}, \\ \\ \quad \sum\_{k=2}^{n-n\_{d0}} \mathbb{P}\{\hat{\mathbf{N}}\_{s}(t) = n - k \ \cap \ \mathbf{k} \text{ birth occurs in } (t, t+h] \} = o(h). \end{cases} \tag{A.6}$$

We combine (A.5) with (A.6), and obtain the difference equation:

$$\frac{p(n,t+h) - p(n,t)}{h} = -\Lambda\_s(n)p(n,t) + \Lambda\_s(n-1)p(n-1,t) + \frac{o(h)}{h}.\tag{A.7}$$

<sup>26</sup>Strictly speaking, the Poisson rate is not constant at 3s(n − 1) in (t, t + h], that is, the Poisson rate change into 3s(n) from 3s(n − 1) when the birth occurs. Therefore, the accurate probability is P{Nˆ <sup>s</sup>(t) = n − 1} × P{ one species is born in(t, t+j] with rate 3s(n−1)}×P{ no species is born in(t+ j, t + h] with rate 3s(n)}, where t + j (0 < j < h) is the time of the birth. However, since we take the limit h → 0 at the end, even if we deal with the probability this strictly, the time evolution equation of the final result will be the same.

<sup>27</sup>Even if we precisely consider the changing Poisson rate with births, this probability will eventually be o(h). Therefore, we do not need the precise values for the exact Poisson rate and the probabilities that k(≥ 2) species are born in (t, t + h].

We take the limit h → 0 and obtain the ODEs with the initial conditions:

$$\begin{aligned} \text{for } n > n\_{s0}, \quad & \left\{ \frac{\partial p(n,t)}{\partial t} = -\Lambda\_s(n)p(n,t) + \Lambda\_s(n-1)p(n-1,t), \\ \rho(n,0) = 0, \\ \text{for } n = n\_{s0}, \quad & \left\{ \frac{\partial p(n\_{s0},t)}{\partial t} = -\Lambda\_s(n\_{s0})p(n\_{s0},t), \\ \rho(n\_{s0},0) = 1, \end{aligned} \right. \end{aligned} \tag{A.8}$$

which are called the Kolmogorov's forward equations. The ODEs (A.8) can be solved and yield

$$\begin{cases} \begin{aligned} \rho(n,t) &= \int\_0^t \mathbf{e}^{-\Lambda\_s(n)(t-s)} \Lambda\_s(n-1)\rho(n-1,s)\mathbf{d}s \quad \text{for } n > n\_{s0}, \\\ p(n\_{s0},t) &= \mathbf{e}^{-\Lambda\_s(n\_{s0})t} .\end{aligned} \end{cases} \tag{A.9}$$

#### A.3. Linear Birth Process

Next, we consider the linear birth process [61, 62] that is mathematically defined as a special case of the pure birth process. When the Poisson rate 3s(n) is proportional to the number of species n,

$$
\Lambda\_s(\mathfrak{n}) = \lambda\_s \mathfrak{n}, \tag{A.10}
$$

where λ<sup>s</sup> is a positive constant, this pure birth process is called the linear birth process28. Then, we can interpret the birth of new species in this process as the occurrence of branching in the evolutionary tree (**Figure 2**). In particular, the linear birth process means that the branchings occur independently on each line of a species as the Poisson processes with the Poisson rate λ<sup>s</sup> , which is common for all existing species.

The solutions of (A.9) for the Yule process can be recursively calculated and yield

$$\begin{split}p(n,t) &= \binom{n-1}{n-n\_{\mathrm{s0}}} \left(\mathrm{e}^{-\lambda\_{\mathrm{s}}t}\right)^{n\_{\mathrm{s0}}} \left(1-\mathrm{e}^{-\lambda\_{\mathrm{s}}t}\right)^{n-n\_{\mathrm{s0}}}\\ & \left(\binom{n}{m}:=\frac{n!}{m!(n-m)!}\right) \qquad\text{for } n > n\_{\mathrm{s0}}, \qquad \text{(A.11)}\\p(n\_{\mathrm{s0}},t) &= \mathrm{e}^{-\lambda\_{\mathrm{s0}}nt}.\end{split} \tag{A.11}$$

Then, the expectation value and the variance of the number of species at time t is given by

$$\begin{aligned} \mathrm{E}[\hat{N}\_{\mathrm{s}}(t)] &= \sum\_{n=n\_{\mathrm{s}0}}^{\infty} np(n,t) = n\_{\mathrm{s}0} \mathbf{e}^{\lambda\_{\mathrm{s}}t}, \\ \mathrm{Var}[\hat{N}\_{\mathrm{s}}(t)] &= \mathrm{E}[\hat{N}\_{\mathrm{s}}(t)^2] - \mathrm{E}[\hat{N}\_{\mathrm{s}}(t)]^2 = n\_{\mathrm{s}0} \mathbf{e}^{\lambda\_{\mathrm{s}}t} (\mathbf{e}^{\lambda\_{\mathrm{s}}t} - 1). \end{aligned} \tag{A.12}$$

Let Ps{0 < age ≤ t at τ } be the probability of the species whose age, that is, the time intervals elapsed since the birth, is t or less at time τ (> t). This probability is given by

$$\begin{split} \mathbb{P}\_{\mathbf{s}}\{0 < \text{age} \le t \text{ at } \mathbf{r}\} &= \mathbb{E}\left[\frac{\hat{N}\_{\mathbf{s}}(\mathbf{r}) - \hat{N}\_{\mathbf{s}}(\mathbf{r} - t)}{\hat{N}\_{\mathbf{s}}(\mathbf{r})}\right] \\ &= 1 - \mathbb{E}\left[\frac{\hat{N}\_{\mathbf{s}}(\mathbf{r} - t)}{\hat{N}\_{\mathbf{s}}(\mathbf{r})}\right] \\ &\simeq 1 - \frac{\mathbb{E}[\hat{N}\_{\mathbf{s}}(\mathbf{r} - t)]}{\mathbb{E}[\hat{N}\_{\mathbf{s}}(\mathbf{r})]} = 1 - \mathbf{e}^{-\lambda\_{\mathbf{s}}t}, \end{split} \tag{A.13}$$

where the approximately equal symbol holds only for a large time<sup>29</sup> τ . Therefore, it no longer depends on τ . Let us use ℓs(s) to denote the probability density function for the age s of species at a large time. By the probability of the species whose age is t or less at a large time, it is defined as

$$\int\_0^t \ell\_s(s)ds = \mathbb{P}\_s\{0 < \text{age} \le t \text{ at a large time}\}. \tag{A.14}$$

Differentiating both sides of (A.14) with respect to t, we obtain

$$\ell\_{\mathbf{s}}(t) = \frac{\text{dP}\_{\mathbf{s}}\{0 < \text{Age} \le t \text{ at a large time}\}}{\text{dt}} = \lambda\_{\mathbf{s}} \mathbf{e}^{-\lambda\_{\mathbf{s}}t}. \quad \text{(A.15)}$$

#### A.4. Multiplicative Process

The multiplicative process is the discrete-time stochastic process defined as

$$
\hat{X}(t+1) = \hat{r}(t)\hat{X}(t) \quad (t = 0, 1, 2, \cdots), \qquad \text{(A.16)}
$$

where rˆ(t), for all times t, are independent and equallydistributed random variables with ν : = E[log rˆ(t)] and σ 2 : = Var[log rˆ(t)]. This process is essentially equivalent to the GBM because both probability density functions are identically the log-normal distributions in the large time limit.

We can easily obtain the solution of (A.16) in the logarithmic form as follows:

$$\log \hat{X}(t) = \sum\_{i=0}^{t-1} \log \hat{r}(i) + \log x\_0,\tag{A.17}$$

where x<sup>0</sup> is the initial value of Xˆ (t). We then define the new variable Yˆ (t) as

$$\hat{Y}(t) := \frac{\log \hat{X}(t) - \log \varkappa\_0 - t\upsilon}{\sqrt{t}} = \frac{\sum\_{i=0}^{t-1} \left(\log \hat{r}(i) - \upsilon\right)}{\sqrt{t}}.\tag{A.18}$$

 

<sup>28</sup>Though this process is also called the Yule–Furry process, we call it the linear birth process in this paper to distinguish it from the Yule process that generates a power law.

<sup>29</sup>We consider only the probability in a large time, because we are interested in only the power-law distribution as the stationary state at a large time.

By the central limit theorem, we obtain the probability density function of Yˆ (t) in the time limit t → ∞:

$$q(\boldsymbol{\wp}) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left[-\frac{\boldsymbol{\wp}^2}{2\sigma^2}\right],\tag{A.19}$$

which is the normal distribution. Consequently, by a change of variables, we obtain the probability density function of Xˆ(t) as follows:

$$p(\mathbf{x}) = \frac{1}{\mathbf{x}\sqrt{2\pi\sigma^2 t}} \exp\left[-\frac{\left\{\log\mathbf{x} - \log\mathbf{x}\_0 - t\boldsymbol{\nu}\right\}^2}{2\sigma^2 t}\right],\tag{A.20}$$

which is the same as the log-normal distribution as (28) of the GBM with ν = µ − σ 2 2 .

### A.5. Stationary Solution of the Fokker–Planck Equation With Reflecting Wall

Here we provide a stationary solution of the FPE with reflecting wall [23, 39, 55].

The SDE<sup>30</sup> of an Itô process for the random variable Xˆ(t) is given by

$$\mathrm{d}\hat{X}(t) = a(\hat{X}(t), t)\mathrm{d}t + b(\hat{X}(t), t)\mathrm{d}\hat{B}(t),\tag{A.21}$$

where Bˆ(t) is a standard Brownian motion; E[dBˆ(t)] = 0, Var[dBˆ(t)] = dt. This SDE is equivalent to the Langevin equation [51]:

$$\frac{d\hat{X}(t)}{dt} = a(\hat{X}(t), t) + b(\hat{X}(t), t)\hat{\Gamma}(t),\tag{A.22}$$

where the noise term Ŵˆ (t) satisfies

$$\begin{cases} \operatorname{E}[\hat{\Gamma}(t)] = 0, \\ \operatorname{E}[\hat{\Gamma}(t)\hat{\Gamma}(t')] = \delta(t - t'). \end{cases} \tag{A.23}$$

We can obtain the FPE for the random variable Xˆ(t) with the probability density p(x, t) as

$$\frac{\partial p(\mathbf{x},t)}{\partial t} = -\frac{\partial}{\partial \mathbf{x}} \{a(\mathbf{x},t)p(\mathbf{x},t)\} + \frac{\partial^2}{\partial \mathbf{x}^2} \left\{\frac{b(\mathbf{x},t)^2}{2} p(\mathbf{x},t)\right\}.\tag{A.24}$$

Then we define the flux J(x, t) as

$$J(\mathbf{x},t) := a(\mathbf{x},t)p(\mathbf{x},t) - \frac{\partial}{\partial \mathbf{x}} \left\{ \frac{b(\mathbf{x},t)^2}{2} p(\mathbf{x},t) \right\},\tag{A.25}$$

so that we can interpret (A.24) as the continuity equation

$$\frac{\partial p(\mathbf{x},t)}{\partial t} + \frac{\partial J(\mathbf{x},t)}{\partial \mathbf{x}} = \mathbf{0}.\tag{A.26}$$

<sup>30</sup>In the Stratonovich convention this SDE is represented by dXˆ(t) = ( a(Xˆ (t), t) − 1 2 b(Xˆ(t), t) ∂b(Xˆ (t), t) ∂Xˆ(t) ) dt + b(Xˆ(t), t) ◦ dBˆ(t).

When a(x, t) and b(x, t) are the time-independent functions, that is, a(x, t) = a(x) and b(x, t) = b(x), the stationary solution p(x) is defined by the condition<sup>31</sup>

$$\frac{\partial p(\mathbf{x})}{\partial t} = \mathbf{0},\tag{A.27}$$

that is equivalent to

$$\frac{\partial f(\mathbf{x})}{\partial \mathbf{x}} = \mathbf{0},\tag{A.28}$$

where J(x) is the stationary flux. Accordingly, the stationary flux J(x) must be constant.

When the stationary flux J(x) takes a nonzero value, the stationary state means that particles flow in from one side of infinity and out the other side. This situation causes the stationary probability density function p(x) to be nonzero at x = ±∞. Consequently, the nonzero stationary flux cannot give us a power-law probability density function that can be normalized, because any power function blows up at one side of infinity. In contrast, when the stationary flux J(x) vanishes anywhere, we can set the reflecting wall at x = xmin so that the stationary probability density function p(x) vanishes outside of the wall. The reflecting wall enables us to obtain a power-law probability density function that can be normalized, because we can cut out the side of infinity where the power function blows up. For this reason, we consider only the case that the flux vanishes at a boundary, that is, the reflecting wall.

In this case, we obtain the second-order ODE

$$J(\mathbf{x}) = a(\mathbf{x})p(\mathbf{x}) - \frac{\mathbf{d}}{\mathbf{d}\mathbf{x}} \left\{ \frac{b(\mathbf{x})^2}{2} p(\mathbf{x}) \right\} = \mathbf{0} \tag{A.29}$$

that the stationary solution p(x) satisfies.

 

The stationary solution is obtained as the solution of (A.29):

$$\begin{cases} \rho(\mathbf{x}) = \rho(\mathbf{x}\_0) b(\mathbf{x}\_0)^2 \mathbf{e}^{f(\mathbf{x})}, \\\\ f(\mathbf{x}) := -2 \log \{ b(\mathbf{x}) \} + \int\_{\infty}^{\mathbf{x}} \frac{2a(\mathbf{x}')}{b(\mathbf{x}')^2} d\mathbf{x}', \end{cases} \tag{A.30}$$

where x0(≥ xmin) is an arbitrary constant. If a(x) and b(x) are the power functions that satisfy the condition

$$\frac{a(\varkappa)}{b(\varkappa)^2} \propto \frac{1}{\varkappa},\tag{A.31}$$

namely,

$$\begin{aligned} a(\mathbf{x}) &= a \mathbf{x}^{2n-1} \text{ ( $a:$ constant $)}, \\ b(\mathbf{x}) &= b \mathbf{x}^n \text{ ($ b: $constant$ )}, \end{aligned} \tag{A.32}$$

we obtain the stationary solution as the power function of x:

$$p(\mathbf{x}) = \mathbf{C} \mathbf{x}^{-\alpha} \quad \left(\text{C} := p(\mathbf{x}\_0) \mathbf{x}\_0^{\alpha}, \ \alpha := 2n - \frac{2a}{b^2}\right). \tag{A.33}$$

<sup>31</sup>Though the existence of the stationary solution is nontrivial, we assume it here.

This stationary solution p(x) must satisfy the normalization condition

$$1 = \int\_{\chi\_{\min}}^{\infty} p(\mathbf{x})d\mathbf{x},\tag{A.34}$$

where we set the reflecting wall at x = xmin(> 0) and assume α > 1. The normalization condition

$$1 = \int\_{\chi\_{\min}}^{\infty} p(\mathbf{x})d\mathbf{x} = \frac{C}{\alpha - 1} (\chi\_{\min})^{-\alpha + 1}.\tag{A.35}$$

determines the constant C as

$$C = (\alpha - 1)(\chi\_{\text{min}})^{-\alpha + 1}.\tag{A.36}$$

Thus, we have the stationary solution

$$p(\mathbf{x}) = (\alpha - 1)(\mathbf{x}\_{\min})^{-\alpha + 1} \mathbf{x}^{-\alpha} \quad \left(\alpha = 2n - \frac{2a}{b^2} > 1\right). \tag{A.37}$$

# Sociophysics Analysis of the Dynamics of Peoples' Interests in Society

Akira Ishii <sup>1</sup> \* and Yasuko Kawahata<sup>2</sup>

<sup>1</sup> Department of Applied Mathematics and Physics, Tottori University, Tottori, Japan, <sup>2</sup> Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan

As a method of analyzing and predicting social phenomena using social media as data, we present analyses based on the mathematical model of the hit phenomenon, which is one of the established models of sociophysics. The dynamics of the number of social media posts for movies, events, and a YouTube movie are explained. For entertainment topics, the direct communication strength, "D," indicates the satisfaction of the current interested people or supporters, whereas the indirect communication strength, "P," indicates the power to acquire a new support layer. Thus, this is effective not only for the analysis of entertainment and marketing strategy but also for burst analysis on the social media.

#### Edited by:

Isamu Okada, Soka University, Japan ¯

#### Reviewed by:

Francisco Welington Lima, Federal University of Piauí, Brazil Reik Donner, Potsdam-Institut für Klimafolgenforschung (PIK), Germany

> \*Correspondence: Akira Ishii ishii@damp.tottori-u.ac.jp

#### Specialty section:

This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics

Received: 29 September 2017 Accepted: 25 July 2018 Published: 08 October 2018

#### Citation:

Ishii A and Kawahata Y (2018) Sociophysics Analysis of the Dynamics of Peoples' Interests in Society. Front. Phys. 6:89. doi: 10.3389/fphy.2018.00089 Keywords: social media, blog, Twitter, advertisement, popularity, mathematical model of the hit phenomena, sociophysics, rumor-spread

### INTRODUCTION

In the present age, where consumer behavior remains on record through the internet, the purchase and action records of numerous consumers are available. Analyses reveal that there are many cases, where it is possible to incorporate natural science methodology, such as physics, apart from conventional social science. Therefore, sociophysics, which studies society using physics, has developed significantly, of late [1, 2]. In this paper, we present certain results based on sociophysics for analyzing and forecasting social phenomena, and the possibility of applying it for marketing, etc., using the voice of the society recorded in blogs and Twitter as data. Sociophysics is a new frontier of physics alongside economic physics; however, if vast amounts of data are available, the methodology of physics that has been the subject of experimental data on natural phenomena can also be applied to social science. Nowadays, we can use the vast stock of digital data on human communication as the observation data of real society [3–6]. Therefore, sociophysics is progressing rapidly.

As a sociophysics theory for analyzing society based on social media writing, a mathematical model for the hit phenomenon has been developed by Ishii et al. [7]. Here, mathematical equation inspired from physics are used to explain people's interest due to the influence of advertisement and communication with other people. Ishii et al. [7] utilized the effects of advertisement and verbal communication to form a model that successfully predicted the outcome of each film, which was screened. For the analysis of the film market in marketing science, although several researches use regression analysis [8–15], it is very difficult to analyze the dynamics of consumers, using this analysis.

In the mathematical theory of the hit phenomenon, the effect of advertisement and the propagation of reputation and rumors by human communication are incorporated into the statistical physics of human dynamics. The propagation of information, reputation, and rumors has been studied in several works. For example, the SIR model is a simple mathematical model for epidemics [16], which is applicable not only to the spread of infectious diseases but also to the spread of information. The equations of the SIR model are as follows:

$$\begin{aligned} \frac{dS}{dt} &= -\beta S\left(t\right)I\left(t\right), \\ \frac{dI}{dt} &= \beta S\left(t\right)I\left(t\right) - \gamma I\left(t\right), \\ \frac{dR}{dt} &= \gamma I\left(t\right), \end{aligned}$$

where, S is the number of susceptible, I is the number of infectious, and R is the recovered individuals. In the case of information spread, S indicates the non-adopters, I are the contagious adopters, and R are the non-contagious adopters.

The other famous model for the spread of information is the Bass model [17, 18]. The equation of the Bass model is as follows:

$$\frac{dN(t)}{dt} = a\left(m - N(t)\right) + b\left(m - N(t)\right)N(t),$$

where m is the total number of people and N(t) is the number of adopters. The first term of the equation indicates a constant propensity to adopt, independent of the number of customers, who have adopted the innovation before time, t. The second term is proportional to the number of customers, who have already adopted the innovation by t; this term represents the extent of favorable exchange of word-of-mouth (WOM) communication between the innovators and the other adopters of the new product.

There are several problems in the above two models. In the SIR model, the spread of information is assumed to happen as communication between an adopter and non-adopter, and the mass media effects are not included. Moreover, the exchange of WOM communication is assumed to be proportional to the number of adopters. In the Bass model, it is assumed that once a consumer adopts a new product, he influences other nonadopters to adopt the product at all later times. In order to overcome these disadvantages, the Bass-SIR model was presented [19].

Another similar mathematical model for calculating the spread of information is the opinion dynamics model by Galam [1] based on the Ising model of statistical physics. It is considered that accumulation of the opinions of individuals (agents) is similar to the mathematical model of the hit phenomenon, considering the interaction between people. In Galam's opinion dynamics model, a variable, c<sup>i</sup> = ±1, represents the choice of agents, I, with Yes = 1 and No = −1. Galam expressed the group conflict function, G, as

$$G = J \sum\_{i,j} c\_i c\_j + S \sum\_i c\_i + \sum\_i S\_i c\_i,$$

where the first term corresponds to the direct communication between people and the second to the external field effect. The third term can correspond to more complex communication. This model is applied to the modern politics [20].

Our approach is different. Here, we use the mathematical model of the hit phenomenon [7], where the intention of people for a certain topic is calculated. In this model, the calculated intention is not on/off. Thus, the value of the intention in the mathematical model of the hit phenomena has no upper limit. The upper-limit value of the intentions of N people is not N. On the contrary, for the SIR model, the Bass model, or the model of Galam, the calculated value of each person is on/off or in-between on and off; the upper limit is unity and is N for N persons. Hence, in the mathematical model of the hit phenomenon, the calculated intention of an adopter can increase to a very large value far beyond unity, depending on mass media advertisement and communication, although the upper limit of each adopter is unity for the SIR model, Bass model, and the opinion dynamics model of Galam.

The target of the mathematical model of the hit phenomenon is the "hits" phenomenon. The hits on social media are similar to the burst phenomenon, which is found to evolve through non-Poissonian dynamics [21]. The similarity between the burst and hit phenomena is that specific topics that are referred to widely, occur frequently in the social media. The difference is that the burst phenomenon is spontaneous, whereas the hit phenomenon is artificially drawn. Nonetheless, the hit and burst phenomena are similar, and the study of the hit phenomenon is useful for the research of the burst phenomenon.

There many investigations on the hit phenomenon, other than our works [22–34]. In contrast to other works on the hit pheonomenon, in our model, the effect of advertisements, and the propagation of reputation and rumors by human communication are incorporated into the statistical physics of human dynamics. The mathematical model has been applied to the motion picture business in the Japanese market, and the calculations have been compared to the reported revenue and observed number of blog posts for each film. Furthermore, in several recent papers, it has been shown that the theory is not only applicable to the box office, but also to other social entertainment, such as local events [35], animated dramas on TV [36], the "general election" of the Japanese girl-group, AKB48 [37], online music [38], plays [39], music concerts [40, 41], Japanese stage actors [42], Kabuki players of the nineteenth century [43], and TV drama [44]. In these works, an extended mathematical theory of the hit phenomena was used to apply the model to general entertainment in society. Thus, it is very natural to use this theory for the prediction of the motion picture business.

In this paper, after screening a movie/ drama and expanding the topic of the social incident using the mathematical model of the hit phenomenon, which is modified slightly from the original model of Ishii et al. [7], we analyze the result of the mathematical model of the hit phenomenon for the analysis and prediction of social dynamics.

## THEORY Mathematical Model of the Hit Phenomenon

The mathematical model of the hit phenomenon within a society is presented as a stochastic process of the interaction of human dynamics as in the many-body theory in physics [7]. In this model, we assume that the intentions of humans in society are affected by three mechanisms: advertisement, communication with friends, and rumor. Advertisements are the external forces for each person in society. Communication with friends is called the direct communication effect and is considered as interaction with the intentions of friends. The rumor effect is considered as interaction among three persons and called indirect communication, as described in Galam [1]. In the model, we use only the time distribution of the advertisement budget as the input, and the WOM represented by posts on social network systems (SNS) is used as the observed data for comparison with the calculated results. The parameters in the model are adjusted in comparison with the calculation and observed SNS posting data.

Here, we introduce the intention of a person, "i," as I<sup>i</sup> (t), where this quantity is assumed to be a real number and proportional to the number of posting in a blog or Twitter. Although Ii(t) itself is not expected to be measured directly in experiments or in social media analysis, we expect that it to be proportional to the number of postings on the internet. According to Ishii et al. [7], we express the equation of the intention of each person using the exponential form as

$$\frac{dI\_{\bar{l}}\left(t\right)}{dt} = -aI\_{\bar{l}}\left(t\right) + \sum\_{j} D\_{\bar{j}}I\_{\bar{j}}\left(t\right) + \sum\_{j} \sum\_{k} P\_{\bar{j}k}I\_{\bar{j}}\left(t\right)I\_{k}\left(t\right) + f\_{\bar{l}}\left(t\right), \tag{1}$$

where Dij, Pijk, and fi(t) are the coefficient of direct communication, coefficient of indirect communication, and the random external force effect for a person, i, respectively. As we consider the above equation for every consumer, i = 1, . . . , Np. Considering the effect of direct communication, indirect communication, and the decline of the audience, we obtain the above equation for the mathematical model of the hit phenomenon. The advertisement and publicity effect for each person can be described as the mean field value of the random external force effect, <fi(t)>. Here, it is assumed that people's height of interest, I (t), attenuates exponentially. Although this is known to occur in movies and as mentioned in Allsop et al. [3], attention to events and anniversaries is known to attenuate as per the power function [45, 46]. In the case of social interest, we attenuate the intermediate between the exponential and power functions [47], but here we adopt exponential decay.

Generally, information spreads through WOM, which sometimes has a significant effect on the spread of topics. The WOM effect can be distinguished into two types: WOM direct from friends and indirect WOM as rumors. We call the WOM effect between friends "direct communication" because customers obtain information directly from their friends. In previous marketing theories based on the Bass model [17, 18], communication from the adopter to non-adopter alone are generally are taken into account. Here, in this paper, we include the communication between non-adopters, in addition. We consider here that person, i, hears information from person, j. The probability per unit time for the information to affect the purchase intention of person, i , can be noted as DijI<sup>j</sup> (t), where I<sup>j</sup> (t) is the purchase intention of person, j , and Dij is the coefficient of direct communication. Thus, we can describe the effect of direct communication as follows:

$$\sum\_{j=1}^{N} D\_{\vec{\eta}} I\_{\vec{\jmath}} \left( t \right),$$

where the summation is done without j = i.

In this paper, the rumor is called indirect communication. In this form of communication, a person hears a rumor while chatting on the street, overhearing a conversation from the next table in a restaurant or on a train, or finds the rumor in blogs or on Twitter. To construct a mathematical model, we focus on a person, who listens to a conversation happening around him/her. We consider that person, i, overhears the conversation between person, j, and person, k. The strength of the effect of the conversation of j and k can be described as DjkI<sup>j</sup> (t)I<sup>k</sup> (t). The probability per unit time for the conversation between j and k to affect the purchase intention of person, i, is defined as QijkDjkI<sup>j</sup> (t)I<sup>k</sup> (t) , where Qijk is the coefficient of indirect effect to i. Thus, the indirect communication coefficient can be defined as Pijk = QijkDjk.

Equation (1) is for individuals; however, it is not convenient for analysis. Thus, we consider the ensemble average of the purchase intention of individuals, as follows:

$$
\langle I(t) \rangle = \frac{1}{N} \sum\_{i} I\_i \,(t). \tag{2}
$$

Considering the effect of direct and indirect communication, and the decline of the audience, we obtain the above equation for the mathematical model of the hit phenomenon. The advertisement and publicity effect for each person can be described by the random effect, fi(t).

For the ensemble average of Equation (1), we obtain for the left-hand side,

$$\langle \frac{dI\_i}{dt} \rangle = \frac{1}{N} \sum\_i \frac{dI\_i}{dt} \stackrel{(t)}{=} \frac{d}{dt} \left( \frac{1}{N} \sum\_i I\_i \begin{pmatrix} t \\ \end{pmatrix} \right) = \frac{d \,\langle I \rangle}{dt}. \tag{3}$$

For the right-hand side, the ensemble average of the first, second, and third are as follows:

$$\left<-aI\_{i}\right> = -a\frac{1}{N}\sum\_{i}I\_{i}\left(t\right) = -a\left,\tag{4}$$

$$\langle \sum\_{j} D\_{\vec{l}\vec{l}} I\_{\vec{l}}(t) \rangle = \langle \sum\_{j} D I\_{\vec{l}}(t) \rangle = \frac{1}{N} \sum\_{i} \sum\_{j} D I\_{\vec{l}}(t)$$

$$= \sum\_{i} D \frac{1}{N} \sum\_{j} I\_{\vec{l}}(t) = N D \left\langle I\_{\vec{l}}(t) \right\rangle, \text{ (5)}$$

$$\begin{aligned} \langle \sum\_{j} \sum\_{k} P\_{ijk} I\_{j} \left( t \right) I\_{k} \left( t \right) \rangle &= \langle P \sum\_{j} \sum\_{k} I\_{j} \left( t \right) I\_{k} \left( t \right) \rangle \\ &= \frac{1}{N} \sum\_{i} P \sum\_{j} \sum\_{k} I\_{j} \left( t \right) I\_{k} \left( t \right) \\ &= \sum\_{i} P \frac{1}{N} \sum\_{j} \sum\_{k} I\_{j} \left( t \right) I\_{k} \left( t \right) \\ &= N P \sum\_{i} \frac{1}{N} \sum\_{j} I\_{j} \left( t \right) \frac{1}{N} \sum\_{k} I\_{k} \left( t \right) \\ &= N^{2} P \left( I \left( t \right) \right)^{2}, \end{aligned} \tag{6}$$

where we assume that the coefficients of the direct and indirect communication can be approximated by

$$\begin{aligned} D\_{i\bar{j}} &\cong D, \\ P\_{i\bar{j}k} D\_{j\bar{k}} = p\_{i\bar{j}k} &\cong P. \end{aligned}$$

under the ensemble average.

For the fourth term, which is the random effect term, we consider that the random effect can be divided into two parts: the collective and individual effects:

$$f\_i(t) = \left< f(t) \right> + \Delta f\_i(t) \,. \tag{7}$$

$$
\left< f\_i \left( t \right) \right> = \frac{1}{N} \sum\_i f\_i \left( t \right) = \left< f \left( t \right) \right>, \tag{8}
$$

where 1f<sup>i</sup> (t) is the deviation of the individual external effects from the collective effect, f (t) . Thus, we consider here that the collective external effect term, f (t) , corresponds to the advertisements and publicity, for the persons in society. The deviation term, 1f<sup>i</sup> (t), corresponds to the deviation effect from the collective advertisement and publicity effect for individuals, which can be assumed to be

$$
\left< \Delta f\_i \left( t \right) \right> = \frac{1}{N} \sum\_i \Delta f\_i \left( t \right) = 0. \tag{9}
$$

Taking the above ensemble average of Equation (1), we obtain the following form as the intention of society as a collective mode:

$$\frac{d\left}{dt} = -a\left + D\left + P\left^2 + \sum\_{\xi} C\_{\xi} A\_{\xi}\left(t\right),\tag{10}$$

where Nd = D and N <sup>2</sup>h = P. The detailed derivation is shown in Allsop et al. [3]. We represent the external effect as f(t) <sup>P</sup> <sup>=</sup> <sup>ξ</sup> CξA<sup>ζ</sup> (t). Hereafter, we denote I(t) as I(t). Equation (10) is modified slightly from the original model in Ishii et al. [7]. In the original mathematical model for the hit phenomenon, direct and the indirect communication are distinguished as the roles of known and unknown people, respectively, for a certain topic. Thus, in the original model, five parameters are to be determined for direct and indirect communication. Moreover, in the original model, we assume different parameters for the before-open and after-open periods. Thus, at least 10 parameters need to be adjusted in the original model, for the communication effects. In the model of this paper, for simplicity, we do not distinguish between known and unknown people for the topic of concern. Hence, the number of parameters that should be adjusted using real data are only two, D and P. The decay rate, a, in Equation (10) can be assumed to be 0.5, which is same as that in the original model [7]. The strength of each media, C<sup>ξ</sup> , should be determined separately. Thus, if the number of media is one, the number of parameters that should be adjusted using real data is only three.

In the following calculation, coefficients C, D, and P are determined such that the calculated value according to the Equation (10) coincides with the daily change in the observed tweet number; the Monte Carlo method is used, and the details are available in Ishii et al. [7], as given below.

The advertisement and publicity effects are included in A<sup>ξ</sup> (t), which is treated as an external force. The index, ξ , sums up the mass media exposures. The WOM, represented by posts on SNSs, such as blogs or Twitter, is used as the observed data, which can be compared with the calculated results of the model. The unit of time is a day.

The advertisement and publicity effects are obtained from M Data Co. Ltd (http://mdata.tv/en/) as the TV metadata of realtime advertisement and publicity on television for a certain topic. TV metadata includes text data containing the summary of TV programs and commercials with time stamp. M Data records them immediately after broadcast. It captures TV metadata by verifying the aired content with human eyes and ears. This metadata contains the summary of the broadcast contents, performer's name, brand name, company name, place name, and duration of exposure. The WOM, represented by posts on SNSs, are observed using the social media analysis tool, Kuchikomi @ Kakaricyo by Hottolink Co. Ltd (https://www.hottolink.co.jp/ english/).

#### Parameter Estimation

For reliability, we introduce the "R-factor" (reliability factor), which is well-known in the field of low-energy electron diffraction (LEED) [48]. In LEED experiments, the experimentally observed curve of the current vs. voltage is compared to the corresponding theoretical curve, using the R-factor.

For our purpose, we define the R-factor as follows:

$$R = \frac{\sum\_{i} \left( f^{\prime}(i) - g(i) \right)^{2}}{\sum\_{i} \left[ f^{2} \left( i \right) + g^{2}(i) \right]},\tag{11}$$

where f(i) and g(i) correspond to the calculated I(t) and the observed number of blog posts or tweets, respectively. The smaller the value of R, the better are the functions, f and g. Thus, we use a stochastic method to search for the parameter set that minimizes R. This random number technique is similar to the Metropolis method [49], which we have used previously [7]. In the actual calculation, we change each parameter within 10% of its value, using the random number per turn. We perform such calculations for more than one-hundred-thousand turns, similar to the Metropolis method for molecular dynamics. For the molecular dynamics, we try to obtain the parameter configuration that gives the lowest total energy. In our case, we try to obtain the parameter configuration with the least R-factor.

In the real calculation, for adjusting parameters C<sup>ξ</sup> , D, and P, the local minimum trapping, as in the first principle calculation in material physics, needs to be avoided. There are several ways to determine the minimum condition, including the steepest descent, equation of motion method, and conjugate gradient method. Even in the actual first principle calculation or density functional theory, local minimum trapping needs to be avoided. In this paper, we only calculate, using several initial values in a Metropolis-like method to avoid local minimum trapping. To check the accuracy of the parameter adjustment, we use the R-factor value. For every calculation shown in this paper, the R-factor is below 0.01.

Although parameters C<sup>ξ</sup> , D, and P in Equation (10) can be considered as functions of time, we retain C<sup>ξ</sup> , D, and P as constant values to examine whether Equation (10) can be applied to any social phenomena.

on TV (Green) and the internet (Purple). The blue curve corresponds to the observed number of daily Blog postings and the red curve is our calculation.

of daily blog posts and the corresponding calculated values. The histogram (Case C) indicates [ADV (s)] is the number of daily exposures on TV information about "Case C" in seconds. And The histogram (Case D) indicates [ADV (s)] is the number of daily exposures on TV (Green) and the internet (Orange).

## RESULTS

In this section, we present the analysis results of social-media posts, using the mathematical model of the hit phenomenon. The actual analysis of the direct and indirect communication are presented, which are critical in the mathematical model of the hit phenomenon. Other examples include Japanese group events, reputation of popular videos, and the results before the conclusion of event-ticket reservation.

### Case A

In **Figure 1**, the mathematical model of the hit phenomenon is applied to a "Case A." Our target is Japanese Famous Food WOM of "Case A." The horizontal axis in the figure represents the date from 2014/1/1 to 2014/12/31. The peaks in the figure correspond to the number of daily exposures on Blog and the internet of "Case A." As can be seen, our calculation results demonstrate that the number of social media posts are measured with sufficient precision. Thus, the mathematical model of the hit phenomenon can be applied to Japanese Famous Food WOM of "Case A."

### Case B

A movie example is shown in **Figure 2**, our calculation for the American movie, "Case B" (2015), is depicted [50]. The histogram indicates the number of seconds of exposure by advertisement or publicity on television. Excluding small fluctuations, the calculation result matches with the number of Twitter postings accurately. Thus, the mathematical model of the hit phenomenon can be used for movies, as well.

### Strong Indirect Communication

In the analysis, we present an example, where indirect communication, which is a characteristic action in the mathematical model of the hit phenomenon, has significant effect. According to the mathematical-model analysis of the hit phenomenon, several movies show large indirect communication. The results for movies, "Case C" and "Case D," are depicted in **Figure 3**. The calculation result by the mathematical model of the hit phenomenon matches well with the number of blog posts. Considerable indirect communication enhances the movie's reputation.

Another typical break is "Case E," a Japanese Tarent who became a hit in September 2016. **Figure 4** shows the analysis of the reputation of "Case E" using the number of blog posts, which are calculated using the mathematical model of the hit phenomenon. The calculation agrees well with the actual measurement [50].

**Figure 5** compares the coefficients determined, before and after the day influencer introduced the video. The strength, "P," of the indirect communication significantly increased, after introduction. Response to media exposure also increased considerably. Thus, it is considered that an explosive epidemic appears as an increase in the strength, "P," of the indirect communication.

On the other hand, the direct communication strength, "D," is rather weak. Although this movie was spreading through breaks, it shows there is no increase in the number of "Case E" 's core interested people. This indicates that even in the midst of an explosive epidemic, the direct communication strength, i.e., the satisfaction of the core interested people is not necessarily increased or limited. We also did a similar calculation on the explosive epidemic for "Pokemon GO" [51].

### Direct and Indirect Communication

The following example is of the reputation of the Japanese Famous group, "Case F" for event. We analyzed the reputation, before and after event held in summer 2015, using the mathematical model of the hit phenomenon [52]. The result is shown in **Figure 6**.

Several Famous groups participated in this event and gathered a vast audience. Before event, the indirect communication strength, "P," was large, whereas that of the direct communication, "D," increased, after the event. Indirect communication shows the strength with which people, other than the interested people, are interested and direct communication indicates the interest of interested people. In the "Case F," it appears that those who were interested, before the event, became core interested people of "Case F," after the event.

In hit contents, indirect communication increases after publication. The "Case E" epidemic is consistent with the increase in break and indirect communication. Hence, the reputation breaks, when indirect communication increases rapidly. Additionally, convergence occurs, when indirect communication decreases.

On the other hand, the strength of direct communication shows the enthusiasm of core interested people but does not imply that there are numerous core interested people. The

increase in direct communication, after the event at "Case F," indicates an increase in the number of enthusiastic interested people. However, the fact that there is no increase in direct communication in the "Case E" indicates that "Case E" is only a topicality and there is no increase in "Case E" 's core interested people.

### DISCUSSION

Using the mathematical model of the hit phenomenon, we analyzed the reputations of a movie, a YouTube movie that became a global topic, and the popular event trend in Japanese. Important factors in the mathematical model of the hit phenomenon include the direct communication strength, "D," the indirect communication strength, "P," and the coefficient, "C," of the media response strength.

The results indicate that for the reputation of "Case E" 's movie, the indirect communication strength, "P," increased, with the world-wide reputation. "P" tends to be large, for hit movies also. Therefore, the indirect communication strength, "P," was found to be related to the wide propagation of the topic.

On the other hand, the comparison of the reputation, before and after the group event, shows that the direct communication strength, "D," appears to be the satisfaction level of the support layer.

Hence, "D" indicates whether the current support layer is satisfied, and P indicates the power to acquire a new support layer. This can be said to be effective not only for the analysis of entertainment and marketing strategy but also for political election analysis.

As the mathematical model of the hit phenomenon is a theory of sociophysics, it is possible to describe how a person in society causes interest, and follow the time change of this interest. Therefore, expansion is easy. For example, to determine which among two competing topics shows interest, a theory has already been proposed, which generates two mathematical models of the hit phenomenon simultaneously [53, 54].

In addition, it is possible to solve the influence of social media on the market share of products by the simultaneous theory of

#### REFERENCES


the market share, in economics, and the mathematical model of the hit phenomenon [55].

The hits on social media are similar to the burst phenomenon, which evolves through non-Poissonian dynamics.

#### CONCLUSION

In this paper, using the mathematical model of the hit phenomenon, which is one of the theories of sociophysics, the rise of topics and convergence in society were calculated, even for movies, and events, and the reputation of a YouTube movie that became a global topic. This establishes that the mathematical model of the hit phenomenon can explain the spread of topics as a social phenomenon. Using this model, it can be determined whether the topic is spread beyond clusters by social dynamics; if the indirect communication is considerable, it becomes a hit. Additionally, it is possible to quantitatively analyze the propagation mechanism of popular topics, using the mathematical model of the hit phenomenon. It may be possible to clarify the mechanism for information propagation as a social epidemic phenomenon, according to the utilization of the corresponding parameter.

#### AUTHOR CONTRIBUTIONS

AI consider the model and select the target. YK do actual computation.


to newly launched movies. Technol Forecast Soc Change (2016) **109**:35–49. doi: 10.1016/j.techfore.2016.05.013


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Ishii and Kawahata. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

digital media

of impactful research

article's readership