Optimal distribution of incentives for public cooperation in heterogeneous interaction environments

In the framework of evolutionary games with institutional reciprocity, limited incentives are at disposal for rewarding cooperators and punishing defectors. In the simplest case, it can be assumed that, depending on their strategies, all players receive equal incentives from the common pool. The question arises, however, what is the optimal distribution of institutional incentives? How should we best reward and punish individuals for cooperation to thrive? We study this problem for the public goods game on a scale-free network. We show that if the synergetic effects of group interactions are weak, the level of cooperation in the population can be maximized simply by adopting the simplest “equal distribution” scheme. If synergetic effects are strong, however, it is best to reward high-degree nodes more than low-degree nodes. These distribution schemes for institutional rewards are independent of payoff normalization. For institutional punishment, however, the same optimization problem is more complex, and its solution depends on whether absolute or degree-normalized payoffs are used. We find that degree-normalized payoffs require high-degree nodes be punished more lenient than low-degree nodes. Conversely, if absolute payoffs count, then high-degree nodes should be punished stronger than low-degree nodes.


INTRODUCTION
In human societies, cooperation is essential for the maintenance of public goods. However, the collapse of cooperation happens often in many public goods dilemmas which we nowadays face, like protecting the global climate or avoiding overfishing of our oceans (Hardin, 1968;O'Neill and Oppenheimer, 2002). For avoiding the tragedy of the commons (Hardin, 1968), we often rely on institutions to enforce public cooperation and acceptable behavior. Although institutionalized punishment appears to be more common than institutionalized reward, both concepts are in use throughout the world. Recently, ample research efforts have been devoted to the study of the emergence of institutions and their effectiveness in promoting prosocial behavior (Yamagishi, 1986;Ostrom, 1990;Gurerk et al., 2006;Henrich, 2006;Cuesta et al., 2008;Sigmund et al., 2010;Baldassarri and Grossman, 2011;Sasaki and Unemi, 2011;Szolnoki et al., 2011a,b;Cressman et al., 2012;Isakov and Rand, 2012;Sasaki et al., 2012;Bechtel and Scheve, 2013;Cressman et al., 2013;Vasconcelos et al., 2013;Vukov et al., 2013). It has been shown, for example, that institutional rewarding promotes the evolution of cooperation in the liner public goods game , the nonlinear public goods games (Chen et al., 2013), and in structured populations in general (Jiménez et al., , 2009Szolnoki et al., 2011a,b). However, institutional punishment is less costly and thus more effective to warrant a given level of public cooperation, especially if participation in the public goods game is optional (Sasaki et al., 2012;Sasaki, 2013).
Besides the obvious stick vs. carrot dilemma (Hilbe and Sigmund, 2010;Perc, 2012, 2013), the question emerges how to best make use of the available resources, which inevitably are finite (Perc, 2012;Chen and Perc, 2014). In particular, we wish to make optimal use of different forms of reciprocity to promote human cooperation (Poteete et al., 2010;Gracia-Lázaro et al., 2012a,b;Exadaktylos et al., 2013;Rand and Nowak, 2013). One plausible approach appears to be allocating the resources depending on the properties of the interaction network that describes the connections among us. Surprisingly, few studies have thus far considered the problem of the optimal allocation of incentives for maximizing public cooperation. Traditionally, all groups and all individuals are considered equal, and depending on their strategies thus deserved of the same reward or punishment Sasaki et al., 2012;Chen et al., 2013). This simple assumption, however, does not agree with the fact that in social networks individuals have different roles, which depend significantly on the degree of the node that they occupy. Indeed, the prominent role of heterogeneous interaction networks for the successful evolution of cooperation is firmly established and well known (Santos and Pacheco, 2005;Santos et al., 2006a,b;Gómez-Gardeñes et al., 2007;Fu et al., 2007;Masuda, 2007;Tanimoto, 2007;Tomassini et al., 2007;Assenza et al., 2008;Fu et al., 2008;Santos et al., 2008;Floría et al., 2009;Fu et al., 2009;Pacheco et al., 2009;Peña et al., 2009;Poncela et al., 2009;Tanimoto, 2009;Brede, 2011;Gómez-Gardeñes et al., 2011;Kovářík et al., 2012;Pinheiro et al., 2012;Tanimoto et al., 2012;Simko and Csermely, 2013;Tanimoto, 2013), and the reasonable assumption is that the incentives should likely also be distributed accordingly for optimal evolutionary outcomes. In the framework of evolutionary graph theory the interaction groups are diverse, and naturally thus the provided incentives within each group should also be different. The number of links an individual player has is traditionally assumed to be a good proxy for that player's influence and importance. In this sense, it is interesting and highly relevant to determine how to distribute the incentives in the light of this heterogeneity.
Here we consider the spatial public goods game (Szolnoki et al., 2009) with institutional reciprocity on a scale-free network (Barabási and Albert, 1999;Albert and Barabási, 2002), where the assumption is that the incentives at disposal for rewarding cooperators and punishing defectors are limited. We assume that the budget available to each group is proportional to its size, and that the distribution of the incentives depends on the number of individual links within the group. Our aim is to arrive at a thorough understanding of how the incentives should be best distributed to maximize public cooperation. In what follows, we present the results obtained with the model described in the Methods section, to where we refer for details. As we will show, if the enhancement factor r is small, the level of cooperation can be maximized simply by adopting the simplest "equal distribution" scheme. If the value of r is large, however, it is best to reward high-degree nodes more than low-degree nodes. Unlike for institutionalized rewards, the optimal distribution of resources within the framework of institutional punishment depends on whether absolute or degree-normalized payoffs are used. High-degree nodes should be punished more lenient than low-degree nodes if degree-normalized payoffs apply, while high-degree nodes should be punished stronger than low-degree nodes if absolute payoffs count.

RESULTS
We perform Monte Carlo simulations of the public goods game described in the Methods section, whereby we consider separately institutional rewarding with absolute payoffs and institutional punishment with absolute payoffs, as well as institutional rewarding with degree-normalized payoffs and institutional punishment with degree-normalized payoffs. As the key parameters we consider the enhancement factor r, the average amount of available incentives δ, and the distribution strength of incentives α (see Methods for details). We determine the stationary fractions of cooperators in the stationary state on networks comprising N = 1000 to 10, 000 nodes. The final results are averaged over 100 independent initial conditions to further enhance accuracy. Figure 1 shows the stationary fraction of cooperators for two different values of the enhancement factor r. When the enhancement factor is small (Figures 1A,C), defectors always dominate if δ < 0.2, and this regardless of the value of α. For intermediate values of δ, the cooperation level can be maximized at an intermediate value of the distribution strength α, which ought to be close to zero. This indicates that an equal distribution of positive incentives, regardless of the degree of players within the group, is the optimal distribution scheme for public cooperation. For high values of δ, the cooperation level increases with increasing the value of α. If the enhancement factor r increases (Figures 1B,D), defectors still dominate for small values of δ and regardless of the value of α. However, the nonmonotonous dependence of the fraction of cooperators on the distribution strength α disappears for intermediate values of δ. Instead, the highest cooperation level is attainable for large values of α.

INSTITUTIONAL REWARDING WITH ABSOLUTE PAYOFFS
Intuitively, it is possible to understand that when the enhancement factor is small, a modest positive incentive is not enough to reverse the doom of cooperators, no matter which distribution scheme is used (Sasaki et al., 2012). Conversely, if the incentives are large and targeted preferentially toward influential players, they can have a high payoff even if the part stemming from the public goods game is small. In agreement with the traditional argument of network reciprocity (Nowak and May, 1992;Santos and Pacheco, 2005;Wang et al., 2013), only cooperators are able to forge a long-term advantage out of this favorable situation and build sizable cooperative groups. Thus, for high-enough values of α, which favor the distribution of rewards toward high-degree nodes, the evolution of public cooperation is successful.
To gain an understanding of the optimal intermediate value of the average amount of available incentives δ requires more effort. First, we show in Figure 2 the payoff differences between cooperators and defectors as well as the fraction of cooperators as a function of degree k during different typical evolutionary stages of the game. We observe that for α = 0, which implies an "equal distribution" scheme irrespective of the degree of players, the payoff of cooperators is higher than that of the corresponding defectors, and this regardless of k. Thus, cooperators can successfully occupy all nodes of the networks. In contrast, for negative values of α, the payoff of cooperators with high or middle degree is less than that of the corresponding defectors, while cooperators with low degree have a higher mean payoff than defectors with small degree. Because there are interconnections among different types of nodes, and because the Fermi strategy updating rule is adopted, cooperators can coexist with defectors at equilibrium. But defectors can occupy most of the nodes in the network, since low-degree cooperators do not obtain a high enough payoff to spread their strategy across the network. For positive values of α, the payoff of cooperators with high and middle degree is larger than that of the corresponding defectors, while cooperators with low degree have a lower mean payoff than defectors with low degree. In addition, high-degree cooperators obtain a sufficiently high payoff through institutional rewarding that enables them to spread the cooperative strategy also toward some of the auxiliary low-degree nodes. Accordingly, cooperative behavior prevails over defection, but the stationary state is still a mixed C + D phase-cooperators are unable to dominate completely.
To further corroborate our arguments, we show in Figure 3 the payoff difference between cooperators and defectors as well as the fraction of cooperators in dependence on degree k, as obtained for a two times larger value of r than used in Figure 2. In comparison, it can be observed that for α ≤ 0 the results remain unchanged. For α > 0, on the other hand, the process of evolution is different from what we have presented in Figure 2. During the early stages of evolution ( Figures 3A,B), cooperators with low degree can have a lower mean payoff than low-degree defectors, while cooperators with middle and high degree can have a higher mean payoff than the corresponding defectors. Further in time, cooperators succeed in occupying all high-degree nodes (Figure 3G), and even low-degree cooperators have a payoff comparable to that of low-degree defectors ( Figure 3C). Cooperators can eventually invade the whole network ( Figure 3H), thus giving rise to the absorbing C phase at r = 2, which emerges for sufficiently large values of α and intermediate values of δ. Figure 4 shows that when the enhancement factor is small, cooperators are unable to survive for small values of δ, and this irrespective of the value of α. For intermediate values of δ, the highest cooperation level is attained at an intermediate value of the distribution strength α, which is almost equal to zero, like by the consideration of institutional reward in the preceding subsection. This indicates that, for small enhancement factors, in case of institutional punishment too an "equal distribution" scheme works best for the evolution of public cooperation. If δ is large-if resources for punishment abound-cooperators can always dominate, regardless of the value of α. For a two times larger value or r cooperators are favored even more, so that the nonmonotonous dependence of the cooperation level on α at intermediate values of δ vanishes. Instead, the fraction of cooperators simply increases with increasing values of α. Thus, the more the highdegree defectors are punished, the better for the evolution of public cooperation.

INSTITUTIONAL PUNISHMENT WITH ABSOLUTE PAYOFFS
It is understandable that low values of r, paired with modest resources for punishing defectors, lead to the dominance of defectors, regardless of the value of α. Conversely, if the resources abound, defectors are punished severely and cooperators dominate. In this limit example the distribution of fines between low, middle and high degree nodes does not play an important role. If, however, the combination of values of r and δ just barely, or not at all, support the survivability of cooperators, then the value of α, and thus the particular distribution of incentives (in this case fines), plays a significant role. With the aim to explain this nontrivial dependence on α, we show in Figure 5 the payoff difference between cooperators and defectors as well as the fraction of cooperators as a function of degree k during different stages of the evolutionary process. It can be observed that for α = 0 cooperators can always have a higher payoff than the defectors with the same corresponding degree (Figures 5A-C).

FIGURE 2 | Time evolution of the mean payoff difference between cooperators and defectors (A-D) and the fraction of cooperators (E-H) as a function of degree k for three typical values of α.
Institutional rewarding and absolute payoffs apply. The insets of (A,B) show the mean payoff difference between cooperators and defectors for low-degree and middle-degree nodes during the early stages of evolution. During the evolutionary process, if the enhancement factor is small, cooperators always have a higher mean payoff than defectors at an intermediate value of α. Parameter values are r = 1 and δ = 0.5. Cooperators can thus rise to complete dominance. While for α < 0, however, low-degree cooperators can have a higher mean payoff than low-degree defectors, while cooperators with middle or high degree can't match the corresponding defectors during the early stages of the evolution (Figure 5A). Defectors can therefore, over time, occupy the high-degree and middle-degree nodes ( Figure 5F). This invasion can decrease the fraction of cooperators on low-degree nodes (Chen et al., 2008). Accordingly, the payoff difference between cooperators and defectors on these nodes continues to be negative, although low-degree defectors receive the negative incentives. Ultimately cooperators therefore die out. For α > 0, on the other hand, defectors with higher degree are punished preferentially-they receive a bigger share of fines from the available fond than low-degree defectors. Due to the small enhancement factor and the institutional punishment, both high-degree cooperators and high-degree defectors have negative payoffs. In fact, low-degree players can have a higher payoff than high-degree players, despite of the fact that we use absolute payoffs in this particular case. Either way, defectors can easily invade low-degree nodes (Figure 5F), and they can spread further toward middle and high degree nodes, although at the beginning of evolution cooperators have a higher payoff than defectors on these nodes. The ultimate consequence is that defectors dominate completely ( Figure 5H).
It remains of interest to explain why the dependence on α disappears at intermediate values of δ for r = 2. For this purpose, we show in Figure 6 the same quantities as in Figure 5, from where it follows that the results do not change for α ≤ 0. However, for α > 0 the differences are clearly inferable. For r = 2 the high-degree cooperators can obtain a positive payoff, and naturally they then have a higher payoff than high-degree defectors, because the latter receive ample negative incentives from the institutional punishment pool (Figures 6A-C). Cooperators can therefore occupy high-degree nodes and from there spread across the whole population (Figure 6G), and this the more effectively the higher the value of α.

INSTITUTIONAL REWARDING WITH DEGREE-NORMALIZED PAYOFFS
From here onwards we turn to considering degree-normalized payoffs, which can have important negative consequences for the evolution of public cooperation in heterogeneous environments if compared to absolute payoffs (Masuda, 2007;Tomassini et al., 2007;Wu et al., 2007;Szolnoki et al., 2008;Maciejewski et al., 2014). As shown in Figure 7, the fraction of cooperators unsurprisingly increases with increasing δ for various distribution strength α (top panels). Furthermore, at small values of r, for small values of δ defectors always dominate, regardless of the value of α, while for intermediate values of δ cooperators recover gradually as α increases. For high δ values there exists an intermediate close-to-zero value of α that maximizes the stationary fraction of cooperators (Figures 7A,D). When the enhancement factor is larger, the extent of the parameter region where the nonmonotonous phenomenon can be observed decreased. Instead, for high values

FIGURE 5 | Time evolution of the mean payoff difference between cooperators and defectors (A-D) and the fraction of cooperators (E-H) as a function of degree k for three typical values of α.
Institutional punishment and absolute payoffs apply. The insets of (A,B) show the mean payoff difference between cooperators and defectors for low-degree and middle-degree nodes during the early stages of evolution. During the evolutionary process, if the enhancement factor is small, cooperators always have a higher mean payoff than defectors at an intermediate value of δ. Parameter values are r = 1 and δ = 0.5.

FIGURE 6 | Time evolution of the mean payoff difference between cooperators and defectors (A-D) and the fraction of cooperators (E-H) as a function of degree k for three typical values of α, as
obtained for a larger enhancement factor but a smaller average amount of available incentives than used in Figure 5. Parameter values are r = 2 and δ = 0.3.

Frontiers in Behavioral Neuroscience
www.frontiersin.org July 2014 | Volume 8 | Article 248 | 6 of δ, the cooperation level increases with increasing values of α and the area of complete cooperator dominance increases as well (Figures 7E,F). If comparing the results presented in Figure 7 with those presented in Figure 1, we find that the nonmonotonous phenomenon still exists, and it can appear even at larger values of r because of the consideration of degree-normalized payoffs. In general, however, the explanation of these results and the evolutionary mechanisms behind are the same as those described when considering institutional rewarding with absolute payoffs.

INSTITUTIONAL PUNISHMENT WITH DEGREE-NORMALIZED PAYOFFS
Lastly, we consider institutional punishment with degreenormalized payoffs. From the results presented in Figure 8 it follows that the stationary fraction of cooperators increases with increasing values of δ, and this regardless of the value of α (top panels). When the enhancement factor is small, we can see that the nonmonotonous dependence of the fraction of cooperators on α exists at intermediate values of the average incentive δ (Figure 8D). When the enhancement factor increases, this phenomenon still exists, but the extent of the parameter region where the nonmonotonous dependence can be observed decreases ( Figure 8E). Surprisingly, when the enhancement factor increases further, the nonmonotonous dependence disappears.
Instead, in a narrow region of intermediate δ values, the fraction of cooperators decreases with increasing values of α. At the same time, the extent of the full cooperation area increases while the full defection region decreases in the considered (α, δ) parameter space ( Figure 8F). While the underlying mechanism for the nonmonotonous dependence on α is qualitatively identical to that reported before when considering institutional punishment with absolute payoffs, the decrease of the level of cooperation at intermediate values of δ and r = 3 as α increases requires special attention. We note that when the value of r is large, low-degree cooperators can still have a positive payoff. For negative α, these low-degree cooperators can even have the highest payoffs because of the consideration of degree-normalized payoffs and sufficiently high values of δ to weigh heavily on the defectors. Therefore, cooperators dominate on all low-degree nodes and from there spread further across the whole network and rise to dominance. This atypical spreading is a unique consequence of the consideration of the optimal distribution of negative incentives from the punishment pool, and it highlights the importance of the parameter α. For positive values of α namely, because most of the negative incentives are then assigned to high-degree defectors and there are only a few of those in the entire population (Barabási and Albert, 1999), the majority of low-degree defectors is not punished at all. The previously described spreading of cooperators from the low-degree nodes outwards is therefore impaired, which ultimately results in an overall lower stationary fraction of cooperators. Instead of cooperation, for larger values of α the low-degree nodes "emit" defection throughout the population.

DISCUSSION
To summarize, we have studied how to best distribute limited institutional incentives in order to maximize public cooperation on scale-free networks. We have considered both institutional rewarding of cooperators and institutional punishment of defectors, and we have also distinguished between absolute and degreenormalized payoffs. Our key assumptions was that, since in heterogeneous environments players have a different number of partners, the incentives ought to be distributed by taking this into account. This would be in agreement with the established importance of degree heterogeneity for cooperation in evolutionary games (Santos et al., 2006a(Santos et al., , 2008. Traditionally, however, previous research has considered the limited budged be distributed equally among all the potential recipients of the incentives, irrespective of the players status and influence within the network. Accordingly, how to distribute the incentives to optimize public cooperation was an important open problem.
We have found interesting solutions on how to optimally distribute the incentives based on each player's social influence level, the proxy for which are the number of social ties the players have within the interaction network. We have shown that sharing the incentives equally among all regardless of status is optimal only if the social dilemma is strong and the propensity to contribute to the common pool is thus weak, and if in addition the available amount of incentives is intermediate. This result is valid for both institutional punishment and institutional rewarding, and it does not depend on whether absolute or degree-normalized payoffs count toward evolutionary fitness. However, if the environment already favors cooperative behavior-when the public goods game is characterized with a high enhancement factor-then it is best to reward influential players more than low-degree players, and this regardless of whether absolute or degree-normalized payoffs apply. For institutional punishment, on the other hand, the solution of the optimization problem depends on whether absolute or degree-normalized payoffs are used. We have shown that degree-normalized payoffs require high-degree nodes be punished more lenient than low-degree nodes, while if absolute payoffs count, then high-degree nodes should be punished stronger than low-degree nodes. In general, rewarding influential cooperators strongly and punishing auxiliary defectors leniently appears to be optimal for the successful evolution of public cooperation.
In terms of solving actual common goods problems, our work might have merit in situations with strong diversity in roles and group sizes. One representative example of such a situation is climate change governance, where existing research has shown that local institutions are an effective way to promote the emergence of widespread cooperation (Vasconcelos et al., 2013). Since our results are derived not only from local institutions, but take into account also the heterogeneous interaction environment, they could offer further advice on how to arrive at globally acceptable climate policies (Vasconcelos et al., 2014).
While the evolution of institutions remains a puzzle , their importance for enforcing socially acceptable behavior in human societies can hardly be overstated. Although institutionalized punishment appears to be prevailing, recent research concerning the effectiveness of punishment, for example related to antisocial punishment (Herrmann et al., 2008;Rand and Nowak, 2011), reciprocity (Ohtsuki et al., 2009), and reward (Rand et al., 2009;Hilbe and Sigmund, 2010), is questioning the aptness of sanctioning for elevating collaborative efforts and raising social welfare. Indeed, although the majority of previous studies addressing the "stick vs. carrot" dilemma concluded that punishment is more effective than reward in sustaining public cooperation (Sigmund, 2007), evidence suggesting that rewards may be as effective as punishment and lead to higher total earnings without potential damage to reputation or fear from retaliation is mounting (Dreber et al., 2008;Perc, 2010, 2012). In particular, Rand and Nowak (2011) argue convincingly that healthy levels of cooperation are likelier to be achieved through less destructive means. We hope that our study will prove to be inspirational for further research aimed at discerning the importance of positive and negative reciprocity for human cooperation, as well as for looking closely at their correlated effects Chen et al., submitted).

METHODS
We consider the evolutionary public goods game on the Barabási-Albert scale-free network (Barabási and Albert, 1999;Albert and Barabási, 2002). Each player x occupies one node of the network, and it can choose between cooperation (s x = 1) and defection (s x = 0) as the two competing strategies. To each public goods game cooperators contribute the cost c = 1, while defectors contribute nothing. The payoff of player x who is member in the group G y , which is centered on player y, depends on the size of the group k y + 1 (here k y is also the degree of node y), on the number of cooperators n c in the group, and on the enhancement factor r. In addition to the payoffs stemming from the public goods game, each group receives institutional incentives I x = (k x + 1)δ to be used either for rewarding cooperators or for punishing defectors, where δ is the average amount of available incentives.
When the incentives are used for rewarding, a cooperator y with degree k y that is member in the group G x thus receives the payoff P y,x = rcn c k x + 1 − c + s y k α y z ∈ G x s z k α z I x , while a defector in the same group receives P y,x = rcn c k x + 1 , where α is the distribution strength. According to the definition of the payoffs, for α > 0 high-degree nodes obtain larger rewards than low-degree nodes, while for α < 0 low-degree nodes receive a larger share from the incentive pool. If the incentives are used for punishing defectors rather than rewarding cooperators, then a cooperator y with degree k y that is member in the group G x receives P y,x = rcn c k x +1 − c, while a defector in the same group receives As by institutionalized rewarding, here to α > 0 implies highdegree nodes are punished stronger than low-degree nodes, and vice versa for α < 0.
Each player x participates in k x + 1 public goods games, which are staged in groups that are centered on player x itself and on its k x neighbors, respectively. The total payoff player x obtains is thus P x = y∈G x P x,y . After playing the games, a player is allowed to learn from one of its randomly chosen neighbors y and update its strategy accordingly. The probability of strategy change is given by the Fermi function (Szabó and Tőke, 1998;Szabó and Fáth, 2007) if we assume that absolute payoff are considered. However, previous research has emphasized also the importance of degreenormalized payoffs (Masuda, 2007;Tomassini et al., 2007;Wu et al., 2007;Szolnoki et al., 2008;Maciejewski et al., 2014), in which case the probability of strategy change is f = 1 1 + exp[(P x /(k x + 1) − P y /(k y + 1))/K] . (2) We consider both absolute (Equation 1) and degree-normalized (Equation 2) payoffs to be representative for the evolutionary fitness of individual players. Especially for institutional punishment, the solution of the considered optimization problem depends significantly on this difference. Without losing generality we set the uncertainly in the strategy adoption process to K = 0.1 (Szolnoki et al., 2009), so that it is very likely that the better performing players will be imitated, although it is also possible that players will occasionally learn from those performing worse.