For whom will the Bayesian agents vote?

Within an agent-based model where moral classifications are socially learned, we ask if a population of agents behaves in a way that may be compared with conservative or liberal positions in the real political spectrum. We assume that agents first experience a formative period, in which they adjust their learning style acting as supervised Bayesian adaptive learners. The formative phase is followed by a period of social influence by reinforcement learning. By comparing data generated by the agents with data from a sample of 15000 Moral Foundation questionnaires we found the following. 1. The number of information exchanges in the formative phase correlates positively with statistics identifying liberals in the social influence phase. This is consistent with recent evidence that connects the dopamine receptor D4-7R gene, political orientation and early age social clique size. 2. The learning algorithms that result from the formative phase vary in the way they treat novelty and corroborative information with more conservative-like agents treating it more equally than liberal-like agents. This is consistent with the correlation between political affiliation and the Openness personality trait reported in the literature. 3. Under the increase of a model parameter interpreted as an external pressure, the statistics of liberal agents resemble more those of conservative agents, consistent with reports on the consequences of external threats on measures of conservatism. We also show that in the social influence phase liberal-like agents readapt much faster than conservative-like agents when subjected to changes on the relevant set of moral issues. This suggests a verifiable dynamical criterium for attaching liberal or conservative labels to groups.


Introduction
A central controversy in moral psychology and sociology deals with understanding the variety of moral values and whether adherence to one set or another have a genetic origin or arise from social interactions.
Political affiliation has been associated to social interaction, to genetics and to the combination of both (e.g. [1][2][3][4][5]). We address questions about early age socialization, cognitive styles and political orientation within a Moral Foundation theory (MFT) perspective using agent-based modelling and techniques from information theory. The present work is culturally situated within the fields of sociophysics [6,7] and computational social sciences [8][9][10] and is a companion to our previous work [11][12][13].
In a series of papers Haidt and coworkers [14][15][16][17][18][19][20] have described MFT, an empirically driven theory dealing with the foundations of moral psychology. It aims to understand statistically significant differences in moral valuations of social issues and their association to coordinates of the political spectrum. The core tenet of the theory is that moral issues, which are valued mostly in an intuitive manner, can be parsed into a number of discrete dimensions, at least five, possibly six or even more. According to Kohlberg [21,22] and Gilligan [23] dimensions representing care/harm and fairness/cheating should be enough to span the space of moral issues. Shweder et al [24] argued that the dimensions should be three instead.
The MFT states that dimensions representing loyalty/betrayal, authority/subversion and sanctity-/degradation should also be included in the moral space. The care/harm and fairness/cheating dimensions are statistically more important for liberals than the rest, and each dimension of the entire set is of similar importance for conservatives. Culture wars would be a consequence of these differences.
Consideration of other political cultures, such as libertarians leads to yet other dimensions, such as liberty/oppression [25]. Political affiliation is also correlated with some characteristics of the Big Five personality traits. Openness and liberal values appear together frequently while Conscientiousness and conservatism are positively associated (e.g. [26]). Further associations between cognitive learning styles and political affiliation have been suggested by EEG experiments [27].
In constructing the Motivated Social-Cognitive perspective Jost et al. [28,29] make the assumption "that conservative ideologies -like virtually all other belief systems -are adopted in part because they satisfy some psychological needs". We have also followed in our previous work [12,13] a motivation driven approach with a totally different methodology: studying mathematically the dynamics of agentbased models using information theory. We considered the discomfort associated to disagreement [30] and the motivating pressure was to reduce pain associated to social exclusion. This was implemented by a learning dynamics designed to maximize a utility function or, equivalently, minimize an energy-like function. Haslam [31] correctly argues that not all social figuring is or should be a matter of cost/benefit calculation. In a third person description, within a mathematical language, we calculate, but the social agent does not calculate, it just acts.
In our previous approach we characterized in a simplified society of agents the effects of different learning styles on the statistics of their opinions about a set of issues. We will call the artificial data set to the data obtained by simulation of the agents. The analytical and numerical results were compared to data gathered by the Moral Foundation Questionnaire project of Haidt and collaborators [32], to which we will refer as the empirical data set. Agents learning with an algorithm that treated new and corroborative information in the same way, exhibited (a) less dispersion of opinions, (b) longer times to readapt under changes of the issues under discussion and (c) histograms of opinions very similar to those of self declared conservatives in the empirical data set. On the other end of the spectrum of cognitive styles, agents that could be thought to score higher in an Openness personality trait, since they gave more importance to new data than to corroborative data, (a) showed greater dispersion of opinions, (b) readapted faster after changes of the issues and (c) were statistically similar to self declared liberals.
Note that we avoided the difficult task of theoretically predefining conservative or liberal. We just took a pragmatic route, comparing the results of our model with empirical data where subjects had declared their belief about their positions in the political spectrum. In other words, a society of agents is classified as conservative or liberal by the proximity of their statistical signatures to those obtained from the Moral Foundation Questionnaires of groups who believe and declared to be of a certain political affiliation.
In this paper we address the following question: why are different cognitive strategies present in the population? Distal causes could be such as the advantages of societies with a higher cohesive set of values due to conservatives and shorter readaptations due to liberals. If we ask for more proximate causes, genetics or heterogeneous social interactions are possible explanations. A discussion by Smith et al [33] illustrates the long path between genetics and opinions about specific issues, including four intermediate levels: biological, cognitive/information processing, personality/values and ideology with the environment influencing each one.
Fowler and collaborators presented evidence for interactions between genetics and politics. In [34] they link the DRD2 dopamine receptor to partisanship hereditability. More relevant to our present study, is their analysis of data [35] from the National Longitudinal Adolescent Health study indicating that a certain allele (7 repetitions long allele) of the dopamine receptor gene DRD4 may have just that kind of influence. For those having two copies of the allele, the number of friends during early age condition the probability of their self declared political affiliation as an adult. The direction is such that those that had a larger number of friends are associated to a larger probability of being a liberal as an adult.
Here we aim at explaining the diversity in moral valuation within our agent based framework by adopting an information theory point of view, in particular we consider an artificial society composed by interacting Bayesian information processing agents. Each agent has a set of social neighbors and exchanges information in the form of opinions about issues. Learning means that when the information brought by the opinion of a social neighbor arrives, there are certain changes in the weights attributed to each moral dimension.
The main results about the learning process following from this approach are two. First, that the learning algorithm is not static but adaptive. It depends on the number of opinions to which an agent has been exposed in social interactions. Second, that for different numbers of such opinions, the difference of the ensuing learning algorithms can be described by the different modulation given to opinions that carry novelty of information relative to opinions that carry corroborative information. The agents of our model are Bayesian during an early window of time we call the formative phase.
Each 'young' agent is exposed to a random number of social information exchanges. At the end of the formative phase the learning algorithm stops evolving and agents enter the social influence phase. Agents, each with its particular fixed learning algorithm determined by the random socialization in the formative phase, exchange information about a set of issues and continue learning. After a time where a steady state has been achieved, we collect statistical information about the state of the society in the form of histograms of opinions (the artificial data set, ADS). A similar set of statistics can be extracted from the set of questions about moral issues (the empirical data set, EDS) collected by Haidt and collaborators from the Moral Foundation project [32] as done in [12,13]. Numerical comparisons of the statistics permit identifying a class of agents with a group of respondents with a given declared political affiliation. The conclusion is that the number of opinion exchanges in the formative phase is correlated with the political affiliation of the corresponding group of the responders. Agents with large number of opinion exchanges in the formative phase are identified with liberals after the social influence phase, those with a small number are identified with conservatives.
In section 2 and appendices we present the mathematical aspects of the theory, first the Bayesian algorithm of learning that evolves during the formative phase, then the description of the social influence phase where agents interact. The rest of the paper has a descriptive approach where no mathematical formalism is used. In section 3 we present the results and describe the comparison to the data obtained from the Moral Foundation questionnaires. We end this paper with a discussion of the results, the limitations of the theory and possible extensions.

Formative phase
Here we describe within a Bayesian framework the way agents process information. We suppose that issues are parsed into a set of five numbers. An issue labeled µ is represented by x µ = {x aµ } a=1,...5 , each x aµ describing the bearing of its content on a moral dimension. Agents emit opinions in a fast, automatic, intuitive manner independently of intricate if-then rules. In the model this is done by summing over the five dimensions the content of each moral dimension of the issue, weighted by the importance the agent attributes to each foundation. The moral state of agent i at time t, called the moral matrix in MFT, is also a vector ω i (t) = {ω a,i (t)} a=1,...5 . The opinion of agent i about issue µ is h i,µ = 5 a=1 ω a,i x a,µ and its sign σ i,µ = sign(h i,µ ) shows whether an agent is for or against an issue.
During a social encounter in the formative phase an agent i receives information y µ = (σ j,µ , x µ ) emitted by the social partner j. Learning occurs in order to decrease disagreement over issues. Within this learning scenario, we hypothesize that evolutionary pressures to increase the prediction of the opinions of others would select learning algorithms near Bayesian optimality (see [36]). As shown in Appendix A the resulting learning algorithm that approximates a full Bayesian use of the available information, can be described in two different ways. One as a motivational algorithm where a cost or energy like function E is decreased by the changes elicited by learning. The other as a modulated Hebbian learning algorithm with the central concept, the modulation function F mod (figure 1), being a measure of the importance attributed to a given issue and the opinion of the interlocutor. In terms of the moral matrixω a (t) and a measure of the full social experience C(t), both ways are: The modulation function and the cost are related by F mod (z, C) = −C(t) ∂Eµ ∂z where z µ = σ j,µ+1 h i,µ measures the concurrence/disagreement between agents i, the receiving agent, and agent j the opinion emitting agent. C(t) is related to the width of the posterior distribution and decreases as learning occurs.
We also use ρ(t) = 1/ 1 + C(t) 2 , a convenient variable since it takes values between zero and one. It is close to zero when an agent had a small number of social encounters and approaches one as the number increases. Hence the modulation function and the cost are functions of z and ρ.
The main results of this paper derive from the fact that the modulation function of the Bayesian algorithm (1) is not the same throughout the learning period and changes as more information is incorporated and depends on the number of social encounters; and (2) it depends on the novelty that the opinion of the social partner carries. These two aspects are clear in figure 1.Right, where the modulation function is plotted as a function of z = h i σ j , for different fixed values of ρ, which measures the number of social interactions. Note that z = |h i |σ i σ j measures the strength |h i | of the opinion held by i and the σ i σ j which is positive if the opinion σ i prior to learning agent i is the same as the that of agent j and the information is corroborative, and z < 0 if the opinions are opposite and the arriving information is considered a novelty.

Social influence phase
We consider the number of information exchanges or socialization events in the formative phase as a random number, not the same for all agents and thus the effective ρ for each agent is a number between zero and one. The agents in the formative phase learned to learn and now they just learn from each other with a frozen modulation function. The validity of this supposition as something that represents the development of adolescents has to be investigated in an independent way. It loosely rings with Piagetian overtones [37]. We also consider the fact that people tend to interact with the likes [38]. So we consider as a nonessential simplification, a system of agents all with the same ρ each one in a site of a social lattice, exchanging information and then investigate the effect of changing ρ. The dynamics of information exchange is analogous to that considered in [11][12][13], the only difference being that the learning occurs with the Bayesian algorithm described above.
We suppose that a society discusses a set of P issues. Parsing of an issue into a vector might be subjective, expressed by the fact that agent i obtains a vector x i . Exchange of information between agents is about the average vector which we suppose reasonable to be independent of the agent, since fluctuations due to subjective parsing, if unbiased, tend to cancel out. We call Z Z the Zeitgeist vector since it captures the contributions of all issues that are currently being discussed by the model society. Without any loss it will be normalized to unit length. The opinion of agent k about the Zeitgeist is and its sign is denoted by σ k = sign(h k ). We now consider a Metropolis-like stochastic dynamics of information exchange. The conjugate parameter β, determines the scale of tolerance to fluctuations in the cost E, that is, it determines how important it is to conform to the opinions of others agents and eventually sets the scale of fluctuations of an agent's moral vector around the Zeitgeist.

Simulation
The artificial data is generated by the following procedure. We suppose that agents are characterized by a learning algorithm parametrized by ρ depending on the number of social interactions they experienced during the formative phase (see appendices A and B for details). We also suppose that agents only interact with counterparts holding equal ρ. We choose a random social undirected graph from an ensemble here taken to be generated by a Barabasi-Albert model with N = 400 and m = 10. Our results are not strongly dependent on the details of the social graph topology [13].
Agents start the social influence phase with moral weights that are represented by unitary vectors ω i with random positive overlaps with a fixed Zeitgeist vector Z Z. The social influence dynamics is implemented as a Markov Chain Monte Carlo process as follows. At each step an edge ij of the social graph is randomly and uniformly chosen. One of its vertices (let it be i) is then marked as the influenced agent with probability 1/2. The influenced agent chooses a random unit vector ω i and changes her moral Note that the agent has complete access to his opinion h i , but only knows the sign of the influencer opinion σ j . Observe also that the pressure parameter β regulates the acceptance rate in the transition.
High pressure β makes moral representation changes more difficult.
Data are collected after the system reaches equilibrium. We typically wait T term = 6×10 4 N interactions before gathering uncorrelated samples for time averaged opinions h i that are used to build the histograms depicted in Figure 2. To guarantee that samples are uncorrelated we calculate autocorrelation times τ and then select properly spaced T term /τ samples. The whole procedure is repeated a n times until 500 independent samples are drawn (n = 4 being the minimum for the data we report

Confrontation between artificial and empirical data
A society of agents is characterized by the values of ρ, measuring the effective socialization in the formative phase, and of β that sets the pressure on the society during the social influence phase. While in a society different agents with different ρ's and feeling different β's will interact, it is a reasonable first approximation to consider that people will more likely interact in a meaningful manner with those that are more similar.
In a steady state of a society of agents, changes in the moral matrices still occur, but the distribution P ADS (h|ρ, β) of opinions about the Zeitgeist are stable in time. From 15000 MFT questionnaires (see [13] for a description of this data set) we obtained the data [32] and the following information. A distance between the two distributions is measured by summing the quadratic difference over a set of bins of h. is larger than a threshold value of identification (e.g 0.1) then the point is not identified to any political affiliation.

Learning dynamics
We started with Bayesian learning and obtained two equivalent descriptions of the learning dynamics describing changes in the weights of the moral dimensions. The dynamics described in equation 1 can be seen to be a gradient descent: changes of the weights are in the direction of decreasing a quantity E that can be interpreted as an energy, a cost or a pain.
We claim that this motivational (or utilitarian) form of learning can be useful to understand better what is occurring. Then for each example the change occurs in the direction which tends to reduce the error of classification, to increase conformism or to reduce pain derived from disagreement. But it is just a mathematical fact that may go along uninterpreted and be described just as a Bayesian inspired learning. We can describe the falling rock as moving along a trajectory that decreases potential energy.
It is not the rock that is being utilitarian or motivated to reduce an energy, but it is our description using energy that seems utilitarian. The motivation lies in our third person description.

The modulation function
By using the idea of the modulation function we described (eq. 2) the same learning dynamics differently.
The modulation function measures the importance of the information carried by the example. It could be thought in a loose way as representing the signal from something like an amygdala, which would signal more strongly in case the example causes surprise due to the novelty of an unexpected result.
In addition to measuring surprise, it is striking that it depends on ρ(t) = 1/ 1 + C(t) 2 . What is striking about a ρ dependent modulation is that in a static scenario and for an agent with only one social partner we can prove [40] that ρ increases with the number of information exchanges, and this still holds numerically when learning from several correlated social partners.
We now analyze the case shown in Figure 1-Left for noiseless communication. At the beginning of the learning process the modulation function is flat. Every piece of information, every example receives the same modulation. Being right or wrong is of little consequence in the manner in which the information is incorporated. As learning occurs, from the information exchange with social partners, the modulation function decreases for positive z and increases for negative z. Examples that carry new information start getting a higher modulation. Those that were predicted correctly, are less effective in fostering changes in the weights of the moral vector. Examples carrying new information make larger impacts, those that corroborate the opinion of the agent, have a smaller influence. As ρ increases this effect is amplified.
In This increases with the value of the noise level and with ρ.
To sum it up, the modulation has three characteristics which we list in decreasing order of importance.
The modulation function depends on 1. Novelty/Corroboration: a measure of whether the example carries new information ( z < 0) or is corroborative (z > 0), 2. Socialization in the formative phase: a measure of the number of information exchanges (ρ), 3. Trust/distrust: a measure of the reliability attributed to the social partners. Given , if z is too negative, the example is not considered new information but rather it is distrusted and its effect is small.
We have analyzed the simple dynamics where the covariance is represented by a single parameter C or equivalently ρ (see appendices for details). This is probably a good approximation but it is reasonable to assume that the dimensions may be interdependent. For example caring for a member of the group may be larger than for a member of another group, also cheating an authority figure may be different than group. This is similar to the methodology we used in [12] and [13]. This results in the identification for fixed β, of the measure ρ of the formative phase, and the self-declared political affiliation of the respondents of the questionnaires. This is done for several values of β and the result is shown in Figures   2 and 3. It is clear that the populations of agents with small value of ρ, or small number of social information exchanges, are close to conservatives and those populations with high ρ or large number of social information exchanges, are more likely to be identified with liberals. Note that this is not a one to one identification. We are not saying that a given agent's value of ρ determines political affiliations, but rather that this subset of the population will have a distribution of opinions consistent with such identification.

The phase diagram
The

Readaptation times
What is it that conservatives conserve? If a society of agents identified with conservatives (low ρ) were to readapt after changes faster than one identified with liberals, our theory would have to be thrown away.
But it is a result of our theory that liberal-like societies are faster than conservative-likes in readapting.
Several approximately equivalent ways of defining relevant measures of readaptations times can be introduced and we have looked at two such measurements and obtained similar results. After a steady state was achieved and the steady state distribution P ADS (h|ρ, β) is measured, the Zeitgeist Z Z old is changed to a new Zeitgeist Z Z new . Call this time t = 0. After a sweep of information exchanges of all the agents, t increases by one unit, the distribution of opinions about the new Zeitgeist P t (h) is measured.
A distance between the two distributions is measured by summing the quadratic difference over a set of bins. As usual the relaxation is exponential so we parametrize D(t) = D 0 e −t/T in terms of the adaptation time T which depends on ρ and β. For more about this measure see [13].

Threats: Conservative shift under increase of pressure
The pressure parameter β determines how important it is to conform to the opinions of others. A more detailed modelling of the agents could make a difference between informational or normative peer pressure. We can model the effect of an external event that threatens the group to which the agent belongs by considering that the pressure β increases. The effects of the threat in the political affiliation of the agents, shown in Figure 5-Left is that the population will shift towards the conservative end of the spectrum.
We supposed a fixed distribution of the number of social information exchanges ρ, and the effect on the distribution of political affiliations before and after a threat which increases the peer pressure β. Our model predicts also that under the perceived decrease of an external threat the populations will shift towards the more liberal region.
We have defined the effective number of moral dimensions of a group with a given political affiliation. This is done by averaging the weights over all members of the population and multiplying by the number of moral dimensions d m = 5. For groups of agents that are identified with conservatives, the effective moral dimension is near 4.8 . For those identified with liberals it is near 3.5. Both increase under increase of the peer pressure parameter β as shown in Figure 5-Right. This is in qualitative agreement with experiments reported in [41][42][43] and further work in [44].

Discussion and Conclusions
The main characteristic of Entropic inference and the Bayesian approach to information theory is that the mathematical structure to represent beliefs in the absence of complete information [45,46], if manifest inconsistency is to be avoided, is probability theory. As presented in section 2 and in appendices A and B a Bayesian study of the learning dynamics of moral classifications can be described as the changes in the weight parameters for each dimension that lead to a decrease in the cost, interpreted as psychological discomfort, caused by differences of opinions.
The main result we presented here is that the cognitive style of the Bayesian agent depends on the complexity of the social interactions in the formative phase and cognitive style induces a statistical association to political affiliation. The formative phase is a mimic of the pre-adolescent phase in the life of an agent and the social influence phase is a mimic of the post-adolescence. During the social influence phase the agent's cognitive style is crystallized, so that it ceases to change, although the agent is still capable of learning, then it follows that statistically the agents when identified with respondents of the MFTQ, with the social complexity of the formative phase being positively correlated with liberalism. This is exciting since in Settle et al. [35] the number of childhood friends is positively correlated to liberalism, at least for those that have two alleles of the DRD4-R7 gene. They cautiously withhold from claiming that a gene for political ideology was identified and just claim that evidence points to a gene-environment interaction.
Within the context of [35] what is the genetic interpretation of our results? Our methods do not address this problem. Genetically, having two long R7 DRD4 alleles, may contribute to making the number of friends a proxy for social complexity in the formative phase. But some other genes may contribute to Openness, with influence on the number of friends, thereby influencing the cognitive style with respect to the differences of learning novelty and corroborations. But our approach does not address this mechanism nor those by which other phenotypes become conservative or liberal. What we say is that Bayesian optimal learning predicts that number of social interactions in the formative phase will correlate with liberalism in the social influence phase. But, why should agents be Bayesian optimal? An answer can be given based on the results of, first, [40] where the functional optimization of the learning process was obtained, second [47], where a related algorithm was shown to be the online version of the Bayes algorithm and third [36] where, using evolutionary programming, the authors showed that perceptrons evolving under pressure for having larger generalization ability, were driven to learning algorithms that resembled Another prediction of the model is that under an increase of β, the peer pressure, a society as a group will tend to seem statistically more conservative, as shown in Figure 5. This effect of peer pressure increase might be behind the results of Bonanno and Jost [41] and Nail et al [42] about the increased conservatism of subjects that were exposed to the 9/11 attack. However Nail et al [43] show that there is no need for social interaction in order to become more conservative, suggesting that our interpretation of β as peer pressure could be extended to a self-regulated parameter that is adjusted dynamically from information about social context.
An empirical definition and consequent measure of pressure might be done following the methodology of [48] where nations were classified on a tight/loose scale. Analysis of morality data sets for individual countries could point out if our pressure and their tight/loose scale are related. Since we use only USA citizens questionnaires, we are not able to address this question here, leaving the issue for a forthcoming paper.
An important characteristic of our model is that it is semantically free. Just or loyal in the mathematical space where the agents are defined are concepts devoid of meaning. We believe that this aspect has to be addressed from an evolutionary perspective in order to understand the emergence of the dimensions and hence provide our mathematical backbone of a semantic dressing.

A The model and methods
A short description of the learning theory is presented below and in the following appendices.
Each agent is endowed with a learning system and a set of weights. They exchange information, learning and teaching at different instances, about a set of issues, represented each by a set of numbers.
Each We model social encounters when agent i receives information y µ = (σ j,µ , x µ ) emitted by the social partner j. Since the length of the vector x µ does not alter the opinions σ, we take all issues to be unit length.
To take into account our limited access to information we have to use a probabilistic framework. Let P (ω|D µ ) describe our knowledge of the vector of moral dimensions ω conditional on the information the agent received until now D µ , composed of all the pairs up to time µ. Now a new pair y µ+1 is received and the probability of having a particular moral dimension ω changes. That is the essence of learning.
The basic relation of inference is drawn from Bayes theorem. If P (ω|D µ ) is the probability posterior to the consideration of the data set D µ and prior to the inclusion of the information contained in the pair y µ+1 , the basic assumption in Bayesian learning is to use the old posterior P (ω|D µ ) as the new prior.
For simplicity we consider an approximation where the probability distributions are multivariate Gaussians. This family can be described by two objects: a mean vector (ω) and a covariance matrix (C). Now the dynamics of learning can be simply written by giving the changes in these two quantities due to the incorporation of the information in the example y µ+1 . After some manipulations (see B below and [47]), the learning dynamics of agent i is described in terms of the components bŷ and E µ , that can be called the learning energy or cost or pain, is given by where h µ = aω a,µ x a,µ+1 is the opinion of the agent about issue x µ+1 before receiving the opinion of the social partner. The average, represented by the angular brackets, is over the gaussian variable u with zero mean and covariance C ab,µ . Note that P (h µ |ω + u) µ is also called the evidence. It is in the likelihood that enters the information about how an issue and a moral vector give rise to an opinion and the noise process that is corrupting the communication.

A.1 Bayesian learning dynamics in the formative phase
Different types of noise can enter in the communication process. Here we suppose the case of multiplicative noise where a fraction of the opinions are inverted. The form of the learning potential can be written where z = σ j,µ+1 h i,µ and Φ is the cumulative distribution of the gaussian N (0, 1). To simplify the interpretation of the results, at the expense of small degradation in the performance of the learning algorithm we consider the case where the covariance has the the from C µ = C µ 1, an overall factor C µ times a unit matrix. In this approximation x T µ+1 C µ x µ+1 = C µ . Then the dynamics iŝ This dynamics and variations for other learning scenarios has been extensively analyzed in [40,47,[49][50][51][52][53][54][55][56]. We now make some comments that are relevant for our present purposes.

A.2 The learning algorithm
We introduce the modulation function (figure 1) F mod (z) = −C µ ∂Eµ ∂z and write the dynamics aŝ Learning is now seen as a modulated Hebbian learning, where changes in the weights are done in the direction of the vector x µ+1 , if the social partner's opinion σ µ+1 about it is positive and in the opposite direction it the opinion is negative. In Figure 1 in the main text, the modulation function is plotted as a function of z. Note that z takes positive values if the opinion of the agent and its social partner are the same and is negative if there is disagreement. If the absolute value of z is large the agent can be said to be very sure about its opinion since small changes in the issue will not change its classification.
But more strikingly, the modulation function depends on C. In Figure 1 we present F mod (z) for different values of ρ = 1/ √ 1 + C 2 , a convenient variable since it takes values between zero and one.
It is close to zero when the agent's opinion has probability around one half of agreeing with that of the social partner. As learning occurs, ρ increases towards one. It can be shown that ρ is related to the probability e g of the opinions being different on a random issue, and e g goes to zero as ρ → 1.
In particular e g = 1 π accos −1 ρ for large d m and uniform and independently distributed examples and it remains a useful variable in other conditions.

A.3 Social influence phase
We consider that the information exchanges in the formative phase occur at random and thus the effective ρ for each agent is a random number. Now we freeze the evolution of the modulation function, ρ or equivalently C is fixed at a particular value for each agent. We consider the agents to start a new phase in their lives where the value of ρ does not change anymore. The agents in the formative phase learned to learn and now they just learn from each other. The validity of this supposition as something that represents the developments of adolescents has to be investigated in an independent way.
The dynamics of information exchange is analogous to that considered in [12,13], the only difference being that the learning occurs with the Bayesian algorithm described above.
We suppose that a society discusses a set of P issues. Parsing of an issue into a vector might be subjective, expressed by the fact that agent i obtains a vector x i . Exchange of information between agents is about the average vector which we suppose reasonable to be independent of the agent, since fluctuations due to subjective parsing, if unbiased, tend to cancel out. We call Z Z the Zeitgeist vector since it captures the contributions of all issues that are currently being discussed by the society. Without any loss it will be normalized to unit length. The opinion of agent k about the Zeitgeist is and its sign is denoted by σ k = sign(h k ). We now consider a Metropolis-like stochastic dynamics of information exchange. Pick at random one agent, call i. Pick its social partner, call it j uniformly from its social neighbors. Now choose a d m dimensional vector u drawn uniformly on a ball of radius κ. A trial weight vector is defined by and accepted as the new weight vector, w i (t + 1) = T if the learning energy :

B Bayesian inspired learning algorithms
For the learning set D t = (y 0 , . . . , y t−1 ) of independently chosen vectors and their opinions, the likelihood is a product where ω is the set parameters to be inferred. The data comes in ordered pairs y t = (σ j,t , x t ) where σ j,t is positive if agent j considers issue x t as a morally acceptable issue and negative otherwise; x t = (x 1 t , . . . , x N t ) is a five dimensional vector. Our choice of N = 5 is determined by Moral Foundation theory.
Bayesian inference derives from the application of Bayes theorem in order to incorporate information that permits updating from a prior to a posterior distribution: In Online learning we consider the updating of the distribution due to the addition of a single example pair y t+1 P (ω|D t+1 ) = P (ω|D t )P (y t |ω) dω P (ω |D t )P (y t |ω ) .
The amount of memory needed to store the whole posterior can be prohibitively large and following Opper [47] we consider a simplification where the posterior is constrained to belong to a parametric family, which we take to be the N dimensional multivariate Gaussian.
If at a certain stage our knowledge is codified into one such Gaussian, a Bayesian update will in general take the posterior out of the Gaussian space. Then a new Gaussian posterior is chosen is such a way that the information loss is minimized. Thus the learning step is comprised of two sub-steps: • New example drives the posterior out of the Gaussian space : P (ω|D t+1 ) := P (ω|D t , y t ) = P G (ω|D t )P (y t |ω) dω P G (ω |D t )P (y t |ω ) (25) • Project back to Gaussian space: The projection step is done using the Kullback-Leibler divergence or equivalently, by maximizing the cross entropy: = dωP (ω|D t , y t ) log P (ω|Dt+1) The minimization of the KL divergence results in projecting into the Gaussian with the same mean and covariance vector as the non-Gaussian posterior: Now change variables, introducing u the fluctuations around the mean ω =ω t + u. Using that for Gaussians with zero mean IE(xf (x)) = IE(x 2 )IE(f (x)) and df (x+y) dx = df (x+y) dy then it follows that the new mean and covariance change as described by equations 1, 2 (main text), 8 and 9 in A.

C Comparing ERN and the modulation function
The modulation function determines the size of the weight changes during learning. We define the average of the modulation function for novelty F mod novelty and F mod corroboration by For a unifomr distribution of examples, and with the normalization of ω, the distribution P (z) of z = hσ, is the gaussian distribution with zero mean and unit variance. Since the modulation function depends on ρ, the difference ∆F = F mod novelty − F mod corroboration (29) can be identified to a political affiliation. This is shown in figure 3.c. This is the closest we can come theoretically to defining within the model a quantity similar to the Error Related Negativity (ERN) measured by Amodio et al [27] which reports differences between measured EEG signals of unexpected and expected situations conditional on self-declared political affiliations. In figure 3.d we show the results from [27] for the magnitude of the ERN signal versus political affiliations. . The vertical axes are the modulation functions and in the horizontal axes appears the product of the prior opinion of the agent (h) times the sign (σ) of the arriving opinion information. Social interactions with opinions where hσ < 0 bring new information, those with hσ > 0 are corroborative. The different curves are drawn each for different numbers of total opinions to which the agent has been exposed, measured by ρ which increases as shown by the arrow (↓). The modulation function changes from almost a constant, for very small number of social opinion exchanges, to a very asymmetrical form where repetitive information causes almost no change at all and novelty gives rise to a very high modulation of changes, except when a level of distrust has been surpassed, as in the very negative region of hσ in the right panel. See Appendix B for details. Figure 2. Comparison with empirical data. Empirical opinions (histograms in orange) correspond to the overlap between moral weights obtained by MFT questionnaires and the average weight of the most conservative group (Zeitgeist direction Z Z). The histogram obtained by simulating social influence in a social network with homogeneous learning styles (homogeneous ρ) and computing overlaps between moral weights of a the agents and a given Zeitgeist direction (Z Z = (1, 1, 1, 1, 1) in the simulation) are represented by the black line. In each graph we find rho that best fits the empirical histogram for pressure β = 3.8 and for each political affiliation group. Simulations are performed on a Barabsi-Albert network with N = 400 and average degree 20.  The stripes represent regions of the space of parameters where agents could be statistically identified with a group with a given political affiliation. The lower line represents the boundary between order (above) and disordered (below) societies. Below the transition line, and for very large β, no identification with MFT questionnaire respondents was found. Right: Color coded relaxation times after changes in the set of moral issues. Note that at the transition relaxation times are very large. This is called critical slowing down. For the agents identified with respondents of the MFTQ, the lowest times correspond to those liberal identified agents and the largest times to conservative identified agents. The line just above the transition shows the locus of minimum correlation time as a function of β, for fixed ρ. The effective number of moral dimensions for two values of the pressure, before and after an external threat. If a threat leads to increased pressure, the statistical signature of liberals agents will look more like that of conservatives.