# Wealth Distribution in Villages. Transition From Socialism to Capitalism in View of Exhaustive Wealth Data and a Master Equation Approach

^{1}Department of Physics, Babeş-Bolyai University, Cluj-Napoca, Romania^{2}Wigner Research Centre for Physics, Budapest, Hungary

Socio-economic inequalities derived from an exhaustive wealth distribution is studied in a closed geographical region from Transylvania (Romania). Exhaustive wealth data is computed from the agricultural records of the Sancraiu commune for three different economic periods. The data is spanning two different periods from the communist economy and gives a glance to the present situation after 31 years of free market economy in Romania. The local growth and reset model based on an analytically solvable master equation is used to describe the observed data. The model with realistically chosen growth and reset rates is successful in describing both the experimentally observed distributions and the inequality indexes (Lorenz curve, Gini coefficient, and Pareto point) derived from this data. The observed changes in the inequality measures are discussed in the context of the relevant socio-economic conditions.

## 1 Introduction

Fascinating statistics related to cities or smaller settlements have been intensely studied by physicists in the last few decades. On the data mining and analyzing side, the studies coming from physics revealed many universalities and scaling laws. Modeling was done with simple physics inspired models that were able to prove by their success the relevancy of some socio-economic processes in the investigated phenomenon. Numerous universalities have been revealed and successfully modeled [1–7].

It has been shown that despite the large differences in their population, cities are all qualitatively similar from the point of view of most of the sociometric indices. Many of the urban sociometric measures, such as length of roads, total income, GDP, total wages, gross urban product etc are scaling with the population of the settlement [1, 3, 5]. Based on the value of the scaling exponent, for cities from the United States, Germany and China many of these urban indicators were categorized by Bettencourt et al. in three regimes: sublinear, linear and superlinear [3]. An elegant explanation for the reason why larger cities are growing on the expense of smaller ones were offered by interpreting this different scaling regimes.

The social inequality in a settlement’s population is also affected by the size of the population, however the methodology of measuring and modeling social inequalities is more complex than the one used for most of the other urban indices mentioned above. Social inequalities have been studied beginning from the early days of Economics and presently is one of the main direction in Econophysics [8, 9]. Everybody is aware of Vilfredo Pareto’s seminal works [10], revealing the universality of the fat-tailed distributions in income and wealth and the related 80–20*%* law. Econophysics targets the problem of inequalities in a society starting from the experimental distribution of income or wealth of individuals (or groups of individuals) of a well-delimited population [11].

The most commonly accepted measure that characterizes the inequality in a society is the Gini index [12, 13]. It has been shown that the value of the Gini index grows with the size of the city, meaning that inequality is more pronounced in larger cities, than in case of smaller ones [6]. Recently, a more profound study focusing on the effect of city size on the income distribution was published by E. Heinrich Mora et. al. [2], showing that the income does not scale in the same manner for all regions of the society. The scaling is sub-linear for the lower regions of the society and scales super-linear in case of the higher regions. This result is in agreement with the wealth data from a small settlement (about 1,000 households) processed in this article. The results of Mora et. al. illustrates nicely the effect of increasing inequality with the city size.

Not only the universality of the fat-tail characterizing wealth and income distribution, but also understanding the underlying mechanism leading to the unusual distribution of the relevant socioeconomic measures are fascinating questions. It is known nowadays that the distribution of income and wealth is not a simple one, different type of distributions apply to different regions of income or wealth. The richer end of both distributions can be described with the power-law type distribution [10, 11]. The low and middle classes of income and wealth can be described better with an exponential distribution. In case of wealth a third region should be considered as well, the region of negative wealth (debts), which is also characterized by an exponential trend, different from the one which applies for the low and middle classes [14].

Modeling the experimental data is done by many different methods, ranging from simple mean-field type models, to more complex models involving network-science approach or agent based computational simulations [11, 15–29].

Recently, we proposed a simple master-equation approach based on a Local Growth and Global Reset (LGGR) [30] process to describe in a similar manner the distribution of both income and wealth [14, 28]. The novelty of this approach is also its ability to describe all income and wealth regions in a unified manner, proposing a compact form of the distribution function for all income/wealth categories and capturing the regions of the negative wealth as well.

Although income and wealth seems to be related and can be modeled with similar approaches, from data mining side there are big differences between them. While incomes are present in many electronically available data (taxation for example), wealth is more difficult to quantify and measure directly. Such data collection raises many private issue problems as well. Wealth is usually estimated via some proxies (see for ex., [31]) and it is hard to find exhaustive data for a well-delimited and statistically significant community. In our previous studies targeting income distributions [28, 32] we have shown the advantages of having a complete statistics of income (exhausting data) in a population for many consecutive years. Such data allows not only to test the statistics offered by the model, but it also allows verification of some hypothesis used in modeling the dynamics of the system.

Exhaustive real-world wealth data are rare to find in the literature and as a consequence, these could be extremely precious from the point of view of modeling socio-economics problems related to social-inequalities. Here we construct such an exhaustive wealth database extracted from taxation and ownership data that is available at the mayor’s office in a Romanian village community, *comuna Sâncraiu,* (*Kalotaszentkirály*), having about 1,000 households. The database includes data from three different years: 1961, 1989, and 2021. The really interesting side of this data is that the studied years are relevant snapshots for three very different political periods, influencing in a different manner private wealth. The year 1961 is exactly the year before collectivization, when the lands given to the peasants by the communists after 1947 were again confiscated and collectivized. 1989 was the last year of the communism regime in Romania and finally 2021 is the present situation after more than 30 years after the fall of communism, offering a picture for the effects of the free market economy in Romanian villages.

Besides presenting and discussing the relevant wealth distributions derived from the data, our additional aim is also to validate once again the modeling power of the LGGR model [14] in understanding social inequalities. By possessing exhaustive wealth data for three different economic periods in Romania we are also able to show how social inequalities varied in these turbulent times and how the growth and reset rates should be adjust in the LGGR model to account for the measured wealth-distributions. Besides the relevant distribution functions we discuss and compare the Lorenz curves, Pareto points and the value of the Gini indices for all the studied periods.

The rest of this manuscript is organized as follows: 1) we present and discuss the data; 2) the LGGR modeling framework is briefly discussed; 3) the LGGR model is applied to the obtained wealth data by using different growth and reset rates; 4) the obtained results are discussed in view of socio-economic inequalities, and 5) finally we summarize our findings.

## 2 The Exhaustive Wealth Data

Wealth is hardly quantifiable since it is composed by all valuables that a person possesses. Previous studies regarding wealth distributions were based either on the inheritance tax data [33], estimated wealth data provided by media companies (Forbes) or organizations such as the one managing the World Inequality Database (WID) [28]. A first shortcoming of these data is their incomplete nature and a biased sampling: targeting the aging part of the population or the rich, for example. Analysis performed on exhaustive wealth data is lacking from the literature to our knowledge. Importance of such exhaustive data was recognized in our previous works on income distribution. The exhaustive data for income derived from a anonymized social-security database in ten consecutive years in the Cluj county (Romania) [14, 32], allowed us to model realistically the dynamics of the income and to describe successfully its distribution.

Here we aim to understand wealth distribution by first constructing an exhaustive data collection for a delimited geographical territory in Romania. The used data and the compiling method that is presented in the following allowed to infer the wealth of each household in a commune of Cluj county. Wealth datasets from three radically different economic life situations are considered:

• 1961 The year under the communist regime where most of the agriculture destined land was confiscated and collectivized, following the land reform from 1945.

• 1989 The last year of the communist regime, when most of the private property is already abolished, leaving households with limited valuables.

• 2021 The current year, 31 years after the abolishment of the communism and transition to a free market economy.

Following an agreement between the Babeş-Bolyai University of Cluj, and commune Sancraiu (in Hungarian: Kalotaszentkirály—Zentelke) records regarding the wealth of households in the commune were obtained in anonymized format. The commune is a well-delimited region in Transylvania, consisting of five villages: Alunisu (Magyarókereke), Braisoru (Malomszeg), Domosu (Kalotadámos), Horlacea (Jákótelke) and the eponymous Sancraiu. The region and these villages can be located on the map of Figure 1. The current population of the commune is 1,628 inhabitants according to the census from 2011. The 1956 census enumerates 3,557 inhabitants [34] and the census from 1992 records a population of 2.053 individuals [34].

**FIGURE 1**. The publicly available map of commune Sancraiu (Cluj, Romania). The map presents the boundaries of the commune, the neighboring communes and city along with the constituent villages, intra-village territory, and agricultural land parcels. Courtesy of Sancraiu Mayor’s office.

The wealth component data for 1961 and 1989 was obtained by digitalizing in an anonymous manner the agricultural registers of the mayor office (Agricultural Register of Sancraiu commune for 1959-1960-1961, Agricultural Register of Sancraiu commune for 1959-1960-1961), while for the year 2021 we used the anonymized taxation database for land and buildings.

The “agricultural register” is a complete agricultural record kept by the local authorities. These records contain each household in the commune along with the owned land area, house area, auxiliary building area (such as barns etc.), along with the number of owned livestock in different category (horses, cows, and sheep, etc.). As the commune is located historically in an agricultural area, we assumed that before 1990 the agricultural records contain the bulk of the valuables that a household owns. In the data there is no estimation of the total wealth of the households, this can be approximated only by transforming all the mentioned valuables with appropriate weights in a hypothetical good. With a realistic weighted sum of all recorded valuables one can create thus a reasonable proxy, that estimates the wealth of an individual household. The obtained wealth is thus a quantity expressed in arbitrary units (*a*.*u*.), and their values cannot be compared between the different years, because the wealth components and also their share in the total wealth is changing in time. In the following we describe each dataset in detail, together with the estimation method for the total wealth.

### 2.1 Wealth Data for 1961

The Agricultural Registers 1959-1960-1961 contain information about each household in the commune. The total number of individual households that are recorded are 1,133. For each household the following data are used to construct a proxy for the wealth:

• *A*_{L}–total land owned in hectares (*ha*). All types of land are included. The weight parameter will be denoted by

• *A*_{Bh}–area of the house owned in *m*^{2}. Weight parameter:

• *A*_{Ba}–total area of the auxiliary buildings (such as barns, etc.). Weight parameter:

• *N*_{C}–number of cows owned. Weight parameter:

• *N*_{WB}–number of domestic water buffaloes owned. Weight parameter:

• *N*_{H}–number of horses. Weight parameter:

• *N*_{P}–number of pigs. Weight parameter:

• *N*_{S}–number of sheep. Weight parameter:

• *N*_{G}–number of goats. Weight parameter:

The dataset is diverse enough to allow a robust approximation of the wealth. The total wealth of household *i* is calculated as a simple weighted sum of these valuable categories

with the *P*_{i} weight factors satisfying the normalization: *∑*_{i}*P*_{i} = 1.

Since the values of the weighting parameters are largely disputable, we estimate the individual wealth values with 10 different (but realistic) parameter sets, allowing also an estimation for the uncertainty of the wealth estimation method. The chosen parameter sets are presented in Table 1. To fully understand the effect of different parametrization, we calculate also the share of a single “category” from the total wealth in the commune. The total wealth is calculated as *w* = *∑*_{i}*w*_{i}. The share of the category *X* (this can be the land *A*_{L}, house *A*_{Bh} etc.) with weighting parameter *P*_{X} in the total wealth in percents will be

where:

We present the shares by categories for the parameter sets from Table 1, in Table 2. The data from this table indicates that the used parameter sets cover a wide range of acceptable methods to compound a wealth proxy and the uncertainty estimated from here is realistic.

The experimental wealth distribution *ρ*(*w*) is calculated for each parameter set. The probability density function is built up from the individual household wealth values with the histogram method. For all parameter sets we used the same number of bins for the histogram, however the middle point of each bin becomes slightly different, leading to an average value and also a corresponding error bar. The error bars for the probability density function is estimated from the different bin intervals and histogram values similarly. The error bars on the points in the direction of both axes signify the greatest deviation from the mean of the bin. The use of these error bars allows us to represent in a compact manner the ensemble of distributions obtained with the used parameter sets.

The probability density function obtained with the discussed methodology is presented in Figure 2. The figure suggest that the different parameter sets introduce some uncertainties affecting the location of the data points, however the overall qualitative shape of the distribution is well delimited.

**FIGURE 2**. Probability density for wealth distribution in 1961. Error bars are obtained by combining the results of the different weight parameter sets shown in Table 1. Wealth values are given in arbitrary units (a.u.) as it is explained in Section 2.1. The theoretical distribution (9) fitted to the averaged experimental distribution (*k* = 1.91 and *r* = 0.82) is shown by the continuous line.

### 2.2 Wealth Data for 1989

The records form the Agricultural register 1989 were also digitalized and anonymized. There are in total 921 individual households in the records. Before proceeding, one has to note that the composition of the wealth will be very different from the year 1961. First, this is because the agricultural lands were all fully collectivized and for each household was allowed a home garden of maximum 40 acres. This area also included the surface of the buildings. Secondly, new categories appear in the records, such as the owned cars. The methodology for estimating the total wealth is the same as for 1961. We assume again that the wealth of a household is the weighted sum of the following relevant “categories”:

• *A*_{L}–total land area in acres (*a*, 1*a* = 100 *m*^{2}). Weight parameter:

• *A*_{Bh}–area of the house owned in *m*^{2}. Weight parameter:

• *A*_{Ba}–total area of the auxiliary buildings (such as barns, etc.) owned. Weight parameter:

• *N*_{cc}–number of carriages. Weight parameter:

• *N*_{ca} - number of cars owned. Weight parameter:

• *N*_{C}–number of the cow. Weight parameter:

• *N*_{WB}–number of domestic water buffaloes. Weight parameter:

• *N*_{H}–number of horses. Weight parameter:

• *N*_{P}–the number of pigs. Weight parameter:

• *N*_{S}–the number of sheep. Weight parameter:

The used different sets of weigh parameters are presented in Table 3. The shares in the total wealth of the different categories, as it was discussed for 1961, are presented in Table 4.

The probability densities of wealth distribution *ρ*(*w*) was computed in the same manner as for 1961 and the error bars were estimated again by using the different weight parameter sets. The result is shown on Figure 3.

**FIGURE 3**. Probability density for wealth distribution in 1989. Error bars are obtained by combining the results of the different weight parameter sets shown in Table 3. Wealth values are given in arbitrary units (a.u.) as it is argued in Section 2.2. The theoretical distribution (Eq. 9) fitted to the averaged experimental distribution (*k* = 0.75 and *r* = 1.22) is shown by the continuous line.

### Wealth Data for 2021

The wealth distribution for year 2021 was derived from the anonymized local taxation records. These records contain all the estimated value of the owned real estates, and the size of the owned land inside the villages (possible built area, with running water, electricity and waste management). The land sizes and types owned outside of the village are also present in the database. For 2021 there are 1,595 taxpaying individuals in these records. The tax an individual pays after their owned valuables are calculated from these records, however we did not have the information regarding the payed total amount of tax. Since the values of different land types are not fixed, again realistic weights are needed to combine different assets. The selected “categories” and weights for computing the total wealth of an individual are:

• *V*_{b}–total estimated value of the owned buildings. It is calculated by taking into account the type of the owned house, the heating apparatus, building materials etc. by the local authority. Weight parameter:

• *A*_{pb}–the area of built-up territories inside the villages in hectares (ha). Weight parameter:

• *A*_{a}–the arable field outside the villages in *ha*. Weight parameter:

• *A*_{p}–the pasture land outside the villages in *ha*. Weight parameter:

• *A*_{g}–the owned grassland outside the villages in *ha*. Weight parameter:

• *A*_{f}–the owned forest outside the villages in *ha*. Weight parameter:

In Table 5 we present the different parameter sets for the weight factors used for 2021. The percentile shares for the different wealth categories calculated according to 2) are presented in Table 6. The methodology of the individual wealth calculation remains the same as in the previous cases. The probability density for wealth distribution with the error bars calculated with the used parameter sets, *ρ*(*w*), is given in Figure 4.

**FIGURE 4**. Probability density for wealth distribution in 2021. Error bars are obtained by combining the results of the different weight parameter sets shown in Table 5. Wealth values are given in arbitrary units (a.u.) as it is argued in Section 2.2. The theoretical distribution (11) fitted to the averaged experimental distribution (*γ* = 5.3 and *β* = 2.1) is shown by the continuous line. Please note the logarithmic scales.

## 3 The LGGR Modeling Framework

A master equation approach describing uni-directional local growth and global reset processes (LGGR) was recently introduced for modeling various distributions that are frequently encountered in complex systems. The appropriateness of such a simple model to describe income and wealth data in modern societies was also discussed in a couple of recent articles [14, 28]. For a better understanding of the LGGR model let us consider a system composed of identical entities, each of them characterized by the amount of quanta they posses. An immediate example of such a system would be the individuals of a society owning different amount of wealth. Let us denote by *P*_{n}(*t*) the probability that a person has *n* quanta of wealth at time *t*. The *P*_{n}(*t*) probability has to satisfy normalization: *∑*_{{n}}*P*_{n}(*t*) = 1. For some fixed *μ*_{n} growth rates and *γ*_{n} reset rates some possible dynamical scenarios are sketched in Figure 5. In the scenario from Figure 5A, the reset rate *γ*_{n} is positive for all *n* values. The model allows for a scenario with *γ*_{n} state-dependent rates where *γ*_{n} < 0 if *n* < *n*_{c}, and *γ*_{n} > 0 for *n* > *n*_{c}. This would mean as a general rule actors with low wealth *n* < *n*_{c} are coming in the system, and those leaving the system would have *n* > *n*_{c} amount of quanta, in general. The schematic representation of the process in case of this scenario is displayed on Figure 5B. In [14, 28] it has been shown, that this second scenario is extremely appropriate to model socio-economic inequalities. Independently, whether we are in the case (a) or (b), the analytical investigation of the LGGR model follows the same route.

**FIGURE 5**. Illustration of the growth and reset process: **(A)** the general mechanism of the process with a positive reset rate, **(B)** the process considering a reset rate which can be both positive and negative.

In case when only local unidirectional transitions (*n* → *n* + 1) are considered for growth, the dynamical evolution equation will have the following form:

The term containing the <*γ*> quantity, guarantees the normalization of *P*_{n}(*t*) by feeding the system at the state *n* = 0 if needed:

In our previous studies we have shown the generalization of the dynamical Eq. 4 to continuous states by converting it into a partial differential equation in the limit when *dt* → 0 [35]. Generalizing the growth and reset rates to continuous states (*μ*_{n} → *μ*(*w*), *γ*_{n} → *γ*(*w*)), the evolution equation corresponding to Eq. 4 has the following form:

Here *ρ*(*w*, *t*) is the normalized probability density (*∫*_{{w}}*ρ*(*w*, *t*)*dw* = 1) for an individual possessing *w* amount of wealth at time moment *t*. The feeding term at 0 is similar to the discrete limit, and it is described with the *δ*(*w*) Dirac functional. The <*γ>* feeding at 0 should be:

In [30, 35] it was proven, that the above dynamical evolution equation converges to a steady-state with a *ρ*_{s}(*w*) stationary probability density:

According to this, one can see that the equation for *ρ*_{s}(*w*) is practically independent of the time-scale on which the *μ*(*w*) growth rate and *γ*(*w*) reset rate is considered, since they appear only in fractions where the time-scale factor is eliminated. It worth mentioning however that in case one has access to a long-term exhaustive data in the same geographical region, it is possible to determine also the functional for of both *μ*(*w*) and *γ*(*w*) from the data related to the relevant time-scales. Such a study was recently performed by us for income distribution in the Cluj county (Romania) [28].

By correctly choosing the *μ*(*w*) growth- and *γ*(*w*) reset rates, the LGGR model will lead to *ρ*_{s}(*w*) distributions that are frequently encountered in complex systems [35].

In case one is interested in the transient dynamics and the convergence to the stationary state, some relevant results are given in [30, 36]. In [30] it is proven under quite general conditions the convergence to the stationary state. In [36] it is shown for discrete states and two special choice of the growth and reset rates that the convergence to the stationary distribution is exponential. For constant growth rate, *μ*, and constant reset rate, *γ*, it is found that the convergence is faster than exp(−*γ t*). For constant reset rate, *γ*, and linearly increasing growth rate, *μ* = *σ*(*n* + *b*), the convergence is faster than exp[−(*σ* + *μ*) *t*].

We will apply thus the LGGR modeling framework to describe the wealth distributions obtained in the mentioned three different economic period of the studied geographical region. We will assume for each three studied years that stationarity is reached. Since all these 3 years were selected so that they represent the end of a peculiar and long-enough socio-economic condition, this assumption is realistic.

## 4 Application of the LGGR Model

Using particularized growth and reset rates we apply now the LGGR model for the three distinct economic situations in order to explain the obtained wealth distributions. First, we identify the proper kernels for the reset and growth rates and then calculate the resulting stationary probability density functions. We then adjust the model parameters to obtain a qualitatively acceptable overlap between the experimental and model data. In a later section we will discuss some aspects related to the chosen growth and reset kernels and derive also some accepted economic indicators of wealth inequalities, like the Gini index, Lorentz curve or the much-discussed Pareto point.

As a starting point in our endeavor, we note that unlike to our last work related to wealth inequalities [14], in our present data there is no information on negative wealth, i.e., debts. As a consequence of this, the application of the LGGR model for describing our data should be possible using much simpler kernels for the growth and reset rates.

### 4.1 Wealth Distribution in Communism–Constant Growth and Linear Reset Rates

For modeling the data in years 1961 and 1989 we considered the following rates:

• a constant growth rate, resulting in a slow growth independently of the existing wealth amount, in agreement with the principles of communism, *μ*(*w*) = *k*.

• a linear reset rate in the form *γ*(*w*) = *w* − *r*. Assuming that only positive wealth exists such a rate will have a negative value in the interval [0, *r*), and becomes positive on *w*, while negative reset means in average an entering into the system with wealth *w*. In this manner, the value of the *r* value marks the *n* = *n*_{c} boundary. Wealth above this value favor a reset, while wealth under it is considered as starting assets for a new individual incoming in the statistics.

Using the above growth and reset rates, Eq. 8 leads to the stationary probability density:

This is a normalized normal distribution restricted on the *w* ≥ 0 interval with its peak shifted into *r*. We denoted here by erf() the well-known error-function. By properly selecting the *r* and *k* parameters, this approach leads to a good fit for the wealth distributions both for the 1961 and 1989 data.

For the data and wealth distributions in year 1961 the best fit can be achieved with *k* = 1.91 and *r* = 0.82. The fit along with the experimental results for the wealth distribution are presented on Figure 2. Using the median points for the data, the fit with the above parameters results in a coefficient of determination *R*^{2} = 0.98. The analytically calculated average wealth for this normal-like distribution is <*w>*_{theo} = 1.459 *a*.*u.*, which can be compared with the average calculated from the experimental data, for each different parametrization <*w>*_{exp} ∈ [1.36 *a*.*u*.; 1.49 *a*.*u*.]. (Here *a*.*u.* stands for the arbitrary units in wealth resulting from our estimation method).

The same normal-like distribution offers a good fit also for the wealth distribution derived from the 1989 data. The best fit parameters are however: *k* = 0.75 and *r* = 1.22. This fit is illustrated on Figure 3. Using again the median points of the experimentally estimated data the above fit leads to a coefficient of determination *R*^{2} = 0.93. The average wealth obtained from the fitted distribution is <*w>*_{theo} = 1.359 *a*.*u.*, which is in good agreement with the values calculated from our parametrizations of the weight coefficients: <*w>*_{exp} ∈ [1.199 *a*.*u*.; 1.391 *a*.*u*.]

From Figures 2, 3 and from the fit statistics we learn that the chosen LGGR method with the proper *k* and *r* parameters describes well the wealth distribution in the communism periods, more specifically the ones derived for 1961 and 1989. We note that the average wealth in this case can be given analytically:

For small *k* values the mean shifts towards the value of *r*, which is consistent with our data and fit for 1961 and 1989.

### 4.2 Wealth Distribution for the Free Market–Preferential Growth With Constant Reset Rate

It was shown that for the free market economy the growth in wealth should be preferential [35]. This leads to the kernel used in our recent study on wealth distribution in modern societies [14]. A constant reset rate is the simplest approach, assuming that growth starts from zero and there is a constant probability of getting out from the considered statistics (either by relocating or by death). For modeling the wealth distribution observed in year 2021 we used thus the LGGR model with the *μ*(*w*) = *w* + *β* growth rate and *γ*(*w*) = *γ* reset rate. The preferential growth rate is in agreement with the much discussed Matthew principle [37], while the constant reset rate does not differentiate between wealth values, each individual in the system can be resetted at any given time with the same probability.

The above growth and reset rates results in Eq. 8 in the Lomax II type (or Tsallis-Pareto) stationary probability density function:

For the data from 2021 we found that the best fit can be achieved with *γ* = 5.3 and *β* = 2.1. On Figure 4 we present the experimental probability density function for wealth along with the Tsalis-Pareto distribution obtained with the previously mentioned parameters. The used regression has a coefficient of determination *R*^{2} = 0.91. The average wealth computed from the Tsalis-Pareto distribution is <*w>*_{theo} = 0.488 *a*.*u*. in comparison with the average wealth obtained from the data: <*w>*_{exp} ∈ [0.279 *a*.*u*.; 0.682 *a*.*u*.].

## 5 Socio-Economic Inequality Measures

We turn now our attention on estimating also the well-known inequality measures used by social sciences.

We will first construct both the experimental and theoretical Lorenz curves [38, 39] calculated from the experimental and fitted probability density functions, respectively. The Lorenz curve indicates the relation between the cumulative share of wealth owned by the households with wealth above *x*, *F*(*x*), and the cumulative share of households with wealth greater than *x*, *C*(*x*).

with:

One obtains therefore the Lorenz curve by plotting *F*(*x*) as a function of *C*(*x*). For a totally uniform wealth distribution, i.e., no socio-economic inequality, the Lorenz curve would be the first diagonal (“equity line”) in the *F*—*C* square. Deviation from this line characterizes the social inequalities. The area between the Lorenz curve and the equity line (Γ) is related to the well-known Gini index [12, 13], *G*, by *G* = 2 Γ. We recall here that if one has a discrete set of wealth values, *x*_{i} in a society (the case of our experimental data) composed by *N* individuals, the Gini index is a number between 0 and 1, defined as

with <*w>* the average wealth:

A 0 Gini index means no social inequality (all wealths are equal) why a Gini index 1 means that all wealth is owned by one individual, i.e., the most extreme inequality. For a more pronounced social inequality the Gini index is higher.

Alternatively, if one has a continuous probability density function for the wealth distribution, the Gini index is computable as:

Following the above definitions one can construct both the Lorenz curve and Gini index from the experimental and model results. The Lorentz curves for the studied years are plotted on Figure 6. The regions spanned by the experimentally observed Lorenz curves for different weight parameters are indicated with a lighter shedded region. The theoretical Lorenz curve computed with the fitted probability density is indicated with a bold continuous line. As expected our theoretical model describes in an acceptable manner also the Lorenz curves. Some deviations are however observed for year 2021 in the limit of high wealths.

**FIGURE 6**. Experimental and theoretical Lorenz curves for the years 1961, 1989, and 2021. The shaded region indicates the region spanned by the experimental results for different weight parameter sets. The theoretical curves were calculated from the fitted distributions, and are plotted with the continuous bold line.

The Gini index can be estimated from the experimental data for all weight parameter sets using the definition (Eq. 15). One can compute also a theoretical Gini index using Eq. 17 with the probability density function given by the LGGR model and best fit parameters. Results in such sense are summarized in the columns corresponding to *G* in Table 7. There is a good agreement between the experimentally calculated and theoretically computed Gini index, confirming once again the applicability of the theoretically derived probability density functions.

Another possibility to quantify social inequalities is by defining the *P* Pareto point. The original Pareto law states that in any society in general 20*%* of individuals own 80*%* of the total wealth. Starting from this hypothesis for any specific wealth distribution data in a society one can define a *P* Pareto point from the Lorenz curve, assuming that the *P* point is that *P* = *C* value for which *F* = 1−*P*. Naturally, for a society that confirms the 80–20 Pareto law *P* = 0.2. The Pareto point should be a number between 0 and 0.5, smaller values meaning higher social inequalities. Our experimental and theoretically constructed Lorenz curves allow the identification of the *P* Pareto point. From our data we get the Pareto points specified in the columns corresponding to *P* in Table 7. One will note again that similarly with the Gini index the theoretical and experimental values are in good agreement.

## 6 Discussion and Conclusion

Before discussing in detail the obtained results and derive the usual socio-economic inequality measures, let us recall here that the chosen weight parameters were crucial in giving a realistic estimate over the total wealth of an individual or household. Looking back to the chosen values of the weight parameters summarized in Tables 2, 4, 6 one will observe that the chosen sets indeed affects the share of wealth categories. This uncertainty will have a direct effect in the wealth averages too, however the error bars from Figures 2–4 suggests that the overall shape of the probability density functions are not altered in a qualitative manner. The reason for this is relatively simple. As previously said the commune is primarily agricultural in its economy, meaning that for the year 1961 all the different wealth categories may be considered as proportional to the size of the owned land. Indeed, the number of livestock, the size of the barn and house should all be proportional to it, as the land is needed to support the number of livestock, and dictates the size of a barn owned by a household. In the 1961 social situation the distribution of wealth is already quite egalitarian due to the land reforms from 1921 to 1945. As a consequence of these, in 1961 the larges owned estate is only 11.76 ha of land.

After 1961 all lands were forcefully collectivized, leaving households with a garden, the owned smaller buildings and the house. The maximal size of the garden was limited to 40 acres, although some larger ones exist in the records (these gardens were located at places where they could not be meaningfully used.) Anecdotal evidence from the commune says that after the collectivization most of the earned yearly income was invested in rebuilding, extending and building new houses. This is well supported by the data also. While in 1961 the total area of houses owned by households in the commune was 44273 *m*^{2}, in 1989 this number became 65396 *m*^{2} which means a 47.7*%* growth.

As it is observable in Figure 4, in 2021 one will observe a larger scattering of the probability densities, caused by the modified and uncorrelated wealth components used in the estimation of the total wealth. This is a results of less categories in the taxation database and also reflects that the agriculture based society diversified to other sectors such as tourism. As a result of this second effect the proportionality between the size of the land owned and the other categories may not be true anymore. Clearly, the shape of the distribution is shifted towards the characteristic power-law like tail by the current year.

We found once again that the LGGR model provides a useful modeling base for many complex systems. For the considered problem in particular, it allowed a realistic modeling of wealth distribution at each historically significant economic period. For the communism years (1961 and 1989) the growth rate was chosen as a fixed *k* constant. A state-dependent growth rate would have raised ideological problems for the communist regime. The constant growth rate leads to a scenario where people may only produce enough for themselves and only trade for the essentials, instead of investing for a greater profit. The reset rate was considered to be linear, with an *r* offset. This can be interpreted in the context of a wealth control mechanism, imposing a desired wealth amount, above which a resetting is favored. Households enter in their wealth evolution dynamics bellow this *r* value and leave the system over *r* in average, in agreement with the ideology supported by a communistic regime. As the communist economy evolves our regression results suggests that the growth slows from *k* = 1.91 in 1961 to *k* = 0.75 in 1989, while the reset threshold grows from *r* = 0.82 to *r* = 1.22. The results also suggest that for the smaller *k* growth rate and the larger *r* reset threshold the resulting wealth distributions tend to peak at around *r*, with an average wealth also around *r*, as seen in the results for 1989 on Figure 3.

With the fall of communism in 1989 the dynamics governing wealth inequalities has changed. The rules of the open market economy allowed the “rich gets richer” effect which is modeled here by the linear growth rate. Such a linear growth rate was used in our previous approaches to describe the distribution of income and wealth [14, 28]. The constant reset rate implemented by us means that everybody can leave the system with the same likelihood. The linear growth rate and the constant reset rate in the LGGR model leads to the Tsalis-Pareto stationary distribution. The power-law tail of this distribution is consistent with the accepted experimental results in the current literature. It is interesting to note that the tail of the distribution observed for the studied commune in 2021 is *b* = −6.3 which is much steeper than the tails observed on larger population scales. One can compare this exponent for the one observed in the wealth distributions on country level, where for USA and Russia one gets *b* = −2.4 and France with *b* = −2.68 [14]. This steeper decay indicates that the smaller community studied here is much more homogeneous in wealth, and therefore also the socio-economic inequality indicators should be smaller.

Concerning the inequality measures used in social sciences our results are consistent with the ones expected for the investigated economic periods. The experimentally and theoretically constructed Lorenz curves seems to be in good agreement, confirming from another perspective the validity of the theoretically proposed probability density functions. The experimentally and theoretically calculated *G* Gini index are also in good agreement, the theoretical values being in the intervals spanned by the experimental values for different parameter sets. The same observations hold for the value of the *P* Pareto point. The *G* and *p* values in Table 7 shows that social inequalities were low in the communist regime. The *G* value is around ∼*G* increased in this region to a value around ∼*P* Pareto point. The value of *P* is ∼*P*, which nowadays is in this region ∼

In summary, we can state that the experimental and theoretical investigation of wealth distribution in a well-delimited social environment (traditional commune in Transylvania) for economically different situations proved expected variations in social inequality measures. Along with the obtained extremely precious inequality data, the study proves once again the modeling power of the LGGR model in various interdisciplinary problems related to complex systems. It would be definitely interesting also to perform such studies on more years and to reveal by these the functional form of the *μ*(*w*) and *γ*(*w*) rates, similarly with the study performed for wealth [28]. Repeating such a study for other regions in Romania or other geographical parts of the world with a well-delimited community, would also be interesting.

## Data Availability Statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://doi.org/10.6084/m9.figshare.17013467.v2.

## Author Contributions

Conceptualization and research design by ZN; theoretical model by TB and ZN; data mining by IG; data analysis by IG and SK; Figures by IG and SK; interpretation by ZN and TB; first draft of the manuscript by IG, SK, and ZN; all author contributed to the final form of the manuscript.

## Funding

Work supported by the UEFISCDI PN-III-P4-ID-PCCF-2016-0084 research grant. TB has been supported by the Hungarian National Bureau for Research, Innovation and Development under project Nr. K123815. The work of SK is supported by the Collegium Talentum Program of Hungary.

## Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

## Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

## Acknowledgments

We acknowledge the mayor office of the Sancraiu commune for the help and openness in providing us the Agricultural records from the archives, and the help they gave for the extraction of the anonymized taxation data. We also acknowledge senior IG for the enlightening suggestions on the data sources.

## References

1. Bettencourt LMA. The Origins of Scaling in Cities. *Science* (2013) 340:1438–41. doi:10.1126/science.1235823

2. Heinrich Mora E, Heine C, Jackson JJ, West GB, Yang VC, Kempes CP. Scaling of Urban Income Inequality in the USA. *J R Soc Interf* (2021) 18:20210223. doi:10.1098/rsif.2021.0223

3. Arcaute E, Hatna E, Ferguson P, Youn H, Johansson A, Batty M. Constructing Cities, Deconstructing Scaling Laws. *J R Soc Interf* (2015) 12:20140745. doi:10.1098/rsif.2014.0745

4. Bettencourt LMA, Lobo J, Helbing D, Kühnert C, West GB. Growth, Innovation, Scaling, and the Pace of Life in Cities. *Proc Natl Acad Sci* (2007) 104:7301–6. doi:10.1073/pnas.0610172104

5. Van Raan AFJ, Van Der Meulen G, Goedhart W. Urban Scaling of Cities in the Netherlands. *PLoS One* (2016) 11:e0146775. doi:10.1371/journal.pone.0146775

6. Behrens K, Robert-Nicoud F. Survival of the Fittest in Cities: Urbanisation and Inequality. *Econ J* (2014) 124:1371–400. doi:10.1111/ecoj.12099

7. Depersin J, Barthelemy M. From Global Scaling to the Dynamics of Individual Cities. *Proc Natl Acad Sci USA* (2018) 115:2317–22. doi:10.1073/pnas.1718690115

8. Chakraborti A, Chatterjee A, Chakrabarti B, Chakravarty SR. *Econophysics of Income and Wealth Distributions*. Cambridge: Cambridge Univ. Press (2013).

9. Yakovenko VM, Rosser JB. Colloquium: Statistical Mechanics of Money, Wealth, and Income. *Rev Mod Phys* (2009) 81:1703–25. doi:10.1103/RevModPhys.81.1703

11. Drăgulescu A, Yakovenko VM. Evidence for the Exponential Distribution of Income in the USA. *Eur Phys J B* (2001) 20:585–9. doi:10.1007/PL00011112

12. Gini C. On the Measure of Concentration with Special Reference to Income and Statistics. Colorado College Publication. *Gen Ser* (2001) 208:73–9.

14. Gere I, Kelemen S, Tóth G, Biró TS, Néda Z. Wealth Distribution in Modern Societies: Collected Data and a Master Equation Approach. *Physica A: Stat Mech its Appl* (2021) 581:126194. doi:10.1016/j.physa.2021.126194

15. Cui L, Lin C. A Simple and Efficient Kinetic Model for Wealth Distribution with Saving Propensity Effect: Based on Lattice Gas Automaton. *Physica A: Stat Mech its Appl* (2021) 561:125283. doi:10.1016/j.physa.2020.125283

16. Cardoso B-HF, Gonçalves S, Iglesias JR. Wealth Distribution Models with Regulations: Dynamics and Equilibria. *Physica A: Stat Mech its Appl* (2020) 551:124201. doi:10.1016/j.physa.2020.124201

17. Park J, Park∗ Y. Wealth Distribution for the Spin Agent Model of the Stock Market. *New Phys Sae Mulli* (2020) 70:292–8. doi:10.3938/NPSM.70.292

18. Lim G, Min S. Analysis of Solidarity Effect for Entropy, Pareto, and Gini Indices on Two-Class Society Using Kinetic Wealth Exchange Model. *Entropy* (2020) 22:386. doi:10.3390/e22040386

19. Levy M, Levy H. Investment talent and the Pareto Wealth Distribution: Theoretical and Experimental Analysis. *Rev Econ Stat* (2003) 85:709–25. doi:10.1162/003465303322369830

20. Jones CI. Pareto and Piketty: The Macroeconomics of Top Income and Wealth Inequality. *J Econ Perspect* (2015) 29:29–46. doi:10.1257/jep.29.1.29

21. Sorger G. Income and Wealth Distribution in a Simple Model of Growth. *Econ Theor* (2000) 16:23–42. doi:10.1007/s001990050325

22. Clementi F, Gallegati M, Kaniadakis G. A Generalized Statistical Model for the Size Distribution of Wealth. *J Stat Mech* (2012) 2012:P12006. doi:10.1088/1742-5468/2012/12/P12006

23. Bouchaud J-P, Mézard M. Wealth Condensation in a Simple Model of Economy. *Physica A: Stat Mech its Appl* (2000) 282:536–45. doi:10.1016/S0378-4371(00)00205-3

24. Chatterjee A, Chakrabarti BK, Stinchcombe RB. Master Equation for a Kinetic Model of a Trading Market and its Analytic Solution. *Phys Rev E* (2005) 72:026126. doi:10.1103/PhysRevE.72.026126

25. de Oliveira PMC. Investment/taxation/redistribution Model Criticality. *Eur Phys J B* (2020) 93:1–6. doi:10.1140/epjb/e2020-10308-x

26. Néda Z, Davidova L, Újvári S, Istrate G. Gambler's Ruin Problem on Erdős-Rényi Graphs. *Physica A: Stat Mech its Appl* (2017) 468:147–57. doi:10.1016/j.physa.2016.10.056

27. Coelho R, Néda Z, Ramasco JJ, Augusta Santos M. A Family-Network Model for Wealth Distribution in Societies. *Physica A: Stat Mech its Appl* (2005) 353:515–28. doi:10.1016/j.physa.2005.01.037

28. Néda Z, Gere I, Biró TS, Tóth G, Derzsy N. Scaling in Income Inequalities and its Dynamical Origin. *Physica A: Stat Mech its Appl* (2020) 549:124491. doi:10.1016/j.physa.2020.124491

29. Cieśla M, Snarska M. A Simple Mechanism Causing Wealth Concentration. *Entropy* (2020) 22:1148. doi:10.3390/e22101148

30. Biró TS, Néda Z, Telcs A. Entropic Divergence and Entropy Related to Nonlinear Master Equations. *Entropy* (2019) 21:993. doi:10.3390/e21100993

31. Podder N, Kakwani NC. Distribution of Wealth in Australia^{∗}. *Rev Income Wealth* (1976) 22:75–92. doi:10.1111/j.1475-4991.1976.tb01143.x

32. Derzsy N, Néda Z, Santos MA. Income Distribution Patterns from a Complete Social Security Database. *Physica A: Stat Mech its Appl* (2012) 391:5611–9. doi:10.1016/j.physa.2012.06.027

33. Drăgulescu A, Yakovenko VM. Exponential and Power-Law Probability Distributions of Wealth and Income in the United Kingdom and the United States. *Physica A: Stat Mech its Appl* (2001) 299:213–21. doi:10.1016/S0378-4371(01)00298-9

34. Varga EA. *Erdély Etnikai És Felekezeti Statisztikája. IV. Fehér, Beszterce-Naszód És Kolozs Megye. Népszámlálási Adatok 1850-1992 Között*. Csíkszereda: Pro-Print (2001).

35. Biró TS, Néda Z. Unidirectional Random Growth with Resetting. *Physica A: Stat Mech its Appl* (2018) 499:335–61. doi:10.1016/j.physa.2018.02.078

36. Biró TS, Csillag L, Néda Z. Transient Dynamics in the Random Growth and Reset Model. *Entropy* (2021) 23:306. doi:10.3390/e23030306

37. Perc M. The Matthew Effect in Empirical Data. *J R Soc Interf* (2014) 11:20140378. doi:10.1098/rsif.2014.0378

38. Lorenz MO. Methods of Measuring the Concentration of Wealth. *Publications Am Stat Assoc* (1905) 9:209–19. doi:10.2307/2276207

Keywords: socio-economic inequalities, wealth distribution, master equation, gini index, lorenz curve, pareto point

Citation: Gere I, Kelemen S, Biró TS and Néda Z (2022) Wealth Distribution in Villages. Transition From Socialism to Capitalism in View of Exhaustive Wealth Data and a Master Equation Approach. *Front. Phys.* 10:827143. doi: 10.3389/fphy.2022.827143

Received: 01 December 2021; Accepted: 10 January 2022;

Published: 02 February 2022.

Edited by:

Haroldo V. Ribeiro, State University of Maringá, BrazilReviewed by:

Claudius Gros, Goethe University Frankfurt, GermanyMarcel Ausloos, Bucharest Academy of Economic Studies, Romania

Copyright © 2022 Gere, Kelemen, Biró and Néda. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Zoltán Néda, zoltan.neda@ubbcluj.ro