The Three Extreme Value Distributions: An Introductory Review

Hansen, Alex

doi:10.3389/fphy.2020.604053

REVIEW article

Front. Phys., 10 December 2020

Sec. Interdisciplinary Physics

Volume 8 - 2020 | https://doi.org/10.3389/fphy.2020.604053

The Three Extreme Value Distributions: An Introductory Review

AH
Alex Hansen ^*

PoreLab, Department of Physics, Norwegian University of Science and Technology, Trondheim, Norway

Abstract

The statistical distribution of the largest value drawn from a sample of a given size has only three possible shapes: it is either a Weibull, a Fréchet or a Gumbel extreme value distributions. I describe in this short review how to relate the statistical distribution followed by the numbers in the sample to the associate extreme value distribution followed by the largest value within the sample. Nothing I present here is new. However, from experience, I have found that a simple, short and compact guide on this matter written for the physics community is missing.

1 Introduction

Extreme value statistics offers a powerful tool box for the theoretical physicist. But it is the kind of tool box that is not missed before one has been introduced to it—perhaps a little like the smart phone. It concerns the statistics of extreme events and it aims to answer questions like “if the strongest signal I have observed over the last hour had the value x, what would the strongest signal expected to be if measured over hundred hours?” Furthermore, if I divide up this hundred-hour interval into a hundred 1-h intervals, what would be the statistical distribution of strongest signal in each 1-h interval?

It is the latter question which is the focus of this mini-review.

There is no lack of literature on extreme value statistics, see e.g., [1–5] or simply google the term. We find it used in connection with spin glasses and disordered systems [6], in connection with noise [7], in connection with optics [8], in connection with fracture [9] or the fiber bundle model [10], in diffusion processes [11] etc. There are plenty of examples from diverse fields of physics.

So, there is no lack of material for the novice that has seen a need for this tool. The problem is that it is not so easy to penetrate the literature, which is often cast in a rather mathematical language which takes work to penetrate. The aim of this mini-review is to present the theory behind and the main results concerning the extreme value distributions in a simple and compact way. We will present nothing new. For a longer, wider and more detailed review of extreme value statistics, Fortin and Clusel [12] or Majumdar et al. present exactly that [13].We have a statistical distribution and its associated cumulative probabilitywhich is the probability to find a number smaller than or equal to x. We draw N numbers from this distribution and record the largest of the N numbers. We repeat this procedure M times and thereby obtain M largest numbers, one for each sequence. What is the distribution of these M largest numbers in the limit when , which then defines the extreme value distribution?

It turns out that depending on

, the extreme value distribution will have one of three functional forms:

The Weibull cumulative probability

where we assume

. Note that

. The corresponding Weibull extreme value distribution is

The Fréchet cumulative probability

Also here we assume

. Note that

. The Fréchet extreme value distribution is

The Gumbel cumulative probability

where

, so that

and

. The corresponding Gumbel extreme value distribution is given by

The questions are 1. which classes of distributions

lead to which of the three extreme value distributions and 2. what is the connection between

and

in each case? It turns out that.

distributions where for and as , see Eq. 10, lead to the Weibull extreme value distribution,
distributions where as , see Eq. 24 lead to the Fréchet extreme value distribution,
and distributions where falls of faster than any power law as , see Eq. 53 lead to the Gumbel extreme value distribution.

Furthermore, we will find that.

for the Weibull extreme value distribution, u is given in terms of x in Eq. 13,
for the Fréchet extreme value distribution, u given in terms of x in Eq. 27,
for the Gumbel extreme value distribution, u is given in terms of x in Eqs 51 and 43.

We summarize these results in Table I.

TABLE 1


Weibull	for	for
Weibull	0 for	0 for
Fréchet	for	for
Fréchet	for	0 for
Gumbel	for where	for	where

Summary of main results.

The discussion that will now follow, will be built on the following relation. We draw N numbers from the probability distribution : . The probability that all the N numbers are smaller than or equal to a value x iswhere is the cumulative probability 1. Our task is to figure out the limit as , and what is as we approach this limit.

Rather than the conventional approach (see e.g., [10]) to this subject based on the Fréchet, Fisher and Tippett stability criterion [1], I will base the entire discussion on the relationI believe this to be the simpler and more intuitive way.

2 Weibull Class

We consider here probability distributions having the formwhere b is positive. We note that leads to a diverging probability density as . We furthermore note that implies that approach a constant when — which for example is the case when the distribution is uniform. The corresponding cumulative probability is given by

The extreme value cumulative probability for N samplings is given byfor . We introduce the variable changewhere the reader should note that b is defined by the original distribution 10. Equation 12 then becomesIn the limit of , this becomesfor negative u. Hence, we have thatwhich is the Weibull cumulative probability, valid for all values of u even though we only know the behavior of close to . The Weibull probability density is given by

We note that the Weibull distribution resembles a stretched exponential. This is correct for . However, is much more common in the wild.

We express the Weibull cumulative probability in terms of the original variable x using Eq. 13,

Hence, in terms of the original variable x, the Weibull extreme value distribution becomes

2.1 Weibull: An Example

We now work out a concrete example. Let us assume that is given byi.e., and in Eq. 10. The cumulative probability is thenFrom Eq. 19 and we have that

We show the distribution 20 with together with the corresponding extreme value distributions for and , Eq. 19 in Figure 1A.

FIGURE 1

Using a random number generator producing IID numbers¹r uniformly distributed on the unit interval, we may stochastically generate numbers that are distributed according to the probability density given in 20. We do this by inverting the expression , where the cumulative probability is given by 21. Hence, we havewhere we have also used that r may be substituted for in 21. We generate a sequence of sequences of numbers using this algorithm, each sequence having length N. We then identify the largest value within each sequence. We chose and , in each case generating such sequences. The histograms based on the random numbers themselves, and of the extreme values for each sequence of length either 100 or 1,000 we show in Figure 1B. This figure should be compared to Figure 1A.

The Weibull distribution, Eq. 17 is much used in connection with material strength [15]. This is no coincidence. Consider a chain. Each link in the chain can sustain a load up to a certain value, above which it fails. This maximum value is distributed according to some probability distribution. When the chain is loaded, it will be the link with the smallest failure threshold that will break first causing the chain as a whole to fail. Hence, the strength distribution of an ensemble of chains is an extreme value distribution, but with respect to the smallest rather than the largest value. The link strength must a positive number. Hence, the link strength distribution is cut off at zero or some positive value. The distribution close to this cutoff value must behave as a power law in the distance to the cutoff, e.g., due to a Taylor expansion around the cutoff. The corresponding extreme value distribution, which is the chain strength distribution, must then be a Weibull distribution.

3 Fréchet Class

We now assume that the probability distribution behaves asand the corresponding cumulative probability behaves asThe extreme value cumulative probability for N samplings is given byfor . We introduce the variable changewhere b comes from the original distribution 24. We now plug this change of variables into Eq. 26 to findIn the limit of , this becomeswhere is given by Eq. 27. We see that as . Furthermore, for , the function is no longer real. Hence, we define for . The ensuing extreme value cumulative probability is then given bywhich is the Fréchet cumulative probability. The Fréchet probability density is given by

We express the Fréchet cumulative probability in terms of the original variable x using Eq. 27,

Hence, in terms of the original variable x, the Fréchet extreme value distribution becomes

3.1 Fréchet: An Example

We consider the distributionThe corresponding cumulative probability is given by

Using Eq. 33, we find the corresponding Fréchet extreme value distribution to bevalid for all . We show and the corresponding for and and in Figure 2A.

FIGURE 2

In order to compare with numerical results, we generate numbers distributed according to 34 by solving the equation where r is drawn from a uniform distribution on the unit interval. From Eq. 35, we get

We generate a sequence of numbers using this algorithm, grouping them together in sequences of or . We generate such sequences. The histograms based on the random numbers themselves generated with Eq. 37, and of the extreme values for each sequence of length either 100 or 1,000 we show in Figure 2B. This figure should be compared to Figure 2A.

4 Gumbel Class

We now assume we have a probability distribution that takes the formwhere . We have that is any number, positive or negative, and is an increasing function of x. We will later on introduce a sufficient criterion imposed on to produce the Gumbel distribution, see Eq. 53. This criterion is equivalent to fulfillingThis criterion is e.g., fulfilled by any polynomial .

The cumulative probability isWe do not care about the form of or for .

The extreme value cumulative probability for N samplings is given byfor . We introduce the variable changewhere is given byEven though is defined by 43, we may interpret its meaning. We do so in the conclusion, see Eq. 71. From Eq. 40 we then have thatLet us now defineWe then expand around ,where . If we now setso that the first order term in the expansion becomes constant as N increases, we will have thatHence, if we have thatfor , then in this limit, we will findwhere we defineHere we have used Eqs (40) and (44).

4.1 Sufficient Criterion for the Gumbel Class

If we combine Eq. 49 for with Eqs 38 and 40, we findwhich is equivalent to

Equation 53, which is equivalent to Eq. 39, is in fact a sufficient condition for 49 to hold for all . We may show this through induction. We have that

If condition 52 is fulfilled, that is when the expression above is zero in the limit for , we also have thatsince both terms on the right hand side of Eq. 54 are zero in this limit. We now assume Eq. 49 to be true for some . We then have thatagain due to both terms on the right hand side of Eq. 54 are zero in this limit. This completes the proof.

4.2 Return to the Derivation

We now combine Eq. 42 with Eq. 41 to findIn the limit of , this becomeswhich is the Gumbel cumulative probability. Here . The Gumbel probability density is given by

We express the Gumbel cumulative probability in terms of the original variable x using Eq. 51,Hence, in terms of the original variable x, the Gumbel extreme value distribution becomes

4.3 An Example: The Gaussian

Here is an example: the Gaussian. The Gaussian probability density is given bywhere σ is the square of the standard deviation. The cumulative probability iswhere is the error function. In order to verify that the Gaussian generates the Gumbel extreme distribution, we use the sufficient condition 53,

The Gaussian cumulative probability in Eq. 63 has the asymptotic formfor large x. We determine solving Eq. 43 using this asymptotic form. We findwhere is the Lambert W function, also known as the product logarithm, which is the solution to the equation . For large arguments, it approaches the natural logarithm, as [16]. This gives uswhen inserting the expression for , Eq. 66 into Eq. 62. Thus we may now express the variable u in the Gumbel cumulative probability 57 in terms of the variables x, σ and N using Eq. 51,

We show in Figure 3A the Gaussian and the corresponding Gumbel distributions for and and . We find that and . These are the confidence intervals for 99% and 99.9%.

FIGURE 3

We show in Figure 3B a histogram based on numbers distributed according to a Gaussian distribution using the Box-Müller algorithm [14]. These numbers were grouped together in sets of either or elements. I generated such sets. The figure displays the two extreme distributions for the two set sizes. This figure should be compared to Figure 3A. In contrast to the two other extreme value distributions, we see that there are visible discrepancies between the calculated Gumbel distributions in Figure 3A and the extreme value histograms in Figure 3B. We see furthermore that the histogram for is closer to the calculated Gumbel distribution than the histogram for . This is due to the very slow convergence induced by the Lambert W functions. Slow convergence is typical for the Gumbel extreme value distributions. This slow convergence has been analyzed and recently and through clever use of scaling methods remedied [17].

5 Concluding Remarks

We summarize the main results presented in this mini-review in Table I.

We have only discussed the distributions associated with the largest values of x except for the Weibull extreme value distribution, Section 2. It is, however, easy to work out: just transform. Otherwise, the story presented here is rather complete.

There is one remark that needs to be made, though. In the derivation of the Gumbel extreme value distribution, Section 4, we defined a variable in Eq. 43. First of all, defined in Eq. 43 may be calculated for any cumulative probability and it has an interpretation making it very useful.

The probability density for the largest among N numbers drawn using the probability distribution is given by

We calculate the average of the cumulative probability for the extreme value based on N samples,For large N, we may write this asusing here Eq. 43. Hence, we may interpret as the x value corresponding to the average confidence interval of the largest observed value in sequences of N numbers. It is essentially the typical size of the extreme value for a sample of size N.

Funding

This work was partly supported by the Research Council of Norway through its Centers of Excellence funding scheme, project number 262644.

Statements

Author contributions

The author confirms being the sole contributor of this work and has approved it for publication.

Acknowledgments

I thank Eivind Bering, Astrid de Wijn, H. George, E. Hentschel, Srutarshi Pradhan, and Itamar Procaccia for numerous interesting discussions on this topic.

Conflict of interest

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be constructed as a potential conflict of interest.

Footnotes

1.^IID variables. Independent and identically distributed random variables, a terminology used in some communities.

REFERENCES

1.
GumbelEJ. Statistics of extremes. New York: Columbia University Press(1958).
- Google Scholar
2.
DavidHA. Order statistics. 2nd ed.New York: Wiley (1981).
- Google Scholar
3.
GalambosJ. The asymptotic theory of extreme order statistics. Malabar, FL: Krieger(1987).
- Google Scholar
4.
EmbrechtsPKlüppelbergCMikoshT. Modeling extreme events for insurance and finance. Berlin: Springer(1997).
- Google Scholar
5.
ColesS. An introduction to statistical modeling of extreme events. Berlin: Springer(2001).
- Google Scholar
6.
BouchaudJ-PMézardM. Universality classes for extreme-value statistics. J Phys Math Gen(1997). 30:7997. 10.1088/0305-4470/30/23/004
- CrossRef
- Google Scholar
7.
AntalTDrozMGyörgyiGRáczZ. 1/f noise and extreme value statistics. Phys Rev Lett(2001). 87:240601. 10.1103/physrevlett.87.240601
- CrossRef
- Google Scholar
8.
RandouxSSuretP. Experimental evidence of extreme value statistics in Raman fiber lasers. Opt Lett(2012). 37:500. 10.1364/OL.37.000500
- CrossRef
- Google Scholar
9.
TaloniAVodretMCostantiniGZapperiS. Size effects on the fracture of microscale and nanoscale materials. Nat Rev Mater(2018). 3:211–24. 10.1038/s41578-018-0029-4
- CrossRef
- Google Scholar
10.
HansenAHemmerPCPradhanS. The fiber bundle model. Berlin: Wiley VCH(2015).
- Google Scholar
11.
PalAEliazarIReuveniS. First passage under restart with branching. Phys Rev Lett(2019). 122:020602. 10.1103/PhysRevLett.122.020602
- CrossRef
- Google Scholar
12.
FortinJ-YCluselM. Applications of extreme value statistics in physics. J Phys Math Theor(2015). 48:183001. 10.1088/1751-8113/48/18/183001
- CrossRef
- Google Scholar
13.
MajumdarSNPalASchehrG. Extreme value statistics of correlated random variables: a pedagogical review. Phys Rep(2020)840:1. 10.1016/j.physrep.2019.10.005
- CrossRef
- Google Scholar
14.
PressWHTeukolskySAVetterlingWTFlanneryBP. Numerical recipes. 3rd ed.Cambridge: Cambridge University Press(2007).
- Google Scholar
15.
RinneH. The Weibull distribution. Boca Raton: CRC Press(2008).
- Google Scholar
16.
CorlessRMGonnetGHHareDEGJeffreyDJKnuthDE. On the LambertW function. Adv Comput Math(1996)5:329–59. 10.1007/BF02124750
- CrossRef
- Google Scholar
17.
ZarfatyLBarkaiEKesslerDA. Accurately approximating extreme value statistics(2020). arXiv:2006.13677.
- Google Scholar

Summary

Keywords

extreme value statistics, statistical analysis, Weibull analysis, Gumbel distribution, Frechet distribution, Weibull distribution

Citation

Hansen A (2020) The Three Extreme Value Distributions: An Introductory Review. Front. Phys. 8:604053. doi: 10.3389/fphy.2020.604053

Received

08 September 2020

Accepted

22 October 2020

Published

10 December 2020

Volume

8 - 2020

Edited by

Matjaž Perc, University of Maribor, Slovenia

Reviewed by

Arnab Pal, Tel Aviv University, Israel

Haroldo V. Ribeiro, State University of Maringá, Brazil

Updates

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Alex Hansen, Alex.Hansen@ntnu.no

This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Interdisciplinary Physics

REVIEW article

The Three Extreme Value Distributions: An Introductory Review

Abstract

1 Introduction