Abstract
The statistical distribution of the largest value drawn from a sample of a given size has only three possible shapes: it is either a Weibull, a Fréchet or a Gumbel extreme value distributions. I describe in this short review how to relate the statistical distribution followed by the numbers in the sample to the associate extreme value distribution followed by the largest value within the sample. Nothing I present here is new. However, from experience, I have found that a simple, short and compact guide on this matter written for the physics community is missing.
1 Introduction
Extreme value statistics offers a powerful tool box for the theoretical physicist. But it is the kind of tool box that is not missed before one has been introduced to it—perhaps a little like the smart phone. It concerns the statistics of extreme events and it aims to answer questions like “if the strongest signal I have observed over the last hour had the value x, what would the strongest signal expected to be if measured over hundred hours?” Furthermore, if I divide up this hundred-hour interval into a hundred 1-h intervals, what would be the statistical distribution of strongest signal in each 1-h interval?
It is the latter question which is the focus of this mini-review.
There is no lack of literature on extreme value statistics, see e.g., [1–5] or simply google the term. We find it used in connection with spin glasses and disordered systems [6], in connection with noise [7], in connection with optics [8], in connection with fracture [9] or the fiber bundle model [10], in diffusion processes [11] etc. There are plenty of examples from diverse fields of physics.
So, there is no lack of material for the novice that has seen a need for this tool. The problem is that it is not so easy to penetrate the literature, which is often cast in a rather mathematical language which takes work to penetrate. The aim of this mini-review is to present the theory behind and the main results concerning the extreme value distributions in a simple and compact way. We will present nothing new. For a longer, wider and more detailed review of extreme value statistics, Fortin and Clusel [12] or Majumdar et al. present exactly that [13].We have a statistical distribution and its associated cumulative probabilitywhich is the probability to find a number smaller than or equal to x. We draw N numbers from this distribution and record the largest of the N numbers. We repeat this procedure M times and thereby obtain M largest numbers, one for each sequence. What is the distribution of these M largest numbers in the limit when , which then defines the extreme value distribution?
It turns out that depending on
, the extreme value distribution will have one of three functional forms:
The Weibull cumulative probability
where we assume
. Note that
. The corresponding Weibull extreme value distribution is
The Fréchet cumulative probability
Also here we assume
. Note that
. The Fréchet extreme value distribution is
The Gumbel cumulative probability
where
, so that
and
. The corresponding Gumbel extreme value distribution is given by
The questions are 1. which classes of distributions
lead to which of the three extreme value distributions and 2. what is the connection between
xand
uin each case? It turns out that.
distributions where for and as , see Eq. 10, lead to the Weibull extreme value distribution,
distributions where as , see Eq. 24 lead to the Fréchet extreme value distribution,
and distributions where falls of faster than any power law as , see Eq. 53 lead to the Gumbel extreme value distribution.
Furthermore, we will find that.
We summarize these results in Table I.
TABLE 1
| Weibull | for | for | |
| 0 for | 0 for | ||
| Fréchet | for | for | |
| 0 for | |||
| Gumbel | for where | for | where |
Summary of main results.
The discussion that will now follow, will be built on the following relation. We draw N numbers from the probability distribution : . The probability that all the N numbers are smaller than or equal to a value x iswhere is the cumulative probability 1. Our task is to figure out the limit as , and what is as we approach this limit.
Rather than the conventional approach (see e.g., [10]) to this subject based on the Fréchet, Fisher and Tippett stability criterion [1], I will base the entire discussion on the relationI believe this to be the simpler and more intuitive way.
2 Weibull Class
We consider here probability distributions having the formwhere b is positive. We note that leads to a diverging probability density as . We furthermore note that implies that approach a constant when — which for example is the case when the distribution is uniform. The corresponding cumulative probability is given by
The extreme value cumulative probability for N samplings is given byfor . We introduce the variable changewhere the reader should note that b is defined by the original distribution 10. Equation 12 then becomesIn the limit of , this becomesfor negative u. Hence, we have thatwhich is the Weibull cumulative probability, valid for all values of u even though we only know the behavior of close to . The Weibull probability density is given by
We note that the Weibull distribution resembles a stretched exponential. This is correct for . However, is much more common in the wild.
We express the Weibull cumulative probability in terms of the original variable x using Eq. 13,
Hence, in terms of the original variable x, the Weibull extreme value distribution becomes
2.1 Weibull: An Example
We now work out a concrete example. Let us assume that is given byi.e., and in Eq. 10. The cumulative probability is thenFrom Eq. 19 and we have that
We show the distribution 20 with together with the corresponding extreme value distributions for and , Eq. 19 in Figure 1A.
FIGURE 1

(A) The curve that has its maximum at is the probability distribution 20 with . The curve that has its maximum in the middle is , Eq. 22 with and the curve that has its maximum to the right is with . (B) The histograms shown here are based on data according to the probability distribution 20 with . The histogram having its maximum to the left shows all the generated data. The histogram having its maximum in the middle shows the largest number among each sequence of numbers of length 100, and the histogram having its maximum to the right shows the largest number among each sequence of numbers of length 1,000. We generated sequences for both cases.
Using a random number generator producing IID numbers1r uniformly distributed on the unit interval, we may stochastically generate numbers that are distributed according to the probability density given in 20. We do this by inverting the expression , where the cumulative probability is given by 21. Hence, we havewhere we have also used that r may be substituted for in 21. We generate a sequence of sequences of numbers using this algorithm, each sequence having length N. We then identify the largest value within each sequence. We chose and , in each case generating such sequences. The histograms based on the random numbers themselves, and of the extreme values for each sequence of length either 100 or 1,000 we show in Figure 1B. This figure should be compared to Figure 1A.
The Weibull distribution, Eq. 17 is much used in connection with material strength [15]. This is no coincidence. Consider a chain. Each link in the chain can sustain a load up to a certain value, above which it fails. This maximum value is distributed according to some probability distribution. When the chain is loaded, it will be the link with the smallest failure threshold that will break first causing the chain as a whole to fail. Hence, the strength distribution of an ensemble of chains is an extreme value distribution, but with respect to the smallest rather than the largest value. The link strength must a positive number. Hence, the link strength distribution is cut off at zero or some positive value. The distribution close to this cutoff value must behave as a power law in the distance to the cutoff, e.g., due to a Taylor expansion around the cutoff. The corresponding extreme value distribution, which is the chain strength distribution, must then be a Weibull distribution.
3 Fréchet Class
We now assume that the probability distribution behaves asand the corresponding cumulative probability behaves asThe extreme value cumulative probability for N samplings is given byfor . We introduce the variable changewhere b comes from the original distribution 24. We now plug this change of variables into Eq. 26 to findIn the limit of , this becomeswhere is given by Eq. 27. We see that as . Furthermore, for , the function is no longer real. Hence, we define for . The ensuing extreme value cumulative probability is then given bywhich is the Fréchet cumulative probability. The Fréchet probability density is given by
We express the Fréchet cumulative probability in terms of the original variable x using Eq. 27,
Hence, in terms of the original variable x, the Fréchet extreme value distribution becomes
3.1 Fréchet: An Example
We consider the distributionThe corresponding cumulative probability is given by
Using Eq. 33, we find the corresponding Fréchet extreme value distribution to bevalid for all . We show and the corresponding for and and in Figure 2A.
FIGURE 2

(A) The curve that has its maximum at is the probability distribution 34 with . The curve that has its maximum in the middle is , Eq. 36 with and the curve that has its maximum to the right is with . (B) The histograms shown here are based on data according to the probability distribution 34 with . The histogram having its maximum to the left shows all the generated data. The histogram having its maximum in the middle shows the largest number among each sequence of numbers of length 100, and the histogram having its maximum to the right shows the largest number among each sequence of numbers of length 1,000. For each sequence length, such sequences were generated.
In order to compare with numerical results, we generate numbers distributed according to 34 by solving the equation where r is drawn from a uniform distribution on the unit interval. From Eq. 35, we get
We generate a sequence of numbers using this algorithm, grouping them together in sequences of or . We generate such sequences. The histograms based on the random numbers themselves generated with Eq. 37, and of the extreme values for each sequence of length either 100 or 1,000 we show in Figure 2B. This figure should be compared to Figure 2A.
4 Gumbel Class
We now assume we have a probability distribution that takes the formwhere . We have that is any number, positive or negative, and is an increasing function of x. We will later on introduce a sufficient criterion imposed on to produce the Gumbel distribution, see Eq. 53. This criterion is equivalent to fulfillingThis criterion is e.g., fulfilled by any polynomial .
The cumulative probability isWe do not care about the form of or for .
The extreme value cumulative probability for N samplings is given byfor . We introduce the variable changewhere is given byEven though is defined by 43, we may interpret its meaning. We do so in the conclusion, see Eq. 71. From Eq. 40 we then have thatLet us now defineWe then expand around ,where . If we now setso that the first order term in the expansion becomes constant as N increases, we will have thatHence, if we have thatfor , then in this limit, we will findwhere we defineHere we have used Eqs (40) and (44).
4.1 Sufficient Criterion for the Gumbel Class
If we combine Eq. 49 for with Eqs 38 and 40, we findwhich is equivalent to
Equation 53, which is equivalent to Eq. 39, is in fact a sufficient condition for 49 to hold for all . We may show this through induction. We have that
If condition 52 is fulfilled, that is when the expression above is zero in the limit for , we also have thatsince both terms on the right hand side of Eq. 54 are zero in this limit. We now assume Eq. 49 to be true for some . We then have thatagain due to both terms on the right hand side of Eq. 54 are zero in this limit. This completes the proof.
4.2 Return to the Derivation
We now combine Eq. 42 with Eq. 41 to findIn the limit of , this becomeswhich is the Gumbel cumulative probability. Here . The Gumbel probability density is given by
We express the Gumbel cumulative probability in terms of the original variable x using Eq. 51,Hence, in terms of the original variable x, the Gumbel extreme value distribution becomes
4.3 An Example: The Gaussian
Here is an example: the Gaussian. The Gaussian probability density is given bywhere σ is the square of the standard deviation. The cumulative probability iswhere is the error function. In order to verify that the Gaussian generates the Gumbel extreme distribution, we use the sufficient condition 53,
The Gaussian cumulative probability in Eq. 63 has the asymptotic formfor large x. We determine solving Eq. 43 using this asymptotic form. We findwhere is the Lambert W function, also known as the product logarithm, which is the solution to the equation . For large arguments, it approaches the natural logarithm, as [16]. This gives uswhen inserting the expression for , Eq. 66 into Eq. 62. Thus we may now express the variable u in the Gumbel cumulative probability 57 in terms of the variables x, σ and N using Eq. 51,
We show in Figure 3A the Gaussian and the corresponding Gumbel distributions for and and . We find that and . These are the confidence intervals for 99% and 99.9%.
FIGURE 3

(A) The Gaussian and the corresponding Gumbel distributions for and and . (B) The histograms shown here are based on data generated using the Box-Müller algorithm which produces numbers distributed according to a Gaussian. Here . The histogram with the maximum to the left shows all the generated data. The histogram with its maximum in the middle shows the largest number among each sequence of numbers of length 100, and the histogram with the rightmost maximum shows the largest number among each sequence of numbers of length 1,000. For each sequence length, such sequences were generated.
We show in Figure 3B a histogram based on numbers distributed according to a Gaussian distribution using the Box-Müller algorithm [14]. These numbers were grouped together in sets of either or elements. I generated such sets. The figure displays the two extreme distributions for the two set sizes. This figure should be compared to Figure 3A. In contrast to the two other extreme value distributions, we see that there are visible discrepancies between the calculated Gumbel distributions in Figure 3A and the extreme value histograms in Figure 3B. We see furthermore that the histogram for is closer to the calculated Gumbel distribution than the histogram for . This is due to the very slow convergence induced by the Lambert W functions. Slow convergence is typical for the Gumbel extreme value distributions. This slow convergence has been analyzed and recently and through clever use of scaling methods remedied [17].
5 Concluding Remarks
We summarize the main results presented in this mini-review in Table I.
We have only discussed the distributions associated with the largest values of x except for the Weibull extreme value distribution, Section 2. It is, however, easy to work out: just transform. Otherwise, the story presented here is rather complete.
There is one remark that needs to be made, though. In the derivation of the Gumbel extreme value distribution, Section 4, we defined a variable in Eq. 43. First of all, defined in Eq. 43 may be calculated for any cumulative probability and it has an interpretation making it very useful.
The probability density for the largest among N numbers drawn using the probability distribution is given by
We calculate the average of the cumulative probability for the extreme value based on N samples,For large N, we may write this asusing here Eq. 43. Hence, we may interpret as the x value corresponding to the average confidence interval of the largest observed value in sequences of N numbers. It is essentially the typical size of the extreme value for a sample of size N.
Funding
This work was partly supported by the Research Council of Norway through its Centers of Excellence funding scheme, project number 262644.
Statements
Author contributions
The author confirms being the sole contributor of this work and has approved it for publication.
Acknowledgments
I thank Eivind Bering, Astrid de Wijn, H. George, E. Hentschel, Srutarshi Pradhan, and Itamar Procaccia for numerous interesting discussions on this topic.
Conflict of interest
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be constructed as a potential conflict of interest.
Footnotes
1.^IID variables. Independent and identically distributed random variables, a terminology used in some communities.
References
1.
Gumbel EJ . Statistics of extremes. New York: Columbia University Press(1958).
2.
David HA . Order statistics. 2nd ed.New York: Wiley (1981).
3.
Galambos J . The asymptotic theory of extreme order statistics. Malabar, FL: Krieger(1987).
4.
Embrechts P Klüppelberg C Mikosh T . Modeling extreme events for insurance and finance. Berlin: Springer(1997).
5.
Coles S . An introduction to statistical modeling of extreme events. Berlin: Springer(2001).
6.
Bouchaud J-P Mézard M . Universality classes for extreme-value statistics. J Phys Math Gen(1997). 30:7997. 10.1088/0305-4470/30/23/004
7.
Antal T Droz M Györgyi G Rácz Z . 1/f noise and extreme value statistics. Phys Rev Lett(2001). 87:240601. 10.1103/physrevlett.87.240601
8.
Randoux S Suret P . Experimental evidence of extreme value statistics in Raman fiber lasers. Opt Lett(2012). 37:500. 10.1364/OL.37.000500
9.
Taloni A Vodret M Costantini G Zapperi S . Size effects on the fracture of microscale and nanoscale materials. Nat Rev Mater(2018). 3:211–24. 10.1038/s41578-018-0029-4
10.
Hansen A Hemmer PC Pradhan S . The fiber bundle model. Berlin: Wiley VCH(2015).
11.
Pal A Eliazar I Reuveni S . First passage under restart with branching. Phys Rev Lett(2019). 122:020602. 10.1103/PhysRevLett.122.020602
12.
Fortin J-Y Clusel M . Applications of extreme value statistics in physics. J Phys Math Theor(2015). 48:183001. 10.1088/1751-8113/48/18/183001
13.
Majumdar SN Pal A Schehr G . Extreme value statistics of correlated random variables: a pedagogical review. Phys Rep(2020)840:1. 10.1016/j.physrep.2019.10.005
14.
Press WH Teukolsky SA Vetterling WT Flannery BP . Numerical recipes. 3rd ed.Cambridge: Cambridge University Press(2007).
15.
Rinne H . The Weibull distribution. Boca Raton: CRC Press(2008).
16.
Corless RM Gonnet GH Hare DEG Jeffrey DJ Knuth DE . On the LambertW function. Adv Comput Math(1996)5:329–59. 10.1007/BF02124750
17.
Zarfaty L Barkai E Kessler DA . Accurately approximating extreme value statistics(2020). arXiv:2006.13677.
Summary
Keywords
extreme value statistics, statistical analysis, Weibull analysis, Gumbel distribution, Frechet distribution, Weibull distribution
Citation
Hansen A (2020) The Three Extreme Value Distributions: An Introductory Review. Front. Phys. 8:604053. doi: 10.3389/fphy.2020.604053
Received
08 September 2020
Accepted
22 October 2020
Published
10 December 2020
Volume
8 - 2020
Edited by
Matjaž Perc, University of Maribor, Slovenia
Reviewed by
Arnab Pal, Tel Aviv University, Israel
Haroldo V. Ribeiro, State University of Maringá, Brazil
Updates
Copyright
© 2020 Hansen.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Alex Hansen, Alex.Hansen@ntnu.no
This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics
Disclaimer
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.