Valor Extremo
Michael P. Mersic February 24, 2007
1
What is Extreme Value Theory?
What happened on February 1, 1953? There was a severe storm in the Netherlands [2]. This caused several dykes to fail. The flooding killed over 1800 people. After this natural disaster, the Dutch government formed a Delta Committee. The purpose of the Delta-Committee was, amongother things, to determine how high the sea dykes need to be to protect against a 1 in 10,000 year storm. The obvious problem with this is that there is only one hundred or so years of historical data. Traditionally, statistics looks at the mean values of a data set. The Central Limit Theorem states that sample means will be normally distributed around the population mean [4]. But, for many problems,the mean doesn’t matter. What matters is the max event or the min event. Events like: • The strength of a 1 in 10,000 year storm. • The probability the stock market will crash tomorrow. • The premium to charge for insurance over a large threshold. There are two questions that I will ask in this paper. The first is, “What are the Extreme 1
Value distributions?” This is the Extremal Limit problem[1]. The other question is, “To what distributions does EVT apply?” This is the Domain of Attraction problem [1]. Before answering these questions I will discuss some preliminaries, including Quantile-Quantile plots, and a more formal formulation of the Extremal Limit Problem and Domain of Attraction problem.
2
Comparing distributions with QQ-Plots
Before getting into EVT, I want todescribe a method of comparing distributions called a Quantile-Quantile Plot, or QQ-Plot. A QQ-Plot allows us to quickly see if two sets of data seem to come from the same distribution [1]. It is important to note that it is not possible to prove that two sets of data came from the same distribution, but with a QQ-Plot we can see if it is a likely hypothesis. Lets take a set of independent andidentically distributed (iid) data X1 , X2 , .., Xn . The data comes from the distribution: F (x) = P (X > x). How can we tell if this distribution is like the normal distribution? One way is to take the Quantiles of the two distributions and plot them. The Quantile is essentially the inverse of F (x),
Q(p) = inf {x : F (x) ≥ p}. Given a set of data, X1 , X2 , .., Xn , from F (x), is the datanormally distributed? To answer this question with a QQ-Plot we will first find the probability points, p1 = F (X1 ), p2 = F (X2 ), ..., pn = F (Xn ). Then find the Quantiles of each probability point from the normal distribution Y1 = QN (p1 ), Y2 = QN (p2 ), ..., Yn = QN (pn ). Finally, plot each point (Y1 , X1 ), (Y2 , X2 ), ..., (Yn , Xn ). If there is a relationship in the data, the points will fallclose to the 45 degree line in the plot.
bution with a least squares regression line:
As an example I will plot the daily maximum wind speed, for days with maximum wind speed above 82-km/hr for Zeventem Belgium. This example is taken from [1]. In the next two graphs the Wind Speed data is plotted against This is a much better fit. the Normal Distribution and against the ExThe linearity of thedata in these graphs can ponential Distribution. First, the Normal Disbe measured by the correlation coefficient, rQ . tribution with a least squares regression line: The correlation coefficient is bounded, 0 ≤ rQ ≤ 1. If the data is perfectly linear, then rQ = 1. In the above examples, rQ for Wind Speed against the Normal Distribution is 0.8969 and rQ for Wind Speed against the Exponential Distributionis 0.9912. As expected, rQ for the Exponential Distribution is much closer to 1 then rQ for the Normal Distribution.
3
3.1
Formulation of the Extremal Value Problem
Maximum and Minimum are the same problem
It is easy to show that the maximum and the Here the plot shows a very poor fit for minimum distribution problems are the same Consider the maximum value the Wind Speed data against...
Regístrate para leer el documento completo.