Estimador hh con reemplazo

Solo disponible en BuenasTareas
  • Páginas : 6 (1289 palabras )
  • Descarga(s) : 0
  • Publicado : 23 de agosto de 2012
Leer documento completo
Vista previa del texto
Unequal Probability Sampling (Chapter 6)
Unequal probability sampling is when some units in the population have probabilities of being selected from others. This handout introduces the Hansen-Hurwitz (H-H) estimator and Horvitz-Thompson (H-T) estimator, examines the properties of both types of estimators for the population total and mean, and compares the two estimators by way of an example. TheHansen-Hurwitz (H-H) Estimator for random sampling with replacement. • Suppose a sample of size n is selected randomly with replacement from a population

but that on each draw, unit i has probability pi of being selected, where

pi = 1.

The probability pi here is called the selection probability for the ith unit. Let yi be the response variable measured on each unit selected. Notethat if a unit is selected more than once, it is used as many times as it is selected. An unbiased estimator of

the population total τ =

yi is given by: τp = 1 n yi . n i=1 pi

An unbiased estimator of the population mean is µp = (1/N )τp . • Dividing by pi gives higher weight to units less likely to be selected. • What happens to this estimator if pi = 1/N, i = 1, . . . , N , sothat each unit has an equal chance of selection?

Example: Consider a population of size N = 3, with values and corresponding selection probabilities given in the first two columns of the table to the right. Note that the true population total is τ = 14. Consider taking a sample of size 1. The H-H estimates of the total for each of the 3 values (samples) are given in the third column of thetable.

Values Probabilities y1 = 3 p1 = .2 y2 = 2 p2 = .5 p3 = .3 y3 = 9

τp 15 4 30

The expected value of τp is: E(τp ) = .2(15) + .5(4) + .3(30) = 14 = τ . • So, in τp = yi 1 n yi , each is unbiased for τ . n i=1 pi pi

Why would you want select units with unequal probabilities? • It may be the most convenient way to sample. Recall the example of taking sample of ponds by selecting a randompoint on a map. If the point lands in a pond then that pond is selected for the sample. It would require a lot more effort to enumerate all the ponds so that an SRS could be selected. See also the farm example below.

• If the response variable is positively correlated with the selection probability, then the Hansen-Hurwitz estimator can have lower variance than the estimator based on anSRS.

Properties of the Hansen-Hurwitz Estimator E(τp ) =

1 n yi 1 Var(τp ) = Var = 2 n i=1 pi n 1 = n2
n N


i=1 2

yi pi

(indep. due to sampling with replacement) yj −τ pj

i=1 j=1

yj −τ pj

1 N = pj n j=1


where τ is unknown, so we need to estimate it in this variance. An unbiased estimate of the variance can be computed as:
n 1 yj Var(τp ) = − τp n(n −1) j=1 pj 2

Note that the properties of µp = (1/N )τp follow easily: E(µp ) = (1/N )τ = µ (unbiased), Var(µp ) = (1/N )2 Var(τp ), Var(µp ) = (1/N )2 Var(τp ). Notes on the Hansen-Hurwitz Estimator 1. We only need pi for the units in the sample (not the whole population). 2. We need not know N in order to estimate τ . 1 n 1 3. If we let yi = 1, i = 1, . . . , N , then τ = N and τp = = N is anestimator of N . n i=1 pi 4. If there is low variability between the values of yj /pj , then the H-H estimator will have low variance, with the extreme case being when yj and pj are exactly proportional to each other. On the other hand, the H-H estimator will have high variance when there is high variability among the values of yj /pj . Example: Consider a population of farms on a 25x25 grid ofvarying sizes and shapes, as given on the last page of this handout. If we randomly select a single square on this grid, then letting xi = the area of farm i and A = 625 total units, the probability that farm i is xi xi = . selected is: pi = A 625 Let yi = the response variable of interest (which might be xi ).


• If yi = xi , then τ =

yi = the total area of all farms. In this...
tracking img