Model Based Small Domain Estimation, Nikos Tzavidis
Recent Methods & Applications in
National Statistics Institutes in Europe
Nikos Tzavidis
1
Mexico City, October 2012
1
Social Statistics & S3RI, University of Southampton
Model-based domain estimation in NSIs in Europe
Outline
Motivation & overview of domain estimation in European NSIs
Estimation of averages (Totals) & ”complex”indicators
Nested error regression model
Design-consistent estimation
Alternative approaches to outlier robust estimation
Empirical Best Prediction
Two recent applications in European NSIs
Domain estimation for business surveys in the Netherlands
Estimating income deprivation in the UK
Current debate in UK - Beyond 2011 Census and the role of
SAE methods
Model-based domain estimation inNSIs in Europe
Motivation
Direct estimation: Use only domain-specific data
Problems with direct estimation
1 Direct estimates may suffer from low precision
2 Not applicable with zero sample sizes
Potential solution: Use of models to improve efficiency
Model-based domain estimation in NSIs in Europe
Domain Estimation in NSIs in Europe
Examples of Domain Estimates:
Averageincome
Labour market activity (employment/unemployment)
Income deprivation
Business statistics
Requires use of different methodologies and auxiliary data
Increasingly relying on model-based estimation
Estimates published as experimental or National statistics
Model-based domain estimation in NSIs in Europe
Two Popular Estimators of Domain Averages
Synthetic estimator
¯ˆ
ˆ
yj = XT βw¯
j
ˆ
βw is the probability weighted estimator
Can be biased - Homogeneity assumption BUT
Stable
Survey regression estimator
ˆ
ˆ
¯
¯
ˆj
ˆ
¯
yj = YjHT + (Xj − xHT )βw
¯
Corrects the potential bias of the synthetic estimator BUT
Can be unstable
Review paper (Pfeffermann (2012), Statistical Science)
Model-based domain estimation in NSIs in Europe
Model-based Methods
Nested ErrorRegression Model
Key Idea (Battese et al., 1988; Rao, 2003)
Improve efficiency of domain estimators
Use random area-specific effects
Capture area variation beyond that explained by covariates
yij = xT β + vj +
ij
ij ,
i = 1, ..., nj , j = 1, ...d
v ∼ N (0, Σv ), ∼ N (0, Σ ), OR
∼ N (0, Σ = f (x))
Estimator of Domain Average
−
ˆEBLU P = Nj 1
yj
¯
ˆˆ
(xT β + vj )
ij
yij+
i∈ s j
Model-based domain estimation in NSIs in Europe
i∈rj
Extensions based on the Nested Error
Regression Model I
Nested Error Regression Model & Design Weights
(You and Rao, 2002)
Aim: Design consistency
Incorporates design weights in estimation
Pseudo EBLUP
¯
ˆP −
¯
yjw EBLU P = γjw yjw + (XT − γjw xT )β w
¯
ˆˆ
¯
ˆ ˆ jw ˆ
j
Model-based domain estimation inNSIs in Europe
Extensions based on the Nested Error
Regression Model II
Outlier Robust estimation with the Nested Error
Regression Model (Sinha and Rao, 2009)
Effects of outliers are controlled via an influence function ψ
ˆ
ˆψ
Replace β by β
Replace vj by vj
ˆ
ˆψ
−
ˆ
yj
¯REBLU P = Nj 1
ψ
ˆ
(xT β + vj )
ˆψ
ij
yij +
i∈ s j
Model-based domain estimation in NSIs inEurope
i∈rj
Alternative Outlier Robust Estimation
M-quantile Model (Chambers & Tzavidis, 2006)
M Qy (q |xij ) = xij T β ψ (q )
q denotes a quantile. Conventionally is a-priori chosen, fixed.
qij random variables such that yij = xT β ψ (qij )
ij
ˆj = E (ˆij )
Estimate empirical domain effects θ
q
ˆ
θj captures between area variation
No explicit parametric assumptions on θj
−
ˆ
yj Q =Nj 1
¯M
ψ
ˆˆ
(xT β (θj ))
ij
yij +
i∈ s j
Model-based domain estimation in NSIs in Europe
i∈rj
Alternative Outlier Robust Estimation
(Cont’d)
ˆM
yj Q can be biased
¯
Bias correction: Robust predictive approach (Welsh &
Ronchetti, 1998 ; Tzavidis et al., 2010; Chambers et al., 2012)
−
ˆ
yj = N j 1 [
¯BC
i∈ s j
yij +
ˆψ
yij +
i∈rj
Nj − nj
nj
φ...
Regístrate para leer el documento completo.