Estadistica

Solo disponible en BuenasTareas
  • Páginas : 26 (6389 palabras )
  • Descarga(s) : 0
  • Publicado : 27 de mayo de 2011
Leer documento completo
Vista previa del texto
O 1993 American Statistical Association

Journal of Business & Economic Statistics, October 1993, Vol. 11, No. 4

Two-Phase Sampling of Tax Records for Business Surveys
John Armstrong, Clayton Block, and K. P. Srinath
Business Survey Methods Division, Statistics Canada, Ottawa, Ontario K I A OT6, Canada
A two-phase design used to sample tax records of businesses to obtain annual estimatesof Canadian economic production is described. Classification information obtained from units in the first-phase sample is used during stratification for selection of the second-phase sample. Bernoulli sampling is employed. Given multiple precision constraints, optimal allocation requires solution of a convex programming problem. An approximate method is described and compared to the optimalapproach. The efficiency of the two-phase design relative to one-phase alternatives is examined. KEY WORDS: Convex programming; Efficiency; Optimal allocation.

A new strategy for the collection and integration of economic data is being implemented at Statistics Canada. According to the strategy, annual economic data for large businesses are collected through mail-out sample surveys and data for smallbusinesses are obtained from a sample of tax records. Estimates of financial variables for the business population are obtained by combining estimates from the two sources. Tax data are used to obtain small-business estimates for two reasons. First, the use of administrative information is substantially cheaper than collection of data through direct contact with respondents. Second, use of taxdata helps Statistics Canada meet requirements to reduce response burden. Most surveys of large businesses are in fact censuses, and the contribution of small businesses to overall estimates is relatively small. The tax sample is designed to take into account precision requirements for overall estimates. Canadian industrial activity is classified according to the four-digit Standard IndustrialClassification (SIC) code (Statistics Canada 1980). Tax records in the population can be accurately stratified using the first two digits of SIC (SIC;?). SIC2 codes classify business activity into 76 groups. Four-digit SIC (SIC4) codes provide classification into finer categories within each group. For example, the SIC2 code of a business might indicate that its major activity is in the communicationindustry, but the SIC4 code describes the activity as radio broadcasting. There are two reasons why sampling of tax records is used rather than tabulation of all records. First, the cost of accurately assigning SIC4 codes to all tax records in the population would be substantial. Second, estimates are required for many variables that are not available in machine-readable form and must be obtainedfrom source documents. The cost of obtaining this information for all records would be prohibitive. A two-phase approach to sampling of tax records was adopted to facilitate accurate estimation of economic production at the SIC4 level. Optimal allocation of the two-phase sample requires determination of first- and second-phase sampling fractions that minimize cost subject to constraints on thecoefficients of variation of estimates of gross business income for SIC4 domains. This problem involves a series of constraints within each SIC2 cell. It differs from the allocation problems for two-phase designs involving a single variance function considered by Rao (1973), Cochran (1977, pp. 327332), and Smith (1989). A description of the two-phase sample design is given in Section 1. Methodsused for estimation are described in Section 2. Sample allocation is considered in Section 3. An iterative method that can be used to obtain optimal sampling fractions is described. This method, involving the solution of a series of convex programming problems, is compared to an approximately optimal closed-form alternative. The efficiency of the two-phase design relative to one-phase alternatives...
tracking img