Matematicas

Páginas: 8 (1794 palabras) Publicado: 4 de abril de 2011
Entropy Principle in Direct Derivation of Benford's Law
Oded Kafri Varicom Communications, Tel Aviv 68165 Israel oded@varicom.co.il

The uneven distribution of digits in numerical data, known as Benford's law, was discovered in 1881. Since then, this law has been shown to be correct in copious numerical data relating to economics, physics and even prime numbers. Although it attractsconsiderable attention, there is no a priori probabilistic criterion when a data set should or should not obey the law. Here a general criterion is suggested, namely that any file of digits in the Shannon limit (namely, having maximum entropy) has a Benford's law distribution of digits.

1

Bedford's law is an empirical uneven distribution of digits that is found in many random numerical data.Numerical data of natural sources that are expected to be random exhibit an uneven distribution of the first order digits that fits to the equation,

ρ (n) = log10 (1 + ) , where n = 1,2,3,4,5,6,7,8,9

1 n

(1)

Namely, digit 1 appears as the first digit at probability ρ (1) , which is about 6.5 times higher than the probability ρ (9) of digit 9 (fig 1).

Fig 1. Benford's law predicts adecreasing frequency of first digits, from 1 through 9.

Eq.(1) was suggested by Newcomb in 1881 from observations of the physical tear and wear of books containing logarithmic tables [1]. Benford further explored the phenomena in 1938, empirically checked it for a wide range of numerical data [2], and unsuccessfully attempted to present a formal proof. Since then Benford's law was found also in primenumbers [3], physical constants, Fibonacci numbers and many more [3,4,5].

2

Benford's law attracts a considerable attention [6]. Attempts for explanation are based on scale invariance [7] and base invariance [8,9,10] principles. However, there are no a priori well-defined probabilistic criteria when a data set should or should not obey the law [4]. Benford's distribution of digits iscounterintuitive as one expects that a random numbers would result in uniformity of their digits distribution, namely, ρ (n) = 1 as in the 9

case of an unbiased lottery. This is the reason why Benford's law is used by income tax agencies of several nations and states for fraud detection of large companies and accounting businesses [4,11,12]. Usually, when a fraud is done, the digits are invoked inequal probabilities and the distribution of digits does not follow Eq. (1). In this paper Benford's law is derived according to a standard probabilistic argumentation. It is assumed that, counter to common intuition (that digits are the logical units that comprise numbers) that the logical units are the 1's. For example, the digit 8 comprises of 8 units 1 etc. This model can be easily viewed as amodel of balls and boxes, namely: A) Digit n is equivalent to a "box" containing n none-interacting balls. B) N sequence of such "boxes" is equivalent to a number or a numerical file. C) All possible configurations of the boxes and balls, for a given number of balls, have equal probability. The last assumption is the definition of equilibrium and randomness in statistical physics. In informationtheory it means that the file is in the Shannon limit (a compressed file). A number is written as a combination of ordered digits assuming a given base B. When we have a number with N digits of base B, we can describe the number as a set of

3

N boxes, each contains a number of balls n, when n can be any integer from 0 to B − 1 .

We designate the total number of balls in a number as P. Anunbiased distribution of balls in boxes means an equal probability for any ball to be in any box. Hereafter, it is shown that this assumption is equivalent to assumption C and yields Benford law. The "intuitive" distribution in which each box has an equal probability to have any digit n (n balls) does not means an equal probability for any single ball to be in any box, but an equal probability for...
Leer documento completo

Regístrate para leer el documento completo.

Estos documentos también te pueden resultar útiles

  • Matematica
  • Matematica
  • Matematicas
  • Las matemáticas
  • Matematica
  • Matematicas
  • Matematica
  • Matematicas

Conviértase en miembro formal de Buenas Tareas

INSCRÍBETE - ES GRATIS