Lempelziv

Páginas: 18 (4355 palabras) Publicado: 30 de septiembre de 2012
IEEE

TRANSACTIONS

ON

INFORMATION

THEORY,

VOL.

IT-23, NO. 3,

MAY

1977

337

A Universal Algorithm for Sequential Data Compression
JACOB ZIV, FELLOW,
IEEE, AND

ABRAHAM

LEMPEL,

MEMBER,

IEEE

Abstract-A universal algorithm for sequential data compression is presented. Its performance is investigated with respect to a nonprobabilistic model of constrainedsources. The compression ratio achieved by the proposed universal code uniformly approaches the lower bounds on the compression ratios attainable by block-to-variable codes and variable-to-block codes designed to match a completely specified source.

I. INTRODUCTION N MANY situations arising in digital communications and data processing, the encountered strings of data display various structuralregularities or are otherwise subject to certain constraints, thereby allowing for storage and time-saving techniques of data compression. Given a discrete data source, the problem of data compression is first to identify the limitations of the source, and second to devise a coding scheme which, subject to certain performance criteria, will best compress the given source. Once the relevant sourceparameters have been identified, the problem reduces to one of minimum-redundancy coding. This phase of the problem has received extensive treatment in the literature [l]-[7]. When no a priori knowledge of the source characteristics is available, and if statistical tests are either impossible or unreliable, the problem of data compression becomes considerably more complicated. In order to overcomethese difficulties one must resort to universal coding schemes whereby the coding process is interlaced with a learning process for the varying source characteristics [8], [9]. Such coding schemes inevitably require a larger working memory space and generally employ performance criteria that are appropriate for a wide variety of sources. In this paper, we describe a universal coding scheme whichcan be applied to any discrete source and whose performance is comparable to certain optimal fixed code book schemes designed for completely specified sources. For lack of adequate criteria, we do not attempt to rank the proposed scheme with respect to other possible universal coding schemes. Instead, for the broad class of sources defined in Section III, we derive upper bounds on the compressionefficiency attainable with full a priori knowledge of the source by fixed code book schemes, and
Manuscript received June 23, 1975; revised July 6, 1976. Paper previously presented at the IEEE International Symposium on Information Theory, Ronneby, Sweden, June 21-24,1976. J. Ziv was with the Department of Electrical Engineering, TechnionIsrael Institute of Technology, Haifa, Israel. He is nowwith the Bell Telephone Laboratories, Murray Hill, NJ 07974. A. Lempel was with the Department of Electrical Engineering, Technion-Israel Institute of Technology, Haifa, Israel. He is now with the Sperry Research Center, Sudbury, M A 01776.

I

then show that the efficiency of our universal code with no a priori knowledge of the source approaches those bounds. The proposed compression algorithmis an adaptation of a simple copying procedure discussed recently [lo] in a study on the complexity of finite sequences. Basically, we employ the concept of encoding future segments of the source-output via maximum-length copying from a buffer containing the recent past output. The transmitted codeword consists of the buffer address and the length of the copied segment. W ith a predeterminedinitial load of the buffer and the information contained in the codewords, the source data can readily be reconstructed at the decoding end of the process. The main drawback of the proposed algorithm is its susceptibility to error propagation in the event of a channel error. II. THE
COMPRESSION ALGORITHM

The proposed compression algorithm consists of a rule for parsing strings of symbols from a...
Leer documento completo

Regístrate para leer el documento completo.

Conviértase en miembro formal de Buenas Tareas

INSCRÍBETE - ES GRATIS