Compresion de datos
Guy E. Blelloch Computer Science Department Carnegie Mellon University
blelloch@cs.cmu.edu
October 16, 2001
Contents
1 2 Introduction Information Theory 2.1 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The Entropy of the English Language . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Conditional Entropyand Markov Chains . . . . . . . . . . . . . . . . . . . . . . . Probability Coding 3.1 Prefix Codes . . . . . . . . . . . . . . . . . 3.1.1 Relationship to Entropy . . . . . . 3.2 Huffman Codes . . . . . . . . . . . . . . . 3.2.1 Combining Messages . . . . . . . . 3.2.2 Minimum Variance Huffman Codes 3.3 Arithmetic Coding . . . . . . . . . . . . . 3.3.1 Integer Implementation . . . . . . .Applications of Probability Coding 4.1 Run-length Coding . . . . . . 4.2 Move-To-Front Coding . . . . 4.3 Residual Coding: JPEG-LS . . 4.4 Context Coding: JBIG . . . . 4.5 Context Coding: PPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 5 5 6 7 9 10 11 12 14 14 15 1821 24 25 25 26 28
3
4
This is an early draft of a chapter of a book I’m starting to write on “algorithms in the real world”. There are surely many mistakes, and please feel free to point them out. In general the Lossless compression part is more polished than the lossy compression part. Some of the text and figures in the Lossy Compression sections are from scribe notes taken by BenLiblit at UC Berkeley. Thanks for many comments from students that helped improve the presentation. c 2000, 2001 Guy Blelloch
1
¡
¢
5
The Lempel-Ziv Algorithms 31 5.1 Lempel-Ziv 77 (Sliding Windows) . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.2 Lempel-Ziv-Welch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Other Lossless Compression 36 6.1Burrows Wheeler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Lossy Compression Techniques 7.1 Scalar Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Vector Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Transform Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 39 41 426 7
8
A Case Study: JPEG and MPEG 42 8.1 JPEG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 8.2 MPEG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Other Lossy Transform Codes 9.1 Wavelet Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Fractal Compression . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 9.3 Model-Based Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 49 51 54
9
2
1
Introduction
Compression is used just about everywhere. All the images you get on the web are compressed, typically in the JPEG or GIF formats, most modems use compression, HDTV will be compressed using MPEG-2, and several file systemsautomatically compress files when stored, and the rest of us do it by hand. The neat thing about compression, as with the other topics we will cover in this course, is that the algorithms used in the real world make heavy use of a wide set of algorithmic tools, including sorting, hash tables, tries, and FFTs. Furthermore, algorithms with strong theoretical foundations play a critical role in...
Regístrate para leer el documento completo.