Intermediate Information Structures

Similar documents
CHAPTER 5 A NEAR-LOSSLESS RUN-LENGTH CODER

Compression Programs. Compression Outline. Multimedia. Lossless vs. Lossy. Encoding/Decoding. Analysis of Algorithms

x y z HD(x, y) + HD(y, z) HD(x, z)

Wavelet Transform. CSEP 590 Data Compression Autumn Wavelet Transformed Barbara (Enhanced) Wavelet Transformed Barbara (Actual)

Design of FPGA- Based SPWM Single Phase Full-Bridge Inverter

COMPRESSION OF TRANSMULTIPLEXED ACOUSTIC SIGNALS

Design of FPGA Based SPWM Single Phase Inverter

A study on the efficient compression algorithm of the voice/data integrated multiplexer

Permutation Enumeration

H(X,Y) = H(X) + H(Y X)

Application of Improved Genetic Algorithm to Two-side Assembly Line Balancing

sible number of wavelengths. The wave~~ngt~ ~ ~ ~ c ~ n b~dwidth is set low eno~gh to interfax One of the most im

On Parity based Divide and Conquer Recursive Functions

Single Bit DACs in a Nutshell. Part I DAC Basics

APPLICATION NOTE UNDERSTANDING EFFECTIVE BITS

Fingerprint Classification Based on Directional Image Constructed Using Wavelet Transform Domains

A New Space-Repetition Code Based on One Bit Feedback Compared to Alamouti Space-Time Code

Logarithms APPENDIX IV. 265 Appendix

Unit 5: Estimating with Confidence

HOW BAD RECEIVER COORDINATES CAN AFFECT GPS TIMING

}, how many different strings of length n 1 exist? }, how many different strings of length n 2 exist that contain at least one a 1

ASample of an XML stream is:

Objectives. Some Basic Terms. Analog and Digital Signals. Analog-to-digital conversion. Parameters of ADC process: Related terms

Lossless image compression Using Hashing (using collision resolution) Amritpal Singh 1 and Rachna rajpoot 2

Density Slicing Reference Manual

7. Counting Measure. Definitions and Basic Properties

Counting on r-fibonacci Numbers

Subject Record (MARC21 format)

Spread Spectrum Signal for Digital Communications

Using Color Histograms to Recognize People in Real Time Visual Surveillance

Cross-Layer Performance of a Distributed Real-Time MAC Protocol Supporting Variable Bit Rate Multiclass Services in WPANs

PROJECT #2 GENERIC ROBOT SIMULATOR

MEASUREMENT AND CONTORL OF TOTAL HARMONIC DISTORTION IN FREQUENCY RANGE 0,02-10KHZ.

General Model :Algorithms in the Real World. Applications. Block Codes

Optimal Arrangement of Buoys Observable by Means of Radar

A New Design of Log-Periodic Dipole Array (LPDA) Antenna

Procedia - Social and Behavioral Sciences 128 ( 2014 ) EPC-TKS 2013

A New Basic Unit for Cascaded Multilevel Inverters with the Capability of Reducing the Number of Switches

1. How many possible ways are there to form five-letter words using only the letters A H? How many such words consist of five distinct letters?

Measurement of Equivalent Input Distortion AN 20

A SELECTIVE POINTER FORWARDING STRATEGY FOR LOCATION TRACKING IN PERSONAL COMMUNICATION SYSTEMS

The Institute of Chartered Accountants of Sri Lanka

x 1 + x x n n = x 1 x 2 + x x n n = x 2 x 3 + x x n n = x 3 x 5 + x x n = x n

ETSI TS V ( )

PERMUTATIONS AND COMBINATIONS

EMPIRICAL MODE DECOMPOSITION IN AUDIO WATERMARKING BY USING WAVELET METHOD

X-Bar and S-Squared Charts

ECE 333: Introduction to Communication Networks Fall Lecture 4: Physical layer II

Chapter 3 Digital Logic Structures

ETSI TS V ( )

3GPP TS V8.0.0 ( )

Massachusetts Institute of Technology Dept. of Electrical Engineering and Computer Science Fall Semester, Introduction to EECS 2.

Zonerich AB-T88. MINI Thermal Printer COMMAND SPECIFICATION. Zonerich Computer Equipments Co.,Ltd MANUAL REVISION EN 1.

Math 140 Introductory Statistics

IV054 IV054 IV054 IV054 LITERATURE INTRODUCTION HISTORY OF CRYPTOGRAPHY

8. Combinatorial Structures

SHORT-TERM TRAVEL TIME PREDICTION USING A NEURAL NETWORK

Ch 9 Sequences, Series, and Probability

Estimation of non Distortion Audio Signal Compression

CS 201: Adversary arguments. This handout presents two lower bounds for selection problems using adversary arguments ëknu73,

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 12

Radar emitter recognition method based on AdaBoost and decision tree Tang Xiaojing1, a, Chen Weigao1 and Zhu Weigang1 1

ELEC 350 Electronics I Fall 2014

3. Error Correcting Codes

INCREASE OF STRAIN GAGE OUTPUT VOLTAGE SIGNALS ACCURACY USING VIRTUAL INSTRUMENT WITH HARMONIC EXCITATION

Application of Image Fusion to Wireless Image Transmission

Performance Limits and Practical Decoding of Interleaved Reed-Solomon Polar Concatenated Codes

VIII. Shell-Voicings

A Novel Three Value Logic for Computing Purposes

Lecture 13: DUART serial I/O, part I

arxiv: v2 [math.co] 15 Oct 2018

Encode Decode Sample Quantize [ ] [ ]

Run-Time Error Detection in Polynomial Basis Multiplication Using Linear Codes

Lecture5: Lossless Compression Techniques

FPGA Implementation of the Ternary Pulse Compression Sequences

Combined Scheme for Fast PN Code Acquisition

CS3203 #5. 6/9/04 Janak J Parekh

A generalization of Eulerian numbers via rook placements

4. INTERSYMBOL INTERFERENCE

Pulse-echo Ultrasonic NDE of Adhesive Bonds in Automotive Assembly

AC : USING ELLIPTIC INTEGRALS AND FUNCTIONS TO STUDY LARGE-AMPLITUDE OSCILLATIONS OF A PENDULUM

lecture notes September 2, Sequential Choice

WAVE-BASED TRANSIENT ANALYSIS USING BLOCK NEWTON-JACOBI

CAEN Tools for Discovery

Table Of Contents Blues Turnarounds

Technical Explanation for Counters

A GHz Constant KVCO Low Phase Noise LC-VCO and an Optimized Automatic Frequency Calibrator Applied in PLL Frequency Synthesizer

A Novel Small Signal Power Line Quality Measurement System

Cross-Entropy-Based Sign-Selection Algorithms for Peak-to-Average Power Ratio Reduction of OFDM Systems

Markov Modulated Punctured Autoregressive Processes for Traffic and Channel Modeling *

Message Scheduling for the FlexRay Protocol: The Dynamic Segment

Test Time Minimization for Hybrid BIST with Test Pattern Broadcasting

Comparison of Frequency Offset Estimation Methods for OFDM Burst Transmission in the Selective Fading Channels

An Adaptive Image Denoising Method based on Thresholding

E X P E R I M E N T 13

Consensus-based Synchronization of Microgrids at Multiple Points of Interconnection

Decode-forward and Compute-forward Coding Schemes for the Two-Way Relay Channel

Throughput/Delay Analysis of Spectrally Phase- Encoded Optical CDMA over WDM Networks

Problem of calculating time delay between pulse arrivals

High Speed Area Efficient Modulo 2 1

Transcription:

Modified from Maria s lectures CPSC 335 Itermediate Iformatio Structures LECTURE 11 Compressio ad Huffma Codig Jo Roke Computer Sciece Uiversity of Calgary Caada

Lecture Overview Codes ad Optimal Codes Huffma Codig No-determiism of the algorithm Implemetatios: Sigly-liked List Doubly-liked list Recursive top-dow Usig heap Adaptive Huffma codig

A 000 C 010 E 100 G 110 B 001 D 011 F 101 H 111 With this code, the message BACADAEAFABBAAAGAH is ecoded as the strig of 54 bits CODES 001000010000011000100000101000001001000000000110000111 It is sometimes advatageous to use variable-legth codes, i which differet symbols may be represeted by differet umbers of bits. For example, Morse code does ot use the same umber of dots ad dashes for each letter of the alphabet. I particular, E, the most frequet letter, is represeted by a sigle dot. I geeral, if our messages are such that some symbols appear very frequetly ad some very rarely, we ca ecode data more efficietly (i.e., usig fewer bits per message) if we assig shorter codes to the frequet symbols. Cosider the followig alterative code for the letters A through H: A 0 C 1010 E 1100 G 1110 B 100 D 1011 F 1101 H 1111 With this code, the same message as above is ecoded as the strig 100010100101101100011010100100000111001111 This strig cotais 42 bits, so it saves more tha 20% i space i compariso with the fixed-legth code show above. 3

Optimal codes 4

Optimal codes 5

Huffma Codig Algorithm is used to assig a codeword to each character i the text accordig to their frequecies. The codeword is usually represeted as a bitstrig. Algorithm starts with the set of idividual trees, cosistig of a sigle ode, sorted i the order of icreasig character probabilities. The two trees with the smallest probabilities are selected ad processed so that they become the left ad the right sub-tree of the paret ode, combiig their probabilities. I the ed, 0 are assiged to all left braches of the tree, 1 to all right braches, ad the codewods for all leaves (characters) of the tree are geerated.

6 pages copied from Corme et al.

Huffma tree buildig exercise

Char Freq Character cout i text. E T A 125 93 80 O 76 I 73 Huffma Code Costructio N S R H 71 65 61 55 L 41 D 40 C 31 U 27 14

Huffma Code Costructio Char E T A O I N S R H L D C U Freq 125 93 80 76 73 71 65 61 55 41 40 31 27 C U 15 31 27

Huffma Code Costructio Example from Uwisc Char E T A O I N S R H L D Freq 125 93 80 76 73 71 65 61 58 55 41 40 C U 31 27 58 C U 16 31 27

Huffma Code Costructio Char E T A O I N S R H Freq 125 93 81 80 76 73 71 65 61 58 55 L D 41 40 81 D L 40 41 58 C U 17 31 27

Huffma Code Costructio Char E T A O I N S R Freq 125 113 93 81 80 76 73 71 65 61 H 58 55 81 113 D L H 40 41 58 55 C U 18 31 27

Huffma Code Costructio Char E T A O I N Freq 126 125 113 93 81 80 76 73 71 S R 65 61 81 126 113 D L R S H 40 41 61 65 58 55 C U 19 31 27

Huffma Code Costructio Char E T A O I N Freq 144 126 125 113 93 81 80 76 73 71 81 126 144 113 D L R S N I H 40 41 61 65 71 73 58 55 C U 20 31 27

Huffma Code Costructio Char E T A O Freq 156 144 126 125 113 93 81 80 76 156 A O 80 76 81 126 144 113 D L R S N I H 40 41 61 65 71 73 58 55 C U 21 31 27

Huffma Code Costructio Char E T Freq 174 156 144 126 125 113 93 81 156 174 A O T 80 76 81 93 126 144 113 D L R S N I H 40 41 61 65 71 73 58 55 C U 22 31 27

Huffma Code Costructio Char E Freq 238 174 156 144 126 125 113 156 174 238 A O T E 80 76 81 93 126 144 125 113 D L R S N I H 40 41 61 65 71 73 58 55 C U 23 31 27

Huffma Code Costructio Char Freq 270 238 174 156 144 126 156 174 270 238 A O T E 80 76 81 93 126 144 125 113 D L R S N I H 40 41 61 65 71 73 58 55 C U 24 31 27

Huffma Code Costructio Char Freq 330 270 238 174 156 330 156 174 270 238 A O T E 80 76 81 93 126 144 125 113 D L R S N I H 40 41 61 65 71 73 58 55 C U 25 31 27

Huffma Code Costructio Char Freq 508 330 270 238 330 508 156 174 270 238 A O T E 80 76 81 93 126 144 125 113 D L R S N I H 40 41 61 65 71 73 58 55 C U 26 31 27

Char Freq 838 Huffma Code Costructio 508 330 838 330 508 156 174 270 238 A O T E 80 76 81 93 126 144 125 113 55 D L R S N I H 40 41 61 65 71 73 58 27 C U

28 Huffma Code Costructio R S N I E H C U 0 0 T D L 1 0 0 A O 0 1 1 1 1 0 0 1 1 1 1 1 1 0 0 0 0 0 1 125 Freq 93 80 76 73 71 61 55 41 40 E Char T A O I N R H L D 31 27 C U 65 S 0000 Fixed 0001 0010 0011 0100 0101 0111 1000 1001 1010 1011 1100 0110 110 Huff 011 000 001 1011 1010 1000 1111 0101 0100 11100 11101 1001 838 Total 4.00 3.62

No-determiism of the Huffma Codig

No-determiism of the Huffma Codig

For aother example, let s ecode a excerpt from Michael Jackso s sog Bad2. Because I m bad, I m bad-- come o Bad, bad-- really, really bad You kow I m bad, I m bad-- you kow it Bad, bad-- really, really bad You kow I m bad, I m bad-- come o, you kow Bad, bad-- really, really bad Thaks to Jeff Boyd who poited me to the paper PVRG-MPEG CODEC 1.1 by Ady C. Hug from which 4 slides have bee take. 31

The frequecy of words i the sog Bad. 32

The Huffma tree for the lyrics to Bad 33

The Huffma codes for the words i Bad. 34

Implemetatio Liked List Implemetatio depeds o the ways to represet the priority queue, which requires removig two smallest probabilities ad isertig the ew probability i the proper positios. The first way to implemet the priority queue is the sigly liked list of refereces to trees, which resembles the algorithm preseted i the previous slides. The tree with the smallest probability is replaced by the ewly created tree. From the trees with the same probability, the first trees ecoutered are chose.

Doubly Liked List All probability odes are first ordered, the first two trees are always removed. The ew tree is iserted at the ed of the list i the sorted order. A doubly-liked list of refereces to trees with immediate access to the begiig ad to the ed of this list is used.

Doubly Liked-List implemetatio

Recursive Implemetatio Top-dow approach for buildig a tree startig from the highest probability. The root probability is kow if lower probabilities, i the root s childre, have bee determied, the latter are kow if the lower probabilities have bee computed etc. Thus, the recursive algorithm ca be used.

HEAP A biary tree has the heap property iff it is empty or the key i the root is larger tha that i either child ad both subtrees have the heap property. Complete if all the leaves are o the same level or two adjacet oes ad all odes at the lowest level are as far to the left as possible. 39

81 80

If we umber the odes from 1 at the root ad place: --the left child of ode k at positio 2k --the right child of ode k at positio 2k+1 The the 'fill from the left' ature of the complete tree esures that the heap ca be stored i cosecutive locatios i a array.

INSERT ito HEAP

Implemetatio usig Heap The mi-heap of probabilities is built. The highest probability is put i the root. Next, the heap property is restored The smallest probability is removed ad the root probability is set to the sum of two smallest probabilities. The processig is complete whe there is oly oe ode i the heap left.

Huffma implemetatio with a heap

Huffma Codig for pairs of characters

Adaptive Huffma Codig Devised by Robert Gallager ad improved by Doald Kuth. Algorithm is based o the siblig property: if each ode has a siblig, ad the breadth-first right-to-left tree traversal geerates a list of odes with oicreasig frequecy couters, it is a Huffma tree. I adaptive Huffma codig, the tree icludes a couter for each symbol updated every time correspodig symbol is beig coded. Checkig whether the siblig property holds esures that the tree uder costructio is a Huffma tree. If the siblig property is violated, the tree is restored.

Adaptive Huffma Codig

Adaptive Huffma Codig

Sources Web liks: l MP3 Coverter: http://www.mp3-overter.com/mp3codec/ huffma_codig.htm l Practical Huffma Codig: http://www.compresscosult.com/ huffma/ Drozdek Textbook - Chapter 11

Shao-Fao I the field of data compressio, Shao Fao codig, amed after Claude Shao ad Robert Fao, is a techique for costructig a prefix code based o a set of symbols ad their probabilities (estimated or measured). It is suboptimal i the sese that it does ot achieve the lowest possible expected code word legth like Huffma codig; however ulike Huffma codig, it does guaratee that all code word legths are withi oe bit of their theoretical ideal etropy.

Shao-Fao Codig For a give list of symbols, develop a correspodig list of probabilities or frequecy couts so that each symbol s relative frequecy of occurrece is kow. Sort the lists of symbols accordig to frequecy, with the most frequetly occurrig symbols at the left ad the least commo at the right. Divide the list ito two parts, with the total frequecy couts of the left part beig as close to the total of the right as possible. The left part of the list is assiged the biary digit 0, ad the right part is assiged the digit 1. This meas that the codes for the symbols i the first part will all start with 0, ad the codes i the secod part will all start with 1. Recursively apply the steps 3 ad 4 to each of the two halves, subdividig groups ad addig bits to the codes util each symbol has become a correspodig code leaf o the tree.

Shao-Fao example

Refereces Shao, C.E. (July 1948). "A Mathematical Theory of Commuicatio". Bell System Techical Joural 27: 379 423. http://cm.bell-labs.com/cm/ms/what/shaoday/ shao1948.pdf. Fao, R.M. (1949). "The trasmissio of iformatio". Techical Report No. 65 (Cambridge (Mass.), USA: Research Laboratory of Electroics at MIT). Shao-Fao