Entropy, Coding and Data Compression

Similar documents
H(X,Y) = H(X) + H(Y X)

Introduction to Source Coding

Comm. 502: Communication Theory. Lecture 6. - Introduction to Source Coding

Communication Theory II

A Brief Introduction to Information Theory and Lossless Coding

LECTURE VI: LOSSLESS COMPRESSION ALGORITHMS DR. OUIEM BCHIR

# 12 ECE 253a Digital Image Processing Pamela Cosman 11/4/11. Introductory material for image compression

Multimedia Systems Entropy Coding Mahdi Amiri February 2011 Sharif University of Technology

Lecture5: Lossless Compression Techniques

Information Theory and Huffman Coding

Information Theory and Communication Optimal Codes

Coding for Efficiency

Chapter 1 INTRODUCTION TO SOURCE CODING AND CHANNEL CODING. Whether a source is analog or digital, a digital communication

Module 8: Video Coding Basics Lecture 40: Need for video coding, Elements of information theory, Lossless coding. The Lecture Contains:

2.1. General Purpose Run Length Encoding Relative Encoding Tokanization or Pattern Substitution

SOME EXAMPLES FROM INFORMATION THEORY (AFTER C. SHANNON).

MAS160: Signals, Systems & Information for Media Technology. Problem Set 4. DUE: October 20, 2003

Computing and Communications 2. Information Theory -Channel Capacity

DEPARTMENT OF INFORMATION TECHNOLOGY QUESTION BANK. Subject Name: Information Coding Techniques UNIT I INFORMATION ENTROPY FUNDAMENTALS

MAS.160 / MAS.510 / MAS.511 Signals, Systems and Information for Media Technology Fall 2007

Compression. Encryption. Decryption. Decompression. Presentation of Information to client site

The Lempel-Ziv (LZ) lossless compression algorithm was developed by Jacob Ziv (AT&T Bell Labs / Technion Israel) and Abraham Lempel (IBM) in 1978;

Solutions to Assignment-2 MOOC-Information Theory

Huffman Coding - A Greedy Algorithm. Slides based on Kevin Wayne / Pearson-Addison Wesley

Digital Communication Systems ECS 452

Communication Theory II

PROBABILITY AND STATISTICS Vol. II - Information Theory and Communication - Tibor Nemetz INFORMATION THEORY AND COMMUNICATION

Module 3 Greedy Strategy

Module 3 Greedy Strategy

ECE Advanced Communication Theory, Spring 2007 Midterm Exam Monday, April 23rd, 6:00-9:00pm, ELAB 325

Digital Communication Systems ECS 452

Speech Coding in the Frequency Domain

COMM901 Source Coding and Compression Winter Semester 2013/2014. Midterm Exam

1 This work was partially supported by NSF Grant No. CCR , and by the URI International Engineering Program.

Lecture 1 Introduction

4. Which of the following channel matrices respresent a symmetric channel? [01M02] 5. The capacity of the channel with the channel Matrix

MULTIMEDIA SYSTEMS

The Need for Data Compression. Data Compression (for Images) -Compressing Graphical Data. Lossy vs Lossless compression

A SURVEY ON DICOM IMAGE COMPRESSION AND DECOMPRESSION TECHNIQUES

CSE 100: BST AVERAGE CASE AND HUFFMAN CODES

B. Tech. (SEM. VI) EXAMINATION, (2) All question early equal make. (3) In ease of numerical problems assume data wherever not provided.

Wednesday, February 1, 2017

Chapter 3 LEAST SIGNIFICANT BIT STEGANOGRAPHY TECHNIQUE FOR HIDING COMPRESSED ENCRYPTED DATA USING VARIOUS FILE FORMATS

FAST LEMPEL-ZIV (LZ 78) COMPLEXITY ESTIMATION USING CODEBOOK HASHING

Error Control Coding. Aaron Gulliver Dept. of Electrical and Computer Engineering University of Victoria

Images with (a) coding redundancy; (b) spatial redundancy; (c) irrelevant information

Course Developer: Ranjan Bose, IIT Delhi

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING QUESTION BANK

Monday, February 2, Is assigned today. Answers due by noon on Monday, February 9, 2015.

6.450: Principles of Digital Communication 1

6.004 Computation Structures Spring 2009

Tarek M. Sobh and Tarek Alameldin

COURSE MATERIAL Subject Name: Communication Theory UNIT V

Problem Sheet 1 Probability, random processes, and noise

ECE/OPTI533 Digital Image Processing class notes 288 Dr. Robert A. Schowengerdt 2003

Multicasting over Multiple-Access Networks

15.Calculate the local oscillator frequency if incoming frequency is F1 and translated carrier frequency

Entropy Coding. Outline. Entropy. Definitions. log. A = {a, b, c, d, e}

Lecture #2. EE 471C / EE 381K-17 Wireless Communication Lab. Professor Robert W. Heath Jr.

Rab Nawaz. Prof. Zhang Wenyi

MATHEMATICS IN COMMUNICATIONS: INTRODUCTION TO CODING. A Public Lecture to the Uganda Mathematics Society

SYLLABUS of the course BASIC ELECTRONICS AND DIGITAL SIGNAL PROCESSING. Master in Computer Science, University of Bolzano-Bozen, a.y.

DCSP-3: Minimal Length Coding. Jianfeng Feng

CHAPTER 5 PAPR REDUCTION USING HUFFMAN AND ADAPTIVE HUFFMAN CODES

Keywords Audio Steganography, Compressive Algorithms, SNR, Capacity, Robustness. (Figure 1: The Steganographic operation) [10]

Bell Labs celebrates 50 years of Information Theory

Error Detection and Correction: Parity Check Code; Bounds Based on Hamming Distance

REVIEW OF IMAGE COMPRESSION TECHNIQUES FOR MULTIMEDIA IMAGES

Lossless Grayscale Image Compression using Blockwise Entropy Shannon (LBES)

Image Processing Computer Graphics I Lecture 20. Display Color Models Filters Dithering Image Compression

GENERIC CODE DESIGN ALGORITHMS FOR REVERSIBLE VARIABLE-LENGTH CODES FROM THE HUFFMAN CODE

Chapter 6: Memory: Information and Secret Codes. CS105: Great Insights in Computer Science

Channel Concepts CS 571 Fall Kenneth L. Calvert

Lossless Image Compression Techniques Comparative Study

FREDRIK TUFVESSON ELECTRICAL AND INFORMATION TECHNOLOGY

ECE 4400:693 - Information Theory

Information Theory: the Day after Yesterday

Masters of Engineering in Electrical Engineering Course Syllabi ( ) City University of New York--College of Staten Island

Scheduling in omnidirectional relay wireless networks

Computer Science 1001.py. Lecture 25 : Intro to Error Correction and Detection Codes

SHANNON S source channel separation theorem states

Background Dirty Paper Coding Codeword Binning Code construction Remaining problems. Information Hiding. Phil Regalia

Error-Correcting Codes

BSc (Hons) Computer Science with Network Security, BEng (Hons) Electronic Engineering. Cohorts: BCNS/17A/FT & BEE/16B/FT

Fundamentals of Digital Communications and Data Transmission

Pooja Rani(M.tech) *, Sonal ** * M.Tech Student, ** Assistant Professor

TCET3202 Analog and digital Communications II

Arithmetic Compression on SPIHT Encoded Images

6.02 Introduction to EECS II Spring Quiz 1

Byte = More common: 8 bits = 1 byte Abbreviation:

Engineering Scope and Sequence Student Outcomes (Objectives Skills/Verbs)

Chapter 1 Coding for Reliable Digital Transmission and Storage

The ternary alphabet is used by alternate mark inversion modulation; successive ones in data are represented by alternating ±1.

EENG 444 / ENAS 944 Digital Communication Systems

UNIT 7C Data Representation: Images and Sound

Error Detection and Correction

UCSD ECE154C Handout #21 Prof. Young-Han Kim Thursday, April 28, Midterm Solutions (Prepared by TA Shouvik Ganguly)

A Hybrid Technique for Image Compression

5984 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 12, DECEMBER 2010

Introduction to Error Control Coding

Transcription:

Entropy, Coding and Data Compression

Data vs. Information yes, not, yes, yes, not not In ASCII, each item is 3 8 = 24 bits of data But if the only possible answers are yes and not, there is only one bit of information per item

Compression = Squeezing out the Air Suppose you want to ship pillows in boxes and are charged by the size of the box To use as few boxes as possible, squeeze out all the air, pack into boxes, fluff them up at the other end Lossless data compression = pillows are perfectly restored Lossy data compression = some damage to the pillows is OK (MP3 is a lossy compression standard for music) Loss may be OK if it is below human perceptual threshold Entropy is a measure of limit of lossless compression

Example: Telegraphy Source English letters -> Morse Code Sender: from Hokkaido D -.. -.. D Receiver: in Tokyo

Coding Messages with Fixed Length Codes Example: 4 symbols, A, B, C, D A=, B=, C=, D= In general, with n symbols, codes need to be of length lg n, rounded up For English text, 26 letters + space = 27 symbols, length = 5 since 2 4 < 27 < 2 5 (replace all punctuation marks by space)

Modeling the Message Source Source Destination Characteristics of the stream of messages coming from the source affect the choice of the coding method We need a model for a source of English text that can be described and analyzed mathematically

Uniquely decodable codes If any encoded string has only one possible source string producing it then we have unique decodablity Example of uniquely decodable code is the prefix code

Prefix Coding A prefix code is defined as a code in which no codeword is the prefix of some other code word. A prefix code is uniquely decodable. Prefix Code Source Symbol Code A Symbol Codeword Code B Symbol Codeword Code C Symbol Codeword s s s 2 s 3 Uniquely Decodable Codes

Decoding of a Prefix Code Decision Tree for Code B Code B Initial State s s s 2 Source Symbol s k s s s 2 Symbol Codeword c k s 3 s 3 Example : Decode Answer : s s 3 s 2 s s

Prefix Codes Only one way to decode left to right when message received Example Symbol A B C D Probability.7... Code Received message:

Prefix Codes Example 2 Source Symbol s k A B C D Code E Symbol Codeword c k IS CODE E A PREFIX CODE? NO WHY? Code of D is a prefix to code of C

Average Code Length Information Source s k Source Encoder c k Source has K symbols Each symbol s k has probability p k Each symbol s k is represented by a codeword c k of length l k bits Average codeword length L K = k = pl k k

Shannon s First Theorem: The Source Coding Theorem L H( S ) The outputs of an information source cannot be represented by a source code whose average length is less than the source entropy

Average Code Length Example Average bits per symbol: L=.7 +. 3+. 3+. 3 =.6 bits/symbol (down from 2) A.7 B. C. D. Another prefix code that A B C D is better.7... L=.7 +. 2+. 3+. 3 =.5

Robot Example Source Entropy Examples 4-way random walk prob( x = S) =, prob( x = N) = 2 prob( x = E) = prob( x = W ) = 8 H ( X ) = ( log 2 + log 2 + log 2 + log 2 ) =. 75bps 2 2 4 4 8 8 8 8 4 W N S E

Source Entropy Examples Robot Example symbol k S N E p k.5 5 5 fixed-length codeword variable-length codeword W 5 symbol stream : S S N W S E N N N W S S S N E S S fixed length: variable length: 32bits 28bits 4 bits savings achieved by VLC (redundancy eliminated)

Entropy, Compressibility, Redundancy Lower entropy <=> More redundant <=> More compressible Higher entropy <=> Less redundant <=> Less compressible A source of yes s and not s takes 24 bits per symbol but contains at most one bit per symbol of information

Entropy and Compression First-order entropy is theoretical minimum on code length when only frequencies are taken into account A B C L=.7 +. 2+. 3+. 3 =.5.7.. First-order Entropy =.353 D. First-order Entropy of English is about 4 bits/character based on typical English texts

Bits You are watching a set of independent random samples of X You see that X has four possible values P(X=A) = /4 P(X=B) = /4 P(X=C) = /4 P(X=D) = /4 So you might see output: BAACBADCDADDDA You transmit data over a binary serial link. You can encode each reading with two bits (e.g. A =, B =, C =, D = ) 2 bits on average per symbol

Fewer Bits Someone tells you that the probabilities are not equal P(X=A) = /2 P(X=B) = /4 P(X=C) = /8 P(X=D) = /8 Is it possible to invent a coding for your transmission that only uses.75 bits on average per symbol. How?

Fewer Bits Someone tells you that the probabilities are not equal P(X=A) = /2 P(X=B) = /4 P(X=C) = /8 P(X=D) = /8 It s possible to invent a coding for your transmission that only uses.75 bits on average per symbol. How? A B C D (This is just one of several ways)

Fewer Bits Suppose there are three equally likely values P(X=A) = /3 P(X=B) = /3 P(X=C) = /3 Here s a naïve coding, costing 2 bits per symbol A B C Can you think of a coding that would need only.6 bits per symbol on average? In theory, it can in fact be done with.58496 bits per symbol.

Kraft-McMillan Inequality K 2 lk k = If codeword lengths of a code satisfy the Kraft McMillan s inequality, then a prefix code with these codeword lengths can be constructed. For code D 2 - + 2-2 + 2-3 + 2-2= 9/8 This means that Code D IS NOT A PREFIX CODE Source Symbol s k Code D Symbol Codewor d C k s s 2 s 2 3 s 3 2 Codeword Length l k

Use of Kraft-McMillan Inequality We may use it if the number of symbols are large such that we cannot simply by inspection judge whether a given code is a prefix code or not WHAT Kraft-McMillan Inequality Can Do: It can determine that a given code IS NOT A PREFIX CODE It can identify that a prefix code could be constructed from a set of codeword lengths WHAT Kraft-McMillan Inequality Cannot Do: It cannot guarantee that a given code is indeed a prefix code

Example Source Symbo l s k Symbol Codewor d c k Code E Codeword Length l k s s 3 s 2 3 For code E s 3 2 2 - + 2-2 + 2-3 + 2-3= IS CODE E A PREFIX CODE? NO WHY? s 3 is a prefix to s 2

Code Efficiency? η ( ) = H S L An efficient code means?

Source Symbol s k Symbol Probability p k Examples Code I Symbol Codeword c k Codeword Length l k Symbol Codeword c k s /2 2 s /4 2 2 s 2 /8 2 3 s 3 /8 2 3 Source Entropy H(S) =/2log 2 (2)+/4log 2 (4)+ /8log 2 (8)+/log 2 (8) = ¾ bits/symbol Code I L = 2 + + + = 2 2 4 8 8 74 η = = 875. 2 Code II Code II Codeword Length l k 7 L = + 2 + 3 + 3 = 2 4 8 8 4 74 η = = 74

For a Prefix Code Shannon s First Theorem ( S ) L ( S ) H < H + L = H( S ) if pk l = 2 k k p k 2 l k for some What is the Efficiency???= if?< However, we may increase efficiency by extending the source k

Increasing Efficiency by Source By extending the source we may potentially increase efficiency The drawback is Increased decoding complexity Extension ( n ) ( n S L ) n S H < H + ( ) ( ) nh S L < nh S + H η η n L < + n ( ) n S H( S ) = H L ( S ) n n n when n

Extension of a Discrete Memoryless Source Treats Blocks of n successive symbols Information Source Extended Information Source S = { } { } s,s,...,s K Pr s = p,k =,,...,K k k K- k = p k = { } n n S = σ, σ,..., σ K { } n Pr σ = q,i =,,...,K i K i n - i = p i =

Example 2 S={s,s,s 2 }, p =/4, p =/4, p 2 =/2 H(S)=(/4)log 2 (4)+ (/4)log 2 (4)+ (/2)log 2 (2) H(S)=3/2 bits Second-Order Extended Source Symbols of S 2 s s s 2 s 3 s 4 s 5 s 6 s 7 s 8 Sequence of Symbols from S s s s s s s 2 s s s s s s 2 s 2 s s 2 s s 2 s P{s i }, i=,,,8 /6 /6 /8 /6 /6 /8 /8 /8 /4 By Computing: H(S 2 )=3 bits

Example 3 Calculate the English of English language if. All alphabet letters are equally probable 2. For a, e, o, t P{s k }=. For h, i, n, r, s P{s k }=.7 For c, d, f, l, m, p, u, y P{s k }=.2 For b, g, j, k, q, v, w, x, z P{s k }=.. H(S)=4.7 bits 2. H(S)=4.7 bits

Source Encoding Efficient representation of information sources Source Coding Requirements Uniquely Decodable Codes Prefix Codes No codeword is a prefix to some other code word Code Efficiency η ( ) = H S L Kraft s Inequality K lk k = 2 Source Coding Theorem ( S ) L ( S ) H < H +

Source Coding Techniques. Huffman Code. 2. Two-path Huffman Code. 3. Lemple-Ziv Code. 4. Shannon Code. 5. Fano Code. 6. Arithmetic Code.

Source Coding Techniques. Huffman Code. 2. Two-path Huffman Code. 3. Lemple-Ziv Code. 4. Shannon Code. 5. Fano Code. 6. Arithmetic Code.

Source Coding Techniques. Huffman Code. With the Huffman code in the binary case the two least probable source output symbols are joined together, resulting in a new message alphabet with one less symbol

Huffman Coding: Example Compute the Huffman Code for the source shown H( S ) = ( 4log. ) 2 4. + 2 ( 2log. ) 2 2. + 2 ( log. ) 2. = 2293. L Source Symbol s k. s s s 2.4 s 3. s 4 Symbol Probability p k

Solution A Source Symbol s k s 2 s s 3 s s 4 Stage I.4..

Solution A Source Symbol s k s 2 s s 3 s s 4 Stage I.4.. Stage II.4

Solution A Source Symbol s k Stage I Stage II Stage III s 2.4.4.4 s.4 s 3 s. s 4.

Solution A Source Symbol s k Stage I Stage II Stage III Stage IV s 2.4.4.4.6 s.4.4 s 3 s. s 4.

Solution A Source Symbol Stage I Stage II Stage III Stage IV s 2.4.4.4 s k.6 s.4.4 s 3 s. s 4.

Solution A Source Symbol Stage I Stage II Stage III Stage IV s 2.4.4.4 s k Code.6 s.4.4 s 3 s. s 4.

Source Symbol Symbol Probability p k Solution A Cont d s. s s 2.4 s 3. s 4 Code word c k s k ( ) = H S 2293. L = 4. 2+ 2. 2 + 2. 2+. 3+. 3 = 22. ( S ) L ( S ) H < H + THIS IS NOT THE ONLY SOLUTION!

Alternate Solution B Source Symbol Stage I Stage II Stage III Stage IV s 2.4.4.4 s k Code.6 s.4.4 s 3 s. s 4.

Source Symbol Alternative Solution B Cont d Symbol Probability p k s. s s 2.4 s 3. s 4 Code word c k s k ( ) = H S 2293. L = 4. + 2. 2 + 2. 3+. 4+. 4 = 22. ( S ) L ( S ) H < H +

What is the difference between the two solutions? They have the same average length They differ in the variance of the average code length Solution A s 2 =.6 Solution B s 2 =.36 σ 2 K ( ) 2 p l L k k k = =

Source Coding Techniques. Huffman Code. 2. Two-path Huffman Code. 3. Lemple-Ziv Code. 4. Shannon Code. 5. Fano Code. 6. Arithmetic Code.

Source Coding Techniques 2. Two-path Huffman Code. This method is used when the probability of symbols in the information source is unknown. So we first can estimate this probability by calculating the number of occurrence of the symbols in the given message then we can find the possible Huffman codes. This can be summarized by the following two passes. Pass : Measure the occurrence possibility of each character in the message Pass 2 : Make possible Huffman codes

Source Coding Techniques 2. Two-path Huffman Code. Example Consider the input: ABABABABABACADABACADABACADABACAD

Source Coding Techniques. Huffman Code. 2. Two-path Huffman Code. 3. Lemple-Ziv Code. 4. Shannon Code. 5. Fano Code. 6. Arithmetic Code.

Lempel-Ziv Coding Huffman coding requires knowledge of a probabilistic model of the source This is not necessarily always feasible Lempel-Ziv code is an adaptive coding technique that does not require prior knowledge of symbol probabilities Lempel-Ziv coding is the basis of well-known ZIP for data compression

Lempel-Ziv Coding Example Codebook Index 2 3 4 5 6 7 8 9 Subsequence Representation Encoding

Lempel-Ziv Coding Example Codebook Index 2 3 4 5 6 7 8 9 Subsequence Representation Encoding

Lempel-Ziv Coding Example Codebook Index 2 3 4 5 6 7 8 9 Subsequence Representation Encoding

Lempel-Ziv Coding Example Codebook Index 2 3 4 5 6 7 8 9 Subsequence Representation Encoding

Lempel-Ziv Coding Example Codebook Index 2 3 4 5 6 7 8 9 Subsequence Representation Encoding

Lempel-Ziv Coding Example Codebook Index 2 3 4 5 6 7 8 9 Subsequence Representation Encoding

Lempel-Ziv Coding Example Codebook Index 2 3 4 5 6 7 8 9 Subsequence Representation Encoding

Lempel-Ziv Coding Example Codebook Index 2 3 4 5 6 7 8 9 Subsequence Representation Encoding

Lempel-Ziv Coding Example Information bits Source encoded bits Codebook Index 2 3 4 5 6 7 8 9 Subsequence Representation 2 42 2 4 6 62 Source Code

How Come this is Compression?! The hope is: If the bit sequence is long enough, eventually the fixed length code words will be shorter than the length of subsequences they represent. When applied to English text Lempel-Ziv achieves approximately 55% Huffman coding achieves approximately 43%