Communication Theory II

Similar documents
Lecture5: Lossless Compression Techniques

Introduction to Source Coding

LECTURE VI: LOSSLESS COMPRESSION ALGORITHMS DR. OUIEM BCHIR

Entropy, Coding and Data Compression

Communication Theory II

Information Theory and Communication Optimal Codes

A Brief Introduction to Information Theory and Lossless Coding

Comm. 502: Communication Theory. Lecture 6. - Introduction to Source Coding

Information Theory and Huffman Coding

# 12 ECE 253a Digital Image Processing Pamela Cosman 11/4/11. Introductory material for image compression

Module 8: Video Coding Basics Lecture 40: Need for video coding, Elements of information theory, Lossless coding. The Lecture Contains:

Multimedia Systems Entropy Coding Mahdi Amiri February 2011 Sharif University of Technology

Chapter 1 INTRODUCTION TO SOURCE CODING AND CHANNEL CODING. Whether a source is analog or digital, a digital communication

Coding for Efficiency

Module 3 Greedy Strategy

Huffman Coding - A Greedy Algorithm. Slides based on Kevin Wayne / Pearson-Addison Wesley

Module 3 Greedy Strategy

ECE Advanced Communication Theory, Spring 2007 Midterm Exam Monday, April 23rd, 6:00-9:00pm, ELAB 325

GENERIC CODE DESIGN ALGORITHMS FOR REVERSIBLE VARIABLE-LENGTH CODES FROM THE HUFFMAN CODE

2.1. General Purpose Run Length Encoding Relative Encoding Tokanization or Pattern Substitution

The Lempel-Ziv (LZ) lossless compression algorithm was developed by Jacob Ziv (AT&T Bell Labs / Technion Israel) and Abraham Lempel (IBM) in 1978;

A SURVEY ON DICOM IMAGE COMPRESSION AND DECOMPRESSION TECHNIQUES

Solutions to Assignment-2 MOOC-Information Theory

Computing and Communications 2. Information Theory -Channel Capacity

COMM901 Source Coding and Compression Winter Semester 2013/2014. Midterm Exam

FAST LEMPEL-ZIV (LZ 78) COMPLEXITY ESTIMATION USING CODEBOOK HASHING

EELE 6333: Wireless Commuications

The idea of similarity is through the Hamming

ECE/OPTI533 Digital Image Processing class notes 288 Dr. Robert A. Schowengerdt 2003

Communication Theory II

SOME EXAMPLES FROM INFORMATION THEORY (AFTER C. SHANNON).

Lossless Image Compression Techniques Comparative Study

DEPARTMENT OF INFORMATION TECHNOLOGY QUESTION BANK. Subject Name: Information Coding Techniques UNIT I INFORMATION ENTROPY FUNDAMENTALS

Compression. Encryption. Decryption. Decompression. Presentation of Information to client site

MAS160: Signals, Systems & Information for Media Technology. Problem Set 4. DUE: October 20, 2003

Outline. Communications Engineering 1

Run-Length Based Huffman Coding

MATHEMATICS IN COMMUNICATIONS: INTRODUCTION TO CODING. A Public Lecture to the Uganda Mathematics Society

Lecture 13 February 23

MAS.160 / MAS.510 / MAS.511 Signals, Systems and Information for Media Technology Fall 2007

CHAPTER 5 PAPR REDUCTION USING HUFFMAN AND ADAPTIVE HUFFMAN CODES

Pooja Rani(M.tech) *, Sonal ** * M.Tech Student, ** Assistant Professor

A Hybrid Technique for Image Compression

Digital Communication Systems ECS 452

Communication Theory II

Speech Coding in the Frequency Domain

Lecture 1 Introduction

1 This work was partially supported by NSF Grant No. CCR , and by the URI International Engineering Program.

Background Dirty Paper Coding Codeword Binning Code construction Remaining problems. Information Hiding. Phil Regalia

DIGITAL COMMUNICATION

Comm 502: Communication Theory

Digital Communications I: Modulation and Coding Course. Term Catharina Logothetis Lecture 12

COURSE MATERIAL Subject Name: Communication Theory UNIT V

Introduction to Error Control Coding

SYLLABUS of the course BASIC ELECTRONICS AND DIGITAL SIGNAL PROCESSING. Master in Computer Science, University of Bolzano-Bozen, a.y.

Hamming net based Low Complexity Successive Cancellation Polar Decoder

FREDRIK TUFVESSON ELECTRICAL AND INFORMATION TECHNOLOGY

A Bi-level Block Coding Technique for Encoding Data Sequences with Sparse Distribution

UCSD ECE154C Handout #21 Prof. Young-Han Kim Thursday, April 28, Midterm Solutions (Prepared by TA Shouvik Ganguly)

CHAPTER 6: REGION OF INTEREST (ROI) BASED IMAGE COMPRESSION FOR RADIOGRAPHIC WELD IMAGES. Every image has a background and foreground detail.

CSE 100: BST AVERAGE CASE AND HUFFMAN CODES

A High-Throughput Memory-Based VLC Decoder with Codeword Boundary Prediction

Lecture 4: Wireless Physical Layer: Channel Coding. Mythili Vutukuru CS 653 Spring 2014 Jan 16, Thursday

Scheduling in omnidirectional relay wireless networks

Chapter 1 Coding for Reliable Digital Transmission and Storage

Tarek M. Sobh and Tarek Alameldin

IMPERIAL COLLEGE of SCIENCE, TECHNOLOGY and MEDICINE, DEPARTMENT of ELECTRICAL and ELECTRONIC ENGINEERING.

ECEn 665: Antennas and Propagation for Wireless Communications 131. s(t) = A c [1 + αm(t)] cos (ω c t) (9.27)

SHANNON S source channel separation theorem states

2. REVIEW OF LITERATURE

PROJECT 5: DESIGNING A VOICE MODEM. Instructor: Amir Asif

ECE 8771, Information Theory & Coding for Digital Communications Summer 2010 Syllabus & Outline (Draft 1 - May 12, 2010)

B. Tech. (SEM. VI) EXAMINATION, (2) All question early equal make. (3) In ease of numerical problems assume data wherever not provided.

Entropy Coding. Outline. Entropy. Definitions. log. A = {a, b, c, d, e}

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING QUESTION BANK

S Coding Methods (5 cr) P. Prerequisites. Literature (1) Contents

Time division multiplexing The block diagram for TDM is illustrated as shown in the figure

Lecture #2. EE 471C / EE 381K-17 Wireless Communication Lab. Professor Robert W. Heath Jr.

Comparative Analysis of Lossless Image Compression techniques SPHIT, JPEG-LS and Data Folding

COPYRIGHTED MATERIAL. Introduction. 1.1 Communication Systems

Volume 2, Issue 9, September 2014 International Journal of Advance Research in Computer Science and Management Studies

photons photodetector t laser input current output current

UNIT-1. Basic signal processing operations in digital communication

6.02 Introduction to EECS II Spring Quiz 1

Fundamentals of Digital Communications and Data Transmission

Lab/Project Error Control Coding using LDPC Codes and HARQ

[Srivastava* et al., 5(8): August, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116

DCSP-3: Minimal Length Coding. Jianfeng Feng

ELEC3028 (EL334) Digital Transmission

CHAPTER 3 Syllabus (2006 scheme syllabus) Differential pulse code modulation DPCM transmitter

6.450: Principles of Digital Communication 1

Fundamentals of Digital Communication

TSKS01 Digital Communication Lecture 1

Images with (a) coding redundancy; (b) spatial redundancy; (c) irrelevant information

Monday, February 2, Is assigned today. Answers due by noon on Monday, February 9, 2015.

Error Control Coding. Aaron Gulliver Dept. of Electrical and Computer Engineering University of Victoria

Indian Institute of Technology, Roorkee, India

TSTE17 System Design, CDIO. General project hints. Behavioral Model. General project hints, cont. Lecture 5. Required documents Modulation, cont.

AN INTRODUCTION TO ERROR CORRECTING CODES Part 2

6.004 Computation Structures Spring 2009

Transcription:

Communication Theory II Lecture 13: Information Theory (cont d) Ahmed Elnakib, PhD Assistant Professor, Mansoura University, Egypt March 22 th, 2015 1

o Source Code Generation Lecture Outlines Source Coding Theorem Lossless Data Compression Algorithms Prefix Coding Huffman Coding Lempel-Ziv Coding 2

Source Encoder: How to design? 3

Source Code Generation: Source Encoder ohow to represent data generated by a discrete source of information? Process: Source encoding Device: Source encoder orequirements: Codeword produced by the encoder is in the binary form Codeword is represented with the minimum number of bits (as low as H(S)) Perfect source encoder: the source code is uniquely decodable The original source sequence can be reconstructed perfectly from the encoded binary sequence 4

Shannon s 1 st Theorem: Source-Encoding Theorem o Given a discrete memoryless source whose output is denoted by the random variable S o The entropy H(S) imposes the following bound on the average codeword length for any source encoding scheme: L H(S) o According to this theorem, the entropy H(S) represents a fundamental limit on the average number of bits per source symbol necessary to represent a discrete memoryless source L H(S) 5

When L = H(S)? (optimal value) o If each time we transmit a sequence (S N ) of N messages (N-order extended source), N H m = E I i = P i I i n i=1 n = P i log 2 ( P i ) bits i=1 6

Average Codeword Length of a Source Encoder Figure below 7

Efficiency/redundancy of a Source Encoder o η = L min L = H(m) L o The redundancy γ is defined as: γ = 1 η 8

Lossless Data Compression o A common characteristic of signals generated by physical sources is that, in their natural form, they contain a significant amount of redundant information E.g., business transactions constitutes a redundant sequence in the sense that any two adjacent symbols are typically correlated with each other o Lossless data compression: an operation performed on a digital signal that removes redundant information from the signal prior to transmission (with no loss of information) Produces an output code that efficiently represent the source code with the minimum average number of bits per symbol The original data can be reconstructed with no loss of information o Limits of Lossless data compression: The entropy of the source establishes the fundamental limit on the removal of redundancy from the data L H(S) o Procedure: assigning short descriptions to the most frequent outcomes of the source output and longer descriptions to the less frequent ones 9

Types of Source Codes: Prefix Code A prefix code is a code in which no codeword is the prefix of any other codeword Example: which of the following represent a prefix code? Symbol Source Probability of occurrence Code I Code II Code III 10

Properties of Prefix Code o Uniquely decodable The end of the codeword is always recognizable Code II s 0 s 1 Instantaneous code (why?) ooffers the possibility of realizing an average codeword length that can be made arbitrarily close to the source entropy Where L represents the average codeword length of the prefix code Decision tree for code II 11 s 2 s 3

Extended Prefix Codes o Higher-order extended prefix codes can offer higher efficiency For n-order extension: Average length per message: o The average codeword length of an extended prefix code can be made as small as the entropy of the source, provided that the extended code has a high enough order in accordance with the source-coding theorem o The price we have to pay for decreasing the average codeword length is increased decoding complexity, which is brought about by the high order of the extended prefix code 12

Optimal Prefix Codes: Huffman Encoding Optimal in the sense that the code has the shortest expected length Huffman Coding Algorithm 1. Splitting stage: The source symbols are listed in order of decreasing probability. o The two source symbols of lowest probability are assigned 0 and 1 2. These two source symbols are then combined into a new source symbol with probability equal to the sum of the two original probabilities. o The list of source symbols, and, therefore, source statistics, is thereby reduced in size by one o The probability of the new symbol is placed in the list in accordance with its value 3. The procedure is repeated until we are left with a final list of source statistics (symbols) of only two for which the symbols 0 and 1 are assigned 4. The code for each (original) source is found by working backward and tracing the sequence of 0s and 1s assigned to that symbol as well as its successors 13

Example 1 (a) Construct a Huffman code for a given source that produces 6 messages with the following probabilities Messages Probabilities (a) Compute its efficiency (b) Plot the decision tree (c) Use the constructed Huffman code to encode and decode the sequence 14

Example 1 Solution Messages Probabilities 15

Example 1 Solution 16

Example 1 Solution (cont d) (a) (b) 17

Example 1 Solution (cont d) (c) Decision tree Initial State 0 1 0 1 0 1 m1 0 1 m2 0 1 m3 m4 m5 m6 18

Example 1 Solution (cont d) (d) Notes: (1) The Huffman code is a prefix code, i.e., uniquely decodable, and (2) The Huffman code is not unique. Why? Average length remains constant Lower code length variance is achieved by moving the combined source probability as high as possible 19

Example 2 Messages Probabilities 20

Example 2 Solution o Number of messages = 4 + 3k for k=1 (one reduction) Number of messages should be 7 We should add one dummy message with a probability equals 0 21

Example 2 Solution 22

Example 3 23

Example 3 Solution Messages Probabilities m 1 0.8 m 2 0.2 24

Example 3 Solution: Second-order extension and 25

Example 3 Solution: Third-order extension and 26

Drawbacks of Huffman Coding orequire the knowledge of a probabilistic model of the source Source statistics are not always known apriori o Not suitable for modeling of text sources Codeword grows exponentially fast in the size of each super-symbol of letters (grouping of letters) Impractical storing requirements ohow to overcome these limitations? Using Lempel-Ziv coding 27

Lempel-Ziv Coding ocurrently, it is the standard algorithm for file compression. Why? Simpler Adaptive Fixed length code Suitable for synchronous transmission Afford practical considerations onow, fixed blocks of 12 bits are used (code book of 2 12 possible entries) Idea: the source data stream is parsed into segments that are shortest subsequences not encountered previously 28

Example Consider the example of the binary sequence 000101110010100101.. a) Construct the Lempel-Ziv code for this sequence b) show how to decode the sequence at the receiver 29

Solution 30

Solution (cont d) Innovation symbol Pointer to the root subsequence o The Lempel Ziv decoder: use the pointer to identify the root subsequence and then append the innovation symbol. o E.g., the binary encoded block 1101 in position 9 The last bit, 1, is the innovation symbol The remaining bits, 110, point to the root subsequence 10 in position 6 31 Hence, the block 1101 is decoded into 101, which is correct

Questions 32