LECTURE VI: LOSSLESS COMPRESSION ALGORITHMS DR. OUIEM BCHIR

Similar documents
Lecture5: Lossless Compression Techniques

Communication Theory II

A Brief Introduction to Information Theory and Lossless Coding

Module 8: Video Coding Basics Lecture 40: Need for video coding, Elements of information theory, Lossless coding. The Lecture Contains:

Entropy, Coding and Data Compression

Introduction to Source Coding

Multimedia Systems Entropy Coding Mahdi Amiri February 2011 Sharif University of Technology

Information Theory and Communication Optimal Codes

Coding for Efficiency

2.1. General Purpose Run Length Encoding Relative Encoding Tokanization or Pattern Substitution

# 12 ECE 253a Digital Image Processing Pamela Cosman 11/4/11. Introductory material for image compression

Module 3 Greedy Strategy

Comm. 502: Communication Theory. Lecture 6. - Introduction to Source Coding

MAS160: Signals, Systems & Information for Media Technology. Problem Set 4. DUE: October 20, 2003

Module 3 Greedy Strategy

Information Theory and Huffman Coding

Lossless Image Compression Techniques Comparative Study

Images with (a) coding redundancy; (b) spatial redundancy; (c) irrelevant information

Huffman Coding - A Greedy Algorithm. Slides based on Kevin Wayne / Pearson-Addison Wesley

A High-Throughput Memory-Based VLC Decoder with Codeword Boundary Prediction

Solutions to Assignment-2 MOOC-Information Theory

Channel Coding/Decoding. Hamming Method

MAS.160 / MAS.510 / MAS.511 Signals, Systems and Information for Media Technology Fall 2007

Compression. Encryption. Decryption. Decompression. Presentation of Information to client site

Chapter 1 INTRODUCTION TO SOURCE CODING AND CHANNEL CODING. Whether a source is analog or digital, a digital communication

GENERIC CODE DESIGN ALGORITHMS FOR REVERSIBLE VARIABLE-LENGTH CODES FROM THE HUFFMAN CODE

Greedy Algorithms. Kleinberg and Tardos, Chapter 4

CHAPTER 5 PAPR REDUCTION USING HUFFMAN AND ADAPTIVE HUFFMAN CODES

COMM901 Source Coding and Compression Winter Semester 2013/2014. Midterm Exam

Monday, February 2, Is assigned today. Answers due by noon on Monday, February 9, 2015.

MULTIMEDIA SYSTEMS

Pooja Rani(M.tech) *, Sonal ** * M.Tech Student, ** Assistant Professor

COPYRIGHTED MATERIAL. Introduction. 1.1 Communication Systems

A SURVEY ON DICOM IMAGE COMPRESSION AND DECOMPRESSION TECHNIQUES

The Need for Data Compression. Data Compression (for Images) -Compressing Graphical Data. Lossy vs Lossless compression

SOME EXAMPLES FROM INFORMATION THEORY (AFTER C. SHANNON).

Digital Communication Systems ECS 452

Chapter 6: Memory: Information and Secret Codes. CS105: Great Insights in Computer Science

Digital Audio. Lecture-6

FAST LEMPEL-ZIV (LZ 78) COMPLEXITY ESTIMATION USING CODEBOOK HASHING

Wednesday, February 1, 2017

1 This work was partially supported by NSF Grant No. CCR , and by the URI International Engineering Program.

Introduction to Error Control Coding

Arithmetic Compression on SPIHT Encoded Images

Computing and Communications 2. Information Theory -Channel Capacity

UCSD ECE154C Handout #21 Prof. Young-Han Kim Thursday, April 28, Midterm Solutions (Prepared by TA Shouvik Ganguly)

A Hybrid Technique for Image Compression

ECE Advanced Communication Theory, Spring 2007 Midterm Exam Monday, April 23rd, 6:00-9:00pm, ELAB 325

CSE 100: BST AVERAGE CASE AND HUFFMAN CODES

HUFFMAN CODING. Catherine Bénéteau and Patrick J. Van Fleet. SACNAS 2009 Mini Course. University of South Florida and University of St.

DEPARTMENT OF INFORMATION TECHNOLOGY QUESTION BANK. Subject Name: Information Coding Techniques UNIT I INFORMATION ENTROPY FUNDAMENTALS

Keywords Audio Steganography, Compressive Algorithms, SNR, Capacity, Robustness. (Figure 1: The Steganographic operation) [10]

Communication Theory II

6.004 Computation Structures Spring 2009

Comparative Analysis of Lossless Image Compression techniques SPHIT, JPEG-LS and Data Folding

Run-Length Based Huffman Coding

Outline. Communications Engineering 1

What You ll Learn Today

Multimedia Communications. Lossless Image Compression

REVIEW OF IMAGE COMPRESSION TECHNIQUES FOR MULTIMEDIA IMAGES

Huffman Coding For Digital Photography

Lossless Grayscale Image Compression using Blockwise Entropy Shannon (LBES)

Indian Institute of Technology, Roorkee, India

Course Developer: Ranjan Bose, IIT Delhi

Hamming net based Low Complexity Successive Cancellation Polar Decoder

2. REVIEW OF LITERATURE

4. Which of the following channel matrices respresent a symmetric channel? [01M02] 5. The capacity of the channel with the channel Matrix

Chapter 8. Representing Multimedia Digitally

Rab Nawaz. Prof. Zhang Wenyi

CHAPTER 6: REGION OF INTEREST (ROI) BASED IMAGE COMPRESSION FOR RADIOGRAPHIC WELD IMAGES. Every image has a background and foreground detail.

Approximate Compression Enhancing compressibility through data approximation

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP ( 1

Unit 1.1: Information representation

Chapter 9 Image Compression Standards

[Srivastava* et al., 5(8): August, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116

MATHEMATICS IN COMMUNICATIONS: INTRODUCTION TO CODING. A Public Lecture to the Uganda Mathematics Society

SHANNON S source channel separation theorem states

Basics of Error Correcting Codes

[Manisha*, 4.(10): October, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785

Speech Coding in the Frequency Domain

Digital Communication Systems ECS 452

COURSE MATERIAL Subject Name: Communication Theory UNIT V

Huffman-Compressed Wavelet Trees for Large Alphabets

6. FUNDAMENTALS OF CHANNEL CODER

The Strengths and Weaknesses of Different Image Compression Methods. Samuel Teare and Brady Jacobson

Chapter 4: The Building Blocks: Binary Numbers, Boolean Logic, and Gates

ECE 8771, Information Theory & Coding for Digital Communications Summer 2010 Syllabus & Outline (Draft 1 - May 12, 2010)

Ch. 3: Image Compression Multimedia Systems

The idea of similarity is through the Hamming

Comparison of Data Compression in Text Using Huffman, Shannon-Fano, Run Length Encoding, and Tunstall Method

SCHEME OF COURSE WORK. Course Code : 13EC1114 L T P C : ELECTRONICS AND COMMUNICATION ENGINEERING

Problem Sheet 1 Probability, random processes, and noise

Digital Communications I: Modulation and Coding Course. Term Catharina Logothetis Lecture 12

Lab/Project Error Control Coding using LDPC Codes and HARQ

DCSP-3: Minimal Length Coding. Jianfeng Feng

Lecture 4: Wireless Physical Layer: Channel Coding. Mythili Vutukuru CS 653 Spring 2014 Jan 16, Thursday

Tarek M. Sobh and Tarek Alameldin

UNIT-1. Basic signal processing operations in digital communication

FREDRIK TUFVESSON ELECTRICAL AND INFORMATION TECHNOLOGY

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING QUESTION BANK

Transcription:

1 LECTURE VI: LOSSLESS COMPRESSION ALGORITHMS DR. OUIEM BCHIR

2 STORAGE SPACE Uncompressed graphics, audio, and video data require substantial storage capacity. Storing uncompressed video is not possible with today s technology (CD & DVD) Data transmission of uncompressed video over digital networks require very high bandwidth. To be cost-effective and feasible, multimedia systems must use compressed video and audio streams.

3 INTRODUCTION Compression: the process of coding that will effectively reduce the total number of bits needed to represent certain information General Data Compression Scheme

4 INTRODUCTION If the compression and decompression processes induce no information loss, then the compression scheme is lossless, otherwise, it is lossy. Compression ratio:

COMPRESSION STEPS 5

6 TYPES OF COMPRESSION Symmetric Compression Same time needed for decoding and encoding phases Asymmetric Compression Compression process is performed once and enough time is available, hence compression can take longer. Decompression is performed frequently and must be done fast.

STATISTICAL ENCODING (FREQUENCY DEPENDENT) 7 Fixed length coding Use equal number of bits to represent each symbol - message of N symbols requires L >= log_2(n) bits per symbol. Good encoding for symbols with equal probability of occurrence. Not efficient if probability of each symbol is not equal. Variable length encoding frequently occurring characters represented with shorter strings than seldom occurring characters. Statistical encoding is dependant on the frequency of occurrence of a character or a sequence of data bytes. You are given a sequence of symbols: S1, S2, S3 and the probability of occurrence of each symbol P(Si) = Pi.

BASICS OF INFORMATION THEORY 8 The entropy η of an information source with alphabet S = {s1,s2, sn} is pi: probability that symbol si will occur in S. Log2(1/pi): amount of information contained in si, which corresponds to the number of bits needed to encode si.

9 EXAMPLE Uniform distribution: pi=1/256, hence, the entropy of the image is log2256=8.

10 ENTROPY AND CODE LENGTH Entropy and Code Length The entropy η is weighted-sum of terms log2(1/pi) It represents the average amount of information contained per symbol in the source S. The entropy η specifies the lower bound for the average number of bits to code each symbol in S, i.e. The average length (measured in bits) of the codewords produced by the encoder.

11 RUN-LENGTH CODING Memoryless Source: An information source that is independently distributed: the value of the current symbol does not depend on the values of the previously appeared symbols. Run-Length Coding (RLC) (not memoryless): exploits memory present in the information source. Rational for RLC: If the information source has the property that symbols tend to form continuous groups, then such symbol and the length of the group can be coded.

RUN-LENGTH CODING (RLC) 12 Content dependent coding RLC replaces the sequence of same consecutive bytes with the number of occurrences. The number of occurrences is indicated by a special flag -! RLC Algorithm: If the same byte occurred at least 4 times then count the number of occurrences Write compressed data in the following format: the counted byte!number of occurrences Example Uncompressed sequence - ABCCCCCCCCCDEFFFFGGG Compressed sequence - ABC!4DEF!0GGG (from 20 to 13 bytes)

VARIABLE-LENGTH CODING (VLC) 13 Shannon-Fano Algorithm: a top-down approach 1. Sort the symbols according to the frequency count of their occurrences. 2. Recursively divide the symbols into two parts, each with approximately the same number of counts, until all parts contain only one symbol. Example: coding of HELLO

14

15

Another coding tree for HELLO by Shannon-Fano 16

17

HUFFMAN CODING ALGORITHM 18 Characters are stored with their probabilities Number of bits of the coded characters differs. Shortest code is assigned to most frequently occurring character. To determine Huffman code, we construct a binary tree. Leaves are characters to be encoded Nodes contain occurrence probabilities of the characters belonging to the subtree. 0 and 1 are assigned to the branches of the tree arbitrarily - therefore different Huffman codes are possible for the same data. Huffman table is generated. Huffman tables must be transmitted with compressed data

19 EXAMPLE OF HUFFMAN CODING

PROPERTIES OF HUFFMAN CODING 20 Unique prefix property: No Huffman code is a prefix of any other Huffman code precludes any ambiguity in decoding. Optimality: Minimum redundancy code proved optimal for a given data model (i.e., a given, accurate, probability distribution) The two least frequent symbols will have the same length for their Huffman codes, differing only at the last bit. Symbols that occur more frequent will have shorter Huffman codes than symbols that occur less frequent. The average code length for an information source S is strictly less than

21 ARITHMETIC CODING Each symbol is coded by considering prior data encoded sequence must be read from beginning; no random access possible. Each symbol is a portion of a real number between 0 and 1. When the message becomes longer, the length of the interval shortens and the number of bits needed to represent the interval increases.

ARITHMETIC VS. HUFFMAN 22 Arithmetic encoding does not encode each symbol separately; Huffman encoding does. Arithmetic encoding transmits only length of encoded string; Huffman encoding transmits the Huffman table. Compression ratios of both are similar.

23

24 ARITHMETIC CODING ENCODER

25 Example: Encode Symbols CAEE$

26

27

28 The final step in Arithmetic encoding calls for the generation of a number that falls within the rang [low, high). The above algorithm will ensure that the shortest binary codeword is found.

29

30 ARITHMETIC CODING DECODER

Decoding symbols CAEE$ 31