Chapter 9 Image Compression Standards

Chapter 9 Image Compression Standards 9.1 The JPEG Standard 9.2 The JPEG2000 Standard 9.3 The JPEG-LS Standard 1IT342

Image Compression Standards The image standard specifies the codec, which defines how an image is compressed into a stream of bytes and decompressed back into an image. 2

9.1 The JPEG Standard JPEG is an image compression standard that was developed by the Joint Photographic Experts Group. JPEG was formally accepted as an international standard in 1992. The JPEG compression algorithm is at its best on photographs and paintings of realistic scenes with smooth variations of tone and color. For web usage, where the amount of data used for an image is important, JPEG is very popular. On the other hand, JPEG may not be as well suited for line drawings and other textual or iconic graphics, where the sharp contrasts between adjacent pixels can cause noticeable artifacts. The JPEG standard actually includes a lossless coding mode, but that mode is not supported in most products. 3

9.1 The JPEG Standard As the typical use of JPEG is a lossy compression method, which somewhat reduces the image fidelity, it should not be used in scenarios where the exact reproduction of the data is required (such as some scientific and medical imaging applications and certain technical image processing work). It employs a transform coding method using the DCT (Discrete Cosine Transform). An image is a function of i and j (or conventionally x and y) in the spatial domain. The 2D DCT is used as one step in JPEG in order to yield a frequency response which is a function F(u, v) in the spatial frequency domain (level of detail in the image), indexed by two integers u and v. 4

Observations for JPEG Image Compression The effectiveness of the DCT transform coding method in JPEG relies on 3 major observations: Observation 1: Useful image contents change relatively slowly across the image, i.e., it is unusual for intensity values to vary widely several times in a small area, for example, within an 8 8 image block. Much of the information in an image is repeated, hence spatial redundancy or intraframe redundancy. 5

Observations for JPEG Image Compression (cont d) Observation 2: Psychophysical experiments suggest that humans are much less likely to notice the loss of very high spatial frequency components than the loss of lower frequency components. The spatial redundancy can be reduced by largely reducing the high spatial frequency contents. Not necessary to represent each pixel in an image frame independently. Instead, one can predict a pixel from its neighbors. Removing a large amount of the redundancy within an image frame, we may save a lot of data in representing the frame, thus achieving data compression. 6

Observations for JPEG Image Compression (cont d) Observation 3: Visual acuity (accuracy in distinguishing closely spaced lines) is much greater for gray ( black and white ) than for color. Chroma subsampling (4:2:0) is used in JPEG, to reduce the spatial resolution of the chroma components. 7

Fig. 7.1: Block diagram for JPEG encoder. 8IT342

Main Steps in JPEG Image Compression 1. Picture Preparation 2. DCT [Discrete Cosine Transform] (Zig-zag ordering) 3. Quantization 4. Run-length encoding (Zig-zag scanning) 5. Entropy Encoding 9

1. Picture Preparation The representation of the colors in the image is converted from RGB to Y C B C R, consisting of one luma component (Y'), representing brightness, and two chroma components, (C B and C R ), representing color. 10

1. Picture Preparation The resolution of the chroma data (both U & V) is reduced, using (4:2:0) Chroma subsampling. This reflects the fact that the eye is less sensitive to fine color details than to fine brightness details. Due to the subsampling of U and V, one U and V encoding will be sent with 16 Y encodings. 11

2. DCT (Discrete Cosine Transform) Each image is divided into 8 8 blocks. Using blocks, however, has the effect of isolating each block from its neighboring context. This is why JPEG images look choppy ( blocky ) when a high compression ratio is specified by the user. 12

2. DCT (Discrete Cosine Transform) to reduce the size of bits required to represent each 8x8 block, the DCT is applied to each block image f(i, j), with output being the DCT coefficients F(u, v) for each block The results of a 64-element DCT transform are 1 DC coefficient and 63 AC coefficients. The DC coefficient represents the average color of the 8x8 region. The 63 AC coefficients represent color change across the block. 13

2. DCT (Zig-zag ordering) These 64 results are written in a zig-zag order as follows, with the DC coefficient followed by AC coefficients of increasing frequency. Low-numbered coefficients represent low-frequency color change, or gradual color change across the region. High-numbered coefficients represent high-frequency color change, or color which changes rapidly from one pixel to another within the block. 14

2. DCT (Zig-zag ordering) Why is this ordering important? Well, if you think of a block of 8x8 pixels out of a coherent image, the pixels are likely to be very similar. If you run DCT on 64 pixels which are very similar, you will get a DC coefficient and some values for the lowfrequency AC coefficients; the remaining coefficients will likely be at or near zero. To give you an idea of how small an 8x8 region is, consider the following example: 15

2. DCT (Zig-zag ordering) The 8x8 region of pixels highlighted above looks like this (magnified 1600 times) As you can see, this region does not deviate much from its average color. In addition, the change is slow and gradual across the block rather than sharp and abrupt from pixel to pixel. 16

2. DCT (Zig-zag ordering) This observation about images allows us to place a much greater importance on the DC and first few AC coefficients (beginning of zig-zag sequence) and it also allows us to assume there will be little or no values in the high-frequency AC coefficients (remainder of sequence). Logically, if these values are of little importance we should be able to assign fewer bits to them in order to achieve greater compression. This naturally leads us to the stages of quantization and entropy encoding, which we will cover next time. 17

3. Quantization The quantization step is the main source for loss in JPEG compression. Quantization allows us to define which elements should receive fine quantization and which elements should receive coarse quantization. Those elements that receive a finer quantization will be reconstructed close or exactly the same as the original image, while those elements that receive coarse quantization will not be reconstructed as accurately, if at all. Non-uniform quantization applied to the DCT coefficients (higher resolution given to DC and low frequency coefficients) Usually results in most of the higher frequency coefficients quantizing to a value of 0. JPEG utilizes a quantization table in order to quantize the results of the DCT. 18

3. Quantization Fuv (,) Fuvround ˆ(,) Quv (,) F(u, v) represents a DCT coefficient, Q(u, v) is a quantization matrix entry, and F ˆ ( u, v) represents the quantized DCT coefficients which JPEG will use in the succeeding entropy coding. The entries of Q(u, v) tend to have larger values towards the lower right corner. This aims to introduce more loss at the higher spatial frequencies a practice supported by Observations 1 and 2. 19

Table 9.1 The Luminance Quantization Table 16 11 10 16 24 40 51 61 1212 14 19 26 58 60 55 1413 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 Table 9.2 The Chrominance Quantization Table 17 18 24 47 99 99 99 99 18 21 26 66 99 99 99 99 24 26 56 99 99 99 99 99 47 66 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 Table 9.1 and 9.2 show the default Q(u, v) values obtained from psychophysical studies with the goal of maximizing the compression ratio while minimizing perceptual losses in JPEG images. 20

3. Quantization A quantization table is an 8x8 matrix of integers that correspond to the results of the DCT. To quantize the data, one merely divides the result of the DCT by the quantization value and keeps the integer portion of the result. Therefore, the higher the integer in the quantization table, the coarser and more compressed the result becomes. This quantization table is defined by the compression application, not by the JPEG standard. A great deal of work goes into creating "good" quantization tables that achieve both good image quality and good compression. Also, because it is not defined by the standard, this quantization table must be stored with the compressed image in order to decompress it. 21

An 8 8 block from the Y image of Lena 200 202 189 188 189 175 175 175 200 203 198 188 189 182 178 175 203 200 200 195 200 187 185 175 200 200 200 200 197 187 187 187 200 205 200 200 195 188 187 175 200 200 200 200 200 190 187 175 205 200 199 200 191 187 187 175 210 200 200 200 188 185 187 186 f(i, j) 515 65-12 4 1 2-8 5-16 3 2 0 0-11 -2 3-12 6 11-1 3 0 1-2 -8 3-4 2-2 -3-5 -2 0-2 7-5 4 0-1 -4 0-3 -1 0 4 1-1 0 3-2 -3 3 3-1 -1 3-2 5-2 4-2 2-3 0 F(u, v) Fig. 9.2: JPEG compression for a smooth image block. 22

Fig. 9.2 (cont d): JPEG compression for a smooth image block. 23

Another 8 8 block from the Y image of Lena 70 70 100 70 87 87 150 187 85 100 96 79 87 154 87 113 100 85 116 79 70 87 86 196 136 69 87 200 79 71 117 96 161 70 87 200 103 71 96 113 161 123 147 133 113 113 85 161 146 147 175 100 103 103 163 187 156 146 189 70 113 161 163 197 f(i, j) -80-40 89-73 44 32 53-3 -135-59 -26 6 14-3 -13-28 47-76 66-3 -108-78 33 59-2 10-18 0 33 11-21 1-1 -9-22 8 32 65-36 -1 5-20 28-46 3 24-30 24 6-20 37-28 12-35 33 17-5 -23 33-30 17-5 -4 20 F(u, v) 24 Fig. 9.2: JPEG compression for a textured image block.

Fig. 9.3 (cont d): JPEG compression for a textured image block. 25

4. Run-length encoding (Zig-zag scanning) RLC aims to turn the AC coefficients values into sets {#-zeros-toskip, next non-zero value}. To make it most likely to hit a long run of zeros: a zig-zag scan is used to turn the 8 8 matrix F ˆ ( u, v) into a 64-vector. Fig. 9.4: Zig-Zag Scan in JPEG. 26

DPCM on DC coefficients The DC coefficients are coded separately from the AC ones. Differential Pulse Code modulation (DPCM) is the coding method. If the DC coefficients for the first 5 image blocks are 150, 155, 149, 152, 144, then the DPCM would produce 150, 5, -6, 3, -8, assuming d i = Dc i-1 DC i, and d 0 = DC 0. 27

5.Entropy Coding The DC and AC coefficients finally undergo an entropy coding step to gain a possible further compression. Use DC as an example: each DPCM coded DC coefficient is represented by (SIZE, AMPLITUDE), where SIZE indicates how many bits are needed for representing the coefficient, and AMPLITUDE contains the actual bits. In the example we re using, codes 150, 5, 6, 3, 8 will be turned into (8, 10010110), (3, 101), (3, 001), (2, 11), (4, 0111). SIZE is Huffman coded since smaller SIZEs occur much more often. AMPLITUDE is not Huffman coded, its value can change widely so Huffman coding has no appreciable benefit. 28

Summary 29 Figure 1: The JPEG Compression Algorithm

This image was reconstructed with 1 coefficient (that is, 1/64): This image was reconstructed with 3% of the coefficients: This image was reconstructed with 34% of the coefficients: 30

9.1.2 Four Commonly Used JPEG Modes 1. Sequential Mode 2. Progressive Mode 3. Hierarchical Mode 4. Lossless Mode 31

1. Sequential Mode The image is encoded in the order in which it is scanned The default JPEG mode, implicitly assumed in the discussions so far. Each graylevel image or color image component is encoded in a single left-to-right, top-tobottom scan. 32

2. Progressive Mode The image is encoded in multiple passes Progressive JPEG delivers low quality versions of the image quickly, followed by higher quality passes. 33

3. Hierarchical Mode The image is encoded at multiple resolutions to accommodate different types of displays The encoded image at the lowest resolution is basically a compressed low-pass filtered image, whereas the images at successively higher resolutions provide additional details (differences from the lower resolution images). 34

3. Hierarchical Mode Fig. 9.5: Block diagram for Hierarchical JPEG. 35

Encoder for a Three-level Hierarchical JPEG 1. Reduction of image resolution: Reduce resolution of the input image f (e.g., 512 512) by a factor of 2 in each dimension to obtain f 2 (e.g., 256 256). Repeat this to obtain f 4 (e.g., 128 128). 2. Compress low-resolution image f 4 : Encode f 4 using any other JPEG method (e.g., Sequential, Progressive) to obtain F 4. 3. Compress difference image d 2 : Decode F 4 to obtain. Use any interpolation method to expand to be of the same resolution as f 2 and call it E( ). Encode difference d2 f2 E ( f ) 4 using any other JPEG method (e.g., Sequential, Progressive) to generate D 2. 4. Compress difference image d 1 : f 4 Decode D 2 to obtain d ; add it to E( f 2 4 ) to get f E( f ) d which is a 2 4 2 version of f 2 after compression and decompression. Encode difference d1 f E( f2) using any other JPEG method (e.g., Sequential, Progressive) to generate D 1. f 4 f 4 36

Decoder for a Three-level Hierarchical JPEG 1. Decompress the encoded low-resolution image F 4 : Decode F 4 using the same JPEG method as in the encoder to obtain. f 4 2. Restore image at the intermediate resolution: Use to obtain. E( f ) f 2 4 d2 2 f 3. Restore image at the original resolution: Use E( f ) d to obtain f. 2 1 f 37

4.Lossless Mode Discussed in Chapter 7, to be replaced by JPEG-LS It is a very special case of JPEG which indeed has no loss in its image quality. It employs only a simple differential coding method, involving no transformation coding. It is rarely used since its compression ratio is very low compared to other, lossy modes. The developed JPEG-LS standard is aimed at lossless image compression (will discuss later in this lecture) 38

9.2 The JPEG2000 Standard JPEG2000 operates in two coding modes: DCT-based and Wavelet-based. Design Goals: To provide a better rate-distortion tradeoff and improved subjective image quality. To provide additional functionalities lacking in the current JPEG standard. 39

9.2 The JPEG2000 Standard Provide lossless and lossy compression in a single bitstream (e.g. different parts of the image gets coded differently) Region of Interest Coding The new standard allows the specification of Regions of Interest (ROI) which can be coded with superior quality than the rest of the image. One might like to code the face of a speaker with more quality than the surrounding furniture. 40

(a) Fig. 9.5: Comparison of JPEG and JPEG2000. (a) Original image. 41

(b) (c) Fig. 9.6 (Cont d): Comparison of JPEG and JPEG2000. (b) JPEG (left) and JPEG2000 (right) images compressed at 0.75 bpp. (c) JPEG (left) and JPEG2000 (right) images compressed at 0.25 bpp. 42

9.3 The JPEG-LS Standard JPEG-LS is in the current ISO/ITU standard for lossless or near lossless compression of continuous tone images. It is part of a larger ISO effort aimed at better compression of medical images. Uses the LOCO-I (LOw COmplexity LOssless Compression for Images) algorithm proposed by Hewlett-Packard. Motivated by the observation that complexity reduction is often more important than small increases in compression offered by more complex algorithms. Main Advantage: Low complexity! 44

9.4 JBIG and JBIG-2: Bi-level Image Compression Standards Main Goal: Enables the handing of documents in electronic form. Primarily used to code scanned images of printed or hand-written text, computer generated text, and facsimile transmissions. JBIG is a lossless compression standard. It also offers progressive encoding/decoding capability, the resulting bitstream contains a set of progressively higher resolution images. JBIG-2 introduces It supports lossy compressions well. 45