Fast Mode Decision using Global Disparity Vector for Multiview Video Coding

Similar documents
Efficient Bit-Plane Coding Scheme for Fine Granular Scalable Video Coding

An improved hybrid fast mode decision method for H.264/AVC intra coding with local information

Information Hiding in H.264 Compressed Video

Motion- and Aliasing-Compensated Prediction for Hybrid Video Coding

A HIGH PERFORMANCE HARDWARE ARCHITECTURE FOR HALF-PIXEL ACCURATE H.264 MOTION ESTIMATION

New Algorithms and FPGA Implementations for Fast Motion Estimation In H.264/AVC

Bit-depth scalable video coding with new interlayer

Scalable Fast Rate-Distortion Optimization for H.264/AVC

Compression of High Dynamic Range Video Using the HEVC and H.264/AVC Standards

Comprehensive scheme for subpixel variable block-size motion estimation

Design of High-Performance Intra Prediction Circuit for H.264 Video Decoder

HDR Video Compression Using High Efficiency Video Coding (HEVC)

DELAY-POWER-RATE-DISTORTION MODEL FOR H.264 VIDEO CODING

Video Encoder Optimization for Efficient Video Analysis in Resource-limited Systems

THE DEMAND for wireless packet-data applications

The Algorithm of Fast Intra Angular Mode Selection for HEVC

Weighted-prediction-based color gamut scalability extension for the H.265/HEVC video codec

Improvement of HEVC Inter-coding Mode Using Multiple Transforms

Encryption Techniques for H.264/AVC Video Coding Based on Intra-Prediction Modes: Insights from Literature

MOTION estimation plays an important role in video

Practical Content-Adaptive Subsampling for Image and Video Compression

Complexity modeling for context-based adaptive binary arithmetic coding (CABAC) in H.264/AVC decoder

ABSTRACT 1. INTRODUCTION IDCT. motion comp. prediction. motion estimation

Performance Evaluation of H.264 AVC Using CABAC Entropy Coding For Image Compression

A Near Optimal Deblocking Filter for H.264 Advanced Video Coding

American International Journal of Research in Science, Technology, Engineering & Mathematics

ISSN: Seema G Bhateja et al, International Journal of Computer Science & Communication Networks,Vol 1(3),

Low-Complexity Bayer-Pattern Video Compression using Distributed Video Coding

Adaptive Deblocking Filter

OVER THE REAL-TIME SELECTIVE ENCRYPTION OF AVS VIDEO CODING STANDARD

Analysis and Improvement of Image Quality in De-Blocked Images

THE ITU-T Video Coding Experts Group (VCEG) and

The ITU-T Video Coding Experts Group (VCEG) and

A Modified Image Template for FELICS Algorithm for Lossless Image Compression

Implementation of CAVLD Architecture Using Binary Tree Structures and Data Hiding for H.264/AVC Using CAVLC & Exp-Golomb Codeword Substitution

Layered Motion Compensation for Moving Image Compression. Gary Demos Hollywood Post Alliance Rancho Mirage, California 21 Feb 2008

ERROR RESILIENT H.264 CODED VIDEO TRANSMISSION OVER WIRELESS CHANNELS

Error Resilient Coding Based on Reversible Data Hiding and Redundant Slice

No-Reference Image Quality Assessment using Blur and Noise

H.264 Video with Hierarchical QAM

UNEQUAL ERROR PROTECTION FOR DATA PARTITIONED H.264/AVC VIDEO STREAMING WITH RAPTOR AND RANDOM LINEAR CODES FOR DVB-H NETWORKS

APPLICATIONS OF DSP OBJECTIVES

GENERIC CODE DESIGN ALGORITHMS FOR REVERSIBLE VARIABLE-LENGTH CODES FROM THE HUFFMAN CODE

L1-Optimized Linear Prediction for Light Field Image Compression

MULTIMEDIA PROCESSING PROJECT REPORT

Quality Assessment of Deblocked Images Changhoon Yim, Member, IEEE, and Alan Conrad Bovik, Fellow, IEEE

HIGH DYNAMIC RANGE VERSUS STANDARD DYNAMIC RANGE COMPRESSION EFFICIENCY

Compression and Image Formats

Thousand to One: An Image Compression System via Cloud Search

Direction-Adaptive Partitioned Block Transform for Color Image Coding

A Modified Image Coder using HVS Characteristics

H.264-Based Resolution, SNR and Temporal Scalable Video Transmission Systems

A High-throughput, Area-efficient Hardware Accelerator for Adaptive Deblocking Filter in H.264/AVC

Audio Signal Compression using DCT and LPC Techniques

Implementation and Optimization of 4 4 Luminance Intra Prediction

COLOR CORRECTION METHOD USING GRAY GRADIENT BAR FOR MULTI-VIEW CAMERA SYSTEM. Jae-Il Jung and Yo-Sung Ho

Frequency Domain Intra-Prediction Analysis and Processing for High Quality Video Coding

INTER-INTRA FRAME CODING IN MOTION PICTURE COMPENSATION USING NEW WAVELET BI-ORTHOGONAL COEFFICIENTS

An Efficient Nonlinear Filter for Removal of Impulse Noise in Color Video Sequences

UNEQUAL POWER ALLOCATION FOR JPEG TRANSMISSION OVER MIMO SYSTEMS. Muhammad F. Sabir, Robert W. Heath Jr. and Alan C. Bovik

Module 6 STILL IMAGE COMPRESSION STANDARDS

Improvements of Demosaicking and Compression for Single Sensor Digital Cameras

Multiplayer Cloud Gaming System with Cooperative Video Sharing

Artifacts Reduced Interpolation Method for Single-Sensor Imaging System

A new quad-tree segmented image compression scheme using histogram analysis and pattern matching

Integral Video Coding

I D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008

A Maximum Likelihood Approach to Video Error Correction Applied to H.264 Decoding

Introducing A Public Stereoscopic 3D High Dynamic Range (SHDR) Video Database

Analysis of Power Consumption of H.264/AVC-based Video Sensor Networks through Modeling the Encoding Complexity and Bitrate

SNR Scalability, Multiple Descriptions, and Perceptual Distortion Measures

Unit 1.1: Information representation

(12) United States Patent

Plenoptic Image Coding using Macropixel-based Intra Prediction

Power-Distortion Optimized Mode Selection for Transmission of VBR Videos in CDMA Systems

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5, MAY

Image Processing Final Test

The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D.

RECOMMENDATION ITU-R BT SUBJECTIVE ASSESSMENT OF STANDARD DEFINITION DIGITAL TELEVISION (SDTV) SYSTEMS. (Question ITU-R 211/11)

Power-Aware Rate Control for Mobile Multimedia Communications

An Improved Bernsen Algorithm Approaches For License Plate Recognition

Keywords: BPS, HOLs, MSE.

Level-Successive Encoding for Digital Photography

Visually Lossless Coding in HEVC: A High Bit Depth and 4:4:4 Capable JND-Based Perceptual Quantisation Technique for HEVC

Adaptive Guided Image Filter for Improved In-Loop Filtering in Video Coding

Classification-based Hybrid Filters for Image Processing

Recommendation ITU-R BT.1866 (03/2010)

Smart Rebinning for the Compression of Concentric Mosaic

Impact of Hyperspectral Image Coding on Subpixel Detection

ASIP Solution for Implementation of H.264 Multi Resolution Motion Estimation

Application driven, AMC-based cross-layer optimization for video service over LTE

Blind Dereverberation of Single-Channel Speech Signals Using an ICA-Based Generative Model

Design and Implementation of 64-bit MAC Unit for DSP Applications using verilog HDL

IMPLEMENTATION OF SOFTWARE-BASED 2X2 MIMO LTE BASE STATION SYSTEM USING GPU

FUZZY JOINT ENCODING AND STATISTICAL MULTIPLEXING OF MULTIPLE VIDEO SOURCES WITH INDEPENDENT QUALITY OF SERVICES FOR STREAMING OVER DVB-H

A New OFDM Transmission Scheme Using Orthogonal Code Multiplexing

ROI-based DICOM image compression for telemedicine

Efficient MPEG-2 to H.264/AVC Transcoding of Intra-Coded Video

Utilization of Multipaths for Spread-Spectrum Code Acquisition in Frequency-Selective Rayleigh Fading Channels

Image Coding Based on Patch-Driven Inpainting

Transcription:

2008 Second International Conference on Future Generation Communication and etworking Symposia Fast Mode Decision using Global Disparity Vector for Multiview Video Coding Dong-Hoon Han, and ung-lyul Lee Digital Media System Lab., Department of Computer Engineering, Sejong University, 98, Kunja-Dong, Kwangjin-Gu, Seoul, Korea dhhan@dms.sejong.ac.kr, yllee@sejong.ac.kr Abstract Since multiview video coding (MVC) based on H.264/AVC uses a prediction scheme exploiting inter-view correlation among multiview video, MVC encoder compresses multiple views more efficiently than simulcast H.264/AVC encoder. However, in case that the number of views to be encoded increases in MVC, the total encoding time will be greatly increased. To reduce computational complexity in MVC, a fast mode decision using both MB (Macroblock)-based region segmentation information and global disparity vector among views is proposed to reduce encoding time. The proposed method achieves on average 40% reduction of total encoding time with the PSR (Peak Signal to oise Ratio) degradation of about 0.05 db. Figure 1. Variable block size in H.264/AVC. Input Video Encode all MB modes Selection of the best MB Mode Encoding 1. Introduction Multi-view video coding (MVC), which is being standardized in the joint video team (JVT) of the ITU-T video coding experts group (VCEG) and ISO/IEC moving picture experts group (MPEG), is expected to become a new video coding standard for the realization of future video applications such as 3D-TV and free viewpoint video [1]. The MVC group in the JVT has selected the H.264/AVC [2]-based MVC method that was proposed by [3] as the MVC reference model, since this method showed better coding efficiency than H.264/AVC simulcast coding and the other methods that were presented in response to the call for proposals made by MPEG [4]. The main difference between conventional video coding and MVC is the availability of multiple camera views on the same scene. As coding efficiency of hybrid video coding depends on temporal prediction method, more coding gain in MVC can be achieved by inter-view (spatial) and temporal predictions. For improving coding efficiency, H.264/AVC-based MVC uses a Rate-Distortion Optimization (RDO) technique in H.264/AVC to find the efficient coding mode for each macroblock (MB). Intra/Inter, SKIP, Direct Modes Residual Data Integer Transform / Quantization Compute RDcost Rate Entropy Coding Distortion Inverse integer Transform / Inverse Quantization Figure 2. Computation process of RD cost for RDO. Figure 1 shows the variable block-size MB modes in H.264/AVC and Figure 2 shows the computation process of Rate-Distortion cost (RD cost) for variable block-size Inter MB mode, Intra MB modes of 4 4, 8 8, and 16 16, SKIP, and Direct mode in H.264/AVC. In Figure 2, the H.264/AVC encoder calculates the RD cost of all MB modes and selects the best MB mode having the minimum RD cost, and this process is repeatedly carried out for each MB. Therefore, the computational complexity of H.264/AVC is increased, compared with the conventional video coding standard such as MPEG-2 video, H.263, or MPEG4 video. Furthermore, MVC taking into account both temporal prediction in H.264/AVC and inter-view prediction among 978-0-7695-3546-3/08 $25.00 2008 IEEE DOI 10.1109/FGCS.2008.58 209

views for improving coding efficiency needs much higher computational complexity than H.264/AVC. In this paper, a fast mode decision scheme is proposed for MVC encoder to reduce computational complexity. The paper is organized as follows. In Section 2, multiview video coding scheme is introduced briefly. In Section 3, the detailed algorithm of the proposed fast mode decision method is described. In Section 4, the performance of the proposed method is compared to that of the MVC reference model by analyzing the various experimental results. Finally, the conclusions are given in Section 5. 2. Feature of Multiview Video Coding 2.1. Hierarchical B picture and prediction structure The increased flexibility of H.264/AVC in comparison to the previous video coding standards provides its improved coding efficiency. In contrast to the previous video coding standards, the coding and display order of pictures in H.264/AVC is completely decoupled. These features allow the selection of arbitrary coding/prediction structures, which are not supported by the previous standards. Hierarchical B picture structure [5] which is a typical example using structural flexibility is introduced through scalable video coding (SVC) [6]. This structure obtains higher coding efficiency than the existing IBBP structure. A typical hierarchical prediction structure with 4 dyadic hierarchy stages is depicted in Figure 3. predicted by the temporal pictures on the temporal axis, and the picture set predicted by the view-temporal (spatiotemporal) pictures on the view and temporal axes. In this structure, the I k and P k pictures are only used at random access points. The B k pictures are used between the I and P pictures that are random access points and are also used in all other positions except at these random access points, where the subscript k means the temporal decomposition level. The B 1 pictures on the temporal axis T0 are predicted spatially (inter-view), the B k pictures on the view axes V0, V2, V4, and V6 are predicted temporally, and the B k pictures on the view axes V1, V3, V5, and V7 are predicted temporally and spatially. To allow synchronization, I-frames start each GOP (S0/T0, S0/T8, etc.). Figure 4. Inter-view/temporal prediction structure using hierarchical B pictures. In the example above, a GOP-length of 8 is shown for coding scheme explanation, but for the coding experiments GOP-lengths of 12 and 15 were used. 2.2. Disparity Vector Figure 3. Hierarchical coding structure with 4 temporal levels. The H.264/AVC-based MVC method that is chosen as a reference model [7] for the standardization of MVC has shown significant coding efficiency. Figure 4 depicts an example of the H.264/AVC-based MVC structure, in which there are eight parallel views. As shown in Figure 4, this structure utilizes the hierarchical B pictures, which not only improves the coding efficiency, but also provides temporal scalability. This structure can be divided into three kinds of picture sets, i.e., the picture set predicted by the inter-view pictures on the view axis, the picture set Since there is redundancy information among each view in multiview video, disparity estimation (DE) is a key technology area to eliminate the view (spatial) redundancy between the views in MVC. Just like motion estimation (ME) to remove the temporal redundancy in a conventional video coding, DE can be used to eliminate the redundancy among the views in MVC. As an example, the views according to the camera positions are shown in Exit and Ballroom MVC sequences of Figure 5(a) and (b), respectively. It can be inferred that view 2 in Figure 5(b) and (d) is located right side of view 0 in Figure 5(a) and (c). The disparity information among the views can be estimated from DE process. 210

In Table 1, an MB is decided as background block mode if a derived motion vector is smaller than 1/4 in integer pixel unit in case of Direct mode or Inter 16 16 mode, or the MB mode is P_SKIP or B_SKIP. Figure 5. View disparity of Exit and Ballroom sequences (a) 0th frame in view 0 of Exit sequences (b) 0th frame in view 2 of Exit sequences (c) 0th frame in view 0 of Ballroom sequence (d) 0th frame in view 2 of Ballroom sequence 3. Fast Mode Decision using Global Disparity Vector 3.1. Region Partition The H.264/AVC-based MVC performs ME/DE process based on the variable block-sizes shown in Figure 1. From the point of view of H.264/AVC codec, large block-based ME is usually performed on homogeneous areas and small block-based ME is performed on detailed areas containing edge information[8]. In addition to that, most video sequences contain moving objects with a stationary background, where object regions are mostly encoded in small block-size while stationary background regions are mostly encoded in large block-size like Inter16 16 blocksize, Direct mode, or SKIP mode in inter frame coding. Table 1 shows the proposed segmentation of the background and objects block modes for fast mode decision in inter-view prediction. Table 1. Segmentation of background and object block modes MB mode mode Object mode P_Skip, B_Skip the other modes except Direct, Inter16x16 background mode Figure 6. Region segmentation map of Exit and Ballroom sequences (a) 4th frame in view 0 of Exit sequence (b) Region segmentation map of (a) (c) 4th frame in view 0 of Ballroom sequence (d) Region segmentation map of (b) Figure 6(a) and (c) show 4th images of view 0 of Exit and Ballroom sequences and Figure 6(b) and (d) show the region segmentation map of Figure 6(a) and Figure 6(c), respectively, created by making use of Table 1. In Figure 6(b) and (d), dark blocks are object regions (blocks) and the other regions are background regions. 3.2. Motion Skip Mode and Global Disparity Vector Global disparity vector (GDV) represents the difference of disparity among each view. Even though there are various ways to derive GDV, GDV for motion skip mode [9] that derivates GDV in MB-based unit ( ±16 integer pixel unit) from neighbor view already decoded is utilized in the proposed method as shown in Figure 7, in which full pixel-based disparity estimation is not necessary. Motion skip mode adopted in Joint Multiview Video coding Model (JMVM), which is the reference model in MVC, copies the motion information from that of the corresponding neighbor MB indicated by GDV in reference view (view 0) to that of the current MB in the current view (view 2). 211

View Reference View (view 0) View using Inter-View prediction (view 2) Anchor Picture GDVahead Time on-anchor Picture Anchor Picture POCahead POCcur POCbehind Figure 7. Derivation process of global disparity vector in motion skip mode. Figure 7 shows the derivation of GDV cur. The two GDVs, GDV ahead and GDV behind, of the anchor pictures are derived from two anchor pictures in an independent way, respectively, by using SAD (Sum of Absolute Difference) in an MB unit. GDV cur of the non-anchor picture between the anchor pictures is derived by Eq. (1), depending on POC cur POC ahead, and POC behind where POC means picture order count (display order) on the time axis. POCcur POC ahead GDVcur = GDVahead + ( GDVbehind GDVahead ) POCbehind POCahead 3.3. Fast Mode Decision in View for Inter-view Prediction Figure 8 shows the flowchart of the proposed method. Using region partitioning in Section 3.1, the region segmentation map separating object and background is generated from the pictures of the reference view. Regions of the views (e.g. V1, V2 in Figure 4) using inter-view prediction are estimated using MB-based GDV and region segmentation map of reference view V0. Figure 9. Region estimation at inter-view in Exit sequence (a) Region segmentation information of baseview (b) Region segmentation information of nonbase view using global disparity vector and (a) GDVcur GDVbehind (1) Figure 9 shows an example of region segmentation estimation. Figure 9 (a) is the region segmentation map of the reference view where black blocks are decided to object regions. Figure 9 (b) is the region segmentation map which is estimated using GDV and the region segmentation map of the reference view in Figure 9 (a). Black blocks in Figure 9 (b) are estimated object region. Through the estimated region information, RDO process is performed on only four modes as mentioned in Table 1 if the current MB is selected as background region. Otherwise, RDO is performed on all modes. In case of the former, RDO computation amount can be much reduced because RDO selects the best MB mode among only four background modes. 4. Experimental results Four MVC test sequences are simulated, which has recommended for the MVC experiments to verify the performance of the proposed scheme [10]. The sequences, each of which has 2 GOPs in 3 views, are used for this experiment. The proposed method is compared with the MVC reference software JMVM 4.0. As shown in Figure 10 and Table 2, the proposed method reduces the encoding time approximately 40% compared with JMVM 4.0 in views using inter-view prediction (e.g. V1, V2 in Figure 4) with the average 0.05dB BD (Bjøntegaard delta)- PSR[11] degradation. Sequence QP Table 2. Experimental results Bitrate (kbps) JMVM 4.0 PSR (db) Bitrate (kbps) Proposed (P,B view) PSR (db) BD- PSR (db) 22 1476.79 39.43 1494.27 39.41 Ballroom 27 720.18 37.37 731.18 37.33 32 368.40 34.85 374.67 34.79-0.09 37 206.14 32.22 208.19 32.15 22 804.92 40.35 807.70 40.33 Exit 27 327.47 39.07 331.22 39.05 32 166.72 37.32 168.50 37.28-0.05 37 98.38 35.13 99.03 35.09 22 1930.97 41.54 1935.90 41.52 Flamenco2 27 1028.76 38.69 1031.20 38.66 32 538.03 35.64 537.27 35.59-0.04 37 288.36 32.62 288.13 32.56 22 1564.40 40.29 1566.12 40.28 Race1 27 737.30 37.69 737.66 37.67 32 369.63 35.02 369.81 34.99-0.02 37 214.64 32.25 215.32 32.23 Avg. -0.05 212

Encode MB Check MB Mode using GDV Inter-view prediction? R-D Optimization region(mode)? mode set? RDO about modes RDO about All modes Mapping to Mapping to Object Finish encoding MB Figure 8. Flowchart for the proposed method JTC1/SC29/WG11 and ITU-T Q6/SG16, Doc. JVT-050d1, 2005. [3] K. Mueller, P. Merkle, A. Smolic, and T. Wiegand, Multiview Coding using AVC, ISO/IEC JTC1/SC29/WG11, Bangkok, Thailand, Doc. M12945, Jan. 2006. [4] Subjective Test Results for the CfP on Multi-View Video Coding, ISO/IEC JTC1/SC29/WG11, Bangkok, Thailand, Doc. 7779, 2006. [5] H. Schwarz, D. Marpe, T. Wiegand, Hierarchical B pictures, ISO/IEC JTC1/SC29/WG11 and ITU-T Q6/SG16, Doc. JVT-P014, Poznan, Poland, July 2005. Figure 10. Speed-up ratio for P and B views where the horizontal axis represents QP values 5. Conclusions This paper presents a fast mode decision method using both GDV and region partition in MVC. The fast mode decision is proposed to make use of correlation among the views for fast multi-view video coding. Experimental results show that the proposed scheme is able to achieve a reduction of about 40% encoding time on average with 0.05dB BD-PSR degradation. References [1] A. Smolic and D. McCutchen, 3DAV exploration of videobased rendering technology in MPEG, IEEE Trans. Circuits Syst. Video Technol., vol. 14, no. 3, pp. 348-356, Mar. 2004. [2] G. Sullivan, T. Wiegand, and A. Luthra, Draft of Version 4 of H.264/AVC (ITU-T Recommendations H.264 and ISO/IEC 14496-10 (MPEG-4 Part 10) Advanced Video Coding), ISO/IEC [6] J. Reichel, M. Wien, and H Schwarz, Joint Scalable Video Model JSVM-7, ISO/IEC JTC1/SC29/WG11 and ITU-T Q6/SG16, Doc. JVT-T202, Jul. 2006. [7] A. Vetro, P. Pandit, H. Kimata, and A. Smolic, Joint Multiview Video Model (JMVM) 5.0, ISO/IEC JTC1/SC29/WG11 and ITU-T Q6/SG16, Doc. JVT-X207, Jul. 2007. [8] Iain E.G. Richardson, H.264 and MPEG-4 Video Compression, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, UK, 2003. [9] H.-S. Koo,.-J. Jeon, and B.-M Jeon, MVC Motion Skip Mode, ISO/IEC JTC1/SC29/WG11 and ITU-T Q6/SG16, Doc. JVT-W081, Apr. 2007. [10]. Su, A. Vetro, A. Smolic, Common Test Conditions for Multiview Video Coding, ISO/IEC JTC1/SC29/WG11 and ITU-T Q6/SG16, Doc. JVT-U211, Oct. 2006. [11] G. Bjontegaard, Calculation of average PSR differences between RD-curves, ITU-T Q6/SG16, Doc. VCEG-M33, Apr. 2001. 213