Design of Coupling Coding in MPEG-4 HE-AAC

Similar documents
Synchronous Machine Parameter Measurement

Synchronous Machine Parameter Measurement

Experiment 3: Non-Ideal Operational Amplifiers

Experiment 3: Non-Ideal Operational Amplifiers

ABB STOTZ-KONTAKT. ABB i-bus EIB Current Module SM/S Intelligent Installation Systems. User Manual SM/S In = 16 A AC Un = 230 V AC

CHAPTER 2 LITERATURE STUDY

MAXIMUM FLOWS IN FUZZY NETWORKS WITH FUNNEL-SHAPED NODES

Exercise 1-1. The Sine Wave EXERCISE OBJECTIVE DISCUSSION OUTLINE. Relationship between a rotating phasor and a sine wave DISCUSSION

INSTITUTE OF AERONAUTICAL ENGINEERING (Autonomous) Dundigal, Hyderabad

Interference Cancellation Method without Feedback Amount for Three Users Interference Channel

Engineer-to-Engineer Note

Algorithms for Memory Hierarchies Lecture 14

Synchronous Generator Line Synchronization

Understanding Basic Analog Ideal Op Amps

Study on SLT calibration method of 2-port waveguide DUT

DYE SOLUBILITY IN SUPERCRITICAL CARBON DIOXIDE FLUID

Fuzzy Logic Controller for Three Phase PWM AC-DC Converter

The Discussion of this exercise covers the following points:

CHAPTER 3 AMPLIFIER DESIGN TECHNIQUES

Exponential-Hyperbolic Model for Actual Operating Conditions of Three Phase Arc Furnaces

Information-Coupled Turbo Codes for LTE Systems

Lecture 16: Four Quadrant operation of DC Drive (or) TYPE E Four Quadrant chopper Fed Drive: Operation

TIME: 1 hour 30 minutes

METHOD OF LOCATION USING SIGNALS OF UNKNOWN ORIGIN. Inventor: Brian L. Baskin

EET 438a Automatic Control Systems Technology Laboratory 5 Control of a Separately Excited DC Machine

Section 16.3 Double Integrals over General Regions

Application of Wavelet De-noising in Vibration Torque Measurement

Simulation of Transformer Based Z-Source Inverter to Obtain High Voltage Boost Ability

High-speed Simulation of the GPRS Link Layer

Power-Aware FPGA Logic Synthesis Using Binary Decision Diagrams

CSI-SF: Estimating Wireless Channel State Using CSI Sampling & Fusion

Redundancy Data Elimination Scheme Based on Stitching Technique in Image Senor Networks

Energy Harvesting Two-Way Channels With Decoding and Processing Costs

Study Guide # Vectors in R 2 and R 3. (a) v = a, b, c = a i + b j + c k; vector addition and subtraction geometrically using parallelograms

A Novel Back EMF Zero Crossing Detection of Brushless DC Motor Based on PWM

Application Note. Differential Amplifier

DESIGN OF CONTINUOUS LAG COMPENSATORS

Y9.ET1.3 Implementation of Secure Energy Management against Cyber/physical Attacks for FREEDM System

Compared to generators DC MOTORS. Back e.m.f. Back e.m.f. Example. Example. The construction of a d.c. motor is the same as a d.c. generator.

Implementation of Different Architectures of Forward 4x4 Integer DCT For H.264/AVC Encoder

Convolutional Networks. Lecture slides for Chapter 9 of Deep Learning Ian Goodfellow

High Speed On-Chip Interconnects: Trade offs in Passive Termination

A Development of Earthing-Resistance-Estimation Instrument

Math 116 Calculus II

Lecture 20. Intro to line integrals. Dan Nichols MATH 233, Spring 2018 University of Massachusetts.

Three-Phase Synchronous Machines The synchronous machine can be used to operate as: 1. Synchronous motors 2. Synchronous generators (Alternator)

A Slot-Asynchronous MAC Protocol Design for Blind Rendezvous in Cognitive Radio Networks

A New Stochastic Inner Product Core Design for Digital FIR Filters

Adaptive Network Coding for Wireless Access Networks

Section Thyristor converter driven DC motor drive

Eliminating Non-Determinism During Test of High-Speed Source Synchronous Differential Buses

Example. Check that the Jacobian of the transformation to spherical coordinates is

Module 9. DC Machines. Version 2 EE IIT, Kharagpur

Soft-decision Viterbi Decoding with Diversity Combining. T.Sakai, K.Kobayashi, S.Kubota, M.Morikura, S.Kato

A Simple Approach to Control the Time-constant of Microwave Integrators

Joanna Towler, Roading Engineer, Professional Services, NZTA National Office Dave Bates, Operations Manager, NZTA National Office

Design and Modeling of Substrate Integrated Waveguide based Antenna to Study the Effect of Different Dielectric Materials

Multipath Mitigation for Bridge Deformation Monitoring

PB-735 HD DP. Industrial Line. Automatic punch and bind machine for books and calendars

Performance Monitoring Fundamentals: Demystifying Performance Assessment Techniques

Postprint. This is the accepted version of a paper presented at IEEE PES General Meeting.

ECE 274 Digital Logic

Digital Design. Chapter 1: Introduction

Solutions to exercise 1 in ETS052 Computer Communication

5 I. T cu2. T use in modem computing systems, it is desirable to. A Comparison of Half-Bridge Resonant Converter Topologies

Development and application of a patent-based design around. process

On the Prediction of EPON Traffic Using Polynomial Fitting in Optical Network Units

To provide data transmission in indoor

April 9, 2000 DIS chapter 10 CHAPTER 3 : INTEGRATED PROCESSOR-LEVEL ARCHITECTURES FOR REAL-TIME DIGITAL SIGNAL PROCESSING

BP-P2P: Belief Propagation-Based Trust and Reputation Management for P2P Networks

9.4. ; 65. A family of curves has polar equations. ; 66. The astronomer Giovanni Cassini ( ) studied the family of curves with polar equations

(CATALYST GROUP) B"sic Electric"l Engineering

Lab 8. Speed Control of a D.C. motor. The Motor Drive

MEASURE THE CHARACTERISTIC CURVES RELEVANT TO AN NPN TRANSISTOR

Design And Implementation Of Luo Converter For Electric Vehicle Applications

Design of FPGA-Based Rapid Prototype Spectral Subtraction for Hands-free Speech Applications

10.4 AREAS AND LENGTHS IN POLAR COORDINATES

ISSCC 2006 / SESSION 21 / ADVANCED CLOCKING, LOGIC AND SIGNALING TECHNIQUES / 21.5

Alternating-Current Circuits

Multi-beam antennas in a broadband wireless access system

Using Compass 3 to Program the Senso Diva Page 1

D I G I TA L C A M E R A S PA RT 4

Temporal Secondary Access Opportunities for WLAN in Radar Bands

& Y Connected resistors, Light emitting diode.

BP-P2P: Belief Propagation-Based Trust and Reputation Management for P2P Networks

CHARACTERISTICS OF THE GPS SIGNAL SCINTILLATIONS DURING IONOSPHERIC IRREGULARITIES AND THEIR EFFECTS OVER THE GPS SYSTEM

Section 2.2 PWM converter driven DC motor drives

THE MODEL 682A05 BEARING FAULT DETECTOR (U.S. Patent No. 6,889,553) A New Approach for Predicting Catastrophic Machine Failure

Ultra Low Cost ACCELEROMETER

Investigation of Ground Frequency Characteristics

Address for Correspondence

University of North Carolina-Charlotte Department of Electrical and Computer Engineering ECGR 4143/5195 Electrical Machinery Fall 2009

Section 17.2: Line Integrals. 1 Objectives. 2 Assignments. 3 Maple Commands. 1. Compute line integrals in IR 2 and IR Read Section 17.

Ultra Low Cost ACCELEROMETER

Direct AC Generation from Solar Cell Arrays

CS 135: Computer Architecture I. Boolean Algebra. Basic Logic Gates

MATH 118 PROBLEM SET 6

Domination and Independence on Square Chessboard

Adaptive VoIP Smoothing of Pareto Traffic Based on Optimal E-Model Quality

ABSTRACT. We further show that using pixel variance for flat field correction leads to errors in cameras with good factory calibration.

Transcription:

Design of Coupling Coding in MPG-4 H-AAC

Design of Coupling Coding in MPG-4 H-AAC Student Chi-Ming Chng Advisor Dr. Chi-Min Liu Dr. Wen-Chieh Lee A Thesis Submitted to Institute of Computer Science nd ngineering College of Computer Science Ntionl Chio Tung University in prtil Fulfillment of the Requirements for the Degree of Mster in Computer Science June 006 Hsinchu, Tiwn, Republic of Chin

SBR ITU PAQ i

Design of Coupling Coding in MPG-4 H-AAC Computer Science Ntionl Chio Tung University The coupling coding in SBR is dpted to trnsform the dt domin to de-correltion nd sve more bits. However, becuse of the inherent constrint of the coupling coding, some side informtion need to be shred by the stereo chnnels. There re two considerble issues relted to qulity due to the shring, including the determining of the shred T/F grid nd the shred chirp fctor. On the other hnd, the quntiztion process cuses of the ris of qulity degrdtion lso need to be inspected. This thesis considers the possible rtifcts to exmine the decision of the shred prmeter, nd proposes coupling decision method for the trdeoff between high bnd qulity nd demnd bits. Both subjective nd objective tests re conducted to chec the qulity improvement. The objective test mesures used is the recommendtion system by ITU-R Ts Group 10/4. ii

iii

Contents Contents...iv Figure List...v Tble List...vii Chpter 1 Introduction...1 Chpter Bcgrounds...5.1 MPG-4 High fficiency AAC...5. Relted Modules in SBR to Coupling Coding...7..1 Time/Frequency Grid in H-AAC...7.. Chirp Fctor of Inverse Filtering...9..3 Noise Floor Scle Fctor Q...9 Chpter 3 Design of Coupling Coding in SBR...1 3.1 Overview of Coupling Coding Schemes in H-AAC...1 3. Decision of Shred T/F Grid...13 3..1 Design of T/F Grid by Dynmic Progrmming in Non-coupling Mode...14 3.. Design of T/F Grid by Dynmic Progrmming in Coupling Mode...17 3.3 Decision of Shred Inverse Filtering Intensity...18 3.3.1 Decision of Inverse Filtering Intensity in Non-coupling Mode...18 3.3. Decision of Inverse Filtering Intensity in Coupling Mode...19 3.4 Decision of Noise Floor Sclefctor...0 3.5 Coupling Switch Method...3 3.5.1 Quntiztion error nlysis...4 3.5. nergy Abnorml Phenomenon...6 3.5.3 Summry...31 Chpter 4 xperiments...33 4.1 xperiment nvironment...33 4. Objective Qulity Mesurement in MPG Test Trcs...34 4.3 Objective Qulity Mesurement in Music Dtbse...38 4.4 Subjective Qulity Mesurement...41 Chpter 5 Conclusion...43 References...44 iv

Figure List Figure 1: Digrm of the H-AAC nd H-AAC v. [9]...1 Figure : Qulity comprison t different bit-rte mong AAC, H-AAC, nd H-AAC v. [6]... Figure 3: Birdies effect occurring in LF of H-AAC due to the insufficient bits...3 Figure 4: nhnced Birdies effect by ting dvntge of coupling coding 3 Figure 5: Bsic rchitecture of H-AAC encoder...3 Figure 6: An exmple of reconstruction process of SBR...5 Figure 7: Bloc digrm of H-AAC encoder...6 Figure 8: An exmple of the frequency tble nd time segment...8 Figure 9: An exmple of VAR frme border to shift few smple points...8 Figure 10: HF djustment process of noise-demnd high resolution grid with three subbnds b 0, b1, b...10 Figure 11: HF djustment process of noise-demnd high resolution grid with three subbnds b 0, b1, b...11 Figure 1: The syntx of the SBR extension dt elements in coupling nd non-coupling modes...13 Figure 13: Digrm of shred time segment in coupling nd non-coupling modes...14 Figure 14: An exmple of the optiml prtition from i to j time unit with 3 time borders nd high resolution envelopes...16 Figure 15: Flowchrt of the DP method proposed in [9]...16 Figure 16: Decision flowchrt of inverse filtering mode...0 Figure 17: The serch rnge of quntized noise floor for different modes. Figure 18: Flowchrt of the propose serch method of the noise floor scle fctor decision...3 Figure 19: The test trc hs high energy difference in high frequency...8 Figure 0: The high reconstruction error in weer chnnel...8 Figure 1: The low reconstruction error in stronger chnnel...8 Figure : The reltionship between men of reltive error nd Ψ when Ψ is more thn one in the right chnnel...8 Figure 3: The reltionship between men of reltive error nd Ψ when Ψ is less thn one in the right chnnel...9 Figure 4: The reltionship between vrince of reltive error nd Ψ v

when Ψ is more thn one in the right chnnel...9 Figure 5: The reltionship between vrince of reltive error nd Ψ when Ψ is less thn one in the right chnnel...9 Figure 6: The reltionship between men of reltive error nd Ψ when Ψ is more thn one in the left chnnel...30 Figure 7: The reltionship between men of reltive error nd Ψ when Ψ is less thn one in the left chnnel...30 Figure 8: The reltionship between vrince of reltive error nd Ψ when Ψ is more thn one in the left chnnel...30 Figure 9: The reltionship between vrince of reltive error nd Ψ when Ψ is less thn one in the left chnnel...31 Figure 30: Coupling Switch Flowchrt...3 Figure 31: The vrince in the ODGs of proposed coupling pproches t 80 bps...35 Figure 3: The vrince in the ODGs of proposed coupling pproches t 64 bps...36 Figure 33: The vrince in the ODGs of proposed coupling pproches t 48 bps...37 Figure 34: The verge ODGs of method M0 nd M1 t 80bps in 16 ctegories...39 Figure 35: The verge ODGs of method M0 nd M1 t 64bps in 16 ctegories...39 Figure 36: The verge ODGs of method M0 nd M1 t 48bps in 16 ctegories...39 Figure 37: The suffered spectrum of the silence in the impulse_m0_0db trc...40 Figure 38: The spectrum of the tringle1 trc in the TonlSignls set...40 Figure 39: Reconstructed spectrogrm of the tringle1 trc in norml mode...41 Figure 40: Reconstructed spectrogrm of the tringle1 trc in coupling mode...41 Figure 41: The result of the subjective test for coupling coding t 48bps 4 vi

Tble List Tble 1: The prmeter newbw decided by inverse filtering mode...9 Tble : The scenrios of the ten bit-consuming stge...15 Tble 3: Comprison of the grid criterion in the norml/coupling mode...18 Tble 4: The twelve trcs recommended by MPG...34 Tble 5: Objective mesurements through the ODGs for proposed coupling pproch t 80 bps...35 Tble 6: Objective mesurements through the ODGs for proposed coupling pproch t 64 bps...36 Tble 7: Objective mesurements through the ODGs for proposed coupling pproch t 48 bps...37 Tble 8: The PSPLb udio dtbse [13]...38 vii

Chpter 1 Introduction MPG-4 H-AAC (High fficiency Advnced Audio Coding) is the extension of the conventionl AAC [1] by supporting the SBR (Spectrl Bnd Repliction) [][3][4][5]. The bsic principle of SBR is to reconstruct the high frequency spectrl bnds by replicting the low frequency spectrl bnds. The resulting codec is referred to s the MPG-4 H-AAC or AACplus. Besides ting the SBR s the bndwidth extension tool, the PS (prmetric stereo) [6][7][8] coding is further incorported s the chnnel reduction tool. The integrted codec is referred to MPG-4 H-AAC version. Figure 1 illustrtes the scheme of H-AAC nd H-AAC version. AAC SBR PS H-AAC V1 H-AAC V Figure 1: Digrm of the H-AAC nd H-AAC v. [9] To fit vriety of situtions of demnd, the three coding schemes re pplied to the different bit rtes. In order to mintin the udio qulity t low bit rte, H-AAC is dpted mong the 48 ~ 96 bps. Furthermore, to stisfy the requirement of very low bit rte lower thn 48 bps, H-AAC version is proposed to overcome the chllenge. Figure illustrtes the reltionship between the bit rte nd the perceptul qulity. However, the efficiency of the complete system is determined lrgely by the coopertion of the three modules. Any unsuitble design of nyone of the three modules will ffect the effect of the reminders, nd hence destroy seriously the qulity of the H-AAC version. This thesis will focus on the coupling coding in SBR. The principle of the coupling coding is to trnsform the left/right (L/R) energy signls into verge/rtio (A/R) mode to eliminte signl correltion nd te dvntge of prmeter shring to 1

sve bits. specilly, t very low bit rte, the bit shortge will result in mny nnoying rtifcts t the low frequency in AAC. Hence, the purpose of coupling coding is to sve the consuming bits of the high frequency to promote the low frequency qulity. Figure 3 shows n exmple of common rtifct, nown s birdies effect [10], t the low frequency prt due to lc of bits in AAC. The spectrl vlley is visible in the low frequency spectrum encoded by AAC. Figure 4 shows the rtifct is enhnced lrgely once AAC is supplied with enough bits. 100 xcellent 80 AAC Qulity Good 60 Fir 40 Poor 0 H-AAC+PS H-AAC Bndwidth fs/ fs/4 SBR AAC SBR AAC AAC PS Bd 0 1 Chnnels 0 0 3 64 96 18 Stereo bit-rte [bit/sec] Figure : Qulity comprison t different bit-rte mong AAC, H-AAC, nd H-AAC v. [6] Figure 5 illustrtes the bloc digrm of H-AAC encoder. This thesis considers the coupling coding design through four design issues. The first nd second issues re decision of shred T/F grid [9] prmeter nd shred inverse filtering intensity [11][1]. Furthermore, ccording to the constrint in the coupling mode, the difference vlues of quntized vlue of the specific prmeters, nmed s noise floor scle fctors [11][1], between two chnnels should be limited. Therefore, selecting suitble quntized noise floor scle fctors for the two chnnels is the third issue. Finlly, the lst issue is coupling switch method which need to consider the trdeoff between high bnd qulity nd demnd bits, nd some possible ris of qulity degrdtion.

Figure 3: Birdies effect occurring in LF of H-AAC due to the insufficient bits Figure 4: nhnced Birdies effect by ting dvntge of coupling coding PCM QMF Anlysis QMF Synthesis AAC Core ncoder SBR ncoder Bitstrem Formtter bitstrem Figure 5: Bsic rchitecture of H-AAC encoder This thesis is orgnized s follow. Chpter introduces the bcgrounds on the fundmentl nowledge of MPG-4 H-AAC. In Chpter 3, the five design issues of the coupling coding in MPG-4 H-AAC will be investigted, which include coupling switch method, decision of shred T/F grid prmeter, decision of shred inverse filtering intensity, decision of the noise floor sclefctors. Chpter 4 conducts 3

experiment to verify performnce of the proposed coupling coding method. Chpter 5 gives conclusion on this thesis. 4

Chpter Bcgrounds This chpter introduces some fundmentl bcground of MPG-4 H-AAC. specilly, the three modules relted to the coupling coding in SBR re lso described..1 MPG-4 High fficiency AAC By the coopertion of both SBR nd the conventionl AAC, the H-AAC cn mintin the high udio qulity t very low bit rte. The SBR tes cre of the high frequency contents, nd reltively the AAC encoder compresses the low frequency contents. Becuse of the few bits consuming of SBR, the most of vilble bits re supplied for AAC to mintin the qulity of low frequency. Figure 6 illustrtes the reconstruction process of SBR. () Originl spectrum (b)decoded LF spectrum from AAC (c)hf regenertion by replicting LF bnds (d)reconstructed HF fter envelope djustment Figure 6: An exmple of reconstruction process of SBR The H-AAC decoder reconstructs the high frequency by replicting the low frequency decoded from AAC, nd then it will djust the tonlity of the replicted low bnd to be close to the tonlity of the originl high bnd by compring the difference between the content of the high nd low bnds. Furthermore, there re two min modules in SBR encoder. One is the time/frequency grid module, nd nother is the high frequency djustment module. In the time/frequency grid module, it splits the high bnds into severl T/F grids. ch T/F grid records the verge energy to the 5

H-AAC decoder. On the other hnd, the HF djustment module records the difference between the originl nd replicted contents of high frequency prt. Besides the bsic reconstruction opertion, the dt-rte reduction tool, Coupling coding, is dopted to further eliminte the signl correltion nd sve consuming bits of high frequency by domin trnsformtion of envelope dt nd prmeter shring. The sved bits by the coupling coding mechnism cn supply to AAC to promote low frequency qulity effectively, nd chieve the optiml overll qulity. Input Signl!""!"" % % & # # $ $ Coded Audio Strem Figure 7: Bloc digrm of H-AAC encoder 6

. Relted Modules in SBR to Coupling Coding In this section, three importnt modules relted to coupling coding will be introduced. They include the time/frequency grid, the chirp fctor used in the inverse filtering opertion, nd the tonlity controlling fctor Q. The understnding of the three issues will ffect the design of the coupling coding lrgely...1 Time/Frequency Grid in H-AAC This subsection introduces the protocol bout the time/frequency grid in H-AAC. Adptive time nd frequency resolution re incorported into SBR for the envelope coding nd djustment. The SBR replictes the low frequency signl to high frequency signl. The QMF subbnds in SBR rnge will be segmented into severl grids by divisions from the T/F dimension. The successive smples in QMF subbnds re integrted into time envelope. The successive subbnds re segmented into severl uniform or non-uniform bnds with different bndwidth by choosing one of the frequency bnd tbles. T/F grid which describes time segments nd ssocited frequency tbles is the bsic reconstructed unit in the subsequent SBR coding process. The verge energy in grid will be used to get the rescle rtio, which is over the energy of the duplicted low bnds. It implies tht the loctions of time borders nd the resolution of the envelopes determine the ccurcy of the repliction nd the udio qulity. There re three components in the time/frequency grid module s follows Frequency tble Time segment Frme clss The frequency domin resolution is determining by choosing from the different frequency tbles in the SBR. Frequency tble ffect the precision of tone ddition nd the low frequency which will be replicted for reconstruction high frequency in the decoder. There re five frequency tbles in SBR. High resolution frequency bnd tble nd low resolution frequency bnd tble re two vilble resolution tbles tht cn be selected for every envelope of SBR frme. Noise floor frequency bnd tble nd limiter frequency bnd tble correspond respectively to the noise floor nd the limiter. All frequency bnd tbles re derived from the mster frequency bnd tble. The mster frequency tbles re defined by functions nd the rguments re trnsmitted in the SBR heder. The time borders ffect the resolution in the time domin. The time 7

borders re more flexible thn frequency bnd tble. It contins the number of envelopes in SBR frme nd loctions of time borders. There re 3 smples in the time domin of one SBR frme. But there re only 16 loctions in the SBR frme for the time borders. Becuse of the constrint in SBR, there re 5 time borders if frme clss is VARVAR. The other frme clss only hs 4 time borders t most. There re four different SBR frme clsses, FIXFIX, FIXVAR, VARFIX, nd VARVAR in the SBR time/frequency grid. The four clsses refer to whether loctions of leding nd triling SBR frme boundries re vrible. H-AAC llows the boundry between the two frmes to shift few smple. It me time domin segment more flexible. Figure 9 is n exmple of the boundry shift few smple to mtch the signl. Low High Low Frequency tble F Time segment T Figure 8: An exmple of the frequency tble nd time segment Figure 9: An exmple of VAR frme border to shift few smple points 8

.. Chirp Fctor of Inverse Filtering The high bnds in SBR decoder re reconstructed from the low bnds. Hence, if the tonlity of the replicted low bnds will not mtch the fetures of the originl high bnds, the inverse filtering is pplied to eliminte the excess tone component of the replicted low bnds. The inverse filtering process is performed in the two steps. First, the liner prediction is pplied on the replicted low bnds. Then the ctul inverse filtering is performed respectively for ech of the replicted low bnds ptched to the high bnds. The resultnt high bnd generted in SBR decoder is obtined s X High { X ( p, l 1) + α X ( p, ) } (, l) = X Low ( p, l) α 0 Low 1 Low l, (1) where α is the chirp fctor tht cn control the inverse filtering level, X Low is the low bnd signl nlyzed from the output of AAC decoder, nd 0, 1 re the prediction coefficients which re used to filter the subbnd signl in the inverse filtering.the chirp fctor cn remove the loction of the poles of the inverse filter nd ffect the degree of the hrmonic ttenution in the high frequency genertor. According to the stndrd [], the clcultion of the chirp fctor is defined s where Q if tempbw( i) < 0.01565, 0 i < N Q ( ) f tempbw( i) 0.01565 0 α ( i) =, () tempbw i N is number of noise floor bnds, nd ( i) tempbw is clculted s ' ( i) if newbw < α ( i), ' ' α ( i) if newbw α ( i) ' 0.75 newbw + 0.5 α tempbw ( i) = 0 i < N Q, (3) 0.9065 newbw + 0.09375 ' where α is the α vlues clculted in the previous SBR frme, nd newbw is decided by inverse filtering mode of current frme nd previous frme ccording to Tble 1. Tble 1: The prmeter newbw decided by inverse filtering mode bs_invf_mode(i) bs_invf_mode(i) Off Low Intermedite Strong Off 0.0 0.6 0.9 0.98 Low 0.6 0.75 0.9 0.98 Intermedite 0.0 0.75 0.9 0.98 Strong 0.0 0.75 0.9 0.98..3 Noise Floor Scle Fctor Q The noise floor scle fctor is the prmeter which ffects the mgnitude 9

djustment of the ech subbnd in the high resolution grid nd the tone-noise dditive level. There re two different types of the envelope scling for the noise-demnd nd the tone-demnd grid ccording to their requirement of content. In the noise-demnd grid, the mgnitudes of the replicted bnds re djusted by gin control fctor defined s [11] G ND = o r 1 1+ Q, (4) where o nd r re the verge energy of the originl high frequency signl nd the replicted low frequency signl respectively in the th high resolution grid, nd Q is the noise floor scle fctor. After the envelope scling, the rndom noise is dded to the high frequency with level defined s C n = o Q 1+ Q. (5) Similrly, the gin control fctor in the tone-demnd grid is defined s G TD = o r Q 1+ Q. (6) Also, the energy mount for the compensted tone is defined s C t = o 1 1+ Q. (7) Figure 10 nd Figure 11 illustrte the two different djustment results for the noisend tone- demnd grids. n C n C n C compenstion r ND ( ) r ND G ( ) r ND G ( G ) scling b 0 b 1 b Noise-demnd grid Figure 10: HF djustment process of noise-demnd high resolution grid with three subbnds b 0, b1, b 10

t C compenstion n C n C r TD ( ) r TD G ( ) r TD G ( G ) scling b 0 b 1 b Tone-demnd grid Figure 11: HF djustment process of noise-demnd high resolution grid with three subbnds b 0, b1, b 11

Chpter 3 Design of Coupling Coding in SBR As the fundmentls of the coupling coding design, this chpter reviews firstly the coupling coding schemes defined in the H-AAC stndrd []. Furthermore, bsed on our relted wor [9][11] bout the designs of the T/F grid nd tonlity compenstion, the thesis extends the wors into the coupling coding method. The extension should consider the ris of prmeter shring. For exmple, the T/F grid shring will result in the corse reconstructed envelope or the more bits consuming. Also, the shring of the chirp fctor nd noise-floor fctor my destroy the tonlity ccurcy of some chnnel nd led to the qulity degrdtion. Finlly, the huge rtio of the left nd right energy chnnels will result in the dt correltion reduction nd the lrge quntiztion error. A coupling switching method considering the bnorml phenomenon is proposed to compromise the trdeoff between the qulity nd the bits reduction. 3.1 Overview of Coupling Coding Schemes in H-AAC For the energy dt of the spectrl envelop extrcted from time/frequency (T/F) grids, there re severl processes, including quntiztion opertion, the DPCM nd the Huffmn entropy coding, to be pplied to reduce the dt rte in turn. Furthermore, to reduce the dt redundncy of the stereo energy chnnels, the coupling mode is dpted to trnsform the Left/Right (L/R) energy chnnels into Averge/Rtio (A/R) mode to eliminte signl correltion. To meet the inherent requirement of the coupling, the prmeters for T/F grid nd inverse filtering need to be shred by A/R chnnels in coupling mode. Figure 1 illustrtes the syntx of the SBR extension dt elements in the two modes. It shows tht both the T/F grid nd the controlling prmeter of inverse filtering level, tht is chirp fctor, should be shred. In summry, there re five criticl points when the coupling mode is switched on. The five terms re s follows: Trnsform L/R mode into A/R mode Shre Time/Frequency grid Shre chirp fctor of inverse filtering DPCM vlues of the rtio chnnel is further quntized with step size. 1

The difference of quntized noise-floor sclefctors between the two chnnels is restricting to the rnge from 0 to 1 by the syntx constrint.!- ". # &,, (+* ()* (+* ()* (+* ()* ' & &,, (+*()* ' & Figure 1: The syntx of the SBR extension dt elements in coupling nd non-coupling modes 3. Decision of Shred T/F Grid Insted of using the individul set of time segments s the norml L/R mode, there cn be only common segment set in the coupling mode. Although the shring of side informtion cn sve bits, the qulity rtifct my occur due to inccurte segment. Hence, the qulity degrdtion should be considered. For the optiml time segments of the signl subbnds of the L/R chnnels in the norml mode, decision method bsed on dynmic progrmming pproch hs been proposed in our other wor [9]. In this section, the modified decision method for coupling mode is proposed to determine the optiml common segment set nd mesure the ffect for qulity. 13

G L = Arg Min GL G R = Arg Min GR ( O ( G ) ) L ( O ( G ) ) R G C = Arg Min GC ( O ( G ) ) C Figure 13: Digrm of shred time segment in coupling nd non-coupling modes 3..1 Design of T/F Grid by Dynmic Progrmming in Non-coupling Mode In [9], decision method of T/F grid by the dynmic progrmming (DP) in non-coupling mode hs been proposed. The bsic concept of the DP method is to serch the optiml grid in the ll possible grids in individul chnnel by n efficient recursive procedure. The resultnt grid G serched by the method will be n optiml solution to me the verge of the energy difference (reconstructed energy error) to the originl signl energy rtios (DSR) in ll qulity mesurement units, nmed criticl units, minimum. Tht is, G where DSR ( G) is defined s = Arg DSR ( G) ( Min( DSR( G) ) G ( c), (8) DSRc c G =, (9) # where c is the criticl unit, #( c) mens the number of criticl units, nd DSR c is the reconstruction error of the criticl unit c in the frme. The lengths of the criticl unit re defined s four smple points nd the criticl bnd bndwidth for time nd frequency direction respectively in [9]. The number of the time borders nd the ssocited frequency resolution determine the totl number of the girds nd lso ffects the resultnt DSR. The dynmic progrmming for DSR nlysis is shown s 14

DSR, u i, j = 0 i < j 8; 0 4, 0 u + 1 Min i+ 1 t j 1 0,0 1, u 0,1 1, u 1 { DSR + DSR, DSR + DSR } i,t t, j i,t t, j (10) where i, j re the border of the time slot consisting of two smples, is the number of the time borders, nd u is the number of the high resolution envelopes. The nottion u D, i j, mens the optiml DSR from i to j with time borders nd u high resolution envelope. According to [9], there re ten different bit-consuming stges defined in the dynmic progrmming method. ch stge indictes the different number of time borders nd high resolution envelopes in one SBR frme. The scenrios of the ten stges re described in Tble. Figure 14 illustrtes the optiml prtition from i to j with 3 time borders nd high resolution envelopes. Tble : The scenrios of the ten bit-consuming stge 15

3, D i, j freq. i j time Figure 14: An exmple of the optiml prtition from i to j time unit with 3 time borders nd high resolution envelopes Figure 15 is the flowchrt of the dynmic progrmming method for serching optiml T/F grid. The loop will consider ll pssble resolution grids. In the loop, it will hve n objective function for determining the optiml T/F grid in the sme bit-consuming stge. There is nother efficiency checing for switching different stges. The dynmic progrmming method serches the optiml grid from the lower bit-consuming stges to the higher bit-consuming stges. Becuse the different bit-consuming stges hve different requirement of bits, the grid decision in the different stges must consider the trdeoff of bits nd qulity. Begin Begin Stge > 9 Stge > 9 Yes nd nd No Find Find optiml optiml T/F T/F grid grid Increse Stge Increse Stge No Chec efficiency Chec efficiency Yes Record Record efficient efficient grid grid Figure 15: Flowchrt of the DP method proposed in [9] 16

3.. Design of T/F Grid by Dynmic Progrmming in Coupling Mode Becuse of the shring of the T/F grid, the two criterions used in the bove DP method must be modified to simultneously consider the content of two chnnels in the coupling mode. There is n objective function which mesures the grids in the DP serch method. In the norml mode, the objective function is defined s the DSR vlue described bove. To consider both the two DSR vlues from L/R chnnels in the coupling mode, the objective function is modified s where DSR 0 nd 1 ( DSR ) O ( G) = Mx, DSR, 0 1 (11) DSR re the DSR vlues of left nd right chnnel respectively. To ensure the qulity of worst chnnel, the conservtive choice of the resultnt grid is dopted in the criterion. The optiml grid is the minimum solution of the objective function (11). On the other hnd, the itertion criterion of the DP method in the norml mode involves the improvement of DSR in the current resolution. If the improvement of DSR is over the threshold depending on the bit rte, it will updte the higher resolution T/F grid to improve the qulity. The improvement is defined s where ' = DSR DSR, (1) ' DSR is the optiml DSR for the preceding bit-consuming stge. Similrly, in the coupling mode, there re two improvements of DSR which re defined s 0 = DSR DSR0. (13) = DSR DSR 1 The modified itertion criterion is to stisfied the two conditions s follows ' 0 ' 1 mx mx, ( 0 1 ) > Φ1 ( 0 1 ) > Φ 1 =, (14) =, (15) min min, where Φ 1 nd Φ re the threshold of itertion criterion. The modifiction of the itertion criterion cn ensure tht the improvement of both the DSR vlues of the two chnnels re over low bound, nd t lest one of the DSR improvements cn exceed the lrge degree to show the efficiency of the new stge. 17

Tble 3: Comprison of the grid criterion in the norml/coupling mode %## $ (%# " &%'#!! " #! " )*)%# + %#'*% 3.3 Decision of Shred Inverse Filtering Intensity In SBR, the inverse filtering is dopted to eliminte excess tones in low bnds to fit the tonlity of high bnds. The different chirp fctors, which re the prmeter to control the intensity level of the inverse filtering, re ssigned to L/R chnnels in the norml mode. Similrly, ccording to the regultion of the stndrd [], only single chirp fctor cn be used in the coupling mode. Once the difference of the tonlity contents is dominte, the possible rtifct suffering from the unsuitble inverse filtering under the constrint is ble to be nticipted. Hence, the ris of the inverse filtering with the sme intensity level for the stereo chnnels in the coupling mode should be considered. 3.3.1 Decision of Inverse Filtering Intensity in Non-coupling Mode In [11][1] of our nother wor, compenstion method of dditionl tone nd noise to mintin the db-difference of the tone nd noise component hs been proposed. In the non-coupling mode, bsed on the method, the selecting method of inverse filtering level is depended on the tonlity between the high bnds nd the low bnds. According to the syntx, ech specific noise bnd hs n inverse filtering mode individully. Furthermore, noise bnd my include severl high-resolution grids. Therefore, the tonlity of the noise bnd is defined in [11][1] s 18

T i NB = mx i NB, (16) N i where T i is the energy of tone in the i th high resolution grid of the noise bnd, N i is the energy of noise floor in the i th high resolution grid of noise bnd, nd NB is the noise bnds in the SBR frme. From (16), the mximum tonlity mong the high resolution grids stnds for the tonlity of the noise bnd. The gol of the inverse filtering mode is to imply tht the tonlity of the replicted low frequency bnd cn pproch the tonlity of the originl high frequency bnd, tht is ˆ =, (17) l NG h NG where l ˆ NG is the tonlity of the replicted low frequency bnd, h NG is the tonlity of the originl high frequency bnd. The optiml chirp fctor cn be evluted from (1 α )T l = N l T N h h, (18) where α is chirp fctor. Then the prctice chirp fctor cn be serched in the Tble 1 to pproximte the optiml one. 3.3. Decision of Inverse Filtering Intensity in Coupling Mode However, in the coupling mode, there is only one inverse filtering mode for two chnnels. To mesure the influence to the resultnt tonlities, the distortion function of the shred chirp fctor is defined s f ( x) 0 l 0 h 1 l 1 h = x + x, (19) where x = 1 α,, re the tonlity of low frequency in the left nd right 0 l 1 l chnnel respectively, nd 0 h, 1 h re the tonlity of high frequency in the left nd right chnnel respectively. Hence, the optiml chirp fctor shred by the two chnnels in the coupling mode should be the minimum solution of the distortion function (19). The optiml solution cn be clculted by solving the eqution of the one order differentil of the distortion function, 0 l 0 0 1 1 1 ( ) + ( x ) = 0 x. (0) l h Form (0), the optiml chirp fctor in the coupling mode cn be evluted from x l l h 0 0 1 1 = 1 α = l h l h. (1) 0 1 ( ) + ( ) l + But there is n exception when the originl signl is tone-rich. The inverse filtering l 19

mechnism must be turned off for the tone-rich frme, becuse the tone-rich signl must not be processed in order to mintin the structure completeness of the tone rich signl. Figure 16 illustrtes the inverse filtering mode decision in the coupling mode. Tone Tone rich rich signl signl detection detection Yes Inverse Inverse filtering filtering mode mode is is zero zero No Clculte Clculte tonlity tonlity of of noise noise grid grid Clculte Clculte optiml optiml chirp chirp fctor fctor Find Find corresponding corresponding inverse inverse filter filter mode mode Figure 16: Decision flowchrt of inverse filtering mode 3.4 Decision of Noise Floor Sclefctor As the quntiztion vlue of the noise floor fctor Q, the quntized noise floor scle fctor q is the prmeter djusting the reconstructed tone-noise content. However, ccording to the syntx constrint, in the coupling mode, the difference of q vlues between the two chnnels cn not be lrger thn twelve. Becuse of the restriction, the optiml q vlues in the coupling mode my be different from the ones in the norml mode. In this section, the modified decision method for coupling mode is proposed to determine the optiml quntized noise floor scle fctor. In [11][1] of our nother wor, the proposed method in the norml mode cn decide the optiml q to mintin the minimum distortion of the db-differences mong ll the high resolution grids contined in the single noise grid. There is only one q 0

vlue shred in single noise grid which includes severl high resolution grids. The distortion function is defined s where TD ( ) + ( ( Q) ) ND ( ) = ( Q) D Q ND TD, () is the idel tone/noise db-difference of originl signl in, re the resultnt tone/noise db-difference by q, if resolution grid, ND TD th high th high resolution grid is noise demnd grid nd tone demnd grid respectively. Hence, the optiml q vlue must be chosen to minimize the distortion function. { min[ D( Q) ]} Q * = Arg (3) Q In the coupling mode, for the quntiztion process of the noise floor fctor Q vlues, the Q vlues should be firstly chnged into A/R mode from the L/R mode to eliminte signl correltion. As mentioned bove, in the coupling mode, the rtio chnnel of the quntized noise floor needs to be restricted s [ ] And, the quntiztion formul of q rtio is defined s q 0,4. (4) rtio (, l) (, l) Q (, ) left q log + 0.5 rtio l = INT + 1, (5) Qright where Q left nd Q right re the noise floor of left nd right chnnel respectively, is the index of the frequency tble bnd in SBR, l is the index of the envelope. On the other hnd, the dequntiztion formul of the q vlue in the L/R mod is defined s Q q = 6 By substituting (6) into (5), it results in where ql nd. (6) 6 q L q log 1 6-q R rtio +, (7) q R re the quntized noise scle fctor in left nd right chnnel respectively. Hence, from (4) nd (7), the constrint of the noise floor scle fctor pir of the L/R mode is derived s [ 1,1] q q. (8) R L Hence, the serch rnge of the cndidtes of quntized L/R noise scle fctor pir in the coupling mode is smller thn the rnge in the norml mode. As shown in Figure 17, rnge Κ is the serch scope in the norml mode, the contrcted rnge χ is the serch scope in the coupling mode, where {( q, q ) Z 0 q, q 30} κ, (9) = L R L R 1

{( q, q ) κ q q 1} χ. (30) = L R L R q L 30 1 χ κ 0 1 30 q R Figure 17: The serch rnge of quntized noise floor for different modes Hence, the distortion mesure function needs to be modified in the coupling mode to consider the two chnnel distortion under the inherent constrint. As n extension of (), the modified distortion function for the coupling mode is defined s where ( q) ( q q ) D( Q ( q )) + D( Q ( q )), ( q, q ) D C L, R = L L R R L R χ (31) Q is the de-quntiztion function of q. Furthermore, the optiml (Q L,Q R ) should be decided by choosing the minimum solution pir (q L, q R ) for the distortion function, tht is * * ( L, qr ) Arg min (, ) { [ DC ( ql qr )]} q =,. (3) ql qr χ Different from the brute-force method to serch optiml solution, modified serch method is proposed to reduce the time cost. Figure 18 illustrtes the proposed noise floor scle fctor decision. We expected tht the most pirs of the optiml quntized noise floor scle fctor for ech individul chnnel cn conform to the constrint of coupling mode. It mens the first serching trget should be the ones. Once the specil pir cn fit the constrint, the finl optiml solution is lso found nd the serch procedure cn be stopped. In summry, we serch the optiml scle fctors for ech chnnel fter the distortion clcultion nd chec of the difference between two chnnels. If the optiml scle fctors for ech individul chnnel cn t conform to the constrint, it will serch the remined cndidte pirs for the optiml solution of the coupling distortion function.

Clculte Clculte D(q) D(q) for for ll ll q q vlues vlues Find Find Min_q Min_q L, L, Min_q Min_q R R Min_q Min_q L -Min_q L -Min_q R >1 R >1 Yes q q L =Min_q L =Min_q L q L q R =Min_q R =Min_q R R No Find Find (q (q R,q R,q L ) L ) χ minimum minimum D C (q C (q R,q R,q L ) L ) Figure 18: Flowchrt of the propose serch method of the noise floor scle fctor decision 3.5 Coupling Switch Method The humn hering is reltively more sensitive for the low frequency then the high frequency. To improve the qulity of the LF component, the bsic objective of the coupling is to sve the consuming bits of the HF prt s mny s possible under the constrint of the lest qulity requirement. Unlie the spects of the intensity coding or M/S coding, the min ris of the coupling coding is from the more inccurte envelope of reconstructed high bnds, not fine structure loosing. Becuse the robustness of SBR, in generl, the coupling coding cn sve lrge mount of consuming bits, nd promote overll qulity under very smll riss. However, from 3

both the two spects, including the dt correltion degree nd quntiztion error vrition, the huge difference of the left nd right energy chnnels my result in the degrdtion of the coding gin nd the increse of the reconstruction error. Bsed on the two points, coupling switch decision method is proposed for the trdeoff between HF qulity nd demnd bits. Ting into ccount the bove riss with the lest bits consuming, the optiml represent domin of the signl, either L/R or A/R mode, will be chosen. 3.5.1 Quntiztion error nlysis As the following, the vritions of the quntiztion errors t different mode will be nlyzed. The spectrl energy is quntized into the scle fctor by ting time-frequency grid of the current frme s the recorder units. There re different quntiztion methods in the norml mode nd the coupling mode. The quntiztion formul in the norml mode is defined s ( l) (, l) INT, = mxlog,0 + 0.5, 0 l, (33) 64 Q L, if bs _ mp _ res = 0 where =, is the index of the frequency tble bnd in SBR, 1, if bs _ mp _ res = 1 l is the index of the envelope, L is the number of envelope in current SBR frme, nd (,l) mens the energy of input signl in the norml mode, nd lso mens the verge energy between the stereo chnnel in coupling mode. Furthermore, the right chnnel quntiztion formul in the coupling mode is defined s QRight (, l) INT ( log ( (, l) ) + 0.5) + pnoffset( bs _ mp _ res) =, (34) where (,l) is the energy rtio between the stereo chnnel. On the other hnd, the de-quntiztion method for the norml mode in H-AAC decoder is defined s Orig (, l) = 64 (, l ) 0 l < L, 0 < n ( r( l) ), (35) where (,l) is decoded envelope scle fctor. Also, by the incorportion of the de-a/r opertion, the de-quntiztion method in the coupling mode is defined s LeftQrig ( l) 0 (, l ) + 1 = 64 pnoffset 1+,,, 0 l < L ( bs _ mp _ res) 1 ( l ) 0 < n( r( l) ), (36) 4

(, l ) l < L (, 0 RightQrig l) = 64, 1 _ 1+ where 0, 1 represent the decoded A/R envelope scle fctors. 0 + 1 (, l ) pnoffset ( bs _ mp res) 0 < n( r( l) ), (37) Through the quntiztion formuls bove, the reconstruction error cn be estimted. From (33) nd (35), the reconstruction vlue in the norml mode cn be clculted s int( mx(log ( ),0) + 0.5) 64 ' = 64, where is the energy of originl signl. From (38), If is smller thn 64, lwys 64. If is greter thn 64, ' ' is clculted s int( log ( ) + 0.5) 64 = 64. Form (39), the reltive error in the norml mode cn be evluted from ε (38) ' is ' = 1, (40) where someε loctes t the rnge from -0.5 to 0.5. If ε is zero vlue, it implies there is no reconstruction error. (39) From (33), (34), (36), nd (37), the reconstruction vlue of two chnnels in the coupling mode is derived s 1 + L R L int( log ( ) + 0.5) int( log ( ) + 0.5) 18 R ' = L 64 1+, (41) 1 + L R L int( log ( ) + 0.5) int( log ( ) + 0.5) 18 R ' = R 64 1+, (4) where L nd R re the energy of originl signl. The rtio of the reconstructed nd originl energies in the coupling mode cn be evluted from (41), (4), tht is ' L L Ψ = ε1+ ε Ψ ε + + 1 ε1+ ε, (43) ' R R ε1 ε1 Ψ + = ε, (44) 1+ Ψ 5

where ε, nd ε [ 1.5, 0.5]. The constnt ε 1 is the L Ψ =, 1 [ 0.5,0.5] R quntiztion error of the energy quntiztion process, nd the constnt ε is the reconstruction error which results from the quntiztion process nd the DPCM opertion in the SBR. 3.5. nergy Abnorml Phenomenon From (43) nd (44), if Ψ is huge, the reconstruction error will pproximte the limit vlue s follows lim Ψ lim Ψ ' ε1 L L = ' ε1 ε R R, (45) =. (46) On the other hnd, if Ψ is very smll, the reconstruction error will pproximte the limit vlue s follows lim Ψ 0 ' L L ε 1+ε =, (47) If the vlues ' R ' L, R L lim Ψ 0 ' R R ε 1 =. (48) re more close to one, it mens the reltive error is more close to zero. However, from (40), (46), nd (47), the two vlue ε1 ε nd ε 1 + ε occurring in the coupling mode my be much lrger thn the vlue ε in the norml mode. This is becuse tht the ε belongs the rnge from 0.5 to -0.5, nd the distribution of ε1 ε nd ε 1 + ε is lrger, tht is ε 1 ε [ 1, ], nd ε + ε,1. It implies tht the reconstructed error of weer chnnel my 1 [ ] bnormlly become very lrge when the energy difference between the two chnnels is lrge. For exmple, Figure 19 is the spectrum of the test stereo signl to illustrte the error ugment phenomenon. The test trc hs high energy difference in HF between L/R chnnels. As shown in Figure 0, it is the comprison of reconstruction signl between coupling nd norml mode. The energy difference is bout 3 db between the reconstruction signls in the coupling nd norml mode. Figure 1 indictes tht the stronger chnnel hs the nerer reconstruction signl between coupling nd norml mode. 6

In order to limit the reconstruction error to the endurble rnge, it needs to find the resonble reltive rnge of Ψ to switch coupling mode. Figure nd Figure 3 illustrte the reltionship between Ψ nd the men of reltive error in the right chnnel. The men of reltive error is clculted s follows men( ε 1 + Ψ ε1 ε1 1 0.5 1 + Ψ Ψ ) = 1 dε ε 1d 0.5 1.5.5. (49) The men of reltive error is close rpidly to the upper bound when Ψ is over 8, nd the coupling mechnism should be turned off to void the nnoying phenomenon of the error ugment. Figure 4 nd Figure 5 indicte the reltionship between Ψ nd the vrince of reltive error in the right chnnel. The vrince of reltive error is clculted s follows ε1 ε1 1 0.5 1 + Ψ vrince( Ψ ) = 1 men( ) dε1dε ε 0.5 1.5.5 Ψ. (50) 1 + Ψ The vrince of reltive error is lso close rpidly to the upper bound when Ψ is over 8. There is trde-off between the reconstruction error nd the sved bit, nd hence switch threshold is required to void extreme reconstruction error. Similrly, Figure 6 nd Figure 7 illustrte the reltionship between Ψ nd the men of reltive error in the left chnnel. Figure 8 nd Figure 9 indicte the reltionship between Ψ nd the vrince of reltive error in the left chnnel. The men nd vrince of reltive error is clculted s nd men( ε Ψ + ε1 ε1 0.5 1 1 + Ψ Ψ ) = 1 dε ε 1d 0.5 1.5.5 (51) ε1 ε1 1 0.5 1 + Ψ vrince( Ψ ) = 1 men( ) dε1dε ε 0.5 1.5.5 Ψ (5) Ψ + for left chnnel, respectively. According to the men nd vrince of reltive error, the switch threshold is set to 8 to void the high reltive error. 7

Figure 19: The test trc hs high energy difference in high frequency Figure 0: The high reconstruction error in weer chnnel Figure 1: The low reconstruction error in stronger chnnel Figure : The reltionship between men of reltive error nd Ψ when Ψ is more 8

thn one in the right chnnel Figure 3: The reltionship between men of reltive error nd Ψ when Ψ is less thn one in the right chnnel Figure 4: The reltionship between vrince of reltive error nd Ψ when Ψ is more thn one in the right chnnel Figure 5: The reltionship between vrince of reltive error nd Ψ when Ψ is less thn one in the right chnnel 9

Figure 6: The reltionship between men of reltive error nd Ψ when Ψ is more thn one in the left chnnel Figure 7: The reltionship between men of reltive error nd Ψ when Ψ is less thn one in the left chnnel Figure 8: The reltionship between vrince of reltive error nd Ψ when Ψ is more thn one in the left chnnel 30

Figure 9: The reltionship between vrince of reltive error nd Ψ when Ψ is less thn one in the left chnnel 3.5.3 Summry Therefore, the criterion of the proposed coupling switch method focuses on the verge energy difference in the SBR rnge to void the bnorml phenomenon. The verge energy difference is clculted from the energy rtio which is divided by criticl unit. The number of smples of criticl bnd in high frequency is different from the number in the low frequency, hence the energy rtio must be normlized s follows, Diff = j c c F r ( ) ( ) c, l c, log log j c j, (53) where c is the criticl unit we used in Section 3., c i, j is the j th smple energy of criticl unit c of chnnel i, nd c is the number of smples of criticl unit. Figure 30 illustrtes the proposed coupling switch method which detects of the huge verge energy difference. 31

Clculte Clculte verge verge energy energy difference difference Averge Averge energy energy difference difference > > Threshold Threshold Yes Use Use norml norml mode mode No Use Use coupling coupling mode mode Figure 30: Coupling Switch Flowchrt. 3

Chpter 4 xperiments In this chpter, the qulity mesurement is conducted on the NCTU_HAAC pltform. xtensive experiments re performed to prove the enhncement of the proposed methods on the MPG test trcs nd the music dtbse [13] collected in PSPLAB. 4.1 xperiment nvironment Objective Qulity Mesurement Tool: The tool clled AQUAL [14] is chosen to mesure the udio qulity in the objective test. AQUAL stnds for vlution of Audio Qulity. The purpose of AQUAL is to supply the udio objective qulity mesurement for coded/decoded udio signls especilly useful for udio codec development. The implementtion of AQUAL is bsed on the ITU-R recommendtion BS.1387 [15]. Subjective Qulity Mesurement Tool: In subjective qulity test, we use MUSHRA [16] to ssist the ssessment. Multi stimulus test with hidden nchors nd reference hs been designed to give relible nd repetble mesure of the udio qulity of intermedite-qulity signls. MUSHRA hs the dvntge tht it provides n bsolute mesure of the udio qulity of the codec which cn be compred directly with the reference. MUSHRA follows the test method nd impirment scle recommended by the ITU-R BS.1116 [17]. 33

4. Objective Qulity Mesurement in MPG Test Trcs MPG twelve trcs include criticl music blncing on the percussion, string, wind instruments, nd humn vocl. The fetures of these twelve trcs re shown in the Tble 4. In this section, it will verify the qulity enhncement of proposed methods in different bit rtes bsed on the MPG test trcs. Tble 4: The twelve trcs recommended by MPG Trcs Signl Description Signls Mode Time (sec) Remr 1 es01 Vocl (Suzn Veg) stereo 10 (c) es0 Germn speech stereo 8 (c) 3 es03 nglish speech stereo 7 (c) 4 sc01 Trumpet solo nd orchestr stereo 10 (b) (d) 5 sc0 Orchestrl piece stereo 1 (d) 6 sc03 Contemporry pop music stereo 11 (d) 7 si01 Hrpsichord stereo 7 (b) 8 si0 Cstnets stereo 7 () 9 si03 pitch pipe stereo 7 (b) 10 sm01 Bgpipes stereo 11 (b) 11 sm0 Glocenspiel stereo 10 () (b) 1 sm03 Pluced strings stereo 13 () (b) Remrs: () Trnsients: pre-echo sensitive, smering of noise in temporl domin. (b) Tonl/Hrmonic structure: noise sensitive, roughness. (c) Nturl vocl (criticl combintion of tonl prts nd ttcs): distortion sensitive, smering of ttcs. (d) Complex sound: stresses the device under test. 34

Tble 5: Objective mesurements through the ODGs for proposed coupling pproch t 80 bps Codec NCTU-HAAC Bit Rte 80 bps Trcs M0 M1 es01-0.68-0.67 es0-0.58-0.56 es03-0.68-0.64 sc01-0.95-0.93 sc0-1.08-1.05 sc03-1.1-1.09 si01-1.56-1.55 si0-1.0-1.0 si03-1.6-1.6 sm01-1.56-1.5 sm0-1.55-1.51 sm03-1.9-1.7 Mx -0.58-0.56 Min -1.6-1.6 Averge -1.139-1.1175 M0: coupling mode disbled M1: coupling mode enbled Figure 31: The vrince in the ODGs of proposed coupling pproches t 80 bps 35

Tble 6: Objective mesurements through the ODGs for proposed coupling pproch t 64 bps Codec NCTU-HAAC Bit Rte 64 bps Trcs M0 M1 es01-0.9-0.9 es0-0.8-0.77 es03-0.95-0.89 sc01-1.6-1.54 sc0-1.66-1.63 sc03-1.59-1.56 si01-1.89-1.87 si0-1.35-1.3 si03 -.07 -.04 sm01 -.11 -.06 sm0 -.19 -.19 sm03-1.64-1.63 Mx -0.8-0.77 Min -.19 -.19 Averge -1.564-1.5333 M0: coupling mode disbled M1: coupling mode enbled Figure 3: The vrince in the ODGs of proposed coupling pproches t 64 bps 36

Tble 7: Objective mesurements through the ODGs for proposed coupling pproch t 48 bps Codec NCTU-HAAC Bit Rte 48 bps Trcs M0 M1 es01-1.47-1.37 es0-1.34-1.7 es03-1.64-1.51 sc01 -.36 -.35 sc0 -.53 -.49 sc03 -.8 -. si01 -.69 -.69 si0 -.5 -.19 si03-3.16-3.11 sm01-3.15-3.07 sm0-3.3-3.1 sm03 -. -. Mx -1.34-1.7 Min -3.3-3.1 Averge -.36 -.3067 M0: coupling mode disbled M1: coupling mode enbled Figure 33: The vrince in the ODGs of proposed coupling pproches t 48 bps 37

According to the result of the objective test, the encoder which enbles coupling mode hs better qulity. specilly, becuse of the similrity of L/R chnnels, the voice trcs including es01, es0 nd es03 hve the conspicuous improvement mong ll the test trcs. On the other hnd, the improvement degree is lrger with the bit-rte decreses. This is becuse tht the sved bits by coupling coding re more importnt t the lower bit-rte. 4.3 Objective Qulity Mesurement in Music Dtbse To confirm the possible ris nd robustness of proposed methods, extensive tests re dpted to verify the qulities of these methods. In PSPLb udio dtbse [13], here re 15 sets. For ech bitstrem set, they re described in Tble 8. Bitstrem Ctegories # of trcs Tble 8: The PSPLb udio dtbse [13] Remr 1 Ff13 103 Killer bitstrem collection from ff13 [18]. Gpsycho 4 LAM qulity test bitstrem [19]. 3 HA64KTest 39 4 HA18KTestV 1 5 Horrible_song 16 6 Ingets1 5 64 bps test bitstrem for multi-formt in HA forum [0]. 18 bps test bitstrem for multi-formt in HA forum [0]. Collections of criticl songs mong ll bitstrems in PSPLb. Bitstrem collection from the test of OGG Vorbis pre 1.0 listening test [1]. MPG 1 MPG test bitstrem set for 48000Hz. MPG44100 1 MPG test bitstrem set for 44100 Hz. Phong 8 Test bistrem collection from Phong []. PSPLb 37 Collections of bitstrem from erly ge of PSPLb. Some re good s iller. Sjeng 3 Smll bitstrem collection by sjeng. SQAM 16 Sound qulity ssessment mteril recordings for subjective tests [3]. TestingSong14 14 Test bitstrem collection from rshong, PSPLb. TonlSignls 15 Artificil bitstrem tht contins sin wve etc. VORBIS_TSTS_ Smples 8 ight Vobis testing smples from HA [0]. 38

M0: coupling mode disbled M1: coupling mode enbled Figure 34: The verge ODGs of method M0 nd M1 t 80bps in 16 ctegories Figure 35: The verge ODGs of method M0 nd M1 t 64bps in 16 ctegories Figure 36: The verge ODGs of method M0 nd M1 t 48bps in 16 ctegories 39

Figure 34 5 illustrte the experiments for the 15 bitstrem sets t the different bit-rtes, where the brs indicte the verge ODG of the bitstrem sets. For the most cses, the qulity under the coupling mode is better thn the one under non-coupling mode. The min chrcteristic of the improvement trcs is tht L/R chnnels re similr in HF component. On the other hnd, there re two trcs which degrde qulity in the coupling mode. The two trcs re impulse_m0_0db nd tringle1 in the TonlSignls set. As shown in Figure 37, the cuse of the degrdtion of the impulse_m0_0db trc is tht the silence suffers the spreding noise in coupling mode. Furthermore, the mechnism of the noise-floor correction [11] will miss detection in coupling mode for tringle1. Figure 39 nd Figure 40 show the spectrums which re reconstructed under the norml mode nd the coupling mode, respectively. Figure 37: The suffered spectrum of the silence in the impulse_m0_0db trc Figure 38: The spectrum of the tringle1 trc in the TonlSignls set 40

Figure 39: Reconstructed spectrogrm of the tringle1 trc in norml mode Figure 40: Reconstructed spectrogrm of the tringle1 trc in coupling mode 4.4 Subjective Qulity Mesurement After the objective qulity mesurement, the subjective listening test needs to be performed to verify the qulity improvement nd possible ris of proposed methods in this thesis. The twelve test trcs re selected from the MPG test trcs. The subjective listening test is performed on the codec NCTU-HAAC nd use the tool clled MUSHRA to ssist the ssessment. The result of the subjective qulity mesurement is shown in the following figure. M0: coupling mode disbled M1: coupling mode enbled 41