LOSS CONCEALMENTS FOR LOW-BIT-RATE PACKET VOICE IN VOIP. Outline

Similar documents
Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile

Cellular systems & GSM Wireless Systems, a.a. 2014/2015

Chapter IV THEORY OF CELP CODING

Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP

EE482: Digital Signal Processing Applications

Overview of Code Excited Linear Predictive Coder

Interoperability of FM Composite Multiplex Signals in an IP based STL

The Channel Vocoder (analyzer):

Digital Speech Processing and Coding

Lesson 8 Speech coding

Enhanced Waveform Interpolative Coding at 4 kbps

Interoperability of FM Composite Multiplex Signals in an IP Based STL

Comparison of CELP speech coder with a wavelet method

Analysis/synthesis coding

Scalable Speech Coding for IP Networks

Information. LSP (Line Spectrum Pair): Essential Technology for High-compression Speech Coding. Takehiro Moriya. Abstract

COMPARATIVE REVIEW BETWEEN CELP AND ACELP ENCODER FOR CDMA TECHNOLOGY

Proceedings of Meetings on Acoustics

Analog and Telecommunication Electronics

Transcoding of Narrowband to Wideband Speech

An objective method for evaluating data hiding in pitch gain and pitch delay parameters of the AMR codec

Voice and Audio Compression for Wireless Communications

Wideband Speech Coding & Its Application

Adaptive time scale modification of speech for graceful degrading voice quality in congested networks

Telecommunication Electronics

EE 225D LECTURE ON MEDIUM AND HIGH RATE CODING. University of California Berkeley

Transcoding free voice transmission in GSM and UMTS networks

Datenkommunikation SS L03 - TDM Techniques. Time Division Multiplexing (synchronous, statistical) Digital Voice Transmission, PDH, SDH

CHAPTER 7 ROLE OF ADAPTIVE MULTIRATE ON WCDMA CAPACITY ENHANCEMENT

The Optimization of G.729 Speech codec and Implementation on the TMS320VC5402

Speech Coding in the Frequency Domain

Page 0 of 23. MELP Vocoder

10 Speech and Audio Signals

DEPARTMENT OF DEFENSE TELECOMMUNICATIONS SYSTEMS STANDARD

Speech Synthesis using Mel-Cepstral Coefficient Feature

Packet Loss Concealment for Speech Transmissions in Real-Time Wireless Applications

QUESTION BANK EC 1351 DIGITAL COMMUNICATION YEAR / SEM : III / VI UNIT I- PULSE MODULATION PART-A (2 Marks) 1. What is the purpose of sample and hold

Flexible and Scalable Transform-Domain Codebook for High Bit Rate CELP Coders

Speech Enhancement using Wiener filtering

NOVEL PITCH DETECTION ALGORITHM WITH APPLICATION TO SPEECH CODING

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals

Packetizing Voice for Mobile Radio

sensors ISSN

9/24/08. Broadcast Systems. Unidirectional distribution systems. Unidirectional distribution. Unidirectional distribution systems DAB Architecture

Distributed Speech Recognition Standardization Activity

Conversational Speech Quality - The Dominating Parameters in VoIP Systems

Fundamental Frequency Detection

Lec 19 Error and Loss Control I: FEC

Fundamentals of Digital Communication

The Public Switched Telephone Network (PSTN)

Multiplexing Concepts and Introduction to BISDN. Professor Richard Harris

Physical Layer: Outline

EUROPEAN pr ETS TELECOMMUNICATION November 1996 STANDARD

Waveform Encoding - PCM. BY: Dr.AHMED ALKHAYYAT. Chapter Two

Overview of Digital Mobile Communications

ITM 1010 Computer and Communication Technologies

ON THE PERFORMANCE OF WTIMIT FOR WIDE BAND TELEPHONY

HIGH QUALITY AUDIO CODING AT LOW BIT RATE USING WAVELET AND WAVELET PACKET TRANSFORM

Low Bit Rate Speech Coding

International Journal of Advanced Engineering Technology E-ISSN

APPLICATIONS OF DSP OBJECTIVES

Typical Wireless Communication System

CSCD 433 Network Programming Fall Lecture 5 Physical Layer Continued

Communications I (ELCN 306)

GSM and Similar Architectures Lesson 08 GSM Traffic and Control Data Channels

Ninad Bhatt Yogeshwar Kosta

ENHANCED TIME DOMAIN PACKET LOSS CONCEALMENT IN SWITCHED SPEECH/AUDIO CODEC.

Implementation of attractive Speech Quality for Mixed Excited Linear Prediction

Wireless Communications

ENEE408G Multimedia Signal Processing

Comparison of Low-Rate Speech Transcoders in Electronic Warfare Situations: Ambe-3000 to G.711, G.726, CVSD

End-to-End Speech Quality Testing in a Complex Transmission Scenario

Preface, Motivation and The Speech Coding Scene

Pulse Code Modulation

A 600 BPS MELP VOCODER FOR USE ON HF CHANNELS

Factors impacting the speech quality in VoIP scenarios and how to assess them

Audio Signal Compression using DCT and LPC Techniques

Compression. Encryption. Decryption. Decompression. Presentation of Information to client site

EC 2301 Digital communication Question bank

6/29 Vol.7, No.2, February 2012

Wireless Communication in Embedded System. Prof. Prabhat Ranjan

4G Mobile Broadband LTE

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications

Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech

A NEW FEATURE VECTOR FOR HMM-BASED PACKET LOSS CONCEALMENT

Data Transmission at 16.8kb/s Over 32kb/s ADPCM Channel

Techniques for low-rate scalable compression of speech signals

3GPP TS V5.0.0 ( )

Speech Quality in modern Network-Terminal Configurations

Next: Broadcast Systems

A Closed-loop Multimode Variable Bit Rate Characteristic Waveform Interpolation Coder

Waveform Coding Algorithms: An Overview

Adaptive Forward-Backward Quantizer for Low Bit Rate. High Quality Speech Coding. University of Missouri-Columbia. Columbia, MO 65211

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A

UNIT TEST I Digital Communication

NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or

3GPP TS V8.0.0 ( )

Lecture Outline. Data and Signals. Analogue Data on Analogue Signals. OSI Protocol Model

MODULATION AND MULTIPLE ACCESS TECHNIQUES

Transcription:

LOSS CONCEALMENTS FOR LOW-BIT-RATE PACKET VOICE IN VOIP Benjamin W. Wah Department of Electrical and Computer Engineering and the Coordinated Science Laboratory University of Illinois at Urbana-Champaign Urbana, IL 61801, USA December 15, 2004 Outline Outline Voice-over-IP: status and problems Loss concealment problem Previous work IP voice traffic loss characteristics Loss concealments for low bit-rate coded speech Parameter-based MDC Improving MDC quality Summary Future outlook Benjamin W. Wah 1

Motivations Application Areas of VoIP (in chronological order) Internet telephony PC-to-phone services Teleconferencing Phone-to-phone calling cards Telecommunication industry using VoIP Consumer Broadband Telephony Hosted IP PBX Wi-FI VoIP 4G Wireless communications Benjamin W. Wah 2 Motivations Size of Business Current Status of VoIP Accounts for 10% of long-distance phone traffic around the world Homes with broadband network using VoIP: 1% in 2004 (17% in 2009) Sustainable expansion of market share is predicted, (1% per year) Large long-distance carriers started using VoIP due to cost efficiency Competition from small companies offering free or inexpensive VoIP Business Strategy Initial business strategy, low-cost low-quality alternative to PSTN Current strategy, equivalent quality, inexpensive substitute, additional features Requirements for VoIP to be mainstream VoIP technology to be transparent (and easy to use) to users Toll quality, inexpensive No PC should be required, IP phones should be inexpensive Extra features: image transfers, multicasting, broadcasting Benjamin W. Wah 3

Motivations Interactive real-time communications: VoIP Speech Quality End-to-end delay due to codec, network, and jitter buffer ITU G.114: one way delay, < 150 ms acceptable, < 400 ms noticeable Mobile loop one-way delay about 100 ms; Mobile-VoIP-Mobile: about 300ms Acoustic echo: due to PSTN wiring or PC setup Noticeable for delays more than 30ms Loss: some degradations on voice samples tolerable Low bandwidth/congestion: due to dial-up connections, other streaming media Long-burst or frequent short-burst intolerable Codec Codec in tandem: code conversions at hosts or gateway, causing degraded quality and increased delay Using PC as phone: Speaker and microphone not optimal for phone conversation Standard low bit-rate speech codecs: Error propagation Benjamin W. Wah 4 Motivations Voice Codecs Codec Kbps Coding Technique G.711 64 Pulse code modulation (PCM) G.726 40-16 Adaptive differential PCM (ADPCM) G.728 16 Low-delay code excited prediction (LD-CELP) G.729 8 Algebraic code-excited linear prediction (ACELP) G.723.1 6.3/5.3 Multi-pulse max likelihood quantization (MP/MLQ)/ACELP GSM FR 13 Regular pulse-excited long term predictor (RPE-LTP) GSM EFR 12.2 Algebraic code-excited linear prediction (ACELP) Ref: Reynolds and Rix: Quality VoIP Benjamin W. Wah 5

Motivations Network Environments: Packet Network IPv4: best-effort, no real-time support Packet size: less than MTU to avoid fragmentation Packet rate: 20-30 packets per second IPv6: best-effort, may support real-time traffic Wireless: future IP-based Loss unavoidable in packet networks Benjamin W. Wah 6 Motivations Network Environments: Transport-Layer Protocol TCP - Reliable but not suitable for real-time - Connection oriented, more secure - Allowed through firewalls UDP - Lossy and unreliable - No congestion control mechanism to slow the flow - Not permitted through firewalls TCP in real-time mode Provides connection-oriented transmission without congestion avoidance Suitable for current VoIP systems for firewall penetrability Loss of real-time voice not handled at the transport layer Benjamin W. Wah 7

Motivations Network Environments: Application-Layer Protocol H.323: umbrella standard for interoperability RTP: no loss recovery scheme Loss of real-time voice not handled Packet losses in real-time voice communications left for end-point applications Benjamin W. Wah 8 Motivations Solutions for Improving Speech Quality Echo cancellation implementation in software for VoIP applications Jitter buffer at receiving end Easier access to broadband connection Both ends agree on a codec while initializing a VoIP session Dedicated IP Phones Improved Codecs with low delay and lower bit-rate requirements New speech coding standards developed for IP networks Benjamin W. Wah 9

Outline Outline Voice-over-IP: status and problems Loss concealment problem Previous work IP voice traffic loss characteristics Loss concealments for low bit-rate coded speech Parameter-based MDC Improving MDC quality Summary Future outlook Benjamin W. Wah 10 Problem Addressed Loss Concealment Problem Design, analyze and evaluate robust end-to-end loss-concealment schemes Allow reliable and real-time low bit-rate voice transmissions Unreliable IP networks, like the Internet and wireless wide area networks Benjamin W. Wah 11

Previous Work Traffic Study Loss Concealment for Low bit rate Coders Improving MDC Quality Sample MDC Paramter MDC Correlation analysis LSP reconstruction Decoding quality LP based MDC Relate LR with LSP reconstruction error Noise shaping Modification Optimal interpolations Benjamin W. Wah 12 Previous work Previous Work: Coder-Independent Schemes Schemes depending priority support from the network Different priorities of different frames, e.g.: voiced, unvoiced [DaSilva 89] Two pass coding: first pass coding original signals, second residue [Yong 92] Schemes adding explicit redundancy Send extracted information of a packet in its following packet [Valenzuela 89] Use forward error correction (FEC) [Shacham89] Schemes exploiting inherent redundancy in voice streams Replay, pad by silence or white noise (receiver-only) [Tucker 85] Waveform substitution (receiver-only) [Wasem 88] Sample-based MDC (sender-receiver with no redundancy) [Jayant 81] Benjamin W. Wah 13

Previous work Previous Work: Coder-Dependent Schemes Schemes depending priority support from the network LP coder: assign different priorities of parameters [Yong 92] Schemes adding explicit redundancy LP coder: FEC for the most sensitive parameters [Atungsiri 93] LP coder: duplicate base information, e.g. LP [Anandakumar 00] Schemes exploiting inherent redundancy in voice streams LP coder: single description Parameter reconstruction (receiver-only)[atungsiri 93] Parameter re-initialization (sender-receiver) [Montminy 00] No existing non-redundant MDC for LP coders Benjamin W. Wah 14 Previous Work Traffic Study Approach Loss Concealment for Low bit rate Coders Improving MDC Quality Sample MDC Paramter MDC Correlation analysis LSP reconstruction Decoding quality LP based MDC Relate LR with LSP reconstruction error Noise shaping Modification Optimal interpolations Benjamin W. Wah 15

IP voice traffic loss characteristics Example connections IP Voice Traffic Loss Characteristics Connection UIUC-Berkeley UIUC-W. China UIUC-Central Europe Loss rate low-medium medium-high high Loss behavior Loss Rate 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 UIUC-Berkeley UIUC-W. China UIUC-C. Europe 0 5 10 15 20 Time of day, hour Loss rate can go up to 50% Distribution 1 0.9 0.8 0.7 0.6 0.5 UIUC-Berkeley UIUC-W. China UIUC-C. Europe 0.4 1 2 3 4 5 6 7 Burst length Most losses have short burst lengths Benjamin W. Wah 16 IP voice traffic loss characteristics Reducing Unrecoverable Loss by Interleaving Bursty losses are difficult to handle Interleaving: disperse bursty losses to isolated losses P(fail i): prob. of losses that cannot be recovered under interleaving factor i P(fail i) 0.3 0.25 0.2 0.15 0.1 0.05 UIUC-Berkeley (i=1) UIUC-Berkeley (i=2) UIUC-Berkeley (i=3) UIUC-Berkeley (i=4) 0 0 5 10 15 20 Time of day (hour) P(fail i) 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 UIUC-W. China (i=1) UIUC-W. China (i=2) UIUC-W. China (i=3) UIUC-W. China (i=4) 0 5 10 15 20 Time of day (hour) P(fail i) 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 UIUC-C. Europe (i=1) UIUC-C. Europe (i=2) UIUC-C. Europe (i=3) UIUC-C. Europe (i=4) 0 5 10 15 20 Time of day (hour) Small interleaving factor 2 4 is enough Multiple-description coding is promising Benjamin W. Wah 17

Outline Outline Voice-over-IP: status and problems Loss concealment problem Previous work IP voice traffic loss characteristics Loss concealments for low bit-rate coded speech Parameter-based MDC Improving MDC quality Summary Future outlook Benjamin W. Wah 18 Previous Work Traffic Study Loss Concealment for Low bit rate Coders Improving MDC Quality Sample MDC Paramter MDC Correlation analysis LSP reconstruction Decoding quality LP based MDC Relate LR with LSP reconstruction error Noise shaping Modification Optimal interpolations Benjamin W. Wah 19

Testing Coders and Streams Coders Streams Bit rate (bps) Quantization of LSP Excitation FS CELP 4800 scalar stochastic code/adaptive code ITU G.723.1 (I) 5300 predictive-split VQ algebraic code/adaptive code ITU G.723.1 (II) 6300 predictive-split VQ multi pulse/adaptive code FS MELP 2400 multi-stage VQ mixed pulse- and noise-like Index Length (ms) Speakers Index Length (ms) Speakers 1 21432 2 male, 1 female 5 4160 1 male 2 22560 2 male, 1 female 6 4082 1 male 3 4424 1 female 7 4867 1 male, 1 female 4 5091 1 female 8 73615 1 male, 1 female Benjamin W. Wah 20 Objective Measures Itakura-Saito likelihood ratio LR = a rr o a T r a o R o a T o - a r : vector of LP coefficients of reconstructed speech - a 0 : vector of LP coefficients of original speech - R 0 : correlation matrix derived from original speech Cepstral distance: CD = 4.34[(c o, 0 c r, 0 ) 2 + 2 (c o, i c r, i ) 2 ] 2 1 [db] - c o,0 : cepstra of original samples - c r,0 : cepstra of reconstructed samples i=1 Benjamin W. Wah 21

Loss concealments for low bit-rate coded speech Typical Linear Predictive Coder Select or Form Excitation H(w) = 1 A(w) = 1 1 10 a k e jwk Gain Perceptual Weighted Mean Square Error k=1 Linear Prediction Analysis S(n) Ŝ(n) Major techniques: Frame-oriented Linear prediction analysis, coefficients generally represented by LSP Excitations: pitch information and random noise Can be open-loop or closed-loop FS CELP, ITU G.723.1 ACELP, ITU G.723.1 MP-MLQ, and MELP Benjamin W. Wah 22 Loss concealments for low bit-rate coded speech Coder-Independent Sample-Based MDC Original speech sequence Sample based Interleaving Frame with even samples Frame with odd samples Coding and Packetization Description 0 Coded UDP Packet Description 1 Coded UDP Packet Lost Depacketization and Decoding Frame with even samples Reconstructed frame with odd samples Deinterleaving Played Speech Sequence Benjamin W. Wah 23

Loss concealments for low bit-rate coded speech Performance of Coder-Independent Sample-Based MDC Likelihood Ratio 3.5 3 2.5 2 1.5 Original CELP Sample-based MDC 1 1 2 3 4 5 6 7 8 Audio File Index Coding quality degrades dramatically Cepstral Distance 9 8.5 8 7.5 7 6.5 6 5.5 Original CELP Sample-based MDC 5 1 2 3 4 5 6 7 8 Audio File Index Drawbacks: Aliasing: caused by down sampling Coding-frame time span lengthened Benjamin W. Wah 24 Previous Work Traffic Study Loss Concealment for Low bit rate Coders Improving MDC Quality Sample MDC Paramter MDC Correlation analysis LSP reconstruction Decoding quality LP based MDC Relate LR with LSP reconstruction error Noise shaping Modification Optimal interpolations Benjamin W. Wah 25

Loss concealments for low bit-rate coded speech Coder-Dependent Parameter-Based MDC Description 0 Original speech sequence Modified coding Coded frame Parameter based Interleaving Coded UDP packet with param. set S 1 Description 1 Coded UDP packet with param. set S 2 Lost Depacketization Frame with param. set S 1 Reconstructed frame with param. set S 2 Deinterleaving Modified decoding Decoded frame Played Speech Sequence Parameters of linear predictive coders: Linear predictor equivalent representations: Reflection coefficient (RF), Log area ratio (LAR), LSP Excitation MDC design by correlation analysis Benjamin W. Wah 26 Loss concealments for low bit-rate coded speech Correlations of Linear Predictor Representations Correlations of LSP Frame LSP Distance x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 10 1 0.82 0.81 0.75 0.72 0.81 0.76 0.74 0.73 0.73 0.74 2 0.61 0.64 0.50 0.45 0.59 0.46 0.43 0.43 0.45 0.55 3 0.46 0.52 0.35 0.26 0.40 0.24 0.21 0.24 0.26 0.42 Correlations of RF, LAR are similar Comparable to voice-sample correlations Sample Dist. 1 2 3 Correlation 0.83 0.60 0.35 Benjamin W. Wah 27

Loss concealments for low bit-rate coded speech Correlations of FS CELP Excitation Parameters Adaptive codewords: 2-element vector Parameter ac delay ac gain Distance Corr. Corr. 1 0.57 0.004 2 0.22 0.007 3 0.21 0.006 Stochastic codewords: 60-element vector Correlation 0.08 0.06 0.04 0.02 0-0.02 Dist=1 Dist=2 Dist=3-0.04-0.06 0 10 20 30 40 50 Element index Very low or no correlation for excitation parameters Benjamin W. Wah 28 Previous Work Traffic Study Loss Concealment for Low bit rate Coders Improving MDC Quality Sample MDC Paramter MDC Correlation analysis LSP reconstruction Decoding quality LP based MDC Relate LR with LSP reconstruction error Noise shaping Modification Optimal interpolations Benjamin W. Wah 29

Coder-dependent: LP-based MDC for linear predictive coders FS CELP SDC and LP-Based Two-Way MDC (US Patent 6754203 B2) FS CELP SDC: (110 bits) 240 Samples ac 1, sc 1 ac 2, sc 2 ac 3, sc 3 ac 4, sc 4 LP (34 bits) Quantization Packetization UDP Packet (144 bits + header) Packet Sequence (every 30 ms) Construction of two-way MDC (with the same bandwidth as SDC): Original Speech Sequence 240 sample frame 240 sample frame Modified CELP Encoding Interleaving Set ac0,1, sc0,1 ac0,2, sc0,2 LP 0 ac1,1, sc1,1 Interleave LP vectors Description 0 LP 0 vector, ac 0,1, ac 0,2, ac 1,1, ac 1,2 sc 0,1, sc 0,2, sc 1,1, sc 1,2 ac1,2, sc1,2 LP 1 Description 1 LP 1 vector, ac 0,1, ac 0,2, ac 1,1, ac 1,2 sc 0,1, sc 0,2, sc 1,1, sc 1,2 Replicate excitation vectors Quantization Packetization UDP Packet (144 bits + header) UDP Packet (144 bits + header) Packet Sequence (every 30 ms) Benjamin W. Wah 30 Coder-dependent: LP-based MDC for linear predictive coders LP-Based Four-Way MDC Interleave LP vectors Description 0 LP 0 vector, ac 0,1, ac 1,1, ac 2,1, ac 3,1 sc 0,1, sc 1,1, sc 2,1, sc 3,1 Original Speech Sequence 240 sample frame 240 sample frame 240 sample frame 240 sample frame Modified CELP Encoding Interleaving Set ac0,1, sc0,1 LP0 ac1,1, sc1,1 LP1 ac2,1, sc2,1 LP2 ac3,1, sc3,1 LP3 Description 1 LP 1 vector, ac 0,1, ac 1,1, ac 2,1, ac 3,1 sc 0,1, sc 1,1, sc 2,1, sc 3,1 Description 2 LP 2 vector, ac 0,1, ac 1,1, ac 2,1, ac 3,1 sc 0,1, sc 1,1, sc 2,1, sc 3,1 Description 3 LP 3 vector Quantization Packetization UDP Packet (144 bits + headers) UDP Packet (144 bits + headers) UDP Packet (144 bits + headers) UDP Packet (144 bits + headers) Packet Sequence (every 30 ms) ac 0,1, ac 1,1, ac 2,1, ac 3,1 Replicate excitation vectors sc 0,1, sc 1,1, sc 2,1, sc 3,1 Further quality degradation with longer subframes and the same packet size No quality degradation if 60-bit subframes but larger packet size are used Benjamin W. Wah 31

Coder-dependent: LP-based MDC for linear predictive coders Synthetic Tests Without Loss All descriptions are received No aliasing Linear prediction precision same as SDC Excitation quality degraded due to extended subframe size Performance evaluation by LR and CD 3.5 9 Likelihood Ratio 3 2.5 2 Original 2way LP MDC 4way LP MDC 2way Sample MDC Cepstrum Distance 8.5 8 7.5 7 6.5 6 Original 2way LP MDC 4way LP MDC 2way Sample MDC 1.5 5.5 0 1 2 3 4 5 6 7 Audio File Index 0 1 2 3 4 5 6 7 Audio File Index Much better than sample-based MDC method Benjamin W. Wah 32 Coder-dependent: LP-based MDC for linear predictive coders Results of Two-Way MDC With One Description received Reconstruction of lost LP vectors based on one of the three representations Comparison using two extra measures: Spectral distortion Correlation [ 1 SD = E 2π π π ] 10 log 10 H o, n (ω) 2 10 log 10 H r, n (ω) 2 dω db LSP gives best reconstruction quality Spectral Distance 68 66 64 62 60 58 56 54 52 50 48 46 LSP RF LAR 0 1 2 3 4 5 6 7 Audio File Index Correlation 0.94 0.92 0.9 0.88 0.86 0.84 0.82 0.8 0.78 0.76 0.74 0.72 LSP RF LAR 1 2 3 4 5 6 7 8 9 10 Coefficient Number Likelihood Ratio 1.95 1.9 1.85 1.8 1.75 1.7 1.65 1.6 1.55 1.5 1.45 LSP RF LAR 0 1 2 3 4 5 6 7 Audio File Index Benjamin W. Wah 33

Coder-dependent: LP-based MDC for linear predictive coders Internet Test Setup Components: Sender Receiver: 200 msec jitter buffer, start clock when first packet arrives Internet simulator: delay and drop packet according to traffic traces Comparison between: SDC Adaptive MDC: dynamically switch between two-way and four-way MDC depending on loss conditions Comparison metrics: Quality in LR and CD Fractions of unrecoverable losses Benjamin W. Wah 34 Coder-dependent: LP-based MDC for linear predictive coders Internet Tests UIUC-Central Europe Fraction of Unrecovered Losses 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 SDC A-MDC 0 0 5 10 15 20 Likehood Ratio (LR) 1.8 1.75 1.7 1.65 1.6 1.55 1.5 SDC A-MDC(LSP) A-MDC(RF) A-MDC(LAR) 0 5 10 15 20 Cepstral Distance (CD) 8 7.9 7.8 7.7 7.6 7.5 7.4 7.3 7.2 7.1 7 6.9 SDC A-MDC(LSP) A-MDC(RF) A-MDC(LAR) 0 5 10 15 20 UIUC time of day (hour) UIUC time of day (hour) UIUC time of day (hour) Summary of adaptive MDC: Recovering the decoding state LSP best overall Effective in reducing unrecovered losses Benjamin W. Wah 35

Coder-dependent: LP-based MDC for linear predictive coders Previous Work Traffic Study Loss Concealment for Low bit rate Coders Improving MDC Quality Sample MDC Not good Paramter MDC Correlation analysis Useful LP based MDC Good LSP reconstruction Relate LR with LSP reconstruction error Decoding quality Noise shaping Modification Optimal interpolations Benjamin W. Wah 36 Previous Work Traffic Study Loss Concealment for Low bit rate Coders Improving MDC Quality Sample MDC Paramter MDC Correlation analysis LSP reconstruction Decoding quality LP based MDC Relate LR with LSP reconstruction error Noise shaping Modification Optimal interpolations Benjamin W. Wah 37

Improving MDC quality LSP reconstruction Optimized Two-Point Linear Interpolation Frame n-1: Frame n: 0 x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 10 π Spectra x n r = α x n 1 + β x n+1 Frame n+1: Averaging: α = β = 0.5 Time Spectral Distortion 1.7 1.65 1.6 1.55 1.5 1.45 1.4 1.35 1.3 averaging 2-pt linear 0 1 2 3 4 5 6 7 Audio File Index Averaging is almost optimal for two-point linear interpolation Benjamin W. Wah 38 Improving MDC quality LSP reconstruction Optimized Six-Point Interpolations Frame n-1: Frame n: Frame n+1: 0 x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 10 π Spectra Time Six-point linear interp. Six-point 2-order interp. Spectral Distortion 1.7 1.65 1.6 1.55 1.5 1.45 1.4 1.35 1.3 averaging 6-pt linear 0 1 2 3 4 5 6 7 Audio File Index Averaging is near optimal Spectral Distortion 1.7 1.65 1.6 1.55 1.5 1.45 1.4 1.35 1.3 averaging 6-pt 2-order 0 1 2 3 4 5 6 7 Audio File Index Not too much improvement Benjamin W. Wah 39

Previous Work Traffic Study Approach Loss Concealment for Low bit rate Coders Improving MDC Quality Sample MDC Paramter MDC Correlation analysis LSP reconstruction Decoding quality LP based MDC Relate LR with LSP reconstruction error Noise shaping Modification Optimal interpolations Benjamin W. Wah 40 Improving MDC quality Decoding quality Causes for Quality Degradation Magnitude (db) 100 90 80 70 60 50 40 30 20 10 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Speech perception: Normalized Frequency Valley noise more noticeable Formant important Significant higher coding-noise inside formant regions due to MDC a) Two-way MDC E RF E RF SDC 1.1591e+8 2.3976e+7 Two-way MDC 2.3049e+8 4.4340e+7 Ratio 1.99 1.85 b) Four-way MDC E RF E RF SDC 2.3507e+8 4.4775e+7 Four-way MDC 8.6771e+8 1.2685e+8 Ratio 4.69 3.83 Benjamin W. Wah 41

Improving MDC quality Decoding quality Goal: noise shaping Perceptual Weighting Filter De-emphasize coding noise inside formant regions Magnitude (db) 25 20 15 10 5 0-5 -10 LP Formant Freq. Lower Bound Upper Bound Perc. Weight 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Normalized Frequency Benjamin W. Wah 42 Improving MDC quality Decoding quality Modification to Perceptual-Weighting Filter Modification: Decrease suppression of noises inside formant regions Shift noise outside formant regions Use SDC noise balancing as reference Choosing suitable perceptual-weighting filter (γ is the filter parameter) E RF E RF Mag. Ratio Mag. Ratio SDC 1.1591e+8 2.3976e+7 two-way MDC (γ = 0.8) 2.3049e+8 1.99 4.4340e+7 1.85 two-way MDC (γ = 0.82) 2.2368e+8 1.93 4.3659e+7 1.82 two-way MDC (γ = 0.84) 2.1501e+8 1.85 4.4372e+7 1.85 two-way MDC (γ = 0.86) 2.1810e+8 1.88 4.5733e+7 1.91 two-way MDC (γ = 0.88) 2.0096e+8 1.73 4.6161e+7 1.92 two-way MDC (γ = 0.9) 2.0098e+8 1.73 4.7287e+7 1.97 Benjamin W. Wah 43

Improving MDC quality Decoding quality Synthetic Tests for Improved Perceptual-Weighting Filter Two-way MDC when both descriptions received 1.65 1.6 SDC MDC(PWF):2 desc. recv. MDC:2 desc. recv. 7.5 7 SDC MDC(PWF):2 desc. recv. MDC:2 desc. recv. Likelihood Ratio 1.55 1.5 1.45 1.4 Cepstral Distance 6.5 6 1.35 5.5 1.3 0 1 2 3 4 5 6 7 Audio File Index 5 0 1 2 3 4 5 6 7 Audio File Index LR similar Noticeable improvements in CD Benjamin W. Wah 44 Improving MDC quality Decoding quality Internet Tests for Improving Perceptual-Weighting Filter UIUC-Central Europe Likehood Ratio (LR) 1.85 1.8 1.75 1.7 1.65 1.6 1.55 SDC A-MDC A-MDC(PWF) 1.5 0 5 10 15 20 UIUC time of day (hour) Cepstral Distance (CD) 8 7.8 7.6 7.4 7.2 7 6.8 6.6 6.4 SDC A-MDC A-MDC(PWF) 6.2 0 5 10 15 20 UIUC time of day (hour) SDC with no loss: LR = 1.33, CD = 5.55 Improved CD Benjamin W. Wah 45

Summary Summary Summary MDC design by correlation analysis LP-based MDC for low bit-rate linear predictive speech coders Optimizing LSP reconstruction Improve MDC excitation quality Future work Further improve MDC quality Bandwidth and quality tradeoff Rate adaptation Benjamin W. Wah 46 Summary Future Outlook Commercial products and services available VoIP solutions for dial-up with poor quality and broadband with acceptable quality Net2Phone, Skype, Netmeeting Broadband connection and low delay success Future research directions Mobile endpoint increases delay Wireless broadband delay problem not solved New Application Areas The use of VoIP technology combined with other multimedia for a complete virtual meeting Microsoft announced Project Istanbul for Summer 2005 Benjamin W. Wah 47