Voice and Audio Compression for Wireless Communications

Size: px
Start display at page:

Download "Voice and Audio Compression for Wireless Communications"

Transcription

1 page 1 Voice and Audio Compression for Wireless Communications by c L. Hanzo, F.C.A. Somerville, J.P. Woodard, H-T. How School of Electronics and Computer Science, University of Southampton, UK

2 page i Contents Preface and Motivation 1 Acknowledgements 11 I Speech Signals and Waveform Coding 13 1 Speech Signals and Coding Motivation of Speech Compression Basic Characterisation of Speech Signals Classification of Speech Codecs Waveform Coding Time-domain Waveform Coding Frequency-domain Waveform Coding Vocoders Hybrid Coding Waveform Coding Digitisation of Speech Quantisation Characteristics Quantisation Noise and Rate-Distortion Theory Non-uniform Quantisation for a Known PDF: Companding PDF-independent Quantisation using Logarithmic Compression The µ-law Compander The A-law Compander Optimum Non-uniform Quantisation Chapter Summary Predictive Coding Forward Predictive Coding DPCM Codec Schematic Predictor Design i

3 page ii ii CONTENTS Problem Formulation Covariance Coefficient Computation Predictor Coefficient Computation Adaptive One-word-memory Quantization DPCM Performance Backward-Adaptive Prediction Background Stochastic Model Processes The 32 kbps G.721 ADPCM Codec Functional Description of the G.721 Codec Adaptive Quantiser G.721 Quantiser Scale Factor Adaptation G.721 Adaptation Speed Control G.721 Adaptive Prediction and Signal Reconstruction Speech Quality Evaluation G.726 and G.727 ADPCM Coding Motivation Embedded G.727 ADPCM coding Performance of the Embedded G.727 ADPCM Codec Rate-Distortion in Predictive Coding Chapter Summary II Analysis by Synthesis Coding 83 3 Analysis-by-synthesis Principles Motivation Analysis-by-synthesis Codec Structure The Short-term Synthesis Filter Long-Term Prediction Open-loop Optimisation of LTP parameters Closed-loop Optimisation of LTP parameters Excitation Models Adaptive Post-filtering Lattice-based Linear Prediction Chapter Summary Speech Spectral Quantization Log-area Ratios Line Spectral Frequencies Derivation of the Line Spectral Frequencies Computation of the Line Spectral Frequencies Chebyshev-description of Line Spectral Frequencies Spectral Vector Quantization Background Speaker-adaptive Vector Quantisation of LSFs

4 page iii CONTENTS iii Stochastic VQ of LPC Parameters Background The Stochastic VQ Algorithm Robust Vector Quantisation Schemes for LSFs LSF Vector-quantisers in Standard Codecs Spectral Quantizers for Wideband Speech Coding Introduction to Wideband Spectral Quantisation Statistical Properties of Wideband LSFs Speech Codec Specifications Wideband LSF Vector Quantizers Memoryless Vector Quantization Predictive Vector Quantization Multimode Vector Quantization Simulation Results and Subjective Evaluations Conclusions on Wideband Spectral Quantisation Chapter Summary RPE Coding Theoretical Background The 13 kbps RPE-LTP GSM Speech encoder Pre-processing STP analysis filtering LTP analysis filtering Regular Excitation Pulse Computation The 13 kbps RPE-LTP GSM Speech Decoder Bit-sensitivity of the GSM Codec A Tool-box Based Speech Transceiver Chapter Summary Forward-Adaptive CELP Coding Background The Original CELP Approach Fixed Codebook Search CELP Excitation Models Binary Pulse Excitation Transformed Binary Pulse Excitation Excitation Generation TBPE Bit Sensitivity Dual-rate Algebraic CELP Coding ACELP Codebook Structure Dual-rate ACELP Bitallocation Dual-rate ACELP Codec Performance CELP Optimization Introduction Calculation of the Excitation Parameters Full Codebook Search Theory

5 page iv iv CONTENTS Sequential Search Procedure Full Search Procedure Sub-Optimal Search Procedures Quantization of the Codebook Gains Calculation of the Synthesis Filter Parameters Bandwidth Expansion Least Squares Techniques Optimization via Powell s Method Simulated Annealing and the Effects of Quantization CELP Error-sensitivity Introduction Improving the Spectral Information Error Sensitivity LSF Ordering Policies The Effect of FEC on the Spectral Parameters The Effect of Interpolation Improving the Error Sensitivity of the Excitation Parameters The Fixed Codebook Index The Fixed Codebook Gain Adaptive Codebook Delay Adaptive Codebook Gain Matching Channel Codecs to the Speech Codec Error Resilience Conclusions Dual-mode Speech Transceiver The Transceiver Scheme Re-configurable Modulation Source-matched Error Protection Low-quality 3.1 kbd Mode High-quality 3.1 kbd Mode Packet Reservation Multiple Access kbd System Performance kbd System Summary Multi-slot PRMA Transceiver Background and Motivation PRMA-assisted Multi-slot Adaptive Modulation Adaptive GSM-like Schemes Adaptive DECT-like Schemes Summary of Adaptive Multi-slot PRMA Chapter Summary Standard Speech Codecs Background The US DoD FS kbits/s CELP Codec Introduction LPC Analysis and Quantization The Adaptive Codebook The Fixed Codebook

6 page v CONTENTS v Error Concealment Techniques Decoder Post-Filtering Conclusion The IS-54 DAMPS speech codec The JDC speech codec The Qualcomm Variable Rate CELP Codec Introduction Codec Schematic and Bit Allocation Codec Rate Selection LPC Analysis and Quantization The Pitch Filter The Fixed Codebook Rate 1/8 Filter Excitation Decoder Post-Filtering Error Protection and Concealment Techniques Conclusion Japanese Half-Rate Speech Codec Introduction Codec Schematic and Bit Allocation Encoder Pre-Processing LPC Analysis and Quantization The Weighting Filter Excitation Vector Excitation Vector Channel Coding Decoder Post Processing The half-rate GSM codec Half-rate GSM codec outline Half-rate GSM Codec s Spectral Quantisation Error protection The 8 kbits/s G.729 Codec Introduction Codec Schematic and Bit Allocation Encoder Pre-Processing LPC Analysis and Quantization The Weighting Filter The Adaptive Codebook The Fixed Algebraic Codebook Quantization of the Gains Decoder Post Processing G.729 Error Concealment Techniques G.729 Bit-sensitivity Turbo-coded OFDM G.729 Speech Transceiver Background System Overview Turbo Channel Encoding

7 page vi vi CONTENTS OFDM in the FRAMES Speech/Data Sub Burst Channel model Turbo-coded G.729 OFDM Parameters Turbo-coded G.729 OFDM Performance Turbo-coded G.729 OFDM Summary G.729 Summary The Reduced Complexity G.729 Annex A Codec Introduction The Perceptual Weighting Filter The Open Loop Pitch Search The Closed Loop Pitch Search The Algebraic Codebook Search The Decoder Post Processing Conclusions The Enhanced Full-rate GSM codec Codec Outline Operation of the EFR-GSM Encoder Spectral Quantisation in the EFR-GSM Codec Adaptive Codebook Search Fixed Codebook Search The IS-136 Speech Codec IS-136 codec outline IS-136 Bitallocation scheme Fixed Codebook Search IS-136 Channel Coding The ITU G Dual-Rate Codec Introduction G Encoding Principle Vector-Quantisation of the LSPs Formant-based Weighting Filter The 6.3 kbps High-rate G Excitation The 5.3 kbps low-rate G excitation G Bitallocation G Error Sensitivity Advanced Multi-rate JD-CDMA Transceiver Multi-rate codecs and systems System Overview The Adaptive Multi-Rate Speech Codec AMR Codec Overview Linear Prediction Analysis LSF Quantization Pitch Analysis Fixed Codebook With Algebraic Structure Post-Processing The AMR Codec s Bit Allocation Codec Mode Switching Philosophy

8 page vii CONTENTS vii The AMR Speech Codec s Error Sensitivity Redundant Residue Number System Based Channel Coding Redundant Residue Number System Overview Source-Matched Error Protection Joint Detection Code Division Multiple Access Overview Joint Detection Based Adaptive Code Division Multiple Access System Performance Subjective Testing Conclusions Chapter Summary Backward-Adaptive CELP Coding Introduction Motivation and Background Backward-Adaptive G728 Schematic Backward-Adaptive G728 Coding G728 Error Weighting G728 Windowing Codebook Gain Adaption G728 Codebook Search G728 Excitation Vector Quantization G728 Adaptive Postfiltering Adaptive Long-term Postfiltering G728 Adaptive Short-term Postfiltering Complexity and Performance of the G728 Codec Reduced-Rate 16-8 kbps G728-Like Codec I The Effects of Long Term Prediction Closed-Loop Codebook Training Reduced-Rate 16-8 kbps G728-Like Codec II Programmable-Rate 8-4 kbps CELP Codecs Motivation kbps Codec Improvements kbps Codecs - Forward Adaption of the STP Synthesis Filter kbps Codecs - Forward Adaption of the LTP Initial Experiments Quantization of Jointly Optimized Gains kbps Codecs - Voiced/Unvoiced Codebooks Low Delay Codecs at 4-8 kbits/s Low Delay ACELP Codec Backward-adaptive Error Sensitivity Issues The Error Sensitivity of the G728 Codec The Error Sensitivity of Our 4-8 kbits/s Low Delay Codecs The Error Sensitivity of Our Low Delay ACELP Codec A Low-Delay Multimode Speech Transceiver

9 page viii viii CONTENTS Background kbps Codec Performance Transmission Issues Higher-quality Mode Lower-quality Mode Speech Transceiver Performance Chapter Summary III Wideband Coding and Transmission Wideband Speech Coding Subband-ADPCM Wideband Coding Introduction and Specifications G722 Codec Outline Principles of Subband Coding Quadrature Mirror Filtering Analysis Filtering Synthesis Filtering Practical QMF Design Constraints G722 Adaptive Quantisation and Prediction G722 Coding Performance Wideband Transform-Coding at 32 kbps Background Transform-Coding Algorithm Subband-Split Wideband CELP Codecs Background Subband-based Wideband CELP coding Motivation Low-band Coding Highband Coding Bit allocation Scheme Fullband Wideband ACELP Coding Wideband ACELP Excitation Wideband 32 kbps ACELP Coding Wideband 9.6 kbps ACELP Coding Turbo-coded Wideband Speech Transceiver Background and Motivation System Overview System Parameters Constant Throughput Adaptive Modulation Adaptive Wideband Transceiver Performance Multi mode Transceiver Adaptation Transceiver Mode Switching The Wideband G Codec Audio Codec Overview

10 page ix CONTENTS ix Detailed Description of the Audio Codec Wideband Adaptive System Performance Audio Frame Error Results Audio Segmental SNR Performance and Discussions G Audio Transceiver Summary and Conclusions Turbo-Detected IRCC AMR-WB Transceivers Introduction The AMR-WB Codec s Error Sensitivity System Model Design of Irregular Convolutional Codes An Example Irregular Convolutional Code UEP AMR IRCC Performance Results UEP AMR Conclusions The AMR-WB+ Audio Codec Introduction Audio requirements in mobile multimedia applications Summary of audio-visual services Bit rates supported by the radio network Overview of the AMR-WB+ codec Encoding the high frequencies Stereo encoding Complexity of AMR-WB Transport and file format of AMR-WB Performance of AMR-WB Summary of the AMR-WB+ codec Chapter Summary Advanced Multi-Rate Speech Transceivers Introduction The Adaptive Multi-Rate Speech Codec Overview Linear Prediction Analysis LSF Quantization Pitch Analysis Fixed Codebook With Algebraic Structure Post-Processing The AMR Codec s Bit Allocation Codec Mode Switching Philosophy Speech Codec s Error Sensitivity System Background System Overview Redundant Residue Number System (RRNS) Channel Coding Overview Source-Matched Error Protection Joint Detection Code Division Multiple Access Overview

11 page x x CONTENTS Joint Detection Based Adaptive Code Division Multiple Access System Performance Subjective Testing A Turbo-Detected Irregular Convolutional Coded AMR Transceiver Motivation The AMR-WB Codec s Error Sensitivity System Model Design of Irregular Convolutional Codes An Example Irregular Convolutional Code UEP AMR IRCC Performance Results UEP AMR Conclusions Chapter Summary MPEG-4 Audio Compression and Transmission Overview of MPEG-4 Audio General Audio Coding Advanced Audio Coding Gain Control Tool Psychoacoustic Model Temporal Noise Shaping Stereophonic Coding AAC Quantization and Coding Noiseless Huffman Coding Bit-Sliced Arithmetic Coding Transform-domain Weighted Interleaved Vector Quantization Parametric Audio Coding Speech Coding in MPEG-4 Audio Harmonic Vector Excitation Coding CELP Coding in MPEG LPC Analysis and Quantization Multi Pulse and Regular Pulse Excitation MPEG-4 Codec Performance MPEG-4 Space-Time Block Coded OFDM Audio Transceiver System Overview System parameters Frame Dropping Procedure Space-Time Coding Adaptive Modulation System Performance Turbo-Detected STTC Aided MPEG-4 Audio Transceivers Motivation and Background Audio Turbo Transceiver Overview The Turbo Transceiver Turbo Transceiver Performance Results MPEG-4 Turbo Transceiver Summary Turbo-Detected STTC Aided MPEG-4 Versus AMR-WB Transceivers

12 page xi CONTENTS xi Motivation and Background The AMR-WB Codec S Error Sensitivity The MPEG-4 TwinVQ Codec S Error Sensitivity The Turbo Transceiver Performance Results AMR-WB and MPEG-4 TwinVQ Turbo Transceiver Summary Chapter Summary IV Very Low Rate Coding and Transmission Overview of Low-rate Speech Coding Low Bit Rate Speech Coding Analysis-by-Synthesis Coding Speech Coding at 2.4kbps Background to 2.4kbps Speech Coding Frequency Selective Harmonic Coder Sinusoidal Transform Coder Multiband Excitation Coders Subband Linear Prediction Coder Mixed Excitation Linear Prediction Coder Waveform Interpolation Coder Speech Coding Below 2.4kbps Linear Predictive Coding model Short Term Prediction Long Term Prediction Final Analysis-by-Synthesis Model Speech Quality Measurements Objective Speech Quality Measures Subjective Speech Quality Measures kbps Selection Process Speech Database Chapter Summary Linear Predictive Vocoder Overview of a Linear Predictive Vocoder Line Spectrum Frequencies Quantization Line Spectrum Frequencies Scalar Quantization Line Spectrum Frequencies Vector Quantization Pitch Detection Voiced-Unvoiced Decision Oversampled Pitch Detector Pitch Tracking Computational Complexity Integer Pitch Detector Unvoiced Frames

13 page xii xii CONTENTS 13.5 Voiced Frames Placement of Excitation Pulses Pulse Energy Adaptive Postfilter Pulse Dispersion Filter Pulse Dispersion Principles Pitch Independent Glottal Pulse Shaping Filter Pitch Dependent Glottal Pulse Shaping Filter Results for Linear Predictive Vocoder Chapter Summary Wavelets and Pitch Detection Conceptual Introduction to Wavelets Fourier Theory Wavelet Theory Detecting Discontinuities with Wavelets Introduction to Wavelet Mathematics Multiresolution Analysis Polynomial Spline Wavelets Pyramidal Algorithm Boundary Effects Preprocessing the Wavelet Transform Signal Spurious Pulses Normalization Candidate Glottal Pulses Voiced-Unvoiced Decision Wavelet Based Pitch Detector Dynamic Programming Autocorrelation Simplification Chapter Summary Zinc Function Excitation Introduction Overview of Prototype Waveform Interpolation Zinc Function Excitation Coding Scenarios U-U-U Encoder Scenario U-U-V Encoder Scenario V-U-U Encoder Scenario U-V-U Encoder Scenario V-V-V Encoder Scenario V-U-V Encoder Scenario U-V-V Encoder Scenario V-V-U Encoder Scenario U-V Decoder Scenario U-U Decoder Scenario V-U Decoder Scenario

14 page xiii CONTENTS xiii V-V Decoder Scenario Zinc Function Modelling Error Minimization Computational Complexity Reducing the Complexity of Zinc Function Excitation Optimization Phases of the Zinc Functions Pitch Detection Voiced-Unvoiced Boundaries Pitch Prototype Selection Voiced Speech Energy Scaling Quantization Excitation Interpolation Between Prototype Segments ZFE Interpolation Regions ZFE Amplitude Parameter Interpolation ZFE Position Parameter Interpolation Implicit Signalling of Prototype Zero Crossing Removal of ZFE Pulse Position Signalling and Interpolation Pitch Synchronous Interpolation of Line Spectrum Frequencies ZFE Interpolation Example Unvoiced Speech Adaptive Postfilter Results for Single Zinc Function Excitation Error Sensitivity of the 1.9kbps PWI-ZFE Coder Parameter Sensitivity of the 1.9kbps PWI-ZFE coder Line Spectrum Frequencies Voiced-Unvoiced Flag Pitch Period Excitation Amplitude Parameters Root Mean Square Energy Parameter Boundary Shift Parameter Degradation from Bit Corruption Error Sensitivity Classes Multiple Zinc Function Excitation Encoding Algorithm Performance of Multiple Zinc Function Excitation A Sixth-rate, 3.8 kbps GSM-like Speech Transceiver Motivation The Turbo-coded Sixth-rate 3.8 kbps GSM-like System Turbo Channel Coding The Turbo-coded GMSK Transceiver System Performance Results Chapter Summary

15 page xiv xiv CONTENTS 16 Mixed-Multiband Excitation Introduction Overview of Mixed-Multiband Excitation Finite Impulse Response Filter Mixed-Multiband Excitation Encoder Voicing Strengths Mixed-Multiband Excitation Decoder Adaptive Postfilter Computational Complexity Performance of the Mixed-Multiband Excitation Coder Performance of a Mixed-Multiband Excitation Linear Predictive Coder Performance of a Mixed-Multiband Excitation and Zinc Function Prototype Excitation Coder A Higher Rate 3.85kbps Mixed-Multiband Excitation Scheme A 2.35 kbit/s Joint-Detection CDMA Speech Transceiver Background The Speech Codec s Bit Allocation The Speech Codec s Error Sensitivity Channel Coding The JD-CDMA Speech System System performance Conclusions on the JD-CDMA Speech Transceiver Chapter Summary Sinusoidal Transform Coding Below 4kbps Introduction Sinusoidal Analysis of Speech Signals Sinusoidal Analysis with Peak Picking Sinusoidal Analysis using Analysis-by-Synthesis Sinusoidal Synthesis of Speech Signals Frequency, Amplitude and Phase Interpolation Overlap-Add Interpolation Low Bit Rate Sinusoidal Coders Increased Frame Length Incorporating Linear Prediction Analysis Incorporating Prototype Waveform Interpolation Encoding the Sinusoidal Frequency Component Determining the Excitation Components Peak-Picking of the Residual Spectra Analysis-by-Synthesis of the Residual Spectrum Computational Complexity Reducing the Computational Complexity Quantizing the Excitation Parameters Encoding the Sinusoidal Amplitudes Vector Quantization of the Amplitudes Interpolation and Decimation

16 VOICE-BO page xv CONTENTS xv Vector Quantization Vector Quantization Performance Scalar Quantization of the Amplitudes Encoding the Sinusoidal Phases Vector Quantization of the Phases Encoding the Phases with a Voiced-Unvoiced Switch Encoding the Sinusoidal Fourier Coefficients Equivalent Rectangular Bandwidth Scale Voiced-Unvoiced Flag Sinusoidal Transform Decoder Pitch Synchronous Interpolation Fourier Coefficient Interpolation Frequency Interpolation Computational Complexity Speech Coder Performance Chapter Summary Conclusions on Low Rate Coding Summary Listening Tests Summary of Very Low Rate Coding Further Research Comparison of Speech Transceivers Background to Speech Quality Evaluation Objective Speech Quality Measures Introduction Signal to Noise Ratios Articulation Index Ceptral Distance Cepstral Example Logarithmic likelihood ratio Euclidean Distance Subjective Measures Quality Tests Comparison of Quality Measures Background Intelligibility tests Subjective Speech Quality of Various Codecs Speech Codec Bit-sensitivity Transceiver Speech Performance Chapter Summary A Constructing the Quadratic Spline Wavelets 827 B Zinc Function Excitation 831

17 page 1 CONTENTS 1 C Probability Density Function for Amplitudes 837 Bibliography 843 Index 887 Author Index 887

18 page 1 Preface and Motivation The Speech Coding Scene Despite the emergence of sophisticated high-rate multimedia services, voice communications remain the predominant means of human communications, although the compressed voice signals may be delivered via the Internet. The large-scale, pervasive introduction of wireless Internet services is likely to promote the unified transmission of both voice and data signals using the Voice over Internet Protocol (VoIP) even in the third - generation (3G) wireless systems, despite wasting much of the valuable frequency resources for the transmission of packet headers. Even when the predicted surge of wireless data and Internet services becomes a reality, voice remains the most natural means of human communications, although this may be delivered via the Internet. This book is dedicated to audio and voice compression issues, although the aspects of error resilience, coding delay, implementational complexity and bitrate are also at the centre of our discussions, characterising many different speech codecs incorported in source-sensitivity matched wireless transceivers. A unique feature of the book is that it also provides cuttingedge turbo-transceiver-aided research-oriented design examples and an a chapter on the VoIP protocol. Here we attempt a rudimentary comparison of some of the codec schemes treated in the book in terms of their speech quality and bitrate, in order to provide a road map for the reader with reference to Cox s work [1, 2]. The formally evaluated Mean Opinion Score (MOS) values of the various codecs portrayed in the book are shown in Figure 1. Observe in the figure that over the years a range of speech codecs have emerged, which attained the quality of the 64 kbps G.711 PCM speech codec, although at the cost of significantly increased coding delay and implementational complexity. The 8 kbps G.729 codec is the most recent addition to this range of the International Telecommunications Union s (ITU) standard schemes, which significantly outperforms all previous standard ITU codecs in robustness terms. The performance target of the 4 kbps ITU codec (ITU4) is also to maintain this impressive set of specifications. The family of codecs designed for various mobile radio systems - such as the 13 kbps Regular Pulse Excited (RPE) scheme of the Global System of Mobile communications known as GSM, the 7.95 kbps IS-54, and the IS-95 Pan-American schemes, the 6.7 kbps Japanese Digital Cellular (JDC) and 3.45 kbps half-rate JDC arrangement (JDC/2) - exhibits slightly lower MOS values than the ITU codecs. Let us now consider the subjective quality of these schemes in a little more depth. The 2.4 kbps US Department of Defence Federal Standard codec known as FS-1015 is the only vocoder in this group and it has a rather synthetic speech quality, associated with the lowest subjective assessment in the figure. The 64 kbps G.711 PCM codec and the G.726/G.727 Adaptive Differential PCM (ADPCM) schemes are waveform codecs. They exhibit a low im- 1

19 page 2 2 CONTENTS plementational complexity associated with a modest bitrate economy. The remaining codecs belong to the so-called hybrid coding family and achieve significant bitrate economies at the cost of increased complexity and delay. Excellent ITU4 New Research G.723 G.729 G.728 G.726 G.711 PCM Good MOS JDC/2 GSM IS54 IS96 JDC Fair MELP In-M FS1016 Complexity Delay FS1015 Poor bit rate (kb/s) Figure 1: Subjective speech quality of various codecs [1] c IEEE, 1996 Specifically, the 16 kbps G.728 backward-adaptive scheme maintains a similar speech quality to the 32 and 64 kbps waveform codecs, while also maintaining an impressively low, 2 ms delay. This scheme was standardised during the early nineties. The similar-quality, but significantly more robust 8 kbps G.729 codec was approved in March 1996 by the ITU. Its standardisation overlapped with the G codec developments. The G codec s 6.4 kbps mode maintains a speech quality similar to the G.711, G.726, G.727, G.728 and G.728 codecs, while its 5.3 kbps mode exhibits a speech quality similar to the cellular speech codecs of the late eighties. The standardisation of a 4 kbps ITU scheme, which we refer to here as ITU4 is also a desirable design goal at the time of writing. In parallel to the ITU s standardisation activities a range of speech coding standards have been proposed for regional cellular mobile systems. The standardisation of the 13 kbps RPE- LTP full-rate GSM (GSM-FR) codec dates back to the second half of the eighties, representing the first standard hybrid codec. Its complexity is significantly lower than that of the more recent Code Excited Linear Predictive (CELP) based codecs. Observe in the figure that there is also a similar-rate Enhanced Full-Rate GSM codec (GSM-EFR), which matches the speech quality of the G.729 and G.728 schemes. The original GSM-FR codec s development was followed a little later by the release of the 7.95 kbps Vector Sum Excited Linear Predictive

20 page 3 CONTENTS 3 (VSELP) IS-54 American cellular standard. Due to advances in the field the 7.95 kbps IS-54 codec achieved a similar subjective speech quality to the 13 kbps GSM-FR scheme. The definition of the 6.7 kbps Japanese JDC VSELP codec was almost coincident with that of the IS-54 arrangement. This codec development was also followed by a half-rate standardisation process, leading to the 3.2 kbps Pitch-Synchroneous Innovation CELP (PSI-CELP) scheme. The IS-95 Pan-American CDMA system also has its own standardised CELP-based speech codec, which is a variable-rate scheme, supporting bitrates between 1.2 and 14.4 kbps, depending on the prevalent voice activity. The perceived speech quality of these cellular speech codecs contrived mainly during the late eighties was found subjectively similar to each other under the perfect channel conditions of Figure 1. Lastly, the 5.6 kbps half-rate GSM codec (GSM-HR) also met its specification in terms of achieving a similar speech quality to the 13 kbps original GSM-FR arrangements, although at the cost of quadruple complexity and higher latency. Recently the advantages of intelligent multimode speech terminals (IMT), which can reconfigure themselves in a number of different bitrate, quality and robustness modes attracted substantial research attention in the community, which led to the standardisation of the High- Speed Downlink Packet Access (HSDPA) mode of the 3G wireless systems. The HSDPAstyle transceivers employ both adaptive modulation and adaptive channel coding, which result in a channel-quality dependent bit-rate fluctuation, hence requiring reconfigurable multimode voice and audio codecs, such as the Advanced Multi-Rate codec referred to as the AMR scheme. Following the standardisation of the narrowband AMR codec, the wideband AMR scheme referred to as the AMR-WB arrangement and encoding the 0-7 KHz band was also developed, which will also be characterised in the book. Finally, the most recent AMR codec, namely the so-called AMR-WB+ scheme will also be the subject of our discussions. Rcent research on sub-2.4 kbps speech codecs is also covered extensively in the book, where the aspects of auditory masking become more dominant. Finally, since the classic G.722 subband-adpcm based wideband codec has become obsolete in the light of exciting new developments in compression, the most recent trend is to consider wideband speech and audio codecs, providing susbtantially enhanced speech quality. Motivated by early seminal work on transform-domain or frequency-domain based compression by Noll and his colleagues, in this field the wideband G codec - which can be programmed to operate between 10 kbps and 32 kbps and hence lends itself to employment in HSDPA-style nearinstantaneously adaptive wireless communicators - is the most attractive candidate. This codec is portrayed in the context of a sophisticated burst-by-burst adaptive wideband turbocoded Orthogonal Frequency Division Multiplex (OFDM) IMT in the book. This scheme is also capable of transmitting high-quality audio signals, behaving essentially as a high-quality waveform codec. Mile-stones in Speech Coding History Over the years a range of excellent monographs and text books have been published, characterising the state-of-the-art at its various stages of development and constituting significant mile-stones. The first major development in the history of speech compression can be considered the invention of the vocoder, dating back to as early as Delta modulation was contrived in 1952 and later it became well established following Steele s monograph on the

21 page 4 4 CONTENTS topic in 1975 [3]. Pulse Coded Modulation (PCM) was first documented in detail in Cattermole s classic contribution in 1969 [4]. However, it was realised in 1967 that predictive coding provides advantages over memory-less coding techniques, such as PCM. Predictive techniques were analysed in depth by Markel and Gray in their 1976 classic treatise [5]. This was shortly followed by the often cited reference [6] by Rabiner and Schafer. Also Lindblom and Ohman contributed a book in 1979 on speech communication research [7]. The foundations of auditory theory were layed down as early as 1970 by Tobias [8], but these principles were not exploited to their full potential until the invention of the analysis by synthesis (AbS) codecs, which were heralded by Atal s multi-pulse excited codec in the early eighties [9]. The waveform coding of speech and video signals has been comprehensively documented by Jayant and Noll in their 1984 monograph [10]. During the eighties the speech codec developments were fuelled by the emergence of mobile radio systems, where spectrum was a scarce resource, potentially doubling the number of subscribers and hence the revenue, if the bitrate could be halved. The RPE principle - as a relatively low-complexity analysis by synthesis technique - was proposed by Kroon, Deprettere and Sluyter in 1986 [11], which was followed by further research conducted by Vary [12,13] and his colleagues at PKI in Germany and IBM in France, leading to the 13 kbps Pan-European GSM codec. This was the first standardised AbS speech codec, which also employed long-term prediction (LTP), recognising the important role the pitch determination plays in efficient speech compression [14, 15]. It was in this era, when Atal and Schroeder invented the Code Excited Linear Predictive (CELP) principle [16], leading to perhaps the most productive period in the history of speech coding during the eighties. Some of these developments were also summarised for example by O Shaughnessy [17], Papamichalis [18], Deller, Proakis and Hansen [19]. It was during this era that the importance of speech perception and acoustic phonetics [20] was duly recognised for example in the monograph by Lieberman and Blumstein. A range of associated speech quality measures were summarised by Quackenbush, Barnwell III and Clements [21]. Nearly concomitantly Furui also published a book related to speech processing [22]. This period witnessed the appearance of many of the speech codecs seen in Figure 1, which found applications in the emerging global mobile radio systems, such as IS-54, JDC, etc. These codecs were typically associated with source-sensitivity matched error protection, where for example Steele, Sundberg and Wong [23 26] have provided early insights on the topic. Further sophisticated solutions were suggested for example by Hagenauer [27]. Both the narrow-band and wide-band AMR, as wello as the AMR-WB+ (AMR) codecs [28, 29] are capable of adaptively adjusting their bitrate. This also allows the user to adjust the ratio between the speech bit rate and the channel coding bit rate constituting the error protection oriented redundancy according to the prevalent near-instantaneous channel conditions in HSDPA-style transceivers. When the channel quality is inferior, the speech encoder operates at low bit rates, thus accommodating more powerful forward error control within the total bit rate budget. By contrast, under high-quality channel conditions the speech encoder may benefit from using the total bit rate budget, yielding high speech quality, since in this high-rate case low redundancy error protection is sufficient. Thus, the AMR concept allows the system to operate in an error-resilient mode under poor channel conditions, while benefitting from a better speech quality under good channel conditions. Hence, the source coding scheme must be designed for seamless switching between rates available without annoying artifacts.

Preface, Motivation and The Speech Coding Scene

Preface, Motivation and The Speech Coding Scene Preface, Motivation and The Speech Coding Scene In the era of third-generation (3G) wireless personal communications standards, despite the emergence of broad-band access network standard proposals, the

More information

Chapter IV THEORY OF CELP CODING

Chapter IV THEORY OF CELP CODING Chapter IV THEORY OF CELP CODING CHAPTER IV THEORY OF CELP CODING 4.1 Introduction Wavefonn coders fail to produce high quality speech at bit rate lower than 16 kbps. Source coders, such as LPC vocoders,

More information

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile 8 2. LITERATURE SURVEY The available radio spectrum for the wireless radio communication is very limited hence to accommodate maximum number of users the speech is compressed. The speech compression techniques

More information

Overview of Code Excited Linear Predictive Coder

Overview of Code Excited Linear Predictive Coder Overview of Code Excited Linear Predictive Coder Minal Mulye 1, Sonal Jagtap 2 1 PG Student, 2 Assistant Professor, Department of E&TC, Smt. Kashibai Navale College of Engg, Pune, India Abstract Advances

More information

Cellular systems & GSM Wireless Systems, a.a. 2014/2015

Cellular systems & GSM Wireless Systems, a.a. 2014/2015 Cellular systems & GSM Wireless Systems, a.a. 2014/2015 Un. of Rome La Sapienza Chiara Petrioli Department of Computer Science University of Rome Sapienza Italy 2 Voice Coding 3 Speech signals Voice coding:

More information

Low Bit Rate Speech Coding

Low Bit Rate Speech Coding Low Bit Rate Speech Coding Jaspreet Singh 1, Mayank Kumar 2 1 Asst. Prof.ECE, RIMT Bareilly, 2 Asst. Prof.ECE, RIMT Bareilly ABSTRACT Despite enormous advances in digital communication, the voice is still

More information

Digital Speech Processing and Coding

Digital Speech Processing and Coding ENEE408G Spring 2006 Lecture-2 Digital Speech Processing and Coding Spring 06 Instructor: Shihab Shamma Electrical & Computer Engineering University of Maryland, College Park http://www.ece.umd.edu/class/enee408g/

More information

Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP

Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP Monika S.Yadav Vidarbha Institute of Technology Rashtrasant Tukdoji Maharaj Nagpur University, Nagpur, India monika.yadav@rediffmail.com

More information

Information. LSP (Line Spectrum Pair): Essential Technology for High-compression Speech Coding. Takehiro Moriya. Abstract

Information. LSP (Line Spectrum Pair): Essential Technology for High-compression Speech Coding. Takehiro Moriya. Abstract LSP (Line Spectrum Pair): Essential Technology for High-compression Speech Coding Takehiro Moriya Abstract Line Spectrum Pair (LSP) technology was accepted as an IEEE (Institute of Electrical and Electronics

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/

More information

NOISE SHAPING IN AN ITU-T G.711-INTEROPERABLE EMBEDDED CODEC

NOISE SHAPING IN AN ITU-T G.711-INTEROPERABLE EMBEDDED CODEC NOISE SHAPING IN AN ITU-T G.711-INTEROPERABLE EMBEDDED CODEC Jimmy Lapierre 1, Roch Lefebvre 1, Bruno Bessette 1, Vladimir Malenovsky 1, Redwan Salami 2 1 Université de Sherbrooke, Sherbrooke (Québec),

More information

A BURST-BY-BURST ADAPTIVE JOINT-DETECTION BASED CDMA SPEECH TRANSCEIVER. H.T. How, T.H. Liew, E.L Kuan and L. Hanzo

A BURST-BY-BURST ADAPTIVE JOINT-DETECTION BASED CDMA SPEECH TRANSCEIVER. H.T. How, T.H. Liew, E.L Kuan and L. Hanzo A BURST-BY-BURST ADAPTIVE JOINT-DETECTION BASED CDMA SPEECH TRANSCEIVER H.T. How, T.H. Liew, E.L Kuan and L. Hanzo Dept. of Electr. and Comp. Sc.,Univ. of Southampton, SO17 1BJ, UK. Tel: +-173-93 1, Fax:

More information

The Channel Vocoder (analyzer):

The Channel Vocoder (analyzer): Vocoders 1 The Channel Vocoder (analyzer): The channel vocoder employs a bank of bandpass filters, Each having a bandwidth between 100 Hz and 300 Hz. Typically, 16-20 linear phase FIR filter are used.

More information

Enhanced Waveform Interpolative Coding at 4 kbps

Enhanced Waveform Interpolative Coding at 4 kbps Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression

More information

EE 225D LECTURE ON MEDIUM AND HIGH RATE CODING. University of California Berkeley

EE 225D LECTURE ON MEDIUM AND HIGH RATE CODING. University of California Berkeley University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Spring,1999 Medium & High Rate Coding Lecture 26

More information

Transcoding of Narrowband to Wideband Speech

Transcoding of Narrowband to Wideband Speech University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2005 Transcoding of Narrowband to Wideband Speech Christian H. Ritz University

More information

Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder

Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder COMPUSOFT, An international journal of advanced computer technology, 3 (3), March-204 (Volume-III, Issue-III) ISSN:2320-0790 Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech

More information

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A EC 6501 DIGITAL COMMUNICATION 1.What is the need of prediction filtering? UNIT - II PART A [N/D-16] Prediction filtering is used mostly in audio signal processing and speech processing for representing

More information

Comparison of CELP speech coder with a wavelet method

Comparison of CELP speech coder with a wavelet method University of Kentucky UKnowledge University of Kentucky Master's Theses Graduate School 2006 Comparison of CELP speech coder with a wavelet method Sriram Nagaswamy University of Kentucky, sriramn@gmail.com

More information

LOSS CONCEALMENTS FOR LOW-BIT-RATE PACKET VOICE IN VOIP. Outline

LOSS CONCEALMENTS FOR LOW-BIT-RATE PACKET VOICE IN VOIP. Outline LOSS CONCEALMENTS FOR LOW-BIT-RATE PACKET VOICE IN VOIP Benjamin W. Wah Department of Electrical and Computer Engineering and the Coordinated Science Laboratory University of Illinois at Urbana-Champaign

More information

10 Speech and Audio Signals

10 Speech and Audio Signals 0 Speech and Audio Signals Introduction Speech and audio signals are normally converted into PCM, which can be stored or transmitted as a PCM code, or compressed to reduce the number of bits used to code

More information

APPLICATIONS OF DSP OBJECTIVES

APPLICATIONS OF DSP OBJECTIVES APPLICATIONS OF DSP OBJECTIVES This lecture will discuss the following: Introduce analog and digital waveform coding Introduce Pulse Coded Modulation Consider speech-coding principles Introduce the channel

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation

More information

Analysis/synthesis coding

Analysis/synthesis coding TSBK06 speech coding p.1/32 Analysis/synthesis coding Many speech coders are based on a principle called analysis/synthesis coding. Instead of coding a waveform, as is normally done in general audio coders

More information

CHAPTER 7 ROLE OF ADAPTIVE MULTIRATE ON WCDMA CAPACITY ENHANCEMENT

CHAPTER 7 ROLE OF ADAPTIVE MULTIRATE ON WCDMA CAPACITY ENHANCEMENT CHAPTER 7 ROLE OF ADAPTIVE MULTIRATE ON WCDMA CAPACITY ENHANCEMENT 7.1 INTRODUCTION Originally developed to be used in GSM by the Europe Telecommunications Standards Institute (ETSI), the AMR speech codec

More information

Lesson 8 Speech coding

Lesson 8 Speech coding Lesson 8 coding Encoding Information Transmitter Antenna Interleaving Among Frames De-Interleaving Antenna Transmission Line Decoding Transmission Line Receiver Information Lesson 8 Outline How information

More information

An objective method for evaluating data hiding in pitch gain and pitch delay parameters of the AMR codec

An objective method for evaluating data hiding in pitch gain and pitch delay parameters of the AMR codec An objective method for evaluating data hiding in pitch gain and pitch delay parameters of the AMR codec Akira Nishimura 1 1 Department of Media and Cultural Studies, Tokyo University of Information Sciences,

More information

Adaptive time scale modification of speech for graceful degrading voice quality in congested networks

Adaptive time scale modification of speech for graceful degrading voice quality in congested networks Adaptive time scale modification of speech for graceful degrading voice quality in congested networks Prof. H. Gokhan ILK Ankara University, Faculty of Engineering, Electrical&Electronics Eng. Dept 1 Contact

More information

1. MOTIVATION AND BACKGROUND

1. MOTIVATION AND BACKGROUND Turbo-Detected Unequal Protection Audio and Speech Transceivers Using Serially Concantenated Convolutional Codes, Trellis Coded Modulation and Space-Time Trellis Coding N S Othman, S X Ng and L Hanzo School

More information

SILK Speech Codec. TDP 10/11 Xavier Anguera I Ciro Gracia

SILK Speech Codec. TDP 10/11 Xavier Anguera I Ciro Gracia SILK Speech Codec TDP 10/11 Xavier Anguera I Ciro Gracia SILK Codec Audio codec desenvolupat per Skype (Febrer 2009) Previament usaven el codec SVOPC (Sinusoidal Voice Over Packet Coder): LPC analysis.

More information

Scalable Speech Coding for IP Networks

Scalable Speech Coding for IP Networks Santa Clara University Scholar Commons Engineering Ph.D. Theses Student Scholarship 8-24-2015 Scalable Speech Coding for IP Networks Koji Seto Santa Clara University Follow this and additional works at:

More information

I D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008

I D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008 R E S E A R C H R E P O R T I D I A P Spectral Noise Shaping: Improvements in Speech/Audio Codec Based on Linear Prediction in Spectral Domain Sriram Ganapathy a b Petr Motlicek a Hynek Hermansky a b Harinath

More information

QUESTION BANK EC 1351 DIGITAL COMMUNICATION YEAR / SEM : III / VI UNIT I- PULSE MODULATION PART-A (2 Marks) 1. What is the purpose of sample and hold

QUESTION BANK EC 1351 DIGITAL COMMUNICATION YEAR / SEM : III / VI UNIT I- PULSE MODULATION PART-A (2 Marks) 1. What is the purpose of sample and hold QUESTION BANK EC 1351 DIGITAL COMMUNICATION YEAR / SEM : III / VI UNIT I- PULSE MODULATION PART-A (2 Marks) 1. What is the purpose of sample and hold circuit 2. What is the difference between natural sampling

More information

Wireless Communications

Wireless Communications Wireless Communications Lecture 5: Coding / Decoding and Modulation / Demodulation Module Representive: Prof. Dr.-Ing. Hans D. Schotten schotten@eit.uni-kl.de Lecturer: Dr.-Ing. Bin Han binhan@eit.uni-kl.de

More information

Speech Synthesis; Pitch Detection and Vocoders

Speech Synthesis; Pitch Detection and Vocoders Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech

More information

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals ISCA Journal of Engineering Sciences ISCA J. Engineering Sci. Vocoder (LPC) Analysis by Variation of Input Parameters and Signals Abstract Gupta Rajani, Mehta Alok K. and Tiwari Vebhav Truba College of

More information

Ninad Bhatt Yogeshwar Kosta

Ninad Bhatt Yogeshwar Kosta DOI 10.1007/s10772-012-9178-9 Implementation of variable bitrate data hiding techniques on standard and proposed GSM 06.10 full rate coder and its overall comparative evaluation of performance Ninad Bhatt

More information

NOVEL PITCH DETECTION ALGORITHM WITH APPLICATION TO SPEECH CODING

NOVEL PITCH DETECTION ALGORITHM WITH APPLICATION TO SPEECH CODING NOVEL PITCH DETECTION ALGORITHM WITH APPLICATION TO SPEECH CODING A Thesis Submitted to the Graduate Faculty of the University of New Orleans in partial fulfillment of the requirements for the degree of

More information

Speech Coding using Linear Prediction

Speech Coding using Linear Prediction Speech Coding using Linear Prediction Jesper Kjær Nielsen Aalborg University and Bang & Olufsen jkn@es.aau.dk September 10, 2015 1 Background Speech is generated when air is pushed from the lungs through

More information

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals 16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract

More information

Flexible and Scalable Transform-Domain Codebook for High Bit Rate CELP Coders

Flexible and Scalable Transform-Domain Codebook for High Bit Rate CELP Coders Flexible and Scalable Transform-Domain Codebook for High Bit Rate CELP Coders Václav Eksler, Bruno Bessette, Milan Jelínek, Tommy Vaillancourt University of Sherbrooke, VoiceAge Corporation Montreal, QC,

More information

Digital Signal Processing

Digital Signal Processing Digital Signal Processing Fourth Edition John G. Proakis Department of Electrical and Computer Engineering Northeastern University Boston, Massachusetts Dimitris G. Manolakis MIT Lincoln Laboratory Lexington,

More information

Wideband Speech Coding & Its Application

Wideband Speech Coding & Its Application Wideband Speech Coding & Its Application Apeksha B. landge. M.E. [student] Aditya Engineering College Beed Prof. Amir Lodhi. Guide & HOD, Aditya Engineering College Beed ABSTRACT: Increasing the bandwidth

More information

3GPP TS V5.0.0 ( )

3GPP TS V5.0.0 ( ) TS 26.171 V5.0.0 (2001-03) Technical Specification 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Speech Codec speech processing functions; AMR Wideband

More information

OFDM and MC-CDMA A Primer

OFDM and MC-CDMA A Primer OFDM and MC-CDMA A Primer L. Hanzo University of Southampton, UK T. Keller Analog Devices Ltd., Cambridge, UK IEEE PRESS IEEE Communications Society, Sponsor John Wiley & Sons, Ltd Contents About the Authors

More information

Speech Compression Using Voice Excited Linear Predictive Coding

Speech Compression Using Voice Excited Linear Predictive Coding Speech Compression Using Voice Excited Linear Predictive Coding Ms.Tosha Sen, Ms.Kruti Jay Pancholi PG Student, Asst. Professor, L J I E T, Ahmedabad Abstract : The aim of the thesis is design good quality

More information

UNIVERSITY OF SURREY LIBRARY

UNIVERSITY OF SURREY LIBRARY 7385001 UNIVERSITY OF SURREY LIBRARY All rights reserved I N F O R M A T I O N T O A L L U S E R S T h e q u a l i t y o f t h i s r e p r o d u c t i o n is d e p e n d e n t u p o n t h e q u a l i t

More information

ON-LINE LABORATORIES FOR SPEECH AND IMAGE PROCESSING AND FOR COMMUNICATION SYSTEMS USING J-DSP

ON-LINE LABORATORIES FOR SPEECH AND IMAGE PROCESSING AND FOR COMMUNICATION SYSTEMS USING J-DSP ON-LINE LABORATORIES FOR SPEECH AND IMAGE PROCESSING AND FOR COMMUNICATION SYSTEMS USING J-DSP A. Spanias, V. Atti, Y. Ko, T. Thrasyvoulou, M.Yasin, M. Zaman, T. Duman, L. Karam, A. Papandreou, K. Tsakalis

More information

Syllabus. osmania university UNIT - I UNIT - II UNIT - III CHAPTER - 1 : INTRODUCTION TO DIGITAL COMMUNICATION CHAPTER - 3 : INFORMATION THEORY

Syllabus. osmania university UNIT - I UNIT - II UNIT - III CHAPTER - 1 : INTRODUCTION TO DIGITAL COMMUNICATION CHAPTER - 3 : INFORMATION THEORY i Syllabus osmania university UNIT - I CHAPTER - 1 : INTRODUCTION TO Elements of Digital Communication System, Comparison of Digital and Analog Communication Systems. CHAPTER - 2 : DIGITAL TRANSMISSION

More information

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Brochure More information from http://www.researchandmarkets.com/reports/569388/ Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Description: Multimedia Signal

More information

Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech

Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech INTERSPEECH 5 Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech M. A. Tuğtekin Turan and Engin Erzin Multimedia, Vision and Graphics Laboratory,

More information

6/29 Vol.7, No.2, February 2012

6/29 Vol.7, No.2, February 2012 Synthesis Filter/Decoder Structures in Speech Codecs Jerry D. Gibson, Electrical & Computer Engineering, UC Santa Barbara, CA, USA gibson@ece.ucsb.edu Abstract Using the Shannon backward channel result

More information

A 600 BPS MELP VOCODER FOR USE ON HF CHANNELS

A 600 BPS MELP VOCODER FOR USE ON HF CHANNELS A 600 BPS MELP VOCODER FOR USE ON HF CHANNELS Mark W. Chamberlain Harris Corporation, RF Communications Division 1680 University Avenue Rochester, New York 14610 ABSTRACT The U.S. government has developed

More information

Audio Signal Compression using DCT and LPC Techniques

Audio Signal Compression using DCT and LPC Techniques Audio Signal Compression using DCT and LPC Techniques P. Sandhya Rani#1, D.Nanaji#2, V.Ramesh#3,K.V.S. Kiran#4 #Student, Department of ECE, Lendi Institute Of Engineering And Technology, Vizianagaram,

More information

Audio /Video Signal Processing. Lecture 1, Organisation, A/D conversion, Sampling Gerald Schuller, TU Ilmenau

Audio /Video Signal Processing. Lecture 1, Organisation, A/D conversion, Sampling Gerald Schuller, TU Ilmenau Audio /Video Signal Processing Lecture 1, Organisation, A/D conversion, Sampling Gerald Schuller, TU Ilmenau Gerald Schuller gerald.schuller@tu ilmenau.de Organisation: Lecture each week, 2SWS, Seminar

More information

Waveform interpolation speech coding

Waveform interpolation speech coding University of Wollongong Research Online University of Wollongong Thesis Collection 1954-2016 University of Wollongong Thesis Collections 1998 Waveform interpolation speech coding Jun Ni University of

More information

Msc Engineering Physics (6th academic year) Royal Institute of Technology, Stockholm August December 2003

Msc Engineering Physics (6th academic year) Royal Institute of Technology, Stockholm August December 2003 Msc Engineering Physics (6th academic year) Royal Institute of Technology, Stockholm August 2002 - December 2003 1 2E1511 - Radio Communication (6 ECTS) The course provides basic knowledge about models

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

DEPARTMENT OF DEFENSE TELECOMMUNICATIONS SYSTEMS STANDARD

DEPARTMENT OF DEFENSE TELECOMMUNICATIONS SYSTEMS STANDARD NOT MEASUREMENT SENSITIVE 20 December 1999 DEPARTMENT OF DEFENSE TELECOMMUNICATIONS SYSTEMS STANDARD ANALOG-TO-DIGITAL CONVERSION OF VOICE BY 2,400 BIT/SECOND MIXED EXCITATION LINEAR PREDICTION (MELP)

More information

COMMUNICATION SYSTEMS

COMMUNICATION SYSTEMS COMMUNICATION SYSTEMS 4TH EDITION Simon Hayhin McMaster University JOHN WILEY & SONS, INC. Ш.! [ BACKGROUND AND PREVIEW 1. The Communication Process 1 2. Primary Communication Resources 3 3. Sources of

More information

ON THE PERFORMANCE OF WTIMIT FOR WIDE BAND TELEPHONY

ON THE PERFORMANCE OF WTIMIT FOR WIDE BAND TELEPHONY ON THE PERFORMANCE OF WTIMIT FOR WIDE BAND TELEPHONY D. Nagajyothi 1 and P. Siddaiah 2 1 Department of Electronics and Communication Engineering, Vardhaman College of Engineering, Shamshabad, Telangana,

More information

Improving Sound Quality by Bandwidth Extension

Improving Sound Quality by Bandwidth Extension International Journal of Scientific & Engineering Research, Volume 3, Issue 9, September-212 Improving Sound Quality by Bandwidth Extension M. Pradeepa, M.Tech, Assistant Professor Abstract - In recent

More information

Voice Activity Detection for Speech Enhancement Applications

Voice Activity Detection for Speech Enhancement Applications Voice Activity Detection for Speech Enhancement Applications E. Verteletskaya, K. Sakhnov Abstract This paper describes a study of noise-robust voice activity detection (VAD) utilizing the periodicity

More information

Waveform Coding Algorithms: An Overview

Waveform Coding Algorithms: An Overview August 24, 2012 Waveform Coding Algorithms: An Overview RWTH Aachen University Compression Algorithms Seminar Report Summer Semester 2012 Adel Zaalouk - 300374 Aachen, Germany Contents 1 An Introduction

More information

Nonuniform multi level crossing for signal reconstruction

Nonuniform multi level crossing for signal reconstruction 6 Nonuniform multi level crossing for signal reconstruction 6.1 Introduction In recent years, there has been considerable interest in level crossing algorithms for sampling continuous time signals. Driven

More information

Improved signal analysis and time-synchronous reconstruction in waveform interpolation coding

Improved signal analysis and time-synchronous reconstruction in waveform interpolation coding University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2000 Improved signal analysis and time-synchronous reconstruction in waveform

More information

Robust Linear Prediction Analysis for Low Bit-Rate Speech Coding

Robust Linear Prediction Analysis for Low Bit-Rate Speech Coding Robust Linear Prediction Analysis for Low Bit-Rate Speech Coding Nanda Prasetiyo Koestoer B. Eng (Hon) (1998) School of Microelectronic Engineering Faculty of Engineering and Information Technology Griffith

More information

DEPARTMENT OF INFORMATION TECHNOLOGY QUESTION BANK. Subject Name: Information Coding Techniques UNIT I INFORMATION ENTROPY FUNDAMENTALS

DEPARTMENT OF INFORMATION TECHNOLOGY QUESTION BANK. Subject Name: Information Coding Techniques UNIT I INFORMATION ENTROPY FUNDAMENTALS DEPARTMENT OF INFORMATION TECHNOLOGY QUESTION BANK Subject Name: Year /Sem: II / IV UNIT I INFORMATION ENTROPY FUNDAMENTALS PART A (2 MARKS) 1. What is uncertainty? 2. What is prefix coding? 3. State the

More information

Modern Quadrature Amplitude Modulation Principles and Applications for Fixed and Wireless Channels

Modern Quadrature Amplitude Modulation Principles and Applications for Fixed and Wireless Channels 1 Modern Quadrature Amplitude Modulation Principles and Applications for Fixed and Wireless Channels W.T. Webb, L.Hanzo Contents PART I: Background to QAM 1 Introduction and Background 1 1.1 Modulation

More information

SNR Scalability, Multiple Descriptions, and Perceptual Distortion Measures

SNR Scalability, Multiple Descriptions, and Perceptual Distortion Measures SNR Scalability, Multiple Descriptions, Perceptual Distortion Measures Jerry D. Gibson Department of Electrical & Computer Engineering University of California, Santa Barbara gibson@mat.ucsb.edu Abstract

More information

Adaptive Forward-Backward Quantizer for Low Bit Rate. High Quality Speech Coding. University of Missouri-Columbia. Columbia, MO 65211

Adaptive Forward-Backward Quantizer for Low Bit Rate. High Quality Speech Coding. University of Missouri-Columbia. Columbia, MO 65211 Adaptive Forward-Backward Quantizer for Low Bit Rate High Quality Speech Coding Jozsef Vass Yunxin Zhao y Xinhua Zhuang Department of Computer Engineering & Computer Science University of Missouri-Columbia

More information

International Journal of Advanced Engineering Technology E-ISSN

International Journal of Advanced Engineering Technology E-ISSN Research Article ARCHITECTURAL STUDY, IMPLEMENTATION AND OBJECTIVE EVALUATION OF CODE EXCITED LINEAR PREDICTION BASED GSM AMR 06.90 SPEECH CODER USING MATLAB Bhatt Ninad S. 1 *, Kosta Yogesh P. 2 Address

More information

Surveillance Transmitter of the Future. Abstract

Surveillance Transmitter of the Future. Abstract Surveillance Transmitter of the Future Eric Pauer DTC Communications Inc. Ronald R Young DTC Communications Inc. 486 Amherst Street Nashua, NH 03062, Phone; 603-880-4411, Fax; 603-880-6965 Elliott Lloyd

More information

Enhanced Variable Rate Codec, Speech Service Options 3, 68, 70, and 73 for Wideband Spread Spectrum Digital Systems

Enhanced Variable Rate Codec, Speech Service Options 3, 68, 70, and 73 for Wideband Spread Spectrum Digital Systems GPP C.S00-D Version.0 October 00 Enhanced Variable Rate Codec, Speech Service Options,, 0, and for Wideband Spread Spectrum Digital Systems 00 GPP GPP and its Organizational Partners claim copyright in

More information

TELECOMMUNICATION SYSTEMS

TELECOMMUNICATION SYSTEMS TELECOMMUNICATION SYSTEMS By Syed Bakhtawar Shah Abid Lecturer in Computer Science 1 MULTIPLEXING An efficient system maximizes the utilization of all resources. Bandwidth is one of the most precious resources

More information

An Interactive Multimedia Introduction to Signal Processing

An Interactive Multimedia Introduction to Signal Processing U. Karrenberg An Interactive Multimedia Introduction to Signal Processing Translation by Richard Hooton and Ulrich Boltz 2nd arranged and supplemented edition With 256 Figures, 12 videos, 250 preprogrammed

More information

Auditory modelling for speech processing in the perceptual domain

Auditory modelling for speech processing in the perceptual domain ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract

More information

Speech Enhancement using Wiener filtering

Speech Enhancement using Wiener filtering Speech Enhancement using Wiener filtering S. Chirtmay and M. Tahernezhadi Department of Electrical Engineering Northern Illinois University DeKalb, IL 60115 ABSTRACT The problem of reducing the disturbing

More information

EC 2301 Digital communication Question bank

EC 2301 Digital communication Question bank EC 2301 Digital communication Question bank UNIT I Digital communication system 2 marks 1.Draw block diagram of digital communication system. Information source and input transducer formatter Source encoder

More information

ENHANCED TIME DOMAIN PACKET LOSS CONCEALMENT IN SWITCHED SPEECH/AUDIO CODEC.

ENHANCED TIME DOMAIN PACKET LOSS CONCEALMENT IN SWITCHED SPEECH/AUDIO CODEC. ENHANCED TIME DOMAIN PACKET LOSS CONCEALMENT IN SWITCHED SPEECH/AUDIO CODEC Jérémie Lecomte, Adrian Tomasek, Goran Marković, Michael Schnabel, Kimitaka Tsutsumi, Kei Kikuiri Fraunhofer IIS, Erlangen, Germany,

More information

ECE 556 BASICS OF DIGITAL SPEECH PROCESSING. Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2

ECE 556 BASICS OF DIGITAL SPEECH PROCESSING. Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2 ECE 556 BASICS OF DIGITAL SPEECH PROCESSING Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2 Analog Sound to Digital Sound Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre

More information

Techniques for low-rate scalable compression of speech signals

Techniques for low-rate scalable compression of speech signals University of Wollongong Research Online University of Wollongong Thesis Collection University of Wollongong Thesis Collections 2002 Techniques for low-rate scalable compression of speech signals Jason

More information

Open Access Improved Frame Error Concealment Algorithm Based on Transform- Domain Mobile Audio Codec

Open Access Improved Frame Error Concealment Algorithm Based on Transform- Domain Mobile Audio Codec Send Orders for Reprints to reprints@benthamscience.ae The Open Electrical & Electronic Engineering Journal, 2014, 8, 527-535 527 Open Access Improved Frame Error Concealment Algorithm Based on Transform-

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 213 http://acousticalsociety.org/ ICA 213 Montreal Montreal, Canada 2-7 June 213 Signal Processing in Acoustics Session 2pSP: Acoustic Signal Processing

More information

Bandwidth Extension for Speech Enhancement

Bandwidth Extension for Speech Enhancement Bandwidth Extension for Speech Enhancement F. Mustiere, M. Bouchard, M. Bolic University of Ottawa Tuesday, May 4 th 2010 CCECE 2010: Signal and Multimedia Processing 1 2 3 4 Current Topic 1 2 3 4 Context

More information

QUESTION BANK. SUBJECT CODE / Name: EC2301 DIGITAL COMMUNICATION UNIT 2

QUESTION BANK. SUBJECT CODE / Name: EC2301 DIGITAL COMMUNICATION UNIT 2 QUESTION BANK DEPARTMENT: ECE SEMESTER: V SUBJECT CODE / Name: EC2301 DIGITAL COMMUNICATION UNIT 2 BASEBAND FORMATTING TECHNIQUES 1. Why prefilterring done before sampling [AUC NOV/DEC 2010] The signal

More information

IMPROVED SPEECH QUALITY FOR VMR - WB SPEECH CODING USING EFFICIENT NOISE ESTIMATION ALGORITHM

IMPROVED SPEECH QUALITY FOR VMR - WB SPEECH CODING USING EFFICIENT NOISE ESTIMATION ALGORITHM IMPROVED SPEECH QUALITY FOR VMR - WB SPEECH CODING USING EFFICIENT NOISE ESTIMATION ALGORITHM Mr. M. Mathivanan Associate Professor/ECE Selvam College of Technology Namakkal, Tamilnadu, India Dr. S.Chenthur

More information

MASTER'S THESIS. Speech Compression and Tone Detection in a Real-Time System. Kristina Berglund. MSc Programmes in Engineering

MASTER'S THESIS. Speech Compression and Tone Detection in a Real-Time System. Kristina Berglund. MSc Programmes in Engineering 2004:003 CIV MASTER'S THESIS Speech Compression and Tone Detection in a Real-Time System Kristina Berglund MSc Programmes in Engineering Department of Computer Science and Electrical Engineering Division

More information

Page 0 of 23. MELP Vocoder

Page 0 of 23. MELP Vocoder Page 0 of 23 MELP Vocoder Outline Introduction MELP Vocoder Features Algorithm Description Parameters & Comparison Page 1 of 23 Introduction Traditional pitched-excited LPC vocoders use either a periodic

More information

Transcoding free voice transmission in GSM and UMTS networks

Transcoding free voice transmission in GSM and UMTS networks Transcoding free voice transmission in GSM and UMTS networks Sara Stančin, Grega Jakus, Sašo Tomažič University of Ljubljana, Faculty of Electrical Engineering Abstract - Transcoding refers to the conversion

More information

Speech Coding in the Frequency Domain

Speech Coding in the Frequency Domain Speech Coding in the Frequency Domain Speech Processing Advanced Topics Tom Bäckström Aalto University October 215 Introduction The speech production model can be used to efficiently encode speech signals.

More information

Voice Excited Lpc for Speech Compression by V/Uv Classification

Voice Excited Lpc for Speech Compression by V/Uv Classification IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 3, Ver. II (May. -Jun. 2016), PP 65-69 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Voice Excited Lpc for Speech

More information

The Opus Codec To be presented at the 135th AES Convention 2013 October New York, USA

The Opus Codec To be presented at the 135th AES Convention 2013 October New York, USA .ooo. The Opus Codec To be presented at the 135th AES Convention 2013 October 17 20 New York, USA This paper was accepted for publication at the 135 th AES Convention. This version of the paper is from

More information

Analog and Telecommunication Electronics

Analog and Telecommunication Electronics Politecnico di Torino - ICT School Analog and Telecommunication Electronics D5 - Special A/D converters» Differential converters» Oversampling, noise shaping» Logarithmic conversion» Approximation, A and

More information

Technical Specification Group Services and System Aspects Meeting #7, Madrid, Spain, March 15-17, 2000 Agenda Item: 5.4.3

Technical Specification Group Services and System Aspects Meeting #7, Madrid, Spain, March 15-17, 2000 Agenda Item: 5.4.3 TSGS#7(00)0028 Technical Specification Group Services and System Aspects Meeting #7, Madrid, Spain, March 15-17, 2000 Agenda Item: 5.4.3 Source: TSG-S4 Title: AMR Wideband Permanent project document WB-4:

More information

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt Pattern Recognition Part 6: Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory

More information

An Approach to Very Low Bit Rate Speech Coding

An Approach to Very Low Bit Rate Speech Coding Computing For Nation Development, February 26 27, 2009 Bharati Vidyapeeth s Institute of Computer Applications and Management, New Delhi An Approach to Very Low Bit Rate Speech Coding Hari Kumar Singh

More information

EEE 309 Communication Theory

EEE 309 Communication Theory EEE 309 Communication Theory Semester: January 2016 Dr. Md. Farhad Hossain Associate Professor Department of EEE, BUET Email: mfarhadhossain@eee.buet.ac.bd Office: ECE 331, ECE Building Part 05 Pulse Code

More information

ARIB STD-T V Audio codec processing functions; Extended Adaptive Multi-Rate - Wideband (AMR-WB+) codec; Transcoding functions

ARIB STD-T V Audio codec processing functions; Extended Adaptive Multi-Rate - Wideband (AMR-WB+) codec; Transcoding functions ARIB STD-T63-26.290 V12.0.0 Audio codec processing functions; Extended Adaptive Multi-Rate - Wideband (AMR-WB+) codec; Transcoding functions (Release 12) Refer to Industrial Property Rights (IPR) in the

More information