Super-Wideband Fine Spectrum Quantization for Low-rate High-Quality MDCT Coding Mode of The 3GPP EVS Codec

Size: px
Start display at page:

Download "Super-Wideband Fine Spectrum Quantization for Low-rate High-Quality MDCT Coding Mode of The 3GPP EVS Codec"

Transcription

1 Super-Wideband Fine Spectrum Quantization for Low-rate High-Quality DCT Coding ode of The 3GPP EVS Codec Presented by Srikanth Nagisetty, Hiroyuki Ehara 15 th Dec 2015

2 Topics of this Presentation Background EVS Encoder Overview LR-DCT Coder Envelope Coding Spectrum Coding Bit allocation Gap Filling Evaluation Results Summary 2

3 Background Challenge: Encoding SWB band signal at low delay & low bit-rate ixed contents and music sampled at 32kHz Frame length : 20 ms Too few bits for quantizing SWB spectrum (quantizing 560 bins using around 256 bits) Highly efficient quantization algorithm is needed 3

4 EVS Encoder Overview Speech Signals ixed & usic Signals The Low Rate High Quality (LR-HQ) DCT coding is one of the mode in the EVS DCT coder. 4

5 LR-HQ DCT SWB Encoder (1/2) Quantized LF information for gap filling Signal Classification Energy Coding Spectrum Coding Transient Normal Harmonic Huffman Coding Advanced bit allocation 5

6 I Scaled Energies E ( b) round( q ( b) ), int LR-HQ DCT SWB Coder b 0,..., N bands -Envelope coding 1 Selection based on 1. Range of indices 2. Bits consumption E ( b) log 2 k kend ( b) 1 X k k start ( b) ( k) 2 Epsilon, Band Energies b 0,..., N bands 1 I I (0) I ( b) I I (0) round( q ( b) I ref int ( b 1), ) b 1,..., N Differential Indices bands 1 6

7 Energy Spectral Peak Tracking Peak Average Energy Average-Energy HF Bands n HF bands for Quantization Non-Zero Non-Zero Bit allocation Reduces the width for the current frame - Bit savings - Inter-frame Continuity Spectral Peak Tracking Bit allocation Temporal discontinuity -If the current frame HF spectrum peak is quantized while the previous frame is not quantized -Perceivable degradation 7

8 LR-HQ DCT Coder -Bit allocation (1/2) Dynamic Bit allocation Bands(1) Bits(1) Bits(1)>T1 Bands(2) : Bands(n-1) Dynamic Bit allocation Bits(2) : Bits(n-1) Bits(2)>T2 : Bits(n-1)>Tn-1 Combination Logic Bit allocation Result Bands(n) Bits(n) Bits(n)>Tn Iterative Bands with bits < Threshold identified and relocates bits to other bands 8

9 Bands(1) Bands(2) : Bands(n-1) Bands(n) LR-HQ DCT Coder -Bit allocation (2/2) Adaptive bit allocation based on Grouping Adaptive Grouping Identifies dominant bands (local max) and its valley to form groups Group(1) Group(2) : Group(k) Group energies Identify Dominant Group Bit allocation to Group Dynamic Bit allocation to bands in a Group Bit allocation Iterative: Identify Next Dominant Group Group energy and Variance Bit allocation Results Bits allocated 1) Adaptively grouping the bands and 2) By exploiting the relationship between the groups. This approach is more suitable for tonal (Harmonic) like signals as the energy of the bands is mainly concentrated at discrete tones. 9

10 LR-HQ DCT Coder -Gap Filling khz TCQ/USQ is performed on Non Zero bands Noise-filling is performed on 0-bit-allocated LF bands Spectral Gaps Zero-bit bands cause spectral gaps, which lead to audible artifacts if left alone. Gap filling technique is used. 10

11 LR-HQ DCT Coder -Gap Filling Normal ode khz TCQ/USQ is performed on Non Zero bands Noise-filling is performed on 0-bit-allocated LF bands Best match spectrum filled in the High Frequency Normalized LF 11

12 LR-HQ DCT Coder -Gap Filling Harmonic ode khz TCQ/USQ is performed on Non Zero bands Normalized LF 1.Harmonic relation not maintained b/w reproduced HF and input HF 2.Roughness perceived during the boundary regions Noise-filling is performed on 0-bit-allocated LF bands Gap Filled Spectrum Best match spectrum filled in the High Frequency irroring 12

13 f(n) Last spectral peak in LF LR-HQ DCT Coder -Spectral Correction Roughness perceived when 1 st spectral peak in HF 2 tones fall in 3 rd 30 spectral Hz and amplitudes peak are in rapidly HF changing f(n) Synthesized LF spectrum Last spectral peak in LF 1 st spectral peak in HF EstHarmonic Est Harmonic Replicated HF spectrum 2 nd spectral peak in HF Est Harmonic 3 rd spectral peak in HF freq Spectral correction using Harmonic transposition. This method retains both the fine structure of the spectrum and the harmonic relationships between the low frequency tones and the replicated high frequency tones 13 freq

14 LR-HQ DCT Coder - Listening Test Setup Degradation Category Rating (DCR) methodology (ITU-T P.800) : 24 ixed and usic samples recorded in Japanese language 16 Japanese naïve listeners Codecs EVS SWB LR-HQ mode Reference Codec: AR-WB+ 14

15 LR-HQ DCT Coder - Evaluation Results LR-HQ SWB Performance is equal or greater than AR-WB+ whose algorithmic delay is longer than twice of EVS (32ms) 15

16 Summary For encoding the SWB spectral coefficients at low bit budget Spectral band energies are quantized using an efficient Huffman coding methods Advanced bit allocation methods are used for efficient representation of spectrum. Spectral holes in the full spectrum coding is filled using gap filling techniques Gap-filling techniques are improved by introducing a fine spectrum normalization and adaptive sparse BWE coding Conclusion: EVS LR-HQ SWB coder meets the performance requirements and is adopted as a part of multi-mode DCT coding in the EVS codec. 16

17 References 1. 3GPP TR Study of Use Cases and requirements for enhanced voice codecs for the EPS, v , ar GPP TS andatory speech codec speech processing functions; Adaptive ulti-rate (AR) speech codec; General description Speech, v , Sep GPP TS Speech codec speech processing functions; Adaptive ulti-rate-wide-band (AR-WB) speech codec; General description, v , Oct GPP TS Codec for Enhanced Voice Services (EVS); Detailed Algorithmic Description, v ), Sep S. Bruhn, et al., Standardization of the new EVS Codec, IEEE ICASSP, April S. Nagisetty, et al., Low Bitrate High Quality DCT Audio coding of the 3GPP EVS, IEEE ICASSP ITU-T G.719, Low complexity, full band audio coding for high quality, conversational applications, ITU-T Recommendation G.719, June T. R. Fischer, H. Sung, J. Zhan, E. Oh, High-quality audio transform coded excitation using trellis codes, ICASSP, 2008, pp ITU-T G.718 Annex B, Superwideband scalable extension for ITU-T G.718, ITU-T Recommendation G.718 Amendment 2, arch, L. Laaksonen et al., Superwideband extension of G.718 and G speech codecs, Interspeech Oshikiri, H. Ehara, K. Yoshida, Efficient spectrum coding for super-wideband speech and its application to 7/10/15 khz bandwidth scalable coders, IEEE ICASSP, 2004, pp.i H. Fastl, E. Zwicker, Psychoacoustics: Facts and odels. Springer series in information sciences. Springer, 3 rd edition, Audio codec processing functions; Extended Adaptive ulti-rate Wide-band (AR-WB+) codec, 3GPP TS GPP, T-doc S , EVS performance requirements, April ITU-T P.800, ethods for Subjective Determination of Transmission Quality. International Telecommunication Union (ITU), Series P., August GPP, Tdoc S , Report of the Global Analysis Lab for the EVS Selection Phase, Aug

18 APPENDIX 18

19 LR-HQ DCT SWB Encoder (2/2) Energy Frequency khz Spectral Coefficients are grouped Spectral bands E(1) E(2).. E(b-1) E(b) Spectral band energies 19

20 8 Symbol Huffman table LR-HQ DCT Coder -Envelope coding (2/2) I(0) : I(b) 32 Symbol Huffman tables Large Symbol Huffman > [-64 63] Range Small Symbol Huffman < [-64 63] Pulse ode Scale ode Context Based Huffman Resized Huffman If I(b) exceeds [-4 3] 1. Position 2. Amplitude of coded directly I(b) I(b-1) determines best Huffman table for encoding the current band I(b) I(b) Span is narrowed down to fewer code words (21 symbols) 20

21 LR-HQ DCT Coder -Envelope coding (3/3) Small Symbol coding method Context based Huffman: table for encoding the current band Resized Huffman : I(b) I(b-1) determines best Huffman I(b) narrowed to a smaller range for using Huffman table with fewer symbols (21 symbols). ΔI (b) ΔI ΔI (b) (b) min( ΔI max( ΔI (b-1 ) (b-1 ) -T,3), 1 ΔI, 3), ΔI (b-1 ) T (b-1 ) - T T ΔI (b) is the the new differenti al index for band b T 15 thr, T 1 15 thr 21

22 LR-HQ DCT Coder -Quantization (TCQ and USQ) agnitude Quantized by USQ & TCQ Position, Number and sign coded by Arithmetic coding 22

23 LR-HQ DCT Coder -Sparse Band Search Reduces computational load Best match spectrum filled in the High Frequency Region Best match index is identified by finding the k' which maximizes the correlation measure, S(k') according to S( k' ) corr( k' ) corr( k') Ene( k') Ncnt 1 k0 Ncnt 1 ~ X k0 2 Ene( k' ), k' 0,..., Nlag 1 ~ X ( Idx[ k]) X ( k lag[ k'] Idx[ k]) ( k lag[ k'] Idx[ k]) 2 Gap filling for Normal mode 23

Overview of Code Excited Linear Predictive Coder

Overview of Code Excited Linear Predictive Coder Overview of Code Excited Linear Predictive Coder Minal Mulye 1, Sonal Jagtap 2 1 PG Student, 2 Assistant Professor, Department of E&TC, Smt. Kashibai Navale College of Engg, Pune, India Abstract Advances

More information

I D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008

I D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008 R E S E A R C H R E P O R T I D I A P Spectral Noise Shaping: Improvements in Speech/Audio Codec Based on Linear Prediction in Spectral Domain Sriram Ganapathy a b Petr Motlicek a Hynek Hermansky a b Harinath

More information

EFFICIENT SUPER-WIDE BANDWIDTH EXTENSION USING LINEAR PREDICTION BASED ANALYSIS-SYNTHESIS. Pramod Bachhav, Massimiliano Todisco and Nicholas Evans

EFFICIENT SUPER-WIDE BANDWIDTH EXTENSION USING LINEAR PREDICTION BASED ANALYSIS-SYNTHESIS. Pramod Bachhav, Massimiliano Todisco and Nicholas Evans EFFICIENT SUPER-WIDE BANDWIDTH EXTENSION USING LINEAR PREDICTION BASED ANALYSIS-SYNTHESIS Pramod Bachhav, Massimiliano Todisco and Nicholas Evans EURECOM, Sophia Antipolis, France {bachhav,todisco,evans}@eurecom.fr

More information

Auditory modelling for speech processing in the perceptual domain

Auditory modelling for speech processing in the perceptual domain ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract

More information

Quality comparison of wideband coders including tandeming and transcoding

Quality comparison of wideband coders including tandeming and transcoding ETSI Workshop on Speech and Noise In Wideband Communication, 22nd and 23rd May 2007 - Sophia Antipolis, France Quality comparison of wideband coders including tandeming and transcoding Catherine Quinquis

More information

Flexible and Scalable Transform-Domain Codebook for High Bit Rate CELP Coders

Flexible and Scalable Transform-Domain Codebook for High Bit Rate CELP Coders Flexible and Scalable Transform-Domain Codebook for High Bit Rate CELP Coders Václav Eksler, Bruno Bessette, Milan Jelínek, Tommy Vaillancourt University of Sherbrooke, VoiceAge Corporation Montreal, QC,

More information

Speech Quality Evaluation of Artificial Bandwidth Extension: Comparing Subjective Judgments and Instrumental Predictions

Speech Quality Evaluation of Artificial Bandwidth Extension: Comparing Subjective Judgments and Instrumental Predictions INTERSPEECH 01 Speech Quality Evaluation of Artificial Bandwidth Extension: Comparing Subjective Judgments and Instrumental Predictions Hannu Pulakka 1, Ville Myllylä 1, Anssi Rämö, and Paavo Alku 1 Microsoft

More information

Audio Signal Compression using DCT and LPC Techniques

Audio Signal Compression using DCT and LPC Techniques Audio Signal Compression using DCT and LPC Techniques P. Sandhya Rani#1, D.Nanaji#2, V.Ramesh#3,K.V.S. Kiran#4 #Student, Department of ECE, Lendi Institute Of Engineering And Technology, Vizianagaram,

More information

Speech Coding in the Frequency Domain

Speech Coding in the Frequency Domain Speech Coding in the Frequency Domain Speech Processing Advanced Topics Tom Bäckström Aalto University October 215 Introduction The speech production model can be used to efficiently encode speech signals.

More information

United Codec. 1. Motivation/Background. 2. Overview. Mofei Zhu, Hugo Guo, Deepak Music 422 Winter 09 Stanford University.

United Codec. 1. Motivation/Background. 2. Overview. Mofei Zhu, Hugo Guo, Deepak Music 422 Winter 09 Stanford University. United Codec Mofei Zhu, Hugo Guo, Deepak Music 422 Winter 09 Stanford University March 13, 2009 1. Motivation/Background The goal of this project is to build a perceptual audio coder for reducing the data

More information

Speech Coding using Linear Prediction

Speech Coding using Linear Prediction Speech Coding using Linear Prediction Jesper Kjær Nielsen Aalborg University and Bang & Olufsen jkn@es.aau.dk September 10, 2015 1 Background Speech is generated when air is pushed from the lungs through

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation

More information

Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP

Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP Monika S.Yadav Vidarbha Institute of Technology Rashtrasant Tukdoji Maharaj Nagpur University, Nagpur, India monika.yadav@rediffmail.com

More information

Acoustics of wideband terminals: a 3GPP perspective

Acoustics of wideband terminals: a 3GPP perspective Acoustics of wideband terminals: a 3GPP perspective Orange Labs Stéphane RAGOT Orange Delegate in 3GPP & 3GPP SA4 Vice-Chair Co-Rapporteur of 3GPP work item on "Requirements and Test Methods for Wideband

More information

NOISE SHAPING IN AN ITU-T G.711-INTEROPERABLE EMBEDDED CODEC

NOISE SHAPING IN AN ITU-T G.711-INTEROPERABLE EMBEDDED CODEC NOISE SHAPING IN AN ITU-T G.711-INTEROPERABLE EMBEDDED CODEC Jimmy Lapierre 1, Roch Lefebvre 1, Bruno Bessette 1, Vladimir Malenovsky 1, Redwan Salami 2 1 Université de Sherbrooke, Sherbrooke (Québec),

More information

APPLICATIONS OF DSP OBJECTIVES

APPLICATIONS OF DSP OBJECTIVES APPLICATIONS OF DSP OBJECTIVES This lecture will discuss the following: Introduce analog and digital waveform coding Introduce Pulse Coded Modulation Consider speech-coding principles Introduce the channel

More information

ARTIFICIAL BANDWIDTH EXTENSION OF NARROW-BAND SPEECH SIGNALS VIA HIGH-BAND ENERGY ESTIMATION

ARTIFICIAL BANDWIDTH EXTENSION OF NARROW-BAND SPEECH SIGNALS VIA HIGH-BAND ENERGY ESTIMATION ARTIFICIAL BANDWIDTH EXTENSION OF NARROW-BAND SPEECH SIGNALS VIA HIGH-BAND ENERGY ESTIMATION Tenkasi Ramabadran and Mark Jasiuk Motorola Labs, Motorola Inc., 1301 East Algonquin Road, Schaumburg, IL 60196,

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb 2009. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence

More information

Wideband Speech Encryption Based Arnold Cat Map for AMR-WB G Codec

Wideband Speech Encryption Based Arnold Cat Map for AMR-WB G Codec Wideband Speech Encryption Based Arnold Cat Map for AMR-WB G.722.2 Codec Fatiha Merazka Telecommunications Department USTHB, University of science & technology Houari Boumediene P.O.Box 32 El Alia 6 Bab

More information

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner. Perception of pitch AUDL4007: 11 Feb 2010. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum, 2005 Chapter 7 1 Definitions

More information

-/$5,!4%$./)3% 2%&%2%.#% 5.)4 -.25

-/$5,!4%$./)3% 2%&%2%.#% 5.)4 -.25 INTERNATIONAL TELECOMMUNICATION UNION )454 0 TELECOMMUNICATION (02/96) STANDARDIZATION SECTOR OF ITU 4%,%0(/.% 42!.3-)33)/. 15!,)49 -%4(/$3 &/2 /"*%#4)6%!.$ 35"*%#4)6%!33%33-%.4 /& 15!,)49 -/$5,!4%$./)3%

More information

Enhanced Waveform Interpolative Coding at 4 kbps

Enhanced Waveform Interpolative Coding at 4 kbps Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression

More information

Transcoding free voice transmission in GSM and UMTS networks

Transcoding free voice transmission in GSM and UMTS networks Transcoding free voice transmission in GSM and UMTS networks Sara Stančin, Grega Jakus, Sašo Tomažič University of Ljubljana, Faculty of Electrical Engineering Abstract - Transcoding refers to the conversion

More information

ENHANCED TIME DOMAIN PACKET LOSS CONCEALMENT IN SWITCHED SPEECH/AUDIO CODEC.

ENHANCED TIME DOMAIN PACKET LOSS CONCEALMENT IN SWITCHED SPEECH/AUDIO CODEC. ENHANCED TIME DOMAIN PACKET LOSS CONCEALMENT IN SWITCHED SPEECH/AUDIO CODEC Jérémie Lecomte, Adrian Tomasek, Goran Marković, Michael Schnabel, Kimitaka Tsutsumi, Kei Kikuiri Fraunhofer IIS, Erlangen, Germany,

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb 2008. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum,

More information

Improving Sound Quality by Bandwidth Extension

Improving Sound Quality by Bandwidth Extension International Journal of Scientific & Engineering Research, Volume 3, Issue 9, September-212 Improving Sound Quality by Bandwidth Extension M. Pradeepa, M.Tech, Assistant Professor Abstract - In recent

More information

ITU-T EV-VBR: A ROBUST 8-32 KBIT/S SCALABLE CODER FOR ERROR PRONE TELECOMMUNICATIONS CHANNELS

ITU-T EV-VBR: A ROBUST 8-32 KBIT/S SCALABLE CODER FOR ERROR PRONE TELECOMMUNICATIONS CHANNELS 6th European Signal Processing Conference (EUSIPCO 008), Lausanne, Switzerland, August 5-9, 008, copyright by EURASIP ITU-T EV-VBR: A ROBUST 8- KBIT/S SCALABLE CODER FOR ERROR PRONE TELECOMMUNICATIONS

More information

SILK Speech Codec. TDP 10/11 Xavier Anguera I Ciro Gracia

SILK Speech Codec. TDP 10/11 Xavier Anguera I Ciro Gracia SILK Speech Codec TDP 10/11 Xavier Anguera I Ciro Gracia SILK Codec Audio codec desenvolupat per Skype (Febrer 2009) Previament usaven el codec SVOPC (Sinusoidal Voice Over Packet Coder): LPC analysis.

More information

Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech

Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech INTERSPEECH 5 Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech M. A. Tuğtekin Turan and Engin Erzin Multimedia, Vision and Graphics Laboratory,

More information

Data Transmission at 16.8kb/s Over 32kb/s ADPCM Channel

Data Transmission at 16.8kb/s Over 32kb/s ADPCM Channel IOSR Journal of Engineering (IOSRJEN) ISSN: 2250-3021 Volume 2, Issue 6 (June 2012), PP 1529-1533 www.iosrjen.org Data Transmission at 16.8kb/s Over 32kb/s ADPCM Channel Muhanned AL-Rawi, Muaayed AL-Rawi

More information

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution PAGE 433 Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution Wenliang Lu, D. Sen, and Shuai Wang School of Electrical Engineering & Telecommunications University of New South Wales,

More information

IMPROVED CODING OF TONAL COMPONENTS IN MPEG-4 AAC WITH SBR

IMPROVED CODING OF TONAL COMPONENTS IN MPEG-4 AAC WITH SBR IMPROVED CODING OF TONAL COMPONENTS IN MPEG-4 AAC WITH SBR Tomasz Żernici, Mare Domańsi, Poznań University of Technology, Chair of Multimedia Telecommunications and Microelectronics, Polana 3, 6-965, Poznań,

More information

Voice Activity Detection for Speech Enhancement Applications

Voice Activity Detection for Speech Enhancement Applications Voice Activity Detection for Speech Enhancement Applications E. Verteletskaya, K. Sakhnov Abstract This paper describes a study of noise-robust voice activity detection (VAD) utilizing the periodicity

More information

Golomb-Rice Coding Optimized via LPC for Frequency Domain Audio Coder

Golomb-Rice Coding Optimized via LPC for Frequency Domain Audio Coder Golomb-Rice Coding Optimized via LPC for Frequency Domain Audio Coder Ryosue Sugiura, Yutaa Kamamoto, Noboru Harada, Hiroazu Kameoa and Taehiro Moriya Graduate School of Information Science and Technology,

More information

HIGH-FREQUENCY TONAL COMPONENTS RESTORATION IN LOW-BITRATE AUDIO CODING USING MULTIPLE SPECTRAL TRANSLATIONS

HIGH-FREQUENCY TONAL COMPONENTS RESTORATION IN LOW-BITRATE AUDIO CODING USING MULTIPLE SPECTRAL TRANSLATIONS HIGH-FREQUENCY TONAL COMPONENTS RESTORATION IN LOW-BITRATE AUDIO CODING USING MULTIPLE SPECTRAL TRANSLATIONS Imen Samaali 1, Gaël Mahé 2, Monia Turki-Hadj Alouane 1 1 Unité Signaux et Systèmes (U2S), Université

More information

Speech/Music Change Point Detection using Sonogram and AANN

Speech/Music Change Point Detection using Sonogram and AANN International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 6, Number 1 (2016), pp. 45-49 International Research Publications House http://www. irphouse.com Speech/Music Change

More information

Experiments in two-tone interference

Experiments in two-tone interference Experiments in two-tone interference Using zero-based encoding An alternative look at combination tones and the critical band John K. Bates Time/Space Systems Functions of the experimental system: Variable

More information

REAL-TIME BROADBAND NOISE REDUCTION

REAL-TIME BROADBAND NOISE REDUCTION REAL-TIME BROADBAND NOISE REDUCTION Robert Hoeldrich and Markus Lorber Institute of Electronic Music Graz Jakoministrasse 3-5, A-8010 Graz, Austria email: robert.hoeldrich@mhsg.ac.at Abstract A real-time

More information

Audio and Speech Compression Using DCT and DWT Techniques

Audio and Speech Compression Using DCT and DWT Techniques Audio and Speech Compression Using DCT and DWT Techniques M. V. Patil 1, Apoorva Gupta 2, Ankita Varma 3, Shikhar Salil 4 Asst. Professor, Dept.of Elex, Bharati Vidyapeeth Univ.Coll.of Engg, Pune, Maharashtra,

More information

TELECOMMUNICATION SYSTEMS

TELECOMMUNICATION SYSTEMS TELECOMMUNICATION SYSTEMS By Syed Bakhtawar Shah Abid Lecturer in Computer Science 1 MULTIPLEXING An efficient system maximizes the utilization of all resources. Bandwidth is one of the most precious resources

More information

ECE 556 BASICS OF DIGITAL SPEECH PROCESSING. Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2

ECE 556 BASICS OF DIGITAL SPEECH PROCESSING. Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2 ECE 556 BASICS OF DIGITAL SPEECH PROCESSING Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2 Analog Sound to Digital Sound Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre

More information

Tones in HVAC Systems (Update from 2006 Seminar, Quebec City) Jerry G. Lilly, P.E. JGL Acoustics, Inc. Issaquah, WA

Tones in HVAC Systems (Update from 2006 Seminar, Quebec City) Jerry G. Lilly, P.E. JGL Acoustics, Inc. Issaquah, WA Tones in HVAC Systems (Update from 2006 Seminar, Quebec City) Jerry G. Lilly, P.E. JGL Acoustics, Inc. Issaquah, WA Outline Review Fundamentals Frequency Spectra Tone Characteristics Tone Detection Methods

More information

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Different Approaches of Spectral Subtraction Method for Speech Enhancement ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches

More information

Complex Sounds. Reading: Yost Ch. 4

Complex Sounds. Reading: Yost Ch. 4 Complex Sounds Reading: Yost Ch. 4 Natural Sounds Most sounds in our everyday lives are not simple sinusoidal sounds, but are complex sounds, consisting of a sum of many sinusoids. The amplitude and frequency

More information

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt Pattern Recognition Part 6: Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory

More information

RECENTLY, there has been an increasing interest in noisy

RECENTLY, there has been an increasing interest in noisy IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 9, SEPTEMBER 2005 535 Warped Discrete Cosine Transform-Based Noisy Speech Enhancement Joon-Hyuk Chang, Member, IEEE Abstract In

More information

Perceptual wideband speech and audio quality measurement. Dr Antony Rix Psytechnics Limited

Perceptual wideband speech and audio quality measurement. Dr Antony Rix Psytechnics Limited Perceptual wideband speech and audio quality measurement Dr Antony Rix Psytechnics Limited Agenda Background Perceptual models BS.1387 PEAQ P.862 PESQ Scope Extension to wideband Performance of wideband

More information

Speech Compression Using Voice Excited Linear Predictive Coding

Speech Compression Using Voice Excited Linear Predictive Coding Speech Compression Using Voice Excited Linear Predictive Coding Ms.Tosha Sen, Ms.Kruti Jay Pancholi PG Student, Asst. Professor, L J I E T, Ahmedabad Abstract : The aim of the thesis is design good quality

More information

Advanced audio analysis. Martin Gasser

Advanced audio analysis. Martin Gasser Advanced audio analysis Martin Gasser Motivation Which methods are common in MIR research? How can we parameterize audio signals? Interesting dimensions of audio: Spectral/ time/melody structure, high

More information

Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder

Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder COMPUSOFT, An international journal of advanced computer technology, 3 (3), March-204 (Volume-III, Issue-III) ISSN:2320-0790 Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech

More information

A Study on Complexity Reduction of Binaural. Decoding in Multi-channel Audio Coding for. Realistic Audio Service

A Study on Complexity Reduction of Binaural. Decoding in Multi-channel Audio Coding for. Realistic Audio Service Contemporary Engineering Sciences, Vol. 9, 2016, no. 1, 11-19 IKARI Ltd, www.m-hiari.com http://dx.doi.org/10.12988/ces.2016.512315 A Study on Complexity Reduction of Binaural Decoding in Multi-channel

More information

The Emergence, Introduction and Challenges of Wideband Choice Codecs in the VoIP Market

The Emergence, Introduction and Challenges of Wideband Choice Codecs in the VoIP Market 5 th Nov, 2008 The Emergence, Introduction and Challenges of Wideband Choice Codecs in the VoIP Market PN101 Roger Chung of Freescale Semiconductor, Inc. All other product or service names are the property

More information

Singing Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection

Singing Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection Detection Lecture usic Processing Applications of usic Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Important pre-requisite for: usic segmentation

More information

Chapter IV THEORY OF CELP CODING

Chapter IV THEORY OF CELP CODING Chapter IV THEORY OF CELP CODING CHAPTER IV THEORY OF CELP CODING 4.1 Introduction Wavefonn coders fail to produce high quality speech at bit rate lower than 16 kbps. Source coders, such as LPC vocoders,

More information

HARMONIC INSTABILITY OF DIGITAL SOFT CLIPPING ALGORITHMS

HARMONIC INSTABILITY OF DIGITAL SOFT CLIPPING ALGORITHMS HARMONIC INSTABILITY OF DIGITAL SOFT CLIPPING ALGORITHMS Sean Enderby and Zlatko Baracskai Department of Digital Media Technology Birmingham City University Birmingham, UK ABSTRACT In this paper several

More information

Convention Paper Presented at the 112th Convention 2002 May Munich, Germany

Convention Paper Presented at the 112th Convention 2002 May Munich, Germany Audio Engineering Society Convention Paper Presented at the 112th Convention 2002 May 10 13 Munich, Germany 5627 This convention paper has been reproduced from the author s advance manuscript, without

More information

core signal feature extractor feature signal estimator adding additional frequency content frequency enhanced audio signal 112 selection side info.

core signal feature extractor feature signal estimator adding additional frequency content frequency enhanced audio signal 112 selection side info. US 20170358311A1 US 20170358311Α1 (ΐ9) United States (ΐ2) Patent Application Publication (ΐο) Pub. No.: US 2017/0358311 Al NAGEL et al. (43) Pub. Date: Dec. 14,2017 (54) DECODER FOR GENERATING A FREQUENCY

More information

Technical Report Speech and multimedia Transmission Quality (STQ); Speech samples and their usage for QoS testing

Technical Report Speech and multimedia Transmission Quality (STQ); Speech samples and their usage for QoS testing Technical Report Speech and multimedia Transmission Quality (STQ); Speech samples and their usage for QoS testing 2 Reference DTR/STQ-00196m Keywords QoS, quality, speech 650 Route des Lucioles F-06921

More information

Perceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter

Perceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter Perceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter Sana Alaya, Novlène Zoghlami and Zied Lachiri Signal, Image and Information Technology Laboratory National Engineering School

More information

Audio Imputation Using the Non-negative Hidden Markov Model

Audio Imputation Using the Non-negative Hidden Markov Model Audio Imputation Using the Non-negative Hidden Markov Model Jinyu Han 1,, Gautham J. Mysore 2, and Bryan Pardo 1 1 EECS Department, Northwestern University 2 Advanced Technology Labs, Adobe Systems Inc.

More information

Digital Audio. Lecture-6

Digital Audio. Lecture-6 Digital Audio Lecture-6 Topics today Digitization of sound PCM Lossless predictive coding 2 Sound Sound is a pressure wave, taking continuous values Increase / decrease in pressure can be measured in amplitude,

More information

Digital Speech Processing and Coding

Digital Speech Processing and Coding ENEE408G Spring 2006 Lecture-2 Digital Speech Processing and Coding Spring 06 Instructor: Shihab Shamma Electrical & Computer Engineering University of Maryland, College Park http://www.ece.umd.edu/class/enee408g/

More information

Transcoding of Narrowband to Wideband Speech

Transcoding of Narrowband to Wideband Speech University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2005 Transcoding of Narrowband to Wideband Speech Christian H. Ritz University

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Evaluation of Audio Compression Artifacts M. Herrera Martinez

Evaluation of Audio Compression Artifacts M. Herrera Martinez Evaluation of Audio Compression Artifacts M. Herrera Martinez This paper deals with subjective evaluation of audio-coding systems. From this evaluation, it is found that, depending on the type of signal

More information

Music Signal Processing

Music Signal Processing Tutorial Music Signal Processing Meinard Müller Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Anssi Klapuri Queen Mary University of London anssi.klapuri@elec.qmul.ac.uk Overview Part I:

More information

ETSI TS V1.1.1 ( )

ETSI TS V1.1.1 ( ) TS 102 925 V1.1.1 (2013-03) Technical Specification Speech and multimedia Transmission Quality (STQ); Transmission requirements for Superwideband/Fullband handsfree and conferencing terminals from a QoS

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/

More information

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile 8 2. LITERATURE SURVEY The available radio spectrum for the wireless radio communication is very limited hence to accommodate maximum number of users the speech is compressed. The speech compression techniques

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

Carrier Frequency Offset Estimation in WCDMA Systems Using a Modified FFT-Based Algorithm

Carrier Frequency Offset Estimation in WCDMA Systems Using a Modified FFT-Based Algorithm Carrier Frequency Offset Estimation in WCDMA Systems Using a Modified FFT-Based Algorithm Seare H. Rezenom and Anthony D. Broadhurst, Member, IEEE Abstract-- Wideband Code Division Multiple Access (WCDMA)

More information

Audio Compression using the MLT and SPIHT

Audio Compression using the MLT and SPIHT Audio Compression using the MLT and SPIHT Mohammed Raad, Alfred Mertins and Ian Burnett School of Electrical, Computer and Telecommunications Engineering University Of Wollongong Northfields Ave Wollongong

More information

Modulation analysis in ArtemiS SUITE 1

Modulation analysis in ArtemiS SUITE 1 02/18 in ArtemiS SUITE 1 of ArtemiS SUITE delivers the envelope spectra of partial bands of an analyzed signal. This allows to determine the frequency, strength and change over time of amplitude modulations

More information

Voice Excited Lpc for Speech Compression by V/Uv Classification

Voice Excited Lpc for Speech Compression by V/Uv Classification IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 3, Ver. II (May. -Jun. 2016), PP 65-69 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Voice Excited Lpc for Speech

More information

Bandwidth Extension for Speech Enhancement

Bandwidth Extension for Speech Enhancement Bandwidth Extension for Speech Enhancement F. Mustiere, M. Bouchard, M. Bolic University of Ottawa Tuesday, May 4 th 2010 CCECE 2010: Signal and Multimedia Processing 1 2 3 4 Current Topic 1 2 3 4 Context

More information

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A EC 6501 DIGITAL COMMUNICATION 1.What is the need of prediction filtering? UNIT - II PART A [N/D-16] Prediction filtering is used mostly in audio signal processing and speech processing for representing

More information

Dilpreet Singh 1, Parminder Singh 2 1 M.Tech. Student, 2 Associate Professor

Dilpreet Singh 1, Parminder Singh 2 1 M.Tech. Student, 2 Associate Professor A Novel Approach for Waveform Compression Dilpreet Singh 1, Parminder Singh 2 1 M.Tech. Student, 2 Associate Professor CSE Department, Guru Nanak Dev Engineering College, Ludhiana Abstract Waveform Compression

More information

(51) Int Cl.: G10L 19/24 ( ) G10L 21/038 ( )

(51) Int Cl.: G10L 19/24 ( ) G10L 21/038 ( ) (19) TEPZZ 48Z 9B_T (11) EP 2 48 029 B1 (12) EUROPEAN PATENT SPECIFICATION (4) Date of publication and mention of the grant of the patent: 14.06.17 Bulletin 17/24 (21) Application number: 117746.0 (22)

More information

3GPP TS V5.0.0 ( )

3GPP TS V5.0.0 ( ) TS 26.171 V5.0.0 (2001-03) Technical Specification 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Speech Codec speech processing functions; AMR Wideband

More information

Subjective Voice Quality Evaluation of Artificial Bandwidth Extension: Comparing Different Audio Bandwidths and Speech Codecs

Subjective Voice Quality Evaluation of Artificial Bandwidth Extension: Comparing Different Audio Bandwidths and Speech Codecs INTERSPEECH 01 Subjective Voice Quality Evaluation of Artificial Bandwidth Extension: Comparing Different Audio Bandwidths and Speech Codecs Hannu Pulakka 1, Anssi Rämö, Ville Myllylä 1, Henri Toukomaa,

More information

22. Konferenz Elektronische Sprachsignalverarbeitung (ESSV), September 2011, Aachen, Germany (TuDPress, ISBN )

22. Konferenz Elektronische Sprachsignalverarbeitung (ESSV), September 2011, Aachen, Germany (TuDPress, ISBN ) BINAURAL WIDEBAND TELEPHONY USING STEGANOGRAPHY Bernd Geiser, Magnus Schäfer, and Peter Vary Institute of Communication Systems and Data Processing ( ) RWTH Aachen University, Germany {geiser schaefer

More information

Spatial Audio Transmission Technology for Multi-point Mobile Voice Chat

Spatial Audio Transmission Technology for Multi-point Mobile Voice Chat Audio Transmission Technology for Multi-point Mobile Voice Chat Voice Chat Multi-channel Coding Binaural Signal Processing Audio Transmission Technology for Multi-point Mobile Voice Chat We have developed

More information

Tempo and Beat Tracking

Tempo and Beat Tracking Lecture Music Processing Tempo and Beat Tracking Meinard Müller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Introduction Basic beat tracking task: Given an audio recording

More information

Broadcast Notes by Ray Voss

Broadcast Notes by Ray Voss Broadcast Notes by Ray Voss The following is an incomplete treatment and in many ways a gross oversimplification of the subject! Nonetheless, it gives a glimpse of the issues and compromises involved in

More information

SOURCE CONTROLLED CHANNEL DECODING FOR GSM-AMR SPEECH TRANSMISSION WITH VOICE ACTIVITY DETECTION (VAD) C. Murali Mohan R. Aravind

SOURCE CONTROLLED CHANNEL DECODING FOR GSM-AMR SPEECH TRANSMISSION WITH VOICE ACTIVITY DETECTION (VAD) C. Murali Mohan R. Aravind SOURCE CONTROLLED CHANNEL DECODING FOR GSM-AMR SPEECH TRANSMISSION WITH VOICE ACTIVITY DETECTION (D C. Murali Mohan R. Aravind Department of Electrical Engineering Indian Institute of Technology, Madras

More information

Frequency Domain Representation of Signals

Frequency Domain Representation of Signals Frequency Domain Representation of Signals The Discrete Fourier Transform (DFT) of a sampled time domain waveform x n x 0, x 1,..., x 1 is a set of Fourier Coefficients whose samples are 1 n0 X k X0, X

More information

Speech Synthesis; Pitch Detection and Vocoders

Speech Synthesis; Pitch Detection and Vocoders Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech

More information

DEPARTMENT OF INFORMATION TECHNOLOGY QUESTION BANK. Subject Name: Information Coding Techniques UNIT I INFORMATION ENTROPY FUNDAMENTALS

DEPARTMENT OF INFORMATION TECHNOLOGY QUESTION BANK. Subject Name: Information Coding Techniques UNIT I INFORMATION ENTROPY FUNDAMENTALS DEPARTMENT OF INFORMATION TECHNOLOGY QUESTION BANK Subject Name: Year /Sem: II / IV UNIT I INFORMATION ENTROPY FUNDAMENTALS PART A (2 MARKS) 1. What is uncertainty? 2. What is prefix coding? 3. State the

More information

E : Lecture 8 Source-Filter Processing. E : Lecture 8 Source-Filter Processing / 21

E : Lecture 8 Source-Filter Processing. E : Lecture 8 Source-Filter Processing / 21 E85.267: Lecture 8 Source-Filter Processing E85.267: Lecture 8 Source-Filter Processing 21-4-1 1 / 21 Source-filter analysis/synthesis n f Spectral envelope Spectral envelope Analysis Source signal n 1

More information

Information. LSP (Line Spectrum Pair): Essential Technology for High-compression Speech Coding. Takehiro Moriya. Abstract

Information. LSP (Line Spectrum Pair): Essential Technology for High-compression Speech Coding. Takehiro Moriya. Abstract LSP (Line Spectrum Pair): Essential Technology for High-compression Speech Coding Takehiro Moriya Abstract Line Spectrum Pair (LSP) technology was accepted as an IEEE (Institute of Electrical and Electronics

More information

ETSI EN V1.2.1 ( )

ETSI EN V1.2.1 ( ) Candidate Harmonized European Standard (Telecommunications series) Electromagnetic compatibility and Radio spectrum Matters (ERM); ElectroMagnetic Compatibility (EMC) standard for radio equipment and services;

More information

Outline. Communications Engineering 1

Outline. Communications Engineering 1 Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband channels Signal space representation Optimal

More information

ETSI EN V1.2.1 ( )

ETSI EN V1.2.1 ( ) EN 301 489-19 V1.2.1 (2002-11) Candidate Harmonized European Standard (Telecommunications series) Electromagnetic compatibility and Radio spectrum Matters (ERM); ElectroMagnetic Compatibility (EMC) standard

More information

Ninad Bhatt Yogeshwar Kosta

Ninad Bhatt Yogeshwar Kosta DOI 10.1007/s10772-012-9178-9 Implementation of variable bitrate data hiding techniques on standard and proposed GSM 06.10 full rate coder and its overall comparative evaluation of performance Ninad Bhatt

More information

ETSI TS V ( )

ETSI TS V ( ) TS 126 171 V14.0.0 (2017-04) TECHNICAL SPECIFICATION Digital cellular telecommunications system (Phase 2+) (GSM); Universal Mobile Telecommunications System (UMTS); LTE; Speech codec speech processing

More information

RECOMMENDATION ITU-R BS User requirements for audio coding systems for digital broadcasting

RECOMMENDATION ITU-R BS User requirements for audio coding systems for digital broadcasting Rec. ITU-R BS.1548-1 1 RECOMMENDATION ITU-R BS.1548-1 User requirements for audio coding systems for digital broadcasting (Question ITU-R 19/6) (2001-2002) The ITU Radiocommunication Assembly, considering

More information

3GPP TS V8.0.0 ( )

3GPP TS V8.0.0 ( ) TS 46.022 V8.0.0 (2008-12) Technical Specification 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Half rate speech; Comfort noise aspects for the half rate

More information

Speech Quality Assessment for Wideband Communication Scenarios

Speech Quality Assessment for Wideband Communication Scenarios Speech Quality Assessment for Wideband Communication Scenarios H. W. Gierlich, S. Völl, F. Kettler (HEAD acoustics GmbH) P. Jax (IND, RWTH Aachen) Workshop on Wideband Speech Quality in Terminals and Networks

More information

AUDL GS08/GAV1 Auditory Perception. Envelope and temporal fine structure (TFS)

AUDL GS08/GAV1 Auditory Perception. Envelope and temporal fine structure (TFS) AUDL GS08/GAV1 Auditory Perception Envelope and temporal fine structure (TFS) Envelope and TFS arise from a method of decomposing waveforms The classic decomposition of waveforms Spectral analysis... Decomposes

More information

The Channel Vocoder (analyzer):

The Channel Vocoder (analyzer): Vocoders 1 The Channel Vocoder (analyzer): The channel vocoder employs a bank of bandpass filters, Each having a bandwidth between 100 Hz and 300 Hz. Typically, 16-20 linear phase FIR filter are used.

More information