BANDWIDTH EXTENSION OF NARROWBAND SPEECH BASED ON BLIND MODEL ADAPTATION

Size: px
Start display at page:

Download "BANDWIDTH EXTENSION OF NARROWBAND SPEECH BASED ON BLIND MODEL ADAPTATION"

Transcription

1 5th European Signal Processing Conference (EUSIPCO 007, Poznan, Poland, September 3-7, 007, copyright by EURASIP BANDWIDH EXENSION OF NARROWBAND SPEECH BASED ON BLIND MODEL ADAPAION Sheng Yao and Cheung-Fat Chan Department of Electronic Engineering City University of Hong Kong, Kowloon, Hong Kong and ABSRAC raditional telephone transmission networ has speech frequency upper-limit below Hz. he narrowband telephone speech (0 Hz sounds muffled as compared with the original wideband speech (0-8 Hz. Artificial bandwidth extension is an economical way of enhancing the quality of narrowband speech without modifying the infrastructure of the networ. Esting bandwidth extension methods usually include off-line learning phase and on-line enhancing phase. he performance of these systems depends largely on the consistency of wideband training data and actual narrowband input data. In real situation, input speeches usually mismatch with off-line training speeches, leading to serious model errors. o avoid the data mismatch, we propose a method based on blind adaptation of linear dynamic model. he benefit of our method is the exclusion of off-line training phase and experiment results show that our systems is comparable with those data-oriented systems in the measurements of highband spectral distortion. When data mismatch occurs, our system outperforms those systems.. INRODUCION With the gradual development of wideband voice terminals such as adaptive multi-rate wideband codec (AMR-WB and variable rate multi-model wideband codec (VRM-WB, current speech transmission networ is a mixture of traditional narrowband terminals and new wideband terminals. During this transition period, bandwidth extension systems (BWE helps to enhance the perceived quality of narrowband speech without the cost of replacing the old narrowband infrastructure. he authors of [] indicate that esting BWE systems are performing reasonably, not because they accurately retrieve the original missing high-band information, but rather they extend the high-band such that the signal sounds perceptually pleasant. Basically reported BWE systems can be his research is supported by Strategic Research Grant (Project of City University of Hong Kong classified into two categories: memoryless systems and memory systems. Memoryless methods are the earlier development of BWE, with members such as VQ codeboo mapping [], linear mapping [3] and Gaussian mixture model (GMM conversion []. hese methods are usually criticized for the disregardness of inter-frame correlationship, which is the cause of relatively large hissing artefacts. Recently, more attention is paid on memory system development. Candidates are hidden Marov model (HMM method [5], HMM with state mapping [7] and linear dynamic model [8]. hese systems are featured for the capability of estimating the missing high-band information based on previous estimations. hey focus more on the retrieval of the trajectory of spectrum evolution, thus hissing artefact greatly reduced. However, all the systems aforementioned are dataoriented. hey perform well when input narrowband speech is consistent with training database, i.e. the same speaer or similar recoding environment. It is not the case in real application where data mismatch often occurs. In this paper, we propose a memory system based on linear dynamic model whose parameters adapt to the input narrowband speech in a blind manner. Off-line model training is not required except for the initial model. Experiment results show that the proposed method is superior to memoryless systems and comparable with memory systems in the measurement of highband spectral distortion. he rest of paper is organized as follows. Section presents the employment of linear dynamic model in bandwidth extension systems. Section 3 explains how the proposed system maes itself adaptive to the input narrowband speech in a blind sense. In section, the objective performance is compared. he last section is for conclusion.. LINEAR DYNAMIC MODEL he model is also termed state space model. In linear state p space model, the hidden speech state vector x( R is presumably linearly evolving according to equation (. x ( + Ax ( + u + w( ( 007 EURASIP 350

2 5th European Signal Processing Conference (EUSIPCO 007, Poznan, Poland, September 3-7, 007, copyright by EURASIP is time index or speech frame index. ransformation matrix A and deterministic control vector u are pre-trained model parameter. w ( is uncorrelated zero-mean Gaussian noise vector with covariance E [ w( w( l ] Qδ. he m observation vector o( R is the noisy linear transformation of state vector x( according to equation (. l (a original wideband female speech from speaer clh o ( Cx ( + v( ( he equation is static in nature since vector o and x share the same index. v( is also uncorrelated zero-mean Gaussian noise with covariance E[ v( v( l ] Rδ. l By assuming o( as input narrowband speech feature vector and x( as the unnown wideband feature vector, linear state space model is employed in the speech bandwidth extension system. Such an assumption is reasonable because Human speech process is a non-stationary random process. he hidden state of the process is never static. State equation ( is one possible representation of state evolution for the ease of mathematical treatment. he idea of linear transformation relationship between o ( and x( assumed in observation equation ( has already been applied in memoryless linear mapping system. he performance is satisfactory apart from the limitation of memoryless nature of the system. 3 Due to the presence of noise and possible noninvertiblity of matrix C in equation (, the state vector x( cannot be uniquely estimated given the observation vector o(, which reflects the one-to-many mapping between narrowband and wideband speech features [6]. We extract 0-order line spectral frequencies (LSF as narrowband feature o. arget wideband feature s defined as 8-order LSF. 3. BLIND MODEL ADAPAION Given a sequence of narrowband feature vector o and a trained linear state space model θ { A, u, C, Q, R}, if we assume θ is stationary and well trained for the sequence, a pretty good estimation of x sequence can be obtained via Kalman filter algorithm as shown below: For,,..., L, Kalman Prediction xˆ Axˆ + u A A + Q Kalman Gain (b estimation of wideband speech with the mismatched model of male speaer bjm Figure ( illustration of model mismatch Κ C ( C C + R Kalman Correction x xˆ + Κ ( o( Cxˆ, initialized usually by ˆ Κ ( C C + R Κ. ˆ 0 x 0 E[ x(0] μ(0 E [ x(0 x(0 ] (0 0 0 he formulation of Kalman gain matrix Κ aims at the minimization of the trace of the state error matrix.herefore Kalman filter algorithm is a MMSE estimator of the hidden wideband speech vector x(. Note that, Kalman filter algorithm is sequential. However, θ { A, u, C, Q, R} should not be stationary. he method described in [8] provides the state space model with several different modes. he system, bloc by bloc, chooses the best fitted modes for the input narrowband sequence via some clustering techniques. Albeit such a treatment offers the state space model a certain degree of dynamics and the subjective and objective performance of the system is satisfactory, lie other data-oriented methods, a relatively large amount of training data is required. Besides, in case the input narrowband speech is quite different from the training database (i.e. speaer or recording environment difference, severe model error occurs, leading to unacceptable level of hissing. For example, a false formant trajectory may appear in high-band spectrogram when a male model is applied to a female input (see figure ( and vise versa. We propose a model updating mechanism that doesn t require off-line training. he basic assumption is that the system is confident about previously estimated wideband features and, by utilizing those results, allows the updating of the model parameters. he concept is illustrated in figure (. For an arbitrary input narrowband vector sequence with length N ( o (, o( +, o( +... o( + N, linear 007 EURASIP 35

3 5th European Signal Processing Conference (EUSIPCO 007, Poznan, Poland, September 3-7, 007, copyright by EURASIP Figure ( the sequence length is fixed to N frames and the updating bloc move forward frame by frame (a with model parameter updating state space model is assumed stationaryθ θ. he corresponding wideband estimate xˆ (, xˆ( +, xˆ( +... xˆ( + N are obtained from the previous estimation. Consider the narrowband input vector o ( + N. First, the sequential Kalman filter algorithm continues to estimate x ˆ( + N with modelθ θ. Given the narrowband observation sequence (b without model parameter updating o (, o( +, o( +, o( o( + N and wideband state estimations up to x ˆ( + N, x ˆ(, xˆ( +, xˆ( +, xˆ( xˆ( + N, the linear state space model θ θ + for sequence from + to + N is updated in mamum lielihood sense. where Aˆ uˆ] + [ ( ( ] 3( ( ˆ ( ( [ Qˆ C ( N + { [ Aˆ uˆ] + [ ] } N + Rˆ + [ 7 ( C+ 8( ] N + i + 3 x + i x i + x + i (, (, (, (, + 5 (, x, 6( i 7(, 8( With θ θ + and next narrowband input o ( + N +, xˆ ( + N + can be estimated via Kalman filter algorithm. hen θ θ + is updated with (c original wideband LSF trajectory Figure (3 system capability of tracing wideband features o ( +, o( +, o( + 3, o( +... o( + N + x ˆ ( +, xˆ( +, xˆ( + 3, xˆ( +... xˆ( + N + he procedure continues until the end of input is reached, which actually conducts a timely online training of the linear state space model. he computation is not as burdensome as model training in [8] since quantities, to require 8 full calculations only once (for the very first updating. In the following updating process, an addition and a subtraction is enough. he value of N is set to 60 frames (about 0 ms for our codec configuration. If N is too small (say less than 50, matrix ( ( and 3 6 ( ( N may become singular. he initial linear state space model is offline trained with environmental signals collected when speaers are not taling. he method is named blind because the updating of 007 EURASIP 35

4 5th European Signal Processing Conference (EUSIPCO 007, Poznan, Poland, September 3-7, 007, copyright by EURASIP state space model is localized within consecutive N speech frames. herefore the model parameters are optimized merely for these N frames. Besides, the wideband training data is the previous estimated data rather than the true data. One may wonder whether the updating is correct since there are estimation errors. We did the following experiment and found out such a blind adaptation is trustable. he first portion of the experiment is to enhance the narrowband speech with initial model not adapted. In such a case, model error is quite large. he other portion is the normal operation (allowing blind adaptation. As is depicted in figure (3, under normal operation of the proposed system, the general shape of high-band feature trajectories can be retrieved. Average distortion of line spectral frequencies is listed in table (. For reference, the third column is the result of well trained and source-matched memory system presented in [8]. Note that the goodness is more relevant to small high-order LSF distortions. case (a dotted curve: proposed method case (b dashed curve: memoryless VQ method solid curve: original Figure ( spectral envelope comparison With adaptation Without adaptation LDS reference lsf( lsf( lsf( lsf( lsf( lsf( lsf( lsf( lsf( lsf( lsf( lsf( lsf( lsf( lsf( lsf( lsf( lsf( able ( LSF distortions along orders Conceptually, in the beginning, the proposed system enhances the silence narrowband input, producing spectrally flat high-band noisy signals lie what ordinary linear state space system does. When speech content comes in, the highband spectral distortion will be large if the initial model does not change accordingly. With adaptation, the model parameters are timely and locally optimized for current voiced narrowband input, driving the underlying model to a voiced model, frame by frame. Since the required wideband feature is previous estimations, which is spectrally flat, the new model parameter is actually optimized for such wideband output speeches as have similar narrowband with input and a spectrally flat high-band. Recall that most suffered speech sounds under bandwidth limitation are fricatives and plosives. hese sounds have a relatively flat high-band and few (a original wideband female speech (b estimated wideband speech by linear state space model with correct speaer model (c estimated wideband speech by blind model adaptation Figure (5 performance illustration of blind model adaptation voicing content in high-band portion. he objective of bandwidth extension is to artificially extend the bandwidth so that the speech becomes perceptually better. See illustration in figure (. In case (a, the speech quality is still enhanced even though high-band formant structure is not recovered. But if case (b occurs (due to model error or memoryless design limitation, human ear is quite sensitive to that noise. Finally in figure (5, we can see the performance difference between proposed method and conventional LDS system [8].. PERFORMANCE EVALUAION he objective measurement is high-band spectral distortion defined as follows: 007 EURASIP 353

5 5th European Signal Processing Conference (EUSIPCO 007, Poznan, Poland, September 3-7, 007, copyright by EURASIP D(dB Outlier Outlier (>5dB (>7.5dB VQ[] %.0% Linear mapping[3].963.5%.% GMM[] % 0.98% HMM[5].06.0% 0.7% HMM state mapping[7].357.% 0.0% Linear state space.33.05% 0.6% model[8] Proposed.55 3.% 0.77% able ( est A (test data matches training data D(dB Outlier Outlier (>5dB (>7.5dB VQ[] % 3.9% Linear mapping[3] %.07% GMM[] %.78% HMM[5] % 0.99% HMM state mapping[7] %.0% Linear state space % 0.8% model[8] Proposed % 0.80% able (3 est B (test data mismatches training data π D π π ( 0 log S ( ω 0 log S ( ω 0 org 0 ext dω BWE systems such as [][][3][5][7][8] are implemented and trained in a speaer-dependent manner. he training data is from the phonetically balanced IViE corpus ( 8-minute speaer dependent paragraph reading speech (about 00,000 frames according to our speech analysizer is piced out to train all the six systems. he silence segments are collected for the training of the initial model of the proposed system, which is the only off-line training requirement for the system. he performance is listed in table ( and (3. he mismatched test data is collected from another speaer with different gender. As we can see in table ((3, the proposed method has similar performances under two (3 circumstances. When test data is consistent with training database, the performance is better than memoryless systems and comparable with memory systems. When model mismatch occurs, it outperforms all the data-oriented methods. 5. CONCLUSION In this paper we present a bandwidth extension system based on blind adaptation of linear state space model. By the measurement of high-band spectral distortion, the proposed system is comparable with data-oriented memory systems and better than memoryless systems. When data mismatch occurs, the performance is better than all the data-oriented systems on the condition that the bacground environment is not dramatically changed. Moreover, off-line training is not required and the efficient computation of on-line model adaptation maes sure the system delay not too large. REFERENCES [] N. Enbom, and W.B. Kleijn, Bandwidth Expansion of Speech Based on Vector Quantization of the Mel Frequency Cepstral Coefficients, Proc. Speech Coding, pp. 7-73, 999. [] K.Y. Par, and H.S. Kim, Narrowband to Wideband Conversion of Speech Using GMM Based ransformation, Proc. ICASSP, pp , 000. [3] Y. Naatoh, M. sushima, and. Norimatsu, Generation of Broadband Speech from Narrowband Speech Based on Linear Mapping, Electronics and Communications in Japan, Part, Vol 85, No. 8, pp. -53, 00. [] M. Nilsson, H. Gustafsson, S. V. Anderson, and W. B. Kleijn, Gaussian Mixture Model Based Mutual Information Estimation between Frequency Bands in Speech, Proc. ICASSP, pp. I55-I58, 00 [5] P. Jax, and P. Vary, On artificial Bandwidth Extension of elephone Speech, Signal Processing, pp , 003. [6] Y. Agiomyrgiannais, and Y. Stylianou, Combined Estimation/coding of Highband Spectral Envelopes for Speech Spectrum Expansion, Proc. ICASSP, pp. 69-7, 00. [7] S.Yao and C.F.Chan, Bloc-based Bandwidth Extension of Narrowband Speech Signal by using CDHMM, Proc. ICASSP, pp. I793-I796, 005 [8] S.Yao and C.F.Chan, Speech Bandwidth Enhancement using State Space Speech Dynamics, Proc. ICASSP, pp. I89-I9, EURASIP 35

651 Analysis of LSF frame selection in voice conversion

651 Analysis of LSF frame selection in voice conversion 651 Analysis of LSF frame selection in voice conversion Elina Helander 1, Jani Nurminen 2, Moncef Gabbouj 1 1 Institute of Signal Processing, Tampere University of Technology, Finland 2 Noia Technology

More information

Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech

Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech INTERSPEECH 5 Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech M. A. Tuğtekin Turan and Engin Erzin Multimedia, Vision and Graphics Laboratory,

More information

Bandwidth Extension for Speech Enhancement

Bandwidth Extension for Speech Enhancement Bandwidth Extension for Speech Enhancement F. Mustiere, M. Bouchard, M. Bolic University of Ottawa Tuesday, May 4 th 2010 CCECE 2010: Signal and Multimedia Processing 1 2 3 4 Current Topic 1 2 3 4 Context

More information

ARTIFICIAL BANDWIDTH EXTENSION OF NARROW-BAND SPEECH SIGNALS VIA HIGH-BAND ENERGY ESTIMATION

ARTIFICIAL BANDWIDTH EXTENSION OF NARROW-BAND SPEECH SIGNALS VIA HIGH-BAND ENERGY ESTIMATION ARTIFICIAL BANDWIDTH EXTENSION OF NARROW-BAND SPEECH SIGNALS VIA HIGH-BAND ENERGY ESTIMATION Tenkasi Ramabadran and Mark Jasiuk Motorola Labs, Motorola Inc., 1301 East Algonquin Road, Schaumburg, IL 60196,

More information

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches Performance study of Text-independent Speaker identification system using & I for Telephone and Microphone Speeches Ruchi Chaudhary, National Technical Research Organization Abstract: A state-of-the-art

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday. L105/205 Phonetics Scarborough Handout 7 10/18/05 Reading: Johnson Ch.2.3.3-2.3.6, Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday Spectral Analysis 1. There are

More information

Artificial Bandwidth Extension Using Deep Neural Networks for Spectral Envelope Estimation

Artificial Bandwidth Extension Using Deep Neural Networks for Spectral Envelope Estimation Platzhalter für Bild, Bild auf Titelfolie hinter das Logo einsetzen Artificial Bandwidth Extension Using Deep Neural Networks for Spectral Envelope Estimation Johannes Abel and Tim Fingscheidt Institute

More information

Change Point Determination in Audio Data Using Auditory Features

Change Point Determination in Audio Data Using Auditory Features INTL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 0, VOL., NO., PP. 8 90 Manuscript received April, 0; revised June, 0. DOI: /eletel-0-00 Change Point Determination in Audio Data Using Auditory Features

More information

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Different Approaches of Spectral Subtraction Method for Speech Enhancement ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches

More information

SUB-BAND INDEPENDENT SUBSPACE ANALYSIS FOR DRUM TRANSCRIPTION. Derry FitzGerald, Eugene Coyle

SUB-BAND INDEPENDENT SUBSPACE ANALYSIS FOR DRUM TRANSCRIPTION. Derry FitzGerald, Eugene Coyle SUB-BAND INDEPENDEN SUBSPACE ANALYSIS FOR DRUM RANSCRIPION Derry FitzGerald, Eugene Coyle D.I.., Rathmines Rd, Dublin, Ireland derryfitzgerald@dit.ie eugene.coyle@dit.ie Bob Lawlor Department of Electronic

More information

SPEECH ENHANCEMENT BASED ON A LOG-SPECTRAL AMPLITUDE ESTIMATOR AND A POSTFILTER DERIVED FROM CLEAN SPEECH CODEBOOK

SPEECH ENHANCEMENT BASED ON A LOG-SPECTRAL AMPLITUDE ESTIMATOR AND A POSTFILTER DERIVED FROM CLEAN SPEECH CODEBOOK 18th European Signal Processing Conference (EUSIPCO-2010) Aalborg, Denmar, August 23-27, 2010 SPEECH ENHANCEMENT BASED ON A LOG-SPECTRAL AMPLITUDE ESTIMATOR AND A POSTFILTER DERIVED FROM CLEAN SPEECH CODEBOOK

More information

Wavelet-based Voice Morphing

Wavelet-based Voice Morphing Wavelet-based Voice orphing ORPHANIDOU C., Oxford Centre for Industrial and Applied athematics athematical Institute, University of Oxford Oxford OX1 3LB, UK orphanid@maths.ox.ac.u OROZ I.. Oxford Centre

More information

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN Yu Wang and Mike Brookes Department of Electrical and Electronic Engineering, Exhibition Road, Imperial College London,

More information

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt Pattern Recognition Part 6: Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory

More information

Transcoding of Narrowband to Wideband Speech

Transcoding of Narrowband to Wideband Speech University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2005 Transcoding of Narrowband to Wideband Speech Christian H. Ritz University

More information

Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition

Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition Author Shannon, Ben, Paliwal, Kuldip Published 25 Conference Title The 8th International Symposium

More information

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals 16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract

More information

Chapter 4 SPEECH ENHANCEMENT

Chapter 4 SPEECH ENHANCEMENT 44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or

More information

Dimension Reduction of the Modulation Spectrogram for Speaker Verification

Dimension Reduction of the Modulation Spectrogram for Speaker Verification Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland Kong Aik Lee and

More information

Evaluation of clipping-noise suppression of stationary-noisy speech based on spectral compensation

Evaluation of clipping-noise suppression of stationary-noisy speech based on spectral compensation Evaluation of clipping-noise suppression of stationary-noisy speech based on spectral compensation Takahiro FUKUMORI ; Makoto HAYAKAWA ; Masato NAKAYAMA 2 ; Takanobu NISHIURA 2 ; Yoichi YAMASHITA 2 Graduate

More information

SOUND SOURCE RECOGNITION AND MODELING

SOUND SOURCE RECOGNITION AND MODELING SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental

More information

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Mohini Avatade & S.L. Sahare Electronics & Telecommunication Department, Cummins

More information

Enhanced Waveform Interpolative Coding at 4 kbps

Enhanced Waveform Interpolative Coding at 4 kbps Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression

More information

A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification

A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification Wei Chu and Abeer Alwan Speech Processing and Auditory Perception Laboratory Department

More information

Speech Coding in the Frequency Domain

Speech Coding in the Frequency Domain Speech Coding in the Frequency Domain Speech Processing Advanced Topics Tom Bäckström Aalto University October 215 Introduction The speech production model can be used to efficiently encode speech signals.

More information

Comparison of Spectral Analysis Methods for Automatic Speech Recognition

Comparison of Spectral Analysis Methods for Automatic Speech Recognition INTERSPEECH 2013 Comparison of Spectral Analysis Methods for Automatic Speech Recognition Venkata Neelima Parinam, Chandra Vootkuri, Stephen A. Zahorian Department of Electrical and Computer Engineering

More information

Effective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a

Effective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a R E S E A R C H R E P O R T I D I A P Effective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a IDIAP RR 7-7 January 8 submitted for publication a IDIAP Research Institute,

More information

Single Channel Speaker Segregation using Sinusoidal Residual Modeling

Single Channel Speaker Segregation using Sinusoidal Residual Modeling NCC 2009, January 16-18, IIT Guwahati 294 Single Channel Speaker Segregation using Sinusoidal Residual Modeling Rajesh M Hegde and A. Srinivas Dept. of Electrical Engineering Indian Institute of Technology

More information

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,

More information

Speech Quality Evaluation of Artificial Bandwidth Extension: Comparing Subjective Judgments and Instrumental Predictions

Speech Quality Evaluation of Artificial Bandwidth Extension: Comparing Subjective Judgments and Instrumental Predictions INTERSPEECH 01 Speech Quality Evaluation of Artificial Bandwidth Extension: Comparing Subjective Judgments and Instrumental Predictions Hannu Pulakka 1, Ville Myllylä 1, Anssi Rämö, and Paavo Alku 1 Microsoft

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

I D I A P. On Factorizing Spectral Dynamics for Robust Speech Recognition R E S E A R C H R E P O R T. Iain McCowan a Hemant Misra a,b

I D I A P. On Factorizing Spectral Dynamics for Robust Speech Recognition R E S E A R C H R E P O R T. Iain McCowan a Hemant Misra a,b R E S E A R C H R E P O R T I D I A P On Factorizing Spectral Dynamics for Robust Speech Recognition a Vivek Tyagi Hervé Bourlard a,b IDIAP RR 3-33 June 23 Iain McCowan a Hemant Misra a,b to appear in

More information

Effect of bandwidth extension to telephone speech recognition in cochlear implant users

Effect of bandwidth extension to telephone speech recognition in cochlear implant users Effect of bandwidth extension to telephone speech recognition in cochlear implant users Chuping Liu Department of Electrical Engineering, University of Southern California, Los Angeles, California 90089

More information

Calibration of Microphone Arrays for Improved Speech Recognition

Calibration of Microphone Arrays for Improved Speech Recognition MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Calibration of Microphone Arrays for Improved Speech Recognition Michael L. Seltzer, Bhiksha Raj TR-2001-43 December 2001 Abstract We present

More information

A Spectral Conversion Approach to Single- Channel Speech Enhancement

A Spectral Conversion Approach to Single- Channel Speech Enhancement University of Pennsylvania ScholarlyCommons Departmental Papers (ESE) Department of Electrical & Systems Engineering May 2007 A Spectral Conversion Approach to Single- Channel Speech Enhancement Athanasios

More information

Applications of Music Processing

Applications of Music Processing Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite

More information

EFFICIENT SUPER-WIDE BANDWIDTH EXTENSION USING LINEAR PREDICTION BASED ANALYSIS-SYNTHESIS. Pramod Bachhav, Massimiliano Todisco and Nicholas Evans

EFFICIENT SUPER-WIDE BANDWIDTH EXTENSION USING LINEAR PREDICTION BASED ANALYSIS-SYNTHESIS. Pramod Bachhav, Massimiliano Todisco and Nicholas Evans EFFICIENT SUPER-WIDE BANDWIDTH EXTENSION USING LINEAR PREDICTION BASED ANALYSIS-SYNTHESIS Pramod Bachhav, Massimiliano Todisco and Nicholas Evans EURECOM, Sophia Antipolis, France {bachhav,todisco,evans}@eurecom.fr

More information

UNSUPERVISED SPEAKER CHANGE DETECTION FOR BROADCAST NEWS SEGMENTATION

UNSUPERVISED SPEAKER CHANGE DETECTION FOR BROADCAST NEWS SEGMENTATION 4th European Signal Processing Conference (EUSIPCO 26), Florence, Italy, September 4-8, 26, copyright by EURASIP UNSUPERVISED SPEAKER CHANGE DETECTION FOR BROADCAST NEWS SEGMENTATION Kasper Jørgensen,

More information

VQ Source Models: Perceptual & Phase Issues

VQ Source Models: Perceptual & Phase Issues VQ Source Models: Perceptual & Phase Issues Dan Ellis & Ron Weiss Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA {dpwe,ronw}@ee.columbia.edu

More information

Relative phase information for detecting human speech and spoofed speech

Relative phase information for detecting human speech and spoofed speech Relative phase information for detecting human speech and spoofed speech Longbiao Wang 1, Yohei Yoshida 1, Yuta Kawakami 1 and Seiichi Nakagawa 2 1 Nagaoka University of Technology, Japan 2 Toyohashi University

More information

An audio watermark-based speech bandwidth extension method

An audio watermark-based speech bandwidth extension method Chen et al. EURASIP Journal on Audio, Speech, and Music Processing 2013, 2013:10 RESEARCH Open Access An audio watermark-based speech bandwidth extension method Zhe Chen, Chengyong Zhao, Guosheng Geng

More information

On Kalman Filtering. The 1960s: A Decade to Remember

On Kalman Filtering. The 1960s: A Decade to Remember On Kalman Filtering A study of A New Approach to Linear Filtering and Prediction Problems by R. E. Kalman Mehul Motani February, 000 The 960s: A Decade to Remember Rudolf E. Kalman in 960 Research Institute

More information

Wireless Communication: Concepts, Techniques, and Models. Hongwei Zhang

Wireless Communication: Concepts, Techniques, and Models. Hongwei Zhang Wireless Communication: Concepts, Techniques, and Models Hongwei Zhang http://www.cs.wayne.edu/~hzhang Outline Digital communication over radio channels Channel capacity MIMO: diversity and parallel channels

More information

(i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods

(i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods Tools and Applications Chapter Intended Learning Outcomes: (i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods

More information

Using RASTA in task independent TANDEM feature extraction

Using RASTA in task independent TANDEM feature extraction R E S E A R C H R E P O R T I D I A P Using RASTA in task independent TANDEM feature extraction Guillermo Aradilla a John Dines a Sunil Sivadas a b IDIAP RR 04-22 April 2004 D a l l e M o l l e I n s t

More information

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 Lecture 5 Slides Jan 26 th, 2005 Outline of Today s Lecture Announcements Filter-bank analysis

More information

L19: Prosodic modification of speech

L19: Prosodic modification of speech L19: Prosodic modification of speech Time-domain pitch synchronous overlap add (TD-PSOLA) Linear-prediction PSOLA Frequency-domain PSOLA Sinusoidal models Harmonic + noise models STRAIGHT This lecture

More information

DERIVATION OF TRAPS IN AUDITORY DOMAIN

DERIVATION OF TRAPS IN AUDITORY DOMAIN DERIVATION OF TRAPS IN AUDITORY DOMAIN Petr Motlíček, Doctoral Degree Programme (4) Dept. of Computer Graphics and Multimedia, FIT, BUT E-mail: motlicek@fit.vutbr.cz Supervised by: Dr. Jan Černocký, Prof.

More information

CS 188: Artificial Intelligence Spring Speech in an Hour

CS 188: Artificial Intelligence Spring Speech in an Hour CS 188: Artificial Intelligence Spring 2006 Lecture 19: Speech Recognition 3/23/2006 Dan Klein UC Berkeley Many slides from Dan Jurafsky Speech in an Hour Speech input is an acoustic wave form s p ee ch

More information

ON BEDROSIAN CONDITION IN APPLICATION TO CHIRP SOUNDS

ON BEDROSIAN CONDITION IN APPLICATION TO CHIRP SOUNDS 15th European Signal Processing Conference (EUSIPCO 7), Poznan, Poland, September 3-7, 7, copyright by EURASIP ON BEDROSIAN CONDIION IN APPLICAION O CHIRP SOUNDS E. HERMANOWICZ 1 ) ) and M. ROJEWSKI Faculty

More information

core signal feature extractor feature signal estimator adding additional frequency content frequency enhanced audio signal 112 selection side info.

core signal feature extractor feature signal estimator adding additional frequency content frequency enhanced audio signal 112 selection side info. US 20170358311A1 US 20170358311Α1 (ΐ9) United States (ΐ2) Patent Application Publication (ΐο) Pub. No.: US 2017/0358311 Al NAGEL et al. (43) Pub. Date: Dec. 14,2017 (54) DECODER FOR GENERATING A FREQUENCY

More information

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing Project : Part 2 A second hands-on lab on Speech Processing Frequency-domain processing February 24, 217 During this lab, you will have a first contact on frequency domain analysis of speech signals. You

More information

Automatic Morse Code Recognition Under Low SNR

Automatic Morse Code Recognition Under Low SNR 2nd International Conference on Mechanical, Electronic, Control and Automation Engineering (MECAE 2018) Automatic Morse Code Recognition Under Low SNR Xianyu Wanga, Qi Zhaob, Cheng Mac, * and Jianping

More information

Modulation Spectrum Power-law Expansion for Robust Speech Recognition

Modulation Spectrum Power-law Expansion for Robust Speech Recognition Modulation Spectrum Power-law Expansion for Robust Speech Recognition Hao-Teng Fan, Zi-Hao Ye and Jeih-weih Hung Department of Electrical Engineering, National Chi Nan University, Nantou, Taiwan E-mail:

More information

Bandwidth Extension of Speech Signals: A Catalyst for the Introduction of Wideband Speech Coding?

Bandwidth Extension of Speech Signals: A Catalyst for the Introduction of Wideband Speech Coding? WIDEBAND SPEECH CODING STANDARDS AND WIRELESS SERVICES Bandwidth Extension of Speech Signals: A Catalyst for the Introduction of Wideband Speech Coding? Peter Jax and Peter Vary, RWTH Aachen University

More information

Nonuniform multi level crossing for signal reconstruction

Nonuniform multi level crossing for signal reconstruction 6 Nonuniform multi level crossing for signal reconstruction 6.1 Introduction In recent years, there has been considerable interest in level crossing algorithms for sampling continuous time signals. Driven

More information

IMPROVED CODING OF TONAL COMPONENTS IN MPEG-4 AAC WITH SBR

IMPROVED CODING OF TONAL COMPONENTS IN MPEG-4 AAC WITH SBR IMPROVED CODING OF TONAL COMPONENTS IN MPEG-4 AAC WITH SBR Tomasz Żernici, Mare Domańsi, Poznań University of Technology, Chair of Multimedia Telecommunications and Microelectronics, Polana 3, 6-965, Poznań,

More information

ADAPTIVE NOISE LEVEL ESTIMATION

ADAPTIVE NOISE LEVEL ESTIMATION Proc. of the 9 th Int. Conference on Digital Audio Effects (DAFx-6), Montreal, Canada, September 18-2, 26 ADAPTIVE NOISE LEVEL ESTIMATION Chunghsin Yeh Analysis/Synthesis team IRCAM/CNRS-STMS, Paris, France

More information

SGN Audio and Speech Processing

SGN Audio and Speech Processing Introduction 1 Course goals Introduction 2 SGN 14006 Audio and Speech Processing Lectures, Fall 2014 Anssi Klapuri Tampere University of Technology! Learn basics of audio signal processing Basic operations

More information

ON THE POTENTIAL FOR ARTIFICIAL BANDWIDTH EXTENSION OF BONE AND TISSUE CONDUCTED SPEECH: A MUTUAL INFORMATION STUDY

ON THE POTENTIAL FOR ARTIFICIAL BANDWIDTH EXTENSION OF BONE AND TISSUE CONDUCTED SPEECH: A MUTUAL INFORMATION STUDY Authors' accepted manuscript of the article published in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) http://dx.doi.org/10.1109/icassp.2015.7178944 ON THE POTENTIAL

More information

Voice Activity Detection

Voice Activity Detection Voice Activity Detection Speech Processing Tom Bäckström Aalto University October 2015 Introduction Voice activity detection (VAD) (or speech activity detection, or speech detection) refers to a class

More information

Mikko Myllymäki and Tuomas Virtanen

Mikko Myllymäki and Tuomas Virtanen NON-STATIONARY NOISE MODEL COMPENSATION IN VOICE ACTIVITY DETECTION Mikko Myllymäki and Tuomas Virtanen Department of Signal Processing, Tampere University of Technology Korkeakoulunkatu 1, 3370, Tampere,

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Joint recognition and direction-of-arrival estimation of simultaneous meetingroom acoustic events

Joint recognition and direction-of-arrival estimation of simultaneous meetingroom acoustic events INTERSPEECH 2013 Joint recognition and direction-of-arrival estimation of simultaneous meetingroom acoustic events Rupayan Chakraborty and Climent Nadeu TALP Research Centre, Department of Signal Theory

More information

Speech and Music Discrimination based on Signal Modulation Spectrum.

Speech and Music Discrimination based on Signal Modulation Spectrum. Speech and Music Discrimination based on Signal Modulation Spectrum. Pavel Balabko June 24, 1999 1 Introduction. This work is devoted to the problem of automatic speech and music discrimination. As we

More information

Flexible and Scalable Transform-Domain Codebook for High Bit Rate CELP Coders

Flexible and Scalable Transform-Domain Codebook for High Bit Rate CELP Coders Flexible and Scalable Transform-Domain Codebook for High Bit Rate CELP Coders Václav Eksler, Bruno Bessette, Milan Jelínek, Tommy Vaillancourt University of Sherbrooke, VoiceAge Corporation Montreal, QC,

More information

Students: Avihay Barazany Royi Levy Supervisor: Kuti Avargel In Association with: Zoran, Haifa

Students: Avihay Barazany Royi Levy Supervisor: Kuti Avargel In Association with: Zoran, Haifa Students: Avihay Barazany Royi Levy Supervisor: Kuti Avargel In Association with: Zoran, Haifa Spring 2008 Introduction Problem Formulation Possible Solutions Proposed Algorithm Experimental Results Conclusions

More information

ANALOGUE TRANSMISSION OVER FADING CHANNELS

ANALOGUE TRANSMISSION OVER FADING CHANNELS J.P. Linnartz EECS 290i handouts Spring 1993 ANALOGUE TRANSMISSION OVER FADING CHANNELS Amplitude modulation Various methods exist to transmit a baseband message m(t) using an RF carrier signal c(t) =

More information

Enhancement of Speech Signal by Adaptation of Scales and Thresholds of Bionic Wavelet Transform Coefficients

Enhancement of Speech Signal by Adaptation of Scales and Thresholds of Bionic Wavelet Transform Coefficients ISSN (Print) : 232 3765 An ISO 3297: 27 Certified Organization Vol. 3, Special Issue 3, April 214 Paiyanoor-63 14, Tamil Nadu, India Enhancement of Speech Signal by Adaptation of Scales and Thresholds

More information

ON THE PERFORMANCE OF WTIMIT FOR WIDE BAND TELEPHONY

ON THE PERFORMANCE OF WTIMIT FOR WIDE BAND TELEPHONY ON THE PERFORMANCE OF WTIMIT FOR WIDE BAND TELEPHONY D. Nagajyothi 1 and P. Siddaiah 2 1 Department of Electronics and Communication Engineering, Vardhaman College of Engineering, Shamshabad, Telangana,

More information

HUMAN speech is frequently encountered in several

HUMAN speech is frequently encountered in several 1948 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 20, NO. 7, SEPTEMBER 2012 Enhancement of Single-Channel Periodic Signals in the Time-Domain Jesper Rindom Jensen, Student Member,

More information

NOISE ESTIMATION IN A SINGLE CHANNEL

NOISE ESTIMATION IN A SINGLE CHANNEL SPEECH ENHANCEMENT FOR CROSS-TALK INTERFERENCE by Levent M. Arslan and John H.L. Hansen Robust Speech Processing Laboratory Department of Electrical Engineering Box 99 Duke University Durham, North Carolina

More information

A Closed-loop Multimode Variable Bit Rate Characteristic Waveform Interpolation Coder

A Closed-loop Multimode Variable Bit Rate Characteristic Waveform Interpolation Coder A Closed-loop Multimode Variable Bit Rate Characteristic Waveform Interpolation Coder Jing Wang, Jingg Kuang, and Shenghui Zhao Research Center of Digital Communication Technology,Department of Electronic

More information

THERE is a constant need for speech codecs with decreased

THERE is a constant need for speech codecs with decreased IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 2, FEBRUARY 2007 377 Conditional Vector Quantization for Speech Coding Yannis Agiomyrgiannakis and Yannis Stylianou Abstract In

More information

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution PAGE 433 Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution Wenliang Lu, D. Sen, and Shuai Wang School of Electrical Engineering & Telecommunications University of New South Wales,

More information

A Comparative Study of Formant Frequencies Estimation Techniques

A Comparative Study of Formant Frequencies Estimation Techniques A Comparative Study of Formant Frequencies Estimation Techniques DORRA GARGOURI, Med ALI KAMMOUN and AHMED BEN HAMIDA Unité de traitement de l information et électronique médicale, ENIS University of Sfax

More information

BEAT DETECTION BY DYNAMIC PROGRAMMING. Racquel Ivy Awuor

BEAT DETECTION BY DYNAMIC PROGRAMMING. Racquel Ivy Awuor BEAT DETECTION BY DYNAMIC PROGRAMMING Racquel Ivy Awuor University of Rochester Department of Electrical and Computer Engineering Rochester, NY 14627 rawuor@ur.rochester.edu ABSTRACT A beat is a salient

More information

A CONSTRUCTION OF COMPACT MFCC-TYPE FEATURES USING SHORT-TIME STATISTICS FOR APPLICATIONS IN AUDIO SEGMENTATION

A CONSTRUCTION OF COMPACT MFCC-TYPE FEATURES USING SHORT-TIME STATISTICS FOR APPLICATIONS IN AUDIO SEGMENTATION 17th European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009 A CONSTRUCTION OF COMPACT MFCC-TYPE FEATURES USING SHORT-TIME STATISTICS FOR APPLICATIONS IN AUDIO SEGMENTATION

More information

Can binary masks improve intelligibility?

Can binary masks improve intelligibility? Can binary masks improve intelligibility? Mike Brookes (Imperial College London) & Mark Huckvale (University College London) Apparently so... 2 How does it work? 3 Time-frequency grid of local SNR + +

More information

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile 8 2. LITERATURE SURVEY The available radio spectrum for the wireless radio communication is very limited hence to accommodate maximum number of users the speech is compressed. The speech compression techniques

More information

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals ISCA Journal of Engineering Sciences ISCA J. Engineering Sci. Vocoder (LPC) Analysis by Variation of Input Parameters and Signals Abstract Gupta Rajani, Mehta Alok K. and Tiwari Vebhav Truba College of

More information

Speech Signal Enhancement Techniques

Speech Signal Enhancement Techniques Speech Signal Enhancement Techniques Chouki Zegar 1, Abdelhakim Dahimene 2 1,2 Institute of Electrical and Electronic Engineering, University of Boumerdes, Algeria inelectr@yahoo.fr, dahimenehakim@yahoo.fr

More information

RASTA-PLP SPEECH ANALYSIS. Aruna Bayya. Phil Kohn y TR December 1991

RASTA-PLP SPEECH ANALYSIS. Aruna Bayya. Phil Kohn y TR December 1991 RASTA-PLP SPEECH ANALYSIS Hynek Hermansky Nelson Morgan y Aruna Bayya Phil Kohn y TR-91-069 December 1991 Abstract Most speech parameter estimation techniques are easily inuenced by the frequency response

More information

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2 Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter

More information

Chapter 2 Direct-Sequence Systems

Chapter 2 Direct-Sequence Systems Chapter 2 Direct-Sequence Systems A spread-spectrum signal is one with an extra modulation that expands the signal bandwidth greatly beyond what is required by the underlying coded-data modulation. Spread-spectrum

More information

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Brochure More information from http://www.researchandmarkets.com/reports/569388/ Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Description: Multimedia Signal

More information

The Partly Preserved Natural Phases in the Concatenative Speech Synthesis Based on the Harmonic/Noise Approach

The Partly Preserved Natural Phases in the Concatenative Speech Synthesis Based on the Harmonic/Noise Approach The Partly Preserved Natural Phases in the Concatenative Speech Synthesis Based on the Harmonic/Noise Approach ZBYNĚ K TYCHTL Department of Cybernetics University of West Bohemia Univerzitní 8, 306 14

More information

Enhancement of Speech in Noisy Conditions

Enhancement of Speech in Noisy Conditions Enhancement of Speech in Noisy Conditions Anuprita P Pawar 1, Asst.Prof.Kirtimalini.B.Choudhari 2 PG Student, Dept. of Electronics and Telecommunication, AISSMS C.O.E., Pune University, India 1 Assistant

More information

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

PDF hosted at the Radboud Repository of the Radboud University Nijmegen PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is an author's version which may differ from the publisher's version. For additional information about this

More information

Chapter 2 Channel Equalization

Chapter 2 Channel Equalization Chapter 2 Channel Equalization 2.1 Introduction In wireless communication systems signal experiences distortion due to fading [17]. As signal propagates, it follows multiple paths between transmitter and

More information

SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT

SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT RASHMI MAKHIJANI Department of CSE, G. H. R.C.E., Near CRPF Campus,Hingna Road, Nagpur, Maharashtra, India rashmi.makhijani2002@gmail.com

More information

Separating Voiced Segments from Music File using MFCC, ZCR and GMM

Separating Voiced Segments from Music File using MFCC, ZCR and GMM Separating Voiced Segments from Music File using MFCC, ZCR and GMM Mr. Prashant P. Zirmite 1, Mr. Mahesh K. Patil 2, Mr. Santosh P. Salgar 3,Mr. Veeresh M. Metigoudar 4 1,2,3,4Assistant Professor, Dept.

More information

COM 12 C 288 E October 2011 English only Original: English

COM 12 C 288 E October 2011 English only Original: English Question(s): 9/12 Source: Title: INTERNATIONAL TELECOMMUNICATION UNION TELECOMMUNICATION STANDARDIZATION SECTOR STUDY PERIOD 2009-2012 Audience STUDY GROUP 12 CONTRIBUTION 288 P.ONRA Contribution Additional

More information

An Efficient Extraction of Vocal Portion from Music Accompaniment Using Trend Estimation

An Efficient Extraction of Vocal Portion from Music Accompaniment Using Trend Estimation An Efficient Extraction of Vocal Portion from Music Accompaniment Using Trend Estimation Aisvarya V 1, Suganthy M 2 PG Student [Comm. Systems], Dept. of ECE, Sree Sastha Institute of Engg. & Tech., Chennai,

More information

Machine recognition of speech trained on data from New Jersey Labs

Machine recognition of speech trained on data from New Jersey Labs Machine recognition of speech trained on data from New Jersey Labs Frequency response (peak around 5 Hz) Impulse response (effective length around 200 ms) 41 RASTA filter 10 attenuation [db] 40 1 10 modulation

More information

Subjective Voice Quality Evaluation of Artificial Bandwidth Extension: Comparing Different Audio Bandwidths and Speech Codecs

Subjective Voice Quality Evaluation of Artificial Bandwidth Extension: Comparing Different Audio Bandwidths and Speech Codecs INTERSPEECH 01 Subjective Voice Quality Evaluation of Artificial Bandwidth Extension: Comparing Different Audio Bandwidths and Speech Codecs Hannu Pulakka 1, Anssi Rämö, Ville Myllylä 1, Henri Toukomaa,

More information

Automatic Text-Independent. Speaker. Recognition Approaches Using Binaural Inputs

Automatic Text-Independent. Speaker. Recognition Approaches Using Binaural Inputs Automatic Text-Independent Speaker Recognition Approaches Using Binaural Inputs Karim Youssef, Sylvain Argentieri and Jean-Luc Zarader 1 Outline Automatic speaker recognition: introduction Designed systems

More information

High-speed Noise Cancellation with Microphone Array

High-speed Noise Cancellation with Microphone Array Noise Cancellation a Posteriori Probability, Maximum Criteria Independent Component Analysis High-speed Noise Cancellation with Microphone Array We propose the use of a microphone array based on independent

More information

A Hybrid Synchronization Technique for the Frequency Offset Correction in OFDM

A Hybrid Synchronization Technique for the Frequency Offset Correction in OFDM A Hybrid Synchronization Technique for the Frequency Offset Correction in OFDM Sameer S. M Department of Electronics and Electrical Communication Engineering Indian Institute of Technology Kharagpur West

More information