SPEECH ENHANCEMENT USING ADAPTIVE FILTERS AND INDEPENDENT COMPONENT ANALYSIS APPROACH

Size: px
Start display at page:

Download "SPEECH ENHANCEMENT USING ADAPTIVE FILTERS AND INDEPENDENT COMPONENT ANALYSIS APPROACH"

Transcription

1 SPEECH ENHANCEMENT USING ADAPTIVE FILTERS AND INDEPENDENT COMPONENT ANALYSIS APPROACH Tomasz Rutkowski Λ, Andrzej Cichocki Λ and Allan Kardec Barros ΛΛ Λ Brain Science Institute RIKEN Wako-shi, Saitama, JAPAN ΛΛ Depto. de Engenharia Eletrica Universidade Federal do Maranhao Sao Luis - Ma - BRAZIL tomek@bsp.brain.riken.go.jp, cia@brain.riken.go.jp and allan@biomedica.org ABSTRACT In this paper we consider the problem of enhancement and extraction of one speaker speech signal corrupted by environmental acoustic noise/interferences and other speakers using array of microphones containing at least two microphones. The preprocessing unit mimic human auditory system by roughly emulating cochlea by nonuniform bandpass filter bank. We construct filter bank with center frequencies of subbands based on approximation of target speaker fundamental frequency. Then we incorporate blind signal separation method for each subband signals, to extract maximum information that represent target speaker speech. After that desired signal is reconstructed from independent components representing every subband. Experiments with office room recordings are presented to confirm the validity and good performance of the proposed method in real-world environment. 1. INTRODUCTION We propose the system that approximately emulate human auditory system usning a nonuniform bandpass filter bank with tracking the fundamental frequency of target speaker. The filter bank consists of bandpass filters with bandwidth starting from 100 Hz to 200 Hz for first filter and increasing in length by factor two for next filters. For the experiments we use telephone quality speech signals with sampling frequency 8kHz. We assume the hearing system performs a spectrographic analysis of an auditory stimulus at the cochlea, which can be regarded as a bank of nonuniform selfadaptive filters whose outputs are ordered tonotopically. At first the fundamental frequency of the target speaker in mixture of convoluted speech signals is estimated in order to properly design center frequencies of the filters. The bank of adaptive band-pass filters, processes the available microphone signals around the fundamental frequency of the target speaker and around its harmonics. In the next stage of processing we perform blind source separation BSS, blind source extraction (BSE) or independet component analysis (ICA) for each frequency sub-band (bin). The efficient learning algorithms are used to perform BSS/BSE/ICA. Finally the set switches is implemented to perform temporal masking and selection/classification tasks of one independent components with specific feature that enhances the voice of target speaker. The main problem in this stage is to decide which signal from obtained subband independent components should be discarded and which one contain essential information from target speaker. We apply spectral measure to solve this problem in every subband. After such processing we have subband signals that carry speech signals with enhanced target speaker information. Last stage of our signal processing system performs signal reconstruction. The inverted filter bank is applied to correctly reconstruct the target speaker voice, avoiding problems with aliasing of subband filtered components. Extensive computer simulations results with array of two or more microphones confirm validity and performance of the proposed approach. We present the results for real room recordings with natural reverberations from the walls and objects in the room. 2. AUDITORY SYSTEM FILTER BANK It is well known that the human auditory system can roughly be described as a nonuniform bandpass filter bank, consisting of strongly overlapping bandpass fil-

2 s 2 s 1 ~ x 1 ~ x 2 Auditory Filter Bank Auditory Filter Bank f0 Nf0 f0 Nf0 x 11 x 21 x 1N x 2N BSE 1 BSE N y 1 y N Inversed Auditory Filter Bank y ŝ 1 s 3 f0 estimation Fig. 1. Conceptual block diagram of the algorithm which roughly mimics the auditory system. First it estimates the fundamental frequency f 0 and process the microphone signals using a bank of band-pass filters (such as the inner ear). Then it process mixed/convolved signal by an BSE or blind signal deconvolution (BSD) algorithm for each frequency bin. Finally, it performs masking by set of switches and inversed filter bank. ters with bandwidth in the order of 50 to 100 Hz for signals below 500 Hz and up to 5000 Hz for signal at higher frequencies [1]. The hearing system performs a spectrographic analysis of any auditory stimulus at the cochlea, which can be regarded as a bank of nonuniform self-adaptive filters whose outputs are ordered tonotopically. Recently, many sophisticated and biologically plausible models of such auditory filters banks have been proposed. In this paper we employ the idea of speaker voice extraction/enhancement from natural mixtures recorded in real environments. Following the idea, we suggest preprocessing unit for blind signal separation signal, that transform microphone signals into subbands with center frequencies around fundamental frequency f 0 and its harmonics. The fundamental frequency of target speaker can be estimated on basis of microphone signals ex 1 and ex 2 and/or estimated signal y(t). Unvoiced speech is noisy due to the random nature of the signal generated at a narrow constriction in the vocal tract for such sounds. For both voiced and unvoiced excitation, the vocal tract, acting as a filter, amplifies certain sound frequencies while attenuating others. As a periodic signal, voiced speech has spectra consisting of harmonics of the fundamental frequency of the vocal fold vibration; this frequency, often abbreviated f 0, is the physical aspect of speech corresponding to perceived pitch. In our algorithm we use that feature to track the speaker, that fundamental frequency f 0 is known or can be estimated. Our filter bank adopt central frequencies of subband bins to enhance signal of target speaker. An important property of speech signals is that they are non-stationary, but can be regarded as locally stationary. Roughly speaking, speech can be divided in the time domain into voiced and unvoiced sounds, the first ones having more structure in the frequency domain than the later. Indeed, voiced sounds are regarded in general as quasi-periodic. In this matter, some experiments pointed that humans may be using this voiced structure to separate sounds from the background. Moreover, humans can more easily understand a voiced than a unvoiced sound. One open problem is how the auditory system segregates those sounds in a higher level (at the auditory cortex). In this work, we suggest that this can be carried out by exploiting the local statistical independence of a pair of sub-band sounds for each frequency sub-band (bin). Our final objective is to develop an efficient algorithm whose output signal y(t) is a modified version of a speech signal s i (t), i.e. signal of interest y(t) = g(s i (t)), where g( ) represents a unavoidable distortion filter and a non-linear transformation operator. Also in our algorithm we included the temporal masking characteristic of the auditory system. This can be managed by a set of switches which select from the subband components after blind separation/extraction algorithm speaker related information. Fig.1 shows the conceptual block diagram of the system. It is composed of four parts. The first one is the fundamental frequency estimation algorithm in spectral domain, which estimates the fundamental frequency from the mixed signals as well as it shows which part of the speech is voiced. It should be noted, that estimation of f 0 becomes rather difficult when all speakers are at the same distance from microphones. The second processing unit is a bank of FIR band-pass filters with center frequencies adopted to estimated f 0

3 a) x( n) y 0 H 0 H 1 y 1 v 0 v 1 Magnitude (db) Normalized Frequency ( π rad/sample) b) v 0 v 1 u 0 u 1 F 0 F 1 ^ x( n) Fig. 3. Filter bank frequency characteristic for 5 subbands. and to avoid aliasing problems we apply: Fig. 2. Filter bank analysis (a) and sections (b) for only 2 subbands. for target speaker. The filter banks process the signals around the fundamental frequency and around its harmonics. The third section is a bank of BSE (blind signal extraction) units for enhancing the desired signal for each frequency sub-band. Finally, the last section performs signal reconstruction based on subbands signal. The reconstruction section is inversed auditory filter in respect to the second unit of our system. In this way we try to reconstruct the signal from subbands avoiding aliasing and frequency distortion problems. 3. ADAPTIVE FILTER BANKS The filter bank consists of slightly overlapping bandpass filters with bandwidth starting from 100 Hz to 200 Hz for first filter and increasing in length by factor two for next filters. At first the fundamental frequency of the target speaker in mixture of convoluted speech signals is estimated in order to properly design center frequencies of the filters. The bank of adaptive band-pass filters, processes the available microphone signals around the fundamental frequency of the target speaker and around its harmonics. Following the idea of human auditory system frequency sensitivity, we decided to construct filter banks with center frequencies as follows: f 0; 4f 0; 10f 0; 22f 0;:::, making the next subbands two times wider in direction to higher frequencies, because human speech has less sound representation there. The lowpass and highpass FIR filters are implemented in cascade configuration. In order to prevent perfect reconstruction of the signal, we design analysis and synthesis filters with following constrains [2] in every section (see Fig. 2 for reference). To following constrain prevents distortion: F 0 (z)h 0 ( z) F 1 (z)h 1 ( z) =0 (2) The exemplary five section filter bank frequency characteristic is presented on the Fig. 3. The center frequency of every subband is constructed from pairs of lowpass and highpass filters is always around f 0 and its higher harmonics f 0; 4f 0; 10f 0; 22f 0;:::for every target speaker. 4. BLIND SOURCE SEPARATION In order to separate the subbands signals we can use any algorithm for BSS/ICA like Natural Gradient algorithm, SOBI, JADE, RICA etc. [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]. In this paper we use novel algorithm for blind source extraction (BSE). Due to limit of space we present here only final algorithm without the theoretical justification. Let us consider a single processing unit for extraction of voiced independent subcomponent (see Fig. 4): y i (k) = w T i x i (k) = " i (k) = y i (k) LX p=1 mx j=1 w ij x ij (k) ; (3) b ip y i (k p) = w T i x i (k) ey i ; (4) where m is the number of microphones i = 1; 2;:::;N,(N - number of frequency P bins) w i = [w i1 ;:::;w im ] T L and ey i (k) = p=1 b ipy i (k p) is the output of FIR bandpass filter with suitably chosen center frequency and bandwidth. Coefficients b ip are fixed. It can be easily shown that the weight of the BSE processing unit can be iteratively updated as follows F 0 (z)h 0 (z) F 1 (z)h 1 (z) =2z l (1) w i = ^R 1 x ix i ^R x~yi ; w iλ = w i =kw i k (5)

4 x i 1 w i1 mixture from microphone #1 x i 2 2 x im w i w im 0 f i Bandpass Filter b ip (k) y i ~ yi ( k ) mixture from microphone #2 (closer to target speaker) Fig. 4. Single processing for blind extraction of an independent voiced component (m means the number of microphones typically m =2. where MX ^R xix i = 1 x i (k)x T i (k); (6) N k=1 MX ^R xiy i = 1 x i (k)ey i (k): N k=1 It should be noted that by changing the central frequency of bandpass filter we can extract in generally different components. Moreover, using the above concept we always extract desired independent components with higher energy so masking set of switches is not necessary. Fig. 6. Result for extraction of one speaker from mixture of three speakers recorded using only two microphones (target speaker was close to microphone #2). shown in Fig. 8. For each subband i we perform the following processing with updating y 1 (k) =x i1 (k) LX p=0 b ip x i2 (k p) (7) 4b ip = y i (k)x i2 (k p): (8) 5. MULTICHANNEL BLIND DECONVOLU- TION/EQUALIZATION Instead of instantaneous blind extraction of subband signals we can apply blind deconvolution/equalization especially if bandwidth is relatively large. The simple model for multichannel deconvolution/equalization is 6. SIGNAL RECONSTRUCTION FROM IN- DEPENDENT COMPONENTS The components carrying maxima around f 0 harmonics obtained from previous section are taken into reconstruction. The signals carry speech signals with enhanced target speaker information. Last part of our signal processing system performs signal reconstruction. The inverted filter bank is applied to correctly reconstruct the target speaker voice, avoiding problems with aliasing of subband filtered components. Extensive computer simulations results presented in next section confirm validity and performance of the proposed approach. We present results for real room recordings with natural reverberations from the walls and subjects in the room. 7. EXPERIMENTS WITH SPEECH SIGNALS RECORDED IN REAL ENVIRONMENT Fig. 5. Room recording plan. The real room recordings were done in the empty experimental room, without carpet and any sound absorbing elements, with many reverberations (easy to notice even during usual conversation). We used two or three cardioid condenser boundary microphones

5 mixture from microphone #1 mixture from microphone #2 Fig. 5 Exemplary computer simulations are shown in Fig. 10 and Fig. 7. For each experiment we have obtained essential enhancement of target speaker. Due to limit of space more details will be given during workshop s presentation. For all performed experiments considerable speech enhancement has been achieved. More detailed audio experiment will be presented at conference. 8. CONCLUSIONS AND DISCUSSION mixture from microphone #3 Fig. 7. Result for extraction of one speaker from mixture of four speakers recorded using three microphones (target speaker was close to microphone #1). In this paper we have described multistage subband based system for extraction and enhancement of speech signal corrupted by other speakers and other acoustic interferences. The proposed approach can be extended to other applications like extraction of biomedical signals with reduced number of sensors. The open problems is how to extract a speaker with lower energy than others speakers or speech signal with specific features independently of this how far away is from the microphones. x i1 _ y i mixture from microphone #1 x i 2 b i FIR mixture from microphone #2 mixture from microphone #3 Fig. 8. Processing unit for blind deconvolution/equalization for m =2microphones. audio-technica PRO44, that can record sounds from half-cardiod space. Such configuration let as record sounds from many directions similarly as human being can sense using ears. Boundary microphones make the task more complicated, because they record more reverbarations from surroundings than directional ones. The microphones were connected to microphone high class line amplifier and professional 20-bit multitrack digital recording system in PC class computer. The system allows us to record up to 8 channels simultaneously with 20-bit resolution and sampling frequency 44.1kHz. The following recordings were done using natural voices and sounds from speakers: (i) 2 mixed man and woman voices talking different frazes in English; (ii) 3 man voices talking different frazes in English; (iii) mixed recordings of man and woman voices talking different frazes in English; (iv) mixed human and natural sound (rain, water fall) sounds or music. We conducted all experiments with target speaker positioned closer to microphones than other sources. The scheme of our recording conditions is presented on mixture from microphone #4 Fig. 9. Result for extraction of one speaker from mixture of five speakers recorded using four microphones (target speaker was close to microphone #3). 9. REFERENCES [1] D. O Shaughnessy, Speech Communication - Human and Machine, IEEE Press, New York, second edition, [2] G. Strang and T. Nguyen, Wavelets and Filter Banks, Wellesley - Cambridge Press, Wellesley MA USA, [3] S. Amari, ICA of temporally correlated signals - learning algorithm, in Proceedings of ICA 99:

6 mixture (speech rain) from microphone #1 mixture (speech rain) from microphone #2 Fig. 10. Result for extraction of one speaker from mixture of two speakers talking during heavy rain, recorded using only two microphones (target speaker was close to microphone #2). International workshop on blind signal separation and independent component analysis, Aussois, France, Jan. 1999, pp [4] S. Amari and A. Cichocki, Adaptive blind signal processing - neural network approaches, Proceedings IEEE, vol. 86, no. 10, pp , October 1998, (invited paper). [5] A. K. Barros and A. Cichocki, RICA - reliable and robust program for independent component analysis, Report and matlab program of RIKEN,Brain Science Institute RIKEN, 2-1 Hirosawa, Wako-shi, Saitama, JAPAN, or [6] A. Belouchrani, K.A. Meraim, and J.-F. Cardoso, A blind source separation technique using second order statistics, IEEE Transactions on Signal Processing, vol. 45, pp , February [7] A. Cichocki, R. Thawonmas, and S. Amari, Sequential blind signal extraction in order specified by stochastic properties, Electronics Letters, vol. 33, no. 1, pp , January [8] N. Delfosse and P. Loubaton, Adaptive blind separation of independent sources: a deflation approach, Signal Processing, vol. 45, pp , [9] A. Hyvärinen and E. Oja, A fast fixed-point algorithm for independent component analysis, Neural Computation, vol. 9, pp , [10] S.C. Douglas and S.-Y. Kung, Kuicnet algorithm for blind deconvolution, in Proceedings of the 1998 IEEE Workshop on Neural Networks for Signal Processing, New York, 1998, pp [11] C. Jutten and J. Herault, Blind separation of sources, part i: An adaptive algorithm based on neuromimetic architecture, Signal Processing, vol. 24, pp. 1 20, [12] L. Molgedey and H.G. Schuster, Separation of a mixture of independent signals using timedelayed correlations, Physical Review Letters, vol. 72, no. 23, pp , [13] B. A. Pearlmutter and L. C. Parra, Maximum likelihood blind source separation: A contextsensitive generalization of ica, in Proceedings of NIPS 96, 1997, vol. 9, pp [14] L. Tong, V.C. Soon, R. Liu, and Y. Huang, Amuse: a new blind identification algorithm, in Proceedings of ISCAS 90, New Orleans, LA, [15] J.K. Tugnait, Blind spatio-temporal equalization and impulse response estimation for mimo channels using a godard cost function, IEEE Transactions on Signal Processing, vol. 45, pp , January [16] S. Choi and A. Cichocki, Blind separation of nonstationary sources in noisy mixtures, Electronic Letters, vol. 36, pp , April [17] A. K. Barros and N. Ohnishi, Removal of quasi-periodic sources from physiological measurements, in Proceedings of ICA 99: International workshop on blind signal separation and independent component analysis, Aussois, France, Jan. 1999, pp [18] J. Huang, K-C. Yen, and Y. Zhao, Subbandbased adaptive decorrelation filtering for cochannel speech separation, IEEE Transactions on Speech and Audio Processing, vol. 8, no. 4, pp , July [19] A.K. Barros, H. Kawahara, A. Cichocki, S. Kojita, T. Rutkowski, M. Kawamoto, and N. Ohnishi, Enhancement of a speech signal embedded in noisy environment using two microphones, in Proceedings of the Second International Workshop on ICA and BSS, ICA 2000, Helsinki, Finland, June, 2000, pp

0.5. s 1, x 1,4. x 2,4. y 1,4. y 2,4

0.5. s 1, x 1,4. x 2,4. y 1,4. y 2,4 ENHANCEMENT OF A SPEECH SIGNAL EMBEDDED IN NOISY ENVIRONMENT USING TWO MICROPHONES Allan Kardec Barros 1;5, Hideki Kawahara, Andrzej Cichocki 3, Shoji Kajita 4, Tomasz Rutkowski 3, Noboru Ohnishi 1;4 1:

More information

DURING the past several years, independent component

DURING the past several years, independent component 912 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 4, JULY 1999 Principal Independent Component Analysis Jie Luo, Bo Hu, Xie-Ting Ling, Ruey-Wen Liu Abstract Conventional blind signal separation algorithms

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/

More information

High-speed Noise Cancellation with Microphone Array

High-speed Noise Cancellation with Microphone Array Noise Cancellation a Posteriori Probability, Maximum Criteria Independent Component Analysis High-speed Noise Cancellation with Microphone Array We propose the use of a microphone array based on independent

More information

Introduction to Blind Signal Processing: Problems and Applications

Introduction to Blind Signal Processing: Problems and Applications Adaptive Blind Signal and Image Processing Andrzej Cichocki, Shun-ichi Amari Copyright @ 2002 John Wiley & Sons, Ltd ISBNs: 0-471-60791-6 (Hardback); 0-470-84589-9 (Electronic) 1 Introduction to Blind

More information

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt Pattern Recognition Part 6: Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory

More information

Auditory modelling for speech processing in the perceptual domain

Auditory modelling for speech processing in the perceptual domain ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract

More information

TARGET SPEECH EXTRACTION IN COCKTAIL PARTY BY COMBINING BEAMFORMING AND BLIND SOURCE SEPARATION

TARGET SPEECH EXTRACTION IN COCKTAIL PARTY BY COMBINING BEAMFORMING AND BLIND SOURCE SEPARATION TARGET SPEECH EXTRACTION IN COCKTAIL PARTY BY COMBINING BEAMFORMING AND BLIND SOURCE SEPARATION Lin Wang 1,2, Heping Ding 2 and Fuliang Yin 1 1 School of Electronic and Information Engineering, Dalian

More information

BLIND SOURCE SEPARATION USING WAVELETS

BLIND SOURCE SEPARATION USING WAVELETS 2 IEEE International Conference on Computational Intelligence and Computing Research BLIND SOURCE SEPARATION USING WAVELETS A.Wims Magdalene Mary, Anto Prem Kumar 2, Anish Abraham Chacko 3 Karunya University,

More information

Separation of Noise and Signals by Independent Component Analysis

Separation of Noise and Signals by Independent Component Analysis ADVCOMP : The Fourth International Conference on Advanced Engineering Computing and Applications in Sciences Separation of Noise and Signals by Independent Component Analysis Sigeru Omatu, Masao Fujimura,

More information

The psychoacoustics of reverberation

The psychoacoustics of reverberation The psychoacoustics of reverberation Steven van de Par Steven.van.de.Par@uni-oldenburg.de July 19, 2016 Thanks to Julian Grosse and Andreas Häußler 2016 AES International Conference on Sound Field Control

More information

REAL-TIME BLIND SOURCE SEPARATION FOR MOVING SPEAKERS USING BLOCKWISE ICA AND RESIDUAL CROSSTALK SUBTRACTION

REAL-TIME BLIND SOURCE SEPARATION FOR MOVING SPEAKERS USING BLOCKWISE ICA AND RESIDUAL CROSSTALK SUBTRACTION REAL-TIME BLIND SOURCE SEPARATION FOR MOVING SPEAKERS USING BLOCKWISE ICA AND RESIDUAL CROSSTALK SUBTRACTION Ryo Mukai Hiroshi Sawada Shoko Araki Shoji Makino NTT Communication Science Laboratories, NTT

More information

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

International Journal of Modern Trends in Engineering and Research   e-issn No.: , Date: 2-4 July, 2015 International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Analysis of Speech Signal Using Graphic User Interface Solly Joy 1, Savitha

More information

FROM BLIND SOURCE SEPARATION TO BLIND SOURCE CANCELLATION IN THE UNDERDETERMINED CASE: A NEW APPROACH BASED ON TIME-FREQUENCY ANALYSIS

FROM BLIND SOURCE SEPARATION TO BLIND SOURCE CANCELLATION IN THE UNDERDETERMINED CASE: A NEW APPROACH BASED ON TIME-FREQUENCY ANALYSIS ' FROM BLIND SOURCE SEPARATION TO BLIND SOURCE CANCELLATION IN THE UNDERDETERMINED CASE: A NEW APPROACH BASED ON TIME-FREQUENCY ANALYSIS Frédéric Abrard and Yannick Deville Laboratoire d Acoustique, de

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation

More information

BLIND SOURCE separation (BSS) [1] is a technique for

BLIND SOURCE separation (BSS) [1] is a technique for 530 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 12, NO. 5, SEPTEMBER 2004 A Robust and Precise Method for Solving the Permutation Problem of Frequency-Domain Blind Source Separation Hiroshi

More information

Principles of Musical Acoustics

Principles of Musical Acoustics William M. Hartmann Principles of Musical Acoustics ^Spr inger Contents 1 Sound, Music, and Science 1 1.1 The Source 2 1.2 Transmission 3 1.3 Receiver 3 2 Vibrations 1 9 2.1 Mass and Spring 9 2.1.1 Definitions

More information

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals 16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract

More information

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,

More information

The Role of High Frequencies in Convolutive Blind Source Separation of Speech Signals

The Role of High Frequencies in Convolutive Blind Source Separation of Speech Signals The Role of High Frequencies in Convolutive Blind Source Separation of Speech Signals Maria G. Jafari and Mark D. Plumbley Centre for Digital Music, Queen Mary University of London, UK maria.jafari@elec.qmul.ac.uk,

More information

BLIND SEPARATION OF LINEAR CONVOLUTIVE MIXTURES USING ORTHOGONAL FILTER BANKS. Milutin Stanacevic, Marc Cohen and Gert Cauwenberghs

BLIND SEPARATION OF LINEAR CONVOLUTIVE MIXTURES USING ORTHOGONAL FILTER BANKS. Milutin Stanacevic, Marc Cohen and Gert Cauwenberghs BLID SEPARATIO OF LIEAR COVOLUTIVE MIXTURES USIG ORTHOGOAL FILTER BAKS Milutin Stanacevic, Marc Cohen and Gert Cauwenberghs Department of Electrical and Computer Engineering and Center for Language and

More information

The basic problem is simply described. Assume d s statistically independent sources s(t) =[s1(t) ::: s ds (t)] T. These sources are convolved and mixe

The basic problem is simply described. Assume d s statistically independent sources s(t) =[s1(t) ::: s ds (t)] T. These sources are convolved and mixe Convolutive Blind Source Separation based on Multiple Decorrelation. Lucas Parra, Clay Spence, Bert De Vries Sarno Corporation, CN-5300, Princeton, NJ 08543 lparra j cspence j bdevries @ sarno.com Abstract

More information

Nonlinear postprocessing for blind speech separation

Nonlinear postprocessing for blind speech separation Nonlinear postprocessing for blind speech separation Dorothea Kolossa and Reinhold Orglmeister 1 TU Berlin, Berlin, Germany, D.Kolossa@ee.tu-berlin.de, WWW home page: http://ntife.ee.tu-berlin.de/personen/kolossa/home.html

More information

Digital Signal Processing

Digital Signal Processing Digital Signal Processing Fourth Edition John G. Proakis Department of Electrical and Computer Engineering Northeastern University Boston, Massachusetts Dimitris G. Manolakis MIT Lincoln Laboratory Lexington,

More information

Audio Restoration Based on DSP Tools

Audio Restoration Based on DSP Tools Audio Restoration Based on DSP Tools EECS 451 Final Project Report Nan Wu School of Electrical Engineering and Computer Science University of Michigan Ann Arbor, MI, United States wunan@umich.edu Abstract

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

An analysis of blind signal separation for real time application

An analysis of blind signal separation for real time application University of Wollongong Research Online University of Wollongong Thesis Collection 1954-2016 University of Wollongong Thesis Collections 2006 An analysis of blind signal separation for real time application

More information

Blind Dereverberation of Single-Channel Speech Signals Using an ICA-Based Generative Model

Blind Dereverberation of Single-Channel Speech Signals Using an ICA-Based Generative Model Blind Dereverberation of Single-Channel Speech Signals Using an ICA-Based Generative Model Jong-Hwan Lee 1, Sang-Hoon Oh 2, and Soo-Young Lee 3 1 Brain Science Research Center and Department of Electrial

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming

Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Engineering

More information

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Mohini Avatade & S.L. Sahare Electronics & Telecommunication Department, Cummins

More information

Real-time Adaptive Concepts in Acoustics

Real-time Adaptive Concepts in Acoustics Real-time Adaptive Concepts in Acoustics Real-time Adaptive Concepts in Acoustics Blind Signal Separation and Multichannel Echo Cancellation by Daniel W.E. Schobben, Ph. D. Philips Research Laboratories

More information

Recent Advances in Acoustic Signal Extraction and Dereverberation

Recent Advances in Acoustic Signal Extraction and Dereverberation Recent Advances in Acoustic Signal Extraction and Dereverberation Emanuël Habets Erlangen Colloquium 2016 Scenario Spatial Filtering Estimated Desired Signal Undesired sound components: Sensor noise Competing

More information

TRANSIENT NOISE REDUCTION BASED ON SPEECH RECONSTRUCTION

TRANSIENT NOISE REDUCTION BASED ON SPEECH RECONSTRUCTION TRANSIENT NOISE REDUCTION BASED ON SPEECH RECONSTRUCTION Jian Li 1,2, Shiwei Wang 1,2, Renhua Peng 1,2, Chengshi Zheng 1,2, Xiaodong Li 1,2 1. Communication Acoustics Laboratory, Institute of Acoustics,

More information

+ C(0)21 C(1)21 Z -1. S1(t) + - C21. E1(t) C(D)21 C(D)12 C12 C(1)12. E2(t) S2(t) (a) Original H-J Network C(0)12. (b) Extended H-J Network

+ C(0)21 C(1)21 Z -1. S1(t) + - C21. E1(t) C(D)21 C(D)12 C12 C(1)12. E2(t) S2(t) (a) Original H-J Network C(0)12. (b) Extended H-J Network An Extension of The Herault-Jutten Network to Signals Including Delays for Blind Separation Tatsuya Nomura, Masaki Eguchi y, Hiroaki Niwamoto z 3, Humio Kokubo y 4, and Masayuki Miyamoto z 5 ATR Human

More information

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Different Approaches of Spectral Subtraction Method for Speech Enhancement ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches

More information

Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta

Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification Daryush Mehta SHBT 03 Research Advisor: Thomas F. Quatieri Speech and Hearing Biosciences and Technology 1 Summary Studied

More information

Spectro-Temporal Methods in Primary Auditory Cortex David Klein Didier Depireux Jonathan Simon Shihab Shamma

Spectro-Temporal Methods in Primary Auditory Cortex David Klein Didier Depireux Jonathan Simon Shihab Shamma Spectro-Temporal Methods in Primary Auditory Cortex David Klein Didier Depireux Jonathan Simon Shihab Shamma & Department of Electrical Engineering Supported in part by a MURI grant from the Office of

More information

Audio Signal Compression using DCT and LPC Techniques

Audio Signal Compression using DCT and LPC Techniques Audio Signal Compression using DCT and LPC Techniques P. Sandhya Rani#1, D.Nanaji#2, V.Ramesh#3,K.V.S. Kiran#4 #Student, Department of ECE, Lendi Institute Of Engineering And Technology, Vizianagaram,

More information

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC

More information

Live multi-track audio recording

Live multi-track audio recording Live multi-track audio recording Joao Luiz Azevedo de Carvalho EE522 Project - Spring 2007 - University of Southern California Abstract In live multi-track audio recording, each microphone perceives sound

More information

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Alfredo Zermini, Qiuqiang Kong, Yong Xu, Mark D. Plumbley, Wenwu Wang Centre for Vision,

More information

Adaptive Filters Application of Linear Prediction

Adaptive Filters Application of Linear Prediction Adaptive Filters Application of Linear Prediction Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Technology Digital Signal Processing

More information

Speech Enhancement Based On Noise Reduction

Speech Enhancement Based On Noise Reduction Speech Enhancement Based On Noise Reduction Kundan Kumar Singh Electrical Engineering Department University Of Rochester ksingh11@z.rochester.edu ABSTRACT This paper addresses the problem of signal distortion

More information

Enhanced Waveform Interpolative Coding at 4 kbps

Enhanced Waveform Interpolative Coding at 4 kbps Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 14 Quiz 04 Review 14/04/07 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Orthonormal bases and tilings of the time-frequency plane for music processing Juan M. Vuletich *

Orthonormal bases and tilings of the time-frequency plane for music processing Juan M. Vuletich * Orthonormal bases and tilings of the time-frequency plane for music processing Juan M. Vuletich * Dept. of Computer Science, University of Buenos Aires, Argentina ABSTRACT Conventional techniques for signal

More information

INSTANTANEOUS FREQUENCY ESTIMATION FOR A SINUSOIDAL SIGNAL COMBINING DESA-2 AND NOTCH FILTER. Yosuke SUGIURA, Keisuke USUKURA, Naoyuki AIKAWA

INSTANTANEOUS FREQUENCY ESTIMATION FOR A SINUSOIDAL SIGNAL COMBINING DESA-2 AND NOTCH FILTER. Yosuke SUGIURA, Keisuke USUKURA, Naoyuki AIKAWA INSTANTANEOUS FREQUENCY ESTIMATION FOR A SINUSOIDAL SIGNAL COMBINING AND NOTCH FILTER Yosuke SUGIURA, Keisuke USUKURA, Naoyuki AIKAWA Tokyo University of Science Faculty of Science and Technology ABSTRACT

More information

Single Channel Speaker Segregation using Sinusoidal Residual Modeling

Single Channel Speaker Segregation using Sinusoidal Residual Modeling NCC 2009, January 16-18, IIT Guwahati 294 Single Channel Speaker Segregation using Sinusoidal Residual Modeling Rajesh M Hegde and A. Srinivas Dept. of Electrical Engineering Indian Institute of Technology

More information

TIMIT LMS LMS. NoisyNA

TIMIT LMS LMS. NoisyNA TIMIT NoisyNA Shi NoisyNA Shi (NoisyNA) shi A ICA PI SNIR [1]. S. V. Vaseghi, Advanced Digital Signal Processing and Noise Reduction, Second Edition, John Wiley & Sons Ltd, 2000. [2]. M. Moonen, and A.

More information

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction Human performance Reverberation

More information

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,

More information

HCS 7367 Speech Perception

HCS 7367 Speech Perception HCS 7367 Speech Perception Dr. Peter Assmann Fall 212 Power spectrum model of masking Assumptions: Only frequencies within the passband of the auditory filter contribute to masking. Detection is based

More information

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing Project : Part 2 A second hands-on lab on Speech Processing Frequency-domain processing February 24, 217 During this lab, you will have a first contact on frequency domain analysis of speech signals. You

More information

SUPERVISED SIGNAL PROCESSING FOR SEPARATION AND INDEPENDENT GAIN CONTROL OF DIFFERENT PERCUSSION INSTRUMENTS USING A LIMITED NUMBER OF MICROPHONES

SUPERVISED SIGNAL PROCESSING FOR SEPARATION AND INDEPENDENT GAIN CONTROL OF DIFFERENT PERCUSSION INSTRUMENTS USING A LIMITED NUMBER OF MICROPHONES SUPERVISED SIGNAL PROCESSING FOR SEPARATION AND INDEPENDENT GAIN CONTROL OF DIFFERENT PERCUSSION INSTRUMENTS USING A LIMITED NUMBER OF MICROPHONES SF Minhas A Barton P Gaydecki School of Electrical and

More information

source signals seconds separateded signals seconds

source signals seconds separateded signals seconds 1 On-line Blind Source Separation of Non-Stationary Signals Lucas Parra, Clay Spence Sarno Corporation, CN-5300, Princeton, NJ 08543, lparra@sarno.com, cspence@sarno.com Abstract We have shown previously

More information

SEPARATION AND DEREVERBERATION PERFORMANCE OF FREQUENCY DOMAIN BLIND SOURCE SEPARATION. Ryo Mukai Shoko Araki Shoji Makino

SEPARATION AND DEREVERBERATION PERFORMANCE OF FREQUENCY DOMAIN BLIND SOURCE SEPARATION. Ryo Mukai Shoko Araki Shoji Makino % > SEPARATION AND DEREVERBERATION PERFORMANCE OF FREQUENCY DOMAIN BLIND SOURCE SEPARATION Ryo Mukai Shoko Araki Shoji Makino NTT Communication Science Laboratories 2-4 Hikaridai, Seika-cho, Soraku-gun,

More information

Speech Coding using Linear Prediction

Speech Coding using Linear Prediction Speech Coding using Linear Prediction Jesper Kjær Nielsen Aalborg University and Bang & Olufsen jkn@es.aau.dk September 10, 2015 1 Background Speech is generated when air is pushed from the lungs through

More information

Reducing comb filtering on different musical instruments using time delay estimation

Reducing comb filtering on different musical instruments using time delay estimation Reducing comb filtering on different musical instruments using time delay estimation Alice Clifford and Josh Reiss Queen Mary, University of London alice.clifford@eecs.qmul.ac.uk Abstract Comb filtering

More information

SGN Audio and Speech Processing

SGN Audio and Speech Processing Introduction 1 Course goals Introduction 2 SGN 14006 Audio and Speech Processing Lectures, Fall 2014 Anssi Klapuri Tampere University of Technology! Learn basics of audio signal processing Basic operations

More information

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Brochure More information from http://www.researchandmarkets.com/reports/569388/ Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Description: Multimedia Signal

More information

Multiple Sound Sources Localization Using Energetic Analysis Method

Multiple Sound Sources Localization Using Energetic Analysis Method VOL.3, NO.4, DECEMBER 1 Multiple Sound Sources Localization Using Energetic Analysis Method Hasan Khaddour, Jiří Schimmel Department of Telecommunications FEEC, Brno University of Technology Purkyňova

More information

Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech

Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech INTERSPEECH 5 Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech M. A. Tuğtekin Turan and Engin Erzin Multimedia, Vision and Graphics Laboratory,

More information

Automotive three-microphone voice activity detector and noise-canceller

Automotive three-microphone voice activity detector and noise-canceller Res. Lett. Inf. Math. Sci., 005, Vol. 7, pp 47-55 47 Available online at http://iims.massey.ac.nz/research/letters/ Automotive three-microphone voice activity detector and noise-canceller Z. QI and T.J.MOIR

More information

KONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM

KONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM KONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM Shruthi S Prabhu 1, Nayana C G 2, Ashwini B N 3, Dr. Parameshachari B D 4 Assistant Professor, Department of Telecommunication Engineering, GSSSIETW,

More information

Introduction to Audio Watermarking Schemes

Introduction to Audio Watermarking Schemes Introduction to Audio Watermarking Schemes N. Lazic and P. Aarabi, Communication over an Acoustic Channel Using Data Hiding Techniques, IEEE Transactions on Multimedia, Vol. 8, No. 5, October 2006 Multimedia

More information

A Comparison of the Convolutive Model and Real Recording for Using in Acoustic Echo Cancellation

A Comparison of the Convolutive Model and Real Recording for Using in Acoustic Echo Cancellation A Comparison of the Convolutive Model and Real Recording for Using in Acoustic Echo Cancellation SEPTIMIU MISCHIE Faculty of Electronics and Telecommunications Politehnica University of Timisoara Vasile

More information

MINUET: MUSICAL INTERFERENCE UNMIXING ESTIMATION TECHNIQUE

MINUET: MUSICAL INTERFERENCE UNMIXING ESTIMATION TECHNIQUE MINUET: MUSICAL INTERFERENCE UNMIXING ESTIMATION TECHNIQUE Scott Rickard, Conor Fearon University College Dublin, Dublin, Ireland {scott.rickard,conor.fearon}@ee.ucd.ie Radu Balan, Justinian Rosca Siemens

More information

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 Lecture 5 Slides Jan 26 th, 2005 Outline of Today s Lecture Announcements Filter-bank analysis

More information

Source Separation and Echo Cancellation Using Independent Component Analysis and DWT

Source Separation and Echo Cancellation Using Independent Component Analysis and DWT Source Separation and Echo Cancellation Using Independent Component Analysis and DWT Shweta Yadav 1, Meena Chavan 2 PG Student [VLSI], Dept. of Electronics, BVDUCOEP Pune,India 1 Assistant Professor, Dept.

More information

A Method for Voiced/Unvoiced Classification of Noisy Speech by Analyzing Time-Domain Features of Spectrogram Image

A Method for Voiced/Unvoiced Classification of Noisy Speech by Analyzing Time-Domain Features of Spectrogram Image Science Journal of Circuits, Systems and Signal Processing 2017; 6(2): 11-17 http://www.sciencepublishinggroup.com/j/cssp doi: 10.11648/j.cssp.20170602.12 ISSN: 2326-9065 (Print); ISSN: 2326-9073 (Online)

More information

Filter Banks I. Prof. Dr. Gerald Schuller. Fraunhofer IDMT & Ilmenau University of Technology Ilmenau, Germany. Fraunhofer IDMT

Filter Banks I. Prof. Dr. Gerald Schuller. Fraunhofer IDMT & Ilmenau University of Technology Ilmenau, Germany. Fraunhofer IDMT Filter Banks I Prof. Dr. Gerald Schuller Fraunhofer IDMT & Ilmenau University of Technology Ilmenau, Germany 1 Structure of perceptual Audio Coders Encoder Decoder 2 Filter Banks essential element of most

More information

HUMAN speech is frequently encountered in several

HUMAN speech is frequently encountered in several 1948 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 20, NO. 7, SEPTEMBER 2012 Enhancement of Single-Channel Periodic Signals in the Time-Domain Jesper Rindom Jensen, Student Member,

More information

Speech Enhancement using Wiener filtering

Speech Enhancement using Wiener filtering Speech Enhancement using Wiener filtering S. Chirtmay and M. Tahernezhadi Department of Electrical Engineering Northern Illinois University DeKalb, IL 60115 ABSTRACT The problem of reducing the disturbing

More information

Neural Blind Separation for Electromagnetic Source Localization and Assessment

Neural Blind Separation for Electromagnetic Source Localization and Assessment Neural Blind Separation for Electromagnetic Source Localization and Assessment L. Albini, P. Burrascano, E. Cardelli, A. Faba, S. Fiori Department of Industrial Engineering, University of Perugia Via G.

More information

Dual Transfer Function GSC and Application to Joint Noise Reduction and Acoustic Echo Cancellation

Dual Transfer Function GSC and Application to Joint Noise Reduction and Acoustic Echo Cancellation Dual Transfer Function GSC and Application to Joint Noise Reduction and Acoustic Echo Cancellation Gal Reuven Under supervision of Sharon Gannot 1 and Israel Cohen 2 1 School of Engineering, Bar-Ilan University,

More information

Sub-band Envelope Approach to Obtain Instants of Significant Excitation in Speech

Sub-band Envelope Approach to Obtain Instants of Significant Excitation in Speech Sub-band Envelope Approach to Obtain Instants of Significant Excitation in Speech Vikram Ramesh Lakkavalli, K V Vijay Girish, A G Ramakrishnan Medical Intelligence and Language Engineering (MILE) Laboratory

More information

Speech Recognition using FIR Wiener Filter

Speech Recognition using FIR Wiener Filter Speech Recognition using FIR Wiener Filter Deepak 1, Vikas Mittal 2 1 Department of Electronics & Communication Engineering, Maharishi Markandeshwar University, Mullana (Ambala), INDIA 2 Department of

More information

Gerhard Schmidt / Tim Haulick Recent Tends for Improving Automotive Speech Enhancement Systems. Geneva, 5-7 March 2008

Gerhard Schmidt / Tim Haulick Recent Tends for Improving Automotive Speech Enhancement Systems. Geneva, 5-7 March 2008 Gerhard Schmidt / Tim Haulick Recent Tends for Improving Automotive Speech Enhancement Systems Speech Communication Channels in a Vehicle 2 Into the vehicle Within the vehicle Out of the vehicle Speech

More information

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping Structure of Speech Physical acoustics Time-domain representation Frequency domain representation Sound shaping Speech acoustics Source-Filter Theory Speech Source characteristics Speech Filter characteristics

More information

SGN Audio and Speech Processing

SGN Audio and Speech Processing SGN 14006 Audio and Speech Processing Introduction 1 Course goals Introduction 2! Learn basics of audio signal processing Basic operations and their underlying ideas and principles Give basic skills although

More information

Perceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter

Perceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter Perceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter Sana Alaya, Novlène Zoghlami and Zied Lachiri Signal, Image and Information Technology Laboratory National Engineering School

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Architectural Acoustics Session 1pAAa: Advanced Analysis of Room Acoustics:

More information

ROOM AND CONCERT HALL ACOUSTICS MEASUREMENTS USING ARRAYS OF CAMERAS AND MICROPHONES

ROOM AND CONCERT HALL ACOUSTICS MEASUREMENTS USING ARRAYS OF CAMERAS AND MICROPHONES ROOM AND CONCERT HALL ACOUSTICS The perception of sound by human listeners in a listening space, such as a room or a concert hall is a complicated function of the type of source sound (speech, oration,

More information

REpeating Pattern Extraction Technique (REPET)

REpeating Pattern Extraction Technique (REPET) REpeating Pattern Extraction Technique (REPET) EECS 32: Machine Perception of Music & Audio Zafar RAFII, Spring 22 Repetition Repetition is a fundamental element in generating and perceiving structure

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb 2009. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb 2008. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum,

More information

Almost Perfect Reconstruction Filter Bank for Non-redundant, Approximately Shift-Invariant, Complex Wavelet Transforms

Almost Perfect Reconstruction Filter Bank for Non-redundant, Approximately Shift-Invariant, Complex Wavelet Transforms Journal of Wavelet Theory and Applications. ISSN 973-6336 Volume 2, Number (28), pp. 4 Research India Publications http://www.ripublication.com/jwta.htm Almost Perfect Reconstruction Filter Bank for Non-redundant,

More information

Psychology of Language

Psychology of Language PSYCH 150 / LIN 155 UCI COGNITIVE SCIENCES syn lab Psychology of Language Prof. Jon Sprouse 01.10.13: The Mental Representation of Speech Sounds 1 A logical organization For clarity s sake, we ll organize

More information

A102 Signals and Systems for Hearing and Speech: Final exam answers

A102 Signals and Systems for Hearing and Speech: Final exam answers A12 Signals and Systems for Hearing and Speech: Final exam answers 1) Take two sinusoids of 4 khz, both with a phase of. One has a peak level of.8 Pa while the other has a peak level of. Pa. Draw the spectrum

More information

COM325 Computer Speech and Hearing

COM325 Computer Speech and Hearing COM325 Computer Speech and Hearing Part III : Theories and Models of Pitch Perception Dr. Guy Brown Room 145 Regent Court Department of Computer Science University of Sheffield Email: g.brown@dcs.shef.ac.uk

More information

Lecture 14: Source Separation

Lecture 14: Source Separation ELEN E896 MUSIC SIGNAL PROCESSING Lecture 1: Source Separation 1. Sources, Mixtures, & Perception. Spatial Filtering 3. Time-Frequency Masking. Model-Based Separation Dan Ellis Dept. Electrical Engineering,

More information

IN a natural environment, speech often occurs simultaneously. Monaural Speech Segregation Based on Pitch Tracking and Amplitude Modulation

IN a natural environment, speech often occurs simultaneously. Monaural Speech Segregation Based on Pitch Tracking and Amplitude Modulation IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 15, NO. 5, SEPTEMBER 2004 1135 Monaural Speech Segregation Based on Pitch Tracking and Amplitude Modulation Guoning Hu and DeLiang Wang, Fellow, IEEE Abstract

More information

REAL-TIME BROADBAND NOISE REDUCTION

REAL-TIME BROADBAND NOISE REDUCTION REAL-TIME BROADBAND NOISE REDUCTION Robert Hoeldrich and Markus Lorber Institute of Electronic Music Graz Jakoministrasse 3-5, A-8010 Graz, Austria email: robert.hoeldrich@mhsg.ac.at Abstract A real-time

More information

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2 Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter

More information

The effects of the excitation source directivity on some room acoustic descriptors obtained from impulse response measurements

The effects of the excitation source directivity on some room acoustic descriptors obtained from impulse response measurements PROCEEDINGS of the 22 nd International Congress on Acoustics Challenges and Solutions in Acoustical Measurements and Design: Paper ICA2016-484 The effects of the excitation source directivity on some room

More information

ESE531 Spring University of Pennsylvania Department of Electrical and System Engineering Digital Signal Processing

ESE531 Spring University of Pennsylvania Department of Electrical and System Engineering Digital Signal Processing University of Pennsylvania Department of Electrical and System Engineering Digital Signal Processing ESE531, Spring 2017 Final Project: Audio Equalization Wednesday, Apr. 5 Due: Tuesday, April 25th, 11:59pm

More information

RASTA-PLP SPEECH ANALYSIS. Aruna Bayya. Phil Kohn y TR December 1991

RASTA-PLP SPEECH ANALYSIS. Aruna Bayya. Phil Kohn y TR December 1991 RASTA-PLP SPEECH ANALYSIS Hynek Hermansky Nelson Morgan y Aruna Bayya Phil Kohn y TR-91-069 December 1991 Abstract Most speech parameter estimation techniques are easily inuenced by the frequency response

More information

Design and Implementation on a Sub-band based Acoustic Echo Cancellation Approach

Design and Implementation on a Sub-band based Acoustic Echo Cancellation Approach Vol., No. 6, 0 Design and Implementation on a Sub-band based Acoustic Echo Cancellation Approach Zhixin Chen ILX Lightwave Corporation Bozeman, Montana, USA chen.zhixin.mt@gmail.com Abstract This paper

More information

McGraw-Hill Irwin DIGITAL SIGNAL PROCESSING. A Computer-Based Approach. Second Edition. Sanjit K. Mitra

McGraw-Hill Irwin DIGITAL SIGNAL PROCESSING. A Computer-Based Approach. Second Edition. Sanjit K. Mitra DIGITAL SIGNAL PROCESSING A Computer-Based Approach Second Edition Sanjit K. Mitra Department of Electrical and Computer Engineering University of California, Santa Barbara Jurgen - Knorr- Kbliothek Spende

More information