Speaker Isolation in a Cocktail-Party Setting

Size: px
Start display at page:

Download "Speaker Isolation in a Cocktail-Party Setting"

Transcription

1 Speaker Isolation in a Cocktail-Party Setting M.K. Alisdairi Columbia University M.S. Candidate Electrical Engineering Spring Abstract the human auditory system is capable of performing many interesting tasks, several of which could find useful applications in engineering settings. One such capability is the ability to perceptually separate sound sources, allowing a listener to focus on a single speaker in a noisy environment. This effect is often referred to as the cocktail effect (in reference to a cocktail-party environment where several simultaneous conversations are taking place in the background) or as Auditory Scene Analysis. This paper introduces two methodologies for isolating a desired speaker s audio stream from a binaural recording of multiple speakers in conversation. An implementation a system for speaker isolation based on one of these methods is also presented. Note that some of the graphics presented in this document are best viewed in a color format. For the electronic version please visit INTRODUCTION Systems capable of performing Auditory Scene Analysis (ASA) [] could find numerous useful applications. The most evident application is as a front- for speech recognition systems. The development of systems capable of ASA could provide improvements in speech recognition in unconstrained auditory environments [7] [9]. Noisy Environment Sound Input Speaker Extraction System Voice Recognition System Figure Speaker Extraction System as a Front-End to Voice Recognition Another possible use for an ASA capable system could be in theatrical/movie settings as a substitute for wireless microphones. In such instances sound engineers could have a versatile means of controlling audio quality without the physical imposition of hardware on the speaker s person. This paper discusses two methodologies for speech extraction. The first method is based on the Interaural Intensity Difference, and the second on the Time Difference of Arrival. After the preliminary discussion an implementation of the TDOA based method is presented. Analysis and implementation is carriedout on sound recordings from the ShATR corpus. The ShATR Corpus The sound files used were taken from the ShATR corpus of dummy head recordings. The recordings are of five speakers (Guy, Martin, Phil, Inge Marie, and Malcolm) oriented around the dummy head. Included in the ShATR corpus are two files; one file contains a recording of each of the five speakers introducing themselves, Speech and Audio Processing Speaker Isolation in a Cocktail-Party Setting -9

2 the other file is a recording of the five carrying on in conversation. These two files are of primary interest in this document. THEORETICAL BACKGROUND THE HRTF The theoretical undergirding of speaker isolation is the fact that the left and right channels traverse different paths; resulting in differing filtering for each channel. The following figure and equations illustrate the concept. Frequency x 4.5 Sample Spectrogram TF Cell Ear Channel Dummy Head Figure Depiction of the Path Depent Filtering Effect Y Y Where: Y ( ω, = H φ ) X = H X, Y Ear Channel (Equation ) (Equation ) The signals received by the left and right ears Time Figure 3 - The TF Cell Concept There are two methods which may be implemented towards the goal of categorizing and weighting the TF cells appropriately; the interaural intensity difference (IID) method and the time delay of arrival (TDOA) method. The Inter-aural Intensity Difference The IID method of cell-weight estimation is based on intensity differentials as a function of frequency [6]. For example, a sound originating from a source in the first quadrant of the figure below will be detected by the left ear as a signal that has undergone a low-pass filtering effect. The low-pass effect is a result of the shadowing of high frequency components by the head. On the other hand, low frequencies are able to wrap around the head with little attenuation. H, H The impulse responses of the paths to the left and right ears. SECOND QUADRANT FIRST QUADRANT X ( ω, The original speech signal. HEAD degrees The two transfer functions H and H are called the Head Related Transfer Functions (HRFT) and are functions of position as well as frequency (note the HRTF is more precisely Figure 4 Sound Source Plane a function of frequency ω, azimuth φ, and elevation θ) [5]. It is averred that that by decomposing a multi-speaker signal The interaural intensity difference may be obtained by taking into several time/frequency (TF) cells and weighting each cell the ratio of the left and right channel magnitudes in the appropriately, a desired speaker s speech data may be extracted frequency domain (or correspondingly taking the difference from an aggregate of speakers [3]. The localization cues that between the frequency domain magnitudes in db). The math are implanted in the received signals by the HRTF may be used follows. to determine the appropriate weightings for each of the TF cells. Y = H iid = log log (Equation 3) YRigth( ω, H Rigth( ω, ( Y ) log( Y ) ) iid = log t Speech and Audio Processing Speaker Isolation in a Cocktail-Party Setting -9

3 By comparing incoming speech data to predetermined categorical information the previously mentioned TF cells may be classified appropriately [4]. x 4 The Time Delay of Arrival The distance differential between the propagation paths for the right and left ears causes a phase distortion between the two channels. This phase distortion may be used as localization cue. As with the IID method, these localization cues may be used to derive the necessary weighting information for each TF cell. Theoretically, extraction of the desired time delay information may be accomplished through direct analysis of the phase components as follows: itd = arg * { Y Y } (Equation 4) Despite the plausibility of the above approach it is avoided due to difficulties resulting from the nonlinearity of the phase functions. An alternative means towards the same is the use of the crosscorrelation function. The TDOA is obtained by retaining the lag index of maximum crosscorrelation between the left and right channels. Correlation of & Channels 8 x Frequency Time Inge Marie on Microphone Four Figure 6 Inge s Introduction Sample The sound sample is first passed through a Bark scaled filter bank and each band-limited output is broken into short time windows. The left and right channel windows are then crosscorrelated, and the index of the maximum crosscorrelation is retained for each time window. The filter bank is a four-channel filter-bank. The justification for this selection is based on the results given in [3], which show a maximization of the SNR for four frequency bands. Initially, the time window length was based solely on calculations of the maximum possible lag index. However, [3] also shows a maximization of SNR for a window length of 56 samples, thus a window length of 56 was used. The following figure depicts a histogram of the lag indexes that result from the process described above: Correlation Index Figure 5 Determining the TDOA by Autocorrelation IMPLEMENTATION This paper concentrates on the TDOA method of speaker isolation. The first step is the analysis of the introduction sound samples from the ShATR corpus to develop a source model that describes each speaker s localization cues. The second step involves use of the localization cues extracted by the previously attained models to determine the appropriate TF cell weightings for each TF cell. Analysis Source Models The first step in building the speech isolation system is to study the behavior of the TDOA as a speaker (position) depent feature. This analysis was conducted on the speech samples of each speaker s introduction, such as Inge Marie s depicted below. Figure 7 Histogram of Lag Index of Maximum Correlation As the above histograms show, the lag indexes of maximum crosscorrelation seem to be normally distributed. The appropriate mean and standard deviations are then manually extracted for each speaker. The results for Guy and Inge Marie follow: Speech and Audio Processing Speaker Isolation in a Cocktail-Party Setting 3-9

4 Speaker: Guy Channel Mean (µ) Standard Deviation (σ) One -8 Two -6.5 Three Four 5 Speaker: Inge Marie Channel Mean (µ) Standard Deviation (σ) One Two Three 7.5 Four The weight of a particular TF cell as a function of lag index (i) is as follows: w ( i µ ch ) σ ch ch( i) = e (Equation 5) Speech Isolation System Once the distributions of the lag indexes for each speaker have been determined the sound file containing simultaneous speech may be analyzed and a desired speaker may be extracted. The simultaneous speech sample is first passed through the previously mentioned filter bank. Each of the four band limited signals is then broken into time windows of length 56 samples. The left and right channels are then crosscorrelated for each time frame, and the index of maximum correlation is retained. The sequence of lag indexes is then compared to the desired speaker s lag index distribution model and a weight corresponding to the likelihood of each TF cell belonging to the desired speaker is used as that cell s weight. Figure 9 Extracting Guy s Speech Signal Although the system preformed the desired task of extracting a single speech track from the sequence, the weighting process introduced some undesirable noise. It was postulated that the source of the noise was the discontinuity of the weighting matrix over time, and that the problem could be ameliorated by filtering the weight matrix. The following figures illustrate the short sections of the four weight sequences before and after filtering..5 Filtering of the Weight Matrix Original Weight Seq. Filtered Weight Seq..5.5 x 4.5 TDOA Analysis.5.5 x Desired Speaker x 4 Speech Data Filter Bank Weighting Based on TDOA Model Isolated Speech Weight Time x 4 Figure Illustration of the Weight Matrix Filtering Process Figure 8 Block Diagram of Speech Isolation System RESULTS The above methodology of speech extraction was successful in isolating the desired speaker s signal. The following is a spectrogram of a sample of cocktail-party speech and the resulting extraction of Guy s stream. Reconstruction of the desired track using the newly filtered weighting matrix resulted in the desired improvement in quality. The following figures illustrate the reconstructed signals for Guy with and without weight matrix filtering. Speech and Audio Processing Speaker Isolation in a Cocktail-Party Setting 4-9

5 [8] F. Berthommier and S. Choi, Evaluation of CASA and BSS models for subband cocktail-party speech separation. [9] D. Wang and G. Brown, Seperation of Speech from Interfering Sounds Based on Oscillatory Correlation, IEEE Transactions on Nerual Networks, Vol., No. 3, May 999. [] A. Bregman, Auditory Scene Analysis, Cambridge, MA: MIT Press, 99. Figure Before and After Filtering the Weighting Function CONCLUDING REMARKS The design presented in this paper illustrates a simple implementation of a speech extraction system and establishes the feasibility of such a system. This design was successful in achieving the objective of extracting a single speaker s track from a group recording. Included in the appices is the code for the implementation as well as larger spectrogram depictions of the results. Despite the successes presented in this paper, there exists room for enhancements in future work. For example, the source model in this implementation was obtained manually. The automation of the source model would allow the system to behave in a more versatile manner, possibly allowing the relaxation of the a priori assumption that the speakers positions remain constant. A second potential improvement could be the incorporation of a broader feature set, possibly including the IID in addition to the TDOA. REFERENCES [] S. Mitra, Digital Signal Processing, a Computer-Based Approach, McGraw-Hill Irwin. [] R Ziemer and W. Tranter, Signals and Systems, Continuous and Discrete, Prentice Hall 998. [3] E. Tessier and F. Berthommier, Speech Enhancement and Segregation Based on the Localisation Cue for Cocktail-Party Processing, [4] W. Chau and R. Duda, Combined Monaural and Binaural Localization of Sound Sources, IEEE Proceedings of ASILOMAR [5] R. Duda, Modeling Head Related Transfer Functions, IEEE Proceedings of ASILOMAR 993. [6] K. Martin, Estimating Azimuth and Elevation from Interaural Differences. [7] S. Choi, H. Glotin, F. Berthommier, and E. Tessier, A CASA Front-End Using the Localization Cue for Segregation and then Cocktail-Party Speech Recognition. Speech and Audio Processing Speaker Isolation in a Cocktail-Party Setting 5-9

6 APPENDIX A: MATLAB CODE function banddat = bands(dd,nb); M.K. Alisdairi Spring b = bands(ysound,number_bands) Given original data 'dd' bands() returns an nb-by-length(dd)-by- matrix containing versions of 'dd' bandlimited according to the bark scale nb is the number of bands used. Note nb may be {8 4 } if(nb~= & nb~= & nb~=4 & nb~=8) clc disp(sprintf('error: nb must be 8,4,, or ')) banddat = ; return Make sure it fits the criteria home disp(sprintf('working...')) N = ^9; Number of frequency points for the STFT Seperate into right and left channels d_left = dd(:,)'; d_right= dd(:,)'; Calculate the STFTs DL = stft(d_left,n,n,n/); DR = stft(d_right,n,n,n/); Define the Bark scaled windows in Hz, then convert to FFT index (w). Note: eight channels max. F = [ ]; w = floor(n*f./48)+; banddat= zeros(nb,((size(dl,)+)*n/),); Initializing is faster Go through and produce the proper bandlimited signals inc = 8/nb; for(i=:inc:8) FtempL = zeros(size(dl)); FtempL((w(i)):((w(i+inc)-)),:) = DL((w(i)):((w(i+inc)-)),:); FtempR = zeros(size(dr)); FtempR((w(i)):((w(i+inc)-)),:) = DR((w(i)):((w(i+inc)-)),:); banddat(ceil(i/inc),:,)=istft(ftempl,n,n,n/); banddat(ceil(i/inc),:,)=istft(ftempr,n,n,n/); Forward the data function [ii,yy,c] = tdoa(b,win,ch) M.K. Alisdairi Spring [i,y,c] = tdoa(band_data,window_length,channels) Function accepts matrix 'b' (produced by bands()), hich contains nb channels of bandlimited audio data. The function will conduct a cross correletion of left and right time windows with length 'win'. bl = b(:,:,); br = b(:,:,); Break data into right and left channels nb=size(b,); Number of bands stop = floor(size(b,)/win); Number of windows Speech and Audio Processing Speaker Isolation in a Cocktail-Party Setting 6-9

7 c = NaN*ones(*win-,stop); xcorr matrix for k=:length(ch) j=ch(k); for i=:win:(stop*win) c(:,ceil(i/win))=xcorr(bl(j,i:(i+win-)),br(j,i:(i+win-)))'; home disp(size(c)) [y,i] = max(c); Determine the TDOA for each frame yy(k,:) = y; ii(k,:) = i-win; Forward actual xcorr. Forward lag indexes function [ys,wgt] = extract(b,ii,person,win) M.K. Alisdairi Spring [ys,wgt] = extract(band_data,lag_indexes,desired_person, window size) This function extracts Guy's voice from the sound data in b If person == then extract Guy. If == then extract Inge. u = [ ; Means for Guy ]; Means for Inge Marie s = [.5 5; Std. devs for Guy 4 3.5]; Std. devs for Inge Marie u = u(person,:); Take the correct data s = s(person,:); w = zeros(,length(ii)*win); Initialize short weight vector len = min(length(b),length(w)); ys = zeros(,len); Initialize place for new data wgt = zeros(size(ii,),length(w)); Initialize actual weight matrix for ch=:size(ii,) Look at all channels wf = zeros(,length(ii)); wf = exp(-(ii(ch,:)-u(ch)).^/(*s(ch)^)); Calculate appropriate weight based on the lag index w(:) = wf(ceil((:length(w))/win)); Convert to long vector wgt(ch,:) = w; Forward the info ys(,:) = ys(,:)+b(ch,:len,).*w(:len); Calculate the extracted voice ys(,:) = ys(,:)+b(ch,:len,).*w(:len); Caluclate the other channel Speech and Audio Processing Speaker Isolation in a Cocktail-Party Setting 7-9

8 APPENDIX B: ENLARGED SPECTROGRAMS Speech and Audio Processing Speaker Isolation in a Cocktail-Party Setting 8-9

9 Speech and Audio Processing Speaker Isolation in a Cocktail-Party Setting 9-9

A classification-based cocktail-party processor

A classification-based cocktail-party processor A classification-based cocktail-party processor Nicoleta Roman, DeLiang Wang Department of Computer and Information Science and Center for Cognitive Science The Ohio State University Columbus, OH 43, USA

More information

Monaural and Binaural Speech Separation

Monaural and Binaural Speech Separation Monaural and Binaural Speech Separation DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction CASA approach to sound separation Ideal binary mask as

More information

IMPROVED COCKTAIL-PARTY PROCESSING

IMPROVED COCKTAIL-PARTY PROCESSING IMPROVED COCKTAIL-PARTY PROCESSING Alexis Favrot, Markus Erne Scopein Research Aarau, Switzerland postmaster@scopein.ch Christof Faller Audiovisual Communications Laboratory, LCAV Swiss Institute of Technology

More information

Binaural Hearing. Reading: Yost Ch. 12

Binaural Hearing. Reading: Yost Ch. 12 Binaural Hearing Reading: Yost Ch. 12 Binaural Advantages Sounds in our environment are usually complex, and occur either simultaneously or close together in time. Studies have shown that the ability to

More information

Automatic Text-Independent. Speaker. Recognition Approaches Using Binaural Inputs

Automatic Text-Independent. Speaker. Recognition Approaches Using Binaural Inputs Automatic Text-Independent Speaker Recognition Approaches Using Binaural Inputs Karim Youssef, Sylvain Argentieri and Jean-Luc Zarader 1 Outline Automatic speaker recognition: introduction Designed systems

More information

ScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking

ScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 122 126 International Conference on Information and Communication Technologies (ICICT 2014) Unsupervised Speech

More information

1856 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 7, SEPTEMBER /$ IEEE

1856 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 7, SEPTEMBER /$ IEEE 1856 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 7, SEPTEMBER 2010 Sequential Organization of Speech in Reverberant Environments by Integrating Monaural Grouping and Binaural

More information

Simultaneous Recognition of Speech Commands by a Robot using a Small Microphone Array

Simultaneous Recognition of Speech Commands by a Robot using a Small Microphone Array 2012 2nd International Conference on Computer Design and Engineering (ICCDE 2012) IPCSIT vol. 49 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V49.14 Simultaneous Recognition of Speech

More information

The psychoacoustics of reverberation

The psychoacoustics of reverberation The psychoacoustics of reverberation Steven van de Par Steven.van.de.Par@uni-oldenburg.de July 19, 2016 Thanks to Julian Grosse and Andreas Häußler 2016 AES International Conference on Sound Field Control

More information

A Neural Oscillator Sound Separator for Missing Data Speech Recognition

A Neural Oscillator Sound Separator for Missing Data Speech Recognition A Neural Oscillator Sound Separator for Missing Data Speech Recognition Guy J. Brown and Jon Barker Department of Computer Science University of Sheffield Regent Court, 211 Portobello Street, Sheffield

More information

Binaural Speaker Recognition for Humanoid Robots

Binaural Speaker Recognition for Humanoid Robots Binaural Speaker Recognition for Humanoid Robots Karim Youssef, Sylvain Argentieri and Jean-Luc Zarader Université Pierre et Marie Curie Institut des Systèmes Intelligents et de Robotique, CNRS UMR 7222

More information

Two-channel Separation of Speech Using Direction-of-arrival Estimation And Sinusoids Plus Transients Modeling

Two-channel Separation of Speech Using Direction-of-arrival Estimation And Sinusoids Plus Transients Modeling Two-channel Separation of Speech Using Direction-of-arrival Estimation And Sinusoids Plus Transients Modeling Mikko Parviainen 1 and Tuomas Virtanen 2 Institute of Signal Processing Tampere University

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction Human performance Reverberation

More information

Sound Source Localization using HRTF database

Sound Source Localization using HRTF database ICCAS June -, KINTEX, Gyeonggi-Do, Korea Sound Source Localization using HRTF database Sungmok Hwang*, Youngjin Park and Younsik Park * Center for Noise and Vibration Control, Dept. of Mech. Eng., KAIST,

More information

INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR PROPOSING A STANDARDISED TESTING ENVIRONMENT FOR BINAURAL SYSTEMS

INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR PROPOSING A STANDARDISED TESTING ENVIRONMENT FOR BINAURAL SYSTEMS 20-21 September 2018, BULGARIA 1 Proceedings of the International Conference on Information Technologies (InfoTech-2018) 20-21 September 2018, Bulgaria INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR

More information

TDE-ILD-HRTF-Based 2D Whole-Plane Sound Source Localization Using Only Two Microphones and Source Counting

TDE-ILD-HRTF-Based 2D Whole-Plane Sound Source Localization Using Only Two Microphones and Source Counting TDE-ILD-HRTF-Based 2D Whole-Plane Sound Source Localization Using Only Two Microphones Source Counting Ali Pourmohammad, Member, IACSIT Seyed Mohammad Ahadi Abstract In outdoor cases, TDOA-based methods

More information

Single Channel Speaker Segregation using Sinusoidal Residual Modeling

Single Channel Speaker Segregation using Sinusoidal Residual Modeling NCC 2009, January 16-18, IIT Guwahati 294 Single Channel Speaker Segregation using Sinusoidal Residual Modeling Rajesh M Hegde and A. Srinivas Dept. of Electrical Engineering Indian Institute of Technology

More information

Multiple Sound Sources Localization Using Energetic Analysis Method

Multiple Sound Sources Localization Using Energetic Analysis Method VOL.3, NO.4, DECEMBER 1 Multiple Sound Sources Localization Using Energetic Analysis Method Hasan Khaddour, Jiří Schimmel Department of Telecommunications FEEC, Brno University of Technology Purkyňova

More information

Auditory Localization

Auditory Localization Auditory Localization CMPT 468: Sound Localization Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University November 15, 2013 Auditory locatlization is the human perception

More information

Computational Perception. Sound localization 2

Computational Perception. Sound localization 2 Computational Perception 15-485/785 January 22, 2008 Sound localization 2 Last lecture sound propagation: reflection, diffraction, shadowing sound intensity (db) defining computational problems sound lateralization

More information

BIOLOGICALLY INSPIRED BINAURAL ANALOGUE SIGNAL PROCESSING

BIOLOGICALLY INSPIRED BINAURAL ANALOGUE SIGNAL PROCESSING Brain Inspired Cognitive Systems August 29 September 1, 2004 University of Stirling, Scotland, UK BIOLOGICALLY INSPIRED BINAURAL ANALOGUE SIGNAL PROCESSING Natasha Chia and Steve Collins University of

More information

Subband Analysis of Time Delay Estimation in STFT Domain

Subband Analysis of Time Delay Estimation in STFT Domain PAGE 211 Subband Analysis of Time Delay Estimation in STFT Domain S. Wang, D. Sen and W. Lu School of Electrical Engineering & Telecommunications University of ew South Wales, Sydney, Australia sh.wang@student.unsw.edu.au,

More information

High-speed Noise Cancellation with Microphone Array

High-speed Noise Cancellation with Microphone Array Noise Cancellation a Posteriori Probability, Maximum Criteria Independent Component Analysis High-speed Noise Cancellation with Microphone Array We propose the use of a microphone array based on independent

More information

Enhancing 3D Audio Using Blind Bandwidth Extension

Enhancing 3D Audio Using Blind Bandwidth Extension Enhancing 3D Audio Using Blind Bandwidth Extension (PREPRINT) Tim Habigt, Marko Ðurković, Martin Rothbucher, and Klaus Diepold Institute for Data Processing, Technische Universität München, 829 München,

More information

PERFORMANCE COMPARISON BETWEEN STEREAUSIS AND INCOHERENT WIDEBAND MUSIC FOR LOCALIZATION OF GROUND VEHICLES ABSTRACT

PERFORMANCE COMPARISON BETWEEN STEREAUSIS AND INCOHERENT WIDEBAND MUSIC FOR LOCALIZATION OF GROUND VEHICLES ABSTRACT Approved for public release; distribution is unlimited. PERFORMANCE COMPARISON BETWEEN STEREAUSIS AND INCOHERENT WIDEBAND MUSIC FOR LOCALIZATION OF GROUND VEHICLES September 1999 Tien Pham U.S. Army Research

More information

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Mohini Avatade & S.L. Sahare Electronics & Telecommunication Department, Cummins

More information

AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES

AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-), Verona, Italy, December 7-9,2 AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES Tapio Lokki Telecommunications

More information

Calibration of Microphone Arrays for Improved Speech Recognition

Calibration of Microphone Arrays for Improved Speech Recognition MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Calibration of Microphone Arrays for Improved Speech Recognition Michael L. Seltzer, Bhiksha Raj TR-2001-43 December 2001 Abstract We present

More information

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 o Music signal characteristics o Perceptual attributes and acoustic properties o Signal representations for pitch detection o STFT o Sinusoidal model o

More information

Listening with Headphones

Listening with Headphones Listening with Headphones Main Types of Errors Front-back reversals Angle error Some Experimental Results Most front-back errors are front-to-back Substantial individual differences Most evident in elevation

More information

III. Publication III. c 2005 Toni Hirvonen.

III. Publication III. c 2005 Toni Hirvonen. III Publication III Hirvonen, T., Segregation of Two Simultaneously Arriving Narrowband Noise Signals as a Function of Spatial and Frequency Separation, in Proceedings of th International Conference on

More information

The analysis of multi-channel sound reproduction algorithms using HRTF data

The analysis of multi-channel sound reproduction algorithms using HRTF data The analysis of multichannel sound reproduction algorithms using HRTF data B. Wiggins, I. PatersonStephens, P. Schillebeeckx Processing Applications Research Group University of Derby Derby, United Kingdom

More information

EE1.el3 (EEE1023): Electronics III. Acoustics lecture 20 Sound localisation. Dr Philip Jackson.

EE1.el3 (EEE1023): Electronics III. Acoustics lecture 20 Sound localisation. Dr Philip Jackson. EE1.el3 (EEE1023): Electronics III Acoustics lecture 20 Sound localisation Dr Philip Jackson www.ee.surrey.ac.uk/teaching/courses/ee1.el3 Sound localisation Objectives: calculate frequency response of

More information

Recent Advances in Acoustic Signal Extraction and Dereverberation

Recent Advances in Acoustic Signal Extraction and Dereverberation Recent Advances in Acoustic Signal Extraction and Dereverberation Emanuël Habets Erlangen Colloquium 2016 Scenario Spatial Filtering Estimated Desired Signal Undesired sound components: Sensor noise Competing

More information

Auditory System For a Mobile Robot

Auditory System For a Mobile Robot Auditory System For a Mobile Robot PhD Thesis Jean-Marc Valin Department of Electrical Engineering and Computer Engineering Université de Sherbrooke, Québec, Canada Jean-Marc.Valin@USherbrooke.ca Motivations

More information

Sound source localization and its use in multimedia applications

Sound source localization and its use in multimedia applications Notes for lecture/ Zack Settel, McGill University Sound source localization and its use in multimedia applications Introduction With the arrival of real-time binaural or "3D" digital audio processing,

More information

Blind Dereverberation of Single-Channel Speech Signals Using an ICA-Based Generative Model

Blind Dereverberation of Single-Channel Speech Signals Using an ICA-Based Generative Model Blind Dereverberation of Single-Channel Speech Signals Using an ICA-Based Generative Model Jong-Hwan Lee 1, Sang-Hoon Oh 2, and Soo-Young Lee 3 1 Brain Science Research Center and Department of Electrial

More information

An Adaptive Algorithm for Speech Source Separation in Overcomplete Cases Using Wavelet Packets

An Adaptive Algorithm for Speech Source Separation in Overcomplete Cases Using Wavelet Packets Proceedings of the th WSEAS International Conference on Signal Processing, Istanbul, Turkey, May 7-9, 6 (pp4-44) An Adaptive Algorithm for Speech Source Separation in Overcomplete Cases Using Wavelet Packets

More information

Source Localisation Mapping using Weighted Interaural Cross-Correlation

Source Localisation Mapping using Weighted Interaural Cross-Correlation ISSC 27, Derry, Sept 3-4 Source Localisation Mapping using Weighted Interaural Cross-Correlation Gavin Kearney, Damien Kelly, Enda Bates, Frank Boland and Dermot Furlong. Department of Electronic and Electrical

More information

Recurrent Timing Neural Networks for Joint F0-Localisation Estimation

Recurrent Timing Neural Networks for Joint F0-Localisation Estimation Recurrent Timing Neural Networks for Joint F0-Localisation Estimation Stuart N. Wrigley and Guy J. Brown Department of Computer Science, University of Sheffield Regent Court, 211 Portobello Street, Sheffield

More information

SOPA version 3. SOPA project. July 22, Principle Introduction Direction of propagation Speed of propagation...

SOPA version 3. SOPA project. July 22, Principle Introduction Direction of propagation Speed of propagation... SOPA version 3 SOPA project July 22, 2015 Contents 1 Principle 2 1.1 Introduction............................ 2 1.2 Direction of propagation..................... 3 1.3 Speed of propagation.......................

More information

Convention Paper Presented at the 125th Convention 2008 October 2 5 San Francisco, CA, USA

Convention Paper Presented at the 125th Convention 2008 October 2 5 San Francisco, CA, USA Audio Engineering Society Convention Paper Presented at the 125th Convention 2008 October 2 5 San Francisco, CA, USA The papers at this Convention have been selected on the basis of a submitted abstract

More information

IN a natural environment, speech often occurs simultaneously. Monaural Speech Segregation Based on Pitch Tracking and Amplitude Modulation

IN a natural environment, speech often occurs simultaneously. Monaural Speech Segregation Based on Pitch Tracking and Amplitude Modulation IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 15, NO. 5, SEPTEMBER 2004 1135 Monaural Speech Segregation Based on Pitch Tracking and Amplitude Modulation Guoning Hu and DeLiang Wang, Fellow, IEEE Abstract

More information

Adaptive Filters Application of Linear Prediction

Adaptive Filters Application of Linear Prediction Adaptive Filters Application of Linear Prediction Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Technology Digital Signal Processing

More information

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC

More information

Indoor Sound Localization

Indoor Sound Localization MIN-Fakultät Fachbereich Informatik Indoor Sound Localization Fares Abawi Universität Hamburg Fakultät für Mathematik, Informatik und Naturwissenschaften Fachbereich Informatik Technische Aspekte Multimodaler

More information

Live multi-track audio recording

Live multi-track audio recording Live multi-track audio recording Joao Luiz Azevedo de Carvalho EE522 Project - Spring 2007 - University of Southern California Abstract In live multi-track audio recording, each microphone perceives sound

More information

Pitch-Based Segregation of Reverberant Speech

Pitch-Based Segregation of Reverberant Speech Technical Report OSU-CISRC-4/5-TR22 Department of Computer Science and Engineering The Ohio State University Columbus, OH 4321-1277 Ftp site: ftp.cse.ohio-state.edu Login: anonymous Directory: pub/tech-report/25

More information

Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks

Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks Mariam Yiwere 1 and Eun Joo Rhee 2 1 Department of Computer Engineering, Hanbat National University,

More information

HRIR Customization in the Median Plane via Principal Components Analysis

HRIR Customization in the Median Plane via Principal Components Analysis 한국소음진동공학회 27 년춘계학술대회논문집 KSNVE7S-6- HRIR Customization in the Median Plane via Principal Components Analysis 주성분분석을이용한 HRIR 맞춤기법 Sungmok Hwang and Youngjin Park* 황성목 박영진 Key Words : Head-Related Transfer

More information

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. 3D and Virtual Sound. Paris Smaragdis. paris.cs.illinois.

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. 3D and Virtual Sound. Paris Smaragdis. paris.cs.illinois. UNIVERSITY ILLINOIS @ URBANA-CHAMPAIGN OF CS 498PS Audio Computing Lab 3D and Virtual Sound Paris Smaragdis paris@illinois.edu paris.cs.illinois.edu Overview Human perception of sound and space ITD, IID,

More information

Intensity Discrimination and Binaural Interaction

Intensity Discrimination and Binaural Interaction Technical University of Denmark Intensity Discrimination and Binaural Interaction 2 nd semester project DTU Electrical Engineering Acoustic Technology Spring semester 2008 Group 5 Troels Schmidt Lindgreen

More information

Chapter 4 SPEECH ENHANCEMENT

Chapter 4 SPEECH ENHANCEMENT 44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or

More information

COMPARISON OF CHANNEL ESTIMATION AND EQUALIZATION TECHNIQUES FOR OFDM SYSTEMS

COMPARISON OF CHANNEL ESTIMATION AND EQUALIZATION TECHNIQUES FOR OFDM SYSTEMS COMPARISON OF CHANNEL ESTIMATION AND EQUALIZATION TECHNIQUES FOR OFDM SYSTEMS Sanjana T and Suma M N Department of Electronics and communication, BMS College of Engineering, Bangalore, India ABSTRACT In

More information

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Alfredo Zermini, Qiuqiang Kong, Yong Xu, Mark D. Plumbley, Wenwu Wang Centre for Vision,

More information

Transmit Power Allocation for BER Performance Improvement in Multicarrier Systems

Transmit Power Allocation for BER Performance Improvement in Multicarrier Systems Transmit Power Allocation for Performance Improvement in Systems Chang Soon Par O and wang Bo (Ed) Lee School of Electrical Engineering and Computer Science, Seoul National University parcs@mobile.snu.ac.r,

More information

ONE of the most common and robust beamforming algorithms

ONE of the most common and robust beamforming algorithms TECHNICAL NOTE 1 Beamforming algorithms - beamformers Jørgen Grythe, Norsonic AS, Oslo, Norway Abstract Beamforming is the name given to a wide variety of array processing algorithms that focus or steer

More information

Modeling Head-Related Transfer Functions Based on Pinna Anthropometry

Modeling Head-Related Transfer Functions Based on Pinna Anthropometry Second LACCEI International Latin American and Caribbean Conference for Engineering and Technology (LACCEI 24) Challenges and Opportunities for Engineering Education, Research and Development 2-4 June

More information

The Role of High Frequencies in Convolutive Blind Source Separation of Speech Signals

The Role of High Frequencies in Convolutive Blind Source Separation of Speech Signals The Role of High Frequencies in Convolutive Blind Source Separation of Speech Signals Maria G. Jafari and Mark D. Plumbley Centre for Digital Music, Queen Mary University of London, UK maria.jafari@elec.qmul.ac.uk,

More information

Audio Restoration Based on DSP Tools

Audio Restoration Based on DSP Tools Audio Restoration Based on DSP Tools EECS 451 Final Project Report Nan Wu School of Electrical Engineering and Computer Science University of Michigan Ann Arbor, MI, United States wunan@umich.edu Abstract

More information

White Rose Research Online URL for this paper: Version: Accepted Version

White Rose Research Online URL for this paper:   Version: Accepted Version This is a repository copy of Exploiting Deep Neural Networks and Head Movements for Robust Binaural Localisation of Multiple Sources in Reverberant Environments. White Rose Research Online URL for this

More information

Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming

Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Engineering

More information

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,

More information

Computational Perception /785

Computational Perception /785 Computational Perception 15-485/785 Assignment 1 Sound Localization due: Thursday, Jan. 31 Introduction This assignment focuses on sound localization. You will develop Matlab programs that synthesize sounds

More information

Airo Interantional Research Journal September, 2013 Volume II, ISSN:

Airo Interantional Research Journal September, 2013 Volume II, ISSN: Airo Interantional Research Journal September, 2013 Volume II, ISSN: 2320-3714 Name of author- Navin Kumar Research scholar Department of Electronics BR Ambedkar Bihar University Muzaffarpur ABSTRACT Direction

More information

WAVELET-BASED SPECTRAL SMOOTHING FOR HEAD-RELATED TRANSFER FUNCTION FILTER DESIGN

WAVELET-BASED SPECTRAL SMOOTHING FOR HEAD-RELATED TRANSFER FUNCTION FILTER DESIGN WAVELET-BASE SPECTRAL SMOOTHING FOR HEA-RELATE TRANSFER FUNCTION FILTER ESIGN HUSEYIN HACIHABIBOGLU, BANU GUNEL, AN FIONN MURTAGH Sonic Arts Research Centre (SARC), Queen s University Belfast, Belfast,

More information

Pitch-based monaural segregation of reverberant speech

Pitch-based monaural segregation of reverberant speech Pitch-based monaural segregation of reverberant speech Nicoleta Roman a Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio 43210 DeLiang Wang b Department of Computer

More information

Toward Automatic Transcription -- Pitch Tracking In Polyphonic Environment

Toward Automatic Transcription -- Pitch Tracking In Polyphonic Environment Toward Automatic Transcription -- Pitch Tracking In Polyphonic Environment Term Project Presentation By: Keerthi C Nagaraj Dated: 30th April 2003 Outline Introduction Background problems in polyphonic

More information

Using Energy Difference for Speech Separation of Dual-microphone Close-talk System

Using Energy Difference for Speech Separation of Dual-microphone Close-talk System ensors & Transducers, Vol. 1, pecial Issue, May 013, pp. 1-17 ensors & Transducers 013 by IF http://www.sensorsportal.com Using Energy Difference for peech eparation of Dual-microphone Close-talk ystem

More information

arxiv: v1 [cs.sd] 4 Dec 2018

arxiv: v1 [cs.sd] 4 Dec 2018 LOCALIZATION AND TRACKING OF AN ACOUSTIC SOURCE USING A DIAGONAL UNLOADING BEAMFORMING AND A KALMAN FILTER Daniele Salvati, Carlo Drioli, Gian Luca Foresti Department of Mathematics, Computer Science and

More information

Robotic Spatial Sound Localization and Its 3-D Sound Human Interface

Robotic Spatial Sound Localization and Its 3-D Sound Human Interface Robotic Spatial Sound Localization and Its 3-D Sound Human Interface Jie Huang, Katsunori Kume, Akira Saji, Masahiro Nishihashi, Teppei Watanabe and William L. Martens The University of Aizu Aizu-Wakamatsu,

More information

Automotive three-microphone voice activity detector and noise-canceller

Automotive three-microphone voice activity detector and noise-canceller Res. Lett. Inf. Math. Sci., 005, Vol. 7, pp 47-55 47 Available online at http://iims.massey.ac.nz/research/letters/ Automotive three-microphone voice activity detector and noise-canceller Z. QI and T.J.MOIR

More information

Robust Speech Recognition Based on Binaural Auditory Processing

Robust Speech Recognition Based on Binaural Auditory Processing Robust Speech Recognition Based on Binaural Auditory Processing Anjali Menon 1, Chanwoo Kim 2, Richard M. Stern 1 1 Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh,

More information

A BINAURAL HEARING AID SPEECH ENHANCEMENT METHOD MAINTAINING SPATIAL AWARENESS FOR THE USER

A BINAURAL HEARING AID SPEECH ENHANCEMENT METHOD MAINTAINING SPATIAL AWARENESS FOR THE USER A BINAURAL EARING AID SPEEC ENANCEMENT METOD MAINTAINING SPATIAL AWARENESS FOR TE USER Joachim Thiemann, Menno Müller and Steven van de Par Carl-von-Ossietzky University Oldenburg, Cluster of Excellence

More information

Spatial Audio Transmission Technology for Multi-point Mobile Voice Chat

Spatial Audio Transmission Technology for Multi-point Mobile Voice Chat Audio Transmission Technology for Multi-point Mobile Voice Chat Voice Chat Multi-channel Coding Binaural Signal Processing Audio Transmission Technology for Multi-point Mobile Voice Chat We have developed

More information

Lateralisation of multiple sound sources by the auditory system

Lateralisation of multiple sound sources by the auditory system Modeling of Binaural Discrimination of multiple Sound Sources: A Contribution to the Development of a Cocktail-Party-Processor 4 H.SLATKY (Lehrstuhl für allgemeine Elektrotechnik und Akustik, Ruhr-Universität

More information

Directionality. Many hearing impaired people have great difficulty

Directionality. Many hearing impaired people have great difficulty Directionality Many hearing impaired people have great difficulty understanding speech in noisy environments such as parties, bars and meetings. But speech understanding can be greatly improved if unwanted

More information

Speech and Music Discrimination based on Signal Modulation Spectrum.

Speech and Music Discrimination based on Signal Modulation Spectrum. Speech and Music Discrimination based on Signal Modulation Spectrum. Pavel Balabko June 24, 1999 1 Introduction. This work is devoted to the problem of automatic speech and music discrimination. As we

More information

Using RASTA in task independent TANDEM feature extraction

Using RASTA in task independent TANDEM feature extraction R E S E A R C H R E P O R T I D I A P Using RASTA in task independent TANDEM feature extraction Guillermo Aradilla a John Dines a Sunil Sivadas a b IDIAP RR 04-22 April 2004 D a l l e M o l l e I n s t

More information

Optimal Adaptive Filtering Technique for Tamil Speech Enhancement

Optimal Adaptive Filtering Technique for Tamil Speech Enhancement Optimal Adaptive Filtering Technique for Tamil Speech Enhancement Vimala.C Project Fellow, Department of Computer Science Avinashilingam Institute for Home Science and Higher Education and Women Coimbatore,

More information

Enhancement of Speech Signal by Adaptation of Scales and Thresholds of Bionic Wavelet Transform Coefficients

Enhancement of Speech Signal by Adaptation of Scales and Thresholds of Bionic Wavelet Transform Coefficients ISSN (Print) : 232 3765 An ISO 3297: 27 Certified Organization Vol. 3, Special Issue 3, April 214 Paiyanoor-63 14, Tamil Nadu, India Enhancement of Speech Signal by Adaptation of Scales and Thresholds

More information

Using Vision to Improve Sound Source Separation

Using Vision to Improve Sound Source Separation Using Vision to Improve Sound Source Separation Yukiko Nakagawa y, Hiroshi G. Okuno y, and Hiroaki Kitano yz ykitano Symbiotic Systems Project ERATO, Japan Science and Technology Corp. Mansion 31 Suite

More information

Binaural Hearing- Human Ability of Sound Source Localization

Binaural Hearing- Human Ability of Sound Source Localization MEE09:07 Binaural Hearing- Human Ability of Sound Source Localization Parvaneh Parhizkari Master of Science in Electrical Engineering Blekinge Institute of Technology December 2008 Blekinge Institute of

More information

From Monaural to Binaural Speaker Recognition for Humanoid Robots

From Monaural to Binaural Speaker Recognition for Humanoid Robots From Monaural to Binaural Speaker Recognition for Humanoid Robots Karim Youssef, Sylvain Argentieri and Jean-Luc Zarader Université Pierre et Marie Curie Institut des Systèmes Intelligents et de Robotique,

More information

Speech Enhancement Using Microphone Arrays

Speech Enhancement Using Microphone Arrays Friedrich-Alexander-Universität Erlangen-Nürnberg Lab Course Speech Enhancement Using Microphone Arrays International Audio Laboratories Erlangen Prof. Dr. ir. Emanuël A. P. Habets Friedrich-Alexander

More information

Perceptual Distortion Maps for Room Reverberation

Perceptual Distortion Maps for Room Reverberation Perceptual Distortion Maps for oom everberation Thomas Zarouchas 1 John Mourjopoulos 1 1 Audio and Acoustic Technology Group Wire Communications aboratory Electrical Engineering and Computer Engineering

More information

Lecture 14: Source Separation

Lecture 14: Source Separation ELEN E896 MUSIC SIGNAL PROCESSING Lecture 1: Source Separation 1. Sources, Mixtures, & Perception. Spatial Filtering 3. Time-Frequency Masking. Model-Based Separation Dan Ellis Dept. Electrical Engineering,

More information

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution PAGE 433 Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution Wenliang Lu, D. Sen, and Shuai Wang School of Electrical Engineering & Telecommunications University of New South Wales,

More information

A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification

A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification Wei Chu and Abeer Alwan Speech Processing and Auditory Perception Laboratory Department

More information

ORIENTATION IN SIMPLE VIRTUAL AUDITORY SPACE CREATED WITH MEASURED HRTF

ORIENTATION IN SIMPLE VIRTUAL AUDITORY SPACE CREATED WITH MEASURED HRTF ORIENTATION IN SIMPLE VIRTUAL AUDITORY SPACE CREATED WITH MEASURED HRTF F. Rund, D. Štorek, O. Glaser, M. Barda Faculty of Electrical Engineering Czech Technical University in Prague, Prague, Czech Republic

More information

From Binaural Technology to Virtual Reality

From Binaural Technology to Virtual Reality From Binaural Technology to Virtual Reality Jens Blauert, D-Bochum Prominent Prominent Features of of Binaural Binaural Hearing Hearing - Localization Formation of positions of the auditory events (azimuth,

More information

Ivan Tashev Microsoft Research

Ivan Tashev Microsoft Research Hannes Gamper Microsoft Research David Johnston Microsoft Research Ivan Tashev Microsoft Research Mark R. P. Thomas Dolby Laboratories Jens Ahrens Chalmers University, Sweden Augmented and virtual reality,

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

A learning, biologically-inspired sound localization model

A learning, biologically-inspired sound localization model A learning, biologically-inspired sound localization model Elena Grassi Neural Systems Lab Institute for Systems Research University of Maryland ITR meeting Oct 12/00 1 Overview HRTF s cues for sound localization.

More information

Acoustics Research Institute

Acoustics Research Institute Austrian Academy of Sciences Acoustics Research Institute Spatial SpatialHearing: Hearing: Single SingleSound SoundSource Sourcein infree FreeField Field Piotr PiotrMajdak Majdak&&Bernhard BernhardLaback

More information

Lab 8. Signal Analysis Using Matlab Simulink

Lab 8. Signal Analysis Using Matlab Simulink E E 2 7 5 Lab June 30, 2006 Lab 8. Signal Analysis Using Matlab Simulink Introduction The Matlab Simulink software allows you to model digital signals, examine power spectra of digital signals, represent

More information

Robust Low-Resource Sound Localization in Correlated Noise

Robust Low-Resource Sound Localization in Correlated Noise INTERSPEECH 2014 Robust Low-Resource Sound Localization in Correlated Noise Lorin Netsch, Jacek Stachurski Texas Instruments, Inc. netsch@ti.com, jacek@ti.com Abstract In this paper we address the problem

More information

Binaural Classification for Reverberant Speech Segregation Using Deep Neural Networks

Binaural Classification for Reverberant Speech Segregation Using Deep Neural Networks 2112 IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 22, NO. 12, DECEMBER 2014 Binaural Classification for Reverberant Speech Segregation Using Deep Neural Networks Yi Jiang, Student

More information

Sound Source Localization in Median Plane using Artificial Ear

Sound Source Localization in Median Plane using Artificial Ear International Conference on Control, Automation and Systems 28 Oct. 14-17, 28 in COEX, Seoul, Korea Sound Source Localization in Median Plane using Artificial Ear Sangmoon Lee 1, Sungmok Hwang 2, Youngjin

More information