THE MATLAB IMPLEMENTATION OF BINAURAL PROCESSING MODEL SIMULATING LATERAL POSITION OF TONES WITH INTERAURAL TIME DIFFERENCES

Similar documents
A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

Testing of Objective Audio Quality Assessment Models on Archive Recordings Artifacts

Spectral and temporal processing in the human auditory system

AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES

AUDL 4007 Auditory Perception. Week 1. The cochlea & auditory nerve: Obligatory stages of auditory processing

Monaural and binaural processing of fluctuating sounds in the auditory system

A cat's cocktail party: Psychophysical, neurophysiological, and computational studies of spatial release from masking

Hearing and Deafness 2. Ear as a frequency analyzer. Chris Darwin

A binaural auditory model and applications to spatial sound evaluation

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

Computational Perception. Sound localization 2

I R UNDERGRADUATE REPORT. Stereausis: A Binaural Processing Model. by Samuel Jiawei Ng Advisor: P.S. Krishnaprasad UG

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Loudness & Temporal resolution

III. Publication III. c 2005 Toni Hirvonen.

Tone-in-noise detection: Observed discrepancies in spectral integration. Nicolas Le Goff a) Technische Universiteit Eindhoven, P.O.

Interaction of Object Binding Cues in Binaural Masking Pattern Experiments

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.

Phase and Feedback in the Nonlinear Brain. Malcolm Slaney (IBM and Stanford) Hiroko Shiraiwa-Terasawa (Stanford) Regaip Sen (Stanford)

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

I. INTRODUCTION. NL-5656 AA Eindhoven, The Netherlands. Electronic mail:

Imagine the cochlea unrolled

HCS 7367 Speech Perception

Binaural Mechanisms that Emphasize Consistent Interaural Timing Information over Frequency

PERFORMANCE COMPARISON BETWEEN STEREAUSIS AND INCOHERENT WIDEBAND MUSIC FOR LOCALIZATION OF GROUND VEHICLES ABSTRACT

The psychoacoustics of reverberation

Intensity Discrimination and Binaural Interaction

Virtual Acoustic Space as Assistive Technology

Exploiting envelope fluctuations to achieve robust extraction and intelligent integration of binaural cues

Assessing the contribution of binaural cues for apparent source width perception via a functional model

The Human Auditory System

A Pole Zero Filter Cascade Provides Good Fits to Human Masking Data and to Basilar Membrane and Neural Data

Human Auditory Periphery (HAP)

SOUND QUALITY EVALUATION OF FAN NOISE BASED ON HEARING-RELATED PARAMETERS SUMMARY INTRODUCTION

Additive Versus Multiplicative Combination of Differences of Interaural Time and Intensity

Recurrent Timing Neural Networks for Joint F0-Localisation Estimation

You know about adding up waves, e.g. from two loudspeakers. AUDL 4007 Auditory Perception. Week 2½. Mathematical prelude: Adding up levels

An auditory model that can account for frequency selectivity and phase effects on masking

Robust Speech Recognition Based on Binaural Auditory Processing

Signals & Systems for Speech & Hearing. Week 6. Practical spectral analysis. Bandpass filters & filterbanks. Try this out on an old friend

Robust Speech Recognition Based on Binaural Auditory Processing

Pitch estimation using spiking neurons

A Silicon Model of an Auditory Neural Representation of Spectral Shape

AUDL Final exam page 1/7 Please answer all of the following questions.

Indoor Sound Localization

Estimating critical bandwidths of temporal sensitivity to low-frequency amplitude modulation

Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model

THE PERCEPTION OF ALL-PASS COMPONENTS IN TRANSFER FUNCTIONS

The EarSpring Model for the Loudness Response in Unimpaired Human Hearing

Signal detection in the auditory midbrain: Neural correlates and mechanisms of spatial release from masking

Binaural Hearing. Reading: Yost Ch. 12

IN a natural environment, speech often occurs simultaneously. Monaural Speech Segregation Based on Pitch Tracking and Amplitude Modulation

John Lazzaro and Carver Mead Department of Computer Science California Institute of Technology Pasadena, California, 91125

Across frequency processing with time varying spectra

Auditory modelling for speech processing in the perceptual domain

A Silicon Model Of Auditory Localization

Preface A detailed knowledge of the processes involved in hearing is an essential prerequisite for numerous medical and technical applications, such a

Detection of Tones in Reproducible Noises: Prediction of Listeners Performance in Diotic and Dichotic Conditions

Using the Gammachirp Filter for Auditory Analysis of Speech

A102 Signals and Systems for Hearing and Speech: Final exam answers

Monaural and Binaural Speech Separation

Applying Models of Auditory Processing to Automatic Speech Recognition: Promise and Progress!

Chapter 12. Preview. Objectives The Production of Sound Waves Frequency of Sound Waves The Doppler Effect. Section 1 Sound Waves

Modeling auditory processing of amplitude modulation I. Detection and masking with narrow-band carriers Dau, T.; Kollmeier, B.; Kohlrausch, A.G.

Auditory filters at low frequencies: ERB and filter shape

Chapter 2 A Silicon Model of Auditory-Nerve Response

Results of Egan and Hake using a single sinusoidal masker [reprinted with permission from J. Acoust. Soc. Am. 22, 622 (1950)].

2 Jonas Braasch this introduction is set on localization models. To establish a binaural model, typically three tasks have to be solved (i) the spatia

Temporal resolution AUDL Domain of temporal resolution. Fine structure and envelope. Modulating a sinusoid. Fine structure and envelope

Acoustics, signals & systems for audiology. Week 4. Signals through Systems

BIOLOGICALLY INSPIRED BINAURAL ANALOGUE SIGNAL PROCESSING

A VLSI-Based Model of Azimuthal Echolocation in the Big Brown Bat

A classification-based cocktail-party processor

Distortion products and the perceived pitch of harmonic complex tones

Spectro-Temporal Methods in Primary Auditory Cortex David Klein Didier Depireux Jonathan Simon Shihab Shamma

COM325 Computer Speech and Hearing

The role of fine structure in bilateral cochlear implantation

AUDL GS08/GAV1 Auditory Perception. Envelope and temporal fine structure (TFS)

A triangulation method for determining the perceptual center of the head for auditory stimuli

University of Huddersfield Repository

Study on method of estimating direct arrival using monaural modulation sp. Author(s)Ando, Masaru; Morikawa, Daisuke; Uno

Characterization of Auditory Evoked Potentials From Transient Binaural beats Generated by Frequency Modulating Sound Stimuli

Acoustics Research Institute

On the relationship between multi-channel envelope and temporal fine structure

Physiological evidence for auditory modulation filterbanks: Cortical responses to concurrent modulations

The role of intrinsic masker fluctuations on the spectral spread of masking

Predicting discrimination of formant frequencies in vowels with a computational model of the auditory midbrain

EXPLORATION OF A BIOLOGICALLY INSPIRED MODEL FOR SOUND SOURCE LOCALIZATION IN 3D SPACE

Ian C. Bruce Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21205

Binaural hearing. Prof. Dan Tollin on the Hearing Throne, Oldenburg Hearing Garden

Modeling auditory processing of amplitude modulation II. Spectral and temporal integration Dau, T.; Kollmeier, B.; Kohlrausch, A.G.

Shift of ITD tuning is observed with different methods of prediction.

Acoustics, signals & systems for audiology. Week 9. Basic Psychoacoustic Phenomena: Temporal resolution

THE TEMPORAL and spectral structure of a sound signal

Auditory Based Feature Vectors for Speech Recognition Systems

Feasibility of Vocal Emotion Conversion on Modulation Spectrogram for Simulated Cochlear Implants

QUALITY ASSESSMENT OF MULTI-CHANNEL AUDIO PROCESSING SCHEMES BASED ON A BINAURAL AUDITORY MODEL

ORIENTATION IN SIMPLE VIRTUAL AUDITORY SPACE CREATED WITH MEASURED HRTF

Imperfect pitch: Gabor s uncertainty principle and the pitch of extremely brief sounds

Psycho-acoustics (Sound characteristics, Masking, and Loudness)

Transcription:

THE MATLAB IMPLEMENTATION OF BINAURAL PROCESSING MODEL SIMULATING LATERAL POSITION OF TONES WITH INTERAURAL TIME DIFFERENCES J. Bouše, V. Vencovský Department of Radioelectronics, Faculty of Electrical Engineering, Czech Technical University in Prague, Czech Republic Abstract The implementation of binaural auditory able to reflect the lateral position of tones with interaural time differences is presented. The is composed of two parts, monaural processing adapted from Dau [] and the binaural processing designed by authors. The binaural processing is simulating medial superior olive (MSO), part of human brain stem which is claimed to be responsible for coding of the temporal differences between signals in two ears [1]. Designed is not using most widely used framework for binaural s, Jeffress delay line, instead of it own designed approach inspired by Grothe s paper [3] is implemented. The output of the was compared with the listening tests taken from literature []. The results show that presented is able to reliably reflect the data. 1 Introduction The human hearing system allows us to localize sound source in space. This ability is connected with the presence of two ears on our head. Since each ear lies on the contralateral side of the head the signals at their inputs coming from and outside sound source generally differ in time (figure 1) and intensity. These cues are called interaural level differences (ILD) and interaural time differences (ITD). The decoding of spatial information based on the ITD is believed to be processed in the first joint of ipsilateral and contralateral nerve path in the medial superior olive (MSO) a part of the human olivary complex in the brainstem [1]. During laboratory measurements when the subject is listening to the signal with the ITD through the headphones, the sound image is perceived within the head. Increasing the ITD will cause the shifting of the sound image intercranialy towards one ear. This phenomena is called lateralization [1]. This paper describes the implementation of the MSO simulating the perception of lateralization caused by the ITD when listening through the headphones. A lot of different binaural s able to detect ITD were developed during last years [7]. Most of them is using as a framework the delay line proposed by Jeffress [5]. There is no evidence for existence of such structure in the mammalian olivary complex. Due this fact in proposed the Jeffress delay line is not used instead of that the realization inspired by the neurophysiological findings described in Grothe s paper [3] Figure 1: The differences in sound propagation paths resulting in the ITD is used. Grothe in cited paper found proofs of inhibition synapses found in the mammalian MSO coming from the contralateral ear through the auditory nerves with very short transmission time. It means that spikes transferred by these nerves should reach the MSO earlier than neural signals coming from both ipsilateral and contralateral ear to the excitatory inputs of the MSO. He claimed that these contralateral time shifted innhibition inputs are essential to detect ITD by the mammalian auditory system.

Model The proposed is composed of two main stages (figure ). The first is monaural auditory adapted from Dau s paper []. The second part is binaural designed and implemented by the authors of the paper. Figure : The diagram.1 Monaural processing stage The monaural represents the input for the analyzed binaural signal. Since humans have two ears, there are two monaural parts in the overall, each of them representing one ear. This consists of the outer- and middle-ear, cochlear frequency selectivity, inner hair cell and adaptation loops. The was taken from Dau s paper []. The outer- and middle ear are ed as linear system with certain frequency response by a cascade of FIR filters (5, 5 and 51th order). The FIR filters were designed in order to reflect experimentally measured amplitude transfer function of the outer- and middle-ear taken from the papers [, ]. The second part of the monaural processing stage is a cochlear selectivity. This stage consists of the th grade gammatone filterbank which divides the input signal into 3 frequency bands according to equivalent rectangular bandwidth (ERB) function. Bandwidth of each gammatone filters is set to be equal to ERB value of the corresponding center frequency []. The minimal and maximal frequency of the filter bank were chosen to be 1 Hz and 15.5 khz. Signal in each band is then processed by the inner hair cell and adaptation loops. The inner hair cells consist of half-wave rectification followed by the 1st-order, low-pass IIR filter with cut off frequency of 1 khz. This process roughly simulates the transformation of the basilar membrane vibration into the membrane potential inside the inner hair cell. If the output signal of this stage is lower than certain threshold value, it is then replaced by this threshold value. This simulates the absolute threshold of the human auditory system.the next part of the are the adaptation feedback loops. Five consecutive, non-linear, feedback loops with the time constants of 5 to 5 ms the temporal masking and adaptation. Input-output function of the loops approximates a logarithmic compression. It is non-linear for stationary input signals and it shows enhancement for rapid signal fluctuations (faster then the time constants in the feedback loops) [].. Binaural processing stage The binaural processing stage diagram can be seen in figure 3. Input signals from both ears are first filtered by first order IIR filter (f c = 3 Hz). This roughly simulates lost of synchronization of the nerve cells on higher frequencies. It is assumed that output of the MSO is not affected by the level or difference in level of the input signals [1]. To eliminate the influence of the signal level is the signal normalized right after the half-wave rectification. The normalization is done

Figure 3: The binaural processing diagram by dividing the input signal by it s envelope, which is extracted from the signal by the equation: env(n, b) = max(f ilt(n, b); orig(n, b)), (1) where env is envelope of the signal, filt is filtered input signal by first order IIR filter with experimentally set time constant equal to 5 ms, orig is input signal, n is sample number and b is number of ERB channel. The extraction of the envelope is denoted on figure..5.5..35 amplitude [ ].3.5..15.1.5 input signal envelope.5 1 1.5.5 3 3.5 time [ms] Figure : The extraction of envelope from the signal. The signals from both ears are then processed in two calculation units, each of them with 3 inputs. Two of the inputs are the delayed signals representing the signals from ipsilateral and contralateral ear coming to the MSO. The delay was experimentally set to 1 µs. Third input is not delayed and represents the inhibitory signal from the contralateral ear. This design was inspired by Grothe s paper [3]. In the calculation units following mathematical processing is done MSO(n, b) = Ip(n, b) Con(n, b) CoInh(n, b) Ip(n, b) Con(n, b), () where Ip and Con is delayed signal from ipsilateral and contralateral ear respectively, ConInh is contralateral inhibitory signal without delay, n is sample number and b is number of ERB channel. The MSO signal from both hemispheres is then again half-wave rectified and the average value in each of 3 channels is computed.

The signal is then processed by a cognitive in order to obtain data comparable to the lateralization experiments. where: 1 (1 r1(b)), left ear is leading L(b) = (1 r(b)), right ear is leading, (3) r 1 (b) = MSO l(b) MSO r (b), () r (b) = MSO r(b) MSO l (b), (5) MSO r and MSO l is the averaged signal from right and left ear respectively. The obtained scale ranges from to 1, where represents the perception of sound in the middle of the head and 1 near to the analyzed ear. This processing is denoted on figure 5. amplitude [ ]..5..3 left MSO right MSO averaged left MSO averaged right MSO lateralization..1 1 1 1 1 sample number [-] Figure 5: The output from MSO calculation units together with lateralization for phase shift equal to 5

3 Results 1 1 15 1 5 5 1 15 1 (a) Hz 1 15 1 5 5 1 15 (c) 75 Hz 1 1 15 1 5 5 1 15 1 (b) 5 Hz 1 15 1 5 5 1 15 (d) 1 Hz Figure : Lateral position of auditory event as a function of Interaural Phase Difference of tone. Red line represents psychophysical data [], blue line is data obtained by the auditory. Since a lot of lateralization experiments using signals with different ITD were already done and presented in the literature by other authors (see [1]), they were taken in this paper as a comparison to the led results. The data obtained by Yost [] who measured lateralization of interaurally phase shifted sinusoids were particularly used. The sinusoids were interaurally shifted from -1 to 1 degrees. The results were obtained for 5 db SPL pure tones of four different frequencies Hz, 5 Hz, 75 Hz and 1 khz. They can be seen on figure. Just mean values of the psychophysical data are shown. Since their variance is quite high, led data are in all cases inside this variance. Conclusion The binaural auditory suitable for simulating lateral position of sound sources via ITD was designed and implemented in Matlab. The was able to simulate the psychophysical lateralization experiment with tones. In contrast to the most often used binaural s, this does not use Jeffresss delay line [5]. Instead of that, time shifted contralateral inhibition theory presented by Grothe [3] is applied. Presented results shows good agreement between simulations and experiments. Presented thus can serve as a proof for the Grothes paper [3]. Its advantage in comparison to s using Jeffresss delay line is also in lower demands on the computation power.

Acknowledgement This work was supported by the Grant Agency of the Czech Technical University in Prague, grant No. SGS11/159/OHK3/3T/13. References [1] Jens Blauert and John S. Allen. Spatial Hearing - The Psychophysics of Human Sound Localization. MIT Press, Cambridge, rev. edition, 1997. [] Torsten Dau, Birger Kollmeier, and Armin Kohlrausch. A quantitative of the effective signal processing in the auditory system. i. structure. Acoustical Society of America, 99:315 3, 199. [3] B. Grothe. New roles for synaptic inhibition in sound localization. Nature Reviews Neuroscience, (7):5 55, July 3. [] D Hammershø I and H Mø ller. Sound transmission to and within the human ear canal. Acoustical Society of America, 1: 7, July 199. [5] Lloyd A. Jeffress. A place theory of sound localization. In Journal of Comparative and Physiological Psychology [7], pages 35 39. [] M Kringlebotn and T Gundersen. Frequency characteristics of the middle ear. Acoustical Society of America, 77:15 1. [7] R. Meddis. Computational Models of the Auditory System. Springer Handbook of Auditory Research, 35. Springer US, 1. [] W. A. Yost. Lateral position of sinusoids presented with interaural intensive and temporal differences. Journal of the Acoustical Society of America, 7():337 9, 191. Jaroslav Bouše Katedra radioelektroniky, FEL ČVUT v Praze, Technická, 1 7, Praha tel. 35 19, e-mail: bousejar@fel.cvut.cz Václav Vencovský Katedra radioelektroniky, FEL ČVUT v Praze, Technická, 1 7, Praha tel. 35 19, e-mail: vencovac@fel.cvut.cz