Binaural Sound Localization Systems Based on Neural Approaches. Nick Rossenbach June 17, 2016

Similar documents
A learning, biologically-inspired sound localization model

Indoor Sound Localization

Binaural hearing. Prof. Dan Tollin on the Hearing Throne, Oldenburg Hearing Garden

A Silicon Model Of Auditory Localization

Sound Source Localization using HRTF database

A Hybrid Architecture using Cross Correlation and Recurrent Neural Networks for Acoustic Tracking in Robots

An Auditory Localization and Coordinate Transform Chip

Combining Sound Localization and Laser-based Object Recognition

Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. 3D and Virtual Sound. Paris Smaragdis. paris.cs.illinois.

Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks

The Human Auditory System

The psychoacoustics of reverberation

THE MATLAB IMPLEMENTATION OF BINAURAL PROCESSING MODEL SIMULATING LATERAL POSITION OF TONES WITH INTERAURAL TIME DIFFERENCES

A binaural auditory model and applications to spatial sound evaluation

EE1.el3 (EEE1023): Electronics III. Acoustics lecture 20 Sound localisation. Dr Philip Jackson.

Computational Perception. Sound localization 2

BIOLOGICALLY INSPIRED BINAURAL ANALOGUE SIGNAL PROCESSING

INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR PROPOSING A STANDARDISED TESTING ENVIRONMENT FOR BINAURAL SYSTEMS

The analysis of multi-channel sound reproduction algorithms using HRTF data

Recording and analysis of head movements, interaural level and time differences in rooms and real-world listening scenarios

IMPROVED COCKTAIL-PARTY PROCESSING

TDE-ILD-HRTF-Based 2D Whole-Plane Sound Source Localization Using Only Two Microphones and Source Counting

Computational Perception /785

Acoustics Research Institute

Listening with Headphones

ONE of the most common and robust beamforming algorithms

NEURAL NETWORK DEMODULATOR FOR QUADRATURE AMPLITUDE MODULATION (QAM)

Convention Paper Presented at the 126th Convention 2009 May 7 10 Munich, Germany

PERFORMANCE COMPARISON BETWEEN STEREAUSIS AND INCOHERENT WIDEBAND MUSIC FOR LOCALIZATION OF GROUND VEHICLES ABSTRACT

Virtual Acoustic Space as Assistive Technology

Convention e-brief 400

Intensity Discrimination and Binaural Interaction

Sound source localization and its use in multimedia applications

Separation and Recognition of multiple sound source using Pulsed Neuron Model

THE TEMPORAL and spectral structure of a sound signal

A Virtual Audio Environment for Testing Dummy- Head HRTFs modeling Real Life Situations

MINE 432 Industrial Automation and Robotics

Binaural Hearing. Reading: Yost Ch. 12

Introduction. 1.1 Surround sound

Indirect Vector Control of Induction Motor Using Pi Speed Controller and Neural Networks

Directional dependence of loudness and binaural summation Sørensen, Michael Friis; Lydolf, Morten; Frandsen, Peder Christian; Møller, Henrik

Study on method of estimating direct arrival using monaural modulation sp. Author(s)Ando, Masaru; Morikawa, Daisuke; Uno

Subband Analysis of Time Delay Estimation in STFT Domain

A cat's cocktail party: Psychophysical, neurophysiological, and computational studies of spatial release from masking

Binaural Speaker Recognition for Humanoid Robots

Monaural and binaural processing of fluctuating sounds in the auditory system

Proceedings of Meetings on Acoustics

Auditory System For a Mobile Robot

Spectro-Temporal Methods in Primary Auditory Cortex David Klein Didier Depireux Jonathan Simon Shihab Shamma

Improved Head Related Transfer Function Generation and Testing for Acoustic Virtual Reality Development

Three-Dimensional Sound Source Localization for Unmanned Ground Vehicles with a Self-Rotational Two-Microphone Array

the codephaser Add a new dimension of CW perception to your receiver by incorporating this simple audio device

Microphone Array Design and Beamforming

Robotic Spatial Sound Localization and Its 3-D Sound Human Interface

Enhancing 3D Audio Using Blind Bandwidth Extension

Artificial Neural Networks. Artificial Intelligence Santa Clara, 2016

A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL

Convention Paper 9870 Presented at the 143 rd Convention 2017 October 18 21, New York, NY, USA

Auditory Localization

Figure 1. Artificial Neural Network structure. B. Spiking Neural Networks Spiking Neural networks (SNNs) fall into the third generation of neural netw

Localization of underwater moving sound source based on time delay estimation using hydrophone array

Computing with Biologically Inspired Neural Oscillators: Application to Color Image Segmentation

Capturing 360 Audio Using an Equal Segment Microphone Array (ESMA)

HRTF adaptation and pattern learning

Supplementary Material for

Click to edit Master title style

Pitch estimation using spiking neurons

On the Plane Wave Assumption in Indoor Channel Modelling

Auditory Distance Perception. Yan-Chen Lu & Martin Cooke

COMPUTATONAL INTELLIGENCE

Available online at ScienceDirect. Procedia Computer Science 85 (2016 )

Biophysical model of coincidence detection in single Nucleus Laminaris neurons

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4

THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM

Robotic Sound Localization. the time we don t even notice when we orient ourselves towards a speaker. Sound

Exploiting envelope fluctuations to achieve robust extraction and intelligent integration of binaural cues

Recurrent Timing Neural Networks for Joint F0-Localisation Estimation

BIOLOGICALLY-INSPIRED SIGNAL PROCESSOR USING LATERAL INHIBITION AND INTEGRATIVE FUNCTION MECHANISMS FOR HIGH INSTANTANEOUS DYNAMIC RANGE

Computational Intelligence Introduction

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks

Monaural and Binaural Speech Separation

UAV Sound Source Localization

Psycho-acoustics (Sound characteristics, Masking, and Loudness)

COMPUTATIONAL RHYTHM AND BEAT ANALYSIS Nicholas Berkner. University of Rochester

COMP 546. Lecture 23. Echolocation. Tues. April 10, 2018

396 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2011

SMARTPHONE SENSOR BASED GESTURE RECOGNITION LIBRARY

Proceedings of Meetings on Acoustics

Smart antenna for doa using music and esprit

Estimation of Trajectory and Location for Mobile Sound Source

arxiv: v1 [cs.sd] 4 Dec 2018

Measuring impulse responses containing complete spatial information ABSTRACT

Binaural Mechanisms that Emphasize Consistent Interaural Timing Information over Frequency

A VLSI-Based Model of Azimuthal Echolocation in the Big Brown Bat

Automatic Text-Independent. Speaker. Recognition Approaches Using Binaural Inputs

Representation Learning for Mobile Robots in Dynamic Environments

Signal detection in the auditory midbrain: Neural correlates and mechanisms of spatial release from masking

Simulate IFFT using Artificial Neural Network Haoran Chang, Ph.D. student, Fall 2018

Recent Advances in Acoustic Signal Extraction and Dereverberation

Transcription:

Binaural Sound Localization Systems Based on Neural Approaches Nick Rossenbach June 17, 2016

Introduction Barn Owl as Biological Example Neural Audio Processing Jeffress model Spence & Pearson Artifical Owl Ruff Localization System Effect of an Artificial Head to Human Acoustic Perception Conclusion

Introduction 0

Introduction Motivation: sound localization plays an important role for mobile robots binaural localization systems are common in nature Reference: Biologically Inspired Binaural Sound Source Localization and Tracking for Mobile Robots, Calmes 2009 uses barn owl as biological example implements system using artificial barn owl ruff also uses statistical tracking and visual sensor aids 1

Barn Owl as Biological Example

Barn Owl Tyto Alba by Peter Trimming, Creative Commons 2.0 one of natures most precise example of sound localization can hunt only by hearing special structure of head makes 110 degree hearing possible asymmetric ears to distinguish the elevation of sounds first research on acoustic hunting was performed by Roger S. Payne in 1971 2

Neural Audio Processing

Neural Network Basics (Biological) neurons: create a charge release the charge when triggered/excited stronger impulse - higher frequency of charges synapses: transfer charges from one neuron to another can increase or reduce the excitation of the target node exhibitory connections: connections increasing the excitation inhibitory connections connections decreasing the excitation 3

Neural Network Basics (Technical) first attempt of mathematical description by McCulloch and Pitts in 1943 linear combination of weighted inputs equivalent of synapses apply activation function on the combination equivalent of neurons y = f (w 1 x 1 + w 2 x 2 +... + w n x n ) activation function e.g. sigmoid function f (x) = 1 1+e x 4

Jeffress Model presented by Lloyd A. Jeffress in 1948 implemented as delay-line algorithm by Liu et. al in 2000 a model for the ITD part of the brain uses I neurons with delayed inputs from left and right ear for each timestep n includes delay lines to match phase shifts phase shift is computed for each frequency band (m) by using fast fourier transformation the azimuth spectrum is divided into I parts 5

Jeffress Model Structure Dual Line structure (Calmes, 2009) 6

Jeffress Model Notation for each node, the signal is delayed by: ( ) τ i = ITDmax 2 sin i I 1 π π 2 to shift a signal in the frequency domain, the complex vector is rotated: X (i) L,n (m) = X L,n(m)e j2πfmτ i the azimuth sector is selected by the minimal distance of the complex values: i n (m) = arg min i [ X n (i) (m)] 7

Jeffress Model Diagram 3D coincidence map (Calmes, 2009) 8

Spence & Pearson a model for the ILD part of the brain (Spence & Pearson, 1989) simulates different parts of the barn owl brain NA - frequency filtered signal intensity (nucleus angularis) VLVp - sigmoidal shaping of the intensity (nucleus ventralis lemnisci lateralis, pars anterior) ICc - peaked response curves determining the ILD sector (central nucleus of the inferior colliculus) parameters tuned in a way to achieve similar results as the barn owl 9

Spence & Pearson - Nodes each neural node has a predefined activation function equal for every node values determined by research on the barn owl voltage v and activity a determined by inputs g: v = g e v e + g i v i + g l v l g e + g i + g l with e = excitatory, i = inhibitory and l = leakage a = 1 1 + e ln(s) (v vt) with s determining the steepness of the sigmoidal slope 10

Spence & Pearson - Structure ICc VLVp (k j) 2 e 2 σ 2 +wj,k icc = 1 σ 2 π if j σ k j + σ w icc j,k = k j σ if j < k j + σ NA L R neural network structure of the implemented Spence & Pearson model w vlvp k = 1 k VLVp +w vlvp k = 1 max input 11

Spence & Pearson - Parameters setting v e = 0, v i = 90, v l = 65 and g l = 1 achieves similar peak responses as the internal brain structure of the barn owl activation function parameters may be randomized most active ICc node determines the sound direction 12

Sound Localization Setup combine Dual-Line/Jeffress model with Spence & Pearson model select most active nodes from both models assign nodes to sectors regarding azimuth and elevation by testing ITD/ILD contour lines of simple two-microphone setup (Calmes, 2009) 13

Artifical Owl Ruff Localization System

Artificial Owl Ruff Aim: expand the azimuth spectrum above 90 degrees make the left ear more sensitive for higher elevated sounds make the right ear more sensitive for lower elevated sounds achieve frequency distortion with a custom HRTF artificial owl ruff setups (Calmes, 2009) 14

ITD / ILD Analysis ITD/ILD contour lines of artificial owl ruff setup (Calmes, 2009) 15

Effects of the Artificial Owl Ruff achieved to expand the azimuth range above 90 degree achieved to focus the ILD part on measuring elevation did not achieve to benefit from a custom HRTF......but: azimuth range further increased ILD sensitivity increased in regards to elevation possibly the improvement was too noisy to improve the localization 16

Effect of an Artificial Head to Human Acoustic Perception

Demo binaural listening demonstration 17

Conclusion

Conclusion biological inspired neural methods enhance sound localization systems: ITD part: Jeffress model ILD part: Spence & Pearson model artificial microphone setups inspired by the barn owl enhance sound localization artificial structures have an important effect on acoustic perception for localization systems as well as humans 18

thank you for your attention! 19

References Biologically Inspired Binaural Sound Source Localization and Tracking for Mobile Robots, Lauent Calmes PhD thesis at I5 chair of the RWTH, 2009 Biologically Inspired Binaural Sound Localization using Interaural Level Differences, Daniel Peger diploma thesis at I5 chair of the RWTH, 2005 The Computation of Sound Source Elevation in the Barn Owl, Clay D. Spence & John C. Pearson, Advances in Neural Information Processing Systems 2, NIPS Conference, Denver, Colorado, USA, November 27-30, 1989 20