Indoor Sound Localization

Similar documents
Auditory System For a Mobile Robot

EE1.el3 (EEE1023): Electronics III. Acoustics lecture 20 Sound localisation. Dr Philip Jackson.

Binaural Sound Localization Systems Based on Neural Approaches. Nick Rossenbach June 17, 2016

A Hybrid Architecture using Cross Correlation and Recurrent Neural Networks for Acoustic Tracking in Robots

THE MATLAB IMPLEMENTATION OF BINAURAL PROCESSING MODEL SIMULATING LATERAL POSITION OF TONES WITH INTERAURAL TIME DIFFERENCES

Joint Position-Pitch Decomposition for Multi-Speaker Tracking

Improving Meetings with Microphone Array Algorithms. Ivan Tashev Microsoft Research

ADAPTIVE ANTENNAS. NARROW BAND AND WIDE BAND BEAMFORMING

TDE-ILD-HRTF-Based 2D Whole-Plane Sound Source Localization Using Only Two Microphones and Source Counting

Separation and Recognition of multiple sound source using Pulsed Neuron Model

Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks

Sound Source Localization using HRTF database

PERFORMANCE COMPARISON BETWEEN STEREAUSIS AND INCOHERENT WIDEBAND MUSIC FOR LOCALIZATION OF GROUND VEHICLES ABSTRACT

A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL

Speech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya 2, B. Yamuna 2, H. Divya 2, B. Shiva Kumar 2, B.

Simultaneous Recognition of Speech Commands by a Robot using a Small Microphone Array

Introduction to Robotics

The Human Auditory System

Three-Dimensional Sound Source Localization for Unmanned Ground Vehicles with a Self-Rotational Two-Microphone Array

138 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 30, NO. 1, JANUARY 2019

Recent Advances in Acoustic Signal Extraction and Dereverberation

Subband Analysis of Time Delay Estimation in STFT Domain

Kalman Filters. Jonas Haeling and Matthis Hauschild

The psychoacoustics of reverberation

BIOLOGICALLY INSPIRED BINAURAL ANALOGUE SIGNAL PROCESSING

Airo Interantional Research Journal September, 2013 Volume II, ISSN:

Sound source localisation in a robot

Monaural and Binaural Speech Separation

Binaural Speaker Recognition for Humanoid Robots

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues

An Auditory Localization and Coordinate Transform Chip

Study on method of estimating direct arrival using monaural modulation sp. Author(s)Ando, Masaru; Morikawa, Daisuke; Uno

Microphone Array Design and Beamforming

Recurrent Timing Neural Networks for Joint F0-Localisation Estimation

arxiv: v1 [cs.sd] 4 Dec 2018

ONE of the most common and robust beamforming algorithms

Lateralisation of multiple sound sources by the auditory system

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks

Binaural Mechanisms that Emphasize Consistent Interaural Timing Information over Frequency

IMPROVED COCKTAIL-PARTY PROCESSING

Study Of Sound Source Localization Using Music Method In Real Acoustic Environment

Binaural Hearing. Reading: Yost Ch. 12

Intensity Discrimination and Binaural Interaction

Robust Speech Direction Detection for Low Cost Robotics Applications

Speech Enhancement Using Microphone Arrays

Neural Processing of Amplitude-Modulated Sounds: Joris, Schreiner and Rees, Physiol. Rev. 2004

A learning, biologically-inspired sound localization model

A cat's cocktail party: Psychophysical, neurophysiological, and computational studies of spatial release from masking

The analysis of multi-channel sound reproduction algorithms using HRTF data

Speaker Isolation in a Cocktail-Party Setting

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. 3D and Virtual Sound. Paris Smaragdis. paris.cs.illinois.

EARIN Jarosław Arabas Room #223, Electronics Bldg.

Advanced delay-and-sum beamformer with deep neural network

Time-of-arrival estimation for blind beamforming

Robust Speech Recognition Group Carnegie Mellon University. Telephone: Fax:

Sound source localization and its use in multimedia applications

System analysis and signal processing

High performance 3D sound localization for surveillance applications Keyrouz, F.; Dipold, K.; Keyrouz, S.

University of Huddersfield Repository

BEAMFORMING WITHIN THE MODAL SOUND FIELD OF A VEHICLE INTERIOR

Smoke and Mirrors Virtual Realities for Sensor Fusion Experiments in Biomimetic Robotics

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4

Michael Brandstein Darren Ward (Eds.) Microphone Arrays. Signal Processing Techniques and Applications. With 149 Figures. Springer

A binaural auditory model and applications to spatial sound evaluation

Convention e-brief 400

ECE 476/ECE 501C/CS Wireless Communication Systems Winter Lecture 6: Fading

Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming

Antennas and Propagation. Chapter 5c: Array Signal Processing and Parametric Estimation Techniques

ECE 476/ECE 501C/CS Wireless Communication Systems Winter Lecture 6: Fading

Acoustics Research Institute

Lab S-3: Beamforming with Phasors. N r k. is the time shift applied to r k

Computational Perception /785

Michael E. Lockwood, Satish Mohan, Douglas L. Jones. Quang Su, Ronald N. Miles

Phase and Feedback in the Nonlinear Brain. Malcolm Slaney (IBM and Stanford) Hiroko Shiraiwa-Terasawa (Stanford) Regaip Sen (Stanford)

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Cost Function for Sound Source Localization with Arbitrary Microphone Arrays

ECE 476/ECE 501C/CS Wireless Communication Systems Winter Lecture 6: Fading

Robust Low-Resource Sound Localization in Correlated Noise

Convention Paper Presented at the 126th Convention 2009 May 7 10 Munich, Germany

Approaches for Angle of Arrival Estimation. Wenguang Mao

Robust Speech Recognition Based on Binaural Auditory Processing

Neural Models for Multi-Sensor Integration in Robotics

Predicting localization accuracy for stereophonic downmixes in Wave Field Synthesis

Auditory Distance Perception. Yan-Chen Lu & Martin Cooke

Estimation of Trajectory and Location for Mobile Sound Source

Effects of Fading Channels on OFDM

Qäf) Newnes f-s^j^s. Digital Signal Processing. A Practical Guide for Engineers and Scientists. by Steven W. Smith

TARGET SPEECH EXTRACTION IN COCKTAIL PARTY BY COMBINING BEAMFORMING AND BLIND SOURCE SEPARATION

Enhancing 3D Audio Using Blind Bandwidth Extension

Efficient Gesture Interpretation for Gesture-based Human-Service Robot Interaction

arxiv: v2 [q-bio.nc] 19 Feb 2014

BFGUI: AN INTERACTIVE TOOL FOR THE SYNTHESIS AND ANALYSIS OF MICROPHONE ARRAY BEAMFORMERS. M. R. P. Thomas, H. Gamper, I. J.

Artificial Beacons with RGB-D Environment Mapping for Indoor Mobile Robot Localization

MICROPHONE ARRAY MEASUREMENTS ON AEROACOUSTIC SOURCES

Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model

Final Project: Sound Source Localization

Signal Processing. Naureen Ghani. December 9, 2017

Signal detection in the auditory midbrain: Neural correlates and mechanisms of spatial release from masking

Machine Learning for Antenna Array Failure Analysis

Speech & Audio Processing / Part-II. Digital Audio Signal Processing DASP. Marc Moonen

Transcription:

MIN-Fakultät Fachbereich Informatik Indoor Sound Localization Fares Abawi Universität Hamburg Fakultät für Mathematik, Informatik und Naturwissenschaften Fachbereich Informatik Technische Aspekte Multimodaler Systeme Monday, 12-12-2016 1

Contents Introduction Cross-Correlation Quality Effecting Factors Sound Localization: Time Difference of Arrival Steered Beamforming Bio-Inspired Sound Localization Comparison Summary References 2

Introduction Definition Sound localization is 3

Introduction [4] 4

Introduction The Jeffess Model Oversimplified model of the mammalian MSO [VIDEO] [6] 5

Introduction [4] Lateral Superior Olive : ILD is performed Medial Superior Olive : ITD is performed 6

Introduction Binaural cues [VIDEOS] [7] Varying ITD Varying ILD Varying ITD & ILD Trading ITD off against ILD 7

Checkpoint Introduction Cross-Correlation Quality Effecting Factors Sound Localization: Time Difference of Arrival Steered Beamforming Bio-Inspired Sound Localization Comparison Summary References 8

Cross-Correlation Get the delay between two signals by shifting one against the other Multiply-> Sum-> Shift-> Repeat! Convolution Theorem: Convolution in the time domain is simply a multiplication in the frequency domain and vice versa 9

Cross-Correlation Complexity: Cooley-Tuckey FFT = n. log(n) Time-Domain xcorr = n 2 Notes on Time->Frequency Domain Transformation The sampling frequency must be twice the maximum frequency a system needs to acquire, according to the Nyquist Theorem, in order to avoid temporal aliasing. A windowing function (Analysis window) must be applied to signal before transformation to avoid frequency leakage and smearing. The window can be in the form of a Hann window, Hamm window or the like. Keep in mind: The cross-correlation of two signals produces a vector with a length of both signal lengths -1. If ignored the cross-correlation will be distorted due to circular convolution. 10

Cross-Correlation Two sinusoids with a difference of 7 samples Peak detected at x = -7 after performing cross-correlation 11

Checkpoint Introduction Cross-Correlation Quality Effecting Factors Sound Localization: Time Difference of Arrival Steered Beamforming Bio-Inspired Sound Localization Comparison Summary References 12

Quality Effecting Factors Echo and Reverb [ANIMATION] [8] 13

Quality Effecting Factors Noise Noise power spectral densities can be estimated by finding the minima from time-frequency bins that do not contain speech [4] Could this work for any sound signal? Any Environment?? 14

Quality Effecting Factors Doppler shift [VIDEO] [9] 15

Checkpoint Introduction Cross-Correlation Quality Effecting Factors Sound Localization: Time Difference of Arrival Steered Beamforming Bio-Inspired Sound Localization Comparison Summary References 16

Time Difference of Arrival In-house Alert Sounds Detection and Direction of Arrival Estimation to Assist People with Hearing Difficulties [1] 17

Time Difference of Arrival [1] 18

Time Difference of Arrival Calculating the delay at which sound arrives the circular microphone array τ (k,i) = 2 R C sin θ k θ i 2 sin θ k θ i 2 + θ i φ s Approximating the angle by incrementing φ s from 0 to 360 selecting the angle which reduces the difference between the analytical delay and that acquired through cross-correlation [1] 19

Steered Beamforming Robust localization and tracking of simultaneous moving sound sources using beamforming and particle filtering [2] 20

Steered Beamforming Detect the sound from an array of omnidirectional microphones Steer the beam towards all possible angles Use particle filtering to predict the motion of the sound source Can detect angle and position! [2] 21

Steered Beamforming [5] 22

Bio-Inspired Sound Localization Neural and Statistical Processing of Spatial Cues for Sound Source Localisation [3] 23

Bio-Inspired Sound Localization [3] Detect the direction of incoming sound Filter the sound signal (Gammatone FB) Detect ITD and ILD Reduce the dimensionality (Inferior Colliculus -> Naïve Bayes) Classify (FFNN) rotate the robot s head in the direction of the sound, aligning a single microphone with the sound source. 24

Checkpoint Introduction Cross-Correlation Quality Effecting Factors Sound Localization: Time Difference of Arrival Steered Beamforming Bio-Inspired Sound Localization Comparison Summary References 25

Comparison TDOA Beamforming Bio-Inspired SSL Steps Cross-Correlate and measure delay Shift, Cross-Correlate, sum and measure power Cross-Correlate, Minimize dimensionality, feed to network and predict Speed Fast Moderate Slow Accuracy Lowest Moderate Best Resources Low High High Training Not Required Not Required Required 26

Checkpoint Introduction Cross-Correlation Quality Effecting Factors Sound Localization: Time Difference of Arrival Steered Beamforming Bio-Inspired Sound Localization Comparison Summary References 27

Summary Mammalians Localize sound through binaural and monaural cues Interaural level difference (ILD) is the measure of sound level/loudness across two inputs Interaural time difference (ITD) is the measure of sound level/loudness across two inputs The Lateral Superior Olive (LSO) : where ILD is measured in the brain The Medial Superior Olive (MSO) : where ITD is measured in the brain Cross-Correlation measures the delay between two signal Cross-Correlation is performed efficiently in the Frequency domain Quality effecting factors: Echo Reverb Noise Doppler shift 28

Summary Computerized systems can measure the direction of sound by: Time difference of arrival or phase delay Steered beamforming Heuristic and statistical methods Beamforming can detect more than a single sound source Sound can be detected by binaural or multi-microphone array systems (circular or aligned) 29

References [1] M. Daoud, M. Al-Ashi, F. Abawi, and A. Khalifeh, In-house alert sounds detection and direction of arrival estimation to assist people with hearing difficulties, in IEEE/ACIS 14th International Conference on Computer and Information Science (ICIS), pp. 297 302, Nevada, US, June 2015. [2] J.-M. Valin, F. Michaud and J. Rouat, Robust localization and tracking of simultaneous moving sound sources using beamforming and particle filtering, Robotics Autonomous Syst. J. 55, 216 228, 2007. [3] J. Davila-Chacon, S. Magg, J. Liu, and S. Wermter. Neural and statistical processing of spatial cues for sound source localization, in IEEE Intl. Conf. on Neural Networks (IJCNN-13), pp. 1 8, Dallas, US, 2013. 30

References [4] B. Grothe, M. Pecka, and D. McAlpine, Mechanisms of Sound Localization in Mammals in Physiological Reviews Published 1 July 2010 Vol. 90 no. 3, 983-1012 http://physrev.physiology.org/content/90/3/983 [5] A. Greensted, Delay Sum Beamforming in The Lab Book Pages, 2012 http://www.labbookpages.co.uk/audio/beamforming/delaysum.html [6] J. Schnupp, E. Nelken, A. King, The Jeffress Model Animation in Auditory Neuroscience https://auditoryneuroscience.com/topics/jeffress-model-animation [7] J. Schnupp, E. Nelken, A. King, Binaural Cues in Auditory Neuroscience https://auditoryneuroscience.com/topics/binaural-cue-demos [8] Echo and Reverb animation in The Physics Classroom http://www.physicsclassroom.com/mmedia/waves/er.gif [9] Waves and Sound: The Doppler Effect In PHYSCLIPS,UNSW, School of Physics, Sydney http://www.animations.physics.unsw.edu.au/jw/doppler.htm 31

Further Reading [10] B. Clénet and H. Romsdorfer, Circular microphone array based beamforming and source localization on reconfigurable hardware. Diss. Master s thesis, Graz University of Technology, 2010. [11] J. Davila-Chacon, J. Twiefel, J. Liu, and S. Wermter. "Improving Humanoid Robot Speech Recognition with Sound Source Localisation." International Conference on Artificial Neural Networks. Springer International Publishing, 2014. 32

Questions? Thank you! 33