Spatial Audio & The Vestibular System!

Similar documents
Ivan Tashev Microsoft Research

Auditory Localization

Sound source localization and its use in multimedia applications

Sound Source Localization using HRTF database

PERSONALIZED HEAD RELATED TRANSFER FUNCTION MEASUREMENT AND VERIFICATION THROUGH SOUND LOCALIZATION RESOLUTION

Virtual Sound Source Positioning and Mixing in 5.1 Implementation on the Real-Time System Genesis

Wave Field Analysis Using Virtual Circular Microphone Arrays

Virtual Acoustic Space as Assistive Technology

3D Sound Simulation over Headphones

Listening with Headphones

Convention Paper Presented at the 139th Convention 2015 October 29 November 1 New York, USA

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA. Why Ambisonics Does Work

Abstract. 1. Introduction and Motivation. 3. Methods. 2. Related Work Omni Directional Stereo Imaging

HRIR Customization in the Median Plane via Principal Components Analysis

The analysis of multi-channel sound reproduction algorithms using HRTF data

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4

3D Sound System with Horizontally Arranged Loudspeakers

HRTF adaptation and pattern learning

VIRTUAL ACOUSTICS: OPPORTUNITIES AND LIMITS OF SPATIAL SOUND REPRODUCTION

Spatial Audio Reproduction: Towards Individualized Binaural Sound

Master MVA Analyse des signaux Audiofréquences Audio Signal Analysis, Indexing and Transformation

Introduction. 1.1 Surround sound

Potential and Limits of a High-Density Hemispherical Array of Loudspeakers for Spatial Hearing and Auralization Research

Final Exam Study Guide: Introduction to Computer Music Course Staff April 24, 2015

Enhancing 3D Audio Using Blind Bandwidth Extension

Measuring impulse responses containing complete spatial information ABSTRACT

MANY emerging applications require the ability to render

Advanced Audiovisual Processing Expected Background

c 2014 Michael Friedman

Predicting localization accuracy for stereophonic downmixes in Wave Field Synthesis

Feeding human senses through Immersion

EE1.el3 (EEE1023): Electronics III. Acoustics lecture 20 Sound localisation. Dr Philip Jackson.

Binaural Audio Project

University of Huddersfield Repository

Fundamentals of Digital Audio *

A Toolkit for Customizing the ambix Ambisonics-to- Binaural Renderer

3D AUDIO AR/VR CAPTURE AND REPRODUCTION SETUP FOR AURALIZATION OF SOUNDSCAPES

Envelopment and Small Room Acoustics

MEASURING DIRECTIVITIES OF NATURAL SOUND SOURCES WITH A SPHERICAL MICROPHONE ARRAY

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. 3D and Virtual Sound. Paris Smaragdis. paris.cs.illinois.

PERSONAL 3D AUDIO SYSTEM WITH LOUDSPEAKERS

SOUND 1 -- ACOUSTICS 1

Audio Engineering Society. Convention Paper. Presented at the 131st Convention 2011 October New York, NY, USA

A Virtual Audio Environment for Testing Dummy- Head HRTFs modeling Real Life Situations

Personalized 3D sound rendering for content creation, delivery, and presentation

Lecture Notes Intro: Sound Waves:

Study on method of estimating direct arrival using monaural modulation sp. Author(s)Ando, Masaru; Morikawa, Daisuke; Uno

IMGD 3xxx - HCI for Real, Virtual, and Teleoperated Environments: Human Hearing and Audio Display Technologies. by Robert W. Lindeman

Chapter 11. Audio. Steven M. LaValle. University of Illinois. Available for downloading at

3D audio overview : from 2.0 to N.M (?)

Binaural hearing. Prof. Dan Tollin on the Hearing Throne, Oldenburg Hearing Garden

CS277 - Experimental Haptics Lecture 2. Haptic Rendering

396 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2011

TSBB15 Computer Vision

APPLICATION OF THE HEAD RELATED TRANSFER FUNCTIONS IN ROOM ACOUSTICS DESIGN USING BEAMFORMING

ROOM IMPULSE RESPONSES AS TEMPORAL AND SPATIAL FILTERS ABSTRACT INTRODUCTION

ROOM AND CONCERT HALL ACOUSTICS MEASUREMENTS USING ARRAYS OF CAMERAS AND MICROPHONES

Modeling Head-Related Transfer Functions Based on Pinna Anthropometry

Ambisonics plug-in suite for production and performance usage

From acoustic simulation to virtual auditory displays

Sensing self motion. Key points: Why robots need self-sensing Sensors for proprioception in biological systems in robot systems

SPATIAL SOUND REPRODUCTION WITH WAVE FIELD SYNTHESIS

Principles of Musical Acoustics

University of Huddersfield Repository

MUS 302 ENGINEERING SECTION

Linux Audio Conference 2009

Binaural auralization based on spherical-harmonics beamforming

Convention e-brief 433

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors

BINAURAL RECORDING SYSTEM AND SOUND MAP OF MALAGA

Sound engineering course

ORIENTATION IN SIMPLE VIRTUAL AUDITORY SPACE CREATED WITH MEASURED HRTF

The psychoacoustics of reverberation

Click to edit Master title style

Proceedings of Meetings on Acoustics

3D Audio Systems through Stereo Loudspeakers

Introducing Twirling720 VR Audio Recorder

CS101 Lecture 18: Audio Encoding. What You ll Learn Today

Robotic Spatial Sound Localization and Its 3-D Sound Human Interface

THE TEMPORAL and spectral structure of a sound signal

NCERT solution for Sound

Motion sickness issues in VR content

Audio Engineering Society. Convention Paper. Presented at the 131st Convention 2011 October New York, NY, USA

VAMBU SOUND: A MIXED TECHNIQUE 4-D REPRODUCTION SYSTEM WITH A HEIGHTENED FRONTAL LOCALISATION AREA

A spatial squeezing approach to ambisonic audio compression

Acquisition of spatial knowledge of architectural spaces via active and passive aural explorations by the blind

Convention e-brief 400

Sound localization Sound localization in audio-based games for visually impaired children

NAME STUDENT # ELEC 484 Audio Signal Processing. Midterm Exam July Listening test

SIC Sound Illusion Cube

Development and application of a stereophonic multichannel recording technique for 3D Audio and VR

Spatialisation accuracy of a Virtual Performance System

Psychology in Your Life

Intext Exercise 1 Question 1: How does the sound produced by a vibrating object in a medium reach your ear?

SENSATION AND PERCEPTION

Binaural Hearing- Human Ability of Sound Source Localization

Mel Spectrum Analysis of Speech Recognition using Single Microphone

A Road Traffic Noise Evaluation System Considering A Stereoscopic Sound Field UsingVirtual Reality Technology

Haptic Rendering CPSC / Sonny Chan University of Calgary

Motion in cycles. Chapter 18. harmonic motion - repeating motion; also called oscillatory motion

Transcription:

! Spatial Audio & The Vestibular System! Gordon Wetzstein! Stanford University! EE 267 Virtual Reality! Lecture 13! stanford.edu/class/ee267/!!

Updates! lab this Friday will be released as a video! TAs will be in lab on Friday, but as extended office hours!

Overview! what is sound? how do we synthesize it?! the human auditory system! stereophonic sound! spatial audio of point sound sources! surround sound! ambisonics! brief overview of the vestibular system!

What is Sound?! sound is a pressure wave propagating in a medium! speed of sound is where c is velocity, is density of medium and K is elastic bulk modulus! c = K!! in air, speed of sound is 340 m/s! in water, speed of sound is 1,483 m/s!

How do we Synthesize Sound?! https://www.youtube.com/watch?v=adrs6eiefcm!

The Human Auditory System! pinna! wikipedia!

The Human Auditory System! hair receptor cells pick up vibrations! cochlea! pinna! wikipedia!

The Human Auditory System! human hearing range: ~20 20,000 Hz! variation between individuals and changes with age! wikipedia!

Bone Conduction! can stimulate eardrum mechanically to create the illusion of audio, e.g. with bone conduction! http://www.goldendance.co.jp/english/boneconduct/01.html! the verge!

Stereophonic Sound! mainly captures differences between the ears:! interaural time difference! amplitude differences from body shape (nose, head, neck, shoulders, )! hello, vr!! time! t +!t t L! R! L! R! wikipedia!

!! Stereophonic Sound Recording! use two microphones! A-B techniques captures differences in time-of-arrival! Olympus! wikipedia! other configurations work too, capture differences in amplitude! X-Y technique! Rode!

Head-related Impulse Response (HRIR)! models phase and amplitude differences for all possible sound directions parameterized by azimuth! and elevation!! can be measured with two microphones in ears of mannequin & speakers all around!! L! R! Zhong and Xie, Head-Related Transfer Functions and Virtual Auditory Display!

Head-related Impulse Response (HRIR)! CIPIC HRTF database: http://interface.cipic.ucdavis.edu/sound/hrtf.html! elevation: -45 to 230.625, azimuth: -80 to 80! need to interpolate between discretely sampled directions!

Head-related Impulse Response (HRIR)! measuring the HRIR! ideal case: scaled & shifted Dirac peaks! L! amplitude! amplitude! R! time! L! R! time!

Head-related Impulse Response (HRIR)! measuring the HRIR! ideal case: scaled & shifted Dirac peaks! in practice: more complicated, includes scattering in the ear, sholders etc.!! L! L! amplitude! amplitude! R! time! amplitude! amplitude! R! time! time! time!

Head-related Impulse Response (HRIR)! measuring the HRIR! need one temporally-varying function for each angle! total of 2! N "! N #! N t samples, where N!,",t is the number of samples for azimuth, elevation, and time, respectively! ( ) ( ) hrir _l!,",t hrir _ r!,",t

! Head-related Impulse Response (HRIR)! applying the HRIR:! given a mono sound source and it s 3D position! ( ) ( ) ( ) s t 1.! compute! L," L and! R," R relative to center of listener! s( t) (! L," L,t) (! R," R,t) L! R!

! time! Head-related Impulse Response (HRIR)! applying the HRIR:! given a mono sound source and it s 3D position! ( ) ( ) ( ) s t 1.! compute! L," L and! R," R relative to center of listener! 2.! look up measured HRIR for left and right ear at these angles! amplitude! hrir _l(! L," L,t) amplitude! time! ( ) hrir _ r! R," R,t

time! Head-related Impulse Response (HRIR)! applying the HRIR:! ( ) s t given a mono sound source and it s 3D position! 1.! compute! L," L and! R," R relative to center of listener! 2.! look up measured HRIR for left and right ear at these angles! 3.! convolve signal with HRIRs to get response! for each ear as! s L ( t) = hrir _l(! L," L,t)# s( t) ( t) = hrir _ r (! R," R,t)# s( t) s R ( ) ( ) amplitude! amplitude! hrir _l(! L," L,t) time! ( ) hrir _ r! R," R,t

Head-related Transfer Function (HRTF)! HRTF is Fourier transform of HRIR! (you ll find the term HRTF more often that HRIR)! s L s R ( t) = hrir _l(! L," L,t)# s t ( t) = hrir _ r (! R," R,t)# s t hrir _l(! L," L,t) ( ) ( ) s L s R { { ( )}} ( ) = F!1 hrtf _ r " R,# R,$ t ( ) ( t) = F!1 hrtf _l(" L,# L,$ t )% F s t t ( )% F s t amplitude! { { }} ( ) hrtf _l! L," L,# t amplitude! time! ( ) hrir _ r! R," R,t amplitude! frequency! hrtf _ r (! R,"" R,# t ) time! frequency!

Head-related Transfer Function (HRTF)! HRTF is Fourier transform of HRIR! (you ll find the term HRTF more often that HRIR)! s L s R ( t) = hrir _l(! L," L,t)# s t ( t) = hrir _ r (! R," R,t)# s t ( ) ( ) s L s R convolution theorem! { { ( )}} ( ) = F!1 hrtf _ r " R,# R,$ t ( ) ( t) = F!1 hrtf _l(" L,# L,$ t )% F s t t ( )% F s t amplitude! { { }} ( ) hrtf _l! L," L,# t amplitude! frequency! hrtf _ r (! R,"" R,# t ) frequency!

Head-related Transfer Function (HRTF)! HRTF is Fourier transform of HRIR! (you ll find the term HRTF more often that HRIR)! s L s R ( t) = hrir _l(! L," L,t)# s t ( t) = hrir _ r (! R," R,t)# s t properties of HRTF:! complex-valued! ( ) ( ) symmetric (because HRIR is real-valued)! s L s R { { ( )}} ( ) = F!1 hrtf _ r " R,# R,$ t ( ) ( t) = F!1 hrtf _l(" L,# L,$ t )% F s t t ( )% F s t amplitude! amplitude! { { }} ( ) hrtf _l! L," L,# t frequency! hrtf _ r (! R,"" R,# t ) frequency!

Head-related Transfer Function (HRTF)! s L s R { { ( )} } ( ) ( t ) = F!1 hrtf ( ",#,$ )% F s t L L t ( t ) = F!1 hrtf ( ",#,$ )% F s t R R t { { } }

Spatial Sound of 1 Point Sound Source! given s(t) and 3D position, follow instructions from last slides by convolving Fourier transform of s with HRTFs for each each! s( t) (! L," L,t) (! R," R,t) L! R!

Spatial Sound of N Point Sound Sources! superposition principle holds, so just sum the contributions of each! s 1 t ( ) s L s R { { ( )}} N ( t) = F!1 hrtf _l(" i L,# i L,$ t )% F s i t & i=1 N ( t) = F!1 hrtf _ r (" i R,# i R,$ t )% F s i t & i=1 { { ( )}} (! 1 L," 1 L,t) (! 1 R," 1 R,t) (! 2 L," 2 L,t) L! R! s 2 ( t) (! 2 R," 2 R,t)

Surround Sound! approximate continuous wave field with discrete set of speakers! most common: 5.1 surround sound = 5 (channels). 1 (bass)!! 6 channels total!

Surround Sound! approximate continuous wave field with discrete set of speakers! can also use more speakers for wave field synthesis (i.e. audio hologram)! http://spatialaudio.net/! ucsb!

Surround Sound! approximate continuous wave field with discrete set of speakers! can also use more speakers for wave field synthesis (i.e. audio hologram)! for wave field synthesis, phase of speakers needs to be synchronized, i.e. a phased array!!

Surround Sound & HRTF! for all speaker-based (surround) sound, we don t need an HRTF because the ears of the listener will apply them!! speaker setup usually needs to be calibrated!

Spatial Audio for VR! VR/AR requires us to re-think audio, especially spatial audio!! could use 5.1 surround sound and set up virtual speakers in the virtual environment can use existing content, but not super easy to capture new content; also doesn t capture directionality from above/below!

Spatial Audio for VR! Two primary approaches:! 1.! Real-time sound engine! render 3D sound sources via HRTF in real-time, just as discussed in the previous slides! used for games and synthetic virtual environments! a lot of libraries available: FMOD, OpenAL,!

Spatial Audio for VR! Two primary approaches:! 2.! Spatial sound recorded from real environments! most widely used format now: ambisonics! simple microphones exist! relatively easy mathematical model! only need 4 channels for starters! used in YouTube VR and many other platforms!

Ambisonics! idea: represent sound incident at a point (i.e. the listener) with some directional information! using all angles!," is impractical need too many sound channels (one for each direction)! some lower-frequency (in direction) components may be sufficient! directional basis representation to the rescue!!

Ambisonics Spherical Harmonics! use spherical harmonics! orthogonal basis functions on a sphere, i.e. full-sphere surround sound! think Fourier transform acting on the directions of a sphere!

Ambisonics Spherical Harmonics! 1 st order! 0 th order! 2 nd order! 3 rd order!

Ambisonics Spherical Harmonics! W! X! Y! Z! 1 st order approximation!! 4 channels: W, X, Y, Z!

Ambisonics Spherical Harmonics! can easily convert a point sound source to the 4-channel ambisonics representation! given azimuth and elevation!,", compute W,X,Y,Z as! 1 W = S! 2 X = S!cos" cos# Y = S!sin" cos# Z = S!sin# omnidirectional component (angle-independent)! stereo in x! stereo in y! stereo in z!

Ambisonics Spherical Harmonics! can also record 4-channel ambisonics via special microphone! same format supported by YouTube VR and other platforms! http://www.oktava-shop.com/!

! Ambisonics Spherical Harmonics! easiest way to render ambisonics: convert W,X,Y,Z channels into 4 virtual speaker positions! for a regularly-spaced square setup, this results in! ( ) 8 ( ) 8 ( ) 8 ( ) 8 LF = 2W + X + Y LB = 2W! X + Y RF = 2W + X! Y RB = 2W! X! Y LF! LB! L! R! RF! RB!

! Audio perception happens mostly in the inner ear! What else is happening there?!

The Inner Ear! pinna! what s this?! hearing! wikipedia!

Brief Overview of the Vestibular System! provides sense of balance & gravity! like IMUs one in each ear! in each ear, sense linear (3 dof from otolithic organs) and angular (3 dof from 3 semicircular canals) acceleration via hair cells!

Vestibulo-Ocular Reflex (VOR)! vestibular system and ocular system are directly coupled in a feedback system! enables low-latency optical image stabilization of the visual system with head motion!

Motion Sickness! 3 types of motion sickness (all related to visual-vestibular conflict theory):!! 1.! Motion sickness caused by motion that is felt but not seen! 2.! Motion sickness caused by motion that is seen but not felt! 3.! Motion sickness caused when both systems detect motion but they do not correspond.!

! Motion Sickness! 3 types of motion sickness (all related to visual-vestibular conflict theory):! 1.! Motion sickness caused by motion that is felt but not seen! 2.! Motion sickness caused by motion that is seen but not felt! 3.! Motion sickness caused when both systems detect motion but they do not correspond.! Example: car and sea sickness!

! Motion Sickness! 3 types of motion sickness (all related to visual-vestibular conflict theory):! 1.! Motion sickness caused by motion that is felt but not seen! 2.! Motion sickness caused by motion that is seen but not felt! 3.! Motion sickness caused when both systems detect motion but they do not correspond.! Example: VR sickness or visually-induced motion sickness (VIMS)!

! Motion Sickness! 3 types of motion sickness (all related to visual-vestibular conflict theory):! 1.! Motion sickness caused by motion that is felt but not seen! 2.! Motion sickness caused by motion that is seen but not felt! 3.! Motion sickness caused when both systems detect motion but they do not correspond.! Example: motion in low gravity!

References and Further Reading! Google s take on spatial audio: https://developers.google.com/vr/concepts/spatial-audio! HRTF:! Algazi, Duda, Thompson, Avendado The CIPIC HRTF Database, Proc. 2001 IEEE Workshop on Applications of Signal Processing to Audio and Electroacoustics! download CIPIC HRTF database here: http://interface.cipic.ucdavis.edu/sound/hrtf.html! Resources by Google:! https://github.com/googlechrome/omnitone! https://developers.google.com/vr/concepts/spatial-audio! https://opensource.googleblog.com/2016/07/omnitone-spatial-audio-on-web.html! http://googlechrome.github.io/omnitone/#home! https://github.com/google/spatial-media/!