The psychoacoustics of reverberation

Size: px
Start display at page:

Download "The psychoacoustics of reverberation"

Transcription

1 The psychoacoustics of reverberation Steven van de Par July 19, 2016 Thanks to Julian Grosse and Andreas Häußler 2016 AES International Conference on Sound Field Control

2 Introduction The psychoacoustics of reverberation, what is this talk about? Reverberation is nearly always present in our daily life It creates large distortions of the physical waveform Yet it mostly has only a small effect on (speech) perception

3 Amplitude Amplitude Introduction The psychoacoustics of reverberation, what is this talk about? Reverberation is nearly always present in our daily life It creates large distortions of the physical waveform Yet it mostly has only a small effect on (speech) perception 1 Clean speech Reverberated speech Clean speech Reverberated speech T60 = 250 ms Time(s) Time(s)

4 Introduction The psychoacoustics of reverberation, what is this talk about? Reverberation is nearly always present in our daily life It creates large distortions of the physical waveform Yet it mostly has only a small effect on (speech) perception Outline: Principles and mechanisms in perception that help beating reverberation Some ideas about controlling sound fields in a perceptually motivated manner

5 The Peripheral Auditory System

6 The inner ear Cochlea: Mechanical energy (oval window) is converted into a neural signal (auditory nerve) Performs a time-frequency analysis

7 The Cochlea 1. cochlear duct 2. scala vestibuli 3. scala tympani 4. spiral ganglion 5. auditory nerve fibres The red arrow is from the oval window The blue arrow points to the round window The cochlea is about 2 mm in diameter

8 Inner Ear: the Basilar Membrane Frequency-to-place transformation: Each point on BM acts as a band-pass filter

9 Cochleagram Simulates basilar-membrane filtering, and represents magnitudes in dbs. Brain captures a relatively coarse spectro-temporal representation

10 Auditory signal representation Cochleagram is a reasonable first order approximation of perception (loudness, timbre) Additional perceptual cues ( texture cues ): - Timing Information for binaural processing - ITDs, 20 s JND (source direction) - Interaural cross-correlation (source width, listener envelopment) - Temporal pitch cues - Modulation cues (e.g. roughness of a sound) Included in advanced models by e.g. Patterson, Meddis and colleagues and Dau et al. (1996, 1997)

11 Another function within the auditory system Source segregation: Often multiple sources are present simultaneously We can focus on one source Cocktail party processing: Listen to one speaker only Spatial separation helps How does the brain do it?

12 Frequency (Hz) Energy Frequency (Hz) Azimuth (deg) 5000 Complex acoustical scenes Time (sec) Time (sec) -50 Cochleagram of a mix of two speakers binary mask indicating source dominance Acoustic mixtures are often spectro-temporally sparse: For each time-frequency interval one source dominates in level Grouping of signal components is essential to make sense of the speech signal

13 Auditory grouping / segregation Bregman 1990: Auditory Scene Analysis Primitive grouping cues: Common onset Common pitch Common AM/FM modulation Common location All have to do with the physics of sound generation See also:

14 Auditory grouping / segregation Fusion by common frequency change Common frequency modulation is a grouping cue (Bregman,

15 Visual grouping / occlusion Apparent continuity Difficult to see what we are dealing with

16 Visual grouping / occlusion Apparent continuity Without providing extra parts of the letters we can now see the letter B We added information about where the letters are cut The overlay is a physically plausible cause for not seeing part of the letters

17 Frequency Frequency Frequency Is the auditory equivalent a two speaker situation? Speaker Female speaker Two 2 speakers Time Male Speaker x 10 4 Male speaker Time x

18 Is the auditory equivalent a two speaker situation? Female speaker 2 speakers 30 ms frames 1 critical band Linear Sum Male speaker Mask Dominating voice only Female mask Male mask

19 Role of low-snr speech glimpses? Schoenmaker and van de Par (Advances in experimental medicine and biology, 2016) Remove speech target tiles with and SNR below a criterion value Speech intelligibility impaired only beyond about 0 db SNR Only positive SNR parts of speech contribute to intelligibility

20 What about reverberation? Female speaker 2 speakers 30 ms frames 1 critical band T60 = 750 ms Linear Sum Male speaker Mask Reverberated Female mask Male mask

21 Reverberation and the Auditory Representation Reverberation will temporally smear the auditory signal Multiple delayed reflections will add to the direct sound Often reverberant field is stronger than direct sound (critical radius) Speech phonemes will start to overlap (Speech rate 10 Syllables/sec) Music is slower (Allegro 150 bpm 3 notes/sec) Segregation will become more difficult Remember the primitive grouping cues: Common onset (largely preserved) Common pitch (pitch unaffected, changes will be smeared) Common AM/FM modulation (high rates changed and converted) Common location (much reduced reliability)

22 Target at 10º Measure distribution of binaural cues

23 Target at 10º Reverberation Measure distribution of binaural cues

24 Sound localization: Precedence effect (Haas effect) Precedence effect: - The first arriving wave front determines perceived direction - Allows spatial cues to contribute to segregation in reverberant conditions

25 Intermediate summary How does perception cope with reverberation: Reverberation is not represented well in the brain due to coarse spectro-temporal resolution of the auditory system Important perceptual segregation cues are robust against reverb Common onset Pitch Common low-rate AM/FM Spatial cues (due to precedence effect)

26 How to use this knowledge for sound field control The auditory principles that cope with reverberation are implemented in the transformed auditory domain. It is not possible to apply these processing principles directly on acoustical signals. Two examples will be given that use perceptual processing knowledge for sound field control

27 Authentic Audio reproduction Recording room Playback room

28 Authentic Audio reproduction Approach to authentic reproduction: - Optimizing spatial parameters on a coarse spectro-temporal scale is enough: - Direct sound for directional information - Reverberant sound for ASW and LEV (IACC) - Texture cues are represented in microphone signals - Consider the (reverberant) acoustics at the reproduction side Grosse and van de Par (IEEE J. OF SELECTED TOPICS IN SIGNAL PROCESSING, 2015)

29 Room-in-room reproduction

30 Room-in-room reproduction Only direct sound can be reproduced optimally No control over reverberant sound field

31 Perceptual approach Perceptual Optimization

32 Perceptual approach Perceptual Optimization Optimization targeting perceptually relevant statistical properties of reverberant sound field

33 Perceptual approach Perceptual Optimization Optimization targeting perceptually relevant statistical properties of reverberant sound field The acoustics of the playback room is an integral part of the optimization

34 Optimization Optimize perceptually relevant statistical parameters: Auditory Transfer Function Direct sound (front loudspeakers) Reverberant sound (dipole loudspeakers) Interaural Cross Correlation (frequency dependent) Cross-talk dipole loudspeakers T60 Direct-to-reverberant ratio

35 Perceptual approach

36 Evaluation All reproduction methods simulated with Room Impulse Responses over headphones convolved with dry instrument recordings Compare objective parameters Recording: Seminar room & Church 699 (ms) 3040 (ms) Playback (PBR): Small Lab & Seminar Room 371(ms) 697 (ms)

37 L (db) IACC E (db) E RinR E RinR,Opt E mch Frequency (Hz) -0.5 IACC ref IACC RinR IACC RinR-Opt IACC mch Frequency (Hz) edc ref edc RinR edc RinR-Opt edc mch t (ms) Objective parameters RinR = Conventional reproduction without optimization RinR,Opt = Our proposed optimization mch = Multi-channel reproduction with surround speakers Ref = Recording room - Coloration can be reduced compared to RinR - Spatial properties (IACC) better conserved

38 L (db) IACC E (db) E RinR E RinR,Opt E mch Frequency (Hz) -0.5 IACC ref IACC RinR IACC RinR-Opt IACC mch Frequency (Hz) edc ref edc RinR edc RinR-Opt edc mch t (ms) Objective parameters RinR = Conventional reproduction without optimization RinR,Opt = Our proposed optimization mch = Multi-channel reproduction with surround speakers Ref = Recording room - Coloration can be reduced compared to RinR - Spatial properties (IACC) better conserved

39 Listening test All reproduction methods simulated with Room Impulse Responses over headphones convolved with dry instrument recordings MUSHRA test Ref = Recording room Recording: Seminar room & Church 699 (ms) 3040 (ms) Playback (PBR): Small Lab & Seminar Room 371(ms) 697 (ms)

40 Results and Conclusions Simple loudspeaker setup allows: Seminar room Perceptual authentic reproduction Individualization by considering playback acoustics Church Grosse and van de Par (2015) IEEE Journal of Selected Topics in Signal Processing

41 Perceptual dereverberation Scenario: - Speech reproduction in a reverberant room - Preprocessing of the speech signal to enhance speech intelligibility Speech signal Preprocessing Reverberant room

42 Amplitude Perceptual dereverberation Main Idea: - Conserve spectro-temporal pattern - Use time-variant filtering (Hodoshima et al., 2006) Reverberated sine Clean sine Time(s)

43 Perceptual dereverberation Approach: Preprocessing of Loudspeaker inputs Adapt current frame based on past Optimize algorithm parameters with perceptual model. (Jørgensen et al. 2013)

44 Perceptual dereverberation Listening test: - Reverberated (pre-processed) speech with reverberated noise - Measure Speech Reception Threshold

45 Perceptual dereverberation Listening test: - Robustness for position - Measure Speech Reception Threshold

46 Summary The auditory system: Uses low-resolution spectro-temporal representation Extracts some special texture cues Uses robust cues for segregation/grouping Two examples for sound field control were shown Authentic audio reproduction in a reverberant playback room Perceptual dereverberation

47 Thank you for your attention Questions

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction Human performance Reverberation

More information

Binaural Hearing. Reading: Yost Ch. 12

Binaural Hearing. Reading: Yost Ch. 12 Binaural Hearing Reading: Yost Ch. 12 Binaural Advantages Sounds in our environment are usually complex, and occur either simultaneously or close together in time. Studies have shown that the ability to

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb 2008. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum,

More information

Monaural and Binaural Speech Separation

Monaural and Binaural Speech Separation Monaural and Binaural Speech Separation DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction CASA approach to sound separation Ideal binary mask as

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb 2009. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence

More information

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner. Perception of pitch AUDL4007: 11 Feb 2010. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum, 2005 Chapter 7 1 Definitions

More information

Envelopment and Small Room Acoustics

Envelopment and Small Room Acoustics Envelopment and Small Room Acoustics David Griesinger Lexicon 3 Oak Park Bedford, MA 01730 Copyright 9/21/00 by David Griesinger Preview of results Loudness isn t everything! At least two additional perceptions:

More information

Accurate sound reproduction from two loudspeakers in a living room

Accurate sound reproduction from two loudspeakers in a living room Accurate sound reproduction from two loudspeakers in a living room Siegfried Linkwitz 13-Apr-08 (1) D M A B Visual Scene 13-Apr-08 (2) What object is this? 19-Apr-08 (3) Perception of sound 13-Apr-08 (4)

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 MODELING SPECTRAL AND TEMPORAL MASKING IN THE HUMAN AUDITORY SYSTEM PACS: 43.66.Ba, 43.66.Dc Dau, Torsten; Jepsen, Morten L.; Ewert,

More information

III. Publication III. c 2005 Toni Hirvonen.

III. Publication III. c 2005 Toni Hirvonen. III Publication III Hirvonen, T., Segregation of Two Simultaneously Arriving Narrowband Noise Signals as a Function of Spatial and Frequency Separation, in Proceedings of th International Conference on

More information

The Association of Loudspeaker Manufacturers & Acoustics International presents

The Association of Loudspeaker Manufacturers & Acoustics International presents The Association of Loudspeaker Manufacturers & Acoustics International presents MEASUREMENT OF HARMONIC DISTORTION AUDIBILITY USING A SIMPLIFIED PSYCHOACOUSTIC MODEL Steve Temme, Pascal Brunet, and Parastoo

More information

Exploiting envelope fluctuations to achieve robust extraction and intelligent integration of binaural cues

Exploiting envelope fluctuations to achieve robust extraction and intelligent integration of binaural cues The Technology of Binaural Listening & Understanding: Paper ICA216-445 Exploiting envelope fluctuations to achieve robust extraction and intelligent integration of binaural cues G. Christopher Stecker

More information

A binaural auditory model and applications to spatial sound evaluation

A binaural auditory model and applications to spatial sound evaluation A binaural auditory model and applications to spatial sound evaluation Ma r k o Ta k a n e n 1, Ga ë ta n Lo r h o 2, a n d Mat t i Ka r ja l a i n e n 1 1 Helsinki University of Technology, Dept. of Signal

More information

Psychoacoustic Cues in Room Size Perception

Psychoacoustic Cues in Room Size Perception Audio Engineering Society Convention Paper Presented at the 116th Convention 2004 May 8 11 Berlin, Germany 6084 This convention paper has been reproduced from the author s advance manuscript, without editing,

More information

Sound source localization and its use in multimedia applications

Sound source localization and its use in multimedia applications Notes for lecture/ Zack Settel, McGill University Sound source localization and its use in multimedia applications Introduction With the arrival of real-time binaural or "3D" digital audio processing,

More information

The Human Auditory System

The Human Auditory System medial geniculate nucleus primary auditory cortex inferior colliculus cochlea superior olivary complex The Human Auditory System Prominent Features of Binaural Hearing Localization Formation of positions

More information

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Loudness & Temporal resolution

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Loudness & Temporal resolution AUDL GS08/GAV1 Signals, systems, acoustics and the ear Loudness & Temporal resolution Absolute thresholds & Loudness Name some ways these concepts are crucial to audiologists Sivian & White (1933) JASA

More information

SOUND 1 -- ACOUSTICS 1

SOUND 1 -- ACOUSTICS 1 SOUND 1 -- ACOUSTICS 1 SOUND 1 ACOUSTICS AND PSYCHOACOUSTICS SOUND 1 -- ACOUSTICS 2 The Ear: SOUND 1 -- ACOUSTICS 3 The Ear: The ear is the organ of hearing. SOUND 1 -- ACOUSTICS 4 The Ear: The outer ear

More information

A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL

A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL 9th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, -7 SEPTEMBER 7 A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL PACS: PACS:. Pn Nicolas Le Goff ; Armin Kohlrausch ; Jeroen

More information

HCS 7367 Speech Perception

HCS 7367 Speech Perception HCS 7367 Speech Perception Dr. Peter Assmann Fall 212 Power spectrum model of masking Assumptions: Only frequencies within the passband of the auditory filter contribute to masking. Detection is based

More information

Introduction. 1.1 Surround sound

Introduction. 1.1 Surround sound Introduction 1 This chapter introduces the project. First a brief description of surround sound is presented. A problem statement is defined which leads to the goal of the project. Finally the scope of

More information

Auditory modelling for speech processing in the perceptual domain

Auditory modelling for speech processing in the perceptual domain ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract

More information

IN a natural environment, speech often occurs simultaneously. Monaural Speech Segregation Based on Pitch Tracking and Amplitude Modulation

IN a natural environment, speech often occurs simultaneously. Monaural Speech Segregation Based on Pitch Tracking and Amplitude Modulation IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 15, NO. 5, SEPTEMBER 2004 1135 Monaural Speech Segregation Based on Pitch Tracking and Amplitude Modulation Guoning Hu and DeLiang Wang, Fellow, IEEE Abstract

More information

Surround: The Current Technological Situation. David Griesinger Lexicon 3 Oak Park Bedford, MA

Surround: The Current Technological Situation. David Griesinger Lexicon 3 Oak Park Bedford, MA Surround: The Current Technological Situation David Griesinger Lexicon 3 Oak Park Bedford, MA 01730 www.world.std.com/~griesngr There are many open questions 1. What is surround sound 2. Who will listen

More information

Auditory Localization

Auditory Localization Auditory Localization CMPT 468: Sound Localization Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University November 15, 2013 Auditory locatlization is the human perception

More information

Acoustics II: Kurt Heutschi recording technique. stereo recording. microphone positioning. surround sound recordings.

Acoustics II: Kurt Heutschi recording technique. stereo recording. microphone positioning. surround sound recordings. demo Acoustics II: recording Kurt Heutschi 2013-01-18 demo Stereo recording: Patent Blumlein, 1931 demo in a real listening experience in a room, different contributions are perceived with directional

More information

A classification-based cocktail-party processor

A classification-based cocktail-party processor A classification-based cocktail-party processor Nicoleta Roman, DeLiang Wang Department of Computer and Information Science and Center for Cognitive Science The Ohio State University Columbus, OH 43, USA

More information

Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model

Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model Sebastian Merchel and Stephan Groth Chair of Communication Acoustics, Dresden University

More information

1856 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 7, SEPTEMBER /$ IEEE

1856 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 7, SEPTEMBER /$ IEEE 1856 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 7, SEPTEMBER 2010 Sequential Organization of Speech in Reverberant Environments by Integrating Monaural Grouping and Binaural

More information

You know about adding up waves, e.g. from two loudspeakers. AUDL 4007 Auditory Perception. Week 2½. Mathematical prelude: Adding up levels

You know about adding up waves, e.g. from two loudspeakers. AUDL 4007 Auditory Perception. Week 2½. Mathematical prelude: Adding up levels AUDL 47 Auditory Perception You know about adding up waves, e.g. from two loudspeakers Week 2½ Mathematical prelude: Adding up levels 2 But how do you get the total rms from the rms values of two signals

More information

Auditory System For a Mobile Robot

Auditory System For a Mobile Robot Auditory System For a Mobile Robot PhD Thesis Jean-Marc Valin Department of Electrical Engineering and Computer Engineering Université de Sherbrooke, Québec, Canada Jean-Marc.Valin@USherbrooke.ca Motivations

More information

Hearing and Deafness 2. Ear as a frequency analyzer. Chris Darwin

Hearing and Deafness 2. Ear as a frequency analyzer. Chris Darwin Hearing and Deafness 2. Ear as a analyzer Chris Darwin Frequency: -Hz Sine Wave. Spectrum Amplitude against -..5 Time (s) Waveform Amplitude against time amp Hz Frequency: 5-Hz Sine Wave. Spectrum Amplitude

More information

Assessing the contribution of binaural cues for apparent source width perception via a functional model

Assessing the contribution of binaural cues for apparent source width perception via a functional model Virtual Acoustics: Paper ICA06-768 Assessing the contribution of binaural cues for apparent source width perception via a functional model Johannes Käsbach (a), Manuel Hahmann (a), Tobias May (a) and Torsten

More information

Spectral and temporal processing in the human auditory system

Spectral and temporal processing in the human auditory system Spectral and temporal processing in the human auditory system To r s t e n Da u 1, Mo rt e n L. Jepsen 1, a n d St e p h a n D. Ew e r t 2 1Centre for Applied Hearing Research, Ørsted DTU, Technical University

More information

University of Huddersfield Repository

University of Huddersfield Repository University of Huddersfield Repository Lee, Hyunkook Capturing and Rendering 360º VR Audio Using Cardioid Microphones Original Citation Lee, Hyunkook (2016) Capturing and Rendering 360º VR Audio Using Cardioid

More information

MUS 302 ENGINEERING SECTION

MUS 302 ENGINEERING SECTION MUS 302 ENGINEERING SECTION Wiley Ross: Recording Studio Coordinator Email =>ross@email.arizona.edu Twitter=> https://twitter.com/ssor Web page => http://www.arts.arizona.edu/studio Youtube Channel=>http://www.youtube.com/user/wileyross

More information

From Binaural Technology to Virtual Reality

From Binaural Technology to Virtual Reality From Binaural Technology to Virtual Reality Jens Blauert, D-Bochum Prominent Prominent Features of of Binaural Binaural Hearing Hearing - Localization Formation of positions of the auditory events (azimuth,

More information

AUDITORY ILLUSIONS & LAB REPORT FORM

AUDITORY ILLUSIONS & LAB REPORT FORM 01/02 Illusions - 1 AUDITORY ILLUSIONS & LAB REPORT FORM NAME: DATE: PARTNER(S): The objective of this experiment is: To understand concepts such as beats, localization, masking, and musical effects. APPARATUS:

More information

Pitch-based monaural segregation of reverberant speech

Pitch-based monaural segregation of reverberant speech Pitch-based monaural segregation of reverberant speech Nicoleta Roman a Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio 43210 DeLiang Wang b Department of Computer

More information

Recent Advances in Acoustic Signal Extraction and Dereverberation

Recent Advances in Acoustic Signal Extraction and Dereverberation Recent Advances in Acoustic Signal Extraction and Dereverberation Emanuël Habets Erlangen Colloquium 2016 Scenario Spatial Filtering Estimated Desired Signal Undesired sound components: Sensor noise Competing

More information

PERFORMANCE COMPARISON BETWEEN STEREAUSIS AND INCOHERENT WIDEBAND MUSIC FOR LOCALIZATION OF GROUND VEHICLES ABSTRACT

PERFORMANCE COMPARISON BETWEEN STEREAUSIS AND INCOHERENT WIDEBAND MUSIC FOR LOCALIZATION OF GROUND VEHICLES ABSTRACT Approved for public release; distribution is unlimited. PERFORMANCE COMPARISON BETWEEN STEREAUSIS AND INCOHERENT WIDEBAND MUSIC FOR LOCALIZATION OF GROUND VEHICLES September 1999 Tien Pham U.S. Army Research

More information

Psycho-acoustics (Sound characteristics, Masking, and Loudness)

Psycho-acoustics (Sound characteristics, Masking, and Loudness) Psycho-acoustics (Sound characteristics, Masking, and Loudness) Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University Mar. 20, 2008 Pure tones Mathematics of the pure

More information

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 o Music signal characteristics o Perceptual attributes and acoustic properties o Signal representations for pitch detection o STFT o Sinusoidal model o

More information

THE MATLAB IMPLEMENTATION OF BINAURAL PROCESSING MODEL SIMULATING LATERAL POSITION OF TONES WITH INTERAURAL TIME DIFFERENCES

THE MATLAB IMPLEMENTATION OF BINAURAL PROCESSING MODEL SIMULATING LATERAL POSITION OF TONES WITH INTERAURAL TIME DIFFERENCES THE MATLAB IMPLEMENTATION OF BINAURAL PROCESSING MODEL SIMULATING LATERAL POSITION OF TONES WITH INTERAURAL TIME DIFFERENCES J. Bouše, V. Vencovský Department of Radioelectronics, Faculty of Electrical

More information

Audio Engineering Society. Convention Paper. Presented at the 124th Convention 2008 May Amsterdam, The Netherlands

Audio Engineering Society. Convention Paper. Presented at the 124th Convention 2008 May Amsterdam, The Netherlands Audio Engineering Society Convention Paper Presented at the 124th Convention 2008 May 17 20 Amsterdam, The Netherlands The papers at this Convention have been selected on the basis of a submitted abstract

More information

Phase and Feedback in the Nonlinear Brain. Malcolm Slaney (IBM and Stanford) Hiroko Shiraiwa-Terasawa (Stanford) Regaip Sen (Stanford)

Phase and Feedback in the Nonlinear Brain. Malcolm Slaney (IBM and Stanford) Hiroko Shiraiwa-Terasawa (Stanford) Regaip Sen (Stanford) Phase and Feedback in the Nonlinear Brain Malcolm Slaney (IBM and Stanford) Hiroko Shiraiwa-Terasawa (Stanford) Regaip Sen (Stanford) Auditory processing pre-cosyne workshop March 23, 2004 Simplistic Models

More information

Robust Speech Recognition Based on Binaural Auditory Processing

Robust Speech Recognition Based on Binaural Auditory Processing INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Robust Speech Recognition Based on Binaural Auditory Processing Anjali Menon 1, Chanwoo Kim 2, Richard M. Stern 1 1 Department of Electrical and Computer

More information

Principles of Musical Acoustics

Principles of Musical Acoustics William M. Hartmann Principles of Musical Acoustics ^Spr inger Contents 1 Sound, Music, and Science 1 1.1 The Source 2 1.2 Transmission 3 1.3 Receiver 3 2 Vibrations 1 9 2.1 Mass and Spring 9 2.1.1 Definitions

More information

Binaural Cue Coding Part I: Psychoacoustic Fundamentals and Design Principles

Binaural Cue Coding Part I: Psychoacoustic Fundamentals and Design Principles IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 11, NO. 6, NOVEMBER 2003 509 Binaural Cue Coding Part I: Psychoacoustic Fundamentals and Design Principles Frank Baumgarte and Christof Faller Abstract

More information

the human chapter 1 Traffic lights the human User-centred Design Light Vision part 1 (modified extract for AISD 2005) Information i/o

the human chapter 1 Traffic lights the human User-centred Design Light Vision part 1 (modified extract for AISD 2005) Information i/o Traffic lights chapter 1 the human part 1 (modified extract for AISD 2005) http://www.baddesigns.com/manylts.html User-centred Design Bad design contradicts facts pertaining to human capabilities Usability

More information

Stefan Launer, Lyon, January 2011 Phonak AG, Stäfa, CH

Stefan Launer, Lyon, January 2011 Phonak AG, Stäfa, CH State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak AG, Stäfa, CH Content Phonak Stefan Launer, Speech in Noise Workshop,

More information

COM325 Computer Speech and Hearing

COM325 Computer Speech and Hearing COM325 Computer Speech and Hearing Part III : Theories and Models of Pitch Perception Dr. Guy Brown Room 145 Regent Court Department of Computer Science University of Sheffield Email: g.brown@dcs.shef.ac.uk

More information

Monaural and binaural processing of fluctuating sounds in the auditory system

Monaural and binaural processing of fluctuating sounds in the auditory system Monaural and binaural processing of fluctuating sounds in the auditory system Eric R. Thompson September 23, 2005 MSc Thesis Acoustic Technology Ørsted DTU Technical University of Denmark Supervisor: Torsten

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Architectural Acoustics Session 2aAAa: Adapting, Enhancing, and Fictionalizing

More information

Pitch-Based Segregation of Reverberant Speech

Pitch-Based Segregation of Reverberant Speech Technical Report OSU-CISRC-4/5-TR22 Department of Computer Science and Engineering The Ohio State University Columbus, OH 4321-1277 Ftp site: ftp.cse.ohio-state.edu Login: anonymous Directory: pub/tech-report/25

More information

Imagine the cochlea unrolled

Imagine the cochlea unrolled 2 2 1 1 1 1 1 Cochlea & Auditory Nerve: obligatory stages of auditory processing Think of the auditory periphery as a processor of signals 2 2 1 1 1 1 1 Imagine the cochlea unrolled Basilar membrane motion

More information

AUDL GS08/GAV1 Auditory Perception. Envelope and temporal fine structure (TFS)

AUDL GS08/GAV1 Auditory Perception. Envelope and temporal fine structure (TFS) AUDL GS08/GAV1 Auditory Perception Envelope and temporal fine structure (TFS) Envelope and TFS arise from a method of decomposing waveforms The classic decomposition of waveforms Spectral analysis... Decomposes

More information

Waves Nx VIRTUAL REALITY AUDIO

Waves Nx VIRTUAL REALITY AUDIO Waves Nx VIRTUAL REALITY AUDIO WAVES VIRTUAL REALITY AUDIO THE FUTURE OF AUDIO REPRODUCTION AND CREATION Today s entertainment is on a mission to recreate the real world. Just as VR makes us feel like

More information

Temporal resolution AUDL Domain of temporal resolution. Fine structure and envelope. Modulating a sinusoid. Fine structure and envelope

Temporal resolution AUDL Domain of temporal resolution. Fine structure and envelope. Modulating a sinusoid. Fine structure and envelope Modulating a sinusoid can also work this backwards! Temporal resolution AUDL 4007 carrier (fine structure) x modulator (envelope) = amplitudemodulated wave 1 2 Domain of temporal resolution Fine structure

More information

Acoustics, signals & systems for audiology. Week 9. Basic Psychoacoustic Phenomena: Temporal resolution

Acoustics, signals & systems for audiology. Week 9. Basic Psychoacoustic Phenomena: Temporal resolution Acoustics, signals & systems for audiology Week 9 Basic Psychoacoustic Phenomena: Temporal resolution Modulating a sinusoid carrier at 1 khz (fine structure) x modulator at 100 Hz (envelope) = amplitudemodulated

More information

Recurrent Timing Neural Networks for Joint F0-Localisation Estimation

Recurrent Timing Neural Networks for Joint F0-Localisation Estimation Recurrent Timing Neural Networks for Joint F0-Localisation Estimation Stuart N. Wrigley and Guy J. Brown Department of Computer Science, University of Sheffield Regent Court, 211 Portobello Street, Sheffield

More information

Chapter 3. Meeting 3, Psychoacoustics, Hearing, and Reflections

Chapter 3. Meeting 3, Psychoacoustics, Hearing, and Reflections Chapter 3. Meeting 3, Psychoacoustics, Hearing, and Reflections 3.1. Announcements Need schlep crew for Tuesday (and other days) Due Today, 15 February: Mix Graph 1 Quiz next Tuesday (we meet Tuesday,

More information

Robust Speech Recognition Based on Binaural Auditory Processing

Robust Speech Recognition Based on Binaural Auditory Processing Robust Speech Recognition Based on Binaural Auditory Processing Anjali Menon 1, Chanwoo Kim 2, Richard M. Stern 1 1 Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh,

More information

Robust Speech Recognition Group Carnegie Mellon University. Telephone: Fax:

Robust Speech Recognition Group Carnegie Mellon University. Telephone: Fax: Robust Automatic Speech Recognition In the 21 st Century Richard Stern (with Alex Acero, Yu-Hsiang Chiu, Evandro Gouvêa, Chanwoo Kim, Kshitiz Kumar, Amir Moghimi, Pedro Moreno, Hyung-Min Park, Bhiksha

More information

The analysis of multi-channel sound reproduction algorithms using HRTF data

The analysis of multi-channel sound reproduction algorithms using HRTF data The analysis of multichannel sound reproduction algorithms using HRTF data B. Wiggins, I. PatersonStephens, P. Schillebeeckx Processing Applications Research Group University of Derby Derby, United Kingdom

More information

The role of fine structure in bilateral cochlear implantation

The role of fine structure in bilateral cochlear implantation Acoustics Research Institute Austrian Academy of Sciences The role of fine structure in bilateral cochlear implantation Laback, B., Majdak, P., Baumgartner, W. D. Interaural Time Difference (ITD) Sound

More information

AUDL 4007 Auditory Perception. Week 1. The cochlea & auditory nerve: Obligatory stages of auditory processing

AUDL 4007 Auditory Perception. Week 1. The cochlea & auditory nerve: Obligatory stages of auditory processing AUDL 4007 Auditory Perception Week 1 The cochlea & auditory nerve: Obligatory stages of auditory processing 1 Think of the ear as a collection of systems, transforming sounds to be sent to the brain 25

More information

The relation between perceived apparent source width and interaural cross-correlation in sound reproduction spaces with low reverberation

The relation between perceived apparent source width and interaural cross-correlation in sound reproduction spaces with low reverberation Downloaded from orbit.dtu.dk on: Feb 05, 2018 The relation between perceived apparent source width and interaural cross-correlation in sound reproduction spaces with low reverberation Käsbach, Johannes;

More information

Lateralisation of multiple sound sources by the auditory system

Lateralisation of multiple sound sources by the auditory system Modeling of Binaural Discrimination of multiple Sound Sources: A Contribution to the Development of a Cocktail-Party-Processor 4 H.SLATKY (Lehrstuhl für allgemeine Elektrotechnik und Akustik, Ruhr-Universität

More information

ScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking

ScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 122 126 International Conference on Information and Communication Technologies (ICICT 2014) Unsupervised Speech

More information

Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks

Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks Mariam Yiwere 1 and Eun Joo Rhee 2 1 Department of Computer Engineering, Hanbat National University,

More information

IMPLEMENTATION AND APPLICATION OF A BINAURAL HEARING MODEL TO THE OBJECTIVE EVALUATION OF SPATIAL IMPRESSION

IMPLEMENTATION AND APPLICATION OF A BINAURAL HEARING MODEL TO THE OBJECTIVE EVALUATION OF SPATIAL IMPRESSION IMPLEMENTATION AND APPLICATION OF A BINAURAL HEARING MODEL TO THE OBJECTIVE EVALUATION OF SPATIAL IMPRESSION RUSSELL MASON Institute of Sound Recording, University of Surrey, Guildford, UK r.mason@surrey.ac.uk

More information

INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR PROPOSING A STANDARDISED TESTING ENVIRONMENT FOR BINAURAL SYSTEMS

INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR PROPOSING A STANDARDISED TESTING ENVIRONMENT FOR BINAURAL SYSTEMS 20-21 September 2018, BULGARIA 1 Proceedings of the International Conference on Information Technologies (InfoTech-2018) 20-21 September 2018, Bulgaria INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR

More information

Enhancing 3D Audio Using Blind Bandwidth Extension

Enhancing 3D Audio Using Blind Bandwidth Extension Enhancing 3D Audio Using Blind Bandwidth Extension (PREPRINT) Tim Habigt, Marko Ðurković, Martin Rothbucher, and Klaus Diepold Institute for Data Processing, Technische Universität München, 829 München,

More information

NAME STUDENT # ELEC 484 Audio Signal Processing. Midterm Exam July Listening test

NAME STUDENT # ELEC 484 Audio Signal Processing. Midterm Exam July Listening test NAME STUDENT # ELEC 484 Audio Signal Processing Midterm Exam July 2008 CLOSED BOOK EXAM Time 1 hour Listening test Choose one of the digital audio effects for each sound example. Put only ONE mark in each

More information

THE TEMPORAL and spectral structure of a sound signal

THE TEMPORAL and spectral structure of a sound signal IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 13, NO. 1, JANUARY 2005 105 Localization of Virtual Sources in Multichannel Audio Reproduction Ville Pulkki and Toni Hirvonen Abstract The localization

More information

MULTICHANNEL CONTROL OF SPATIAL EXTENT THROUGH SINUSOIDAL PARTIAL MODULATION (SPM)

MULTICHANNEL CONTROL OF SPATIAL EXTENT THROUGH SINUSOIDAL PARTIAL MODULATION (SPM) MULTICHANNEL CONTROL OF SPATIAL EXTENT THROUGH SINUSOIDAL PARTIAL MODULATION (SPM) Andrés Cabrera Media Arts and Technology University of California Santa Barbara, USA andres@mat.ucsb.edu Gary Kendall

More information

SGN Audio and Speech Processing

SGN Audio and Speech Processing Introduction 1 Course goals Introduction 2 SGN 14006 Audio and Speech Processing Lectures, Fall 2014 Anssi Klapuri Tampere University of Technology! Learn basics of audio signal processing Basic operations

More information

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC

More information

AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES

AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-), Verona, Italy, December 7-9,2 AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES Tapio Lokki Telecommunications

More information

Testing of Objective Audio Quality Assessment Models on Archive Recordings Artifacts

Testing of Objective Audio Quality Assessment Models on Archive Recordings Artifacts POSTER 25, PRAGUE MAY 4 Testing of Objective Audio Quality Assessment Models on Archive Recordings Artifacts Bc. Martin Zalabák Department of Radioelectronics, Czech Technical University in Prague, Technická

More information

I R UNDERGRADUATE REPORT. Stereausis: A Binaural Processing Model. by Samuel Jiawei Ng Advisor: P.S. Krishnaprasad UG

I R UNDERGRADUATE REPORT. Stereausis: A Binaural Processing Model. by Samuel Jiawei Ng Advisor: P.S. Krishnaprasad UG UNDERGRADUATE REPORT Stereausis: A Binaural Processing Model by Samuel Jiawei Ng Advisor: P.S. Krishnaprasad UG 2001-6 I R INSTITUTE FOR SYSTEMS RESEARCH ISR develops, applies and teaches advanced methodologies

More information

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,

More information

Spectro-Temporal Methods in Primary Auditory Cortex David Klein Didier Depireux Jonathan Simon Shihab Shamma

Spectro-Temporal Methods in Primary Auditory Cortex David Klein Didier Depireux Jonathan Simon Shihab Shamma Spectro-Temporal Methods in Primary Auditory Cortex David Klein Didier Depireux Jonathan Simon Shihab Shamma & Department of Electrical Engineering Supported in part by a MURI grant from the Office of

More information

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4 SOPA version 2 Revised July 7 2014 SOPA project September 21, 2014 Contents 1 Introduction 2 2 Basic concept 3 3 Capturing spatial audio 4 4 Sphere around your head 5 5 Reproduction 7 5.1 Binaural reproduction......................

More information

Two-channel Separation of Speech Using Direction-of-arrival Estimation And Sinusoids Plus Transients Modeling

Two-channel Separation of Speech Using Direction-of-arrival Estimation And Sinusoids Plus Transients Modeling Two-channel Separation of Speech Using Direction-of-arrival Estimation And Sinusoids Plus Transients Modeling Mikko Parviainen 1 and Tuomas Virtanen 2 Institute of Signal Processing Tampere University

More information

ROBUST SPEECH RECOGNITION. Richard Stern

ROBUST SPEECH RECOGNITION. Richard Stern ROBUST SPEECH RECOGNITION Richard Stern Robust Speech Recognition Group Mellon University Telephone: (412) 268-2535 Fax: (412) 268-3890 rms@cs.cmu.edu http://www.cs.cmu.edu/~rms Short Course at Universidad

More information

IMPROVED COCKTAIL-PARTY PROCESSING

IMPROVED COCKTAIL-PARTY PROCESSING IMPROVED COCKTAIL-PARTY PROCESSING Alexis Favrot, Markus Erne Scopein Research Aarau, Switzerland postmaster@scopein.ch Christof Faller Audiovisual Communications Laboratory, LCAV Swiss Institute of Technology

More information

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. 3D and Virtual Sound. Paris Smaragdis. paris.cs.illinois.

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. 3D and Virtual Sound. Paris Smaragdis. paris.cs.illinois. UNIVERSITY ILLINOIS @ URBANA-CHAMPAIGN OF CS 498PS Audio Computing Lab 3D and Virtual Sound Paris Smaragdis paris@illinois.edu paris.cs.illinois.edu Overview Human perception of sound and space ITD, IID,

More information

Predicting localization accuracy for stereophonic downmixes in Wave Field Synthesis

Predicting localization accuracy for stereophonic downmixes in Wave Field Synthesis Predicting localization accuracy for stereophonic downmixes in Wave Field Synthesis Hagen Wierstorf Assessment of IP-based Applications, T-Labs, Technische Universität Berlin, Berlin, Germany. Sascha Spors

More information

Enhanced Waveform Interpolative Coding at 4 kbps

Enhanced Waveform Interpolative Coding at 4 kbps Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression

More information

Final Exam Study Guide: Introduction to Computer Music Course Staff April 24, 2015

Final Exam Study Guide: Introduction to Computer Music Course Staff April 24, 2015 Final Exam Study Guide: 15-322 Introduction to Computer Music Course Staff April 24, 2015 This document is intended to help you identify and master the main concepts of 15-322, which is also what we intend

More information

Multiple Sound Sources Localization Using Energetic Analysis Method

Multiple Sound Sources Localization Using Energetic Analysis Method VOL.3, NO.4, DECEMBER 1 Multiple Sound Sources Localization Using Energetic Analysis Method Hasan Khaddour, Jiří Schimmel Department of Telecommunications FEEC, Brno University of Technology Purkyňova

More information

Sound Source Localization using HRTF database

Sound Source Localization using HRTF database ICCAS June -, KINTEX, Gyeonggi-Do, Korea Sound Source Localization using HRTF database Sungmok Hwang*, Youngjin Park and Younsik Park * Center for Noise and Vibration Control, Dept. of Mech. Eng., KAIST,

More information

Simultaneous Recognition of Speech Commands by a Robot using a Small Microphone Array

Simultaneous Recognition of Speech Commands by a Robot using a Small Microphone Array 2012 2nd International Conference on Computer Design and Engineering (ICCDE 2012) IPCSIT vol. 49 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V49.14 Simultaneous Recognition of Speech

More information

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Alfredo Zermini, Qiuqiang Kong, Yong Xu, Mark D. Plumbley, Wenwu Wang Centre for Vision,

More information

Speaker placement, externalization, and envelopment in home listening rooms

Speaker placement, externalization, and envelopment in home listening rooms Speaker placement, externalization, and envelopment in home listening rooms David Griesinger Lexicon 3 Oak Park Bedford, MA 01730 dg@lexicon.com Abstract The ideal number and placement of low frequency

More information

Psychoacoustics of 3D Sound Recording: Research and Practice

Psychoacoustics of 3D Sound Recording: Research and Practice Psychoacoustics of 3D Sound Recording: Research and Practice Dr Hyunkook Lee University of Huddersfield, UK h.lee@hud.ac.uk www.hyunkooklee.com www.hud.ac.uk/apl About me Senior Lecturer (i.e. Associate

More information

Single Channel Speaker Segregation using Sinusoidal Residual Modeling

Single Channel Speaker Segregation using Sinusoidal Residual Modeling NCC 2009, January 16-18, IIT Guwahati 294 Single Channel Speaker Segregation using Sinusoidal Residual Modeling Rajesh M Hegde and A. Srinivas Dept. of Electrical Engineering Indian Institute of Technology

More information

Physics 101. Lecture 21 Doppler Effect Loudness Human Hearing Interference of Sound Waves Reflection & Refraction of Sound

Physics 101. Lecture 21 Doppler Effect Loudness Human Hearing Interference of Sound Waves Reflection & Refraction of Sound Physics 101 Lecture 21 Doppler Effect Loudness Human Hearing Interference of Sound Waves Reflection & Refraction of Sound Quiz: Monday Oct. 18; Chaps. 16,17,18(as covered in class),19 CR/NC Deadline Oct.

More information