Stefan Launer, Lyon, January 2011 Phonak AG, Stäfa, CH

Similar documents
Technical features For internal use only / For internal use only Copy / right Copy Sieme A All rights re 06. All rights re se v r ed.

Agenda. Wireless Defined To Zoom or Not to Zoom DuoPhone & DirectTouch De-Mystified Wireless accessories Overview Phonak CROS in depth

The psychoacoustics of reverberation

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues

Recent Advances in Acoustic Signal Extraction and Dereverberation

Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor. Presented by Amir Kiperwas

Acoustic Beamforming for Hearing Aids Using Multi Microphone Array by Designing Graphical User Interface

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Auditory System For a Mobile Robot

Binaural Hearing. Reading: Yost Ch. 12

Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

EE482: Digital Signal Processing Applications

Detection, Interpolation and Cancellation Algorithms for GSM burst Removal for Forensic Audio

Psychoacoustic Cues in Room Size Perception

REAL-TIME BROADBAND NOISE REDUCTION

HCS 7367 Speech Perception

COM 12 C 288 E October 2011 English only Original: English

Speech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya 2, B. Yamuna 2, H. Divya 2, B. Shiva Kumar 2, B.

FREQUENCY RESPONSE AND LATENCY OF MEMS MICROPHONES: THEORY AND PRACTICE

Towards an intelligent binaural spee enhancement system by integrating me signal extraction. Author(s)Chau, Duc Thanh; Li, Junfeng; Akagi,

Lateralisation of multiple sound sources by the auditory system

Envelopment and Small Room Acoustics

Finding the Prototype for Stereo Loudspeakers

You know about adding up waves, e.g. from two loudspeakers. AUDL 4007 Auditory Perception. Week 2½. Mathematical prelude: Adding up levels

A BINAURAL HEARING AID SPEECH ENHANCEMENT METHOD MAINTAINING SPATIAL AWARENESS FOR THE USER

Two-channel Separation of Speech Using Direction-of-arrival Estimation And Sinusoids Plus Transients Modeling

Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model

AN547 - Why you need high performance, ultra-high SNR MEMS microphones

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.

III. Publication III. c 2005 Toni Hirvonen.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

A classification-based cocktail-party processor

ZLS38500 Firmware for Handsfree Car Kits

Monaural and Binaural Speech Separation

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

AUDITORY ILLUSIONS & LAB REPORT FORM

Auditory modelling for speech processing in the perceptual domain

Gerhard Schmidt / Tim Haulick Recent Tends for Improving Automotive Speech Enhancement Systems. Geneva, 5-7 March 2008

Acoustics II: Kurt Heutschi recording technique. stereo recording. microphone positioning. surround sound recordings.

IS SII BETTER THAN STI AT RECOGNISING THE EFFECTS OF POOR TONAL BALANCE ON INTELLIGIBILITY?

Sound Processing Technologies for Realistic Sensations in Teleworking

IMPROVED COCKTAIL-PARTY PROCESSING

Case study for voice amplification in a highly absorptive conference room using negative absorption tuning by the YAMAHA Active Field Control system

3 RD GENERATION BE HEARD AND HEAR, LOUD AND CLEAR

Accurate sound reproduction from two loudspeakers in a living room

Perceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter

Psycho-acoustics (Sound characteristics, Masking, and Loudness)

Improving Meetings with Microphone Array Algorithms. Ivan Tashev Microsoft Research

DESIGN OF VOICE ALARM SYSTEMS FOR TRAFFIC TUNNELS: OPTIMISATION OF SPEECH INTELLIGIBILITY

NOISE ESTIMATION IN A SINGLE CHANNEL

Single channel noise reduction

ACTIVE LOW-FREQUENCY MODAL NOISE CANCELLA- TION FOR ROOM ACOUSTICS: AN EXPERIMENTAL STUDY

Automotive three-microphone voice activity detector and noise-canceller

Active control for adaptive sound zones in passenger train compartments

MMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2

Intensity Discrimination and Binaural Interaction

Surround: The Current Technological Situation. David Griesinger Lexicon 3 Oak Park Bedford, MA

ACOUSTIC DATA TRANSMISSION IN AIR USING TRANSDUCER ARRAY

Speech quality for mobile phones: What is achievable with today s technology?

Dual Transfer Function GSC and Application to Joint Noise Reduction and Acoustic Echo Cancellation

Sound source localization and its use in multimedia applications

Installation Guide & User Manual Sound Plus Infrared System, Model WIR 950

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

A COHERENCE-BASED ALGORITHM FOR NOISE REDUCTION IN DUAL-MICROPHONE APPLICATIONS

Audio Engineering Society Convention Paper Presented at the 110th Convention 2001 May Amsterdam, The Netherlands

Lecture 14: Source Separation

Influence of artificial mouth s directivity in determining Speech Transmission Index

Single Channel Speaker Segregation using Sinusoidal Residual Modeling

Self Localization Using A Modulated Acoustic Chirp

Interfacing with the Machine

Robust Speech Recognition Group Carnegie Mellon University. Telephone: Fax:

DSP-599zx Version 5.0 Manual Supplement

A Computational Efficient Method for Assuring Full Duplex Feeling in Hands-free Communication

DESIGN OF ROOMS FOR MULTICHANNEL AUDIO MONITORING

Digitally controlled Active Noise Reduction with integrated Speech Communication

Improving room acoustics at low frequencies with multiple loudspeakers and time based room correction

Spatial Audio Reproduction: Towards Individualized Binaural Sound

Binaural segregation in multisource reverberant environments

Validation of a Virtual Sound Environment System for Testing Hearing Aids

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Loudness & Temporal resolution

CHAPTER 3 SPEECH ENHANCEMENT ALGORITHMS

Production Noise Immunity

Directionality. Many hearing impaired people have great difficulty

Analytical Analysis of Disturbed Radio Broadcast

A Low-Power Broad-Bandwidth Noise Cancellation VLSI Circuit Design for In-Ear Headphones

Audio Quality Terminology

Combining Subjective and Objective Assessment of Loudspeaker Distortion Marian Liebig Wolfgang Klippel

Factors impacting the speech quality in VoIP scenarios and how to assess them

Non-intrusive intelligibility prediction for Mandarin speech in noise. Creative Commons: Attribution 3.0 Hong Kong License

ScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking

Acoustic Calibration Service in Automobile Field at NIM, China

Chapter 4 SPEECH ENHANCEMENT

Audio System Evaluation with Music Signals

29th TONMEISTERTAGUNG VDT INTERNATIONAL CONVENTION, November 2016

INTERNATIONAL TELECOMMUNICATION UNION

The Human Auditory System

3D Distortion Measurement (DIS)

AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES

Transcription:

State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak AG, Stäfa, CH Content Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 2 Speech intelligibility in complex listening environments for hearing impaired persons Noise reduction technologies in hearing instruments De-reverberation Single microphone technology Multi-microphone technology FM systems Results Challenges 1

Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 3 Speech Intelligibility in Noise??? Speech Intelligibility in Complex Listening Conditions!! Different types of interfering sources Different spatial arrangements of sources and interferers Dynamic Room acoustics Reverberation Distance Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 4 Speech Intelligibility in Noise??? Speech Intelligibility in Complex Listening Conditions!! Test methodology Speech tests: short sentences, words, phonemes - target from front, static White noise from the back Anechoic environment Lab / real life results Speech intelligibility Listening effort 2

Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 5 Speech Intelligibility in Noise 20 15 SNR db 10 5 0 20 30 40 50 60 70 80 90 Hearing Loss db (3FA) Mild hearing loss Moderate hearing loss Severe hearing loss Killion 1997 Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 6 Physical structure of interfering signal has a strong impact on speech intelligibility Introducing... spectral dips: SH 3-4 db SRT NH: 9-15 db temporal dips: SH 1-2 db SRT NH: 6-7 db combination of both: SH 4-5 db SRT NH: 15-20 db... improves speech intelligbility a lot for normal-hearing subjects, much less so for hearing impaired subjects! Peters, Moore and Baer 1998, JASA 3

Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 7 Speech Intelligibility in Multi-talker Environment Speech intelligibility as a function of interfering talkers Fig. 2, Bronhorst and Plomp, JASA 1992 Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 8 Spatial Release from Masking Anechoic Chamber NH HI 10 db! Beutelmann & Brand, JASA 2006 4

Spatial Release from Masking - Office Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 9 Spatial release reduced by reverberation Beutelmann & Brand, JASA 2006 4 db! NH HI Spatial Release from Masking - Cafeteria Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 10 6-7 db! NH HI Beutelmann & Brand, JASA 2006 5

Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 11 Speech intelligibility in reverberant environments Correct % 100 90 80 70 60 50 40 30 20 10 normal mild Moderate / severe 0 Sound suite T = 0.54 T = 1.55 Reverberation Time Harris & Swenson, Audiology 1990, p. 314-321 Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 12 How to mix a Speech in Noise Cocktail noise canceling directional microphones Speech in noise cocktail Objectives for a hearing instrument: Speech intelligibility improvement!!!! Ease of listening, listening effort, listening comfort 6

Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 13 Noise Reduction Using a Single Microphone Single Microphone Noise-Cancellers: in principal estimate the noise and subtract it from the noisy signal. (S + N) Adaptive Filter: H = 1 - N* / (S + N) N* (S + N) - N* S Speech Detection Noise Estimation Statistical estimation, amplitude modulation, noise detection in speech pauses Use a single information source to separate two signals Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 14 Reverberation Canceller Reduces the smearing effects by de-blurring the speech signal Level Signal EchoBlock Time span of early reflections Time span of disturbing reflections Time 7

Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 15 Single Microphone Noise Reduction - Summary This technique performs well eliminating stationary noises like a fan or in a car, etc. Reverberation: very reverberant rooms Speech like noises can t be suppressed without degrading speech quality at the same time.... ease of listening: improving listening comfort reduction of perceived noisiness less annoyance Improvement of speech intelligibility??? Sound quality is a trade off Delay & Sum - Technique Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 16 Delay = d/c Target direction f b d The acoustical signal is picked up at two different locations by the front and the back microphones The signal from the back is delayed The signals from both microphones are summed Depending on delay - different directions are attenuated + 8

Digital Adaptive Directional Microphones Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 17 Adaptive: minimize output energy of the two microphones Front AD- Converter AD- Converter Spatial Processor α Back The spatial weighting factor (a) is continuously adapted, the Directivity Index hereby optimized. Digital Adaptive Directional Microphones Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 18 Amplify sounds from front Adaptively attenuates strongest noise source 9

Frequency Specific Beamforming Directivity in each frequency band Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 19 Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 20 Directional Microphones: Potential and Limitations Significant speech intelligibility benefit compared to omnidirectional systems in complex listening conditions from side & asymmetric diffuse moving noises reverberant environments & larger distances Lab results: 3-6 db improvements 10

Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 21 Directional Microphones: Potential and Limitations Positioning on head Microphone mismatch, ageing etc More than two microphones Noise floor Narrow beam pattern acceptable? Size constraint: low frequency rolloff Computational complexity Delay = d/c Target direction + f b d Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 22 Directivity Index for Different Products Styles and Placements 5 BTE DI 0 11

Factors causing BF mismatch Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 23 The beamformer performance in our current products can be limited due to level and phase mismatch caused by the following factors: time invariant time variant Microphone production mismatch HI assembly Clean W&W variability Customer individual head/pinna shape Device geometry: ITEs and microbtes have unfavorable mic positions Microphone ageing HI repairing W&W pollution Non-idealities of current adaptive level matching block Customer HI positioning variance Effects of Microphone/BF mismatch Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 24 microphone phase deviation rotated null direction microphone magnitude deviation reduced suppression target blocking 12

Binaural Directional Microphones Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 25 Improving directivity by linear combination of monaural directional microphone outputs Beamformer Beamformer wireless transmission w i i X i w i X i i Maximum SNR improvement: 3 db Test setup Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 26 Subjects 20 adults Moderate - moderately-severe hearing loss Exélia Art and Ambra microp BTE Algorithms Excelia Ambra UltraZoom (monaural) Ambra StereoZoom (binaural) Test setup OLSA: speech intelligibility in noise Listening effort scaling Paired comparison 13

Binaural Beamforming Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 27 OLSA Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 28 A) OLSA, 60 angle B) OLSA, 45 angle 0 0-2 -2-4 -4 SRT 50% in db SNR -6-8 -10-6 -8-10 -12-14 Exélia Art VoiceZoom Ambra UltraZoom Ambra StereoZoom -12-14 14

Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 29 Paired Comparison Subjective Speech Intelligibility 70 Mit welchem Hörgerät verstehen Sie besser? Ambra UZ Ambra SZ Exelia VZ 60 Anzahl Vergleiche 50 40 30 20 10 0 45 Winkel Störgeräusch 60 Winkel Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 30 Paired Comparison Subjective Listening Effort Anzahl Vergleiche 80 70 60 50 40 30 20 10 0 Mit welchem Hörgerät verstehen Sie leichter? Ambra UZ Ambra SZ Exelia VZ 45 Winkel Störgeräusch 60 Winkel 15

User-Steered directionality Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 31 Traditional beamforming systems focus only to the front Speech signals do not always come from the front and facing the speaker is not always possible Car, restaurants, small groups ZoomControl, accessible through mypilot, allows Exélia wearers to select in which direction to focus hearing Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 32 Listen to the side: User-Steered directionality Uses four-microphone network of full bandwidth binaural instruments Broadband audio data transfer between devices focuses hearing in one specific direction, while suppressing all signals in other directions 16

Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 33 User-Steered directionality Fall_Launch_2010_Ambra_GB_Page 33 Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 34 User-Steered directionality 7 5 3 1 SNR (db SPL) -1-3 -5-7 -9 Without 0 (front) 90 (left) 180 (back) 270 (right) Adaptive multichannel directionality Steerable directionality ExeliaArt P Fall_Launch_2010_Ambra_GB_Page 34 17

Subjective Evaluation Listening Effort Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 35 Which setting needs the least listening effort to understand well? (For first time and experienced user (n=9)) 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 88% 78% 11% 13% 11% Without ZoomControl VoiceZoom Omni Male speech Female speech Binaural noise reduction techniques Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 36 Different types of algorithms Beam former: spatial information, timing difference Binaural Wiener Filter Blind source separation: statistical information estimating room transfer function Auditory processing schemes 18

BWF: Speech Intelligibility Weighted Gain Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 37 Speech Intelligibility Weighted Gain Acoustic Environment PhD Thesis van den Bogaert 2008 BWF: Speech Intelligibility Weighted Gain Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 38 Speech Intelligibility Weighted Gain Acoustic Environment PhD Thesis van den Bogaert 2008 19

Binaural Beam Forming / Noise Reduction Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 39 No stereo output signal => loss of spatial sensation / localization Artificially re-introduce that by split-directionality Mixing in part of the original signal at the output Narrower beam width How narrow should the beam be (head movement!)? Complex environments Dynamic -> target tracking, target identification Reverberation & distance Expected improvements: specific situation, no generic solution Single /few strong interfering source, frontal hemisphere Environments with little reverberation Technical constraints Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 40 Delay over the link Clock jtter Noise floor, signal degradation Microphone calibration (amplitude and phase) 20

Earlevel FM Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 41 Modern FM Technology Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 42 Dynamic Speech Extraction Automatic FM advantage: Adjusts the FM gain depending on the environmental noise level Surrounding Noise Compensation Voice Activity Detector Multi-talker networks: New team teaching concept using up to 10 transmitters 21

SNR at ear level for different technologies Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 43 SNR (db) 40 35 30 25 20 15 10 5 0-5 -10-15 -20 No FM Traditional FM: fix FM Advantage Adaptive FM Advantage 40 45 50 55 60 65 70 75 80 85 Surrounding Noise (db SPL) 10 db FM advantage: - Good environmental awareness - Audibility of the own voice - Compromise at high noise levels Fieldstudy with 48 adults Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 44 315 * 0 TX3 45 HINT Sentences Correlated HINT Noise 1 m * 1 meter (39 ) Loudspeakers to center of head. 7.6 cm (3 ) Loudspeaker to TX3 Transmitter 225 135 Source: Valente, 2002 22

Speech Intelligibility Threshold Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 45 Mean RTS (db) 10 5 0-5 -10-15 -20 5.9 2.3 1.8 6.5 3.2 2.1 (SD) -14.6-18.9 8.6 3.1-0.8-5 Unaided Omni Dual FM-M FM-B Normal Listening Conditions Source: Valente, 2002 Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 46 Auditory Scene Analysis / Hearing Instrument Processing Auditory Processing Bottom up / top down No delay constraint, no real-time processing Higher resolution signal analysis Much higher computational power => Stream segregation & source formation: works on several different time scales No signal reconstruction! Perceptual attenuation, focus attention, suppression of neuronal activity Channel: full information capacity A priori knowledge, situational knowledge - other sensory modalities - world knowledge, models of sources fill in information Attention control: Target signal identification and tracking, switching back and forth between objects, overcoming salient sources Hearing Instrument Processing Bottom up Delay constraint, real-time processing comp. power constraint - limited signal analysis, spectro-temporal resolution Signal reconstruction & signal modification: amplification & attenuation / filtering -> distortions Channel with limited information capacity Retrospective analysis Dynamic aspects head / source movement Target signal assumption: in front 23

Conclusion Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 47 Hearing instruments offer several algorithms to improve speech intelligibility in complex listening environments Algorithms based mainly on Speech intelligibility in complex listening environments remains a huge challenge Reverberation and distance Dynamic target selection and tracking Technical limitations Realistic test setups and test procedures Questions Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 48 Speech intelligibility: how much is top-down driven versus bottom-up processing? Speech intelligibility: how fast is it really?? How much information do we infer at the end of a sentence? Which cues (pitch, temporal fine structure, location,.) are the essential ones, does it depend on situation? How does the auditory system pick the relevant one?? How do we achieve perceptual constancy voices in real life always sounds the same, (almost) independent of environment? 24

Thank you!!! Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 49 Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 50 Speech Intelligibility in reverberant environments 100 90 80 normal Speech Intelligibility % 70 60 50 40 30 20 10 0 Reverberation time Sound suite T = 0.54 T = 1.55 mild moderate / severe Harris & Swenson, Audiology 1990, p. 314-321 25

Binaural processing - audio delay Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 51 Group delay: Is mainly determined by Radio bandwidth, ADC, CODEC, buffering (Error correction) Delay shall be deterministic and constant For binaural audio processing the link delay adds to the other signal processing delay ie. FFT block processing, ADC. Overall system delay should be less than 10 ms (Stone & Moore 2005, ) Audiosignals + control data: Some more delay for Gain control is acceptable (Hohmann 2009) Jitter examples: 800 Hz pure tone Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 52 Acoustic delay from head dimension: typ. 500 µs for ear distance Normal hearing minimum audible angle: a few µs Jitter should be smaller than 20 µs RMS -> allows for binaural beamforming without significant localization errors phase difference / deg 30 0 0 phasediff T jitter = 30µ s 360 freq time / s 10 26