Introduction to distributed speech enhancement algorithms for ad hoc microphone arrays and wireless acoustic sensor networks

Similar documents
/$ IEEE

Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor. Presented by Amir Kiperwas

Speech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya 2, B. Yamuna 2, H. Divya 2, B. Shiva Kumar 2, B.

DISTANT or hands-free audio acquisition is required in

A BROADBAND BEAMFORMER USING CONTROLLABLE CONSTRAINTS AND MINIMUM VARIANCE

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 21, NO. 5, MAY

Recent Advances in Acoustic Signal Extraction and Dereverberation

Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming

Microphone Array Design and Beamforming

Dual Transfer Function GSC and Application to Joint Noise Reduction and Acoustic Echo Cancellation

Michael Brandstein Darren Ward (Eds.) Microphone Arrays. Signal Processing Techniques and Applications. With 149 Figures. Springer

546 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 4, MAY /$ IEEE

Speech Enhancement Using Robust Generalized Sidelobe Canceller with Multi-Channel Post-Filtering in Adverse Environments

AN ADAPTIVE MICROPHONE ARRAY FOR OPTIMUM BEAMFORMING AND NOISE REDUCTION

AN ADAPTIVE MICROPHONE ARRAY FOR OPTIMUM BEAMFORMING AND NOISE REDUCTION

260 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 2, FEBRUARY /$ IEEE

Towards an intelligent binaural spee enhancement system by integrating me signal extraction. Author(s)Chau, Duc Thanh; Li, Junfeng; Akagi,

NOISE REDUCTION IN DUAL-MICROPHONE MOBILE PHONES USING A BANK OF PRE-MEASURED TARGET-CANCELLATION FILTERS. P.O.Box 18, Prague 8, Czech Republic

NOISE REDUCTION IN DUAL-MICROPHONE MOBILE PHONES USING A BANK OF PRE-MEASURED TARGET-CANCELLATION FILTERS. P.O.Box 18, Prague 8, Czech Republic

About Multichannel Speech Signal Extraction and Separation Techniques

LETTER Pre-Filtering Algorithm for Dual-Microphone Generalized Sidelobe Canceller Using General Transfer Function

MULTICHANNEL AUDIO DATABASE IN VARIOUS ACOUSTIC ENVIRONMENTS

Uplink and Downlink Beamforming for Fading Channels. Mats Bengtsson and Björn Ottersten

IN REVERBERANT and noisy environments, multi-channel

ROBUST SUPERDIRECTIVE BEAMFORMER WITH OPTIMAL REGULARIZATION

Binaural Beamforming with Spatial Cues Preservation

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis

PATH UNCERTAINTY ROBUST BEAMFORMING. Richard Stanton and Mike Brookes. Imperial College London {rs408,

COMPARISON OF TWO BINAURAL BEAMFORMING APPROACHES FOR HEARING AIDS

Airo Interantional Research Journal September, 2013 Volume II, ISSN:

EUSIPCO

A BINAURAL HEARING AID SPEECH ENHANCEMENT METHOD MAINTAINING SPATIAL AWARENESS FOR THE USER

OPTIMUM POST-FILTER ESTIMATION FOR NOISE REDUCTION IN MULTICHANNEL SPEECH PROCESSING

NOISE reduction, sometimes also referred to as speech enhancement,

TARGET SPEECH EXTRACTION IN COCKTAIL PARTY BY COMBINING BEAMFORMING AND BLIND SOURCE SEPARATION

Springer Topics in Signal Processing

arxiv: v1 [cs.sd] 4 Dec 2018

MULTICHANNEL systems are often used for

Broadband Microphone Arrays for Speech Acquisition

Published in: Proceedings of the 11th International Workshop on Acoustic Echo and Noise Control

Comparison of LMS and NLMS algorithm with the using of 4 Linear Microphone Array for Speech Enhancement

Study Of Sound Source Localization Using Music Method In Real Acoustic Environment

Performance Evaluation of Nonlinear Speech Enhancement Based on Virtual Increase of Channels in Reverberant Environments

Informed Spatial Filtering for Sound Extraction Using Distributed Microphone Arrays

Speech enhancement with ad-hoc microphone array using single source activity

Spatial Source Subtraction Based on Incomplete Measurements of Relative Transfer Function

MULTICHANNEL ACOUSTIC ECHO SUPPRESSION

/$ IEEE

A COHERENCE-BASED ALGORITHM FOR NOISE REDUCTION IN DUAL-MICROPHONE APPLICATIONS

The Role of High Frequencies in Convolutive Blind Source Separation of Speech Signals

IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 23, NO. 8, AUGUST Zbyněk Koldovský, Jiří Málek, and Sharon Gannot

A MULTI-CHANNEL POSTFILTER BASED ON THE DIFFUSE NOISE SOUND FIELD. Lukas Pfeifenberger 1 and Franz Pernkopf 1

Direction of Arrival Algorithms for Mobile User Detection

A Review on Beamforming Techniques in Wireless Communication

All-Neural Multi-Channel Speech Enhancement

Adaptive Beamforming Applied for Signals Estimated with MUSIC Algorithm

Speech Enhancement Based On Noise Reduction

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 5, NO. 5, SEPTEMBER

HUMAN speech is frequently encountered in several

A Frequency-Invariant Fixed Beamformer for Speech Enhancement

Robust Near-Field Adaptive Beamforming with Distance Discrimination

LOCAL RELATIVE TRANSFER FUNCTION FOR SOUND SOURCE LOCALIZATION

NOISE ESTIMATION IN A SINGLE CHANNEL

Advanced delay-and-sum beamformer with deep neural network

Local Relative Transfer Function for Sound Source Localization

Adaptive beamforming using pipelined transform domain filters

Multichannel Acoustic Signal Processing for Human/Machine Interfaces -

ONE of the most common and robust beamforming algorithms

Joint recognition and direction-of-arrival estimation of simultaneous meetingroom acoustic events

REAL-TIME BLIND SOURCE SEPARATION FOR MOVING SPEAKERS USING BLOCKWISE ICA AND RESIDUAL CROSSTALK SUBTRACTION

A Comparison of the Convolutive Model and Real Recording for Using in Acoustic Echo Cancellation

Adaptive Beamforming. Chapter Signal Steering Vectors

INTERFERENCE REJECTION OF ADAPTIVE ARRAY ANTENNAS BY USING LMS AND SMI ALGORITHMS

Multi-Stage Coherence Drift Based Sampling Rate Synchronization for Acoustic Beamforming

Online Version Only. Book made by this file is ILLEGAL. 2. Mathematical Description

SPEAKER CHANGE DETECTION AND SPEAKER DIARIZATION USING SPATIAL INFORMATION.

Optimum Beamforming. ECE 754 Supplemental Notes Kathleen E. Wage. March 31, Background Beampatterns for optimal processors Array gain

A Robust Adaptive Beamformer with a Blocking Matrix Using Coefficient-Constrained Adaptive Filters

Microphone Array Feedback Suppression. for Indoor Room Acoustics

BLIND SOURCE separation (BSS) [1] is a technique for

Design of Broadband Beamformers Robust Against Gain and Phase Errors in the Microphone Array Characteristics

Performance Analysis of MUSIC and MVDR DOA Estimation Algorithm

Recent advances in noise reduction and dereverberation algorithms for binaural hearing aids

High-speed Noise Cancellation with Microphone Array

Speech Enhancement Using Microphone Arrays

Speech Signal Enhancement Techniques

Subspace Noise Estimation and Gamma Distribution Based Microphone Array Post-filter Design

RIR Estimation for Synthetic Data Acquisition

Design of Robust Differential Microphone Arrays

Joint dereverberation and residual echo suppression of speech signals in noisy environments Habets, E.A.P.; Gannot, S.; Cohen, I.; Sommen, P.C.W.

Dominant Voiced Speech Segregation Using Onset Offset Detection and IBM Based Segmentation

Improving Meetings with Microphone Array Algorithms. Ivan Tashev Microsoft Research

Clustered Multi-channel Dereverberation for Ad-hoc Microphone Arrays

Different Approaches of Spectral Subtraction Method for Speech Enhancement

A Three-Microphone Adaptive Noise Canceller for Minimizing Reverberation and Signal Distortion

IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 24, NO. 7, JULY

Approaches for Angle of Arrival Estimation. Wenguang Mao

This is a repository copy of White Noise Reduction for Wideband Beamforming Based on Uniform Rectangular Arrays.

SPEECH signals are inherently sparse in the time and frequency

TIMIT LMS LMS. NoisyNA

Transcription:

Introduction to distributed speech enhancement algorithms for ad hoc microphone arrays and wireless acoustic sensor networks Part I: Array Processing in Acoustic Environments Sharon Gannot 1 and Alexander Bertrand 2 1 Faculty of Engineering, Bar-Ilan University, Israel 2 KU Leuven, E.E. Department ESAT-STADIUS, Belgium EUSIPCO 2013, Marrakesh, Morocco S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 1 / 33

Introduction and Outline Acoustic Spatial Processing Multi-Microphone Solutions Add the spatial domain to the time/frequency domain. Allow spatially selective algorithms for signal separation and noise suppression, which outperform single-microphone algorithms. Adopt array processing techniques to the acoustic world. Distributed Microphone Arrays Microphones can be placed randomly, avoiding tedious calibration. Utilization of very large microphone number is possible, hence increased spatial resolution may be expected. High probability to find microphones close to a relevant sound source. Improved sound field sampling. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 2 / 33

Introduction and Outline Challenges of Distributed Beamforming Distributed microphone array beamforming: Ad hoc sensor networks. Large volume (and many nodes). Robustness: High fault percentage. Arbitrary deployment of nodes. Sampling rate mismatch. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 3 / 33

Introduction and Outline Tutorial Outline Part I Array Processing in acoustic environment. Part II DANSE-based distributed speech enhancement in WASNs. Part III GSC-based distributed speech enhancement in WASNs. Part IV Random microphone deployment: Performance & Sampling rate mismatch. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 4 / 33

Array Processing in Speech Applications Preliminaries Spatial Filters Beamforming (Narrowband Signals): y(t) = w H (t)z(t). z0 t w 0 z1 t w 1 y t zm 1 t wm 1 w: M 1 beamforming vector of complex gains. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 5 / 33

Array Processing in Speech Applications Preliminaries Beampattern Control Beamformers Discriminate between angles. Can be steered by setting w. Depends on the ratio d λ 0. 315 0 0 db 10 db 45 20 db 30 db 270 40 db 90 225 135 S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech 180 enhancement EUSIPCO 2013 6 / 33

Array Processing in Speech Applications Room Acoustics Room Acoustics Essentials Sound Fields Directional Room impulse response relates source and microphones. Uncorrelated Signals on microphone are uncorrelated. Diffused Sound is coming from all directions [Dal-Degan and Prati, 1988]; [Habets and Gannot, 2007]. Reverberation Late reflections tend to be diffused. Deteriorates intelligibility. Degrades ASR performance. Beamforming becomes a cumbersome task. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 7 / 33

Array Processing in Speech Applications Room Acoustics The Room Impulse Response (RIR) [Allen and Berkley, 1979]; simulator: [Habets, 2006]; [Polack, 1993]; [Jot et al., 1997] Amplitude 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 direct path colouration tail 3 Parts: Direct path. Colouration (early arrivals). Reverberation tail (late arrivals). 0.01 0 0.05 0.1 0.15 0.2 0.25 0.3 Time [Sec] Reverberation should be taken into consideration while designing the algorithms even if it does not deteriorate speech quality and intelligibility. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 8 / 33

Array Processing in Speech Applications Room Acoustics From Geometry to Linear Algebra Array Design for Speech Propagating in Acoustic Environments Beampattern: Array response as a function of the angle of arrival (AoA). In reverberant environments (especially for low DRR), sound propagation is more involved than merely the AoA. The steering vector generalizes to acoustic transfer function (ATF). Beampattern becomes meaningless. The ATF summarizes all arrivals of the speech signals. The vector of received signals is treated as a vector in an abstract linear space. Linear Algebra methods are utilized to construct beamformers. It is a cumbersome task to blindly estimate the ATFs. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 9 / 33

Array Processing in Speech Applications Literature Array Processing in Speech Applications I 1 Fixed beamforming Combine the microphone signals using a time-invariant filter-and-sum operation (data-independent) [Jan and Flanagan, 1996]; [Doclo and Moonen, 2003]. 2 Blind Source Separation (BSS) Considers the received signals at the microphones as a mixture of all sound sources filtered by the RIRs. Utilizes Independent Component Analysis (ICA) techniques [Makino et al., 2007]; TRINICON, [Buchner et al., 2004]. 3 Adaptive Beamforming Combine the spatial focusing of fixed beamformers with adaptive suppression of (spectrally and spatially time-varying) background noise General reading: [Cox et al., 1987]; [Van Veen and Buckley, 1988]; [Van Trees, 2002]. 4 Computational Auditory Scene Analysis (CASA) Aims at performing sound segregation by modelling the human auditory perceptual processing [Wang and Brown, 2006]. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 10 / 33

Array Processing in Speech Applications Literature Array Processing in Speech Applications II Beamforming Criteria 1 Adaptive optimization [Sondhi and Elko, 1986]; [Kaneda and Ohga, 1986]; [Brandstein and Ward, 2001]. 2 Minimum variance distortionless response (MVDR) and GSC [Van Compernolle, 1990]; [Affes and Grenier, 1997]; [Nordholm et al., 1993]; [Hoshuyama et al., 1999]; [Gannot et al., 2001]; [Herbordt, 2005]; [Gannot and Cohen, 2008]. 3 Minimum mean square error (MMSE) - GSVD based spatial Wiener filter [Doclo and Moonen, 2002]. 4 Speech distortion weighted multichannel Wiener filter (SDW-MWF) [Doclo and Moonen, 2002]; [Spriet et al., 2004]; [Doclo et al., 2005]. 5 Maximum signal to noise ratio (SNR) [Warsitz and Haeb-Umbach, 2007]. 6 Linearly constrained minimum variance (LCMV) [Markovich et al., 2009]. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 11 / 33

Array Processing in Speech Applications Literature Array Processing in Speech Applications III Some Books 1 Acoustic signal processing for telecommunication [Gay and Benesty, 2000]. 2 Microphone Arrays: Signal Processing Techniques and Applications [Brandstein and Ward, 2001]. 3 Speech Enhancement [Benesty et al., 2005]. 4 Blind speech separation [Makino et al., 2007]. 5 Microphone Array Signal Processing [Benesty et al., 2008a]. 6 Springer handbook of speech processing [Benesty et al., 2008b]. 7 Handbook on array processing and sensor networks [Haykin and Liu, 2010]. 8 Speech processing in modern communication: Challenges and perspectives [Cohen et al., 2010]. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 12 / 33

Array Processing in Speech Applications Definitions Multiple Wideband Signals (e.g. Speech) Multiplicative Transfer Function (MTF) Approximation t STFT = {l, k}; Convolution STFT = Multiplication (for long enough frames). Microphone Signals (m = 0,..., M 1): z m (l, k) = P d j=1 sd j hd jm + P i j=1 si j hi jm + P n j=1 sn j hn jm + n m Vector Formulation z(l, k) = H d s d + H i s i + H n s n + n Hs + n. P = P d + P i + P n M Beamforming in the STFT Domain Apply filter & sum beamforming independently for each frequency bin. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 13 / 33

Optimal Beamforming Criteria & Solutions Linearly Constrained Minimum Variance Linearly Constrained Minimum Variance Beamformer [Er and Cantoni, 1983]; [Van Veen and Buckley, 1988] LCMV Criterion y(l, k) = w H (l, k)z(l, k). Let Φ nn = E{nn H } be the M M correlation matrix of the unconstraint sources. Minimize noise power w H Φ nn w Such that a linear constraint set is satisfied: C H w = g. C : M P constraints matrix. g : P 1 response vector. Closed-form Solution w(l, k) = Φ 1 nn C ( C H Φ 1 nn C ) 1 g S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 14 / 33

Optimal Beamforming Criteria & Solutions Linearly Constrained Minimum Variance Linearly Constrained Minimum Power (LCMP) Beamformer [Van Trees, 2002] LCMV vs. LCMP Assume C = H (all directional signals constrained). w LCMP = argmin{w H Φ zz w s.t. H H w = g} w = argmin{w H (HΦ ss H H + Φ nn )w s.t. H H w = g} w = argmin{g H Φ ss g + w H Φ nn w s.t. H H w = g} w = argmin{w H Φ nn w s.t. H H w = g} = w LCMV w If H is not accurately estimated, the LCMP beamformer exhibits self-cancellation and hence severe speech distortion. It is quite common in the literature to use only the term LCMV for both beamformers. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 15 / 33

Optimal Beamforming Criteria & Solutions Linearly Constrained Minimum Variance LCMV Minimization Graphical Interpretation [Frost III, 1972] w 2 w LCMV w H w const zz w n w 0 w 1 H C w g S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 16 / 33

Optimal Beamforming Criteria & Solutions Minimum Variance Distortionless Response The Minimum Variance Distortionless Beamformer [Affes and Grenier, 1997]; [Hoshuyama et al., 1999]; [Gannot et al., 2001] Beamformer Design: One desired signal Single constraint (P = 1). Steer a beam to desired source and minimize other directions. C = h d ; g = 1. Closed-form Solution (MPDR eq. MVDR): Output signal: w(l, k) = Φ 1 zz h d (h d ) H Φ 1 zz h d = Φ 1 nn h d (h d ) H Φ 1 n y = s d + residual noise and interference signals h d S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 17 / 33

Optimal Beamforming Criteria & Solutions The Relative Transfer Function The Relative Transfer Function [Gannot et al., 2001] Modified Constraint Set: C(l, k) = h d (l, k); g(l, k) = (h d 1 (l, k)) (h d (l, k)) H w = (h d 1 (l, k)) Equivalent to: C(l, k) = h d (l, k) hd h d 1 = [ 1 hd 2 h d 1... hd M h d 1 ] T ; g(l, k) = 1. with h d (l, k) the relative transfer function - the ratio of all ATFs to the reference ATF (#1 in this case) Output signal: y = h d 1 s d + residual noise and interference signals S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 18 / 33

Optimal Beamforming Criteria & Solutions The Relative Transfer Function The Importance of the RTF 1 0.8 AIR 1 AIR 2 1 0.8 0.6 0.6 Normalized Amplitute 0.4 0.2 Normalized Amplitute 0.4 0.2 0 0-0.2-0.2-0.4 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 Time[Sec] -0.4-0.2-0.15-0.1-0.05 0 0.05 0.1 0.15 0.2 Time[Sec] (a) Room Impulse Responses (b) Relative Impulse Response Features Can be blindly estimated from data. No need to know microphone position (crucial in ad hoc applications). Multitude estimation procedures exists. Usually exhibits better behaviour than the ATF. Drawback: Non-causal (in severe cases can cause pre-echo ). S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 19 / 33

Optimal Beamforming Criteria & Solutions The Relative Transfer Function RTF Estimation Procedures Utilizing speech non-stationarity and noise stationarity [Shalvi and Weinstein, 1996]; [Gannot et al., 2001]. An extension to two nonstationary sources in stationary noise exists [Reuven et al., 2008]. Utilizing speech presence probability and spectral subtraction [Cohen, 2004]. Based on eigenvalue decomposition (EVD) of the spatial correlation matrix for the multiple sources case [Markovich et al., 2009]. Nonconcurrent desired and interference sources. An extension to concurrent desired and interference source, based on ICA (TRINICON), exists [Reindl et al., 2013]. Recursive extensions exist: Single source: use PASTd [Yang, 1995] to recursively track the rank-1 eigenvector [Affes and Grenier, 1997]. Multiple sources: use generalization of PASTd to recursively track the rank-p eigenvectors with arbitrary activity pattern [Markovich-Golan et al., 2010]. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 20 / 33

Optimal Beamforming Criteria & Solutions Multiple Speech Distortion Weighted Multichannel Wiener Filter Multiple Speech Distortion Weighted Multichannel Wiener Filter (MSDW-MWF)[Markovich-Golan et al., 2012] Notation (Reminder) Received signals: z (l, k) = Hs + n. P < M constrained sources: s (l, k) [ s 1 s P ] T and respective ATFs: H (l, k) [ h 1 h P ]. Sources covariance matrix: Φ ss = diag {φ s1 s 1,..., φ sp s P }. Microphones covariance matrix: Φ zz HΦ ss H H + Φ nn. MSDW-MWF Control the distortion of each individual source. Minimize the weighted mean square error (MSE). Desired response for all constrained signals: d (l, k) g H s (l, k). The beamformer output: y (l, k) = w H z (l, k). MSE: E { d(l) y(l) 2}. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 21 / 33

Optimal Beamforming Criteria & Solutions Multiple Speech Distortion Weighted Multichannel Wiener Filter Speech enhancement with a Single Source I Speech Distortion Weighted Multichannel Wiener Filter (SDW-MWF) [Doclo and Moonen, 2002]; [Spriet et al., 2004]; [Doclo et al., 2005] MWF Distortion SDW-MWF MVDR MSE S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 22 / 33

Optimal Beamforming Criteria & Solutions Multiple Speech Distortion Weighted Multichannel Wiener Filter Speech enhancement with a Single Source II Speech Distortion Weighted Multichannel Wiener Filter (SDW-MWF) [Doclo and Moonen, 2002]; [Spriet et al., 2004]; [Doclo et al., 2005] The Multichannel Wiener Filter (MWF) Criterion J w E { d (l) y (l) 2} = g (h d ) H w 2 φ s d s d + wh Φ nn w The Speech Distortion Weighted (SDW)-MWF Criterion J SDW-MWF = g (h d ) H w 2 φ s d s d + µwh Φ nn w SDW-MWF Solution(Requires VAD) w = φ s d s d Φ 1 nn h d µ + φ s d s d (hd ) H Φ 1 nn h g d S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 23 / 33

Optimal Beamforming Criteria & Solutions Multiple Speech Distortion Weighted Multichannel Wiener Filter Speech Enhancement with Multiple Sources I [Markovich-Golan et al., 2012] MWF Dist. 1 Dist. 2 Dist. 3 MSDW-MWF LCMV MSE S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 24 / 33

Optimal Beamforming Criteria & Solutions Multiple Speech Distortion Weighted Multichannel Wiener Filter Speech Enhancement with Multiple Sources II [Markovich-Golan et al., 2012] The MSDW-MWF Criterion J MSDW-MWF ( ) H ( ) g H H w ΛΦss g H H w + w H Φ nn w Diagonal weights matrix: Λ diag {λ 1,.., λ P }. MSDW-MWF Beamformer (Requires VAD) w ( HΛΦ ss H H + Φ nn ) 1 HΛΦss g S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 25 / 33

Optimal Beamforming Criteria & Solutions Multiple Speech Distortion Weighted Multichannel Wiener Filter Special Cases of Λ MWF Λ = I. w = Φ 1 zz HΦ ss g. SDW-MWF (Reminder: Single Source of Interest) LCMV Λ = µ 1. w = ( h d φ s d s d (hd ) H + µφ nn ) 1 h d φ s d s d g. lim µ 0 w = Λ = µ 1 Φ 1 ss. Φ 1 nn h d g (MVDR eq. MPDR). (h d ) H Φ 1 nn hd lim µ 0 w = Φ 1 nn H ( H H Φ 1 nn H ) 1 g (LCMV eq. LCMP). S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 26 / 33

Bibliography References and Further Reading I Affes, S. and Grenier, Y. (1997). A signal subspace tracking algorithm for microphone array processing of speech. IEEE transactions on Speech and Audio Processing, 5(5):425 437. Allen, J. and Berkley, D. (1979). Image method for efficiently simulating small-room acoustics. J. Acoustical Society of America, 65(4):943 950. Benesty, J., Chen, J., and Huang, Y. (2008a). Microphone array signal processing. Springer. Benesty, J., Huang, Y., and Sondhi, M., editors (2008b). Springer handbook of speech processing. Springer Verlag. Benesty, J., Makino, S., and Chen, J., editors (2005). Speech Enhancement. Signals and Communication Technology. Springer, Berlin. Brandstein, M. S. and Ward, D. B., editors (2001). Microphone Arrays: Signal Processing Techniques and Applications. Springer-Verlag, Berlin. Buchner, H., Aichner, R., and Kellermann, W. (2004). TRINICON: A versatile framework for multichannel blind signal processing. In IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 3, pages iii 889, Montreal, Canda. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 27 / 33

Bibliography References and Further Reading II Cohen, I. (2004). Relative transfer function identification using speech signals. IEEE Transactions on Speech and Audio Processing, 12(5):451 459. Cohen, I., Benesty, J., and Gannot, S., editors (2010). Speech processing in modern communication: Challenges and perspectives. Topics in signal processing. Springer. Cox, H., Zeskind, R., and Owen, M. (1987). Robust adaptive beamforming. IEEE trans. on Acoustics, Speech and Signal Proc., 35(10):1365 1376. Dal-Degan, N. and Prati, C. (1988). Acoustic noise analysis and speech enhancement techniques for mobile radio application. Signal Processing, 15(4):43 56. Doclo, S. and Moonen, M. (2002). GSVD-based optimal filtering for single and multimicrophone speech enhancement. IEEE Trans. on Signal Processing, 50(9):2230 2244. Doclo, S. and Moonen, M. (2003). Design of far-field and near-field broadband beamformers using eigenfilters. Signal Processing, 83(12):2641 2673. Doclo, S., Spriet, A., Wouters, J., and Moonen, M. (2005). Speech Enhancement, chapter Speech Distortion Weighted Multichannel Wiener Filtering Techniques for Noise Reduction, pages 199 228. In [Benesty et al., 2005]. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 28 / 33

Bibliography References and Further Reading III Er, M. and Cantoni, A. (1983). Derivative constraints for broad-band element space antenna array processors. IEEE Transactions on Acoustics, Speech and Signal Processing, 31(6):1378 1393. Frost III, O. L. (1972). An algorithm for linearly constrained adaptive array processing. Proceedings of the IEEE, 60(8):926 935. Gannot, S., Burshtein, D., and Weinstein, E. (2001). Signal enhancement using beamforming and nonstationarity with applications to speech. IEEE Transactions on Signal Processing, 49(8):1614 1626. Gannot, S. and Cohen, I. (2008). Springer Handbook of Speech Processing and Speech Communication, chapter Adaptive beamforming and postfiltering. In [Benesty et al., 2008b]. Gay, S. L. and Benesty, J., editors (2000). Acoustic signal processing for telecommunication. Kluwer Academic. Habets, E. and Gannot, S. (2007). Generating sensor signals in isotropic noise fields. The Journal of the Acoustical Society of America, 122:3464 3470. Habets, E. A. P. (2006). Room impulse response (RIR) generator. http://home.tiscali.nl/ehabets/rir generator.html. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 29 / 33

Bibliography References and Further Reading IV Haykin, S. and Liu, K. R., editors (2010). Handbook on array processing and sensor networks, volume 63. Wiley-IEEE Press. Herbordt, W. (2005). Sound capture for human/machine interfaces - Practical aspects of microphone array signal processing, volume 315 of Lecture Notes in Control and Information Sciences. Springer, Heidelberg, Germany. Hoshuyama, O., Sugiyama, A., and Hirano, A. (1999). A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters. IEEE trans. on Signal Proc., 47(10):2677 2684. Jan, E. and Flanagan, J. (1996). Sound capture from spatial volumes: Matched-filter processing of microphone arrays having randomly-distributed sensors. In IEEE Int. Conf. Acoust. Speech and Sig. Proc. (ICASSP), pages 917 920, Atlanta, Georgia, USA. Jot, J.-M., Cerveau, L., and Warusfel, O. (1997). Analysis and synthesis of room reverberation based on a statistical time-frequency model. In Audio Engineering Society Convention 103. Audio Engineering Society. Kaneda, Y. and Ohga, J. (1986). Adaptive microphone-array system for noise reduction. IEEE Trans. Acoustics, Speech, and Signal Processing, 34(6):1391 1400. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 30 / 33

Bibliography References and Further Reading V Makino, S., Lee, T.-W., and Sawada, H. (2007). Blind speech separation. Springer Heidelberg. Markovich, S., Gannot, S., and Cohen, I. (2009). Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfering speech signals. IEEE Transactions on Audio, Speech, and Language Processing, 17(6):1071 1086. Markovich-Golan, S., Gannot, S., and Cohen, I. (2010). Subspace tracking of multiple sources and its application to speakers extraction. In The IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pages 201 204, Dallas, Texas, USA. Markovich-Golan, S., Gannot, S., and Cohen, I. (2012). A weighted multichannel Wiener filter for multiple sources scenarios. In The IEEE 27th Convention of IEEE Israel (IEEEI), Eilat, Israel. best student paper award. Nordholm, S., Claesson, I., and Bengtsson, B. (1993). Adaptive Array Noise Suppression of Handsfree Speaker Input in Cars. IEEE trans. on Vehicular tech., 42(4):514 518. Polack, J.-D. (1993). Playing billiards in the concert hall: The mathematical foundations of geometrical room acoustics. Applied Acoustics, 38(2):235 244. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 31 / 33

Bibliography References and Further Reading VI Reindl, K., Markovich-Golan, S., Barfuss, H., Gannot, S., and Kellermann, W. (2013). Geometrically constrained TRINICON-based relative transfer function estimation in underdetermined scenarios. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, USA. Reuven, G., Gannot, S., and Cohen, I. (2008). Dual-source transfer-function generalized sidelobe canceller. IEEE Transactions on Audio, Speech, and Language Processing, 16(4):711 727. Shalvi, O. and Weinstein, E. (1996). System identification using nonstationary signals. IEEE Trans. Signal Processing, 44(8):2055 2063. Sondhi, M. and Elko, G. (1986). Adaptive optimization of microphone arrays under a nonlinear constraint. In IEEE Int. Conf. Acoust. Speech and Sig. Proc. (ICASSP), volume 11, pages 981 984, Tokyo, Japan. Spriet, A., Moonen, M., and Wouters, J. (2004). Spatially pre-processed speech distortion weighted multi-channel wiener filtering for noise reduction. Signal Processing, 84(12):2367 2387. Van Compernolle, D. (1990). Switching adaptive filters for enhancing noisy and reverberant speech from microphone array recordings. In Proc. Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pages 833 836, Albuquerque, New Mexico, USA. IEEE. Van Trees, H. L. (2002). Detection, Estimation, and Modulation Theory, volume IV, Optimum Array Processing. Wiley, New York. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 32 / 33

Bibliography References and Further Reading VII Van Veen, B. D. and Buckley, K. M. (1988). Beamforming: A versatile approach to spatial filtering. IEEE Acoustics, Speech and Signal Proc. magazine, pages 4 24. Wang, D. and Brown, G. J. (2006). Computational auditory scene analysis: Principles, algorithms, and applications. Wiley interscience. Warsitz, E. and Haeb-Umbach, M. (2007). Blind acoustic beamforming based on generalized eigenvalue decomposition. IEEE Transactions on Audio, Speech, and Language Processing, 15(5):1529 1539. Yang, B. (1995). Projection Approximation Subspace Tracking. IEEE transactions on Signal Processing, 43(1):95 107. S. Gannot (BIU) and A. Bertrand (KUL) Distributed speech enhancement EUSIPCO 2013 33 / 33