Microphone Array project in MSR: approach and results

Similar documents
Improving Meetings with Microphone Array Algorithms. Ivan Tashev Microsoft Research

Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming

3 RD GENERATION BE HEARD AND HEAR, LOUD AND CLEAR

Cost Function for Sound Source Localization with Arbitrary Microphone Arrays

Revision 1.1 May Front End DSP Audio Technologies for In-Car Applications ROADMAP 2016

Speech Intelligibility Enhancement using Microphone Array via Intra-Vehicular Beamforming

Holographic Measurement of the Acoustical 3D Output by Near Field Scanning by Dave Logan, Wolfgang Klippel, Christian Bellmann, Daniel Knobloch

Applications: FM wireless audio, USB PC audio broadcasting, wireless microphones, maternal and child care.

ONE of the most common and robust beamforming algorithms

High-speed Noise Cancellation with Microphone Array

Holographic Measurement of the 3D Sound Field using Near-Field Scanning by Dave Logan, Wolfgang Klippel, Christian Bellmann, Daniel Knobloch

MAXXSPEECH PERFORMANCE ENHANCEMENT FOR AUTOMATIC SPEECH RECOGNITION

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Speech Intelligibility Enhancement using Microphone Array via Intra-Vehicular Beamforming

Recent Advances in Acoustic Signal Extraction and Dereverberation

Dual Transfer Function GSC and Application to Joint Noise Reduction and Acoustic Echo Cancellation

Airo Interantional Research Journal September, 2013 Volume II, ISSN:

Microphone Array Power Ratio for Speech Quality Assessment in Noisy Reverberant Environments 1

Time-of-arrival estimation for blind beamforming

Joint Position-Pitch Decomposition for Multi-Speaker Tracking

Sound Source Localization using HRTF database

Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor. Presented by Amir Kiperwas

How To... Commission an Installed Sound Environment

Applying the Filtered Back-Projection Method to Extract Signal at Specific Position

Interfacing with the Machine

HANDSFREE VOICE INTERFACE FOR HOME NETWORK SERVICE USING A MICROPHONE ARRAY NETWORK

What you Need: Exel Acoustic Set with XL2 Analyzer M4260 Measurement Microphone Minirator MR-PRO

Microsoft Lync compatibility. Sennheiser Communications solution overview

Study Of Sound Source Localization Using Music Method In Real Acoustic Environment

STAP approach for DOA estimation using microphone arrays

Microsoft Lync compatibility. Sennheiser Communications solution overview

Acoustic echo cancellers for mobile devices

USBPRO User Manual. Contents. Cardioid Condenser USB Microphone

Introduction to Equalization

DESIGN AND APPLICATION OF DDS-CONTROLLED, CARDIOID LOUDSPEAKER ARRAYS

TELIKOU Intercom System

Audio Quality Terminology

Speech Enhancement Using Microphone Arrays

Participants: A.K.A. "Senseless Confusion" Scott McNeese, Cirrus Logic. Facilitator: Ron Kuper, Sonos, Inc.

Auditory System For a Mobile Robot

ADAPTIVE ANTENNAS. TYPES OF BEAMFORMING

FREQUENCY RESPONSE AND LATENCY OF MEMS MICROPHONES: THEORY AND PRACTICE

Speech communication in cars goes wideband the new ITU-T T Focus Group CarCom

MIMO II: Physical Channel Modeling, Spatial Multiplexing. COS 463: Wireless Networks Lecture 17 Kyle Jamieson

snom compatibility Sennheiser Communications solution overview

ROBUST SUPERDIRECTIVE BEAMFORMER WITH OPTIMAL REGULARIZATION

XAP GWARE 119 M A T R I X. Acoustic Echo Canceller

VTech R&D Design Support Capability

Adaptive Systems Homework Assignment 3

RIR Estimation for Synthetic Data Acquisition

BEAMFORMING WITHIN THE MODAL SOUND FIELD OF A VEHICLE INTERIOR

Broadsoft Compatibility. Sennheiser Communications Solution Overview

application guide Rental/Production

TE 302 DISCRETE SIGNALS AND SYSTEMS. Chapter 1: INTRODUCTION

Automotive three-microphone voice activity detector and noise-canceller

This document provides guidelines for configuring a simple Acoustic Echo Cancellation (AEC) reference for CONVERGE Pro 2.

Surround: The Current Technological Situation. David Griesinger Lexicon 3 Oak Park Bedford, MA

K.NARSING RAO(08R31A0425) DEPT OF ELECTRONICS & COMMUNICATION ENGINEERING (NOVH).

FDM based MIMO Spatio-Temporal Channel Sounder

EXPERIMENTAL EVALUATION OF MODIFIED PHASE TRANSFORM FOR SOUND SOURCE DETECTION

Acoustic Beamforming for Hearing Aids Using Multi Microphone Array by Designing Graphical User Interface

Speech Enhancement Based On Noise Reduction

Speech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya 2, B. Yamuna 2, H. Divya 2, B. Shiva Kumar 2, B.

WIND SPEED ESTIMATION AND WIND-INDUCED NOISE REDUCTION USING A 2-CHANNEL SMALL MICROPHONE ARRAY

Calibration of Microphone Arrays for Improved Speech Recognition

Excelsior Audio Design & Services, llc

innovaphone Compatibility Sennheiser Communications Solution Overview

High Gain Advanced GPS Receiver

6 TH GENERATION PROFESSIONAL SOUND FOR CONSUMER ELECTRONICS

Joint recognition and direction-of-arrival estimation of simultaneous meetingroom acoustic events

Eigenvalues and Eigenvectors in Array Antennas. Optimization of Array Antennas for High Performance. Self-introduction

Real-Time Software Receiver Using Massively Parallel

Sound Processing Technologies for Realistic Sensations in Teleworking

Gerhard Schmidt / Tim Haulick Recent Tends for Improving Automotive Speech Enhancement Systems. Geneva, 5-7 March 2008

Aastra Telecom compatibility. Sennheiser Communications solution overview

Broadband Microphone Arrays for Speech Acquisition

Single Channel Speaker Segregation using Sinusoidal Residual Modeling

Comparison of LMS and NLMS algorithm with the using of 4 Linear Microphone Array for Speech Enhancement

EECS 452, W.03 DSP Project Proposals: HW#5 James Glettler

Detection of Radio Pulses from Air Showers with LOPES

Interfacing to the SoundStation VTX 1000 TM with Vortex Devices

Smart antenna technology

DESIGN OF VOICE ALARM SYSTEMS FOR TRAFFIC TUNNELS: OPTIMISATION OF SPEECH INTELLIGIBILITY

Avaya compatibility. Sennheiser Communications solution overview

2. The use of beam steering speakers in a Public Address system

Airborne Sound Insulation

Using Audacity to make a recording

Creating & Editing Audio: Audacity Document for Follow-Along Exercises. Follow the instructions below to learn the different features of Audacity

Mutual Coupling Estimation for GPS Antenna Arrays in the Presence of Multipath

Active Cancellation Algorithm for Radar Cross Section Reduction

SPECTRAL COMBINING FOR MICROPHONE DIVERSITY SYSTEMS

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Learning Human Context through Unobtrusive Methods

Optimizing Satellite Communications with Adaptive and Phased Array Antennas

ACOUSTIC BEAMFORMING AND SPEECH RECOGNITION USING MICROPHONE ARRAY

LBC 3252/xx Intellivox 2b Active Line Array Loudspeakers

Autonomous Vehicle Speaker Verification System

AUTOMATIC SPEECH RECOGNITION FOR NUMERIC DIGITS USING TIME NORMALIZATION AND ENERGY ENVELOPES

Design and Implementation on a Sub-band based Acoustic Echo Cancellation Approach

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution

Transcription:

Microphone Array project in MSR: approach and results Ivan Tashev Microsoft Research June 2004

Agenda Microphone Array project Beamformer design algorithm Implementation and hardware designs Demo

Motivation PCs today have pretty bad ears ; audio captured or recorded from PCs sounds terrible (especially with laptops) unless a good headset is used. Sound will play more and more important role in human-computer interaction, especially in devices without keyboard (tablets, handhelds) Increases using computers in collaboration and communication Users don t like headsets or other tethered microphones, especially in a video call. Existing wireless solutions do not provide enough good sound quality, you have to wear them

Microphone array project: goals Far goal: sound capturing quality for untethered user the same as with close-up microphone Near goal: Create technology for OS support and devices so cheap to become commodity on the market Beamforming is ability to make the microphone array to listen to given location, suppressing the signals coming from other locations

Target scenarios Real-time communications Providing good sound capturing for Windows Messenger, MSN Messenger, other applications built on top of the RTC stack New applications for VoIP and enhanced telephony Collaboration and groupware High quality sound from meeting rooms for recording and broadcasting purposes (OneNote) Voice messaging Speech recognition Voice commands for Tablet PCs and handhelds Voice control and dictation for PCs and laptops

Problems Wear nothing approach requires using separate microphones: connected or integrated These microphones deliver poor sound capturing quality: Too much ambient and electronic noises Reverberation and reflections poor user experience and bad speech recognition results Noise suppression and de-reverberation are difficult with a single microphone channel

The solution Using microphone arrays for capturing the sound A set of close positioned microphones Synchronous capturing of the signals Microphone Array acts as an acoustic antenna This is called spatial filtering or beamforming Listens only to the direction of the speaker Reduces the noises from other directions Reduces the reverberation

Beamforming: known approaches Fixed beam formation Delay and sum most intuitive, irregular beam shape Parametric solutions: very complex Fast real-time execution Adaptive beamformers Generalized side lobe canceller Vary with the target criteria (MVDR, etc.) Slow adaptation, CPU time intensive

Beamforming: known approaches Fixed beam formation Delay and sum most intuitive, irregular beam shape Parametric solutions: very complex Fast real-time execution Adaptive beamformers Generalized side lobe canceller Vary with the target criteria (MVDR, etc.) Slow adaptation, CPU time intensive

Beamforming: known approaches Fixed beam formation Delay and sum most intuitive, irregular beam shape Parametric solutions: very complex Fast real-time execution Adaptive beamformers Generalized side lobe canceller Vary with the target criteria (MVDR, etc.) Slow adaptation, CPU time intensive

Beamformer: canonical form Canonical form of the beamformer: Y ( f ) = M 1 i = 0 W ( f, i) X i ( f ) M number of microphones Xi(f) spectrum of i-th channel W(f,i) weight coefficients matrix Y(f) output signal For each weight matrix we have corresponding shape of the beam B( ϕ, θ, f ) - the array gain as function of direction The goal is to find weight matrix to satisfy certain criteria

Beamformer: Array parameters Noise = ambient + non-correlated + correlated (jammers and reverberation) Ambient noise gain Non-correlated noise: Correlated (from given direction): The total noise gain is the combination of the first two + 2 0 2 0 2 2 ),, ( ) ( 20log f S df d d f B f N π π π ϕ θ θ ϕ 2 0 2 0 ),, ( ) ( ),, ( ) ( 20log S S f J J f S S df f B f J df f B f S θ ϕ θ ϕ = 2 0 1 0 2 ), ( 20log f S M i df i f W

Weights calculation Weights calculation as optimization process Minimization criterion: the total noise gain Multidimensional optimization Slow, especially in real time (adaptive beamformers) Can t follow the changes Multimodal 2M dimensional hypersurface local minima In all cases the starting point is critical

Weights calculation (2) Our approach: Deterministic beam formation Use as much prior info as possible Do your homework: calculate the weights in advance Calculate set of beams to cover the work volume Fast real-time engine: switches the beams on the fly

Beamformer: Prior Info Prerequisites: Microphone array geometry microphones coordinates and orientation Directivity response of the microphones U m (f,c) Hardware noise model N I (f) Ambient noise model N A (f)

Beamformer: Prior Info Prerequisites: Microphone array geometry microphones coordinates and orientation Directivity response of the microphones U m (f,c) Hardware noise model N I (f) Ambient noise model N A (f)

Beamformer: Prior Info Prerequisites: Microphone array geometry microphones coordinates and orientation Directivity response of the microphones U m (f,c) Hardware noise model N I (f) Ambient noise model N A (f)

Beamformer: Prior Info Prerequisites: Microphone array geometry microphones coordinates and orientation -20 Directivity response of the microphones U m (f,c) Hardware noise model N I (f) Ambient noise model N A (f) -30-40 -50-60 -70-80 -90-100 0 1000 2000 3000 4000 5000 6000 7000 8000

Beamformer: Prior Info Prerequisites: Microphone array geometry microphones coordinates and orientation Directivity response of the microphones U m (f,c) Hardware noise model N I (f) Ambient noise model N A (f)

Pattern synthesis Design in the beamspace Define the target beam shape: π ( ρ ) ( ) ( ) (,,, ) cos T ρ π ϕ T cos T ϕ πθ cos T θ ρϕθδ = kδ δ δ Define the weight function Combine the microphone directivity patterns using weighted MMSE T 1xL = V 1xL D MxL M MxL W 1xM Do the design in 3D

Pattern synthesis Design in the beamspace Define the target beam shape: π ( ρ ) ( ) ( ) (,,, ) cos T ρ π ϕ T cos T ϕ πθ cos T θ ρϕθδ = 1.2 kδ δ δ Define the weight function 1 0.8 Combine the microphone directivity patterns using 0.6 weighted MMSE Gain 0.4 T 1xL = V 1xL D MxL M MxL W 1xM 0.2 Do the design in 3D0-0.2 Beams at 1250 Hz 0 100 200 300 400 Angle, deg Desired Delay and sum

Pattern synthesis Design in the beamspace Define the target beam shape: π ( ρ ) ( ) ( ) (,,, ) cos T ρ π ϕ T cos T ϕ πθ cos T θ ρϕθδ = kδ δ δ Define the weight function Combine the microphone directivity patterns using weighted MMSE T 1xL = V 1xL D MxL M MxL W 1xM Do the design in 3D

Pattern synthesis Design in the beamspace Define the target beam shape: π ( ρ ) ( ) ( ) (,,, ) cos T ρ π ϕ T cos T ϕ πθ cos T θ ρϕθδ = kδ δ δ Define the weight function Combine the microphone directivity patterns using weighted MMSE T 1xL = V 1xL D MxL M MxL W 1xM Do the design in 3D

Pattern synthesis Design in the beamspace Define the target beam shape: π ( ρ ) ( ) ( ) (,,, ) cos T ρ π ϕ T cos T ϕ πθ cos T θ ρϕθδ = kδ δ δ Define the weight function Combine the microphone directivity patterns using weighted MMSE T 1xL = V 1xL D MxL M MxL W 1xM Do the design in 3D

Pattern synthesis Design in the beamspace Define the target beam shape: π ( ρ ) ( ) ( ) (,,, ) cos T ρ π ϕ T cos T ϕ πθ cos T θ ρϕθδ = kδ δ δ Define the weight function Combine the microphone directivity patterns using weighted MMSE T 1xL = V 1xL D MxL M MxL W 1xM Do the design in 3D

Pattern synthesis Design in the beamspace Define the target beam shape: π ( ρ ) ( ) ( ) (,,, ) cos T ρ π ϕ T cos T ϕ πθ cos T θ ρϕθδ = kδ δ δ Define the weight function Combine the microphone directivity patterns using weighted MMSE T 1xL = V 1xL D MxL M MxL W 1xM Do the design in 3D

Pattern synthesis Design in the beamspace Define the target beam shape: π ( ρ ) ( ) ( ) (,,, ) cos T ρ π ϕ T cos T ϕ πθ cos T θ ρϕθδ = kδ δ δ Define the weight function 1.2 Combine the microphone directivity patterns 1 using weighted MMSE T 1xL = V 1xL D MxL M MxL W 1xM Do the design in 3D Gains 0.8 0.6 0.4 0.2 Set of design beams 0-0.2 0 50 100 150 200 250 300 350 400 Angle, deg

Pattern synthesis Design in the beamspace Define the target beam shape: π ( ρ ) ( ) ( ) (,,, ) cos T ρ π ϕ T cos T ϕ πθ cos T θ ρϕθδ = kδ δ δ Define the weight function Combine the microphone directivity patterns using weighted MMSE T 1xL = V 1xL D MxL M MxL W 1xM Do the design in 3D

Pattern synthesis Design in the beamspace Define the target beam shape: π ( ρ ) ( ) ( ) (,,, ) cos T ρ π ϕ T cos T ϕ πθ cos T θ ρϕθδ = kδ δ δ Define the weight function Combine the microphone directivity patterns using weighted MMSE T 1xL = V 1xL D MxL M MxL W 1xM Do the design in 3D

Pattern synthesis Design in the beamspace Define the target beam shape: π ( ρ ) ( ) ( ) (,,, ) cos T ρ π ϕ T cos T ϕ πθ cos T θ ρϕθδ = kδ δ δ Define the weight function Combine the microphone directivity patterns using weighted MMSE T 1xL = V 1xL D MxL M MxL W 1xM Do the design in 3D

Pattern synthesis Design in the beamspace Define the target beam shape: π ( ρ ) ( ) ( ) (,,, ) cos T ρ π ϕ T cos T ϕ πθ cos T θ ρϕθδ = kδ δ δ Define the weight function Combine the microphone directivity patterns using weighted MMSE T 1xL = V 1xL D MxL M MxL W 1xM Do the design in 3D

Pattern synthesis Design in the beamspace Define the target beam shape: π ( ρ ) ( ) ( ) (,,, ) cos T ρ π ϕ T cos T ϕ πθ cos T θ ρϕθδ = kδ δ δ Define the weight function Combine the microphone directivity patterns using weighted MMSE T 1xL = V 1xL D MxL M MxL W 1xM Do the design in 3D

Pattern synthesis Design in the beamspace Define the target beam shape: π ( ρ ) ( ) ( ) (,,, ) cos T ρ π ϕ T cos T ϕ πθ cos T θ ρϕθδ = kδ δ δ Define the weight function Combine the microphone directivity patterns using weighted MMSE T 1xL = V 1xL D MxL M MxL W 1xM Do the design in 3D

Dimensions reduction Dimensions reduction: from 2M to 1 Two controversial processes: Narrow beam: better ambient noise reduction Wide beam: better internal noise reduction One dimensional search: beam width Cover the whole frequency band Calculate set of beams

On next charts: Z-axis: noise gain in db X-axis: frequency, logarithmic, 1-100Hz, 2-200 Hz, 3-400Hz, 7-6400Hz Y-axis: beam width, linear, 0 180 0, every 5 0, 33-15 0.

Ambient noise gain Noise gain 0-20 -40-60 Frequency -80 35 30 25 20 15 10 5 0 7 6 5 4 3 2 1 Beam width

Non-correlated noise gain Noise gain 120 100 80 60 40 20 0 Frequency -20-40 40 30 20 10 0 7 6 5 4 3 2 1 Beam width

Total noise gain Noise gain 80 60 40 20 0-20 -40 35 30 25 20 15 10 5 0 7 6 5 4 3 2 1 Frequency Beam width

Dimensions reduction Dimensions reduction: from 2M to 1 Two controversial processes: Narrow beam: better ambient noise reduction Wide beam: better internal noise reduction One dimensional search: beam width Cover the whole frequency band Calculate set of beams

Implementation: overall MASynthesis.exe Offline Design the weights MicArr.INI Weights.dat Real time just use pre-calculated weights AEC MABeamformer Noise Suppression

Implementation: Real-time engine SSL Beam selection Gain calibration Gains correction N-channels input stream Beamformer Mono output stream Geometry Weights

Hardware designs USB MicArray Prototypes 4-mic desktop 8-mic conference tabletop Bus-powered (no power grid) Compatible with USB audio (no device drivers to install) Integrated in laptops/monitors

Results: noise suppression Microphone Array noise suppression Provides itself 14-18 db ambient noise suppression Helps the noise suppressor to do better job More at http://micarray One of the best technologies on the market Device Noise Signal SNR Omni-directional Microphone -45.53-40.64 4.89 Unidirectional Microphone -44.51-33.91 10.6 Close-Up Microphone -64.46-30.04 34.42 Andrea DA 400 2.0, 4 el. MA, $135-51.72-26.19 25.53 Acoustic Magic, 8 element MA, $250-62.39-32.6 28.79 MSR 4 elements + WinXP NS -61.68-33.86 27.82 MSR 4 elements + New NS -64.41-32.14 33.27

Results: speech recognition Microphone Arrays for speech recognition Linear processing, speech recognition friendly Reduces ambient noises Partial de-reverberation Results 25 Speech Recognition Error Device PC Mic Error rate, % 20.391 Time 3:25 20 VoiceTracker MSR MicArray MSR MicArray+NS 17.9 14.22 13.683 3:17 4:03 3:34 Error rate, % 15 10 Close-up 6.171 2:35 5 4 element array, Yakima SAPI 5.2 374 utterances, 7 speakers (4 male, 3 female), age 25-53 0 PC Mic VoiceTracker MSR MicArray MSR MicArray+NS Close-up De vice

Results: conclusions Ambient noise suppression The current technology provides good noise suppression under the quality requirements constrains Telecommunication scenario has good quality sound Meetings recording for listening purposes OK. Speech recognition results Need improvement Reverberation as major reason Important for recorded meetings search technology

Microphone Array - Example Person speaking at 3 ft from microphones Typical $10 PC microphone SNR=10.3 db PC mic + WinXP noise reduction SNR=18.4 db Competitor (HW DSP) SNR=34.4dB MSR USB desktop array SNR=42.5dB

Microphone array - demo First demo: Records in parallel the output of the microphone array and a regular PC microphone. After this merges both WAV files to one file and plays it with CoolEdit. Second demo: ClearMessage application

Take outs Most of our projects are optimization in one way or another: Try carefully to define the optimization criterion Reduce the number of dimensions as much as possible Choose the method, especially if there are too many papers and no definite answer

Finally Questions? Contact: ivantash@microsoft.com See: http://research.microsoft.com/users/ivantash/