Meeting Corpora Hardware Overview & ASR Accuracies

Similar documents
Automotive three-microphone voice activity detector and noise-canceller

Auditory System For a Mobile Robot

Joint Position-Pitch Decomposition for Multi-Speaker Tracking

Robust Low-Resource Sound Localization in Correlated Noise

Direction-of-Arrival Estimation Using a Microphone Array with the Multichannel Cross-Correlation Method

Case study for voice amplification in a highly absorptive conference room using negative absorption tuning by the YAMAHA Active Field Control system

Acoustic Beamforming for Speaker Diarization of Meetings

Michael Brandstein Darren Ward (Eds.) Microphone Arrays. Signal Processing Techniques and Applications. With 149 Figures. Springer

POSSIBLY the most noticeable difference when performing

Microphone Cartridge Model: MP201

Airborne Sound Insulation

IMPROVING MICROPHONE ARRAY SPEECH RECOGNITION WITH COCHLEAR IMPLANT-LIKE SPECTRALLY REDUCED SPEECH

Collection of re-transmitted data and impulse responses and remote ASR and speaker verification. Igor Szoke, Lada Mosner (et al.

EXPERIMENTS IN ACOUSTIC SOURCE LOCALIZATION USING SPARSE ARRAYS IN ADVERSE INDOORS ENVIRONMENTS

Surround: The Current Technological Situation. David Griesinger Lexicon 3 Oak Park Bedford, MA

Microphone Array Power Ratio for Speech Quality Assessment in Noisy Reverberant Environments 1

Application Note. Airbag Noise Measurements

Recent Advances in Acoustic Signal Extraction and Dereverberation

Audio data fuzzy fusion for source localization

The Beacon Locator Project

Single channel noise reduction

Microphone Array project in MSR: approach and results

Localization of underwater moving sound source based on time delay estimation using hydrophone array

The psychoacoustics of reverberation

[Q] DEFINE AUDIO AMPLIFIER. STATE ITS TYPE. DRAW ITS FREQUENCY RESPONSE CURVE.

SpeechLine Wired MEB 102 (-L), MEB 104 (-L) Install Boundary Layer Microphone

Gerhard Schmidt / Tim Haulick Recent Tends for Improving Automotive Speech Enhancement Systems. Geneva, 5-7 March 2008

Improving Meetings with Microphone Array Algorithms. Ivan Tashev Microsoft Research

DISTANT SPEECH RECOGNITION USING MICROPHONE ARRAYS

USBPRO User Manual. Contents. Cardioid Condenser USB Microphone

Three Microphones Embedded System for Single Unknown Sound Source Localization

Omnidirectional Sound Source Tracking Based on Sequential Updating Histogram

CS 3570 Chapter 5. Digital Audio Processing

CLIO Pocket is Audiomatica's new Electro-Acoustical Multi-Platform Personal measurement system.

MAXXSPEECH PERFORMANCE ENHANCEMENT FOR AUTOMATIC SPEECH RECOGNITION

EXPERIMENTAL EVALUATION OF MODIFIED PHASE TRANSFORM FOR SOUND SOURCE DETECTION

Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor. Presented by Amir Kiperwas

University of Huddersfield Repository

Selection of Microphones for Diffusion Measurement Method

Time Delay Estimation: Applications and Algorithms

High Gain Advanced GPS Receiver

WITH the advent of ubiquitous computing, a significant

Technical Data Measurement Microphones

RIR Estimation for Synthetic Data Acquisition

Live multi-track audio recording

REVERB Workshop 2014 SINGLE-CHANNEL REVERBERANT SPEECH RECOGNITION USING C 50 ESTIMATION Pablo Peso Parada, Dushyant Sharma, Patrick A. Naylor, Toon v

Lavalier microphone for smartphones USER MANUAL

DEREVERBERATION AND BEAMFORMING IN FAR-FIELD SPEAKER RECOGNITION. Brno University of Technology, and IT4I Center of Excellence, Czechia

BSWA Impedance Tube Solutions

Interfacing with the Machine

Microphones & Accessories

Using sound levels for location tracking

Direct Digital Amplification (DDX )

A MICROPHONE ARRAY INTERFACE FOR REAL-TIME INTERACTIVE MUSIC PERFORMANCE

Simultaneous Recognition of Speech Commands by a Robot using a Small Microphone Array

Reducing comb filtering on different musical instruments using time delay estimation

Time-of-arrival estimation for blind beamforming

Finding an Active Shooter with GNURadio

Miditech Guitarface USB

Holographic Measurement of the Acoustical 3D Output by Near Field Scanning by Dave Logan, Wolfgang Klippel, Christian Bellmann, Daniel Knobloch

Robust Speaker Segmentation for Meetings: The ICSI-SRI Spring 2005 Diarization System

Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming

LM4935 Automatic Gain Control (AGC) Guide

EXPERIMENTAL STUDY OF THE MORPHING FLAP AS A LOW NOISE HIGH LIFT DEVICE FOR AIRCRAFT WING

Active and Passive Acoustic Detection, Classification and Recognition with the Hopkins Acoustic Surveillance Unit (HASU)

Figure 1. SIG ACAM 100 and OptiNav BeamformX at InterNoise 2015.

Ultimate USB & XLR Microphone for Professional Recording

P R O D U C T D A T A

11. Audio Amp. LM386 Low Power Amplifier:

Loudspeaker Array Case Study

Focusrite Saffire 6 USB. User Guide

Convention Paper Presented at the 131st Convention 2011 October New York, USA

AVAL AUDIO-VISUAL ACTIVE LOCATOR. Faculty Sponsor: Professor Kathleen E. Wage Kelly Byrnes Rony Alaghbar Jacob Cohen

~ ~ ~(r. Controls. Removing the cover. Fig.3a. Fig.4. <I (j <I

AUTOMATIC SPEECH RECOGNITION FOR NUMERIC DIGITS USING TIME NORMALIZATION AND ENERGY ENVELOPES

PA System in a Box. Edwin Africano, Nathan Gutierrez, Tuan Phan

Wireless Seismic Acquisition: Real Time Data Matters. Gary Jones. Finding Petroleum Advances in Seismic January 25, 2011

Speech Intelligibility Enhancement using Microphone Array via Intra-Vehicular Beamforming

Revision 1.1 May Front End DSP Audio Technologies for In-Car Applications ROADMAP 2016

Technical features For internal use only / For internal use only Copy / right Copy Sieme A All rights re 06. All rights re se v r ed.

BS 17 SINGLE CHANNEL BELTPACK. User Manual. January 2017 V1.0

TDOA-Based Localization Using Distributed Sensors Based on Commodity Hardware. EW Europe 2017 London

TDE-ILD-HRTF-Based 2D Whole-Plane Sound Source Localization Using Only Two Microphones and Source Counting

CHANNEL SELECTION BASED ON MULTICHANNEL CROSS-CORRELATION COEFFICIENTS FOR DISTANT SPEECH RECOGNITION. Pittsburgh, PA 15213, USA

SW Series Impedance Tube Solutions

Dante Kit MEG basic

Microphone Array Design and Beamforming

Bandwidth Efficient Mixed Pseudo Analogue-Digital Speech Transmission

MULTI-PURPOSE AUDITORIUM SOUND REINFORCEMENT SYSTEM DESIGN ECE SPRING 2015 ANDREW PAWLING LUKE BOEHNLEIN KELLIE BOHLIG ROB LINGENFELTER

Icom IC-9100 HF/VHF/UHF transceiver

Performance Evaluation of Nonlinear Speech Enhancement Based on Virtual Increase of Channels in Reverberant Environments

AN7561Z. BTL output power IC for car audio. ICs for Audio Common Use. Overview. Features. Applications

Enabling New Speech Driven Services for Mobile Devices: An overview of the ETSI standards activities for Distributed Speech Recognition Front-ends

Expert sound, In every detail. BETA Microphones

29th TONMEISTERTAGUNG VDT INTERNATIONAL CONVENTION, November 2016

Multichannel Robot Speech Recognition Database: MChRSR

Digitally controlled Active Noise Reduction with integrated Speech Communication

Effective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a

Design and Production. Analog & Digital Audio. Fast & Accurate Measurements. Scalable Architecture. Superior Specifications.

LMV1024/LMV1026 (Stereo) PDM Output with Pre-Amplifier for Electret Microphones

Transcription:

Meeting Corpora Hardware Overview & ASR Accuracies George Jose (153070011) Guide : Dr. Preeti Rao Indian Institute of Technology, Bombay 22 July, 2016 1/18

Outline 1 AMI Meeting Corpora 2 3 2/18

AMI Meeting Layout 3/18

Hardware Block Diagram 4/18

Audio Acquisition 24 Sennheiser MKE 2-5-C miniature electret microphone Custom-built microphone power box 3 PreSonux Digimax preamplifier/digitizer 1 Mark of the Unicorn 2408mkII PC interface Cakewalk SONAR recording software 5/18

Audio Acquisition Sennheiser MKE 2-5-C miniature electret microphone Linear frequency response between 20Hz and 20kHz Omnidirectional characteristics High sensitivity : 31mV/Pa Custom-built microphone power box MKE 2-5-C requires separate DC bias voltage Provides a biasing voltage for all microphones 6/18

Audio Acquisition PreSonux Digimax preamplifier/digitizer 8 channel microphone preamplifier 24bit digitization Sample rates- 32kHz,44.1kHz,48kHz Mark of the Unicorn 2408mkII PC interface Provides interface to PC for hard-disk based audio recording Supports 72 simultaneous input and audio channels Allows controlled acquisition through driver software on PC 7/18

Integrated Hardware 8/18

& Testing Database : TIDigits (Adults) Training Data : 55M + 57W = 112 Speakers Vocabulary : Digits 0-9, oh Phones : 20 Monophone & Triphone Model 9/18

Test Data Simulation (REVERB Challenge Data) Testing Data : 3M + 3W = 6 Speakers RIR : 8ch circular array (Diameter = 20cm) Convolved with RIRs of 3 different rooms 1 SimRoom1 : T 60 = 0.25s 2 SimRoom2 : T 60 = 0.68s 3 SimRoom3 : T 60 = 0.75s 10/18

SimRoom1 (T 60 = 0.25s) 11/18

SimRoom2 (T 60 = 0.68s) 12/18

SimRoom3 (T 60 = 0.75s) 13/18

BeamformIt Block Diagram Wiener Filter to each channel for noise reduction Reference channel selection using cross correlation value GCC PHAT for TDOA estimation TDOA post processing to get better estimate 14/18

BeamformIt : Parameters Window Size : 64ms Hop Size : 32ms Reference Channel Selection : Based on cross correlation TDOA postprocessing : Noise Threshold & Viterbi Decoding Noise threshold : 10% of maximum cross correlation Performed Channel Elimination : Yes (To avoid bad frames) Performed Weight Adaptation : Yes (To reduce noise) 15/18

Word Error Rates : Before & After BeamformIt Scenario : Clean Speech Monophone Triphone Clean Speech 0.40 0.59 Scenario : Noise Only SNR Condition Monophone Triphone 15dB Before 2.83 2.64 After 1.58 1.45 10dB Before 8.04 6.26 After 3.56 3.23 16/18

Word Error Rates : Before & After BeamformIt Scenario : Reverberation Only T60 Condition Monophone Triphone 250ms Before 2.31 2.31 After 1.45 1.84 680ms Before 9.68 15.35 After 6.98 12.25 750ms Before 14.89 18.18 After 11.26 13.90 17/18

Word Error Rates : Before & After BeamformIt Scenario : SimRoom2(680ms) SNR Condition Monophone Triphone 15dB Before 39.06 39.92 After 16.67 15.88 10dB Before 60.54 64.16 After 33.20 33.99 Scenario : SimRoom3(750ms) SNR Condition Monophone Triphone 15dB Before 34.78 41.11 After 15.15 20.82 10dB Before 56.85 66.73 After 31.09 40.12 18/18